Merge branch 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace

Pull siginfo cleanups from Eric Biederman:
 "Long ago when 2.4 was just a testing release copy_siginfo_to_user was
  made to copy individual fields to userspace, possibly for efficiency
  and to ensure initialized values were not copied to userspace.

  Unfortunately the design was complex, it's assumptions unstated, and
  humans are fallible and so while it worked much of the time that
  design failed to ensure unitialized memory is not copied to userspace.

  This set of changes is part of a new design to clean up siginfo and
  simplify things, and hopefully make the siginfo handling robust enough
  that a simple inspection of the code can be made to ensure we don't
  copy any unitializied fields to userspace.

  The design is to unify struct siginfo and struct compat_siginfo into a
  single definition that is shared between all architectures so that
  anyone adding to the set of information shared with struct siginfo can
  see the whole picture. Hopefully ensuring all future si_code
  assignments are arch independent.

  The design is to unify copy_siginfo_to_user32 and
  copy_siginfo_from_user32 so that those function are complete and cope
  with all of the different cases documented in signinfo_layout. I don't
  think there was a single implementation of either of those functions
  that was complete and correct before my changes unified them.

  The design is to introduce a series of helpers including
  force_siginfo_fault that take the values that are needed in struct
  siginfo and build the siginfo structure for their callers. Ensuring
  struct siginfo is built correctly.

  The remaining work for 4.17 (unless someone thinks it is post -rc1
  material) is to push usage of those helpers down into the
  architectures so that architecture specific code will not need to deal
  with the fiddly work of intializing struct siginfo, and then when
  struct siginfo is guaranteed to be fully initialized change copy
  siginfo_to_user into a simple wrapper around copy_to_user.

  Further there is work in progress on the issues that have been
  documented requires arch specific knowledge to sort out.

  The changes below fix or at least document all of the issues that have
  been found with siginfo generation. Then proceed to unify struct
  siginfo the 32 bit helpers that copy siginfo to and from userspace,
  and generally clean up anything that is not arch specific with regards
  to siginfo generation.

  It is a lot but with the unification you can of siginfo you can
  already see the code reduction in the kernel"

* 'siginfo-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace: (45 commits)
  signal/memory-failure: Use force_sig_mceerr and send_sig_mceerr
  mm/memory_failure: Remove unused trapno from memory_failure
  signal/ptrace: Add force_sig_ptrace_errno_trap and use it where needed
  signal/powerpc: Remove unnecessary signal_code parameter of do_send_trap
  signal: Helpers for faults with specialized siginfo layouts
  signal: Add send_sig_fault and force_sig_fault
  signal: Replace memset(info,...) with clear_siginfo for clarity
  signal: Don't use structure initializers for struct siginfo
  signal/arm64: Better isolate the COMPAT_TASK portion of ptrace_hbptriggered
  ptrace: Use copy_siginfo in setsiginfo and getsiginfo
  signal: Unify and correct copy_siginfo_to_user32
  signal: Remove the code to clear siginfo before calling copy_siginfo_from_user32
  signal: Unify and correct copy_siginfo_from_user32
  signal/blackfin: Remove pointless UID16_SIGINFO_COMPAT_NEEDED
  signal/blackfin: Move the blackfin specific si_codes to asm-generic/siginfo.h
  signal/tile: Move the tile specific si_codes to asm-generic/siginfo.h
  signal/frv: Move the frv specific si_codes to asm-generic/siginfo.h
  signal/ia64: Move the ia64 specific si_codes to asm-generic/siginfo.h
  signal/powerpc: Remove redefinition of NSIGTRAP on powerpc
  signal: Move addr_lsb into the _sigfault union for clarity
  ...
diff --git a/.mailmap b/.mailmap
index 1469ff0..e18cab7 100644
--- a/.mailmap
+++ b/.mailmap
@@ -107,6 +107,7 @@
 Maciej W. Rozycki <macro@mips.com> <macro@imgtec.com>
 Marcin Nowakowski <marcin.nowakowski@mips.com> <marcin.nowakowski@imgtec.com>
 Mark Brown <broonie@sirena.org.uk>
+Mark Yao <markyao0591@gmail.com> <mark.yao@rock-chips.com>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@theobroma-systems.com>
 Martin Kepplinger <martink@posteo.de> <martin.kepplinger@ginzinger.com>
 Matthieu CASTET <castet.matthieu@free.fr>
diff --git a/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32 b/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32
new file mode 100644
index 0000000..da98223
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-iio-dfsdm-adc-stm32
@@ -0,0 +1,16 @@
+What:		/sys/bus/iio/devices/iio:deviceX/in_voltage_spi_clk_freq
+KernelVersion:	4.14
+Contact:	arnaud.pouliquen@st.com
+Description:
+		For audio purpose only.
+		Used by audio driver to set/get the spi input frequency.
+		This is mandatory if DFSDM is slave on SPI bus, to
+		provide information on the SPI clock frequency during runtime
+		Notice that the SPI frequency should be a multiple of sample
+		frequency to ensure the precision.
+		if DFSDM input is SPI master
+			Reading  SPI clkout frequency,
+			error on writing
+		If DFSDM input is SPI Slave:
+			Reading returns value previously set.
+			Writing value before starting conversions.
\ No newline at end of file
diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu b/Documentation/ABI/testing/sysfs-devices-system-cpu
index d6d862d..bfd29bc 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -375,3 +375,19 @@
 Description:	information about CPUs heterogeneity.
 
 		cpu_capacity: capacity of cpu#.
+
+What:		/sys/devices/system/cpu/vulnerabilities
+		/sys/devices/system/cpu/vulnerabilities/meltdown
+		/sys/devices/system/cpu/vulnerabilities/spectre_v1
+		/sys/devices/system/cpu/vulnerabilities/spectre_v2
+Date:		January 2018
+Contact:	Linux kernel mailing list <linux-kernel@vger.kernel.org>
+Description:	Information about CPU vulnerabilities
+
+		The files are named after the code names of CPU
+		vulnerabilities. The output of those files reflects the
+		state of the CPUs in the system. Possible output values:
+
+		"Not affected"	  CPU is not affected by the vulnerability
+		"Vulnerable"	  CPU is affected and no mitigation in effect
+		"Mitigation: $M"  CPU is affected and mitigation $M is in effect
diff --git a/Documentation/IRQ-domain.txt b/Documentation/IRQ-domain.txt
index 4a1cd76..507775c 100644
--- a/Documentation/IRQ-domain.txt
+++ b/Documentation/IRQ-domain.txt
@@ -265,37 +265,5 @@
 
 === Debugging ===
 
-If you switch on CONFIG_IRQ_DOMAIN_DEBUG (which depends on
-CONFIG_IRQ_DOMAIN and CONFIG_DEBUG_FS), you will find a new file in
-your debugfs mount point, called irq_domain_mapping. This file
-contains a live snapshot of all the IRQ domains in the system:
-
- name              mapped  linear-max  direct-max  devtree-node
- pl061                  8           8           0  /smb/gpio@e0080000
- pl061                  8           8           0  /smb/gpio@e1050000
- pMSI                   0           0           0  /interrupt-controller@e1101000/v2m@e0080000
- MSI                   37           0           0  /interrupt-controller@e1101000/v2m@e0080000
- GICv2m                37           0           0  /interrupt-controller@e1101000/v2m@e0080000
- GICv2                448         448           0  /interrupt-controller@e1101000
-
-it also iterates over the interrupts to display their mapping in the
-domains, and makes the domain stacking visible:
-
-
-irq    hwirq    chip name        chip data           active  type            domain
-    1  0x00019  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
-    2  0x0001d  GICv2            0xffff00000916bfd8          LINEAR          GICv2
-    3  0x0001e  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
-    4  0x0001b  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
-    5  0x0001a  GICv2            0xffff00000916bfd8          LINEAR          GICv2
-[...]
-   96  0x81808  MSI              0x          (null)           RADIX          MSI
-   96+ 0x00063  GICv2m           0xffff8003ee116980           RADIX          GICv2m
-   96+ 0x00063  GICv2            0xffff00000916bfd8          LINEAR          GICv2
-   97  0x08800  MSI              0x          (null)     *     RADIX          MSI
-   97+ 0x00064  GICv2m           0xffff8003ee116980     *     RADIX          GICv2m
-   97+ 0x00064  GICv2            0xffff00000916bfd8     *    LINEAR          GICv2
-
-Here, interrupts 1-5 are only using a single domain, while 96 and 97
-are build out of a stack of three domain, each level performing a
-particular function.
+Most of the internals of the IRQ subsystem are exposed in debugfs by
+turning CONFIG_GENERIC_IRQ_DEBUGFS on.
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.html b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
index 38d6d80..6c06e10 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.html
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.html
@@ -1097,7 +1097,8 @@
 its next exit from idle.
 Finally, the <tt>rcu_qs_ctr_snap</tt> field is used to detect
 cases where a given operation has resulted in a quiescent state
-for all flavors of RCU, for example, <tt>cond_resched_rcu_qs()</tt>.
+for all flavors of RCU, for example, <tt>cond_resched()</tt>
+when RCU has indicated a need for quiescent states.
 
 <h5>RCU Callback Handling</h5>
 
@@ -1182,8 +1183,8 @@
 Its fields are as follows:
 
 <pre>
-  1   int dynticks_nesting;
-  2   int dynticks_nmi_nesting;
+  1   long dynticks_nesting;
+  2   long dynticks_nmi_nesting;
   3   atomic_t dynticks;
   4   bool rcu_need_heavy_qs;
   5   unsigned long rcu_qs_ctr;
@@ -1191,15 +1192,31 @@
 </pre>
 
 <p>The <tt>-&gt;dynticks_nesting</tt> field counts the
-nesting depth of normal interrupts.
-In addition, this counter is incremented when exiting dyntick-idle
-mode and decremented when entering it.
+nesting depth of process execution, so that in normal circumstances
+this counter has value zero or one.
+NMIs, irqs, and tracers are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
+field.
+Because NMIs cannot be masked, changes to this variable have to be
+undertaken carefully using an algorithm provided by Andy Lutomirski.
+The initial transition from idle adds one, and nested transitions
+add two, so that a nesting level of five is represented by a
+<tt>-&gt;dynticks_nmi_nesting</tt> value of nine.
 This counter can therefore be thought of as counting the number
 of reasons why this CPU cannot be permitted to enter dyntick-idle
-mode, aside from non-maskable interrupts (NMIs).
-NMIs are counted by the <tt>-&gt;dynticks_nmi_nesting</tt>
-field, except that NMIs that interrupt non-dyntick-idle execution
-are not counted.
+mode, aside from process-level transitions.
+
+<p>However, it turns out that when running in non-idle kernel context,
+the Linux kernel is fully capable of entering interrupt handlers that
+never exit and perhaps also vice versa.
+Therefore, whenever the <tt>-&gt;dynticks_nesting</tt> field is
+incremented up from zero, the <tt>-&gt;dynticks_nmi_nesting</tt> field
+is set to a large positive number, and whenever the
+<tt>-&gt;dynticks_nesting</tt> field is decremented down to zero,
+the the <tt>-&gt;dynticks_nmi_nesting</tt> field is set to zero.
+Assuming that the number of misnested interrupts is not sufficient
+to overflow the counter, this approach corrects the
+<tt>-&gt;dynticks_nmi_nesting</tt> field every time the corresponding
+CPU enters the idle loop from process context.
 
 </p><p>The <tt>-&gt;dynticks</tt> field counts the corresponding
 CPU's transitions to and from dyntick-idle mode, so that this counter
@@ -1231,14 +1248,16 @@
 <tr><th>&nbsp;</th></tr>
 <tr><th align="left">Quick Quiz:</th></tr>
 <tr><td>
-	Why not just count all NMIs?
-	Wouldn't that be simpler and less error prone?
+	Why not simply combine the <tt>-&gt;dynticks_nesting</tt>
+	and <tt>-&gt;dynticks_nmi_nesting</tt> counters into a
+	single counter that just counts the number of reasons that
+	the corresponding CPU is non-idle?
 </td></tr>
 <tr><th align="left">Answer:</th></tr>
 <tr><td bgcolor="#ffffff"><font color="ffffff">
-	It seems simpler only until you think hard about how to go about
-	updating the <tt>rcu_dynticks</tt> structure's
-	<tt>-&gt;dynticks</tt> field.
+	Because this would fail in the presence of interrupts whose
+	handlers never return and of handlers that manage to return
+	from a made-up interrupt.
 </font></td></tr>
 <tr><td>&nbsp;</td></tr>
 </table>
diff --git a/Documentation/RCU/Design/Requirements/Requirements.html b/Documentation/RCU/Design/Requirements/Requirements.html
index 62e847b..4969022 100644
--- a/Documentation/RCU/Design/Requirements/Requirements.html
+++ b/Documentation/RCU/Design/Requirements/Requirements.html
@@ -581,7 +581,8 @@
 DYNIX/ptx used an explicit memory barrier for publication, but had nothing
 resembling <tt>rcu_dereference()</tt> for subscription, nor did it
 have anything resembling the <tt>smp_read_barrier_depends()</tt>
-that was later subsumed into <tt>rcu_dereference()</tt>.
+that was later subsumed into <tt>rcu_dereference()</tt> and later
+still into <tt>READ_ONCE()</tt>.
 The need for these operations made itself known quite suddenly at a
 late-1990s meeting with the DEC Alpha architects, back in the days when
 DEC was still a free-standing company.
@@ -2797,7 +2798,7 @@
 executing in usermode (which is one use case for
 <tt>CONFIG_NO_HZ_FULL=y</tt>) or in the kernel.
 That said, CPU-bound loops in the kernel must execute
-<tt>cond_resched_rcu_qs()</tt> at least once per few tens of milliseconds
+<tt>cond_resched()</tt> at least once per few tens of milliseconds
 in order to avoid receiving an IPI from RCU.
 
 <p>
@@ -3128,7 +3129,7 @@
 is to have implicit
 read-side critical sections that are delimited by voluntary context
 switches, that is, calls to <tt>schedule()</tt>,
-<tt>cond_resched_rcu_qs()</tt>, and
+<tt>cond_resched()</tt>, and
 <tt>synchronize_rcu_tasks()</tt>.
 In addition, transitions to and from userspace execution also delimit
 tasks-RCU read-side critical sections.
diff --git a/Documentation/RCU/rcu_dereference.txt b/Documentation/RCU/rcu_dereference.txt
index 1acb26b..ab96227 100644
--- a/Documentation/RCU/rcu_dereference.txt
+++ b/Documentation/RCU/rcu_dereference.txt
@@ -122,11 +122,7 @@
 		Note that if checks for being within an RCU read-side
 		critical section are not required and the pointer is never
 		dereferenced, rcu_access_pointer() should be used in place
-		of rcu_dereference(). The rcu_access_pointer() primitive
-		does not require an enclosing read-side critical section,
-		and also omits the smp_read_barrier_depends() included in
-		rcu_dereference(), which in turn should provide a small
-		performance gain in some CPUs (e.g., the DEC Alpha).
+		of rcu_dereference().
 
 	o	The comparison is against a pointer that references memory
 		that was initialized "a long time ago."  The reason
diff --git a/Documentation/RCU/stallwarn.txt b/Documentation/RCU/stallwarn.txt
index a08f928..4259f95 100644
--- a/Documentation/RCU/stallwarn.txt
+++ b/Documentation/RCU/stallwarn.txt
@@ -23,12 +23,10 @@
 o	A CPU looping with bottom halves disabled.  This condition can
 	result in RCU-sched and RCU-bh stalls.
 
-o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the
-	kernel without invoking schedule().  Note that cond_resched()
-	does not necessarily prevent RCU CPU stall warnings.  Therefore,
-	if the looping in the kernel is really expected and desirable
-	behavior, you might need to replace some of the cond_resched()
-	calls with calls to cond_resched_rcu_qs().
+o	For !CONFIG_PREEMPT kernels, a CPU looping anywhere in the kernel
+	without invoking schedule().  If the looping in the kernel is
+	really expected and desirable behavior, you might need to add
+	some calls to cond_resched().
 
 o	Booting Linux using a console connection that is too slow to
 	keep up with the boot-time console-message rate.  For example,
diff --git a/Documentation/RCU/whatisRCU.txt b/Documentation/RCU/whatisRCU.txt
index df62466..a27fbfb 100644
--- a/Documentation/RCU/whatisRCU.txt
+++ b/Documentation/RCU/whatisRCU.txt
@@ -600,8 +600,7 @@
 
 	#define rcu_dereference(p) \
 	({ \
-		typeof(p) _________p1 = p; \
-		smp_read_barrier_depends(); \
+		typeof(p) _________p1 = READ_ONCE(p); \
 		(_________p1); \
 	})
 
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index af7104a..b98048b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -114,7 +114,6 @@
 			This facility can be used to prevent such uncontrolled
 			GPE floodings.
 			Format: <int>
-			Support masking of GPEs numbered from 0x00 to 0x7f.
 
 	acpi_no_auto_serialize	[HW,ACPI]
 			Disable auto-serialization of AML methods
@@ -223,7 +222,7 @@
 
 	acpi_sleep=	[HW,ACPI] Sleep options
 			Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig,
-				  old_ordering, nonvs, sci_force_enable }
+				  old_ordering, nonvs, sci_force_enable, nobl }
 			See Documentation/power/video.txt for information on
 			s3_bios and s3_mode.
 			s3_beep is for debugging; it makes the PC's speaker beep
@@ -239,6 +238,9 @@
 			sci_force_enable causes the kernel to set SCI_EN directly
 			on resume from S1/S3 (which is against the ACPI spec,
 			but some broken systems don't work without it).
+			nobl causes the internal blacklist of systems known to
+			behave incorrectly in some ways with respect to system
+			suspend and resume to be ignored (use wisely).
 
 	acpi_use_timer_override [HW,ACPI]
 			Use timer override. For some broken Nvidia NF5 boards
@@ -713,9 +715,6 @@
 			It will be ignored when crashkernel=X,high is not used
 			or memory reserved is below 4G.
 
-	crossrelease_fullstack
-			[KNL] Allow to record full stack trace in cross-release
-
 	cryptomgr.notests
                         [KNL] Disable crypto self-tests
 
@@ -2053,9 +2052,6 @@
 			This tests the locking primitive's ability to
 			transition abruptly to and from idle.
 
-	locktorture.torture_runnable= [BOOT]
-			Start locktorture running at boot time.
-
 	locktorture.torture_type= [KNL]
 			Specify the locking implementation to test.
 
@@ -2626,6 +2622,11 @@
 	nosmt		[KNL,S390] Disable symmetric multithreading (SMT).
 			Equivalent to smt=1.
 
+	nospectre_v2	[X86] Disable all mitigations for the Spectre variant 2
+			(indirect branch prediction) vulnerability. System may
+			allow data leaks with this option, which is equivalent
+			to spectre_v2=off.
+
 	noxsave		[BUGS=X86] Disables x86 extended register state save
 			and restore using xsave. The kernel will fallback to
 			enabling legacy floating-point and sse state.
@@ -2712,8 +2713,6 @@
 			steal time is computed, but won't influence scheduler
 			behaviour
 
-	nopti		[X86-64] Disable kernel page table isolation
-
 	nolapic		[X86-32,APIC] Do not enable or use the local APIC.
 
 	nolapic_timer	[X86-32,APIC] Do not use the local APIC timer.
@@ -3100,6 +3099,12 @@
 		pcie_scan_all	Scan all possible PCIe devices.  Otherwise we
 				only look for one device below a PCIe downstream
 				port.
+		big_root_window	Try to add a big 64bit memory window to the PCIe
+				root complex on AMD CPUs. Some GFX hardware
+				can resize a BAR to allow access to all VRAM.
+				Adding the window is slightly risky (it may
+				conflict with unreported devices), so this
+				taints the kernel.
 
 	pcie_aspm=	[PCIE] Forcibly enable or disable PCIe Active State Power
 			Management.
@@ -3288,11 +3293,20 @@
 	pt.		[PARIDE]
 			See Documentation/blockdev/paride.txt.
 
-	pti=		[X86_64]
-			Control user/kernel address space isolation:
-			on - enable
-			off - disable
-			auto - default setting
+	pti=		[X86_64] Control Page Table Isolation of user and
+			kernel address spaces.  Disabling this feature
+			removes hardening, but improves performance of
+			system calls and interrupts.
+
+			on   - unconditionally enable
+			off  - unconditionally disable
+			auto - kernel detects whether your CPU model is
+			       vulnerable to issues that PTI mitigates
+
+			Not specifying this option is equivalent to pti=auto.
+
+	nopti		[X86_64]
+			Equivalent to pti=off
 
 	pty.legacy_count=
 			[KNL] Number of legacy pty's. Overwrites compiled-in
@@ -3471,9 +3485,6 @@
 			the same as for rcuperf.nreaders.
 			N, where N is the number of CPUs
 
-	rcuperf.perf_runnable= [BOOT]
-			Start rcuperf running at boot time.
-
 	rcuperf.perf_type= [KNL]
 			Specify the RCU implementation to test.
 
@@ -3607,9 +3618,6 @@
 			Test RCU's dyntick-idle handling.  See also the
 			rcutorture.shuffle_interval parameter.
 
-	rcutorture.torture_runnable= [BOOT]
-			Start rcutorture running at boot time.
-
 	rcutorture.torture_type= [KNL]
 			Specify the RCU implementation to test.
 
@@ -3667,7 +3675,8 @@
 
 	rdt=		[HW,X86,RDT]
 			Turn on/off individual RDT features. List is:
-			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, mba.
+			cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
+			mba.
 			E.g. to turn on cmt and turn off mba use:
 				rdt=cmt,!mba
 
@@ -3943,6 +3952,29 @@
 	sonypi.*=	[HW] Sony Programmable I/O Control Device driver
 			See Documentation/laptops/sonypi.txt
 
+	spectre_v2=	[X86] Control mitigation of Spectre variant 2
+			(indirect branch speculation) vulnerability.
+
+			on   - unconditionally enable
+			off  - unconditionally disable
+			auto - kernel detects whether your CPU model is
+			       vulnerable
+
+			Selecting 'on' will, and 'auto' may, choose a
+			mitigation method at run time according to the
+			CPU, the available microcode, the setting of the
+			CONFIG_RETPOLINE configuration option, and the
+			compiler with which the kernel was built.
+
+			Specific mitigations can also be selected manually:
+
+			retpoline	  - replace indirect branches
+			retpoline,generic - google's original retpoline
+			retpoline,amd     - AMD-specific minimal thunk
+
+			Not specifying this option is equivalent to
+			spectre_v2=auto.
+
 	spia_io_base=	[HW,MTD]
 	spia_fio_base=
 	spia_pedr=
diff --git a/Documentation/arm64/cpu-feature-registers.txt b/Documentation/arm64/cpu-feature-registers.txt
index bd9b3fa..a70090b 100644
--- a/Documentation/arm64/cpu-feature-registers.txt
+++ b/Documentation/arm64/cpu-feature-registers.txt
@@ -110,7 +110,9 @@
      x--------------------------------------------------x
      | Name                         |  bits   | visible |
      |--------------------------------------------------|
-     | RES0                         | [63-48] |    n    |
+     | RES0                         | [63-52] |    n    |
+     |--------------------------------------------------|
+     | FHM                          | [51-48] |    y    |
      |--------------------------------------------------|
      | DP                           | [47-44] |    y    |
      |--------------------------------------------------|
diff --git a/Documentation/arm64/elf_hwcaps.txt b/Documentation/arm64/elf_hwcaps.txt
index 89edba1..57324ee 100644
--- a/Documentation/arm64/elf_hwcaps.txt
+++ b/Documentation/arm64/elf_hwcaps.txt
@@ -158,3 +158,7 @@
 HWCAP_SVE
 
     Functionality implied by ID_AA64PFR0_EL1.SVE == 0b0001.
+
+HWCAP_ASIMDFHM
+
+   Functionality implied by ID_AA64ISAR0_EL1.FHM == 0b0001.
diff --git a/Documentation/arm64/silicon-errata.txt b/Documentation/arm64/silicon-errata.txt
index fc1c884..c1d520d 100644
--- a/Documentation/arm64/silicon-errata.txt
+++ b/Documentation/arm64/silicon-errata.txt
@@ -72,7 +72,7 @@
 | Hisilicon      | Hip0{6,7}       | #161010701      | N/A                         |
 | Hisilicon      | Hip07           | #161600802      | HISILICON_ERRATUM_161600802 |
 |                |                 |                 |                             |
-| Qualcomm Tech. | Falkor v1       | E1003           | QCOM_FALKOR_ERRATUM_1003    |
+| Qualcomm Tech. | Kryo/Falkor v1  | E1003           | QCOM_FALKOR_ERRATUM_1003    |
 | Qualcomm Tech. | Falkor v1       | E1009           | QCOM_FALKOR_ERRATUM_1009    |
 | Qualcomm Tech. | QDF2400 ITS     | E0065           | QCOM_QDF2400_ERRATUM_0065   |
 | Qualcomm Tech. | Falkor v{1,2}   | E1041           | QCOM_FALKOR_ERRATUM_1041    |
diff --git a/Documentation/circular-buffers.txt b/Documentation/circular-buffers.txt
index d462817..53e51ca 100644
--- a/Documentation/circular-buffers.txt
+++ b/Documentation/circular-buffers.txt
@@ -220,8 +220,7 @@
 
 Note the use of READ_ONCE() and smp_load_acquire() to read the
 opposition index.  This prevents the compiler from discarding and
-reloading its cached value - which some compilers will do across
-smp_read_barrier_depends().  This isn't strictly needed if you can
+reloading its cached value.  This isn't strictly needed if you can
 be sure that the opposition index will _only_ be used the once.
 The smp_load_acquire() additionally forces the CPU to order against
 subsequent memory references.  Similarly, smp_store_release() is used
diff --git a/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt b/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt
new file mode 100644
index 0000000..6efabba
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/arm-dsu-pmu.txt
@@ -0,0 +1,27 @@
+* ARM DynamIQ Shared Unit (DSU) Performance Monitor Unit (PMU)
+
+ARM DyanmIQ Shared Unit (DSU) integrates one or more CPU cores
+with a shared L3 memory system, control logic and external interfaces to
+form a multicore cluster. The PMU enables to gather various statistics on
+the operations of the DSU. The PMU provides independent 32bit counters that
+can count any of the supported events, along with a 64bit cycle counter.
+The PMU is accessed via CPU system registers and has no MMIO component.
+
+** DSU PMU required properties:
+
+- compatible	: should be one of :
+
+		"arm,dsu-pmu"
+
+- interrupts	: Exactly 1 SPI must be listed.
+
+- cpus		: List of phandles for the CPUs connected to this DSU instance.
+
+
+** Example:
+
+dsu-pmu-0 {
+	compatible = "arm,dsu-pmu";
+	interrupts = <GIC_SPI 02 IRQ_TYPE_LEVEL_HIGH>;
+	cpus = <&cpu_0>, <&cpu_1>;
+};
diff --git a/Documentation/devicetree/bindings/arm/firmware/sdei.txt b/Documentation/devicetree/bindings/arm/firmware/sdei.txt
new file mode 100644
index 0000000..ee3f0ff
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/firmware/sdei.txt
@@ -0,0 +1,42 @@
+* Software Delegated Exception Interface (SDEI)
+
+Firmware implementing the SDEI functions described in ARM document number
+ARM DEN 0054A ("Software Delegated Exception Interface") can be used by
+Linux to receive notification of events such as those generated by
+firmware-first error handling, or from an IRQ that has been promoted to
+a firmware-assisted NMI.
+
+The interface provides a number of API functions for registering callbacks
+and enabling/disabling events. Functions are invoked by trapping to the
+privilege level of the SDEI firmware (specified as part of the binding
+below) and passing arguments in a manner specified by the "SMC Calling
+Convention (ARM DEN 0028B):
+
+	 r0		=> 32-bit Function ID / return value
+	{r1 - r3}	=> Parameters
+
+Note that the immediate field of the trapping instruction must be set
+to #0.
+
+The SDEI_EVENT_REGISTER function registers a callback in the kernel
+text to handle the specified event number.
+
+The sdei node should be a child node of '/firmware' and have required
+properties:
+
+ - compatible    : should contain:
+	* "arm,sdei-1.0" : For implementations complying to SDEI version 1.x.
+
+ - method        : The method of calling the SDEI firmware. Permitted
+                   values are:
+	* "smc" : SMC #0, with the register assignments specified in this
+	          binding.
+	* "hvc" : HVC #0, with the register assignments specified in this
+	          binding.
+Example:
+	firmware {
+		sdei {
+			compatible	= "arm,sdei-1.0";
+			method		= "smc";
+		};
+	};
diff --git a/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt b/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt
index 51336e5..35c3c34 100644
--- a/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt
+++ b/Documentation/devicetree/bindings/arm/marvell/armada-37xx.txt
@@ -14,3 +14,22 @@
 Example:
 
 compatible = "marvell,armada-3720-db", "marvell,armada3720", "marvell,armada3710";
+
+
+Power management
+----------------
+
+For power management (particularly DVFS and AVS), the North Bridge
+Power Management component is needed:
+
+Required properties:
+- compatible     : should contain "marvell,armada-3700-nb-pm", "syscon";
+- reg            : the register start and length for the North Bridge
+		    Power Management
+
+Example:
+
+nb_pm: syscon@14000 {
+	compatible = "marvell,armada-3700-nb-pm", "syscon";
+	reg = <0x14000 0x60>;
+}
diff --git a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
index 367c8203..3ac0298 100644
--- a/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
+++ b/Documentation/devicetree/bindings/hwmon/aspeed-pwm-tacho.txt
@@ -22,8 +22,9 @@
 - compatible : should be "aspeed,ast2400-pwm-tacho" for AST2400 and
 	       "aspeed,ast2500-pwm-tacho" for AST2500.
 
-- clocks : a fixed clock providing input clock frequency(PWM
-	   and Fan Tach clock)
+- clocks : phandle to clock provider with the clock number in the second cell
+
+- resets : phandle to reset controller with the reset number in the second cell
 
 fan subnode format:
 ===================
@@ -48,19 +49,14 @@
 
 Examples:
 
-pwm_tacho_fixed_clk: fixedclk {
-	compatible = "fixed-clock";
-	#clock-cells = <0>;
-	clock-frequency = <24000000>;
-};
-
 pwm_tacho: pwmtachocontroller@1e786000 {
 	#address-cells = <1>;
 	#size-cells = <1>;
 	#cooling-cells = <2>;
 	reg = <0x1E786000 0x1000>;
 	compatible = "aspeed,ast2500-pwm-tacho";
-	clocks = <&pwm_tacho_fixed_clk>;
+	clocks = <&syscon ASPEED_CLK_APB>;
+	resets = <&syscon ASPEED_RESET_PWM>;
 	pinctrl-names = "default";
 	pinctrl-0 = <&pinctrl_pwm0_default &pinctrl_pwm1_default>;
 
diff --git a/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt b/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt
new file mode 100644
index 0000000..e9ebb8a
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/adc/sigma-delta-modulator.txt
@@ -0,0 +1,13 @@
+Device-Tree bindings for sigma delta modulator
+
+Required properties:
+- compatible: should be "ads1201", "sd-modulator". "sd-modulator" can be use
+	as a generic SD modulator if modulator not specified in compatible list.
+- #io-channel-cells = <1>: See the IIO bindings section "IIO consumers".
+
+Example node:
+
+	ads1202: adc@0 {
+		compatible = "sd-modulator";
+		#io-channel-cells = <1>;
+	};
diff --git a/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt b/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt
new file mode 100644
index 0000000..911492da
--- /dev/null
+++ b/Documentation/devicetree/bindings/iio/adc/st,stm32-dfsdm-adc.txt
@@ -0,0 +1,128 @@
+STMicroelectronics STM32 DFSDM ADC device driver
+
+
+STM32 DFSDM ADC is a sigma delta analog-to-digital converter dedicated to
+interface external sigma delta modulators to STM32 micro controllers.
+It is mainly targeted for:
+- Sigma delta modulators (motor control, metering...)
+- PDM microphones (audio digital microphone)
+
+It features up to 8 serial digital interfaces (SPI or Manchester) and
+up to 4 filters on stm32h7.
+
+Each child node match with a filter instance.
+
+Contents of a STM32 DFSDM root node:
+------------------------------------
+Required properties:
+- compatible: Should be "st,stm32h7-dfsdm".
+- reg: Offset and length of the DFSDM block register set.
+- clocks: IP and serial interfaces clocking. Should be set according
+		to rcc clock ID and "clock-names".
+- clock-names: Input clock name "dfsdm" must be defined,
+		"audio" is optional. If defined CLKOUT is based on the audio
+		clock, else "dfsdm" is used.
+- #interrupt-cells = <1>;
+- #address-cells = <1>;
+- #size-cells = <0>;
+
+Optional properties:
+- spi-max-frequency: Requested only for SPI master mode.
+		  SPI clock OUT frequency (Hz). This clock must be set according
+		  to "clock" property. Frequency must be a multiple of the rcc
+		  clock frequency. If not, SPI CLKOUT frequency will not be
+		  accurate.
+
+Contents of a STM32 DFSDM child nodes:
+--------------------------------------
+
+Required properties:
+- compatible: Must be:
+	"st,stm32-dfsdm-adc" for sigma delta ADCs
+	"st,stm32-dfsdm-dmic" for audio digital microphone.
+- reg: Specifies the DFSDM filter instance used.
+- interrupts: IRQ lines connected to each DFSDM filter instance.
+- st,adc-channels:	List of single-ended channels muxed for this ADC.
+			valid values:
+				"st,stm32h7-dfsdm" compatibility: 0 to 7.
+- st,adc-channel-names:	List of single-ended channel names.
+- st,filter-order:  SinC filter order from 0 to 5.
+			0: FastSinC
+			[1-5]: order 1 to 5.
+			For audio purpose it is recommended to use order 3 to 5.
+- #io-channel-cells = <1>: See the IIO bindings section "IIO consumers".
+
+Required properties for "st,stm32-dfsdm-adc" compatibility:
+- io-channels: From common IIO binding. Used to pipe external sigma delta
+		modulator or internal ADC output to DFSDM channel.
+		This is not required for "st,stm32-dfsdm-pdm" compatibility as
+		PDM microphone is binded in Audio DT node.
+
+Required properties for "st,stm32-dfsdm-pdm" compatibility:
+- #sound-dai-cells: Must be set to 0.
+- dma: DMA controller phandle and DMA request line associated to the
+		filter instance (specified by the field "reg")
+- dma-names: Must be "rx"
+
+Optional properties:
+- st,adc-channel-types:	Single-ended channel input type.
+			- "SPI_R": SPI with data on rising edge (default)
+			- "SPI_F": SPI with data on falling edge
+			- "MANCH_R": manchester codec, rising edge = logic 0
+			- "MANCH_F": manchester codec, falling edge = logic 1
+- st,adc-channel-clk-src: Conversion clock source.
+			  - "CLKIN": external SPI clock (CLKIN x)
+			  - "CLKOUT": internal SPI clock (CLKOUT) (default)
+			  - "CLKOUT_F": internal SPI clock divided by 2 (falling edge).
+			  - "CLKOUT_R": internal SPI clock divided by 2 (rising edge).
+
+- st,adc-alt-channel: Must be defined if two sigma delta modulator are
+			  connected on same SPI input.
+			  If not set, channel n is connected to SPI input n.
+			  If set, channel n is connected to SPI input n + 1.
+
+- st,filter0-sync: Set to 1 to synchronize with DFSDM filter instance 0.
+		   Used for multi microphones synchronization.
+
+Example of a sigma delta adc connected on DFSDM SPI port 0
+and a pdm microphone connected on DFSDM SPI port 1:
+
+	ads1202: simple_sd_adc@0 {
+		compatible = "ads1202";
+		#io-channel-cells = <1>;
+	};
+
+	dfsdm: dfsdm@40017000 {
+		compatible = "st,stm32h7-dfsdm";
+		reg = <0x40017000 0x400>;
+		clocks = <&rcc DFSDM1_CK>;
+		clock-names = "dfsdm";
+		#interrupt-cells = <1>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		dfsdm_adc0: filter@0 {
+			compatible = "st,stm32-dfsdm-adc";
+			#io-channel-cells = <1>;
+			reg = <0>;
+			interrupts = <110>;
+			st,adc-channels = <0>;
+			st,adc-channel-names = "sd_adc0";
+			st,adc-channel-types = "SPI_F";
+			st,adc-channel-clk-src = "CLKOUT";
+			io-channels = <&ads1202 0>;
+			st,filter-order = <3>;
+		};
+		dfsdm_pdm1: filter@1 {
+			compatible = "st,stm32-dfsdm-dmic";
+			reg = <1>;
+			interrupts = <111>;
+			dmas = <&dmamux1 102 0x400 0x00>;
+			dma-names = "rx";
+			st,adc-channels = <1>;
+			st,adc-channel-names = "dmic1";
+			st,adc-channel-types = "SPI_R";
+			st,adc-channel-clk-src = "CLKOUT";
+			st,filter-order = <5>;
+		};
+	}
diff --git a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt
index f320dcd..8ced169 100644
--- a/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt
+++ b/Documentation/devicetree/bindings/interrupt-controller/brcm,bcm2836-l1-intc.txt
@@ -12,7 +12,7 @@
 			  registers
 - interrupt-controller:	Identifies the node as an interrupt controller
 - #interrupt-cells:	Specifies the number of cells needed to encode an
-			  interrupt source. The value shall be 1
+			  interrupt source. The value shall be 2
 
 Please refer to interrupts.txt in this directory for details of the common
 Interrupt Controllers bindings used by client devices.
@@ -32,6 +32,6 @@
 	compatible = "brcm,bcm2836-l1-intc";
 	reg = <0x40000000 0x100>;
 	interrupt-controller;
-	#interrupt-cells = <1>;
+	#interrupt-cells = <2>;
 	interrupt-parent = <&local_intc>;
 };
diff --git a/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
new file mode 100644
index 0000000..35f7527
--- /dev/null
+++ b/Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
@@ -0,0 +1,30 @@
+Android Goldfish PIC
+
+Android Goldfish programmable interrupt device used by Android
+emulator.
+
+Required properties:
+
+- compatible : should contain "google,goldfish-pic"
+- reg        : <registers mapping>
+- interrupts : <interrupt mapping>
+
+Example for mips when used in cascade mode:
+
+        cpuintc {
+                #interrupt-cells = <0x1>;
+                #address-cells = <0>;
+                interrupt-controller;
+                compatible = "mti,cpu-interrupt-controller";
+        };
+
+        interrupt-controller@1f000000 {
+                compatible = "google,goldfish-pic";
+                reg = <0x1f000000 0x1000>;
+
+                interrupt-controller;
+                #interrupt-cells = <0x1>;
+
+                interrupt-parent = <&cpuintc>;
+                interrupts = <0x2>;
+        };
diff --git a/Documentation/devicetree/bindings/mfd/mc13xxx.txt b/Documentation/devicetree/bindings/mfd/mc13xxx.txt
index ac235fe..8261ea7 100644
--- a/Documentation/devicetree/bindings/mfd/mc13xxx.txt
+++ b/Documentation/devicetree/bindings/mfd/mc13xxx.txt
@@ -130,7 +130,7 @@
 			#size-cells = <0>;
 			led-control = <0x000 0x000 0x0e0 0x000>;
 
-			sysled {
+			sysled@3 {
 				reg = <3>;
 				label = "system:red:live";
 				linux,default-trigger = "heartbeat";
diff --git a/Documentation/devicetree/bindings/mfd/syscon.txt b/Documentation/devicetree/bindings/mfd/syscon.txt
index 8b92d45..25d9e9c 100644
--- a/Documentation/devicetree/bindings/mfd/syscon.txt
+++ b/Documentation/devicetree/bindings/mfd/syscon.txt
@@ -16,9 +16,17 @@
 Optional property:
 - reg-io-width: the size (in bytes) of the IO accesses that should be
   performed on the device.
+- hwlocks: reference to a phandle of a hardware spinlock provider node.
 
 Examples:
 gpr: iomuxc-gpr@20e0000 {
 	compatible = "fsl,imx6q-iomuxc-gpr", "syscon";
 	reg = <0x020e0000 0x38>;
+	hwlocks = <&hwlock1 1>;
+};
+
+hwlock1: hwspinlock@40500000 {
+	...
+	reg = <0x40500000 0x1000>;
+	#hwlock-cells = <1>;
 };
diff --git a/Documentation/devicetree/bindings/mmc/mtk-sd.txt b/Documentation/devicetree/bindings/mmc/mtk-sd.txt
index 72d2a73..9b80176 100644
--- a/Documentation/devicetree/bindings/mmc/mtk-sd.txt
+++ b/Documentation/devicetree/bindings/mmc/mtk-sd.txt
@@ -12,6 +12,8 @@
 	"mediatek,mt8173-mmc": for mmc host ip compatible with mt8173
 	"mediatek,mt2701-mmc": for mmc host ip compatible with mt2701
 	"mediatek,mt2712-mmc": for mmc host ip compatible with mt2712
+	"mediatek,mt7623-mmc", "mediatek,mt2701-mmc": for MT7623 SoC
+
 - reg: physical base address of the controller and length
 - interrupts: Should contain MSDC interrupt number
 - clocks: Should contain phandle for the clock feeding the MMC controller
diff --git a/Documentation/devicetree/bindings/mmc/tmio_mmc.txt b/Documentation/devicetree/bindings/mmc/tmio_mmc.txt
index 3c67624..d8685cb 100644
--- a/Documentation/devicetree/bindings/mmc/tmio_mmc.txt
+++ b/Documentation/devicetree/bindings/mmc/tmio_mmc.txt
@@ -26,6 +26,7 @@
 		"renesas,sdhi-r8a7794" - SDHI IP on R8A7794 SoC
 		"renesas,sdhi-r8a7795" - SDHI IP on R8A7795 SoC
 		"renesas,sdhi-r8a7796" - SDHI IP on R8A7796 SoC
+		"renesas,sdhi-r8a77995" - SDHI IP on R8A77995 SoC
 		"renesas,sdhi-shmobile" - a generic sh-mobile SDHI controller
 		"renesas,rcar-gen1-sdhi" - a generic R-Car Gen1 SDHI controller
 		"renesas,rcar-gen2-sdhi" - a generic R-Car Gen2 or RZ/G1
diff --git a/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt b/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt
index c34aa6f..63d4d626 100644
--- a/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt
+++ b/Documentation/devicetree/bindings/mtd/fsl-quadspi.txt
@@ -12,7 +12,7 @@
   - reg-names: Should contain the reg names "QuadSPI" and "QuadSPI-memory"
   - interrupts : Should contain the interrupt for the device
   - clocks : The clocks needed by the QuadSPI controller
-  - clock-names : the name of the clocks
+  - clock-names : Should contain the name of the clocks: "qspi_en" and "qspi".
 
 Optional properties:
   - fsl,qspi-has-second-chip: The controller has two buses, bus A and bus B.
diff --git a/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt b/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt
index b6e8bfd..e9f01a9 100644
--- a/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt
+++ b/Documentation/devicetree/bindings/mtd/gpmc-onenand.txt
@@ -9,13 +9,14 @@
 
 Required properties:
 
+ - compatible:		"ti,omap2-onenand"
  - reg:			The CS line the peripheral is connected to
- - gpmc,device-width	Width of the ONENAND device connected to the GPMC
+ - gpmc,device-width:	Width of the ONENAND device connected to the GPMC
 			in bytes. Must be 1 or 2.
 
 Optional properties:
 
- - dma-channel:		DMA Channel index
+ - int-gpios:		GPIO specifier for the INT pin.
 
 For inline partition table parsing (optional):
 
@@ -35,6 +36,7 @@
 		#size-cells = <1>;
 
 		onenand@0 {
+			compatible = "ti,omap2-onenand";
 			reg = <0 0 0>; /* CS0, offset 0 */
 			gpmc,device-width = <2>;
 
diff --git a/Documentation/devicetree/bindings/mtd/marvell-nand.txt b/Documentation/devicetree/bindings/mtd/marvell-nand.txt
new file mode 100644
index 0000000..c08fb47
--- /dev/null
+++ b/Documentation/devicetree/bindings/mtd/marvell-nand.txt
@@ -0,0 +1,123 @@
+Marvell NAND Flash Controller (NFC)
+
+Required properties:
+- compatible: can be one of the following:
+    * "marvell,armada-8k-nand-controller"
+    * "marvell,armada370-nand-controller"
+    * "marvell,pxa3xx-nand-controller"
+    * "marvell,armada-8k-nand" (deprecated)
+    * "marvell,armada370-nand" (deprecated)
+    * "marvell,pxa3xx-nand" (deprecated)
+  Compatibles marked deprecated support only the old bindings described
+  at the bottom.
+- reg: NAND flash controller memory area.
+- #address-cells: shall be set to 1. Encode the NAND CS.
+- #size-cells: shall be set to 0.
+- interrupts: shall define the NAND controller interrupt.
+- clocks: shall reference the NAND controller clock.
+- marvell,system-controller: Set to retrieve the syscon node that handles
+  NAND controller related registers (only required with the
+  "marvell,armada-8k-nand[-controller]" compatibles).
+
+Optional properties:
+- label: see partition.txt. New platforms shall omit this property.
+- dmas: shall reference DMA channel associated to the NAND controller.
+  This property is only used with "marvell,pxa3xx-nand[-controller]"
+  compatible strings.
+- dma-names: shall be "rxtx".
+  This property is only used with "marvell,pxa3xx-nand[-controller]"
+  compatible strings.
+
+Optional children nodes:
+Children nodes represent the available NAND chips.
+
+Required properties:
+- reg: shall contain the native Chip Select ids (0-3).
+- nand-rb: see nand.txt (0-1).
+
+Optional properties:
+- marvell,nand-keep-config: orders the driver not to take the timings
+  from the core and leaving them completely untouched. Bootloader
+  timings will then be used.
+- label: MTD name.
+- nand-on-flash-bbt: see nand.txt.
+- nand-ecc-mode: see nand.txt. Will use hardware ECC if not specified.
+- nand-ecc-algo: see nand.txt. This property is essentially useful when
+  not using hardware ECC. Howerver, it may be added when using hardware
+  ECC for clarification but will be ignored by the driver because ECC
+  mode is chosen depending on the page size and the strength required by
+  the NAND chip. This value may be overwritten with nand-ecc-strength
+  property.
+- nand-ecc-strength: see nand.txt.
+- nand-ecc-step-size: see nand.txt. Marvell's NAND flash controller does
+  use fixed strength (1-bit for Hamming, 16-bit for BCH), so the actual
+  step size will shrink or grow in order to fit the required strength.
+  Step sizes are not completely random for all and follow certain
+  patterns described in AN-379, "Marvell SoC NFC ECC".
+
+See Documentation/devicetree/bindings/mtd/nand.txt for more details on
+generic bindings.
+
+
+Example:
+nand_controller: nand-controller@d0000 {
+	compatible = "marvell,armada370-nand-controller";
+	reg = <0xd0000 0x54>;
+	#address-cells = <1>;
+	#size-cells = <0>;
+	interrupts = <GIC_SPI 84 IRQ_TYPE_LEVEL_HIGH>;
+	clocks = <&coredivclk 0>;
+
+	nand@0 {
+		reg = <0>;
+		label = "main-storage";
+		nand-rb = <0>;
+		nand-ecc-mode = "hw";
+		marvell,nand-keep-config;
+		nand-on-flash-bbt;
+		nand-ecc-strength = <4>;
+		nand-ecc-step-size = <512>;
+
+		partitions {
+			compatible = "fixed-partitions";
+			#address-cells = <1>;
+			#size-cells = <1>;
+
+			partition@0 {
+				label = "Rootfs";
+				reg = <0x00000000 0x40000000>;
+			};
+		};
+	};
+};
+
+
+Note on legacy bindings: One can find, in not-updated device trees,
+bindings slightly different than described above with other properties
+described below as well as the partitions node at the root of a so
+called "nand" node (without clear controller/chip separation).
+
+Legacy properties:
+- marvell,nand-enable-arbiter: To enable the arbiter, all boards blindly
+  used it, this bit was set by the bootloader for many boards and even if
+  it is marked reserved in several datasheets, it might be needed to set
+  it (otherwise it is harmless) so whether or not this property is set,
+  the bit is selected by the driver.
+- num-cs: Number of chip-select lines to use, all boards blindly set 1
+  to this and for a reason, other values would have failed. The value of
+  this property is ignored.
+
+Example:
+
+	nand0: nand@43100000 {
+		compatible = "marvell,pxa3xx-nand";
+		reg = <0x43100000 90>;
+		interrupts = <45>;
+		dmas = <&pdma 97 0>;
+		dma-names = "rxtx";
+		#address-cells = <1>;
+		marvell,nand-keep-config;
+		marvell,nand-enable-arbiter;
+		num-cs = <1>;
+		/* Partitions (optional) */
+       };
diff --git a/Documentation/devicetree/bindings/mtd/mtk-nand.txt b/Documentation/devicetree/bindings/mtd/mtk-nand.txt
index 0431841..1c88526 100644
--- a/Documentation/devicetree/bindings/mtd/mtk-nand.txt
+++ b/Documentation/devicetree/bindings/mtd/mtk-nand.txt
@@ -12,8 +12,10 @@
 
 The first part of NFC is NAND Controller Interface (NFI) HW.
 Required NFI properties:
-- compatible:			Should be one of "mediatek,mt2701-nfc",
-				"mediatek,mt2712-nfc".
+- compatible:			Should be one of
+				"mediatek,mt2701-nfc",
+				"mediatek,mt2712-nfc",
+				"mediatek,mt7622-nfc".
 - reg:				Base physical address and size of NFI.
 - interrupts:			Interrupts of NFI.
 - clocks:			NFI required clocks.
@@ -142,7 +144,10 @@
 ==============
 
 Required BCH properties:
-- compatible:	Should be one of "mediatek,mt2701-ecc", "mediatek,mt2712-ecc".
+- compatible:	Should be one of
+		"mediatek,mt2701-ecc",
+		"mediatek,mt2712-ecc",
+		"mediatek,mt7622-ecc".
 - reg:		Base physical address and size of ECC.
 - interrupts:	Interrupts of ECC.
 - clocks:	ECC required clocks.
diff --git a/Documentation/devicetree/bindings/mtd/nand.txt b/Documentation/devicetree/bindings/mtd/nand.txt
index 133f381..8bb11d8 100644
--- a/Documentation/devicetree/bindings/mtd/nand.txt
+++ b/Documentation/devicetree/bindings/mtd/nand.txt
@@ -43,6 +43,7 @@
 		     This is particularly useful when only the in-band area is
 		     used by the upper layers, and you want to make your NAND
 		     as reliable as possible.
+- nand-rb: shall contain the native Ready/Busy ids.
 
 The ECC strength and ECC step size properties define the correction capability
 of a controller. Together, they say a controller can correct "{strength} bit
diff --git a/Documentation/devicetree/bindings/opp/opp.txt b/Documentation/devicetree/bindings/opp/opp.txt
index 9d733af..4e4f302 100644
--- a/Documentation/devicetree/bindings/opp/opp.txt
+++ b/Documentation/devicetree/bindings/opp/opp.txt
@@ -45,6 +45,11 @@
 phandle to a OPP table in their DT node. The OPP core will use this phandle to
 find the operating points for the device.
 
+This can contain more than one phandle for power domain providers that provide
+multiple power domains. That is, one phandle for each power domain. If only one
+phandle is available, then the same OPP table will be used for all power domains
+provided by the power domain provider.
+
 If required, this can be extended for SoC vendor specific bindings. Such bindings
 should be documented as Documentation/devicetree/bindings/power/<vendor>-opp.txt
 and should have a compatible description like: "operating-points-v2-<vendor>".
@@ -154,6 +159,14 @@
 
 - status: Marks the node enabled/disabled.
 
+- required-opp: This contains phandle to an OPP node in another device's OPP
+  table. It may contain an array of phandles, where each phandle points to an
+  OPP of a different device. It should not contain multiple phandles to the OPP
+  nodes in the same OPP table. This specifies the minimum required OPP of the
+  device(s), whose OPP's phandle is present in this property, for the
+  functioning of the current device at the current OPP (where this property is
+  present).
+
 Example 1: Single cluster Dual-core ARM cortex A9, switch DVFS states together.
 
 / {
diff --git a/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt b/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt
new file mode 100644
index 0000000..832346e
--- /dev/null
+++ b/Documentation/devicetree/bindings/opp/ti-omap5-opp-supply.txt
@@ -0,0 +1,63 @@
+Texas Instruments OMAP compatible OPP supply description
+
+OMAP5, DRA7, and AM57 family of SoCs have Class0 AVS eFuse registers which
+contain data that can be used to adjust voltages programmed for some of their
+supplies for more efficient operation. This binding provides the information
+needed to read these values and use them to program the main regulator during
+an OPP transitions.
+
+Also, some supplies may have an associated vbb-supply which is an Adaptive Body
+Bias regulator which much be transitioned in a specific sequence with regards
+to the vdd-supply and clk when making an OPP transition. By supplying two
+regulators to the device that will undergo OPP transitions we can make use
+of the multi regulator binding that is part of the OPP core described here [1]
+to describe both regulators needed by the platform.
+
+[1] Documentation/devicetree/bindings/opp/opp.txt
+
+Required Properties for Device Node:
+- vdd-supply: phandle to regulator controlling VDD supply
+- vbb-supply: phandle to regulator controlling Body Bias supply
+	      (Usually Adaptive Body Bias regulator)
+
+Required Properties for opp-supply node:
+- compatible: Should be one of:
+	"ti,omap-opp-supply" - basic OPP supply controlling VDD and VBB
+	"ti,omap5-opp-supply" - OMAP5+ optimized voltages in efuse(class0)VDD
+			    along with VBB
+	"ti,omap5-core-opp-supply" - OMAP5+ optimized voltages in efuse(class0) VDD
+			    but no VBB.
+- reg: Address and length of the efuse register set for the device (mandatory
+	only for "ti,omap5-opp-supply")
+- ti,efuse-settings: An array of u32 tuple items providing information about
+	optimized efuse configuration. Each item consists of the following:
+	volt: voltage in uV - reference voltage (OPP voltage)
+	efuse_offseet: efuse offset from reg where the optimized voltage is stored.
+- ti,absolute-max-voltage-uv: absolute maximum voltage for the OPP supply.
+
+Example:
+
+/* Device Node (CPU)  */
+cpus {
+	cpu0: cpu@0 {
+		device_type = "cpu";
+
+		...
+
+		vdd-supply = <&vcc>;
+		vbb-supply = <&abb_mpu>;
+	};
+};
+
+/* OMAP OPP Supply with Class0 registers */
+opp_supply_mpu: opp_supply@4a003b20 {
+	compatible = "ti,omap5-opp-supply";
+	reg = <0x4a003b20 0x8>;
+	ti,efuse-settings = <
+	/* uV   offset */
+	1060000 0x0
+	1160000 0x4
+	1210000 0x8
+	>;
+	ti,absolute-max-voltage-uv = <1500000>;
+};
diff --git a/Documentation/devicetree/bindings/power/power_domain.txt b/Documentation/devicetree/bindings/power/power_domain.txt
index 14bd9e9..f335531 100644
--- a/Documentation/devicetree/bindings/power/power_domain.txt
+++ b/Documentation/devicetree/bindings/power/power_domain.txt
@@ -40,6 +40,12 @@
   domain's idle states. In the absence of this property, the domain would be
   considered as capable of being powered-on or powered-off.
 
+- operating-points-v2 : Phandles to the OPP tables of power domains provided by
+  a power domain provider. If the provider provides a single power domain only
+  or all the power domains provided by the provider have identical OPP tables,
+  then this shall contain a single phandle. Refer to ../opp/opp.txt for more
+  information.
+
 Example:
 
 	power: power-controller@12340000 {
@@ -120,4 +126,63 @@
 inside a PM domain with index 0 of a power controller represented by a node
 with the label "power".
 
+Optional properties:
+- required-opp: This contains phandle to an OPP node in another device's OPP
+  table. It may contain an array of phandles, where each phandle points to an
+  OPP of a different device. It should not contain multiple phandles to the OPP
+  nodes in the same OPP table. This specifies the minimum required OPP of the
+  device(s), whose OPP's phandle is present in this property, for the
+  functioning of the current device at the current OPP (where this property is
+  present).
+
+Example:
+- OPP table for domain provider that provides two domains.
+
+	domain0_opp_table: opp-table0 {
+		compatible = "operating-points-v2";
+
+		domain0_opp_0: opp-1000000000 {
+			opp-hz = /bits/ 64 <1000000000>;
+			opp-microvolt = <975000 970000 985000>;
+		};
+		domain0_opp_1: opp-1100000000 {
+			opp-hz = /bits/ 64 <1100000000>;
+			opp-microvolt = <1000000 980000 1010000>;
+		};
+	};
+
+	domain1_opp_table: opp-table1 {
+		compatible = "operating-points-v2";
+
+		domain1_opp_0: opp-1200000000 {
+			opp-hz = /bits/ 64 <1200000000>;
+			opp-microvolt = <975000 970000 985000>;
+		};
+		domain1_opp_1: opp-1300000000 {
+			opp-hz = /bits/ 64 <1300000000>;
+			opp-microvolt = <1000000 980000 1010000>;
+		};
+	};
+
+	power: power-controller@12340000 {
+		compatible = "foo,power-controller";
+		reg = <0x12340000 0x1000>;
+		#power-domain-cells = <1>;
+		operating-points-v2 = <&domain0_opp_table>, <&domain1_opp_table>;
+	};
+
+	leaky-device0@12350000 {
+		compatible = "foo,i-leak-current";
+		reg = <0x12350000 0x1000>;
+		power-domains = <&power 0>;
+		required-opp = <&domain0_opp_0>;
+	};
+
+	leaky-device1@12350000 {
+		compatible = "foo,i-leak-current";
+		reg = <0x12350000 0x1000>;
+		power-domains = <&power 1>;
+		required-opp = <&domain1_opp_1>;
+	};
+
 [1]. Documentation/devicetree/bindings/power/domain-idle-state.txt
diff --git a/Documentation/devicetree/bindings/regulator/regulator.txt b/Documentation/devicetree/bindings/regulator/regulator.txt
index 3cbf56ce..2babe15b 100644
--- a/Documentation/devicetree/bindings/regulator/regulator.txt
+++ b/Documentation/devicetree/bindings/regulator/regulator.txt
@@ -42,8 +42,16 @@
 - regulator-state-[mem/disk] node has following common properties:
 	- regulator-on-in-suspend: regulator should be on in suspend state.
 	- regulator-off-in-suspend: regulator should be off in suspend state.
-	- regulator-suspend-microvolt: regulator should be set to this voltage
-	  in suspend.
+	- regulator-suspend-min-microvolt: minimum voltage may be set in
+	  suspend state.
+	- regulator-suspend-max-microvolt: maximum voltage may be set in
+	  suspend state.
+	- regulator-suspend-microvolt: the default voltage which regulator
+	  would be set in suspend. This property is now deprecated, instead
+	  setting voltage for suspend mode via the API which regulator
+	  driver provides is recommended.
+	- regulator-changeable-in-suspend: whether the default voltage and
+	  the regulator on/off in suspend can be changed in runtime.
 	- regulator-mode: operating mode in the given suspend state.
 	  The set of possible operating modes depends on the capabilities of
 	  every hardware so the valid modes are documented on each regulator
diff --git a/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt b/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt
new file mode 100644
index 0000000..63dc078
--- /dev/null
+++ b/Documentation/devicetree/bindings/regulator/sprd,sc2731-regulator.txt
@@ -0,0 +1,43 @@
+Spreadtrum SC2731 Voltage regulators
+
+The SC2731 integrates low-voltage and low quiescent current DCDC/LDO.
+14 LDO and 3 DCDCs are designed for external use. All DCDCs/LDOs have
+their own bypass (power-down) control signals. External tantalum or MLCC
+ceramic capacitors are recommended to use with these LDOs.
+
+Required properties:
+ - compatible: should be "sprd,sc27xx-regulator".
+
+List of regulators provided by this controller. It is named according to
+its regulator type, BUCK_<name> and LDO_<name>. The definition for each
+of these nodes is defined using the standard binding for regulators at
+Documentation/devicetree/bindings/regulator/regulator.txt.
+
+The valid names for regulators are:
+BUCK:
+	BUCK_CPU0, BUCK_CPU1, BUCK_RF
+LDO:
+	LDO_CAMA0, LDO_CAMA1, LDO_CAMMOT, LDO_VLDO, LDO_EMMCCORE, LDO_SDCORE,
+	LDO_SDIO, LDO_WIFIPA, LDO_USB33, LDO_CAMD0, LDO_CAMD1, LDO_CON,
+	LDO_CAMIO, LDO_SRAM
+
+Example:
+	regulators {
+		compatible = "sprd,sc27xx-regulator";
+
+		vddarm0: BUCK_CPU0 {
+			regulator-name = "vddarm0";
+			regulator-min-microvolt = <400000>;
+			regulator-max-microvolt = <1996875>;
+			regulator-ramp-delay = <25000>;
+			regulator-always-on;
+		};
+
+		vddcama0: LDO_CAMA0 {
+			regulator-name = "vddcama0";
+			regulator-min-microvolt = <1200000>;
+			regulator-max-microvolt = <3750000>;
+			regulator-enable-ramp-delay = <100>;
+		};
+		...
+	};
diff --git a/Documentation/devicetree/bindings/sound/dmic.txt b/Documentation/devicetree/bindings/sound/dmic.txt
index 54c8ef6..f7bf656 100644
--- a/Documentation/devicetree/bindings/sound/dmic.txt
+++ b/Documentation/devicetree/bindings/sound/dmic.txt
@@ -7,10 +7,12 @@
 
 Optional properties:
 	- dmicen-gpios: GPIO specifier for dmic to control start and stop
+	- num-channels: Number of microphones on this DAI
 
 Example node:
 
 	dmic_codec: dmic@0 {
 		compatible = "dmic-codec";
 		dmicen-gpios = <&gpio4 3 GPIO_ACTIVE_HIGH>;
+		num-channels = <1>;
 	};
diff --git a/Documentation/devicetree/bindings/sound/max98373.txt b/Documentation/devicetree/bindings/sound/max98373.txt
new file mode 100644
index 0000000..456cb1c
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/max98373.txt
@@ -0,0 +1,40 @@
+Maxim Integrated MAX98373 Speaker Amplifier
+
+This device supports I2C.
+
+Required properties:
+
+ - compatible : "maxim,max98373"
+
+ - reg : the I2C address of the device.
+
+Optional properties:
+
+  - maxim,vmon-slot-no : slot number used to send voltage information
+                   or in inteleave mode this will be used as
+                   interleave slot.
+                   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,imon-slot-no : slot number used to send current information
+                   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,spkfb-slot-no : slot number used to send speaker feedback information
+                   slot range : 0 ~ 15,  Default : 0
+
+  - maxim,interleave-mode : For cases where a single combined channel
+		   for the I/V sense data is not sufficient, the device can also be configured
+		   to share a single data output channel on alternating frames.
+		   In this configuration, the current and voltage data will be frame interleaved
+		   on a single output channel.
+                   Boolean, define to enable the interleave mode, Default : false
+
+Example:
+
+codec: max98373@31 {
+   compatible = "maxim,max98373";
+   reg = <0x31>;
+   maxim,vmon-slot-no = <0>;
+   maxim,imon-slot-no = <1>;
+   maxim,spkfb-slot-no = <2>;
+   maxim,interleave-mode;
+};
diff --git a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
index 77a57f8..6df87b9 100644
--- a/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
+++ b/Documentation/devicetree/bindings/sound/mt2701-afe-pcm.txt
@@ -2,153 +2,143 @@
 
 Required properties:
 - compatible = "mediatek,mt2701-audio";
-- reg: register location and size
 - interrupts: should contain AFE and ASYS interrupts
 - interrupt-names: should be "afe" and "asys"
 - power-domains: should define the power domain
+- clocks: Must contain an entry for each entry in clock-names
+  See ../clocks/clock-bindings.txt for details
 - clock-names: should have these clock names:
 		"infra_sys_audio_clk",
 		"top_audio_mux1_sel",
 		"top_audio_mux2_sel",
-		"top_audio_mux1_div",
-		"top_audio_mux2_div",
-		"top_audio_48k_timing",
-		"top_audio_44k_timing",
-		"top_audpll_mux_sel",
-		"top_apll_sel",
-		"top_aud1_pll_98M",
-		"top_aud2_pll_90M",
-		"top_hadds2_pll_98M",
-		"top_hadds2_pll_294M",
-		"top_audpll",
-		"top_audpll_d4",
-		"top_audpll_d8",
-		"top_audpll_d16",
-		"top_audpll_d24",
-		"top_audintbus_sel",
-		"clk_26m",
-		"top_syspll1_d4",
-		"top_aud_k1_src_sel",
-		"top_aud_k2_src_sel",
-		"top_aud_k3_src_sel",
-		"top_aud_k4_src_sel",
-		"top_aud_k5_src_sel",
-		"top_aud_k6_src_sel",
-		"top_aud_k1_src_div",
-		"top_aud_k2_src_div",
-		"top_aud_k3_src_div",
-		"top_aud_k4_src_div",
-		"top_aud_k5_src_div",
-		"top_aud_k6_src_div",
-		"top_aud_i2s1_mclk",
-		"top_aud_i2s2_mclk",
-		"top_aud_i2s3_mclk",
-		"top_aud_i2s4_mclk",
-		"top_aud_i2s5_mclk",
-		"top_aud_i2s6_mclk",
-		"top_asm_m_sel",
-		"top_asm_h_sel",
-		"top_univpll2_d4",
-		"top_univpll2_d2",
-		"top_syspll_d5";
+		"top_audio_a1sys_hp",
+		"top_audio_a2sys_hp",
+		"i2s0_src_sel",
+		"i2s1_src_sel",
+		"i2s2_src_sel",
+		"i2s3_src_sel",
+		"i2s0_src_div",
+		"i2s1_src_div",
+		"i2s2_src_div",
+		"i2s3_src_div",
+		"i2s0_mclk_en",
+		"i2s1_mclk_en",
+		"i2s2_mclk_en",
+		"i2s3_mclk_en",
+		"i2so0_hop_ck",
+		"i2so1_hop_ck",
+		"i2so2_hop_ck",
+		"i2so3_hop_ck",
+		"i2si0_hop_ck",
+		"i2si1_hop_ck",
+		"i2si2_hop_ck",
+		"i2si3_hop_ck",
+		"asrc0_out_ck",
+		"asrc1_out_ck",
+		"asrc2_out_ck",
+		"asrc3_out_ck",
+		"audio_afe_pd",
+		"audio_afe_conn_pd",
+		"audio_a1sys_pd",
+		"audio_a2sys_pd",
+		"audio_mrgif_pd";
+- assigned-clocks: list of input clocks and dividers for the audio system.
+		   See ../clocks/clock-bindings.txt for details.
+- assigned-clocks-parents: parent of input clocks of assigned clocks.
+- assigned-clock-rates: list of clock frequencies of assigned clocks.
+
+Must be a subnode of MediaTek audsys device tree node.
+See ../arm/mediatek/mediatek,audsys.txt for details about the parent node.
 
 Example:
 
-	afe: mt2701-afe-pcm@11220000 {
-		compatible = "mediatek,mt2701-audio";
-		reg = <0 0x11220000 0 0x2000>,
-		      <0 0x112A0000 0 0x20000>;
-		interrupts = <GIC_SPI 104 IRQ_TYPE_LEVEL_LOW>,
-			     <GIC_SPI 132 IRQ_TYPE_LEVEL_LOW>;
-		interrupt-names	= "afe", "asys";
-		power-domains = <&scpsys MT2701_POWER_DOMAIN_IFR_MSC>;
-		clocks = <&infracfg CLK_INFRA_AUDIO>,
-			 <&topckgen CLK_TOP_AUD_MUX1_SEL>,
-			 <&topckgen CLK_TOP_AUD_MUX2_SEL>,
-			 <&topckgen CLK_TOP_AUD_MUX1_DIV>,
-			 <&topckgen CLK_TOP_AUD_MUX2_DIV>,
-			 <&topckgen CLK_TOP_AUD_48K_TIMING>,
-			 <&topckgen CLK_TOP_AUD_44K_TIMING>,
-			 <&topckgen CLK_TOP_AUDPLL_MUX_SEL>,
-			 <&topckgen CLK_TOP_APLL_SEL>,
-			 <&topckgen CLK_TOP_AUD1PLL_98M>,
-			 <&topckgen CLK_TOP_AUD2PLL_90M>,
-			 <&topckgen CLK_TOP_HADDS2PLL_98M>,
-			 <&topckgen CLK_TOP_HADDS2PLL_294M>,
-			 <&topckgen CLK_TOP_AUDPLL>,
-			 <&topckgen CLK_TOP_AUDPLL_D4>,
-			 <&topckgen CLK_TOP_AUDPLL_D8>,
-			 <&topckgen CLK_TOP_AUDPLL_D16>,
-			 <&topckgen CLK_TOP_AUDPLL_D24>,
-			 <&topckgen CLK_TOP_AUDINTBUS_SEL>,
-			 <&clk26m>,
-			 <&topckgen CLK_TOP_SYSPLL1_D4>,
-			 <&topckgen CLK_TOP_AUD_K1_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K2_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K3_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K4_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K5_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K6_SRC_SEL>,
-			 <&topckgen CLK_TOP_AUD_K1_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_K2_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_K3_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_K4_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_K5_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_K6_SRC_DIV>,
-			 <&topckgen CLK_TOP_AUD_I2S1_MCLK>,
-			 <&topckgen CLK_TOP_AUD_I2S2_MCLK>,
-			 <&topckgen CLK_TOP_AUD_I2S3_MCLK>,
-			 <&topckgen CLK_TOP_AUD_I2S4_MCLK>,
-			 <&topckgen CLK_TOP_AUD_I2S5_MCLK>,
-			 <&topckgen CLK_TOP_AUD_I2S6_MCLK>,
-			 <&topckgen CLK_TOP_ASM_M_SEL>,
-			 <&topckgen CLK_TOP_ASM_H_SEL>,
-			 <&topckgen CLK_TOP_UNIVPLL2_D4>,
-			 <&topckgen CLK_TOP_UNIVPLL2_D2>,
-			 <&topckgen CLK_TOP_SYSPLL_D5>;
+	audsys: audio-subsystem@11220000 {
+		compatible = "mediatek,mt2701-audsys", "syscon", "simple-mfd";
+		...
 
-		clock-names = "infra_sys_audio_clk",
-			      "top_audio_mux1_sel",
-			      "top_audio_mux2_sel",
-			      "top_audio_mux1_div",
-			      "top_audio_mux2_div",
-			      "top_audio_48k_timing",
-			      "top_audio_44k_timing",
-			      "top_audpll_mux_sel",
-			      "top_apll_sel",
-			      "top_aud1_pll_98M",
-			      "top_aud2_pll_90M",
-			      "top_hadds2_pll_98M",
-			      "top_hadds2_pll_294M",
-			      "top_audpll",
-			      "top_audpll_d4",
-			      "top_audpll_d8",
-			      "top_audpll_d16",
-			      "top_audpll_d24",
-			      "top_audintbus_sel",
-			      "clk_26m",
-			      "top_syspll1_d4",
-			      "top_aud_k1_src_sel",
-			      "top_aud_k2_src_sel",
-			      "top_aud_k3_src_sel",
-			      "top_aud_k4_src_sel",
-			      "top_aud_k5_src_sel",
-			      "top_aud_k6_src_sel",
-			      "top_aud_k1_src_div",
-			      "top_aud_k2_src_div",
-			      "top_aud_k3_src_div",
-			      "top_aud_k4_src_div",
-			      "top_aud_k5_src_div",
-			      "top_aud_k6_src_div",
-			      "top_aud_i2s1_mclk",
-			      "top_aud_i2s2_mclk",
-			      "top_aud_i2s3_mclk",
-			      "top_aud_i2s4_mclk",
-			      "top_aud_i2s5_mclk",
-			      "top_aud_i2s6_mclk",
-			      "top_asm_m_sel",
-			      "top_asm_h_sel",
-			      "top_univpll2_d4",
-			      "top_univpll2_d2",
-			      "top_syspll_d5";
+		afe: audio-controller {
+			compatible = "mediatek,mt2701-audio";
+			interrupts =  <GIC_SPI 104 IRQ_TYPE_LEVEL_LOW>,
+				      <GIC_SPI 132 IRQ_TYPE_LEVEL_LOW>;
+			interrupt-names	= "afe", "asys";
+			power-domains = <&scpsys MT2701_POWER_DOMAIN_IFR_MSC>;
+
+			clocks = <&infracfg CLK_INFRA_AUDIO>,
+				 <&topckgen CLK_TOP_AUD_MUX1_SEL>,
+				 <&topckgen CLK_TOP_AUD_MUX2_SEL>,
+				 <&topckgen CLK_TOP_AUD_48K_TIMING>,
+				 <&topckgen CLK_TOP_AUD_44K_TIMING>,
+				 <&topckgen CLK_TOP_AUD_K1_SRC_SEL>,
+				 <&topckgen CLK_TOP_AUD_K2_SRC_SEL>,
+				 <&topckgen CLK_TOP_AUD_K3_SRC_SEL>,
+				 <&topckgen CLK_TOP_AUD_K4_SRC_SEL>,
+				 <&topckgen CLK_TOP_AUD_K1_SRC_DIV>,
+				 <&topckgen CLK_TOP_AUD_K2_SRC_DIV>,
+				 <&topckgen CLK_TOP_AUD_K3_SRC_DIV>,
+				 <&topckgen CLK_TOP_AUD_K4_SRC_DIV>,
+				 <&topckgen CLK_TOP_AUD_I2S1_MCLK>,
+				 <&topckgen CLK_TOP_AUD_I2S2_MCLK>,
+				 <&topckgen CLK_TOP_AUD_I2S3_MCLK>,
+				 <&topckgen CLK_TOP_AUD_I2S4_MCLK>,
+				 <&audsys CLK_AUD_I2SO1>,
+				 <&audsys CLK_AUD_I2SO2>,
+				 <&audsys CLK_AUD_I2SO3>,
+				 <&audsys CLK_AUD_I2SO4>,
+				 <&audsys CLK_AUD_I2SIN1>,
+				 <&audsys CLK_AUD_I2SIN2>,
+				 <&audsys CLK_AUD_I2SIN3>,
+				 <&audsys CLK_AUD_I2SIN4>,
+				 <&audsys CLK_AUD_ASRCO1>,
+				 <&audsys CLK_AUD_ASRCO2>,
+				 <&audsys CLK_AUD_ASRCO3>,
+				 <&audsys CLK_AUD_ASRCO4>,
+				 <&audsys CLK_AUD_AFE>,
+				 <&audsys CLK_AUD_AFE_CONN>,
+				 <&audsys CLK_AUD_A1SYS>,
+				 <&audsys CLK_AUD_A2SYS>,
+				 <&audsys CLK_AUD_AFE_MRGIF>;
+
+			clock-names = "infra_sys_audio_clk",
+				      "top_audio_mux1_sel",
+				      "top_audio_mux2_sel",
+				      "top_audio_a1sys_hp",
+				      "top_audio_a2sys_hp",
+				      "i2s0_src_sel",
+				      "i2s1_src_sel",
+				      "i2s2_src_sel",
+				      "i2s3_src_sel",
+				      "i2s0_src_div",
+				      "i2s1_src_div",
+				      "i2s2_src_div",
+				      "i2s3_src_div",
+				      "i2s0_mclk_en",
+				      "i2s1_mclk_en",
+				      "i2s2_mclk_en",
+				      "i2s3_mclk_en",
+				      "i2so0_hop_ck",
+				      "i2so1_hop_ck",
+				      "i2so2_hop_ck",
+				      "i2so3_hop_ck",
+				      "i2si0_hop_ck",
+				      "i2si1_hop_ck",
+				      "i2si2_hop_ck",
+				      "i2si3_hop_ck",
+				      "asrc0_out_ck",
+				      "asrc1_out_ck",
+				      "asrc2_out_ck",
+				      "asrc3_out_ck",
+				      "audio_afe_pd",
+				      "audio_afe_conn_pd",
+				      "audio_a1sys_pd",
+				      "audio_a2sys_pd",
+				      "audio_mrgif_pd";
+
+			assigned-clocks = <&topckgen CLK_TOP_AUD_MUX1_SEL>,
+					  <&topckgen CLK_TOP_AUD_MUX2_SEL>,
+					  <&topckgen CLK_TOP_AUD_MUX1_DIV>,
+					  <&topckgen CLK_TOP_AUD_MUX2_DIV>;
+			assigned-clock-parents = <&topckgen CLK_TOP_AUD1PLL_98M>,
+						 <&topckgen CLK_TOP_AUD2PLL_90M>;
+			assigned-clock-rates = <0>, <0>, <49152000>, <45158400>;
+		};
 	};
diff --git a/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt b/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt
index 601c518..4eb980b 100644
--- a/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt
+++ b/Documentation/devicetree/bindings/sound/mxs-audio-sgtl5000.txt
@@ -1,10 +1,31 @@
 * Freescale MXS audio complex with SGTL5000 codec
 
 Required properties:
-- compatible: "fsl,mxs-audio-sgtl5000"
-- model: The user-visible name of this sound complex
-- saif-controllers: The phandle list of the MXS SAIF controller
-- audio-codec: The phandle of the SGTL5000 audio codec
+- compatible		: "fsl,mxs-audio-sgtl5000"
+- model			: The user-visible name of this sound complex
+- saif-controllers	: The phandle list of the MXS SAIF controller
+- audio-codec		: The phandle of the SGTL5000 audio codec
+- audio-routing		: A list of the connections between audio components.
+			  Each entry is a pair of strings, the first being the
+			  connection's sink, the second being the connection's
+			  source. Valid names could be power supplies, SGTL5000
+			  pins, and the jacks on the board:
+
+			  Power supplies:
+			   * Mic Bias
+
+			  SGTL5000 pins:
+			   * MIC_IN
+			   * LINE_IN
+			   * HP_OUT
+			   * LINE_OUT
+
+			  Board connectors:
+			   * Mic Jack
+			   * Line In Jack
+			   * Headphone Jack
+			   * Line Out Jack
+			   * Ext Spk
 
 Example:
 
@@ -14,4 +35,8 @@
 	model = "imx28-evk-sgtl5000";
 	saif-controllers = <&saif0 &saif1>;
 	audio-codec = <&sgtl5000>;
+	audio-routing =
+		"MIC_IN", "Mic Jack",
+		"Mic Jack", "Mic Bias",
+		"Headphone Jack", "HP_OUT";
 };
diff --git a/Documentation/devicetree/bindings/sound/nau8825.txt b/Documentation/devicetree/bindings/sound/nau8825.txt
index 2f5e973..d16d968 100644
--- a/Documentation/devicetree/bindings/sound/nau8825.txt
+++ b/Documentation/devicetree/bindings/sound/nau8825.txt
@@ -69,7 +69,7 @@
   - nuvoton,jack-insert-debounce: number from 0 to 7 that sets debounce time to 2^(n+2) ms
   - nuvoton,jack-eject-debounce: number from 0 to 7 that sets debounce time to 2^(n+2) ms
 
-  - nuvoton,crosstalk-bypass: make crosstalk function bypass if set.
+  - nuvoton,crosstalk-enable: make crosstalk function enable if set.
 
   - clocks: list of phandle and clock specifier pairs according to common clock bindings for the
       clocks described in clock-names
@@ -98,7 +98,7 @@
       nuvoton,short-key-debounce = <2>;
       nuvoton,jack-insert-debounce = <7>;
       nuvoton,jack-eject-debounce = <7>;
-      nuvoton,crosstalk-bypass;
+      nuvoton,crosstalk-enable;
 
       clock-names = "mclk";
       clocks = <&tegra_car TEGRA210_CLK_CLK_OUT_2>;
diff --git a/Documentation/devicetree/bindings/sound/pcm186x.txt b/Documentation/devicetree/bindings/sound/pcm186x.txt
new file mode 100644
index 0000000..1087f48
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/pcm186x.txt
@@ -0,0 +1,42 @@
+Texas Instruments PCM186x Universal Audio ADC
+
+These devices support both I2C and SPI (configured with pin strapping
+on the board).
+
+Required properties:
+
+ - compatible : "ti,pcm1862",
+                "ti,pcm1863",
+                "ti,pcm1864",
+                "ti,pcm1865"
+
+ - reg : The I2C address of the device for I2C, the chip select
+         number for SPI.
+
+ - avdd-supply: Analog core power supply (3.3v)
+ - dvdd-supply: Digital core power supply
+ - iovdd-supply: Digital IO power supply
+        See regulator/regulator.txt for more information
+
+CODEC input pins:
+ * VINL1
+ * VINR1
+ * VINL2
+ * VINR2
+ * VINL3
+ * VINR3
+ * VINL4
+ * VINR4
+
+The pins can be used in referring sound node's audio-routing property.
+
+Example:
+
+	pcm186x: audio-codec@4a {
+		compatible = "ti,pcm1865";
+		reg = <0x4a>;
+
+		avdd-supply = <&reg_3v3_analog>;
+		dvdd-supply = <&reg_3v3>;
+		iovdd-supply = <&reg_1v8>;
+	};
diff --git a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt
index 085bec3..5bed9a5 100644
--- a/Documentation/devicetree/bindings/sound/renesas,rsnd.txt
+++ b/Documentation/devicetree/bindings/sound/renesas,rsnd.txt
@@ -4,7 +4,7 @@
 * Modules
 =============================================
 
-Renesas R-Car sound is constructed from below modules
+Renesas R-Car and RZ/G sound is constructed from below modules
 (for Gen2 or later)
 
  SCU		: Sampling Rate Converter Unit
@@ -197,12 +197,17 @@
 	[MEM] -> [SRC2] -> [CTU03] -+
 
 	sound {
+		#address-cells = <1>;
+		#size-cells = <0>;
+
 		compatible = "simple-scu-audio-card";
 		...
-		simple-audio-card,cpu-0 {
+		simple-audio-card,cpu@0 {
+			reg = <0>;
 			sound-dai = <&rcar_sound 0>;
 		};
-		simple-audio-card,cpu-1 {
+		simple-audio-card,cpu@1 {
+			reg = <1>;
 			sound-dai = <&rcar_sound 1>;
 		};
 		simple-audio-card,codec {
@@ -334,9 +339,11 @@
 
 - compatible			: "renesas,rcar_sound-<soctype>", fallbacks
 				  "renesas,rcar_sound-gen1" if generation1, and
-				  "renesas,rcar_sound-gen2" if generation2
+				  "renesas,rcar_sound-gen2" if generation2 (or RZ/G1)
 				  "renesas,rcar_sound-gen3" if generation3
 				  Examples with soctypes are:
+				    - "renesas,rcar_sound-r8a7743" (RZ/G1M)
+				    - "renesas,rcar_sound-r8a7745" (RZ/G1E)
 				    - "renesas,rcar_sound-r8a7778" (R-Car M1A)
 				    - "renesas,rcar_sound-r8a7779" (R-Car H1)
 				    - "renesas,rcar_sound-r8a7790" (R-Car H2)
diff --git a/Documentation/devicetree/bindings/sound/simple-card.txt b/Documentation/devicetree/bindings/sound/simple-card.txt
index 166f229..17c13e7 100644
--- a/Documentation/devicetree/bindings/sound/simple-card.txt
+++ b/Documentation/devicetree/bindings/sound/simple-card.txt
@@ -140,6 +140,7 @@
 	simple-audio-card,name = "Cubox Audio";
 
 	simple-audio-card,dai-link@0 {		/* I2S - HDMI */
+		reg = <0>;
 		format = "i2s";
 		cpu {
 			sound-dai = <&audio1 0>;
@@ -150,6 +151,7 @@
 	};
 
 	simple-audio-card,dai-link@1 {		/* S/PDIF - HDMI */
+		reg = <1>;
 		cpu {
 			sound-dai = <&audio1 1>;
 		};
@@ -159,6 +161,7 @@
 	};
 
 	simple-audio-card,dai-link@2 {		/* S/PDIF - S/PDIF */
+		reg = <2>;
 		cpu {
 			sound-dai = <&audio1 1>;
 		};
diff --git a/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt b/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt
new file mode 100644
index 0000000..864f5b0
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/st,stm32-adfsdm.txt
@@ -0,0 +1,63 @@
+STMicroelectronics Audio Digital Filter Sigma Delta modulators(DFSDM)
+
+The DFSDM allows PDM microphones capture through SPI interface. The Audio
+interface is seems as a sub block of the DFSDM device.
+For details on DFSDM bindings refer to ../iio/adc/st,stm32-dfsdm-adc.txt
+
+Required properties:
+  - compatible: "st,stm32h7-dfsdm-dai".
+
+  - #sound-dai-cells : Must be equal to 0
+
+  - io-channels : phandle to iio dfsdm instance node.
+
+Example of a sound card using audio DFSDM node.
+
+	sound_card {
+		compatible = "audio-graph-card";
+
+		dais = <&cpu_port>;
+	};
+
+	dfsdm: dfsdm@40017000 {
+		compatible = "st,stm32h7-dfsdm";
+		reg = <0x40017000 0x400>;
+		clocks = <&rcc DFSDM1_CK>;
+		clock-names = "dfsdm";
+		#interrupt-cells = <1>;
+		#address-cells = <1>;
+		#size-cells = <0>;
+
+		dfsdm_adc0: filter@0 {
+			compatible = "st,stm32-dfsdm-dmic";
+			reg = <0>;
+			interrupts = <110>;
+			dmas = <&dmamux1 101 0x400 0x00>;
+			dma-names = "rx";
+			st,adc-channels = <1>;
+			st,adc-channel-names = "dmic0";
+			st,adc-channel-types = "SPI_R";
+			st,adc-channel-clk-src = "CLKOUT";
+			st,filter-order = <5>;
+
+			dfsdm_dai0: dfsdm-dai {
+				compatible = "st,stm32h7-dfsdm-dai";
+				#sound-dai-cells = <0>;
+				io-channels = <&dfsdm_adc0 0>;
+				cpu_port: port {
+				dfsdm_endpoint: endpoint {
+					remote-endpoint = <&dmic0_endpoint>;
+				};
+			};
+		};
+	};
+
+	dmic0: dmic@0 {
+		compatible = "dmic-codec";
+		#sound-dai-cells = <0>;
+		port {
+			dmic0_endpoint: endpoint {
+				remote-endpoint = <&dfsdm_endpoint>;
+			};
+		};
+	};
diff --git a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt
index 1f9cd70..b1acc1a 100644
--- a/Documentation/devicetree/bindings/sound/st,stm32-sai.txt
+++ b/Documentation/devicetree/bindings/sound/st,stm32-sai.txt
@@ -20,11 +20,6 @@
 
 Optional properties:
   - resets: Reference to a reset controller asserting the SAI
-  - st,sync: specify synchronization mode.
-	By default SAI sub-block is in asynchronous mode.
-	This property sets SAI sub-block as slave of another SAI sub-block.
-	Must contain the phandle and index of the sai sub-block providing
-	the synchronization.
 
 SAI subnodes:
 Two subnodes corresponding to SAI sub-block instances A et B can be defined.
@@ -44,6 +39,13 @@
   - pinctrl-names: should contain only value "default"
   - pinctrl-0: see Documentation/devicetree/bindings/pinctrl/pinctrl-stm32.txt
 
+SAI subnodes Optional properties:
+  - st,sync: specify synchronization mode.
+	By default SAI sub-block is in asynchronous mode.
+	This property sets SAI sub-block as slave of another SAI sub-block.
+	Must contain the phandle and index of the sai sub-block providing
+	the synchronization.
+
 The device node should contain one 'port' child node with one child 'endpoint'
 node, according to the bindings defined in Documentation/devicetree/bindings/
 graph.txt.
diff --git a/Documentation/devicetree/bindings/sound/sun4i-i2s.txt b/Documentation/devicetree/bindings/sound/sun4i-i2s.txt
index 05d7135..b9d50d6 100644
--- a/Documentation/devicetree/bindings/sound/sun4i-i2s.txt
+++ b/Documentation/devicetree/bindings/sound/sun4i-i2s.txt
@@ -8,6 +8,7 @@
 - compatible: should be one of the following:
    - "allwinner,sun4i-a10-i2s"
    - "allwinner,sun6i-a31-i2s"
+   - "allwinner,sun8i-a83t-i2s"
    - "allwinner,sun8i-h3-i2s"
 - reg: physical base address of the controller and length of memory mapped
   region.
@@ -23,6 +24,7 @@
 
 Required properties for the following compatibles:
 	- "allwinner,sun6i-a31-i2s"
+	- "allwinner,sun8i-a83t-i2s"
 	- "allwinner,sun8i-h3-i2s"
 - resets: phandle to the reset line for this codec
 
diff --git a/Documentation/devicetree/bindings/sound/tas5720.txt b/Documentation/devicetree/bindings/sound/tas5720.txt
index 40d94f8..7481653f 100644
--- a/Documentation/devicetree/bindings/sound/tas5720.txt
+++ b/Documentation/devicetree/bindings/sound/tas5720.txt
@@ -6,10 +6,12 @@
 
 http://www.ti.com/product/TAS5720L
 http://www.ti.com/product/TAS5720M
+http://www.ti.com/product/TAS5722L
 
 Required properties:
 
-- compatible : "ti,tas5720"
+- compatible : "ti,tas5720",
+               "ti,tas5722"
 - reg : I2C slave address
 - dvdd-supply : phandle to a 3.3-V supply for the digital circuitry
 - pvdd-supply : phandle to a supply used for the Class-D amp and the analog
diff --git a/Documentation/devicetree/bindings/sound/tfa9879.txt b/Documentation/devicetree/bindings/sound/tfa9879.txt
index 23ba522..1620e68 100644
--- a/Documentation/devicetree/bindings/sound/tfa9879.txt
+++ b/Documentation/devicetree/bindings/sound/tfa9879.txt
@@ -6,18 +6,18 @@
 
 - reg : the I2C address of the device
 
+- #sound-dai-cells : must be 0.
+
 Example:
 
 &i2c1 {
-	clock-frequency = <100000>;
 	pinctrl-names = "default";
 	pinctrl-0 = <&pinctrl_i2c1>;
-	status = "okay";
 
-	codec: tfa9879@6c {
+	amp: amp@6c {
 		#sound-dai-cells = <0>;
 		compatible = "nxp,tfa9879";
 		reg = <0x6c>;
-        };
+	};
 };
 
diff --git a/Documentation/devicetree/bindings/sound/ti,tas6424.txt b/Documentation/devicetree/bindings/sound/ti,tas6424.txt
new file mode 100644
index 0000000..1c4ada0
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/ti,tas6424.txt
@@ -0,0 +1,20 @@
+Texas Instruments TAS6424 Quad-Channel Audio amplifier
+
+The TAS6424 serial control bus communicates through I2C protocols.
+
+Required properties:
+	- compatible: "ti,tas6424" - TAS6424
+	- reg: I2C slave address
+	- sound-dai-cells: must be equal to 0
+
+Example:
+
+tas6424: tas6424@6a {
+	compatible = "ti,tas6424";
+	reg = <0x6a>;
+
+	#sound-dai-cells = <0>;
+};
+
+For more product information please see the link below:
+http://www.ti.com/product/TAS6424-Q1
diff --git a/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt b/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt
index 6fbba56..5b3c33b 100644
--- a/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt
+++ b/Documentation/devicetree/bindings/sound/tlv320aic31xx.txt
@@ -22,7 +22,7 @@
 
 Optional properties:
 
-- gpio-reset - gpio pin number used for codec reset
+- reset-gpios - GPIO specification for the active low RESET input.
 - ai31xx-micbias-vg - MicBias Voltage setting
         1 or MICBIAS_2_0V - MICBIAS output is powered to 2.0V
         2 or MICBIAS_2_5V - MICBIAS output is powered to 2.5V
@@ -30,6 +30,10 @@
 	If this node is not mentioned or if the value is unknown, then
 	micbias	is set to 2.0V.
 
+Deprecated properties:
+
+- gpio-reset - gpio pin number used for codec reset
+
 CODEC output pins:
   * HPL
   * HPR
@@ -48,6 +52,7 @@
 The pins can be used in referring sound node's audio-routing property.
 
 Example:
+#include <dt-bindings/gpio/gpio.h>
 #include <dt-bindings/sound/tlv320aic31xx-micbias.h>
 
 tlv320aic31xx: tlv320aic31xx@18 {
@@ -56,6 +61,8 @@
 
 	ai31xx-micbias-vg = <MICBIAS_OFF>;
 
+	reset-gpios = <&gpio1 17 GPIO_ACTIVE_LOW>;
+
 	HPVDD-supply = <&regulator>;
 	SPRVDD-supply = <&regulator>;
 	SPLVDD-supply = <&regulator>;
diff --git a/Documentation/devicetree/bindings/sound/tlv320aic3x.txt b/Documentation/devicetree/bindings/sound/tlv320aic3x.txt
index ba5b45c..9796c463 100644
--- a/Documentation/devicetree/bindings/sound/tlv320aic3x.txt
+++ b/Documentation/devicetree/bindings/sound/tlv320aic3x.txt
@@ -17,7 +17,7 @@
 
 Optional properties:
 
-- gpio-reset - gpio pin number used for codec reset
+- reset-gpios - GPIO specification for the active low RESET input.
 - ai3x-gpio-func - <array of 2 int> - AIC3X_GPIO1 & AIC3X_GPIO2 Functionality
 				    - Not supported on tlv320aic3104
 - ai3x-micbias-vg - MicBias Voltage required.
@@ -34,6 +34,10 @@
 - AVDD-supply, IOVDD-supply, DRVDD-supply, DVDD-supply : power supplies for the
   device as covered in Documentation/devicetree/bindings/regulator/regulator.txt
 
+Deprecated properties:
+
+- gpio-reset - gpio pin number used for codec reset
+
 CODEC output pins:
   * LLOUT
   * RLOUT
@@ -61,10 +65,14 @@
 
 Example:
 
+#include <dt-bindings/gpio/gpio.h>
+
 tlv320aic3x: tlv320aic3x@1b {
 	compatible = "ti,tlv320aic3x";
 	reg = <0x1b>;
 
+	reset-gpios = <&gpio1 17 GPIO_ACTIVE_LOW>;
+
 	AVDD-supply = <&regulator>;
 	IOVDD-supply = <&regulator>;
 	DRVDD-supply = <&regulator>;
diff --git a/Documentation/devicetree/bindings/sound/tscs42xx.txt b/Documentation/devicetree/bindings/sound/tscs42xx.txt
new file mode 100644
index 0000000..2ac2f09
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/tscs42xx.txt
@@ -0,0 +1,16 @@
+TSCS42XX Audio CODEC
+
+Required Properties:
+
+	- compatible :	"tempo,tscs42A1" for analog mic
+			"tempo,tscs42A2" for digital mic
+
+	- reg : 	<0x71> for analog mic
+			<0x69> for digital mic
+
+Example:
+
+wookie: codec@69 {
+	compatible = "tempo,tscs42A2";
+	reg = <0x69>;
+};
diff --git a/Documentation/devicetree/bindings/sound/uniphier,evea.txt b/Documentation/devicetree/bindings/sound/uniphier,evea.txt
new file mode 100644
index 0000000..3f31b23
--- /dev/null
+++ b/Documentation/devicetree/bindings/sound/uniphier,evea.txt
@@ -0,0 +1,26 @@
+Socionext EVEA - UniPhier SoC internal codec driver
+
+Required properties:
+- compatible      : should be "socionext,uniphier-evea".
+- reg             : offset and length of the register set for the device.
+- clock-names     : should include following entries:
+                    "evea", "exiv"
+- clocks          : a list of phandle, should contain an entry for each
+                    entries in clock-names.
+- reset-names     : should include following entries:
+                    "evea", "exiv", "adamv"
+- resets          : a list of phandle, should contain reset entries of
+                    reset-names.
+- #sound-dai-cells: should be 1.
+
+Example:
+
+	codec {
+		compatible = "socionext,uniphier-evea";
+		reg = <0x57900000 0x1000>;
+		clock-names = "evea", "exiv";
+		clocks = <&sys_clk 41>, <&sys_clk 42>;
+		reset-names = "evea", "exiv", "adamv";
+		resets = <&sys_rst 41>, <&sys_rst 42>, <&adamv_rst 0>;
+		#sound-dai-cells = <1>;
+	};
diff --git a/Documentation/devicetree/bindings/spi/sh-msiof.txt b/Documentation/devicetree/bindings/spi/sh-msiof.txt
index bdd8395..80710f0 100644
--- a/Documentation/devicetree/bindings/spi/sh-msiof.txt
+++ b/Documentation/devicetree/bindings/spi/sh-msiof.txt
@@ -36,7 +36,21 @@
 
 Optional properties:
 - clocks               : Must contain a reference to the functional clock.
-- num-cs               : Total number of chip-selects (default is 1)
+- num-cs               : Total number of chip selects (default is 1).
+			 Up to 3 native chip selects are supported:
+			   0: MSIOF_SYNC
+			   1: MSIOF_SS1
+			   2: MSIOF_SS2
+			 Hardware limitations related to chip selects:
+			   - Native chip selects are always deasserted in
+			     between transfers that are part of the same
+			     message.  Use cs-gpios to work around this.
+			   - All slaves using native chip selects must use the
+			     same spi-cs-high configuration.  Use cs-gpios to
+			     work around this.
+			   - When using GPIO chip selects, at least one native
+			     chip select must be left unused, as it will be
+			     driven anyway.
 - dmas                 : Must contain a list of two references to DMA
 			 specifiers, one for transmission, and one for
 			 reception.
diff --git a/Documentation/devicetree/bindings/spi/spi-meson.txt b/Documentation/devicetree/bindings/spi/spi-meson.txt
index 825c39c..b7f5e86 100644
--- a/Documentation/devicetree/bindings/spi/spi-meson.txt
+++ b/Documentation/devicetree/bindings/spi/spi-meson.txt
@@ -27,7 +27,9 @@
 communications with dedicated 16 words RX/TX PIO FIFOs.
 
 Required properties:
- - compatible: should be "amlogic,meson-gx-spicc" on Amlogic GX SoCs.
+ - compatible: should be:
+	"amlogic,meson-gx-spicc" on Amlogic GX and compatible SoCs.
+	"amlogic,meson-axg-spicc" on Amlogic AXG and compatible SoCs
  - reg: physical base address and length of the controller registers
  - interrupts: The interrupt specifier
  - clock-names: Must contain "core"
diff --git a/Documentation/devicetree/bindings/spi/spi-orion.txt b/Documentation/devicetree/bindings/spi/spi-orion.txt
index df8ec31..8434a65 100644
--- a/Documentation/devicetree/bindings/spi/spi-orion.txt
+++ b/Documentation/devicetree/bindings/spi/spi-orion.txt
@@ -18,8 +18,17 @@
 	The eight register sets following the control registers refer to
 	chip-select lines 0 through 7 respectively.
 - cell-index : Which of multiple SPI controllers is this.
+- clocks : pointers to the reference clocks for this device, the first
+	   one is the one used for the clock on the spi bus, the
+	   second one is optional and is the clock used for the
+	   functional part of the controller
+
 Optional properties:
 - interrupts : Is currently not used.
+- clock-names : names of used clocks, mandatory if the second clock is
+		used, the name must be "core", and "axi" (the latter
+		is only for Armada 7K/8K).
+
 
 Example:
        spi@10600 {
diff --git a/Documentation/devicetree/bindings/spi/spi-xilinx.txt b/Documentation/devicetree/bindings/spi/spi-xilinx.txt
index c7b7856..7bf61ef 100644
--- a/Documentation/devicetree/bindings/spi/spi-xilinx.txt
+++ b/Documentation/devicetree/bindings/spi/spi-xilinx.txt
@@ -2,7 +2,7 @@
 -------------------------------------------------
 
 Required properties:
-- compatible		: Should be "xlnx,xps-spi-2.00.a" or "xlnx,xps-spi-2.00.b"
+- compatible		: Should be "xlnx,xps-spi-2.00.a", "xlnx,xps-spi-2.00.b" or "xlnx,axi-quad-spi-1.00.a"
 - reg			: Physical base address and size of SPI registers map.
 - interrupts		: Property with a value describing the interrupt
 			  number.
diff --git a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
index e3c28da..977054f 100644
--- a/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
+++ b/Documentation/devicetree/bindings/timer/actions,owl-timer.txt
@@ -2,6 +2,7 @@
 
 Required properties:
 - compatible      :  "actions,s500-timer" for S500
+                     "actions,s700-timer" for S700
                      "actions,s900-timer" for S900
 - reg             :  Offset and length of the register set for the device.
 - interrupts      :  Should contain the interrupts.
diff --git a/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt b/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt
new file mode 100644
index 0000000..6d97e7d
--- /dev/null
+++ b/Documentation/devicetree/bindings/timer/spreadtrum,sprd-timer.txt
@@ -0,0 +1,20 @@
+Spreadtrum timers
+
+The Spreadtrum SC9860 platform provides 3 general-purpose timers.
+These timers can support 32bit or 64bit counter, as well as supporting
+period mode or one-shot mode, and they are can be wakeup source
+during deep sleep.
+
+Required properties:
+- compatible: should be "sprd,sc9860-timer" for SC9860 platform.
+- reg: The register address of the timer device.
+- interrupts: Should contain the interrupt for the timer device.
+- clocks: The phandle to the source clock (usually a 32.768 KHz fixed clock).
+
+Example:
+	timer@40050000 {
+		compatible = "sprd,sc9860-timer";
+		reg = <0 0x40050000 0 0x20>;
+		interrupts = <GIC_SPI 26 IRQ_TYPE_LEVEL_HIGH>;
+		clocks = <&ext_32k>;
+	};
diff --git a/Documentation/devicetree/bindings/vendor-prefixes.txt b/Documentation/devicetree/bindings/vendor-prefixes.txt
index 0994bdd..f776fb8 100644
--- a/Documentation/devicetree/bindings/vendor-prefixes.txt
+++ b/Documentation/devicetree/bindings/vendor-prefixes.txt
@@ -347,6 +347,7 @@
 tcl	Toby Churchill Ltd.
 technexion	TechNexion
 technologic	Technologic Systems
+tempo	Tempo Semiconductor
 terasic	Terasic Inc.
 thine	THine Electronics, Inc.
 ti	Texas Instruments
diff --git a/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt
new file mode 100644
index 0000000..3de9618
--- /dev/null
+++ b/Documentation/devicetree/bindings/watchdog/zii,rave-sp-wdt.txt
@@ -0,0 +1,39 @@
+Zodiac Inflight Innovations RAVE Supervisory Processor Watchdog Bindings
+
+RAVE SP watchdog device is a "MFD cell" device corresponding to
+watchdog functionality of RAVE Supervisory Processor. It is expected
+that its Device Tree node is specified as a child of the node
+corresponding to the parent RAVE SP device (as documented in
+Documentation/devicetree/bindings/mfd/zii,rave-sp.txt)
+
+Required properties:
+
+- compatible: Depending on wire protocol implemented by RAVE SP
+  firmware, should be one of:
+	- "zii,rave-sp-watchdog"
+	- "zii,rave-sp-watchdog-legacy"
+
+Optional properties:
+
+- wdt-timeout:	Two byte nvmem cell specified as per
+		Documentation/devicetree/bindings/nvmem/nvmem.txt
+
+Example:
+
+	rave-sp {
+		compatible = "zii,rave-sp-rdu1";
+		current-speed = <38400>;
+
+		eeprom {
+			wdt_timeout: wdt-timeout@8E {
+				reg = <0x8E 2>;
+			};
+		};
+
+		watchdog {
+			compatible = "zii,rave-sp-watchdog";
+			nvmem-cells = <&wdt_timeout>;
+			nvmem-cell-names = "wdt-timeout";
+		};
+	}
+
diff --git a/Documentation/driver-api/iio/hw-consumer.rst b/Documentation/driver-api/iio/hw-consumer.rst
new file mode 100644
index 0000000..8facce6
--- /dev/null
+++ b/Documentation/driver-api/iio/hw-consumer.rst
@@ -0,0 +1,51 @@
+===========
+HW consumer
+===========
+An IIO device can be directly connected to another device in hardware. in this
+case the buffers between IIO provider and IIO consumer are handled by hardware.
+The Industrial I/O HW consumer offers a way to bond these IIO devices without
+software buffer for data. The implementation can be found under
+:file:`drivers/iio/buffer/hw-consumer.c`
+
+
+* struct :c:type:`iio_hw_consumer` — Hardware consumer structure
+* :c:func:`iio_hw_consumer_alloc` — Allocate IIO hardware consumer
+* :c:func:`iio_hw_consumer_free` — Free IIO hardware consumer
+* :c:func:`iio_hw_consumer_enable` — Enable IIO hardware consumer
+* :c:func:`iio_hw_consumer_disable` — Disable IIO hardware consumer
+
+
+HW consumer setup
+=================
+
+As standard IIO device the implementation is based on IIO provider/consumer.
+A typical IIO HW consumer setup looks like this::
+
+	static struct iio_hw_consumer *hwc;
+
+	static const struct iio_info adc_info = {
+		.read_raw = adc_read_raw,
+	};
+
+	static int adc_read_raw(struct iio_dev *indio_dev,
+				struct iio_chan_spec const *chan, int *val,
+				int *val2, long mask)
+	{
+		ret = iio_hw_consumer_enable(hwc);
+
+		/* Acquire data */
+
+		ret = iio_hw_consumer_disable(hwc);
+	}
+
+	static int adc_probe(struct platform_device *pdev)
+	{
+		hwc = devm_iio_hw_consumer_alloc(&iio->dev);
+	}
+
+More details
+============
+.. kernel-doc:: include/linux/iio/hw-consumer.h
+.. kernel-doc:: drivers/iio/buffer/industrialio-hw-consumer.c
+   :export:
+
diff --git a/Documentation/driver-api/iio/index.rst b/Documentation/driver-api/iio/index.rst
index e5c3922..7fba341 100644
--- a/Documentation/driver-api/iio/index.rst
+++ b/Documentation/driver-api/iio/index.rst
@@ -15,3 +15,4 @@
    buffers
    triggers
    triggered-buffers
+   hw-consumer
diff --git a/Documentation/driver-api/pm/devices.rst b/Documentation/driver-api/pm/devices.rst
index 53c1b0b..1128705 100644
--- a/Documentation/driver-api/pm/devices.rst
+++ b/Documentation/driver-api/pm/devices.rst
@@ -777,17 +777,51 @@
 runtime suspend at the beginning of the ``suspend_late`` phase of system-wide
 suspend (or in the ``poweroff_late`` phase of hibernation), when runtime PM
 has been disabled for it, under the assumption that its state should not change
-after that point until the system-wide transition is over.  If that happens, the
-driver's system-wide resume callbacks, if present, may still be invoked during
-the subsequent system-wide resume transition and the device's runtime power
-management status may be set to "active" before enabling runtime PM for it,
-so the driver must be prepared to cope with the invocation of its system-wide
-resume callbacks back-to-back with its ``->runtime_suspend`` one (without the
-intervening ``->runtime_resume`` and so on) and the final state of the device
-must reflect the "active" status for runtime PM in that case.
+after that point until the system-wide transition is over (the PM core itself
+does that for devices whose "noirq", "late" and "early" system-wide PM callbacks
+are executed directly by it).  If that happens, the driver's system-wide resume
+callbacks, if present, may still be invoked during the subsequent system-wide
+resume transition and the device's runtime power management status may be set
+to "active" before enabling runtime PM for it, so the driver must be prepared to
+cope with the invocation of its system-wide resume callbacks back-to-back with
+its ``->runtime_suspend`` one (without the intervening ``->runtime_resume`` and
+so on) and the final state of the device must reflect the "active" runtime PM
+status in that case.
 
 During system-wide resume from a sleep state it's easiest to put devices into
 the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
-Refer to that document for more information regarding this particular issue as
+[Refer to that document for more information regarding this particular issue as
 well as for information on the device runtime power management framework in
-general.
+general.]
+
+However, it often is desirable to leave devices in suspend after system
+transitions to the working state, especially if those devices had been in
+runtime suspend before the preceding system-wide suspend (or analogous)
+transition.  Device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag to
+indicate to the PM core (and middle-layer code) that they prefer the specific
+devices handled by them to be left suspended and they have no problems with
+skipping their system-wide resume callbacks for this reason.  Whether or not the
+devices will actually be left in suspend may depend on their state before the
+given system suspend-resume cycle and on the type of the system transition under
+way.  In particular, devices are not left suspended if that transition is a
+restore from hibernation, as device states are not guaranteed to be reflected
+by the information stored in the hibernation image in that case.
+
+The middle-layer code involved in the handling of the device is expected to
+indicate to the PM core if the device may be left in suspend by setting its
+:c:member:`power.may_skip_resume` status bit which is checked by the PM core
+during the "noirq" phase of the preceding system-wide suspend (or analogous)
+transition.  The middle layer is then responsible for handling the device as
+appropriate in its "noirq" resume callback, which is executed regardless of
+whether or not the device is left suspended, but the other resume callbacks
+(except for ``->complete``) will be skipped automatically by the PM core if the
+device really can be left in suspend.
+
+For devices whose "noirq", "late" and "early" driver callbacks are invoked
+directly by the PM core, all of the system-wide resume callbacks are skipped if
+``DPM_FLAG_LEAVE_SUSPENDED`` is set and the device is in runtime suspend during
+the ``suspend_noirq`` (or analogous) phase or the transition under way is a
+proper system suspend (rather than anything related to hibernation) and the
+device's wakeup settings are suitable for runtime PM (that is, it cannot
+generate wakeup signals at all or it is allowed to wake up the system from
+sleep).
diff --git a/Documentation/driver-model/devres.txt b/Documentation/driver-model/devres.txt
index c180045..7c1bb3d 100644
--- a/Documentation/driver-model/devres.txt
+++ b/Documentation/driver-model/devres.txt
@@ -384,6 +384,9 @@
   devm_reset_control_get()
   devm_reset_controller_register()
 
+SERDEV
+  devm_serdev_device_open()
+
 SLAVE DMA ENGINE
   devm_acpi_dma_controller_register()
 
diff --git a/Documentation/features/debug/KASAN/arch-support.txt b/Documentation/features/debug/KASAN/arch-support.txt
index f377290..3406fae 100644
--- a/Documentation/features/debug/KASAN/arch-support.txt
+++ b/Documentation/features/debug/KASAN/arch-support.txt
@@ -35,5 +35,5 @@
     |          um: | TODO |
     |   unicore32: | TODO |
     |         x86: |  ok  | 64-bit only
-    |      xtensa: | TODO |
+    |      xtensa: |  ok  |
     -----------------------
diff --git a/Documentation/features/debug/stackprotector/arch-support.txt b/Documentation/features/debug/stackprotector/arch-support.txt
index d7acd7b..59a4c9f 100644
--- a/Documentation/features/debug/stackprotector/arch-support.txt
+++ b/Documentation/features/debug/stackprotector/arch-support.txt
@@ -35,5 +35,5 @@
     |          um: | TODO |
     |   unicore32: | TODO |
     |         x86: |  ok  |
-    |      xtensa: | TODO |
+    |      xtensa: |  ok  |
     -----------------------
diff --git a/Documentation/filesystems/nilfs2.txt b/Documentation/filesystems/nilfs2.txt
index c0727dc..f2f3f85 100644
--- a/Documentation/filesystems/nilfs2.txt
+++ b/Documentation/filesystems/nilfs2.txt
@@ -25,8 +25,8 @@
 cleaner or garbage collector) are required.  Details on the tools are
 described in the man pages included in the package.
 
-Project web page:    http://nilfs.sourceforge.net/
-Download page:       http://nilfs.sourceforge.net/en/download.html
+Project web page:    https://nilfs.sourceforge.io/
+Download page:       https://nilfs.sourceforge.io/en/download.html
 List info:           http://vger.kernel.org/vger-lists.html#linux-nilfs
 
 Caveats
diff --git a/Documentation/gpu/i915.rst b/Documentation/gpu/i915.rst
index 2e7ee03..e94d3ac 100644
--- a/Documentation/gpu/i915.rst
+++ b/Documentation/gpu/i915.rst
@@ -341,10 +341,7 @@
 GuC-specific firmware loader
 ----------------------------
 
-.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_loader.c
-   :doc: GuC-specific firmware loader
-
-.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_loader.c
+.. kernel-doc:: drivers/gpu/drm/i915/intel_guc_fw.c
    :internal:
 
 GuC-based command submission
diff --git a/Documentation/hwmon/lm25066 b/Documentation/hwmon/lm25066
index 3fa6bf8..51b32aa 100644
--- a/Documentation/hwmon/lm25066
+++ b/Documentation/hwmon/lm25066
@@ -8,11 +8,6 @@
     Datasheets:
 	http://www.ti.com/lit/gpn/lm25056
 	http://www.ti.com/lit/gpn/lm25056a
-  * TI LM25063
-    Prefix: 'lm25063'
-    Addresses scanned: -
-    Datasheet:
-	To be announced
   * National Semiconductor LM25066
     Prefix: 'lm25066'
     Addresses scanned: -
@@ -42,7 +37,7 @@
 -----------
 
 This driver supports hardware monitoring for National Semiconductor / TI LM25056,
-LM25063, LM25066, LM5064, and LM5066/LM5066I Power Management, Monitoring,
+LM25066, LM5064, and LM5066/LM5066I Power Management, Monitoring,
 Control, and Protection ICs.
 
 The driver is a client driver to the core PMBus driver. Please see
@@ -74,12 +69,8 @@
 in1_average		Average measured input voltage.
 in1_min			Minimum input voltage.
 in1_max			Maximum input voltage.
-in1_crit		Critical high input voltage (LM25063 only).
-in1_lcrit		Critical low input voltage (LM25063 only).
 in1_min_alarm		Input voltage low alarm.
 in1_max_alarm		Input voltage high alarm.
-in1_lcrit_alarm		Input voltage critical low alarm (LM25063 only).
-in1_crit_alarm		Input voltage critical high alarm. (LM25063 only).
 
 in2_label		"vmon"
 in2_input		Measured voltage on VAUX pin
@@ -94,16 +85,12 @@
 in3_average		Average measured output voltage.
 in3_min			Minimum output voltage.
 in3_min_alarm		Output voltage low alarm.
-in3_highest		Historical minimum output voltage (LM25063 only).
-in3_lowest		Historical maximum output voltage (LM25063 only).
 
 curr1_label		"iin"
 curr1_input		Measured input current.
 curr1_average		Average measured input current.
 curr1_max		Maximum input current.
-curr1_crit		Critical input current (LM25063 only).
 curr1_max_alarm		Input current high alarm.
-curr1_crit_alarm	Input current critical high alarm (LM25063 only).
 
 power1_label		"pin"
 power1_input		Measured input power.
@@ -113,11 +100,6 @@
 power1_input_highest	Historical maximum power.
 power1_reset_history	Write any value to reset maximum power history.
 
-power2_label		"pout". LM25063 only.
-power2_input		Measured output power.
-power2_max		Maximum output power limit.
-power2_crit		Critical output power limit.
-
 temp1_input		Measured temperature.
 temp1_max		Maximum temperature.
 temp1_crit		Critical high temperature.
diff --git a/Documentation/hwmon/max31785 b/Documentation/hwmon/max31785
index 45fb609..270c5f8 100644
--- a/Documentation/hwmon/max31785
+++ b/Documentation/hwmon/max31785
@@ -17,8 +17,9 @@
 features are provided, including PWM frequency control, temperature hysteresis,
 dual tachometer measurements, and fan health monitoring.
 
-For dual rotor fan configuration, the MAX31785 exposes the slowest rotor of the
-two in the fan[1-4]_input attributes.
+For dual-rotor configurations the MAX31785A exposes the second rotor tachometer
+readings in attributes fan[5-8]_input. By contrast the MAX31785 only exposes
+the slowest rotor measurement, and does so in the fan[1-4]_input attributes.
 
 Usage Notes
 -----------
@@ -31,7 +32,9 @@
 
 fan[1-4]_alarm		Fan alarm.
 fan[1-4]_fault		Fan fault.
-fan[1-4]_input		Fan RPM.
+fan[1-8]_input		Fan RPM. On the MAX31785A, inputs 5-8 correspond to the
+			second rotor of fans 1-4
+fan[1-4]_target		Fan input target
 
 in[1-6]_crit		Critical maximum output voltage
 in[1-6]_crit_alarm	Output voltage critical high alarm
@@ -44,6 +47,12 @@
 in[1-6]_min		Minimum output voltage
 in[1-6]_min_alarm	Output voltage low alarm
 
+pwm[1-4]		Fan target duty cycle (0..255)
+pwm[1-4]_enable		0: Full-speed
+			1: Manual PWM control
+			2: Automatic PWM (tach-feedback RPM fan-control)
+			3: Automatic closed-loop (temp-feedback fan-control)
+
 temp[1-11]_crit		Critical high temperature
 temp[1-11]_crit_alarm	Chip temperature critical high alarm
 temp[1-11]_input	Measured temperature
diff --git a/Documentation/hwmon/w83773g b/Documentation/hwmon/w83773g
new file mode 100644
index 0000000..4cc6c0b
--- /dev/null
+++ b/Documentation/hwmon/w83773g
@@ -0,0 +1,33 @@
+Kernel driver w83773g
+====================
+
+Supported chips:
+  * Nuvoton W83773G
+    Prefix: 'w83773g'
+    Addresses scanned: I2C 0x4c and 0x4d
+    Datasheet: https://www.nuvoton.com/resource-files/W83773G_SG_DatasheetV1_2.pdf
+
+Authors:
+	Lei YU <mine260309@gmail.com>
+
+Description
+-----------
+
+This driver implements support for Nuvoton W83773G temperature sensor
+chip. This chip implements one local and two remote sensors.
+The chip also features offsets for the two remote sensors which get added to
+the input readings. The chip does all the scaling by itself and the driver
+therefore reports true temperatures that don't need any user-space adjustments.
+Temperature is measured in degrees Celsius.
+The chip is wired over I2C/SMBus and specified over a temperature
+range of -40 to +125 degrees Celsius (for local sensor) and -40 to +127
+degrees Celsius (for remote sensors).
+Resolution for both the local and remote channels is 0.125 degree C.
+
+The chip supports only temperature measurement. The driver exports
+the temperature values via the following sysfs files:
+
+temp[1-3]_input
+temp[2-3]_fault
+temp[2-3]_offset
+update_interval
diff --git a/Documentation/kbuild/kconfig-language.txt b/Documentation/kbuild/kconfig-language.txt
index 262722d..c4a293a 100644
--- a/Documentation/kbuild/kconfig-language.txt
+++ b/Documentation/kbuild/kconfig-language.txt
@@ -200,10 +200,14 @@
 <expr> ::= <symbol>                             (1)
            <symbol> '=' <symbol>                (2)
            <symbol> '!=' <symbol>               (3)
-           '(' <expr> ')'                       (4)
-           '!' <expr>                           (5)
-           <expr> '&&' <expr>                   (6)
-           <expr> '||' <expr>                   (7)
+           <symbol1> '<' <symbol2>              (4)
+           <symbol1> '>' <symbol2>              (4)
+           <symbol1> '<=' <symbol2>             (4)
+           <symbol1> '>=' <symbol2>             (4)
+           '(' <expr> ')'                       (5)
+           '!' <expr>                           (6)
+           <expr> '&&' <expr>                   (7)
+           <expr> '||' <expr>                   (8)
 
 Expressions are listed in decreasing order of precedence. 
 
@@ -214,10 +218,13 @@
     otherwise 'n'.
 (3) If the values of both symbols are equal, it returns 'n',
     otherwise 'y'.
-(4) Returns the value of the expression. Used to override precedence.
-(5) Returns the result of (2-/expr/).
-(6) Returns the result of min(/expr/, /expr/).
-(7) Returns the result of max(/expr/, /expr/).
+(4) If value of <symbol1> is respectively lower, greater, lower-or-equal,
+    or greater-or-equal than value of <symbol2>, it returns 'y',
+    otherwise 'n'.
+(5) Returns the value of the expression. Used to override precedence.
+(6) Returns the result of (2-/expr/).
+(7) Returns the result of min(/expr/, /expr/).
+(8) Returns the result of max(/expr/, /expr/).
 
 An expression can have a value of 'n', 'm' or 'y' (or 0, 1, 2
 respectively for calculations). A menu entry becomes visible when its
diff --git a/Documentation/locking/locktorture.txt b/Documentation/locking/locktorture.txt
index a2ef3a9..6a8df4c 100644
--- a/Documentation/locking/locktorture.txt
+++ b/Documentation/locking/locktorture.txt
@@ -57,11 +57,6 @@
 
 		     o "rwsem_lock": read/write down() and up() semaphore pairs.
 
-torture_runnable  Start locktorture at boot time in the case where the
-		  module is built into the kernel, otherwise wait for
-		  torture_runnable to be set via sysfs before starting.
-		  By default it will begin once the module is loaded.
-
 
 	    ** Torture-framework (RCU + locking) **
 
diff --git a/Documentation/memory-barriers.txt b/Documentation/memory-barriers.txt
index 479ecec..a863009 100644
--- a/Documentation/memory-barriers.txt
+++ b/Documentation/memory-barriers.txt
@@ -227,17 +227,20 @@
  (*) On any given CPU, dependent memory accesses will be issued in order, with
      respect to itself.  This means that for:
 
-	Q = READ_ONCE(P); smp_read_barrier_depends(); D = READ_ONCE(*Q);
+	Q = READ_ONCE(P); D = READ_ONCE(*Q);
 
      the CPU will issue the following memory operations:
 
 	Q = LOAD P, D = LOAD *Q
 
-     and always in that order.  On most systems, smp_read_barrier_depends()
-     does nothing, but it is required for DEC Alpha.  The READ_ONCE()
-     is required to prevent compiler mischief.  Please note that you
-     should normally use something like rcu_dereference() instead of
-     open-coding smp_read_barrier_depends().
+     and always in that order.  However, on DEC Alpha, READ_ONCE() also
+     emits a memory-barrier instruction, so that a DEC Alpha CPU will
+     instead issue the following memory operations:
+
+	Q = LOAD P, MEMORY_BARRIER, D = LOAD *Q, MEMORY_BARRIER
+
+     Whether on DEC Alpha or not, the READ_ONCE() also prevents compiler
+     mischief.
 
  (*) Overlapping loads and stores within a particular CPU will appear to be
      ordered within that CPU.  This means that for:
@@ -1815,7 +1818,7 @@
 	GENERAL		mb()			smp_mb()
 	WRITE		wmb()			smp_wmb()
 	READ		rmb()			smp_rmb()
-	DATA DEPENDENCY	read_barrier_depends()	smp_read_barrier_depends()
+	DATA DEPENDENCY				READ_ONCE()
 
 
 All memory barriers except the data dependency barriers imply a compiler
@@ -2864,7 +2867,10 @@
 
 Other CPUs may also have split caches, but must coordinate between the various
 cachelets for normal memory accesses.  The semantics of the Alpha removes the
-need for coordination in the absence of memory barriers.
+need for hardware coordination in the absence of memory barriers, which
+permitted Alpha to sport higher CPU clock rates back in the day.  However,
+please note that smp_read_barrier_depends() should not be used except in
+Alpha arch-specific code and within the READ_ONCE() macro.
 
 
 CACHE COHERENCY VS DMA
diff --git a/Documentation/mtd/spi-nor.txt b/Documentation/mtd/spi-nor.txt
index 548d630..da1fbff 100644
--- a/Documentation/mtd/spi-nor.txt
+++ b/Documentation/mtd/spi-nor.txt
@@ -60,3 +60,6 @@
 initialize the necessary fields for spi_nor{}. Please see
 drivers/mtd/spi-nor/spi-nor.c for detail. Please also refer to fsl-quadspi.c
 when you want to write a new driver for a SPI NOR controller.
+Another API is spi_nor_restore(), this is used to restore the status of SPI
+flash chip such as addressing mode. Call it whenever detach the driver from
+device or reboot the system.
diff --git a/Documentation/networking/index.rst b/Documentation/networking/index.rst
index 66e62086..7d4b159 100644
--- a/Documentation/networking/index.rst
+++ b/Documentation/networking/index.rst
@@ -9,6 +9,7 @@
    batman-adv
    kapi
    z8530book
+   msg_zerocopy
 
 .. only::  subproject
 
@@ -16,4 +17,3 @@
    =======
 
    * :ref:`genindex`
-
diff --git a/Documentation/networking/msg_zerocopy.rst b/Documentation/networking/msg_zerocopy.rst
index 77f6d7e..291a012 100644
--- a/Documentation/networking/msg_zerocopy.rst
+++ b/Documentation/networking/msg_zerocopy.rst
@@ -72,6 +72,10 @@
 	if (setsockopt(fd, SOL_SOCKET, SO_ZEROCOPY, &one, sizeof(one)))
 		error(1, errno, "setsockopt zerocopy");
 
+Setting the socket option only works when the socket is in its initial
+(TCP_CLOSED) state.  Trying to set the option for a socket returned by accept(),
+for example, will lead to an EBUSY error. In this case, the option should be set
+to the listening socket and it will be inherited by the accepted sockets.
 
 Transmission
 ------------
diff --git a/Documentation/perf/arm_dsu_pmu.txt b/Documentation/perf/arm_dsu_pmu.txt
new file mode 100644
index 0000000..d611e15
--- /dev/null
+++ b/Documentation/perf/arm_dsu_pmu.txt
@@ -0,0 +1,28 @@
+ARM DynamIQ Shared Unit (DSU) PMU
+==================================
+
+ARM DynamIQ Shared Unit integrates one or more cores with an L3 memory system,
+control logic and external interfaces to form a multicore cluster. The PMU
+allows counting the various events related to the L3 cache, Snoop Control Unit
+etc, using 32bit independent counters. It also provides a 64bit cycle counter.
+
+The PMU can only be accessed via CPU system registers and are common to the
+cores connected to the same DSU. Like most of the other uncore PMUs, DSU
+PMU doesn't support process specific events and cannot be used in sampling mode.
+
+The DSU provides a bitmap for a subset of implemented events via hardware
+registers. There is no way for the driver to determine if the other events
+are available or not. Hence the driver exposes only those events advertised
+by the DSU, in "events" directory under :
+
+  /sys/bus/event_sources/devices/arm_dsu_<N>/
+
+The user should refer to the TRM of the product to figure out the supported events
+and use the raw event code for the unlisted events.
+
+The driver also exposes the CPUs connected to the DSU instance in "associated_cpus".
+
+
+e.g usage :
+
+	perf stat -a -e arm_dsu_0/cycles/
diff --git a/Documentation/power/pci.txt b/Documentation/power/pci.txt
index 704cd36..8eaf9ee 100644
--- a/Documentation/power/pci.txt
+++ b/Documentation/power/pci.txt
@@ -994,6 +994,17 @@
 the function will set the power.direct_complete flag for it (to make the PM core
 skip the subsequent "thaw" callbacks for it) and return.
 
+Setting the DPM_FLAG_LEAVE_SUSPENDED flag means that the driver prefers the
+device to be left in suspend after system-wide transitions to the working state.
+This flag is checked by the PM core, but the PCI bus type informs the PM core
+which devices may be left in suspend from its perspective (that happens during
+the "noirq" phase of system-wide suspend and analogous transitions) and next it
+uses the dev_pm_may_skip_resume() helper to decide whether or not to return from
+pci_pm_resume_noirq() early, as the PM core will skip the remaining resume
+callbacks for the device during the transition under way and will set its
+runtime PM status to "suspended" if dev_pm_may_skip_resume() returns "true" for
+it.
+
 3.2. Device Runtime Power Management
 ------------------------------------
 In addition to providing device power management callbacks PCI device drivers
diff --git a/Documentation/power/regulator/machine.txt b/Documentation/power/regulator/machine.txt
index 757e3b5..eff4dca 100644
--- a/Documentation/power/regulator/machine.txt
+++ b/Documentation/power/regulator/machine.txt
@@ -23,16 +23,12 @@
 e.g. for the machine above
 
 static struct regulator_consumer_supply regulator1_consumers[] = {
-{
-	.dev_name	= "dev_name(consumer B)",
-	.supply		= "Vcc",
-},};
+	REGULATOR_SUPPLY("Vcc", "consumer B"),
+};
 
 static struct regulator_consumer_supply regulator2_consumers[] = {
-{
-	.dev	= "dev_name(consumer A"),
-	.supply	= "Vcc",
-},};
+	REGULATOR_SUPPLY("Vcc", "consumer A"),
+};
 
 This maps Regulator-1 to the 'Vcc' supply for Consumer B and maps Regulator-2
 to the 'Vcc' supply for Consumer A.
@@ -78,20 +74,20 @@
 Finally the regulator devices must be registered in the usual manner.
 
 static struct platform_device regulator_devices[] = {
-{
-	.name = "regulator",
-	.id = DCDC_1,
-	.dev = {
-		.platform_data = &regulator1_data,
+	{
+		.name = "regulator",
+		.id = DCDC_1,
+		.dev = {
+			.platform_data = &regulator1_data,
+		},
 	},
-},
-{
-	.name = "regulator",
-	.id = DCDC_2,
-	.dev = {
-		.platform_data = &regulator2_data,
+	{
+		.name = "regulator",
+		.id = DCDC_2,
+		.dev = {
+			.platform_data = &regulator2_data,
+		},
 	},
-},
 };
 /* register regulator 1 device */
 platform_device_register(&regulator_devices[0]);
diff --git a/Documentation/process/kernel-enforcement-statement.rst b/Documentation/process/kernel-enforcement-statement.rst
index b317067..bfa6a78 100644
--- a/Documentation/process/kernel-enforcement-statement.rst
+++ b/Documentation/process/kernel-enforcement-statement.rst
@@ -118,6 +118,7 @@
   - Mike Marshall
   - Chris Mason
   - Paul E. McKenney
+  - Arnaldo Carvalho de Melo
   - David S. Miller
   - Ingo Molnar
   - Kuninori Morimoto
diff --git a/Documentation/thermal/cpu-cooling-api.txt b/Documentation/thermal/cpu-cooling-api.txt
index 7165358..7df567e 100644
--- a/Documentation/thermal/cpu-cooling-api.txt
+++ b/Documentation/thermal/cpu-cooling-api.txt
@@ -26,39 +26,16 @@
    clip_cpus: cpumask of cpus where the frequency constraints will happen.
 
 1.1.2 struct thermal_cooling_device *of_cpufreq_cooling_register(
-	struct device_node *np, const struct cpumask *clip_cpus)
+					struct cpufreq_policy *policy)
 
     This interface function registers the cpufreq cooling device with
     the name "thermal-cpufreq-%x" linking it with a device tree node, in
     order to bind it via the thermal DT code. This api can support multiple
     instances of cpufreq cooling devices.
 
-    np: pointer to the cooling device device tree node
-    clip_cpus: cpumask of cpus where the frequency constraints will happen.
+    policy: CPUFreq policy.
 
-1.1.3 struct thermal_cooling_device *cpufreq_power_cooling_register(
-    const struct cpumask *clip_cpus, u32 capacitance,
-    get_static_t plat_static_func)
-
-Similar to cpufreq_cooling_register, this function registers a cpufreq
-cooling device.  Using this function, the cooling device will
-implement the power extensions by using a simple cpu power model.  The
-cpus must have registered their OPPs using the OPP library.
-
-The additional parameters are needed for the power model (See 2. Power
-models).  "capacitance" is the dynamic power coefficient (See 2.1
-Dynamic power).  "plat_static_func" is a function to calculate the
-static power consumed by these cpus (See 2.2 Static power).
-
-1.1.4 struct thermal_cooling_device *of_cpufreq_power_cooling_register(
-    struct device_node *np, const struct cpumask *clip_cpus, u32 capacitance,
-    get_static_t plat_static_func)
-
-Similar to cpufreq_power_cooling_register, this function register a
-cpufreq cooling device with power extensions using the device tree
-information supplied by the np parameter.
-
-1.1.5 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
+1.1.3 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 
     This interface function unregisters the "thermal-cpufreq-%x" cooling device.
 
@@ -67,20 +44,14 @@
 2. Power models
 
 The power API registration functions provide a simple power model for
-CPUs.  The current power is calculated as dynamic + (optionally)
-static power.  This power model requires that the operating-points of
+CPUs.  The current power is calculated as dynamic power (static power isn't
+supported currently).  This power model requires that the operating-points of
 the CPUs are registered using the kernel's opp library and the
 `cpufreq_frequency_table` is assigned to the `struct device` of the
 cpu.  If you are using CONFIG_CPUFREQ_DT then the
 `cpufreq_frequency_table` should already be assigned to the cpu
 device.
 
-The `plat_static_func` parameter of `cpufreq_power_cooling_register()`
-and `of_cpufreq_power_cooling_register()` is optional.  If you don't
-provide it, only dynamic power will be considered.
-
-2.1 Dynamic power
-
 The dynamic power consumption of a processor depends on many factors.
 For a given processor implementation the primary factors are:
 
@@ -119,79 +90,3 @@
 from 100 to 500.  For reference, the approximate values for the SoC in
 ARM's Juno Development Platform are 530 for the Cortex-A57 cluster and
 140 for the Cortex-A53 cluster.
-
-
-2.2 Static power
-
-Static leakage power consumption depends on a number of factors.  For a
-given circuit implementation the primary factors are:
-
-- Time the circuit spends in each 'power state'
-- Temperature
-- Operating voltage
-- Process grade
-
-The time the circuit spends in each 'power state' for a given
-evaluation period at first order means OFF or ON.  However,
-'retention' states can also be supported that reduce power during
-inactive periods without loss of context.
-
-Note: The visibility of state entries to the OS can vary, according to
-platform specifics, and this can then impact the accuracy of a model
-based on OS state information alone.  It might be possible in some
-cases to extract more accurate information from system resources.
-
-The temperature, operating voltage and process 'grade' (slow to fast)
-of the circuit are all significant factors in static leakage power
-consumption.  All of these have complex relationships to static power.
-
-Circuit implementation specific factors include the chosen silicon
-process as well as the type, number and size of transistors in both
-the logic gates and any RAM elements included.
-
-The static power consumption modelling must take into account the
-power managed regions that are implemented.  Taking the example of an
-ARM processor cluster, the modelling would take into account whether
-each CPU can be powered OFF separately or if only a single power
-region is implemented for the complete cluster.
-
-In one view, there are others, a static power consumption model can
-then start from a set of reference values for each power managed
-region (e.g. CPU, Cluster/L2) in each state (e.g. ON, OFF) at an
-arbitrary process grade, voltage and temperature point.  These values
-are then scaled for all of the following: the time in each state, the
-process grade, the current temperature and the operating voltage.
-However, since both implementation specific and complex relationships
-dominate the estimate, the appropriate interface to the model from the
-cpu cooling device is to provide a function callback that calculates
-the static power in this platform.  When registering the cpu cooling
-device pass a function pointer that follows the `get_static_t`
-prototype:
-
-    int plat_get_static(cpumask_t *cpumask, int interval,
-                        unsigned long voltage, u32 &power);
-
-`cpumask` is the cpumask of the cpus involved in the calculation.
-`voltage` is the voltage at which they are operating.  The function
-should calculate the average static power for the last `interval`
-milliseconds.  It returns 0 on success, -E* on error.  If it
-succeeds, it should store the static power in `power`.  Reading the
-temperature of the cpus described by `cpumask` is left for
-plat_get_static() to do as the platform knows best which thermal
-sensor is closest to the cpu.
-
-If `plat_static_func` is NULL, static power is considered to be
-negligible for this platform and only dynamic power is considered.
-
-The platform specific callback can then use any combination of tables
-and/or equations to permute the estimated value.  Process grade
-information is not passed to the model since access to such data, from
-on-chip measurement capability or manufacture time data, is platform
-specific.
-
-Note: the significance of static power for CPUs in comparison to
-dynamic power is highly dependent on implementation.  Given the
-potential complexity in implementation, the importance and accuracy of
-its inclusion when using cpu cooling devices should be assessed on a
-case by case basis.
-
diff --git a/Documentation/usb/gadget-testing.txt b/Documentation/usb/gadget-testing.txt
index 441a4b9b..5908a21 100644
--- a/Documentation/usb/gadget-testing.txt
+++ b/Documentation/usb/gadget-testing.txt
@@ -693,7 +693,7 @@
 in each line. The rules stated above are best illustrated with an example:
 
 # mkdir functions/uvc.usb0/control/header/h
-# cd functions/uvc.usb0/control/header/h
+# cd functions/uvc.usb0/control/
 # ln -s header/h class/fs
 # ln -s header/h class/ss
 # mkdir -p functions/uvc.usb0/streaming/uncompressed/u/360p
diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index 57d3ee9..fc3ae95 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -3403,6 +3403,52 @@
 or if no page table is present for the addresses (e.g. when using
 hugepages).
 
+4.108 KVM_PPC_GET_CPU_CHAR
+
+Capability: KVM_CAP_PPC_GET_CPU_CHAR
+Architectures: powerpc
+Type: vm ioctl
+Parameters: struct kvm_ppc_cpu_char (out)
+Returns: 0 on successful completion
+	 -EFAULT if struct kvm_ppc_cpu_char cannot be written
+
+This ioctl gives userspace information about certain characteristics
+of the CPU relating to speculative execution of instructions and
+possible information leakage resulting from speculative execution (see
+CVE-2017-5715, CVE-2017-5753 and CVE-2017-5754).  The information is
+returned in struct kvm_ppc_cpu_char, which looks like this:
+
+struct kvm_ppc_cpu_char {
+	__u64	character;		/* characteristics of the CPU */
+	__u64	behaviour;		/* recommended software behaviour */
+	__u64	character_mask;		/* valid bits in character */
+	__u64	behaviour_mask;		/* valid bits in behaviour */
+};
+
+For extensibility, the character_mask and behaviour_mask fields
+indicate which bits of character and behaviour have been filled in by
+the kernel.  If the set of defined bits is extended in future then
+userspace will be able to tell whether it is running on a kernel that
+knows about the new bits.
+
+The character field describes attributes of the CPU which can help
+with preventing inadvertent information disclosure - specifically,
+whether there is an instruction to flash-invalidate the L1 data cache
+(ori 30,30,0 or mtspr SPRN_TRIG2,rN), whether the L1 data cache is set
+to a mode where entries can only be used by the thread that created
+them, whether the bcctr[l] instruction prevents speculation, and
+whether a speculation barrier instruction (ori 31,31,0) is provided.
+
+The behaviour field describes actions that software should take to
+prevent inadvertent information disclosure, and thus describes which
+vulnerabilities the hardware is subject to; specifically whether the
+L1 data cache should be flushed when returning to user mode from the
+kernel, and whether a speculation barrier should be placed between an
+array bounds check and the array access.
+
+These fields use the same bit definitions as the new
+H_GET_CPU_CHARACTERISTICS hypercall.
+
 5. The kvm_run structure
 ------------------------
 
diff --git a/Documentation/x86/intel_rdt_ui.txt b/Documentation/x86/intel_rdt_ui.txt
index 6851854..756fd76 100644
--- a/Documentation/x86/intel_rdt_ui.txt
+++ b/Documentation/x86/intel_rdt_ui.txt
@@ -7,15 +7,24 @@
 Vikas Shivappa <vikas.shivappa@intel.com>
 
 This feature is enabled by the CONFIG_INTEL_RDT Kconfig and the
-X86 /proc/cpuinfo flag bits "rdt", "cqm", "cat_l3" and "cdp_l3".
+X86 /proc/cpuinfo flag bits:
+RDT (Resource Director Technology) Allocation - "rdt_a"
+CAT (Cache Allocation Technology) - "cat_l3", "cat_l2"
+CDP (Code and Data Prioritization ) - "cdp_l3", "cdp_l2"
+CQM (Cache QoS Monitoring) - "cqm_llc", "cqm_occup_llc"
+MBM (Memory Bandwidth Monitoring) - "cqm_mbm_total", "cqm_mbm_local"
+MBA (Memory Bandwidth Allocation) - "mba"
 
 To use the feature mount the file system:
 
- # mount -t resctrl resctrl [-o cdp] /sys/fs/resctrl
+ # mount -t resctrl resctrl [-o cdp[,cdpl2]] /sys/fs/resctrl
 
 mount options are:
 
 "cdp": Enable code/data prioritization in L3 cache allocations.
+"cdpl2": Enable code/data prioritization in L2 cache allocations.
+
+L2 and L3 CDP are controlled seperately.
 
 RDT features are orthogonal. A particular system may support only
 monitoring, only control, or both monitoring and control.
diff --git a/Documentation/x86/pti.txt b/Documentation/x86/pti.txt
new file mode 100644
index 0000000..5cd58439
--- /dev/null
+++ b/Documentation/x86/pti.txt
@@ -0,0 +1,186 @@
+Overview
+========
+
+Page Table Isolation (pti, previously known as KAISER[1]) is a
+countermeasure against attacks on the shared user/kernel address
+space such as the "Meltdown" approach[2].
+
+To mitigate this class of attacks, we create an independent set of
+page tables for use only when running userspace applications.  When
+the kernel is entered via syscalls, interrupts or exceptions, the
+page tables are switched to the full "kernel" copy.  When the system
+switches back to user mode, the user copy is used again.
+
+The userspace page tables contain only a minimal amount of kernel
+data: only what is needed to enter/exit the kernel such as the
+entry/exit functions themselves and the interrupt descriptor table
+(IDT).  There are a few strictly unnecessary things that get mapped
+such as the first C function when entering an interrupt (see
+comments in pti.c).
+
+This approach helps to ensure that side-channel attacks leveraging
+the paging structures do not function when PTI is enabled.  It can be
+enabled by setting CONFIG_PAGE_TABLE_ISOLATION=y at compile time.
+Once enabled at compile-time, it can be disabled at boot with the
+'nopti' or 'pti=' kernel parameters (see kernel-parameters.txt).
+
+Page Table Management
+=====================
+
+When PTI is enabled, the kernel manages two sets of page tables.
+The first set is very similar to the single set which is present in
+kernels without PTI.  This includes a complete mapping of userspace
+that the kernel can use for things like copy_to_user().
+
+Although _complete_, the user portion of the kernel page tables is
+crippled by setting the NX bit in the top level.  This ensures
+that any missed kernel->user CR3 switch will immediately crash
+userspace upon executing its first instruction.
+
+The userspace page tables map only the kernel data needed to enter
+and exit the kernel.  This data is entirely contained in the 'struct
+cpu_entry_area' structure which is placed in the fixmap which gives
+each CPU's copy of the area a compile-time-fixed virtual address.
+
+For new userspace mappings, the kernel makes the entries in its
+page tables like normal.  The only difference is when the kernel
+makes entries in the top (PGD) level.  In addition to setting the
+entry in the main kernel PGD, a copy of the entry is made in the
+userspace page tables' PGD.
+
+This sharing at the PGD level also inherently shares all the lower
+layers of the page tables.  This leaves a single, shared set of
+userspace page tables to manage.  One PTE to lock, one set of
+accessed bits, dirty bits, etc...
+
+Overhead
+========
+
+Protection against side-channel attacks is important.  But,
+this protection comes at a cost:
+
+1. Increased Memory Use
+  a. Each process now needs an order-1 PGD instead of order-0.
+     (Consumes an additional 4k per process).
+  b. The 'cpu_entry_area' structure must be 2MB in size and 2MB
+     aligned so that it can be mapped by setting a single PMD
+     entry.  This consumes nearly 2MB of RAM once the kernel
+     is decompressed, but no space in the kernel image itself.
+
+2. Runtime Cost
+  a. CR3 manipulation to switch between the page table copies
+     must be done at interrupt, syscall, and exception entry
+     and exit (it can be skipped when the kernel is interrupted,
+     though.)  Moves to CR3 are on the order of a hundred
+     cycles, and are required at every entry and exit.
+  b. A "trampoline" must be used for SYSCALL entry.  This
+     trampoline depends on a smaller set of resources than the
+     non-PTI SYSCALL entry code, so requires mapping fewer
+     things into the userspace page tables.  The downside is
+     that stacks must be switched at entry time.
+  c. Global pages are disabled for all kernel structures not
+     mapped into both kernel and userspace page tables.  This
+     feature of the MMU allows different processes to share TLB
+     entries mapping the kernel.  Losing the feature means more
+     TLB misses after a context switch.  The actual loss of
+     performance is very small, however, never exceeding 1%.
+  d. Process Context IDentifiers (PCID) is a CPU feature that
+     allows us to skip flushing the entire TLB when switching page
+     tables by setting a special bit in CR3 when the page tables
+     are changed.  This makes switching the page tables (at context
+     switch, or kernel entry/exit) cheaper.  But, on systems with
+     PCID support, the context switch code must flush both the user
+     and kernel entries out of the TLB.  The user PCID TLB flush is
+     deferred until the exit to userspace, minimizing the cost.
+     See intel.com/sdm for the gory PCID/INVPCID details.
+  e. The userspace page tables must be populated for each new
+     process.  Even without PTI, the shared kernel mappings
+     are created by copying top-level (PGD) entries into each
+     new process.  But, with PTI, there are now *two* kernel
+     mappings: one in the kernel page tables that maps everything
+     and one for the entry/exit structures.  At fork(), we need to
+     copy both.
+  f. In addition to the fork()-time copying, there must also
+     be an update to the userspace PGD any time a set_pgd() is done
+     on a PGD used to map userspace.  This ensures that the kernel
+     and userspace copies always map the same userspace
+     memory.
+  g. On systems without PCID support, each CR3 write flushes
+     the entire TLB.  That means that each syscall, interrupt
+     or exception flushes the TLB.
+  h. INVPCID is a TLB-flushing instruction which allows flushing
+     of TLB entries for non-current PCIDs.  Some systems support
+     PCIDs, but do not support INVPCID.  On these systems, addresses
+     can only be flushed from the TLB for the current PCID.  When
+     flushing a kernel address, we need to flush all PCIDs, so a
+     single kernel address flush will require a TLB-flushing CR3
+     write upon the next use of every PCID.
+
+Possible Future Work
+====================
+1. We can be more careful about not actually writing to CR3
+   unless its value is actually changed.
+2. Allow PTI to be enabled/disabled at runtime in addition to the
+   boot-time switching.
+
+Testing
+========
+
+To test stability of PTI, the following test procedure is recommended,
+ideally doing all of these in parallel:
+
+1. Set CONFIG_DEBUG_ENTRY=y
+2. Run several copies of all of the tools/testing/selftests/x86/ tests
+   (excluding MPX and protection_keys) in a loop on multiple CPUs for
+   several minutes.  These tests frequently uncover corner cases in the
+   kernel entry code.  In general, old kernels might cause these tests
+   themselves to crash, but they should never crash the kernel.
+3. Run the 'perf' tool in a mode (top or record) that generates many
+   frequent performance monitoring non-maskable interrupts (see "NMI"
+   in /proc/interrupts).  This exercises the NMI entry/exit code which
+   is known to trigger bugs in code paths that did not expect to be
+   interrupted, including nested NMIs.  Using "-c" boosts the rate of
+   NMIs, and using two -c with separate counters encourages nested NMIs
+   and less deterministic behavior.
+
+	while true; do perf record -c 10000 -e instructions,cycles -a sleep 10; done
+
+4. Launch a KVM virtual machine.
+5. Run 32-bit binaries on systems supporting the SYSCALL instruction.
+   This has been a lightly-tested code path and needs extra scrutiny.
+
+Debugging
+=========
+
+Bugs in PTI cause a few different signatures of crashes
+that are worth noting here.
+
+ * Failures of the selftests/x86 code.  Usually a bug in one of the
+   more obscure corners of entry_64.S
+ * Crashes in early boot, especially around CPU bringup.  Bugs
+   in the trampoline code or mappings cause these.
+ * Crashes at the first interrupt.  Caused by bugs in entry_64.S,
+   like screwing up a page table switch.  Also caused by
+   incorrectly mapping the IRQ handler entry code.
+ * Crashes at the first NMI.  The NMI code is separate from main
+   interrupt handlers and can have bugs that do not affect
+   normal interrupts.  Also caused by incorrectly mapping NMI
+   code.  NMIs that interrupt the entry code must be very
+   careful and can be the cause of crashes that show up when
+   running perf.
+ * Kernel crashes at the first exit to userspace.  entry_64.S
+   bugs, or failing to map some of the exit code.
+ * Crashes at first interrupt that interrupts userspace. The paths
+   in entry_64.S that return to userspace are sometimes separate
+   from the ones that return to the kernel.
+ * Double faults: overflowing the kernel stack because of page
+   faults upon page faults.  Caused by touching non-pti-mapped
+   data in the entry code, or forgetting to switch to kernel
+   CR3 before calling into C functions which are not pti-mapped.
+ * Userspace segfaults early in boot, sometimes manifesting
+   as mount(8) failing to mount the rootfs.  These have
+   tended to be TLB invalidation issues.  Usually invalidating
+   the wrong PCID, or otherwise missing an invalidation.
+
+1. https://gruss.cc/files/kaiser.pdf
+2. https://meltdownattack.com/meltdown.pdf
diff --git a/Documentation/x86/x86_64/mm.txt b/Documentation/x86/x86_64/mm.txt
index ad41b38..ea91cb6 100644
--- a/Documentation/x86/x86_64/mm.txt
+++ b/Documentation/x86/x86_64/mm.txt
@@ -12,8 +12,9 @@
 ... unused hole ...
 ffffec0000000000 - fffffbffffffffff (=44 bits) kasan shadow memory (16TB)
 ... unused hole ...
-fffffe0000000000 - fffffe7fffffffff (=39 bits) LDT remap for PTI
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+fffffe8000000000 - fffffeffffffffff (=39 bits) LDT remap for PTI
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
@@ -37,13 +38,15 @@
 ... unused hole ...
 ffdf000000000000 - fffffc0000000000 (=53 bits) kasan shadow memory (8PB)
 ... unused hole ...
-fffffe8000000000 - fffffeffffffffff (=39 bits) cpu_entry_area mapping
+				    vaddr_end for KASLR
+fffffe0000000000 - fffffe7fffffffff (=39 bits) cpu_entry_area mapping
+... unused hole ...
 ffffff0000000000 - ffffff7fffffffff (=39 bits) %esp fixup stacks
 ... unused hole ...
 ffffffef00000000 - fffffffeffffffff (=64 GB) EFI region mapping space
 ... unused hole ...
 ffffffff80000000 - ffffffff9fffffff (=512 MB)  kernel text mapping, from phys 0
-ffffffffa0000000 - [fixmap start]   (~1526 MB) module mapping space
+ffffffffa0000000 - fffffffffeffffff (1520 MB) module mapping space
 [fixmap start]   - ffffffffff5fffff kernel-internal fixmap range
 ffffffffff600000 - ffffffffff600fff (=4 kB) legacy vsyscall ABI
 ffffffffffe00000 - ffffffffffffffff (=2 MB) unused hole
@@ -67,9 +70,10 @@
 The mappings are not part of any other kernel PGD and are only available
 during EFI runtime calls.
 
-The module mapping space size changes based on the CONFIG requirements for the
-following fixmap section.
-
 Note that if CONFIG_RANDOMIZE_MEMORY is enabled, the direct mapping of all
 physical memory, vmalloc/ioremap space and virtual memory map are randomized.
 Their order is preserved but their base will be offset early at boot time.
+
+Be very careful vs. KASLR when changing anything here. The KASLR address
+range must not overlap with anything except the KASAN shadow area, which is
+correct as KASAN disables KASLR.
diff --git a/Documentation/xtensa/mmu.txt b/Documentation/xtensa/mmu.txt
index 5de8715..318114d 100644
--- a/Documentation/xtensa/mmu.txt
+++ b/Documentation/xtensa/mmu.txt
@@ -69,8 +69,19 @@
 | Userspace        |                           0x00000000  TASK_SIZE
 +------------------+                           0x40000000
 +------------------+
-| Page table       |                           0x80000000
-+------------------+                           0x80400000
+| Page table       |  XCHAL_PAGE_TABLE_VADDR   0x80000000  XCHAL_PAGE_TABLE_SIZE
++------------------+
+| KASAN shadow map |  KASAN_SHADOW_START       0x80400000  KASAN_SHADOW_SIZE
++------------------+                           0x8e400000
++------------------+
+| VMALLOC area     |  VMALLOC_START            0xc0000000  128MB - 64KB
++------------------+  VMALLOC_END
+| Cache aliasing   |  TLBTEMP_BASE_1           0xc7ff0000  DCACHE_WAY_SIZE
+| remap area 1     |
++------------------+
+| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
+| remap area 2     |
++------------------+
 +------------------+
 | KMAP area        |  PKMAP_BASE                           PTRS_PER_PTE *
 |                  |                                       DCACHE_N_COLORS *
@@ -81,16 +92,7 @@
 |                  |                                       NR_CPUS *
 |                  |                                       DCACHE_N_COLORS *
 |                  |                                       PAGE_SIZE
-+------------------+  FIXADDR_TOP              0xbffff000
-+------------------+
-| VMALLOC area     |  VMALLOC_START            0xc0000000  128MB - 64KB
-+------------------+  VMALLOC_END
-| Cache aliasing   |  TLBTEMP_BASE_1           0xc7ff0000  DCACHE_WAY_SIZE
-| remap area 1     |
-+------------------+
-| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
-| remap area 2     |
-+------------------+
++------------------+  FIXADDR_TOP              0xcffff000
 +------------------+
 | Cached KSEG      |  XCHAL_KSEG_CACHED_VADDR  0xd0000000  128MB
 +------------------+
@@ -109,8 +111,19 @@
 | Userspace        |                           0x00000000  TASK_SIZE
 +------------------+                           0x40000000
 +------------------+
-| Page table       |                           0x80000000
-+------------------+                           0x80400000
+| Page table       |  XCHAL_PAGE_TABLE_VADDR   0x80000000  XCHAL_PAGE_TABLE_SIZE
++------------------+
+| KASAN shadow map |  KASAN_SHADOW_START       0x80400000  KASAN_SHADOW_SIZE
++------------------+                           0x8e400000
++------------------+
+| VMALLOC area     |  VMALLOC_START            0xa0000000  128MB - 64KB
++------------------+  VMALLOC_END
+| Cache aliasing   |  TLBTEMP_BASE_1           0xa7ff0000  DCACHE_WAY_SIZE
+| remap area 1     |
++------------------+
+| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
+| remap area 2     |
++------------------+
 +------------------+
 | KMAP area        |  PKMAP_BASE                           PTRS_PER_PTE *
 |                  |                                       DCACHE_N_COLORS *
@@ -121,16 +134,7 @@
 |                  |                                       NR_CPUS *
 |                  |                                       DCACHE_N_COLORS *
 |                  |                                       PAGE_SIZE
-+------------------+  FIXADDR_TOP              0x9ffff000
-+------------------+
-| VMALLOC area     |  VMALLOC_START            0xa0000000  128MB - 64KB
-+------------------+  VMALLOC_END
-| Cache aliasing   |  TLBTEMP_BASE_1           0xa7ff0000  DCACHE_WAY_SIZE
-| remap area 1     |
-+------------------+
-| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
-| remap area 2     |
-+------------------+
++------------------+  FIXADDR_TOP              0xaffff000
 +------------------+
 | Cached KSEG      |  XCHAL_KSEG_CACHED_VADDR  0xb0000000  256MB
 +------------------+
@@ -150,8 +154,19 @@
 | Userspace        |                           0x00000000  TASK_SIZE
 +------------------+                           0x40000000
 +------------------+
-| Page table       |                           0x80000000
-+------------------+                           0x80400000
+| Page table       |  XCHAL_PAGE_TABLE_VADDR   0x80000000  XCHAL_PAGE_TABLE_SIZE
++------------------+
+| KASAN shadow map |  KASAN_SHADOW_START       0x80400000  KASAN_SHADOW_SIZE
++------------------+                           0x8e400000
++------------------+
+| VMALLOC area     |  VMALLOC_START            0x90000000  128MB - 64KB
++------------------+  VMALLOC_END
+| Cache aliasing   |  TLBTEMP_BASE_1           0x97ff0000  DCACHE_WAY_SIZE
+| remap area 1     |
++------------------+
+| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
+| remap area 2     |
++------------------+
 +------------------+
 | KMAP area        |  PKMAP_BASE                           PTRS_PER_PTE *
 |                  |                                       DCACHE_N_COLORS *
@@ -162,16 +177,7 @@
 |                  |                                       NR_CPUS *
 |                  |                                       DCACHE_N_COLORS *
 |                  |                                       PAGE_SIZE
-+------------------+  FIXADDR_TOP              0x8ffff000
-+------------------+
-| VMALLOC area     |  VMALLOC_START            0x90000000  128MB - 64KB
-+------------------+  VMALLOC_END
-| Cache aliasing   |  TLBTEMP_BASE_1           0x97ff0000  DCACHE_WAY_SIZE
-| remap area 1     |
-+------------------+
-| Cache aliasing   |  TLBTEMP_BASE_2                       DCACHE_WAY_SIZE
-| remap area 2     |
-+------------------+
++------------------+  FIXADDR_TOP              0x9ffff000
 +------------------+
 | Cached KSEG      |  XCHAL_KSEG_CACHED_VADDR  0xa0000000  512MB
 +------------------+
diff --git a/MAINTAINERS b/MAINTAINERS
index b46c9ce..98ee6fe 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -62,7 +62,15 @@
 
 7.	When sending security related changes or reports to a maintainer
 	please Cc: security@kernel.org, especially if the maintainer
-	does not respond.
+	does not respond. Please keep in mind that the security team is
+	a small set of people who can be efficient only when working on
+	verified bugs. Please only Cc: this list when you have identified
+	that the bug would present a short-term risk to other users if it
+	were publicly disclosed. For example, reports of address leaks do
+	not represent an immediate threat and are better handled publicly,
+	and ideally, should come with a patch proposal. Please do not send
+	automated reports to this list either. Such bugs will be handled
+	better and faster in the usual public places.
 
 8.	Happy hacking.
 
@@ -321,7 +329,7 @@
 
 ACPI COMPONENT ARCHITECTURE (ACPICA)
 M:	Robert Moore <robert.moore@intel.com>
-M:	Lv Zheng <lv.zheng@intel.com>
+M:	Erik Schmauss <erik.schmauss@intel.com>
 M:	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
 L:	linux-acpi@vger.kernel.org
 L:	devel@acpica.org
@@ -867,6 +875,12 @@
 F:	drivers/android/
 F:	drivers/staging/android/
 
+ANDROID GOLDFISH PIC DRIVER
+M:	Miodrag Dinic <miodrag.dinic@mips.com>
+S:	Supported
+F:	Documentation/devicetree/bindings/interrupt-controller/google,goldfish-pic.txt
+F:	drivers/irqchip/irq-goldfish-pic.c
+
 ANDROID GOLDFISH RTC DRIVER
 M:	Miodrag Dinic <miodrag.dinic@mips.com>
 S:	Supported
@@ -1313,7 +1327,8 @@
 F:	tools/perf/arch/arm/util/auxtrace.c
 F:	tools/perf/arch/arm/util/cs-etm.c
 F:	tools/perf/arch/arm/util/cs-etm.h
-F:	tools/perf/util/cs-etm.h
+F:	tools/perf/util/cs-etm.*
+F:	tools/perf/util/cs-etm-decoder/*
 
 ARM/CORGI MACHINE SUPPORT
 M:	Richard Purdie <rpurdie@rpsys.net>
@@ -1583,6 +1598,7 @@
 F:	arch/arm/configs/mvebu_*_defconfig
 F:	arch/arm/mach-mvebu/
 F:	arch/arm64/boot/dts/marvell/armada*
+F:	drivers/cpufreq/armada-37xx-cpufreq.c
 F:	drivers/cpufreq/mvebu-cpufreq.c
 F:	drivers/irqchip/irq-armada-370-xp.c
 F:	drivers/irqchip/irq-mvebu-*
@@ -2383,13 +2399,6 @@
 F:	drivers/input/touchscreen/atmel_mxt_ts.c
 F:	include/linux/platform_data/atmel_mxt_ts.h
 
-ATMEL NAND DRIVER
-M:	Wenyou Yang <wenyou.yang@atmel.com>
-M:	Josh Wu <rainyfeeling@outlook.com>
-L:	linux-mtd@lists.infradead.org
-S:	Supported
-F:	drivers/mtd/nand/atmel/*
-
 ATMEL SAMA5D2 ADC DRIVER
 M:	Ludovic Desroches <ludovic.desroches@microchip.com>
 L:	linux-iio@vger.kernel.org
@@ -5139,6 +5148,12 @@
 S:	Maintained
 F:	drivers/edac/skx_edac.c
 
+EDAC-TI
+M:	Tero Kristo <t-kristo@ti.com>
+L:	linux-edac@vger.kernel.org
+S:	Maintained
+F:	drivers/edac/ti_edac.c
+
 EDIROL UA-101/UA-1000 DRIVER
 M:	Clemens Ladisch <clemens@ladisch.de>
 L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
@@ -5149,15 +5164,15 @@
 EFI TEST DRIVER
 L:	linux-efi@vger.kernel.org
 M:	Ivan Hu <ivan.hu@canonical.com>
-M:	Matt Fleming <matt@codeblueprint.co.uk>
+M:	Ard Biesheuvel <ard.biesheuvel@linaro.org>
 S:	Maintained
 F:	drivers/firmware/efi/test/
 
 EFI VARIABLE FILESYSTEM
 M:	Matthew Garrett <matthew.garrett@nebula.com>
 M:	Jeremy Kerr <jk@ozlabs.org>
-M:	Matt Fleming <matt@codeblueprint.co.uk>
-T:	git git://git.kernel.org/pub/scm/linux/kernel/git/mfleming/efi.git
+M:	Ard Biesheuvel <ard.biesheuvel@linaro.org>
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git
 L:	linux-efi@vger.kernel.org
 S:	Maintained
 F:	fs/efivarfs/
@@ -5318,7 +5333,6 @@
 F:	security/integrity/evm/
 
 EXTENSIBLE FIRMWARE INTERFACE (EFI)
-M:	Matt Fleming <matt@codeblueprint.co.uk>
 M:	Ard Biesheuvel <ard.biesheuvel@linaro.org>
 L:	linux-efi@vger.kernel.org
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/efi/efi.git
@@ -6610,16 +6624,6 @@
 S:	Maintained
 F:	drivers/i2c/i2c-stub.c
 
-i386 BOOT CODE
-M:	"H. Peter Anvin" <hpa@zytor.com>
-S:	Maintained
-F:	arch/x86/boot/
-
-i386 SETUP CODE / CPU ERRATA WORKAROUNDS
-M:	"H. Peter Anvin" <hpa@zytor.com>
-T:	git git://git.kernel.org/pub/scm/linux/kernel/git/hpa/linux-2.6-x86setup.git
-S:	Maintained
-
 IA64 (Itanium) PLATFORM
 M:	Tony Luck <tony.luck@intel.com>
 M:	Fenghua Yu <fenghua.yu@intel.com>
@@ -8193,6 +8197,7 @@
 F:	include/linux/seqlock.h
 F:	lib/locking*.[ch]
 F:	kernel/locking/
+X:	kernel/locking/locktorture.c
 
 LOGICAL DISK MANAGER SUPPORT (LDM, Windows 2000/XP/Vista Dynamic Disks)
 M:	"Richard Russon (FlatCap)" <ldm@flatcap.org>
@@ -8408,6 +8413,13 @@
 S:	Odd Fixes
 F:	drivers/net/wireless/marvell/mwl8k.c
 
+MARVELL NAND CONTROLLER DRIVER
+M:	Miquel Raynal <miquel.raynal@free-electrons.com>
+L:	linux-mtd@lists.infradead.org
+S:	Maintained
+F:	drivers/mtd/nand/marvell_nand.c
+F:	Documentation/devicetree/bindings/mtd/marvell-nand.txt
+
 MARVELL SOC MMC/SD/SDIO CONTROLLER DRIVER
 M:	Nicolas Pitre <nico@fluxnic.net>
 S:	Odd Fixes
@@ -8955,7 +8967,7 @@
 W:	http://www.linux-mtd.infradead.org/
 Q:	http://patchwork.ozlabs.org/project/linux-mtd/list/
 T:	git git://git.infradead.org/linux-mtd.git master
-T:	git git://git.infradead.org/l2-mtd.git master
+T:	git git://git.infradead.org/linux-mtd.git mtd/next
 S:	Maintained
 F:	Documentation/devicetree/bindings/mtd/
 F:	drivers/mtd/
@@ -9044,6 +9056,14 @@
 F:	drivers/media/platform/atmel/atmel-isc-regs.h
 F:	devicetree/bindings/media/atmel-isc.txt
 
+MICROCHIP / ATMEL NAND DRIVER
+M:	Wenyou Yang <wenyou.yang@microchip.com>
+M:	Josh Wu <rainyfeeling@outlook.com>
+L:	linux-mtd@lists.infradead.org
+S:	Supported
+F:	drivers/mtd/nand/atmel/*
+F:	Documentation/devicetree/bindings/mtd/atmel-nand.txt
+
 MICROCHIP KSZ SERIES ETHERNET SWITCH DRIVER
 M:	Woojung Huh <Woojung.Huh@microchip.com>
 M:	Microchip Linux Driver Support <UNGLinuxDriver@microchip.com>
@@ -9086,6 +9106,7 @@
 
 MIPS
 M:	Ralf Baechle <ralf@linux-mips.org>
+M:	James Hogan <jhogan@kernel.org>
 L:	linux-mips@linux-mips.org
 W:	http://www.linux-mips.org/
 T:	git git://git.linux-mips.org/pub/scm/ralf/linux.git
@@ -9343,7 +9364,7 @@
 W:	http://www.linux-mtd.infradead.org/
 Q:	http://patchwork.ozlabs.org/project/linux-mtd/list/
 T:	git git://git.infradead.org/linux-mtd.git nand/fixes
-T:	git git://git.infradead.org/l2-mtd.git nand/next
+T:	git git://git.infradead.org/linux-mtd.git nand/next
 S:	Maintained
 F:	drivers/mtd/nand/
 F:	include/linux/mtd/*nand*.h
@@ -9639,8 +9660,8 @@
 NILFS2 FILESYSTEM
 M:	Ryusuke Konishi <konishi.ryusuke@lab.ntt.co.jp>
 L:	linux-nilfs@vger.kernel.org
-W:	http://nilfs.sourceforge.net/
-W:	http://nilfs.osdn.jp/
+W:	https://nilfs.sourceforge.io/
+W:	https://nilfs.osdn.jp/
 T:	git git://github.com/konis/nilfs2.git
 S:	Supported
 F:	Documentation/filesystems/nilfs2.txt
@@ -9744,6 +9765,15 @@
 F:	Documentation/filesystems/ntfs.txt
 F:	fs/ntfs/
 
+NUBUS SUBSYSTEM
+M:	Finn Thain <fthain@telegraphics.com.au>
+L:	linux-m68k@lists.linux-m68k.org
+S:	Maintained
+F:	arch/*/include/asm/nubus.h
+F:	drivers/nubus/
+F:	include/linux/nubus.h
+F:	include/uapi/linux/nubus.h
+
 NVIDIA (rivafb and nvidiafb) FRAMEBUFFER DRIVER
 M:	Antonino Daplas <adaplas@gmail.com>
 L:	linux-fbdev@vger.kernel.org
@@ -9804,6 +9834,7 @@
 M:	Peter Rosin <peda@axentia.se>
 L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
 S:	Maintained
+F:	Documentation/devicetree/bindings/sound/tfa9879.txt
 F:	sound/soc/codecs/tfa9879*
 
 NXP-NCI NFC DRIVER
@@ -10135,7 +10166,7 @@
 F:	drivers/irqchip/irq-or1k-*
 
 OPENVSWITCH
-M:	Pravin Shelar <pshelar@nicira.com>
+M:	Pravin B Shelar <pshelar@ovn.org>
 L:	netdev@vger.kernel.org
 L:	dev@openvswitch.org
 W:	http://openvswitch.org
@@ -10890,6 +10921,7 @@
 F:	include/linux/pm_*
 F:	include/linux/powercap.h
 F:	drivers/powercap/
+F:	kernel/configs/nopm.config
 
 POWER STATE COORDINATION INTERFACE (PSCI)
 M:	Mark Rutland <mark.rutland@arm.com>
@@ -11450,15 +11482,6 @@
 S:	Orphan
 F:	drivers/net/wireless/ray*
 
-RCUTORTURE MODULE
-M:	Josh Triplett <josh@joshtriplett.org>
-M:	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
-L:	linux-kernel@vger.kernel.org
-S:	Supported
-T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
-F:	Documentation/RCU/torture.txt
-F:	kernel/rcu/rcutorture.c
-
 RCUTORTURE TEST FRAMEWORK
 M:	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
 M:	Josh Triplett <josh@joshtriplett.org>
@@ -11651,8 +11674,8 @@
 RISC-V ARCHITECTURE
 M:	Palmer Dabbelt <palmer@sifive.com>
 M:	Albert Ou <albert@sifive.com>
-L:	patches@groups.riscv.org
-T:	git https://github.com/riscv/riscv-linux
+L:	linux-riscv@lists.infradead.org
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/palmer/riscv-linux.git
 S:	Supported
 F:	arch/riscv/
 K:	riscv
@@ -12233,7 +12256,7 @@
 S:	Supported
 
 SECURITY SUBSYSTEM
-M:	James Morris <james.l.morris@oracle.com>
+M:	James Morris <jmorris@namei.org>
 M:	"Serge E. Hallyn" <serge@hallyn.com>
 L:	linux-security-module@vger.kernel.org (suggested Cc:)
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security.git
@@ -12592,6 +12615,12 @@
 F:	drivers/media/i2c/soc_camera/
 F:	drivers/media/platform/soc_camera/
 
+SOCIONEXT UNIPHIER SOUND DRIVER
+M:	Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>
+L:	alsa-devel@alsa-project.org (moderated for non-subscribers)
+S:	Maintained
+F:	sound/soc/uniphier/
+
 SOEKRIS NET48XX LED SUPPORT
 M:	Chris Boot <bootc@bootc.net>
 S:	Maintained
@@ -12616,6 +12645,15 @@
 S:	Supported
 F:	drivers/media/pci/solo6x10/
 
+SOFTWARE DELEGATED EXCEPTION INTERFACE (SDEI)
+M:	James Morse <james.morse@arm.com>
+L:	linux-arm-kernel@lists.infradead.org
+S:	Maintained
+F:	Documentation/devicetree/bindings/arm/firmware/sdei.txt
+F:	drivers/firmware/arm_sdei.c
+F:	include/linux/sdei.h
+F:	include/uapi/linux/sdei.h
+
 SOFTWARE RAID (Multiple Disks) SUPPORT
 M:	Shaohua Li <shli@kernel.org>
 L:	linux-raid@vger.kernel.org
@@ -12780,7 +12818,7 @@
 W:	http://www.linux-mtd.infradead.org/
 Q:	http://patchwork.ozlabs.org/project/linux-mtd/list/
 T:	git git://git.infradead.org/linux-mtd.git spi-nor/fixes
-T:	git git://git.infradead.org/l2-mtd.git spi-nor/next
+T:	git git://git.infradead.org/linux-mtd.git spi-nor/next
 S:	Maintained
 F:	drivers/mtd/spi-nor/
 F:	include/linux/mtd/spi-nor.h
@@ -13767,6 +13805,18 @@
 S:	Maintained
 F:	drivers/platform/x86/topstar-laptop.c
 
+TORTURE-TEST MODULES
+M:	Davidlohr Bueso <dave@stgolabs.net>
+M:	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
+M:	Josh Triplett <josh@joshtriplett.org>
+L:	linux-kernel@vger.kernel.org
+S:	Supported
+T:	git git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git
+F:	Documentation/RCU/torture.txt
+F:	kernel/torture.c
+F:	kernel/rcu/rcutorture.c
+F:	kernel/locking/locktorture.c
+
 TOSHIBA ACPI EXTRAS DRIVER
 M:	Azael Avalos <coproscefalo@gmail.com>
 L:	platform-driver-x86@vger.kernel.org
@@ -13850,6 +13900,13 @@
 S:	Maintained
 K:	^Subject:.*(?i)trivial
 
+TEMPO SEMICONDUCTOR DRIVERS
+M:	Steven Eckhoff <steven.eckhoff.opensource@gmail.com>
+S:	Maintained
+F:	sound/soc/codecs/tscs*.c
+F:	sound/soc/codecs/tscs*.h
+F:	Documentation/devicetree/bindings/sound/tscs*.txt
+
 TTY LAYER
 M:	Greg Kroah-Hartman <gregkh@linuxfoundation.org>
 M:	Jiri Slaby <jslaby@suse.com>
@@ -14648,6 +14705,7 @@
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/broonie/regulator.git
 S:	Supported
 F:	Documentation/devicetree/bindings/regulator/
+F:	Documentation/power/regulator/
 F:	drivers/regulator/
 F:	include/dt-bindings/regulator/
 F:	include/linux/regulator/
@@ -14858,7 +14916,7 @@
 X86 ARCHITECTURE (32-BIT AND 64-BIT)
 M:	Thomas Gleixner <tglx@linutronix.de>
 M:	Ingo Molnar <mingo@redhat.com>
-M:	"H. Peter Anvin" <hpa@zytor.com>
+R:	"H. Peter Anvin" <hpa@zytor.com>
 M:	x86@kernel.org
 L:	linux-kernel@vger.kernel.org
 T:	git git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git x86/core
diff --git a/Makefile b/Makefile
index eb1f597..c8b8e90 100644
--- a/Makefile
+++ b/Makefile
@@ -2,7 +2,7 @@
 VERSION = 4
 PATCHLEVEL = 15
 SUBLEVEL = 0
-EXTRAVERSION = -rc6
+EXTRAVERSION =
 NAME = Fearless Coyote
 
 # *DOCUMENTATION*
@@ -484,26 +484,6 @@
 endif
 KBUILD_CFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
 KBUILD_AFLAGS += $(CLANG_TARGET) $(CLANG_GCC_TC)
-KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
-KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
-KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
-KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
-# Quiet clang warning: comparison of unsigned expression < 0 is always false
-KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
-# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the
-# source of a reference will be _MergedGlobals and not on of the whitelisted names.
-# See modpost pattern 2
-KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
-KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
-KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
-KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
-else
-
-# These warnings generated too much noise in a regular build.
-# Use make W=1 to enable them (see scripts/Makefile.extrawarn)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
-KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
 endif
 
 ifeq ($(config-targets),1)
@@ -716,6 +696,29 @@
 endif
 KBUILD_CFLAGS += $(stackp-flag)
 
+ifeq ($(cc-name),clang)
+KBUILD_CPPFLAGS += $(call cc-option,-Qunused-arguments,)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-variable)
+KBUILD_CFLAGS += $(call cc-disable-warning, format-invalid-specifier)
+KBUILD_CFLAGS += $(call cc-disable-warning, gnu)
+KBUILD_CFLAGS += $(call cc-disable-warning, address-of-packed-member)
+# Quiet clang warning: comparison of unsigned expression < 0 is always false
+KBUILD_CFLAGS += $(call cc-disable-warning, tautological-compare)
+# CLANG uses a _MergedGlobals as optimization, but this breaks modpost, as the
+# source of a reference will be _MergedGlobals and not on of the whitelisted names.
+# See modpost pattern 2
+KBUILD_CFLAGS += $(call cc-option, -mno-global-merge,)
+KBUILD_CFLAGS += $(call cc-option, -fcatch-undefined-behavior)
+KBUILD_CFLAGS += $(call cc-option, -no-integrated-as)
+KBUILD_AFLAGS += $(call cc-option, -no-integrated-as)
+else
+
+# These warnings generated too much noise in a regular build.
+# Use make W=1 to enable them (see scripts/Makefile.extrawarn)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-but-set-variable)
+KBUILD_CFLAGS += $(call cc-disable-warning, unused-const-variable)
+endif
+
 ifdef CONFIG_FRAME_POINTER
 KBUILD_CFLAGS	+= -fno-omit-frame-pointer -fno-optimize-sibling-calls
 else
diff --git a/arch/Kconfig b/arch/Kconfig
index 400b9e1b..a26d6f8 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -234,8 +234,8 @@
 config ARCH_HAS_SET_MEMORY
 	bool
 
-# Select if arch init_task initializer is different to init/init_task.c
-config ARCH_INIT_TASK
+# Select if arch init_task must go in the __init_task_data section
+config ARCH_TASK_STRUCT_ON_STACK
        bool
 
 # Select if arch has its private alloc_task_struct() function
diff --git a/arch/alpha/include/asm/thread_info.h b/arch/alpha/include/asm/thread_info.h
index 8c20c5e..807d7b9 100644
--- a/arch/alpha/include/asm/thread_info.h
+++ b/arch/alpha/include/asm/thread_info.h
@@ -39,9 +39,6 @@ struct thread_info {
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* How to get the thread information struct from C.  */
 register struct thread_info *__current_thread_info __asm__("$8");
 #define current_thread_info()  __current_thread_info
diff --git a/arch/alpha/kernel/sys_sio.c b/arch/alpha/kernel/sys_sio.c
index 37bd6d9..a6bdc1d 100644
--- a/arch/alpha/kernel/sys_sio.c
+++ b/arch/alpha/kernel/sys_sio.c
@@ -102,6 +102,15 @@ sio_pci_route(void)
 				   alpha_mv.sys.sio.route_tab);
 }
 
+static bool sio_pci_dev_irq_needs_level(const struct pci_dev *dev)
+{
+	if ((dev->class >> 16 == PCI_BASE_CLASS_BRIDGE) &&
+	    (dev->class >> 8 != PCI_CLASS_BRIDGE_PCMCIA))
+		return false;
+
+	return true;
+}
+
 static unsigned int __init
 sio_collect_irq_levels(void)
 {
@@ -110,8 +119,7 @@ sio_collect_irq_levels(void)
 
 	/* Iterate through the devices, collecting IRQ levels.  */
 	for_each_pci_dev(dev) {
-		if ((dev->class >> 16 == PCI_BASE_CLASS_BRIDGE) &&
-		    (dev->class >> 8 != PCI_CLASS_BRIDGE_PCMCIA))
+		if (!sio_pci_dev_irq_needs_level(dev))
 			continue;
 
 		if (dev->irq)
@@ -120,8 +128,7 @@ sio_collect_irq_levels(void)
 	return level_bits;
 }
 
-static void __init
-sio_fixup_irq_levels(unsigned int level_bits)
+static void __sio_fixup_irq_levels(unsigned int level_bits, bool reset)
 {
 	unsigned int old_level_bits;
 
@@ -139,12 +146,21 @@ sio_fixup_irq_levels(unsigned int level_bits)
 	 */
 	old_level_bits = inb(0x4d0) | (inb(0x4d1) << 8);
 
-	level_bits |= (old_level_bits & 0x71ff);
+	if (reset)
+		old_level_bits &= 0x71ff;
+
+	level_bits |= old_level_bits;
 
 	outb((level_bits >> 0) & 0xff, 0x4d0);
 	outb((level_bits >> 8) & 0xff, 0x4d1);
 }
 
+static inline void
+sio_fixup_irq_levels(unsigned int level_bits)
+{
+	__sio_fixup_irq_levels(level_bits, true);
+}
+
 static inline int
 noname_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
 {
@@ -181,7 +197,14 @@ noname_map_irq(const struct pci_dev *dev, u8 slot, u8 pin)
 	const long min_idsel = 6, max_idsel = 14, irqs_per_slot = 5;
 	int irq = COMMON_TABLE_LOOKUP, tmp;
 	tmp = __kernel_extbl(alpha_mv.sys.sio.route_tab, irq);
-	return irq >= 0 ? tmp : -1;
+
+	irq = irq >= 0 ? tmp : -1;
+
+	/* Fixup IRQ level if an actual IRQ mapping is detected */
+	if (sio_pci_dev_irq_needs_level(dev) && irq >= 0)
+		__sio_fixup_irq_levels(1 << irq, false);
+
+	return irq;
 }
 
 static inline int
diff --git a/arch/alpha/lib/ev6-memset.S b/arch/alpha/lib/ev6-memset.S
index 316a99a..1cfcfbb 100644
--- a/arch/alpha/lib/ev6-memset.S
+++ b/arch/alpha/lib/ev6-memset.S
@@ -18,7 +18,7 @@
  * The algorithm for the leading and trailing quadwords remains the same,
  * however the loop has been unrolled to enable better memory throughput,
  * and the code has been replicated for each of the entry points: __memset
- * and __memsetw to permit better scheduling to eliminate the stalling
+ * and __memset16 to permit better scheduling to eliminate the stalling
  * encountered during the mask replication.
  * A future enhancement might be to put in a byte store loop for really
  * small (say < 32 bytes) memset()s.  Whether or not that change would be
@@ -34,7 +34,7 @@
 	.globl memset
 	.globl __memset
 	.globl ___memset
-	.globl __memsetw
+	.globl __memset16
 	.globl __constant_c_memset
 
 	.ent ___memset
@@ -415,9 +415,9 @@
 	 * to mask stalls.  Note that entry point names also had to change
 	 */
 	.align 5
-	.ent __memsetw
+	.ent __memset16
 
-__memsetw:
+__memset16:
 	.frame $30,0,$26,0
 	.prologue 0
 
@@ -596,8 +596,8 @@
 	nop
 	ret $31,($26),1		# L0 :
 
-	.end __memsetw
-	EXPORT_SYMBOL(__memsetw)
+	.end __memset16
+	EXPORT_SYMBOL(__memset16)
 
 memset = ___memset
 __memset = ___memset
diff --git a/arch/arc/boot/dts/axc003.dtsi b/arch/arc/boot/dts/axc003.dtsi
index 4e6e9f5..dc91c66 100644
--- a/arch/arc/boot/dts/axc003.dtsi
+++ b/arch/arc/boot/dts/axc003.dtsi
@@ -35,6 +35,14 @@
 			reg = <0x80 0x10>, <0x100 0x10>;
 			#clock-cells = <0>;
 			clocks = <&input_clk>;
+
+			/*
+			 * Set initial core pll output frequency to 90MHz.
+			 * It will be applied at the core pll driver probing
+			 * on early boot.
+			 */
+			assigned-clocks = <&core_clk>;
+			assigned-clock-rates = <90000000>;
 		};
 
 		core_intc: archs-intc@cpu {
diff --git a/arch/arc/boot/dts/axc003_idu.dtsi b/arch/arc/boot/dts/axc003_idu.dtsi
index 63954a8..69ff4895 100644
--- a/arch/arc/boot/dts/axc003_idu.dtsi
+++ b/arch/arc/boot/dts/axc003_idu.dtsi
@@ -35,6 +35,14 @@
 			reg = <0x80 0x10>, <0x100 0x10>;
 			#clock-cells = <0>;
 			clocks = <&input_clk>;
+
+			/*
+			 * Set initial core pll output frequency to 100MHz.
+			 * It will be applied at the core pll driver probing
+			 * on early boot.
+			 */
+			assigned-clocks = <&core_clk>;
+			assigned-clock-rates = <100000000>;
 		};
 
 		core_intc: archs-intc@cpu {
diff --git a/arch/arc/boot/dts/hsdk.dts b/arch/arc/boot/dts/hsdk.dts
index 8f627c2..006aa3d 100644
--- a/arch/arc/boot/dts/hsdk.dts
+++ b/arch/arc/boot/dts/hsdk.dts
@@ -114,6 +114,14 @@
 			reg = <0x00 0x10>, <0x14B8 0x4>;
 			#clock-cells = <0>;
 			clocks = <&input_clk>;
+
+			/*
+			 * Set initial core pll output frequency to 1GHz.
+			 * It will be applied at the core pll driver probing
+			 * on early boot.
+			 */
+			assigned-clocks = <&core_clk>;
+			assigned-clock-rates = <1000000000>;
 		};
 
 		serial: serial@5000 {
diff --git a/arch/arc/configs/hsdk_defconfig b/arch/arc/configs/hsdk_defconfig
index 7b8f8fa..ac6b0ed 100644
--- a/arch/arc/configs/hsdk_defconfig
+++ b/arch/arc/configs/hsdk_defconfig
@@ -49,10 +49,11 @@
 CONFIG_SERIAL_OF_PLATFORM=y
 # CONFIG_HW_RANDOM is not set
 # CONFIG_HWMON is not set
+CONFIG_DRM=y
+# CONFIG_DRM_FBDEV_EMULATION is not set
+CONFIG_DRM_UDL=y
 CONFIG_FB=y
-CONFIG_FB_UDL=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
-CONFIG_USB=y
 CONFIG_USB_EHCI_HCD=y
 CONFIG_USB_EHCI_HCD_PLATFORM=y
 CONFIG_USB_OHCI_HCD=y
diff --git a/arch/arc/include/asm/thread_info.h b/arch/arc/include/asm/thread_info.h
index 2d79e52..c85947b 100644
--- a/arch/arc/include/asm/thread_info.h
+++ b/arch/arc/include/asm/thread_info.h
@@ -62,9 +62,6 @@ struct thread_info {
 	.addr_limit = KERNEL_DS,		\
 }
 
-#define init_thread_info    (init_thread_union.thread_info)
-#define init_stack          (init_thread_union.stack)
-
 static inline __attribute_const__ struct thread_info *current_thread_info(void)
 {
 	register unsigned long sp asm("sp");
diff --git a/arch/arc/include/asm/uaccess.h b/arch/arc/include/asm/uaccess.h
index f35974e..c9173c0 100644
--- a/arch/arc/include/asm/uaccess.h
+++ b/arch/arc/include/asm/uaccess.h
@@ -668,6 +668,7 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)
 		return 0;
 
 	__asm__ __volatile__(
+	"	mov	lp_count, %5		\n"
 	"	lp	3f			\n"
 	"1:	ldb.ab  %3, [%2, 1]		\n"
 	"	breq.d	%3, 0, 3f               \n"
@@ -684,8 +685,8 @@ __arc_strncpy_from_user(char *dst, const char __user *src, long count)
 	"	.word   1b, 4b			\n"
 	"	.previous			\n"
 	: "+r"(res), "+r"(dst), "+r"(src), "=r"(val)
-	: "g"(-EFAULT), "l"(count)
-	: "memory");
+	: "g"(-EFAULT), "r"(count)
+	: "lp_count", "lp_start", "lp_end", "memory");
 
 	return res;
 }
diff --git a/arch/arc/kernel/setup.c b/arch/arc/kernel/setup.c
index 7ef7d9a..9d27331 100644
--- a/arch/arc/kernel/setup.c
+++ b/arch/arc/kernel/setup.c
@@ -199,7 +199,7 @@ static void read_arc_build_cfg_regs(void)
 			unsigned int exec_ctrl;
 
 			READ_BCR(AUX_EXEC_CTRL, exec_ctrl);
-			cpu->extn.dual_enb = exec_ctrl & 1;
+			cpu->extn.dual_enb = !(exec_ctrl & 1);
 
 			/* dual issue always present for this core */
 			cpu->extn.dual = 1;
diff --git a/arch/arc/kernel/stacktrace.c b/arch/arc/kernel/stacktrace.c
index 74315f3..bf40e06 100644
--- a/arch/arc/kernel/stacktrace.c
+++ b/arch/arc/kernel/stacktrace.c
@@ -163,7 +163,7 @@ arc_unwind_core(struct task_struct *tsk, struct pt_regs *regs,
  */
 static int __print_sym(unsigned int address, void *unused)
 {
-	__print_symbol("  %s\n", address);
+	printk("  %pS\n", (void *)address);
 	return 0;
 }
 
diff --git a/arch/arc/kernel/traps.c b/arch/arc/kernel/traps.c
index c720678..b123558 100644
--- a/arch/arc/kernel/traps.c
+++ b/arch/arc/kernel/traps.c
@@ -85,6 +85,7 @@ DO_ERROR_INFO(SIGILL, "Illegal Insn (or Seq)", insterror_is_error, ILL_ILLOPC)
 DO_ERROR_INFO(SIGBUS, "Invalid Mem Access", __weak do_memory_error, BUS_ADRERR)
 DO_ERROR_INFO(SIGTRAP, "Breakpoint Set", trap_is_brkpt, TRAP_BRKPT)
 DO_ERROR_INFO(SIGBUS, "Misaligned Access", do_misaligned_error, BUS_ADRALN)
+DO_ERROR_INFO(SIGSEGV, "gcc generated __builtin_trap", do_trap5_error, 0)
 
 /*
  * Entry Point for Misaligned Data access Exception, for emulating in software
@@ -117,6 +118,8 @@ void do_machine_check_fault(unsigned long address, struct pt_regs *regs)
  * Thus TRAP_S <n> can be used for specific purpose
  *  -1 used for software breakpointing (gdb)
  *  -2 used by kprobes
+ *  -5 __builtin_trap() generated by gcc (2018.03 onwards) for toggle such as
+ *     -fno-isolate-erroneous-paths-dereference
  */
 void do_non_swi_trap(unsigned long address, struct pt_regs *regs)
 {
@@ -136,6 +139,9 @@ void do_non_swi_trap(unsigned long address, struct pt_regs *regs)
 		kgdb_trap(regs);
 		break;
 
+	case 5:
+		do_trap5_error(address, regs);
+		break;
 	default:
 		break;
 	}
@@ -157,3 +163,11 @@ void do_insterror_or_kprobe(unsigned long address, struct pt_regs *regs)
 
 	insterror_is_error(address, regs);
 }
+
+/*
+ * abort() call generated by older gcc for __builtin_trap()
+ */
+void abort(void)
+{
+	__asm__ __volatile__("trap_s  5\n");
+}
diff --git a/arch/arc/kernel/troubleshoot.c b/arch/arc/kernel/troubleshoot.c
index 7d8c1d6..6e9a0a9 100644
--- a/arch/arc/kernel/troubleshoot.c
+++ b/arch/arc/kernel/troubleshoot.c
@@ -163,6 +163,9 @@ static void show_ecr_verbose(struct pt_regs *regs)
 		else
 			pr_cont("Bus Error, check PRM\n");
 #endif
+	} else if (vec == ECR_V_TRAP) {
+		if (regs->ecr_param == 5)
+			pr_cont("gcc generated __builtin_trap\n");
 	} else {
 		pr_cont("Check Programmer's Manual\n");
 	}
diff --git a/arch/arc/plat-axs10x/axs10x.c b/arch/arc/plat-axs10x/axs10x.c
index f1ac679..46544e8 100644
--- a/arch/arc/plat-axs10x/axs10x.c
+++ b/arch/arc/plat-axs10x/axs10x.c
@@ -317,25 +317,23 @@ static void __init axs103_early_init(void)
 	 * Instead of duplicating defconfig/DT for SMP/QUAD, add a small hack
 	 * of fudging the freq in DT
 	 */
+#define AXS103_QUAD_CORE_CPU_FREQ_HZ	50000000
+
 	unsigned int num_cores = (read_aux_reg(ARC_REG_MCIP_BCR) >> 16) & 0x3F;
 	if (num_cores > 2) {
-		u32 freq = 50, orig;
-		/*
-		 * TODO: use cpu node "cpu-freq" param instead of platform-specific
-		 * "/cpu_card/core_clk" as it works only if we use fixed-clock for cpu.
-		 */
+		u32 freq;
 		int off = fdt_path_offset(initial_boot_params, "/cpu_card/core_clk");
 		const struct fdt_property *prop;
 
 		prop = fdt_get_property(initial_boot_params, off,
-					"clock-frequency", NULL);
-		orig = be32_to_cpu(*(u32*)(prop->data)) / 1000000;
+					"assigned-clock-rates", NULL);
+		freq = be32_to_cpu(*(u32 *)(prop->data));
 
 		/* Patching .dtb in-place with new core clock value */
-		if (freq != orig ) {
-			freq = cpu_to_be32(freq * 1000000);
+		if (freq != AXS103_QUAD_CORE_CPU_FREQ_HZ) {
+			freq = cpu_to_be32(AXS103_QUAD_CORE_CPU_FREQ_HZ);
 			fdt_setprop_inplace(initial_boot_params, off,
-					    "clock-frequency", &freq, sizeof(freq));
+					    "assigned-clock-rates", &freq, sizeof(freq));
 		}
 	}
 #endif
diff --git a/arch/arc/plat-hsdk/platform.c b/arch/arc/plat-hsdk/platform.c
index fd0ae5e..2958aed 100644
--- a/arch/arc/plat-hsdk/platform.c
+++ b/arch/arc/plat-hsdk/platform.c
@@ -38,42 +38,6 @@ static void __init hsdk_init_per_cpu(unsigned int cpu)
 #define CREG_PAE		(CREG_BASE + 0x180)
 #define CREG_PAE_UPDATE		(CREG_BASE + 0x194)
 
-#define CREG_CORE_IF_CLK_DIV	(CREG_BASE + 0x4B8)
-#define CREG_CORE_IF_CLK_DIV_2	0x1
-#define CGU_BASE		ARC_PERIPHERAL_BASE
-#define CGU_PLL_STATUS		(ARC_PERIPHERAL_BASE + 0x4)
-#define CGU_PLL_CTRL		(ARC_PERIPHERAL_BASE + 0x0)
-#define CGU_PLL_STATUS_LOCK	BIT(0)
-#define CGU_PLL_STATUS_ERR	BIT(1)
-#define CGU_PLL_CTRL_1GHZ	0x3A10
-#define HSDK_PLL_LOCK_TIMEOUT	500
-
-#define HSDK_PLL_LOCKED() \
-	!!(ioread32((void __iomem *) CGU_PLL_STATUS) & CGU_PLL_STATUS_LOCK)
-
-#define HSDK_PLL_ERR() \
-	!!(ioread32((void __iomem *) CGU_PLL_STATUS) & CGU_PLL_STATUS_ERR)
-
-static void __init hsdk_set_cpu_freq_1ghz(void)
-{
-	u32 timeout = HSDK_PLL_LOCK_TIMEOUT;
-
-	/*
-	 * As we set cpu clock which exceeds 500MHz, the divider for the interface
-	 * clock must be programmed to div-by-2.
-	 */
-	iowrite32(CREG_CORE_IF_CLK_DIV_2, (void __iomem *) CREG_CORE_IF_CLK_DIV);
-
-	/* Set cpu clock to 1GHz */
-	iowrite32(CGU_PLL_CTRL_1GHZ, (void __iomem *) CGU_PLL_CTRL);
-
-	while (!HSDK_PLL_LOCKED() && timeout--)
-		cpu_relax();
-
-	if (!HSDK_PLL_LOCKED() || HSDK_PLL_ERR())
-		pr_err("Failed to setup CPU frequency to 1GHz!");
-}
-
 #define SDIO_BASE		(ARC_PERIPHERAL_BASE + 0xA000)
 #define SDIO_UHS_REG_EXT	(SDIO_BASE + 0x108)
 #define SDIO_UHS_REG_EXT_DIV_2	(2 << 30)
@@ -98,12 +62,6 @@ static void __init hsdk_init_early(void)
 	 * minimum possible div-by-2.
 	 */
 	iowrite32(SDIO_UHS_REG_EXT_DIV_2, (void __iomem *) SDIO_UHS_REG_EXT);
-
-	/*
-	 * Setup CPU frequency to 1GHz.
-	 * TODO: remove it after smart hsdk pll driver will be introduced.
-	 */
-	hsdk_set_cpu_freq_1ghz();
 }
 
 static const char *hsdk_compat[] __initconst = {
diff --git a/arch/arm/boot/dts/aspeed-g4.dtsi b/arch/arm/boot/dts/aspeed-g4.dtsi
index 45d815a..de08d90 100644
--- a/arch/arm/boot/dts/aspeed-g4.dtsi
+++ b/arch/arm/boot/dts/aspeed-g4.dtsi
@@ -219,7 +219,7 @@
 				compatible = "aspeed,ast2400-vuart";
 				reg = <0x1e787000 0x40>;
 				reg-shift = <2>;
-				interrupts = <10>;
+				interrupts = <8>;
 				clocks = <&clk_uart>;
 				no-loopback-test;
 				status = "disabled";
diff --git a/arch/arm/boot/dts/at91-tse850-3.dts b/arch/arm/boot/dts/at91-tse850-3.dts
index 5f29010..9b82cc8 100644
--- a/arch/arm/boot/dts/at91-tse850-3.dts
+++ b/arch/arm/boot/dts/at91-tse850-3.dts
@@ -221,6 +221,7 @@
 	jc42@18 {
 		compatible = "nxp,se97b", "jedec,jc-42.4-temp";
 		reg = <0x18>;
+		smbus-timeout-disable;
 	};
 
 	dpot: mcp4651-104@28 {
diff --git a/arch/arm/boot/dts/bcm2836.dtsi b/arch/arm/boot/dts/bcm2836.dtsi
index 61e1580..1dfd764 100644
--- a/arch/arm/boot/dts/bcm2836.dtsi
+++ b/arch/arm/boot/dts/bcm2836.dtsi
@@ -13,24 +13,24 @@
 			compatible = "brcm,bcm2836-l1-intc";
 			reg = <0x40000000 0x100>;
 			interrupt-controller;
-			#interrupt-cells = <1>;
+			#interrupt-cells = <2>;
 			interrupt-parent = <&local_intc>;
 		};
 
 		arm-pmu {
 			compatible = "arm,cortex-a7-pmu";
 			interrupt-parent = <&local_intc>;
-			interrupts = <9>;
+			interrupts = <9 IRQ_TYPE_LEVEL_HIGH>;
 		};
 	};
 
 	timer {
 		compatible = "arm,armv7-timer";
 		interrupt-parent = <&local_intc>;
-		interrupts = <0>, // PHYS_SECURE_PPI
-			     <1>, // PHYS_NONSECURE_PPI
-			     <3>, // VIRT_PPI
-			     <2>; // HYP_PPI
+		interrupts = <0 IRQ_TYPE_LEVEL_HIGH>, // PHYS_SECURE_PPI
+			     <1 IRQ_TYPE_LEVEL_HIGH>, // PHYS_NONSECURE_PPI
+			     <3 IRQ_TYPE_LEVEL_HIGH>, // VIRT_PPI
+			     <2 IRQ_TYPE_LEVEL_HIGH>; // HYP_PPI
 		always-on;
 	};
 
@@ -76,7 +76,7 @@
 	compatible = "brcm,bcm2836-armctrl-ic";
 	reg = <0x7e00b200 0x200>;
 	interrupt-parent = <&local_intc>;
-	interrupts = <8>;
+	interrupts = <8 IRQ_TYPE_LEVEL_HIGH>;
 };
 
 &cpu_thermal {
diff --git a/arch/arm/boot/dts/bcm2837.dtsi b/arch/arm/boot/dts/bcm2837.dtsi
index bc1cca5..efa7d33 100644
--- a/arch/arm/boot/dts/bcm2837.dtsi
+++ b/arch/arm/boot/dts/bcm2837.dtsi
@@ -12,7 +12,7 @@
 			compatible = "brcm,bcm2836-l1-intc";
 			reg = <0x40000000 0x100>;
 			interrupt-controller;
-			#interrupt-cells = <1>;
+			#interrupt-cells = <2>;
 			interrupt-parent = <&local_intc>;
 		};
 	};
@@ -20,10 +20,10 @@
 	timer {
 		compatible = "arm,armv7-timer";
 		interrupt-parent = <&local_intc>;
-		interrupts = <0>, // PHYS_SECURE_PPI
-			     <1>, // PHYS_NONSECURE_PPI
-			     <3>, // VIRT_PPI
-			     <2>; // HYP_PPI
+		interrupts = <0 IRQ_TYPE_LEVEL_HIGH>, // PHYS_SECURE_PPI
+			     <1 IRQ_TYPE_LEVEL_HIGH>, // PHYS_NONSECURE_PPI
+			     <3 IRQ_TYPE_LEVEL_HIGH>, // VIRT_PPI
+			     <2 IRQ_TYPE_LEVEL_HIGH>; // HYP_PPI
 		always-on;
 	};
 
@@ -73,7 +73,7 @@
 	compatible = "brcm,bcm2836-armctrl-ic";
 	reg = <0x7e00b200 0x200>;
 	interrupt-parent = <&local_intc>;
-	interrupts = <8>;
+	interrupts = <8 IRQ_TYPE_LEVEL_HIGH>;
 };
 
 &cpu_thermal {
diff --git a/arch/arm/boot/dts/bcm283x.dtsi b/arch/arm/boot/dts/bcm283x.dtsi
index dcde93c..18db25a 100644
--- a/arch/arm/boot/dts/bcm283x.dtsi
+++ b/arch/arm/boot/dts/bcm283x.dtsi
@@ -2,6 +2,7 @@
 #include <dt-bindings/clock/bcm2835.h>
 #include <dt-bindings/clock/bcm2835-aux.h>
 #include <dt-bindings/gpio/gpio.h>
+#include <dt-bindings/interrupt-controller/irq.h>
 
 /* firmware-provided startup stubs live here, where the secondary CPUs are
  * spinning.
diff --git a/arch/arm/boot/dts/da850-lcdk.dts b/arch/arm/boot/dts/da850-lcdk.dts
index eed89e6..a1f4d6d 100644
--- a/arch/arm/boot/dts/da850-lcdk.dts
+++ b/arch/arm/boot/dts/da850-lcdk.dts
@@ -293,12 +293,12 @@
 					label = "u-boot env";
 					reg = <0 0x020000>;
 				};
-				partition@0x020000 {
+				partition@20000 {
 					/* The LCDK defaults to booting from this partition */
 					label = "u-boot";
 					reg = <0x020000 0x080000>;
 				};
-				partition@0x0a0000 {
+				partition@a0000 {
 					label = "free space";
 					reg = <0x0a0000 0>;
 				};
diff --git a/arch/arm/boot/dts/da850-lego-ev3.dts b/arch/arm/boot/dts/da850-lego-ev3.dts
index 413dbd5..81942ae 100644
--- a/arch/arm/boot/dts/da850-lego-ev3.dts
+++ b/arch/arm/boot/dts/da850-lego-ev3.dts
@@ -178,7 +178,7 @@
 	 */
 	battery {
 		pinctrl-names = "default";
-		pintctrl-0 = <&battery_pins>;
+		pinctrl-0 = <&battery_pins>;
 		compatible = "lego,ev3-battery";
 		io-channels = <&adc 4>, <&adc 3>;
 		io-channel-names = "voltage", "current";
@@ -392,7 +392,7 @@
 	batt_volt_en {
 		gpio-hog;
 		gpios = <6 GPIO_ACTIVE_HIGH>;
-		output-low;
+		output-high;
 	};
 };
 
diff --git a/arch/arm/boot/dts/exynos5800-peach-pi.dts b/arch/arm/boot/dts/exynos5800-peach-pi.dts
index b2b95ff..0029ec2 100644
--- a/arch/arm/boot/dts/exynos5800-peach-pi.dts
+++ b/arch/arm/boot/dts/exynos5800-peach-pi.dts
@@ -664,6 +664,10 @@
 	status = "okay";
 };
 
+&mixer {
+	status = "okay";
+};
+
 /* eMMC flash */
 &mmc_0 {
 	status = "okay";
diff --git a/arch/arm/boot/dts/imx6ul.dtsi b/arch/arm/boot/dts/imx6ul.dtsi
index d5181f8..963e169 100644
--- a/arch/arm/boot/dts/imx6ul.dtsi
+++ b/arch/arm/boot/dts/imx6ul.dtsi
@@ -68,12 +68,14 @@
 			clock-latency = <61036>; /* two CLK32 periods */
 			operating-points = <
 				/* kHz	uV */
+				696000	1275000
 				528000	1175000
 				396000	1025000
 				198000	950000
 			>;
 			fsl,soc-operating-points = <
 				/* KHz	uV */
+				696000	1275000
 				528000	1175000
 				396000	1175000
 				198000	1175000
diff --git a/arch/arm/boot/dts/kirkwood-openblocks_a7.dts b/arch/arm/boot/dts/kirkwood-openblocks_a7.dts
index cf2f524..27cc913 100644
--- a/arch/arm/boot/dts/kirkwood-openblocks_a7.dts
+++ b/arch/arm/boot/dts/kirkwood-openblocks_a7.dts
@@ -53,7 +53,8 @@
 		};
 
 		pinctrl: pin-controller@10000 {
-			pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header>;
+			pinctrl-0 = <&pmx_dip_switches &pmx_gpio_header
+				     &pmx_gpio_header_gpo>;
 			pinctrl-names = "default";
 
 			pmx_uart0: pmx-uart0 {
@@ -85,11 +86,16 @@
 			 * ground.
 			 */
 			pmx_gpio_header: pmx-gpio-header {
-				marvell,pins = "mpp17", "mpp7", "mpp29", "mpp28",
+				marvell,pins = "mpp17", "mpp29", "mpp28",
 					       "mpp35", "mpp34", "mpp40";
 				marvell,function = "gpio";
 			};
 
+			pmx_gpio_header_gpo: pxm-gpio-header-gpo {
+				marvell,pins = "mpp7";
+				marvell,function = "gpo";
+			};
+
 			pmx_gpio_init: pmx-init {
 				marvell,pins = "mpp38";
 				marvell,function = "gpio";
diff --git a/arch/arm/boot/dts/ls1021a-qds.dts b/arch/arm/boot/dts/ls1021a-qds.dts
index 9408753..67b4de0 100644
--- a/arch/arm/boot/dts/ls1021a-qds.dts
+++ b/arch/arm/boot/dts/ls1021a-qds.dts
@@ -215,7 +215,7 @@
 				reg = <0x2a>;
 				VDDA-supply = <&reg_3p3v>;
 				VDDIO-supply = <&reg_3p3v>;
-				clocks = <&sys_mclk 1>;
+				clocks = <&sys_mclk>;
 			};
 		};
 	};
diff --git a/arch/arm/boot/dts/ls1021a-twr.dts b/arch/arm/boot/dts/ls1021a-twr.dts
index a8b148a..44715c8 100644
--- a/arch/arm/boot/dts/ls1021a-twr.dts
+++ b/arch/arm/boot/dts/ls1021a-twr.dts
@@ -187,7 +187,7 @@
 		reg = <0x0a>;
 		VDDA-supply = <&reg_3p3v>;
 		VDDIO-supply = <&reg_3p3v>;
-		clocks = <&sys_mclk 1>;
+		clocks = <&sys_mclk>;
 	};
 };
 
diff --git a/arch/arm/boot/dts/omap2420-n8x0-common.dtsi b/arch/arm/boot/dts/omap2420-n8x0-common.dtsi
index 1df3ace..63b0b49 100644
--- a/arch/arm/boot/dts/omap2420-n8x0-common.dtsi
+++ b/arch/arm/boot/dts/omap2420-n8x0-common.dtsi
@@ -52,6 +52,7 @@
 	onenand@0,0 {
 		#address-cells = <1>;
 		#size-cells = <1>;
+		compatible = "ti,omap2-onenand";
 		reg = <0 0 0x20000>;	/* CS0, offset 0, IO size 128K */
 
 		gpmc,sync-read;
diff --git a/arch/arm/boot/dts/omap3-igep.dtsi b/arch/arm/boot/dts/omap3-igep.dtsi
index 4ad7d55..f33cc80 100644
--- a/arch/arm/boot/dts/omap3-igep.dtsi
+++ b/arch/arm/boot/dts/omap3-igep.dtsi
@@ -147,32 +147,32 @@
 		gpmc,sync-read;
 		gpmc,sync-write;
 		gpmc,burst-length = <16>;
-		gpmc,burst-read;
 		gpmc,burst-wrap;
+		gpmc,burst-read;
 		gpmc,burst-write;
 		gpmc,device-width = <2>; /* GPMC_DEVWIDTH_16BIT */
 		gpmc,mux-add-data = <2>; /* GPMC_MUX_AD */
 		gpmc,cs-on-ns = <0>;
-		gpmc,cs-rd-off-ns = <87>;
-		gpmc,cs-wr-off-ns = <87>;
+		gpmc,cs-rd-off-ns = <96>;
+		gpmc,cs-wr-off-ns = <96>;
 		gpmc,adv-on-ns = <0>;
-		gpmc,adv-rd-off-ns = <10>;
-		gpmc,adv-wr-off-ns = <10>;
-		gpmc,oe-on-ns = <15>;
-		gpmc,oe-off-ns = <87>;
+		gpmc,adv-rd-off-ns = <12>;
+		gpmc,adv-wr-off-ns = <12>;
+		gpmc,oe-on-ns = <18>;
+		gpmc,oe-off-ns = <96>;
 		gpmc,we-on-ns = <0>;
-		gpmc,we-off-ns = <87>;
-		gpmc,rd-cycle-ns = <112>;
-		gpmc,wr-cycle-ns = <112>;
-		gpmc,access-ns = <81>;
-		gpmc,page-burst-access-ns = <15>;
+		gpmc,we-off-ns = <96>;
+		gpmc,rd-cycle-ns = <114>;
+		gpmc,wr-cycle-ns = <114>;
+		gpmc,access-ns = <90>;
+		gpmc,page-burst-access-ns = <12>;
 		gpmc,bus-turnaround-ns = <0>;
 		gpmc,cycle2cycle-delay-ns = <0>;
 		gpmc,wait-monitoring-ns = <0>;
-		gpmc,clk-activation-ns = <5>;
+		gpmc,clk-activation-ns = <6>;
 		gpmc,wr-data-mux-bus-ns = <30>;
-		gpmc,wr-access-ns = <81>;
-		gpmc,sync-clk-ps = <15000>;
+		gpmc,wr-access-ns = <90>;
+		gpmc,sync-clk-ps = <12000>;
 
 		#address-cells = <1>;
 		#size-cells = <1>;
diff --git a/arch/arm/boot/dts/omap3-n900.dts b/arch/arm/boot/dts/omap3-n900.dts
index 669c51c..e7c7b8e 100644
--- a/arch/arm/boot/dts/omap3-n900.dts
+++ b/arch/arm/boot/dts/omap3-n900.dts
@@ -838,6 +838,7 @@
 	onenand@0,0 {
 		#address-cells = <1>;
 		#size-cells = <1>;
+		compatible = "ti,omap2-onenand";
 		reg = <0 0 0x20000>;	/* CS0, offset 0, IO size 128K */
 
 		gpmc,sync-read;
diff --git a/arch/arm/boot/dts/omap3-n950-n9.dtsi b/arch/arm/boot/dts/omap3-n950-n9.dtsi
index 12fbb3d..0d9b853 100644
--- a/arch/arm/boot/dts/omap3-n950-n9.dtsi
+++ b/arch/arm/boot/dts/omap3-n950-n9.dtsi
@@ -367,6 +367,7 @@
 	onenand@0,0 {
 		#address-cells = <1>;
 		#size-cells = <1>;
+		compatible = "ti,omap2-onenand";
 		reg = <0 0 0x20000>;	/* CS0, offset 0, IO size 128K */
 
 		gpmc,sync-read;
diff --git a/arch/arm/boot/dts/omap3430-sdp.dts b/arch/arm/boot/dts/omap3430-sdp.dts
index 908951e..d652708 100644
--- a/arch/arm/boot/dts/omap3430-sdp.dts
+++ b/arch/arm/boot/dts/omap3430-sdp.dts
@@ -154,6 +154,7 @@
 		linux,mtd-name= "samsung,kfm2g16q2m-deb8";
 		#address-cells = <1>;
 		#size-cells = <1>;
+		compatible = "ti,omap2-onenand";
 		reg = <2 0 0x20000>;	/* CS2, offset 0, IO size 4 */
 
 		gpmc,device-width = <2>;
diff --git a/arch/arm/boot/dts/rk3066a-marsboard.dts b/arch/arm/boot/dts/rk3066a-marsboard.dts
index c6d92c2..d23ee6d 100644
--- a/arch/arm/boot/dts/rk3066a-marsboard.dts
+++ b/arch/arm/boot/dts/rk3066a-marsboard.dts
@@ -83,6 +83,10 @@
 	};
 };
 
+&cpu0 {
+	cpu0-supply = <&vdd_arm>;
+};
+
 &i2c1 {
 	status = "okay";
 	clock-frequency = <400000>;
diff --git a/arch/arm/boot/dts/rk3288.dtsi b/arch/arm/boot/dts/rk3288.dtsi
index cd24894..6102e4e 100644
--- a/arch/arm/boot/dts/rk3288.dtsi
+++ b/arch/arm/boot/dts/rk3288.dtsi
@@ -956,7 +956,7 @@
 	iep_mmu: iommu@ff900800 {
 		compatible = "rockchip,iommu";
 		reg = <0x0 0xff900800 0x0 0x40>;
-		interrupts = <GIC_SPI 17 IRQ_TYPE_LEVEL_HIGH 0>;
+		interrupts = <GIC_SPI 17 IRQ_TYPE_LEVEL_HIGH>;
 		interrupt-names = "iep_mmu";
 		#iommu-cells = <0>;
 		status = "disabled";
diff --git a/arch/arm/boot/dts/sun4i-a10.dtsi b/arch/arm/boot/dts/sun4i-a10.dtsi
index b91300d..4f2f2eea 100644
--- a/arch/arm/boot/dts/sun4i-a10.dtsi
+++ b/arch/arm/boot/dts/sun4i-a10.dtsi
@@ -502,8 +502,8 @@
 			reg = <0x01c16000 0x1000>;
 			interrupts = <58>;
 			clocks = <&ccu CLK_AHB_HDMI0>, <&ccu CLK_HDMI>,
-				 <&ccu 9>,
-				 <&ccu 18>;
+				 <&ccu CLK_PLL_VIDEO0_2X>,
+				 <&ccu CLK_PLL_VIDEO1_2X>;
 			clock-names = "ahb", "mod", "pll-0", "pll-1";
 			dmas = <&dma SUN4I_DMA_NORMAL 16>,
 			       <&dma SUN4I_DMA_NORMAL 16>,
@@ -1104,7 +1104,7 @@
 
 					be1_out_tcon0: endpoint@0 {
 						reg = <0>;
-						remote-endpoint = <&tcon1_in_be0>;
+						remote-endpoint = <&tcon0_in_be1>;
 					};
 
 					be1_out_tcon1: endpoint@1 {
diff --git a/arch/arm/boot/dts/sun5i-a10s.dtsi b/arch/arm/boot/dts/sun5i-a10s.dtsi
index 6ae4d95..316cb8b 100644
--- a/arch/arm/boot/dts/sun5i-a10s.dtsi
+++ b/arch/arm/boot/dts/sun5i-a10s.dtsi
@@ -82,8 +82,8 @@
 			reg = <0x01c16000 0x1000>;
 			interrupts = <58>;
 			clocks = <&ccu CLK_AHB_HDMI>, <&ccu CLK_HDMI>,
-				 <&ccu 9>,
-				 <&ccu 16>;
+				 <&ccu CLK_PLL_VIDEO0_2X>,
+				 <&ccu CLK_PLL_VIDEO1_2X>;
 			clock-names = "ahb", "mod", "pll-0", "pll-1";
 			dmas = <&dma SUN4I_DMA_NORMAL 16>,
 			       <&dma SUN4I_DMA_NORMAL 16>,
diff --git a/arch/arm/boot/dts/sun6i-a31.dtsi b/arch/arm/boot/dts/sun6i-a31.dtsi
index 8bfa12b..72d3fe4 100644
--- a/arch/arm/boot/dts/sun6i-a31.dtsi
+++ b/arch/arm/boot/dts/sun6i-a31.dtsi
@@ -429,8 +429,8 @@
 			interrupts = <GIC_SPI 88 IRQ_TYPE_LEVEL_HIGH>;
 			clocks = <&ccu CLK_AHB1_HDMI>, <&ccu CLK_HDMI>,
 				 <&ccu CLK_HDMI_DDC>,
-				 <&ccu 7>,
-				 <&ccu 13>;
+				 <&ccu CLK_PLL_VIDEO0_2X>,
+				 <&ccu CLK_PLL_VIDEO1_2X>;
 			clock-names = "ahb", "mod", "ddc", "pll-0", "pll-1";
 			resets = <&ccu RST_AHB1_HDMI>;
 			reset-names = "ahb";
diff --git a/arch/arm/boot/dts/sun7i-a20.dtsi b/arch/arm/boot/dts/sun7i-a20.dtsi
index 68dfa82..bd0cd32 100644
--- a/arch/arm/boot/dts/sun7i-a20.dtsi
+++ b/arch/arm/boot/dts/sun7i-a20.dtsi
@@ -581,8 +581,8 @@
 			reg = <0x01c16000 0x1000>;
 			interrupts = <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>;
 			clocks = <&ccu CLK_AHB_HDMI0>, <&ccu CLK_HDMI>,
-				 <&ccu 9>,
-				 <&ccu 18>;
+				 <&ccu CLK_PLL_VIDEO0_2X>,
+				 <&ccu CLK_PLL_VIDEO1_2X>;
 			clock-names = "ahb", "mod", "pll-0", "pll-1";
 			dmas = <&dma SUN4I_DMA_NORMAL 16>,
 			       <&dma SUN4I_DMA_NORMAL 16>,
@@ -1354,7 +1354,7 @@
 
 					be1_out_tcon0: endpoint@0 {
 						reg = <0>;
-						remote-endpoint = <&tcon1_in_be0>;
+						remote-endpoint = <&tcon0_in_be1>;
 					};
 
 					be1_out_tcon1: endpoint@1 {
diff --git a/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts b/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts
index 9871553..a021ee6 100644
--- a/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts
+++ b/arch/arm/boot/dts/sun8i-a83t-tbs-a711.dts
@@ -146,6 +146,7 @@
 	status = "okay";
 
 	axp81x: pmic@3a3 {
+		compatible = "x-powers,axp813";
 		reg = <0x3a3>;
 		interrupt-parent = <&r_intc>;
 		interrupts = <0 IRQ_TYPE_LEVEL_LOW>;
diff --git a/arch/arm/boot/dts/tango4-common.dtsi b/arch/arm/boot/dts/tango4-common.dtsi
index 0ec1b0a3..ff72a8e 100644
--- a/arch/arm/boot/dts/tango4-common.dtsi
+++ b/arch/arm/boot/dts/tango4-common.dtsi
@@ -156,7 +156,6 @@
 			reg = <0x6e000 0x400>;
 			ranges = <0 0x6e000 0x400>;
 			interrupt-parent = <&gic>;
-			interrupt-controller;
 			#address-cells = <1>;
 			#size-cells = <1>;
 
diff --git a/arch/arm/configs/aspeed_g4_defconfig b/arch/arm/configs/aspeed_g4_defconfig
index d23b9d5..95946de 100644
--- a/arch/arm/configs/aspeed_g4_defconfig
+++ b/arch/arm/configs/aspeed_g4_defconfig
@@ -1,7 +1,6 @@
 CONFIG_KERNEL_XZ=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_LOG_BUF_SHIFT=14
diff --git a/arch/arm/configs/aspeed_g5_defconfig b/arch/arm/configs/aspeed_g5_defconfig
index c0ad7b8..8c7ea03 100644
--- a/arch/arm/configs/aspeed_g5_defconfig
+++ b/arch/arm/configs/aspeed_g5_defconfig
@@ -1,7 +1,6 @@
 CONFIG_KERNEL_XZ=y
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_LOG_BUF_SHIFT=14
diff --git a/arch/arm/configs/hisi_defconfig b/arch/arm/configs/hisi_defconfig
index b2e340b..74d611e 100644
--- a/arch/arm/configs/hisi_defconfig
+++ b/arch/arm/configs/hisi_defconfig
@@ -1,4 +1,3 @@
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BLK_DEV_INITRD=y
diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
index 61509c4..b659244 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -1,6 +1,5 @@
 CONFIG_SYSVIPC=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_CGROUPS=y
diff --git a/arch/arm/configs/mvebu_v7_defconfig b/arch/arm/configs/mvebu_v7_defconfig
index 6955370..ddaeda4 100644
--- a/arch/arm/configs/mvebu_v7_defconfig
+++ b/arch/arm/configs/mvebu_v7_defconfig
@@ -1,6 +1,5 @@
 CONFIG_SYSVIPC=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_LOG_BUF_SHIFT=14
 CONFIG_BLK_DEV_INITRD=y
@@ -57,7 +56,7 @@
 CONFIG_MTD_PHYSMAP_OF=y
 CONFIG_MTD_M25P80=y
 CONFIG_MTD_NAND=y
-CONFIG_MTD_NAND_PXA3xx=y
+CONFIG_MTD_NAND_MARVELL=y
 CONFIG_MTD_SPI_NOR=y
 CONFIG_SRAM=y
 CONFIG_MTD_UBI=y
diff --git a/arch/arm/configs/pxa_defconfig b/arch/arm/configs/pxa_defconfig
index 830e817..837d0c9 100644
--- a/arch/arm/configs/pxa_defconfig
+++ b/arch/arm/configs/pxa_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BSD_PROCESS_ACCT=y
diff --git a/arch/arm/configs/sama5_defconfig b/arch/arm/configs/sama5_defconfig
index 6529cb4..2080025 100644
--- a/arch/arm/configs/sama5_defconfig
+++ b/arch/arm/configs/sama5_defconfig
@@ -2,7 +2,6 @@
 # CONFIG_SWAP is not set
 CONFIG_SYSVIPC=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_LOG_BUF_SHIFT=14
diff --git a/arch/arm/configs/sunxi_defconfig b/arch/arm/configs/sunxi_defconfig
index 5caaf97..df433ab 100644
--- a/arch/arm/configs/sunxi_defconfig
+++ b/arch/arm/configs/sunxi_defconfig
@@ -10,6 +10,7 @@
 CONFIG_NR_CPUS=8
 CONFIG_AEABI=y
 CONFIG_HIGHMEM=y
+CONFIG_CMA=y
 CONFIG_ARM_APPENDED_DTB=y
 CONFIG_ARM_ATAG_DTB_COMPAT=y
 CONFIG_CPU_FREQ=y
@@ -33,6 +34,7 @@
 # CONFIG_WIRELESS is not set
 CONFIG_DEVTMPFS=y
 CONFIG_DEVTMPFS_MOUNT=y
+CONFIG_DMA_CMA=y
 CONFIG_BLK_DEV_SD=y
 CONFIG_ATA=y
 CONFIG_AHCI_SUNXI=y
diff --git a/arch/arm/configs/tegra_defconfig b/arch/arm/configs/tegra_defconfig
index 6678f29..c819be0 100644
--- a/arch/arm/configs/tegra_defconfig
+++ b/arch/arm/configs/tegra_defconfig
@@ -1,5 +1,4 @@
 CONFIG_SYSVIPC=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IKCONFIG=y
diff --git a/arch/arm/configs/vt8500_v6_v7_defconfig b/arch/arm/configs/vt8500_v6_v7_defconfig
index 1bfaa7b..9b85326 100644
--- a/arch/arm/configs/vt8500_v6_v7_defconfig
+++ b/arch/arm/configs/vt8500_v6_v7_defconfig
@@ -1,4 +1,3 @@
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_BLK_DEV_INITRD=y
diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index a9f7d3f..acbf9ec 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -238,6 +238,9 @@ int kvm_arm_coproc_set_reg(struct kvm_vcpu *vcpu, const struct kvm_one_reg *);
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
 
+static inline void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+				     int exception_index) {}
+
 static inline void __cpu_init_hyp_mode(phys_addr_t pgd_ptr,
 				       unsigned long hyp_stack_ptr,
 				       unsigned long vector_ptr)
@@ -301,4 +304,6 @@ int kvm_arm_vcpu_arch_has_attr(struct kvm_vcpu *vcpu,
 /* All host FP/SIMD state is restored on guest exit, so nothing to save: */
 static inline void kvm_fpsimd_flush_cpu_state(void) {}
 
+static inline void kvm_arm_vhe_guest_enter(void) {}
+static inline void kvm_arm_vhe_guest_exit(void) {}
 #endif /* __ARM_KVM_HOST_H__ */
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index fa6f217..a2d176a 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -211,6 +211,11 @@ static inline bool __kvm_cpu_uses_extended_idmap(void)
 	return false;
 }
 
+static inline unsigned long __kvm_idmap_ptrs_per_pgd(void)
+{
+	return PTRS_PER_PGD;
+}
+
 static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
 				       pgd_t *hyp_pgd,
 				       pgd_t *merged_hyp_pgd,
@@ -221,6 +226,18 @@ static inline unsigned int kvm_get_vmid_bits(void)
 	return 8;
 }
 
+static inline void *kvm_get_hyp_vector(void)
+{
+	return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return 0;
+}
+
+#define kvm_phys_to_vttbr(addr)		(addr)
+
 #endif	/* !__ASSEMBLY__ */
 
 #endif /* __ARM_KVM_MMU_H__ */
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 776757d..e71cc35 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -75,9 +75,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,					\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /*
  * how to get the current stack pointer in C
  */
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index 5cf0488..3e26c6f 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -793,7 +793,6 @@ void abort(void)
 	/* if that doesn't kill us, halt */
 	panic("Oops failed to kill thread");
 }
-EXPORT_SYMBOL(abort);
 
 void __init trap_init(void)
 {
diff --git a/arch/arm/mach-davinci/dm365.c b/arch/arm/mach-davinci/dm365.c
index 8be04ec..5ace938 100644
--- a/arch/arm/mach-davinci/dm365.c
+++ b/arch/arm/mach-davinci/dm365.c
@@ -868,10 +868,10 @@ static const struct dma_slave_map dm365_edma_map[] = {
 	{ "spi_davinci.0", "rx", EDMA_FILTER_PARAM(0, 17) },
 	{ "spi_davinci.3", "tx", EDMA_FILTER_PARAM(0, 18) },
 	{ "spi_davinci.3", "rx", EDMA_FILTER_PARAM(0, 19) },
-	{ "dm6441-mmc.0", "rx", EDMA_FILTER_PARAM(0, 26) },
-	{ "dm6441-mmc.0", "tx", EDMA_FILTER_PARAM(0, 27) },
-	{ "dm6441-mmc.1", "rx", EDMA_FILTER_PARAM(0, 30) },
-	{ "dm6441-mmc.1", "tx", EDMA_FILTER_PARAM(0, 31) },
+	{ "da830-mmc.0", "rx", EDMA_FILTER_PARAM(0, 26) },
+	{ "da830-mmc.0", "tx", EDMA_FILTER_PARAM(0, 27) },
+	{ "da830-mmc.1", "rx", EDMA_FILTER_PARAM(0, 30) },
+	{ "da830-mmc.1", "tx", EDMA_FILTER_PARAM(0, 31) },
 };
 
 static struct edma_soc_info dm365_edma_pdata = {
@@ -925,12 +925,14 @@ static struct resource edma_resources[] = {
 	/* not using TC*_ERR */
 };
 
-static struct platform_device dm365_edma_device = {
-	.name			= "edma",
-	.id			= 0,
-	.dev.platform_data	= &dm365_edma_pdata,
-	.num_resources		= ARRAY_SIZE(edma_resources),
-	.resource		= edma_resources,
+static const struct platform_device_info dm365_edma_device __initconst = {
+	.name		= "edma",
+	.id		= 0,
+	.dma_mask	= DMA_BIT_MASK(32),
+	.res		= edma_resources,
+	.num_res	= ARRAY_SIZE(edma_resources),
+	.data		= &dm365_edma_pdata,
+	.size_data	= sizeof(dm365_edma_pdata),
 };
 
 static struct resource dm365_asp_resources[] = {
@@ -1428,13 +1430,18 @@ int __init dm365_init_video(struct vpfe_config *vpfe_cfg,
 
 static int __init dm365_init_devices(void)
 {
+	struct platform_device *edma_pdev;
 	int ret = 0;
 
 	if (!cpu_is_davinci_dm365())
 		return 0;
 
 	davinci_cfg_reg(DM365_INT_EDMA_CC);
-	platform_device_register(&dm365_edma_device);
+	edma_pdev = platform_device_register_full(&dm365_edma_device);
+	if (IS_ERR(edma_pdev)) {
+		pr_warn("%s: Failed to register eDMA\n", __func__);
+		return PTR_ERR(edma_pdev);
+	}
 
 	platform_device_register(&dm365_mdio_device);
 	platform_device_register(&dm365_emac_device);
diff --git a/arch/arm/mach-omap2/Makefile b/arch/arm/mach-omap2/Makefile
index 2f722a8..c15bbca 100644
--- a/arch/arm/mach-omap2/Makefile
+++ b/arch/arm/mach-omap2/Makefile
@@ -232,6 +232,3 @@
 obj-y					+= omap_phy_internal.o
 
 obj-$(CONFIG_MACH_OMAP2_TUSB6010)	+= usb-tusb6010.o
-
-onenand-$(CONFIG_MTD_ONENAND_OMAP2)	:= gpmc-onenand.o
-obj-y					+= $(onenand-m) $(onenand-y)
diff --git a/arch/arm/mach-omap2/gpmc-onenand.c b/arch/arm/mach-omap2/gpmc-onenand.c
deleted file mode 100644
index 2944af8..0000000
--- a/arch/arm/mach-omap2/gpmc-onenand.c
+++ /dev/null
@@ -1,409 +0,0 @@
-/*
- * linux/arch/arm/mach-omap2/gpmc-onenand.c
- *
- * Copyright (C) 2006 - 2009 Nokia Corporation
- * Contacts:	Juha Yrjola
- *		Tony Lindgren
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#include <linux/string.h>
-#include <linux/kernel.h>
-#include <linux/platform_device.h>
-#include <linux/mtd/onenand_regs.h>
-#include <linux/io.h>
-#include <linux/omap-gpmc.h>
-#include <linux/platform_data/mtd-onenand-omap2.h>
-#include <linux/err.h>
-
-#include <asm/mach/flash.h>
-
-#include "soc.h"
-
-#define	ONENAND_IO_SIZE	SZ_128K
-
-#define	ONENAND_FLAG_SYNCREAD	(1 << 0)
-#define	ONENAND_FLAG_SYNCWRITE	(1 << 1)
-#define	ONENAND_FLAG_HF		(1 << 2)
-#define	ONENAND_FLAG_VHF	(1 << 3)
-
-static unsigned onenand_flags;
-static unsigned latency;
-
-static struct omap_onenand_platform_data *gpmc_onenand_data;
-
-static struct resource gpmc_onenand_resource = {
-	.flags		= IORESOURCE_MEM,
-};
-
-static struct platform_device gpmc_onenand_device = {
-	.name		= "omap2-onenand",
-	.id		= -1,
-	.num_resources	= 1,
-	.resource	= &gpmc_onenand_resource,
-};
-
-static struct gpmc_settings onenand_async = {
-	.device_width	= GPMC_DEVWIDTH_16BIT,
-	.mux_add_data	= GPMC_MUX_AD,
-};
-
-static struct gpmc_settings onenand_sync = {
-	.burst_read	= true,
-	.burst_wrap	= true,
-	.burst_len	= GPMC_BURST_16,
-	.device_width	= GPMC_DEVWIDTH_16BIT,
-	.mux_add_data	= GPMC_MUX_AD,
-	.wait_pin	= 0,
-};
-
-static void omap2_onenand_calc_async_timings(struct gpmc_timings *t)
-{
-	struct gpmc_device_timings dev_t;
-	const int t_cer = 15;
-	const int t_avdp = 12;
-	const int t_aavdh = 7;
-	const int t_ce = 76;
-	const int t_aa = 76;
-	const int t_oe = 20;
-	const int t_cez = 20; /* max of t_cez, t_oez */
-	const int t_wpl = 40;
-	const int t_wph = 30;
-
-	memset(&dev_t, 0, sizeof(dev_t));
-
-	dev_t.t_avdp_r = max_t(int, t_avdp, t_cer) * 1000;
-	dev_t.t_avdp_w = dev_t.t_avdp_r;
-	dev_t.t_aavdh = t_aavdh * 1000;
-	dev_t.t_aa = t_aa * 1000;
-	dev_t.t_ce = t_ce * 1000;
-	dev_t.t_oe = t_oe * 1000;
-	dev_t.t_cez_r = t_cez * 1000;
-	dev_t.t_cez_w = dev_t.t_cez_r;
-	dev_t.t_wpl = t_wpl * 1000;
-	dev_t.t_wph = t_wph * 1000;
-
-	gpmc_calc_timings(t, &onenand_async, &dev_t);
-}
-
-static void omap2_onenand_set_async_mode(void __iomem *onenand_base)
-{
-	u32 reg;
-
-	/* Ensure sync read and sync write are disabled */
-	reg = readw(onenand_base + ONENAND_REG_SYS_CFG1);
-	reg &= ~ONENAND_SYS_CFG1_SYNC_READ & ~ONENAND_SYS_CFG1_SYNC_WRITE;
-	writew(reg, onenand_base + ONENAND_REG_SYS_CFG1);
-}
-
-static void set_onenand_cfg(void __iomem *onenand_base)
-{
-	u32 reg = ONENAND_SYS_CFG1_RDY | ONENAND_SYS_CFG1_INT;
-
-	reg |=	(latency << ONENAND_SYS_CFG1_BRL_SHIFT) |
-		ONENAND_SYS_CFG1_BL_16;
-	if (onenand_flags & ONENAND_FLAG_SYNCREAD)
-		reg |= ONENAND_SYS_CFG1_SYNC_READ;
-	else
-		reg &= ~ONENAND_SYS_CFG1_SYNC_READ;
-	if (onenand_flags & ONENAND_FLAG_SYNCWRITE)
-		reg |= ONENAND_SYS_CFG1_SYNC_WRITE;
-	else
-		reg &= ~ONENAND_SYS_CFG1_SYNC_WRITE;
-	if (onenand_flags & ONENAND_FLAG_HF)
-		reg |= ONENAND_SYS_CFG1_HF;
-	else
-		reg &= ~ONENAND_SYS_CFG1_HF;
-	if (onenand_flags & ONENAND_FLAG_VHF)
-		reg |= ONENAND_SYS_CFG1_VHF;
-	else
-		reg &= ~ONENAND_SYS_CFG1_VHF;
-
-	writew(reg, onenand_base + ONENAND_REG_SYS_CFG1);
-}
-
-static int omap2_onenand_get_freq(struct omap_onenand_platform_data *cfg,
-				  void __iomem *onenand_base)
-{
-	u16 ver = readw(onenand_base + ONENAND_REG_VERSION_ID);
-	int freq;
-
-	switch ((ver >> 4) & 0xf) {
-	case 0:
-		freq = 40;
-		break;
-	case 1:
-		freq = 54;
-		break;
-	case 2:
-		freq = 66;
-		break;
-	case 3:
-		freq = 83;
-		break;
-	case 4:
-		freq = 104;
-		break;
-	default:
-		pr_err("onenand rate not detected, bad GPMC async timings?\n");
-		freq = 0;
-	}
-
-	return freq;
-}
-
-static void omap2_onenand_calc_sync_timings(struct gpmc_timings *t,
-					    unsigned int flags,
-					    int freq)
-{
-	struct gpmc_device_timings dev_t;
-	const int t_cer  = 15;
-	const int t_avdp = 12;
-	const int t_cez  = 20; /* max of t_cez, t_oez */
-	const int t_wpl  = 40;
-	const int t_wph  = 30;
-	int min_gpmc_clk_period, t_ces, t_avds, t_avdh, t_ach, t_aavdh, t_rdyo;
-	int div, gpmc_clk_ns;
-
-	if (flags & ONENAND_SYNC_READ)
-		onenand_flags = ONENAND_FLAG_SYNCREAD;
-	else if (flags & ONENAND_SYNC_READWRITE)
-		onenand_flags = ONENAND_FLAG_SYNCREAD | ONENAND_FLAG_SYNCWRITE;
-
-	switch (freq) {
-	case 104:
-		min_gpmc_clk_period = 9600; /* 104 MHz */
-		t_ces   = 3;
-		t_avds  = 4;
-		t_avdh  = 2;
-		t_ach   = 3;
-		t_aavdh = 6;
-		t_rdyo  = 6;
-		break;
-	case 83:
-		min_gpmc_clk_period = 12000; /* 83 MHz */
-		t_ces   = 5;
-		t_avds  = 4;
-		t_avdh  = 2;
-		t_ach   = 6;
-		t_aavdh = 6;
-		t_rdyo  = 9;
-		break;
-	case 66:
-		min_gpmc_clk_period = 15000; /* 66 MHz */
-		t_ces   = 6;
-		t_avds  = 5;
-		t_avdh  = 2;
-		t_ach   = 6;
-		t_aavdh = 6;
-		t_rdyo  = 11;
-		break;
-	default:
-		min_gpmc_clk_period = 18500; /* 54 MHz */
-		t_ces   = 7;
-		t_avds  = 7;
-		t_avdh  = 7;
-		t_ach   = 9;
-		t_aavdh = 7;
-		t_rdyo  = 15;
-		onenand_flags &= ~ONENAND_FLAG_SYNCWRITE;
-		break;
-	}
-
-	div = gpmc_calc_divider(min_gpmc_clk_period);
-	gpmc_clk_ns = gpmc_ticks_to_ns(div);
-	if (gpmc_clk_ns < 15) /* >66MHz */
-		onenand_flags |= ONENAND_FLAG_HF;
-	else
-		onenand_flags &= ~ONENAND_FLAG_HF;
-	if (gpmc_clk_ns < 12) /* >83MHz */
-		onenand_flags |= ONENAND_FLAG_VHF;
-	else
-		onenand_flags &= ~ONENAND_FLAG_VHF;
-	if (onenand_flags & ONENAND_FLAG_VHF)
-		latency = 8;
-	else if (onenand_flags & ONENAND_FLAG_HF)
-		latency = 6;
-	else if (gpmc_clk_ns >= 25) /* 40 MHz*/
-		latency = 3;
-	else
-		latency = 4;
-
-	/* Set synchronous read timings */
-	memset(&dev_t, 0, sizeof(dev_t));
-
-	if (onenand_flags & ONENAND_FLAG_SYNCREAD)
-		onenand_sync.sync_read = true;
-	if (onenand_flags & ONENAND_FLAG_SYNCWRITE) {
-		onenand_sync.sync_write = true;
-		onenand_sync.burst_write = true;
-	} else {
-		dev_t.t_avdp_w = max(t_avdp, t_cer) * 1000;
-		dev_t.t_wpl = t_wpl * 1000;
-		dev_t.t_wph = t_wph * 1000;
-		dev_t.t_aavdh = t_aavdh * 1000;
-	}
-	dev_t.ce_xdelay = true;
-	dev_t.avd_xdelay = true;
-	dev_t.oe_xdelay = true;
-	dev_t.we_xdelay = true;
-	dev_t.clk = min_gpmc_clk_period;
-	dev_t.t_bacc = dev_t.clk;
-	dev_t.t_ces = t_ces * 1000;
-	dev_t.t_avds = t_avds * 1000;
-	dev_t.t_avdh = t_avdh * 1000;
-	dev_t.t_ach = t_ach * 1000;
-	dev_t.cyc_iaa = (latency + 1);
-	dev_t.t_cez_r = t_cez * 1000;
-	dev_t.t_cez_w = dev_t.t_cez_r;
-	dev_t.cyc_aavdh_oe = 1;
-	dev_t.t_rdyo = t_rdyo * 1000 + min_gpmc_clk_period;
-
-	gpmc_calc_timings(t, &onenand_sync, &dev_t);
-}
-
-static int omap2_onenand_setup_async(void __iomem *onenand_base)
-{
-	struct gpmc_timings t;
-	int ret;
-
-	/*
-	 * Note that we need to keep sync_write set for the call to
-	 * omap2_onenand_set_async_mode() to work to detect the onenand
-	 * supported clock rate for the sync timings.
-	 */
-	if (gpmc_onenand_data->of_node) {
-		gpmc_read_settings_dt(gpmc_onenand_data->of_node,
-				      &onenand_async);
-		if (onenand_async.sync_read || onenand_async.sync_write) {
-			if (onenand_async.sync_write)
-				gpmc_onenand_data->flags |=
-					ONENAND_SYNC_READWRITE;
-			else
-				gpmc_onenand_data->flags |= ONENAND_SYNC_READ;
-			onenand_async.sync_read = false;
-		}
-	}
-
-	onenand_async.sync_write = true;
-	omap2_onenand_calc_async_timings(&t);
-
-	ret = gpmc_cs_program_settings(gpmc_onenand_data->cs, &onenand_async);
-	if (ret < 0)
-		return ret;
-
-	ret = gpmc_cs_set_timings(gpmc_onenand_data->cs, &t, &onenand_async);
-	if (ret < 0)
-		return ret;
-
-	omap2_onenand_set_async_mode(onenand_base);
-
-	return 0;
-}
-
-static int omap2_onenand_setup_sync(void __iomem *onenand_base, int *freq_ptr)
-{
-	int ret, freq = *freq_ptr;
-	struct gpmc_timings t;
-
-	if (!freq) {
-		/* Very first call freq is not known */
-		freq = omap2_onenand_get_freq(gpmc_onenand_data, onenand_base);
-		if (!freq)
-			return -ENODEV;
-		set_onenand_cfg(onenand_base);
-	}
-
-	if (gpmc_onenand_data->of_node) {
-		gpmc_read_settings_dt(gpmc_onenand_data->of_node,
-				      &onenand_sync);
-	} else {
-		/*
-		 * FIXME: Appears to be legacy code from initial ONENAND commit.
-		 * Unclear what boards this is for and if this can be removed.
-		 */
-		if (!cpu_is_omap34xx())
-			onenand_sync.wait_on_read = true;
-	}
-
-	omap2_onenand_calc_sync_timings(&t, gpmc_onenand_data->flags, freq);
-
-	ret = gpmc_cs_program_settings(gpmc_onenand_data->cs, &onenand_sync);
-	if (ret < 0)
-		return ret;
-
-	ret = gpmc_cs_set_timings(gpmc_onenand_data->cs, &t, &onenand_sync);
-	if (ret < 0)
-		return ret;
-
-	set_onenand_cfg(onenand_base);
-
-	*freq_ptr = freq;
-
-	return 0;
-}
-
-static int gpmc_onenand_setup(void __iomem *onenand_base, int *freq_ptr)
-{
-	struct device *dev = &gpmc_onenand_device.dev;
-	unsigned l = ONENAND_SYNC_READ | ONENAND_SYNC_READWRITE;
-	int ret;
-
-	ret = omap2_onenand_setup_async(onenand_base);
-	if (ret) {
-		dev_err(dev, "unable to set to async mode\n");
-		return ret;
-	}
-
-	if (!(gpmc_onenand_data->flags & l))
-		return 0;
-
-	ret = omap2_onenand_setup_sync(onenand_base, freq_ptr);
-	if (ret)
-		dev_err(dev, "unable to set to sync mode\n");
-	return ret;
-}
-
-int gpmc_onenand_init(struct omap_onenand_platform_data *_onenand_data)
-{
-	int err;
-	struct device *dev = &gpmc_onenand_device.dev;
-
-	gpmc_onenand_data = _onenand_data;
-	gpmc_onenand_data->onenand_setup = gpmc_onenand_setup;
-	gpmc_onenand_device.dev.platform_data = gpmc_onenand_data;
-
-	if (cpu_is_omap24xx() &&
-			(gpmc_onenand_data->flags & ONENAND_SYNC_READWRITE)) {
-		dev_warn(dev, "OneNAND using only SYNC_READ on 24xx\n");
-		gpmc_onenand_data->flags &= ~ONENAND_SYNC_READWRITE;
-		gpmc_onenand_data->flags |= ONENAND_SYNC_READ;
-	}
-
-	if (cpu_is_omap34xx())
-		gpmc_onenand_data->flags |= ONENAND_IN_OMAP34XX;
-	else
-		gpmc_onenand_data->flags &= ~ONENAND_IN_OMAP34XX;
-
-	err = gpmc_cs_request(gpmc_onenand_data->cs, ONENAND_IO_SIZE,
-				(unsigned long *)&gpmc_onenand_resource.start);
-	if (err < 0) {
-		dev_err(dev, "Cannot request GPMC CS %d, error %d\n",
-			gpmc_onenand_data->cs, err);
-		return err;
-	}
-
-	gpmc_onenand_resource.end = gpmc_onenand_resource.start +
-							ONENAND_IO_SIZE - 1;
-
-	err = platform_device_register(&gpmc_onenand_device);
-	if (err) {
-		dev_err(dev, "Unable to register OneNAND device\n");
-		gpmc_cs_free(gpmc_onenand_data->cs);
-	}
-
-	return err;
-}
diff --git a/arch/arm/net/bpf_jit_32.c b/arch/arm/net/bpf_jit_32.c
index c199990..323a4df 100644
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -27,14 +27,58 @@
 
 int bpf_jit_enable __read_mostly;
 
+/*
+ * eBPF prog stack layout:
+ *
+ *                         high
+ * original ARM_SP =>     +-----+
+ *                        |     | callee saved registers
+ *                        +-----+ <= (BPF_FP + SCRATCH_SIZE)
+ *                        | ... | eBPF JIT scratch space
+ * eBPF fp register =>    +-----+
+ *   (BPF_FP)             | ... | eBPF prog stack
+ *                        +-----+
+ *                        |RSVD | JIT scratchpad
+ * current ARM_SP =>      +-----+ <= (BPF_FP - STACK_SIZE + SCRATCH_SIZE)
+ *                        |     |
+ *                        | ... | Function call stack
+ *                        |     |
+ *                        +-----+
+ *                          low
+ *
+ * The callee saved registers depends on whether frame pointers are enabled.
+ * With frame pointers (to be compliant with the ABI):
+ *
+ *                                high
+ * original ARM_SP =>     +------------------+ \
+ *                        |        pc        | |
+ * current ARM_FP =>      +------------------+ } callee saved registers
+ *                        |r4-r8,r10,fp,ip,lr| |
+ *                        +------------------+ /
+ *                                low
+ *
+ * Without frame pointers:
+ *
+ *                                high
+ * original ARM_SP =>     +------------------+
+ *                        | r4-r8,r10,fp,lr  | callee saved registers
+ * current ARM_FP =>      +------------------+
+ *                                low
+ *
+ * When popping registers off the stack at the end of a BPF function, we
+ * reference them via the current ARM_FP register.
+ */
+#define CALLEE_MASK	(1 << ARM_R4 | 1 << ARM_R5 | 1 << ARM_R6 | \
+			 1 << ARM_R7 | 1 << ARM_R8 | 1 << ARM_R10 | \
+			 1 << ARM_FP)
+#define CALLEE_PUSH_MASK (CALLEE_MASK | 1 << ARM_LR)
+#define CALLEE_POP_MASK  (CALLEE_MASK | 1 << ARM_PC)
+
 #define STACK_OFFSET(k)	(k)
 #define TMP_REG_1	(MAX_BPF_JIT_REG + 0)	/* TEMP Register 1 */
 #define TMP_REG_2	(MAX_BPF_JIT_REG + 1)	/* TEMP Register 2 */
 #define TCALL_CNT	(MAX_BPF_JIT_REG + 2)	/* Tail Call Count */
 
-/* Flags used for JIT optimization */
-#define SEEN_CALL	(1 << 0)
-
 #define FLAG_IMM_OVERFLOW	(1 << 0)
 
 /*
@@ -95,7 +139,6 @@ static const u8 bpf2a32[][2] = {
  * idx			:	index of current last JITed instruction.
  * prologue_bytes	:	bytes used in prologue.
  * epilogue_offset	:	offset of epilogue starting.
- * seen			:	bit mask used for JIT optimization.
  * offsets		:	array of eBPF instruction offsets in
  *				JITed code.
  * target		:	final JITed code.
@@ -110,7 +153,6 @@ struct jit_ctx {
 	unsigned int idx;
 	unsigned int prologue_bytes;
 	unsigned int epilogue_offset;
-	u32 seen;
 	u32 flags;
 	u32 *offsets;
 	u32 *target;
@@ -179,8 +221,13 @@ static void jit_fill_hole(void *area, unsigned int size)
 		*ptr++ = __opcode_to_mem_arm(ARM_INST_UDF);
 }
 
-/* Stack must be multiples of 16 Bytes */
-#define STACK_ALIGN(sz) (((sz) + 3) & ~3)
+#if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5)
+/* EABI requires the stack to be aligned to 64-bit boundaries */
+#define STACK_ALIGNMENT	8
+#else
+/* Stack must be aligned to 32-bit boundaries */
+#define STACK_ALIGNMENT	4
+#endif
 
 /* Stack space for BPF_REG_2, BPF_REG_3, BPF_REG_4,
  * BPF_REG_5, BPF_REG_7, BPF_REG_8, BPF_REG_9,
@@ -194,7 +241,7 @@ static void jit_fill_hole(void *area, unsigned int size)
 	 + SCRATCH_SIZE + \
 	 + 4 /* extra for skb_copy_bits buffer */)
 
-#define STACK_SIZE STACK_ALIGN(_STACK_SIZE)
+#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
 
 /* Get the offset of eBPF REGISTERs stored on scratch space. */
 #define STACK_VAR(off) (STACK_SIZE-off-4)
@@ -285,16 +332,19 @@ static inline void emit_mov_i(const u8 rd, u32 val, struct jit_ctx *ctx)
 		emit_mov_i_no8m(rd, val, ctx);
 }
 
-static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
+static void emit_bx_r(u8 tgt_reg, struct jit_ctx *ctx)
 {
-	ctx->seen |= SEEN_CALL;
-#if __LINUX_ARM_ARCH__ < 5
-	emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
-
 	if (elf_hwcap & HWCAP_THUMB)
 		emit(ARM_BX(tgt_reg), ctx);
 	else
 		emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
+}
+
+static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
+{
+#if __LINUX_ARM_ARCH__ < 5
+	emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+	emit_bx_r(tgt_reg, ctx);
 #else
 	emit(ARM_BLX_R(tgt_reg), ctx);
 #endif
@@ -354,7 +404,6 @@ static inline void emit_udivmod(u8 rd, u8 rm, u8 rn, struct jit_ctx *ctx, u8 op)
 	}
 
 	/* Call appropriate function */
-	ctx->seen |= SEEN_CALL;
 	emit_mov_i(ARM_IP, op == BPF_DIV ?
 		   (u32)jit_udiv32 : (u32)jit_mod32, ctx);
 	emit_blx_r(ARM_IP, ctx);
@@ -620,8 +669,6 @@ static inline void emit_a32_lsh_r64(const u8 dst[], const u8 src[], bool dstk,
 	/* Do LSH operation */
 	emit(ARM_SUB_I(ARM_IP, rt, 32), ctx);
 	emit(ARM_RSB_I(tmp2[0], rt, 32), ctx);
-	/* As we are using ARM_LR */
-	ctx->seen |= SEEN_CALL;
 	emit(ARM_MOV_SR(ARM_LR, rm, SRTYPE_ASL, rt), ctx);
 	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rd, SRTYPE_ASL, ARM_IP), ctx);
 	emit(ARM_ORR_SR(ARM_IP, ARM_LR, rd, SRTYPE_LSR, tmp2[0]), ctx);
@@ -656,8 +703,6 @@ static inline void emit_a32_arsh_r64(const u8 dst[], const u8 src[], bool dstk,
 	/* Do the ARSH operation */
 	emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
 	emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
-	/* As we are using ARM_LR */
-	ctx->seen |= SEEN_CALL;
 	emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
 	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
 	_emit(ARM_COND_MI, ARM_B(0), ctx);
@@ -692,8 +737,6 @@ static inline void emit_a32_lsr_r64(const u8 dst[], const u8 src[], bool dstk,
 	/* Do LSH operation */
 	emit(ARM_RSB_I(ARM_IP, rt, 32), ctx);
 	emit(ARM_SUBS_I(tmp2[0], rt, 32), ctx);
-	/* As we are using ARM_LR */
-	ctx->seen |= SEEN_CALL;
 	emit(ARM_MOV_SR(ARM_LR, rd, SRTYPE_LSR, rt), ctx);
 	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_ASL, ARM_IP), ctx);
 	emit(ARM_ORR_SR(ARM_LR, ARM_LR, rm, SRTYPE_LSR, tmp2[0]), ctx);
@@ -828,8 +871,6 @@ static inline void emit_a32_mul_r64(const u8 dst[], const u8 src[], bool dstk,
 	/* Do Multiplication */
 	emit(ARM_MUL(ARM_IP, rd, rn), ctx);
 	emit(ARM_MUL(ARM_LR, rm, rt), ctx);
-	/* As we are using ARM_LR */
-	ctx->seen |= SEEN_CALL;
 	emit(ARM_ADD_R(ARM_LR, ARM_IP, ARM_LR), ctx);
 
 	emit(ARM_UMULL(ARM_IP, rm, rd, rt), ctx);
@@ -872,33 +913,53 @@ static inline void emit_str_r(const u8 dst, const u8 src, bool dstk,
 }
 
 /* dst = *(size*)(src + off) */
-static inline void emit_ldx_r(const u8 dst, const u8 src, bool dstk,
-			      const s32 off, struct jit_ctx *ctx, const u8 sz){
+static inline void emit_ldx_r(const u8 dst[], const u8 src, bool dstk,
+			      s32 off, struct jit_ctx *ctx, const u8 sz){
 	const u8 *tmp = bpf2a32[TMP_REG_1];
-	u8 rd = dstk ? tmp[1] : dst;
+	const u8 *rd = dstk ? tmp : dst;
 	u8 rm = src;
+	s32 off_max;
 
-	if (off) {
+	if (sz == BPF_H)
+		off_max = 0xff;
+	else
+		off_max = 0xfff;
+
+	if (off < 0 || off > off_max) {
 		emit_a32_mov_i(tmp[0], off, false, ctx);
 		emit(ARM_ADD_R(tmp[0], tmp[0], src), ctx);
 		rm = tmp[0];
+		off = 0;
+	} else if (rd[1] == rm) {
+		emit(ARM_MOV_R(tmp[0], rm), ctx);
+		rm = tmp[0];
 	}
 	switch (sz) {
-	case BPF_W:
-		/* Load a Word */
-		emit(ARM_LDR_I(rd, rm, 0), ctx);
+	case BPF_B:
+		/* Load a Byte */
+		emit(ARM_LDRB_I(rd[1], rm, off), ctx);
+		emit_a32_mov_i(dst[0], 0, dstk, ctx);
 		break;
 	case BPF_H:
 		/* Load a HalfWord */
-		emit(ARM_LDRH_I(rd, rm, 0), ctx);
+		emit(ARM_LDRH_I(rd[1], rm, off), ctx);
+		emit_a32_mov_i(dst[0], 0, dstk, ctx);
 		break;
-	case BPF_B:
-		/* Load a Byte */
-		emit(ARM_LDRB_I(rd, rm, 0), ctx);
+	case BPF_W:
+		/* Load a Word */
+		emit(ARM_LDR_I(rd[1], rm, off), ctx);
+		emit_a32_mov_i(dst[0], 0, dstk, ctx);
+		break;
+	case BPF_DW:
+		/* Load a Double Word */
+		emit(ARM_LDR_I(rd[1], rm, off), ctx);
+		emit(ARM_LDR_I(rd[0], rm, off + 4), ctx);
 		break;
 	}
 	if (dstk)
-		emit(ARM_STR_I(rd, ARM_SP, STACK_VAR(dst)), ctx);
+		emit(ARM_STR_I(rd[1], ARM_SP, STACK_VAR(dst[1])), ctx);
+	if (dstk && sz == BPF_DW)
+		emit(ARM_STR_I(rd[0], ARM_SP, STACK_VAR(dst[0])), ctx);
 }
 
 /* Arithmatic Operation */
@@ -906,7 +967,6 @@ static inline void emit_ar_r(const u8 rd, const u8 rt, const u8 rm,
 			     const u8 rn, struct jit_ctx *ctx, u8 op) {
 	switch (op) {
 	case BPF_JSET:
-		ctx->seen |= SEEN_CALL;
 		emit(ARM_AND_R(ARM_IP, rt, rn), ctx);
 		emit(ARM_AND_R(ARM_LR, rd, rm), ctx);
 		emit(ARM_ORRS_R(ARM_IP, ARM_LR, ARM_IP), ctx);
@@ -945,7 +1005,7 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	const u8 *tcc = bpf2a32[TCALL_CNT];
 	const int idx0 = ctx->idx;
 #define cur_offset (ctx->idx - idx0)
-#define jmp_offset (out_offset - (cur_offset))
+#define jmp_offset (out_offset - (cur_offset) - 2)
 	u32 off, lo, hi;
 
 	/* if (index >= array->map.max_entries)
@@ -956,7 +1016,7 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	emit_a32_mov_i(tmp[1], off, false, ctx);
 	emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
 	emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
-	/* index (64 bit) */
+	/* index is 32-bit for arrays */
 	emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
 	/* index >= array->map.max_entries */
 	emit(ARM_CMP_R(tmp2[1], tmp[1]), ctx);
@@ -997,7 +1057,7 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	emit_a32_mov_i(tmp2[1], off, false, ctx);
 	emit(ARM_LDR_R(tmp[1], tmp[1], tmp2[1]), ctx);
 	emit(ARM_ADD_I(tmp[1], tmp[1], ctx->prologue_bytes), ctx);
-	emit(ARM_BX(tmp[1]), ctx);
+	emit_bx_r(tmp[1], ctx);
 
 	/* out: */
 	if (out_offset == -1)
@@ -1070,54 +1130,22 @@ static void build_prologue(struct jit_ctx *ctx)
 	const u8 r2 = bpf2a32[BPF_REG_1][1];
 	const u8 r3 = bpf2a32[BPF_REG_1][0];
 	const u8 r4 = bpf2a32[BPF_REG_6][1];
-	const u8 r5 = bpf2a32[BPF_REG_6][0];
-	const u8 r6 = bpf2a32[TMP_REG_1][1];
-	const u8 r7 = bpf2a32[TMP_REG_1][0];
-	const u8 r8 = bpf2a32[TMP_REG_2][1];
-	const u8 r10 = bpf2a32[TMP_REG_2][0];
 	const u8 fplo = bpf2a32[BPF_REG_FP][1];
 	const u8 fphi = bpf2a32[BPF_REG_FP][0];
-	const u8 sp = ARM_SP;
 	const u8 *tcc = bpf2a32[TCALL_CNT];
 
-	u16 reg_set = 0;
-
-	/*
-	 * eBPF prog stack layout
-	 *
-	 *                         high
-	 * original ARM_SP =>     +-----+ eBPF prologue
-	 *                        |FP/LR|
-	 * current ARM_FP =>      +-----+
-	 *                        | ... | callee saved registers
-	 * eBPF fp register =>    +-----+ <= (BPF_FP)
-	 *                        | ... | eBPF JIT scratch space
-	 *                        |     | eBPF prog stack
-	 *                        +-----+
-	 *			  |RSVD | JIT scratchpad
-	 * current A64_SP =>      +-----+ <= (BPF_FP - STACK_SIZE)
-	 *                        |     |
-	 *                        | ... | Function call stack
-	 *                        |     |
-	 *                        +-----+
-	 *                          low
-	 */
-
 	/* Save callee saved registers. */
-	reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
 #ifdef CONFIG_FRAME_POINTER
-	reg_set |= (1<<ARM_FP) | (1<<ARM_IP) | (1<<ARM_LR) | (1<<ARM_PC);
-	emit(ARM_MOV_R(ARM_IP, sp), ctx);
+	u16 reg_set = CALLEE_PUSH_MASK | 1 << ARM_IP | 1 << ARM_PC;
+	emit(ARM_MOV_R(ARM_IP, ARM_SP), ctx);
 	emit(ARM_PUSH(reg_set), ctx);
 	emit(ARM_SUB_I(ARM_FP, ARM_IP, 4), ctx);
 #else
-	/* Check if call instruction exists in BPF body */
-	if (ctx->seen & SEEN_CALL)
-		reg_set |= (1<<ARM_LR);
-	emit(ARM_PUSH(reg_set), ctx);
+	emit(ARM_PUSH(CALLEE_PUSH_MASK), ctx);
+	emit(ARM_MOV_R(ARM_FP, ARM_SP), ctx);
 #endif
 	/* Save frame pointer for later */
-	emit(ARM_SUB_I(ARM_IP, sp, SCRATCH_SIZE), ctx);
+	emit(ARM_SUB_I(ARM_IP, ARM_SP, SCRATCH_SIZE), ctx);
 
 	ctx->stack_size = imm8m(STACK_SIZE);
 
@@ -1140,33 +1168,19 @@ static void build_prologue(struct jit_ctx *ctx)
 	/* end of prologue */
 }
 
+/* restore callee saved registers. */
 static void build_epilogue(struct jit_ctx *ctx)
 {
-	const u8 r4 = bpf2a32[BPF_REG_6][1];
-	const u8 r5 = bpf2a32[BPF_REG_6][0];
-	const u8 r6 = bpf2a32[TMP_REG_1][1];
-	const u8 r7 = bpf2a32[TMP_REG_1][0];
-	const u8 r8 = bpf2a32[TMP_REG_2][1];
-	const u8 r10 = bpf2a32[TMP_REG_2][0];
-	u16 reg_set = 0;
-
-	/* unwind function call stack */
-	emit(ARM_ADD_I(ARM_SP, ARM_SP, ctx->stack_size), ctx);
-
-	/* restore callee saved registers. */
-	reg_set |= (1<<r4) | (1<<r5) | (1<<r6) | (1<<r7) | (1<<r8) | (1<<r10);
 #ifdef CONFIG_FRAME_POINTER
-	/* the first instruction of the prologue was: mov ip, sp */
-	reg_set |= (1<<ARM_FP) | (1<<ARM_SP) | (1<<ARM_PC);
+	/* When using frame pointers, some additional registers need to
+	 * be loaded. */
+	u16 reg_set = CALLEE_POP_MASK | 1 << ARM_SP;
+	emit(ARM_SUB_I(ARM_SP, ARM_FP, hweight16(reg_set) * 4), ctx);
 	emit(ARM_LDM(ARM_SP, reg_set), ctx);
 #else
-	if (ctx->seen & SEEN_CALL)
-		reg_set |= (1<<ARM_PC);
 	/* Restore callee saved registers. */
-	emit(ARM_POP(reg_set), ctx);
-	/* Return back to the callee function */
-	if (!(ctx->seen & SEEN_CALL))
-		emit(ARM_BX(ARM_LR), ctx);
+	emit(ARM_MOV_R(ARM_SP, ARM_FP), ctx);
+	emit(ARM_POP(CALLEE_POP_MASK), ctx);
 #endif
 }
 
@@ -1394,8 +1408,6 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 			emit_rev32(rt, rt, ctx);
 			goto emit_bswap_uxt;
 		case 64:
-			/* Because of the usage of ARM_LR */
-			ctx->seen |= SEEN_CALL;
 			emit_rev32(ARM_LR, rt, ctx);
 			emit_rev32(rt, rd, ctx);
 			emit(ARM_MOV_R(rd, ARM_LR), ctx);
@@ -1448,22 +1460,7 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx)
 		rn = sstk ? tmp2[1] : src_lo;
 		if (sstk)
 			emit(ARM_LDR_I(rn, ARM_SP, STACK_VAR(src_lo)), ctx);
-		switch (BPF_SIZE(code)) {
-		case BPF_W:
-			/* Load a Word */
-		case BPF_H:
-			/* Load a Half-Word */
-		case BPF_B:
-			/* Load a Byte */
-			emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_SIZE(code));
-			emit_a32_mov_i(dst_hi, 0, dstk, ctx);
-			break;
-		case BPF_DW:
-			/* Load a double word */
-			emit_ldx_r(dst_lo, rn, dstk, off, ctx, BPF_W);
-			emit_ldx_r(dst_hi, rn, dstk, off+4, ctx, BPF_W);
-			break;
-		}
+		emit_ldx_r(dst, rn, dstk, off, ctx, BPF_SIZE(code));
 		break;
 	/* R0 = ntohx(*(size *)(((struct sk_buff *)R6)->data + imm)) */
 	case BPF_LD | BPF_ABS | BPF_W:
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index c9a7e9e..b488076 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -522,20 +522,13 @@
 config QCOM_FALKOR_ERRATUM_1003
 	bool "Falkor E1003: Incorrect translation due to ASID change"
 	default y
-	select ARM64_PAN if ARM64_SW_TTBR0_PAN
 	help
 	  On Falkor v1, an incorrect ASID may be cached in the TLB when ASID
-	  and BADDR are changed together in TTBRx_EL1. The workaround for this
-	  issue is to use a reserved ASID in cpu_do_switch_mm() before
-	  switching to the new ASID. Saying Y here selects ARM64_PAN if
-	  ARM64_SW_TTBR0_PAN is selected. This is done because implementing and
-	  maintaining the E1003 workaround in the software PAN emulation code
-	  would be an unnecessary complication. The affected Falkor v1 CPU
-	  implements ARMv8.1 hardware PAN support and using hardware PAN
-	  support versus software PAN emulation is mutually exclusive at
-	  runtime.
-
-	  If unsure, say Y.
+	  and BADDR are changed together in TTBRx_EL1. Since we keep the ASID
+	  in TTBR1_EL1, this situation only occurs in the entry trampoline and
+	  then only for entries in the walk cache, since the leaf translation
+	  is unchanged. Work around the erratum by invalidating the walk cache
+	  entries for the trampoline before entering the kernel proper.
 
 config QCOM_FALKOR_ERRATUM_1009
 	bool "Falkor E1009: Prematurely complete a DSB after a TLBI"
@@ -656,6 +649,35 @@
 	default 47 if ARM64_VA_BITS_47
 	default 48 if ARM64_VA_BITS_48
 
+choice
+	prompt "Physical address space size"
+	default ARM64_PA_BITS_48
+	help
+	  Choose the maximum physical address range that the kernel will
+	  support.
+
+config ARM64_PA_BITS_48
+	bool "48-bit"
+
+config ARM64_PA_BITS_52
+	bool "52-bit (ARMv8.2)"
+	depends on ARM64_64K_PAGES
+	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
+	help
+	  Enable support for a 52-bit physical address space, introduced as
+	  part of the ARMv8.2-LPA extension.
+
+	  With this enabled, the kernel will also continue to work on CPUs that
+	  do not support ARMv8.2-LPA, but with some added memory overhead (and
+	  minor performance overhead).
+
+endchoice
+
+config ARM64_PA_BITS
+	int
+	default 48 if ARM64_PA_BITS_48
+	default 52 if ARM64_PA_BITS_52
+
 config CPU_BIG_ENDIAN
        bool "Build big-endian kernel"
        help
@@ -850,6 +872,35 @@
 	  However for 4K, we choose a higher default value, 11 as opposed to 10, giving us
 	  4M allocations matching the default size used by generic code.
 
+config UNMAP_KERNEL_AT_EL0
+	bool "Unmap kernel when running in userspace (aka \"KAISER\")" if EXPERT
+	default y
+	help
+	  Speculation attacks against some high-performance processors can
+	  be used to bypass MMU permission checks and leak kernel data to
+	  userspace. This can be defended against by unmapping the kernel
+	  when running in userspace, mapping it back in on exception entry
+	  via a trampoline page in the vector table.
+
+	  If unsure, say Y.
+
+config HARDEN_BRANCH_PREDICTOR
+	bool "Harden the branch predictor against aliasing attacks" if EXPERT
+	default y
+	help
+	  Speculation attacks against some high-performance processors rely on
+	  being able to manipulate the branch predictor for a victim context by
+	  executing aliasing branches in the attacker context.  Such attacks
+	  can be partially mitigated against by clearing internal branch
+	  predictor state and limiting the prediction logic in some situations.
+
+	  This config option will take CPU-specific actions to harden the
+	  branch predictor against aliasing attacks and may rely on specific
+	  instruction sequences or control bits being set by the system
+	  firmware.
+
+	  If unsure, say Y.
+
 menuconfig ARMV8_DEPRECATED
 	bool "Emulate deprecated/obsolete ARMv8 instructions"
 	depends on COMPAT
@@ -1021,6 +1072,22 @@
 	  operations if DC CVAP is not supported (following the behaviour of
 	  DC CVAP itself if the system does not define a point of persistence).
 
+config ARM64_RAS_EXTN
+	bool "Enable support for RAS CPU Extensions"
+	default y
+	help
+	  CPUs that support the Reliability, Availability and Serviceability
+	  (RAS) Extensions, part of ARMv8.2 are able to track faults and
+	  errors, classify them and report them to software.
+
+	  On CPUs with these extensions system software can use additional
+	  barriers to determine if faults are pending and read the
+	  classification from a new set of registers.
+
+	  Selecting this feature will allow the kernel to use these barriers
+	  and access the new registers if the system supports the extension.
+	  Platform RAS features may additionally depend on firmware support.
+
 endmenu
 
 config ARM64_SVE
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
index 45bdbfb..4a8d3f8 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-bananapi-m64.dts
@@ -75,6 +75,7 @@
 	pinctrl-0 = <&rgmii_pins>;
 	phy-mode = "rgmii";
 	phy-handle = <&ext_rgmii_phy>;
+	phy-supply = <&reg_dc1sw>;
 	status = "okay";
 };
 
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
index 806442d..604cdae 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-pine64.dts
@@ -77,6 +77,7 @@
 	pinctrl-0 = <&rmii_pins>;
 	phy-mode = "rmii";
 	phy-handle = <&ext_rmii_phy1>;
+	phy-supply = <&reg_dc1sw>;
 	status = "okay";
 
 };
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
index 0eb2ace..abe179d 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine-baseboard.dts
@@ -82,6 +82,7 @@
 	pinctrl-0 = <&rgmii_pins>;
 	phy-mode = "rgmii";
 	phy-handle = <&ext_rgmii_phy>;
+	phy-supply = <&reg_dc1sw>;
 	status = "okay";
 };
 
@@ -95,7 +96,7 @@
 &mmc2 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc2_pins>;
-	vmmc-supply = <&reg_vcc3v3>;
+	vmmc-supply = <&reg_dcdc1>;
 	vqmmc-supply = <&reg_vcc1v8>;
 	bus-width = <8>;
 	non-removable;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine.dtsi
index a5da18a..43418bd 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64-sopine.dtsi
@@ -45,19 +45,10 @@
 
 #include "sun50i-a64.dtsi"
 
-/ {
-	reg_vcc3v3: vcc3v3 {
-		compatible = "regulator-fixed";
-		regulator-name = "vcc3v3";
-		regulator-min-microvolt = <3300000>;
-		regulator-max-microvolt = <3300000>;
-	};
-};
-
 &mmc0 {
 	pinctrl-names = "default";
 	pinctrl-0 = <&mmc0_pins>;
-	vmmc-supply = <&reg_vcc3v3>;
+	vmmc-supply = <&reg_dcdc1>;
 	non-removable;
 	disable-wp;
 	bus-width = <4>;
diff --git a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-zero-plus2.dts b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-zero-plus2.dts
index b6b7a56..a42fd79 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-zero-plus2.dts
+++ b/arch/arm64/boot/dts/allwinner/sun50i-h5-orangepi-zero-plus2.dts
@@ -71,7 +71,7 @@
 	pinctrl-0 = <&mmc0_pins_a>, <&mmc0_cd_pin>;
 	vmmc-supply = <&reg_vcc3v3>;
 	bus-width = <4>;
-	cd-gpios = <&pio 5 6 GPIO_ACTIVE_HIGH>;
+	cd-gpios = <&pio 5 6 GPIO_ACTIVE_LOW>;
 	status = "okay";
 };
 
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
index 7c9bdc7..9db1931 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10.dtsi
@@ -66,6 +66,7 @@
 				     <&cpu1>,
 				     <&cpu2>,
 				     <&cpu3>;
+		interrupt-parent = <&intc>;
 	};
 
 	psci {
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
index e3b64d0..9c7724e 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-master.dtsi
@@ -63,8 +63,10 @@
 			cpm_ethernet: ethernet@0 {
 				compatible = "marvell,armada-7k-pp22";
 				reg = <0x0 0x100000>, <0x129000 0xb000>;
-				clocks = <&cpm_clk 1 3>, <&cpm_clk 1 9>, <&cpm_clk 1 5>;
-				clock-names = "pp_clk", "gop_clk", "mg_clk";
+				clocks = <&cpm_clk 1 3>, <&cpm_clk 1 9>,
+					 <&cpm_clk 1 5>, <&cpm_clk 1 18>;
+				clock-names = "pp_clk", "gop_clk",
+					      "mg_clk","axi_clk";
 				marvell,system-controller = <&cpm_syscon0>;
 				status = "disabled";
 				dma-coherent;
@@ -155,7 +157,8 @@
 				#size-cells = <0>;
 				compatible = "marvell,orion-mdio";
 				reg = <0x12a200 0x10>;
-				clocks = <&cpm_clk 1 9>, <&cpm_clk 1 5>;
+				clocks = <&cpm_clk 1 9>, <&cpm_clk 1 5>,
+					 <&cpm_clk 1 6>, <&cpm_clk 1 18>;
 				status = "disabled";
 			};
 
@@ -338,8 +341,8 @@
 				compatible = "marvell,armada-cp110-sdhci";
 				reg = <0x780000 0x300>;
 				interrupts = <ICU_GRP_NSR 27 IRQ_TYPE_LEVEL_HIGH>;
-				clock-names = "core";
-				clocks = <&cpm_clk 1 4>;
+				clock-names = "core","axi";
+				clocks = <&cpm_clk 1 4>, <&cpm_clk 1 18>;
 				dma-coherent;
 				status = "disabled";
 			};
diff --git a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
index 0d51096..87ac68b 100644
--- a/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
+++ b/arch/arm64/boot/dts/marvell/armada-cp110-slave.dtsi
@@ -63,8 +63,10 @@
 			cps_ethernet: ethernet@0 {
 				compatible = "marvell,armada-7k-pp22";
 				reg = <0x0 0x100000>, <0x129000 0xb000>;
-				clocks = <&cps_clk 1 3>, <&cps_clk 1 9>, <&cps_clk 1 5>;
-				clock-names = "pp_clk", "gop_clk", "mg_clk";
+				clocks = <&cps_clk 1 3>, <&cps_clk 1 9>,
+					 <&cps_clk 1 5>, <&cps_clk 1 18>;
+				clock-names = "pp_clk", "gop_clk",
+					      "mg_clk", "axi_clk";
 				marvell,system-controller = <&cps_syscon0>;
 				status = "disabled";
 				dma-coherent;
@@ -155,7 +157,8 @@
 				#size-cells = <0>;
 				compatible = "marvell,orion-mdio";
 				reg = <0x12a200 0x10>;
-				clocks = <&cps_clk 1 9>, <&cps_clk 1 5>;
+				clocks = <&cps_clk 1 9>, <&cps_clk 1 5>,
+					 <&cps_clk 1 6>, <&cps_clk 1 18>;
 				status = "disabled";
 			};
 
diff --git a/arch/arm64/boot/dts/renesas/salvator-common.dtsi b/arch/arm64/boot/dts/renesas/salvator-common.dtsi
index a298df7..dbe2648 100644
--- a/arch/arm64/boot/dts/renesas/salvator-common.dtsi
+++ b/arch/arm64/boot/dts/renesas/salvator-common.dtsi
@@ -255,7 +255,6 @@
 &avb {
 	pinctrl-0 = <&avb_pins>;
 	pinctrl-names = "default";
-	renesas,no-ether-link;
 	phy-handle = <&phy0>;
 	status = "okay";
 
diff --git a/arch/arm64/boot/dts/renesas/ulcb.dtsi b/arch/arm64/boot/dts/renesas/ulcb.dtsi
index 0d85b31..73439cf 100644
--- a/arch/arm64/boot/dts/renesas/ulcb.dtsi
+++ b/arch/arm64/boot/dts/renesas/ulcb.dtsi
@@ -145,7 +145,6 @@
 &avb {
 	pinctrl-0 = <&avb_pins>;
 	pinctrl-names = "default";
-	renesas,no-ether-link;
 	phy-handle = <&phy0>;
 	status = "okay";
 
diff --git a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
index d4f8078..3890468 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3328-rock64.dts
@@ -132,6 +132,8 @@
 	assigned-clocks = <&cru SCLK_MAC2IO>, <&cru SCLK_MAC2IO_EXT>;
 	assigned-clock-parents = <&gmac_clkin>, <&gmac_clkin>;
 	clock_in_out = "input";
+	/* shows instability at 1GBit right now */
+	max-speed = <100>;
 	phy-supply = <&vcc_io>;
 	phy-mode = "rgmii";
 	pinctrl-names = "default";
diff --git a/arch/arm64/boot/dts/rockchip/rk3328.dtsi b/arch/arm64/boot/dts/rockchip/rk3328.dtsi
index 41d6184..2426da6 100644
--- a/arch/arm64/boot/dts/rockchip/rk3328.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3328.dtsi
@@ -514,7 +514,7 @@
 	tsadc: tsadc@ff250000 {
 		compatible = "rockchip,rk3328-tsadc";
 		reg = <0x0 0xff250000 0x0 0x100>;
-		interrupts = <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH 0>;
+		interrupts = <GIC_SPI 58 IRQ_TYPE_LEVEL_HIGH>;
 		assigned-clocks = <&cru SCLK_TSADC>;
 		assigned-clock-rates = <50000>;
 		clocks = <&cru SCLK_TSADC>, <&cru PCLK_TSADC>;
diff --git a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
index 910628d..1fc5060 100644
--- a/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3399-puma.dtsi
@@ -155,17 +155,6 @@
 		regulator-min-microvolt = <5000000>;
 		regulator-max-microvolt = <5000000>;
 	};
-
-	vdd_log: vdd-log {
-		compatible = "pwm-regulator";
-		pwms = <&pwm2 0 25000 0>;
-		regulator-name = "vdd_log";
-		regulator-min-microvolt = <800000>;
-		regulator-max-microvolt = <1400000>;
-		regulator-always-on;
-		regulator-boot-on;
-		status = "okay";
-	};
 };
 
 &cpu_b0 {
diff --git a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
index 48e7331..0ac2ace 100644
--- a/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
+++ b/arch/arm64/boot/dts/socionext/uniphier-pxs3.dtsi
@@ -198,8 +198,8 @@
 			gpio-controller;
 			#gpio-cells = <2>;
 			gpio-ranges = <&pinctrl 0 0 0>,
-				      <&pinctrl 96 0 0>,
-				      <&pinctrl 160 0 0>;
+				      <&pinctrl 104 0 0>,
+				      <&pinctrl 168 0 0>;
 			gpio-ranges-group-names = "gpio_range0",
 						  "gpio_range1",
 						  "gpio_range2";
diff --git a/arch/arm64/configs/defconfig b/arch/arm64/configs/defconfig
index 6356c6d..b20fa9b 100644
--- a/arch/arm64/configs/defconfig
+++ b/arch/arm64/configs/defconfig
@@ -161,7 +161,7 @@
 CONFIG_MTD_M25P80=y
 CONFIG_MTD_NAND=y
 CONFIG_MTD_NAND_DENALI_DT=y
-CONFIG_MTD_NAND_PXA3xx=y
+CONFIG_MTD_NAND_MARVELL=y
 CONFIG_MTD_SPI_NOR=y
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_NBD=m
diff --git a/arch/arm64/include/asm/alternative.h b/arch/arm64/include/asm/alternative.h
index 4a85c69..6690281 100644
--- a/arch/arm64/include/asm/alternative.h
+++ b/arch/arm64/include/asm/alternative.h
@@ -12,6 +12,8 @@
 #include <linux/stddef.h>
 #include <linux/stringify.h>
 
+extern int alternatives_applied;
+
 struct alt_instr {
 	s32 orig_offset;	/* offset to original instruction */
 	s32 alt_offset;		/* offset to replacement instruction */
diff --git a/arch/arm64/include/asm/arm_dsu_pmu.h b/arch/arm64/include/asm/arm_dsu_pmu.h
new file mode 100644
index 0000000..82e5cc3
--- /dev/null
+++ b/arch/arm64/include/asm/arm_dsu_pmu.h
@@ -0,0 +1,129 @@
+/*
+ * ARM DynamIQ Shared Unit (DSU) PMU Low level register access routines.
+ *
+ * Copyright (C) ARM Limited, 2017.
+ *
+ * Author: Suzuki K Poulose <suzuki.poulose@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2, as published by the Free Software Foundation.
+ */
+
+#include <linux/bitops.h>
+#include <linux/build_bug.h>
+#include <linux/compiler.h>
+#include <linux/types.h>
+#include <asm/barrier.h>
+#include <asm/sysreg.h>
+
+
+#define CLUSTERPMCR_EL1			sys_reg(3, 0, 15, 5, 0)
+#define CLUSTERPMCNTENSET_EL1		sys_reg(3, 0, 15, 5, 1)
+#define CLUSTERPMCNTENCLR_EL1		sys_reg(3, 0, 15, 5, 2)
+#define CLUSTERPMOVSSET_EL1		sys_reg(3, 0, 15, 5, 3)
+#define CLUSTERPMOVSCLR_EL1		sys_reg(3, 0, 15, 5, 4)
+#define CLUSTERPMSELR_EL1		sys_reg(3, 0, 15, 5, 5)
+#define CLUSTERPMINTENSET_EL1		sys_reg(3, 0, 15, 5, 6)
+#define CLUSTERPMINTENCLR_EL1		sys_reg(3, 0, 15, 5, 7)
+#define CLUSTERPMCCNTR_EL1		sys_reg(3, 0, 15, 6, 0)
+#define CLUSTERPMXEVTYPER_EL1		sys_reg(3, 0, 15, 6, 1)
+#define CLUSTERPMXEVCNTR_EL1		sys_reg(3, 0, 15, 6, 2)
+#define CLUSTERPMMDCR_EL1		sys_reg(3, 0, 15, 6, 3)
+#define CLUSTERPMCEID0_EL1		sys_reg(3, 0, 15, 6, 4)
+#define CLUSTERPMCEID1_EL1		sys_reg(3, 0, 15, 6, 5)
+
+static inline u32 __dsu_pmu_read_pmcr(void)
+{
+	return read_sysreg_s(CLUSTERPMCR_EL1);
+}
+
+static inline void __dsu_pmu_write_pmcr(u32 val)
+{
+	write_sysreg_s(val, CLUSTERPMCR_EL1);
+	isb();
+}
+
+static inline u32 __dsu_pmu_get_reset_overflow(void)
+{
+	u32 val = read_sysreg_s(CLUSTERPMOVSCLR_EL1);
+	/* Clear the bit */
+	write_sysreg_s(val, CLUSTERPMOVSCLR_EL1);
+	isb();
+	return val;
+}
+
+static inline void __dsu_pmu_select_counter(int counter)
+{
+	write_sysreg_s(counter, CLUSTERPMSELR_EL1);
+	isb();
+}
+
+static inline u64 __dsu_pmu_read_counter(int counter)
+{
+	__dsu_pmu_select_counter(counter);
+	return read_sysreg_s(CLUSTERPMXEVCNTR_EL1);
+}
+
+static inline void __dsu_pmu_write_counter(int counter, u64 val)
+{
+	__dsu_pmu_select_counter(counter);
+	write_sysreg_s(val, CLUSTERPMXEVCNTR_EL1);
+	isb();
+}
+
+static inline void __dsu_pmu_set_event(int counter, u32 event)
+{
+	__dsu_pmu_select_counter(counter);
+	write_sysreg_s(event, CLUSTERPMXEVTYPER_EL1);
+	isb();
+}
+
+static inline u64 __dsu_pmu_read_pmccntr(void)
+{
+	return read_sysreg_s(CLUSTERPMCCNTR_EL1);
+}
+
+static inline void __dsu_pmu_write_pmccntr(u64 val)
+{
+	write_sysreg_s(val, CLUSTERPMCCNTR_EL1);
+	isb();
+}
+
+static inline void __dsu_pmu_disable_counter(int counter)
+{
+	write_sysreg_s(BIT(counter), CLUSTERPMCNTENCLR_EL1);
+	isb();
+}
+
+static inline void __dsu_pmu_enable_counter(int counter)
+{
+	write_sysreg_s(BIT(counter), CLUSTERPMCNTENSET_EL1);
+	isb();
+}
+
+static inline void __dsu_pmu_counter_interrupt_enable(int counter)
+{
+	write_sysreg_s(BIT(counter), CLUSTERPMINTENSET_EL1);
+	isb();
+}
+
+static inline void __dsu_pmu_counter_interrupt_disable(int counter)
+{
+	write_sysreg_s(BIT(counter), CLUSTERPMINTENCLR_EL1);
+	isb();
+}
+
+
+static inline u32 __dsu_pmu_read_pmceid(int n)
+{
+	switch (n) {
+	case 0:
+		return read_sysreg_s(CLUSTERPMCEID0_EL1);
+	case 1:
+		return read_sysreg_s(CLUSTERPMCEID1_EL1);
+	default:
+		BUILD_BUG();
+		return 0;
+	}
+}
diff --git a/arch/arm64/include/asm/asm-uaccess.h b/arch/arm64/include/asm/asm-uaccess.h
index b3da6c8..4128bec 100644
--- a/arch/arm64/include/asm/asm-uaccess.h
+++ b/arch/arm64/include/asm/asm-uaccess.h
@@ -4,6 +4,7 @@
 
 #include <asm/alternative.h>
 #include <asm/kernel-pgtable.h>
+#include <asm/mmu.h>
 #include <asm/sysreg.h>
 #include <asm/assembler.h>
 
@@ -12,52 +13,63 @@
  */
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 	.macro	__uaccess_ttbr0_disable, tmp1
-	mrs	\tmp1, ttbr1_el1		// swapper_pg_dir
-	add	\tmp1, \tmp1, #SWAPPER_DIR_SIZE	// reserved_ttbr0 at the end of swapper_pg_dir
-	msr	ttbr0_el1, \tmp1		// set reserved TTBR0_EL1
+	mrs	\tmp1, ttbr1_el1			// swapper_pg_dir
+	bic	\tmp1, \tmp1, #TTBR_ASID_MASK
+	sub	\tmp1, \tmp1, #RESERVED_TTBR0_SIZE	// reserved_ttbr0 just before swapper_pg_dir
+	msr	ttbr0_el1, \tmp1			// set reserved TTBR0_EL1
+	isb
+	add	\tmp1, \tmp1, #RESERVED_TTBR0_SIZE
+	msr	ttbr1_el1, \tmp1		// set reserved ASID
 	isb
 	.endm
 
-	.macro	__uaccess_ttbr0_enable, tmp1
+	.macro	__uaccess_ttbr0_enable, tmp1, tmp2
 	get_thread_info \tmp1
 	ldr	\tmp1, [\tmp1, #TSK_TI_TTBR0]	// load saved TTBR0_EL1
+	mrs	\tmp2, ttbr1_el1
+	extr    \tmp2, \tmp2, \tmp1, #48
+	ror     \tmp2, \tmp2, #16
+	msr	ttbr1_el1, \tmp2		// set the active ASID
+	isb
 	msr	ttbr0_el1, \tmp1		// set the non-PAN TTBR0_EL1
 	isb
 	.endm
 
-	.macro	uaccess_ttbr0_disable, tmp1
-alternative_if_not ARM64_HAS_PAN
-	__uaccess_ttbr0_disable \tmp1
-alternative_else_nop_endif
-	.endm
-
-	.macro	uaccess_ttbr0_enable, tmp1, tmp2
+	.macro	uaccess_ttbr0_disable, tmp1, tmp2
 alternative_if_not ARM64_HAS_PAN
 	save_and_disable_irq \tmp2		// avoid preemption
-	__uaccess_ttbr0_enable \tmp1
+	__uaccess_ttbr0_disable \tmp1
 	restore_irq \tmp2
 alternative_else_nop_endif
 	.endm
+
+	.macro	uaccess_ttbr0_enable, tmp1, tmp2, tmp3
+alternative_if_not ARM64_HAS_PAN
+	save_and_disable_irq \tmp3		// avoid preemption
+	__uaccess_ttbr0_enable \tmp1, \tmp2
+	restore_irq \tmp3
+alternative_else_nop_endif
+	.endm
 #else
-	.macro	uaccess_ttbr0_disable, tmp1
+	.macro	uaccess_ttbr0_disable, tmp1, tmp2
 	.endm
 
-	.macro	uaccess_ttbr0_enable, tmp1, tmp2
+	.macro	uaccess_ttbr0_enable, tmp1, tmp2, tmp3
 	.endm
 #endif
 
 /*
  * These macros are no-ops when UAO is present.
  */
-	.macro	uaccess_disable_not_uao, tmp1
-	uaccess_ttbr0_disable \tmp1
+	.macro	uaccess_disable_not_uao, tmp1, tmp2
+	uaccess_ttbr0_disable \tmp1, \tmp2
 alternative_if ARM64_ALT_PAN_NOT_UAO
 	SET_PSTATE_PAN(1)
 alternative_else_nop_endif
 	.endm
 
-	.macro	uaccess_enable_not_uao, tmp1, tmp2
-	uaccess_ttbr0_enable \tmp1, \tmp2
+	.macro	uaccess_enable_not_uao, tmp1, tmp2, tmp3
+	uaccess_ttbr0_enable \tmp1, \tmp2, \tmp3
 alternative_if ARM64_ALT_PAN_NOT_UAO
 	SET_PSTATE_PAN(0)
 alternative_else_nop_endif
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 8b16828..3873dd7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -26,7 +26,6 @@
 #include <asm/asm-offsets.h>
 #include <asm/cpufeature.h>
 #include <asm/debug-monitors.h>
-#include <asm/mmu_context.h>
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
@@ -110,6 +109,13 @@
 	.endm
 
 /*
+ * RAS Error Synchronization barrier
+ */
+	.macro  esb
+	hint    #16
+	.endm
+
+/*
  * NOP sequence
  */
 	.macro	nops, num
@@ -255,7 +261,11 @@ lr	.req	x30		// link register
 #else
 	adr_l	\dst, \sym
 #endif
+alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	mrs	\tmp, tpidr_el1
+alternative_else
+	mrs	\tmp, tpidr_el2
+alternative_endif
 	add	\dst, \dst, \tmp
 	.endm
 
@@ -266,7 +276,11 @@ lr	.req	x30		// link register
 	 */
 	.macro ldr_this_cpu dst, sym, tmp
 	adr_l	\dst, \sym
+alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	mrs	\tmp, tpidr_el1
+alternative_else
+	mrs	\tmp, tpidr_el2
+alternative_endif
 	ldr	\dst, [\dst, \tmp]
 	.endm
 
@@ -344,10 +358,26 @@ alternative_endif
  * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
  */
 	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
 	ldr_l	\tmpreg, idmap_t0sz
 	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
+	.endm
+
+/*
+ * tcr_compute_pa_size - set TCR.(I)PS to the highest supported
+ * ID_AA64MMFR0_EL1.PARange value
+ *
+ *	tcr:		register with the TCR_ELx value to be updated
+ *	pos:		IPS or PS bitfield position
+ *	tmp{0,1}:	temporary registers
+ */
+	.macro	tcr_compute_pa_size, tcr, pos, tmp0, tmp1
+	mrs	\tmp0, ID_AA64MMFR0_EL1
+	// Narrow PARange to fit the PS field in TCR_ELx
+	ubfx	\tmp0, \tmp0, #ID_AA64MMFR0_PARANGE_SHIFT, #3
+	mov	\tmp1, #ID_AA64MMFR0_PARANGE_MAX
+	cmp	\tmp0, \tmp1
+	csel	\tmp0, \tmp1, \tmp0, hi
+	bfi	\tcr, \tmp0, \pos, #3
 	.endm
 
 /*
@@ -478,37 +508,18 @@ alternative_endif
 	.endm
 
 /*
- * Errata workaround prior to TTBR0_EL1 update
+ * Arrange a physical address in a TTBR register, taking care of 52-bit
+ * addresses.
  *
- * 	val:	TTBR value with new BADDR, preserved
- * 	tmp0:	temporary register, clobbered
- * 	tmp1:	other temporary register, clobbered
+ * 	phys:	physical address, preserved
+ * 	ttbr:	returns the TTBR value
  */
-	.macro	pre_ttbr0_update_workaround, val, tmp0, tmp1
-#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003
-alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1003
-	mrs	\tmp0, ttbr0_el1
-	mov	\tmp1, #FALKOR_RESERVED_ASID
-	bfi	\tmp0, \tmp1, #48, #16		// reserved ASID + old BADDR
-	msr	ttbr0_el1, \tmp0
-	isb
-	bfi	\tmp0, \val, #0, #48		// reserved ASID + new BADDR
-	msr	ttbr0_el1, \tmp0
-	isb
-alternative_else_nop_endif
-#endif
-	.endm
-
-/*
- * Errata workaround post TTBR0_EL1 update.
- */
-	.macro	post_ttbr0_update_workaround
-#ifdef CONFIG_CAVIUM_ERRATUM_27456
-alternative_if ARM64_WORKAROUND_CAVIUM_27456
-	ic	iallu
-	dsb	nsh
-	isb
-alternative_else_nop_endif
+	.macro	phys_to_ttbr, phys, ttbr
+#ifdef CONFIG_ARM64_PA_BITS_52
+	orr	\ttbr, \phys, \phys, lsr #46
+	and	\ttbr, \ttbr, #TTBR_BADDR_MASK_52
+#else
+	mov	\ttbr, \phys
 #endif
 	.endm
 
diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index 2ff7c5e..bb26382 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -41,7 +41,11 @@
 #define ARM64_WORKAROUND_CAVIUM_30115		20
 #define ARM64_HAS_DCPOP				21
 #define ARM64_SVE				22
+#define ARM64_UNMAP_KERNEL_AT_EL0		23
+#define ARM64_HARDEN_BRANCH_PREDICTOR		24
+#define ARM64_HARDEN_BP_POST_GUEST_EXIT		25
+#define ARM64_HAS_RAS_EXTN			26
 
-#define ARM64_NCAPS				23
+#define ARM64_NCAPS				27
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
index cbf08d7..be7bd19 100644
--- a/arch/arm64/include/asm/cputype.h
+++ b/arch/arm64/include/asm/cputype.h
@@ -79,28 +79,37 @@
 #define ARM_CPU_PART_AEM_V8		0xD0F
 #define ARM_CPU_PART_FOUNDATION		0xD00
 #define ARM_CPU_PART_CORTEX_A57		0xD07
+#define ARM_CPU_PART_CORTEX_A72		0xD08
 #define ARM_CPU_PART_CORTEX_A53		0xD03
 #define ARM_CPU_PART_CORTEX_A73		0xD09
+#define ARM_CPU_PART_CORTEX_A75		0xD0A
 
 #define APM_CPU_PART_POTENZA		0x000
 
 #define CAVIUM_CPU_PART_THUNDERX	0x0A1
 #define CAVIUM_CPU_PART_THUNDERX_81XX	0x0A2
 #define CAVIUM_CPU_PART_THUNDERX_83XX	0x0A3
+#define CAVIUM_CPU_PART_THUNDERX2	0x0AF
 
 #define BRCM_CPU_PART_VULCAN		0x516
 
 #define QCOM_CPU_PART_FALKOR_V1		0x800
 #define QCOM_CPU_PART_FALKOR		0xC00
+#define QCOM_CPU_PART_KRYO		0x200
 
 #define MIDR_CORTEX_A53 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A53)
 #define MIDR_CORTEX_A57 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A57)
+#define MIDR_CORTEX_A72 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A72)
 #define MIDR_CORTEX_A73 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A73)
+#define MIDR_CORTEX_A75 MIDR_CPU_MODEL(ARM_CPU_IMP_ARM, ARM_CPU_PART_CORTEX_A75)
 #define MIDR_THUNDERX	MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX)
 #define MIDR_THUNDERX_81XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_81XX)
 #define MIDR_THUNDERX_83XX MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX_83XX)
+#define MIDR_CAVIUM_THUNDERX2 MIDR_CPU_MODEL(ARM_CPU_IMP_CAVIUM, CAVIUM_CPU_PART_THUNDERX2)
+#define MIDR_BRCM_VULCAN MIDR_CPU_MODEL(ARM_CPU_IMP_BRCM, BRCM_CPU_PART_VULCAN)
 #define MIDR_QCOM_FALKOR_V1 MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR_V1)
 #define MIDR_QCOM_FALKOR MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_FALKOR)
+#define MIDR_QCOM_KRYO MIDR_CPU_MODEL(ARM_CPU_IMP_QCOM, QCOM_CPU_PART_KRYO)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/include/asm/efi.h b/arch/arm64/include/asm/efi.h
index c4cd508..8389050 100644
--- a/arch/arm64/include/asm/efi.h
+++ b/arch/arm64/include/asm/efi.h
@@ -121,19 +121,21 @@ static inline void efi_set_pgd(struct mm_struct *mm)
 		if (mm != current->active_mm) {
 			/*
 			 * Update the current thread's saved ttbr0 since it is
-			 * restored as part of a return from exception. Set
-			 * the hardware TTBR0_EL1 using cpu_switch_mm()
-			 * directly to enable potential errata workarounds.
+			 * restored as part of a return from exception. Enable
+			 * access to the valid TTBR0_EL1 and invoke the errata
+			 * workaround directly since there is no return from
+			 * exception when invoking the EFI run-time services.
 			 */
 			update_saved_ttbr0(current, mm);
-			cpu_switch_mm(mm->pgd, mm);
+			uaccess_ttbr0_enable();
+			post_ttbr_update_workaround();
 		} else {
 			/*
 			 * Defer the switch to the current thread's TTBR0_EL1
 			 * until uaccess_enable(). Restore the current
 			 * thread's saved ttbr0 corresponding to its active_mm
 			 */
-			cpu_set_reserved_ttbr0();
+			uaccess_ttbr0_disable();
 			update_saved_ttbr0(current, current->active_mm);
 		}
 	}
diff --git a/arch/arm64/include/asm/esr.h b/arch/arm64/include/asm/esr.h
index 014d7d8..803443d 100644
--- a/arch/arm64/include/asm/esr.h
+++ b/arch/arm64/include/asm/esr.h
@@ -86,6 +86,18 @@
 #define ESR_ELx_WNR_SHIFT	(6)
 #define ESR_ELx_WNR		(UL(1) << ESR_ELx_WNR_SHIFT)
 
+/* Asynchronous Error Type */
+#define ESR_ELx_IDS_SHIFT	(24)
+#define ESR_ELx_IDS		(UL(1) << ESR_ELx_IDS_SHIFT)
+#define ESR_ELx_AET_SHIFT	(10)
+#define ESR_ELx_AET		(UL(0x7) << ESR_ELx_AET_SHIFT)
+
+#define ESR_ELx_AET_UC		(UL(0) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEU		(UL(1) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UEO		(UL(2) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_UER		(UL(3) << ESR_ELx_AET_SHIFT)
+#define ESR_ELx_AET_CE		(UL(6) << ESR_ELx_AET_SHIFT)
+
 /* Shared ISS field definitions for Data/Instruction aborts */
 #define ESR_ELx_SET_SHIFT	(11)
 #define ESR_ELx_SET_MASK	(UL(3) << ESR_ELx_SET_SHIFT)
@@ -100,6 +112,7 @@
 #define ESR_ELx_FSC		(0x3F)
 #define ESR_ELx_FSC_TYPE	(0x3C)
 #define ESR_ELx_FSC_EXTABT	(0x10)
+#define ESR_ELx_FSC_SERROR	(0x11)
 #define ESR_ELx_FSC_ACCESS	(0x08)
 #define ESR_ELx_FSC_FAULT	(0x04)
 #define ESR_ELx_FSC_PERM	(0x0C)
@@ -127,6 +140,13 @@
 #define ESR_ELx_WFx_ISS_WFE	(UL(1) << 0)
 #define ESR_ELx_xVC_IMM_MASK	((1UL << 16) - 1)
 
+#define DISR_EL1_IDS		(UL(1) << 24)
+/*
+ * DISR_EL1 and ESR_ELx share the bottom 13 bits, but the RES0 bits may mean
+ * different things in the future...
+ */
+#define DISR_EL1_ESR_MASK	(ESR_ELx_AET | ESR_ELx_EA | ESR_ELx_FSC)
+
 /* ESR value templates for specific events */
 
 /* BRK instruction trap from AArch64 state */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index 0c2eec4..bc30429 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -18,6 +18,8 @@
 #ifndef __ASM_EXCEPTION_H
 #define __ASM_EXCEPTION_H
 
+#include <asm/esr.h>
+
 #include <linux/interrupt.h>
 
 #define __exception	__attribute__((section(".exception.text")))
@@ -27,4 +29,16 @@
 #define __exception_irq_entry	__exception
 #endif
 
+static inline u32 disr_to_esr(u64 disr)
+{
+	unsigned int esr = ESR_ELx_EC_SERROR << ESR_ELx_EC_SHIFT;
+
+	if ((disr & DISR_EL1_IDS) == 0)
+		esr |= (disr & DISR_EL1_ESR_MASK);
+	else
+		esr |= (disr & ESR_ELx_ISS_MASK);
+
+	return esr;
+}
+
 #endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/fixmap.h b/arch/arm64/include/asm/fixmap.h
index 4052ec3..ec1e6d6 100644
--- a/arch/arm64/include/asm/fixmap.h
+++ b/arch/arm64/include/asm/fixmap.h
@@ -58,6 +58,11 @@ enum fixed_addresses {
 	FIX_APEI_GHES_NMI,
 #endif /* CONFIG_ACPI_APEI_GHES */
 
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	FIX_ENTRY_TRAMP_DATA,
+	FIX_ENTRY_TRAMP_TEXT,
+#define TRAMP_VALIAS		(__fix_to_virt(FIX_ENTRY_TRAMP_TEXT))
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
 	__end_of_permanent_fixed_addresses,
 
 	/*
diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
index 74f3439..8857a0f 100644
--- a/arch/arm64/include/asm/fpsimd.h
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -71,7 +71,7 @@ extern void fpsimd_flush_thread(void);
 extern void fpsimd_signal_preserve_current_state(void);
 extern void fpsimd_preserve_current_state(void);
 extern void fpsimd_restore_current_state(void);
-extern void fpsimd_update_current_state(struct fpsimd_state *state);
+extern void fpsimd_update_current_state(struct user_fpsimd_state const *state);
 
 extern void fpsimd_flush_task_state(struct task_struct *target);
 extern void sve_flush_cpu_state(void);
diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 7803343..82386e8 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -52,7 +52,52 @@
 #define IDMAP_PGTABLE_LEVELS	(ARM64_HW_PGTABLE_LEVELS(PHYS_MASK_SHIFT))
 #endif
 
-#define SWAPPER_DIR_SIZE	(SWAPPER_PGTABLE_LEVELS * PAGE_SIZE)
+
+/*
+ * If KASLR is enabled, then an offset K is added to the kernel address
+ * space. The bottom 21 bits of this offset are zero to guarantee 2MB
+ * alignment for PA and VA.
+ *
+ * For each pagetable level of the swapper, we know that the shift will
+ * be larger than 21 (for the 4KB granule case we use section maps thus
+ * the smallest shift is actually 30) thus there is the possibility that
+ * KASLR can increase the number of pagetable entries by 1, so we make
+ * room for this extra entry.
+ *
+ * Note KASLR cannot increase the number of required entries for a level
+ * by more than one because it increments both the virtual start and end
+ * addresses equally (the extra entry comes from the case where the end
+ * address is just pushed over a boundary and the start address isn't).
+ */
+
+#ifdef CONFIG_RANDOMIZE_BASE
+#define EARLY_KASLR	(1)
+#else
+#define EARLY_KASLR	(0)
+#endif
+
+#define EARLY_ENTRIES(vstart, vend, shift) (((vend) >> (shift)) \
+					- ((vstart) >> (shift)) + 1 + EARLY_KASLR)
+
+#define EARLY_PGDS(vstart, vend) (EARLY_ENTRIES(vstart, vend, PGDIR_SHIFT))
+
+#if SWAPPER_PGTABLE_LEVELS > 3
+#define EARLY_PUDS(vstart, vend) (EARLY_ENTRIES(vstart, vend, PUD_SHIFT))
+#else
+#define EARLY_PUDS(vstart, vend) (0)
+#endif
+
+#if SWAPPER_PGTABLE_LEVELS > 2
+#define EARLY_PMDS(vstart, vend) (EARLY_ENTRIES(vstart, vend, SWAPPER_TABLE_SHIFT))
+#else
+#define EARLY_PMDS(vstart, vend) (0)
+#endif
+
+#define EARLY_PAGES(vstart, vend) ( 1 			/* PGDIR page */				\
+			+ EARLY_PGDS((vstart), (vend)) 	/* each PGDIR needs a next level page table */	\
+			+ EARLY_PUDS((vstart), (vend))	/* each PUD needs a next level page table */	\
+			+ EARLY_PMDS((vstart), (vend)))	/* each PMD needs a next level page table */
+#define SWAPPER_DIR_SIZE (PAGE_SIZE * EARLY_PAGES(KIMAGE_VADDR + TEXT_OFFSET, _end))
 #define IDMAP_DIR_SIZE		(IDMAP_PGTABLE_LEVELS * PAGE_SIZE)
 
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
@@ -78,8 +123,16 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define SWAPPER_PTE_FLAGS	(_SWAPPER_PTE_FLAGS | PTE_NG)
+#define SWAPPER_PMD_FLAGS	(_SWAPPER_PMD_FLAGS | PMD_SECT_NG)
+#else
+#define SWAPPER_PTE_FLAGS	_SWAPPER_PTE_FLAGS
+#define SWAPPER_PMD_FLAGS	_SWAPPER_PMD_FLAGS
+#endif
 
 #if ARM64_SWAPPER_USES_SECTION_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 715d395..b0c8417 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -23,6 +23,8 @@
 #include <asm/types.h>
 
 /* Hyp Configuration Register (HCR) bits */
+#define HCR_TEA		(UL(1) << 37)
+#define HCR_TERR	(UL(1) << 36)
 #define HCR_E2H		(UL(1) << 34)
 #define HCR_ID		(UL(1) << 33)
 #define HCR_CD		(UL(1) << 32)
diff --git a/arch/arm64/include/asm/kvm_asm.h b/arch/arm64/include/asm/kvm_asm.h
index ab4d0a9..24961b7 100644
--- a/arch/arm64/include/asm/kvm_asm.h
+++ b/arch/arm64/include/asm/kvm_asm.h
@@ -68,6 +68,8 @@ extern u32 __kvm_get_mdcr_el2(void);
 
 extern u32 __init_stage2_translation(void);
 
+extern void __qcom_hyp_sanitize_btac_predictors(void);
+
 #endif
 
 #endif /* __ARM_KVM_ASM_H__ */
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h
index 5f28dfa..413dc82 100644
--- a/arch/arm64/include/asm/kvm_emulate.h
+++ b/arch/arm64/include/asm/kvm_emulate.h
@@ -50,6 +50,13 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu)
 	vcpu->arch.hcr_el2 = HCR_GUEST_FLAGS;
 	if (is_kernel_in_hyp_mode())
 		vcpu->arch.hcr_el2 |= HCR_E2H;
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN)) {
+		/* route synchronous external abort exceptions to EL2 */
+		vcpu->arch.hcr_el2 |= HCR_TEA;
+		/* trap error record accesses */
+		vcpu->arch.hcr_el2 |= HCR_TERR;
+	}
+
 	if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features))
 		vcpu->arch.hcr_el2 &= ~HCR_RW;
 }
@@ -64,6 +71,11 @@ static inline void vcpu_set_hcr(struct kvm_vcpu *vcpu, unsigned long hcr)
 	vcpu->arch.hcr_el2 = hcr;
 }
 
+static inline void vcpu_set_vsesr(struct kvm_vcpu *vcpu, u64 vsesr)
+{
+	vcpu->arch.vsesr_el2 = vsesr;
+}
+
 static inline unsigned long *vcpu_pc(const struct kvm_vcpu *vcpu)
 {
 	return (unsigned long *)&vcpu_gp_regs(vcpu)->regs.pc;
@@ -171,6 +183,11 @@ static inline phys_addr_t kvm_vcpu_get_fault_ipa(const struct kvm_vcpu *vcpu)
 	return ((phys_addr_t)vcpu->arch.fault.hpfar_el2 & HPFAR_MASK) << 8;
 }
 
+static inline u64 kvm_vcpu_get_disr(const struct kvm_vcpu *vcpu)
+{
+	return vcpu->arch.fault.disr_el1;
+}
+
 static inline u32 kvm_vcpu_hvc_get_imm(const struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_get_hsr(vcpu) & ESR_ELx_xVC_IMM_MASK;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ea6cb5b..4485ae8 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -25,6 +25,7 @@
 #include <linux/types.h>
 #include <linux/kvm_types.h>
 #include <asm/cpufeature.h>
+#include <asm/daifflags.h>
 #include <asm/fpsimd.h>
 #include <asm/kvm.h>
 #include <asm/kvm_asm.h>
@@ -89,6 +90,7 @@ struct kvm_vcpu_fault_info {
 	u32 esr_el2;		/* Hyp Syndrom Register */
 	u64 far_el2;		/* Hyp Fault Address Register */
 	u64 hpfar_el2;		/* Hyp IPA Fault Address Register */
+	u64 disr_el1;		/* Deferred [SError] Status Register */
 };
 
 /*
@@ -120,6 +122,7 @@ enum vcpu_sysreg {
 	PAR_EL1,	/* Physical Address Register */
 	MDSCR_EL1,	/* Monitor Debug System Control Register */
 	MDCCINT_EL1,	/* Monitor Debug Comms Channel Interrupt Enable Reg */
+	DISR_EL1,	/* Deferred Interrupt Status Register */
 
 	/* Performance Monitors Registers */
 	PMCR_EL0,	/* Control Register */
@@ -192,6 +195,8 @@ struct kvm_cpu_context {
 		u64 sys_regs[NR_SYS_REGS];
 		u32 copro[NR_COPRO_REGS];
 	};
+
+	struct kvm_vcpu *__hyp_running_vcpu;
 };
 
 typedef struct kvm_cpu_context kvm_cpu_context_t;
@@ -277,6 +282,9 @@ struct kvm_vcpu_arch {
 
 	/* Detect first run of a vcpu */
 	bool has_run_once;
+
+	/* Virtual SError ESR to restore when HCR_EL2.VSE is set */
+	u64 vsesr_el2;
 };
 
 #define vcpu_gp_regs(v)		(&(v)->arch.ctxt.gp_regs)
@@ -340,6 +348,8 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		int exception_index);
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index);
 
 int kvm_perf_init(void);
 int kvm_perf_teardown(void);
@@ -396,4 +406,13 @@ static inline void kvm_fpsimd_flush_cpu_state(void)
 		sve_flush_cpu_state();
 }
 
+static inline void kvm_arm_vhe_guest_enter(void)
+{
+	local_daif_mask();
+}
+
+static inline void kvm_arm_vhe_guest_exit(void)
+{
+	local_daif_restore(DAIF_PROCCTX_NOIRQ);
+}
 #endif /* __ARM64_KVM_HOST_H__ */
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 672c868..72e279d 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -273,15 +273,26 @@ void kvm_toggle_cache(struct kvm_vcpu *vcpu, bool was_enabled);
 
 static inline bool __kvm_cpu_uses_extended_idmap(void)
 {
-	return __cpu_uses_extended_idmap();
+	return __cpu_uses_extended_idmap_level();
 }
 
+static inline unsigned long __kvm_idmap_ptrs_per_pgd(void)
+{
+	return idmap_ptrs_per_pgd;
+}
+
+/*
+ * Can't use pgd_populate here, because the extended idmap adds an extra level
+ * above CONFIG_PGTABLE_LEVELS (which is 2 or 3 if we're using the extended
+ * idmap), and pgd_populate is only available if CONFIG_PGTABLE_LEVELS = 4.
+ */
 static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
 				       pgd_t *hyp_pgd,
 				       pgd_t *merged_hyp_pgd,
 				       unsigned long hyp_idmap_start)
 {
 	int idmap_idx;
+	u64 pgd_addr;
 
 	/*
 	 * Use the first entry to access the HYP mappings. It is
@@ -289,7 +300,8 @@ static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
 	 * extended idmap.
 	 */
 	VM_BUG_ON(pgd_val(merged_hyp_pgd[0]));
-	merged_hyp_pgd[0] = __pgd(__pa(hyp_pgd) | PMD_TYPE_TABLE);
+	pgd_addr = __phys_to_pgd_val(__pa(hyp_pgd));
+	merged_hyp_pgd[0] = __pgd(pgd_addr | PMD_TYPE_TABLE);
 
 	/*
 	 * Create another extended level entry that points to the boot HYP map,
@@ -299,7 +311,8 @@ static inline void __kvm_extend_hypmap(pgd_t *boot_hyp_pgd,
 	 */
 	idmap_idx = hyp_idmap_start >> VA_BITS;
 	VM_BUG_ON(pgd_val(merged_hyp_pgd[idmap_idx]));
-	merged_hyp_pgd[idmap_idx] = __pgd(__pa(boot_hyp_pgd) | PMD_TYPE_TABLE);
+	pgd_addr = __phys_to_pgd_val(__pa(boot_hyp_pgd));
+	merged_hyp_pgd[idmap_idx] = __pgd(pgd_addr | PMD_TYPE_TABLE);
 }
 
 static inline unsigned int kvm_get_vmid_bits(void)
@@ -309,5 +322,45 @@ static inline unsigned int kvm_get_vmid_bits(void)
 	return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
 }
 
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu.h>
+
+static inline void *kvm_get_hyp_vector(void)
+{
+	struct bp_hardening_data *data = arm64_get_bp_hardening_data();
+	void *vect = kvm_ksym_ref(__kvm_hyp_vector);
+
+	if (data->fn) {
+		vect = __bp_harden_hyp_vecs_start +
+		       data->hyp_vectors_slot * SZ_2K;
+
+		if (!has_vhe())
+			vect = lm_alias(vect);
+	}
+
+	return vect;
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return create_hyp_mappings(kvm_ksym_ref(__bp_harden_hyp_vecs_start),
+				   kvm_ksym_ref(__bp_harden_hyp_vecs_end),
+				   PAGE_HYP_EXEC);
+}
+
+#else
+static inline void *kvm_get_hyp_vector(void)
+{
+	return kvm_ksym_ref(__kvm_hyp_vector);
+}
+
+static inline int kvm_map_vectors(void)
+{
+	return 0;
+}
+#endif
+
+#define kvm_phys_to_vttbr(addr)		phys_to_ttbr(addr)
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
index 0d34bf0..a050d4f 100644
--- a/arch/arm64/include/asm/mmu.h
+++ b/arch/arm64/include/asm/mmu.h
@@ -17,6 +17,11 @@
 #define __ASM_MMU_H
 
 #define MMCF_AARCH32	0x1	/* mm context flag for AArch32 executables */
+#define USER_ASID_BIT	48
+#define USER_ASID_FLAG	(UL(1) << USER_ASID_BIT)
+#define TTBR_ASID_MASK	(UL(0xffff) << 48)
+
+#ifndef __ASSEMBLY__
 
 typedef struct {
 	atomic64_t	id;
@@ -31,6 +36,49 @@ typedef struct {
  */
 #define ASID(mm)	((mm)->context.id.counter & 0xffff)
 
+static inline bool arm64_kernel_unmapped_at_el0(void)
+{
+	return IS_ENABLED(CONFIG_UNMAP_KERNEL_AT_EL0) &&
+	       cpus_have_const_cap(ARM64_UNMAP_KERNEL_AT_EL0);
+}
+
+typedef void (*bp_hardening_cb_t)(void);
+
+struct bp_hardening_data {
+	int			hyp_vectors_slot;
+	bp_hardening_cb_t	fn;
+};
+
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+extern char __bp_harden_hyp_vecs_start[], __bp_harden_hyp_vecs_end[];
+
+DECLARE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+	return this_cpu_ptr(&bp_hardening_data);
+}
+
+static inline void arm64_apply_bp_hardening(void)
+{
+	struct bp_hardening_data *d;
+
+	if (!cpus_have_const_cap(ARM64_HARDEN_BRANCH_PREDICTOR))
+		return;
+
+	d = arm64_get_bp_hardening_data();
+	if (d->fn)
+		d->fn();
+}
+#else
+static inline struct bp_hardening_data *arm64_get_bp_hardening_data(void)
+{
+	return NULL;
+}
+
+static inline void arm64_apply_bp_hardening(void)	{ }
+#endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
 extern void paging_init(void);
 extern void bootmem_init(void);
 extern void __iomem *early_io_map(phys_addr_t phys, unsigned long virt);
@@ -41,4 +89,5 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys,
 extern void *fixmap_remap_fdt(phys_addr_t dt_phys);
 extern void mark_linear_text_alias_ro(void);
 
+#endif	/* !__ASSEMBLY__ */
 #endif
diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
index 9d155fa..8d33319 100644
--- a/arch/arm64/include/asm/mmu_context.h
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -19,8 +19,6 @@
 #ifndef __ASM_MMU_CONTEXT_H
 #define __ASM_MMU_CONTEXT_H
 
-#define FALKOR_RESERVED_ASID	1
-
 #ifndef __ASSEMBLY__
 
 #include <linux/compiler.h>
@@ -51,23 +49,39 @@ static inline void contextidr_thread_switch(struct task_struct *next)
  */
 static inline void cpu_set_reserved_ttbr0(void)
 {
-	unsigned long ttbr = __pa_symbol(empty_zero_page);
+	unsigned long ttbr = phys_to_ttbr(__pa_symbol(empty_zero_page));
 
 	write_sysreg(ttbr, ttbr0_el1);
 	isb();
 }
 
+static inline void cpu_switch_mm(pgd_t *pgd, struct mm_struct *mm)
+{
+	BUG_ON(pgd == swapper_pg_dir);
+	cpu_set_reserved_ttbr0();
+	cpu_do_switch_mm(virt_to_phys(pgd),mm);
+}
+
 /*
  * TCR.T0SZ value to use when the ID map is active. Usually equals
  * TCR_T0SZ(VA_BITS), unless system RAM is positioned very high in
  * physical memory, in which case it will be smaller.
  */
 extern u64 idmap_t0sz;
+extern u64 idmap_ptrs_per_pgd;
 
 static inline bool __cpu_uses_extended_idmap(void)
 {
-	return (!IS_ENABLED(CONFIG_ARM64_VA_BITS_48) &&
-		unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS)));
+	return unlikely(idmap_t0sz != TCR_T0SZ(VA_BITS));
+}
+
+/*
+ * True if the extended ID map requires an extra level of translation table
+ * to be configured.
+ */
+static inline bool __cpu_uses_extended_idmap_level(void)
+{
+	return ARM64_HW_PGTABLE_LEVELS(64 - idmap_t0sz) > CONFIG_PGTABLE_LEVELS;
 }
 
 /*
@@ -170,7 +184,7 @@ static inline void update_saved_ttbr0(struct task_struct *tsk,
 	else
 		ttbr = virt_to_phys(mm->pgd) | ASID(mm) << 48;
 
-	task_thread_info(tsk)->ttbr0 = ttbr;
+	WRITE_ONCE(task_thread_info(tsk)->ttbr0, ttbr);
 }
 #else
 static inline void update_saved_ttbr0(struct task_struct *tsk,
@@ -225,6 +239,7 @@ switch_mm(struct mm_struct *prev, struct mm_struct *next,
 #define activate_mm(prev,next)	switch_mm(prev, next, current)
 
 void verify_cpu_asid_bits(void);
+void post_ttbr_update_workaround(void);
 
 #endif /* !__ASSEMBLY__ */
 
diff --git a/arch/arm64/include/asm/percpu.h b/arch/arm64/include/asm/percpu.h
index 3bd498e..4339320 100644
--- a/arch/arm64/include/asm/percpu.h
+++ b/arch/arm64/include/asm/percpu.h
@@ -16,11 +16,15 @@
 #ifndef __ASM_PERCPU_H
 #define __ASM_PERCPU_H
 
+#include <asm/alternative.h>
 #include <asm/stack_pointer.h>
 
 static inline void set_my_cpu_offset(unsigned long off)
 {
-	asm volatile("msr tpidr_el1, %0" :: "r" (off) : "memory");
+	asm volatile(ALTERNATIVE("msr tpidr_el1, %0",
+				 "msr tpidr_el2, %0",
+				 ARM64_HAS_VIRT_HOST_EXTN)
+			:: "r" (off) : "memory");
 }
 
 static inline unsigned long __my_cpu_offset(void)
@@ -31,7 +35,10 @@ static inline unsigned long __my_cpu_offset(void)
 	 * We want to allow caching the value, so avoid using volatile and
 	 * instead use a fake stack read to hazard against barrier().
 	 */
-	asm("mrs %0, tpidr_el1" : "=r" (off) :
+	asm(ALTERNATIVE("mrs %0, tpidr_el1",
+			"mrs %0, tpidr_el2",
+			ARM64_HAS_VIRT_HOST_EXTN)
+		: "=r" (off) :
 		"Q" (*(const unsigned long *)current_stack_pointer));
 
 	return off;
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
index 5ca6a57..e9d9f1b 100644
--- a/arch/arm64/include/asm/pgalloc.h
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -44,7 +44,7 @@ static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
 
 static inline void __pud_populate(pud_t *pud, phys_addr_t pmd, pudval_t prot)
 {
-	set_pud(pud, __pud(pmd | prot));
+	set_pud(pud, __pud(__phys_to_pud_val(pmd) | prot));
 }
 
 static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
@@ -73,7 +73,7 @@ static inline void pud_free(struct mm_struct *mm, pud_t *pud)
 
 static inline void __pgd_populate(pgd_t *pgdp, phys_addr_t pud, pgdval_t prot)
 {
-	set_pgd(pgdp, __pgd(pud | prot));
+	set_pgd(pgdp, __pgd(__phys_to_pgd_val(pud) | prot));
 }
 
 static inline void pgd_populate(struct mm_struct *mm, pgd_t *pgd, pud_t *pud)
@@ -129,7 +129,7 @@ static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
 static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t pte,
 				  pmdval_t prot)
 {
-	set_pmd(pmdp, __pmd(pte | prot));
+	set_pmd(pmdp, __pmd(__phys_to_pmd_val(pte) | prot));
 }
 
 /*
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index eb0c2bd..f42836d 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -16,6 +16,8 @@
 #ifndef __ASM_PGTABLE_HWDEF_H
 #define __ASM_PGTABLE_HWDEF_H
 
+#include <asm/memory.h>
+
 /*
  * Number of page-table levels required to address 'va_bits' wide
  * address, without section mapping. We resolve the top (va_bits - PAGE_SHIFT)
@@ -116,9 +118,9 @@
  * Level 1 descriptor (PUD).
  */
 #define PUD_TYPE_TABLE		(_AT(pudval_t, 3) << 0)
-#define PUD_TABLE_BIT		(_AT(pgdval_t, 1) << 1)
-#define PUD_TYPE_MASK		(_AT(pgdval_t, 3) << 0)
-#define PUD_TYPE_SECT		(_AT(pgdval_t, 1) << 0)
+#define PUD_TABLE_BIT		(_AT(pudval_t, 1) << 1)
+#define PUD_TYPE_MASK		(_AT(pudval_t, 3) << 0)
+#define PUD_TYPE_SECT		(_AT(pudval_t, 1) << 0)
 
 /*
  * Level 2 descriptor (PMD).
@@ -166,6 +168,14 @@
 #define PTE_UXN			(_AT(pteval_t, 1) << 54)	/* User XN */
 #define PTE_HYP_XN		(_AT(pteval_t, 1) << 54)	/* HYP XN */
 
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#ifdef CONFIG_ARM64_PA_BITS_52
+#define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
+#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
+#else
+#define PTE_ADDR_MASK		PTE_ADDR_LOW
+#endif
+
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
  */
@@ -196,7 +206,7 @@
 /*
  * Highest possible physical address supported.
  */
-#define PHYS_MASK_SHIFT		(48)
+#define PHYS_MASK_SHIFT		(CONFIG_ARM64_PA_BITS)
 #define PHYS_MASK		((UL(1) << PHYS_MASK_SHIFT) - 1)
 
 /*
@@ -272,9 +282,23 @@
 #define TCR_TG1_4K		(UL(2) << TCR_TG1_SHIFT)
 #define TCR_TG1_64K		(UL(3) << TCR_TG1_SHIFT)
 
+#define TCR_IPS_SHIFT		32
+#define TCR_IPS_MASK		(UL(7) << TCR_IPS_SHIFT)
+#define TCR_A1			(UL(1) << 22)
 #define TCR_ASID16		(UL(1) << 36)
 #define TCR_TBI0		(UL(1) << 37)
 #define TCR_HA			(UL(1) << 39)
 #define TCR_HD			(UL(1) << 40)
 
+/*
+ * TTBR.
+ */
+#ifdef CONFIG_ARM64_PA_BITS_52
+/*
+ * This should be GENMASK_ULL(47, 2).
+ * TTBR_ELx[1] is RES0 in this configuration.
+ */
+#define TTBR_BADDR_MASK_52	(((UL(1) << 46) - 1) << 2)
+#endif
+
 #endif
diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 0a5635f..22a9268 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -34,8 +34,16 @@
 
 #include <asm/pgtable-types.h>
 
-#define PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define _PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
+#define _PROT_SECT_DEFAULT	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define PROT_DEFAULT		(_PROT_DEFAULT | PTE_NG)
+#define PROT_SECT_DEFAULT	(_PROT_SECT_DEFAULT | PMD_SECT_NG)
+#else
+#define PROT_DEFAULT		_PROT_DEFAULT
+#define PROT_SECT_DEFAULT	_PROT_SECT_DEFAULT
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
 
 #define PROT_DEVICE_nGnRnE	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRnE))
 #define PROT_DEVICE_nGnRE	(PROT_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_ATTRINDX(MT_DEVICE_nGnRE))
@@ -48,6 +56,7 @@
 #define PROT_SECT_NORMAL_EXEC	(PROT_SECT_DEFAULT | PMD_SECT_UXN | PMD_ATTRINDX(MT_NORMAL))
 
 #define _PAGE_DEFAULT		(PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL))
+#define _HYP_PAGE_DEFAULT	(_PAGE_DEFAULT & ~PTE_NG)
 
 #define PAGE_KERNEL		__pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_WRITE)
 #define PAGE_KERNEL_RO		__pgprot(_PAGE_DEFAULT | PTE_PXN | PTE_UXN | PTE_DIRTY | PTE_RDONLY)
@@ -55,15 +64,15 @@
 #define PAGE_KERNEL_EXEC	__pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE)
 #define PAGE_KERNEL_EXEC_CONT	__pgprot(_PAGE_DEFAULT | PTE_UXN | PTE_DIRTY | PTE_WRITE | PTE_CONT)
 
-#define PAGE_HYP		__pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN)
-#define PAGE_HYP_EXEC		__pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY)
-#define PAGE_HYP_RO		__pgprot(_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
+#define PAGE_HYP		__pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_HYP_XN)
+#define PAGE_HYP_EXEC		__pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY)
+#define PAGE_HYP_RO		__pgprot(_HYP_PAGE_DEFAULT | PTE_HYP | PTE_RDONLY | PTE_HYP_XN)
 #define PAGE_HYP_DEVICE		__pgprot(PROT_DEVICE_nGnRE | PTE_HYP)
 
 #define PAGE_S2			__pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_NORMAL) | PTE_S2_RDONLY)
 #define PAGE_S2_DEVICE		__pgprot(PROT_DEFAULT | PTE_S2_MEMATTR(MT_S2_DEVICE_nGnRE) | PTE_S2_RDONLY | PTE_UXN)
 
-#define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_PXN | PTE_UXN)
+#define PAGE_NONE		__pgprot(((_PAGE_DEFAULT) & ~PTE_VALID) | PTE_PROT_NONE | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
 #define PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_UXN | PTE_WRITE)
 #define PAGE_SHARED_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_PXN | PTE_WRITE)
 #define PAGE_READONLY		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN | PTE_UXN)
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index bdcc7f1..89167c4 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -59,9 +59,22 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 
 #define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))
 
-#define pte_pfn(pte)		((pte_val(pte) & PHYS_MASK) >> PAGE_SHIFT)
+/*
+ * Macros to convert between a physical address and its placement in a
+ * page table entry, taking care of 52-bit addresses.
+ */
+#ifdef CONFIG_ARM64_PA_BITS_52
+#define __pte_to_phys(pte)	\
+	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
+#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
+#else
+#define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
+#define __phys_to_pte_val(phys)	(phys)
+#endif
 
-#define pfn_pte(pfn,prot)	(__pte(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+#define pte_pfn(pte)		(__pte_to_phys(pte) >> PAGE_SHIFT)
+#define pfn_pte(pfn,prot)	\
+	__pte(__phys_to_pte_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
 
 #define pte_none(pte)		(!pte_val(pte))
 #define pte_clear(mm,addr,ptep)	set_pte(ptep, __pte(0))
@@ -292,6 +305,11 @@ static inline int pte_same(pte_t pte_a, pte_t pte_b)
 
 #define __HAVE_ARCH_PTE_SPECIAL
 
+static inline pte_t pgd_pte(pgd_t pgd)
+{
+	return __pte(pgd_val(pgd));
+}
+
 static inline pte_t pud_pte(pud_t pud)
 {
 	return __pte(pud_val(pud));
@@ -357,15 +375,24 @@ static inline int pmd_protnone(pmd_t pmd)
 
 #define pmd_mkhuge(pmd)		(__pmd(pmd_val(pmd) & ~PMD_TABLE_BIT))
 
-#define pmd_pfn(pmd)		(((pmd_val(pmd) & PMD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
-#define pfn_pmd(pfn,prot)	(__pmd(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+#define __pmd_to_phys(pmd)	__pte_to_phys(pmd_pte(pmd))
+#define __phys_to_pmd_val(phys)	__phys_to_pte_val(phys)
+#define pmd_pfn(pmd)		((__pmd_to_phys(pmd) & PMD_MASK) >> PAGE_SHIFT)
+#define pfn_pmd(pfn,prot)	__pmd(__phys_to_pmd_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
 #define mk_pmd(page,prot)	pfn_pmd(page_to_pfn(page),prot)
 
 #define pud_write(pud)		pte_write(pud_pte(pud))
-#define pud_pfn(pud)		(((pud_val(pud) & PUD_MASK) & PHYS_MASK) >> PAGE_SHIFT)
+
+#define __pud_to_phys(pud)	__pte_to_phys(pud_pte(pud))
+#define __phys_to_pud_val(phys)	__phys_to_pte_val(phys)
+#define pud_pfn(pud)		((__pud_to_phys(pud) & PUD_MASK) >> PAGE_SHIFT)
+#define pfn_pud(pfn,prot)	__pud(__phys_to_pud_val((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot))
 
 #define set_pmd_at(mm, addr, pmdp, pmd)	set_pte_at(mm, addr, (pte_t *)pmdp, pmd_pte(pmd))
 
+#define __pgd_to_phys(pgd)	__pte_to_phys(pgd_pte(pgd))
+#define __phys_to_pgd_val(phys)	__phys_to_pte_val(phys)
+
 #define __pgprot_modify(prot,mask,bits) \
 	__pgprot((pgprot_val(prot) & ~(mask)) | (bits))
 
@@ -416,7 +443,7 @@ static inline void pmd_clear(pmd_t *pmdp)
 
 static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 {
-	return pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK;
+	return __pmd_to_phys(pmd);
 }
 
 /* Find an entry in the third-level page table. */
@@ -434,7 +461,7 @@ static inline phys_addr_t pmd_page_paddr(pmd_t pmd)
 #define pte_set_fixmap_offset(pmd, addr)	pte_set_fixmap(pte_offset_phys(pmd, addr))
 #define pte_clear_fixmap()		clear_fixmap(FIX_PTE)
 
-#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(__pmd_to_phys(pmd)))
 
 /* use ONLY for statically allocated translation tables */
 #define pte_offset_kimg(dir,addr)	((pte_t *)__phys_to_kimg(pte_offset_phys((dir), (addr))))
@@ -467,7 +494,7 @@ static inline void pud_clear(pud_t *pudp)
 
 static inline phys_addr_t pud_page_paddr(pud_t pud)
 {
-	return pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK;
+	return __pud_to_phys(pud);
 }
 
 /* Find an entry in the second-level page table. */
@@ -480,7 +507,7 @@ static inline phys_addr_t pud_page_paddr(pud_t pud)
 #define pmd_set_fixmap_offset(pud, addr)	pmd_set_fixmap(pmd_offset_phys(pud, addr))
 #define pmd_clear_fixmap()		clear_fixmap(FIX_PMD)
 
-#define pud_page(pud)		pfn_to_page(__phys_to_pfn(pud_val(pud) & PHYS_MASK))
+#define pud_page(pud)		pfn_to_page(__phys_to_pfn(__pud_to_phys(pud)))
 
 /* use ONLY for statically allocated translation tables */
 #define pmd_offset_kimg(dir,addr)	((pmd_t *)__phys_to_kimg(pmd_offset_phys((dir), (addr))))
@@ -519,7 +546,7 @@ static inline void pgd_clear(pgd_t *pgdp)
 
 static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 {
-	return pgd_val(pgd) & PHYS_MASK & (s32)PAGE_MASK;
+	return __pgd_to_phys(pgd);
 }
 
 /* Find an entry in the frst-level page table. */
@@ -532,7 +559,7 @@ static inline phys_addr_t pgd_page_paddr(pgd_t pgd)
 #define pud_set_fixmap_offset(pgd, addr)	pud_set_fixmap(pud_offset_phys(pgd, addr))
 #define pud_clear_fixmap()		clear_fixmap(FIX_PUD)
 
-#define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(pgd_val(pgd) & PHYS_MASK))
+#define pgd_page(pgd)		pfn_to_page(__phys_to_pfn(__pgd_to_phys(pgd)))
 
 /* use ONLY for statically allocated translation tables */
 #define pud_offset_kimg(dir,addr)	((pud_t *)__phys_to_kimg(pud_offset_phys((dir), (addr))))
@@ -682,7 +709,9 @@ static inline void pmdp_set_wrprotect(struct mm_struct *mm,
 #endif
 
 extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
+extern pgd_t swapper_pg_end[];
 extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+extern pgd_t tramp_pg_dir[PTRS_PER_PGD];
 
 /*
  * Encode and decode a swap entry:
@@ -736,6 +765,12 @@ static inline void update_mmu_cache(struct vm_area_struct *vma,
 #define kc_vaddr_to_offset(v)	((v) & ~VA_START)
 #define kc_offset_to_vaddr(o)	((o) | VA_START)
 
+#ifdef CONFIG_ARM64_PA_BITS_52
+#define phys_to_ttbr(addr)	(((addr) | ((addr) >> 46)) & TTBR_BADDR_MASK_52)
+#else
+#define phys_to_ttbr(addr)	(addr)
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_PGTABLE_H */
diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h
index 14ad6e4e..16cef2e 100644
--- a/arch/arm64/include/asm/proc-fns.h
+++ b/arch/arm64/include/asm/proc-fns.h
@@ -35,12 +35,6 @@ extern u64 cpu_do_resume(phys_addr_t ptr, u64 idmap_ttbr);
 
 #include <asm/memory.h>
 
-#define cpu_switch_mm(pgd,mm)				\
-do {							\
-	BUG_ON(pgd == swapper_pg_dir);			\
-	cpu_do_switch_mm(virt_to_phys(pgd),mm);		\
-} while (0)
-
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* __ASM_PROCFNS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
index 023cacb..cee4ae2 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -216,6 +216,7 @@ static inline void spin_lock_prefetch(const void *ptr)
 
 int cpu_enable_pan(void *__unused);
 int cpu_enable_cache_maint_trap(void *__unused);
+int cpu_clear_disr(void *__unused);
 
 /* Userspace interface for PR_SVE_{SET,GET}_VL prctl()s: */
 #define SVE_SET_VL(arg)	sve_set_current_vl(arg)
diff --git a/arch/arm64/include/asm/sdei.h b/arch/arm64/include/asm/sdei.h
new file mode 100644
index 0000000..e073e68
--- /dev/null
+++ b/arch/arm64/include/asm/sdei.h
@@ -0,0 +1,57 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017 Arm Ltd.
+#ifndef __ASM_SDEI_H
+#define __ASM_SDEI_H
+
+/* Values for sdei_exit_mode */
+#define SDEI_EXIT_HVC  0
+#define SDEI_EXIT_SMC  1
+
+#define SDEI_STACK_SIZE		IRQ_STACK_SIZE
+
+#ifndef __ASSEMBLY__
+
+#include <linux/linkage.h>
+#include <linux/preempt.h>
+#include <linux/types.h>
+
+#include <asm/virt.h>
+
+extern unsigned long sdei_exit_mode;
+
+/* Software Delegated Exception entry point from firmware*/
+asmlinkage void __sdei_asm_handler(unsigned long event_num, unsigned long arg,
+				   unsigned long pc, unsigned long pstate);
+
+/* and its CONFIG_UNMAP_KERNEL_AT_EL0 trampoline */
+asmlinkage void __sdei_asm_entry_trampoline(unsigned long event_num,
+						   unsigned long arg,
+						   unsigned long pc,
+						   unsigned long pstate);
+
+/*
+ * The above entry point does the minimum to call C code. This function does
+ * anything else, before calling the driver.
+ */
+struct sdei_registered_event;
+asmlinkage unsigned long __sdei_handler(struct pt_regs *regs,
+					struct sdei_registered_event *arg);
+
+unsigned long sdei_arch_get_entry_point(int conduit);
+#define sdei_arch_get_entry_point(x)	sdei_arch_get_entry_point(x)
+
+bool _on_sdei_stack(unsigned long sp);
+static inline bool on_sdei_stack(unsigned long sp)
+{
+	if (!IS_ENABLED(CONFIG_VMAP_STACK))
+		return false;
+	if (!IS_ENABLED(CONFIG_ARM_SDE_INTERFACE))
+		return false;
+	if (in_nmi())
+		return _on_sdei_stack(sp);
+
+	return false;
+}
+
+#endif /* __ASSEMBLY__ */
+#endif	/* __ASM_SDEI_H */
diff --git a/arch/arm64/include/asm/sections.h b/arch/arm64/include/asm/sections.h
index 941267c..caab039 100644
--- a/arch/arm64/include/asm/sections.h
+++ b/arch/arm64/include/asm/sections.h
@@ -28,5 +28,6 @@ extern char __initdata_begin[], __initdata_end[];
 extern char __inittext_begin[], __inittext_end[];
 extern char __irqentry_text_start[], __irqentry_text_end[];
 extern char __mmuoff_data_start[], __mmuoff_data_end[];
+extern char __entry_tramp_text_start[], __entry_tramp_text_end[];
 
 #endif /* __ASM_SECTIONS_H */
diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h
index 74a9d30..b299929 100644
--- a/arch/arm64/include/asm/sparsemem.h
+++ b/arch/arm64/include/asm/sparsemem.h
@@ -17,7 +17,7 @@
 #define __ASM_SPARSEMEM_H
 
 #ifdef CONFIG_SPARSEMEM
-#define MAX_PHYSMEM_BITS	48
+#define MAX_PHYSMEM_BITS	CONFIG_ARM64_PA_BITS
 #define SECTION_SIZE_BITS	30
 #endif
 
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index 6ad3077..472ef94 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -22,6 +22,7 @@
 
 #include <asm/memory.h>
 #include <asm/ptrace.h>
+#include <asm/sdei.h>
 
 struct stackframe {
 	unsigned long fp;
@@ -85,6 +86,8 @@ static inline bool on_accessible_stack(struct task_struct *tsk, unsigned long sp
 		return true;
 	if (on_overflow_stack(sp))
 		return true;
+	if (on_sdei_stack(sp))
+		return true;
 
 	return false;
 }
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 08cc885..0e1960c 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -20,6 +20,7 @@
 #ifndef __ASM_SYSREG_H
 #define __ASM_SYSREG_H
 
+#include <asm/compiler.h>
 #include <linux/stringify.h>
 
 /*
@@ -175,6 +176,16 @@
 #define SYS_AFSR0_EL1			sys_reg(3, 0, 5, 1, 0)
 #define SYS_AFSR1_EL1			sys_reg(3, 0, 5, 1, 1)
 #define SYS_ESR_EL1			sys_reg(3, 0, 5, 2, 0)
+
+#define SYS_ERRIDR_EL1			sys_reg(3, 0, 5, 3, 0)
+#define SYS_ERRSELR_EL1			sys_reg(3, 0, 5, 3, 1)
+#define SYS_ERXFR_EL1			sys_reg(3, 0, 5, 4, 0)
+#define SYS_ERXCTLR_EL1			sys_reg(3, 0, 5, 4, 1)
+#define SYS_ERXSTATUS_EL1		sys_reg(3, 0, 5, 4, 2)
+#define SYS_ERXADDR_EL1			sys_reg(3, 0, 5, 4, 3)
+#define SYS_ERXMISC0_EL1		sys_reg(3, 0, 5, 5, 0)
+#define SYS_ERXMISC1_EL1		sys_reg(3, 0, 5, 5, 1)
+
 #define SYS_FAR_EL1			sys_reg(3, 0, 6, 0, 0)
 #define SYS_PAR_EL1			sys_reg(3, 0, 7, 4, 0)
 
@@ -278,6 +289,7 @@
 #define SYS_AMAIR_EL1			sys_reg(3, 0, 10, 3, 0)
 
 #define SYS_VBAR_EL1			sys_reg(3, 0, 12, 0, 0)
+#define SYS_DISR_EL1			sys_reg(3, 0, 12, 1, 1)
 
 #define SYS_ICC_IAR0_EL1		sys_reg(3, 0, 12, 8, 0)
 #define SYS_ICC_EOIR0_EL1		sys_reg(3, 0, 12, 8, 1)
@@ -353,8 +365,10 @@
 
 #define SYS_DACR32_EL2			sys_reg(3, 4, 3, 0, 0)
 #define SYS_IFSR32_EL2			sys_reg(3, 4, 5, 0, 1)
+#define SYS_VSESR_EL2			sys_reg(3, 4, 5, 2, 3)
 #define SYS_FPEXC32_EL2			sys_reg(3, 4, 5, 3, 0)
 
+#define SYS_VDISR_EL2			sys_reg(3, 4, 12, 1,  1)
 #define __SYS__AP0Rx_EL2(x)		sys_reg(3, 4, 12, 8, x)
 #define SYS_ICH_AP0R0_EL2		__SYS__AP0Rx_EL2(0)
 #define SYS_ICH_AP0R1_EL2		__SYS__AP0Rx_EL2(1)
@@ -398,27 +412,85 @@
 
 /* Common SCTLR_ELx flags. */
 #define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_IESB	(1 << 21)
+#define SCTLR_ELx_WXN	(1 << 19)
 #define SCTLR_ELx_I	(1 << 12)
 #define SCTLR_ELx_SA	(1 << 3)
 #define SCTLR_ELx_C	(1 << 2)
 #define SCTLR_ELx_A	(1 << 1)
 #define SCTLR_ELx_M	1
 
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M  | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I | SCTLR_ELx_IESB)
+
+/* SCTLR_EL2 specific flags. */
 #define SCTLR_EL2_RES1	((1 << 4)  | (1 << 5)  | (1 << 11) | (1 << 16) | \
 			 (1 << 18) | (1 << 22) | (1 << 23) | (1 << 28) | \
 			 (1 << 29))
+#define SCTLR_EL2_RES0	((1 << 6)  | (1 << 7)  | (1 << 8)  | (1 << 9)  | \
+			 (1 << 10) | (1 << 13) | (1 << 14) | (1 << 15) | \
+			 (1 << 17) | (1 << 20) | (1 << 24) | (1 << 26) | \
+			 (1 << 27) | (1 << 30) | (1 << 31))
 
-#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
-			 SCTLR_ELx_SA | SCTLR_ELx_I)
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL2		SCTLR_ELx_EE
+#define ENDIAN_CLEAR_EL2	0
+#else
+#define ENDIAN_SET_EL2		0
+#define ENDIAN_CLEAR_EL2	SCTLR_ELx_EE
+#endif
+
+/* SCTLR_EL2 value used for the hyp-stub */
+#define SCTLR_EL2_SET	(SCTLR_ELx_IESB   | ENDIAN_SET_EL2   | SCTLR_EL2_RES1)
+#define SCTLR_EL2_CLEAR	(SCTLR_ELx_M      | SCTLR_ELx_A    | SCTLR_ELx_C   | \
+			 SCTLR_ELx_SA     | SCTLR_ELx_I    | SCTLR_ELx_WXN | \
+			 ENDIAN_CLEAR_EL2 | SCTLR_EL2_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL2_SET ^ SCTLR_EL2_CLEAR) != ~0)
+
 
 /* SCTLR_EL1 specific flags. */
 #define SCTLR_EL1_UCI		(1 << 26)
+#define SCTLR_EL1_E0E		(1 << 24)
 #define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_NTWE		(1 << 18)
+#define SCTLR_EL1_NTWI		(1 << 16)
 #define SCTLR_EL1_UCT		(1 << 15)
+#define SCTLR_EL1_DZE		(1 << 14)
+#define SCTLR_EL1_UMA		(1 << 9)
 #define SCTLR_EL1_SED		(1 << 8)
+#define SCTLR_EL1_ITD		(1 << 7)
 #define SCTLR_EL1_CP15BEN	(1 << 5)
+#define SCTLR_EL1_SA0		(1 << 4)
+
+#define SCTLR_EL1_RES1	((1 << 11) | (1 << 20) | (1 << 22) | (1 << 28) | \
+			 (1 << 29))
+#define SCTLR_EL1_RES0  ((1 << 6)  | (1 << 10) | (1 << 13) | (1 << 17) | \
+			 (1 << 27) | (1 << 30) | (1 << 31))
+
+#ifdef CONFIG_CPU_BIG_ENDIAN
+#define ENDIAN_SET_EL1		(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#define ENDIAN_CLEAR_EL1	0
+#else
+#define ENDIAN_SET_EL1		0
+#define ENDIAN_CLEAR_EL1	(SCTLR_EL1_E0E | SCTLR_ELx_EE)
+#endif
+
+#define SCTLR_EL1_SET	(SCTLR_ELx_M    | SCTLR_ELx_C    | SCTLR_ELx_SA   |\
+			 SCTLR_EL1_SA0  | SCTLR_EL1_SED  | SCTLR_ELx_I    |\
+			 SCTLR_EL1_DZE  | SCTLR_EL1_UCT  | SCTLR_EL1_NTWI |\
+			 SCTLR_EL1_NTWE | SCTLR_ELx_IESB | SCTLR_EL1_SPAN |\
+			 ENDIAN_SET_EL1 | SCTLR_EL1_UCI  | SCTLR_EL1_RES1)
+#define SCTLR_EL1_CLEAR	(SCTLR_ELx_A   | SCTLR_EL1_CP15BEN | SCTLR_EL1_ITD    |\
+			 SCTLR_EL1_UMA | SCTLR_ELx_WXN     | ENDIAN_CLEAR_EL1 |\
+			 SCTLR_EL1_RES0)
+
+/* Check all the bits are accounted for */
+#define SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS	BUILD_BUG_ON((SCTLR_EL1_SET ^ SCTLR_EL1_CLEAR) != ~0)
 
 /* id_aa64isar0 */
+#define ID_AA64ISAR0_FHM_SHIFT		48
 #define ID_AA64ISAR0_DP_SHIFT		44
 #define ID_AA64ISAR0_SM4_SHIFT		40
 #define ID_AA64ISAR0_SM3_SHIFT		36
@@ -437,7 +509,10 @@
 #define ID_AA64ISAR1_DPB_SHIFT		0
 
 /* id_aa64pfr0 */
+#define ID_AA64PFR0_CSV3_SHIFT		60
+#define ID_AA64PFR0_CSV2_SHIFT		56
 #define ID_AA64PFR0_SVE_SHIFT		32
+#define ID_AA64PFR0_RAS_SHIFT		28
 #define ID_AA64PFR0_GIC_SHIFT		24
 #define ID_AA64PFR0_ASIMD_SHIFT		20
 #define ID_AA64PFR0_FP_SHIFT		16
@@ -447,6 +522,7 @@
 #define ID_AA64PFR0_EL0_SHIFT		0
 
 #define ID_AA64PFR0_SVE			0x1
+#define ID_AA64PFR0_RAS_V1		0x1
 #define ID_AA64PFR0_FP_NI		0xf
 #define ID_AA64PFR0_FP_SUPPORTED	0x0
 #define ID_AA64PFR0_ASIMD_NI		0xf
@@ -471,6 +547,14 @@
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN16_NI		0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED	0x1
+#define ID_AA64MMFR0_PARANGE_48		0x5
+#define ID_AA64MMFR0_PARANGE_52		0x6
+
+#ifdef CONFIG_ARM64_PA_BITS_52
+#define ID_AA64MMFR0_PARANGE_MAX	ID_AA64MMFR0_PARANGE_52
+#else
+#define ID_AA64MMFR0_PARANGE_MAX	ID_AA64MMFR0_PARANGE_48
+#endif
 
 /* id_aa64mmfr1 */
 #define ID_AA64MMFR1_PAN_SHIFT		20
@@ -582,6 +666,7 @@
 
 #else
 
+#include <linux/build_bug.h>
 #include <linux/types.h>
 
 asm(
@@ -638,6 +723,9 @@ static inline void config_sctlr_el1(u32 clear, u32 set)
 {
 	u32 val;
 
+	SCTLR_EL2_BUILD_BUG_ON_MISSING_BITS;
+	SCTLR_EL1_BUILD_BUG_ON_MISSING_BITS;
+
 	val = read_sysreg(sctlr_el1);
 	val &= ~clear;
 	val |= set;
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index eb43128..740aa03c 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -51,8 +51,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,					\
 }
 
-#define init_stack		(init_thread_union.stack)
-
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(tsk->thread.cpu_context.pc))
 #define thread_saved_sp(tsk)	\
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
index af1c769..9e82dd7 100644
--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -23,6 +23,7 @@
 
 #include <linux/sched.h>
 #include <asm/cputype.h>
+#include <asm/mmu.h>
 
 /*
  * Raw TLBI operations.
@@ -54,6 +55,11 @@
 
 #define __tlbi(op, ...)		__TLBI_N(op, ##__VA_ARGS__, 1, 0)
 
+#define __tlbi_user(op, arg) do {						\
+	if (arm64_kernel_unmapped_at_el0())					\
+		__tlbi(op, (arg) | USER_ASID_FLAG);				\
+} while (0)
+
 /*
  *	TLB Management
  *	==============
@@ -115,6 +121,7 @@ static inline void flush_tlb_mm(struct mm_struct *mm)
 
 	dsb(ishst);
 	__tlbi(aside1is, asid);
+	__tlbi_user(aside1is, asid);
 	dsb(ish);
 }
 
@@ -125,6 +132,7 @@ static inline void flush_tlb_page(struct vm_area_struct *vma,
 
 	dsb(ishst);
 	__tlbi(vale1is, addr);
+	__tlbi_user(vale1is, addr);
 	dsb(ish);
 }
 
@@ -151,10 +159,13 @@ static inline void __flush_tlb_range(struct vm_area_struct *vma,
 
 	dsb(ishst);
 	for (addr = start; addr < end; addr += 1 << (PAGE_SHIFT - 12)) {
-		if (last_level)
+		if (last_level) {
 			__tlbi(vale1is, addr);
-		else
+			__tlbi_user(vale1is, addr);
+		} else {
 			__tlbi(vae1is, addr);
+			__tlbi_user(vae1is, addr);
+		}
 	}
 	dsb(ish);
 }
@@ -194,6 +205,7 @@ static inline void __flush_tlb_pgtable(struct mm_struct *mm,
 	unsigned long addr = uaddr >> 12 | (ASID(mm) << 48);
 
 	__tlbi(vae1is, addr);
+	__tlbi_user(vae1is, addr);
 	dsb(ish);
 }
 
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
index 1696f9d..178e338 100644
--- a/arch/arm64/include/asm/traps.h
+++ b/arch/arm64/include/asm/traps.h
@@ -19,6 +19,7 @@
 #define __ASM_TRAP_H
 
 #include <linux/list.h>
+#include <asm/esr.h>
 #include <asm/sections.h>
 
 struct pt_regs;
@@ -66,4 +67,57 @@ static inline int in_entry_text(unsigned long ptr)
 	return ptr >= (unsigned long)&__entry_text_start &&
 	       ptr < (unsigned long)&__entry_text_end;
 }
+
+/*
+ * CPUs with the RAS extensions have an Implementation-Defined-Syndrome bit
+ * to indicate whether this ESR has a RAS encoding. CPUs without this feature
+ * have a ISS-Valid bit in the same position.
+ * If this bit is set, we know its not a RAS SError.
+ * If its clear, we need to know if the CPU supports RAS. Uncategorized RAS
+ * errors share the same encoding as an all-zeros encoding from a CPU that
+ * doesn't support RAS.
+ */
+static inline bool arm64_is_ras_serror(u32 esr)
+{
+	WARN_ON(preemptible());
+
+	if (esr & ESR_ELx_IDS)
+		return false;
+
+	if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN))
+		return true;
+	else
+		return false;
+}
+
+/*
+ * Return the AET bits from a RAS SError's ESR.
+ *
+ * It is implementation defined whether Uncategorized errors are containable.
+ * We treat them as Uncontainable.
+ * Non-RAS SError's are reported as Uncontained/Uncategorized.
+ */
+static inline u32 arm64_ras_serror_get_severity(u32 esr)
+{
+	u32 aet = esr & ESR_ELx_AET;
+
+	if (!arm64_is_ras_serror(esr)) {
+		/* Not a RAS error, we can't interpret the ESR. */
+		return ESR_ELx_AET_UC;
+	}
+
+	/*
+	 * AET is RES0 if 'the value returned in the DFSC field is not
+	 * [ESR_ELx_FSC_SERROR]'
+	 */
+	if ((esr & ESR_ELx_FSC) != ESR_ELx_FSC_SERROR) {
+		/* No severity information : Uncategorized */
+		return ESR_ELx_AET_UC;
+	}
+
+	return aet;
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr);
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr);
 #endif
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index fc0f9eb..59fda529 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -105,17 +105,23 @@ static inline void set_fs(mm_segment_t fs)
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 static inline void __uaccess_ttbr0_disable(void)
 {
-	unsigned long ttbr;
+	unsigned long flags, ttbr;
 
-	/* reserved_ttbr0 placed at the end of swapper_pg_dir */
-	ttbr = read_sysreg(ttbr1_el1) + SWAPPER_DIR_SIZE;
-	write_sysreg(ttbr, ttbr0_el1);
+	local_irq_save(flags);
+	ttbr = read_sysreg(ttbr1_el1);
+	ttbr &= ~TTBR_ASID_MASK;
+	/* reserved_ttbr0 placed before swapper_pg_dir */
+	write_sysreg(ttbr - RESERVED_TTBR0_SIZE, ttbr0_el1);
 	isb();
+	/* Set reserved ASID */
+	write_sysreg(ttbr, ttbr1_el1);
+	isb();
+	local_irq_restore(flags);
 }
 
 static inline void __uaccess_ttbr0_enable(void)
 {
-	unsigned long flags;
+	unsigned long flags, ttbr0, ttbr1;
 
 	/*
 	 * Disable interrupts to avoid preemption between reading the 'ttbr0'
@@ -123,7 +129,17 @@ static inline void __uaccess_ttbr0_enable(void)
 	 * roll-over and an update of 'ttbr0'.
 	 */
 	local_irq_save(flags);
-	write_sysreg(current_thread_info()->ttbr0, ttbr0_el1);
+	ttbr0 = READ_ONCE(current_thread_info()->ttbr0);
+
+	/* Restore active ASID */
+	ttbr1 = read_sysreg(ttbr1_el1);
+	ttbr1 &= ~TTBR_ASID_MASK;		/* safety measure */
+	ttbr1 |= ttbr0 & TTBR_ASID_MASK;
+	write_sysreg(ttbr1, ttbr1_el1);
+	isb();
+
+	/* Restore user page table */
+	write_sysreg(ttbr0, ttbr0_el1);
 	isb();
 	local_irq_restore(flags);
 }
@@ -155,6 +171,18 @@ static inline bool uaccess_ttbr0_enable(void)
 }
 #endif
 
+static inline void __uaccess_disable_hw_pan(void)
+{
+	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(0), ARM64_HAS_PAN,
+			CONFIG_ARM64_PAN));
+}
+
+static inline void __uaccess_enable_hw_pan(void)
+{
+	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
+			CONFIG_ARM64_PAN));
+}
+
 #define __uaccess_disable(alt)						\
 do {									\
 	if (!uaccess_ttbr0_disable())					\
diff --git a/arch/arm64/include/asm/vmap_stack.h b/arch/arm64/include/asm/vmap_stack.h
new file mode 100644
index 0000000..0b5ec6e
--- /dev/null
+++ b/arch/arm64/include/asm/vmap_stack.h
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017 Arm Ltd.
+#ifndef __ASM_VMAP_STACK_H
+#define __ASM_VMAP_STACK_H
+
+#include <linux/bug.h>
+#include <linux/gfp.h>
+#include <linux/kconfig.h>
+#include <linux/vmalloc.h>
+#include <asm/memory.h>
+#include <asm/pgtable.h>
+#include <asm/thread_info.h>
+
+/*
+ * To ensure that VMAP'd stack overflow detection works correctly, all VMAP'd
+ * stacks need to have the same alignment.
+ */
+static inline unsigned long *arch_alloc_vmap_stack(size_t stack_size, int node)
+{
+	BUILD_BUG_ON(!IS_ENABLED(CONFIG_VMAP_STACK));
+
+	return __vmalloc_node_range(stack_size, THREAD_ALIGN,
+				    VMALLOC_START, VMALLOC_END,
+				    THREADINFO_GFP, PAGE_KERNEL, 0, node,
+				    __builtin_return_address(0));
+}
+
+#endif /* __ASM_VMAP_STACK_H */
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index cda76fa..f018c3d 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -43,5 +43,6 @@
 #define HWCAP_ASIMDDP		(1 << 20)
 #define HWCAP_SHA512		(1 << 21)
 #define HWCAP_SVE		(1 << 22)
+#define HWCAP_ASIMDFHM		(1 << 23)
 
 #endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 067baac..b875413 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -52,6 +52,11 @@
 arm64-obj-$(CONFIG_ARM64_RELOC_TEST)	+= arm64-reloc-test.o
 arm64-reloc-test-y := reloc_test_core.o reloc_test_syms.o
 arm64-obj-$(CONFIG_CRASH_DUMP)		+= crash_dump.o
+arm64-obj-$(CONFIG_ARM_SDE_INTERFACE)	+= sdei.o
+
+ifeq ($(CONFIG_KVM),y)
+arm64-obj-$(CONFIG_HARDEN_BRANCH_PREDICTOR)	+= bpi.o
+endif
 
 obj-y					+= $(arm64-obj-y) vdso/ probes/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/acpi.c b/arch/arm64/kernel/acpi.c
index b3162715..252396a 100644
--- a/arch/arm64/kernel/acpi.c
+++ b/arch/arm64/kernel/acpi.c
@@ -117,7 +117,7 @@ bool __init acpi_psci_present(void)
 }
 
 /* Whether HVC must be used instead of SMC as the PSCI conduit */
-bool __init acpi_psci_use_hvc(void)
+bool acpi_psci_use_hvc(void)
 {
 	return acpi_gbl_FADT.arm_boot_flags & ACPI_FADT_PSCI_USE_HVC;
 }
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c
index 6dd0a3a3..414288a 100644
--- a/arch/arm64/kernel/alternative.c
+++ b/arch/arm64/kernel/alternative.c
@@ -32,6 +32,8 @@
 #define ALT_ORIG_PTR(a)		__ALT_PTR(a, orig_offset)
 #define ALT_REPL_PTR(a)		__ALT_PTR(a, alt_offset)
 
+int alternatives_applied;
+
 struct alt_region {
 	struct alt_instr *begin;
 	struct alt_instr *end;
@@ -143,7 +145,6 @@ static void __apply_alternatives(void *alt_region, bool use_linear_alias)
  */
 static int __apply_alternatives_multi_stop(void *unused)
 {
-	static int patched = 0;
 	struct alt_region region = {
 		.begin	= (struct alt_instr *)__alt_instructions,
 		.end	= (struct alt_instr *)__alt_instructions_end,
@@ -151,14 +152,14 @@ static int __apply_alternatives_multi_stop(void *unused)
 
 	/* We always have a CPU 0 at this point (__init) */
 	if (smp_processor_id()) {
-		while (!READ_ONCE(patched))
+		while (!READ_ONCE(alternatives_applied))
 			cpu_relax();
 		isb();
 	} else {
-		BUG_ON(patched);
+		BUG_ON(alternatives_applied);
 		__apply_alternatives(&region, true);
 		/* Barriers provided by the cache flushing */
-		WRITE_ONCE(patched, 1);
+		WRITE_ONCE(alternatives_applied, 1);
 	}
 
 	return 0;
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 71bf088..1303e04 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -18,12 +18,14 @@
  * along with this program.  If not, see <http://www.gnu.org/licenses/>.
  */
 
+#include <linux/arm_sdei.h>
 #include <linux/sched.h>
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
 #include <linux/suspend.h>
 #include <asm/cpufeature.h>
+#include <asm/fixmap.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -130,6 +132,7 @@ int main(void)
   BLANK();
 #ifdef CONFIG_KVM_ARM_HOST
   DEFINE(VCPU_CONTEXT,		offsetof(struct kvm_vcpu, arch.ctxt));
+  DEFINE(VCPU_FAULT_DISR,	offsetof(struct kvm_vcpu, arch.fault.disr_el1));
   DEFINE(CPU_GP_REGS,		offsetof(struct kvm_cpu_context, gp_regs));
   DEFINE(CPU_USER_PT_REGS,	offsetof(struct kvm_regs, regs));
   DEFINE(CPU_FP_REGS,		offsetof(struct kvm_regs, fp_regs));
@@ -148,11 +151,18 @@ int main(void)
   DEFINE(ARM_SMCCC_RES_X2_OFFS,		offsetof(struct arm_smccc_res, a2));
   DEFINE(ARM_SMCCC_QUIRK_ID_OFFS,	offsetof(struct arm_smccc_quirk, id));
   DEFINE(ARM_SMCCC_QUIRK_STATE_OFFS,	offsetof(struct arm_smccc_quirk, state));
-
   BLANK();
   DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
   DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
   DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   DEFINE(ARM64_FTR_SYSVAL,	offsetof(struct arm64_ftr_reg, sys_val));
+  BLANK();
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+  DEFINE(TRAMP_VALIAS,		TRAMP_VALIAS);
+#endif
+#ifdef CONFIG_ARM_SDE_INTERFACE
+  DEFINE(SDEI_EVENT_INTREGS,	offsetof(struct sdei_registered_event, interrupted_regs));
+  DEFINE(SDEI_EVENT_PRIORITY,	offsetof(struct sdei_registered_event, priority));
+#endif
   return 0;
 }
diff --git a/arch/arm64/kernel/bpi.S b/arch/arm64/kernel/bpi.S
new file mode 100644
index 0000000..76225c2
--- /dev/null
+++ b/arch/arm64/kernel/bpi.S
@@ -0,0 +1,87 @@
+/*
+ * Contains CPU specific branch predictor invalidation sequences
+ *
+ * Copyright (C) 2018 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+.macro ventry target
+	.rept 31
+	nop
+	.endr
+	b	\target
+.endm
+
+.macro vectors target
+	ventry \target + 0x000
+	ventry \target + 0x080
+	ventry \target + 0x100
+	ventry \target + 0x180
+
+	ventry \target + 0x200
+	ventry \target + 0x280
+	ventry \target + 0x300
+	ventry \target + 0x380
+
+	ventry \target + 0x400
+	ventry \target + 0x480
+	ventry \target + 0x500
+	ventry \target + 0x580
+
+	ventry \target + 0x600
+	ventry \target + 0x680
+	ventry \target + 0x700
+	ventry \target + 0x780
+.endm
+
+	.align	11
+ENTRY(__bp_harden_hyp_vecs_start)
+	.rept 4
+	vectors __kvm_hyp_vector
+	.endr
+ENTRY(__bp_harden_hyp_vecs_end)
+ENTRY(__psci_hyp_bp_inval_start)
+	sub	sp, sp, #(8 * 18)
+	stp	x16, x17, [sp, #(16 * 0)]
+	stp	x14, x15, [sp, #(16 * 1)]
+	stp	x12, x13, [sp, #(16 * 2)]
+	stp	x10, x11, [sp, #(16 * 3)]
+	stp	x8, x9, [sp, #(16 * 4)]
+	stp	x6, x7, [sp, #(16 * 5)]
+	stp	x4, x5, [sp, #(16 * 6)]
+	stp	x2, x3, [sp, #(16 * 7)]
+	stp	x0, x1, [sp, #(16 * 8)]
+	mov	x0, #0x84000000
+	smc	#0
+	ldp	x16, x17, [sp, #(16 * 0)]
+	ldp	x14, x15, [sp, #(16 * 1)]
+	ldp	x12, x13, [sp, #(16 * 2)]
+	ldp	x10, x11, [sp, #(16 * 3)]
+	ldp	x8, x9, [sp, #(16 * 4)]
+	ldp	x6, x7, [sp, #(16 * 5)]
+	ldp	x4, x5, [sp, #(16 * 6)]
+	ldp	x2, x3, [sp, #(16 * 7)]
+	ldp	x0, x1, [sp, #(16 * 8)]
+	add	sp, sp, #(8 * 18)
+ENTRY(__psci_hyp_bp_inval_end)
+
+ENTRY(__qcom_hyp_sanitize_link_stack_start)
+	stp     x29, x30, [sp, #-16]!
+	.rept	16
+	bl	. + 4
+	.endr
+	ldp	x29, x30, [sp], #16
+ENTRY(__qcom_hyp_sanitize_link_stack_end)
diff --git a/arch/arm64/kernel/cpu_errata.c b/arch/arm64/kernel/cpu_errata.c
index 0e27f86..ed68818 100644
--- a/arch/arm64/kernel/cpu_errata.c
+++ b/arch/arm64/kernel/cpu_errata.c
@@ -30,6 +30,20 @@ is_affected_midr_range(const struct arm64_cpu_capabilities *entry, int scope)
 				       entry->midr_range_max);
 }
 
+static bool __maybe_unused
+is_kryo_midr(const struct arm64_cpu_capabilities *entry, int scope)
+{
+	u32 model;
+
+	WARN_ON(scope != SCOPE_LOCAL_CPU || preemptible());
+
+	model = read_cpuid_id();
+	model &= MIDR_IMPLEMENTOR_MASK | (0xf00 << MIDR_PARTNUM_SHIFT) |
+		 MIDR_ARCHITECTURE_MASK;
+
+	return model == entry->midr_model;
+}
+
 static bool
 has_mismatched_cache_line_size(const struct arm64_cpu_capabilities *entry,
 				int scope)
@@ -46,6 +60,127 @@ static int cpu_enable_trap_ctr_access(void *__unused)
 	return 0;
 }
 
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+#include <asm/mmu_context.h>
+#include <asm/cacheflush.h>
+
+DEFINE_PER_CPU_READ_MOSTLY(struct bp_hardening_data, bp_hardening_data);
+
+#ifdef CONFIG_KVM
+extern char __psci_hyp_bp_inval_start[], __psci_hyp_bp_inval_end[];
+extern char __qcom_hyp_sanitize_link_stack_start[];
+extern char __qcom_hyp_sanitize_link_stack_end[];
+
+static void __copy_hyp_vect_bpi(int slot, const char *hyp_vecs_start,
+				const char *hyp_vecs_end)
+{
+	void *dst = lm_alias(__bp_harden_hyp_vecs_start + slot * SZ_2K);
+	int i;
+
+	for (i = 0; i < SZ_2K; i += 0x80)
+		memcpy(dst + i, hyp_vecs_start, hyp_vecs_end - hyp_vecs_start);
+
+	flush_icache_range((uintptr_t)dst, (uintptr_t)dst + SZ_2K);
+}
+
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+				      const char *hyp_vecs_start,
+				      const char *hyp_vecs_end)
+{
+	static int last_slot = -1;
+	static DEFINE_SPINLOCK(bp_lock);
+	int cpu, slot = -1;
+
+	spin_lock(&bp_lock);
+	for_each_possible_cpu(cpu) {
+		if (per_cpu(bp_hardening_data.fn, cpu) == fn) {
+			slot = per_cpu(bp_hardening_data.hyp_vectors_slot, cpu);
+			break;
+		}
+	}
+
+	if (slot == -1) {
+		last_slot++;
+		BUG_ON(((__bp_harden_hyp_vecs_end - __bp_harden_hyp_vecs_start)
+			/ SZ_2K) <= last_slot);
+		slot = last_slot;
+		__copy_hyp_vect_bpi(slot, hyp_vecs_start, hyp_vecs_end);
+	}
+
+	__this_cpu_write(bp_hardening_data.hyp_vectors_slot, slot);
+	__this_cpu_write(bp_hardening_data.fn, fn);
+	spin_unlock(&bp_lock);
+}
+#else
+#define __psci_hyp_bp_inval_start		NULL
+#define __psci_hyp_bp_inval_end			NULL
+#define __qcom_hyp_sanitize_link_stack_start	NULL
+#define __qcom_hyp_sanitize_link_stack_end	NULL
+
+static void __install_bp_hardening_cb(bp_hardening_cb_t fn,
+				      const char *hyp_vecs_start,
+				      const char *hyp_vecs_end)
+{
+	__this_cpu_write(bp_hardening_data.fn, fn);
+}
+#endif	/* CONFIG_KVM */
+
+static void  install_bp_hardening_cb(const struct arm64_cpu_capabilities *entry,
+				     bp_hardening_cb_t fn,
+				     const char *hyp_vecs_start,
+				     const char *hyp_vecs_end)
+{
+	u64 pfr0;
+
+	if (!entry->matches(entry, SCOPE_LOCAL_CPU))
+		return;
+
+	pfr0 = read_cpuid(ID_AA64PFR0_EL1);
+	if (cpuid_feature_extract_unsigned_field(pfr0, ID_AA64PFR0_CSV2_SHIFT))
+		return;
+
+	__install_bp_hardening_cb(fn, hyp_vecs_start, hyp_vecs_end);
+}
+
+#include <linux/psci.h>
+
+static int enable_psci_bp_hardening(void *data)
+{
+	const struct arm64_cpu_capabilities *entry = data;
+
+	if (psci_ops.get_version)
+		install_bp_hardening_cb(entry,
+				       (bp_hardening_cb_t)psci_ops.get_version,
+				       __psci_hyp_bp_inval_start,
+				       __psci_hyp_bp_inval_end);
+
+	return 0;
+}
+
+static void qcom_link_stack_sanitization(void)
+{
+	u64 tmp;
+
+	asm volatile("mov	%0, x30		\n"
+		     ".rept	16		\n"
+		     "bl	. + 4		\n"
+		     ".endr			\n"
+		     "mov	x30, %0		\n"
+		     : "=&r" (tmp));
+}
+
+static int qcom_enable_link_stack_sanitization(void *data)
+{
+	const struct arm64_cpu_capabilities *entry = data;
+
+	install_bp_hardening_cb(entry, qcom_link_stack_sanitization,
+				__qcom_hyp_sanitize_link_stack_start,
+				__qcom_hyp_sanitize_link_stack_end);
+
+	return 0;
+}
+#endif	/* CONFIG_HARDEN_BRANCH_PREDICTOR */
+
 #define MIDR_RANGE(model, min, max) \
 	.def_scope = SCOPE_LOCAL_CPU, \
 	.matches = is_affected_midr_range, \
@@ -169,6 +304,13 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 			   MIDR_CPU_VAR_REV(0, 0),
 			   MIDR_CPU_VAR_REV(0, 0)),
 	},
+	{
+		.desc = "Qualcomm Technologies Kryo erratum 1003",
+		.capability = ARM64_WORKAROUND_QCOM_FALKOR_E1003,
+		.def_scope = SCOPE_LOCAL_CPU,
+		.midr_model = MIDR_QCOM_KRYO,
+		.matches = is_kryo_midr,
+	},
 #endif
 #ifdef CONFIG_QCOM_FALKOR_ERRATUM_1009
 	{
@@ -187,6 +329,47 @@ const struct arm64_cpu_capabilities arm64_errata[] = {
 		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
 	},
 #endif
+#ifdef CONFIG_HARDEN_BRANCH_PREDICTOR
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A57),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A72),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A73),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CORTEX_A75),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
+		.enable = qcom_enable_link_stack_sanitization,
+	},
+	{
+		.capability = ARM64_HARDEN_BP_POST_GUEST_EXIT,
+		MIDR_ALL_VERSIONS(MIDR_QCOM_FALKOR_V1),
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_BRCM_VULCAN),
+		.enable = enable_psci_bp_hardening,
+	},
+	{
+		.capability = ARM64_HARDEN_BRANCH_PREDICTOR,
+		MIDR_ALL_VERSIONS(MIDR_CAVIUM_THUNDERX2),
+		.enable = enable_psci_bp_hardening,
+	},
+#endif
 	{
 	}
 };
@@ -200,15 +383,18 @@ void verify_local_cpu_errata_workarounds(void)
 {
 	const struct arm64_cpu_capabilities *caps = arm64_errata;
 
-	for (; caps->matches; caps++)
-		if (!cpus_have_cap(caps->capability) &&
-			caps->matches(caps, SCOPE_LOCAL_CPU)) {
+	for (; caps->matches; caps++) {
+		if (cpus_have_cap(caps->capability)) {
+			if (caps->enable)
+				caps->enable((void *)caps);
+		} else if (caps->matches(caps, SCOPE_LOCAL_CPU)) {
 			pr_crit("CPU%d: Requires work around for %s, not detected"
 					" at boot time\n",
 				smp_processor_id(),
 				caps->desc ? : "an erratum");
 			cpu_die_early();
 		}
+	}
 }
 
 void update_cpu_errata_workarounds(void)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index a73a592..0fb6a31 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -123,6 +123,7 @@ cpufeature_pan_not_uao(const struct arm64_cpu_capabilities *entry, int __unused)
  * sync with the documentation of the CPU feature register ABI.
  */
 static const struct arm64_ftr_bits ftr_id_aa64isar0[] = {
+	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_FHM_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_DP_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM4_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64ISAR0_SM3_SHIFT, 4, 0),
@@ -145,8 +146,11 @@ static const struct arm64_ftr_bits ftr_id_aa64isar1[] = {
 };
 
 static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV3_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_NONSTRICT, FTR_LOWER_SAFE, ID_AA64PFR0_CSV2_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_SVE),
 				   FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_SVE_SHIFT, 4, 0),
+	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_RAS_SHIFT, 4, 0),
 	ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_GIC_SHIFT, 4, 0),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_ASIMD_SHIFT, 4, ID_AA64PFR0_ASIMD_NI),
 	S_ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR0_FP_SHIFT, 4, ID_AA64PFR0_FP_NI),
@@ -846,6 +850,67 @@ static bool has_no_fpsimd(const struct arm64_cpu_capabilities *entry, int __unus
 					ID_AA64PFR0_FP_SHIFT) < 0;
 }
 
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+static int __kpti_forced; /* 0: not forced, >0: forced on, <0: forced off */
+
+static bool unmap_kernel_at_el0(const struct arm64_cpu_capabilities *entry,
+				int __unused)
+{
+	u64 pfr0 = read_sanitised_ftr_reg(SYS_ID_AA64PFR0_EL1);
+
+	/* Forced on command line? */
+	if (__kpti_forced) {
+		pr_info_once("kernel page table isolation forced %s by command line option\n",
+			     __kpti_forced > 0 ? "ON" : "OFF");
+		return __kpti_forced > 0;
+	}
+
+	/* Useful for KASLR robustness */
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE))
+		return true;
+
+	/* Don't force KPTI for CPUs that are not vulnerable */
+	switch (read_cpuid_id() & MIDR_CPU_MODEL_MASK) {
+	case MIDR_CAVIUM_THUNDERX2:
+	case MIDR_BRCM_VULCAN:
+		return false;
+	}
+
+	/* Defer to CPU feature registers */
+	return !cpuid_feature_extract_unsigned_field(pfr0,
+						     ID_AA64PFR0_CSV3_SHIFT);
+}
+
+static int __init parse_kpti(char *str)
+{
+	bool enabled;
+	int ret = strtobool(str, &enabled);
+
+	if (ret)
+		return ret;
+
+	__kpti_forced = enabled ? 1 : -1;
+	return 0;
+}
+__setup("kpti=", parse_kpti);
+#endif	/* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
+static int cpu_copy_el2regs(void *__unused)
+{
+	/*
+	 * Copy register values that aren't redirected by hardware.
+	 *
+	 * Before code patching, we only set tpidr_el1, all CPUs need to copy
+	 * this value to tpidr_el2 before we patch the code. Once we've done
+	 * that, freshly-onlined CPUs will set tpidr_el2, so we don't need to
+	 * do anything here.
+	 */
+	if (!alternatives_applied)
+		write_sysreg(read_sysreg(tpidr_el1), tpidr_el2);
+
+	return 0;
+}
+
 static const struct arm64_cpu_capabilities arm64_features[] = {
 	{
 		.desc = "GIC system register CPU interface",
@@ -915,6 +980,7 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.capability = ARM64_HAS_VIRT_HOST_EXTN,
 		.def_scope = SCOPE_SYSTEM,
 		.matches = runs_at_el2,
+		.enable = cpu_copy_el2regs,
 	},
 	{
 		.desc = "32-bit EL0 Support",
@@ -932,6 +998,14 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.def_scope = SCOPE_SYSTEM,
 		.matches = hyp_offset_low,
 	},
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	{
+		.desc = "Kernel page table isolation (KPTI)",
+		.capability = ARM64_UNMAP_KERNEL_AT_EL0,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = unmap_kernel_at_el0,
+	},
+#endif
 	{
 		/* FP/SIMD is not implemented */
 		.capability = ARM64_HAS_NO_FPSIMD,
@@ -963,6 +1037,19 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
 		.enable = sve_kernel_enable,
 	},
 #endif /* CONFIG_ARM64_SVE */
+#ifdef CONFIG_ARM64_RAS_EXTN
+	{
+		.desc = "RAS Extension Support",
+		.capability = ARM64_HAS_RAS_EXTN,
+		.def_scope = SCOPE_SYSTEM,
+		.matches = has_cpuid_feature,
+		.sys_reg = SYS_ID_AA64PFR0_EL1,
+		.sign = FTR_UNSIGNED,
+		.field_pos = ID_AA64PFR0_RAS_SHIFT,
+		.min_field_value = ID_AA64PFR0_RAS_V1,
+		.enable = cpu_clear_disr,
+	},
+#endif /* CONFIG_ARM64_RAS_EXTN */
 	{},
 };
 
@@ -992,6 +1079,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
 	HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM3_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM3),
 	HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_SM4_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_SM4),
 	HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_DP_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDDP),
+	HWCAP_CAP(SYS_ID_AA64ISAR0_EL1, ID_AA64ISAR0_FHM_SHIFT, FTR_UNSIGNED, 1, CAP_HWCAP, HWCAP_ASIMDFHM),
 	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_FP),
 	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_FP_SHIFT, FTR_SIGNED, 1, CAP_HWCAP, HWCAP_FPHP),
 	HWCAP_CAP(SYS_ID_AA64PFR0_EL1, ID_AA64PFR0_ASIMD_SHIFT, FTR_SIGNED, 0, CAP_HWCAP, HWCAP_ASIMD),
@@ -1071,6 +1159,25 @@ static void __init setup_elf_hwcaps(const struct arm64_cpu_capabilities *hwcaps)
 			cap_set_elf_hwcap(hwcaps);
 }
 
+/*
+ * Check if the current CPU has a given feature capability.
+ * Should be called from non-preemptible context.
+ */
+static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
+			       unsigned int cap)
+{
+	const struct arm64_cpu_capabilities *caps;
+
+	if (WARN_ON(preemptible()))
+		return false;
+
+	for (caps = cap_array; caps->matches; caps++)
+		if (caps->capability == cap &&
+		    caps->matches(caps, SCOPE_LOCAL_CPU))
+			return true;
+	return false;
+}
+
 void update_cpu_capabilities(const struct arm64_cpu_capabilities *caps,
 			    const char *info)
 {
@@ -1106,7 +1213,7 @@ void __init enable_cpu_capabilities(const struct arm64_cpu_capabilities *caps)
 			 * uses an IPI, giving us a PSTATE that disappears when
 			 * we return.
 			 */
-			stop_machine(caps->enable, NULL, cpu_online_mask);
+			stop_machine(caps->enable, (void *)caps, cpu_online_mask);
 		}
 	}
 }
@@ -1134,8 +1241,9 @@ verify_local_elf_hwcaps(const struct arm64_cpu_capabilities *caps)
 }
 
 static void
-verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
+verify_local_cpu_features(const struct arm64_cpu_capabilities *caps_list)
 {
+	const struct arm64_cpu_capabilities *caps = caps_list;
 	for (; caps->matches; caps++) {
 		if (!cpus_have_cap(caps->capability))
 			continue;
@@ -1143,13 +1251,13 @@ verify_local_cpu_features(const struct arm64_cpu_capabilities *caps)
 		 * If the new CPU misses an advertised feature, we cannot proceed
 		 * further, park the cpu.
 		 */
-		if (!caps->matches(caps, SCOPE_LOCAL_CPU)) {
+		if (!__this_cpu_has_cap(caps_list, caps->capability)) {
 			pr_crit("CPU%d: missing feature: %s\n",
 					smp_processor_id(), caps->desc);
 			cpu_die_early();
 		}
 		if (caps->enable)
-			caps->enable(NULL);
+			caps->enable((void *)caps);
 	}
 }
 
@@ -1189,6 +1297,9 @@ static void verify_local_cpu_capabilities(void)
 
 	if (system_supports_sve())
 		verify_sve_features();
+
+	if (system_uses_ttbr0_pan())
+		pr_info("Emulating Privileged Access Never (PAN) using TTBR0_EL1 switching\n");
 }
 
 void check_local_cpu_capabilities(void)
@@ -1225,25 +1336,6 @@ static void __init mark_const_caps_ready(void)
 	static_branch_enable(&arm64_const_caps_ready);
 }
 
-/*
- * Check if the current CPU has a given feature capability.
- * Should be called from non-preemptible context.
- */
-static bool __this_cpu_has_cap(const struct arm64_cpu_capabilities *cap_array,
-			       unsigned int cap)
-{
-	const struct arm64_cpu_capabilities *caps;
-
-	if (WARN_ON(preemptible()))
-		return false;
-
-	for (caps = cap_array; caps->desc; caps++)
-		if (caps->capability == cap && caps->matches)
-			return caps->matches(caps, SCOPE_LOCAL_CPU);
-
-	return false;
-}
-
 extern const struct arm64_cpu_capabilities arm64_errata[];
 
 bool this_cpu_has_cap(unsigned int cap)
@@ -1387,3 +1479,11 @@ static int __init enable_mrs_emulation(void)
 }
 
 core_initcall(enable_mrs_emulation);
+
+int cpu_clear_disr(void *__unused)
+{
+	/* Firmware may have left a deferred SError in this register. */
+	write_sysreg_s(0, SYS_DISR_EL1);
+
+	return 0;
+}
diff --git a/arch/arm64/kernel/cpuidle.c b/arch/arm64/kernel/cpuidle.c
index fd69108..f2d1381 100644
--- a/arch/arm64/kernel/cpuidle.c
+++ b/arch/arm64/kernel/cpuidle.c
@@ -47,6 +47,8 @@ int arm_cpuidle_suspend(int index)
 
 #include <acpi/processor.h>
 
+#define ARM64_LPI_IS_RETENTION_STATE(arch_flags) (!(arch_flags))
+
 int acpi_processor_ffh_lpi_probe(unsigned int cpu)
 {
 	return arm_cpuidle_init(cpu);
@@ -54,6 +56,10 @@ int acpi_processor_ffh_lpi_probe(unsigned int cpu)
 
 int acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi)
 {
-	return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, lpi->index);
+	if (ARM64_LPI_IS_RETENTION_STATE(lpi->arch_flags))
+		return CPU_PM_CPU_IDLE_ENTER_RETENTION(arm_cpuidle_suspend,
+						lpi->index);
+	else
+		return CPU_PM_CPU_IDLE_ENTER(arm_cpuidle_suspend, lpi->index);
 }
 #endif
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 1e25545..7f94623 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -76,6 +76,7 @@ static const char *const hwcap_str[] = {
 	"asimddp",
 	"sha512",
 	"sve",
+	"asimdfhm",
 	NULL
 };
 
diff --git a/arch/arm64/kernel/efi.c b/arch/arm64/kernel/efi.c
index 82cd075..f85ac58 100644
--- a/arch/arm64/kernel/efi.c
+++ b/arch/arm64/kernel/efi.c
@@ -48,7 +48,9 @@ static __init pteval_t create_mapping_protection(efi_memory_desc_t *md)
 		return pgprot_val(PAGE_KERNEL_ROX);
 
 	/* RW- */
-	if (attr & EFI_MEMORY_XP || type != EFI_RUNTIME_SERVICES_CODE)
+	if (((attr & (EFI_MEMORY_RP | EFI_MEMORY_WP | EFI_MEMORY_XP)) ==
+	     EFI_MEMORY_XP) ||
+	    type != EFI_RUNTIME_SERVICES_CODE)
 		return pgprot_val(PAGE_KERNEL);
 
 	/* RWX */
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 6d14b8f..b34e717 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -28,6 +28,8 @@
 #include <asm/errno.h>
 #include <asm/esr.h>
 #include <asm/irq.h>
+#include <asm/memory.h>
+#include <asm/mmu.h>
 #include <asm/processor.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
@@ -69,8 +71,21 @@
 #define BAD_FIQ		2
 #define BAD_ERROR	3
 
-	.macro kernel_ventry	label
+	.macro kernel_ventry, el, label, regsize = 64
 	.align 7
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+alternative_if ARM64_UNMAP_KERNEL_AT_EL0
+	.if	\el == 0
+	.if	\regsize == 64
+	mrs	x30, tpidrro_el0
+	msr	tpidrro_el0, xzr
+	.else
+	mov	x30, xzr
+	.endif
+	.endif
+alternative_else_nop_endif
+#endif
+
 	sub	sp, sp, #S_FRAME_SIZE
 #ifdef CONFIG_VMAP_STACK
 	/*
@@ -82,7 +97,7 @@
 	tbnz	x0, #THREAD_SHIFT, 0f
 	sub	x0, sp, x0			// x0'' = sp' - x0' = (sp + x0) - sp = x0
 	sub	sp, sp, x0			// sp'' = sp' - x0 = (sp + x0) - x0 = sp
-	b	\label
+	b	el\()\el\()_\label
 
 0:
 	/*
@@ -114,7 +129,12 @@
 	sub	sp, sp, x0
 	mrs	x0, tpidrro_el0
 #endif
-	b	\label
+	b	el\()\el\()_\label
+	.endm
+
+	.macro tramp_alias, dst, sym
+	mov_q	\dst, TRAMP_VALIAS
+	add	\dst, \dst, #(\sym - .entry.tramp.text)
 	.endm
 
 	.macro	kernel_entry, el, regsize = 64
@@ -185,7 +205,7 @@
 
 	.if	\el != 0
 	mrs	x21, ttbr0_el1
-	tst	x21, #0xffff << 48		// Check for the reserved ASID
+	tst	x21, #TTBR_ASID_MASK		// Check for the reserved ASID
 	orr	x23, x23, #PSR_PAN_BIT		// Set the emulated PAN in the saved SPSR
 	b.eq	1f				// TTBR0 access already disabled
 	and	x23, x23, #~PSR_PAN_BIT		// Clear the emulated PAN in the saved SPSR
@@ -248,7 +268,7 @@
 	tbnz	x22, #22, 1f			// Skip re-enabling TTBR0 access if the PSR_PAN_BIT is set
 	.endif
 
-	__uaccess_ttbr0_enable x0
+	__uaccess_ttbr0_enable x0, x1
 
 	.if	\el == 0
 	/*
@@ -257,7 +277,7 @@
 	 * Cavium erratum 27456 (broadcast TLBI instructions may cause I-cache
 	 * corruption).
 	 */
-	post_ttbr0_update_workaround
+	bl	post_ttbr_update_workaround
 	.endif
 1:
 	.if	\el != 0
@@ -269,18 +289,20 @@
 	.if	\el == 0
 	ldr	x23, [sp, #S_SP]		// load return stack pointer
 	msr	sp_el0, x23
+	tst	x22, #PSR_MODE32_BIT		// native task?
+	b.eq	3f
+
 #ifdef CONFIG_ARM64_ERRATUM_845719
 alternative_if ARM64_WORKAROUND_845719
-	tbz	x22, #4, 1f
 #ifdef CONFIG_PID_IN_CONTEXTIDR
 	mrs	x29, contextidr_el1
 	msr	contextidr_el1, x29
 #else
 	msr contextidr_el1, xzr
 #endif
-1:
 alternative_else_nop_endif
 #endif
+3:
 	.endif
 
 	msr	elr_el1, x21			// set up the return data
@@ -302,7 +324,21 @@
 	ldp	x28, x29, [sp, #16 * 14]
 	ldr	lr, [sp, #S_LR]
 	add	sp, sp, #S_FRAME_SIZE		// restore sp
-	eret					// return to kernel
+
+	.if	\el == 0
+alternative_insn eret, nop, ARM64_UNMAP_KERNEL_AT_EL0
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	bne	4f
+	msr	far_el1, x30
+	tramp_alias	x30, tramp_exit_native
+	br	x30
+4:
+	tramp_alias	x30, tramp_exit_compat
+	br	x30
+#endif
+	.else
+	eret
+	.endif
 	.endm
 
 	.macro	irq_stack_entry
@@ -367,31 +403,31 @@
 
 	.align	11
 ENTRY(vectors)
-	kernel_ventry	el1_sync_invalid		// Synchronous EL1t
-	kernel_ventry	el1_irq_invalid			// IRQ EL1t
-	kernel_ventry	el1_fiq_invalid			// FIQ EL1t
-	kernel_ventry	el1_error_invalid		// Error EL1t
+	kernel_ventry	1, sync_invalid			// Synchronous EL1t
+	kernel_ventry	1, irq_invalid			// IRQ EL1t
+	kernel_ventry	1, fiq_invalid			// FIQ EL1t
+	kernel_ventry	1, error_invalid		// Error EL1t
 
-	kernel_ventry	el1_sync			// Synchronous EL1h
-	kernel_ventry	el1_irq				// IRQ EL1h
-	kernel_ventry	el1_fiq_invalid			// FIQ EL1h
-	kernel_ventry	el1_error			// Error EL1h
+	kernel_ventry	1, sync				// Synchronous EL1h
+	kernel_ventry	1, irq				// IRQ EL1h
+	kernel_ventry	1, fiq_invalid			// FIQ EL1h
+	kernel_ventry	1, error			// Error EL1h
 
-	kernel_ventry	el0_sync			// Synchronous 64-bit EL0
-	kernel_ventry	el0_irq				// IRQ 64-bit EL0
-	kernel_ventry	el0_fiq_invalid			// FIQ 64-bit EL0
-	kernel_ventry	el0_error			// Error 64-bit EL0
+	kernel_ventry	0, sync				// Synchronous 64-bit EL0
+	kernel_ventry	0, irq				// IRQ 64-bit EL0
+	kernel_ventry	0, fiq_invalid			// FIQ 64-bit EL0
+	kernel_ventry	0, error			// Error 64-bit EL0
 
 #ifdef CONFIG_COMPAT
-	kernel_ventry	el0_sync_compat			// Synchronous 32-bit EL0
-	kernel_ventry	el0_irq_compat			// IRQ 32-bit EL0
-	kernel_ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
-	kernel_ventry	el0_error_compat		// Error 32-bit EL0
+	kernel_ventry	0, sync_compat, 32		// Synchronous 32-bit EL0
+	kernel_ventry	0, irq_compat, 32		// IRQ 32-bit EL0
+	kernel_ventry	0, fiq_invalid_compat, 32	// FIQ 32-bit EL0
+	kernel_ventry	0, error_compat, 32		// Error 32-bit EL0
 #else
-	kernel_ventry	el0_sync_invalid		// Synchronous 32-bit EL0
-	kernel_ventry	el0_irq_invalid			// IRQ 32-bit EL0
-	kernel_ventry	el0_fiq_invalid			// FIQ 32-bit EL0
-	kernel_ventry	el0_error_invalid		// Error 32-bit EL0
+	kernel_ventry	0, sync_invalid, 32		// Synchronous 32-bit EL0
+	kernel_ventry	0, irq_invalid, 32		// IRQ 32-bit EL0
+	kernel_ventry	0, fiq_invalid, 32		// FIQ 32-bit EL0
+	kernel_ventry	0, error_invalid, 32		// Error 32-bit EL0
 #endif
 END(vectors)
 
@@ -685,12 +721,15 @@
 	 * Instruction abort handling
 	 */
 	mrs	x26, far_el1
-	enable_daif
+	enable_da_f
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_off
+#endif
 	ct_user_exit
 	mov	x0, x26
 	mov	x1, x25
 	mov	x2, sp
-	bl	do_mem_abort
+	bl	do_el0_ia_bp_hardening
 	b	ret_to_user
 el0_fpsimd_acc:
 	/*
@@ -943,6 +982,124 @@
 
 	.popsection				// .entry.text
 
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+/*
+ * Exception vectors trampoline.
+ */
+	.pushsection ".entry.tramp.text", "ax"
+
+	.macro tramp_map_kernel, tmp
+	mrs	\tmp, ttbr1_el1
+	add	\tmp, \tmp, #(PAGE_SIZE + RESERVED_TTBR0_SIZE)
+	bic	\tmp, \tmp, #USER_ASID_FLAG
+	msr	ttbr1_el1, \tmp
+#ifdef CONFIG_QCOM_FALKOR_ERRATUM_1003
+alternative_if ARM64_WORKAROUND_QCOM_FALKOR_E1003
+	/* ASID already in \tmp[63:48] */
+	movk	\tmp, #:abs_g2_nc:(TRAMP_VALIAS >> 12)
+	movk	\tmp, #:abs_g1_nc:(TRAMP_VALIAS >> 12)
+	/* 2MB boundary containing the vectors, so we nobble the walk cache */
+	movk	\tmp, #:abs_g0_nc:((TRAMP_VALIAS & ~(SZ_2M - 1)) >> 12)
+	isb
+	tlbi	vae1, \tmp
+	dsb	nsh
+alternative_else_nop_endif
+#endif /* CONFIG_QCOM_FALKOR_ERRATUM_1003 */
+	.endm
+
+	.macro tramp_unmap_kernel, tmp
+	mrs	\tmp, ttbr1_el1
+	sub	\tmp, \tmp, #(PAGE_SIZE + RESERVED_TTBR0_SIZE)
+	orr	\tmp, \tmp, #USER_ASID_FLAG
+	msr	ttbr1_el1, \tmp
+	/*
+	 * We avoid running the post_ttbr_update_workaround here because the
+	 * user and kernel ASIDs don't have conflicting mappings, so any
+	 * "blessing" as described in:
+	 *
+	 *   http://lkml.kernel.org/r/56BB848A.6060603@caviumnetworks.com
+	 *
+	 * will not hurt correctness. Whilst this may partially defeat the
+	 * point of using split ASIDs in the first place, it avoids
+	 * the hit of invalidating the entire I-cache on every return to
+	 * userspace.
+	 */
+	.endm
+
+	.macro tramp_ventry, regsize = 64
+	.align	7
+1:
+	.if	\regsize == 64
+	msr	tpidrro_el0, x30	// Restored in kernel_ventry
+	.endif
+	/*
+	 * Defend against branch aliasing attacks by pushing a dummy
+	 * entry onto the return stack and using a RET instruction to
+	 * enter the full-fat kernel vectors.
+	 */
+	bl	2f
+	b	.
+2:
+	tramp_map_kernel	x30
+#ifdef CONFIG_RANDOMIZE_BASE
+	adr	x30, tramp_vectors + PAGE_SIZE
+alternative_insn isb, nop, ARM64_WORKAROUND_QCOM_FALKOR_E1003
+	ldr	x30, [x30]
+#else
+	ldr	x30, =vectors
+#endif
+	prfm	plil1strm, [x30, #(1b - tramp_vectors)]
+	msr	vbar_el1, x30
+	add	x30, x30, #(1b - tramp_vectors)
+	isb
+	ret
+	.endm
+
+	.macro tramp_exit, regsize = 64
+	adr	x30, tramp_vectors
+	msr	vbar_el1, x30
+	tramp_unmap_kernel	x30
+	.if	\regsize == 64
+	mrs	x30, far_el1
+	.endif
+	eret
+	.endm
+
+	.align	11
+ENTRY(tramp_vectors)
+	.space	0x400
+
+	tramp_ventry
+	tramp_ventry
+	tramp_ventry
+	tramp_ventry
+
+	tramp_ventry	32
+	tramp_ventry	32
+	tramp_ventry	32
+	tramp_ventry	32
+END(tramp_vectors)
+
+ENTRY(tramp_exit_native)
+	tramp_exit
+END(tramp_exit_native)
+
+ENTRY(tramp_exit_compat)
+	tramp_exit	32
+END(tramp_exit_compat)
+
+	.ltorg
+	.popsection				// .entry.tramp.text
+#ifdef CONFIG_RANDOMIZE_BASE
+	.pushsection ".rodata", "a"
+	.align PAGE_SHIFT
+	.globl	__entry_tramp_data_start
+__entry_tramp_data_start:
+	.quad	vectors
+	.popsection				// .rodata
+#endif /* CONFIG_RANDOMIZE_BASE */
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
 /*
  * Special system call wrappers.
  */
@@ -996,3 +1153,180 @@
 	b	ret_to_user
 ENDPROC(ret_from_fork)
 NOKPROBE(ret_from_fork)
+
+#ifdef CONFIG_ARM_SDE_INTERFACE
+
+#include <asm/sdei.h>
+#include <uapi/linux/arm_sdei.h>
+
+.macro sdei_handler_exit exit_mode
+	/* On success, this call never returns... */
+	cmp	\exit_mode, #SDEI_EXIT_SMC
+	b.ne	99f
+	smc	#0
+	b	.
+99:	hvc	#0
+	b	.
+.endm
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+/*
+ * The regular SDEI entry point may have been unmapped along with the rest of
+ * the kernel. This trampoline restores the kernel mapping to make the x1 memory
+ * argument accessible.
+ *
+ * This clobbers x4, __sdei_handler() will restore this from firmware's
+ * copy.
+ */
+.ltorg
+.pushsection ".entry.tramp.text", "ax"
+ENTRY(__sdei_asm_entry_trampoline)
+	mrs	x4, ttbr1_el1
+	tbz	x4, #USER_ASID_BIT, 1f
+
+	tramp_map_kernel tmp=x4
+	isb
+	mov	x4, xzr
+
+	/*
+	 * Use reg->interrupted_regs.addr_limit to remember whether to unmap
+	 * the kernel on exit.
+	 */
+1:	str	x4, [x1, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
+
+#ifdef CONFIG_RANDOMIZE_BASE
+	adr	x4, tramp_vectors + PAGE_SIZE
+	add	x4, x4, #:lo12:__sdei_asm_trampoline_next_handler
+	ldr	x4, [x4]
+#else
+	ldr	x4, =__sdei_asm_handler
+#endif
+	br	x4
+ENDPROC(__sdei_asm_entry_trampoline)
+NOKPROBE(__sdei_asm_entry_trampoline)
+
+/*
+ * Make the exit call and restore the original ttbr1_el1
+ *
+ * x0 & x1: setup for the exit API call
+ * x2: exit_mode
+ * x4: struct sdei_registered_event argument from registration time.
+ */
+ENTRY(__sdei_asm_exit_trampoline)
+	ldr	x4, [x4, #(SDEI_EVENT_INTREGS + S_ORIG_ADDR_LIMIT)]
+	cbnz	x4, 1f
+
+	tramp_unmap_kernel	tmp=x4
+
+1:	sdei_handler_exit exit_mode=x2
+ENDPROC(__sdei_asm_exit_trampoline)
+NOKPROBE(__sdei_asm_exit_trampoline)
+	.ltorg
+.popsection		// .entry.tramp.text
+#ifdef CONFIG_RANDOMIZE_BASE
+.pushsection ".rodata", "a"
+__sdei_asm_trampoline_next_handler:
+	.quad	__sdei_asm_handler
+.popsection		// .rodata
+#endif /* CONFIG_RANDOMIZE_BASE */
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+
+/*
+ * Software Delegated Exception entry point.
+ *
+ * x0: Event number
+ * x1: struct sdei_registered_event argument from registration time.
+ * x2: interrupted PC
+ * x3: interrupted PSTATE
+ * x4: maybe clobbered by the trampoline
+ *
+ * Firmware has preserved x0->x17 for us, we must save/restore the rest to
+ * follow SMC-CC. We save (or retrieve) all the registers as the handler may
+ * want them.
+ */
+ENTRY(__sdei_asm_handler)
+	stp     x2, x3, [x1, #SDEI_EVENT_INTREGS + S_PC]
+	stp     x4, x5, [x1, #SDEI_EVENT_INTREGS + 16 * 2]
+	stp     x6, x7, [x1, #SDEI_EVENT_INTREGS + 16 * 3]
+	stp     x8, x9, [x1, #SDEI_EVENT_INTREGS + 16 * 4]
+	stp     x10, x11, [x1, #SDEI_EVENT_INTREGS + 16 * 5]
+	stp     x12, x13, [x1, #SDEI_EVENT_INTREGS + 16 * 6]
+	stp     x14, x15, [x1, #SDEI_EVENT_INTREGS + 16 * 7]
+	stp     x16, x17, [x1, #SDEI_EVENT_INTREGS + 16 * 8]
+	stp     x18, x19, [x1, #SDEI_EVENT_INTREGS + 16 * 9]
+	stp     x20, x21, [x1, #SDEI_EVENT_INTREGS + 16 * 10]
+	stp     x22, x23, [x1, #SDEI_EVENT_INTREGS + 16 * 11]
+	stp     x24, x25, [x1, #SDEI_EVENT_INTREGS + 16 * 12]
+	stp     x26, x27, [x1, #SDEI_EVENT_INTREGS + 16 * 13]
+	stp     x28, x29, [x1, #SDEI_EVENT_INTREGS + 16 * 14]
+	mov	x4, sp
+	stp     lr, x4, [x1, #SDEI_EVENT_INTREGS + S_LR]
+
+	mov	x19, x1
+
+#ifdef CONFIG_VMAP_STACK
+	/*
+	 * entry.S may have been using sp as a scratch register, find whether
+	 * this is a normal or critical event and switch to the appropriate
+	 * stack for this CPU.
+	 */
+	ldrb	w4, [x19, #SDEI_EVENT_PRIORITY]
+	cbnz	w4, 1f
+	ldr_this_cpu dst=x5, sym=sdei_stack_normal_ptr, tmp=x6
+	b	2f
+1:	ldr_this_cpu dst=x5, sym=sdei_stack_critical_ptr, tmp=x6
+2:	mov	x6, #SDEI_STACK_SIZE
+	add	x5, x5, x6
+	mov	sp, x5
+#endif
+
+	/*
+	 * We may have interrupted userspace, or a guest, or exit-from or
+	 * return-to either of these. We can't trust sp_el0, restore it.
+	 */
+	mrs	x28, sp_el0
+	ldr_this_cpu	dst=x0, sym=__entry_task, tmp=x1
+	msr	sp_el0, x0
+
+	/* If we interrupted the kernel point to the previous stack/frame. */
+	and     x0, x3, #0xc
+	mrs     x1, CurrentEL
+	cmp     x0, x1
+	csel	x29, x29, xzr, eq	// fp, or zero
+	csel	x4, x2, xzr, eq		// elr, or zero
+
+	stp	x29, x4, [sp, #-16]!
+	mov	x29, sp
+
+	add	x0, x19, #SDEI_EVENT_INTREGS
+	mov	x1, x19
+	bl	__sdei_handler
+
+	msr	sp_el0, x28
+	/* restore regs >x17 that we clobbered */
+	mov	x4, x19         // keep x4 for __sdei_asm_exit_trampoline
+	ldp	x28, x29, [x4, #SDEI_EVENT_INTREGS + 16 * 14]
+	ldp	x18, x19, [x4, #SDEI_EVENT_INTREGS + 16 * 9]
+	ldp	lr, x1, [x4, #SDEI_EVENT_INTREGS + S_LR]
+	mov	sp, x1
+
+	mov	x1, x0			// address to complete_and_resume
+	/* x0 = (x0 <= 1) ? EVENT_COMPLETE:EVENT_COMPLETE_AND_RESUME */
+	cmp	x0, #1
+	mov_q	x2, SDEI_1_0_FN_SDEI_EVENT_COMPLETE
+	mov_q	x3, SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME
+	csel	x0, x2, x3, ls
+
+	ldr_l	x2, sdei_exit_mode
+
+alternative_if_not ARM64_UNMAP_KERNEL_AT_EL0
+	sdei_handler_exit exit_mode=x2
+alternative_else_nop_endif
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	tramp_alias	dst=x5, sym=__sdei_asm_exit_trampoline
+	br	x5
+#endif
+ENDPROC(__sdei_asm_handler)
+NOKPROBE(__sdei_asm_handler)
+#endif /* CONFIG_ARM_SDE_INTERFACE */
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
index ad0edf3..e7226c4 100644
--- a/arch/arm64/kernel/fpsimd.c
+++ b/arch/arm64/kernel/fpsimd.c
@@ -1036,14 +1036,14 @@ void fpsimd_restore_current_state(void)
  * flag that indicates that the FPSIMD register contents are the most recent
  * FPSIMD state of 'current'
  */
-void fpsimd_update_current_state(struct fpsimd_state *state)
+void fpsimd_update_current_state(struct user_fpsimd_state const *state)
 {
 	if (!system_supports_fpsimd())
 		return;
 
 	local_bh_disable();
 
-	current->thread.fpsimd_state.user_fpsimd = state->user_fpsimd;
+	current->thread.fpsimd_state.user_fpsimd = *state;
 	if (system_supports_sve() && test_thread_flag(TIF_SVE))
 		fpsimd_to_sve(current);
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index e3cb9fb..ba3ab047 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -148,6 +148,26 @@
 ENDPROC(preserve_boot_args)
 
 /*
+ * Macro to arrange a physical address in a page table entry, taking care of
+ * 52-bit addresses.
+ *
+ * Preserves:	phys
+ * Returns:	pte
+ */
+	.macro	phys_to_pte, phys, pte
+#ifdef CONFIG_ARM64_PA_BITS_52
+	/*
+	 * We assume \phys is 64K aligned and this is guaranteed by only
+	 * supporting this configuration with 64K pages.
+	 */
+	orr	\pte, \phys, \phys, lsr #36
+	and	\pte, \pte, #PTE_ADDR_MASK
+#else
+	mov	\pte, \phys
+#endif
+	.endm
+
+/*
  * Macro to create a table entry to the next page.
  *
  *	tbl:	page table address
@@ -156,54 +176,124 @@
  *	ptrs:	#imm pointers per table page
  *
  * Preserves:	virt
- * Corrupts:	tmp1, tmp2
+ * Corrupts:	ptrs, tmp1, tmp2
  * Returns:	tbl -> next level table page address
  */
 	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
-	lsr	\tmp1, \virt, #\shift
-	and	\tmp1, \tmp1, #\ptrs - 1	// table index
-	add	\tmp2, \tbl, #PAGE_SIZE
+	add	\tmp1, \tbl, #PAGE_SIZE
+	phys_to_pte \tmp1, \tmp2
 	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
+	lsr	\tmp1, \virt, #\shift
+	sub	\ptrs, \ptrs, #1
+	and	\tmp1, \tmp1, \ptrs		// table index
 	str	\tmp2, [\tbl, \tmp1, lsl #3]
 	add	\tbl, \tbl, #PAGE_SIZE		// next level table page
 	.endm
 
 /*
- * Macro to populate the PGD (and possibily PUD) for the corresponding
- * block entry in the next level (tbl) for the given virtual address.
+ * Macro to populate page table entries, these entries can be pointers to the next level
+ * or last level entries pointing to physical memory.
  *
- * Preserves:	tbl, next, virt
- * Corrupts:	tmp1, tmp2
+ *	tbl:	page table address
+ *	rtbl:	pointer to page table or physical memory
+ *	index:	start index to write
+ *	eindex:	end index to write - [index, eindex] written to
+ *	flags:	flags for pagetable entry to or in
+ *	inc:	increment to rtbl between each entry
+ *	tmp1:	temporary variable
+ *
+ * Preserves:	tbl, eindex, flags, inc
+ * Corrupts:	index, tmp1
+ * Returns:	rtbl
  */
-	.macro	create_pgd_entry, tbl, virt, tmp1, tmp2
-	create_table_entry \tbl, \virt, PGDIR_SHIFT, PTRS_PER_PGD, \tmp1, \tmp2
-#if SWAPPER_PGTABLE_LEVELS > 3
-	create_table_entry \tbl, \virt, PUD_SHIFT, PTRS_PER_PUD, \tmp1, \tmp2
-#endif
-#if SWAPPER_PGTABLE_LEVELS > 2
-	create_table_entry \tbl, \virt, SWAPPER_TABLE_SHIFT, PTRS_PER_PTE, \tmp1, \tmp2
-#endif
+	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
+.Lpe\@:	phys_to_pte \rtbl, \tmp1
+	orr	\tmp1, \tmp1, \flags	// tmp1 = table entry
+	str	\tmp1, [\tbl, \index, lsl #3]
+	add	\rtbl, \rtbl, \inc	// rtbl = pa next level
+	add	\index, \index, #1
+	cmp	\index, \eindex
+	b.ls	.Lpe\@
 	.endm
 
 /*
- * Macro to populate block entries in the page table for the start..end
- * virtual range (inclusive).
+ * Compute indices of table entries from virtual address range. If multiple entries
+ * were needed in the previous page table level then the next page table level is assumed
+ * to be composed of multiple pages. (This effectively scales the end index).
  *
- * Preserves:	tbl, flags
- * Corrupts:	phys, start, end, pstate
+ *	vstart:	virtual address of start of range
+ *	vend:	virtual address of end of range
+ *	shift:	shift used to transform virtual address into index
+ *	ptrs:	number of entries in page table
+ *	istart:	index in table corresponding to vstart
+ *	iend:	index in table corresponding to vend
+ *	count:	On entry: how many extra entries were required in previous level, scales
+ *			  our end index.
+ *		On exit: returns how many extra entries required for next page table level
+ *
+ * Preserves:	vstart, vend, shift, ptrs
+ * Returns:	istart, iend, count
  */
-	.macro	create_block_map, tbl, flags, phys, start, end
-	lsr	\phys, \phys, #SWAPPER_BLOCK_SHIFT
-	lsr	\start, \start, #SWAPPER_BLOCK_SHIFT
-	and	\start, \start, #PTRS_PER_PTE - 1	// table index
-	orr	\phys, \flags, \phys, lsl #SWAPPER_BLOCK_SHIFT	// table entry
-	lsr	\end, \end, #SWAPPER_BLOCK_SHIFT
-	and	\end, \end, #PTRS_PER_PTE - 1		// table end index
-9999:	str	\phys, [\tbl, \start, lsl #3]		// store the entry
-	add	\start, \start, #1			// next entry
-	add	\phys, \phys, #SWAPPER_BLOCK_SIZE		// next block
-	cmp	\start, \end
-	b.ls	9999b
+	.macro compute_indices, vstart, vend, shift, ptrs, istart, iend, count
+	lsr	\iend, \vend, \shift
+	mov	\istart, \ptrs
+	sub	\istart, \istart, #1
+	and	\iend, \iend, \istart	// iend = (vend >> shift) & (ptrs - 1)
+	mov	\istart, \ptrs
+	mul	\istart, \istart, \count
+	add	\iend, \iend, \istart	// iend += (count - 1) * ptrs
+					// our entries span multiple tables
+
+	lsr	\istart, \vstart, \shift
+	mov	\count, \ptrs
+	sub	\count, \count, #1
+	and	\istart, \istart, \count
+
+	sub	\count, \iend, \istart
+	.endm
+
+/*
+ * Map memory for specified virtual address range. Each level of page table needed supports
+ * multiple entries. If a level requires n entries the next page table level is assumed to be
+ * formed from n pages.
+ *
+ *	tbl:	location of page table
+ *	rtbl:	address to be used for first level page table entry (typically tbl + PAGE_SIZE)
+ *	vstart:	start address to map
+ *	vend:	end address to map - we map [vstart, vend]
+ *	flags:	flags to use to map last level entries
+ *	phys:	physical address corresponding to vstart - physical memory is contiguous
+ *	pgds:	the number of pgd entries
+ *
+ * Temporaries:	istart, iend, tmp, count, sv - these need to be different registers
+ * Preserves:	vstart, vend, flags
+ * Corrupts:	tbl, rtbl, istart, iend, tmp, count, sv
+ */
+	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
+	add \rtbl, \tbl, #PAGE_SIZE
+	mov \sv, \rtbl
+	mov \count, #0
+	compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	mov \tbl, \sv
+	mov \sv, \rtbl
+
+#if SWAPPER_PGTABLE_LEVELS > 3
+	compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	mov \tbl, \sv
+	mov \sv, \rtbl
+#endif
+
+#if SWAPPER_PGTABLE_LEVELS > 2
+	compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	mov \tbl, \sv
+#endif
+
+	compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
+	bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
+	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
 	.endm
 
 /*
@@ -221,14 +311,16 @@
 	 * dirty cache lines being evicted.
 	 */
 	adrp	x0, idmap_pg_dir
-	ldr	x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+	adrp	x1, swapper_pg_end
+	sub	x1, x1, x0
 	bl	__inval_dcache_area
 
 	/*
 	 * Clear the idmap and swapper page tables.
 	 */
 	adrp	x0, idmap_pg_dir
-	ldr	x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+	adrp	x1, swapper_pg_end
+	sub	x1, x1, x0
 1:	stp	xzr, xzr, [x0], #16
 	stp	xzr, xzr, [x0], #16
 	stp	xzr, xzr, [x0], #16
@@ -244,16 +336,34 @@
 	adrp	x0, idmap_pg_dir
 	adrp	x3, __idmap_text_start		// __pa(__idmap_text_start)
 
-#ifndef CONFIG_ARM64_VA_BITS_48
+	/*
+	 * VA_BITS may be too small to allow for an ID mapping to be created
+	 * that covers system RAM if that is located sufficiently high in the
+	 * physical address space. So for the ID map, use an extended virtual
+	 * range in that case, and configure an additional translation level
+	 * if needed.
+	 *
+	 * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
+	 * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
+	 * this number conveniently equals the number of leading zeroes in
+	 * the physical address of __idmap_text_end.
+	 */
+	adrp	x5, __idmap_text_end
+	clz	x5, x5
+	cmp	x5, TCR_T0SZ(VA_BITS)	// default T0SZ small enough?
+	b.ge	1f			// .. then skip VA range extension
+
+	adr_l	x6, idmap_t0sz
+	str	x5, [x6]
+	dmb	sy
+	dc	ivac, x6		// Invalidate potentially stale cache line
+
+#if (VA_BITS < 48)
 #define EXTRA_SHIFT	(PGDIR_SHIFT + PAGE_SHIFT - 3)
-#define EXTRA_PTRS	(1 << (48 - EXTRA_SHIFT))
+#define EXTRA_PTRS	(1 << (PHYS_MASK_SHIFT - EXTRA_SHIFT))
 
 	/*
-	 * If VA_BITS < 48, it may be too small to allow for an ID mapping to be
-	 * created that covers system RAM if that is located sufficiently high
-	 * in the physical address space. So for the ID map, use an extended
-	 * virtual range in that case, by configuring an additional translation
-	 * level.
+	 * If VA_BITS < 48, we have to configure an additional table level.
 	 * First, we have to verify our assumption that the current value of
 	 * VA_BITS was chosen such that all translation levels are fully
 	 * utilised, and that lowering T0SZ will always result in an additional
@@ -263,30 +373,22 @@
 #error "Mismatch between VA_BITS and page size/number of translation levels"
 #endif
 
+	mov	x4, EXTRA_PTRS
+	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
+#else
 	/*
-	 * Calculate the maximum allowed value for TCR_EL1.T0SZ so that the
-	 * entire ID map region can be mapped. As T0SZ == (64 - #bits used),
-	 * this number conveniently equals the number of leading zeroes in
-	 * the physical address of __idmap_text_end.
+	 * If VA_BITS == 48, we don't have to configure an additional
+	 * translation level, but the top-level table has more entries.
 	 */
-	adrp	x5, __idmap_text_end
-	clz	x5, x5
-	cmp	x5, TCR_T0SZ(VA_BITS)	// default T0SZ small enough?
-	b.ge	1f			// .. then skip additional level
-
-	adr_l	x6, idmap_t0sz
-	str	x5, [x6]
-	dmb	sy
-	dc	ivac, x6		// Invalidate potentially stale cache line
-
-	create_table_entry x0, x3, EXTRA_SHIFT, EXTRA_PTRS, x5, x6
-1:
+	mov	x4, #1 << (PHYS_MASK_SHIFT - PGDIR_SHIFT)
+	str_l	x4, idmap_ptrs_per_pgd, x5
 #endif
-
-	create_pgd_entry x0, x3, x5, x6
+1:
+	ldr_l	x4, idmap_ptrs_per_pgd
 	mov	x5, x3				// __pa(__idmap_text_start)
 	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
-	create_block_map x0, x7, x3, x5, x6
+
+	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
 
 	/*
 	 * Map the kernel image (starting with PHYS_OFFSET).
@@ -294,12 +396,13 @@
 	adrp	x0, swapper_pg_dir
 	mov_q	x5, KIMAGE_VADDR + TEXT_OFFSET	// compile time __va(_text)
 	add	x5, x5, x23			// add KASLR displacement
-	create_pgd_entry x0, x5, x3, x6
+	mov	x4, PTRS_PER_PGD
 	adrp	x6, _end			// runtime __pa(_end)
 	adrp	x3, _text			// runtime __pa(_text)
 	sub	x6, x6, x3			// _end - _text
 	add	x6, x6, x5			// runtime __va(_end)
-	create_block_map x0, x7, x3, x5, x6
+
+	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
 
 	/*
 	 * Since the page tables have been populated with non-cacheable
@@ -307,7 +410,8 @@
 	 * tables again to remove any speculatively loaded cache lines.
 	 */
 	adrp	x0, idmap_pg_dir
-	ldr	x1, =(IDMAP_DIR_SIZE + SWAPPER_DIR_SIZE + RESERVED_TTBR0_SIZE)
+	adrp	x1, swapper_pg_end
+	sub	x1, x1, x0
 	dmb	sy
 	bl	__inval_dcache_area
 
@@ -388,17 +492,13 @@
 	mrs	x0, CurrentEL
 	cmp	x0, #CurrentEL_EL2
 	b.eq	1f
-	mrs	x0, sctlr_el1
-CPU_BE(	orr	x0, x0, #(3 << 24)	)	// Set the EE and E0E bits for EL1
-CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 	mov	w0, #BOOT_CPU_MODE_EL1		// This cpu booted in EL1
 	isb
 	ret
 
-1:	mrs	x0, sctlr_el2
-CPU_BE(	orr	x0, x0, #(1 << 25)	)	// Set the EE bit for EL2
-CPU_LE(	bic	x0, x0, #(1 << 25)	)	// Clear the EE bit for EL2
+1:	mov_q	x0, (SCTLR_EL2_RES1 | ENDIAN_SET_EL2)
 	msr	sctlr_el2, x0
 
 #ifdef CONFIG_ARM64_VHE
@@ -514,10 +614,7 @@
 	 * requires no configuration, and all non-hyp-specific EL2 setup
 	 * will be done via the _EL1 system register aliases in __cpu_setup.
 	 */
-	/* sctlr_el1 */
-	mov	x0, #0x0800			// Set/clear RES{1,0} bits
-CPU_BE(	movk	x0, #0x33d0, lsl #16	)	// Set EE and E0E on BE systems
-CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
+	mov_q	x0, (SCTLR_EL1_RES1 | ENDIAN_SET_EL1)
 	msr	sctlr_el1, x0
 
 	/* Coprocessor traps. */
@@ -679,8 +776,10 @@
 	update_early_cpu_boot_status 0, x1, x2
 	adrp	x1, idmap_pg_dir
 	adrp	x2, swapper_pg_dir
-	msr	ttbr0_el1, x1			// load TTBR0
-	msr	ttbr1_el1, x2			// load TTBR1
+	phys_to_ttbr x1, x3
+	phys_to_ttbr x2, x4
+	msr	ttbr0_el1, x3			// load TTBR0
+	msr	ttbr1_el1, x4			// load TTBR1
 	isb
 	msr	sctlr_el1, x0
 	isb
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
index e56d848..84f5d52f 100644
--- a/arch/arm64/kernel/hibernate-asm.S
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -33,12 +33,14 @@
  * Even switching to our copied tables will cause a changed output address at
  * each stage of the walk.
  */
-.macro break_before_make_ttbr_switch zero_page, page_table
-	msr	ttbr1_el1, \zero_page
+.macro break_before_make_ttbr_switch zero_page, page_table, tmp
+	phys_to_ttbr \zero_page, \tmp
+	msr	ttbr1_el1, \tmp
 	isb
 	tlbi	vmalle1
 	dsb	nsh
-	msr	ttbr1_el1, \page_table
+	phys_to_ttbr \page_table, \tmp
+	msr	ttbr1_el1, \tmp
 	isb
 .endm
 
@@ -78,7 +80,7 @@
 	 * We execute from ttbr0, change ttbr1 to our copied linear map tables
 	 * with a break-before-make via the zero page
 	 */
-	break_before_make_ttbr_switch	x5, x0
+	break_before_make_ttbr_switch	x5, x0, x6
 
 	mov	x21, x1
 	mov	x30, x2
@@ -109,7 +111,7 @@
 	dsb	ish		/* wait for PoU cleaning to finish */
 
 	/* switch to the restored kernels page tables */
-	break_before_make_ttbr_switch	x25, x21
+	break_before_make_ttbr_switch	x25, x21, x6
 
 	ic	ialluis
 	dsb	ish
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 3009b8b..f20cf7e 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -247,8 +247,7 @@ static int create_safe_exec_page(void *src_start, size_t length,
 	}
 
 	pte = pte_offset_kernel(pmd, dst_addr);
-	set_pte(pte, __pte(virt_to_phys((void *)dst) |
-			 pgprot_val(PAGE_KERNEL_EXEC)));
+	set_pte(pte, pfn_pte(virt_to_pfn(dst), PAGE_KERNEL_EXEC));
 
 	/*
 	 * Load our new page tables. A strict BBM approach requires that we
@@ -264,7 +263,7 @@ static int create_safe_exec_page(void *src_start, size_t length,
 	 */
 	cpu_set_reserved_ttbr0();
 	local_flush_tlb_all();
-	write_sysreg(virt_to_phys(pgd), ttbr0_el1);
+	write_sysreg(phys_to_ttbr(virt_to_phys(pgd)), ttbr0_el1);
 	isb();
 
 	*phys_dst_addr = virt_to_phys((void *)dst);
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 713561e..60e5fc6 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -29,6 +29,7 @@
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
 #include <linux/vmalloc.h>
+#include <asm/vmap_stack.h>
 
 unsigned long irq_err_count;
 
@@ -58,17 +59,7 @@ static void init_irq_stacks(void)
 	unsigned long *p;
 
 	for_each_possible_cpu(cpu) {
-		/*
-		* To ensure that VMAP'd stack overflow detection works
-		* correctly, the IRQ stacks need to have the same
-		* alignment as other stacks.
-		*/
-		p = __vmalloc_node_range(IRQ_STACK_SIZE, THREAD_ALIGN,
-					 VMALLOC_START, VMALLOC_END,
-					 THREADINFO_GFP, PAGE_KERNEL,
-					 0, cpu_to_node(cpu),
-					 __builtin_return_address(0));
-
+		p = arch_alloc_vmap_stack(IRQ_STACK_SIZE, cpu_to_node(cpu));
 		per_cpu(irq_stack_ptr, cpu) = p;
 	}
 }
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 6b7dcf4..583fd81 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -370,16 +370,14 @@ void tls_preserve_current_state(void)
 
 static void tls_thread_switch(struct task_struct *next)
 {
-	unsigned long tpidr, tpidrro;
-
 	tls_preserve_current_state();
 
-	tpidr = *task_user_tls(next);
-	tpidrro = is_compat_thread(task_thread_info(next)) ?
-		  next->thread.tp_value : 0;
+	if (is_compat_thread(task_thread_info(next)))
+		write_sysreg(next->thread.tp_value, tpidrro_el0);
+	else if (!arm64_kernel_unmapped_at_el0())
+		write_sysreg(0, tpidrro_el0);
 
-	write_sysreg(tpidr, tpidr_el0);
-	write_sysreg(tpidrro, tpidrro_el0);
+	write_sysreg(*task_user_tls(next), tpidr_el0);
 }
 
 /* Restore the UAO state depending on next's addr_limit */
diff --git a/arch/arm64/kernel/sdei.c b/arch/arm64/kernel/sdei.c
new file mode 100644
index 0000000..6b8d90d5
--- /dev/null
+++ b/arch/arm64/kernel/sdei.c
@@ -0,0 +1,235 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017 Arm Ltd.
+#define pr_fmt(fmt) "sdei: " fmt
+
+#include <linux/arm_sdei.h>
+#include <linux/hardirq.h>
+#include <linux/irqflags.h>
+#include <linux/sched/task_stack.h>
+#include <linux/uaccess.h>
+
+#include <asm/alternative.h>
+#include <asm/kprobes.h>
+#include <asm/mmu.h>
+#include <asm/ptrace.h>
+#include <asm/sections.h>
+#include <asm/sysreg.h>
+#include <asm/vmap_stack.h>
+
+unsigned long sdei_exit_mode;
+
+/*
+ * VMAP'd stacks checking for stack overflow on exception using sp as a scratch
+ * register, meaning SDEI has to switch to its own stack. We need two stacks as
+ * a critical event may interrupt a normal event that has just taken a
+ * synchronous exception, and is using sp as scratch register. For a critical
+ * event interrupting a normal event, we can't reliably tell if we were on the
+ * sdei stack.
+ * For now, we allocate stacks when the driver is probed.
+ */
+DECLARE_PER_CPU(unsigned long *, sdei_stack_normal_ptr);
+DECLARE_PER_CPU(unsigned long *, sdei_stack_critical_ptr);
+
+#ifdef CONFIG_VMAP_STACK
+DEFINE_PER_CPU(unsigned long *, sdei_stack_normal_ptr);
+DEFINE_PER_CPU(unsigned long *, sdei_stack_critical_ptr);
+#endif
+
+static void _free_sdei_stack(unsigned long * __percpu *ptr, int cpu)
+{
+	unsigned long *p;
+
+	p = per_cpu(*ptr, cpu);
+	if (p) {
+		per_cpu(*ptr, cpu) = NULL;
+		vfree(p);
+	}
+}
+
+static void free_sdei_stacks(void)
+{
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		_free_sdei_stack(&sdei_stack_normal_ptr, cpu);
+		_free_sdei_stack(&sdei_stack_critical_ptr, cpu);
+	}
+}
+
+static int _init_sdei_stack(unsigned long * __percpu *ptr, int cpu)
+{
+	unsigned long *p;
+
+	p = arch_alloc_vmap_stack(SDEI_STACK_SIZE, cpu_to_node(cpu));
+	if (!p)
+		return -ENOMEM;
+	per_cpu(*ptr, cpu) = p;
+
+	return 0;
+}
+
+static int init_sdei_stacks(void)
+{
+	int cpu;
+	int err = 0;
+
+	for_each_possible_cpu(cpu) {
+		err = _init_sdei_stack(&sdei_stack_normal_ptr, cpu);
+		if (err)
+			break;
+		err = _init_sdei_stack(&sdei_stack_critical_ptr, cpu);
+		if (err)
+			break;
+	}
+
+	if (err)
+		free_sdei_stacks();
+
+	return err;
+}
+
+bool _on_sdei_stack(unsigned long sp)
+{
+	unsigned long low, high;
+
+	if (!IS_ENABLED(CONFIG_VMAP_STACK))
+		return false;
+
+	low = (unsigned long)raw_cpu_read(sdei_stack_critical_ptr);
+	high = low + SDEI_STACK_SIZE;
+
+	if (low <= sp && sp < high)
+		return true;
+
+	low = (unsigned long)raw_cpu_read(sdei_stack_normal_ptr);
+	high = low + SDEI_STACK_SIZE;
+
+	return (low <= sp && sp < high);
+}
+
+unsigned long sdei_arch_get_entry_point(int conduit)
+{
+	/*
+	 * SDEI works between adjacent exception levels. If we booted at EL1 we
+	 * assume a hypervisor is marshalling events. If we booted at EL2 and
+	 * dropped to EL1 because we don't support VHE, then we can't support
+	 * SDEI.
+	 */
+	if (is_hyp_mode_available() && !is_kernel_in_hyp_mode()) {
+		pr_err("Not supported on this hardware/boot configuration\n");
+		return 0;
+	}
+
+	if (IS_ENABLED(CONFIG_VMAP_STACK)) {
+		if (init_sdei_stacks())
+			return 0;
+	}
+
+	sdei_exit_mode = (conduit == CONDUIT_HVC) ? SDEI_EXIT_HVC : SDEI_EXIT_SMC;
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	if (arm64_kernel_unmapped_at_el0()) {
+		unsigned long offset;
+
+		offset = (unsigned long)__sdei_asm_entry_trampoline -
+			 (unsigned long)__entry_tramp_text_start;
+		return TRAMP_VALIAS + offset;
+	} else
+#endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
+		return (unsigned long)__sdei_asm_handler;
+
+}
+
+/*
+ * __sdei_handler() returns one of:
+ *  SDEI_EV_HANDLED -  success, return to the interrupted context.
+ *  SDEI_EV_FAILED  -  failure, return this error code to firmare.
+ *  virtual-address -  success, return to this address.
+ */
+static __kprobes unsigned long _sdei_handler(struct pt_regs *regs,
+					     struct sdei_registered_event *arg)
+{
+	u32 mode;
+	int i, err = 0;
+	int clobbered_registers = 4;
+	u64 elr = read_sysreg(elr_el1);
+	u32 kernel_mode = read_sysreg(CurrentEL) | 1;	/* +SPSel */
+	unsigned long vbar = read_sysreg(vbar_el1);
+
+	if (arm64_kernel_unmapped_at_el0())
+		clobbered_registers++;
+
+	/* Retrieve the missing registers values */
+	for (i = 0; i < clobbered_registers; i++) {
+		/* from within the handler, this call always succeeds */
+		sdei_api_event_context(i, &regs->regs[i]);
+	}
+
+	/*
+	 * We didn't take an exception to get here, set PAN. UAO will be cleared
+	 * by sdei_event_handler()s set_fs(USER_DS) call.
+	 */
+	__uaccess_enable_hw_pan();
+
+	err = sdei_event_handler(regs, arg);
+	if (err)
+		return SDEI_EV_FAILED;
+
+	if (elr != read_sysreg(elr_el1)) {
+		/*
+		 * We took a synchronous exception from the SDEI handler.
+		 * This could deadlock, and if you interrupt KVM it will
+		 * hyp-panic instead.
+		 */
+		pr_warn("unsafe: exception during handler\n");
+	}
+
+	mode = regs->pstate & (PSR_MODE32_BIT | PSR_MODE_MASK);
+
+	/*
+	 * If we interrupted the kernel with interrupts masked, we always go
+	 * back to wherever we came from.
+	 */
+	if (mode == kernel_mode && !interrupts_enabled(regs))
+		return SDEI_EV_HANDLED;
+
+	/*
+	 * Otherwise, we pretend this was an IRQ. This lets user space tasks
+	 * receive signals before we return to them, and KVM to invoke it's
+	 * world switch to do the same.
+	 *
+	 * See DDI0487B.a Table D1-7 'Vector offsets from vector table base
+	 * address'.
+	 */
+	if (mode == kernel_mode)
+		return vbar + 0x280;
+	else if (mode & PSR_MODE32_BIT)
+		return vbar + 0x680;
+
+	return vbar + 0x480;
+}
+
+
+asmlinkage __kprobes notrace unsigned long
+__sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
+{
+	unsigned long ret;
+	bool do_nmi_exit = false;
+
+	/*
+	 * nmi_enter() deals with printk() re-entrance and use of RCU when
+	 * RCU believed this CPU was idle. Because critical events can
+	 * interrupt normal events, we may already be in_nmi().
+	 */
+	if (!in_nmi()) {
+		nmi_enter();
+		do_nmi_exit = true;
+	}
+
+	ret = _sdei_handler(regs, arg);
+
+	if (do_nmi_exit)
+		nmi_exit();
+
+	return ret;
+}
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index b120111..f60c052 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -178,7 +178,8 @@ static void __user *apply_user_offset(
 
 static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 {
-	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
+	struct user_fpsimd_state const *fpsimd =
+		&current->thread.fpsimd_state.user_fpsimd;
 	int err;
 
 	/* copy the FP and status/control registers */
@@ -195,7 +196,7 @@ static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
 
 static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
 {
-	struct fpsimd_state fpsimd;
+	struct user_fpsimd_state fpsimd;
 	__u32 magic, size;
 	int err = 0;
 
@@ -266,7 +267,7 @@ static int restore_sve_fpsimd_context(struct user_ctxs *user)
 {
 	int err;
 	unsigned int vq;
-	struct fpsimd_state fpsimd;
+	struct user_fpsimd_state fpsimd;
 	struct sve_context sve;
 
 	if (__copy_from_user(&sve, user->sve, sizeof(sve)))
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
index cbc4edd..79feb86 100644
--- a/arch/arm64/kernel/signal32.c
+++ b/arch/arm64/kernel/signal32.c
@@ -148,7 +148,8 @@ union __fpsimd_vreg {
 
 static int compat_preserve_vfp_context(struct compat_vfp_sigframe __user *frame)
 {
-	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
+	struct user_fpsimd_state const *fpsimd =
+		&current->thread.fpsimd_state.user_fpsimd;
 	compat_ulong_t magic = VFP_MAGIC;
 	compat_ulong_t size = VFP_STORAGE_SIZE;
 	compat_ulong_t fpscr, fpexc;
@@ -197,7 +198,7 @@ static int compat_preserve_vfp_context(struct compat_vfp_sigframe __user *frame)
 
 static int compat_restore_vfp_context(struct compat_vfp_sigframe __user *frame)
 {
-	struct fpsimd_state fpsimd;
+	struct user_fpsimd_state fpsimd;
 	compat_ulong_t magic = VFP_MAGIC;
 	compat_ulong_t size = VFP_STORAGE_SIZE;
 	compat_ulong_t fpscr;
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 551eb07..3b8ad7b 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/acpi.h>
+#include <linux/arm_sdei.h>
 #include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/spinlock.h>
@@ -836,6 +837,7 @@ static void ipi_cpu_stop(unsigned int cpu)
 	set_cpu_online(cpu, false);
 
 	local_daif_mask();
+	sdei_mask_local_cpu();
 
 	while (1)
 		cpu_relax();
@@ -853,6 +855,7 @@ static void ipi_cpu_crash_stop(unsigned int cpu, struct pt_regs *regs)
 	atomic_dec(&waiting_for_crash_ipi);
 
 	local_irq_disable();
+	sdei_mask_local_cpu();
 
 #ifdef CONFIG_HOTPLUG_CPU
 	if (cpu_ops[cpu]->cpu_die)
@@ -972,6 +975,8 @@ void smp_send_stop(void)
 	if (num_online_cpus() > 1)
 		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
 			   cpumask_pr_args(cpu_online_mask));
+
+	sdei_mask_local_cpu();
 }
 
 #ifdef CONFIG_KEXEC_CORE
@@ -990,8 +995,10 @@ void crash_smp_send_stop(void)
 
 	cpus_stopped = 1;
 
-	if (num_online_cpus() == 1)
+	if (num_online_cpus() == 1) {
+		sdei_mask_local_cpu();
 		return;
+	}
 
 	cpumask_copy(&mask, cpu_online_mask);
 	cpumask_clear_cpu(smp_processor_id(), &mask);
@@ -1009,6 +1016,8 @@ void crash_smp_send_stop(void)
 	if (atomic_read(&waiting_for_crash_ipi) > 0)
 		pr_warning("SMP: failed to stop secondary CPUs %*pbl\n",
 			   cpumask_pr_args(&mask));
+
+	sdei_mask_local_cpu();
 }
 
 bool smp_crash_stop_failed(void)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 3fe5ad8..a307b9e 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -2,6 +2,7 @@
 #include <linux/ftrace.h>
 #include <linux/percpu.h>
 #include <linux/slab.h>
+#include <linux/uaccess.h>
 #include <asm/alternative.h>
 #include <asm/cacheflush.h>
 #include <asm/cpufeature.h>
@@ -51,8 +52,7 @@ void notrace __cpu_suspend_exit(void)
 	 * PSTATE was not saved over suspend/resume, re-enable any detected
 	 * features that might not have been set correctly.
 	 */
-	asm(ALTERNATIVE("nop", SET_PSTATE_PAN(1), ARM64_HAS_PAN,
-			CONFIG_ARM64_PAN));
+	__uaccess_enable_hw_pan();
 	uao_thread_switch(current);
 
 	/*
diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 8d48b23..2186853 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -37,18 +37,14 @@ static int __init get_cpu_for_node(struct device_node *node)
 	if (!cpu_node)
 		return -1;
 
-	for_each_possible_cpu(cpu) {
-		if (of_get_cpu_node(cpu, NULL) == cpu_node) {
-			topology_parse_cpu_capacity(cpu_node, cpu);
-			of_node_put(cpu_node);
-			return cpu;
-		}
-	}
-
-	pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
+	cpu = of_cpu_node_to_id(cpu_node);
+	if (cpu >= 0)
+		topology_parse_cpu_capacity(cpu_node, cpu);
+	else
+		pr_crit("Unable to find CPU node for %pOF\n", cpu_node);
 
 	of_node_put(cpu_node);
-	return -1;
+	return cpu;
 }
 
 static int __init parse_core(struct device_node *core, int cluster_id,
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index 3d3588f..bbb0fde 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -662,17 +662,58 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 }
 #endif
 
-asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+void __noreturn arm64_serror_panic(struct pt_regs *regs, u32 esr)
 {
-	nmi_enter();
-
 	console_verbose();
 
 	pr_crit("SError Interrupt on CPU%d, code 0x%08x -- %s\n",
 		smp_processor_id(), esr, esr_get_class_string(esr));
-	__show_regs(regs);
+	if (regs)
+		__show_regs(regs);
 
-	panic("Asynchronous SError Interrupt");
+	nmi_panic(regs, "Asynchronous SError Interrupt");
+
+	cpu_park_loop();
+	unreachable();
+}
+
+bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsigned int esr)
+{
+	u32 aet = arm64_ras_serror_get_severity(esr);
+
+	switch (aet) {
+	case ESR_ELx_AET_CE:	/* corrected error */
+	case ESR_ELx_AET_UEO:	/* restartable, not yet consumed */
+		/*
+		 * The CPU can make progress. We may take UEO again as
+		 * a more severe error.
+		 */
+		return false;
+
+	case ESR_ELx_AET_UEU:	/* Uncorrected Unrecoverable */
+	case ESR_ELx_AET_UER:	/* Uncorrected Recoverable */
+		/*
+		 * The CPU can't make progress. The exception may have
+		 * been imprecise.
+		 */
+		return true;
+
+	case ESR_ELx_AET_UC:	/* Uncontainable or Uncategorized error */
+	default:
+		/* Error has been silently propagated */
+		arm64_serror_panic(regs, esr);
+	}
+}
+
+asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
+{
+	nmi_enter();
+
+	/* non-RAS errors are not containable */
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
+		arm64_serror_panic(regs, esr);
+
+	nmi_exit();
 }
 
 void __pte_error(const char *file, int line, unsigned long val)
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 7da3e5c..0221aca 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -57,6 +57,17 @@
 #define HIBERNATE_TEXT
 #endif
 
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define TRAMP_TEXT					\
+	. = ALIGN(PAGE_SIZE);				\
+	VMLINUX_SYMBOL(__entry_tramp_text_start) = .;	\
+	*(.entry.tramp.text)				\
+	. = ALIGN(PAGE_SIZE);				\
+	VMLINUX_SYMBOL(__entry_tramp_text_end) = .;
+#else
+#define TRAMP_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -113,6 +124,7 @@
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
 			HIBERNATE_TEXT
+			TRAMP_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -206,13 +218,19 @@
 	. = ALIGN(PAGE_SIZE);
 	idmap_pg_dir = .;
 	. += IDMAP_DIR_SIZE;
-	swapper_pg_dir = .;
-	. += SWAPPER_DIR_SIZE;
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+	tramp_pg_dir = .;
+	. += PAGE_SIZE;
+#endif
 
 #ifdef CONFIG_ARM64_SW_TTBR0_PAN
 	reserved_ttbr0 = .;
 	. += RESERVED_TTBR0_SIZE;
 #endif
+	swapper_pg_dir = .;
+	. += SWAPPER_DIR_SIZE;
+	swapper_pg_end = .;
 
 	__pecoff_data_size = ABSOLUTE(. - __initdata_begin);
 	_end = .;
@@ -234,7 +252,10 @@
 ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
 	<= SZ_4K, "Hibernate exit text too big or misaligned")
 #endif
-
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+ASSERT((__entry_tramp_text_end - __entry_tramp_text_start) == PAGE_SIZE,
+	"Entry trampoline text too big")
+#endif
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
  */
diff --git a/arch/arm64/kvm/handle_exit.c b/arch/arm64/kvm/handle_exit.c
index 304203f..520b0da 100644
--- a/arch/arm64/kvm/handle_exit.c
+++ b/arch/arm64/kvm/handle_exit.c
@@ -23,18 +23,26 @@
 #include <linux/kvm_host.h>
 
 #include <asm/esr.h>
+#include <asm/exception.h>
 #include <asm/kvm_asm.h>
 #include <asm/kvm_coproc.h>
 #include <asm/kvm_emulate.h>
 #include <asm/kvm_mmu.h>
 #include <asm/kvm_psci.h>
 #include <asm/debug-monitors.h>
+#include <asm/traps.h>
 
 #define CREATE_TRACE_POINTS
 #include "trace.h"
 
 typedef int (*exit_handle_fn)(struct kvm_vcpu *, struct kvm_run *);
 
+static void kvm_handle_guest_serror(struct kvm_vcpu *vcpu, u32 esr)
+{
+	if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(NULL, esr))
+		kvm_inject_vabt(vcpu);
+}
+
 static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
 	int ret;
@@ -45,7 +53,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 	ret = kvm_psci_call(vcpu);
 	if (ret < 0) {
-		kvm_inject_undefined(vcpu);
+		vcpu_set_reg(vcpu, 0, ~0UL);
 		return 1;
 	}
 
@@ -54,7 +62,7 @@ static int handle_hvc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 
 static int handle_smc(struct kvm_vcpu *vcpu, struct kvm_run *run)
 {
-	kvm_inject_undefined(vcpu);
+	vcpu_set_reg(vcpu, 0, ~0UL);
 	return 1;
 }
 
@@ -242,7 +250,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 			*vcpu_pc(vcpu) -= adj;
 		}
 
-		kvm_inject_vabt(vcpu);
 		return 1;
 	}
 
@@ -252,7 +259,6 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 	case ARM_EXCEPTION_IRQ:
 		return 1;
 	case ARM_EXCEPTION_EL1_SERROR:
-		kvm_inject_vabt(vcpu);
 		/* We may still need to return for single-step */
 		if (!(*vcpu_cpsr(vcpu) & DBG_SPSR_SS)
 			&& kvm_arm_handle_step_debug(vcpu, run))
@@ -275,3 +281,25 @@ int handle_exit(struct kvm_vcpu *vcpu, struct kvm_run *run,
 		return 0;
 	}
 }
+
+/* For exit types that need handling before we can be preempted */
+void handle_exit_early(struct kvm_vcpu *vcpu, struct kvm_run *run,
+		       int exception_index)
+{
+	if (ARM_SERROR_PENDING(exception_index)) {
+		if (this_cpu_has_cap(ARM64_HAS_RAS_EXTN)) {
+			u64 disr = kvm_vcpu_get_disr(vcpu);
+
+			kvm_handle_guest_serror(vcpu, disr_to_esr(disr));
+		} else {
+			kvm_inject_vabt(vcpu);
+		}
+
+		return;
+	}
+
+	exception_index = ARM_EXCEPTION_CODE(exception_index);
+
+	if (exception_index == ARM_EXCEPTION_EL1_SERROR)
+		kvm_handle_guest_serror(vcpu, kvm_vcpu_get_hsr(vcpu));
+}
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 870828c..e086c6ef 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -63,7 +63,8 @@
 	cmp	x0, #HVC_STUB_HCALL_NR
 	b.lo	__kvm_handle_stub_hvc
 
-	msr	ttbr0_el2, x0
+	phys_to_ttbr x0, x4
+	msr	ttbr0_el2, x4
 
 	mrs	x4, tcr_el1
 	ldr	x5, =TCR_EL2_MASK
@@ -71,30 +72,27 @@
 	mov	x5, #TCR_EL2_RES1
 	orr	x4, x4, x5
 
-#ifndef CONFIG_ARM64_VA_BITS_48
 	/*
-	 * If we are running with VA_BITS < 48, we may be running with an extra
-	 * level of translation in the ID map. This is only the case if system
-	 * RAM is out of range for the currently configured page size and number
-	 * of translation levels, in which case we will also need the extra
-	 * level for the HYP ID map, or we won't be able to enable the EL2 MMU.
+	 * The ID map may be configured to use an extended virtual address
+	 * range. This is only the case if system RAM is out of range for the
+	 * currently configured page size and VA_BITS, in which case we will
+	 * also need the extended virtual range for the HYP ID map, or we won't
+	 * be able to enable the EL2 MMU.
 	 *
 	 * However, at EL2, there is only one TTBR register, and we can't switch
 	 * between translation tables *and* update TCR_EL2.T0SZ at the same
-	 * time. Bottom line: we need the extra level in *both* our translation
-	 * tables.
+	 * time. Bottom line: we need to use the extended range with *both* our
+	 * translation tables.
 	 *
 	 * So use the same T0SZ value we use for the ID map.
 	 */
 	ldr_l	x5, idmap_t0sz
 	bfi	x4, x5, TCR_T0SZ_OFFSET, TCR_TxSZ_WIDTH
-#endif
+
 	/*
-	 * Read the PARange bits from ID_AA64MMFR0_EL1 and set the PS bits in
-	 * TCR_EL2.
+	 * Set the PS bits in TCR_EL2.
 	 */
-	mrs	x5, ID_AA64MMFR0_EL1
-	bfi	x4, x5, #16, #3
+	tcr_compute_pa_size x4, #TCR_EL2_PS_SHIFT, x5, x6
 
 	msr	tcr_el2, x4
 
@@ -122,6 +120,10 @@
 	kern_hyp_va	x2
 	msr	vbar_el2, x2
 
+	/* copy tpidr_el1 into tpidr_el2 for use by HYP */
+	mrs	x1, tpidr_el1
+	msr	tpidr_el2, x1
+
 	/* Hello, World! */
 	eret
 ENDPROC(__kvm_hyp_init)
diff --git a/arch/arm64/kvm/hyp/entry.S b/arch/arm64/kvm/hyp/entry.S
index 12ee62d..fdd1068 100644
--- a/arch/arm64/kvm/hyp/entry.S
+++ b/arch/arm64/kvm/hyp/entry.S
@@ -62,8 +62,8 @@
 	// Store the host regs
 	save_callee_saved_regs x1
 
-	// Store the host_ctxt for use at exit time
-	str	x1, [sp, #-16]!
+	// Store host_ctxt and vcpu for use at exit time
+	stp	x1, x0, [sp, #-16]!
 
 	add	x18, x0, #VCPU_CONTEXT
 
@@ -124,6 +124,17 @@
 	// Now restore the host regs
 	restore_callee_saved_regs x2
 
+alternative_if ARM64_HAS_RAS_EXTN
+	// If we have the RAS extensions we can consume a pending error
+	// without an unmask-SError and isb.
+	esb
+	mrs_s	x2, SYS_DISR_EL1
+	str	x2, [x1, #(VCPU_FAULT_DISR - VCPU_CONTEXT)]
+	cbz	x2, 1f
+	msr_s	SYS_DISR_EL1, xzr
+	orr	x0, x0, #(1<<ARM_EXIT_WITH_SERROR_BIT)
+1:	ret
+alternative_else
 	// If we have a pending asynchronous abort, now is the
 	// time to find out. From your VAXorcist book, page 666:
 	// "Threaten me not, oh Evil one!  For I speak with
@@ -134,7 +145,9 @@
 	mov	x5, x0
 
 	dsb	sy		// Synchronize against in-flight ld/st
+	nop
 	msr	daifclr, #4	// Unmask aborts
+alternative_endif
 
 	// This is our single instruction exception window. A pending
 	// SError is guaranteed to occur at the earliest when we unmask
@@ -159,6 +172,10 @@
 ENDPROC(__guest_exit)
 
 ENTRY(__fpsimd_guest_restore)
+	// x0: esr
+	// x1: vcpu
+	// x2-x29,lr: vcpu regs
+	// vcpu x0-x1 on the stack
 	stp	x2, x3, [sp, #-16]!
 	stp	x4, lr, [sp, #-16]!
 
@@ -173,7 +190,7 @@
 alternative_endif
 	isb
 
-	mrs	x3, tpidr_el2
+	mov	x3, x1
 
 	ldr	x0, [x3, #VCPU_HOST_CONTEXT]
 	kern_hyp_va x0
@@ -196,3 +213,15 @@
 
 	eret
 ENDPROC(__fpsimd_guest_restore)
+
+ENTRY(__qcom_hyp_sanitize_btac_predictors)
+	/**
+	 * Call SMC64 with Silicon provider serviceID 23<<8 (0xc2001700)
+	 * 0xC2000000-0xC200FFFF: assigned to SiP Service Calls
+	 * b15-b0: contains SiP functionID
+	 */
+	movz    x0, #0x1700
+	movk    x0, #0xc200, lsl #16
+	smc     #0
+	ret
+ENDPROC(__qcom_hyp_sanitize_btac_predictors)
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 5170ce1..e4f37b9 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -104,6 +104,7 @@
 	/*
 	 * x0: ESR_EC
 	 */
+	ldr	x1, [sp, #16 + 8]	// vcpu stored by __guest_enter
 
 	/*
 	 * We trap the first access to the FP/SIMD to save the host context
@@ -116,19 +117,18 @@
 	b.eq	__fpsimd_guest_restore
 alternative_else_nop_endif
 
-	mrs	x1, tpidr_el2
 	mov	x0, #ARM_EXCEPTION_TRAP
 	b	__guest_exit
 
 el1_irq:
 	stp     x0, x1, [sp, #-16]!
-	mrs	x1, tpidr_el2
+	ldr	x1, [sp, #16 + 8]
 	mov	x0, #ARM_EXCEPTION_IRQ
 	b	__guest_exit
 
 el1_error:
 	stp     x0, x1, [sp, #-16]!
-	mrs	x1, tpidr_el2
+	ldr	x1, [sp, #16 + 8]
 	mov	x0, #ARM_EXCEPTION_EL1_SERROR
 	b	__guest_exit
 
@@ -163,6 +163,18 @@
 	eret
 ENDPROC(__hyp_do_panic)
 
+ENTRY(__hyp_panic)
+	/*
+	 * '=kvm_host_cpu_state' is a host VA from the constant pool, it may
+	 * not be accessible by this address from EL2, hyp_panic() converts
+	 * it with kern_hyp_va() before use.
+	 */
+	ldr	x0, =kvm_host_cpu_state
+	mrs	x1, tpidr_el2
+	add	x0, x0, x1
+	b	hyp_panic
+ENDPROC(__hyp_panic)
+
 .macro invalid_vector	label, target = __hyp_panic
 	.align	2
 \label:
diff --git a/arch/arm64/kvm/hyp/s2-setup.c b/arch/arm64/kvm/hyp/s2-setup.c
index a81f5e1..603e1ee 100644
--- a/arch/arm64/kvm/hyp/s2-setup.c
+++ b/arch/arm64/kvm/hyp/s2-setup.c
@@ -32,6 +32,8 @@ u32 __hyp_text __init_stage2_translation(void)
 	 * PS is only 3. Fortunately, bit 19 is RES0 in VTCR_EL2...
 	 */
 	parange = read_sysreg(id_aa64mmfr0_el1) & 7;
+	if (parange > ID_AA64MMFR0_PARANGE_MAX)
+		parange = ID_AA64MMFR0_PARANGE_MAX;
 	val |= parange << 16;
 
 	/* Compute the actual PARange... */
diff --git a/arch/arm64/kvm/hyp/switch.c b/arch/arm64/kvm/hyp/switch.c
index f7c651f..036e1f3 100644
--- a/arch/arm64/kvm/hyp/switch.c
+++ b/arch/arm64/kvm/hyp/switch.c
@@ -17,6 +17,7 @@
 
 #include <linux/types.h>
 #include <linux/jump_label.h>
+#include <uapi/linux/psci.h>
 
 #include <asm/kvm_asm.h>
 #include <asm/kvm_emulate.h>
@@ -52,7 +53,7 @@ static void __hyp_text __activate_traps_vhe(void)
 	val &= ~(CPACR_EL1_FPEN | CPACR_EL1_ZEN);
 	write_sysreg(val, cpacr_el1);
 
-	write_sysreg(__kvm_hyp_vector, vbar_el1);
+	write_sysreg(kvm_get_hyp_vector(), vbar_el1);
 }
 
 static void __hyp_text __activate_traps_nvhe(void)
@@ -93,6 +94,9 @@ static void __hyp_text __activate_traps(struct kvm_vcpu *vcpu)
 
 	write_sysreg(val, hcr_el2);
 
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN) && (val & HCR_VSE))
+		write_sysreg_s(vcpu->arch.vsesr_el2, SYS_VSESR_EL2);
+
 	/* Trap on AArch32 cp15 c15 accesses (EL1 or EL0) */
 	write_sysreg(1 << 15, hstr_el2);
 	/*
@@ -235,11 +239,12 @@ static bool __hyp_text __translate_far_to_hpfar(u64 far, u64 *hpfar)
 
 static bool __hyp_text __populate_fault_info(struct kvm_vcpu *vcpu)
 {
-	u64 esr = read_sysreg_el2(esr);
-	u8 ec = ESR_ELx_EC(esr);
+	u8 ec;
+	u64 esr;
 	u64 hpfar, far;
 
-	vcpu->arch.fault.esr_el2 = esr;
+	esr = vcpu->arch.fault.esr_el2;
+	ec = ESR_ELx_EC(esr);
 
 	if (ec != ESR_ELx_EC_DABT_LOW && ec != ESR_ELx_EC_IABT_LOW)
 		return true;
@@ -305,9 +310,9 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	u64 exit_code;
 
 	vcpu = kern_hyp_va(vcpu);
-	write_sysreg(vcpu, tpidr_el2);
 
 	host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+	host_ctxt->__hyp_running_vcpu = vcpu;
 	guest_ctxt = &vcpu->arch.ctxt;
 
 	__sysreg_save_host_state(host_ctxt);
@@ -332,6 +337,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	exit_code = __guest_enter(vcpu, host_ctxt);
 	/* And we're baaack! */
 
+	if (ARM_EXCEPTION_CODE(exit_code) != ARM_EXCEPTION_IRQ)
+		vcpu->arch.fault.esr_el2 = read_sysreg_el2(esr);
 	/*
 	 * We're using the raw exception code in order to only process
 	 * the trap if no SError is pending. We will come back to the
@@ -341,6 +348,18 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 	if (exit_code == ARM_EXCEPTION_TRAP && !__populate_fault_info(vcpu))
 		goto again;
 
+	if (exit_code == ARM_EXCEPTION_TRAP &&
+	    (kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC64 ||
+	     kvm_vcpu_trap_get_class(vcpu) == ESR_ELx_EC_HVC32) &&
+	    vcpu_get_reg(vcpu, 0) == PSCI_0_2_FN_PSCI_VERSION) {
+		u64 val = PSCI_RET_NOT_SUPPORTED;
+		if (test_bit(KVM_ARM_VCPU_PSCI_0_2, vcpu->arch.features))
+			val = 2;
+
+		vcpu_set_reg(vcpu, 0, val);
+		goto again;
+	}
+
 	if (static_branch_unlikely(&vgic_v2_cpuif_trap) &&
 	    exit_code == ARM_EXCEPTION_TRAP) {
 		bool valid;
@@ -393,6 +412,14 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 		/* 0 falls through to be handled out of EL2 */
 	}
 
+	if (cpus_have_const_cap(ARM64_HARDEN_BP_POST_GUEST_EXIT)) {
+		u32 midr = read_cpuid_id();
+
+		/* Apply BTAC predictors mitigation to all Falkor chips */
+		if ((midr & MIDR_CPU_MODEL_MASK) == MIDR_QCOM_FALKOR_V1)
+			__qcom_hyp_sanitize_btac_predictors();
+	}
+
 	fp_enabled = __fpsimd_enabled();
 
 	__sysreg_save_guest_state(guest_ctxt);
@@ -422,7 +449,8 @@ int __hyp_text __kvm_vcpu_run(struct kvm_vcpu *vcpu)
 
 static const char __hyp_panic_string[] = "HYP panic:\nPS:%08llx PC:%016llx ESR:%08llx\nFAR:%016llx HPFAR:%016llx PAR:%016llx\nVCPU:%p\n";
 
-static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par)
+static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par,
+					     struct kvm_vcpu *vcpu)
 {
 	unsigned long str_va;
 
@@ -436,35 +464,35 @@ static void __hyp_text __hyp_call_panic_nvhe(u64 spsr, u64 elr, u64 par)
 	__hyp_do_panic(str_va,
 		       spsr,  elr,
 		       read_sysreg(esr_el2),   read_sysreg_el2(far),
-		       read_sysreg(hpfar_el2), par,
-		       (void *)read_sysreg(tpidr_el2));
+		       read_sysreg(hpfar_el2), par, vcpu);
 }
 
-static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par)
+static void __hyp_text __hyp_call_panic_vhe(u64 spsr, u64 elr, u64 par,
+					    struct kvm_vcpu *vcpu)
 {
 	panic(__hyp_panic_string,
 	      spsr,  elr,
 	      read_sysreg_el2(esr),   read_sysreg_el2(far),
-	      read_sysreg(hpfar_el2), par,
-	      (void *)read_sysreg(tpidr_el2));
+	      read_sysreg(hpfar_el2), par, vcpu);
 }
 
 static hyp_alternate_select(__hyp_call_panic,
 			    __hyp_call_panic_nvhe, __hyp_call_panic_vhe,
 			    ARM64_HAS_VIRT_HOST_EXTN);
 
-void __hyp_text __noreturn __hyp_panic(void)
+void __hyp_text __noreturn hyp_panic(struct kvm_cpu_context *__host_ctxt)
 {
+	struct kvm_vcpu *vcpu = NULL;
+
 	u64 spsr = read_sysreg_el2(spsr);
 	u64 elr = read_sysreg_el2(elr);
 	u64 par = read_sysreg(par_el1);
 
 	if (read_sysreg(vttbr_el2)) {
-		struct kvm_vcpu *vcpu;
 		struct kvm_cpu_context *host_ctxt;
 
-		vcpu = (struct kvm_vcpu *)read_sysreg(tpidr_el2);
-		host_ctxt = kern_hyp_va(vcpu->arch.host_cpu_context);
+		host_ctxt = kern_hyp_va(__host_ctxt);
+		vcpu = host_ctxt->__hyp_running_vcpu;
 		__timer_disable_traps(vcpu);
 		__deactivate_traps(vcpu);
 		__deactivate_vm(vcpu);
@@ -472,7 +500,7 @@ void __hyp_text __noreturn __hyp_panic(void)
 	}
 
 	/* Call panic for real */
-	__hyp_call_panic()(spsr, elr, par);
+	__hyp_call_panic()(spsr, elr, par, vcpu);
 
 	unreachable();
 }
diff --git a/arch/arm64/kvm/hyp/sysreg-sr.c b/arch/arm64/kvm/hyp/sysreg-sr.c
index 9341376..2c17afd 100644
--- a/arch/arm64/kvm/hyp/sysreg-sr.c
+++ b/arch/arm64/kvm/hyp/sysreg-sr.c
@@ -27,8 +27,8 @@ static void __hyp_text __sysreg_do_nothing(struct kvm_cpu_context *ctxt) { }
 /*
  * Non-VHE: Both host and guest must save everything.
  *
- * VHE: Host must save tpidr*_el[01], actlr_el1, mdscr_el1, sp0, pc,
- * pstate, and guest must save everything.
+ * VHE: Host must save tpidr*_el0, actlr_el1, mdscr_el1, sp_el0,
+ * and guest must save everything.
  */
 
 static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
@@ -36,11 +36,8 @@ static void __hyp_text __sysreg_save_common_state(struct kvm_cpu_context *ctxt)
 	ctxt->sys_regs[ACTLR_EL1]	= read_sysreg(actlr_el1);
 	ctxt->sys_regs[TPIDR_EL0]	= read_sysreg(tpidr_el0);
 	ctxt->sys_regs[TPIDRRO_EL0]	= read_sysreg(tpidrro_el0);
-	ctxt->sys_regs[TPIDR_EL1]	= read_sysreg(tpidr_el1);
 	ctxt->sys_regs[MDSCR_EL1]	= read_sysreg(mdscr_el1);
 	ctxt->gp_regs.regs.sp		= read_sysreg(sp_el0);
-	ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);
-	ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);
 }
 
 static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
@@ -62,10 +59,16 @@ static void __hyp_text __sysreg_save_state(struct kvm_cpu_context *ctxt)
 	ctxt->sys_regs[AMAIR_EL1]	= read_sysreg_el1(amair);
 	ctxt->sys_regs[CNTKCTL_EL1]	= read_sysreg_el1(cntkctl);
 	ctxt->sys_regs[PAR_EL1]		= read_sysreg(par_el1);
+	ctxt->sys_regs[TPIDR_EL1]	= read_sysreg(tpidr_el1);
 
 	ctxt->gp_regs.sp_el1		= read_sysreg(sp_el1);
 	ctxt->gp_regs.elr_el1		= read_sysreg_el1(elr);
 	ctxt->gp_regs.spsr[KVM_SPSR_EL1]= read_sysreg_el1(spsr);
+	ctxt->gp_regs.regs.pc		= read_sysreg_el2(elr);
+	ctxt->gp_regs.regs.pstate	= read_sysreg_el2(spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		ctxt->sys_regs[DISR_EL1] = read_sysreg_s(SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_save_host_state,
@@ -89,11 +92,8 @@ static void __hyp_text __sysreg_restore_common_state(struct kvm_cpu_context *ctx
 	write_sysreg(ctxt->sys_regs[ACTLR_EL1],	  actlr_el1);
 	write_sysreg(ctxt->sys_regs[TPIDR_EL0],	  tpidr_el0);
 	write_sysreg(ctxt->sys_regs[TPIDRRO_EL0], tpidrro_el0);
-	write_sysreg(ctxt->sys_regs[TPIDR_EL1],	  tpidr_el1);
 	write_sysreg(ctxt->sys_regs[MDSCR_EL1],	  mdscr_el1);
 	write_sysreg(ctxt->gp_regs.regs.sp,	  sp_el0);
-	write_sysreg_el2(ctxt->gp_regs.regs.pc,	  elr);
-	write_sysreg_el2(ctxt->gp_regs.regs.pstate, spsr);
 }
 
 static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
@@ -115,10 +115,16 @@ static void __hyp_text __sysreg_restore_state(struct kvm_cpu_context *ctxt)
 	write_sysreg_el1(ctxt->sys_regs[AMAIR_EL1],	amair);
 	write_sysreg_el1(ctxt->sys_regs[CNTKCTL_EL1], 	cntkctl);
 	write_sysreg(ctxt->sys_regs[PAR_EL1],		par_el1);
+	write_sysreg(ctxt->sys_regs[TPIDR_EL1],		tpidr_el1);
 
 	write_sysreg(ctxt->gp_regs.sp_el1,		sp_el1);
 	write_sysreg_el1(ctxt->gp_regs.elr_el1,		elr);
 	write_sysreg_el1(ctxt->gp_regs.spsr[KVM_SPSR_EL1],spsr);
+	write_sysreg_el2(ctxt->gp_regs.regs.pc,		elr);
+	write_sysreg_el2(ctxt->gp_regs.regs.pstate,	spsr);
+
+	if (cpus_have_const_cap(ARM64_HAS_RAS_EXTN))
+		write_sysreg_s(ctxt->sys_regs[DISR_EL1], SYS_VDISR_EL2);
 }
 
 static hyp_alternate_select(__sysreg_call_restore_host_state,
diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c
index 8ecbcb4..60666a0 100644
--- a/arch/arm64/kvm/inject_fault.c
+++ b/arch/arm64/kvm/inject_fault.c
@@ -164,14 +164,25 @@ void kvm_inject_undefined(struct kvm_vcpu *vcpu)
 		inject_undef64(vcpu);
 }
 
+static void pend_guest_serror(struct kvm_vcpu *vcpu, u64 esr)
+{
+	vcpu_set_vsesr(vcpu, esr);
+	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+}
+
 /**
  * kvm_inject_vabt - inject an async abort / SError into the guest
  * @vcpu: The VCPU to receive the exception
  *
  * It is assumed that this code is called from the VCPU thread and that the
  * VCPU therefore is not currently executing guest code.
+ *
+ * Systems with the RAS Extensions specify an imp-def ESR (ISV/IDS = 1) with
+ * the remaining ISS all-zeros so that this error is not interpreted as an
+ * uncategorized RAS error. Without the RAS Extensions we can't specify an ESR
+ * value, so the CPU generates an imp-def value.
  */
 void kvm_inject_vabt(struct kvm_vcpu *vcpu)
 {
-	vcpu_set_hcr(vcpu, vcpu_get_hcr(vcpu) | HCR_VSE);
+	pend_guest_serror(vcpu, ESR_ELx_ISV);
 }
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 1830ebc..50a43c7 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1159,6 +1159,16 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AFSR0_EL1), access_vm_reg, reset_unknown, AFSR0_EL1 },
 	{ SYS_DESC(SYS_AFSR1_EL1), access_vm_reg, reset_unknown, AFSR1_EL1 },
 	{ SYS_DESC(SYS_ESR_EL1), access_vm_reg, reset_unknown, ESR_EL1 },
+
+	{ SYS_DESC(SYS_ERRIDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERRSELR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXFR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXCTLR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXSTATUS_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXADDR_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC0_EL1), trap_raz_wi },
+	{ SYS_DESC(SYS_ERXMISC1_EL1), trap_raz_wi },
+
 	{ SYS_DESC(SYS_FAR_EL1), access_vm_reg, reset_unknown, FAR_EL1 },
 	{ SYS_DESC(SYS_PAR_EL1), NULL, reset_unknown, PAR_EL1 },
 
@@ -1169,6 +1179,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
 	{ SYS_DESC(SYS_AMAIR_EL1), access_vm_reg, reset_amair_el1, AMAIR_EL1 },
 
 	{ SYS_DESC(SYS_VBAR_EL1), NULL, reset_val, VBAR_EL1, 0 },
+	{ SYS_DESC(SYS_DISR_EL1), NULL, reset_val, DISR_EL1, 0 },
 
 	{ SYS_DESC(SYS_ICC_IAR0_EL1), write_to_read_only },
 	{ SYS_DESC(SYS_ICC_EOIR0_EL1), read_from_write_only },
diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
index e88fb99..3d69a8d 100644
--- a/arch/arm64/lib/clear_user.S
+++ b/arch/arm64/lib/clear_user.S
@@ -30,7 +30,7 @@
  * Alignment fixed up by hardware.
  */
 ENTRY(__clear_user)
-	uaccess_enable_not_uao x2, x3
+	uaccess_enable_not_uao x2, x3, x4
 	mov	x2, x1			// save the size for fixup return
 	subs	x1, x1, #8
 	b.mi	2f
@@ -50,7 +50,7 @@
 	b.mi	5f
 uao_user_alternative 9f, strb, sttrb, wzr, x0, 0
 5:	mov	x0, #0
-	uaccess_disable_not_uao x2
+	uaccess_disable_not_uao x2, x3
 	ret
 ENDPROC(__clear_user)
 
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
index 4b5d826..20305d4 100644
--- a/arch/arm64/lib/copy_from_user.S
+++ b/arch/arm64/lib/copy_from_user.S
@@ -64,10 +64,10 @@
 
 end	.req	x5
 ENTRY(__arch_copy_from_user)
-	uaccess_enable_not_uao x3, x4
+	uaccess_enable_not_uao x3, x4, x5
 	add	end, x0, x2
 #include "copy_template.S"
-	uaccess_disable_not_uao x3
+	uaccess_disable_not_uao x3, x4
 	mov	x0, #0				// Nothing to copy
 	ret
 ENDPROC(__arch_copy_from_user)
diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S
index b24a830..fbb090f 100644
--- a/arch/arm64/lib/copy_in_user.S
+++ b/arch/arm64/lib/copy_in_user.S
@@ -65,10 +65,10 @@
 
 end	.req	x5
 ENTRY(raw_copy_in_user)
-	uaccess_enable_not_uao x3, x4
+	uaccess_enable_not_uao x3, x4, x5
 	add	end, x0, x2
 #include "copy_template.S"
-	uaccess_disable_not_uao x3
+	uaccess_disable_not_uao x3, x4
 	mov	x0, #0
 	ret
 ENDPROC(raw_copy_in_user)
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
index 351f076..fda6172 100644
--- a/arch/arm64/lib/copy_to_user.S
+++ b/arch/arm64/lib/copy_to_user.S
@@ -63,10 +63,10 @@
 
 end	.req	x5
 ENTRY(__arch_copy_to_user)
-	uaccess_enable_not_uao x3, x4
+	uaccess_enable_not_uao x3, x4, x5
 	add	end, x0, x2
 #include "copy_template.S"
-	uaccess_disable_not_uao x3
+	uaccess_disable_not_uao x3, x4
 	mov	x0, #0
 	ret
 ENDPROC(__arch_copy_to_user)
diff --git a/arch/arm64/lib/tishift.S b/arch/arm64/lib/tishift.S
index 0179a43..d3db9b2 100644
--- a/arch/arm64/lib/tishift.S
+++ b/arch/arm64/lib/tishift.S
@@ -38,19 +38,19 @@
 ENDPROC(__ashlti3)
 
 ENTRY(__ashrti3)
-	cbz	x2, 3f
+	cbz	x2, 1f
 	mov	x3, #64
 	sub	x3, x3, x2
 	cmp	x3, #0
-	b.le	4f
+	b.le	2f
 	lsr	x0, x0, x2
 	lsl	x3, x1, x3
 	asr	x2, x1, x2
 	orr	x0, x0, x3
 	mov	x1, x2
-3:
+1:
 	ret
-4:
+2:
 	neg	w0, w3
 	asr	x2, x1, #63
 	asr	x0, x1, x0
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 7f1dbe9..91464e7 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -49,7 +49,7 @@
  *	- end     - virtual end address of region
  */
 ENTRY(__flush_cache_user_range)
-	uaccess_ttbr0_enable x2, x3
+	uaccess_ttbr0_enable x2, x3, x4
 	dcache_line_size x2, x3
 	sub	x3, x2, #1
 	bic	x4, x0, x3
@@ -72,7 +72,7 @@
 	isb
 	mov	x0, #0
 1:
-	uaccess_ttbr0_disable x1
+	uaccess_ttbr0_disable x1, x2
 	ret
 9:
 	mov	x0, #-EFAULT
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
index 6f401704..301417a 100644
--- a/arch/arm64/mm/context.c
+++ b/arch/arm64/mm/context.c
@@ -39,7 +39,16 @@ static cpumask_t tlb_flush_pending;
 
 #define ASID_MASK		(~GENMASK(asid_bits - 1, 0))
 #define ASID_FIRST_VERSION	(1UL << asid_bits)
-#define NUM_USER_ASIDS		ASID_FIRST_VERSION
+
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+#define NUM_USER_ASIDS		(ASID_FIRST_VERSION >> 1)
+#define asid2idx(asid)		(((asid) & ~ASID_MASK) >> 1)
+#define idx2asid(idx)		(((idx) << 1) & ~ASID_MASK)
+#else
+#define NUM_USER_ASIDS		(ASID_FIRST_VERSION)
+#define asid2idx(asid)		((asid) & ~ASID_MASK)
+#define idx2asid(idx)		asid2idx(idx)
+#endif
 
 /* Get the ASIDBits supported by the current CPU */
 static u32 get_cpu_asid_bits(void)
@@ -79,13 +88,6 @@ void verify_cpu_asid_bits(void)
 	}
 }
 
-static void set_reserved_asid_bits(void)
-{
-	if (IS_ENABLED(CONFIG_QCOM_FALKOR_ERRATUM_1003) &&
-	    cpus_have_const_cap(ARM64_WORKAROUND_QCOM_FALKOR_E1003))
-		__set_bit(FALKOR_RESERVED_ASID, asid_map);
-}
-
 static void flush_context(unsigned int cpu)
 {
 	int i;
@@ -94,8 +96,6 @@ static void flush_context(unsigned int cpu)
 	/* Update the list of reserved ASIDs and the ASID bitmap. */
 	bitmap_clear(asid_map, 0, NUM_USER_ASIDS);
 
-	set_reserved_asid_bits();
-
 	for_each_possible_cpu(i) {
 		asid = atomic64_xchg_relaxed(&per_cpu(active_asids, i), 0);
 		/*
@@ -107,7 +107,7 @@ static void flush_context(unsigned int cpu)
 		 */
 		if (asid == 0)
 			asid = per_cpu(reserved_asids, i);
-		__set_bit(asid & ~ASID_MASK, asid_map);
+		__set_bit(asid2idx(asid), asid_map);
 		per_cpu(reserved_asids, i) = asid;
 	}
 
@@ -162,16 +162,16 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
 		 * We had a valid ASID in a previous life, so try to re-use
 		 * it if possible.
 		 */
-		asid &= ~ASID_MASK;
-		if (!__test_and_set_bit(asid, asid_map))
+		if (!__test_and_set_bit(asid2idx(asid), asid_map))
 			return newasid;
 	}
 
 	/*
 	 * Allocate a free ASID. If we can't find one, take a note of the
-	 * currently active ASIDs and mark the TLBs as requiring flushes.
-	 * We always count from ASID #1, as we use ASID #0 when setting a
-	 * reserved TTBR0 for the init_mm.
+	 * currently active ASIDs and mark the TLBs as requiring flushes.  We
+	 * always count from ASID #2 (index 1), as we use ASID #0 when setting
+	 * a reserved TTBR0 for the init_mm and we allocate ASIDs in even/odd
+	 * pairs.
 	 */
 	asid = find_next_zero_bit(asid_map, NUM_USER_ASIDS, cur_idx);
 	if (asid != NUM_USER_ASIDS)
@@ -188,32 +188,35 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu)
 set_asid:
 	__set_bit(asid, asid_map);
 	cur_idx = asid;
-	return asid | generation;
+	return idx2asid(asid) | generation;
 }
 
 void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
 {
 	unsigned long flags;
-	u64 asid;
+	u64 asid, old_active_asid;
 
 	asid = atomic64_read(&mm->context.id);
 
 	/*
 	 * The memory ordering here is subtle.
-	 * If our ASID matches the current generation, then we update
-	 * our active_asids entry with a relaxed xchg. Racing with a
-	 * concurrent rollover means that either:
+	 * If our active_asids is non-zero and the ASID matches the current
+	 * generation, then we update the active_asids entry with a relaxed
+	 * cmpxchg. Racing with a concurrent rollover means that either:
 	 *
-	 * - We get a zero back from the xchg and end up waiting on the
+	 * - We get a zero back from the cmpxchg and end up waiting on the
 	 *   lock. Taking the lock synchronises with the rollover and so
 	 *   we are forced to see the updated generation.
 	 *
-	 * - We get a valid ASID back from the xchg, which means the
+	 * - We get a valid ASID back from the cmpxchg, which means the
 	 *   relaxed xchg in flush_context will treat us as reserved
 	 *   because atomic RmWs are totally ordered for a given location.
 	 */
-	if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits)
-	    && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid))
+	old_active_asid = atomic64_read(&per_cpu(active_asids, cpu));
+	if (old_active_asid &&
+	    !((asid ^ atomic64_read(&asid_generation)) >> asid_bits) &&
+	    atomic64_cmpxchg_relaxed(&per_cpu(active_asids, cpu),
+				     old_active_asid, asid))
 		goto switch_mm_fastpath;
 
 	raw_spin_lock_irqsave(&cpu_asid_lock, flags);
@@ -231,6 +234,9 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
 	raw_spin_unlock_irqrestore(&cpu_asid_lock, flags);
 
 switch_mm_fastpath:
+
+	arm64_apply_bp_hardening();
+
 	/*
 	 * Defer TTBR0_EL1 setting for user threads to uaccess_enable() when
 	 * emulating PAN.
@@ -239,6 +245,15 @@ void check_and_switch_context(struct mm_struct *mm, unsigned int cpu)
 		cpu_switch_mm(mm->pgd, mm);
 }
 
+/* Errata workaround post TTBRx_EL1 update. */
+asmlinkage void post_ttbr_update_workaround(void)
+{
+	asm(ALTERNATIVE("nop; nop; nop",
+			"ic iallu; dsb nsh; isb",
+			ARM64_WORKAROUND_CAVIUM_27456,
+			CONFIG_CAVIUM_ERRATUM_27456));
+}
+
 static int asids_init(void)
 {
 	asid_bits = get_cpu_asid_bits();
@@ -254,8 +269,6 @@ static int asids_init(void)
 		panic("Failed to allocate bitmap for %lu ASIDs\n",
 		      NUM_USER_ASIDS);
 
-	set_reserved_asid_bits();
-
 	pr_info("ASID allocator initialised with %lu entries\n", NUM_USER_ASIDS);
 	return 0;
 }
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index abe2005..ce441d2 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -707,6 +707,23 @@ asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
 	arm64_notify_die("", regs, &info, esr);
 }
 
+asmlinkage void __exception do_el0_ia_bp_hardening(unsigned long addr,
+						   unsigned int esr,
+						   struct pt_regs *regs)
+{
+	/*
+	 * We've taken an instruction abort from userspace and not yet
+	 * re-enabled IRQs. If the address is a kernel address, apply
+	 * BP hardening prior to enabling IRQs and pre-emption.
+	 */
+	if (addr > TASK_SIZE)
+		arm64_apply_bp_hardening();
+
+	local_irq_enable();
+	do_mem_abort(addr, esr, regs);
+}
+
+
 asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
 					   unsigned int esr,
 					   struct pt_regs *regs)
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 00e7b90..c903f7c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -366,6 +366,9 @@ void __init arm64_memblock_init(void)
 	/* Handle linux,usable-memory-range property */
 	fdt_enforce_memory_region();
 
+	/* Remove memory above our supported physical address size */
+	memblock_remove(1ULL << PHYS_MASK_SHIFT, ULLONG_MAX);
+
 	/*
 	 * Ensure that the linear region takes up exactly half of the kernel
 	 * virtual address space. This way, we can distinguish a linear address
@@ -600,49 +603,6 @@ void __init mem_init(void)
 
 	mem_init_print_info(NULL);
 
-#define MLK(b, t) b, t, ((t) - (b)) >> 10
-#define MLM(b, t) b, t, ((t) - (b)) >> 20
-#define MLG(b, t) b, t, ((t) - (b)) >> 30
-#define MLK_ROUNDUP(b, t) b, t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
-
-	pr_notice("Virtual kernel memory layout:\n");
-#ifdef CONFIG_KASAN
-	pr_notice("    kasan   : 0x%16lx - 0x%16lx   (%6ld GB)\n",
-		MLG(KASAN_SHADOW_START, KASAN_SHADOW_END));
-#endif
-	pr_notice("    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n",
-		MLM(MODULES_VADDR, MODULES_END));
-	pr_notice("    vmalloc : 0x%16lx - 0x%16lx   (%6ld GB)\n",
-		MLG(VMALLOC_START, VMALLOC_END));
-	pr_notice("      .text : 0x%p" " - 0x%p" "   (%6ld KB)\n",
-		MLK_ROUNDUP(_text, _etext));
-	pr_notice("    .rodata : 0x%p" " - 0x%p" "   (%6ld KB)\n",
-		MLK_ROUNDUP(__start_rodata, __init_begin));
-	pr_notice("      .init : 0x%p" " - 0x%p" "   (%6ld KB)\n",
-		MLK_ROUNDUP(__init_begin, __init_end));
-	pr_notice("      .data : 0x%p" " - 0x%p" "   (%6ld KB)\n",
-		MLK_ROUNDUP(_sdata, _edata));
-	pr_notice("       .bss : 0x%p" " - 0x%p" "   (%6ld KB)\n",
-		MLK_ROUNDUP(__bss_start, __bss_stop));
-	pr_notice("    fixed   : 0x%16lx - 0x%16lx   (%6ld KB)\n",
-		MLK(FIXADDR_START, FIXADDR_TOP));
-	pr_notice("    PCI I/O : 0x%16lx - 0x%16lx   (%6ld MB)\n",
-		MLM(PCI_IO_START, PCI_IO_END));
-#ifdef CONFIG_SPARSEMEM_VMEMMAP
-	pr_notice("    vmemmap : 0x%16lx - 0x%16lx   (%6ld GB maximum)\n",
-		MLG(VMEMMAP_START, VMEMMAP_START + VMEMMAP_SIZE));
-	pr_notice("              0x%16lx - 0x%16lx   (%6ld MB actual)\n",
-		MLM((unsigned long)phys_to_page(memblock_start_of_DRAM()),
-		    (unsigned long)virt_to_page(high_memory)));
-#endif
-	pr_notice("    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n",
-		MLM(__phys_to_virt(memblock_start_of_DRAM()),
-		    (unsigned long)high_memory));
-
-#undef MLK
-#undef MLM
-#undef MLK_ROUNDUP
-
 	/*
 	 * Check boundaries twice: Some fundamental inconsistencies can be
 	 * detected at build time already.
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 267d2b7..b44992e 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -50,6 +50,7 @@
 #define NO_CONT_MAPPINGS	BIT(1)
 
 u64 idmap_t0sz = TCR_T0SZ(VA_BITS);
+u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
 
 u64 kimage_voffset __ro_after_init;
 EXPORT_SYMBOL(kimage_voffset);
@@ -525,6 +526,35 @@ static int __init parse_rodata(char *arg)
 }
 early_param("rodata", parse_rodata);
 
+#ifdef CONFIG_UNMAP_KERNEL_AT_EL0
+static int __init map_entry_trampoline(void)
+{
+	pgprot_t prot = rodata_enabled ? PAGE_KERNEL_ROX : PAGE_KERNEL_EXEC;
+	phys_addr_t pa_start = __pa_symbol(__entry_tramp_text_start);
+
+	/* The trampoline is always mapped and can therefore be global */
+	pgprot_val(prot) &= ~PTE_NG;
+
+	/* Map only the text into the trampoline page table */
+	memset(tramp_pg_dir, 0, PGD_SIZE);
+	__create_pgd_mapping(tramp_pg_dir, pa_start, TRAMP_VALIAS, PAGE_SIZE,
+			     prot, pgd_pgtable_alloc, 0);
+
+	/* Map both the text and data into the kernel page table */
+	__set_fixmap(FIX_ENTRY_TRAMP_TEXT, pa_start, prot);
+	if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
+		extern char __entry_tramp_data_start[];
+
+		__set_fixmap(FIX_ENTRY_TRAMP_DATA,
+			     __pa_symbol(__entry_tramp_data_start),
+			     PAGE_KERNEL_RO);
+	}
+
+	return 0;
+}
+core_initcall(map_entry_trampoline);
+#endif
+
 /*
  * Create fine-grained mappings for the kernel.
  */
@@ -570,8 +600,8 @@ static void __init map_kernel(pgd_t *pgd)
 		 * entry instead.
 		 */
 		BUG_ON(!IS_ENABLED(CONFIG_ARM64_16K_PAGES));
-		set_pud(pud_set_fixmap_offset(pgd, FIXADDR_START),
-			__pud(__pa_symbol(bm_pmd) | PUD_TYPE_TABLE));
+		pud_populate(&init_mm, pud_set_fixmap_offset(pgd, FIXADDR_START),
+			     lm_alias(bm_pmd));
 		pud_clear_fixmap();
 	} else {
 		BUG();
@@ -612,7 +642,8 @@ void __init paging_init(void)
 	 * allocated with it.
 	 */
 	memblock_free(__pa_symbol(swapper_pg_dir) + PAGE_SIZE,
-		      SWAPPER_DIR_SIZE - PAGE_SIZE);
+		      __pa_symbol(swapper_pg_end) - __pa_symbol(swapper_pg_dir)
+		      - PAGE_SIZE);
 }
 
 /*
@@ -686,7 +717,7 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node)
 			if (!p)
 				return -ENOMEM;
 
-			set_pmd(pmd, __pmd(__pa(p) | PROT_SECT_NORMAL));
+			pmd_set_huge(pmd, __pa(p), __pgprot(PROT_SECT_NORMAL));
 		} else
 			vmemmap_verify((pte_t *)pmd, node, addr, next);
 	} while (addr = next, addr != end);
@@ -879,15 +910,19 @@ int __init arch_ioremap_pmd_supported(void)
 
 int pud_set_huge(pud_t *pud, phys_addr_t phys, pgprot_t prot)
 {
+	pgprot_t sect_prot = __pgprot(PUD_TYPE_SECT |
+					pgprot_val(mk_sect_prot(prot)));
 	BUG_ON(phys & ~PUD_MASK);
-	set_pud(pud, __pud(phys | PUD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	set_pud(pud, pfn_pud(__phys_to_pfn(phys), sect_prot));
 	return 1;
 }
 
 int pmd_set_huge(pmd_t *pmd, phys_addr_t phys, pgprot_t prot)
 {
+	pgprot_t sect_prot = __pgprot(PMD_TYPE_SECT |
+					pgprot_val(mk_sect_prot(prot)));
 	BUG_ON(phys & ~PMD_MASK);
-	set_pmd(pmd, __pmd(phys | PMD_TYPE_SECT | pgprot_val(mk_sect_prot(prot))));
+	set_pmd(pmd, pfn_pmd(__phys_to_pfn(phys), sect_prot));
 	return 1;
 }
 
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 051e71e..289f911 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -49,6 +49,14 @@ void __init pgd_cache_init(void)
 	if (PGD_SIZE == PAGE_SIZE)
 		return;
 
+#ifdef CONFIG_ARM64_PA_BITS_52
+	/*
+	 * With 52-bit physical addresses, the architecture requires the
+	 * top-level table to be aligned to at least 64 bytes.
+	 */
+	BUILD_BUG_ON(PGD_SIZE < 64);
+#endif
+
 	/*
 	 * Naturally aligned pgds required by the architecture.
 	 */
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 95233df..9f177aa 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -70,7 +70,11 @@
 	mrs	x8, mdscr_el1
 	mrs	x9, oslsr_el1
 	mrs	x10, sctlr_el1
+alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	mrs	x11, tpidr_el1
+alternative_else
+	mrs	x11, tpidr_el2
+alternative_endif
 	mrs	x12, sp_el0
 	stp	x2, x3, [x0]
 	stp	x4, xzr, [x0, #16]
@@ -116,7 +120,11 @@
 	msr	mdscr_el1, x10
 
 	msr	sctlr_el1, x12
+alternative_if_not ARM64_HAS_VIRT_HOST_EXTN
 	msr	tpidr_el1, x13
+alternative_else
+	msr	tpidr_el2, x13
+alternative_endif
 	msr	sp_el0, x14
 	/*
 	 * Restore oslsr_el1 by writing oslar_el1
@@ -124,6 +132,11 @@
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
+
+alternative_if ARM64_HAS_RAS_EXTN
+	msr_s	SYS_DISR_EL1, xzr
+alternative_else_nop_endif
+
 	isb
 	ret
 ENDPROC(cpu_do_resume)
@@ -138,13 +151,18 @@
  *	- pgd_phys - physical address of new TTB
  */
 ENTRY(cpu_do_switch_mm)
-	pre_ttbr0_update_workaround x0, x2, x3
+	mrs	x2, ttbr1_el1
 	mmid	x1, x1				// get mm->context.id
-	bfi	x0, x1, #48, #16		// set the ASID
-	msr	ttbr0_el1, x0			// set TTBR0
+	phys_to_ttbr x0, x3
+#ifdef CONFIG_ARM64_SW_TTBR0_PAN
+	bfi	x3, x1, #48, #16		// set the ASID field in TTBR0
+#endif
+	bfi	x2, x1, #48, #16		// set the ASID
+	msr	ttbr1_el1, x2			// in TTBR1 (since TCR.A1 is set)
 	isb
-	post_ttbr0_update_workaround
-	ret
+	msr	ttbr0_el1, x3			// now update TTBR0
+	isb
+	b	post_ttbr_update_workaround	// Back to C code...
 ENDPROC(cpu_do_switch_mm)
 
 	.pushsection ".idmap.text", "ax"
@@ -158,14 +176,16 @@
 	save_and_disable_daif flags=x2
 
 	adrp	x1, empty_zero_page
-	msr	ttbr1_el1, x1
+	phys_to_ttbr x1, x3
+	msr	ttbr1_el1, x3
 	isb
 
 	tlbi	vmalle1
 	dsb	nsh
 	isb
 
-	msr	ttbr1_el1, x0
+	phys_to_ttbr x0, x3
+	msr	ttbr1_el1, x3
 	isb
 
 	restore_daif x2
@@ -214,25 +234,19 @@
 	/*
 	 * Prepare SCTLR
 	 */
-	adr	x5, crval
-	ldp	w5, w6, [x5]
-	mrs	x0, sctlr_el1
-	bic	x0, x0, x5			// clear bits
-	orr	x0, x0, x6			// set bits
+	mov_q	x0, SCTLR_EL1_SET
 	/*
 	 * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
 	 * both user and kernel.
 	 */
 	ldr	x10, =TCR_TxSZ(VA_BITS) | TCR_CACHE_FLAGS | TCR_SMP_FLAGS | \
-			TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0
+			TCR_TG_FLAGS | TCR_ASID16 | TCR_TBI0 | TCR_A1
 	tcr_set_idmap_t0sz	x10, x9
 
 	/*
-	 * Read the PARange bits from ID_AA64MMFR0_EL1 and set the IPS bits in
-	 * TCR_EL1.
+	 * Set the IPS bits in TCR_EL1.
 	 */
-	mrs	x9, ID_AA64MMFR0_EL1
-	bfi	x10, x9, #32, #3
+	tcr_compute_pa_size x10, #TCR_IPS_SHIFT, x5, x6
 #ifdef CONFIG_ARM64_HW_AFDBM
 	/*
 	 * Hardware update of the Access and Dirty bits.
@@ -249,21 +263,3 @@
 	msr	tcr_el1, x10
 	ret					// return to head.S
 ENDPROC(__cpu_setup)
-
-	/*
-	 * We set the desired value explicitly, including those of the
-	 * reserved bits. The values of bits EE & E0E were set early in
-	 * el2_setup, which are left untouched below.
-	 *
-	 *                 n n            T
-	 *       U E      WT T UD     US IHBS
-	 *       CE0      XWHW CZ     ME TEEA S
-	 * .... .IEE .... NEAI TE.I ..AD DEN0 ACAM
-	 * 0011 0... 1101 ..0. ..0. 10.. .0.. .... < hardware reserved
-	 * .... .1.. .... 01.1 11.1 ..01 0.01 1101 < software settings
-	 */
-	.type	crval, #object
-crval:
-	.word	0xfcffffff			// clear
-	.word	0x34d5d91d			// set
-	.popsection
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index ba38d40..bb32f7f 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -148,7 +148,8 @@ static inline int epilogue_offset(const struct jit_ctx *ctx)
 /* Stack must be multiples of 16B */
 #define STACK_ALIGN(sz) (((sz) + 15) & ~15)
 
-#define PROLOGUE_OFFSET 8
+/* Tail call offset to jump into */
+#define PROLOGUE_OFFSET 7
 
 static int build_prologue(struct jit_ctx *ctx)
 {
@@ -200,19 +201,19 @@ static int build_prologue(struct jit_ctx *ctx)
 	/* Initialize tail_call_cnt */
 	emit(A64_MOVZ(1, tcc, 0, 0), ctx);
 
-	/* 4 byte extra for skb_copy_bits buffer */
-	ctx->stack_size = prog->aux->stack_depth + 4;
-	ctx->stack_size = STACK_ALIGN(ctx->stack_size);
-
-	/* Set up function call stack */
-	emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
-
 	cur_offset = ctx->idx - idx0;
 	if (cur_offset != PROLOGUE_OFFSET) {
 		pr_err_once("PROLOGUE_OFFSET = %d, expected %d!\n",
 			    cur_offset, PROLOGUE_OFFSET);
 		return -1;
 	}
+
+	/* 4 byte extra for skb_copy_bits buffer */
+	ctx->stack_size = prog->aux->stack_depth + 4;
+	ctx->stack_size = STACK_ALIGN(ctx->stack_size);
+
+	/* Set up function call stack */
+	emit(A64_SUB_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
 	return 0;
 }
 
@@ -260,11 +261,12 @@ static int emit_bpf_tail_call(struct jit_ctx *ctx)
 	emit(A64_LDR64(prg, tmp, prg), ctx);
 	emit(A64_CBZ(1, prg, jmp_offset), ctx);
 
-	/* goto *(prog->bpf_func + prologue_size); */
+	/* goto *(prog->bpf_func + prologue_offset); */
 	off = offsetof(struct bpf_prog, bpf_func);
 	emit_a64_mov_i64(tmp, off, ctx);
 	emit(A64_LDR64(tmp, prg, tmp), ctx);
 	emit(A64_ADD_I(1, tmp, tmp, sizeof(u32) * PROLOGUE_OFFSET), ctx);
+	emit(A64_ADD_I(1, A64_SP, A64_SP, ctx->stack_size), ctx);
 	emit(A64_BR(tmp), ctx);
 
 	/* out: */
diff --git a/arch/arm64/xen/hypercall.S b/arch/arm64/xen/hypercall.S
index 401ceb7..c5f05c4 100644
--- a/arch/arm64/xen/hypercall.S
+++ b/arch/arm64/xen/hypercall.S
@@ -101,12 +101,12 @@
 	 * need the explicit uaccess_enable/disable if the TTBR0 PAN emulation
 	 * is enabled (it implies that hardware UAO and PAN disabled).
 	 */
-	uaccess_ttbr0_enable x6, x7
+	uaccess_ttbr0_enable x6, x7, x8
 	hvc XEN_IMM
 
 	/*
 	 * Disable userspace access from kernel once the hyp call completed.
 	 */
-	uaccess_ttbr0_disable x6
+	uaccess_ttbr0_disable x6, x7
 	ret
 ENDPROC(privcmd_call);
diff --git a/arch/blackfin/include/asm/thread_info.h b/arch/blackfin/include/asm/thread_info.h
index 2966b93..a5aeab4 100644
--- a/arch/blackfin/include/asm/thread_info.h
+++ b/arch/blackfin/include/asm/thread_info.h
@@ -56,8 +56,6 @@ struct thread_info {
 	.cpu		= 0,			\
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
 
 /* Given a task stack pointer, you can find its corresponding
  * thread_info structure just by masking it to the THREAD_SIZE
diff --git a/arch/c6x/include/asm/thread_info.h b/arch/c6x/include/asm/thread_info.h
index acc70c1..59a5697 100644
--- a/arch/c6x/include/asm/thread_info.h
+++ b/arch/c6x/include/asm/thread_info.h
@@ -60,9 +60,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* get the thread information struct of current task */
 static inline __attribute__((const))
 struct thread_info *current_thread_info(void)
diff --git a/arch/cris/include/asm/processor.h b/arch/cris/include/asm/processor.h
index 124dd5e..ee4d8b0 100644
--- a/arch/cris/include/asm/processor.h
+++ b/arch/cris/include/asm/processor.h
@@ -26,13 +26,6 @@ struct task_struct;
  */
 #define TASK_UNMAPPED_BASE      (PAGE_ALIGN(TASK_SIZE / 3))
 
-/* THREAD_SIZE is the size of the thread_info/kernel_stack combo.
- * normally, the stack is found by doing something like p + THREAD_SIZE
- * in CRIS, a page is 8192 bytes, which seems like a sane size
- */
-#define THREAD_SIZE       PAGE_SIZE
-#define THREAD_SIZE_ORDER (0)
-
 /*
  * At user->kernel entry, the pt_regs struct is stacked on the top of the kernel-stack.
  * This macro allows us to find those regs for a task.
@@ -59,8 +52,6 @@ static inline void release_thread(struct task_struct *dead_task)
         /* Nothing needs to be done.  */
 }
 
-#define init_stack      (init_thread_union.stack)
-
 #define cpu_relax()     barrier()
 
 void default_idle(void);
diff --git a/arch/cris/include/asm/thread_info.h b/arch/cris/include/asm/thread_info.h
index 472830c..996fef3 100644
--- a/arch/cris/include/asm/thread_info.h
+++ b/arch/cris/include/asm/thread_info.h
@@ -20,6 +20,13 @@
 #endif
 
 
+/* THREAD_SIZE is the size of the thread_info/kernel_stack combo.
+ * normally, the stack is found by doing something like p + THREAD_SIZE
+ * in CRIS, a page is 8192 bytes, which seems like a sane size
+ */
+#define THREAD_SIZE       PAGE_SIZE
+#define THREAD_SIZE_ORDER (0)
+
 /*
  * low level task data that entry.S needs immediate access to
  * - this struct should fit entirely inside of one cache line
@@ -56,8 +63,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,			\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-
 #endif /* !__ASSEMBLY__ */
 
 /*
diff --git a/arch/cris/kernel/vmlinux.lds.S b/arch/cris/kernel/vmlinux.lds.S
index 6d1dbc1..9b232e0 100644
--- a/arch/cris/kernel/vmlinux.lds.S
+++ b/arch/cris/kernel/vmlinux.lds.S
@@ -11,6 +11,7 @@
 
 #include <asm-generic/vmlinux.lds.h>
 #include <asm/page.h>
+#include <asm/thread_info.h>
 
 #ifdef CONFIG_ETRAX_VMEM_SIZE
 #define __CONFIG_ETRAX_VMEM_SIZE CONFIG_ETRAX_VMEM_SIZE
diff --git a/arch/frv/include/asm/thread_info.h b/arch/frv/include/asm/thread_info.h
index ccba3b6..0f95084 100644
--- a/arch/frv/include/asm/thread_info.h
+++ b/arch/frv/include/asm/thread_info.h
@@ -64,9 +64,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 register struct thread_info *__current_thread_info asm("gr15");
 
diff --git a/arch/h8300/include/asm/thread_info.h b/arch/h8300/include/asm/thread_info.h
index 072b92c..0cdaa30 100644
--- a/arch/h8300/include/asm/thread_info.h
+++ b/arch/h8300/include/asm/thread_info.h
@@ -46,9 +46,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
diff --git a/arch/hexagon/include/asm/thread_info.h b/arch/hexagon/include/asm/thread_info.h
index b80fe1d..f41f9c6 100644
--- a/arch/hexagon/include/asm/thread_info.h
+++ b/arch/hexagon/include/asm/thread_info.h
@@ -84,9 +84,6 @@ struct thread_info {
 	.regs = NULL,			\
 }
 
-#define init_thread_info        (init_thread_union.thread_info)
-#define init_stack              (init_thread_union.stack)
-
 /* Tacky preprocessor trickery */
 #define	qqstr(s) qstr(s)
 #define qstr(s) #s
diff --git a/arch/hexagon/kernel/vmlinux.lds.S b/arch/hexagon/kernel/vmlinux.lds.S
index ec87e67..ad69d18 100644
--- a/arch/hexagon/kernel/vmlinux.lds.S
+++ b/arch/hexagon/kernel/vmlinux.lds.S
@@ -22,6 +22,8 @@
 #include <asm/asm-offsets.h>	/*  Most of the kernel defines are here  */
 #include <asm/mem-layout.h>	/*  except for page_offset  */
 #include <asm/cache.h>		/*  and now we're pulling cache line size  */
+#include <asm/thread_info.h>	/*  and we need THREAD_SIZE too */
+
 OUTPUT_ARCH(hexagon)
 ENTRY(stext)
 
diff --git a/arch/ia64/Kconfig b/arch/ia64/Kconfig
index 49583c5..315c51f 100644
--- a/arch/ia64/Kconfig
+++ b/arch/ia64/Kconfig
@@ -43,7 +43,7 @@
 	select ARCH_HAVE_NMI_SAFE_CMPXCHG
 	select GENERIC_IOMAP
 	select GENERIC_SMP_IDLE_THREAD
-	select ARCH_INIT_TASK
+	select ARCH_TASK_STRUCT_ON_STACK
 	select ARCH_TASK_STRUCT_ALLOCATOR
 	select ARCH_THREAD_STACK_ALLOCATOR
 	select ARCH_CLOCKSOURCE_DATA
diff --git a/arch/ia64/Makefile b/arch/ia64/Makefile
index c100d78..2dd7f51 100644
--- a/arch/ia64/Makefile
+++ b/arch/ia64/Makefile
@@ -42,7 +42,7 @@
 endif
 
 KBUILD_CFLAGS += $(cflags-y)
-head-y := arch/ia64/kernel/head.o arch/ia64/kernel/init_task.o
+head-y := arch/ia64/kernel/head.o
 
 libs-y				+= arch/ia64/lib/
 core-y				+= arch/ia64/kernel/ arch/ia64/mm/
diff --git a/arch/ia64/include/asm/atomic.h b/arch/ia64/include/asm/atomic.h
index 28e02c9..762eeb0f 100644
--- a/arch/ia64/include/asm/atomic.h
+++ b/arch/ia64/include/asm/atomic.h
@@ -65,29 +65,30 @@ ia64_atomic_fetch_##op (int i, atomic_t *v)				\
 ATOMIC_OPS(add, +)
 ATOMIC_OPS(sub, -)
 
-#define atomic_add_return(i,v)						\
+#ifdef __OPTIMIZE__
+#define __ia64_atomic_const(i)	__builtin_constant_p(i) ?		\
+		((i) == 1 || (i) == 4 || (i) == 8 || (i) == 16 ||	\
+		 (i) == -1 || (i) == -4 || (i) == -8 || (i) == -16) : 0
+
+#define atomic_add_return(i, v)						\
 ({									\
-	int __ia64_aar_i = (i);						\
-	(__builtin_constant_p(i)					\
-	 && (   (__ia64_aar_i ==  1) || (__ia64_aar_i ==   4)		\
-	     || (__ia64_aar_i ==  8) || (__ia64_aar_i ==  16)		\
-	     || (__ia64_aar_i == -1) || (__ia64_aar_i ==  -4)		\
-	     || (__ia64_aar_i == -8) || (__ia64_aar_i == -16)))		\
-		? ia64_fetch_and_add(__ia64_aar_i, &(v)->counter)	\
-		: ia64_atomic_add(__ia64_aar_i, v);			\
+	int __i = (i);							\
+	static const int __ia64_atomic_p = __ia64_atomic_const(i);	\
+	__ia64_atomic_p ? ia64_fetch_and_add(__i, &(v)->counter) :	\
+				ia64_atomic_add(__i, v);		\
 })
 
-#define atomic_sub_return(i,v)						\
+#define atomic_sub_return(i, v)						\
 ({									\
-	int __ia64_asr_i = (i);						\
-	(__builtin_constant_p(i)					\
-	 && (   (__ia64_asr_i ==   1) || (__ia64_asr_i ==   4)		\
-	     || (__ia64_asr_i ==   8) || (__ia64_asr_i ==  16)		\
-	     || (__ia64_asr_i ==  -1) || (__ia64_asr_i ==  -4)		\
-	     || (__ia64_asr_i ==  -8) || (__ia64_asr_i == -16)))	\
-		? ia64_fetch_and_add(-__ia64_asr_i, &(v)->counter)	\
-		: ia64_atomic_sub(__ia64_asr_i, v);			\
+	int __i = (i);							\
+	static const int __ia64_atomic_p = __ia64_atomic_const(i);	\
+	__ia64_atomic_p ? ia64_fetch_and_add(-__i, &(v)->counter) :	\
+				ia64_atomic_sub(__i, v);		\
 })
+#else
+#define atomic_add_return(i, v)	ia64_atomic_add(i, v)
+#define atomic_sub_return(i, v)	ia64_atomic_sub(i, v)
+#endif
 
 #define atomic_fetch_add(i,v)						\
 ({									\
diff --git a/arch/ia64/include/asm/thread_info.h b/arch/ia64/include/asm/thread_info.h
index 1d172a4..64a1011 100644
--- a/arch/ia64/include/asm/thread_info.h
+++ b/arch/ia64/include/asm/thread_info.h
@@ -12,6 +12,8 @@
 #include <asm/processor.h>
 #include <asm/ptrace.h>
 
+#define THREAD_SIZE			KERNEL_STACK_SIZE
+
 #ifndef __ASSEMBLY__
 
 /*
@@ -41,8 +43,6 @@ struct thread_info {
 #endif
 };
 
-#define THREAD_SIZE			KERNEL_STACK_SIZE
-
 #define INIT_THREAD_INFO(tsk)			\
 {						\
 	.task		= &tsk,			\
diff --git a/arch/ia64/kernel/Makefile b/arch/ia64/kernel/Makefile
index 14ad79f..0b4c65a 100644
--- a/arch/ia64/kernel/Makefile
+++ b/arch/ia64/kernel/Makefile
@@ -7,7 +7,7 @@
 CFLAGS_REMOVE_ftrace.o = -pg
 endif
 
-extra-y	:= head.o init_task.o vmlinux.lds
+extra-y	:= head.o vmlinux.lds
 
 obj-y := entry.o efi.o efi_stub.o gate-data.o fsys.o ia64_ksyms.o irq.o irq_ia64.o	\
 	 irq_lsapic.o ivt.o machvec.o pal.o patch.o process.o perfmon.o ptrace.o sal.o		\
diff --git a/arch/ia64/kernel/acpi.c b/arch/ia64/kernel/acpi.c
index 1d29b2f..1dacbf5e 100644
--- a/arch/ia64/kernel/acpi.c
+++ b/arch/ia64/kernel/acpi.c
@@ -504,6 +504,11 @@ acpi_numa_memory_affinity_init(struct acpi_srat_mem_affinity *ma)
 	if (!(ma->flags & ACPI_SRAT_MEM_ENABLED))
 		return -1;
 
+	if (num_node_memblks >= NR_NODE_MEMBLKS) {
+		pr_err("NUMA: too many memblk ranges\n");
+		return -EINVAL;
+	}
+
 	/* record this node in proximity bitmap */
 	pxm_bit_set(pxm);
 
diff --git a/arch/ia64/kernel/init_task.c b/arch/ia64/kernel/init_task.c
deleted file mode 100644
index 8df9245..0000000
--- a/arch/ia64/kernel/init_task.c
+++ /dev/null
@@ -1,44 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-/*
- * This is where we statically allocate and initialize the initial
- * task.
- *
- * Copyright (C) 1999, 2002-2003 Hewlett-Packard Co
- *	David Mosberger-Tang <davidm@hpl.hp.com>
- */
-
-#include <linux/init.h>
-#include <linux/mm.h>
-#include <linux/fs.h>
-#include <linux/module.h>
-#include <linux/sched.h>
-#include <linux/init_task.h>
-#include <linux/mqueue.h>
-
-#include <linux/uaccess.h>
-#include <asm/pgtable.h>
-
-static struct signal_struct init_signals = INIT_SIGNALS(init_signals);
-static struct sighand_struct init_sighand = INIT_SIGHAND(init_sighand);
-/*
- * Initial task structure.
- *
- * We need to make sure that this is properly aligned due to the way process stacks are
- * handled. This is done by having a special ".data..init_task" section...
- */
-#define init_thread_info	init_task_mem.s.thread_info
-#define init_stack		init_task_mem.stack
-
-union {
-	struct {
-		struct task_struct task;
-		struct thread_info thread_info;
-	} s;
-	unsigned long stack[KERNEL_STACK_SIZE/sizeof (unsigned long)];
-} init_task_mem asm ("init_task") __init_task_data =
-	{{
-	.task =		INIT_TASK(init_task_mem.s.task),
-	.thread_info =	INIT_THREAD_INFO(init_task_mem.s.task)
-}};
-
-EXPORT_SYMBOL(init_task);
diff --git a/arch/ia64/kernel/time.c b/arch/ia64/kernel/time.c
index c6ecb97..9025699 100644
--- a/arch/ia64/kernel/time.c
+++ b/arch/ia64/kernel/time.c
@@ -88,7 +88,7 @@ void vtime_flush(struct task_struct *tsk)
 	}
 
 	if (ti->softirq_time) {
-		delta = cycle_to_nsec(ti->softirq_time));
+		delta = cycle_to_nsec(ti->softirq_time);
 		account_system_index_time(tsk, delta, CPUTIME_SOFTIRQ);
 	}
 
diff --git a/arch/ia64/kernel/vmlinux.lds.S b/arch/ia64/kernel/vmlinux.lds.S
index 58db59d..b0b2070 100644
--- a/arch/ia64/kernel/vmlinux.lds.S
+++ b/arch/ia64/kernel/vmlinux.lds.S
@@ -3,6 +3,7 @@
 #include <asm/cache.h>
 #include <asm/ptrace.h>
 #include <asm/pgtable.h>
+#include <asm/thread_info.h>
 
 #include <asm-generic/vmlinux.lds.h>
 
diff --git a/arch/m32r/include/asm/thread_info.h b/arch/m32r/include/asm/thread_info.h
index b3a215b..ba00f10 100644
--- a/arch/m32r/include/asm/thread_info.h
+++ b/arch/m32r/include/asm/thread_info.h
@@ -56,9 +56,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
diff --git a/arch/m32r/kernel/traps.c b/arch/m32r/kernel/traps.c
index cb79fba..b88a8dd 100644
--- a/arch/m32r/kernel/traps.c
+++ b/arch/m32r/kernel/traps.c
@@ -122,7 +122,6 @@ void abort(void)
 	/* if that doesn't kill us, halt */
 	panic("Oops failed to kill thread");
 }
-EXPORT_SYMBOL(abort);
 
 void __init trap_init(void)
 {
diff --git a/arch/m68k/configs/amiga_defconfig b/arch/m68k/configs/amiga_defconfig
index 5b5fa98..e0b285e 100644
--- a/arch/m68k/configs/amiga_defconfig
+++ b/arch/m68k/configs/amiga_defconfig
@@ -454,7 +454,6 @@
 CONFIG_PPS_CLIENT_PARPORT=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FB_CIRRUS=y
 CONFIG_FB_AMIGA=y
@@ -595,6 +594,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -624,6 +624,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -653,3 +654,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/apollo_defconfig b/arch/m68k/configs/apollo_defconfig
index 72a7764..3281026 100644
--- a/arch/m68k/configs/apollo_defconfig
+++ b/arch/m68k/configs/apollo_defconfig
@@ -422,7 +422,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -554,6 +553,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -583,6 +583,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -612,3 +613,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/atari_defconfig b/arch/m68k/configs/atari_defconfig
index 884b43a..e943fad 100644
--- a/arch/m68k/configs/atari_defconfig
+++ b/arch/m68k/configs/atari_defconfig
@@ -437,7 +437,6 @@
 CONFIG_PPS_CLIENT_PARPORT=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FB_ATARI=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
@@ -576,6 +575,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -605,6 +605,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -634,3 +635,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/bvme6000_defconfig b/arch/m68k/configs/bvme6000_defconfig
index fcfa60d..700c231 100644
--- a/arch/m68k/configs/bvme6000_defconfig
+++ b/arch/m68k/configs/bvme6000_defconfig
@@ -420,7 +420,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_HID=m
 CONFIG_HIDRAW=y
 CONFIG_UHID=m
@@ -546,6 +545,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -575,6 +575,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -604,3 +605,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/hp300_defconfig b/arch/m68k/configs/hp300_defconfig
index 9d597bb..271d57f 100644
--- a/arch/m68k/configs/hp300_defconfig
+++ b/arch/m68k/configs/hp300_defconfig
@@ -425,7 +425,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -556,6 +555,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -585,6 +585,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -614,3 +615,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/mac_defconfig b/arch/m68k/configs/mac_defconfig
index 45da20d..88761b8 100644
--- a/arch/m68k/configs/mac_defconfig
+++ b/arch/m68k/configs/mac_defconfig
@@ -447,7 +447,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FB_VALKYRIE=y
 CONFIG_FB_MAC=y
@@ -578,6 +577,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -607,6 +607,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -636,3 +637,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/multi_defconfig b/arch/m68k/configs/multi_defconfig
index fda880c..7cb35da 100644
--- a/arch/m68k/configs/multi_defconfig
+++ b/arch/m68k/configs/multi_defconfig
@@ -504,7 +504,6 @@
 CONFIG_PPS_CLIENT_PARPORT=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FB_CIRRUS=y
 CONFIG_FB_AMIGA=y
@@ -658,6 +657,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -687,6 +687,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -716,3 +717,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/mvme147_defconfig b/arch/m68k/configs/mvme147_defconfig
index 7d5e486..b139d7b 100644
--- a/arch/m68k/configs/mvme147_defconfig
+++ b/arch/m68k/configs/mvme147_defconfig
@@ -420,7 +420,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_HID=m
 CONFIG_HIDRAW=y
 CONFIG_UHID=m
@@ -546,6 +545,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -575,6 +575,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -604,3 +605,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/mvme16x_defconfig b/arch/m68k/configs/mvme16x_defconfig
index 7763b71..3983461 100644
--- a/arch/m68k/configs/mvme16x_defconfig
+++ b/arch/m68k/configs/mvme16x_defconfig
@@ -420,7 +420,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_HID=m
 CONFIG_HIDRAW=y
 CONFIG_UHID=m
@@ -546,6 +545,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -575,6 +575,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -604,3 +605,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/q40_defconfig b/arch/m68k/configs/q40_defconfig
index 17eaebf..14c6083 100644
--- a/arch/m68k/configs/q40_defconfig
+++ b/arch/m68k/configs/q40_defconfig
@@ -437,7 +437,6 @@
 CONFIG_PPS_CLIENT_PARPORT=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -569,6 +568,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -598,6 +598,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -627,3 +628,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/sun3_defconfig b/arch/m68k/configs/sun3_defconfig
index d1cb7a0..97dec0b 100644
--- a/arch/m68k/configs/sun3_defconfig
+++ b/arch/m68k/configs/sun3_defconfig
@@ -419,7 +419,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -548,6 +547,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -576,6 +576,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -605,3 +606,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/configs/sun3x_defconfig b/arch/m68k/configs/sun3x_defconfig
index ea3a331..56df28d 100644
--- a/arch/m68k/configs/sun3x_defconfig
+++ b/arch/m68k/configs/sun3x_defconfig
@@ -419,7 +419,6 @@
 CONFIG_PPS_CLIENT_LDISC=m
 CONFIG_PTP_1588_CLOCK=m
 # CONFIG_HWMON is not set
-# CONFIG_RC_CORE is not set
 CONFIG_FB=y
 CONFIG_FRAMEBUFFER_CONSOLE=y
 CONFIG_LOGO=y
@@ -548,6 +547,7 @@
 CONFIG_TEST_HASH=m
 CONFIG_TEST_USER_COPY=m
 CONFIG_TEST_BPF=m
+CONFIG_TEST_FIND_BIT=m
 CONFIG_TEST_FIRMWARE=m
 CONFIG_TEST_SYSCTL=m
 CONFIG_TEST_UDELAY=m
@@ -577,6 +577,7 @@
 CONFIG_CRYPTO_RMD320=m
 CONFIG_CRYPTO_SHA512=m
 CONFIG_CRYPTO_SHA3=m
+CONFIG_CRYPTO_SM3=m
 CONFIG_CRYPTO_TGR192=m
 CONFIG_CRYPTO_WP512=m
 CONFIG_CRYPTO_AES_TI=m
@@ -606,3 +607,4 @@
 # CONFIG_CRYPTO_HW is not set
 CONFIG_CRC32_SELFTEST=m
 CONFIG_XZ_DEC_TEST=m
+CONFIG_STRING_SELFTEST=m
diff --git a/arch/m68k/include/asm/macintosh.h b/arch/m68k/include/asm/macintosh.h
index f42c274..9b840c0 100644
--- a/arch/m68k/include/asm/macintosh.h
+++ b/arch/m68k/include/asm/macintosh.h
@@ -33,7 +33,7 @@ struct mac_model
 	char ide_type;
 	char scc_type;
 	char ether_type;
-	char nubus_type;
+	char expansion_type;
 	char floppy_type;
 };
 
@@ -73,8 +73,11 @@ struct mac_model
 #define MAC_ETHER_SONIC		1
 #define MAC_ETHER_MACE		2
 
-#define MAC_NO_NUBUS		0
-#define MAC_NUBUS		1
+#define MAC_EXP_NONE		0
+#define MAC_EXP_PDS		1 /* Accepts only a PDS card */
+#define MAC_EXP_NUBUS		2 /* Accepts only NuBus card(s) */
+#define MAC_EXP_PDS_NUBUS	3 /* Accepts PDS card and/or NuBus card(s) */
+#define MAC_EXP_PDS_COMM	4 /* Accepts PDS card or Comm Slot card */
 
 #define MAC_FLOPPY_IWM		0
 #define MAC_FLOPPY_SWIM_ADDR1	1
diff --git a/arch/m68k/include/asm/thread_info.h b/arch/m68k/include/asm/thread_info.h
index 9280355..015f1ca 100644
--- a/arch/m68k/include/asm/thread_info.h
+++ b/arch/m68k/include/asm/thread_info.h
@@ -41,8 +41,6 @@ struct thread_info {
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
 
-#define init_stack		(init_thread_union.stack)
-
 #ifndef __ASSEMBLY__
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
@@ -58,8 +56,6 @@ static inline struct thread_info *current_thread_info(void)
 }
 #endif
 
-#define init_thread_info	(init_thread_union.thread_info)
-
 /* entry.S relies on these definitions!
  * bits 0-7 are tested at every exception exit
  * bits 8-15 are also tested at syscall exit
diff --git a/arch/m68k/mac/config.c b/arch/m68k/mac/config.c
index 16cd5ce..d3d4352 100644
--- a/arch/m68k/mac/config.c
+++ b/arch/m68k/mac/config.c
@@ -212,7 +212,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_II,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_IWM,
 	},
 
@@ -227,7 +227,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_II,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_IWM,
 	}, {
 		.ident		= MAC_MODEL_IIX,
@@ -236,7 +236,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_II,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_IICX,
@@ -245,7 +245,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_II,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_SE30,
@@ -254,7 +254,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_II,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -272,7 +272,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_IIFX,
@@ -281,7 +281,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_IIFX,
 		.scc_type	= MAC_SCC_IOP,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_IOP,
 	}, {
 		.ident		= MAC_MODEL_IISI,
@@ -290,7 +290,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_IIVI,
@@ -299,7 +299,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_IIVX,
@@ -308,7 +308,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -323,7 +323,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_CCL,
@@ -332,7 +331,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_CCLII,
@@ -341,7 +340,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -356,7 +355,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_LCII,
@@ -365,7 +364,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_LCIII,
@@ -374,7 +373,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -395,7 +394,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q605_ACC,
@@ -404,7 +403,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q610,
@@ -414,7 +413,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q630,
@@ -424,8 +423,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.ide_type	= MAC_IDE_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
-		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_COMM,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q650,
@@ -435,7 +433,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	},
 	/* The Q700 does have a NS Sonic */
@@ -447,7 +445,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA2,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q800,
@@ -457,7 +455,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_Q840,
@@ -467,7 +465,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA3,
 		.scc_type	= MAC_SCC_PSC,
 		.ether_type	= MAC_ETHER_MACE,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_AV,
 	}, {
 		.ident		= MAC_MODEL_Q900,
@@ -477,7 +475,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA2,
 		.scc_type	= MAC_SCC_IOP,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_IOP,
 	}, {
 		.ident		= MAC_MODEL_Q950,
@@ -487,7 +485,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA2,
 		.scc_type	= MAC_SCC_IOP,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_IOP,
 	},
 
@@ -502,7 +500,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_P475,
@@ -511,7 +509,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_P475F,
@@ -520,7 +518,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_P520,
@@ -529,7 +527,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_P550,
@@ -538,7 +536,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 	/* These have the comm slot, and therefore possibly SONIC ethernet */
@@ -549,8 +547,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_II,
-		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_COMM,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_P588,
@@ -560,8 +557,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.ide_type	= MAC_IDE_QUADRA,
 		.scc_type	= MAC_SCC_II,
-		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_COMM,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_TV,
@@ -570,7 +566,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_P600,
@@ -579,7 +574,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_LC,
 		.scc_type	= MAC_SCC_II,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -596,7 +591,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_C650,
@@ -606,7 +601,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR1,
 	}, {
 		.ident		= MAC_MODEL_C660,
@@ -616,7 +611,7 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_QUADRA3,
 		.scc_type	= MAC_SCC_PSC,
 		.ether_type	= MAC_ETHER_MACE,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_PDS_NUBUS,
 		.floppy_type	= MAC_FLOPPY_AV,
 	},
 
@@ -633,7 +628,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB145,
@@ -642,7 +636,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB150,
@@ -652,7 +645,6 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_OLD,
 		.ide_type	= MAC_IDE_PB,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB160,
@@ -661,7 +653,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB165,
@@ -670,7 +661,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB165C,
@@ -679,7 +669,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB170,
@@ -688,7 +677,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB180,
@@ -697,7 +685,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB180C,
@@ -706,7 +693,6 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_QUADRA,
 		.scsi_type	= MAC_SCSI_OLD,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB190,
@@ -716,7 +702,6 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_LATE,
 		.ide_type	= MAC_IDE_BABOON,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB520,
@@ -726,7 +711,6 @@ static struct mac_model mac_data_table[] = {
 		.scsi_type	= MAC_SCSI_LATE,
 		.scc_type	= MAC_SCC_QUADRA,
 		.ether_type	= MAC_ETHER_SONIC,
-		.nubus_type	= MAC_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -743,7 +727,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB230,
@@ -752,7 +736,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB250,
@@ -761,7 +745,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB270C,
@@ -770,7 +754,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB280,
@@ -779,7 +763,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	}, {
 		.ident		= MAC_MODEL_PB280C,
@@ -788,7 +772,7 @@ static struct mac_model mac_data_table[] = {
 		.via_type	= MAC_VIA_IICI,
 		.scsi_type	= MAC_SCSI_DUO,
 		.scc_type	= MAC_SCC_QUADRA,
-		.nubus_type	= MAC_NUBUS,
+		.expansion_type	= MAC_EXP_NUBUS,
 		.floppy_type	= MAC_FLOPPY_SWIM_ADDR2,
 	},
 
@@ -1100,14 +1084,12 @@ int __init mac_platform_init(void)
 	 * Ethernet device
 	 */
 
-	switch (macintosh_config->ether_type) {
-	case MAC_ETHER_SONIC:
+	if (macintosh_config->ether_type == MAC_ETHER_SONIC ||
+	    macintosh_config->expansion_type == MAC_EXP_PDS_COMM)
 		platform_device_register_simple("macsonic", -1, NULL, 0);
-		break;
-	case MAC_ETHER_MACE:
+
+	if (macintosh_config->ether_type == MAC_ETHER_MACE)
 		platform_device_register_simple("macmace", -1, NULL, 0);
-		break;
-	}
 
 	return 0;
 }
diff --git a/arch/m68k/mac/oss.c b/arch/m68k/mac/oss.c
index 3f81892..921e6c0 100644
--- a/arch/m68k/mac/oss.c
+++ b/arch/m68k/mac/oss.c
@@ -53,56 +53,41 @@ void __init oss_init(void)
 }
 
 /*
- * Handle miscellaneous OSS interrupts.
+ * Handle OSS interrupts.
+ * XXX how do you clear a pending IRQ? is it even necessary?
  */
 
-static void oss_irq(struct irq_desc *desc)
+static void oss_iopism_irq(struct irq_desc *desc)
 {
-	int events = oss->irq_pending &
-		(OSS_IP_IOPSCC | OSS_IP_SCSI | OSS_IP_IOPISM);
-
-	if (events & OSS_IP_IOPSCC) {
-		oss->irq_pending &= ~OSS_IP_IOPSCC;
-		generic_handle_irq(IRQ_MAC_SCC);
-	}
-
-	if (events & OSS_IP_SCSI) {
-		oss->irq_pending &= ~OSS_IP_SCSI;
-		generic_handle_irq(IRQ_MAC_SCSI);
-	}
-
-	if (events & OSS_IP_IOPISM) {
-		oss->irq_pending &= ~OSS_IP_IOPISM;
-		generic_handle_irq(IRQ_MAC_ADB);
-	}
+	generic_handle_irq(IRQ_MAC_ADB);
 }
 
-/*
- * Nubus IRQ handler, OSS style
- *
- * Unlike the VIA/RBV this is on its own autovector interrupt level.
- */
+static void oss_scsi_irq(struct irq_desc *desc)
+{
+	generic_handle_irq(IRQ_MAC_SCSI);
+}
 
 static void oss_nubus_irq(struct irq_desc *desc)
 {
-	int events, irq_bit, i;
+	u16 events, irq_bit;
+	int irq_num;
 
 	events = oss->irq_pending & OSS_IP_NUBUS;
-	if (!events)
-		return;
-
-	/* There are only six slots on the OSS, not seven */
-
-	i = 6;
-	irq_bit = 0x40;
+	irq_num = NUBUS_SOURCE_BASE + 5;
+	irq_bit = OSS_IP_NUBUS5;
 	do {
-		--i;
-		irq_bit >>= 1;
 		if (events & irq_bit) {
-			oss->irq_pending &= ~irq_bit;
-			generic_handle_irq(NUBUS_SOURCE_BASE + i);
+			events &= ~irq_bit;
+			generic_handle_irq(irq_num);
 		}
-	} while(events & (irq_bit - 1));
+		--irq_num;
+		irq_bit >>= 1;
+	} while (events);
+}
+
+static void oss_iopscc_irq(struct irq_desc *desc)
+{
+	generic_handle_irq(IRQ_MAC_SCC);
 }
 
 /*
@@ -122,14 +107,14 @@ static void oss_nubus_irq(struct irq_desc *desc)
 
 void __init oss_register_interrupts(void)
 {
-	irq_set_chained_handler(OSS_IRQLEV_IOPISM, oss_irq);
-	irq_set_chained_handler(OSS_IRQLEV_SCSI,   oss_irq);
+	irq_set_chained_handler(OSS_IRQLEV_IOPISM, oss_iopism_irq);
+	irq_set_chained_handler(OSS_IRQLEV_SCSI,   oss_scsi_irq);
 	irq_set_chained_handler(OSS_IRQLEV_NUBUS,  oss_nubus_irq);
-	irq_set_chained_handler(OSS_IRQLEV_IOPSCC, oss_irq);
+	irq_set_chained_handler(OSS_IRQLEV_IOPSCC, oss_iopscc_irq);
 	irq_set_chained_handler(OSS_IRQLEV_VIA1,   via1_irq);
 
 	/* OSS_VIA1 gets enabled here because it has no machspec interrupt. */
-	oss->irq_level[OSS_VIA1] = IRQ_AUTO_6;
+	oss->irq_level[OSS_VIA1] = OSS_IRQLEV_VIA1;
 }
 
 /*
diff --git a/arch/metag/include/asm/thread_info.h b/arch/metag/include/asm/thread_info.h
index 554f73a..a1a9c7f 100644
--- a/arch/metag/include/asm/thread_info.h
+++ b/arch/metag/include/asm/thread_info.h
@@ -74,9 +74,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the current stack pointer from C */
 register unsigned long current_stack_pointer asm("A0StP") __used;
 
diff --git a/arch/microblaze/include/asm/thread_info.h b/arch/microblaze/include/asm/thread_info.h
index e7e8954e..9afe4b5 100644
--- a/arch/microblaze/include/asm/thread_info.h
+++ b/arch/microblaze/include/asm/thread_info.h
@@ -86,9 +86,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index 350a990..8e0b370 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -259,6 +259,7 @@
 	select LEDS_GPIO_REGISTER
 	select BCM47XX_NVRAM
 	select BCM47XX_SPROM
+	select BCM47XX_SSB if !BCM47XX_BCMA
 	help
 	 Support for BCM47XX based boards
 
@@ -389,6 +390,7 @@
 	select SYS_SUPPORTS_32BIT_KERNEL
 	select SYS_SUPPORTS_MIPS16
 	select SYS_SUPPORTS_MULTITHREADING
+	select SYS_SUPPORTS_VPE_LOADER
 	select SYS_HAS_EARLY_PRINTK
 	select GPIOLIB
 	select SWAP_IO_SPACE
@@ -516,6 +518,7 @@
 	select SYS_SUPPORTS_MIPS16
 	select SYS_SUPPORTS_MULTITHREADING
 	select SYS_SUPPORTS_SMARTMIPS
+	select SYS_SUPPORTS_VPE_LOADER
 	select SYS_SUPPORTS_ZBOOT
 	select SYS_SUPPORTS_RELOCATABLE
 	select USE_OF
@@ -2281,9 +2284,16 @@
 	  The only reason this is a build-time option is to save ~14K from the
 	  final kernel image.
 
+config SYS_SUPPORTS_VPE_LOADER
+	bool
+	depends on SYS_SUPPORTS_MULTITHREADING
+	help
+	  Indicates that the platform supports the VPE loader, and provides
+	  physical_memsize.
+
 config MIPS_VPE_LOADER
 	bool "VPE loader support."
-	depends on SYS_SUPPORTS_MULTITHREADING && MODULES
+	depends on SYS_SUPPORTS_VPE_LOADER && MODULES
 	select CPU_MIPSR2_IRQ_VI
 	select CPU_MIPSR2_IRQ_EI
 	select MIPS_MT
diff --git a/arch/mips/Kconfig.debug b/arch/mips/Kconfig.debug
index 464af5e..0749c37 100644
--- a/arch/mips/Kconfig.debug
+++ b/arch/mips/Kconfig.debug
@@ -124,30 +124,36 @@
 
 	  If unsure, say N.
 
-menuconfig MIPS_CPS_NS16550
+menuconfig MIPS_CPS_NS16550_BOOL
 	bool "CPS SMP NS16550 UART output"
 	depends on MIPS_CPS
 	help
 	  Output debug information via an ns16550 compatible UART if exceptions
 	  occur early in the boot process of a secondary core.
 
-if MIPS_CPS_NS16550
+if MIPS_CPS_NS16550_BOOL
+
+config MIPS_CPS_NS16550
+	def_bool MIPS_CPS_NS16550_BASE != 0
 
 config MIPS_CPS_NS16550_BASE
 	hex "UART Base Address"
 	default 0x1b0003f8 if MIPS_MALTA
+	default 0
 	help
 	  The base address of the ns16550 compatible UART on which to output
 	  debug information from the early stages of core startup.
 
+	  This is only used if non-zero.
+
 config MIPS_CPS_NS16550_SHIFT
 	int "UART Register Shift"
-	default 0 if MIPS_MALTA
+	default 0
 	help
 	  The number of bits to shift ns16550 register indices by in order to
 	  form their addresses. That is, log base 2 of the span between
 	  adjacent ns16550 registers in the system.
 
-endif # MIPS_CPS_NS16550
+endif # MIPS_CPS_NS16550_BOOL
 
 endmenu
diff --git a/arch/mips/ar7/platform.c b/arch/mips/ar7/platform.c
index 4674f1e..e1675c2 100644
--- a/arch/mips/ar7/platform.c
+++ b/arch/mips/ar7/platform.c
@@ -575,7 +575,7 @@ static int __init ar7_register_uarts(void)
 	uart_port.type		= PORT_AR7;
 	uart_port.uartclk	= clk_get_rate(bus_clk) / 2;
 	uart_port.iotype	= UPIO_MEM32;
-	uart_port.flags		= UPF_FIXED_TYPE;
+	uart_port.flags		= UPF_FIXED_TYPE | UPF_BOOT_AUTOCONF;
 	uart_port.regshift	= 2;
 
 	uart_port.line		= 0;
diff --git a/arch/mips/ath25/devices.c b/arch/mips/ath25/devices.c
index e115634..301a902 100644
--- a/arch/mips/ath25/devices.c
+++ b/arch/mips/ath25/devices.c
@@ -73,6 +73,7 @@ const char *get_system_type(void)
 
 void __init ath25_serial_setup(u32 mapbase, int irq, unsigned int uartclk)
 {
+#ifdef CONFIG_SERIAL_8250_CONSOLE
 	struct uart_port s;
 
 	memset(&s, 0, sizeof(s));
@@ -85,6 +86,7 @@ void __init ath25_serial_setup(u32 mapbase, int irq, unsigned int uartclk)
 	s.uartclk = uartclk;
 
 	early_serial_setup(&s);
+#endif /* CONFIG_SERIAL_8250_CONSOLE */
 }
 
 int __init ath25_add_wmac(int nr, u32 base, int irq)
diff --git a/arch/mips/include/asm/thread_info.h b/arch/mips/include/asm/thread_info.h
index 5e8927f..4993db4 100644
--- a/arch/mips/include/asm/thread_info.h
+++ b/arch/mips/include/asm/thread_info.h
@@ -49,9 +49,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* How to get the thread information struct from C.  */
 register struct thread_info *__current_thread_info __asm__("$28");
 
diff --git a/arch/mips/kernel/cps-vec.S b/arch/mips/kernel/cps-vec.S
index c7ed260..e68e6e0 100644
--- a/arch/mips/kernel/cps-vec.S
+++ b/arch/mips/kernel/cps-vec.S
@@ -235,6 +235,7 @@
 	has_mt	t0, 3f
 
 	.set	push
+	.set	MIPS_ISA_LEVEL_RAW
 	.set	mt
 
 	/* Only allow 1 TC per VPE to execute... */
@@ -388,6 +389,7 @@
 #elif defined(CONFIG_MIPS_MT)
 
 	.set	push
+	.set	MIPS_ISA_LEVEL_RAW
 	.set	mt
 
 	/* If the core doesn't support MT then return */
diff --git a/arch/mips/kernel/mips-cm.c b/arch/mips/kernel/mips-cm.c
index dd5567b..8f5bd04 100644
--- a/arch/mips/kernel/mips-cm.c
+++ b/arch/mips/kernel/mips-cm.c
@@ -292,7 +292,6 @@ void mips_cm_lock_other(unsigned int cluster, unsigned int core,
 				  *this_cpu_ptr(&cm_core_lock_flags));
 	} else {
 		WARN_ON(cluster != 0);
-		WARN_ON(vp != 0);
 		WARN_ON(block != CM_GCR_Cx_OTHER_BLOCK_LOCAL);
 
 		/*
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 45d0b6b..57028d4 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -705,6 +705,18 @@ int mips_set_process_fp_mode(struct task_struct *task, unsigned int value)
 	struct task_struct *t;
 	int max_users;
 
+	/* If nothing to change, return right away, successfully.  */
+	if (value == mips_get_process_fp_mode(task))
+		return 0;
+
+	/* Only accept a mode change if 64-bit FP enabled for o32.  */
+	if (!IS_ENABLED(CONFIG_MIPS_O32_FP64_SUPPORT))
+		return -EOPNOTSUPP;
+
+	/* And only for o32 tasks.  */
+	if (IS_ENABLED(CONFIG_64BIT) && !test_thread_flag(TIF_32BIT_REGS))
+		return -EOPNOTSUPP;
+
 	/* Check the value is valid */
 	if (value & ~known_bits)
 		return -EOPNOTSUPP;
diff --git a/arch/mips/kernel/ptrace.c b/arch/mips/kernel/ptrace.c
index efbd8df..0b23b1a 100644
--- a/arch/mips/kernel/ptrace.c
+++ b/arch/mips/kernel/ptrace.c
@@ -419,25 +419,38 @@ static int gpr64_set(struct task_struct *target,
 
 #endif /* CONFIG_64BIT */
 
-static int fpr_get(struct task_struct *target,
-		   const struct user_regset *regset,
-		   unsigned int pos, unsigned int count,
-		   void *kbuf, void __user *ubuf)
+/*
+ * Copy the floating-point context to the supplied NT_PRFPREG buffer,
+ * !CONFIG_CPU_HAS_MSA variant.  FP context's general register slots
+ * correspond 1:1 to buffer slots.  Only general registers are copied.
+ */
+static int fpr_get_fpa(struct task_struct *target,
+		       unsigned int *pos, unsigned int *count,
+		       void **kbuf, void __user **ubuf)
 {
-	unsigned i;
-	int err;
+	return user_regset_copyout(pos, count, kbuf, ubuf,
+				   &target->thread.fpu,
+				   0, NUM_FPU_REGS * sizeof(elf_fpreg_t));
+}
+
+/*
+ * Copy the floating-point context to the supplied NT_PRFPREG buffer,
+ * CONFIG_CPU_HAS_MSA variant.  Only lower 64 bits of FP context's
+ * general register slots are copied to buffer slots.  Only general
+ * registers are copied.
+ */
+static int fpr_get_msa(struct task_struct *target,
+		       unsigned int *pos, unsigned int *count,
+		       void **kbuf, void __user **ubuf)
+{
+	unsigned int i;
 	u64 fpr_val;
+	int err;
 
-	/* XXX fcr31  */
-
-	if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t))
-		return user_regset_copyout(&pos, &count, &kbuf, &ubuf,
-					   &target->thread.fpu,
-					   0, sizeof(elf_fpregset_t));
-
+	BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));
 	for (i = 0; i < NUM_FPU_REGS; i++) {
 		fpr_val = get_fpr64(&target->thread.fpu.fpr[i], 0);
-		err = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+		err = user_regset_copyout(pos, count, kbuf, ubuf,
 					  &fpr_val, i * sizeof(elf_fpreg_t),
 					  (i + 1) * sizeof(elf_fpreg_t));
 		if (err)
@@ -447,27 +460,64 @@ static int fpr_get(struct task_struct *target,
 	return 0;
 }
 
-static int fpr_set(struct task_struct *target,
+/*
+ * Copy the floating-point context to the supplied NT_PRFPREG buffer.
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCSR register separately.
+ */
+static int fpr_get(struct task_struct *target,
 		   const struct user_regset *regset,
 		   unsigned int pos, unsigned int count,
-		   const void *kbuf, const void __user *ubuf)
+		   void *kbuf, void __user *ubuf)
 {
-	unsigned i;
+	const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);
 	int err;
+
+	if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
+		err = fpr_get_fpa(target, &pos, &count, &kbuf, &ubuf);
+	else
+		err = fpr_get_msa(target, &pos, &count, &kbuf, &ubuf);
+	if (err)
+		return err;
+
+	err = user_regset_copyout(&pos, &count, &kbuf, &ubuf,
+				  &target->thread.fpu.fcr31,
+				  fcr31_pos, fcr31_pos + sizeof(u32));
+
+	return err;
+}
+
+/*
+ * Copy the supplied NT_PRFPREG buffer to the floating-point context,
+ * !CONFIG_CPU_HAS_MSA variant.   Buffer slots correspond 1:1 to FP
+ * context's general register slots.  Only general registers are copied.
+ */
+static int fpr_set_fpa(struct task_struct *target,
+		       unsigned int *pos, unsigned int *count,
+		       const void **kbuf, const void __user **ubuf)
+{
+	return user_regset_copyin(pos, count, kbuf, ubuf,
+				  &target->thread.fpu,
+				  0, NUM_FPU_REGS * sizeof(elf_fpreg_t));
+}
+
+/*
+ * Copy the supplied NT_PRFPREG buffer to the floating-point context,
+ * CONFIG_CPU_HAS_MSA variant.  Buffer slots are copied to lower 64
+ * bits only of FP context's general register slots.  Only general
+ * registers are copied.
+ */
+static int fpr_set_msa(struct task_struct *target,
+		       unsigned int *pos, unsigned int *count,
+		       const void **kbuf, const void __user **ubuf)
+{
+	unsigned int i;
 	u64 fpr_val;
-
-	/* XXX fcr31  */
-
-	init_fp_ctx(target);
-
-	if (sizeof(target->thread.fpu.fpr[i]) == sizeof(elf_fpreg_t))
-		return user_regset_copyin(&pos, &count, &kbuf, &ubuf,
-					  &target->thread.fpu,
-					  0, sizeof(elf_fpregset_t));
+	int err;
 
 	BUILD_BUG_ON(sizeof(fpr_val) != sizeof(elf_fpreg_t));
-	for (i = 0; i < NUM_FPU_REGS && count >= sizeof(elf_fpreg_t); i++) {
-		err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+	for (i = 0; i < NUM_FPU_REGS && *count > 0; i++) {
+		err = user_regset_copyin(pos, count, kbuf, ubuf,
 					 &fpr_val, i * sizeof(elf_fpreg_t),
 					 (i + 1) * sizeof(elf_fpreg_t));
 		if (err)
@@ -478,6 +528,53 @@ static int fpr_set(struct task_struct *target,
 	return 0;
 }
 
+/*
+ * Copy the supplied NT_PRFPREG buffer to the floating-point context.
+ * Choose the appropriate helper for general registers, and then copy
+ * the FCSR register separately.
+ *
+ * We optimize for the case where `count % sizeof(elf_fpreg_t) == 0',
+ * which is supposed to have been guaranteed by the kernel before
+ * calling us, e.g. in `ptrace_regset'.  We enforce that requirement,
+ * so that we can safely avoid preinitializing temporaries for
+ * partial register writes.
+ */
+static int fpr_set(struct task_struct *target,
+		   const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	const int fcr31_pos = NUM_FPU_REGS * sizeof(elf_fpreg_t);
+	u32 fcr31;
+	int err;
+
+	BUG_ON(count % sizeof(elf_fpreg_t));
+
+	if (pos + count > sizeof(elf_fpregset_t))
+		return -EIO;
+
+	init_fp_ctx(target);
+
+	if (sizeof(target->thread.fpu.fpr[0]) == sizeof(elf_fpreg_t))
+		err = fpr_set_fpa(target, &pos, &count, &kbuf, &ubuf);
+	else
+		err = fpr_set_msa(target, &pos, &count, &kbuf, &ubuf);
+	if (err)
+		return err;
+
+	if (count > 0) {
+		err = user_regset_copyin(&pos, &count, &kbuf, &ubuf,
+					 &fcr31,
+					 fcr31_pos, fcr31_pos + sizeof(u32));
+		if (err)
+			return err;
+
+		ptrace_setfcr31(target, fcr31);
+	}
+
+	return err;
+}
+
 enum mips_regset {
 	REGSET_GPR,
 	REGSET_FPR,
diff --git a/arch/mips/lib/Makefile b/arch/mips/lib/Makefile
index 78c2aff..e84e126 100644
--- a/arch/mips/lib/Makefile
+++ b/arch/mips/lib/Makefile
@@ -16,4 +16,5 @@
 obj-$(CONFIG_CPU_TX39XX)	+= r3k_dump_tlb.o
 
 # libgcc-style stuff needed in the kernel
-obj-y += ashldi3.o ashrdi3.o bswapsi.o bswapdi.o cmpdi2.o lshrdi3.o ucmpdi2.o
+obj-y += ashldi3.o ashrdi3.o bswapsi.o bswapdi.o cmpdi2.o lshrdi3.o multi3.o \
+	 ucmpdi2.o
diff --git a/arch/mips/lib/libgcc.h b/arch/mips/lib/libgcc.h
index 28002ed..199a7f9 100644
--- a/arch/mips/lib/libgcc.h
+++ b/arch/mips/lib/libgcc.h
@@ -10,10 +10,18 @@ typedef int word_type __attribute__ ((mode (__word__)));
 struct DWstruct {
 	int high, low;
 };
+
+struct TWstruct {
+	long long high, low;
+};
 #elif defined(__LITTLE_ENDIAN)
 struct DWstruct {
 	int low, high;
 };
+
+struct TWstruct {
+	long long low, high;
+};
 #else
 #error I feel sick.
 #endif
@@ -23,4 +31,13 @@ typedef union {
 	long long ll;
 } DWunion;
 
+#if defined(CONFIG_64BIT) && defined(CONFIG_CPU_MIPSR6)
+typedef int ti_type __attribute__((mode(TI)));
+
+typedef union {
+	struct TWstruct s;
+	ti_type ti;
+} TWunion;
+#endif
+
 #endif /* __ASM_LIBGCC_H */
diff --git a/arch/mips/lib/multi3.c b/arch/mips/lib/multi3.c
new file mode 100644
index 0000000..111ad47
--- /dev/null
+++ b/arch/mips/lib/multi3.c
@@ -0,0 +1,54 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/export.h>
+
+#include "libgcc.h"
+
+/*
+ * GCC 7 suboptimally generates __multi3 calls for mips64r6, so for that
+ * specific case only we'll implement it here.
+ *
+ * See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82981
+ */
+#if defined(CONFIG_64BIT) && defined(CONFIG_CPU_MIPSR6) && (__GNUC__ == 7)
+
+/* multiply 64-bit values, low 64-bits returned */
+static inline long long notrace dmulu(long long a, long long b)
+{
+	long long res;
+
+	asm ("dmulu %0,%1,%2" : "=r" (res) : "r" (a), "r" (b));
+	return res;
+}
+
+/* multiply 64-bit unsigned values, high 64-bits of 128-bit result returned */
+static inline long long notrace dmuhu(long long a, long long b)
+{
+	long long res;
+
+	asm ("dmuhu %0,%1,%2" : "=r" (res) : "r" (a), "r" (b));
+	return res;
+}
+
+/* multiply 128-bit values, low 128-bits returned */
+ti_type notrace __multi3(ti_type a, ti_type b)
+{
+	TWunion res, aa, bb;
+
+	aa.ti = a;
+	bb.ti = b;
+
+	/*
+	 * a * b =           (a.lo * b.lo)
+	 *         + 2^64  * (a.hi * b.lo + a.lo * b.hi)
+	 *        [+ 2^128 * (a.hi * b.hi)]
+	 */
+	res.s.low = dmulu(aa.s.low, bb.s.low);
+	res.s.high = dmuhu(aa.s.low, bb.s.low);
+	res.s.high += dmulu(aa.s.high, bb.s.low);
+	res.s.high += dmulu(aa.s.low, bb.s.high);
+
+	return res.ti;
+}
+EXPORT_SYMBOL(__multi3);
+
+#endif /* 64BIT && CPU_MIPSR6 && GCC7 */
diff --git a/arch/mips/mm/uasm-micromips.c b/arch/mips/mm/uasm-micromips.c
index cdb5a19..9bb6baa 100644
--- a/arch/mips/mm/uasm-micromips.c
+++ b/arch/mips/mm/uasm-micromips.c
@@ -40,7 +40,7 @@
 
 #include "uasm.c"
 
-static const struct insn const insn_table_MM[insn_invalid] = {
+static const struct insn insn_table_MM[insn_invalid] = {
 	[insn_addu]	= {M(mm_pool32a_op, 0, 0, 0, 0, mm_addu32_op), RT | RS | RD},
 	[insn_addiu]	= {M(mm_addiu32_op, 0, 0, 0, 0, 0), RT | RS | SIMM},
 	[insn_and]	= {M(mm_pool32a_op, 0, 0, 0, 0, mm_and_op), RT | RS | RD},
diff --git a/arch/mips/ralink/timer.c b/arch/mips/ralink/timer.c
index d4469b2..4f46a45 100644
--- a/arch/mips/ralink/timer.c
+++ b/arch/mips/ralink/timer.c
@@ -109,9 +109,9 @@ static int rt_timer_probe(struct platform_device *pdev)
 	}
 
 	rt->irq = platform_get_irq(pdev, 0);
-	if (!rt->irq) {
+	if (rt->irq < 0) {
 		dev_err(&pdev->dev, "failed to load irq\n");
-		return -ENOENT;
+		return rt->irq;
 	}
 
 	rt->membase = devm_ioremap_resource(&pdev->dev, res);
diff --git a/arch/mips/rb532/Makefile b/arch/mips/rb532/Makefile
index efdecdb..8186afc 100644
--- a/arch/mips/rb532/Makefile
+++ b/arch/mips/rb532/Makefile
@@ -2,4 +2,6 @@
 # Makefile for the RB532 board specific parts of the kernel
 #
 
-obj-y	 += irq.o time.o setup.o serial.o prom.o gpio.o devices.o
+obj-$(CONFIG_SERIAL_8250_CONSOLE) += serial.o
+
+obj-y	 += irq.o time.o setup.o prom.o gpio.o devices.o
diff --git a/arch/mips/rb532/devices.c b/arch/mips/rb532/devices.c
index 32ea3e6..354d258 100644
--- a/arch/mips/rb532/devices.c
+++ b/arch/mips/rb532/devices.c
@@ -310,6 +310,8 @@ static int __init plat_setup_devices(void)
 	return platform_add_devices(rb532_devs, ARRAY_SIZE(rb532_devs));
 }
 
+#ifdef CONFIG_NET
+
 static int __init setup_kmac(char *s)
 {
 	printk(KERN_INFO "korina mac = %s\n", s);
@@ -322,4 +324,6 @@ static int __init setup_kmac(char *s)
 
 __setup("kmac=", setup_kmac);
 
+#endif /* CONFIG_NET */
+
 arch_initcall(plat_setup_devices);
diff --git a/arch/mn10300/include/asm/thread_info.h b/arch/mn10300/include/asm/thread_info.h
index f5f90bb..1748a7b 100644
--- a/arch/mn10300/include/asm/thread_info.h
+++ b/arch/mn10300/include/asm/thread_info.h
@@ -79,8 +79,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
 #define init_uregs							\
 	((struct pt_regs *)						\
 	 ((unsigned long) init_stack + THREAD_SIZE - sizeof(struct pt_regs)))
diff --git a/arch/mn10300/kernel/mn10300-serial.c b/arch/mn10300/kernel/mn10300-serial.c
index d7ef123..4994b57 100644
--- a/arch/mn10300/kernel/mn10300-serial.c
+++ b/arch/mn10300/kernel/mn10300-serial.c
@@ -550,7 +550,7 @@ static void mn10300_serial_receive_interrupt(struct mn10300_serial_port *port)
 		return;
 	}
 
-	smp_read_barrier_depends();
+	/* READ_ONCE() enforces dependency, but dangerous through integer!!! */
 	ch = port->rx_buffer[ix++];
 	st = port->rx_buffer[ix++];
 	smp_mb();
@@ -1728,7 +1728,10 @@ static int mn10300_serial_poll_get_char(struct uart_port *_port)
 			if (CIRC_CNT(port->rx_inp, ix, MNSC_BUFFER_SIZE) == 0)
 				return NO_POLL_CHAR;
 
-			smp_read_barrier_depends();
+			/*
+			 * READ_ONCE() enforces dependency, but dangerous
+			 * through integer!!!
+			 */
 			ch = port->rx_buffer[ix++];
 			st = port->rx_buffer[ix++];
 			smp_mb();
diff --git a/arch/nios2/include/asm/thread_info.h b/arch/nios2/include/asm/thread_info.h
index d69c338..7349a4f 100644
--- a/arch/nios2/include/asm/thread_info.h
+++ b/arch/nios2/include/asm/thread_info.h
@@ -63,9 +63,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
diff --git a/arch/openrisc/include/asm/processor.h b/arch/openrisc/include/asm/processor.h
index 396d8f3..af31a9f 100644
--- a/arch/openrisc/include/asm/processor.h
+++ b/arch/openrisc/include/asm/processor.h
@@ -84,8 +84,6 @@ void start_thread(struct pt_regs *regs, unsigned long nip, unsigned long sp);
 void release_thread(struct task_struct *);
 unsigned long get_wchan(struct task_struct *p);
 
-#define init_stack      (init_thread_union.stack)
-
 #define cpu_relax()     barrier()
 
 #endif /* __ASSEMBLY__ */
diff --git a/arch/openrisc/include/asm/thread_info.h b/arch/openrisc/include/asm/thread_info.h
index c229aa6..5c15dfa 100644
--- a/arch/openrisc/include/asm/thread_info.h
+++ b/arch/openrisc/include/asm/thread_info.h
@@ -79,8 +79,6 @@ struct thread_info {
 	.ksp            = 0,                            \
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-
 /* how to get the thread information struct from C */
 register struct thread_info *current_thread_info_reg asm("r10");
 #define current_thread_info()   (current_thread_info_reg)
diff --git a/arch/openrisc/kernel/vmlinux.lds.S b/arch/openrisc/kernel/vmlinux.lds.S
index 00ddb78..953bdcd 100644
--- a/arch/openrisc/kernel/vmlinux.lds.S
+++ b/arch/openrisc/kernel/vmlinux.lds.S
@@ -28,6 +28,7 @@
 
 #include <asm/page.h>
 #include <asm/cache.h>
+#include <asm/thread_info.h>
 #include <asm-generic/vmlinux.lds.h>
 
 #ifdef __OR1K__
diff --git a/arch/parisc/include/asm/ldcw.h b/arch/parisc/include/asm/ldcw.h
index dd5a08a..3eb4bfc 100644
--- a/arch/parisc/include/asm/ldcw.h
+++ b/arch/parisc/include/asm/ldcw.h
@@ -12,6 +12,7 @@
    for the semaphore.  */
 
 #define __PA_LDCW_ALIGNMENT	16
+#define __PA_LDCW_ALIGN_ORDER	4
 #define __ldcw_align(a) ({					\
 	unsigned long __ret = (unsigned long) &(a)->lock[0];	\
 	__ret = (__ret + __PA_LDCW_ALIGNMENT - 1)		\
@@ -29,6 +30,7 @@
    ldcd). */
 
 #define __PA_LDCW_ALIGNMENT	4
+#define __PA_LDCW_ALIGN_ORDER	2
 #define __ldcw_align(a) (&(a)->slock)
 #define __LDCW	"ldcw,co"
 
diff --git a/arch/parisc/include/asm/thread_info.h b/arch/parisc/include/asm/thread_info.h
index 598c8d6..28575754 100644
--- a/arch/parisc/include/asm/thread_info.h
+++ b/arch/parisc/include/asm/thread_info.h
@@ -25,9 +25,6 @@ struct thread_info {
 	.preempt_count	= INIT_PREEMPT_COUNT,	\
 }
 
-#define init_thread_info        (init_thread_union.thread_info)
-#define init_stack              (init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 #define current_thread_info()	((struct thread_info *)mfctl(30))
 
diff --git a/arch/parisc/kernel/drivers.c b/arch/parisc/kernel/drivers.c
index d8f7735..29b99b8 100644
--- a/arch/parisc/kernel/drivers.c
+++ b/arch/parisc/kernel/drivers.c
@@ -870,7 +870,7 @@ static void print_parisc_device(struct parisc_device *dev)
 	static int count;
 
 	print_pa_hwpath(dev, hw_path);
-	printk(KERN_INFO "%d. %s at 0x%p [%s] { %d, 0x%x, 0x%.3x, 0x%.5x }",
+	printk(KERN_INFO "%d. %s at 0x%px [%s] { %d, 0x%x, 0x%.3x, 0x%.5x }",
 		++count, dev->name, (void*) dev->hpa.start, hw_path, dev->id.hw_type,
 		dev->id.hversion_rev, dev->id.hversion, dev->id.sversion);
 
diff --git a/arch/parisc/kernel/entry.S b/arch/parisc/kernel/entry.S
index f3cecf5..e95207c 100644
--- a/arch/parisc/kernel/entry.S
+++ b/arch/parisc/kernel/entry.S
@@ -35,6 +35,7 @@
 #include <asm/pgtable.h>
 #include <asm/signal.h>
 #include <asm/unistd.h>
+#include <asm/ldcw.h>
 #include <asm/thread_info.h>
 
 #include <linux/linkage.h>
@@ -46,6 +47,14 @@
 #endif
 
 	.import		pa_tlb_lock,data
+	.macro  load_pa_tlb_lock reg
+#if __PA_LDCW_ALIGNMENT > 4
+	load32	PA(pa_tlb_lock) + __PA_LDCW_ALIGNMENT-1, \reg
+	depi	0,31,__PA_LDCW_ALIGN_ORDER, \reg
+#else
+	load32	PA(pa_tlb_lock), \reg
+#endif
+	.endm
 
 	/* space_to_prot macro creates a prot id from a space id */
 
@@ -457,7 +466,7 @@
 	.macro		tlb_lock	spc,ptp,pte,tmp,tmp1,fault
 #ifdef CONFIG_SMP
 	cmpib,COND(=),n	0,\spc,2f
-	load32		PA(pa_tlb_lock),\tmp
+	load_pa_tlb_lock \tmp
 1:	LDCW		0(\tmp),\tmp1
 	cmpib,COND(=)	0,\tmp1,1b
 	nop
@@ -480,7 +489,7 @@
 	/* Release pa_tlb_lock lock. */
 	.macro		tlb_unlock1	spc,tmp
 #ifdef CONFIG_SMP
-	load32		PA(pa_tlb_lock),\tmp
+	load_pa_tlb_lock \tmp
 	tlb_unlock0	\spc,\tmp
 #endif
 	.endm
diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index adf7187..2d40c4f 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -36,6 +36,7 @@
 #include <asm/assembly.h>
 #include <asm/pgtable.h>
 #include <asm/cache.h>
+#include <asm/ldcw.h>
 #include <linux/linkage.h>
 
 	.text
@@ -333,8 +334,12 @@
 
 	.macro	tlb_lock	la,flags,tmp
 #ifdef CONFIG_SMP
-	ldil		L%pa_tlb_lock,%r1
-	ldo		R%pa_tlb_lock(%r1),\la
+#if __PA_LDCW_ALIGNMENT > 4
+	load32		pa_tlb_lock + __PA_LDCW_ALIGNMENT-1, \la
+	depi		0,31,__PA_LDCW_ALIGN_ORDER, \la
+#else
+	load32		pa_tlb_lock, \la
+#endif
 	rsm		PSW_SM_I,\flags
 1:	LDCW		0(\la),\tmp
 	cmpib,<>,n	0,\tmp,3f
diff --git a/arch/parisc/kernel/process.c b/arch/parisc/kernel/process.c
index 30f9239..cad3e86 100644
--- a/arch/parisc/kernel/process.c
+++ b/arch/parisc/kernel/process.c
@@ -39,6 +39,7 @@
 #include <linux/kernel.h>
 #include <linux/mm.h>
 #include <linux/fs.h>
+#include <linux/cpu.h>
 #include <linux/module.h>
 #include <linux/personality.h>
 #include <linux/ptrace.h>
@@ -184,6 +185,44 @@ int dump_task_fpu (struct task_struct *tsk, elf_fpregset_t *r)
 }
 
 /*
+ * Idle thread support
+ *
+ * Detect when running on QEMU with SeaBIOS PDC Firmware and let
+ * QEMU idle the host too.
+ */
+
+int running_on_qemu __read_mostly;
+
+void __cpuidle arch_cpu_idle_dead(void)
+{
+	/* nop on real hardware, qemu will offline CPU. */
+	asm volatile("or %%r31,%%r31,%%r31\n":::);
+}
+
+void __cpuidle arch_cpu_idle(void)
+{
+	local_irq_enable();
+
+	/* nop on real hardware, qemu will idle sleep. */
+	asm volatile("or %%r10,%%r10,%%r10\n":::);
+}
+
+static int __init parisc_idle_init(void)
+{
+	const char *marker;
+
+	/* check QEMU/SeaBIOS marker in PAGE0 */
+	marker = (char *) &PAGE0->pad0;
+	running_on_qemu = (memcmp(marker, "SeaBIOS", 8) == 0);
+
+	if (!running_on_qemu)
+		cpu_idle_poll_ctrl(1);
+
+	return 0;
+}
+arch_initcall(parisc_idle_init);
+
+/*
  * Copy architecture-specific thread state
  */
 int
diff --git a/arch/parisc/mm/init.c b/arch/parisc/mm/init.c
index 13f7854..48f4139 100644
--- a/arch/parisc/mm/init.c
+++ b/arch/parisc/mm/init.c
@@ -631,11 +631,11 @@ void __init mem_init(void)
 	mem_init_print_info(NULL);
 #ifdef CONFIG_DEBUG_KERNEL /* double-sanity-check paranoia */
 	printk("virtual kernel memory layout:\n"
-	       "    vmalloc : 0x%p - 0x%p   (%4ld MB)\n"
-	       "    memory  : 0x%p - 0x%p   (%4ld MB)\n"
-	       "      .init : 0x%p - 0x%p   (%4ld kB)\n"
-	       "      .data : 0x%p - 0x%p   (%4ld kB)\n"
-	       "      .text : 0x%p - 0x%p   (%4ld kB)\n",
+	       "    vmalloc : 0x%px - 0x%px   (%4ld MB)\n"
+	       "    memory  : 0x%px - 0x%px   (%4ld MB)\n"
+	       "      .init : 0x%px - 0x%px   (%4ld kB)\n"
+	       "      .data : 0x%px - 0x%px   (%4ld kB)\n"
+	       "      .text : 0x%px - 0x%px   (%4ld kB)\n",
 
 	       (void*)VMALLOC_START, (void*)VMALLOC_END,
 	       (VMALLOC_END - VMALLOC_START) >> 20,
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index c51e6ce..2ed525a 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -166,6 +166,7 @@
 	select GENERIC_CLOCKEVENTS_BROADCAST	if SMP
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_CPU_VULNERABILITIES	if PPC_BOOK3S_64
 	select GENERIC_IRQ_SHOW
 	select GENERIC_IRQ_SHOW_LEVEL
 	select GENERIC_SMP_IDLE_THREAD
diff --git a/arch/powerpc/configs/fsl-emb-nonhw.config b/arch/powerpc/configs/fsl-emb-nonhw.config
index cc49c95..e0567dc 100644
--- a/arch/powerpc/configs/fsl-emb-nonhw.config
+++ b/arch/powerpc/configs/fsl-emb-nonhw.config
@@ -71,7 +71,6 @@
 CONFIG_IP_ROUTE_VERBOSE=y
 CONFIG_IP_SCTP=m
 CONFIG_IPV6=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_ISO9660_FS=m
 CONFIG_JFFS2_FS_DEBUG=1
 CONFIG_JFFS2_FS=y
diff --git a/arch/powerpc/configs/powernv_defconfig b/arch/powerpc/configs/powernv_defconfig
index 4891bbe..73dab7a 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -4,7 +4,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_AUDIT=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_TASKSTATS=y
diff --git a/arch/powerpc/configs/ppc64_defconfig b/arch/powerpc/configs/ppc64_defconfig
index 6ddca80..5033e63 100644
--- a/arch/powerpc/configs/ppc64_defconfig
+++ b/arch/powerpc/configs/ppc64_defconfig
@@ -1,7 +1,6 @@
 CONFIG_PPC64=y
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_TASKSTATS=y
diff --git a/arch/powerpc/configs/pseries_defconfig b/arch/powerpc/configs/pseries_defconfig
index bde2cd1..0dd5cf7 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -3,7 +3,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_AUDIT=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_TASKSTATS=y
diff --git a/arch/powerpc/include/asm/exception-64e.h b/arch/powerpc/include/asm/exception-64e.h
index a703452..555e22d 100644
--- a/arch/powerpc/include/asm/exception-64e.h
+++ b/arch/powerpc/include/asm/exception-64e.h
@@ -209,5 +209,11 @@ exc_##label##_book3e:
 	ori	r3,r3,vector_offset@l;		\
 	mtspr	SPRN_IVOR##vector_number,r3;
 
+#define RFI_TO_KERNEL							\
+	rfi
+
+#define RFI_TO_USER							\
+	rfi
+
 #endif /* _ASM_POWERPC_EXCEPTION_64E_H */
 
diff --git a/arch/powerpc/include/asm/exception-64s.h b/arch/powerpc/include/asm/exception-64s.h
index b272052..7197b17 100644
--- a/arch/powerpc/include/asm/exception-64s.h
+++ b/arch/powerpc/include/asm/exception-64s.h
@@ -74,6 +74,59 @@
  */
 #define EX_R3		EX_DAR
 
+/*
+ * Macros for annotating the expected destination of (h)rfid
+ *
+ * The nop instructions allow us to insert one or more instructions to flush the
+ * L1-D cache when returning to userspace or a guest.
+ */
+#define RFI_FLUSH_SLOT							\
+	RFI_FLUSH_FIXUP_SECTION;					\
+	nop;								\
+	nop;								\
+	nop
+
+#define RFI_TO_KERNEL							\
+	rfid
+
+#define RFI_TO_USER							\
+	RFI_FLUSH_SLOT;							\
+	rfid;								\
+	b	rfi_flush_fallback
+
+#define RFI_TO_USER_OR_KERNEL						\
+	RFI_FLUSH_SLOT;							\
+	rfid;								\
+	b	rfi_flush_fallback
+
+#define RFI_TO_GUEST							\
+	RFI_FLUSH_SLOT;							\
+	rfid;								\
+	b	rfi_flush_fallback
+
+#define HRFI_TO_KERNEL							\
+	hrfid
+
+#define HRFI_TO_USER							\
+	RFI_FLUSH_SLOT;							\
+	hrfid;								\
+	b	hrfi_flush_fallback
+
+#define HRFI_TO_USER_OR_KERNEL						\
+	RFI_FLUSH_SLOT;							\
+	hrfid;								\
+	b	hrfi_flush_fallback
+
+#define HRFI_TO_GUEST							\
+	RFI_FLUSH_SLOT;							\
+	hrfid;								\
+	b	hrfi_flush_fallback
+
+#define HRFI_TO_UNKNOWN							\
+	RFI_FLUSH_SLOT;							\
+	hrfid;								\
+	b	hrfi_flush_fallback
+
 #ifdef CONFIG_RELOCATABLE
 #define __EXCEPTION_RELON_PROLOG_PSERIES_1(label, h)			\
 	mfspr	r11,SPRN_##h##SRR0;	/* save SRR0 */			\
@@ -218,7 +271,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	mtspr	SPRN_##h##SRR0,r12;					\
 	mfspr	r12,SPRN_##h##SRR1;	/* and SRR1 */			\
 	mtspr	SPRN_##h##SRR1,r10;					\
-	h##rfid;							\
+	h##RFI_TO_KERNEL;						\
 	b	.	/* prevent speculative execution */
 #define EXCEPTION_PROLOG_PSERIES_1(label, h)				\
 	__EXCEPTION_PROLOG_PSERIES_1(label, h)
@@ -232,7 +285,7 @@ END_FTR_SECTION_NESTED(ftr,ftr,943)
 	mtspr	SPRN_##h##SRR0,r12;					\
 	mfspr	r12,SPRN_##h##SRR1;	/* and SRR1 */			\
 	mtspr	SPRN_##h##SRR1,r10;					\
-	h##rfid;							\
+	h##RFI_TO_KERNEL;						\
 	b	.	/* prevent speculative execution */
 
 #define EXCEPTION_PROLOG_PSERIES_1_NORI(label, h)			\
diff --git a/arch/powerpc/include/asm/feature-fixups.h b/arch/powerpc/include/asm/feature-fixups.h
index 8f88f77..1e82eb3 100644
--- a/arch/powerpc/include/asm/feature-fixups.h
+++ b/arch/powerpc/include/asm/feature-fixups.h
@@ -187,7 +187,20 @@ label##3:					       	\
 	FTR_ENTRY_OFFSET label##1b-label##3b;		\
 	.popsection;
 
+#define RFI_FLUSH_FIXUP_SECTION				\
+951:							\
+	.pushsection __rfi_flush_fixup,"a";		\
+	.align 2;					\
+952:							\
+	FTR_ENTRY_OFFSET 951b-952b;			\
+	.popsection;
+
+
 #ifndef __ASSEMBLY__
+#include <linux/types.h>
+
+extern long __start___rfi_flush_fixup, __stop___rfi_flush_fixup;
+
 void apply_feature_fixups(void);
 void setup_feature_keys(void);
 #endif
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index a409177..eca3f9c 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -241,6 +241,7 @@
 #define H_GET_HCA_INFO          0x1B8
 #define H_GET_PERF_COUNT        0x1BC
 #define H_MANAGE_TRACE          0x1C0
+#define H_GET_CPU_CHARACTERISTICS 0x1C8
 #define H_FREE_LOGICAL_LAN_BUFFER 0x1D4
 #define H_QUERY_INT_STATE       0x1E4
 #define H_POLL_PENDING		0x1D8
@@ -330,6 +331,17 @@
 #define H_SIGNAL_SYS_RESET_ALL_OTHERS		-2
 /* >= 0 values are CPU number */
 
+/* H_GET_CPU_CHARACTERISTICS return values */
+#define H_CPU_CHAR_SPEC_BAR_ORI31	(1ull << 63) // IBM bit 0
+#define H_CPU_CHAR_BCCTRL_SERIALISED	(1ull << 62) // IBM bit 1
+#define H_CPU_CHAR_L1D_FLUSH_ORI30	(1ull << 61) // IBM bit 2
+#define H_CPU_CHAR_L1D_FLUSH_TRIG2	(1ull << 60) // IBM bit 3
+#define H_CPU_CHAR_L1D_THREAD_PRIV	(1ull << 59) // IBM bit 4
+
+#define H_CPU_BEHAV_FAVOUR_SECURITY	(1ull << 63) // IBM bit 0
+#define H_CPU_BEHAV_L1D_FLUSH_PR	(1ull << 62) // IBM bit 1
+#define H_CPU_BEHAV_BNDS_CHK_SPEC_BAR	(1ull << 61) // IBM bit 2
+
 /* Flag values used in H_REGISTER_PROC_TBL hcall */
 #define PROC_TABLE_OP_MASK	0x18
 #define PROC_TABLE_DEREG	0x10
@@ -341,6 +353,7 @@
 #define PROC_TABLE_GTSE		0x01
 
 #ifndef __ASSEMBLY__
+#include <linux/types.h>
 
 /**
  * plpar_hcall_norets: - Make a pseries hypervisor call with no return arguments
@@ -436,6 +449,11 @@ static inline unsigned int get_longbusy_msecs(int longbusy_rc)
 	}
 }
 
+struct h_cpu_char_result {
+	u64 character;
+	u64 behaviour;
+};
+
 #endif /* __ASSEMBLY__ */
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_HVCALL_H */
diff --git a/arch/powerpc/include/asm/paca.h b/arch/powerpc/include/asm/paca.h
index 3892db9..23ac7fc 100644
--- a/arch/powerpc/include/asm/paca.h
+++ b/arch/powerpc/include/asm/paca.h
@@ -232,6 +232,16 @@ struct paca_struct {
 	struct sibling_subcore_state *sibling_subcore_state;
 #endif
 #endif
+#ifdef CONFIG_PPC_BOOK3S_64
+	/*
+	 * rfi fallback flush must be in its own cacheline to prevent
+	 * other paca data leaking into the L1d
+	 */
+	u64 exrfi[EX_SIZE] __aligned(0x80);
+	void *rfi_flush_fallback_area;
+	u64 l1d_flush_congruence;
+	u64 l1d_flush_sets;
+#endif
 };
 
 extern void copy_mm_to_paca(struct mm_struct *mm);
diff --git a/arch/powerpc/include/asm/plpar_wrappers.h b/arch/powerpc/include/asm/plpar_wrappers.h
index 7f01b22..55eddf5 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -326,4 +326,18 @@ static inline long plapr_signal_sys_reset(long cpu)
 	return plpar_hcall_norets(H_SIGNAL_SYS_RESET, cpu);
 }
 
+static inline long plpar_get_cpu_characteristics(struct h_cpu_char_result *p)
+{
+	unsigned long retbuf[PLPAR_HCALL_BUFSIZE];
+	long rc;
+
+	rc = plpar_hcall(H_GET_CPU_CHARACTERISTICS, retbuf);
+	if (rc == H_SUCCESS) {
+		p->character = retbuf[0];
+		p->behaviour = retbuf[1];
+	}
+
+	return rc;
+}
+
 #endif /* _ASM_POWERPC_PLPAR_WRAPPERS_H */
diff --git a/arch/powerpc/include/asm/setup.h b/arch/powerpc/include/asm/setup.h
index cf00ec2..469b7fd 100644
--- a/arch/powerpc/include/asm/setup.h
+++ b/arch/powerpc/include/asm/setup.h
@@ -39,6 +39,19 @@ static inline void pseries_big_endian_exceptions(void) {}
 static inline void pseries_little_endian_exceptions(void) {}
 #endif /* CONFIG_PPC_PSERIES */
 
+void rfi_flush_enable(bool enable);
+
+/* These are bit flags */
+enum l1d_flush_type {
+	L1D_FLUSH_NONE		= 0x1,
+	L1D_FLUSH_FALLBACK	= 0x2,
+	L1D_FLUSH_ORI		= 0x4,
+	L1D_FLUSH_MTTRIG	= 0x8,
+};
+
+void __init setup_rfi_flush(enum l1d_flush_type, bool enable);
+void do_rfi_flush_fixups(enum l1d_flush_type types);
+
 #endif /* !__ASSEMBLY__ */
 
 #endif	/* _ASM_POWERPC_SETUP_H */
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index a264c3a..4a12c00 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -58,9 +58,6 @@ struct thread_info {
 	.flags =	0,			\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 #define THREAD_SIZE_ORDER	(THREAD_SHIFT - PAGE_SHIFT)
 
 /* how to get the thread information struct from C */
diff --git a/arch/powerpc/include/uapi/asm/kvm.h b/arch/powerpc/include/uapi/asm/kvm.h
index 61d6049..637b726 100644
--- a/arch/powerpc/include/uapi/asm/kvm.h
+++ b/arch/powerpc/include/uapi/asm/kvm.h
@@ -443,6 +443,31 @@ struct kvm_ppc_rmmu_info {
 	__u32	ap_encodings[8];
 };
 
+/* For KVM_PPC_GET_CPU_CHAR */
+struct kvm_ppc_cpu_char {
+	__u64	character;		/* characteristics of the CPU */
+	__u64	behaviour;		/* recommended software behaviour */
+	__u64	character_mask;		/* valid bits in character */
+	__u64	behaviour_mask;		/* valid bits in behaviour */
+};
+
+/*
+ * Values for character and character_mask.
+ * These are identical to the values used by H_GET_CPU_CHARACTERISTICS.
+ */
+#define KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31		(1ULL << 63)
+#define KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED	(1ULL << 62)
+#define KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30	(1ULL << 61)
+#define KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2	(1ULL << 60)
+#define KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV	(1ULL << 59)
+#define KVM_PPC_CPU_CHAR_BR_HINT_HONOURED	(1ULL << 58)
+#define KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF	(1ULL << 57)
+#define KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS	(1ULL << 56)
+
+#define KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY	(1ULL << 63)
+#define KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR		(1ULL << 62)
+#define KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR	(1ULL << 61)
+
 /* Per-vcpu XICS interrupt controller state */
 #define KVM_REG_PPC_ICP_STATE	(KVM_REG_PPC | KVM_REG_SIZE_U64 | 0x8c)
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 6b95841..f390d57 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -237,6 +237,11 @@ int main(void)
 	OFFSET(PACA_NMI_EMERG_SP, paca_struct, nmi_emergency_sp);
 	OFFSET(PACA_IN_MCE, paca_struct, in_mce);
 	OFFSET(PACA_IN_NMI, paca_struct, in_nmi);
+	OFFSET(PACA_RFI_FLUSH_FALLBACK_AREA, paca_struct, rfi_flush_fallback_area);
+	OFFSET(PACA_EXRFI, paca_struct, exrfi);
+	OFFSET(PACA_L1D_FLUSH_CONGRUENCE, paca_struct, l1d_flush_congruence);
+	OFFSET(PACA_L1D_FLUSH_SETS, paca_struct, l1d_flush_sets);
+
 #endif
 	OFFSET(PACAHWCPUID, paca_struct, hw_cpu_id);
 	OFFSET(PACAKEXECSTATE, paca_struct, kexec_state);
diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 3320bca..2748584 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -37,6 +37,11 @@
 #include <asm/tm.h>
 #include <asm/ppc-opcode.h>
 #include <asm/export.h>
+#ifdef CONFIG_PPC_BOOK3S
+#include <asm/exception-64s.h>
+#else
+#include <asm/exception-64e.h>
+#endif
 
 /*
  * System calls.
@@ -262,13 +267,23 @@
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 
 	ld	r13,GPR13(r1)	/* only restore r13 if returning to usermode */
+	ld	r2,GPR2(r1)
+	ld	r1,GPR1(r1)
+	mtlr	r4
+	mtcr	r5
+	mtspr	SPRN_SRR0,r7
+	mtspr	SPRN_SRR1,r8
+	RFI_TO_USER
+	b	.	/* prevent speculative execution */
+
+	/* exit to kernel */
 1:	ld	r2,GPR2(r1)
 	ld	r1,GPR1(r1)
 	mtlr	r4
 	mtcr	r5
 	mtspr	SPRN_SRR0,r7
 	mtspr	SPRN_SRR1,r8
-	RFI
+	RFI_TO_KERNEL
 	b	.	/* prevent speculative execution */
 
 .Lsyscall_error:
@@ -397,8 +412,7 @@
 	mtmsrd	r10, 1
 	mtspr	SPRN_SRR0, r11
 	mtspr	SPRN_SRR1, r12
-
-	rfid
+	RFI_TO_USER
 	b	.	/* prevent speculative execution */
 #endif
 _ASM_NOKPROBE_SYMBOL(system_call_common);
@@ -878,7 +892,7 @@
 END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR)
 	ACCOUNT_CPU_USER_EXIT(r13, r2, r4)
 	REST_GPR(13, r1)
-1:
+
 	mtspr	SPRN_SRR1,r3
 
 	ld	r2,_CCR(r1)
@@ -891,8 +905,22 @@
 	ld	r3,GPR3(r1)
 	ld	r4,GPR4(r1)
 	ld	r1,GPR1(r1)
+	RFI_TO_USER
+	b	.	/* prevent speculative execution */
 
-	rfid
+1:	mtspr	SPRN_SRR1,r3
+
+	ld	r2,_CCR(r1)
+	mtcrf	0xFF,r2
+	ld	r2,_NIP(r1)
+	mtspr	SPRN_SRR0,r2
+
+	ld	r0,GPR0(r1)
+	ld	r2,GPR2(r1)
+	ld	r3,GPR3(r1)
+	ld	r4,GPR4(r1)
+	ld	r1,GPR1(r1)
+	RFI_TO_KERNEL
 	b	.	/* prevent speculative execution */
 
 #endif /* CONFIG_PPC_BOOK3E */
@@ -1073,7 +1101,7 @@
 	
 	mtspr	SPRN_SRR0,r5
 	mtspr	SPRN_SRR1,r6
-	rfid
+	RFI_TO_KERNEL
 	b	.	/* prevent speculative execution */
 
 rtas_return_loc:
@@ -1098,7 +1126,7 @@
 
 	mtspr	SPRN_SRR0,r3
 	mtspr	SPRN_SRR1,r4
-	rfid
+	RFI_TO_KERNEL
 	b	.	/* prevent speculative execution */
 _ASM_NOKPROBE_SYMBOL(__enter_rtas)
 _ASM_NOKPROBE_SYMBOL(rtas_return_loc)
@@ -1171,7 +1199,7 @@
 	LOAD_REG_IMMEDIATE(r12, MSR_SF | MSR_ISF | MSR_LE)
 	andc	r11,r11,r12
 	mtsrr1	r11
-	rfid
+	RFI_TO_KERNEL
 #endif /* CONFIG_PPC_BOOK3E */
 
 1:	/* Return from OF */
diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S
index e441b46..2dc10bf 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -256,7 +256,7 @@
 	LOAD_HANDLER(r12, machine_check_handle_early)
 1:	mtspr	SPRN_SRR0,r12
 	mtspr	SPRN_SRR1,r11
-	rfid
+	RFI_TO_KERNEL
 	b	.	/* prevent speculative execution */
 2:
 	/* Stack overflow. Stay on emergency stack and panic.
@@ -445,7 +445,7 @@
 	li	r3,MSR_ME
 	andc	r10,r10,r3		/* Turn off MSR_ME */
 	mtspr	SPRN_SRR1,r10
-	rfid
+	RFI_TO_KERNEL
 	b	.
 2:
 	/*
@@ -463,7 +463,7 @@
 	 */
 	bl	machine_check_queue_event
 	MACHINE_CHECK_HANDLER_WINDUP
-	rfid
+	RFI_TO_USER_OR_KERNEL
 9:
 	/* Deliver the machine check to host kernel in V mode. */
 	MACHINE_CHECK_HANDLER_WINDUP
@@ -598,6 +598,9 @@
 	stw	r9,PACA_EXSLB+EX_CCR(r13)	/* save CR in exc. frame */
 	std	r10,PACA_EXSLB+EX_LR(r13)	/* save LR */
 
+	andi.	r9,r11,MSR_PR	// Check for exception from userspace
+	cmpdi	cr4,r9,MSR_PR	// And save the result in CR4 for later
+
 	/*
 	 * Test MSR_RI before calling slb_allocate_realmode, because the
 	 * MSR in r11 gets clobbered. However we still want to allocate
@@ -624,9 +627,12 @@
 
 	/* All done -- return from exception. */
 
+	bne	cr4,1f		/* returning to kernel */
+
 .machine	push
 .machine	"power4"
 	mtcrf	0x80,r9
+	mtcrf	0x08,r9		/* MSR[PR] indication is in cr4 */
 	mtcrf	0x04,r9		/* MSR[RI] indication is in cr5 */
 	mtcrf	0x02,r9		/* I/D indication is in cr6 */
 	mtcrf	0x01,r9		/* slb_allocate uses cr0 and cr7 */
@@ -640,8 +646,29 @@
 	ld	r11,PACA_EXSLB+EX_R11(r13)
 	ld	r12,PACA_EXSLB+EX_R12(r13)
 	ld	r13,PACA_EXSLB+EX_R13(r13)
-	rfid
+	RFI_TO_USER
 	b	.	/* prevent speculative execution */
+1:
+.machine	push
+.machine	"power4"
+	mtcrf	0x80,r9
+	mtcrf	0x08,r9		/* MSR[PR] indication is in cr4 */
+	mtcrf	0x04,r9		/* MSR[RI] indication is in cr5 */
+	mtcrf	0x02,r9		/* I/D indication is in cr6 */
+	mtcrf	0x01,r9		/* slb_allocate uses cr0 and cr7 */
+.machine	pop
+
+	RESTORE_CTR(r9, PACA_EXSLB)
+	RESTORE_PPR_PACA(PACA_EXSLB, r9)
+	mr	r3,r12
+	ld	r9,PACA_EXSLB+EX_R9(r13)
+	ld	r10,PACA_EXSLB+EX_R10(r13)
+	ld	r11,PACA_EXSLB+EX_R11(r13)
+	ld	r12,PACA_EXSLB+EX_R12(r13)
+	ld	r13,PACA_EXSLB+EX_R13(r13)
+	RFI_TO_KERNEL
+	b	.	/* prevent speculative execution */
+
 
 2:	std     r3,PACA_EXSLB+EX_DAR(r13)
 	mr	r3,r12
@@ -651,7 +678,7 @@
 	mtspr	SPRN_SRR0,r10
 	ld	r10,PACAKMSR(r13)
 	mtspr	SPRN_SRR1,r10
-	rfid
+	RFI_TO_KERNEL
 	b	.
 
 8:	std     r3,PACA_EXSLB+EX_DAR(r13)
@@ -662,7 +689,7 @@
 	mtspr	SPRN_SRR0,r10
 	ld	r10,PACAKMSR(r13)
 	mtspr	SPRN_SRR1,r10
-	rfid
+	RFI_TO_KERNEL
 	b	.
 
 EXC_COMMON_BEGIN(unrecov_slb)
@@ -901,7 +928,7 @@
 	mtspr	SPRN_SRR0,r10 ; 				\
 	ld	r10,PACAKMSR(r13) ;				\
 	mtspr	SPRN_SRR1,r10 ; 				\
-	rfid ; 							\
+	RFI_TO_KERNEL ;						\
 	b	. ;	/* prevent speculative execution */
 
 #ifdef CONFIG_PPC_FAST_ENDIAN_SWITCH
@@ -917,7 +944,7 @@
 	xori	r12,r12,MSR_LE ;				\
 	mtspr	SPRN_SRR1,r12 ;					\
 	mr	r13,r9 ;					\
-	rfid ;		/* return to userspace */		\
+	RFI_TO_USER ;	/* return to userspace */		\
 	b	. ;	/* prevent speculative execution */
 #else
 #define SYSCALL_FASTENDIAN_TEST
@@ -1063,7 +1090,7 @@
 	mtcr	r11
 	REST_GPR(11, r1)
 	ld	r1,GPR1(r1)
-	hrfid
+	HRFI_TO_USER_OR_KERNEL
 
 1:	mtcr	r11
 	REST_GPR(11, r1)
@@ -1314,7 +1341,7 @@
 	ld	r11,PACA_EXGEN+EX_R11(r13)
 	ld	r12,PACA_EXGEN+EX_R12(r13)
 	ld	r13,PACA_EXGEN+EX_R13(r13)
-	HRFID
+	HRFI_TO_UNKNOWN
 	b	.
 #endif
 
@@ -1418,10 +1445,94 @@
 	ld	r10,PACA_EXGEN+EX_R10(r13);		\
 	ld	r11,PACA_EXGEN+EX_R11(r13);		\
 	/* returns to kernel where r13 must be set up, so don't restore it */ \
-	##_H##rfid;					\
+	##_H##RFI_TO_KERNEL;				\
 	b	.;					\
 	MASKED_DEC_HANDLER(_H)
 
+TRAMP_REAL_BEGIN(rfi_flush_fallback)
+	SET_SCRATCH0(r13);
+	GET_PACA(r13);
+	std	r9,PACA_EXRFI+EX_R9(r13)
+	std	r10,PACA_EXRFI+EX_R10(r13)
+	std	r11,PACA_EXRFI+EX_R11(r13)
+	std	r12,PACA_EXRFI+EX_R12(r13)
+	std	r8,PACA_EXRFI+EX_R13(r13)
+	mfctr	r9
+	ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+	ld	r11,PACA_L1D_FLUSH_SETS(r13)
+	ld	r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
+	/*
+	 * The load adresses are at staggered offsets within cachelines,
+	 * which suits some pipelines better (on others it should not
+	 * hurt).
+	 */
+	addi	r12,r12,8
+	mtctr	r11
+	DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+
+	/* order ld/st prior to dcbt stop all streams with flushing */
+	sync
+1:	li	r8,0
+	.rept	8 /* 8-way set associative */
+	ldx	r11,r10,r8
+	add	r8,r8,r12
+	xor	r11,r11,r11	// Ensure r11 is 0 even if fallback area is not
+	add	r8,r8,r11	// Add 0, this creates a dependency on the ldx
+	.endr
+	addi	r10,r10,128 /* 128 byte cache line */
+	bdnz	1b
+
+	mtctr	r9
+	ld	r9,PACA_EXRFI+EX_R9(r13)
+	ld	r10,PACA_EXRFI+EX_R10(r13)
+	ld	r11,PACA_EXRFI+EX_R11(r13)
+	ld	r12,PACA_EXRFI+EX_R12(r13)
+	ld	r8,PACA_EXRFI+EX_R13(r13)
+	GET_SCRATCH0(r13);
+	rfid
+
+TRAMP_REAL_BEGIN(hrfi_flush_fallback)
+	SET_SCRATCH0(r13);
+	GET_PACA(r13);
+	std	r9,PACA_EXRFI+EX_R9(r13)
+	std	r10,PACA_EXRFI+EX_R10(r13)
+	std	r11,PACA_EXRFI+EX_R11(r13)
+	std	r12,PACA_EXRFI+EX_R12(r13)
+	std	r8,PACA_EXRFI+EX_R13(r13)
+	mfctr	r9
+	ld	r10,PACA_RFI_FLUSH_FALLBACK_AREA(r13)
+	ld	r11,PACA_L1D_FLUSH_SETS(r13)
+	ld	r12,PACA_L1D_FLUSH_CONGRUENCE(r13)
+	/*
+	 * The load adresses are at staggered offsets within cachelines,
+	 * which suits some pipelines better (on others it should not
+	 * hurt).
+	 */
+	addi	r12,r12,8
+	mtctr	r11
+	DCBT_STOP_ALL_STREAM_IDS(r11) /* Stop prefetch streams */
+
+	/* order ld/st prior to dcbt stop all streams with flushing */
+	sync
+1:	li	r8,0
+	.rept	8 /* 8-way set associative */
+	ldx	r11,r10,r8
+	add	r8,r8,r12
+	xor	r11,r11,r11	// Ensure r11 is 0 even if fallback area is not
+	add	r8,r8,r11	// Add 0, this creates a dependency on the ldx
+	.endr
+	addi	r10,r10,128 /* 128 byte cache line */
+	bdnz	1b
+
+	mtctr	r9
+	ld	r9,PACA_EXRFI+EX_R9(r13)
+	ld	r10,PACA_EXRFI+EX_R10(r13)
+	ld	r11,PACA_EXRFI+EX_R11(r13)
+	ld	r12,PACA_EXRFI+EX_R12(r13)
+	ld	r8,PACA_EXRFI+EX_R13(r13)
+	GET_SCRATCH0(r13);
+	hrfid
+
 /*
  * Real mode exceptions actually use this too, but alternate
  * instruction code patches (which end up in the common .text area)
@@ -1441,7 +1552,7 @@
 	addi	r13, r13, 4
 	mtspr	SPRN_SRR0, r13
 	GET_SCRATCH0(r13)
-	rfid
+	RFI_TO_KERNEL
 	b	.
 
 TRAMP_REAL_BEGIN(kvmppc_skip_Hinterrupt)
@@ -1453,7 +1564,7 @@
 	addi	r13, r13, 4
 	mtspr	SPRN_HSRR0, r13
 	GET_SCRATCH0(r13)
-	hrfid
+	HRFI_TO_KERNEL
 	b	.
 #endif
 
diff --git a/arch/powerpc/kernel/mce.c b/arch/powerpc/kernel/mce.c
index 742e465..71e8a1b 100644
--- a/arch/powerpc/kernel/mce.c
+++ b/arch/powerpc/kernel/mce.c
@@ -273,7 +273,7 @@ static void machine_process_ue_event(struct work_struct *work)
 
 				pfn = evt->u.ue_error.physical_address >>
 					PAGE_SHIFT;
-				memory_failure(pfn, SIGBUS, 0);
+				memory_failure(pfn, 0);
 			} else
 				pr_warn("Failed to identify bad address from "
 					"where the uncorrectable error (UE) "
diff --git a/arch/powerpc/kernel/setup-common.c b/arch/powerpc/kernel/setup-common.c
index 9d21354..8fd3a70 100644
--- a/arch/powerpc/kernel/setup-common.c
+++ b/arch/powerpc/kernel/setup-common.c
@@ -242,14 +242,6 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 	unsigned short maj;
 	unsigned short min;
 
-	/* We only show online cpus: disable preempt (overzealous, I
-	 * knew) to prevent cpu going down. */
-	preempt_disable();
-	if (!cpu_online(cpu_id)) {
-		preempt_enable();
-		return 0;
-	}
-
 #ifdef CONFIG_SMP
 	pvr = per_cpu(cpu_pvr, cpu_id);
 #else
@@ -358,9 +350,6 @@ static int show_cpuinfo(struct seq_file *m, void *v)
 #ifdef CONFIG_SMP
 	seq_printf(m, "\n");
 #endif
-
-	preempt_enable();
-
 	/* If this is the last cpu, print the summary */
 	if (cpumask_next(cpu_id, cpu_online_mask) >= nr_cpu_ids)
 		show_cpuinfo_summary(m);
diff --git a/arch/powerpc/kernel/setup_64.c b/arch/powerpc/kernel/setup_64.c
index 8956a98..e67413f 100644
--- a/arch/powerpc/kernel/setup_64.c
+++ b/arch/powerpc/kernel/setup_64.c
@@ -38,6 +38,7 @@
 #include <linux/memory.h>
 #include <linux/nmi.h>
 
+#include <asm/debugfs.h>
 #include <asm/io.h>
 #include <asm/kdump.h>
 #include <asm/prom.h>
@@ -801,3 +802,141 @@ static int __init disable_hardlockup_detector(void)
 	return 0;
 }
 early_initcall(disable_hardlockup_detector);
+
+#ifdef CONFIG_PPC_BOOK3S_64
+static enum l1d_flush_type enabled_flush_types;
+static void *l1d_flush_fallback_area;
+static bool no_rfi_flush;
+bool rfi_flush;
+
+static int __init handle_no_rfi_flush(char *p)
+{
+	pr_info("rfi-flush: disabled on command line.");
+	no_rfi_flush = true;
+	return 0;
+}
+early_param("no_rfi_flush", handle_no_rfi_flush);
+
+/*
+ * The RFI flush is not KPTI, but because users will see doco that says to use
+ * nopti we hijack that option here to also disable the RFI flush.
+ */
+static int __init handle_no_pti(char *p)
+{
+	pr_info("rfi-flush: disabling due to 'nopti' on command line.\n");
+	handle_no_rfi_flush(NULL);
+	return 0;
+}
+early_param("nopti", handle_no_pti);
+
+static void do_nothing(void *unused)
+{
+	/*
+	 * We don't need to do the flush explicitly, just enter+exit kernel is
+	 * sufficient, the RFI exit handlers will do the right thing.
+	 */
+}
+
+void rfi_flush_enable(bool enable)
+{
+	if (rfi_flush == enable)
+		return;
+
+	if (enable) {
+		do_rfi_flush_fixups(enabled_flush_types);
+		on_each_cpu(do_nothing, NULL, 1);
+	} else
+		do_rfi_flush_fixups(L1D_FLUSH_NONE);
+
+	rfi_flush = enable;
+}
+
+static void init_fallback_flush(void)
+{
+	u64 l1d_size, limit;
+	int cpu;
+
+	l1d_size = ppc64_caches.l1d.size;
+	limit = min(safe_stack_limit(), ppc64_rma_size);
+
+	/*
+	 * Align to L1d size, and size it at 2x L1d size, to catch possible
+	 * hardware prefetch runoff. We don't have a recipe for load patterns to
+	 * reliably avoid the prefetcher.
+	 */
+	l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit));
+	memset(l1d_flush_fallback_area, 0, l1d_size * 2);
+
+	for_each_possible_cpu(cpu) {
+		/*
+		 * The fallback flush is currently coded for 8-way
+		 * associativity. Different associativity is possible, but it
+		 * will be treated as 8-way and may not evict the lines as
+		 * effectively.
+		 *
+		 * 128 byte lines are mandatory.
+		 */
+		u64 c = l1d_size / 8;
+
+		paca[cpu].rfi_flush_fallback_area = l1d_flush_fallback_area;
+		paca[cpu].l1d_flush_congruence = c;
+		paca[cpu].l1d_flush_sets = c / 128;
+	}
+}
+
+void __init setup_rfi_flush(enum l1d_flush_type types, bool enable)
+{
+	if (types & L1D_FLUSH_FALLBACK) {
+		pr_info("rfi-flush: Using fallback displacement flush\n");
+		init_fallback_flush();
+	}
+
+	if (types & L1D_FLUSH_ORI)
+		pr_info("rfi-flush: Using ori type flush\n");
+
+	if (types & L1D_FLUSH_MTTRIG)
+		pr_info("rfi-flush: Using mttrig type flush\n");
+
+	enabled_flush_types = types;
+
+	if (!no_rfi_flush)
+		rfi_flush_enable(enable);
+}
+
+#ifdef CONFIG_DEBUG_FS
+static int rfi_flush_set(void *data, u64 val)
+{
+	if (val == 1)
+		rfi_flush_enable(true);
+	else if (val == 0)
+		rfi_flush_enable(false);
+	else
+		return -EINVAL;
+
+	return 0;
+}
+
+static int rfi_flush_get(void *data, u64 *val)
+{
+	*val = rfi_flush ? 1 : 0;
+	return 0;
+}
+
+DEFINE_SIMPLE_ATTRIBUTE(fops_rfi_flush, rfi_flush_get, rfi_flush_set, "%llu\n");
+
+static __init int rfi_flush_debugfs_init(void)
+{
+	debugfs_create_file("rfi_flush", 0600, powerpc_debugfs_root, NULL, &fops_rfi_flush);
+	return 0;
+}
+device_initcall(rfi_flush_debugfs_init);
+#endif
+
+ssize_t cpu_show_meltdown(struct device *dev, struct device_attribute *attr, char *buf)
+{
+	if (rfi_flush)
+		return sprintf(buf, "Mitigation: RFI Flush\n");
+
+	return sprintf(buf, "Vulnerable\n");
+}
+#endif /* CONFIG_PPC_BOOK3S_64 */
diff --git a/arch/powerpc/kernel/vmlinux.lds.S b/arch/powerpc/kernel/vmlinux.lds.S
index 0494e15..307843d 100644
--- a/arch/powerpc/kernel/vmlinux.lds.S
+++ b/arch/powerpc/kernel/vmlinux.lds.S
@@ -132,6 +132,15 @@
 	/* Read-only data */
 	RO_DATA(PAGE_SIZE)
 
+#ifdef CONFIG_PPC64
+	. = ALIGN(8);
+	__rfi_flush_fixup : AT(ADDR(__rfi_flush_fixup) - LOAD_OFFSET) {
+		__start___rfi_flush_fixup = .;
+		*(__rfi_flush_fixup)
+		__stop___rfi_flush_fixup = .;
+	}
+#endif
+
 	EXCEPTION_TABLE(0)
 
 	NOTES :kernel :notes
diff --git a/arch/powerpc/kvm/book3s_64_mmu.c b/arch/powerpc/kvm/book3s_64_mmu.c
index 29ebe2f..a93d719 100644
--- a/arch/powerpc/kvm/book3s_64_mmu.c
+++ b/arch/powerpc/kvm/book3s_64_mmu.c
@@ -235,6 +235,7 @@ static int kvmppc_mmu_book3s_64_xlate(struct kvm_vcpu *vcpu, gva_t eaddr,
 		gpte->may_read = true;
 		gpte->may_write = true;
 		gpte->page_size = MMU_PAGE_4K;
+		gpte->wimg = HPTE_R_M;
 
 		return 0;
 	}
diff --git a/arch/powerpc/kvm/book3s_64_mmu_hv.c b/arch/powerpc/kvm/book3s_64_mmu_hv.c
index 9660972..b73dbc9 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_hv.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_hv.c
@@ -65,11 +65,17 @@ struct kvm_resize_hpt {
 	u32 order;
 
 	/* These fields protected by kvm->lock */
-	int error;
-	bool prepare_done;
 
-	/* Private to the work thread, until prepare_done is true,
-	 * then protected by kvm->resize_hpt_sem */
+	/* Possible values and their usage:
+	 *  <0     an error occurred during allocation,
+	 *  -EBUSY allocation is in the progress,
+	 *  0      allocation made successfuly.
+	 */
+	int error;
+
+	/* Private to the work thread, until error != -EBUSY,
+	 * then protected by kvm->lock.
+	 */
 	struct kvm_hpt_info hpt;
 };
 
@@ -159,8 +165,6 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order)
 		 * Reset all the reverse-mapping chains for all memslots
 		 */
 		kvmppc_rmap_reset(kvm);
-		/* Ensure that each vcpu will flush its TLB on next entry. */
-		cpumask_setall(&kvm->arch.need_tlb_flush);
 		err = 0;
 		goto out;
 	}
@@ -176,6 +180,10 @@ long kvmppc_alloc_reset_hpt(struct kvm *kvm, int order)
 	kvmppc_set_hpt(kvm, &info);
 
 out:
+	if (err == 0)
+		/* Ensure that each vcpu will flush its TLB on next entry. */
+		cpumask_setall(&kvm->arch.need_tlb_flush);
+
 	mutex_unlock(&kvm->lock);
 	return err;
 }
@@ -1413,16 +1421,20 @@ static void resize_hpt_pivot(struct kvm_resize_hpt *resize)
 
 static void resize_hpt_release(struct kvm *kvm, struct kvm_resize_hpt *resize)
 {
-	BUG_ON(kvm->arch.resize_hpt != resize);
+	if (WARN_ON(!mutex_is_locked(&kvm->lock)))
+		return;
 
 	if (!resize)
 		return;
 
-	if (resize->hpt.virt)
-		kvmppc_free_hpt(&resize->hpt);
+	if (resize->error != -EBUSY) {
+		if (resize->hpt.virt)
+			kvmppc_free_hpt(&resize->hpt);
+		kfree(resize);
+	}
 
-	kvm->arch.resize_hpt = NULL;
-	kfree(resize);
+	if (kvm->arch.resize_hpt == resize)
+		kvm->arch.resize_hpt = NULL;
 }
 
 static void resize_hpt_prepare_work(struct work_struct *work)
@@ -1431,17 +1443,41 @@ static void resize_hpt_prepare_work(struct work_struct *work)
 						     struct kvm_resize_hpt,
 						     work);
 	struct kvm *kvm = resize->kvm;
-	int err;
+	int err = 0;
 
-	resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n",
-			 resize->order);
-
-	err = resize_hpt_allocate(resize);
+	if (WARN_ON(resize->error != -EBUSY))
+		return;
 
 	mutex_lock(&kvm->lock);
 
+	/* Request is still current? */
+	if (kvm->arch.resize_hpt == resize) {
+		/* We may request large allocations here:
+		 * do not sleep with kvm->lock held for a while.
+		 */
+		mutex_unlock(&kvm->lock);
+
+		resize_hpt_debug(resize, "resize_hpt_prepare_work(): order = %d\n",
+				 resize->order);
+
+		err = resize_hpt_allocate(resize);
+
+		/* We have strict assumption about -EBUSY
+		 * when preparing for HPT resize.
+		 */
+		if (WARN_ON(err == -EBUSY))
+			err = -EINPROGRESS;
+
+		mutex_lock(&kvm->lock);
+		/* It is possible that kvm->arch.resize_hpt != resize
+		 * after we grab kvm->lock again.
+		 */
+	}
+
 	resize->error = err;
-	resize->prepare_done = true;
+
+	if (kvm->arch.resize_hpt != resize)
+		resize_hpt_release(kvm, resize);
 
 	mutex_unlock(&kvm->lock);
 }
@@ -1466,14 +1502,12 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm,
 
 	if (resize) {
 		if (resize->order == shift) {
-			/* Suitable resize in progress */
-			if (resize->prepare_done) {
-				ret = resize->error;
-				if (ret != 0)
-					resize_hpt_release(kvm, resize);
-			} else {
+			/* Suitable resize in progress? */
+			ret = resize->error;
+			if (ret == -EBUSY)
 				ret = 100; /* estimated time in ms */
-			}
+			else if (ret)
+				resize_hpt_release(kvm, resize);
 
 			goto out;
 		}
@@ -1493,6 +1527,8 @@ long kvm_vm_ioctl_resize_hpt_prepare(struct kvm *kvm,
 		ret = -ENOMEM;
 		goto out;
 	}
+
+	resize->error = -EBUSY;
 	resize->order = shift;
 	resize->kvm = kvm;
 	INIT_WORK(&resize->work, resize_hpt_prepare_work);
@@ -1547,16 +1583,12 @@ long kvm_vm_ioctl_resize_hpt_commit(struct kvm *kvm,
 	if (!resize || (resize->order != shift))
 		goto out;
 
-	ret = -EBUSY;
-	if (!resize->prepare_done)
-		goto out;
-
 	ret = resize->error;
-	if (ret != 0)
+	if (ret)
 		goto out;
 
 	ret = resize_hpt_rehash(resize);
-	if (ret != 0)
+	if (ret)
 		goto out;
 
 	resize_hpt_pivot(resize);
diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
index 2659844..9c61f73 100644
--- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S
@@ -79,7 +79,7 @@
 	mtmsrd	r0,1		/* clear RI in MSR */
 	mtsrr0	r5
 	mtsrr1	r6
-	RFI
+	RFI_TO_KERNEL
 
 kvmppc_call_hv_entry:
 BEGIN_FTR_SECTION
@@ -199,7 +199,7 @@
 	mtmsrd	r6, 1			/* Clear RI in MSR */
 	mtsrr0	r8
 	mtsrr1	r7
-	RFI
+	RFI_TO_KERNEL
 
 	/* Virtual-mode return */
 .Lvirt_return:
@@ -1167,8 +1167,7 @@
 
 	ld	r0, VCPU_GPR(R0)(r4)
 	ld	r4, VCPU_GPR(R4)(r4)
-
-	hrfid
+	HRFI_TO_GUEST
 	b	.
 
 secondary_too_late:
@@ -3320,7 +3319,7 @@
 	ld	r4, PACAKMSR(r13)
 	mtspr	SPRN_SRR0, r3
 	mtspr	SPRN_SRR1, r4
-	rfid
+	RFI_TO_KERNEL
 9:	addi	r3, r1, STACK_FRAME_OVERHEAD
 	bl	kvmppc_bad_interrupt
 	b	9b
diff --git a/arch/powerpc/kvm/book3s_pr.c b/arch/powerpc/kvm/book3s_pr.c
index d0dc862..7deaeeb 100644
--- a/arch/powerpc/kvm/book3s_pr.c
+++ b/arch/powerpc/kvm/book3s_pr.c
@@ -60,6 +60,7 @@ static void kvmppc_giveup_fac(struct kvm_vcpu *vcpu, ulong fac);
 #define MSR_USER32 MSR_USER
 #define MSR_USER64 MSR_USER
 #define HW_PAGE_SIZE PAGE_SIZE
+#define HPTE_R_M   _PAGE_COHERENT
 #endif
 
 static bool kvmppc_is_split_real(struct kvm_vcpu *vcpu)
@@ -557,6 +558,7 @@ int kvmppc_handle_pagefault(struct kvm_run *run, struct kvm_vcpu *vcpu,
 		pte.eaddr = eaddr;
 		pte.vpage = eaddr >> 12;
 		pte.page_size = MMU_PAGE_64K;
+		pte.wimg = HPTE_R_M;
 	}
 
 	switch (kvmppc_get_msr(vcpu) & (MSR_DR|MSR_IR)) {
diff --git a/arch/powerpc/kvm/book3s_rmhandlers.S b/arch/powerpc/kvm/book3s_rmhandlers.S
index 42a4b23..34a5ade 100644
--- a/arch/powerpc/kvm/book3s_rmhandlers.S
+++ b/arch/powerpc/kvm/book3s_rmhandlers.S
@@ -46,6 +46,9 @@
 
 #define FUNC(name)		name
 
+#define RFI_TO_KERNEL	RFI
+#define RFI_TO_GUEST	RFI
+
 .macro INTERRUPT_TRAMPOLINE intno
 
 .global kvmppc_trampoline_\intno
@@ -141,7 +144,7 @@
 	GET_SCRATCH0(r13)
 
 	/* And get back into the code */
-	RFI
+	RFI_TO_KERNEL
 #endif
 
 /*
@@ -164,6 +167,6 @@
 	ori	r5, r5, MSR_EE
 	mtsrr0	r7
 	mtsrr1	r6
-	RFI
+	RFI_TO_KERNEL
 
 #include "book3s_segment.S"
diff --git a/arch/powerpc/kvm/book3s_segment.S b/arch/powerpc/kvm/book3s_segment.S
index 2a2b96d..93a180c 100644
--- a/arch/powerpc/kvm/book3s_segment.S
+++ b/arch/powerpc/kvm/book3s_segment.S
@@ -156,7 +156,7 @@
 	PPC_LL	r9, SVCPU_R9(r3)
 	PPC_LL	r3, (SVCPU_R3)(r3)
 
-	RFI
+	RFI_TO_GUEST
 kvmppc_handler_trampoline_enter_end:
 
 
@@ -407,5 +407,5 @@
 	cmpwi	r12, BOOK3S_INTERRUPT_DOORBELL
 	beqa	BOOK3S_INTERRUPT_DOORBELL
 
-	RFI
+	RFI_TO_KERNEL
 kvmppc_handler_trampoline_exit_end:
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c
index 1915e86..0a7c887 100644
--- a/arch/powerpc/kvm/powerpc.c
+++ b/arch/powerpc/kvm/powerpc.c
@@ -39,6 +39,10 @@
 #include <asm/iommu.h>
 #include <asm/switch_to.h>
 #include <asm/xive.h>
+#ifdef CONFIG_PPC_PSERIES
+#include <asm/hvcall.h>
+#include <asm/plpar_wrappers.h>
+#endif
 
 #include "timing.h"
 #include "irq.h"
@@ -548,6 +552,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 #ifdef CONFIG_KVM_XICS
 	case KVM_CAP_IRQ_XICS:
 #endif
+	case KVM_CAP_PPC_GET_CPU_CHAR:
 		r = 1;
 		break;
 
@@ -1759,6 +1764,124 @@ static int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
 	return r;
 }
 
+#ifdef CONFIG_PPC_BOOK3S_64
+/*
+ * These functions check whether the underlying hardware is safe
+ * against attacks based on observing the effects of speculatively
+ * executed instructions, and whether it supplies instructions for
+ * use in workarounds.  The information comes from firmware, either
+ * via the device tree on powernv platforms or from an hcall on
+ * pseries platforms.
+ */
+#ifdef CONFIG_PPC_PSERIES
+static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp)
+{
+	struct h_cpu_char_result c;
+	unsigned long rc;
+
+	if (!machine_is(pseries))
+		return -ENOTTY;
+
+	rc = plpar_get_cpu_characteristics(&c);
+	if (rc == H_SUCCESS) {
+		cp->character = c.character;
+		cp->behaviour = c.behaviour;
+		cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 |
+			KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED |
+			KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 |
+			KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 |
+			KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV |
+			KVM_PPC_CPU_CHAR_BR_HINT_HONOURED |
+			KVM_PPC_CPU_CHAR_MTTRIG_THR_RECONF |
+			KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS;
+		cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY |
+			KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR |
+			KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR;
+	}
+	return 0;
+}
+#else
+static int pseries_get_cpu_char(struct kvm_ppc_cpu_char *cp)
+{
+	return -ENOTTY;
+}
+#endif
+
+static inline bool have_fw_feat(struct device_node *fw_features,
+				const char *state, const char *name)
+{
+	struct device_node *np;
+	bool r = false;
+
+	np = of_get_child_by_name(fw_features, name);
+	if (np) {
+		r = of_property_read_bool(np, state);
+		of_node_put(np);
+	}
+	return r;
+}
+
+static int kvmppc_get_cpu_char(struct kvm_ppc_cpu_char *cp)
+{
+	struct device_node *np, *fw_features;
+	int r;
+
+	memset(cp, 0, sizeof(*cp));
+	r = pseries_get_cpu_char(cp);
+	if (r != -ENOTTY)
+		return r;
+
+	np = of_find_node_by_name(NULL, "ibm,opal");
+	if (np) {
+		fw_features = of_get_child_by_name(np, "fw-features");
+		of_node_put(np);
+		if (!fw_features)
+			return 0;
+		if (have_fw_feat(fw_features, "enabled",
+				 "inst-spec-barrier-ori31,31,0"))
+			cp->character |= KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31;
+		if (have_fw_feat(fw_features, "enabled",
+				 "fw-bcctrl-serialized"))
+			cp->character |= KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED;
+		if (have_fw_feat(fw_features, "enabled",
+				 "inst-l1d-flush-ori30,30,0"))
+			cp->character |= KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30;
+		if (have_fw_feat(fw_features, "enabled",
+				 "inst-l1d-flush-trig2"))
+			cp->character |= KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2;
+		if (have_fw_feat(fw_features, "enabled",
+				 "fw-l1d-thread-split"))
+			cp->character |= KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV;
+		if (have_fw_feat(fw_features, "enabled",
+				 "fw-count-cache-disabled"))
+			cp->character |= KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS;
+		cp->character_mask = KVM_PPC_CPU_CHAR_SPEC_BAR_ORI31 |
+			KVM_PPC_CPU_CHAR_BCCTRL_SERIALISED |
+			KVM_PPC_CPU_CHAR_L1D_FLUSH_ORI30 |
+			KVM_PPC_CPU_CHAR_L1D_FLUSH_TRIG2 |
+			KVM_PPC_CPU_CHAR_L1D_THREAD_PRIV |
+			KVM_PPC_CPU_CHAR_COUNT_CACHE_DIS;
+
+		if (have_fw_feat(fw_features, "enabled",
+				 "speculation-policy-favor-security"))
+			cp->behaviour |= KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY;
+		if (!have_fw_feat(fw_features, "disabled",
+				  "needs-l1d-flush-msr-pr-0-to-1"))
+			cp->behaviour |= KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR;
+		if (!have_fw_feat(fw_features, "disabled",
+				  "needs-spec-barrier-for-bound-checks"))
+			cp->behaviour |= KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR;
+		cp->behaviour_mask = KVM_PPC_CPU_BEHAV_FAVOUR_SECURITY |
+			KVM_PPC_CPU_BEHAV_L1D_FLUSH_PR |
+			KVM_PPC_CPU_BEHAV_BNDS_CHK_SPEC_BAR;
+
+		of_node_put(fw_features);
+	}
+
+	return 0;
+}
+#endif
+
 long kvm_arch_vm_ioctl(struct file *filp,
                        unsigned int ioctl, unsigned long arg)
 {
@@ -1861,6 +1984,14 @@ long kvm_arch_vm_ioctl(struct file *filp,
 			r = -EFAULT;
 		break;
 	}
+	case KVM_PPC_GET_CPU_CHAR: {
+		struct kvm_ppc_cpu_char cpuchar;
+
+		r = kvmppc_get_cpu_char(&cpuchar);
+		if (r >= 0 && copy_to_user(argp, &cpuchar, sizeof(cpuchar)))
+			r = -EFAULT;
+		break;
+	}
 	default: {
 		struct kvm *kvm = filp->private_data;
 		r = kvm->arch.kvm_ops->arch_vm_ioctl(filp, ioctl, arg);
diff --git a/arch/powerpc/lib/feature-fixups.c b/arch/powerpc/lib/feature-fixups.c
index 41cf5ae..a95ea00 100644
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -116,6 +116,47 @@ void do_feature_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 	}
 }
 
+#ifdef CONFIG_PPC_BOOK3S_64
+void do_rfi_flush_fixups(enum l1d_flush_type types)
+{
+	unsigned int instrs[3], *dest;
+	long *start, *end;
+	int i;
+
+	start = PTRRELOC(&__start___rfi_flush_fixup),
+	end = PTRRELOC(&__stop___rfi_flush_fixup);
+
+	instrs[0] = 0x60000000; /* nop */
+	instrs[1] = 0x60000000; /* nop */
+	instrs[2] = 0x60000000; /* nop */
+
+	if (types & L1D_FLUSH_FALLBACK)
+		/* b .+16 to fallback flush */
+		instrs[0] = 0x48000010;
+
+	i = 0;
+	if (types & L1D_FLUSH_ORI) {
+		instrs[i++] = 0x63ff0000; /* ori 31,31,0 speculation barrier */
+		instrs[i++] = 0x63de0000; /* ori 30,30,0 L1d flush*/
+	}
+
+	if (types & L1D_FLUSH_MTTRIG)
+		instrs[i++] = 0x7c12dba6; /* mtspr TRIG2,r0 (SPR #882) */
+
+	for (i = 0; start < end; start++, i++) {
+		dest = (void *)start + *start;
+
+		pr_devel("patching dest %lx\n", (unsigned long)dest);
+
+		patch_instruction(dest, instrs[0]);
+		patch_instruction(dest + 1, instrs[1]);
+		patch_instruction(dest + 2, instrs[2]);
+	}
+
+	printk(KERN_DEBUG "rfi-flush: patched %d locations\n", i);
+}
+#endif /* CONFIG_PPC_BOOK3S_64 */
+
 void do_lwsync_fixups(unsigned long value, void *fixup_start, void *fixup_end)
 {
 	long *start, *end;
diff --git a/arch/powerpc/mm/fault.c b/arch/powerpc/mm/fault.c
index 4797d08..6e1e390 100644
--- a/arch/powerpc/mm/fault.c
+++ b/arch/powerpc/mm/fault.c
@@ -145,6 +145,11 @@ static noinline int bad_area(struct pt_regs *regs, unsigned long address)
 	return __bad_area(regs, address, SEGV_MAPERR);
 }
 
+static noinline int bad_access(struct pt_regs *regs, unsigned long address)
+{
+	return __bad_area(regs, address, SEGV_ACCERR);
+}
+
 static int do_sigbus(struct pt_regs *regs, unsigned long address,
 		     unsigned int fault)
 {
@@ -490,7 +495,7 @@ static int __do_page_fault(struct pt_regs *regs, unsigned long address,
 
 good_area:
 	if (unlikely(access_error(is_write, is_exec, vma)))
-		return bad_area(regs, address);
+		return bad_access(regs, address);
 
 	/*
 	 * If for any reason at all we couldn't handle the fault,
diff --git a/arch/powerpc/platforms/powernv/setup.c b/arch/powerpc/platforms/powernv/setup.c
index 1edfbc1..4fb21e1 100644
--- a/arch/powerpc/platforms/powernv/setup.c
+++ b/arch/powerpc/platforms/powernv/setup.c
@@ -37,13 +37,62 @@
 #include <asm/kexec.h>
 #include <asm/smp.h>
 #include <asm/tm.h>
+#include <asm/setup.h>
 
 #include "powernv.h"
 
+static void pnv_setup_rfi_flush(void)
+{
+	struct device_node *np, *fw_features;
+	enum l1d_flush_type type;
+	int enable;
+
+	/* Default to fallback in case fw-features are not available */
+	type = L1D_FLUSH_FALLBACK;
+	enable = 1;
+
+	np = of_find_node_by_name(NULL, "ibm,opal");
+	fw_features = of_get_child_by_name(np, "fw-features");
+	of_node_put(np);
+
+	if (fw_features) {
+		np = of_get_child_by_name(fw_features, "inst-l1d-flush-trig2");
+		if (np && of_property_read_bool(np, "enabled"))
+			type = L1D_FLUSH_MTTRIG;
+
+		of_node_put(np);
+
+		np = of_get_child_by_name(fw_features, "inst-l1d-flush-ori30,30,0");
+		if (np && of_property_read_bool(np, "enabled"))
+			type = L1D_FLUSH_ORI;
+
+		of_node_put(np);
+
+		/* Enable unless firmware says NOT to */
+		enable = 2;
+		np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-hv-1-to-0");
+		if (np && of_property_read_bool(np, "disabled"))
+			enable--;
+
+		of_node_put(np);
+
+		np = of_get_child_by_name(fw_features, "needs-l1d-flush-msr-pr-0-to-1");
+		if (np && of_property_read_bool(np, "disabled"))
+			enable--;
+
+		of_node_put(np);
+		of_node_put(fw_features);
+	}
+
+	setup_rfi_flush(type, enable > 0);
+}
+
 static void __init pnv_setup_arch(void)
 {
 	set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
 
+	pnv_setup_rfi_flush();
+
 	/* Initialize SMP */
 	pnv_smp_init();
 
diff --git a/arch/powerpc/platforms/pseries/dlpar.c b/arch/powerpc/platforms/pseries/dlpar.c
index 6e35780..a0b20c0 100644
--- a/arch/powerpc/platforms/pseries/dlpar.c
+++ b/arch/powerpc/platforms/pseries/dlpar.c
@@ -574,11 +574,26 @@ static ssize_t dlpar_show(struct class *class, struct class_attribute *attr,
 
 static CLASS_ATTR_RW(dlpar);
 
-static int __init pseries_dlpar_init(void)
+int __init dlpar_workqueue_init(void)
 {
+	if (pseries_hp_wq)
+		return 0;
+
 	pseries_hp_wq = alloc_workqueue("pseries hotplug workqueue",
-					WQ_UNBOUND, 1);
+			WQ_UNBOUND, 1);
+
+	return pseries_hp_wq ? 0 : -ENOMEM;
+}
+
+static int __init dlpar_sysfs_init(void)
+{
+	int rc;
+
+	rc = dlpar_workqueue_init();
+	if (rc)
+		return rc;
+
 	return sysfs_create_file(kernel_kobj, &class_attr_dlpar.attr);
 }
-machine_device_initcall(pseries, pseries_dlpar_init);
+machine_device_initcall(pseries, dlpar_sysfs_init);
 
diff --git a/arch/powerpc/platforms/pseries/pseries.h b/arch/powerpc/platforms/pseries/pseries.h
index 4470a31..1ae1d9f 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -98,4 +98,6 @@ static inline unsigned long cmo_get_page_size(void)
 	return CMO_PageSize;
 }
 
+int dlpar_workqueue_init(void);
+
 #endif /* _PSERIES_PSERIES_H */
diff --git a/arch/powerpc/platforms/pseries/ras.c b/arch/powerpc/platforms/pseries/ras.c
index 4923ffe..81d8614 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -69,7 +69,8 @@ static int __init init_ras_IRQ(void)
 	/* Hotplug Events */
 	np = of_find_node_by_path("/event-sources/hot-plug-events");
 	if (np != NULL) {
-		request_event_sources_irqs(np, ras_hotplug_interrupt,
+		if (dlpar_workqueue_init() == 0)
+			request_event_sources_irqs(np, ras_hotplug_interrupt,
 					   "RAS_HOTPLUG");
 		of_node_put(np);
 	}
diff --git a/arch/powerpc/platforms/pseries/setup.c b/arch/powerpc/platforms/pseries/setup.c
index a8531e0..ae4f596 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -459,6 +459,39 @@ static void __init find_and_init_phbs(void)
 	of_pci_check_probe_only();
 }
 
+static void pseries_setup_rfi_flush(void)
+{
+	struct h_cpu_char_result result;
+	enum l1d_flush_type types;
+	bool enable;
+	long rc;
+
+	/* Enable by default */
+	enable = true;
+
+	rc = plpar_get_cpu_characteristics(&result);
+	if (rc == H_SUCCESS) {
+		types = L1D_FLUSH_NONE;
+
+		if (result.character & H_CPU_CHAR_L1D_FLUSH_TRIG2)
+			types |= L1D_FLUSH_MTTRIG;
+		if (result.character & H_CPU_CHAR_L1D_FLUSH_ORI30)
+			types |= L1D_FLUSH_ORI;
+
+		/* Use fallback if nothing set in hcall */
+		if (types == L1D_FLUSH_NONE)
+			types = L1D_FLUSH_FALLBACK;
+
+		if (!(result.behaviour & H_CPU_BEHAV_L1D_FLUSH_PR))
+			enable = false;
+	} else {
+		/* Default to fallback if case hcall is not available */
+		types = L1D_FLUSH_FALLBACK;
+	}
+
+	setup_rfi_flush(types, enable);
+}
+
 static void __init pSeries_setup_arch(void)
 {
 	set_arch_panic_timeout(10, ARCH_PANIC_TIMEOUT);
@@ -476,6 +509,8 @@ static void __init pSeries_setup_arch(void)
 
 	fwnmi_init();
 
+	pseries_setup_rfi_flush();
+
 	/* By default, only probe PCI (can be overridden by rtas_pci) */
 	pci_add_flags(PCI_PROBE_ONLY);
 
diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index cab24f5..0ddc7ac 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -2344,10 +2344,10 @@ static void dump_one_paca(int cpu)
 	DUMP(p, kernel_toc, "lx");
 	DUMP(p, kernelbase, "lx");
 	DUMP(p, kernel_msr, "lx");
-	DUMP(p, emergency_sp, "p");
+	DUMP(p, emergency_sp, "px");
 #ifdef CONFIG_PPC_BOOK3S_64
-	DUMP(p, nmi_emergency_sp, "p");
-	DUMP(p, mc_emergency_sp, "p");
+	DUMP(p, nmi_emergency_sp, "px");
+	DUMP(p, mc_emergency_sp, "px");
 	DUMP(p, in_nmi, "x");
 	DUMP(p, in_mce, "x");
 	DUMP(p, hmi_event_available, "x");
@@ -2375,17 +2375,21 @@ static void dump_one_paca(int cpu)
 	DUMP(p, slb_cache_ptr, "x");
 	for (i = 0; i < SLB_CACHE_ENTRIES; i++)
 		printf(" slb_cache[%d]:        = 0x%016lx\n", i, p->slb_cache[i]);
+
+	DUMP(p, rfi_flush_fallback_area, "px");
+	DUMP(p, l1d_flush_congruence, "llx");
+	DUMP(p, l1d_flush_sets, "llx");
 #endif
 	DUMP(p, dscr_default, "llx");
 #ifdef CONFIG_PPC_BOOK3E
-	DUMP(p, pgd, "p");
-	DUMP(p, kernel_pgd, "p");
-	DUMP(p, tcd_ptr, "p");
-	DUMP(p, mc_kstack, "p");
-	DUMP(p, crit_kstack, "p");
-	DUMP(p, dbg_kstack, "p");
+	DUMP(p, pgd, "px");
+	DUMP(p, kernel_pgd, "px");
+	DUMP(p, tcd_ptr, "px");
+	DUMP(p, mc_kstack, "px");
+	DUMP(p, crit_kstack, "px");
+	DUMP(p, dbg_kstack, "px");
 #endif
-	DUMP(p, __current, "p");
+	DUMP(p, __current, "px");
 	DUMP(p, kstack, "lx");
 	printf(" kstack_base          = 0x%016lx\n", p->kstack & ~(THREAD_SIZE - 1));
 	DUMP(p, stab_rr, "lx");
@@ -2403,7 +2407,7 @@ static void dump_one_paca(int cpu)
 #endif
 
 #ifdef CONFIG_PPC_POWERNV
-	DUMP(p, core_idle_state_ptr, "p");
+	DUMP(p, core_idle_state_ptr, "px");
 	DUMP(p, thread_idle_state, "x");
 	DUMP(p, thread_mask, "x");
 	DUMP(p, subcore_sibling_mask, "x");
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index e69de29..47dacf0 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -0,0 +1,75 @@
+CONFIG_SMP=y
+CONFIG_PCI=y
+CONFIG_PCIE_XILINX=y
+CONFIG_SYSVIPC=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_CGROUPS=y
+CONFIG_CGROUP_SCHED=y
+CONFIG_CFS_BANDWIDTH=y
+CONFIG_CGROUP_BPF=y
+CONFIG_NAMESPACES=y
+CONFIG_USER_NS=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_EXPERT=y
+CONFIG_CHECKPOINT_RESTORE=y
+CONFIG_BPF_SYSCALL=y
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
+CONFIG_IP_MULTICAST=y
+CONFIG_IP_ADVANCED_ROUTER=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+CONFIG_IP_PNP_RARP=y
+CONFIG_NETLINK_DIAG=y
+CONFIG_DEVTMPFS=y
+CONFIG_BLK_DEV_LOOP=y
+CONFIG_VIRTIO_BLK=y
+CONFIG_BLK_DEV_SD=y
+CONFIG_BLK_DEV_SR=y
+CONFIG_ATA=y
+CONFIG_SATA_AHCI=y
+CONFIG_SATA_AHCI_PLATFORM=y
+CONFIG_NETDEVICES=y
+CONFIG_VIRTIO_NET=y
+CONFIG_MACB=y
+CONFIG_E1000E=y
+CONFIG_R8169=y
+CONFIG_MICROSEMI_PHY=y
+CONFIG_INPUT_MOUSEDEV=y
+CONFIG_SERIAL_8250=y
+CONFIG_SERIAL_8250_CONSOLE=y
+CONFIG_SERIAL_OF_PLATFORM=y
+# CONFIG_PTP_1588_CLOCK is not set
+CONFIG_DRM=y
+CONFIG_DRM_RADEON=y
+CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_USB=y
+CONFIG_USB_XHCI_HCD=y
+CONFIG_USB_XHCI_PLATFORM=y
+CONFIG_USB_EHCI_HCD=y
+CONFIG_USB_EHCI_HCD_PLATFORM=y
+CONFIG_USB_OHCI_HCD=y
+CONFIG_USB_OHCI_HCD_PLATFORM=y
+CONFIG_USB_STORAGE=y
+CONFIG_USB_UAS=y
+CONFIG_VIRTIO_MMIO=y
+CONFIG_RAS=y
+CONFIG_EXT4_FS=y
+CONFIG_EXT4_FS_POSIX_ACL=y
+CONFIG_AUTOFS4_FS=y
+CONFIG_MSDOS_FS=y
+CONFIG_VFAT_FS=y
+CONFIG_TMPFS=y
+CONFIG_TMPFS_POSIX_ACL=y
+CONFIG_NFS_FS=y
+CONFIG_NFS_V4=y
+CONFIG_NFS_V4_1=y
+CONFIG_NFS_V4_2=y
+CONFIG_ROOT_NFS=y
+# CONFIG_RCU_TRACE is not set
+CONFIG_CRYPTO_USER_API_HASH=y
diff --git a/arch/riscv/include/asm/csr.h b/arch/riscv/include/asm/csr.h
index 0d64bc9..3c7a2c9 100644
--- a/arch/riscv/include/asm/csr.h
+++ b/arch/riscv/include/asm/csr.h
@@ -17,10 +17,10 @@
 #include <linux/const.h>
 
 /* Status register flags */
-#define SR_IE   _AC(0x00000002, UL) /* Interrupt Enable */
-#define SR_PIE  _AC(0x00000020, UL) /* Previous IE */
-#define SR_PS   _AC(0x00000100, UL) /* Previously Supervisor */
-#define SR_SUM  _AC(0x00040000, UL) /* Supervisor may access User Memory */
+#define SR_SIE	_AC(0x00000002, UL) /* Supervisor Interrupt Enable */
+#define SR_SPIE	_AC(0x00000020, UL) /* Previous Supervisor IE */
+#define SR_SPP	_AC(0x00000100, UL) /* Previously Supervisor */
+#define SR_SUM	_AC(0x00040000, UL) /* Supervisor may access User Memory */
 
 #define SR_FS           _AC(0x00006000, UL) /* Floating-point Status */
 #define SR_FS_OFF       _AC(0x00000000, UL)
diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index a82ce59..b269451 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -21,8 +21,6 @@
 
 #include <linux/types.h>
 
-#ifdef CONFIG_MMU
-
 extern void __iomem *ioremap(phys_addr_t offset, unsigned long size);
 
 /*
@@ -36,8 +34,6 @@ extern void __iomem *ioremap(phys_addr_t offset, unsigned long size);
 
 extern void iounmap(volatile void __iomem *addr);
 
-#endif /* CONFIG_MMU */
-
 /* Generic IO read/write.  These perform native-endian accesses. */
 #define __raw_writeb __raw_writeb
 static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
diff --git a/arch/riscv/include/asm/irqflags.h b/arch/riscv/include/asm/irqflags.h
index 6fdc860..07a3c6d 100644
--- a/arch/riscv/include/asm/irqflags.h
+++ b/arch/riscv/include/asm/irqflags.h
@@ -27,25 +27,25 @@ static inline unsigned long arch_local_save_flags(void)
 /* unconditionally enable interrupts */
 static inline void arch_local_irq_enable(void)
 {
-	csr_set(sstatus, SR_IE);
+	csr_set(sstatus, SR_SIE);
 }
 
 /* unconditionally disable interrupts */
 static inline void arch_local_irq_disable(void)
 {
-	csr_clear(sstatus, SR_IE);
+	csr_clear(sstatus, SR_SIE);
 }
 
 /* get status and disable interrupts */
 static inline unsigned long arch_local_irq_save(void)
 {
-	return csr_read_clear(sstatus, SR_IE);
+	return csr_read_clear(sstatus, SR_SIE);
 }
 
 /* test flags */
 static inline int arch_irqs_disabled_flags(unsigned long flags)
 {
-	return !(flags & SR_IE);
+	return !(flags & SR_SIE);
 }
 
 /* test hardware interrupt enable bit */
@@ -57,7 +57,7 @@ static inline int arch_irqs_disabled(void)
 /* set interrupt enabled status */
 static inline void arch_local_irq_restore(unsigned long flags)
 {
-	csr_set(sstatus, flags & SR_IE);
+	csr_set(sstatus, flags & SR_SIE);
 }
 
 #endif /* _ASM_RISCV_IRQFLAGS_H */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 2cbd92e..1630196 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -20,8 +20,6 @@
 
 #ifndef __ASSEMBLY__
 
-#ifdef CONFIG_MMU
-
 /* Page Upper Directory not used in RISC-V */
 #include <asm-generic/pgtable-nopud.h>
 #include <asm/page.h>
@@ -413,8 +411,6 @@ static inline void pgtable_cache_init(void)
 	/* No page table caches to initialize */
 }
 
-#endif /* CONFIG_MMU */
-
 #define VMALLOC_SIZE     (KERN_VIRT_SIZE >> 1)
 #define VMALLOC_END      (PAGE_OFFSET - 1)
 #define VMALLOC_START    (PAGE_OFFSET - VMALLOC_SIZE)
diff --git a/arch/riscv/include/asm/ptrace.h b/arch/riscv/include/asm/ptrace.h
index 93b8956..2c5df94 100644
--- a/arch/riscv/include/asm/ptrace.h
+++ b/arch/riscv/include/asm/ptrace.h
@@ -66,7 +66,7 @@ struct pt_regs {
 #define REG_FMT "%08lx"
 #endif
 
-#define user_mode(regs) (((regs)->sstatus & SR_PS) == 0)
+#define user_mode(regs) (((regs)->sstatus & SR_SPP) == 0)
 
 
 /* Helpers for working with the instruction pointer */
diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h
index 22c3536..f8fa1cd 100644
--- a/arch/riscv/include/asm/thread_info.h
+++ b/arch/riscv/include/asm/thread_info.h
@@ -64,8 +64,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_stack		(init_thread_union.stack)
-
 #endif /* !__ASSEMBLY__ */
 
 /*
diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h
index 715b0f1..7b9c24e 100644
--- a/arch/riscv/include/asm/tlbflush.h
+++ b/arch/riscv/include/asm/tlbflush.h
@@ -15,8 +15,6 @@
 #ifndef _ASM_RISCV_TLBFLUSH_H
 #define _ASM_RISCV_TLBFLUSH_H
 
-#ifdef CONFIG_MMU
-
 #include <linux/mm_types.h>
 
 /*
@@ -64,6 +62,4 @@ static inline void flush_tlb_kernel_range(unsigned long start,
 	flush_tlb_all();
 }
 
-#endif /* CONFIG_MMU */
-
 #endif /* _ASM_RISCV_TLBFLUSH_H */
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h
index 27b90d6..14b0b22 100644
--- a/arch/riscv/include/asm/uaccess.h
+++ b/arch/riscv/include/asm/uaccess.h
@@ -127,7 +127,6 @@ extern int fixup_exception(struct pt_regs *state);
  * call.
  */
 
-#ifdef CONFIG_MMU
 #define __get_user_asm(insn, x, ptr, err)			\
 do {								\
 	uintptr_t __tmp;					\
@@ -153,13 +152,11 @@ do {								\
 	__disable_user_access();				\
 	(x) = __x;						\
 } while (0)
-#endif /* CONFIG_MMU */
 
 #ifdef CONFIG_64BIT
 #define __get_user_8(x, ptr, err) \
 	__get_user_asm("ld", x, ptr, err)
 #else /* !CONFIG_64BIT */
-#ifdef CONFIG_MMU
 #define __get_user_8(x, ptr, err)				\
 do {								\
 	u32 __user *__ptr = (u32 __user *)(ptr);		\
@@ -193,7 +190,6 @@ do {								\
 	(x) = (__typeof__(x))((__typeof__((x)-(x)))(		\
 		(((u64)__hi << 32) | __lo)));			\
 } while (0)
-#endif /* CONFIG_MMU */
 #endif /* CONFIG_64BIT */
 
 
@@ -267,8 +263,6 @@ do {								\
 		((x) = 0, -EFAULT);				\
 })
 
-
-#ifdef CONFIG_MMU
 #define __put_user_asm(insn, x, ptr, err)			\
 do {								\
 	uintptr_t __tmp;					\
@@ -292,14 +286,11 @@ do {								\
 		: "rJ" (__x), "i" (-EFAULT));			\
 	__disable_user_access();				\
 } while (0)
-#endif /* CONFIG_MMU */
-
 
 #ifdef CONFIG_64BIT
 #define __put_user_8(x, ptr, err) \
 	__put_user_asm("sd", x, ptr, err)
 #else /* !CONFIG_64BIT */
-#ifdef CONFIG_MMU
 #define __put_user_8(x, ptr, err)				\
 do {								\
 	u32 __user *__ptr = (u32 __user *)(ptr);		\
@@ -329,7 +320,6 @@ do {								\
 		: "rJ" (__x), "rJ" (__x >> 32), "i" (-EFAULT));	\
 	__disable_user_access();				\
 } while (0)
-#endif /* CONFIG_MMU */
 #endif /* CONFIG_64BIT */
 
 
@@ -438,7 +428,6 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n)
  * will set "err" to -EFAULT, while successful accesses return the previous
  * value.
  */
-#ifdef CONFIG_MMU
 #define __cmpxchg_user(ptr, old, new, err, size, lrb, scb)	\
 ({								\
 	__typeof__(ptr) __ptr = (ptr);				\
@@ -508,6 +497,5 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n)
 	(err) = __err;						\
 	__ret;							\
 })
-#endif /* CONFIG_MMU */
 
 #endif /* _ASM_RISCV_UACCESS_H */
diff --git a/arch/riscv/include/asm/unistd.h b/arch/riscv/include/asm/unistd.h
index 9f250ed0..2f704a5 100644
--- a/arch/riscv/include/asm/unistd.h
+++ b/arch/riscv/include/asm/unistd.h
@@ -14,3 +14,4 @@
 #define __ARCH_HAVE_MMU
 #define __ARCH_WANT_SYS_CLONE
 #include <uapi/asm/unistd.h>
+#include <uapi/asm/syscalls.h>
diff --git a/arch/riscv/include/asm/vdso-syscalls.h b/arch/riscv/include/asm/vdso-syscalls.h
deleted file mode 100644
index a2ccf18..0000000
--- a/arch/riscv/include/asm/vdso-syscalls.h
+++ /dev/null
@@ -1,28 +0,0 @@
-/*
- * Copyright (C) 2017 SiFive
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef _ASM_RISCV_VDSO_SYSCALLS_H
-#define _ASM_RISCV_VDSO_SYSCALLS_H
-
-#ifdef CONFIG_SMP
-
-/* These syscalls are only used by the vDSO and are not in the uapi. */
-#define __NR_riscv_flush_icache (__NR_arch_specific_syscall + 15)
-__SYSCALL(__NR_riscv_flush_icache, sys_riscv_flush_icache)
-
-#endif
-
-#endif /* _ASM_RISCV_VDSO_H */
diff --git a/arch/riscv/include/uapi/asm/syscalls.h b/arch/riscv/include/uapi/asm/syscalls.h
new file mode 100644
index 0000000..818655b
--- /dev/null
+++ b/arch/riscv/include/uapi/asm/syscalls.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright (C) 2017 SiFive
+ */
+
+#ifndef _ASM__UAPI__SYSCALLS_H
+#define _ASM__UAPI__SYSCALLS_H
+
+/*
+ * Allows the instruction cache to be flushed from userspace.  Despite RISC-V
+ * having a direct 'fence.i' instruction available to userspace (which we
+ * can't trap!), that's not actually viable when running on Linux because the
+ * kernel might schedule a process on another hart.  There is no way for
+ * userspace to handle this without invoking the kernel (as it doesn't know the
+ * thread->hart mappings), so we've defined a RISC-V specific system call to
+ * flush the instruction cache.
+ *
+ * __NR_riscv_flush_icache is defined to flush the instruction cache over an
+ * address range, with the flush applying to either all threads or just the
+ * caller.  We don't currently do anything with the address range, that's just
+ * in there for forwards compatibility.
+ */
+#define __NR_riscv_flush_icache (__NR_arch_specific_syscall + 15)
+__SYSCALL(__NR_riscv_flush_icache, sys_riscv_flush_icache)
+
+#endif
diff --git a/arch/riscv/kernel/entry.S b/arch/riscv/kernel/entry.S
index 20ee86f..7404ec2 100644
--- a/arch/riscv/kernel/entry.S
+++ b/arch/riscv/kernel/entry.S
@@ -196,7 +196,7 @@
 	addi s2, s2, 0x4
 	REG_S s2, PT_SEPC(sp)
 	/* System calls run with interrupts enabled */
-	csrs sstatus, SR_IE
+	csrs sstatus, SR_SIE
 	/* Trace syscalls, but only if requested by the user. */
 	REG_L t0, TASK_TI_FLAGS(tp)
 	andi t0, t0, _TIF_SYSCALL_TRACE
@@ -224,8 +224,8 @@
 
 ret_from_exception:
 	REG_L s0, PT_SSTATUS(sp)
-	csrc sstatus, SR_IE
-	andi s0, s0, SR_PS
+	csrc sstatus, SR_SIE
+	andi s0, s0, SR_SPP
 	bnez s0, restore_all
 
 resume_userspace:
@@ -255,7 +255,7 @@
 	bnez s1, work_resched
 work_notifysig:
 	/* Handle pending signals and notify-resume requests */
-	csrs sstatus, SR_IE /* Enable interrupts for do_notify_resume() */
+	csrs sstatus, SR_SIE /* Enable interrupts for do_notify_resume() */
 	move a0, sp /* pt_regs */
 	move a1, s0 /* current_thread_info->flags */
 	tail do_notify_resume
diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c
index 0d90dcc..d74d4ad 100644
--- a/arch/riscv/kernel/process.c
+++ b/arch/riscv/kernel/process.c
@@ -76,7 +76,7 @@ void show_regs(struct pt_regs *regs)
 void start_thread(struct pt_regs *regs, unsigned long pc,
 	unsigned long sp)
 {
-	regs->sstatus = SR_PIE /* User mode, irqs on */ | SR_FS_INITIAL;
+	regs->sstatus = SR_SPIE /* User mode, irqs on */ | SR_FS_INITIAL;
 	regs->sepc = pc;
 	regs->sp = sp;
 	set_fs(USER_DS);
@@ -110,7 +110,7 @@ int copy_thread(unsigned long clone_flags, unsigned long usp,
 		const register unsigned long gp __asm__ ("gp");
 		memset(childregs, 0, sizeof(struct pt_regs));
 		childregs->gp = gp;
-		childregs->sstatus = SR_PS | SR_PIE; /* Supervisor, irqs on */
+		childregs->sstatus = SR_SPP | SR_SPIE; /* Supervisor, irqs on */
 
 		p->thread.ra = (unsigned long)ret_from_kernel_thread;
 		p->thread.s[0] = usp; /* fn */
diff --git a/arch/riscv/kernel/syscall_table.c b/arch/riscv/kernel/syscall_table.c
index a5bd640..ade52b9 100644
--- a/arch/riscv/kernel/syscall_table.c
+++ b/arch/riscv/kernel/syscall_table.c
@@ -23,5 +23,4 @@
 void *sys_call_table[__NR_syscalls] = {
 	[0 ... __NR_syscalls - 1] = sys_ni_syscall,
 #include <asm/unistd.h>
-#include <asm/vdso-syscalls.h>
 };
diff --git a/arch/riscv/kernel/vdso/flush_icache.S b/arch/riscv/kernel/vdso/flush_icache.S
index b0fbad7..023e4d4 100644
--- a/arch/riscv/kernel/vdso/flush_icache.S
+++ b/arch/riscv/kernel/vdso/flush_icache.S
@@ -13,7 +13,6 @@
 
 #include <linux/linkage.h>
 #include <asm/unistd.h>
-#include <asm/vdso-syscalls.h>
 
 	.text
 /* int __vdso_flush_icache(void *start, void *end, unsigned long flags); */
diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
index df2ca3c..0713f3c 100644
--- a/arch/riscv/mm/fault.c
+++ b/arch/riscv/mm/fault.c
@@ -63,7 +63,7 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
 		goto vmalloc_fault;
 
 	/* Enable interrupts if they were enabled in the parent context. */
-	if (likely(regs->sstatus & SR_PIE))
+	if (likely(regs->sstatus & SR_SPIE))
 		local_irq_enable();
 
 	/*
diff --git a/arch/s390/include/asm/kvm_host.h b/arch/s390/include/asm/kvm_host.h
index e14f381..c1b0a9a 100644
--- a/arch/s390/include/asm/kvm_host.h
+++ b/arch/s390/include/asm/kvm_host.h
@@ -207,7 +207,8 @@ struct kvm_s390_sie_block {
 	__u16	ipa;			/* 0x0056 */
 	__u32	ipb;			/* 0x0058 */
 	__u32	scaoh;			/* 0x005c */
-	__u8	reserved60;		/* 0x0060 */
+#define FPF_BPBC 	0x20
+	__u8	fpf;			/* 0x0060 */
 #define ECB_GS		0x40
 #define ECB_TE		0x10
 #define ECB_SRSI	0x04
diff --git a/arch/s390/include/asm/thread_info.h b/arch/s390/include/asm/thread_info.h
index 0880a37..25d6ec3 100644
--- a/arch/s390/include/asm/thread_info.h
+++ b/arch/s390/include/asm/thread_info.h
@@ -42,8 +42,6 @@ struct thread_info {
 	.flags		= 0,			\
 }
 
-#define init_stack		(init_thread_union.stack)
-
 void arch_release_task_struct(struct task_struct *tsk);
 int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
 
diff --git a/arch/s390/include/uapi/asm/kvm.h b/arch/s390/include/uapi/asm/kvm.h
index 38535a57..4cdaa55 100644
--- a/arch/s390/include/uapi/asm/kvm.h
+++ b/arch/s390/include/uapi/asm/kvm.h
@@ -224,6 +224,7 @@ struct kvm_guest_debug_arch {
 #define KVM_SYNC_RICCB  (1UL << 7)
 #define KVM_SYNC_FPRS   (1UL << 8)
 #define KVM_SYNC_GSCB   (1UL << 9)
+#define KVM_SYNC_BPBC   (1UL << 10)
 /* length and alignment of the sdnx as a power of two */
 #define SDNXC 8
 #define SDNXL (1UL << SDNXC)
@@ -247,7 +248,9 @@ struct kvm_sync_regs {
 	};
 	__u8  reserved[512];	/* for future vector expansion */
 	__u32 fpc;		/* valid on KVM_SYNC_VRS or KVM_SYNC_FPRS */
-	__u8 padding1[52];	/* riccb needs to be 64byte aligned */
+	__u8 bpbc : 1;		/* bp mode */
+	__u8 reserved2 : 7;
+	__u8 padding1[51];	/* riccb needs to be 64byte aligned */
 	__u8 riccb[64];		/* runtime instrumentation controls block */
 	__u8 padding2[192];	/* sdnx needs to be 256byte aligned */
 	union {
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c
index ec8b68e..1371dff 100644
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -421,6 +421,9 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
 	case KVM_CAP_S390_GS:
 		r = test_facility(133);
 		break;
+	case KVM_CAP_S390_BPB:
+		r = test_facility(82);
+		break;
 	default:
 		r = 0;
 	}
@@ -766,7 +769,7 @@ static void kvm_s390_sync_request_broadcast(struct kvm *kvm, int req)
 
 /*
  * Must be called with kvm->srcu held to avoid races on memslots, and with
- * kvm->lock to avoid races with ourselves and kvm_s390_vm_stop_migration.
+ * kvm->slots_lock to avoid races with ourselves and kvm_s390_vm_stop_migration.
  */
 static int kvm_s390_vm_start_migration(struct kvm *kvm)
 {
@@ -792,11 +795,12 @@ static int kvm_s390_vm_start_migration(struct kvm *kvm)
 
 	if (kvm->arch.use_cmma) {
 		/*
-		 * Get the last slot. They should be sorted by base_gfn, so the
-		 * last slot is also the one at the end of the address space.
-		 * We have verified above that at least one slot is present.
+		 * Get the first slot. They are reverse sorted by base_gfn, so
+		 * the first slot is also the one at the end of the address
+		 * space. We have verified above that at least one slot is
+		 * present.
 		 */
-		ms = slots->memslots + slots->used_slots - 1;
+		ms = slots->memslots;
 		/* round up so we only use full longs */
 		ram_pages = roundup(ms->base_gfn + ms->npages, BITS_PER_LONG);
 		/* allocate enough bytes to store all the bits */
@@ -821,7 +825,7 @@ static int kvm_s390_vm_start_migration(struct kvm *kvm)
 }
 
 /*
- * Must be called with kvm->lock to avoid races with ourselves and
+ * Must be called with kvm->slots_lock to avoid races with ourselves and
  * kvm_s390_vm_start_migration.
  */
 static int kvm_s390_vm_stop_migration(struct kvm *kvm)
@@ -836,6 +840,8 @@ static int kvm_s390_vm_stop_migration(struct kvm *kvm)
 
 	if (kvm->arch.use_cmma) {
 		kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION);
+		/* We have to wait for the essa emulation to finish */
+		synchronize_srcu(&kvm->srcu);
 		vfree(mgs->pgste_bitmap);
 	}
 	kfree(mgs);
@@ -845,14 +851,12 @@ static int kvm_s390_vm_stop_migration(struct kvm *kvm)
 static int kvm_s390_vm_set_migration(struct kvm *kvm,
 				     struct kvm_device_attr *attr)
 {
-	int idx, res = -ENXIO;
+	int res = -ENXIO;
 
-	mutex_lock(&kvm->lock);
+	mutex_lock(&kvm->slots_lock);
 	switch (attr->attr) {
 	case KVM_S390_VM_MIGRATION_START:
-		idx = srcu_read_lock(&kvm->srcu);
 		res = kvm_s390_vm_start_migration(kvm);
-		srcu_read_unlock(&kvm->srcu, idx);
 		break;
 	case KVM_S390_VM_MIGRATION_STOP:
 		res = kvm_s390_vm_stop_migration(kvm);
@@ -860,7 +864,7 @@ static int kvm_s390_vm_set_migration(struct kvm *kvm,
 	default:
 		break;
 	}
-	mutex_unlock(&kvm->lock);
+	mutex_unlock(&kvm->slots_lock);
 
 	return res;
 }
@@ -1750,7 +1754,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = -EFAULT;
 		if (copy_from_user(&args, argp, sizeof(args)))
 			break;
+		mutex_lock(&kvm->slots_lock);
 		r = kvm_s390_get_cmma_bits(kvm, &args);
+		mutex_unlock(&kvm->slots_lock);
 		if (!r) {
 			r = copy_to_user(argp, &args, sizeof(args));
 			if (r)
@@ -1764,7 +1770,9 @@ long kvm_arch_vm_ioctl(struct file *filp,
 		r = -EFAULT;
 		if (copy_from_user(&args, argp, sizeof(args)))
 			break;
+		mutex_lock(&kvm->slots_lock);
 		r = kvm_s390_set_cmma_bits(kvm, &args);
+		mutex_unlock(&kvm->slots_lock);
 		break;
 	}
 	default:
@@ -2197,6 +2205,8 @@ int kvm_arch_vcpu_init(struct kvm_vcpu *vcpu)
 	kvm_s390_set_prefix(vcpu, 0);
 	if (test_kvm_facility(vcpu->kvm, 64))
 		vcpu->run->kvm_valid_regs |= KVM_SYNC_RICCB;
+	if (test_kvm_facility(vcpu->kvm, 82))
+		vcpu->run->kvm_valid_regs |= KVM_SYNC_BPBC;
 	if (test_kvm_facility(vcpu->kvm, 133))
 		vcpu->run->kvm_valid_regs |= KVM_SYNC_GSCB;
 	/* fprs can be synchronized via vrs, even if the guest has no vx. With
@@ -2338,6 +2348,7 @@ static void kvm_s390_vcpu_initial_reset(struct kvm_vcpu *vcpu)
 	current->thread.fpu.fpc = 0;
 	vcpu->arch.sie_block->gbea = 1;
 	vcpu->arch.sie_block->pp = 0;
+	vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
 	vcpu->arch.pfault_token = KVM_S390_PFAULT_TOKEN_INVALID;
 	kvm_clear_async_pf_completion_queue(vcpu);
 	if (!kvm_s390_user_cpu_state_ctrl(vcpu->kvm))
@@ -3297,6 +3308,11 @@ static void sync_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 		vcpu->arch.sie_block->ecd |= ECD_HOSTREGMGMT;
 		vcpu->arch.gs_enabled = 1;
 	}
+	if ((kvm_run->kvm_dirty_regs & KVM_SYNC_BPBC) &&
+	    test_kvm_facility(vcpu->kvm, 82)) {
+		vcpu->arch.sie_block->fpf &= ~FPF_BPBC;
+		vcpu->arch.sie_block->fpf |= kvm_run->s.regs.bpbc ? FPF_BPBC : 0;
+	}
 	save_access_regs(vcpu->arch.host_acrs);
 	restore_access_regs(vcpu->run->s.regs.acrs);
 	/* save host (userspace) fprs/vrs */
@@ -3343,6 +3359,7 @@ static void store_regs(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
 	kvm_run->s.regs.pft = vcpu->arch.pfault_token;
 	kvm_run->s.regs.pfs = vcpu->arch.pfault_select;
 	kvm_run->s.regs.pfc = vcpu->arch.pfault_compare;
+	kvm_run->s.regs.bpbc = (vcpu->arch.sie_block->fpf & FPF_BPBC) == FPF_BPBC;
 	save_access_regs(vcpu->run->s.regs.acrs);
 	restore_access_regs(vcpu->arch.host_acrs);
 	/* Save guest register state */
diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c
index 572496c..0714bfa 100644
--- a/arch/s390/kvm/priv.c
+++ b/arch/s390/kvm/priv.c
@@ -1006,7 +1006,7 @@ static inline int do_essa(struct kvm_vcpu *vcpu, const int orc)
 		cbrlo[entries] = gfn << PAGE_SHIFT;
 	}
 
-	if (orc) {
+	if (orc && gfn < ms->bitmap_size) {
 		/* increment only if we are really flipping the bit to 1 */
 		if (!test_and_set_bit(gfn, ms->pgste_bitmap))
 			atomic64_inc(&ms->dirty_pages);
diff --git a/arch/s390/kvm/vsie.c b/arch/s390/kvm/vsie.c
index 5d6ae03..7513483 100644
--- a/arch/s390/kvm/vsie.c
+++ b/arch/s390/kvm/vsie.c
@@ -223,6 +223,12 @@ static void unshadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 	memcpy(scb_o->gcr, scb_s->gcr, 128);
 	scb_o->pp = scb_s->pp;
 
+	/* branch prediction */
+	if (test_kvm_facility(vcpu->kvm, 82)) {
+		scb_o->fpf &= ~FPF_BPBC;
+		scb_o->fpf |= scb_s->fpf & FPF_BPBC;
+	}
+
 	/* interrupt intercept */
 	switch (scb_s->icptcode) {
 	case ICPT_PROGI:
@@ -265,6 +271,7 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 	scb_s->ecb3 = 0;
 	scb_s->ecd = 0;
 	scb_s->fac = 0;
+	scb_s->fpf = 0;
 
 	rc = prepare_cpuflags(vcpu, vsie_page);
 	if (rc)
@@ -324,6 +331,9 @@ static int shadow_scb(struct kvm_vcpu *vcpu, struct vsie_page *vsie_page)
 			prefix_unmapped(vsie_page);
 		scb_s->ecb |= scb_o->ecb & ECB_TE;
 	}
+	/* branch prediction */
+	if (test_kvm_facility(vcpu->kvm, 82))
+		scb_s->fpf |= scb_o->fpf & FPF_BPBC;
 	/* SIMD */
 	if (test_kvm_facility(vcpu->kvm, 129)) {
 		scb_s->eca |= scb_o->eca & ECA_VX;
diff --git a/arch/s390/lib/uaccess.c b/arch/s390/lib/uaccess.c
index cae5a1e..c4f8039 100644
--- a/arch/s390/lib/uaccess.c
+++ b/arch/s390/lib/uaccess.c
@@ -89,11 +89,11 @@ EXPORT_SYMBOL(enable_sacf_uaccess);
 
 void disable_sacf_uaccess(mm_segment_t old_fs)
 {
+	current->thread.mm_segment = old_fs;
 	if (old_fs == USER_DS && test_facility(27)) {
 		__ctl_load(S390_lowcore.user_asce, 1, 1);
 		clear_cpu_flag(CIF_ASCE_PRIMARY);
 	}
-	current->thread.mm_segment = old_fs;
 }
 EXPORT_SYMBOL(disable_sacf_uaccess);
 
diff --git a/arch/s390/pci/pci_dma.c b/arch/s390/pci/pci_dma.c
index f7aa5a7..2d15d84 100644
--- a/arch/s390/pci/pci_dma.c
+++ b/arch/s390/pci/pci_dma.c
@@ -181,6 +181,9 @@ static int __dma_update_trans(struct zpci_dev *zdev, unsigned long pa,
 static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
 			   size_t size, int flags)
 {
+	unsigned long irqflags;
+	int ret;
+
 	/*
 	 * With zdev->tlb_refresh == 0, rpcit is not required to establish new
 	 * translations when previously invalid translation-table entries are
@@ -196,8 +199,22 @@ static int __dma_purge_tlb(struct zpci_dev *zdev, dma_addr_t dma_addr,
 			return 0;
 	}
 
-	return zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
-				  PAGE_ALIGN(size));
+	ret = zpci_refresh_trans((u64) zdev->fh << 32, dma_addr,
+				 PAGE_ALIGN(size));
+	if (ret == -ENOMEM && !s390_iommu_strict) {
+		/* enable the hypervisor to free some resources */
+		if (zpci_refresh_global(zdev))
+			goto out;
+
+		spin_lock_irqsave(&zdev->iommu_bitmap_lock, irqflags);
+		bitmap_andnot(zdev->iommu_bitmap, zdev->iommu_bitmap,
+			      zdev->lazy_bitmap, zdev->iommu_pages);
+		bitmap_zero(zdev->lazy_bitmap, zdev->iommu_pages);
+		spin_unlock_irqrestore(&zdev->iommu_bitmap_lock, irqflags);
+		ret = 0;
+	}
+out:
+	return ret;
 }
 
 static int dma_update_trans(struct zpci_dev *zdev, unsigned long pa,
diff --git a/arch/s390/pci/pci_insn.c b/arch/s390/pci/pci_insn.c
index 19bcb3b..f069929 100644
--- a/arch/s390/pci/pci_insn.c
+++ b/arch/s390/pci/pci_insn.c
@@ -89,6 +89,9 @@ int zpci_refresh_trans(u64 fn, u64 addr, u64 range)
 	if (cc)
 		zpci_err_insn(cc, status, addr, range);
 
+	if (cc == 1 && (status == 4 || status == 16))
+		return -ENOMEM;
+
 	return (cc) ? -EIO : 0;
 }
 
diff --git a/arch/score/include/asm/thread_info.h b/arch/score/include/asm/thread_info.h
index ad51b56..bc4c7c9 100644
--- a/arch/score/include/asm/thread_info.h
+++ b/arch/score/include/asm/thread_info.h
@@ -58,9 +58,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* How to get the thread information struct from C. */
 register struct thread_info *__current_thread_info __asm__("r28");
 #define current_thread_info()	__current_thread_info
diff --git a/arch/sh/boards/mach-se/770x/setup.c b/arch/sh/boards/mach-se/770x/setup.c
index 77c3535..412326d 100644
--- a/arch/sh/boards/mach-se/770x/setup.c
+++ b/arch/sh/boards/mach-se/770x/setup.c
@@ -9,6 +9,7 @@
  */
 #include <linux/init.h>
 #include <linux/platform_device.h>
+#include <linux/sh_eth.h>
 #include <mach-se/mach/se.h>
 #include <mach-se/mach/mrshpc.h>
 #include <asm/machvec.h>
@@ -115,13 +116,23 @@ static struct platform_device heartbeat_device = {
 #if defined(CONFIG_CPU_SUBTYPE_SH7710) ||\
 	defined(CONFIG_CPU_SUBTYPE_SH7712)
 /* SH771X Ethernet driver */
+static struct sh_eth_plat_data sh_eth_plat = {
+	.phy = PHY_ID,
+	.phy_interface = PHY_INTERFACE_MODE_MII,
+};
+
 static struct resource sh_eth0_resources[] = {
 	[0] = {
 		.start = SH_ETH0_BASE,
-		.end = SH_ETH0_BASE + 0x1B8,
+		.end = SH_ETH0_BASE + 0x1B8 - 1,
 		.flags = IORESOURCE_MEM,
 	},
 	[1] = {
+		.start = SH_TSU_BASE,
+		.end = SH_TSU_BASE + 0x200 - 1,
+		.flags = IORESOURCE_MEM,
+	},
+	[2] = {
 		.start = SH_ETH0_IRQ,
 		.end = SH_ETH0_IRQ,
 		.flags = IORESOURCE_IRQ,
@@ -132,7 +143,7 @@ static struct platform_device sh_eth0_device = {
 	.name = "sh771x-ether",
 	.id = 0,
 	.dev = {
-		.platform_data = PHY_ID,
+		.platform_data = &sh_eth_plat,
 	},
 	.num_resources = ARRAY_SIZE(sh_eth0_resources),
 	.resource = sh_eth0_resources,
@@ -141,10 +152,15 @@ static struct platform_device sh_eth0_device = {
 static struct resource sh_eth1_resources[] = {
 	[0] = {
 		.start = SH_ETH1_BASE,
-		.end = SH_ETH1_BASE + 0x1B8,
+		.end = SH_ETH1_BASE + 0x1B8 - 1,
 		.flags = IORESOURCE_MEM,
 	},
 	[1] = {
+		.start = SH_TSU_BASE,
+		.end = SH_TSU_BASE + 0x200 - 1,
+		.flags = IORESOURCE_MEM,
+	},
+	[2] = {
 		.start = SH_ETH1_IRQ,
 		.end = SH_ETH1_IRQ,
 		.flags = IORESOURCE_IRQ,
@@ -155,7 +171,7 @@ static struct platform_device sh_eth1_device = {
 	.name = "sh771x-ether",
 	.id = 1,
 	.dev = {
-		.platform_data = PHY_ID,
+		.platform_data = &sh_eth_plat,
 	},
 	.num_resources = ARRAY_SIZE(sh_eth1_resources),
 	.resource = sh_eth1_resources,
diff --git a/arch/sh/include/asm/thread_info.h b/arch/sh/include/asm/thread_info.h
index becb798..cf5c792 100644
--- a/arch/sh/include/asm/thread_info.h
+++ b/arch/sh/include/asm/thread_info.h
@@ -63,9 +63,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the current stack pointer from C */
 register unsigned long current_stack_pointer asm("r15") __used;
 
diff --git a/arch/sh/include/mach-se/mach/se.h b/arch/sh/include/mach-se/mach/se.h
index 4246ef9..aa83fe1 100644
--- a/arch/sh/include/mach-se/mach/se.h
+++ b/arch/sh/include/mach-se/mach/se.h
@@ -100,6 +100,7 @@
 /* Base address */
 #define SH_ETH0_BASE 0xA7000000
 #define SH_ETH1_BASE 0xA7000400
+#define SH_TSU_BASE  0xA7000800
 /* PHY ID */
 #if defined(CONFIG_CPU_SUBTYPE_SH7710)
 # define PHY_ID 0x00
diff --git a/arch/sparc/crypto/Makefile b/arch/sparc/crypto/Makefile
index 818d3aa..d257186 100644
--- a/arch/sparc/crypto/Makefile
+++ b/arch/sparc/crypto/Makefile
@@ -10,7 +10,7 @@
 
 obj-$(CONFIG_CRYPTO_AES_SPARC64) += aes-sparc64.o
 obj-$(CONFIG_CRYPTO_DES_SPARC64) += des-sparc64.o
-obj-$(CONFIG_CRYPTO_DES_SPARC64) += camellia-sparc64.o
+obj-$(CONFIG_CRYPTO_CAMELLIA_SPARC64) += camellia-sparc64.o
 
 obj-$(CONFIG_CRYPTO_CRC32C_SPARC64) += crc32c-sparc64.o
 
diff --git a/arch/sparc/include/asm/thread_info_32.h b/arch/sparc/include/asm/thread_info_32.h
index febaaeb..548b366 100644
--- a/arch/sparc/include/asm/thread_info_32.h
+++ b/arch/sparc/include/asm/thread_info_32.h
@@ -63,9 +63,6 @@ struct thread_info {
 	.preempt_count	=	INIT_PREEMPT_COUNT,	\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 register struct thread_info *current_thread_info_reg asm("g6");
 #define current_thread_info()   (current_thread_info_reg)
diff --git a/arch/sparc/include/asm/thread_info_64.h b/arch/sparc/include/asm/thread_info_64.h
index caf9153..f7e7b0b 100644
--- a/arch/sparc/include/asm/thread_info_64.h
+++ b/arch/sparc/include/asm/thread_info_64.h
@@ -120,9 +120,6 @@ struct thread_info {
 	.preempt_count	=	INIT_PREEMPT_COUNT,	\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 register struct thread_info *current_thread_info_reg asm("g6");
 #define current_thread_info()	(current_thread_info_reg)
diff --git a/arch/tile/include/asm/thread_info.h b/arch/tile/include/asm/thread_info.h
index b7659b8..2adcacd 100644
--- a/arch/tile/include/asm/thread_info.h
+++ b/arch/tile/include/asm/thread_info.h
@@ -59,9 +59,6 @@ struct thread_info {
 	.align_ctl	= 0,			\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 #endif /* !__ASSEMBLY__ */
 
 #if PAGE_SIZE < 8192
diff --git a/arch/um/include/asm/processor-generic.h b/arch/um/include/asm/processor-generic.h
index 86942a49..b58b746 100644
--- a/arch/um/include/asm/processor-generic.h
+++ b/arch/um/include/asm/processor-generic.h
@@ -58,7 +58,10 @@ static inline void release_thread(struct task_struct *task)
 {
 }
 
-#define init_stack	(init_thread_union.stack)
+static inline void mm_copy_segments(struct mm_struct *from_mm,
+				    struct mm_struct *new_mm)
+{
+}
 
 /*
  * User space process size: 3GB (default).
diff --git a/arch/um/include/asm/thread_info.h b/arch/um/include/asm/thread_info.h
index 9300f76..4eecd96 100644
--- a/arch/um/include/asm/thread_info.h
+++ b/arch/um/include/asm/thread_info.h
@@ -6,6 +6,9 @@
 #ifndef __UM_THREAD_INFO_H
 #define __UM_THREAD_INFO_H
 
+#define THREAD_SIZE_ORDER CONFIG_KERNEL_STACK_ORDER
+#define THREAD_SIZE ((1 << CONFIG_KERNEL_STACK_ORDER) * PAGE_SIZE)
+
 #ifndef __ASSEMBLY__
 
 #include <asm/types.h>
@@ -37,10 +40,6 @@ struct thread_info {
 	.real_thread = NULL,			\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
-#define THREAD_SIZE ((1 << CONFIG_KERNEL_STACK_ORDER) * PAGE_SIZE)
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
@@ -53,8 +52,6 @@ static inline struct thread_info *current_thread_info(void)
 	return ti;
 }
 
-#define THREAD_SIZE_ORDER CONFIG_KERNEL_STACK_ORDER
-
 #endif
 
 #define TIF_SYSCALL_TRACE	0	/* syscall trace active */
diff --git a/arch/um/include/asm/vmlinux.lds.h b/arch/um/include/asm/vmlinux.lds.h
new file mode 100644
index 0000000..149494a
--- /dev/null
+++ b/arch/um/include/asm/vmlinux.lds.h
@@ -0,0 +1,2 @@
+#include <asm/thread_info.h>
+#include <asm-generic/vmlinux.lds.h>
diff --git a/arch/um/kernel/dyn.lds.S b/arch/um/kernel/dyn.lds.S
index d417e38..5568cf8 100644
--- a/arch/um/kernel/dyn.lds.S
+++ b/arch/um/kernel/dyn.lds.S
@@ -1,5 +1,4 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-#include <asm-generic/vmlinux.lds.h>
+#include <asm/vmlinux.lds.h>
 #include <asm/page.h>
 
 OUTPUT_FORMAT(ELF_FORMAT)
diff --git a/arch/um/kernel/um_arch.c b/arch/um/kernel/um_arch.c
index f433690..a818cce 100644
--- a/arch/um/kernel/um_arch.c
+++ b/arch/um/kernel/um_arch.c
@@ -54,7 +54,7 @@ struct cpuinfo_um boot_cpu_data = {
 
 union thread_union cpu0_irqstack
 	__attribute__((__section__(".data..init_irqstack"))) =
-		{ INIT_THREAD_INFO(init_task) };
+		{ .thread_info = INIT_THREAD_INFO(init_task) };
 
 /* Changed in setup_arch, which is called in early boot */
 static char host_info[(__NEW_UTS_LEN + 1) * 5];
diff --git a/arch/um/kernel/uml.lds.S b/arch/um/kernel/uml.lds.S
index 3d6ed6b..36b07ec 100644
--- a/arch/um/kernel/uml.lds.S
+++ b/arch/um/kernel/uml.lds.S
@@ -1,5 +1,5 @@
 /* SPDX-License-Identifier: GPL-2.0 */
-#include <asm-generic/vmlinux.lds.h>
+#include <asm/vmlinux.lds.h>
 #include <asm/page.h>
 
 OUTPUT_FORMAT(ELF_FORMAT)
diff --git a/arch/unicore32/include/asm/thread_info.h b/arch/unicore32/include/asm/thread_info.h
index e79ad6d..5fb728f 100644
--- a/arch/unicore32/include/asm/thread_info.h
+++ b/arch/unicore32/include/asm/thread_info.h
@@ -87,9 +87,6 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,					\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /*
  * how to get the thread information struct from C
  */
diff --git a/arch/unicore32/kernel/traps.c b/arch/unicore32/kernel/traps.c
index 5f25b39..c4ac604 100644
--- a/arch/unicore32/kernel/traps.c
+++ b/arch/unicore32/kernel/traps.c
@@ -298,7 +298,6 @@ void abort(void)
 	/* if that doesn't kill us, halt */
 	panic("Oops failed to kill thread");
 }
-EXPORT_SYMBOL(abort);
 
 void __init trap_init(void)
 {
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index d4fc98c..423e4b6 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -55,7 +55,6 @@
 	select ARCH_HAS_GCOV_PROFILE_ALL
 	select ARCH_HAS_KCOV			if X86_64
 	select ARCH_HAS_PMEM_API		if X86_64
-	# Causing hangs/crashes, see the commit that added this change for details.
 	select ARCH_HAS_REFCOUNT
 	select ARCH_HAS_UACCESS_FLUSHCACHE	if X86_64
 	select ARCH_HAS_SET_MEMORY
@@ -89,6 +88,7 @@
 	select GENERIC_CLOCKEVENTS_MIN_ADJUST
 	select GENERIC_CMOS_UPDATE
 	select GENERIC_CPU_AUTOPROBE
+	select GENERIC_CPU_VULNERABILITIES
 	select GENERIC_EARLY_IOREMAP
 	select GENERIC_FIND_FIRST_BIT
 	select GENERIC_IOMAP
@@ -429,6 +429,19 @@
        def_bool y
        depends on X86_GOLDFISH
 
+config RETPOLINE
+	bool "Avoid speculative indirect branches in kernel"
+	default y
+	help
+	  Compile kernel with the retpoline compiler options to guard against
+	  kernel-to-user data leaks by avoiding speculative indirect
+	  branches. Requires a compiler with -mindirect-branch=thunk-extern
+	  support for full protection. The kernel may run slower.
+
+	  Without compiler support, at least indirect branches in assembler
+	  code are eliminated. Since this includes the syscall entry path,
+	  it is not entirely pointless.
+
 config INTEL_RDT
 	bool "Intel Resource Director Technology support"
 	default n
@@ -797,6 +810,15 @@
 config PARAVIRT_CLOCK
 	bool
 
+config JAILHOUSE_GUEST
+	bool "Jailhouse non-root cell support"
+	depends on X86_64 && PCI
+	select X86_PM_TIMER
+	---help---
+	  This option allows to run Linux as guest in a Jailhouse non-root
+	  cell. You can leave this option disabled if you only want to start
+	  Jailhouse and run Linux afterwards in the root cell.
+
 endif #HYPERVISOR_GUEST
 
 config NO_BOOTMEM
diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
index 672441c..192e4d2 100644
--- a/arch/x86/Kconfig.debug
+++ b/arch/x86/Kconfig.debug
@@ -169,14 +169,6 @@
 	  options. See Documentation/x86/x86_64/boot-options.txt for more
 	  details.
 
-config IOMMU_STRESS
-	bool "Enable IOMMU stress-test mode"
-	---help---
-	  This option disables various optimizations in IOMMU related
-	  code to do real stress testing of the IOMMU code. This option
-	  will cause a performance drop and should only be enabled for
-	  testing.
-
 config IOMMU_LEAK
 	bool "IOMMU leak tracing"
 	depends on IOMMU_DEBUG && DMA_API_DEBUG
diff --git a/arch/x86/Makefile b/arch/x86/Makefile
index 3e73bc2..fad5516 100644
--- a/arch/x86/Makefile
+++ b/arch/x86/Makefile
@@ -230,6 +230,14 @@
 #
 KBUILD_CFLAGS += -fno-asynchronous-unwind-tables
 
+# Avoid indirect branches in kernel to deal with Spectre
+ifdef CONFIG_RETPOLINE
+    RETPOLINE_CFLAGS += $(call cc-option,-mindirect-branch=thunk-extern -mindirect-branch-register)
+    ifneq ($(RETPOLINE_CFLAGS),)
+        KBUILD_CFLAGS += $(RETPOLINE_CFLAGS) -DRETPOLINE
+    endif
+endif
+
 archscripts: scripts_basic
 	$(Q)$(MAKE) $(build)=arch/x86/tools relocs
 
diff --git a/arch/x86/crypto/aesni-intel_asm.S b/arch/x86/crypto/aesni-intel_asm.S
index 16627fe..3d09e3a 100644
--- a/arch/x86/crypto/aesni-intel_asm.S
+++ b/arch/x86/crypto/aesni-intel_asm.S
@@ -32,6 +32,7 @@
 #include <linux/linkage.h>
 #include <asm/inst.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 /*
  * The following macros are used to move an (un)aligned 16 byte value to/from
@@ -2884,7 +2885,7 @@
 	pxor INC, STATE4
 	movdqu IV, 0x30(OUTP)
 
-	call *%r11
+	CALL_NOSPEC %r11
 
 	movdqu 0x00(OUTP), INC
 	pxor INC, STATE1
@@ -2929,7 +2930,7 @@
 	_aesni_gf128mul_x_ble()
 	movups IV, (IVP)
 
-	call *%r11
+	CALL_NOSPEC %r11
 
 	movdqu 0x40(OUTP), INC
 	pxor INC, STATE1
diff --git a/arch/x86/crypto/camellia-aesni-avx-asm_64.S b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
index f7c495e..a14af6e 100644
--- a/arch/x86/crypto/camellia-aesni-avx-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx-asm_64.S
@@ -17,6 +17,7 @@
 
 #include <linux/linkage.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -1227,7 +1228,7 @@
 	vpxor 14 * 16(%rax), %xmm15, %xmm14;
 	vpxor 15 * 16(%rax), %xmm15, %xmm15;
 
-	call *%r9;
+	CALL_NOSPEC %r9;
 
 	addq $(16 * 16), %rsp;
 
diff --git a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
index eee5b39..b66bbfa 100644
--- a/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
+++ b/arch/x86/crypto/camellia-aesni-avx2-asm_64.S
@@ -12,6 +12,7 @@
 
 #include <linux/linkage.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 #define CAMELLIA_TABLE_BYTE_LEN 272
 
@@ -1343,7 +1344,7 @@
 	vpxor 14 * 32(%rax), %ymm15, %ymm14;
 	vpxor 15 * 32(%rax), %ymm15, %ymm15;
 
-	call *%r9;
+	CALL_NOSPEC %r9;
 
 	addq $(16 * 32), %rsp;
 
diff --git a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
index 7a7de27..d9b734d 100644
--- a/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
+++ b/arch/x86/crypto/crc32c-pcl-intel-asm_64.S
@@ -45,6 +45,7 @@
 
 #include <asm/inst.h>
 #include <linux/linkage.h>
+#include <asm/nospec-branch.h>
 
 ## ISCSI CRC 32 Implementation with crc32 and pclmulqdq Instruction
 
@@ -172,7 +173,7 @@
 	movzxw  (bufp, %rax, 2), len
 	lea	crc_array(%rip), bufp
 	lea     (bufp, len, 1), bufp
-	jmp     *bufp
+	JMP_NOSPEC bufp
 
 	################################################################
 	## 2a) PROCESS FULL BLOCKS:
diff --git a/arch/x86/entry/calling.h b/arch/x86/entry/calling.h
index 45a63e0..3f48f69 100644
--- a/arch/x86/entry/calling.h
+++ b/arch/x86/entry/calling.h
@@ -198,8 +198,11 @@ For 32-bit we have the following conventions - kernel is built with
  * PAGE_TABLE_ISOLATION PGDs are 8k.  Flip bit 12 to switch between the two
  * halves:
  */
-#define PTI_SWITCH_PGTABLES_MASK	(1<<PAGE_SHIFT)
-#define PTI_SWITCH_MASK		(PTI_SWITCH_PGTABLES_MASK|(1<<X86_CR3_PTI_SWITCH_BIT))
+#define PTI_USER_PGTABLE_BIT		PAGE_SHIFT
+#define PTI_USER_PGTABLE_MASK		(1 << PTI_USER_PGTABLE_BIT)
+#define PTI_USER_PCID_BIT		X86_CR3_PTI_PCID_USER_BIT
+#define PTI_USER_PCID_MASK		(1 << PTI_USER_PCID_BIT)
+#define PTI_USER_PGTABLE_AND_PCID_MASK  (PTI_USER_PCID_MASK | PTI_USER_PGTABLE_MASK)
 
 .macro SET_NOFLUSH_BIT	reg:req
 	bts	$X86_CR3_PCID_NOFLUSH_BIT, \reg
@@ -208,7 +211,7 @@ For 32-bit we have the following conventions - kernel is built with
 .macro ADJUST_KERNEL_CR3 reg:req
 	ALTERNATIVE "", "SET_NOFLUSH_BIT \reg", X86_FEATURE_PCID
 	/* Clear PCID and "PAGE_TABLE_ISOLATION bit", point CR3 at kernel pagetables: */
-	andq    $(~PTI_SWITCH_MASK), \reg
+	andq    $(~PTI_USER_PGTABLE_AND_PCID_MASK), \reg
 .endm
 
 .macro SWITCH_TO_KERNEL_CR3 scratch_reg:req
@@ -239,15 +242,19 @@ For 32-bit we have the following conventions - kernel is built with
 	/* Flush needed, clear the bit */
 	btr	\scratch_reg, THIS_CPU_user_pcid_flush_mask
 	movq	\scratch_reg2, \scratch_reg
-	jmp	.Lwrcr3_\@
+	jmp	.Lwrcr3_pcid_\@
 
 .Lnoflush_\@:
 	movq	\scratch_reg2, \scratch_reg
 	SET_NOFLUSH_BIT \scratch_reg
 
+.Lwrcr3_pcid_\@:
+	/* Flip the ASID to the user version */
+	orq	$(PTI_USER_PCID_MASK), \scratch_reg
+
 .Lwrcr3_\@:
-	/* Flip the PGD and ASID to the user version */
-	orq     $(PTI_SWITCH_MASK), \scratch_reg
+	/* Flip the PGD to the user version */
+	orq     $(PTI_USER_PGTABLE_MASK), \scratch_reg
 	mov	\scratch_reg, %cr3
 .Lend_\@:
 .endm
@@ -263,17 +270,12 @@ For 32-bit we have the following conventions - kernel is built with
 	movq	%cr3, \scratch_reg
 	movq	\scratch_reg, \save_reg
 	/*
-	 * Is the "switch mask" all zero?  That means that both of
-	 * these are zero:
-	 *
-	 *	1. The user/kernel PCID bit, and
-	 *	2. The user/kernel "bit" that points CR3 to the
-	 *	   bottom half of the 8k PGD
-	 *
-	 * That indicates a kernel CR3 value, not a user CR3.
+	 * Test the user pagetable bit. If set, then the user page tables
+	 * are active. If clear CR3 already has the kernel page table
+	 * active.
 	 */
-	testq	$(PTI_SWITCH_MASK), \scratch_reg
-	jz	.Ldone_\@
+	bt	$PTI_USER_PGTABLE_BIT, \scratch_reg
+	jnc	.Ldone_\@
 
 	ADJUST_KERNEL_CR3 \scratch_reg
 	movq	\scratch_reg, %cr3
@@ -290,7 +292,7 @@ For 32-bit we have the following conventions - kernel is built with
 	 * KERNEL pages can always resume with NOFLUSH as we do
 	 * explicit flushes.
 	 */
-	bt	$X86_CR3_PTI_SWITCH_BIT, \save_reg
+	bt	$PTI_USER_PGTABLE_BIT, \save_reg
 	jnc	.Lnoflush_\@
 
 	/*
diff --git a/arch/x86/entry/entry_32.S b/arch/x86/entry/entry_32.S
index ace8f32..2a35b1e 100644
--- a/arch/x86/entry/entry_32.S
+++ b/arch/x86/entry/entry_32.S
@@ -44,6 +44,7 @@
 #include <asm/asm.h>
 #include <asm/smap.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 
 	.section .entry.text, "ax"
 
@@ -243,6 +244,18 @@
 	movl	%ebx, PER_CPU_VAR(stack_canary)+stack_canary_offset
 #endif
 
+#ifdef CONFIG_RETPOLINE
+	/*
+	 * When switching from a shallower to a deeper call stack
+	 * the RSB may either underflow or use entries populated
+	 * with userspace addresses. On CPUs where those concerns
+	 * exist, overwrite the RSB with entries which capture
+	 * speculative execution to prevent attack.
+	 */
+	/* Clobbers %ebx */
+	FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+#endif
+
 	/* restore callee-saved registers */
 	popl	%esi
 	popl	%edi
@@ -290,7 +303,7 @@
 
 	/* kernel thread */
 1:	movl	%edi, %eax
-	call	*%ebx
+	CALL_NOSPEC %ebx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
@@ -919,7 +932,7 @@
 	movl	%ecx, %es
 	TRACE_IRQS_OFF
 	movl	%esp, %eax			# pt_regs pointer
-	call	*%edi
+	CALL_NOSPEC %edi
 	jmp	ret_from_exception
 END(common_exception)
 
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index f048e38..a835704 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -37,6 +37,7 @@
 #include <asm/pgtable_types.h>
 #include <asm/export.h>
 #include <asm/frame.h>
+#include <asm/nospec-branch.h>
 #include <linux/err.h>
 
 #include "calling.h"
@@ -191,7 +192,7 @@
 	 */
 	pushq	%rdi
 	movq	$entry_SYSCALL_64_stage2, %rdi
-	jmp	*%rdi
+	JMP_NOSPEC %rdi
 END(entry_SYSCALL_64_trampoline)
 
 	.popsection
@@ -270,7 +271,12 @@
 	 * It might end up jumping to the slow path.  If it jumps, RAX
 	 * and all argument registers are clobbered.
 	 */
+#ifdef CONFIG_RETPOLINE
+	movq	sys_call_table(, %rax, 8), %rax
+	call	__x86_indirect_thunk_rax
+#else
 	call	*sys_call_table(, %rax, 8)
+#endif
 .Lentry_SYSCALL_64_after_fastpath_call:
 
 	movq	%rax, RAX(%rsp)
@@ -442,7 +448,7 @@
 	jmp	entry_SYSCALL64_slow_path
 
 1:
-	jmp	*%rax				/* Called from C */
+	JMP_NOSPEC %rax				/* Called from C */
 END(stub_ptregs_64)
 
 .macro ptregs_stub func
@@ -485,6 +491,18 @@
 	movq	%rbx, PER_CPU_VAR(irq_stack_union)+stack_canary_offset
 #endif
 
+#ifdef CONFIG_RETPOLINE
+	/*
+	 * When switching from a shallower to a deeper call stack
+	 * the RSB may either underflow or use entries populated
+	 * with userspace addresses. On CPUs where those concerns
+	 * exist, overwrite the RSB with entries which capture
+	 * speculative execution to prevent attack.
+	 */
+	/* Clobbers %rbx */
+	FILL_RETURN_BUFFER RSB_CLEAR_LOOPS, X86_FEATURE_RSB_CTXSW
+#endif
+
 	/* restore callee-saved registers */
 	popq	%r15
 	popq	%r14
@@ -521,7 +539,7 @@
 1:
 	/* kernel thread */
 	movq	%r12, %rdi
-	call	*%rbx
+	CALL_NOSPEC %rbx
 	/*
 	 * A kernel thread is allowed to return here after successfully
 	 * calling do_execve().  Exit to userspace to complete the execve()
@@ -1247,7 +1265,7 @@
 #endif
 
 #ifdef CONFIG_X86_MCE
-idtentry machine_check					has_error_code=0	paranoid=1 do_sym=*machine_check_vector(%rip)
+idtentry machine_check		do_mce			has_error_code=0	paranoid=1
 #endif
 
 /*
diff --git a/arch/x86/entry/entry_64_compat.S b/arch/x86/entry/entry_64_compat.S
index 40f1700..98d5358 100644
--- a/arch/x86/entry/entry_64_compat.S
+++ b/arch/x86/entry/entry_64_compat.S
@@ -190,8 +190,13 @@
 	/* Interrupts are off on entry. */
 	swapgs
 
-	/* Stash user ESP and switch to the kernel stack. */
+	/* Stash user ESP */
 	movl	%esp, %r8d
+
+	/* Use %rsp as scratch reg. User ESP is stashed in r8 */
+	SWITCH_TO_KERNEL_CR3 scratch_reg=%rsp
+
+	/* Switch to the kernel stack */
 	movq	PER_CPU_VAR(cpu_current_top_of_stack), %rsp
 
 	/* Construct struct pt_regs on stack */
@@ -220,12 +225,6 @@
 	pushq   $0			/* pt_regs->r15 = 0 */
 
 	/*
-	 * We just saved %rdi so it is safe to clobber.  It is not
-	 * preserved during the C calls inside TRACE_IRQS_OFF anyway.
-	 */
-	SWITCH_TO_KERNEL_CR3 scratch_reg=%rdi
-
-	/*
 	 * User mode is traced as though IRQs are on, and SYSENTER
 	 * turned them off.
 	 */
diff --git a/arch/x86/events/amd/power.c b/arch/x86/events/amd/power.c
index a6eee5a..2aefacf 100644
--- a/arch/x86/events/amd/power.c
+++ b/arch/x86/events/amd/power.c
@@ -277,7 +277,7 @@ static int __init amd_power_pmu_init(void)
 	int ret;
 
 	if (!x86_match_cpu(cpu_match))
-		return 0;
+		return -ENODEV;
 
 	if (!boot_cpu_has(X86_FEATURE_ACC_POWER))
 		return -ENODEV;
diff --git a/arch/x86/events/intel/bts.c b/arch/x86/events/intel/bts.c
index 141e07b..24ffa1e 100644
--- a/arch/x86/events/intel/bts.c
+++ b/arch/x86/events/intel/bts.c
@@ -582,6 +582,24 @@ static __init int bts_init(void)
 	if (!boot_cpu_has(X86_FEATURE_DTES64) || !x86_pmu.bts)
 		return -ENODEV;
 
+	if (boot_cpu_has(X86_FEATURE_PTI)) {
+		/*
+		 * BTS hardware writes through a virtual memory map we must
+		 * either use the kernel physical map, or the user mapping of
+		 * the AUX buffer.
+		 *
+		 * However, since this driver supports per-CPU and per-task inherit
+		 * we cannot use the user mapping since it will not be availble
+		 * if we're not running the owning process.
+		 *
+		 * With PTI we can't use the kernal map either, because its not
+		 * there when we run userspace.
+		 *
+		 * For now, disable this driver when using PTI.
+		 */
+		return -ENODEV;
+	}
+
 	bts_pmu.capabilities	= PERF_PMU_CAP_AUX_NO_SG | PERF_PMU_CAP_ITRACE |
 				  PERF_PMU_CAP_EXCLUSIVE;
 	bts_pmu.task_ctx_nr	= perf_sw_context;
diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index 8f0aace..18c25ab 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -5,6 +5,7 @@
 
 #include <asm/cpu_entry_area.h>
 #include <asm/perf_event.h>
+#include <asm/tlbflush.h>
 #include <asm/insn.h>
 
 #include "../perf_event.h"
@@ -283,20 +284,35 @@ static DEFINE_PER_CPU(void *, insn_buffer);
 
 static void ds_update_cea(void *cea, void *addr, size_t size, pgprot_t prot)
 {
+	unsigned long start = (unsigned long)cea;
 	phys_addr_t pa;
 	size_t msz = 0;
 
 	pa = virt_to_phys(addr);
+
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, pa += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, pa, prot);
+
+	/*
+	 * This is a cross-CPU update of the cpu_entry_area, we must shoot down
+	 * all TLB entries for it.
+	 */
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void ds_clear_cea(void *cea, size_t size)
 {
+	unsigned long start = (unsigned long)cea;
 	size_t msz = 0;
 
+	preempt_disable();
 	for (; msz < size; msz += PAGE_SIZE, cea += PAGE_SIZE)
 		cea_set_pte(cea, 0, PAGE_NONE);
+
+	flush_tlb_kernel_range(start, start + size);
+	preempt_enable();
 }
 
 static void *dsalloc_pages(size_t size, gfp_t flags, int cpu)
@@ -356,10 +372,9 @@ static int alloc_pebs_buffer(int cpu)
 static void release_pebs_buffer(int cpu)
 {
 	struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
-	struct debug_store *ds = hwev->ds;
 	void *cea;
 
-	if (!ds || !x86_pmu.pebs)
+	if (!x86_pmu.pebs)
 		return;
 
 	kfree(per_cpu(insn_buffer, cpu));
@@ -368,7 +383,6 @@ static void release_pebs_buffer(int cpu)
 	/* Clear the fixmap */
 	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.pebs_buffer;
 	ds_clear_cea(cea, x86_pmu.pebs_buffer_size);
-	ds->pebs_buffer_base = 0;
 	dsfree_pages(hwev->ds_pebs_vaddr, x86_pmu.pebs_buffer_size);
 	hwev->ds_pebs_vaddr = NULL;
 }
@@ -403,16 +417,14 @@ static int alloc_bts_buffer(int cpu)
 static void release_bts_buffer(int cpu)
 {
 	struct cpu_hw_events *hwev = per_cpu_ptr(&cpu_hw_events, cpu);
-	struct debug_store *ds = hwev->ds;
 	void *cea;
 
-	if (!ds || !x86_pmu.bts)
+	if (!x86_pmu.bts)
 		return;
 
 	/* Clear the fixmap */
 	cea = &get_cpu_entry_area(cpu)->cpu_debug_buffers.bts_buffer;
 	ds_clear_cea(cea, BTS_BUFFER_SIZE);
-	ds->bts_buffer_base = 0;
 	dsfree_pages(hwev->ds_bts_vaddr, BTS_BUFFER_SIZE);
 	hwev->ds_bts_vaddr = NULL;
 }
@@ -438,16 +450,22 @@ void release_ds_buffers(void)
 	if (!x86_pmu.bts && !x86_pmu.pebs)
 		return;
 
-	get_online_cpus();
-	for_each_online_cpu(cpu)
+	for_each_possible_cpu(cpu)
+		release_ds_buffer(cpu);
+
+	for_each_possible_cpu(cpu) {
+		/*
+		 * Again, ignore errors from offline CPUs, they will no longer
+		 * observe cpu_hw_events.ds and not program the DS_AREA when
+		 * they come up.
+		 */
 		fini_debug_store_on_cpu(cpu);
+	}
 
 	for_each_possible_cpu(cpu) {
 		release_pebs_buffer(cpu);
 		release_bts_buffer(cpu);
-		release_ds_buffer(cpu);
 	}
-	put_online_cpus();
 }
 
 void reserve_ds_buffers(void)
@@ -467,8 +485,6 @@ void reserve_ds_buffers(void)
 	if (!x86_pmu.pebs)
 		pebs_err = 1;
 
-	get_online_cpus();
-
 	for_each_possible_cpu(cpu) {
 		if (alloc_ds_buffer(cpu)) {
 			bts_err = 1;
@@ -505,11 +521,14 @@ void reserve_ds_buffers(void)
 		if (x86_pmu.pebs && !pebs_err)
 			x86_pmu.pebs_active = 1;
 
-		for_each_online_cpu(cpu)
+		for_each_possible_cpu(cpu) {
+			/*
+			 * Ignores wrmsr_on_cpu() errors for offline CPUs they
+			 * will get this call through intel_pmu_cpu_starting().
+			 */
 			init_debug_store_on_cpu(cpu);
+		}
 	}
-
-	put_online_cpus();
 }
 
 /*
diff --git a/arch/x86/events/intel/rapl.c b/arch/x86/events/intel/rapl.c
index 005908e..a2efb49 100644
--- a/arch/x86/events/intel/rapl.c
+++ b/arch/x86/events/intel/rapl.c
@@ -755,14 +755,14 @@ static const struct x86_cpu_id rapl_cpu_match[] __initconst = {
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_IVYBRIDGE_X, snbep_rapl_init),
 
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_CORE, hsw_rapl_init),
-	X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_X,    hsw_rapl_init),
+	X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_X,    hsx_rapl_init),
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_ULT,  hsw_rapl_init),
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_HASWELL_GT3E, hsw_rapl_init),
 
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_CORE,   hsw_rapl_init),
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_GT3E,   hsw_rapl_init),
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_X,	  hsx_rapl_init),
-	X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_XEON_D, hsw_rapl_init),
+	X86_RAPL_MODEL_MATCH(INTEL_FAM6_BROADWELL_XEON_D, hsx_rapl_init),
 
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNL, knl_rapl_init),
 	X86_RAPL_MODEL_MATCH(INTEL_FAM6_XEON_PHI_KNM, knl_rapl_init),
diff --git a/arch/x86/events/msr.c b/arch/x86/events/msr.c
index 14efaa0..18e2628 100644
--- a/arch/x86/events/msr.c
+++ b/arch/x86/events/msr.c
@@ -10,7 +10,9 @@ enum perf_msr_id {
 	PERF_MSR_SMI			= 4,
 	PERF_MSR_PTSC			= 5,
 	PERF_MSR_IRPERF			= 6,
-
+	PERF_MSR_THERM			= 7,
+	PERF_MSR_THERM_SNAP		= 8,
+	PERF_MSR_THERM_UNIT		= 9,
 	PERF_MSR_EVENT_MAX,
 };
 
@@ -29,6 +31,11 @@ static bool test_irperf(int idx)
 	return boot_cpu_has(X86_FEATURE_IRPERF);
 }
 
+static bool test_therm_status(int idx)
+{
+	return boot_cpu_has(X86_FEATURE_DTHERM);
+}
+
 static bool test_intel(int idx)
 {
 	if (boot_cpu_data.x86_vendor != X86_VENDOR_INTEL ||
@@ -95,22 +102,28 @@ struct perf_msr {
 	bool	(*test)(int idx);
 };
 
-PMU_EVENT_ATTR_STRING(tsc,    evattr_tsc,    "event=0x00");
-PMU_EVENT_ATTR_STRING(aperf,  evattr_aperf,  "event=0x01");
-PMU_EVENT_ATTR_STRING(mperf,  evattr_mperf,  "event=0x02");
-PMU_EVENT_ATTR_STRING(pperf,  evattr_pperf,  "event=0x03");
-PMU_EVENT_ATTR_STRING(smi,    evattr_smi,    "event=0x04");
-PMU_EVENT_ATTR_STRING(ptsc,   evattr_ptsc,   "event=0x05");
-PMU_EVENT_ATTR_STRING(irperf, evattr_irperf, "event=0x06");
+PMU_EVENT_ATTR_STRING(tsc,				evattr_tsc,		"event=0x00"	);
+PMU_EVENT_ATTR_STRING(aperf,				evattr_aperf,		"event=0x01"	);
+PMU_EVENT_ATTR_STRING(mperf,				evattr_mperf,		"event=0x02"	);
+PMU_EVENT_ATTR_STRING(pperf,				evattr_pperf,		"event=0x03"	);
+PMU_EVENT_ATTR_STRING(smi,				evattr_smi,		"event=0x04"	);
+PMU_EVENT_ATTR_STRING(ptsc,				evattr_ptsc,		"event=0x05"	);
+PMU_EVENT_ATTR_STRING(irperf,				evattr_irperf,		"event=0x06"	);
+PMU_EVENT_ATTR_STRING(cpu_thermal_margin,		evattr_therm,		"event=0x07"	);
+PMU_EVENT_ATTR_STRING(cpu_thermal_margin.snapshot,	evattr_therm_snap,	"1"		);
+PMU_EVENT_ATTR_STRING(cpu_thermal_margin.unit,		evattr_therm_unit,	"C"		);
 
 static struct perf_msr msr[] = {
-	[PERF_MSR_TSC]    = { 0,		&evattr_tsc,	NULL,		 },
-	[PERF_MSR_APERF]  = { MSR_IA32_APERF,	&evattr_aperf,	test_aperfmperf, },
-	[PERF_MSR_MPERF]  = { MSR_IA32_MPERF,	&evattr_mperf,	test_aperfmperf, },
-	[PERF_MSR_PPERF]  = { MSR_PPERF,	&evattr_pperf,	test_intel,	 },
-	[PERF_MSR_SMI]    = { MSR_SMI_COUNT,	&evattr_smi,	test_intel,	 },
-	[PERF_MSR_PTSC]   = { MSR_F15H_PTSC,	&evattr_ptsc,	test_ptsc,	 },
-	[PERF_MSR_IRPERF] = { MSR_F17H_IRPERF,	&evattr_irperf,	test_irperf,	 },
+	[PERF_MSR_TSC]		= { 0,				&evattr_tsc,		NULL,			},
+	[PERF_MSR_APERF]	= { MSR_IA32_APERF,		&evattr_aperf,		test_aperfmperf,	},
+	[PERF_MSR_MPERF]	= { MSR_IA32_MPERF,		&evattr_mperf,		test_aperfmperf,	},
+	[PERF_MSR_PPERF]	= { MSR_PPERF,			&evattr_pperf,		test_intel,		},
+	[PERF_MSR_SMI]		= { MSR_SMI_COUNT,		&evattr_smi,		test_intel,		},
+	[PERF_MSR_PTSC]		= { MSR_F15H_PTSC,		&evattr_ptsc,		test_ptsc,		},
+	[PERF_MSR_IRPERF]	= { MSR_F17H_IRPERF,		&evattr_irperf,		test_irperf,		},
+	[PERF_MSR_THERM]	= { MSR_IA32_THERM_STATUS,	&evattr_therm,		test_therm_status,	},
+	[PERF_MSR_THERM_SNAP]	= { MSR_IA32_THERM_STATUS,	&evattr_therm_snap,	test_therm_status,	},
+	[PERF_MSR_THERM_UNIT]	= { MSR_IA32_THERM_STATUS,	&evattr_therm_unit,	test_therm_status,	},
 };
 
 static struct attribute *events_attrs[PERF_MSR_EVENT_MAX + 1] = {
@@ -161,9 +174,9 @@ static int msr_event_init(struct perf_event *event)
 	if (!msr[cfg].attr)
 		return -EINVAL;
 
-	event->hw.idx = -1;
-	event->hw.event_base = msr[cfg].msr;
-	event->hw.config = cfg;
+	event->hw.idx		= -1;
+	event->hw.event_base	= msr[cfg].msr;
+	event->hw.config	= cfg;
 
 	return 0;
 }
@@ -184,7 +197,7 @@ static void msr_event_update(struct perf_event *event)
 	u64 prev, now;
 	s64 delta;
 
-	/* Careful, an NMI might modify the previous event value. */
+	/* Careful, an NMI might modify the previous event value: */
 again:
 	prev = local64_read(&event->hw.prev_count);
 	now = msr_read_counter(event);
@@ -193,17 +206,22 @@ static void msr_event_update(struct perf_event *event)
 		goto again;
 
 	delta = now - prev;
-	if (unlikely(event->hw.event_base == MSR_SMI_COUNT))
+	if (unlikely(event->hw.event_base == MSR_SMI_COUNT)) {
 		delta = sign_extend64(delta, 31);
-
-	local64_add(delta, &event->count);
+		local64_add(delta, &event->count);
+	} else if (unlikely(event->hw.event_base == MSR_IA32_THERM_STATUS)) {
+		/* If valid, extract digital readout, otherwise set to -1: */
+		now = now & (1ULL << 31) ? (now >> 16) & 0x3f :  -1;
+		local64_set(&event->count, now);
+	} else {
+		local64_add(delta, &event->count);
+	}
 }
 
 static void msr_event_start(struct perf_event *event, int flags)
 {
-	u64 now;
+	u64 now = msr_read_counter(event);
 
-	now = msr_read_counter(event);
 	local64_set(&event->hw.prev_count, now);
 }
 
@@ -250,9 +268,7 @@ static int __init msr_init(void)
 	for (i = PERF_MSR_TSC + 1; i < PERF_MSR_EVENT_MAX; i++) {
 		u64 val;
 
-		/*
-		 * Virt sucks arse; you cannot tell if a R/O MSR is present :/
-		 */
+		/* Virt sucks; you cannot tell if a R/O MSR is present :/ */
 		if (!msr[i].test(i) || rdmsrl_safe(msr[i].msr, &val))
 			msr[i].attr = NULL;
 	}
diff --git a/arch/x86/hyperv/mmu.c b/arch/x86/hyperv/mmu.c
index 9cc9e1c..56c9eba 100644
--- a/arch/x86/hyperv/mmu.c
+++ b/arch/x86/hyperv/mmu.c
@@ -137,7 +137,12 @@ static void hyperv_flush_tlb_others(const struct cpumask *cpus,
 	}
 
 	if (info->mm) {
+		/*
+		 * AddressSpace argument must match the CR3 with PCID bits
+		 * stripped out.
+		 */
 		flush->address_space = virt_to_phys(info->mm->pgd);
+		flush->address_space &= CR3_ADDR_MASK;
 		flush->flags = 0;
 	} else {
 		flush->address_space = 0;
@@ -219,7 +224,12 @@ static void hyperv_flush_tlb_others_ex(const struct cpumask *cpus,
 	}
 
 	if (info->mm) {
+		/*
+		 * AddressSpace argument must match the CR3 with PCID bits
+		 * stripped out.
+		 */
 		flush->address_space = virt_to_phys(info->mm->pgd);
+		flush->address_space &= CR3_ADDR_MASK;
 		flush->flags = 0;
 	} else {
 		flush->address_space = 0;
@@ -278,8 +288,6 @@ void hyperv_setup_mmu_ops(void)
 	if (!(ms_hyperv.hints & HV_X64_REMOTE_TLB_FLUSH_RECOMMENDED))
 		return;
 
-	setup_clear_cpu_cap(X86_FEATURE_PCID);
-
 	if (!(ms_hyperv.hints & HV_X64_EX_PROCESSOR_MASKS_RECOMMENDED)) {
 		pr_info("Using hypercall for remote TLB flush\n");
 		pv_mmu_ops.flush_tlb_others = hyperv_flush_tlb_others;
diff --git a/arch/x86/include/asm/acpi.h b/arch/x86/include/asm/acpi.h
index 8d0ec9d..44f5d79 100644
--- a/arch/x86/include/asm/acpi.h
+++ b/arch/x86/include/asm/acpi.h
@@ -49,7 +49,7 @@ extern int acpi_fix_pin2_polarity;
 extern int acpi_disable_cmcff;
 
 extern u8 acpi_sci_flags;
-extern int acpi_sci_override_gsi;
+extern u32 acpi_sci_override_gsi;
 void acpi_pic_sci_set_trigger(unsigned int, u16);
 
 struct device;
diff --git a/arch/x86/include/asm/alternative.h b/arch/x86/include/asm/alternative.h
index dbfd085..cf5961c 100644
--- a/arch/x86/include/asm/alternative.h
+++ b/arch/x86/include/asm/alternative.h
@@ -140,7 +140,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	".popsection\n"							\
 	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr, feature, 1)			\
-	".popsection"
+	".popsection\n"
 
 #define ALTERNATIVE_2(oldinstr, newinstr1, feature1, newinstr2, feature2)\
 	OLDINSTR_2(oldinstr, 1, 2)					\
@@ -151,7 +151,7 @@ static inline int alternatives_text_reserved(void *start, void *end)
 	".pushsection .altinstr_replacement, \"ax\"\n"			\
 	ALTINSTR_REPLACEMENT(newinstr1, feature1, 1)			\
 	ALTINSTR_REPLACEMENT(newinstr2, feature2, 2)			\
-	".popsection"
+	".popsection\n"
 
 /*
  * Alternative instructions for different CPU types or capabilities.
diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index a9e57f0..9872277 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -136,6 +136,7 @@ extern void disconnect_bsp_APIC(int virt_wire_setup);
 extern void disable_local_APIC(void);
 extern void lapic_shutdown(void);
 extern void sync_Arb_IDs(void);
+extern void init_bsp_APIC(void);
 extern void apic_intr_mode_init(void);
 extern void setup_local_APIC(void);
 extern void init_apic_mappings(void);
diff --git a/arch/x86/include/asm/asm-prototypes.h b/arch/x86/include/asm/asm-prototypes.h
index ff700d8..4d11161 100644
--- a/arch/x86/include/asm/asm-prototypes.h
+++ b/arch/x86/include/asm/asm-prototypes.h
@@ -11,7 +11,34 @@
 #include <asm/pgtable.h>
 #include <asm/special_insns.h>
 #include <asm/preempt.h>
+#include <asm/asm.h>
 
 #ifndef CONFIG_X86_CMPXCHG64
 extern void cmpxchg8b_emu(void);
 #endif
+
+#ifdef CONFIG_RETPOLINE
+#ifdef CONFIG_X86_32
+#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
+#else
+#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
+INDIRECT_THUNK(8)
+INDIRECT_THUNK(9)
+INDIRECT_THUNK(10)
+INDIRECT_THUNK(11)
+INDIRECT_THUNK(12)
+INDIRECT_THUNK(13)
+INDIRECT_THUNK(14)
+INDIRECT_THUNK(15)
+#endif
+INDIRECT_THUNK(ax)
+INDIRECT_THUNK(bx)
+INDIRECT_THUNK(cx)
+INDIRECT_THUNK(dx)
+INDIRECT_THUNK(si)
+INDIRECT_THUNK(di)
+INDIRECT_THUNK(bp)
+asmlinkage void __fill_rsb(void);
+asmlinkage void __clear_rsb(void);
+
+#endif /* CONFIG_RETPOLINE */
diff --git a/arch/x86/include/asm/cpufeature.h b/arch/x86/include/asm/cpufeature.h
index ea9a7dd..70eddb3 100644
--- a/arch/x86/include/asm/cpufeature.h
+++ b/arch/x86/include/asm/cpufeature.h
@@ -29,6 +29,7 @@ enum cpuid_leafs
 	CPUID_8000_000A_EDX,
 	CPUID_7_ECX,
 	CPUID_8000_0007_EBX,
+	CPUID_7_EDX,
 };
 
 #ifdef CONFIG_X86_FEATURE_NAMES
@@ -79,8 +80,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 17, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(REQUIRED_MASK, 18, feature_bit) ||	\
 	   REQUIRED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
 
 #define DISABLED_MASK_BIT_SET(feature_bit)				\
 	 ( CHECK_BIT_IN_MASK_WORD(DISABLED_MASK,  0, feature_bit) ||	\
@@ -101,8 +103,9 @@ extern const char * const x86_bug_flags[NBUGINTS*32];
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 15, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 16, feature_bit) ||	\
 	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 17, feature_bit) ||	\
+	   CHECK_BIT_IN_MASK_WORD(DISABLED_MASK, 18, feature_bit) ||	\
 	   DISABLED_MASK_CHECK					  ||	\
-	   BUILD_BUG_ON_ZERO(NCAPINTS != 18))
+	   BUILD_BUG_ON_ZERO(NCAPINTS != 19))
 
 #define cpu_has(c, bit)							\
 	(__builtin_constant_p(bit) && REQUIRED_MASK_BIT_SET(bit) ? 1 :	\
diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 07cdd17..1d9199e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -13,7 +13,7 @@
 /*
  * Defines x86 CPU feature bits
  */
-#define NCAPINTS			18	   /* N 32-bit words worth of info */
+#define NCAPINTS			19	   /* N 32-bit words worth of info */
 #define NBUGINTS			1	   /* N 32-bit bug flags */
 
 /*
@@ -203,12 +203,15 @@
 #define X86_FEATURE_PROC_FEEDBACK	( 7*32+ 9) /* AMD ProcFeedbackInterface */
 #define X86_FEATURE_SME			( 7*32+10) /* AMD Secure Memory Encryption */
 #define X86_FEATURE_PTI			( 7*32+11) /* Kernel Page Table Isolation enabled */
+#define X86_FEATURE_RETPOLINE		( 7*32+12) /* "" Generic Retpoline mitigation for Spectre variant 2 */
+#define X86_FEATURE_RETPOLINE_AMD	( 7*32+13) /* "" AMD Retpoline mitigation for Spectre variant 2 */
 #define X86_FEATURE_INTEL_PPIN		( 7*32+14) /* Intel Processor Inventory Number */
-#define X86_FEATURE_INTEL_PT		( 7*32+15) /* Intel Processor Trace */
-#define X86_FEATURE_AVX512_4VNNIW	( 7*32+16) /* AVX-512 Neural Network Instructions */
-#define X86_FEATURE_AVX512_4FMAPS	( 7*32+17) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_CDP_L2		( 7*32+15) /* Code and Data Prioritization L2 */
 
 #define X86_FEATURE_MBA			( 7*32+18) /* Memory Bandwidth Allocation */
+#define X86_FEATURE_RSB_CTXSW		( 7*32+19) /* "" Fill RSB on context switches */
+
+#define X86_FEATURE_USE_IBPB		( 7*32+21) /* "" Indirect Branch Prediction Barrier enabled */
 
 /* Virtualization flags: Linux defined, word 8 */
 #define X86_FEATURE_TPR_SHADOW		( 8*32+ 0) /* Intel TPR Shadow */
@@ -243,6 +246,7 @@
 #define X86_FEATURE_AVX512IFMA		( 9*32+21) /* AVX-512 Integer Fused Multiply-Add instructions */
 #define X86_FEATURE_CLFLUSHOPT		( 9*32+23) /* CLFLUSHOPT instruction */
 #define X86_FEATURE_CLWB		( 9*32+24) /* CLWB instruction */
+#define X86_FEATURE_INTEL_PT		( 9*32+25) /* Intel Processor Trace */
 #define X86_FEATURE_AVX512PF		( 9*32+26) /* AVX-512 Prefetch */
 #define X86_FEATURE_AVX512ER		( 9*32+27) /* AVX-512 Exponential and Reciprocal */
 #define X86_FEATURE_AVX512CD		( 9*32+28) /* AVX-512 Conflict Detection */
@@ -268,6 +272,9 @@
 #define X86_FEATURE_CLZERO		(13*32+ 0) /* CLZERO instruction */
 #define X86_FEATURE_IRPERF		(13*32+ 1) /* Instructions Retired Count */
 #define X86_FEATURE_XSAVEERPTR		(13*32+ 2) /* Always save/restore FP error pointers */
+#define X86_FEATURE_IBPB		(13*32+12) /* Indirect Branch Prediction Barrier */
+#define X86_FEATURE_IBRS		(13*32+14) /* Indirect Branch Restricted Speculation */
+#define X86_FEATURE_STIBP		(13*32+15) /* Single Thread Indirect Branch Predictors */
 
 /* Thermal and Power Management Leaf, CPUID level 0x00000006 (EAX), word 14 */
 #define X86_FEATURE_DTHERM		(14*32+ 0) /* Digital Thermal Sensor */
@@ -316,6 +323,13 @@
 #define X86_FEATURE_SUCCOR		(17*32+ 1) /* Uncorrectable error containment and recovery */
 #define X86_FEATURE_SMCA		(17*32+ 3) /* Scalable MCA */
 
+/* Intel-defined CPU features, CPUID level 0x00000007:0 (EDX), word 18 */
+#define X86_FEATURE_AVX512_4VNNIW	(18*32+ 2) /* AVX-512 Neural Network Instructions */
+#define X86_FEATURE_AVX512_4FMAPS	(18*32+ 3) /* AVX-512 Multiply Accumulation Single precision */
+#define X86_FEATURE_SPEC_CTRL		(18*32+26) /* "" Speculation Control (IBRS + IBPB) */
+#define X86_FEATURE_INTEL_STIBP		(18*32+27) /* "" Single Thread Indirect Branch Predictors */
+#define X86_FEATURE_ARCH_CAPABILITIES	(18*32+29) /* IA32_ARCH_CAPABILITIES MSR (Intel) */
+
 /*
  * BUG word(s)
  */
@@ -341,6 +355,8 @@
 #define X86_BUG_SWAPGS_FENCE		X86_BUG(11) /* SWAPGS without input dep on GS */
 #define X86_BUG_MONITOR			X86_BUG(12) /* IPI required to wake up remote CPU */
 #define X86_BUG_AMD_E400		X86_BUG(13) /* CPU is among the affected by Erratum 400 */
-#define X86_BUG_CPU_INSECURE		X86_BUG(14) /* CPU is insecure and needs kernel page table isolation */
+#define X86_BUG_CPU_MELTDOWN		X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */
+#define X86_BUG_SPECTRE_V1		X86_BUG(15) /* CPU is affected by Spectre variant 1 attack with conditional branches */
+#define X86_BUG_SPECTRE_V2		X86_BUG(16) /* CPU is affected by Spectre variant 2 attack with indirect branches */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/asm/disabled-features.h
index b027633..33833d1 100644
--- a/arch/x86/include/asm/disabled-features.h
+++ b/arch/x86/include/asm/disabled-features.h
@@ -77,6 +77,7 @@
 #define DISABLED_MASK15	0
 #define DISABLED_MASK16	(DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UMIP)
 #define DISABLED_MASK17	0
-#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
+#define DISABLED_MASK18	0
+#define DISABLED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
 
 #endif /* _ASM_X86_DISABLED_FEATURES_H */
diff --git a/arch/x86/include/asm/hypervisor.h b/arch/x86/include/asm/hypervisor.h
index 96aa6b9..8c5aaba 100644
--- a/arch/x86/include/asm/hypervisor.h
+++ b/arch/x86/include/asm/hypervisor.h
@@ -28,6 +28,7 @@ enum x86_hypervisor_type {
 	X86_HYPER_XEN_PV,
 	X86_HYPER_XEN_HVM,
 	X86_HYPER_KVM,
+	X86_HYPER_JAILHOUSE,
 };
 
 #ifdef CONFIG_HYPERVISOR_GUEST
diff --git a/arch/x86/include/asm/i8259.h b/arch/x86/include/asm/i8259.h
index c8376b4..5cdcdbd 100644
--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -69,6 +69,11 @@ struct legacy_pic {
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
+static inline bool has_legacy_pic(void)
+{
+	return legacy_pic != &null_legacy_pic;
+}
+
 static inline int nr_legacy_irqs(void)
 {
 	return legacy_pic->nr_legacy_irqs;
diff --git a/arch/x86/include/asm/jailhouse_para.h b/arch/x86/include/asm/jailhouse_para.h
new file mode 100644
index 0000000..875b543
--- /dev/null
+++ b/arch/x86/include/asm/jailhouse_para.h
@@ -0,0 +1,26 @@
+/* SPDX-License-Identifier: GPL2.0 */
+
+/*
+ * Jailhouse paravirt_ops implementation
+ *
+ * Copyright (c) Siemens AG, 2015-2017
+ *
+ * Authors:
+ *  Jan Kiszka <jan.kiszka@siemens.com>
+ */
+
+#ifndef _ASM_X86_JAILHOUSE_PARA_H
+#define _ASM_X86_JAILHOUSE_PARA_H
+
+#include <linux/types.h>
+
+#ifdef CONFIG_JAILHOUSE_GUEST
+bool jailhouse_paravirt(void);
+#else
+static inline bool jailhouse_paravirt(void)
+{
+	return false;
+}
+#endif
+
+#endif /* _ASM_X86_JAILHOUSE_PARA_H */
diff --git a/arch/x86/include/asm/mce.h b/arch/x86/include/asm/mce.h
index b1e8d8d..96ea4b5 100644
--- a/arch/x86/include/asm/mce.h
+++ b/arch/x86/include/asm/mce.h
@@ -376,6 +376,7 @@ struct smca_bank {
 extern struct smca_bank smca_banks[MAX_NR_BANKS];
 
 extern const char *smca_get_long_name(enum smca_bank_types t);
+extern bool amd_mce_is_memory_error(struct mce *m);
 
 extern int mce_threshold_create_device(unsigned int cpu);
 extern int mce_threshold_remove_device(unsigned int cpu);
@@ -384,6 +385,7 @@ extern int mce_threshold_remove_device(unsigned int cpu);
 
 static inline int mce_threshold_create_device(unsigned int cpu) { return 0; };
 static inline int mce_threshold_remove_device(unsigned int cpu) { return 0; };
+static inline bool amd_mce_is_memory_error(struct mce *m) { return false; };
 
 #endif
 
diff --git a/arch/x86/include/asm/mem_encrypt.h b/arch/x86/include/asm/mem_encrypt.h
index c9459a4..22c5f3e 100644
--- a/arch/x86/include/asm/mem_encrypt.h
+++ b/arch/x86/include/asm/mem_encrypt.h
@@ -39,7 +39,7 @@ void __init sme_unmap_bootdata(char *real_mode_data);
 
 void __init sme_early_init(void);
 
-void __init sme_encrypt_kernel(void);
+void __init sme_encrypt_kernel(struct boot_params *bp);
 void __init sme_enable(struct boot_params *bp);
 
 int __init early_set_memory_decrypted(unsigned long vaddr, unsigned long size);
@@ -67,7 +67,7 @@ static inline void __init sme_unmap_bootdata(char *real_mode_data) { }
 
 static inline void __init sme_early_init(void) { }
 
-static inline void __init sme_encrypt_kernel(void) { }
+static inline void __init sme_encrypt_kernel(struct boot_params *bp) { }
 static inline void __init sme_enable(struct boot_params *bp) { }
 
 static inline bool sme_active(void) { return false; }
diff --git a/arch/x86/include/asm/mpspec_def.h b/arch/x86/include/asm/mpspec_def.h
index a6bec80..6fb923a 100644
--- a/arch/x86/include/asm/mpspec_def.h
+++ b/arch/x86/include/asm/mpspec_def.h
@@ -128,9 +128,17 @@ enum mp_irq_source_types {
 	mp_ExtINT = 3
 };
 
-#define MP_IRQDIR_DEFAULT	0
-#define MP_IRQDIR_HIGH		1
-#define MP_IRQDIR_LOW		3
+#define MP_IRQPOL_DEFAULT	0x0
+#define MP_IRQPOL_ACTIVE_HIGH	0x1
+#define MP_IRQPOL_RESERVED	0x2
+#define MP_IRQPOL_ACTIVE_LOW	0x3
+#define MP_IRQPOL_MASK		0x3
+
+#define MP_IRQTRIG_DEFAULT	0x0
+#define MP_IRQTRIG_EDGE		0x4
+#define MP_IRQTRIG_RESERVED	0x8
+#define MP_IRQTRIG_LEVEL	0xc
+#define MP_IRQTRIG_MASK		0xc
 
 #define MP_APIC_ALL	0xFF
 
diff --git a/arch/x86/include/asm/mshyperv.h b/arch/x86/include/asm/mshyperv.h
index 5400add..8bf450b 100644
--- a/arch/x86/include/asm/mshyperv.h
+++ b/arch/x86/include/asm/mshyperv.h
@@ -7,6 +7,7 @@
 #include <linux/nmi.h>
 #include <asm/io.h>
 #include <asm/hyperv.h>
+#include <asm/nospec-branch.h>
 
 /*
  * The below CPUID leaves are present if VersionAndFeatures.HypervisorPresent
@@ -186,10 +187,11 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 		return U64_MAX;
 
 	__asm__ __volatile__("mov %4, %%r8\n"
-			     "call *%5"
+			     CALL_NOSPEC
 			     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
 			       "+c" (control), "+d" (input_address)
-			     :  "r" (output_address), "m" (hv_hypercall_pg)
+			     :  "r" (output_address),
+				THUNK_TARGET(hv_hypercall_pg)
 			     : "cc", "memory", "r8", "r9", "r10", "r11");
 #else
 	u32 input_address_hi = upper_32_bits(input_address);
@@ -200,13 +202,13 @@ static inline u64 hv_do_hypercall(u64 control, void *input, void *output)
 	if (!hv_hypercall_pg)
 		return U64_MAX;
 
-	__asm__ __volatile__("call *%7"
+	__asm__ __volatile__(CALL_NOSPEC
 			     : "=A" (hv_status),
 			       "+c" (input_address_lo), ASM_CALL_CONSTRAINT
 			     : "A" (control),
 			       "b" (input_address_hi),
 			       "D"(output_address_hi), "S"(output_address_lo),
-			       "m" (hv_hypercall_pg)
+			       THUNK_TARGET(hv_hypercall_pg)
 			     : "cc", "memory");
 #endif /* !x86_64 */
 	return hv_status;
@@ -227,10 +229,10 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
 
 #ifdef CONFIG_X86_64
 	{
-		__asm__ __volatile__("call *%4"
+		__asm__ __volatile__(CALL_NOSPEC
 				     : "=a" (hv_status), ASM_CALL_CONSTRAINT,
 				       "+c" (control), "+d" (input1)
-				     : "m" (hv_hypercall_pg)
+				     : THUNK_TARGET(hv_hypercall_pg)
 				     : "cc", "r8", "r9", "r10", "r11");
 	}
 #else
@@ -238,13 +240,13 @@ static inline u64 hv_do_fast_hypercall8(u16 code, u64 input1)
 		u32 input1_hi = upper_32_bits(input1);
 		u32 input1_lo = lower_32_bits(input1);
 
-		__asm__ __volatile__ ("call *%5"
+		__asm__ __volatile__ (CALL_NOSPEC
 				      : "=A"(hv_status),
 					"+c"(input1_lo),
 					ASM_CALL_CONSTRAINT
 				      :	"A" (control),
 					"b" (input1_hi),
-					"m" (hv_hypercall_pg)
+					THUNK_TARGET(hv_hypercall_pg)
 				      : "cc", "edi", "esi");
 	}
 #endif
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 34c4922..e520a1e 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -39,6 +39,13 @@
 
 /* Intel MSRs. Some also available on other CPUs */
 
+#define MSR_IA32_SPEC_CTRL		0x00000048 /* Speculation Control */
+#define SPEC_CTRL_IBRS			(1 << 0)   /* Indirect Branch Restricted Speculation */
+#define SPEC_CTRL_STIBP			(1 << 1)   /* Single Thread Indirect Branch Predictors */
+
+#define MSR_IA32_PRED_CMD		0x00000049 /* Prediction Command */
+#define PRED_CMD_IBPB			(1 << 0)   /* Indirect Branch Prediction Barrier */
+
 #define MSR_PPIN_CTL			0x0000004e
 #define MSR_PPIN			0x0000004f
 
@@ -57,6 +64,11 @@
 #define SNB_C3_AUTO_UNDEMOTE		(1UL << 28)
 
 #define MSR_MTRRcap			0x000000fe
+
+#define MSR_IA32_ARCH_CAPABILITIES	0x0000010a
+#define ARCH_CAP_RDCL_NO		(1 << 0)   /* Not susceptible to Meltdown */
+#define ARCH_CAP_IBRS_ALL		(1 << 1)   /* Enhanced IBRS support */
+
 #define MSR_IA32_BBL_CR_CTL		0x00000119
 #define MSR_IA32_BBL_CR_CTL3		0x0000011e
 
@@ -355,6 +367,9 @@
 #define FAM10H_MMIO_CONF_BASE_MASK	0xfffffffULL
 #define FAM10H_MMIO_CONF_BASE_SHIFT	20
 #define MSR_FAM10H_NODE_ID		0xc001100c
+#define MSR_F10H_DECFG			0xc0011029
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT	1
+#define MSR_F10H_DECFG_LFENCE_SERIALIZE		BIT_ULL(MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT)
 
 /* K8 MSRs */
 #define MSR_K8_TOP_MEM1			0xc001001a
diff --git a/arch/x86/include/asm/nospec-branch.h b/arch/x86/include/asm/nospec-branch.h
new file mode 100644
index 0000000..d15d471
--- /dev/null
+++ b/arch/x86/include/asm/nospec-branch.h
@@ -0,0 +1,174 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef _ASM_X86_NOSPEC_BRANCH_H_
+#define _ASM_X86_NOSPEC_BRANCH_H_
+
+#include <asm/alternative.h>
+#include <asm/alternative-asm.h>
+#include <asm/cpufeatures.h>
+
+#ifdef __ASSEMBLY__
+
+/*
+ * This should be used immediately before a retpoline alternative.  It tells
+ * objtool where the retpolines are so that it can make sense of the control
+ * flow by just reading the original instruction(s) and ignoring the
+ * alternatives.
+ */
+.macro ANNOTATE_NOSPEC_ALTERNATIVE
+	.Lannotate_\@:
+	.pushsection .discard.nospec
+	.long .Lannotate_\@ - .
+	.popsection
+.endm
+
+/*
+ * These are the bare retpoline primitives for indirect jmp and call.
+ * Do not use these directly; they only exist to make the ALTERNATIVE
+ * invocation below less ugly.
+ */
+.macro RETPOLINE_JMP reg:req
+	call	.Ldo_rop_\@
+.Lspec_trap_\@:
+	pause
+	lfence
+	jmp	.Lspec_trap_\@
+.Ldo_rop_\@:
+	mov	\reg, (%_ASM_SP)
+	ret
+.endm
+
+/*
+ * This is a wrapper around RETPOLINE_JMP so the called function in reg
+ * returns to the instruction after the macro.
+ */
+.macro RETPOLINE_CALL reg:req
+	jmp	.Ldo_call_\@
+.Ldo_retpoline_jmp_\@:
+	RETPOLINE_JMP \reg
+.Ldo_call_\@:
+	call	.Ldo_retpoline_jmp_\@
+.endm
+
+/*
+ * JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
+ * indirect jmp/call which may be susceptible to the Spectre variant 2
+ * attack.
+ */
+.macro JMP_NOSPEC reg:req
+#ifdef CONFIG_RETPOLINE
+	ANNOTATE_NOSPEC_ALTERNATIVE
+	ALTERNATIVE_2 __stringify(jmp *\reg),				\
+		__stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE,	\
+		__stringify(lfence; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+#else
+	jmp	*\reg
+#endif
+.endm
+
+.macro CALL_NOSPEC reg:req
+#ifdef CONFIG_RETPOLINE
+	ANNOTATE_NOSPEC_ALTERNATIVE
+	ALTERNATIVE_2 __stringify(call *\reg),				\
+		__stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
+		__stringify(lfence; call *\reg), X86_FEATURE_RETPOLINE_AMD
+#else
+	call	*\reg
+#endif
+.endm
+
+/* This clobbers the BX register */
+.macro FILL_RETURN_BUFFER nr:req ftr:req
+#ifdef CONFIG_RETPOLINE
+	ALTERNATIVE "", "call __clear_rsb", \ftr
+#endif
+.endm
+
+#else /* __ASSEMBLY__ */
+
+#define ANNOTATE_NOSPEC_ALTERNATIVE				\
+	"999:\n\t"						\
+	".pushsection .discard.nospec\n\t"			\
+	".long 999b - .\n\t"					\
+	".popsection\n\t"
+
+#if defined(CONFIG_X86_64) && defined(RETPOLINE)
+
+/*
+ * Since the inline asm uses the %V modifier which is only in newer GCC,
+ * the 64-bit one is dependent on RETPOLINE not CONFIG_RETPOLINE.
+ */
+# define CALL_NOSPEC						\
+	ANNOTATE_NOSPEC_ALTERNATIVE				\
+	ALTERNATIVE(						\
+	"call *%[thunk_target]\n",				\
+	"call __x86_indirect_thunk_%V[thunk_target]\n",		\
+	X86_FEATURE_RETPOLINE)
+# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
+
+#elif defined(CONFIG_X86_32) && defined(CONFIG_RETPOLINE)
+/*
+ * For i386 we use the original ret-equivalent retpoline, because
+ * otherwise we'll run out of registers. We don't care about CET
+ * here, anyway.
+ */
+# define CALL_NOSPEC ALTERNATIVE("call *%[thunk_target]\n",	\
+	"       jmp    904f;\n"					\
+	"       .align 16\n"					\
+	"901:	call   903f;\n"					\
+	"902:	pause;\n"					\
+	"    	lfence;\n"					\
+	"       jmp    902b;\n"					\
+	"       .align 16\n"					\
+	"903:	addl   $4, %%esp;\n"				\
+	"       pushl  %[thunk_target];\n"			\
+	"       ret;\n"						\
+	"       .align 16\n"					\
+	"904:	call   901b;\n",				\
+	X86_FEATURE_RETPOLINE)
+
+# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
+#else /* No retpoline for C / inline asm */
+# define CALL_NOSPEC "call *%[thunk_target]\n"
+# define THUNK_TARGET(addr) [thunk_target] "rm" (addr)
+#endif
+
+/* The Spectre V2 mitigation variants */
+enum spectre_v2_mitigation {
+	SPECTRE_V2_NONE,
+	SPECTRE_V2_RETPOLINE_MINIMAL,
+	SPECTRE_V2_RETPOLINE_MINIMAL_AMD,
+	SPECTRE_V2_RETPOLINE_GENERIC,
+	SPECTRE_V2_RETPOLINE_AMD,
+	SPECTRE_V2_IBRS,
+};
+
+extern char __indirect_thunk_start[];
+extern char __indirect_thunk_end[];
+
+/*
+ * On VMEXIT we must ensure that no RSB predictions learned in the guest
+ * can be followed in the host, by overwriting the RSB completely. Both
+ * retpoline and IBRS mitigations for Spectre v2 need this; only on future
+ * CPUs with IBRS_ATT *might* it be avoided.
+ */
+static inline void vmexit_fill_RSB(void)
+{
+#ifdef CONFIG_RETPOLINE
+	alternative_input("",
+			  "call __fill_rsb",
+			  X86_FEATURE_RETPOLINE,
+			  ASM_NO_INPUT_CLOBBER(_ASM_BX, "memory"));
+#endif
+}
+
+static inline void indirect_branch_prediction_barrier(void)
+{
+	alternative_input("",
+			  "call __ibp_barrier",
+			  X86_FEATURE_USE_IBPB,
+			  ASM_NO_INPUT_CLOBBER("eax", "ecx", "edx", "memory"));
+}
+
+#endif /* __ASSEMBLY__ */
+#endif /* _ASM_X86_NOSPEC_BRANCH_H_ */
diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h
index 7a5d669..eb66fa9 100644
--- a/arch/x86/include/asm/pci_x86.h
+++ b/arch/x86/include/asm/pci_x86.h
@@ -38,6 +38,7 @@ do {						\
 #define PCI_NOASSIGN_ROMS	0x80000
 #define PCI_ROOT_NO_CRS		0x100000
 #define PCI_NOASSIGN_BARS	0x200000
+#define PCI_BIG_ROOT_WINDOW	0x400000
 
 extern unsigned int pci_probe;
 extern unsigned long pirq_table_addr;
diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h
index b97a539..6b8f73d 100644
--- a/arch/x86/include/asm/pgtable_64_types.h
+++ b/arch/x86/include/asm/pgtable_64_types.h
@@ -75,7 +75,13 @@ typedef struct { pteval_t pte; } pte_t;
 #define PGDIR_SIZE	(_AC(1, UL) << PGDIR_SHIFT)
 #define PGDIR_MASK	(~(PGDIR_SIZE - 1))
 
-/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. */
+/*
+ * See Documentation/x86/x86_64/mm.txt for a description of the memory map.
+ *
+ * Be very careful vs. KASLR when changing anything here. The KASLR address
+ * range must not overlap with anything except the KASAN shadow area, which
+ * is correct as KASAN disables KASLR.
+ */
 #define MAXMEM			_AC(__AC(1, UL) << MAX_PHYSMEM_BITS, UL)
 
 #ifdef CONFIG_X86_5LEVEL
@@ -88,7 +94,7 @@ typedef struct { pteval_t pte; } pte_t;
 # define VMALLOC_SIZE_TB	_AC(32, UL)
 # define __VMALLOC_BASE		_AC(0xffffc90000000000, UL)
 # define __VMEMMAP_BASE		_AC(0xffffea0000000000, UL)
-# define LDT_PGD_ENTRY		_AC(-4, UL)
+# define LDT_PGD_ENTRY		_AC(-3, UL)
 # define LDT_BASE_ADDR		(LDT_PGD_ENTRY << PGDIR_SHIFT)
 #endif
 
@@ -104,13 +110,13 @@ typedef struct { pteval_t pte; } pte_t;
 
 #define MODULES_VADDR		(__START_KERNEL_map + KERNEL_IMAGE_SIZE)
 /* The module sections ends with the start of the fixmap */
-#define MODULES_END		__fix_to_virt(__end_of_fixed_addresses + 1)
+#define MODULES_END		_AC(0xffffffffff000000, UL)
 #define MODULES_LEN		(MODULES_END - MODULES_VADDR)
 
 #define ESPFIX_PGD_ENTRY	_AC(-2, UL)
 #define ESPFIX_BASE_ADDR	(ESPFIX_PGD_ENTRY << P4D_SHIFT)
 
-#define CPU_ENTRY_AREA_PGD	_AC(-3, UL)
+#define CPU_ENTRY_AREA_PGD	_AC(-4, UL)
 #define CPU_ENTRY_AREA_BASE	(CPU_ENTRY_AREA_PGD << P4D_SHIFT)
 
 #define EFI_VA_START		( -4 * (_AC(1, UL) << 30))
diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h
index 6a60fea..625a52a 100644
--- a/arch/x86/include/asm/processor-flags.h
+++ b/arch/x86/include/asm/processor-flags.h
@@ -40,7 +40,7 @@
 #define CR3_NOFLUSH	BIT_ULL(63)
 
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
-# define X86_CR3_PTI_SWITCH_BIT	11
+# define X86_CR3_PTI_PCID_USER_BIT	11
 #endif
 
 #else
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index d3a67fb..efbde08 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -971,4 +971,7 @@ bool xen_set_default_idle(void);
 
 void stop_this_cpu(void *dummy);
 void df_debug(struct pt_regs *regs, long error_code);
+
+void __ibp_barrier(void);
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/include/asm/required-features.h b/arch/x86/include/asm/required-features.h
index d91ba04..fb3a6de 100644
--- a/arch/x86/include/asm/required-features.h
+++ b/arch/x86/include/asm/required-features.h
@@ -106,6 +106,7 @@
 #define REQUIRED_MASK15	0
 #define REQUIRED_MASK16	(NEED_LA57)
 #define REQUIRED_MASK17	0
-#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 18)
+#define REQUIRED_MASK18	0
+#define REQUIRED_MASK_CHECK BUILD_BUG_ON_ZERO(NCAPINTS != 19)
 
 #endif /* _ASM_X86_REQUIRED_FEATURES_H */
diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
index 0022333..d25a638 100644
--- a/arch/x86/include/asm/thread_info.h
+++ b/arch/x86/include/asm/thread_info.h
@@ -62,8 +62,6 @@ struct thread_info {
 	.flags		= 0,			\
 }
 
-#define init_stack		(init_thread_union.stack)
-
 #else /* !__ASSEMBLY__ */
 
 #include <asm/asm-offsets.h>
diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h
index 4a08dd2..d33e4a2 100644
--- a/arch/x86/include/asm/tlbflush.h
+++ b/arch/x86/include/asm/tlbflush.h
@@ -81,13 +81,13 @@ static inline u16 kern_pcid(u16 asid)
 	 * Make sure that the dynamic ASID space does not confict with the
 	 * bit we are using to switch between user and kernel ASIDs.
 	 */
-	BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_SWITCH_BIT));
+	BUILD_BUG_ON(TLB_NR_DYN_ASIDS >= (1 << X86_CR3_PTI_PCID_USER_BIT));
 
 	/*
 	 * The ASID being passed in here should have respected the
 	 * MAX_ASID_AVAILABLE and thus never have the switch bit set.
 	 */
-	VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_SWITCH_BIT));
+	VM_WARN_ON_ONCE(asid & (1 << X86_CR3_PTI_PCID_USER_BIT));
 #endif
 	/*
 	 * The dynamically-assigned ASIDs that get passed in are small
@@ -112,7 +112,7 @@ static inline u16 user_pcid(u16 asid)
 {
 	u16 ret = kern_pcid(asid);
 #ifdef CONFIG_PAGE_TABLE_ISOLATION
-	ret |= 1 << X86_CR3_PTI_SWITCH_BIT;
+	ret |= 1 << X86_CR3_PTI_PCID_USER_BIT;
 #endif
 	return ret;
 }
diff --git a/arch/x86/include/asm/traps.h b/arch/x86/include/asm/traps.h
index 31051f3..3de6933 100644
--- a/arch/x86/include/asm/traps.h
+++ b/arch/x86/include/asm/traps.h
@@ -88,6 +88,7 @@ dotraplinkage void do_simd_coprocessor_error(struct pt_regs *, long);
 #ifdef CONFIG_X86_32
 dotraplinkage void do_iret_error(struct pt_regs *, long);
 #endif
+dotraplinkage void do_mce(struct pt_regs *, long);
 
 static inline int get_si_code(unsigned long condition)
 {
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index c1688c2..1f86e1b 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -56,18 +56,27 @@ void unwind_start(struct unwind_state *state, struct task_struct *task,
 
 #if defined(CONFIG_UNWINDER_ORC) || defined(CONFIG_UNWINDER_FRAME_POINTER)
 /*
- * WARNING: The entire pt_regs may not be safe to dereference.  In some cases,
- * only the iret frame registers are accessible.  Use with caution!
+ * If 'partial' returns true, only the iret frame registers are valid.
  */
-static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state,
+						    bool *partial)
 {
 	if (unwind_done(state))
 		return NULL;
 
+	if (partial) {
+#ifdef CONFIG_UNWINDER_ORC
+		*partial = !state->full_regs;
+#else
+		*partial = false;
+#endif
+	}
+
 	return state->regs;
 }
 #else
-static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state)
+static inline struct pt_regs *unwind_get_entry_regs(struct unwind_state *state,
+						    bool *partial)
 {
 	return NULL;
 }
diff --git a/arch/x86/include/asm/uprobes.h b/arch/x86/include/asm/uprobes.h
index 74f4c2f..d8bfa98 100644
--- a/arch/x86/include/asm/uprobes.h
+++ b/arch/x86/include/asm/uprobes.h
@@ -53,6 +53,10 @@ struct arch_uprobe {
 			u8	fixups;
 			u8	ilen;
 		} 			defparam;
+		struct {
+			u8	reg_offset;	/* to the start of pt_regs */
+			u8	ilen;
+		}			push;
 	};
 };
 
diff --git a/arch/x86/include/asm/uv/uv_bau.h b/arch/x86/include/asm/uv/uv_bau.h
index 7cac798..7803114 100644
--- a/arch/x86/include/asm/uv/uv_bau.h
+++ b/arch/x86/include/asm/uv/uv_bau.h
@@ -48,7 +48,6 @@
 #define UV2_NET_ENDPOINT_INTD		0x28
 #define UV_NET_ENDPOINT_INTD		(is_uv1_hub() ?			\
 			UV1_NET_ENDPOINT_INTD : UV2_NET_ENDPOINT_INTD)
-#define UV_DESC_PSHIFT			49
 #define UV_PAYLOADQ_GNODE_SHIFT		49
 #define UV_PTC_BASENAME			"sgi_uv/ptc_statistics"
 #define UV_BAU_BASENAME			"sgi_uv/bau_tunables"
diff --git a/arch/x86/include/asm/uv/uv_hub.h b/arch/x86/include/asm/uv/uv_hub.h
index 036e26d..44cf6d6 100644
--- a/arch/x86/include/asm/uv/uv_hub.h
+++ b/arch/x86/include/asm/uv/uv_hub.h
@@ -241,6 +241,7 @@ static inline int uv_hub_info_check(int version)
 #define UV2_HUB_REVISION_BASE		3
 #define UV3_HUB_REVISION_BASE		5
 #define UV4_HUB_REVISION_BASE		7
+#define UV4A_HUB_REVISION_BASE		8	/* UV4 (fixed) rev 2 */
 
 #ifdef	UV1_HUB_IS_SUPPORTED
 static inline int is_uv1_hub(void)
@@ -280,6 +281,19 @@ static inline int is_uv3_hub(void)
 }
 #endif
 
+/* First test "is UV4A", then "is UV4" */
+#ifdef	UV4A_HUB_IS_SUPPORTED
+static inline int is_uv4a_hub(void)
+{
+	return (uv_hub_info->hub_revision >= UV4A_HUB_REVISION_BASE);
+}
+#else
+static inline int is_uv4a_hub(void)
+{
+	return 0;
+}
+#endif
+
 #ifdef	UV4_HUB_IS_SUPPORTED
 static inline int is_uv4_hub(void)
 {
diff --git a/arch/x86/include/asm/uv/uv_mmrs.h b/arch/x86/include/asm/uv/uv_mmrs.h
index 548d684..ecb9dde 100644
--- a/arch/x86/include/asm/uv/uv_mmrs.h
+++ b/arch/x86/include/asm/uv/uv_mmrs.h
@@ -39,9 +39,11 @@
  *	#define UV2Hxxx	b
  *	#define UV3Hxxx	c
  *	#define UV4Hxxx	d
+ *	#define UV4AHxxx e
  *	#define UVHxxx	(is_uv1_hub() ? UV1Hxxx :
  *			(is_uv2_hub() ? UV2Hxxx :
  *			(is_uv3_hub() ? UV3Hxxx :
+ *			(is_uv4a_hub() ? UV4AHxxx :
  *					UV4Hxxx))
  *
  * If the MMR exists on all hub types > 1 but have different addresses, the
@@ -49,8 +51,10 @@
  *	#define UV2Hxxx	b
  *	#define UV3Hxxx	c
  *	#define UV4Hxxx	d
+ *	#define UV4AHxxx e
  *	#define UVHxxx	(is_uv2_hub() ? UV2Hxxx :
  *			(is_uv3_hub() ? UV3Hxxx :
+ *			(is_uv4a_hub() ? UV4AHxxx :
  *					UV4Hxxx))
  *
  *	union uvh_xxx {
@@ -63,6 +67,7 @@
  *		} s2;
  *		struct uv3h_xxx_s {	 # Full UV3 definition (*)
  *		} s3;
+ *		(NOTE: No struct uv4ah_xxx_s members exist)
  *		struct uv4h_xxx_s {	 # Full UV4 definition (*)
  *		} s4;
  *	};
@@ -99,6 +104,7 @@
 #define UV2_HUB_IS_SUPPORTED	1
 #define UV3_HUB_IS_SUPPORTED	1
 #define UV4_HUB_IS_SUPPORTED	1
+#define UV4A_HUB_IS_SUPPORTED	1
 
 /* Error function to catch undefined references */
 extern unsigned long uv_undefined(char *str);
@@ -2779,35 +2785,47 @@ union uvh_lb_bau_sb_activation_status_1_u {
 	/*is_uv4_hub*/ UV4H_LB_BAU_SB_DESCRIPTOR_BASE_32)
 
 #define UVH_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_SHFT	12
-#define UVH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	49
-#define UVH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0x7ffe000000000000UL
 
+#define UV1H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	49
 #define UV1H_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_MASK 0x000007fffffff000UL
+#define UV1H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0x7ffe000000000000UL
 
-
+#define UV2H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	49
 #define UV2H_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_MASK 0x000007fffffff000UL
+#define UV2H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0x7ffe000000000000UL
 
+#define UV3H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	49
 #define UV3H_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_MASK 0x000007fffffff000UL
+#define UV3H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0x7ffe000000000000UL
 
+#define UV4H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	49
 #define UV4H_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_MASK 0x00003ffffffff000UL
+#define UV4H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0x7ffe000000000000UL
 
+#define UV4AH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT	53
+#define UV4AH_LB_BAU_SB_DESCRIPTOR_BASE_PAGE_ADDRESS_MASK 0x000ffffffffff000UL
+#define UV4AH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK	0xffe0000000000000UL
 
-union uvh_lb_bau_sb_descriptor_base_u {
-	unsigned long	v;
-	struct uvh_lb_bau_sb_descriptor_base_s {
-		unsigned long	rsvd_0_11:12;
-		unsigned long	rsvd_12_48:37;
-		unsigned long	node_id:14;			/* RW */
-		unsigned long	rsvd_63:1;
-	} s;
-	struct uv4h_lb_bau_sb_descriptor_base_s {
-		unsigned long	rsvd_0_11:12;
-		unsigned long	page_address:34;		/* RW */
-		unsigned long	rsvd_46_48:3;
-		unsigned long	node_id:14;			/* RW */
-		unsigned long	rsvd_63:1;
-	} s4;
-};
+#define UVH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT (			\
+	is_uv1_hub() ? UV1H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT :	\
+	is_uv2_hub() ? UV2H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT :	\
+	is_uv3_hub() ? UV3H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT :	\
+	is_uv4a_hub() ? UV4AH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT :	\
+	/*is_uv4_hub*/ UV4H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT)
+
+#define UVH_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK (			\
+	is_uv1_hub() ? UV1H_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK :	\
+	is_uv2_hub() ? UV2H_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK :	\
+	is_uv3_hub() ? UV3H_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK :	\
+	is_uv4a_hub() ? UV4AH_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK :	\
+	/*is_uv4_hub*/ UV4H_LB_BAU_SB_DESCRIPTOR_PAGE_ADDRESS_MASK)
+
+#define UVH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK (			\
+	is_uv1_hub() ? UV1H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK :	\
+	is_uv2_hub() ? UV2H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK :	\
+	is_uv3_hub() ? UV3H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK :	\
+	is_uv4a_hub() ? UV4AH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK :	\
+	/*is_uv4_hub*/ UV4H_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_MASK)
 
 /* ========================================================================= */
 /*                               UVH_NODE_ID                                 */
@@ -3031,6 +3049,41 @@ union uvh_node_present_table_u {
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
 
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_SHFT 48
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_SHFT 63
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_SHFT 48
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_SHFT 63
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_MASK 0x00000000ff000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_SHFT 48
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_SHFT 63
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_SHFT 48
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_SHFT 63
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_SHFT 48
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_SHFT 63
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_0_MMR_ENABLE_MASK 0x8000000000000000UL
+
 
 union uvh_rh_gam_alias210_overlay_config_0_mmr_u {
 	unsigned long	v;
@@ -3042,6 +3095,46 @@ union uvh_rh_gam_alias210_overlay_config_0_mmr_u {
 		unsigned long	rsvd_53_62:10;
 		unsigned long	enable:1;			/* RW */
 	} s;
+	struct uv1h_rh_gam_alias210_overlay_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s1;
+	struct uvxh_rh_gam_alias210_overlay_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} sx;
+	struct uv2h_rh_gam_alias210_overlay_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s2;
+	struct uv3h_rh_gam_alias210_overlay_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s3;
+	struct uv4h_rh_gam_alias210_overlay_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3064,6 +3157,41 @@ union uvh_rh_gam_alias210_overlay_config_0_mmr_u {
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
 
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_SHFT 48
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_SHFT 63
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_SHFT 48
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_SHFT 63
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_MASK 0x00000000ff000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_SHFT 48
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_SHFT 63
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_SHFT 48
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_SHFT 63
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_SHFT 48
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_SHFT 63
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_1_MMR_ENABLE_MASK 0x8000000000000000UL
+
 
 union uvh_rh_gam_alias210_overlay_config_1_mmr_u {
 	unsigned long	v;
@@ -3075,6 +3203,46 @@ union uvh_rh_gam_alias210_overlay_config_1_mmr_u {
 		unsigned long	rsvd_53_62:10;
 		unsigned long	enable:1;			/* RW */
 	} s;
+	struct uv1h_rh_gam_alias210_overlay_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s1;
+	struct uvxh_rh_gam_alias210_overlay_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} sx;
+	struct uv2h_rh_gam_alias210_overlay_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s2;
+	struct uv3h_rh_gam_alias210_overlay_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s3;
+	struct uv4h_rh_gam_alias210_overlay_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3097,6 +3265,41 @@ union uvh_rh_gam_alias210_overlay_config_1_mmr_u {
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
 #define UVH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
 
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_SHFT 48
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_SHFT 63
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV1H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_SHFT 48
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_SHFT 63
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_MASK 0x00000000ff000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UVXH_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_SHFT 48
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_SHFT 63
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV2H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_SHFT 48
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_SHFT 63
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV3H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_SHFT 48
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_SHFT 63
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_BASE_MASK 0x00000000ff000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_M_ALIAS_MASK 0x001f000000000000UL
+#define UV4H_RH_GAM_ALIAS210_OVERLAY_CONFIG_2_MMR_ENABLE_MASK 0x8000000000000000UL
+
 
 union uvh_rh_gam_alias210_overlay_config_2_mmr_u {
 	unsigned long	v;
@@ -3108,6 +3311,46 @@ union uvh_rh_gam_alias210_overlay_config_2_mmr_u {
 		unsigned long	rsvd_53_62:10;
 		unsigned long	enable:1;			/* RW */
 	} s;
+	struct uv1h_rh_gam_alias210_overlay_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s1;
+	struct uvxh_rh_gam_alias210_overlay_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} sx;
+	struct uv2h_rh_gam_alias210_overlay_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s2;
+	struct uv3h_rh_gam_alias210_overlay_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s3;
+	struct uv4h_rh_gam_alias210_overlay_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	base:8;				/* RW */
+		unsigned long	rsvd_32_47:16;
+		unsigned long	m_alias:5;			/* RW */
+		unsigned long	rsvd_53_62:10;
+		unsigned long	enable:1;			/* RW */
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3126,6 +3369,21 @@ union uvh_rh_gam_alias210_overlay_config_2_mmr_u {
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
 
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_0_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
 
 union uvh_rh_gam_alias210_redirect_config_0_mmr_u {
 	unsigned long	v;
@@ -3134,6 +3392,31 @@ union uvh_rh_gam_alias210_redirect_config_0_mmr_u {
 		unsigned long	dest_base:22;			/* RW */
 		unsigned long	rsvd_46_63:18;
 	} s;
+	struct uv1h_rh_gam_alias210_redirect_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s1;
+	struct uvxh_rh_gam_alias210_redirect_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} sx;
+	struct uv2h_rh_gam_alias210_redirect_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s2;
+	struct uv3h_rh_gam_alias210_redirect_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s3;
+	struct uv4h_rh_gam_alias210_redirect_config_0_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3152,6 +3435,21 @@ union uvh_rh_gam_alias210_redirect_config_0_mmr_u {
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
 
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_1_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
 
 union uvh_rh_gam_alias210_redirect_config_1_mmr_u {
 	unsigned long	v;
@@ -3160,6 +3458,31 @@ union uvh_rh_gam_alias210_redirect_config_1_mmr_u {
 		unsigned long	dest_base:22;			/* RW */
 		unsigned long	rsvd_46_63:18;
 	} s;
+	struct uv1h_rh_gam_alias210_redirect_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s1;
+	struct uvxh_rh_gam_alias210_redirect_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} sx;
+	struct uv2h_rh_gam_alias210_redirect_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s2;
+	struct uv3h_rh_gam_alias210_redirect_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s3;
+	struct uv4h_rh_gam_alias210_redirect_config_1_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3178,6 +3501,21 @@ union uvh_rh_gam_alias210_redirect_config_1_mmr_u {
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
 #define UVH_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
 
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
+#define UV1H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
+#define UVXH_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
+#define UV2H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
+#define UV3H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_SHFT 24
+#define UV4H_RH_GAM_ALIAS210_REDIRECT_CONFIG_2_MMR_DEST_BASE_MASK 0x00003fffff000000UL
+
 
 union uvh_rh_gam_alias210_redirect_config_2_mmr_u {
 	unsigned long	v;
@@ -3186,6 +3524,31 @@ union uvh_rh_gam_alias210_redirect_config_2_mmr_u {
 		unsigned long	dest_base:22;			/* RW */
 		unsigned long	rsvd_46_63:18;
 	} s;
+	struct uv1h_rh_gam_alias210_redirect_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s1;
+	struct uvxh_rh_gam_alias210_redirect_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} sx;
+	struct uv2h_rh_gam_alias210_redirect_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s2;
+	struct uv3h_rh_gam_alias210_redirect_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s3;
+	struct uv4h_rh_gam_alias210_redirect_config_2_mmr_s {
+		unsigned long	rsvd_0_23:24;
+		unsigned long	dest_base:22;			/* RW */
+		unsigned long	rsvd_46_63:18;
+	} s4;
 };
 
 /* ========================================================================= */
@@ -3384,6 +3747,162 @@ union uvh_rh_gam_gru_overlay_config_mmr_u {
 };
 
 /* ========================================================================= */
+/*                   UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR                    */
+/* ========================================================================= */
+#define UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR uv_undefined("UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR")
+#define UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR uv_undefined("UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR")
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR 0x1603000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR 0x483000UL
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR (				\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR :		\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR :		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR :		\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR)
+
+
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_SHFT	26
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT	46
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_SHFT 63
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK	0x00003ffffc000000UL
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK	0x000fc00000000000UL
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_SHFT	26
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT	46
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_SHFT 63
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK	0x00003ffffc000000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK	0x000fc00000000000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT 52
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK 0x000ffffffc000000UL
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK 0x03f0000000000000UL
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT)
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK)
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK)
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK)
+
+union uvh_rh_gam_mmioh_overlay_config0_mmr_u {
+	unsigned long	v;
+	struct uv3h_rh_gam_mmioh_overlay_config0_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:20;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	rsvd_56_62:7;
+		unsigned long	enable:1;			/* RW */
+	} s3;
+	struct uv4h_rh_gam_mmioh_overlay_config0_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:20;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	rsvd_56_62:7;
+		unsigned long	enable:1;			/* RW */
+	} s4;
+	struct uv4ah_rh_gam_mmioh_overlay_config0_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:26;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	undef_62:1;			/* Undefined */
+		unsigned long	enable:1;			/* RW */
+	} s4a;
+};
+
+/* ========================================================================= */
+/*                   UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR                    */
+/* ========================================================================= */
+#define UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR uv_undefined("UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR")
+#define UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR uv_undefined("UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR")
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR 0x1603000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR 0x483000UL
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR (				\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR :		\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR :		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR :		\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR)
+
+
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_SHFT	26
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT	46
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_SHFT 63
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK	0x00003ffffc000000UL
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK	0x000fc00000000000UL
+#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_SHFT	26
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT	46
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_SHFT 63
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK	0x00003ffffc000000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK	0x000fc00000000000UL
+#define UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_MASK 0x8000000000000000UL
+
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT 52
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK 0x000ffffffc000000UL
+#define UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK 0x03f0000000000000UL
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT)
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK)
+
+#define UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK)
+
+union uvh_rh_gam_mmioh_overlay_config1_mmr_u {
+	unsigned long	v;
+	struct uv3h_rh_gam_mmioh_overlay_config1_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:20;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	rsvd_56_62:7;
+		unsigned long	enable:1;			/* RW */
+	} s3;
+	struct uv4h_rh_gam_mmioh_overlay_config1_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:20;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	rsvd_56_62:7;
+		unsigned long	enable:1;			/* RW */
+	} s4;
+	struct uv4ah_rh_gam_mmioh_overlay_config1_mmr_s {
+		unsigned long	rsvd_0_25:26;
+		unsigned long	base:26;			/* RW */
+		unsigned long	m_io:6;				/* RW */
+		unsigned long	n_io:4;
+		unsigned long	undef_62:1;			/* Undefined */
+		unsigned long	enable:1;			/* RW */
+	} s4a;
+};
+
+/* ========================================================================= */
 /*                   UVH_RH_GAM_MMIOH_OVERLAY_CONFIG_MMR                     */
 /* ========================================================================= */
 #define UV1H_RH_GAM_MMIOH_OVERLAY_CONFIG_MMR 0x1600030UL
@@ -3438,6 +3957,112 @@ union uvh_rh_gam_mmioh_overlay_config_mmr_u {
 };
 
 /* ========================================================================= */
+/*                  UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR                    */
+/* ========================================================================= */
+#define UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR uv_undefined("UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR")
+#define UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR uv_undefined("UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR")
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR 0x1603800UL
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR 0x483800UL
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR (				\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR :		\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR :		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR :		\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR)
+
+#define UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH uv_undefined("UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH")
+#define UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH uv_undefined("UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH")
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH 128
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH 128
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH (			\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH :	\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH :	\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH :	\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH)
+
+
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_SHFT 0
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK 0x0000000000007fffUL
+
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_SHFT 0
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK 0x0000000000007fffUL
+
+#define UV4AH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK 0x0000000000000fffUL
+
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK)
+
+union uvh_rh_gam_mmioh_redirect_config0_mmr_u {
+	unsigned long	v;
+	struct uv3h_rh_gam_mmioh_redirect_config0_mmr_s {
+		unsigned long	nasid:15;			/* RW */
+		unsigned long	rsvd_15_63:49;
+	} s3;
+	struct uv4h_rh_gam_mmioh_redirect_config0_mmr_s {
+		unsigned long	nasid:15;			/* RW */
+		unsigned long	rsvd_15_63:49;
+	} s4;
+	struct uv4ah_rh_gam_mmioh_redirect_config0_mmr_s {
+		unsigned long	nasid:12;			/* RW */
+		unsigned long	rsvd_12_63:52;
+	} s4a;
+};
+
+/* ========================================================================= */
+/*                  UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR                    */
+/* ========================================================================= */
+#define UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR uv_undefined("UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR")
+#define UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR uv_undefined("UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR")
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR 0x1604800UL
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR 0x484800UL
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR (				\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR :		\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR :		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR :		\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR)
+
+#define UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH uv_undefined("UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH")
+#define UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH uv_undefined("UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH")
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH 128
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH 128
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH (			\
+	is_uv1_hub() ? UV1H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH :	\
+	is_uv2_hub() ? UV2H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH :	\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH :	\
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH)
+
+
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_SHFT 0
+#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK 0x0000000000007fffUL
+
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_SHFT 0
+#define UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK 0x0000000000007fffUL
+
+#define UV4AH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK 0x0000000000000fffUL
+
+#define UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK (		\
+	is_uv3_hub() ? UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK : \
+	is_uv4a_hub() ? UV4AH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK : \
+	/*is_uv4_hub*/ UV4H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK)
+
+union uvh_rh_gam_mmioh_redirect_config1_mmr_u {
+	unsigned long	v;
+	struct uv3h_rh_gam_mmioh_redirect_config1_mmr_s {
+		unsigned long	nasid:15;			/* RW */
+		unsigned long	rsvd_15_63:49;
+	} s3;
+	struct uv4h_rh_gam_mmioh_redirect_config1_mmr_s {
+		unsigned long	nasid:15;			/* RW */
+		unsigned long	rsvd_15_63:49;
+	} s4;
+	struct uv4ah_rh_gam_mmioh_redirect_config1_mmr_s {
+		unsigned long	nasid:12;			/* RW */
+		unsigned long	rsvd_12_63:52;
+	} s4a;
+};
+
+/* ========================================================================= */
 /*                    UVH_RH_GAM_MMR_OVERLAY_CONFIG_MMR                      */
 /* ========================================================================= */
 #define UV1H_RH_GAM_MMR_OVERLAY_CONFIG_MMR 0x1600028UL
@@ -4138,88 +4763,6 @@ union uv3h_gr0_gam_gr_config_u {
 };
 
 /* ========================================================================= */
-/*                   UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR                   */
-/* ========================================================================= */
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR		0x1603000UL
-
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_SHFT	26
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT	46
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_SHFT 63
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK	0x00003ffffc000000UL
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK	0x000fc00000000000UL
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK 0x8000000000000000UL
-
-union uv3h_rh_gam_mmioh_overlay_config0_mmr_u {
-	unsigned long	v;
-	struct uv3h_rh_gam_mmioh_overlay_config0_mmr_s {
-		unsigned long	rsvd_0_25:26;
-		unsigned long	base:20;			/* RW */
-		unsigned long	m_io:6;				/* RW */
-		unsigned long	n_io:4;
-		unsigned long	rsvd_56_62:7;
-		unsigned long	enable:1;			/* RW */
-	} s3;
-};
-
-/* ========================================================================= */
-/*                   UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR                   */
-/* ========================================================================= */
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR		0x1604000UL
-
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_SHFT	26
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT	46
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_SHFT 63
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK	0x00003ffffc000000UL
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK	0x000fc00000000000UL
-#define UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_ENABLE_MASK 0x8000000000000000UL
-
-union uv3h_rh_gam_mmioh_overlay_config1_mmr_u {
-	unsigned long	v;
-	struct uv3h_rh_gam_mmioh_overlay_config1_mmr_s {
-		unsigned long	rsvd_0_25:26;
-		unsigned long	base:20;			/* RW */
-		unsigned long	m_io:6;				/* RW */
-		unsigned long	n_io:4;
-		unsigned long	rsvd_56_62:7;
-		unsigned long	enable:1;			/* RW */
-	} s3;
-};
-
-/* ========================================================================= */
-/*                  UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR                   */
-/* ========================================================================= */
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR		0x1603800UL
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH	128
-
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_SHFT 0
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK 0x0000000000007fffUL
-
-union uv3h_rh_gam_mmioh_redirect_config0_mmr_u {
-	unsigned long	v;
-	struct uv3h_rh_gam_mmioh_redirect_config0_mmr_s {
-		unsigned long	nasid:15;			/* RW */
-		unsigned long	rsvd_15_63:49;
-	} s3;
-};
-
-/* ========================================================================= */
-/*                  UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR                   */
-/* ========================================================================= */
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR		0x1604800UL
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH	128
-
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_SHFT 0
-#define UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK 0x0000000000007fffUL
-
-union uv3h_rh_gam_mmioh_redirect_config1_mmr_u {
-	unsigned long	v;
-	struct uv3h_rh_gam_mmioh_redirect_config1_mmr_s {
-		unsigned long	nasid:15;			/* RW */
-		unsigned long	rsvd_15_63:49;
-	} s3;
-};
-
-/* ========================================================================= */
 /*                       UV4H_LB_PROC_INTD_QUEUE_FIRST                       */
 /* ========================================================================= */
 #define UV4H_LB_PROC_INTD_QUEUE_FIRST			0xa4100UL
diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h
index aa47475..fc2f082 100644
--- a/arch/x86/include/asm/x86_init.h
+++ b/arch/x86/include/asm/x86_init.h
@@ -212,6 +212,7 @@ enum x86_legacy_i8042_state {
 struct x86_legacy_features {
 	enum x86_legacy_i8042_state i8042;
 	int rtc;
+	int warm_reset;
 	int no_vga;
 	int reserve_bios_regions;
 	struct x86_legacy_devices devices;
diff --git a/arch/x86/include/asm/xen/hypercall.h b/arch/x86/include/asm/xen/hypercall.h
index 7cb282e..bfd8826 100644
--- a/arch/x86/include/asm/xen/hypercall.h
+++ b/arch/x86/include/asm/xen/hypercall.h
@@ -44,6 +44,7 @@
 #include <asm/page.h>
 #include <asm/pgtable.h>
 #include <asm/smap.h>
+#include <asm/nospec-branch.h>
 
 #include <xen/interface/xen.h>
 #include <xen/interface/sched.h>
@@ -217,9 +218,9 @@ privcmd_call(unsigned call,
 	__HYPERCALL_5ARG(a1, a2, a3, a4, a5);
 
 	stac();
-	asm volatile("call *%[call]"
+	asm volatile(CALL_NOSPEC
 		     : __HYPERCALL_5PARAM
-		     : [call] "a" (&hypercall_page[call])
+		     : [thunk_target] "a" (&hypercall_page[call])
 		     : __HYPERCALL_CLOBBER5);
 	clac();
 
diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h
index afdd5ae..aebf603 100644
--- a/arch/x86/include/uapi/asm/bootparam.h
+++ b/arch/x86/include/uapi/asm/bootparam.h
@@ -9,6 +9,7 @@
 #define SETUP_PCI			3
 #define SETUP_EFI			4
 #define SETUP_APPLE_PROPERTIES		5
+#define SETUP_JAILHOUSE			6
 
 /* ram_size flags */
 #define RAMDISK_IMAGE_START_MASK	0x07FF
@@ -126,6 +127,27 @@ struct boot_e820_entry {
 	__u32 type;
 } __attribute__((packed));
 
+/*
+ * Smallest compatible version of jailhouse_setup_data required by this kernel.
+ */
+#define JAILHOUSE_SETUP_REQUIRED_VERSION	1
+
+/*
+ * The boot loader is passing platform information via this Jailhouse-specific
+ * setup data structure.
+ */
+struct jailhouse_setup_data {
+	u16	version;
+	u16	compatible_version;
+	u16	pm_timer_address;
+	u16	num_cpus;
+	u64	pci_mmconfig_base;
+	u32	tsc_khz;
+	u32	apic_khz;
+	u8	standard_ioapic;
+	u8	cpu_ids[255];
+} __attribute__((packed));
+
 /* The so-called "zeropage" */
 struct boot_params {
 	struct screen_info screen_info;			/* 0x000 */
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 81bb565..29786c8 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -29,10 +29,13 @@
 KASAN_SANITIZE_paravirt.o				:= n
 
 OBJECT_FILES_NON_STANDARD_relocate_kernel_$(BITS).o	:= y
-OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
 OBJECT_FILES_NON_STANDARD_test_nx.o			:= y
 OBJECT_FILES_NON_STANDARD_paravirt_patch_$(BITS).o	:= y
 
+ifdef CONFIG_FRAME_POINTER
+OBJECT_FILES_NON_STANDARD_ftrace_$(BITS).o		:= y
+endif
+
 # If instrumentation of this dir is enabled, boot hangs during first second.
 # Probably could be more selective here, but note that files related to irqs,
 # boot, dumpstack/stacktrace, etc are either non-interesting or can lead to
@@ -112,6 +115,8 @@
 obj-$(CONFIG_PARAVIRT_CLOCK)	+= pvclock.o
 obj-$(CONFIG_X86_PMEM_LEGACY_DEVICE) += pmem.o
 
+obj-$(CONFIG_JAILHOUSE_GUEST)	+= jailhouse.o
+
 obj-$(CONFIG_EISA)		+= eisa.o
 obj-$(CONFIG_PCSPKR_PLATFORM)	+= pcspeaker.o
 
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index f4c463d..ec3a286 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -68,8 +68,9 @@ int acpi_ioapic;
 int acpi_strict;
 int acpi_disable_cmcff;
 
+/* ACPI SCI override configuration */
 u8 acpi_sci_flags __initdata;
-int acpi_sci_override_gsi __initdata;
+u32 acpi_sci_override_gsi __initdata = INVALID_ACPI_IRQ;
 int acpi_skip_timer_override __initdata;
 int acpi_use_timer_override __initdata;
 int acpi_fix_pin2_polarity __initdata;
@@ -112,8 +113,6 @@ static u32 isa_irq_to_gsi[NR_IRQS_LEGACY] __read_mostly = {
 	0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15
 };
 
-#define	ACPI_INVALID_GSI		INT_MIN
-
 /*
  * This is just a simple wrapper around early_memremap(),
  * with sanity checks for phys == 0 and size == 0.
@@ -372,7 +371,7 @@ static void __init mp_override_legacy_irq(u8 bus_irq, u8 polarity, u8 trigger,
 	 * and acpi_isa_irq_to_gsi() may give wrong result.
 	 */
 	if (gsi < nr_legacy_irqs() && isa_irq_to_gsi[gsi] == gsi)
-		isa_irq_to_gsi[gsi] = ACPI_INVALID_GSI;
+		isa_irq_to_gsi[gsi] = INVALID_ACPI_IRQ;
 	isa_irq_to_gsi[bus_irq] = gsi;
 }
 
@@ -620,24 +619,24 @@ int acpi_gsi_to_irq(u32 gsi, unsigned int *irqp)
 	}
 
 	rc = acpi_get_override_irq(gsi, &trigger, &polarity);
-	if (rc == 0) {
-		trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
-		polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH;
-		irq = acpi_register_gsi(NULL, gsi, trigger, polarity);
-		if (irq >= 0) {
-			*irqp = irq;
-			return 0;
-		}
-	}
+	if (rc)
+		return rc;
 
-	return -1;
+	trigger = trigger ? ACPI_LEVEL_SENSITIVE : ACPI_EDGE_SENSITIVE;
+	polarity = polarity ? ACPI_ACTIVE_LOW : ACPI_ACTIVE_HIGH;
+	irq = acpi_register_gsi(NULL, gsi, trigger, polarity);
+	if (irq < 0)
+		return irq;
+
+	*irqp = irq;
+	return 0;
 }
 EXPORT_SYMBOL_GPL(acpi_gsi_to_irq);
 
 int acpi_isa_irq_to_gsi(unsigned isa_irq, u32 *gsi)
 {
 	if (isa_irq < nr_legacy_irqs() &&
-	    isa_irq_to_gsi[isa_irq] != ACPI_INVALID_GSI) {
+	    isa_irq_to_gsi[isa_irq] != INVALID_ACPI_IRQ) {
 		*gsi = isa_irq_to_gsi[isa_irq];
 		return 0;
 	}
@@ -676,8 +675,7 @@ static int acpi_register_gsi_ioapic(struct device *dev, u32 gsi,
 	mutex_lock(&acpi_ioapic_lock);
 	irq = mp_map_gsi_to_irq(gsi, IOAPIC_MAP_ALLOC, &info);
 	/* Don't set up the ACPI SCI because it's already set up */
-	if (irq >= 0 && enable_update_mptable &&
-	    acpi_gbl_FADT.sci_interrupt != gsi)
+	if (irq >= 0 && enable_update_mptable && gsi != acpi_gbl_FADT.sci_interrupt)
 		mp_config_acpi_gsi(dev, gsi, trigger, polarity);
 	mutex_unlock(&acpi_ioapic_lock);
 #endif
@@ -1211,8 +1209,9 @@ static int __init acpi_parse_madt_ioapic_entries(void)
 	/*
 	 * If BIOS did not supply an INT_SRC_OVR for the SCI
 	 * pretend we got one so we can set the SCI flags.
+	 * But ignore setting up SCI on hardware reduced platforms.
 	 */
-	if (!acpi_sci_override_gsi)
+	if (acpi_sci_override_gsi == INVALID_ACPI_IRQ && !acpi_gbl_reduced_hardware)
 		acpi_sci_ioapic_setup(acpi_gbl_FADT.sci_interrupt, 0, 0,
 				      acpi_gbl_FADT.sci_interrupt);
 
diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 7188aea..f1915b7 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -138,6 +138,8 @@ static int __init acpi_sleep_setup(char *str)
 			acpi_nvs_nosave_s3();
 		if (strncmp(str, "old_ordering", 12) == 0)
 			acpi_old_suspend_ordering();
+		if (strncmp(str, "nobl", 4) == 0)
+			acpi_sleep_no_blacklist();
 		str = strchr(str, ',');
 		if (str != NULL)
 			str += strspn(str, ", \t");
diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c
index dbaf14d..30571fd 100644
--- a/arch/x86/kernel/alternative.c
+++ b/arch/x86/kernel/alternative.c
@@ -298,7 +298,7 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
 	tgt_rip  = next_rip + o_dspl;
 	n_dspl = tgt_rip - orig_insn;
 
-	DPRINTK("target RIP: %p, new_displ: 0x%x", tgt_rip, n_dspl);
+	DPRINTK("target RIP: %px, new_displ: 0x%x", tgt_rip, n_dspl);
 
 	if (tgt_rip - orig_insn >= 0) {
 		if (n_dspl - 2 <= 127)
@@ -344,15 +344,18 @@ recompute_jump(struct alt_instr *a, u8 *orig_insn, u8 *repl_insn, u8 *insnbuf)
 static void __init_or_module noinline optimize_nops(struct alt_instr *a, u8 *instr)
 {
 	unsigned long flags;
+	int i;
 
-	if (instr[0] != 0x90)
-		return;
+	for (i = 0; i < a->padlen; i++) {
+		if (instr[i] != 0x90)
+			return;
+	}
 
 	local_irq_save(flags);
 	add_nops(instr + (a->instrlen - a->padlen), a->padlen);
 	local_irq_restore(flags);
 
-	DUMP_BYTES(instr, a->instrlen, "%p: [%d:%d) optimized NOPs: ",
+	DUMP_BYTES(instr, a->instrlen, "%px: [%d:%d) optimized NOPs: ",
 		   instr, a->instrlen - a->padlen, a->padlen);
 }
 
@@ -373,7 +376,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 	u8 *instr, *replacement;
 	u8 insnbuf[MAX_PATCH_LEN];
 
-	DPRINTK("alt table %p -> %p", start, end);
+	DPRINTK("alt table %px, -> %px", start, end);
 	/*
 	 * The scan order should be from start to end. A later scanned
 	 * alternative code can overwrite previously scanned alternative code.
@@ -397,14 +400,14 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 			continue;
 		}
 
-		DPRINTK("feat: %d*32+%d, old: (%p, len: %d), repl: (%p, len: %d), pad: %d",
+		DPRINTK("feat: %d*32+%d, old: (%px len: %d), repl: (%px, len: %d), pad: %d",
 			a->cpuid >> 5,
 			a->cpuid & 0x1f,
 			instr, a->instrlen,
 			replacement, a->replacementlen, a->padlen);
 
-		DUMP_BYTES(instr, a->instrlen, "%p: old_insn: ", instr);
-		DUMP_BYTES(replacement, a->replacementlen, "%p: rpl_insn: ", replacement);
+		DUMP_BYTES(instr, a->instrlen, "%px: old_insn: ", instr);
+		DUMP_BYTES(replacement, a->replacementlen, "%px: rpl_insn: ", replacement);
 
 		memcpy(insnbuf, replacement, a->replacementlen);
 		insnbuf_sz = a->replacementlen;
@@ -430,7 +433,7 @@ void __init_or_module noinline apply_alternatives(struct alt_instr *start,
 				 a->instrlen - a->replacementlen);
 			insnbuf_sz += a->instrlen - a->replacementlen;
 		}
-		DUMP_BYTES(insnbuf, insnbuf_sz, "%p: final_insn: ", instr);
+		DUMP_BYTES(insnbuf, insnbuf_sz, "%px: final_insn: ", instr);
 
 		text_poke_early(instr, insnbuf, insnbuf_sz);
 	}
diff --git a/arch/x86/kernel/aperture_64.c b/arch/x86/kernel/aperture_64.c
index f5d92bc..2c4d5ec 100644
--- a/arch/x86/kernel/aperture_64.c
+++ b/arch/x86/kernel/aperture_64.c
@@ -30,6 +30,7 @@
 #include <asm/dma.h>
 #include <asm/amd_nb.h>
 #include <asm/x86_init.h>
+#include <linux/crash_dump.h>
 
 /*
  * Using 512M as goal, in case kexec will load kernel_big
@@ -56,6 +57,33 @@ int fallback_aper_force __initdata;
 
 int fix_aperture __initdata = 1;
 
+#ifdef CONFIG_PROC_VMCORE
+/*
+ * If the first kernel maps the aperture over e820 RAM, the kdump kernel will
+ * use the same range because it will remain configured in the northbridge.
+ * Trying to dump this area via /proc/vmcore may crash the machine, so exclude
+ * it from vmcore.
+ */
+static unsigned long aperture_pfn_start, aperture_page_count;
+
+static int gart_oldmem_pfn_is_ram(unsigned long pfn)
+{
+	return likely((pfn < aperture_pfn_start) ||
+		      (pfn >= aperture_pfn_start + aperture_page_count));
+}
+
+static void exclude_from_vmcore(u64 aper_base, u32 aper_order)
+{
+	aperture_pfn_start = aper_base >> PAGE_SHIFT;
+	aperture_page_count = (32 * 1024 * 1024) << aper_order >> PAGE_SHIFT;
+	WARN_ON(register_oldmem_pfn_is_ram(&gart_oldmem_pfn_is_ram));
+}
+#else
+static void exclude_from_vmcore(u64 aper_base, u32 aper_order)
+{
+}
+#endif
+
 /* This code runs before the PCI subsystem is initialized, so just
    access the northbridge directly. */
 
@@ -435,8 +463,16 @@ int __init gart_iommu_hole_init(void)
 
 out:
 	if (!fix && !fallback_aper_force) {
-		if (last_aper_base)
+		if (last_aper_base) {
+			/*
+			 * If this is the kdump kernel, the first kernel
+			 * may have allocated the range over its e820 RAM
+			 * and fixed up the northbridge
+			 */
+			exclude_from_vmcore(last_aper_base, last_aper_order);
+
 			return 1;
+		}
 		return 0;
 	}
 
@@ -473,6 +509,14 @@ int __init gart_iommu_hole_init(void)
 		return 0;
 	}
 
+	/*
+	 * If this is the kdump kernel _and_ the first kernel did not
+	 * configure the aperture in the northbridge, this range may
+	 * overlap with the first kernel's memory. We can't access the
+	 * range through vmcore even though it should be part of the dump.
+	 */
+	exclude_from_vmcore(aper_alloc, aper_order);
+
 	/* Fix up the north bridges */
 	for (i = 0; i < amd_nb_bus_dev_ranges[i].dev_limit; i++) {
 		int bus, dev_base, dev_limit;
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index 880441f..25ddf02 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -1286,6 +1286,55 @@ static int __init apic_intr_mode_select(void)
 	return APIC_SYMMETRIC_IO;
 }
 
+/*
+ * An initial setup of the virtual wire mode.
+ */
+void __init init_bsp_APIC(void)
+{
+	unsigned int value;
+
+	/*
+	 * Don't do the setup now if we have a SMP BIOS as the
+	 * through-I/O-APIC virtual wire mode might be active.
+	 */
+	if (smp_found_config || !boot_cpu_has(X86_FEATURE_APIC))
+		return;
+
+	/*
+	 * Do not trust the local APIC being empty at bootup.
+	 */
+	clear_local_APIC();
+
+	/*
+	 * Enable APIC.
+	 */
+	value = apic_read(APIC_SPIV);
+	value &= ~APIC_VECTOR_MASK;
+	value |= APIC_SPIV_APIC_ENABLED;
+
+#ifdef CONFIG_X86_32
+	/* This bit is reserved on P4/Xeon and should be cleared */
+	if ((boot_cpu_data.x86_vendor == X86_VENDOR_INTEL) &&
+	    (boot_cpu_data.x86 == 15))
+		value &= ~APIC_SPIV_FOCUS_DISABLED;
+	else
+#endif
+		value |= APIC_SPIV_FOCUS_DISABLED;
+	value |= SPURIOUS_APIC_VECTOR;
+	apic_write(APIC_SPIV, value);
+
+	/*
+	 * Set up the virtual wire mode.
+	 */
+	apic_write(APIC_LVT0, APIC_DM_EXTINT);
+	value = APIC_DM_NMI;
+	if (!lapic_is_integrated())		/* 82489DX */
+		value |= APIC_LVT_LEVEL_TRIGGER;
+	if (apic_extnmi == APIC_EXTNMI_NONE)
+		value |= APIC_LVT_MASKED;
+	apic_write(APIC_LVT1, value);
+}
+
 /* Init the interrupt delivery mode for the BSP */
 void __init apic_intr_mode_init(void)
 {
diff --git a/arch/x86/kernel/apic/apic_flat_64.c b/arch/x86/kernel/apic/apic_flat_64.c
index 25a8702..e84c9eb 100644
--- a/arch/x86/kernel/apic/apic_flat_64.c
+++ b/arch/x86/kernel/apic/apic_flat_64.c
@@ -19,6 +19,7 @@
 #include <asm/smp.h>
 #include <asm/apic.h>
 #include <asm/ipi.h>
+#include <asm/jailhouse_para.h>
 
 #include <linux/acpi.h>
 
@@ -84,12 +85,8 @@ flat_send_IPI_mask_allbutself(const struct cpumask *cpumask, int vector)
 static void flat_send_IPI_allbutself(int vector)
 {
 	int cpu = smp_processor_id();
-#ifdef	CONFIG_HOTPLUG_CPU
-	int hotplug = 1;
-#else
-	int hotplug = 0;
-#endif
-	if (hotplug || vector == NMI_VECTOR) {
+
+	if (IS_ENABLED(CONFIG_HOTPLUG_CPU) || vector == NMI_VECTOR) {
 		if (!cpumask_equal(cpu_online_mask, cpumask_of(cpu))) {
 			unsigned long mask = cpumask_bits(cpu_online_mask)[0];
 
@@ -218,6 +215,15 @@ static int physflat_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 	return 0;
 }
 
+static void physflat_init_apic_ldr(void)
+{
+	/*
+	 * LDR and DFR are not involved in physflat mode, rather:
+	 * "In physical destination mode, the destination processor is
+	 * specified by its local APIC ID [...]." (Intel SDM, 10.6.2.1)
+	 */
+}
+
 static void physflat_send_IPI_allbutself(int vector)
 {
 	default_send_IPI_mask_allbutself_phys(cpu_online_mask, vector);
@@ -230,7 +236,8 @@ static void physflat_send_IPI_all(int vector)
 
 static int physflat_probe(void)
 {
-	if (apic == &apic_physflat || num_possible_cpus() > 8)
+	if (apic == &apic_physflat || num_possible_cpus() > 8 ||
+	    jailhouse_paravirt())
 		return 1;
 
 	return 0;
@@ -251,8 +258,7 @@ static struct apic apic_physflat __ro_after_init = {
 	.dest_logical			= 0,
 	.check_apicid_used		= NULL,
 
-	/* not needed, but shouldn't hurt: */
-	.init_apic_ldr			= flat_init_apic_ldr,
+	.init_apic_ldr			= physflat_init_apic_ldr,
 
 	.ioapic_phys_id_map		= NULL,
 	.setup_apic_routing		= NULL,
diff --git a/arch/x86/kernel/apic/io_apic.c b/arch/x86/kernel/apic/io_apic.c
index 8a79634..8ad2e41 100644
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -800,18 +800,18 @@ static int irq_polarity(int idx)
 	/*
 	 * Determine IRQ line polarity (high active or low active):
 	 */
-	switch (mp_irqs[idx].irqflag & 0x03) {
-	case 0:
+	switch (mp_irqs[idx].irqflag & MP_IRQPOL_MASK) {
+	case MP_IRQPOL_DEFAULT:
 		/* conforms to spec, ie. bus-type dependent polarity */
 		if (test_bit(bus, mp_bus_not_pci))
 			return default_ISA_polarity(idx);
 		else
 			return default_PCI_polarity(idx);
-	case 1:
+	case MP_IRQPOL_ACTIVE_HIGH:
 		return IOAPIC_POL_HIGH;
-	case 2:
+	case MP_IRQPOL_RESERVED:
 		pr_warn("IOAPIC: Invalid polarity: 2, defaulting to low\n");
-	case 3:
+	case MP_IRQPOL_ACTIVE_LOW:
 	default: /* Pointless default required due to do gcc stupidity */
 		return IOAPIC_POL_LOW;
 	}
@@ -845,8 +845,8 @@ static int irq_trigger(int idx)
 	/*
 	 * Determine IRQ trigger mode (edge or level sensitive):
 	 */
-	switch ((mp_irqs[idx].irqflag >> 2) & 0x03) {
-	case 0:
+	switch (mp_irqs[idx].irqflag & MP_IRQTRIG_MASK) {
+	case MP_IRQTRIG_DEFAULT:
 		/* conforms to spec, ie. bus-type dependent trigger mode */
 		if (test_bit(bus, mp_bus_not_pci))
 			trigger = default_ISA_trigger(idx);
@@ -854,11 +854,11 @@ static int irq_trigger(int idx)
 			trigger = default_PCI_trigger(idx);
 		/* Take EISA into account */
 		return eisa_irq_trigger(idx, bus, trigger);
-	case 1:
+	case MP_IRQTRIG_EDGE:
 		return IOAPIC_EDGE;
-	case 2:
+	case MP_IRQTRIG_RESERVED:
 		pr_warn("IOAPIC: Invalid trigger mode 2 defaulting to level\n");
-	case 3:
+	case MP_IRQTRIG_LEVEL:
 	default: /* Pointless default required due to do gcc stupidity */
 		return IOAPIC_LEVEL;
 	}
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index f8b03bb..3cc471b 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -542,14 +542,17 @@ static int x86_vector_alloc_irqs(struct irq_domain *domain, unsigned int virq,
 
 		err = assign_irq_vector_policy(irqd, info);
 		trace_vector_setup(virq + i, false, err);
-		if (err)
+		if (err) {
+			irqd->chip_data = NULL;
+			free_apic_chip_data(apicd);
 			goto error;
+		}
 	}
 
 	return 0;
 
 error:
-	x86_vector_free_irqs(domain, virq, i + 1);
+	x86_vector_free_irqs(domain, virq, i);
 	return err;
 }
 
diff --git a/arch/x86/kernel/apic/x2apic_uv_x.c b/arch/x86/kernel/apic/x2apic_uv_x.c
index e1b8e8b..46b675a 100644
--- a/arch/x86/kernel/apic/x2apic_uv_x.c
+++ b/arch/x86/kernel/apic/x2apic_uv_x.c
@@ -137,6 +137,8 @@ static int __init early_get_pnodeid(void)
 	case UV3_HUB_PART_NUMBER_X:
 		uv_min_hub_revision_id += UV3_HUB_REVISION_BASE;
 		break;
+
+	/* Update: UV4A has only a modified revision to indicate HUB fixes */
 	case UV4_HUB_PART_NUMBER:
 		uv_min_hub_revision_id += UV4_HUB_REVISION_BASE - 1;
 		uv_cpuid.gnode_shift = 2; /* min partition is 4 sockets */
@@ -316,6 +318,7 @@ static int __init uv_acpi_madt_oem_check(char *oem_id, char *oem_table_id)
 	} else if (!strcmp(oem_table_id, "UVH")) {
 		/* Only UV1 systems: */
 		uv_system_type = UV_NON_UNIQUE_APIC;
+		x86_platform.legacy.warm_reset = 0;
 		__this_cpu_write(x2apic_extra_bits, pnodeid << uvh_apicid.s.pnode_shift);
 		uv_set_apicid_hibit();
 		uv_apic = 1;
@@ -767,6 +770,7 @@ static __init void map_gru_high(int max_pnode)
 		return;
 	}
 
+	/* Only UV3 has distributed GRU mode */
 	if (is_uv3_hub() && gru.s3.mode) {
 		map_gru_distributed(gru.v);
 		return;
@@ -790,63 +794,61 @@ static __init void map_mmr_high(int max_pnode)
 		pr_info("UV: MMR disabled\n");
 }
 
-/*
- * This commonality works because both 0 & 1 versions of the MMIOH OVERLAY
- * and REDIRECT MMR regs are exactly the same on UV3.
- */
-struct mmioh_config {
-	unsigned long overlay;
-	unsigned long redirect;
-	char *id;
-};
-
-static __initdata struct mmioh_config mmiohs[] = {
-	{
-		UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR,
-		UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR,
-		"MMIOH0"
-	},
-	{
-		UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR,
-		UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR,
-		"MMIOH1"
-	},
-};
-
-/* UV3 & UV4 have identical MMIOH overlay configs */
-static __init void map_mmioh_high_uv3(int index, int min_pnode, int max_pnode)
+/* UV3/4 have identical MMIOH overlay configs, UV4A is slightly different */
+static __init void map_mmioh_high_uv34(int index, int min_pnode, int max_pnode)
 {
-	union uv3h_rh_gam_mmioh_overlay_config0_mmr_u overlay;
+	unsigned long overlay;
 	unsigned long mmr;
 	unsigned long base;
+	unsigned long nasid_mask;
+	unsigned long m_overlay;
 	int i, n, shift, m_io, max_io;
 	int nasid, lnasid, fi, li;
 	char *id;
 
-	id = mmiohs[index].id;
-	overlay.v = uv_read_local_mmr(mmiohs[index].overlay);
-
-	pr_info("UV: %s overlay 0x%lx base:0x%x m_io:%d\n", id, overlay.v, overlay.s3.base, overlay.s3.m_io);
-	if (!overlay.s3.enable) {
+	if (index == 0) {
+		id = "MMIOH0";
+		m_overlay = UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR;
+		overlay = uv_read_local_mmr(m_overlay);
+		base = overlay & UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_MASK;
+		mmr = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR;
+		m_io = (overlay & UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_MASK)
+			>> UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT;
+		shift = UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_M_IO_SHFT;
+		n = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH;
+		nasid_mask = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_NASID_MASK;
+	} else {
+		id = "MMIOH1";
+		m_overlay = UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR;
+		overlay = uv_read_local_mmr(m_overlay);
+		base = overlay & UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_BASE_MASK;
+		mmr = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR;
+		m_io = (overlay & UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_MASK)
+			>> UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT;
+		shift = UVH_RH_GAM_MMIOH_OVERLAY_CONFIG1_MMR_M_IO_SHFT;
+		n = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_DEPTH;
+		nasid_mask = UVH_RH_GAM_MMIOH_REDIRECT_CONFIG1_MMR_NASID_MASK;
+	}
+	pr_info("UV: %s overlay 0x%lx base:0x%lx m_io:%d\n", id, overlay, base, m_io);
+	if (!(overlay & UVH_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_ENABLE_MASK)) {
 		pr_info("UV: %s disabled\n", id);
 		return;
 	}
 
-	shift = UV3H_RH_GAM_MMIOH_OVERLAY_CONFIG0_MMR_BASE_SHFT;
-	base = (unsigned long)overlay.s3.base;
-	m_io = overlay.s3.m_io;
-	mmr = mmiohs[index].redirect;
-	n = UV3H_RH_GAM_MMIOH_REDIRECT_CONFIG0_MMR_DEPTH;
 	/* Convert to NASID: */
 	min_pnode *= 2;
 	max_pnode *= 2;
 	max_io = lnasid = fi = li = -1;
 
 	for (i = 0; i < n; i++) {
-		union uv3h_rh_gam_mmioh_redirect_config0_mmr_u redirect;
+		unsigned long m_redirect = mmr + i * 8;
+		unsigned long redirect = uv_read_local_mmr(m_redirect);
 
-		redirect.v = uv_read_local_mmr(mmr + i * 8);
-		nasid = redirect.s3.nasid;
+		nasid = redirect & nasid_mask;
+		if (i == 0)
+			pr_info("UV: %s redirect base 0x%lx(@0x%lx) 0x%04x\n",
+				id, redirect, m_redirect, nasid);
+
 		/* Invalid NASID: */
 		if (nasid < min_pnode || max_pnode < nasid)
 			nasid = -1;
@@ -894,8 +896,8 @@ static __init void map_mmioh_high(int min_pnode, int max_pnode)
 
 	if (is_uv3_hub() || is_uv4_hub()) {
 		/* Map both MMIOH regions: */
-		map_mmioh_high_uv3(0, min_pnode, max_pnode);
-		map_mmioh_high_uv3(1, min_pnode, max_pnode);
+		map_mmioh_high_uv34(0, min_pnode, max_pnode);
+		map_mmioh_high_uv34(1, min_pnode, max_pnode);
 		return;
 	}
 
diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c
index bcb75dc..ea831c8 100644
--- a/arch/x86/kernel/cpu/amd.c
+++ b/arch/x86/kernel/cpu/amd.c
@@ -829,8 +829,32 @@ static void init_amd(struct cpuinfo_x86 *c)
 		set_cpu_cap(c, X86_FEATURE_K8);
 
 	if (cpu_has(c, X86_FEATURE_XMM2)) {
-		/* MFENCE stops RDTSC speculation */
-		set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+		unsigned long long val;
+		int ret;
+
+		/*
+		 * A serializing LFENCE has less overhead than MFENCE, so
+		 * use it for execution serialization.  On families which
+		 * don't have that MSR, LFENCE is already serializing.
+		 * msr_set_bit() uses the safe accessors, too, even if the MSR
+		 * is not present.
+		 */
+		msr_set_bit(MSR_F10H_DECFG,
+			    MSR_F10H_DECFG_LFENCE_SERIALIZE_BIT);
+
+		/*
+		 * Verify that the MSR write was successful (could be running
+		 * under a hypervisor) and only then assume that LFENCE is
+		 * serializing.
+		 */
+		ret = rdmsrl_safe(MSR_F10H_DECFG, &val);
+		if (!ret && (val & MSR_F10H_DECFG_LFENCE_SERIALIZE)) {
+			/* A serializing LFENCE stops RDTSC speculation */
+			set_cpu_cap(c, X86_FEATURE_LFENCE_RDTSC);
+		} else {
+			/* MFENCE stops RDTSC speculation */
+			set_cpu_cap(c, X86_FEATURE_MFENCE_RDTSC);
+		}
 	}
 
 	/*
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index ba0b242..3bfb2b2 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -10,6 +10,11 @@
  */
 #include <linux/init.h>
 #include <linux/utsname.h>
+#include <linux/cpu.h>
+#include <linux/module.h>
+
+#include <asm/nospec-branch.h>
+#include <asm/cmdline.h>
 #include <asm/bugs.h>
 #include <asm/processor.h>
 #include <asm/processor-flags.h>
@@ -19,6 +24,9 @@
 #include <asm/alternative.h>
 #include <asm/pgtable.h>
 #include <asm/set_memory.h>
+#include <asm/intel-family.h>
+
+static void __init spectre_v2_select_mitigation(void);
 
 void __init check_bugs(void)
 {
@@ -29,6 +37,9 @@ void __init check_bugs(void)
 		print_cpu_info(&boot_cpu_data);
 	}
 
+	/* Select the proper spectre mitigation before patching alternatives */
+	spectre_v2_select_mitigation();
+
 #ifdef CONFIG_X86_32
 	/*
 	 * Check whether we are able to run this kernel safely on SMP.
@@ -60,3 +71,249 @@ void __init check_bugs(void)
 		set_memory_4k((unsigned long)__va(0), 1);
 #endif
 }
+
+/* The kernel command line selection */
+enum spectre_v2_mitigation_cmd {
+	SPECTRE_V2_CMD_NONE,
+	SPECTRE_V2_CMD_AUTO,
+	SPECTRE_V2_CMD_FORCE,
+	SPECTRE_V2_CMD_RETPOLINE,
+	SPECTRE_V2_CMD_RETPOLINE_GENERIC,
+	SPECTRE_V2_CMD_RETPOLINE_AMD,
+};
+
+static const char *spectre_v2_strings[] = {
+	[SPECTRE_V2_NONE]			= "Vulnerable",
+	[SPECTRE_V2_RETPOLINE_MINIMAL]		= "Vulnerable: Minimal generic ASM retpoline",
+	[SPECTRE_V2_RETPOLINE_MINIMAL_AMD]	= "Vulnerable: Minimal AMD ASM retpoline",
+	[SPECTRE_V2_RETPOLINE_GENERIC]		= "Mitigation: Full generic retpoline",
+	[SPECTRE_V2_RETPOLINE_AMD]		= "Mitigation: Full AMD retpoline",
+};
+
+#undef pr_fmt
+#define pr_fmt(fmt)     "Spectre V2 : " fmt
+
+static enum spectre_v2_mitigation spectre_v2_enabled = SPECTRE_V2_NONE;
+
+#ifdef RETPOLINE
+static bool spectre_v2_bad_module;
+
+bool retpoline_module_ok(bool has_retpoline)
+{
+	if (spectre_v2_enabled == SPECTRE_V2_NONE || has_retpoline)
+		return true;
+
+	pr_err("System may be vunerable to spectre v2\n");
+	spectre_v2_bad_module = true;
+	return false;
+}
+
+static inline const char *spectre_v2_module_string(void)
+{
+	return spectre_v2_bad_module ? " - vulnerable module loaded" : "";
+}
+#else
+static inline const char *spectre_v2_module_string(void) { return ""; }
+#endif
+
+static void __init spec2_print_if_insecure(const char *reason)
+{
+	if (boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		pr_info("%s\n", reason);
+}
+
+static void __init spec2_print_if_secure(const char *reason)
+{
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		pr_info("%s\n", reason);
+}
+
+static inline bool retp_compiler(void)
+{
+	return __is_defined(RETPOLINE);
+}
+
+static inline bool match_option(const char *arg, int arglen, const char *opt)
+{
+	int len = strlen(opt);
+
+	return len == arglen && !strncmp(arg, opt, len);
+}
+
+static enum spectre_v2_mitigation_cmd __init spectre_v2_parse_cmdline(void)
+{
+	char arg[20];
+	int ret;
+
+	ret = cmdline_find_option(boot_command_line, "spectre_v2", arg,
+				  sizeof(arg));
+	if (ret > 0)  {
+		if (match_option(arg, ret, "off")) {
+			goto disable;
+		} else if (match_option(arg, ret, "on")) {
+			spec2_print_if_secure("force enabled on command line.");
+			return SPECTRE_V2_CMD_FORCE;
+		} else if (match_option(arg, ret, "retpoline")) {
+			spec2_print_if_insecure("retpoline selected on command line.");
+			return SPECTRE_V2_CMD_RETPOLINE;
+		} else if (match_option(arg, ret, "retpoline,amd")) {
+			if (boot_cpu_data.x86_vendor != X86_VENDOR_AMD) {
+				pr_err("retpoline,amd selected but CPU is not AMD. Switching to AUTO select\n");
+				return SPECTRE_V2_CMD_AUTO;
+			}
+			spec2_print_if_insecure("AMD retpoline selected on command line.");
+			return SPECTRE_V2_CMD_RETPOLINE_AMD;
+		} else if (match_option(arg, ret, "retpoline,generic")) {
+			spec2_print_if_insecure("generic retpoline selected on command line.");
+			return SPECTRE_V2_CMD_RETPOLINE_GENERIC;
+		} else if (match_option(arg, ret, "auto")) {
+			return SPECTRE_V2_CMD_AUTO;
+		}
+	}
+
+	if (!cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+		return SPECTRE_V2_CMD_AUTO;
+disable:
+	spec2_print_if_insecure("disabled on command line.");
+	return SPECTRE_V2_CMD_NONE;
+}
+
+/* Check for Skylake-like CPUs (for RSB handling) */
+static bool __init is_skylake_era(void)
+{
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_INTEL &&
+	    boot_cpu_data.x86 == 6) {
+		switch (boot_cpu_data.x86_model) {
+		case INTEL_FAM6_SKYLAKE_MOBILE:
+		case INTEL_FAM6_SKYLAKE_DESKTOP:
+		case INTEL_FAM6_SKYLAKE_X:
+		case INTEL_FAM6_KABYLAKE_MOBILE:
+		case INTEL_FAM6_KABYLAKE_DESKTOP:
+			return true;
+		}
+	}
+	return false;
+}
+
+static void __init spectre_v2_select_mitigation(void)
+{
+	enum spectre_v2_mitigation_cmd cmd = spectre_v2_parse_cmdline();
+	enum spectre_v2_mitigation mode = SPECTRE_V2_NONE;
+
+	/*
+	 * If the CPU is not affected and the command line mode is NONE or AUTO
+	 * then nothing to do.
+	 */
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2) &&
+	    (cmd == SPECTRE_V2_CMD_NONE || cmd == SPECTRE_V2_CMD_AUTO))
+		return;
+
+	switch (cmd) {
+	case SPECTRE_V2_CMD_NONE:
+		return;
+
+	case SPECTRE_V2_CMD_FORCE:
+		/* FALLTRHU */
+	case SPECTRE_V2_CMD_AUTO:
+		goto retpoline_auto;
+
+	case SPECTRE_V2_CMD_RETPOLINE_AMD:
+		if (IS_ENABLED(CONFIG_RETPOLINE))
+			goto retpoline_amd;
+		break;
+	case SPECTRE_V2_CMD_RETPOLINE_GENERIC:
+		if (IS_ENABLED(CONFIG_RETPOLINE))
+			goto retpoline_generic;
+		break;
+	case SPECTRE_V2_CMD_RETPOLINE:
+		if (IS_ENABLED(CONFIG_RETPOLINE))
+			goto retpoline_auto;
+		break;
+	}
+	pr_err("kernel not compiled with retpoline; no mitigation available!");
+	return;
+
+retpoline_auto:
+	if (boot_cpu_data.x86_vendor == X86_VENDOR_AMD) {
+	retpoline_amd:
+		if (!boot_cpu_has(X86_FEATURE_LFENCE_RDTSC)) {
+			pr_err("LFENCE not serializing. Switching to generic retpoline\n");
+			goto retpoline_generic;
+		}
+		mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_AMD :
+					 SPECTRE_V2_RETPOLINE_MINIMAL_AMD;
+		setup_force_cpu_cap(X86_FEATURE_RETPOLINE_AMD);
+		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
+	} else {
+	retpoline_generic:
+		mode = retp_compiler() ? SPECTRE_V2_RETPOLINE_GENERIC :
+					 SPECTRE_V2_RETPOLINE_MINIMAL;
+		setup_force_cpu_cap(X86_FEATURE_RETPOLINE);
+	}
+
+	spectre_v2_enabled = mode;
+	pr_info("%s\n", spectre_v2_strings[mode]);
+
+	/*
+	 * If neither SMEP or KPTI are available, there is a risk of
+	 * hitting userspace addresses in the RSB after a context switch
+	 * from a shallow call stack to a deeper one. To prevent this fill
+	 * the entire RSB, even when using IBRS.
+	 *
+	 * Skylake era CPUs have a separate issue with *underflow* of the
+	 * RSB, when they will predict 'ret' targets from the generic BTB.
+	 * The proper mitigation for this is IBRS. If IBRS is not supported
+	 * or deactivated in favour of retpolines the RSB fill on context
+	 * switch is required.
+	 */
+	if ((!boot_cpu_has(X86_FEATURE_PTI) &&
+	     !boot_cpu_has(X86_FEATURE_SMEP)) || is_skylake_era()) {
+		setup_force_cpu_cap(X86_FEATURE_RSB_CTXSW);
+		pr_info("Filling RSB on context switch\n");
+	}
+
+	/* Initialize Indirect Branch Prediction Barrier if supported */
+	if (boot_cpu_has(X86_FEATURE_IBPB)) {
+		setup_force_cpu_cap(X86_FEATURE_USE_IBPB);
+		pr_info("Enabling Indirect Branch Prediction Barrier\n");
+	}
+}
+
+#undef pr_fmt
+
+#ifdef CONFIG_SYSFS
+ssize_t cpu_show_meltdown(struct device *dev,
+			  struct device_attribute *attr, char *buf)
+{
+	if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
+		return sprintf(buf, "Not affected\n");
+	if (boot_cpu_has(X86_FEATURE_PTI))
+		return sprintf(buf, "Mitigation: PTI\n");
+	return sprintf(buf, "Vulnerable\n");
+}
+
+ssize_t cpu_show_spectre_v1(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V1))
+		return sprintf(buf, "Not affected\n");
+	return sprintf(buf, "Vulnerable\n");
+}
+
+ssize_t cpu_show_spectre_v2(struct device *dev,
+			    struct device_attribute *attr, char *buf)
+{
+	if (!boot_cpu_has_bug(X86_BUG_SPECTRE_V2))
+		return sprintf(buf, "Not affected\n");
+
+	return sprintf(buf, "%s%s%s\n", spectre_v2_strings[spectre_v2_enabled],
+		       boot_cpu_has(X86_FEATURE_USE_IBPB) ? ", IBPB" : "",
+		       spectre_v2_module_string());
+}
+#endif
+
+void __ibp_barrier(void)
+{
+	__wrmsr(MSR_IA32_PRED_CMD, PRED_CMD_IBPB, 0);
+}
+EXPORT_SYMBOL_GPL(__ibp_barrier);
diff --git a/arch/x86/kernel/cpu/centaur.c b/arch/x86/kernel/cpu/centaur.c
index 68bc6d9..c578cd2 100644
--- a/arch/x86/kernel/cpu/centaur.c
+++ b/arch/x86/kernel/cpu/centaur.c
@@ -106,6 +106,10 @@ static void early_init_centaur(struct cpuinfo_x86 *c)
 #ifdef CONFIG_X86_64
 	set_cpu_cap(c, X86_FEATURE_SYSENTER32);
 #endif
+	if (c->x86_power & (1 << 8)) {
+		set_cpu_cap(c, X86_FEATURE_CONSTANT_TSC);
+		set_cpu_cap(c, X86_FEATURE_NONSTOP_TSC);
+	}
 }
 
 static void init_centaur(struct cpuinfo_x86 *c)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index c47de4e..c7c996a 100644
--- a/arch/x86/kernel/cpu/common.c
+++ b/arch/x86/kernel/cpu/common.c
@@ -47,6 +47,8 @@
 #include <asm/pat.h>
 #include <asm/microcode.h>
 #include <asm/microcode_intel.h>
+#include <asm/intel-family.h>
+#include <asm/cpu_device_id.h>
 
 #ifdef CONFIG_X86_LOCAL_APIC
 #include <asm/uv/uv.h>
@@ -769,6 +771,7 @@ void get_cpu_cap(struct cpuinfo_x86 *c)
 		cpuid_count(0x00000007, 0, &eax, &ebx, &ecx, &edx);
 		c->x86_capability[CPUID_7_0_EBX] = ebx;
 		c->x86_capability[CPUID_7_ECX] = ecx;
+		c->x86_capability[CPUID_7_EDX] = edx;
 	}
 
 	/* Extended state features: level 0x0000000d */
@@ -876,6 +879,41 @@ static void identify_cpu_without_cpuid(struct cpuinfo_x86 *c)
 #endif
 }
 
+static const __initdata struct x86_cpu_id cpu_no_speculation[] = {
+	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_CEDARVIEW,	X86_FEATURE_ANY },
+	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_CLOVERVIEW,	X86_FEATURE_ANY },
+	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_LINCROFT,	X86_FEATURE_ANY },
+	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_PENWELL,	X86_FEATURE_ANY },
+	{ X86_VENDOR_INTEL,	6, INTEL_FAM6_ATOM_PINEVIEW,	X86_FEATURE_ANY },
+	{ X86_VENDOR_CENTAUR,	5 },
+	{ X86_VENDOR_INTEL,	5 },
+	{ X86_VENDOR_NSC,	5 },
+	{ X86_VENDOR_ANY,	4 },
+	{}
+};
+
+static const __initdata struct x86_cpu_id cpu_no_meltdown[] = {
+	{ X86_VENDOR_AMD },
+	{}
+};
+
+static bool __init cpu_vulnerable_to_meltdown(struct cpuinfo_x86 *c)
+{
+	u64 ia32_cap = 0;
+
+	if (x86_match_cpu(cpu_no_meltdown))
+		return false;
+
+	if (cpu_has(c, X86_FEATURE_ARCH_CAPABILITIES))
+		rdmsrl(MSR_IA32_ARCH_CAPABILITIES, ia32_cap);
+
+	/* Rogue Data Cache Load? No! */
+	if (ia32_cap & ARCH_CAP_RDCL_NO)
+		return false;
+
+	return true;
+}
+
 /*
  * Do minimum CPU detection early.
  * Fields really needed: vendor, cpuid_level, family, model, mask,
@@ -923,8 +961,12 @@ static void __init early_identify_cpu(struct cpuinfo_x86 *c)
 
 	setup_force_cpu_cap(X86_FEATURE_ALWAYS);
 
-	/* Assume for now that ALL x86 CPUs are insecure */
-	setup_force_cpu_bug(X86_BUG_CPU_INSECURE);
+	if (!x86_match_cpu(cpu_no_speculation)) {
+		if (cpu_vulnerable_to_meltdown(c))
+			setup_force_cpu_bug(X86_BUG_CPU_MELTDOWN);
+		setup_force_cpu_bug(X86_BUG_SPECTRE_V1);
+		setup_force_cpu_bug(X86_BUG_SPECTRE_V2);
+	}
 
 	fpu__init_system(c);
 
diff --git a/arch/x86/kernel/cpu/hypervisor.c b/arch/x86/kernel/cpu/hypervisor.c
index bea8d3e..479ca47 100644
--- a/arch/x86/kernel/cpu/hypervisor.c
+++ b/arch/x86/kernel/cpu/hypervisor.c
@@ -31,6 +31,7 @@ extern const struct hypervisor_x86 x86_hyper_ms_hyperv;
 extern const struct hypervisor_x86 x86_hyper_xen_pv;
 extern const struct hypervisor_x86 x86_hyper_xen_hvm;
 extern const struct hypervisor_x86 x86_hyper_kvm;
+extern const struct hypervisor_x86 x86_hyper_jailhouse;
 
 static const __initconst struct hypervisor_x86 * const hypervisors[] =
 {
@@ -45,6 +46,9 @@ static const __initconst struct hypervisor_x86 * const hypervisors[] =
 #ifdef CONFIG_KVM_GUEST
 	&x86_hyper_kvm,
 #endif
+#ifdef CONFIG_JAILHOUSE_GUEST
+	&x86_hyper_jailhouse,
+#endif
 };
 
 enum x86_hypervisor_type x86_hyper_type;
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index b1af220..6936d14 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -102,6 +102,59 @@ static void probe_xeon_phi_r3mwait(struct cpuinfo_x86 *c)
 		ELF_HWCAP2 |= HWCAP2_RING3MWAIT;
 }
 
+/*
+ * Early microcode releases for the Spectre v2 mitigation were broken.
+ * Information taken from;
+ * - https://newsroom.intel.com/wp-content/uploads/sites/11/2018/01/microcode-update-guidance.pdf
+ * - https://kb.vmware.com/s/article/52345
+ * - Microcode revisions observed in the wild
+ * - Release note from 20180108 microcode release
+ */
+struct sku_microcode {
+	u8 model;
+	u8 stepping;
+	u32 microcode;
+};
+static const struct sku_microcode spectre_bad_microcodes[] = {
+	{ INTEL_FAM6_KABYLAKE_DESKTOP,	0x0B,	0x84 },
+	{ INTEL_FAM6_KABYLAKE_DESKTOP,	0x0A,	0x84 },
+	{ INTEL_FAM6_KABYLAKE_DESKTOP,	0x09,	0x84 },
+	{ INTEL_FAM6_KABYLAKE_MOBILE,	0x0A,	0x84 },
+	{ INTEL_FAM6_KABYLAKE_MOBILE,	0x09,	0x84 },
+	{ INTEL_FAM6_SKYLAKE_X,		0x03,	0x0100013e },
+	{ INTEL_FAM6_SKYLAKE_X,		0x04,	0x0200003c },
+	{ INTEL_FAM6_SKYLAKE_MOBILE,	0x03,	0xc2 },
+	{ INTEL_FAM6_SKYLAKE_DESKTOP,	0x03,	0xc2 },
+	{ INTEL_FAM6_BROADWELL_CORE,	0x04,	0x28 },
+	{ INTEL_FAM6_BROADWELL_GT3E,	0x01,	0x1b },
+	{ INTEL_FAM6_BROADWELL_XEON_D,	0x02,	0x14 },
+	{ INTEL_FAM6_BROADWELL_XEON_D,	0x03,	0x07000011 },
+	{ INTEL_FAM6_BROADWELL_X,	0x01,	0x0b000025 },
+	{ INTEL_FAM6_HASWELL_ULT,	0x01,	0x21 },
+	{ INTEL_FAM6_HASWELL_GT3E,	0x01,	0x18 },
+	{ INTEL_FAM6_HASWELL_CORE,	0x03,	0x23 },
+	{ INTEL_FAM6_HASWELL_X,		0x02,	0x3b },
+	{ INTEL_FAM6_HASWELL_X,		0x04,	0x10 },
+	{ INTEL_FAM6_IVYBRIDGE_X,	0x04,	0x42a },
+	/* Updated in the 20180108 release; blacklist until we know otherwise */
+	{ INTEL_FAM6_ATOM_GEMINI_LAKE,	0x01,	0x22 },
+	/* Observed in the wild */
+	{ INTEL_FAM6_SANDYBRIDGE_X,	0x06,	0x61b },
+	{ INTEL_FAM6_SANDYBRIDGE_X,	0x07,	0x712 },
+};
+
+static bool bad_spectre_microcode(struct cpuinfo_x86 *c)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(spectre_bad_microcodes); i++) {
+		if (c->x86_model == spectre_bad_microcodes[i].model &&
+		    c->x86_mask == spectre_bad_microcodes[i].stepping)
+			return (c->microcode <= spectre_bad_microcodes[i].microcode);
+	}
+	return false;
+}
+
 static void early_init_intel(struct cpuinfo_x86 *c)
 {
 	u64 misc_enable;
@@ -123,6 +176,30 @@ static void early_init_intel(struct cpuinfo_x86 *c)
 		c->microcode = intel_get_microcode_revision();
 
 	/*
+	 * The Intel SPEC_CTRL CPUID bit implies IBRS and IBPB support,
+	 * and they also have a different bit for STIBP support. Also,
+	 * a hypervisor might have set the individual AMD bits even on
+	 * Intel CPUs, for finer-grained selection of what's available.
+	 */
+	if (cpu_has(c, X86_FEATURE_SPEC_CTRL)) {
+		set_cpu_cap(c, X86_FEATURE_IBRS);
+		set_cpu_cap(c, X86_FEATURE_IBPB);
+	}
+	if (cpu_has(c, X86_FEATURE_INTEL_STIBP))
+		set_cpu_cap(c, X86_FEATURE_STIBP);
+
+	/* Now if any of them are set, check the blacklist and clear the lot */
+	if ((cpu_has(c, X86_FEATURE_IBRS) || cpu_has(c, X86_FEATURE_IBPB) ||
+	     cpu_has(c, X86_FEATURE_STIBP)) && bad_spectre_microcode(c)) {
+		pr_warn("Intel Spectre v2 broken microcode detected; disabling Speculation Control\n");
+		clear_cpu_cap(c, X86_FEATURE_IBRS);
+		clear_cpu_cap(c, X86_FEATURE_IBPB);
+		clear_cpu_cap(c, X86_FEATURE_STIBP);
+		clear_cpu_cap(c, X86_FEATURE_SPEC_CTRL);
+		clear_cpu_cap(c, X86_FEATURE_INTEL_STIBP);
+	}
+
+	/*
 	 * Atom erratum AAE44/AAF40/AAG38/AAH41:
 	 *
 	 * A race condition between speculative fetches and invalidating
diff --git a/arch/x86/kernel/cpu/intel_rdt.c b/arch/x86/kernel/cpu/intel_rdt.c
index 88dcf84..410629f 100644
--- a/arch/x86/kernel/cpu/intel_rdt.c
+++ b/arch/x86/kernel/cpu/intel_rdt.c
@@ -135,6 +135,40 @@ struct rdt_resource rdt_resources_all[] = {
 		.format_str		= "%d=%0*x",
 		.fflags			= RFTYPE_RES_CACHE,
 	},
+	[RDT_RESOURCE_L2DATA] =
+	{
+		.rid			= RDT_RESOURCE_L2DATA,
+		.name			= "L2DATA",
+		.domains		= domain_init(RDT_RESOURCE_L2DATA),
+		.msr_base		= IA32_L2_CBM_BASE,
+		.msr_update		= cat_wrmsr,
+		.cache_level		= 2,
+		.cache = {
+			.min_cbm_bits	= 1,
+			.cbm_idx_mult	= 2,
+			.cbm_idx_offset	= 0,
+		},
+		.parse_ctrlval		= parse_cbm,
+		.format_str		= "%d=%0*x",
+		.fflags			= RFTYPE_RES_CACHE,
+	},
+	[RDT_RESOURCE_L2CODE] =
+	{
+		.rid			= RDT_RESOURCE_L2CODE,
+		.name			= "L2CODE",
+		.domains		= domain_init(RDT_RESOURCE_L2CODE),
+		.msr_base		= IA32_L2_CBM_BASE,
+		.msr_update		= cat_wrmsr,
+		.cache_level		= 2,
+		.cache = {
+			.min_cbm_bits	= 1,
+			.cbm_idx_mult	= 2,
+			.cbm_idx_offset	= 1,
+		},
+		.parse_ctrlval		= parse_cbm,
+		.format_str		= "%d=%0*x",
+		.fflags			= RFTYPE_RES_CACHE,
+	},
 	[RDT_RESOURCE_MBA] =
 	{
 		.rid			= RDT_RESOURCE_MBA,
@@ -259,15 +293,15 @@ static void rdt_get_cache_alloc_cfg(int idx, struct rdt_resource *r)
 	r->alloc_enabled = true;
 }
 
-static void rdt_get_cdp_l3_config(int type)
+static void rdt_get_cdp_config(int level, int type)
 {
-	struct rdt_resource *r_l3 = &rdt_resources_all[RDT_RESOURCE_L3];
+	struct rdt_resource *r_l = &rdt_resources_all[level];
 	struct rdt_resource *r = &rdt_resources_all[type];
 
-	r->num_closid = r_l3->num_closid / 2;
-	r->cache.cbm_len = r_l3->cache.cbm_len;
-	r->default_ctrl = r_l3->default_ctrl;
-	r->cache.shareable_bits = r_l3->cache.shareable_bits;
+	r->num_closid = r_l->num_closid / 2;
+	r->cache.cbm_len = r_l->cache.cbm_len;
+	r->default_ctrl = r_l->default_ctrl;
+	r->cache.shareable_bits = r_l->cache.shareable_bits;
 	r->data_width = (r->cache.cbm_len + 3) / 4;
 	r->alloc_capable = true;
 	/*
@@ -277,6 +311,18 @@ static void rdt_get_cdp_l3_config(int type)
 	r->alloc_enabled = false;
 }
 
+static void rdt_get_cdp_l3_config(void)
+{
+	rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA);
+	rdt_get_cdp_config(RDT_RESOURCE_L3, RDT_RESOURCE_L3CODE);
+}
+
+static void rdt_get_cdp_l2_config(void)
+{
+	rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA);
+	rdt_get_cdp_config(RDT_RESOURCE_L2, RDT_RESOURCE_L2CODE);
+}
+
 static int get_cache_id(int cpu, int level)
 {
 	struct cpu_cacheinfo *ci = get_cpu_cacheinfo(cpu);
@@ -525,10 +571,6 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 		 */
 		if (static_branch_unlikely(&rdt_mon_enable_key))
 			rmdir_mondata_subdir_allrdtgrp(r, d->id);
-		kfree(d->ctrl_val);
-		kfree(d->rmid_busy_llc);
-		kfree(d->mbm_total);
-		kfree(d->mbm_local);
 		list_del(&d->list);
 		if (is_mbm_enabled())
 			cancel_delayed_work(&d->mbm_over);
@@ -545,6 +587,10 @@ static void domain_remove_cpu(int cpu, struct rdt_resource *r)
 			cancel_delayed_work(&d->cqm_limbo);
 		}
 
+		kfree(d->ctrl_val);
+		kfree(d->rmid_busy_llc);
+		kfree(d->mbm_total);
+		kfree(d->mbm_local);
 		kfree(d);
 		return;
 	}
@@ -645,6 +691,7 @@ enum {
 	RDT_FLAG_L3_CAT,
 	RDT_FLAG_L3_CDP,
 	RDT_FLAG_L2_CAT,
+	RDT_FLAG_L2_CDP,
 	RDT_FLAG_MBA,
 };
 
@@ -667,6 +714,7 @@ static struct rdt_options rdt_options[]  __initdata = {
 	RDT_OPT(RDT_FLAG_L3_CAT,    "l3cat",	X86_FEATURE_CAT_L3),
 	RDT_OPT(RDT_FLAG_L3_CDP,    "l3cdp",	X86_FEATURE_CDP_L3),
 	RDT_OPT(RDT_FLAG_L2_CAT,    "l2cat",	X86_FEATURE_CAT_L2),
+	RDT_OPT(RDT_FLAG_L2_CDP,    "l2cdp",	X86_FEATURE_CDP_L2),
 	RDT_OPT(RDT_FLAG_MBA,	    "mba",	X86_FEATURE_MBA),
 };
 #define NUM_RDT_OPTIONS ARRAY_SIZE(rdt_options)
@@ -729,15 +777,15 @@ static __init bool get_rdt_alloc_resources(void)
 
 	if (rdt_cpu_has(X86_FEATURE_CAT_L3)) {
 		rdt_get_cache_alloc_cfg(1, &rdt_resources_all[RDT_RESOURCE_L3]);
-		if (rdt_cpu_has(X86_FEATURE_CDP_L3)) {
-			rdt_get_cdp_l3_config(RDT_RESOURCE_L3DATA);
-			rdt_get_cdp_l3_config(RDT_RESOURCE_L3CODE);
-		}
+		if (rdt_cpu_has(X86_FEATURE_CDP_L3))
+			rdt_get_cdp_l3_config();
 		ret = true;
 	}
 	if (rdt_cpu_has(X86_FEATURE_CAT_L2)) {
 		/* CPUID 0x10.2 fields are same format at 0x10.1 */
 		rdt_get_cache_alloc_cfg(2, &rdt_resources_all[RDT_RESOURCE_L2]);
+		if (rdt_cpu_has(X86_FEATURE_CDP_L2))
+			rdt_get_cdp_l2_config();
 		ret = true;
 	}
 
diff --git a/arch/x86/kernel/cpu/intel_rdt.h b/arch/x86/kernel/cpu/intel_rdt.h
index 3397244..3fd7a70 100644
--- a/arch/x86/kernel/cpu/intel_rdt.h
+++ b/arch/x86/kernel/cpu/intel_rdt.h
@@ -7,12 +7,15 @@
 #include <linux/jump_label.h>
 
 #define IA32_L3_QOS_CFG		0xc81
+#define IA32_L2_QOS_CFG		0xc82
 #define IA32_L3_CBM_BASE	0xc90
 #define IA32_L2_CBM_BASE	0xd10
 #define IA32_MBA_THRTL_BASE	0xd50
 
 #define L3_QOS_CDP_ENABLE	0x01ULL
 
+#define L2_QOS_CDP_ENABLE	0x01ULL
+
 /*
  * Event IDs are used to program IA32_QM_EVTSEL before reading event
  * counter from IA32_QM_CTR
@@ -357,6 +360,8 @@ enum {
 	RDT_RESOURCE_L3DATA,
 	RDT_RESOURCE_L3CODE,
 	RDT_RESOURCE_L2,
+	RDT_RESOURCE_L2DATA,
+	RDT_RESOURCE_L2CODE,
 	RDT_RESOURCE_MBA,
 
 	/* Must be the last */
diff --git a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
index 64c5ff9..bdab7d2 100644
--- a/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
+++ b/arch/x86/kernel/cpu/intel_rdt_rdtgroup.c
@@ -990,6 +990,7 @@ mongroup_create_dir(struct kernfs_node *parent_kn, struct rdtgroup *prgrp,
 	kernfs_remove(kn);
 	return ret;
 }
+
 static void l3_qos_cfg_update(void *arg)
 {
 	bool *enable = arg;
@@ -997,8 +998,17 @@ static void l3_qos_cfg_update(void *arg)
 	wrmsrl(IA32_L3_QOS_CFG, *enable ? L3_QOS_CDP_ENABLE : 0ULL);
 }
 
-static int set_l3_qos_cfg(struct rdt_resource *r, bool enable)
+static void l2_qos_cfg_update(void *arg)
 {
+	bool *enable = arg;
+
+	wrmsrl(IA32_L2_QOS_CFG, *enable ? L2_QOS_CDP_ENABLE : 0ULL);
+}
+
+static int set_cache_qos_cfg(int level, bool enable)
+{
+	void (*update)(void *arg);
+	struct rdt_resource *r_l;
 	cpumask_var_t cpu_mask;
 	struct rdt_domain *d;
 	int cpu;
@@ -1006,16 +1016,24 @@ static int set_l3_qos_cfg(struct rdt_resource *r, bool enable)
 	if (!zalloc_cpumask_var(&cpu_mask, GFP_KERNEL))
 		return -ENOMEM;
 
-	list_for_each_entry(d, &r->domains, list) {
+	if (level == RDT_RESOURCE_L3)
+		update = l3_qos_cfg_update;
+	else if (level == RDT_RESOURCE_L2)
+		update = l2_qos_cfg_update;
+	else
+		return -EINVAL;
+
+	r_l = &rdt_resources_all[level];
+	list_for_each_entry(d, &r_l->domains, list) {
 		/* Pick one CPU from each domain instance to update MSR */
 		cpumask_set_cpu(cpumask_any(&d->cpu_mask), cpu_mask);
 	}
 	cpu = get_cpu();
 	/* Update QOS_CFG MSR on this cpu if it's in cpu_mask. */
 	if (cpumask_test_cpu(cpu, cpu_mask))
-		l3_qos_cfg_update(&enable);
+		update(&enable);
 	/* Update QOS_CFG MSR on all other cpus in cpu_mask. */
-	smp_call_function_many(cpu_mask, l3_qos_cfg_update, &enable, 1);
+	smp_call_function_many(cpu_mask, update, &enable, 1);
 	put_cpu();
 
 	free_cpumask_var(cpu_mask);
@@ -1023,52 +1041,99 @@ static int set_l3_qos_cfg(struct rdt_resource *r, bool enable)
 	return 0;
 }
 
-static int cdp_enable(void)
+static int cdp_enable(int level, int data_type, int code_type)
 {
-	struct rdt_resource *r_l3data = &rdt_resources_all[RDT_RESOURCE_L3DATA];
-	struct rdt_resource *r_l3code = &rdt_resources_all[RDT_RESOURCE_L3CODE];
-	struct rdt_resource *r_l3 = &rdt_resources_all[RDT_RESOURCE_L3];
+	struct rdt_resource *r_ldata = &rdt_resources_all[data_type];
+	struct rdt_resource *r_lcode = &rdt_resources_all[code_type];
+	struct rdt_resource *r_l = &rdt_resources_all[level];
 	int ret;
 
-	if (!r_l3->alloc_capable || !r_l3data->alloc_capable ||
-	    !r_l3code->alloc_capable)
+	if (!r_l->alloc_capable || !r_ldata->alloc_capable ||
+	    !r_lcode->alloc_capable)
 		return -EINVAL;
 
-	ret = set_l3_qos_cfg(r_l3, true);
+	ret = set_cache_qos_cfg(level, true);
 	if (!ret) {
-		r_l3->alloc_enabled = false;
-		r_l3data->alloc_enabled = true;
-		r_l3code->alloc_enabled = true;
+		r_l->alloc_enabled = false;
+		r_ldata->alloc_enabled = true;
+		r_lcode->alloc_enabled = true;
 	}
 	return ret;
 }
 
-static void cdp_disable(void)
+static int cdpl3_enable(void)
 {
-	struct rdt_resource *r = &rdt_resources_all[RDT_RESOURCE_L3];
+	return cdp_enable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA,
+			  RDT_RESOURCE_L3CODE);
+}
+
+static int cdpl2_enable(void)
+{
+	return cdp_enable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA,
+			  RDT_RESOURCE_L2CODE);
+}
+
+static void cdp_disable(int level, int data_type, int code_type)
+{
+	struct rdt_resource *r = &rdt_resources_all[level];
 
 	r->alloc_enabled = r->alloc_capable;
 
-	if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled) {
-		rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled = false;
-		rdt_resources_all[RDT_RESOURCE_L3CODE].alloc_enabled = false;
-		set_l3_qos_cfg(r, false);
+	if (rdt_resources_all[data_type].alloc_enabled) {
+		rdt_resources_all[data_type].alloc_enabled = false;
+		rdt_resources_all[code_type].alloc_enabled = false;
+		set_cache_qos_cfg(level, false);
 	}
 }
 
+static void cdpl3_disable(void)
+{
+	cdp_disable(RDT_RESOURCE_L3, RDT_RESOURCE_L3DATA, RDT_RESOURCE_L3CODE);
+}
+
+static void cdpl2_disable(void)
+{
+	cdp_disable(RDT_RESOURCE_L2, RDT_RESOURCE_L2DATA, RDT_RESOURCE_L2CODE);
+}
+
+static void cdp_disable_all(void)
+{
+	if (rdt_resources_all[RDT_RESOURCE_L3DATA].alloc_enabled)
+		cdpl3_disable();
+	if (rdt_resources_all[RDT_RESOURCE_L2DATA].alloc_enabled)
+		cdpl2_disable();
+}
+
 static int parse_rdtgroupfs_options(char *data)
 {
 	char *token, *o = data;
 	int ret = 0;
 
 	while ((token = strsep(&o, ",")) != NULL) {
-		if (!*token)
-			return -EINVAL;
+		if (!*token) {
+			ret = -EINVAL;
+			goto out;
+		}
 
-		if (!strcmp(token, "cdp"))
-			ret = cdp_enable();
+		if (!strcmp(token, "cdp")) {
+			ret = cdpl3_enable();
+			if (ret)
+				goto out;
+		} else if (!strcmp(token, "cdpl2")) {
+			ret = cdpl2_enable();
+			if (ret)
+				goto out;
+		} else {
+			ret = -EINVAL;
+			goto out;
+		}
 	}
 
+	return 0;
+
+out:
+	pr_err("Invalid mount option \"%s\"\n", token);
+
 	return ret;
 }
 
@@ -1223,7 +1288,7 @@ static struct dentry *rdt_mount(struct file_system_type *fs_type,
 out_info:
 	kernfs_remove(kn_info);
 out_cdp:
-	cdp_disable();
+	cdp_disable_all();
 out:
 	rdt_last_cmd_clear();
 	mutex_unlock(&rdtgroup_mutex);
@@ -1383,7 +1448,7 @@ static void rdt_kill_sb(struct super_block *sb)
 	/*Put everything back to default values. */
 	for_each_alloc_enabled_rdt_resource(r)
 		reset_all_ctrls(r);
-	cdp_disable();
+	cdp_disable_all();
 	rmdir_all_sub();
 	static_branch_disable_cpuslocked(&rdt_alloc_enable_key);
 	static_branch_disable_cpuslocked(&rdt_mon_enable_key);
diff --git a/arch/x86/kernel/cpu/mcheck/mce-severity.c b/arch/x86/kernel/cpu/mcheck/mce-severity.c
index 4ca632a..5bbd06f 100644
--- a/arch/x86/kernel/cpu/mcheck/mce-severity.c
+++ b/arch/x86/kernel/cpu/mcheck/mce-severity.c
@@ -59,6 +59,7 @@ static struct severity {
 #define  MCGMASK(x, y)	.mcgmask = x, .mcgres = y
 #define  MASK(x, y)	.mask = x, .result = y
 #define MCI_UC_S (MCI_STATUS_UC|MCI_STATUS_S)
+#define MCI_UC_AR (MCI_STATUS_UC|MCI_STATUS_AR)
 #define MCI_UC_SAR (MCI_STATUS_UC|MCI_STATUS_S|MCI_STATUS_AR)
 #define	MCI_ADDR (MCI_STATUS_ADDRV|MCI_STATUS_MISCV)
 
@@ -101,6 +102,22 @@ static struct severity {
 		NOSER, BITCLR(MCI_STATUS_UC)
 		),
 
+	/*
+	 * known AO MCACODs reported via MCE or CMC:
+	 *
+	 * SRAO could be signaled either via a machine check exception or
+	 * CMCI with the corresponding bit S 1 or 0. So we don't need to
+	 * check bit S for SRAO.
+	 */
+	MCESEV(
+		AO, "Action optional: memory scrubbing error",
+		SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD_SCRUBMSK, MCI_STATUS_UC|MCACOD_SCRUB)
+		),
+	MCESEV(
+		AO, "Action optional: last level cache writeback error",
+		SER, MASK(MCI_STATUS_OVER|MCI_UC_AR|MCACOD, MCI_STATUS_UC|MCACOD_L3WB)
+		),
+
 	/* ignore OVER for UCNA */
 	MCESEV(
 		UCNA, "Uncorrected no action required",
@@ -149,15 +166,6 @@ static struct severity {
 		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR, MCI_UC_SAR)
 		),
 
-	/* known AO MCACODs: */
-	MCESEV(
-		AO, "Action optional: memory scrubbing error",
-		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD_SCRUBMSK, MCI_UC_S|MCACOD_SCRUB)
-		),
-	MCESEV(
-		AO, "Action optional: last level cache writeback error",
-		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR|MCACOD, MCI_UC_S|MCACOD_L3WB)
-		),
 	MCESEV(
 		SOME, "Action optional: unknown MCACOD",
 		SER, MASK(MCI_STATUS_OVER|MCI_UC_SAR, MCI_UC_S)
diff --git a/arch/x86/kernel/cpu/mcheck/mce.c b/arch/x86/kernel/cpu/mcheck/mce.c
index 3b7319e..ba1f955 100644
--- a/arch/x86/kernel/cpu/mcheck/mce.c
+++ b/arch/x86/kernel/cpu/mcheck/mce.c
@@ -503,10 +503,8 @@ static int mce_usable_address(struct mce *m)
 bool mce_is_memory_error(struct mce *m)
 {
 	if (m->cpuvendor == X86_VENDOR_AMD) {
-		/* ErrCodeExt[20:16] */
-		u8 xec = (m->status >> 16) & 0x1f;
+		return amd_mce_is_memory_error(m);
 
-		return (xec == 0x0 || xec == 0x8);
 	} else if (m->cpuvendor == X86_VENDOR_INTEL) {
 		/*
 		 * Intel SDM Volume 3B - 15.9.2 Compound Error Codes
@@ -530,6 +528,17 @@ bool mce_is_memory_error(struct mce *m)
 }
 EXPORT_SYMBOL_GPL(mce_is_memory_error);
 
+static bool mce_is_correctable(struct mce *m)
+{
+	if (m->cpuvendor == X86_VENDOR_AMD && m->status & MCI_STATUS_DEFERRED)
+		return false;
+
+	if (m->status & MCI_STATUS_UC)
+		return false;
+
+	return true;
+}
+
 static bool cec_add_mce(struct mce *m)
 {
 	if (!m)
@@ -537,7 +546,7 @@ static bool cec_add_mce(struct mce *m)
 
 	/* We eat only correctable DRAM errors with usable addresses. */
 	if (mce_is_memory_error(m) &&
-	    !(m->status & MCI_STATUS_UC) &&
+	    mce_is_correctable(m)  &&
 	    mce_usable_address(m))
 		if (!cec_add_elem(m->addr >> PAGE_SHIFT))
 			return true;
@@ -1785,6 +1794,11 @@ static void unexpected_machine_check(struct pt_regs *regs, long error_code)
 void (*machine_check_vector)(struct pt_regs *, long error_code) =
 						unexpected_machine_check;
 
+dotraplinkage void do_mce(struct pt_regs *regs, long error_code)
+{
+	machine_check_vector(regs, error_code);
+}
+
 /*
  * Called for each booted CPU to set up machine checks.
  * Must be called with preempt off:
diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
index 486f640..0f32ad2 100644
--- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
+++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
@@ -110,6 +110,20 @@ const char *smca_get_long_name(enum smca_bank_types t)
 }
 EXPORT_SYMBOL_GPL(smca_get_long_name);
 
+static enum smca_bank_types smca_get_bank_type(struct mce *m)
+{
+	struct smca_bank *b;
+
+	if (m->bank >= N_SMCA_BANK_TYPES)
+		return N_SMCA_BANK_TYPES;
+
+	b = &smca_banks[m->bank];
+	if (!b->hwid)
+		return N_SMCA_BANK_TYPES;
+
+	return b->hwid->bank_type;
+}
+
 static struct smca_hwid smca_hwid_mcatypes[] = {
 	/* { bank_type, hwid_mcatype, xec_bitmap } */
 
@@ -407,7 +421,9 @@ static void deferred_error_interrupt_enable(struct cpuinfo_x86 *c)
 	    (deferred_error_int_vector != amd_deferred_error_interrupt))
 		deferred_error_int_vector = amd_deferred_error_interrupt;
 
-	low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
+	if (!mce_flags.smca)
+		low = (low & ~MASK_DEF_INT_TYPE) | DEF_INT_TYPE_APIC;
+
 	wrmsr(MSR_CU_DEF_ERR, low, high);
 }
 
@@ -738,6 +754,17 @@ int umc_normaddr_to_sysaddr(u64 norm_addr, u16 nid, u8 umc, u64 *sys_addr)
 }
 EXPORT_SYMBOL_GPL(umc_normaddr_to_sysaddr);
 
+bool amd_mce_is_memory_error(struct mce *m)
+{
+	/* ErrCodeExt[20:16] */
+	u8 xec = (m->status >> 16) & 0x1f;
+
+	if (mce_flags.smca)
+		return smca_get_bank_type(m) == SMCA_UMC && xec == 0x0;
+
+	return m->bank == 4 && xec == 0x8;
+}
+
 static void __log_error(unsigned int bank, u64 status, u64 addr, u64 misc)
 {
 	struct mce m;
diff --git a/arch/x86/kernel/cpu/microcode/core.c b/arch/x86/kernel/cpu/microcode/core.c
index c4fa4a8..e4fc595 100644
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -239,7 +239,7 @@ static int __init save_microcode_in_initrd(void)
 		break;
 	case X86_VENDOR_AMD:
 		if (c->x86 >= 0x10)
-			return save_microcode_in_initrd_amd(cpuid_eax(1));
+			ret = save_microcode_in_initrd_amd(cpuid_eax(1));
 		break;
 	default:
 		break;
diff --git a/arch/x86/kernel/cpu/microcode/intel.c b/arch/x86/kernel/cpu/microcode/intel.c
index 8ccdca6..f7c55b0 100644
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -45,6 +45,9 @@ static const char ucode_path[] = "kernel/x86/microcode/GenuineIntel.bin";
 /* Current microcode patch used in early patching on the APs. */
 static struct microcode_intel *intel_ucode_patch;
 
+/* last level cache size per core */
+static int llc_size_per_core;
+
 static inline bool cpu_signatures_match(unsigned int s1, unsigned int p1,
 					unsigned int s2, unsigned int p2)
 {
@@ -910,8 +913,19 @@ static bool is_blacklisted(unsigned int cpu)
 {
 	struct cpuinfo_x86 *c = &cpu_data(cpu);
 
-	if (c->x86 == 6 && c->x86_model == INTEL_FAM6_BROADWELL_X) {
-		pr_err_once("late loading on model 79 is disabled.\n");
+	/*
+	 * Late loading on model 79 with microcode revision less than 0x0b000021
+	 * and LLC size per core bigger than 2.5MB may result in a system hang.
+	 * This behavior is documented in item BDF90, #334165 (Intel Xeon
+	 * Processor E7-8800/4800 v4 Product Family).
+	 */
+	if (c->x86 == 6 &&
+	    c->x86_model == INTEL_FAM6_BROADWELL_X &&
+	    c->x86_mask == 0x01 &&
+	    llc_size_per_core > 2621440 &&
+	    c->microcode < 0x0b000021) {
+		pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode);
+		pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n");
 		return true;
 	}
 
@@ -966,6 +980,15 @@ static struct microcode_ops microcode_intel_ops = {
 	.apply_microcode                  = apply_microcode_intel,
 };
 
+static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c)
+{
+	u64 llc_size = c->x86_cache_size * 1024;
+
+	do_div(llc_size, c->x86_max_cores);
+
+	return (int)llc_size;
+}
+
 struct microcode_ops * __init init_intel_microcode(void)
 {
 	struct cpuinfo_x86 *c = &boot_cpu_data;
@@ -976,5 +999,7 @@ struct microcode_ops * __init init_intel_microcode(void)
 		return NULL;
 	}
 
+	llc_size_per_core = calc_llc_size_per_core(c);
+
 	return &microcode_intel_ops;
 }
diff --git a/arch/x86/kernel/cpu/scattered.c b/arch/x86/kernel/cpu/scattered.c
index 05459ad..4075d2b 100644
--- a/arch/x86/kernel/cpu/scattered.c
+++ b/arch/x86/kernel/cpu/scattered.c
@@ -21,12 +21,10 @@ struct cpuid_bit {
 static const struct cpuid_bit cpuid_bits[] = {
 	{ X86_FEATURE_APERFMPERF,       CPUID_ECX,  0, 0x00000006, 0 },
 	{ X86_FEATURE_EPB,		CPUID_ECX,  3, 0x00000006, 0 },
-	{ X86_FEATURE_INTEL_PT,		CPUID_EBX, 25, 0x00000007, 0 },
-	{ X86_FEATURE_AVX512_4VNNIW,    CPUID_EDX,  2, 0x00000007, 0 },
-	{ X86_FEATURE_AVX512_4FMAPS,    CPUID_EDX,  3, 0x00000007, 0 },
 	{ X86_FEATURE_CAT_L3,		CPUID_EBX,  1, 0x00000010, 0 },
 	{ X86_FEATURE_CAT_L2,		CPUID_EBX,  2, 0x00000010, 0 },
 	{ X86_FEATURE_CDP_L3,		CPUID_ECX,  2, 0x00000010, 1 },
+	{ X86_FEATURE_CDP_L2,		CPUID_ECX,  2, 0x00000010, 2 },
 	{ X86_FEATURE_MBA,		CPUID_EBX,  3, 0x00000010, 0 },
 	{ X86_FEATURE_HW_PSTATE,	CPUID_EDX,  7, 0x80000007, 0 },
 	{ X86_FEATURE_CPB,		CPUID_EDX,  9, 0x80000007, 0 },
diff --git a/arch/x86/kernel/dumpstack.c b/arch/x86/kernel/dumpstack.c
index 5fa1106..afbecff 100644
--- a/arch/x86/kernel/dumpstack.c
+++ b/arch/x86/kernel/dumpstack.c
@@ -76,12 +76,23 @@ void show_iret_regs(struct pt_regs *regs)
 		regs->sp, regs->flags);
 }
 
-static void show_regs_safe(struct stack_info *info, struct pt_regs *regs)
+static void show_regs_if_on_stack(struct stack_info *info, struct pt_regs *regs,
+				  bool partial)
 {
-	if (on_stack(info, regs, sizeof(*regs)))
+	/*
+	 * These on_stack() checks aren't strictly necessary: the unwind code
+	 * has already validated the 'regs' pointer.  The checks are done for
+	 * ordering reasons: if the registers are on the next stack, we don't
+	 * want to print them out yet.  Otherwise they'll be shown as part of
+	 * the wrong stack.  Later, when show_trace_log_lvl() switches to the
+	 * next stack, this function will be called again with the same regs so
+	 * they can be printed in the right context.
+	 */
+	if (!partial && on_stack(info, regs, sizeof(*regs))) {
 		__show_regs(regs, 0);
-	else if (on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
-			  IRET_FRAME_SIZE)) {
+
+	} else if (partial && on_stack(info, (void *)regs + IRET_FRAME_OFFSET,
+				       IRET_FRAME_SIZE)) {
 		/*
 		 * When an interrupt or exception occurs in entry code, the
 		 * full pt_regs might not have been saved yet.  In that case
@@ -98,11 +109,13 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 	struct stack_info stack_info = {0};
 	unsigned long visit_mask = 0;
 	int graph_idx = 0;
+	bool partial;
 
 	printk("%sCall Trace:\n", log_lvl);
 
 	unwind_start(&state, task, regs, stack);
 	stack = stack ? : get_stack_pointer(task, regs);
+	regs = unwind_get_entry_regs(&state, &partial);
 
 	/*
 	 * Iterate through the stacks, starting with the current stack pointer.
@@ -120,7 +133,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 	 * - hardirq stack
 	 * - entry stack
 	 */
-	for (regs = NULL; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
+	for ( ; stack; stack = PTR_ALIGN(stack_info.next_sp, sizeof(long))) {
 		const char *stack_name;
 
 		if (get_stack_info(stack, task, &stack_info, &visit_mask)) {
@@ -140,7 +153,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 			printk("%s <%s>\n", log_lvl, stack_name);
 
 		if (regs)
-			show_regs_safe(&stack_info, regs);
+			show_regs_if_on_stack(&stack_info, regs, partial);
 
 		/*
 		 * Scan the stack, printing any text addresses we find.  At the
@@ -164,7 +177,7 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 
 			/*
 			 * Don't print regs->ip again if it was already printed
-			 * by show_regs_safe() below.
+			 * by show_regs_if_on_stack().
 			 */
 			if (regs && stack == &regs->ip)
 				goto next;
@@ -199,9 +212,9 @@ void show_trace_log_lvl(struct task_struct *task, struct pt_regs *regs,
 			unwind_next_frame(&state);
 
 			/* if the frame has entry regs, print them */
-			regs = unwind_get_entry_regs(&state);
+			regs = unwind_get_entry_regs(&state, &partial);
 			if (regs)
-				show_regs_safe(&stack_info, regs);
+				show_regs_if_on_stack(&stack_info, regs, partial);
 		}
 
 		if (stack_name)
diff --git a/arch/x86/kernel/ftrace_32.S b/arch/x86/kernel/ftrace_32.S
index b6c6468e..4c8440d 100644
--- a/arch/x86/kernel/ftrace_32.S
+++ b/arch/x86/kernel/ftrace_32.S
@@ -8,6 +8,7 @@
 #include <asm/segment.h>
 #include <asm/export.h>
 #include <asm/ftrace.h>
+#include <asm/nospec-branch.h>
 
 #ifdef CC_USING_FENTRY
 # define function_hook	__fentry__
@@ -197,7 +198,8 @@
 	movl	0x4(%ebp), %edx
 	subl	$MCOUNT_INSN_SIZE, %eax
 
-	call	*ftrace_trace_function
+	movl	ftrace_trace_function, %ecx
+	CALL_NOSPEC %ecx
 
 	popl	%edx
 	popl	%ecx
@@ -241,5 +243,5 @@
 	movl	%eax, %ecx
 	popl	%edx
 	popl	%eax
-	jmp	*%ecx
+	JMP_NOSPEC %ecx
 #endif
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index c832291..91b2cff 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -7,7 +7,8 @@
 #include <asm/ptrace.h>
 #include <asm/ftrace.h>
 #include <asm/export.h>
-
+#include <asm/nospec-branch.h>
+#include <asm/unwind_hints.h>
 
 	.code64
 	.section .entry.text, "ax"
@@ -20,7 +21,6 @@
 EXPORT_SYMBOL(mcount)
 #endif
 
-/* All cases save the original rbp (8 bytes) */
 #ifdef CONFIG_FRAME_POINTER
 # ifdef CC_USING_FENTRY
 /* Save parent and function stack frames (rip and rbp) */
@@ -31,7 +31,7 @@
 # endif
 #else
 /* No need to save a stack frame */
-# define MCOUNT_FRAME_SIZE	8
+# define MCOUNT_FRAME_SIZE	0
 #endif /* CONFIG_FRAME_POINTER */
 
 /* Size of stack used to save mcount regs in save_mcount_regs */
@@ -64,10 +64,10 @@
  */
 .macro save_mcount_regs added=0
 
-	/* Always save the original rbp */
+#ifdef CONFIG_FRAME_POINTER
+	/* Save the original rbp */
 	pushq %rbp
 
-#ifdef CONFIG_FRAME_POINTER
 	/*
 	 * Stack traces will stop at the ftrace trampoline if the frame pointer
 	 * is not set up properly. If fentry is used, we need to save a frame
@@ -105,7 +105,11 @@
 	 * Save the original RBP. Even though the mcount ABI does not
 	 * require this, it helps out callers.
 	 */
+#ifdef CONFIG_FRAME_POINTER
 	movq MCOUNT_REG_SIZE-8(%rsp), %rdx
+#else
+	movq %rbp, %rdx
+#endif
 	movq %rdx, RBP(%rsp)
 
 	/* Copy the parent address into %rsi (second parameter) */
@@ -148,7 +152,7 @@
 
 ENTRY(function_hook)
 	retq
-END(function_hook)
+ENDPROC(function_hook)
 
 ENTRY(ftrace_caller)
 	/* save_mcount_regs fills in first two parameters */
@@ -184,7 +188,7 @@
 /* This is weak to keep gas from relaxing the jumps */
 WEAK(ftrace_stub)
 	retq
-END(ftrace_caller)
+ENDPROC(ftrace_caller)
 
 ENTRY(ftrace_regs_caller)
 	/* Save the current flags before any operations that can change them */
@@ -255,7 +259,7 @@
 
 	jmp ftrace_epilogue
 
-END(ftrace_regs_caller)
+ENDPROC(ftrace_regs_caller)
 
 
 #else /* ! CONFIG_DYNAMIC_FTRACE */
@@ -286,12 +290,12 @@
 	 * ip and parent ip are used and the list function is called when
 	 * function tracing is enabled.
 	 */
-	call   *ftrace_trace_function
-
+	movq ftrace_trace_function, %r8
+	CALL_NOSPEC %r8
 	restore_mcount_regs
 
 	jmp fgraph_trace
-END(function_hook)
+ENDPROC(function_hook)
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -313,9 +317,10 @@
 	restore_mcount_regs
 
 	retq
-END(ftrace_graph_caller)
+ENDPROC(ftrace_graph_caller)
 
-GLOBAL(return_to_handler)
+ENTRY(return_to_handler)
+	UNWIND_HINT_EMPTY
 	subq  $24, %rsp
 
 	/* Save the return values */
@@ -329,5 +334,6 @@
 	movq 8(%rsp), %rdx
 	movq (%rsp), %rax
 	addq $24, %rsp
-	jmp *%rdi
+	JMP_NOSPEC %rdi
+END(return_to_handler)
 #endif
diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c
index 6a5d757..7ba5d81 100644
--- a/arch/x86/kernel/head64.c
+++ b/arch/x86/kernel/head64.c
@@ -157,8 +157,8 @@ unsigned long __head __startup_64(unsigned long physaddr,
 	p = fixup_pointer(&phys_base, physaddr);
 	*p += load_delta - sme_get_me_mask();
 
-	/* Encrypt the kernel (if SME is active) */
-	sme_encrypt_kernel();
+	/* Encrypt the kernel and related (if SME is active) */
+	sme_encrypt_kernel(bp);
 
 	/*
 	 * Return the SME encryption mask (if SME is active) to be used as a
diff --git a/arch/x86/kernel/idt.c b/arch/x86/kernel/idt.c
index d985cef..56d99be 100644
--- a/arch/x86/kernel/idt.c
+++ b/arch/x86/kernel/idt.c
@@ -56,7 +56,7 @@ struct idt_data {
  * Early traps running on the DEFAULT_STACK because the other interrupt
  * stacks work only after cpu_init().
  */
-static const __initdata struct idt_data early_idts[] = {
+static const __initconst struct idt_data early_idts[] = {
 	INTG(X86_TRAP_DB,		debug),
 	SYSG(X86_TRAP_BP,		int3),
 #ifdef CONFIG_X86_32
@@ -70,7 +70,7 @@ static const __initdata struct idt_data early_idts[] = {
  * the traps which use them are reinitialized with IST after cpu_init() has
  * set up TSS.
  */
-static const __initdata struct idt_data def_idts[] = {
+static const __initconst struct idt_data def_idts[] = {
 	INTG(X86_TRAP_DE,		divide_error),
 	INTG(X86_TRAP_NMI,		nmi),
 	INTG(X86_TRAP_BR,		bounds),
@@ -108,7 +108,7 @@ static const __initdata struct idt_data def_idts[] = {
 /*
  * The APIC and SMP idt entries
  */
-static const __initdata struct idt_data apic_idts[] = {
+static const __initconst struct idt_data apic_idts[] = {
 #ifdef CONFIG_SMP
 	INTG(RESCHEDULE_VECTOR,		reschedule_interrupt),
 	INTG(CALL_FUNCTION_VECTOR,	call_function_interrupt),
@@ -150,7 +150,7 @@ static const __initdata struct idt_data apic_idts[] = {
  * Early traps running on the DEFAULT_STACK because the other interrupt
  * stacks work only after cpu_init().
  */
-static const __initdata struct idt_data early_pf_idts[] = {
+static const __initconst struct idt_data early_pf_idts[] = {
 	INTG(X86_TRAP_PF,		page_fault),
 };
 
@@ -158,7 +158,7 @@ static const __initdata struct idt_data early_pf_idts[] = {
  * Override for the debug_idt. Same as the default, but with interrupt
  * stack set to DEFAULT_STACK (0). Required for NMI trap handling.
  */
-static const __initdata struct idt_data dbg_idts[] = {
+static const __initconst struct idt_data dbg_idts[] = {
 	INTG(X86_TRAP_DB,	debug),
 	INTG(X86_TRAP_BP,	int3),
 };
@@ -180,7 +180,7 @@ gate_desc debug_idt_table[IDT_ENTRIES] __page_aligned_bss;
  * The exceptions which use Interrupt stacks. They are setup after
  * cpu_init() when the TSS has been initialized.
  */
-static const __initdata struct idt_data ist_idts[] = {
+static const __initconst struct idt_data ist_idts[] = {
 	ISTG(X86_TRAP_DB,	debug,		DEBUG_STACK),
 	ISTG(X86_TRAP_NMI,	nmi,		NMI_STACK),
 	SISTG(X86_TRAP_BP,	int3,		DEBUG_STACK),
diff --git a/arch/x86/kernel/irq_32.c b/arch/x86/kernel/irq_32.c
index a83b334..c1bdbd3 100644
--- a/arch/x86/kernel/irq_32.c
+++ b/arch/x86/kernel/irq_32.c
@@ -20,6 +20,7 @@
 #include <linux/mm.h>
 
 #include <asm/apic.h>
+#include <asm/nospec-branch.h>
 
 #ifdef CONFIG_DEBUG_STACKOVERFLOW
 
@@ -55,11 +56,11 @@ DEFINE_PER_CPU(struct irq_stack *, softirq_stack);
 static void call_on_stack(void *func, void *stack)
 {
 	asm volatile("xchgl	%%ebx,%%esp	\n"
-		     "call	*%%edi		\n"
+		     CALL_NOSPEC
 		     "movl	%%ebx,%%esp	\n"
 		     : "=b" (stack)
 		     : "0" (stack),
-		       "D"(func)
+		       [thunk_target] "D"(func)
 		     : "memory", "cc", "edx", "ecx", "eax");
 }
 
@@ -95,11 +96,11 @@ static inline int execute_on_irq_stack(int overflow, struct irq_desc *desc)
 		call_on_stack(print_stack_overflow, isp);
 
 	asm volatile("xchgl	%%ebx,%%esp	\n"
-		     "call	*%%edi		\n"
+		     CALL_NOSPEC
 		     "movl	%%ebx,%%esp	\n"
 		     : "=a" (arg1), "=b" (isp)
 		     :  "0" (desc),   "1" (isp),
-			"D" (desc->handle_irq)
+			[thunk_target] "D" (desc->handle_irq)
 		     : "memory", "cc", "ecx");
 	return 1;
 }
diff --git a/arch/x86/kernel/irqinit.c b/arch/x86/kernel/irqinit.c
index 8da3e90..a539410 100644
--- a/arch/x86/kernel/irqinit.c
+++ b/arch/x86/kernel/irqinit.c
@@ -61,6 +61,9 @@ void __init init_ISA_irqs(void)
 	struct irq_chip *chip = legacy_pic->chip;
 	int i;
 
+#if defined(CONFIG_X86_64) || defined(CONFIG_X86_LOCAL_APIC)
+	init_bsp_APIC();
+#endif
 	legacy_pic->init(0);
 
 	for (i = 0; i < nr_legacy_irqs(); i++)
diff --git a/arch/x86/kernel/itmt.c b/arch/x86/kernel/itmt.c
index f73f475..d177940 100644
--- a/arch/x86/kernel/itmt.c
+++ b/arch/x86/kernel/itmt.c
@@ -24,7 +24,6 @@
 #include <linux/cpumask.h>
 #include <linux/cpuset.h>
 #include <linux/mutex.h>
-#include <linux/sched.h>
 #include <linux/sysctl.h>
 #include <linux/nodemask.h>
 
diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c
new file mode 100644
index 0000000..b68fd89
--- /dev/null
+++ b/arch/x86/kernel/jailhouse.c
@@ -0,0 +1,211 @@
+// SPDX-License-Identifier: GPL2.0
+/*
+ * Jailhouse paravirt_ops implementation
+ *
+ * Copyright (c) Siemens AG, 2015-2017
+ *
+ * Authors:
+ *  Jan Kiszka <jan.kiszka@siemens.com>
+ */
+
+#include <linux/acpi_pmtmr.h>
+#include <linux/kernel.h>
+#include <linux/reboot.h>
+#include <asm/apic.h>
+#include <asm/cpu.h>
+#include <asm/hypervisor.h>
+#include <asm/i8259.h>
+#include <asm/irqdomain.h>
+#include <asm/pci_x86.h>
+#include <asm/reboot.h>
+#include <asm/setup.h>
+
+static __initdata struct jailhouse_setup_data setup_data;
+static unsigned int precalibrated_tsc_khz;
+
+static uint32_t jailhouse_cpuid_base(void)
+{
+	if (boot_cpu_data.cpuid_level < 0 ||
+	    !boot_cpu_has(X86_FEATURE_HYPERVISOR))
+		return 0;
+
+	return hypervisor_cpuid_base("Jailhouse\0\0\0", 0);
+}
+
+static uint32_t __init jailhouse_detect(void)
+{
+	return jailhouse_cpuid_base();
+}
+
+static void jailhouse_get_wallclock(struct timespec *now)
+{
+	memset(now, 0, sizeof(*now));
+}
+
+static void __init jailhouse_timer_init(void)
+{
+	lapic_timer_frequency = setup_data.apic_khz * (1000 / HZ);
+}
+
+static unsigned long jailhouse_get_tsc(void)
+{
+	return precalibrated_tsc_khz;
+}
+
+static void __init jailhouse_x2apic_init(void)
+{
+#ifdef CONFIG_X86_X2APIC
+	if (!x2apic_enabled())
+		return;
+	/*
+	 * We do not have access to IR inside Jailhouse non-root cells.  So
+	 * we have to run in physical mode.
+	 */
+	x2apic_phys = 1;
+	/*
+	 * This will trigger the switch to apic_x2apic_phys.  Empty OEM IDs
+	 * ensure that only this APIC driver picks up the call.
+	 */
+	default_acpi_madt_oem_check("", "");
+#endif
+}
+
+static void __init jailhouse_get_smp_config(unsigned int early)
+{
+	struct ioapic_domain_cfg ioapic_cfg = {
+		.type = IOAPIC_DOMAIN_STRICT,
+		.ops = &mp_ioapic_irqdomain_ops,
+	};
+	struct mpc_intsrc mp_irq = {
+		.type = MP_INTSRC,
+		.irqtype = mp_INT,
+		.irqflag = MP_IRQPOL_ACTIVE_HIGH | MP_IRQTRIG_EDGE,
+	};
+	unsigned int cpu;
+
+	jailhouse_x2apic_init();
+
+	register_lapic_address(0xfee00000);
+
+	for (cpu = 0; cpu < setup_data.num_cpus; cpu++) {
+		generic_processor_info(setup_data.cpu_ids[cpu],
+				       boot_cpu_apic_version);
+	}
+
+	smp_found_config = 1;
+
+	if (setup_data.standard_ioapic) {
+		mp_register_ioapic(0, 0xfec00000, gsi_top, &ioapic_cfg);
+
+		/* Register 1:1 mapping for legacy UART IRQs 3 and 4 */
+		mp_irq.srcbusirq = mp_irq.dstirq = 3;
+		mp_save_irq(&mp_irq);
+
+		mp_irq.srcbusirq = mp_irq.dstirq = 4;
+		mp_save_irq(&mp_irq);
+	}
+}
+
+static void jailhouse_no_restart(void)
+{
+	pr_notice("Jailhouse: Restart not supported, halting\n");
+	machine_halt();
+}
+
+static int __init jailhouse_pci_arch_init(void)
+{
+	pci_direct_init(1);
+
+	/*
+	 * There are no bridges on the virtual PCI root bus under Jailhouse,
+	 * thus no other way to discover all devices than a full scan.
+	 * Respect any overrides via the command line, though.
+	 */
+	if (pcibios_last_bus < 0)
+		pcibios_last_bus = 0xff;
+
+	return 0;
+}
+
+static void __init jailhouse_init_platform(void)
+{
+	u64 pa_data = boot_params.hdr.setup_data;
+	struct setup_data header;
+	void *mapping;
+
+	x86_init.irqs.pre_vector_init	= x86_init_noop;
+	x86_init.timers.timer_init	= jailhouse_timer_init;
+	x86_init.mpparse.get_smp_config	= jailhouse_get_smp_config;
+	x86_init.pci.arch_init		= jailhouse_pci_arch_init;
+
+	x86_platform.calibrate_cpu	= jailhouse_get_tsc;
+	x86_platform.calibrate_tsc	= jailhouse_get_tsc;
+	x86_platform.get_wallclock	= jailhouse_get_wallclock;
+	x86_platform.legacy.rtc		= 0;
+	x86_platform.legacy.warm_reset	= 0;
+	x86_platform.legacy.i8042	= X86_LEGACY_I8042_PLATFORM_ABSENT;
+
+	legacy_pic			= &null_legacy_pic;
+
+	machine_ops.emergency_restart	= jailhouse_no_restart;
+
+	while (pa_data) {
+		mapping = early_memremap(pa_data, sizeof(header));
+		memcpy(&header, mapping, sizeof(header));
+		early_memunmap(mapping, sizeof(header));
+
+		if (header.type == SETUP_JAILHOUSE &&
+		    header.len >= sizeof(setup_data)) {
+			pa_data += offsetof(struct setup_data, data);
+
+			mapping = early_memremap(pa_data, sizeof(setup_data));
+			memcpy(&setup_data, mapping, sizeof(setup_data));
+			early_memunmap(mapping, sizeof(setup_data));
+
+			break;
+		}
+
+		pa_data = header.next;
+	}
+
+	if (!pa_data)
+		panic("Jailhouse: No valid setup data found");
+
+	if (setup_data.compatible_version > JAILHOUSE_SETUP_REQUIRED_VERSION)
+		panic("Jailhouse: Unsupported setup data structure");
+
+	pmtmr_ioport = setup_data.pm_timer_address;
+	pr_debug("Jailhouse: PM-Timer IO Port: %#x\n", pmtmr_ioport);
+
+	precalibrated_tsc_khz = setup_data.tsc_khz;
+	setup_force_cpu_cap(X86_FEATURE_TSC_KNOWN_FREQ);
+
+	pci_probe = 0;
+
+	/*
+	 * Avoid that the kernel complains about missing ACPI tables - there
+	 * are none in a non-root cell.
+	 */
+	disable_acpi();
+}
+
+bool jailhouse_paravirt(void)
+{
+	return jailhouse_cpuid_base() != 0;
+}
+
+static bool jailhouse_x2apic_available(void)
+{
+	/*
+	 * The x2APIC is only available if the root cell enabled it. Jailhouse
+	 * does not support switching between xAPIC and x2APIC.
+	 */
+	return x2apic_enabled();
+}
+
+const struct hypervisor_x86 x86_hyper_jailhouse __refconst = {
+	.name			= "Jailhouse",
+	.detect			= jailhouse_detect,
+	.init.init_platform	= jailhouse_init_platform,
+	.init.x2apic_available	= jailhouse_x2apic_available,
+};
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index e941136..203d3988 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -40,6 +40,7 @@
 #include <asm/debugreg.h>
 #include <asm/set_memory.h>
 #include <asm/sections.h>
+#include <asm/nospec-branch.h>
 
 #include "common.h"
 
@@ -203,7 +204,7 @@ static int copy_optimized_instructions(u8 *dest, u8 *src, u8 *real)
 }
 
 /* Check whether insn is indirect jump */
-static int insn_is_indirect_jump(struct insn *insn)
+static int __insn_is_indirect_jump(struct insn *insn)
 {
 	return ((insn->opcode.bytes[0] == 0xff &&
 		(X86_MODRM_REG(insn->modrm.value) & 6) == 4) || /* Jump */
@@ -237,6 +238,26 @@ static int insn_jump_into_range(struct insn *insn, unsigned long start, int len)
 	return (start <= target && target <= start + len);
 }
 
+static int insn_is_indirect_jump(struct insn *insn)
+{
+	int ret = __insn_is_indirect_jump(insn);
+
+#ifdef CONFIG_RETPOLINE
+	/*
+	 * Jump to x86_indirect_thunk_* is treated as an indirect jump.
+	 * Note that even with CONFIG_RETPOLINE=y, the kernel compiled with
+	 * older gcc may use indirect jump. So we add this check instead of
+	 * replace indirect-jump check.
+	 */
+	if (!ret)
+		ret = insn_jump_into_range(insn,
+				(unsigned long)__indirect_thunk_start,
+				(unsigned long)__indirect_thunk_end -
+				(unsigned long)__indirect_thunk_start);
+#endif
+	return ret;
+}
+
 /* Decode whole function to ensure any instructions don't jump into target */
 static int can_optimize(unsigned long paddr)
 {
diff --git a/arch/x86/kernel/mpparse.c b/arch/x86/kernel/mpparse.c
index 3a4b128..27d0a17 100644
--- a/arch/x86/kernel/mpparse.c
+++ b/arch/x86/kernel/mpparse.c
@@ -281,7 +281,7 @@ static void __init construct_default_ioirq_mptable(int mpc_default_type)
 	int ELCR_fallback = 0;
 
 	intsrc.type = MP_INTSRC;
-	intsrc.irqflag = 0;	/* conforming */
+	intsrc.irqflag = MP_IRQTRIG_DEFAULT | MP_IRQPOL_DEFAULT;
 	intsrc.srcbus = 0;
 	intsrc.dstapic = mpc_ioapic_id(0);
 
@@ -324,10 +324,13 @@ static void __init construct_default_ioirq_mptable(int mpc_default_type)
 			 *  copy that information over to the MP table in the
 			 *  irqflag field (level sensitive, active high polarity).
 			 */
-			if (ELCR_trigger(i))
-				intsrc.irqflag = 13;
-			else
-				intsrc.irqflag = 0;
+			if (ELCR_trigger(i)) {
+				intsrc.irqflag = MP_IRQTRIG_LEVEL |
+						 MP_IRQPOL_ACTIVE_HIGH;
+			} else {
+				intsrc.irqflag = MP_IRQTRIG_DEFAULT |
+						 MP_IRQPOL_DEFAULT;
+			}
 		}
 
 		intsrc.srcbusirq = i;
@@ -419,7 +422,7 @@ static inline void __init construct_default_ISA_mptable(int mpc_default_type)
 	construct_ioapic_table(mpc_default_type);
 
 	lintsrc.type = MP_LINTSRC;
-	lintsrc.irqflag = 0;		/* conforming */
+	lintsrc.irqflag = MP_IRQTRIG_DEFAULT | MP_IRQPOL_DEFAULT;
 	lintsrc.srcbusid = 0;
 	lintsrc.srcbusirq = 0;
 	lintsrc.destapic = MP_APIC_ALL;
@@ -664,7 +667,7 @@ static int  __init get_MP_intsrc_index(struct mpc_intsrc *m)
 	if (m->irqtype != mp_INT)
 		return 0;
 
-	if (m->irqflag != 0x0f)
+	if (m->irqflag != (MP_IRQTRIG_LEVEL | MP_IRQPOL_ACTIVE_LOW))
 		return 0;
 
 	/* not legacy */
@@ -673,7 +676,8 @@ static int  __init get_MP_intsrc_index(struct mpc_intsrc *m)
 		if (mp_irqs[i].irqtype != mp_INT)
 			continue;
 
-		if (mp_irqs[i].irqflag != 0x0f)
+		if (mp_irqs[i].irqflag != (MP_IRQTRIG_LEVEL |
+					   MP_IRQPOL_ACTIVE_LOW))
 			continue;
 
 		if (mp_irqs[i].srcbus != m->srcbus)
@@ -784,7 +788,8 @@ static int  __init replace_intsrc_all(struct mpc_table *mpc,
 		if (mp_irqs[i].irqtype != mp_INT)
 			continue;
 
-		if (mp_irqs[i].irqflag != 0x0f)
+		if (mp_irqs[i].irqflag != (MP_IRQTRIG_LEVEL |
+					   MP_IRQPOL_ACTIVE_LOW))
 			continue;
 
 		if (nr_m_spare > 0) {
diff --git a/arch/x86/kernel/platform-quirks.c b/arch/x86/kernel/platform-quirks.c
index 39a5929..235fe60 100644
--- a/arch/x86/kernel/platform-quirks.c
+++ b/arch/x86/kernel/platform-quirks.c
@@ -9,6 +9,7 @@ void __init x86_early_init_platform_quirks(void)
 {
 	x86_platform.legacy.i8042 = X86_LEGACY_I8042_EXPECTED_PRESENT;
 	x86_platform.legacy.rtc = 1;
+	x86_platform.legacy.warm_reset = 1;
 	x86_platform.legacy.reserve_bios_regions = 0;
 	x86_platform.legacy.devices.pnpbios = 1;
 
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index aed9d94..03408b9 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -21,7 +21,6 @@
 #include <linux/dmi.h>
 #include <linux/utsname.h>
 #include <linux/stackprotector.h>
-#include <linux/tick.h>
 #include <linux/cpuidle.h>
 #include <trace/events/power.h>
 #include <linux/hw_breakpoint.h>
@@ -47,7 +46,7 @@
  * section. Since TSS's are completely CPU-local, we want them
  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
  */
-__visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tss_struct, cpu_tss_rw) = {
+__visible DEFINE_PER_CPU_PAGE_ALIGNED(struct tss_struct, cpu_tss_rw) = {
 	.x86_tss = {
 		/*
 		 * .sp0 is only used when entering ring 0 from a lower
@@ -380,19 +379,24 @@ void stop_this_cpu(void *dummy)
 	disable_local_APIC();
 	mcheck_cpu_clear(this_cpu_ptr(&cpu_info));
 
+	/*
+	 * Use wbinvd on processors that support SME. This provides support
+	 * for performing a successful kexec when going from SME inactive
+	 * to SME active (or vice-versa). The cache must be cleared so that
+	 * if there are entries with the same physical address, both with and
+	 * without the encryption bit, they don't race each other when flushed
+	 * and potentially end up with the wrong entry being committed to
+	 * memory.
+	 */
+	if (boot_cpu_has(X86_FEATURE_SME))
+		native_wbinvd();
 	for (;;) {
 		/*
-		 * Use wbinvd followed by hlt to stop the processor. This
-		 * provides support for kexec on a processor that supports
-		 * SME. With kexec, going from SME inactive to SME active
-		 * requires clearing cache entries so that addresses without
-		 * the encryption bit set don't corrupt the same physical
-		 * address that has the encryption bit set when caches are
-		 * flushed. To achieve this a wbinvd is performed followed by
-		 * a hlt. Even if the processor is not in the kexec/SME
-		 * scenario this only adds a wbinvd to a halting processor.
+		 * Use native_halt() so that memory contents don't change
+		 * (stack usage and variables) after possibly issuing the
+		 * native_wbinvd() above.
 		 */
-		asm volatile("wbinvd; hlt" : : : "memory");
+		native_halt();
 	}
 }
 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 8af2e8d..1ae67e9 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -114,7 +114,6 @@
 #include <asm/alternative.h>
 #include <asm/prom.h>
 #include <asm/microcode.h>
-#include <asm/mmu_context.h>
 #include <asm/kaslr.h>
 #include <asm/unwind.h>
 
@@ -364,16 +363,6 @@ static void __init reserve_initrd(void)
 	    !ramdisk_image || !ramdisk_size)
 		return;		/* No initrd provided by bootloader */
 
-	/*
-	 * If SME is active, this memory will be marked encrypted by the
-	 * kernel when it is accessed (including relocation). However, the
-	 * ramdisk image was loaded decrypted by the bootloader, so make
-	 * sure that it is encrypted before accessing it. For SEV the
-	 * ramdisk will already be encrypted, so only do this for SME.
-	 */
-	if (sme_active())
-		sme_early_encrypt(ramdisk_image, ramdisk_end - ramdisk_image);
-
 	initrd_start = 0;
 
 	mapped_size = memblock_mem_size(max_pfn_mapped);
@@ -906,9 +895,6 @@ void __init setup_arch(char **cmdline_p)
 		set_bit(EFI_BOOT, &efi.flags);
 		set_bit(EFI_64BIT, &efi.flags);
 	}
-
-	if (efi_enabled(EFI_BOOT))
-		efi_memblock_x86_reserve_range();
 #endif
 
 	x86_init.oem.arch_setup();
@@ -962,6 +948,8 @@ void __init setup_arch(char **cmdline_p)
 
 	parse_early_param();
 
+	if (efi_enabled(EFI_BOOT))
+		efi_memblock_x86_reserve_range();
 #ifdef CONFIG_MEMORY_HOTPLUG
 	/*
 	 * Memory used by the kernel cannot be hot-removed because Linux
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index ed556d5..6f27fac 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -75,7 +75,6 @@
 #include <asm/uv/uv.h>
 #include <linux/mc146818rtc.h>
 #include <asm/i8259.h>
-#include <asm/realmode.h>
 #include <asm/misc.h>
 #include <asm/qspinlock.h>
 
@@ -934,7 +933,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
 	 * the targeted processor.
 	 */
 
-	if (get_uv_system_type() != UV_NON_UNIQUE_APIC) {
+	if (x86_platform.legacy.warm_reset) {
 
 		pr_debug("Setting warm reset code and vector.\n");
 
@@ -1006,7 +1005,7 @@ static int do_boot_cpu(int apicid, int cpu, struct task_struct *idle,
 	/* mark "stuck" area as not stuck */
 	*trampoline_status = 0;
 
-	if (get_uv_system_type() != UV_NON_UNIQUE_APIC) {
+	if (x86_platform.legacy.warm_reset) {
 		/*
 		 * Cleanup possible dangling ends...
 		 */
diff --git a/arch/x86/kernel/stacktrace.c b/arch/x86/kernel/stacktrace.c
index 20161ef..093f2ea 100644
--- a/arch/x86/kernel/stacktrace.c
+++ b/arch/x86/kernel/stacktrace.c
@@ -102,7 +102,7 @@ __save_stack_trace_reliable(struct stack_trace *trace,
 	for (unwind_start(&state, task, NULL, NULL); !unwind_done(&state);
 	     unwind_next_frame(&state)) {
 
-		regs = unwind_get_entry_regs(&state);
+		regs = unwind_get_entry_regs(&state, NULL);
 		if (regs) {
 			/*
 			 * Kernel mode registers on the stack indicate an
diff --git a/arch/x86/kernel/tboot.c b/arch/x86/kernel/tboot.c
index a4eb279..a2486f4 100644
--- a/arch/x86/kernel/tboot.c
+++ b/arch/x86/kernel/tboot.c
@@ -138,6 +138,17 @@ static int map_tboot_page(unsigned long vaddr, unsigned long pfn,
 		return -1;
 	set_pte_at(&tboot_mm, vaddr, pte, pfn_pte(pfn, prot));
 	pte_unmap(pte);
+
+	/*
+	 * PTI poisons low addresses in the kernel page tables in the
+	 * name of making them unusable for userspace.  To execute
+	 * code at such a low address, the poison must be cleared.
+	 *
+	 * Note: 'pgd' actually gets set in p4d_alloc() _or_
+	 * pud_alloc() depending on 4/5-level paging.
+	 */
+	pgd->pgd &= ~_PAGE_NX;
+
 	return 0;
 }
 
diff --git a/arch/x86/kernel/time.c b/arch/x86/kernel/time.c
index 749d189..774ebaf 100644
--- a/arch/x86/kernel/time.c
+++ b/arch/x86/kernel/time.c
@@ -69,9 +69,12 @@ static struct irqaction irq0  = {
 
 static void __init setup_default_timer_irq(void)
 {
-	if (!nr_legacy_irqs())
-		return;
-	setup_irq(0, &irq0);
+	/*
+	 * Unconditionally register the legacy timer; even without legacy
+	 * PIC/PIT we need this for the HPET0 in legacy replacement mode.
+	 */
+	if (setup_irq(0, &irq0))
+		pr_info("Failed to register legacy timer interrupt\n");
 }
 
 /* Default timer init function */
diff --git a/arch/x86/kernel/tsc.c b/arch/x86/kernel/tsc.c
index 8ea117f..fb43027 100644
--- a/arch/x86/kernel/tsc.c
+++ b/arch/x86/kernel/tsc.c
@@ -25,6 +25,7 @@
 #include <asm/geode.h>
 #include <asm/apic.h>
 #include <asm/intel-family.h>
+#include <asm/i8259.h>
 
 unsigned int __read_mostly cpu_khz;	/* TSC clocks / usec, not used here */
 EXPORT_SYMBOL(cpu_khz);
@@ -363,6 +364,20 @@ static unsigned long pit_calibrate_tsc(u32 latch, unsigned long ms, int loopmin)
 	unsigned long tscmin, tscmax;
 	int pitcnt;
 
+	if (!has_legacy_pic()) {
+		/*
+		 * Relies on tsc_early_delay_calibrate() to have given us semi
+		 * usable udelay(), wait for the same 50ms we would have with
+		 * the PIT loop below.
+		 */
+		udelay(10 * USEC_PER_MSEC);
+		udelay(10 * USEC_PER_MSEC);
+		udelay(10 * USEC_PER_MSEC);
+		udelay(10 * USEC_PER_MSEC);
+		udelay(10 * USEC_PER_MSEC);
+		return ULONG_MAX;
+	}
+
 	/* Set the Gate high, disable speaker */
 	outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
@@ -487,6 +502,9 @@ static unsigned long quick_pit_calibrate(void)
 	u64 tsc, delta;
 	unsigned long d1, d2;
 
+	if (!has_legacy_pic())
+		return 0;
+
 	/* Set the Gate high, disable speaker */
 	outb((inb(0x61) & ~0x02) | 0x01, 0x61);
 
@@ -602,7 +620,6 @@ unsigned long native_calibrate_tsc(void)
 		case INTEL_FAM6_KABYLAKE_DESKTOP:
 			crystal_khz = 24000;	/* 24.0 MHz */
 			break;
-		case INTEL_FAM6_SKYLAKE_X:
 		case INTEL_FAM6_ATOM_DENVERTON:
 			crystal_khz = 25000;	/* 25.0 MHz */
 			break;
@@ -612,6 +629,8 @@ unsigned long native_calibrate_tsc(void)
 		}
 	}
 
+	if (crystal_khz == 0)
+		return 0;
 	/*
 	 * TSC frequency determined by CPUID is a "hardware reported"
 	 * frequency and is the most accurate one so far we have. This
@@ -987,8 +1006,6 @@ static void __init detect_art(void)
 
 /* clocksource code */
 
-static struct clocksource clocksource_tsc;
-
 static void tsc_resume(struct clocksource *cs)
 {
 	tsc_verify_tsc_adjust(true);
@@ -1039,12 +1056,31 @@ static void tsc_cs_tick_stable(struct clocksource *cs)
 /*
  * .mask MUST be CLOCKSOURCE_MASK(64). See comment above read_tsc()
  */
+static struct clocksource clocksource_tsc_early = {
+	.name                   = "tsc-early",
+	.rating                 = 299,
+	.read                   = read_tsc,
+	.mask                   = CLOCKSOURCE_MASK(64),
+	.flags                  = CLOCK_SOURCE_IS_CONTINUOUS |
+				  CLOCK_SOURCE_MUST_VERIFY,
+	.archdata               = { .vclock_mode = VCLOCK_TSC },
+	.resume			= tsc_resume,
+	.mark_unstable		= tsc_cs_mark_unstable,
+	.tick_stable		= tsc_cs_tick_stable,
+};
+
+/*
+ * Must mark VALID_FOR_HRES early such that when we unregister tsc_early
+ * this one will immediately take over. We will only register if TSC has
+ * been found good.
+ */
 static struct clocksource clocksource_tsc = {
 	.name                   = "tsc",
 	.rating                 = 300,
 	.read                   = read_tsc,
 	.mask                   = CLOCKSOURCE_MASK(64),
 	.flags                  = CLOCK_SOURCE_IS_CONTINUOUS |
+				  CLOCK_SOURCE_VALID_FOR_HRES |
 				  CLOCK_SOURCE_MUST_VERIFY,
 	.archdata               = { .vclock_mode = VCLOCK_TSC },
 	.resume			= tsc_resume,
@@ -1168,8 +1204,8 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 	int cpu;
 
 	/* Don't bother refining TSC on unstable systems */
-	if (check_tsc_unstable())
-		goto out;
+	if (tsc_unstable)
+		return;
 
 	/*
 	 * Since the work is started early in boot, we may be
@@ -1221,9 +1257,13 @@ static void tsc_refine_calibration_work(struct work_struct *work)
 		set_cyc2ns_scale(tsc_khz, cpu, tsc_stop);
 
 out:
+	if (tsc_unstable)
+		return;
+
 	if (boot_cpu_has(X86_FEATURE_ART))
 		art_related_clocksource = &clocksource_tsc;
 	clocksource_register_khz(&clocksource_tsc, tsc_khz);
+	clocksource_unregister(&clocksource_tsc_early);
 }
 
 
@@ -1232,13 +1272,11 @@ static int __init init_tsc_clocksource(void)
 	if (!boot_cpu_has(X86_FEATURE_TSC) || tsc_disabled > 0 || !tsc_khz)
 		return 0;
 
+	if (check_tsc_unstable())
+		return 0;
+
 	if (tsc_clocksource_reliable)
 		clocksource_tsc.flags &= ~CLOCK_SOURCE_MUST_VERIFY;
-	/* lower the rating if we already know its unstable: */
-	if (check_tsc_unstable()) {
-		clocksource_tsc.rating = 0;
-		clocksource_tsc.flags &= ~CLOCK_SOURCE_IS_CONTINUOUS;
-	}
 
 	if (boot_cpu_has(X86_FEATURE_NONSTOP_TSC_S3))
 		clocksource_tsc.flags |= CLOCK_SOURCE_SUSPEND_NONSTOP;
@@ -1251,6 +1289,7 @@ static int __init init_tsc_clocksource(void)
 		if (boot_cpu_has(X86_FEATURE_ART))
 			art_related_clocksource = &clocksource_tsc;
 		clocksource_register_khz(&clocksource_tsc, tsc_khz);
+		clocksource_unregister(&clocksource_tsc_early);
 		return 0;
 	}
 
@@ -1315,6 +1354,12 @@ void __init tsc_init(void)
 		(unsigned long)cpu_khz / 1000,
 		(unsigned long)cpu_khz % 1000);
 
+	if (cpu_khz != tsc_khz) {
+		pr_info("Detected %lu.%03lu MHz TSC",
+			(unsigned long)tsc_khz / 1000,
+			(unsigned long)tsc_khz % 1000);
+	}
+
 	/* Sanitize TSC ADJUST before cyc2ns gets initialized */
 	tsc_store_and_check_tsc_adjust(true);
 
@@ -1349,9 +1394,12 @@ void __init tsc_init(void)
 
 	check_system_tsc_reliable();
 
-	if (unsynchronized_tsc())
+	if (unsynchronized_tsc()) {
 		mark_tsc_unstable("TSCs unsynchronized");
+		return;
+	}
 
+	clocksource_register_khz(&clocksource_tsc_early, tsc_khz);
 	detect_art();
 }
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index be86a86..1f9188f 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -74,8 +74,50 @@ static struct orc_entry *orc_module_find(unsigned long ip)
 }
 #endif
 
+#ifdef CONFIG_DYNAMIC_FTRACE
+static struct orc_entry *orc_find(unsigned long ip);
+
+/*
+ * Ftrace dynamic trampolines do not have orc entries of their own.
+ * But they are copies of the ftrace entries that are static and
+ * defined in ftrace_*.S, which do have orc entries.
+ *
+ * If the undwinder comes across a ftrace trampoline, then find the
+ * ftrace function that was used to create it, and use that ftrace
+ * function's orc entrie, as the placement of the return code in
+ * the stack will be identical.
+ */
+static struct orc_entry *orc_ftrace_find(unsigned long ip)
+{
+	struct ftrace_ops *ops;
+	unsigned long caller;
+
+	ops = ftrace_ops_trampoline(ip);
+	if (!ops)
+		return NULL;
+
+	if (ops->flags & FTRACE_OPS_FL_SAVE_REGS)
+		caller = (unsigned long)ftrace_regs_call;
+	else
+		caller = (unsigned long)ftrace_call;
+
+	/* Prevent unlikely recursion */
+	if (ip == caller)
+		return NULL;
+
+	return orc_find(caller);
+}
+#else
+static struct orc_entry *orc_ftrace_find(unsigned long ip)
+{
+	return NULL;
+}
+#endif
+
 static struct orc_entry *orc_find(unsigned long ip)
 {
+	static struct orc_entry *orc;
+
 	if (!orc_init)
 		return NULL;
 
@@ -111,7 +153,11 @@ static struct orc_entry *orc_find(unsigned long ip)
 				  __stop_orc_unwind_ip - __start_orc_unwind_ip, ip);
 
 	/* Module lookup: */
-	return orc_module_find(ip);
+	orc = orc_module_find(ip);
+	if (orc)
+		return orc;
+
+	return orc_ftrace_find(ip);
 }
 
 static void orc_sort_swap(void *_a, void *_b, int size)
diff --git a/arch/x86/kernel/uprobes.c b/arch/x86/kernel/uprobes.c
index a3755d2..85c7ef2 100644
--- a/arch/x86/kernel/uprobes.c
+++ b/arch/x86/kernel/uprobes.c
@@ -528,11 +528,11 @@ static int default_pre_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
 	return 0;
 }
 
-static int push_ret_address(struct pt_regs *regs, unsigned long ip)
+static int emulate_push_stack(struct pt_regs *regs, unsigned long val)
 {
 	unsigned long new_sp = regs->sp - sizeof_long();
 
-	if (copy_to_user((void __user *)new_sp, &ip, sizeof_long()))
+	if (copy_to_user((void __user *)new_sp, &val, sizeof_long()))
 		return -EFAULT;
 
 	regs->sp = new_sp;
@@ -566,7 +566,7 @@ static int default_post_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs
 		regs->ip += correction;
 	} else if (auprobe->defparam.fixups & UPROBE_FIX_CALL) {
 		regs->sp += sizeof_long(); /* Pop incorrect return address */
-		if (push_ret_address(regs, utask->vaddr + auprobe->defparam.ilen))
+		if (emulate_push_stack(regs, utask->vaddr + auprobe->defparam.ilen))
 			return -ERESTART;
 	}
 	/* popf; tell the caller to not touch TF */
@@ -655,7 +655,7 @@ static bool branch_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
 		 *
 		 * But there is corner case, see the comment in ->post_xol().
 		 */
-		if (push_ret_address(regs, new_ip))
+		if (emulate_push_stack(regs, new_ip))
 			return false;
 	} else if (!check_jmp_cond(auprobe, regs)) {
 		offs = 0;
@@ -665,6 +665,16 @@ static bool branch_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
 	return true;
 }
 
+static bool push_emulate_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
+{
+	unsigned long *src_ptr = (void *)regs + auprobe->push.reg_offset;
+
+	if (emulate_push_stack(regs, *src_ptr))
+		return false;
+	regs->ip += auprobe->push.ilen;
+	return true;
+}
+
 static int branch_post_xol_op(struct arch_uprobe *auprobe, struct pt_regs *regs)
 {
 	BUG_ON(!branch_is_call(auprobe));
@@ -703,6 +713,10 @@ static const struct uprobe_xol_ops branch_xol_ops = {
 	.post_xol = branch_post_xol_op,
 };
 
+static const struct uprobe_xol_ops push_xol_ops = {
+	.emulate  = push_emulate_op,
+};
+
 /* Returns -ENOSYS if branch_xol_ops doesn't handle this insn */
 static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
 {
@@ -750,6 +764,87 @@ static int branch_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
 	return 0;
 }
 
+/* Returns -ENOSYS if push_xol_ops doesn't handle this insn */
+static int push_setup_xol_ops(struct arch_uprobe *auprobe, struct insn *insn)
+{
+	u8 opc1 = OPCODE1(insn), reg_offset = 0;
+
+	if (opc1 < 0x50 || opc1 > 0x57)
+		return -ENOSYS;
+
+	if (insn->length > 2)
+		return -ENOSYS;
+	if (insn->length == 2) {
+		/* only support rex_prefix 0x41 (x64 only) */
+#ifdef CONFIG_X86_64
+		if (insn->rex_prefix.nbytes != 1 ||
+		    insn->rex_prefix.bytes[0] != 0x41)
+			return -ENOSYS;
+
+		switch (opc1) {
+		case 0x50:
+			reg_offset = offsetof(struct pt_regs, r8);
+			break;
+		case 0x51:
+			reg_offset = offsetof(struct pt_regs, r9);
+			break;
+		case 0x52:
+			reg_offset = offsetof(struct pt_regs, r10);
+			break;
+		case 0x53:
+			reg_offset = offsetof(struct pt_regs, r11);
+			break;
+		case 0x54:
+			reg_offset = offsetof(struct pt_regs, r12);
+			break;
+		case 0x55:
+			reg_offset = offsetof(struct pt_regs, r13);
+			break;
+		case 0x56:
+			reg_offset = offsetof(struct pt_regs, r14);
+			break;
+		case 0x57:
+			reg_offset = offsetof(struct pt_regs, r15);
+			break;
+		}
+#else
+		return -ENOSYS;
+#endif
+	} else {
+		switch (opc1) {
+		case 0x50:
+			reg_offset = offsetof(struct pt_regs, ax);
+			break;
+		case 0x51:
+			reg_offset = offsetof(struct pt_regs, cx);
+			break;
+		case 0x52:
+			reg_offset = offsetof(struct pt_regs, dx);
+			break;
+		case 0x53:
+			reg_offset = offsetof(struct pt_regs, bx);
+			break;
+		case 0x54:
+			reg_offset = offsetof(struct pt_regs, sp);
+			break;
+		case 0x55:
+			reg_offset = offsetof(struct pt_regs, bp);
+			break;
+		case 0x56:
+			reg_offset = offsetof(struct pt_regs, si);
+			break;
+		case 0x57:
+			reg_offset = offsetof(struct pt_regs, di);
+			break;
+		}
+	}
+
+	auprobe->push.reg_offset = reg_offset;
+	auprobe->push.ilen = insn->length;
+	auprobe->ops = &push_xol_ops;
+	return 0;
+}
+
 /**
  * arch_uprobe_analyze_insn - instruction analysis including validity and fixups.
  * @mm: the probed address space.
@@ -771,6 +866,10 @@ int arch_uprobe_analyze_insn(struct arch_uprobe *auprobe, struct mm_struct *mm,
 	if (ret != -ENOSYS)
 		return ret;
 
+	ret = push_setup_xol_ops(auprobe, &insn);
+	if (ret != -ENOSYS)
+		return ret;
+
 	/*
 	 * Figure out which fixups default_post_xol_op() will need to perform,
 	 * and annotate defparam->fixups accordingly.
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 1e413a93..9b138a0 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -124,6 +124,12 @@
 		ASSERT(. - _entry_trampoline == PAGE_SIZE, "entry trampoline is too big");
 #endif
 
+#ifdef CONFIG_RETPOLINE
+		__indirect_thunk_start = .;
+		*(.text.__x86.indirect_thunk)
+		__indirect_thunk_end = .;
+#endif
+
 		/* End of text section */
 		_etext = .;
 	} :text = 0x9090
diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c
index b514b2b..290ecf7 100644
--- a/arch/x86/kvm/emulate.c
+++ b/arch/x86/kvm/emulate.c
@@ -25,6 +25,7 @@
 #include <asm/kvm_emulate.h>
 #include <linux/stringify.h>
 #include <asm/debugreg.h>
+#include <asm/nospec-branch.h>
 
 #include "x86.h"
 #include "tss.h"
@@ -1021,8 +1022,8 @@ static __always_inline u8 test_cc(unsigned int condition, unsigned long flags)
 	void (*fop)(void) = (void *)em_setcc + 4 * (condition & 0xf);
 
 	flags = (flags & EFLAGS_MASK) | X86_EFLAGS_IF;
-	asm("push %[flags]; popf; call *%[fastop]"
-	    : "=a"(rc) : [fastop]"r"(fop), [flags]"r"(flags));
+	asm("push %[flags]; popf; " CALL_NOSPEC
+	    : "=a"(rc) : [thunk_target]"r"(fop), [flags]"r"(flags));
 	return rc;
 }
 
@@ -5335,9 +5336,9 @@ static int fastop(struct x86_emulate_ctxt *ctxt, void (*fop)(struct fastop *))
 	if (!(ctxt->d & ByteOp))
 		fop += __ffs(ctxt->dst.bytes) * FASTOP_SIZE;
 
-	asm("push %[flags]; popf; call *%[fastop]; pushf; pop %[flags]\n"
+	asm("push %[flags]; popf; " CALL_NOSPEC " ; pushf; pop %[flags]\n"
 	    : "+a"(ctxt->dst.val), "+d"(ctxt->src.val), [flags]"+D"(flags),
-	      [fastop]"+S"(fop), ASM_CALL_CONSTRAINT
+	      [thunk_target]"+S"(fop), ASM_CALL_CONSTRAINT
 	    : "c"(ctxt->src2.val));
 
 	ctxt->eflags = (ctxt->eflags & ~EFLAGS_MASK) | (flags & EFLAGS_MASK);
diff --git a/arch/x86/kvm/mmu.c b/arch/x86/kvm/mmu.c
index c4deb1f..2b8eb4d 100644
--- a/arch/x86/kvm/mmu.c
+++ b/arch/x86/kvm/mmu.c
@@ -3781,7 +3781,8 @@ static int kvm_arch_setup_async_pf(struct kvm_vcpu *vcpu, gva_t gva, gfn_t gfn)
 bool kvm_can_do_async_pf(struct kvm_vcpu *vcpu)
 {
 	if (unlikely(!lapic_in_kernel(vcpu) ||
-		     kvm_event_needs_reinjection(vcpu)))
+		     kvm_event_needs_reinjection(vcpu) ||
+		     vcpu->arch.exception.pending))
 		return false;
 
 	if (!vcpu->arch.apf.delivery_as_pf_vmexit && is_guest_mode(vcpu))
@@ -5465,30 +5466,34 @@ static void mmu_destroy_caches(void)
 
 int kvm_mmu_module_init(void)
 {
+	int ret = -ENOMEM;
+
 	kvm_mmu_clear_all_pte_masks();
 
 	pte_list_desc_cache = kmem_cache_create("pte_list_desc",
 					    sizeof(struct pte_list_desc),
 					    0, SLAB_ACCOUNT, NULL);
 	if (!pte_list_desc_cache)
-		goto nomem;
+		goto out;
 
 	mmu_page_header_cache = kmem_cache_create("kvm_mmu_page_header",
 						  sizeof(struct kvm_mmu_page),
 						  0, SLAB_ACCOUNT, NULL);
 	if (!mmu_page_header_cache)
-		goto nomem;
+		goto out;
 
 	if (percpu_counter_init(&kvm_total_used_mmu_pages, 0, GFP_KERNEL))
-		goto nomem;
+		goto out;
 
-	register_shrinker(&mmu_shrinker);
+	ret = register_shrinker(&mmu_shrinker);
+	if (ret)
+		goto out;
 
 	return 0;
 
-nomem:
+out:
 	mmu_destroy_caches();
-	return -ENOMEM;
+	return ret;
 }
 
 /*
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index eb714f1..f40d0da 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -45,6 +45,7 @@
 #include <asm/debugreg.h>
 #include <asm/kvm_para.h>
 #include <asm/irq_remapping.h>
+#include <asm/nospec-branch.h>
 
 #include <asm/virtext.h>
 #include "trace.h"
@@ -361,7 +362,6 @@ static void recalc_intercepts(struct vcpu_svm *svm)
 {
 	struct vmcb_control_area *c, *h;
 	struct nested_state *g;
-	u32 h_intercept_exceptions;
 
 	mark_dirty(svm->vmcb, VMCB_INTERCEPTS);
 
@@ -372,14 +372,9 @@ static void recalc_intercepts(struct vcpu_svm *svm)
 	h = &svm->nested.hsave->control;
 	g = &svm->nested;
 
-	/* No need to intercept #UD if L1 doesn't intercept it */
-	h_intercept_exceptions =
-		h->intercept_exceptions & ~(1U << UD_VECTOR);
-
 	c->intercept_cr = h->intercept_cr | g->intercept_cr;
 	c->intercept_dr = h->intercept_dr | g->intercept_dr;
-	c->intercept_exceptions =
-		h_intercept_exceptions | g->intercept_exceptions;
+	c->intercept_exceptions = h->intercept_exceptions | g->intercept_exceptions;
 	c->intercept = h->intercept | g->intercept;
 }
 
@@ -2202,7 +2197,6 @@ static int ud_interception(struct vcpu_svm *svm)
 {
 	int er;
 
-	WARN_ON_ONCE(is_guest_mode(&svm->vcpu));
 	er = emulate_instruction(&svm->vcpu, EMULTYPE_TRAP_UD);
 	if (er == EMULATE_USER_EXIT)
 		return 0;
@@ -4986,6 +4980,25 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 		"mov %%r14, %c[r14](%[svm]) \n\t"
 		"mov %%r15, %c[r15](%[svm]) \n\t"
 #endif
+		/*
+		* Clear host registers marked as clobbered to prevent
+		* speculative use.
+		*/
+		"xor %%" _ASM_BX ", %%" _ASM_BX " \n\t"
+		"xor %%" _ASM_CX ", %%" _ASM_CX " \n\t"
+		"xor %%" _ASM_DX ", %%" _ASM_DX " \n\t"
+		"xor %%" _ASM_SI ", %%" _ASM_SI " \n\t"
+		"xor %%" _ASM_DI ", %%" _ASM_DI " \n\t"
+#ifdef CONFIG_X86_64
+		"xor %%r8, %%r8 \n\t"
+		"xor %%r9, %%r9 \n\t"
+		"xor %%r10, %%r10 \n\t"
+		"xor %%r11, %%r11 \n\t"
+		"xor %%r12, %%r12 \n\t"
+		"xor %%r13, %%r13 \n\t"
+		"xor %%r14, %%r14 \n\t"
+		"xor %%r15, %%r15 \n\t"
+#endif
 		"pop %%" _ASM_BP
 		:
 		: [svm]"a"(svm),
@@ -5015,6 +5028,9 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 #endif
 		);
 
+	/* Eliminate branch target predictions from guest mode */
+	vmexit_fill_RSB();
+
 #ifdef CONFIG_X86_64
 	wrmsrl(MSR_GS_BASE, svm->host.gs_base);
 #else
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 023afa0..a8b96dc 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -50,6 +50,7 @@
 #include <asm/apic.h>
 #include <asm/irq_remapping.h>
 #include <asm/mmu_context.h>
+#include <asm/nospec-branch.h>
 
 #include "trace.h"
 #include "pmu.h"
@@ -899,8 +900,16 @@ static inline short vmcs_field_to_offset(unsigned long field)
 {
 	BUILD_BUG_ON(ARRAY_SIZE(vmcs_field_to_offset_table) > SHRT_MAX);
 
-	if (field >= ARRAY_SIZE(vmcs_field_to_offset_table) ||
-	    vmcs_field_to_offset_table[field] == 0)
+	if (field >= ARRAY_SIZE(vmcs_field_to_offset_table))
+		return -ENOENT;
+
+	/*
+	 * FIXME: Mitigation for CVE-2017-5753.  To be replaced with a
+	 * generic mechanism.
+	 */
+	asm("lfence");
+
+	if (vmcs_field_to_offset_table[field] == 0)
 		return -ENOENT;
 
 	return vmcs_field_to_offset_table[field];
@@ -1887,7 +1896,7 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
 {
 	u32 eb;
 
-	eb = (1u << PF_VECTOR) | (1u << MC_VECTOR) |
+	eb = (1u << PF_VECTOR) | (1u << UD_VECTOR) | (1u << MC_VECTOR) |
 	     (1u << DB_VECTOR) | (1u << AC_VECTOR);
 	if ((vcpu->guest_debug &
 	     (KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP)) ==
@@ -1905,8 +1914,6 @@ static void update_exception_bitmap(struct kvm_vcpu *vcpu)
 	 */
 	if (is_guest_mode(vcpu))
 		eb |= get_vmcs12(vcpu)->exception_bitmap;
-	else
-		eb |= 1u << UD_VECTOR;
 
 	vmcs_write32(EXCEPTION_BITMAP, eb);
 }
@@ -5917,7 +5924,6 @@ static int handle_exception(struct kvm_vcpu *vcpu)
 		return 1;  /* already handled by vmx_vcpu_run() */
 
 	if (is_invalid_opcode(intr_info)) {
-		WARN_ON_ONCE(is_guest_mode(vcpu));
 		er = emulate_instruction(vcpu, EMULTYPE_TRAP_UD);
 		if (er == EMULATE_USER_EXIT)
 			return 0;
@@ -9123,14 +9129,14 @@ static void vmx_handle_external_intr(struct kvm_vcpu *vcpu)
 #endif
 			"pushf\n\t"
 			__ASM_SIZE(push) " $%c[cs]\n\t"
-			"call *%[entry]\n\t"
+			CALL_NOSPEC
 			:
 #ifdef CONFIG_X86_64
 			[sp]"=&r"(tmp),
 #endif
 			ASM_CALL_CONSTRAINT
 			:
-			[entry]"r"(entry),
+			THUNK_TARGET(entry),
 			[ss]"i"(__KERNEL_DS),
 			[cs]"i"(__KERNEL_CS)
 			);
@@ -9415,6 +9421,7 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		/* Save guest registers, load host registers, keep flags */
 		"mov %0, %c[wordsize](%%" _ASM_SP ") \n\t"
 		"pop %0 \n\t"
+		"setbe %c[fail](%0)\n\t"
 		"mov %%" _ASM_AX ", %c[rax](%0) \n\t"
 		"mov %%" _ASM_BX ", %c[rbx](%0) \n\t"
 		__ASM_SIZE(pop) " %c[rcx](%0) \n\t"
@@ -9431,12 +9438,23 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 		"mov %%r13, %c[r13](%0) \n\t"
 		"mov %%r14, %c[r14](%0) \n\t"
 		"mov %%r15, %c[r15](%0) \n\t"
+		"xor %%r8d,  %%r8d \n\t"
+		"xor %%r9d,  %%r9d \n\t"
+		"xor %%r10d, %%r10d \n\t"
+		"xor %%r11d, %%r11d \n\t"
+		"xor %%r12d, %%r12d \n\t"
+		"xor %%r13d, %%r13d \n\t"
+		"xor %%r14d, %%r14d \n\t"
+		"xor %%r15d, %%r15d \n\t"
 #endif
 		"mov %%cr2, %%" _ASM_AX "   \n\t"
 		"mov %%" _ASM_AX ", %c[cr2](%0) \n\t"
 
+		"xor %%eax, %%eax \n\t"
+		"xor %%ebx, %%ebx \n\t"
+		"xor %%esi, %%esi \n\t"
+		"xor %%edi, %%edi \n\t"
 		"pop  %%" _ASM_BP "; pop  %%" _ASM_DX " \n\t"
-		"setbe %c[fail](%0) \n\t"
 		".pushsection .rodata \n\t"
 		".global vmx_return \n\t"
 		"vmx_return: " _ASM_PTR " 2b \n\t"
@@ -9473,6 +9491,9 @@ static void __noclone vmx_vcpu_run(struct kvm_vcpu *vcpu)
 #endif
 	      );
 
+	/* Eliminate branch target predictions from guest mode */
+	vmexit_fill_RSB();
+
 	/* MSR_IA32_DEBUGCTLMSR is zeroed on vmexit. Restore it if needed */
 	if (debugctlmsr)
 		update_debugctlmsr(debugctlmsr);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 1cec2c6..c53298d 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7496,13 +7496,13 @@ EXPORT_SYMBOL_GPL(kvm_task_switch);
 
 int kvm_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
 {
-	if ((sregs->efer & EFER_LME) && (sregs->cr0 & X86_CR0_PG_BIT)) {
+	if ((sregs->efer & EFER_LME) && (sregs->cr0 & X86_CR0_PG)) {
 		/*
 		 * When EFER.LME and CR0.PG are set, the processor is in
 		 * 64-bit mode (though maybe in a 32-bit code segment).
 		 * CR4.PAE and EFER.LMA must be set.
 		 */
-		if (!(sregs->cr4 & X86_CR4_PAE_BIT)
+		if (!(sregs->cr4 & X86_CR4_PAE)
 		    || !(sregs->efer & EFER_LMA))
 			return -EINVAL;
 	} else {
diff --git a/arch/x86/lib/Makefile b/arch/x86/lib/Makefile
index 7b181b6..69a4739 100644
--- a/arch/x86/lib/Makefile
+++ b/arch/x86/lib/Makefile
@@ -26,6 +26,8 @@
 lib-$(CONFIG_RWSEM_XCHGADD_ALGORITHM) += rwsem.o
 lib-$(CONFIG_INSTRUCTION_DECODER) += insn.o inat.o insn-eval.o
 lib-$(CONFIG_RANDOMIZE_BASE) += kaslr.o
+lib-$(CONFIG_RETPOLINE) += retpoline.o
+OBJECT_FILES_NON_STANDARD_retpoline.o :=y
 
 obj-y += msr.o msr-reg.o msr-reg-export.o hweight.o
 
diff --git a/arch/x86/lib/checksum_32.S b/arch/x86/lib/checksum_32.S
index 4d34bb5..46e71a7 100644
--- a/arch/x86/lib/checksum_32.S
+++ b/arch/x86/lib/checksum_32.S
@@ -29,7 +29,8 @@
 #include <asm/errno.h>
 #include <asm/asm.h>
 #include <asm/export.h>
-				
+#include <asm/nospec-branch.h>
+
 /*
  * computes a partial checksum, e.g. for TCP/UDP fragments
  */
@@ -156,7 +157,7 @@
 	negl %ebx
 	lea 45f(%ebx,%ebx,2), %ebx
 	testl %esi, %esi
-	jmp *%ebx
+	JMP_NOSPEC %ebx
 
 	# Handle 2-byte-aligned regions
 20:	addw (%esi), %ax
@@ -439,7 +440,7 @@
 	andl $-32,%edx
 	lea 3f(%ebx,%ebx), %ebx
 	testl %esi, %esi 
-	jmp *%ebx
+	JMP_NOSPEC %ebx
 1:	addl $64,%esi
 	addl $64,%edi 
 	SRC(movb -32(%edx),%bl)	; SRC(movb (%edx),%bl)
diff --git a/arch/x86/lib/delay.c b/arch/x86/lib/delay.c
index 4846eff..f5b7f1b 100644
--- a/arch/x86/lib/delay.c
+++ b/arch/x86/lib/delay.c
@@ -162,7 +162,7 @@ void __delay(unsigned long loops)
 }
 EXPORT_SYMBOL(__delay);
 
-inline void __const_udelay(unsigned long xloops)
+void __const_udelay(unsigned long xloops)
 {
 	unsigned long lpj = this_cpu_read(cpu_info.loops_per_jiffy) ? : loops_per_jiffy;
 	int d0;
diff --git a/arch/x86/lib/retpoline.S b/arch/x86/lib/retpoline.S
new file mode 100644
index 0000000..480edc3
--- /dev/null
+++ b/arch/x86/lib/retpoline.S
@@ -0,0 +1,104 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <linux/stringify.h>
+#include <linux/linkage.h>
+#include <asm/dwarf2.h>
+#include <asm/cpufeatures.h>
+#include <asm/alternative-asm.h>
+#include <asm/export.h>
+#include <asm/nospec-branch.h>
+#include <asm/bitsperlong.h>
+
+.macro THUNK reg
+	.section .text.__x86.indirect_thunk
+
+ENTRY(__x86_indirect_thunk_\reg)
+	CFI_STARTPROC
+	JMP_NOSPEC %\reg
+	CFI_ENDPROC
+ENDPROC(__x86_indirect_thunk_\reg)
+.endm
+
+/*
+ * Despite being an assembler file we can't just use .irp here
+ * because __KSYM_DEPS__ only uses the C preprocessor and would
+ * only see one instance of "__x86_indirect_thunk_\reg" rather
+ * than one per register with the correct names. So we do it
+ * the simple and nasty way...
+ */
+#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
+#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
+#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
+
+GENERATE_THUNK(_ASM_AX)
+GENERATE_THUNK(_ASM_BX)
+GENERATE_THUNK(_ASM_CX)
+GENERATE_THUNK(_ASM_DX)
+GENERATE_THUNK(_ASM_SI)
+GENERATE_THUNK(_ASM_DI)
+GENERATE_THUNK(_ASM_BP)
+#ifdef CONFIG_64BIT
+GENERATE_THUNK(r8)
+GENERATE_THUNK(r9)
+GENERATE_THUNK(r10)
+GENERATE_THUNK(r11)
+GENERATE_THUNK(r12)
+GENERATE_THUNK(r13)
+GENERATE_THUNK(r14)
+GENERATE_THUNK(r15)
+#endif
+
+/*
+ * Fill the CPU return stack buffer.
+ *
+ * Each entry in the RSB, if used for a speculative 'ret', contains an
+ * infinite 'pause; lfence; jmp' loop to capture speculative execution.
+ *
+ * This is required in various cases for retpoline and IBRS-based
+ * mitigations for the Spectre variant 2 vulnerability. Sometimes to
+ * eliminate potentially bogus entries from the RSB, and sometimes
+ * purely to ensure that it doesn't get empty, which on some CPUs would
+ * allow predictions from other (unwanted!) sources to be used.
+ *
+ * Google experimented with loop-unrolling and this turned out to be
+ * the optimal version - two calls, each with their own speculation
+ * trap should their return address end up getting used, in a loop.
+ */
+.macro STUFF_RSB nr:req sp:req
+	mov	$(\nr / 2), %_ASM_BX
+	.align 16
+771:
+	call	772f
+773:						/* speculation trap */
+	pause
+	lfence
+	jmp	773b
+	.align 16
+772:
+	call	774f
+775:						/* speculation trap */
+	pause
+	lfence
+	jmp	775b
+	.align 16
+774:
+	dec	%_ASM_BX
+	jnz	771b
+	add	$((BITS_PER_LONG/8) * \nr), \sp
+.endm
+
+#define RSB_FILL_LOOPS		16	/* To avoid underflow */
+
+ENTRY(__fill_rsb)
+	STUFF_RSB RSB_FILL_LOOPS, %_ASM_SP
+	ret
+END(__fill_rsb)
+EXPORT_SYMBOL_GPL(__fill_rsb)
+
+#define RSB_CLEAR_LOOPS		32	/* To forcibly overwrite all entries */
+
+ENTRY(__clear_rsb)
+	STUFF_RSB RSB_CLEAR_LOOPS, %_ASM_SP
+	ret
+END(__clear_rsb)
+EXPORT_SYMBOL_GPL(__clear_rsb)
diff --git a/arch/x86/mm/dump_pagetables.c b/arch/x86/mm/dump_pagetables.c
index f56902c..2a4849e 100644
--- a/arch/x86/mm/dump_pagetables.c
+++ b/arch/x86/mm/dump_pagetables.c
@@ -61,10 +61,10 @@ enum address_markers_idx {
 	KASAN_SHADOW_START_NR,
 	KASAN_SHADOW_END_NR,
 #endif
+	CPU_ENTRY_AREA_NR,
 #if defined(CONFIG_MODIFY_LDT_SYSCALL) && !defined(CONFIG_X86_5LEVEL)
 	LDT_NR,
 #endif
-	CPU_ENTRY_AREA_NR,
 #ifdef CONFIG_X86_ESPFIX64
 	ESPFIX_START_NR,
 #endif
diff --git a/arch/x86/mm/extable.c b/arch/x86/mm/extable.c
index 9fe656c..45f5d6c 100644
--- a/arch/x86/mm/extable.c
+++ b/arch/x86/mm/extable.c
@@ -21,16 +21,16 @@ ex_fixup_handler(const struct exception_table_entry *x)
 	return (ex_handler_t)((unsigned long)&x->handler + x->handler);
 }
 
-bool ex_handler_default(const struct exception_table_entry *fixup,
-		       struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_default(const struct exception_table_entry *fixup,
+				  struct pt_regs *regs, int trapnr)
 {
 	regs->ip = ex_fixup_addr(fixup);
 	return true;
 }
 EXPORT_SYMBOL(ex_handler_default);
 
-bool ex_handler_fault(const struct exception_table_entry *fixup,
-		     struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_fault(const struct exception_table_entry *fixup,
+				struct pt_regs *regs, int trapnr)
 {
 	regs->ip = ex_fixup_addr(fixup);
 	regs->ax = trapnr;
@@ -42,8 +42,8 @@ EXPORT_SYMBOL_GPL(ex_handler_fault);
  * Handler for UD0 exception following a failed test against the
  * result of a refcount inc/dec/add/sub.
  */
-bool ex_handler_refcount(const struct exception_table_entry *fixup,
-			 struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_refcount(const struct exception_table_entry *fixup,
+				   struct pt_regs *regs, int trapnr)
 {
 	/* First unconditionally saturate the refcount. */
 	*(int *)regs->cx = INT_MIN / 2;
@@ -95,8 +95,8 @@ EXPORT_SYMBOL(ex_handler_refcount);
  * of vulnerability by restoring from the initial state (essentially, zeroing
  * out all the FPU registers) if we can't restore from the task's FPU state.
  */
-bool ex_handler_fprestore(const struct exception_table_entry *fixup,
-			  struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_fprestore(const struct exception_table_entry *fixup,
+				    struct pt_regs *regs, int trapnr)
 {
 	regs->ip = ex_fixup_addr(fixup);
 
@@ -108,8 +108,8 @@ bool ex_handler_fprestore(const struct exception_table_entry *fixup,
 }
 EXPORT_SYMBOL_GPL(ex_handler_fprestore);
 
-bool ex_handler_ext(const struct exception_table_entry *fixup,
-		   struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_ext(const struct exception_table_entry *fixup,
+			      struct pt_regs *regs, int trapnr)
 {
 	/* Special hack for uaccess_err */
 	current->thread.uaccess_err = 1;
@@ -118,8 +118,8 @@ bool ex_handler_ext(const struct exception_table_entry *fixup,
 }
 EXPORT_SYMBOL(ex_handler_ext);
 
-bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
-			     struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
+				       struct pt_regs *regs, int trapnr)
 {
 	if (pr_warn_once("unchecked MSR access error: RDMSR from 0x%x at rIP: 0x%lx (%pF)\n",
 			 (unsigned int)regs->cx, regs->ip, (void *)regs->ip))
@@ -133,8 +133,8 @@ bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
 }
 EXPORT_SYMBOL(ex_handler_rdmsr_unsafe);
 
-bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
-			     struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
+				       struct pt_regs *regs, int trapnr)
 {
 	if (pr_warn_once("unchecked MSR access error: WRMSR to 0x%x (tried to write 0x%08x%08x) at rIP: 0x%lx (%pF)\n",
 			 (unsigned int)regs->cx, (unsigned int)regs->dx,
@@ -147,8 +147,8 @@ bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
 }
 EXPORT_SYMBOL(ex_handler_wrmsr_unsafe);
 
-bool ex_handler_clear_fs(const struct exception_table_entry *fixup,
-			 struct pt_regs *regs, int trapnr)
+__visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup,
+				   struct pt_regs *regs, int trapnr)
 {
 	if (static_cpu_has(X86_BUG_NULL_SEG))
 		asm volatile ("mov %0, %%fs" : : "rm" (__USER_DS));
@@ -157,7 +157,7 @@ bool ex_handler_clear_fs(const struct exception_table_entry *fixup,
 }
 EXPORT_SYMBOL(ex_handler_clear_fs);
 
-bool ex_has_fault_handler(unsigned long ip)
+__visible bool ex_has_fault_handler(unsigned long ip)
 {
 	const struct exception_table_entry *e;
 	ex_handler_t handler;
diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index b3e4077..800de81 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -439,18 +439,13 @@ static noinline int vmalloc_fault(unsigned long address)
 	if (pgd_none(*pgd_ref))
 		return -1;
 
-	if (pgd_none(*pgd)) {
-		set_pgd(pgd, *pgd_ref);
-		arch_flush_lazy_mmu_mode();
-	} else if (CONFIG_PGTABLE_LEVELS > 4) {
-		/*
-		 * With folded p4d, pgd_none() is always false, so the pgd may
-		 * point to an empty page table entry and pgd_page_vaddr()
-		 * will return garbage.
-		 *
-		 * We will do the correct sanity check on the p4d level.
-		 */
-		BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+	if (CONFIG_PGTABLE_LEVELS > 4) {
+		if (pgd_none(*pgd)) {
+			set_pgd(pgd, *pgd_ref);
+			arch_flush_lazy_mmu_mode();
+		} else {
+			BUG_ON(pgd_page_vaddr(*pgd) != pgd_page_vaddr(*pgd_ref));
+		}
 	}
 
 	/* With 4-level paging, copying happens on the p4d level. */
@@ -459,7 +454,7 @@ static noinline int vmalloc_fault(unsigned long address)
 	if (p4d_none(*p4d_ref))
 		return -1;
 
-	if (p4d_none(*p4d)) {
+	if (p4d_none(*p4d) && CONFIG_PGTABLE_LEVELS == 4) {
 		set_p4d(p4d, *p4d_ref);
 		arch_flush_lazy_mmu_mode();
 	} else {
@@ -470,6 +465,7 @@ static noinline int vmalloc_fault(unsigned long address)
 	 * Below here mismatches are bugs because these lower tables
 	 * are shared:
 	 */
+	BUILD_BUG_ON(CONFIG_PGTABLE_LEVELS < 4);
 
 	pud = pud_offset(p4d, address);
 	pud_ref = pud_offset(p4d_ref, address);
diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c
index 8ca324d..82f5252 100644
--- a/arch/x86/mm/init.c
+++ b/arch/x86/mm/init.c
@@ -868,7 +868,7 @@ __visible DEFINE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate) = {
 	.next_asid = 1,
 	.cr4 = ~0UL,	/* fail hard if we screw up cr4 shadow initialization */
 };
-EXPORT_SYMBOL_GPL(cpu_tlbstate);
+EXPORT_PER_CPU_SYMBOL(cpu_tlbstate);
 
 void update_cache_mode_entry(unsigned entry, enum page_cache_mode cache)
 {
diff --git a/arch/x86/mm/kasan_init_64.c b/arch/x86/mm/kasan_init_64.c
index 47388f0..af6f2f9 100644
--- a/arch/x86/mm/kasan_init_64.c
+++ b/arch/x86/mm/kasan_init_64.c
@@ -21,10 +21,14 @@ extern struct range pfn_mapped[E820_MAX_ENTRIES];
 
 static p4d_t tmp_p4d_table[PTRS_PER_P4D] __initdata __aligned(PAGE_SIZE);
 
-static __init void *early_alloc(size_t size, int nid)
+static __init void *early_alloc(size_t size, int nid, bool panic)
 {
-	return memblock_virt_alloc_try_nid_nopanic(size, size,
-		__pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
+	if (panic)
+		return memblock_virt_alloc_try_nid(size, size,
+			__pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
+	else
+		return memblock_virt_alloc_try_nid_nopanic(size, size,
+			__pa(MAX_DMA_ADDRESS), BOOTMEM_ALLOC_ACCESSIBLE, nid);
 }
 
 static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
@@ -38,14 +42,14 @@ static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
 		if (boot_cpu_has(X86_FEATURE_PSE) &&
 		    ((end - addr) == PMD_SIZE) &&
 		    IS_ALIGNED(addr, PMD_SIZE)) {
-			p = early_alloc(PMD_SIZE, nid);
+			p = early_alloc(PMD_SIZE, nid, false);
 			if (p && pmd_set_huge(pmd, __pa(p), PAGE_KERNEL))
 				return;
 			else if (p)
 				memblock_free(__pa(p), PMD_SIZE);
 		}
 
-		p = early_alloc(PAGE_SIZE, nid);
+		p = early_alloc(PAGE_SIZE, nid, true);
 		pmd_populate_kernel(&init_mm, pmd, p);
 	}
 
@@ -57,7 +61,7 @@ static void __init kasan_populate_pmd(pmd_t *pmd, unsigned long addr,
 		if (!pte_none(*pte))
 			continue;
 
-		p = early_alloc(PAGE_SIZE, nid);
+		p = early_alloc(PAGE_SIZE, nid, true);
 		entry = pfn_pte(PFN_DOWN(__pa(p)), PAGE_KERNEL);
 		set_pte_at(&init_mm, addr, pte, entry);
 	} while (pte++, addr += PAGE_SIZE, addr != end);
@@ -75,14 +79,14 @@ static void __init kasan_populate_pud(pud_t *pud, unsigned long addr,
 		if (boot_cpu_has(X86_FEATURE_GBPAGES) &&
 		    ((end - addr) == PUD_SIZE) &&
 		    IS_ALIGNED(addr, PUD_SIZE)) {
-			p = early_alloc(PUD_SIZE, nid);
+			p = early_alloc(PUD_SIZE, nid, false);
 			if (p && pud_set_huge(pud, __pa(p), PAGE_KERNEL))
 				return;
 			else if (p)
 				memblock_free(__pa(p), PUD_SIZE);
 		}
 
-		p = early_alloc(PAGE_SIZE, nid);
+		p = early_alloc(PAGE_SIZE, nid, true);
 		pud_populate(&init_mm, pud, p);
 	}
 
@@ -101,7 +105,7 @@ static void __init kasan_populate_p4d(p4d_t *p4d, unsigned long addr,
 	unsigned long next;
 
 	if (p4d_none(*p4d)) {
-		void *p = early_alloc(PAGE_SIZE, nid);
+		void *p = early_alloc(PAGE_SIZE, nid, true);
 
 		p4d_populate(&init_mm, p4d, p);
 	}
@@ -122,7 +126,7 @@ static void __init kasan_populate_pgd(pgd_t *pgd, unsigned long addr,
 	unsigned long next;
 
 	if (pgd_none(*pgd)) {
-		p = early_alloc(PAGE_SIZE, nid);
+		p = early_alloc(PAGE_SIZE, nid, true);
 		pgd_populate(&init_mm, pgd, p);
 	}
 
diff --git a/arch/x86/mm/kaslr.c b/arch/x86/mm/kaslr.c
index 879ef93..aedebd2 100644
--- a/arch/x86/mm/kaslr.c
+++ b/arch/x86/mm/kaslr.c
@@ -34,25 +34,14 @@
 #define TB_SHIFT 40
 
 /*
- * Virtual address start and end range for randomization. The end changes base
- * on configuration to have the highest amount of space for randomization.
- * It increases the possible random position for each randomized region.
+ * Virtual address start and end range for randomization.
  *
- * You need to add an if/def entry if you introduce a new memory region
- * compatible with KASLR. Your entry must be in logical order with memory
- * layout. For example, ESPFIX is before EFI because its virtual address is
- * before. You also need to add a BUILD_BUG_ON() in kernel_randomize_memory() to
- * ensure that this order is correct and won't be changed.
+ * The end address could depend on more configuration options to make the
+ * highest amount of space for randomization available, but that's too hard
+ * to keep straight and caused issues already.
  */
 static const unsigned long vaddr_start = __PAGE_OFFSET_BASE;
-
-#if defined(CONFIG_X86_ESPFIX64)
-static const unsigned long vaddr_end = ESPFIX_BASE_ADDR;
-#elif defined(CONFIG_EFI)
-static const unsigned long vaddr_end = EFI_VA_END;
-#else
-static const unsigned long vaddr_end = __START_KERNEL_map;
-#endif
+static const unsigned long vaddr_end = CPU_ENTRY_AREA_BASE;
 
 /* Default values */
 unsigned long page_offset_base = __PAGE_OFFSET_BASE;
@@ -101,15 +90,12 @@ void __init kernel_randomize_memory(void)
 	unsigned long remain_entropy;
 
 	/*
-	 * All these BUILD_BUG_ON checks ensures the memory layout is
-	 * consistent with the vaddr_start/vaddr_end variables.
+	 * These BUILD_BUG_ON checks ensure the memory layout is consistent
+	 * with the vaddr_start/vaddr_end variables. These checks are very
+	 * limited....
 	 */
 	BUILD_BUG_ON(vaddr_start >= vaddr_end);
-	BUILD_BUG_ON(IS_ENABLED(CONFIG_X86_ESPFIX64) &&
-		     vaddr_end >= EFI_VA_END);
-	BUILD_BUG_ON((IS_ENABLED(CONFIG_X86_ESPFIX64) ||
-		      IS_ENABLED(CONFIG_EFI)) &&
-		     vaddr_end >= __START_KERNEL_map);
+	BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE);
 	BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
 
 	if (!kaslr_memory_enabled())
diff --git a/arch/x86/mm/mem_encrypt.c b/arch/x86/mm/mem_encrypt.c
index 391b134..e1d61e8 100644
--- a/arch/x86/mm/mem_encrypt.c
+++ b/arch/x86/mm/mem_encrypt.c
@@ -464,37 +464,62 @@ void swiotlb_set_mem_attributes(void *vaddr, unsigned long size)
 	set_memory_decrypted((unsigned long)vaddr, size >> PAGE_SHIFT);
 }
 
-static void __init sme_clear_pgd(pgd_t *pgd_base, unsigned long start,
-				 unsigned long end)
+struct sme_populate_pgd_data {
+	void	*pgtable_area;
+	pgd_t	*pgd;
+
+	pmdval_t pmd_flags;
+	pteval_t pte_flags;
+	unsigned long paddr;
+
+	unsigned long vaddr;
+	unsigned long vaddr_end;
+};
+
+static void __init sme_clear_pgd(struct sme_populate_pgd_data *ppd)
 {
 	unsigned long pgd_start, pgd_end, pgd_size;
 	pgd_t *pgd_p;
 
-	pgd_start = start & PGDIR_MASK;
-	pgd_end = end & PGDIR_MASK;
+	pgd_start = ppd->vaddr & PGDIR_MASK;
+	pgd_end = ppd->vaddr_end & PGDIR_MASK;
 
-	pgd_size = (((pgd_end - pgd_start) / PGDIR_SIZE) + 1);
-	pgd_size *= sizeof(pgd_t);
+	pgd_size = (((pgd_end - pgd_start) / PGDIR_SIZE) + 1) * sizeof(pgd_t);
 
-	pgd_p = pgd_base + pgd_index(start);
+	pgd_p = ppd->pgd + pgd_index(ppd->vaddr);
 
 	memset(pgd_p, 0, pgd_size);
 }
 
-#define PGD_FLAGS	_KERNPG_TABLE_NOENC
-#define P4D_FLAGS	_KERNPG_TABLE_NOENC
-#define PUD_FLAGS	_KERNPG_TABLE_NOENC
-#define PMD_FLAGS	(__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL)
+#define PGD_FLAGS		_KERNPG_TABLE_NOENC
+#define P4D_FLAGS		_KERNPG_TABLE_NOENC
+#define PUD_FLAGS		_KERNPG_TABLE_NOENC
+#define PMD_FLAGS		_KERNPG_TABLE_NOENC
 
-static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area,
-				     unsigned long vaddr, pmdval_t pmd_val)
+#define PMD_FLAGS_LARGE		(__PAGE_KERNEL_LARGE_EXEC & ~_PAGE_GLOBAL)
+
+#define PMD_FLAGS_DEC		PMD_FLAGS_LARGE
+#define PMD_FLAGS_DEC_WP	((PMD_FLAGS_DEC & ~_PAGE_CACHE_MASK) | \
+				 (_PAGE_PAT | _PAGE_PWT))
+
+#define PMD_FLAGS_ENC		(PMD_FLAGS_LARGE | _PAGE_ENC)
+
+#define PTE_FLAGS		(__PAGE_KERNEL_EXEC & ~_PAGE_GLOBAL)
+
+#define PTE_FLAGS_DEC		PTE_FLAGS
+#define PTE_FLAGS_DEC_WP	((PTE_FLAGS_DEC & ~_PAGE_CACHE_MASK) | \
+				 (_PAGE_PAT | _PAGE_PWT))
+
+#define PTE_FLAGS_ENC		(PTE_FLAGS | _PAGE_ENC)
+
+static pmd_t __init *sme_prepare_pgd(struct sme_populate_pgd_data *ppd)
 {
 	pgd_t *pgd_p;
 	p4d_t *p4d_p;
 	pud_t *pud_p;
 	pmd_t *pmd_p;
 
-	pgd_p = pgd_base + pgd_index(vaddr);
+	pgd_p = ppd->pgd + pgd_index(ppd->vaddr);
 	if (native_pgd_val(*pgd_p)) {
 		if (IS_ENABLED(CONFIG_X86_5LEVEL))
 			p4d_p = (p4d_t *)(native_pgd_val(*pgd_p) & ~PTE_FLAGS_MASK);
@@ -504,15 +529,15 @@ static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area,
 		pgd_t pgd;
 
 		if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
-			p4d_p = pgtable_area;
+			p4d_p = ppd->pgtable_area;
 			memset(p4d_p, 0, sizeof(*p4d_p) * PTRS_PER_P4D);
-			pgtable_area += sizeof(*p4d_p) * PTRS_PER_P4D;
+			ppd->pgtable_area += sizeof(*p4d_p) * PTRS_PER_P4D;
 
 			pgd = native_make_pgd((pgdval_t)p4d_p + PGD_FLAGS);
 		} else {
-			pud_p = pgtable_area;
+			pud_p = ppd->pgtable_area;
 			memset(pud_p, 0, sizeof(*pud_p) * PTRS_PER_PUD);
-			pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
+			ppd->pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
 
 			pgd = native_make_pgd((pgdval_t)pud_p + PGD_FLAGS);
 		}
@@ -520,58 +545,160 @@ static void __init *sme_populate_pgd(pgd_t *pgd_base, void *pgtable_area,
 	}
 
 	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
-		p4d_p += p4d_index(vaddr);
+		p4d_p += p4d_index(ppd->vaddr);
 		if (native_p4d_val(*p4d_p)) {
 			pud_p = (pud_t *)(native_p4d_val(*p4d_p) & ~PTE_FLAGS_MASK);
 		} else {
 			p4d_t p4d;
 
-			pud_p = pgtable_area;
+			pud_p = ppd->pgtable_area;
 			memset(pud_p, 0, sizeof(*pud_p) * PTRS_PER_PUD);
-			pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
+			ppd->pgtable_area += sizeof(*pud_p) * PTRS_PER_PUD;
 
 			p4d = native_make_p4d((pudval_t)pud_p + P4D_FLAGS);
 			native_set_p4d(p4d_p, p4d);
 		}
 	}
 
-	pud_p += pud_index(vaddr);
+	pud_p += pud_index(ppd->vaddr);
 	if (native_pud_val(*pud_p)) {
 		if (native_pud_val(*pud_p) & _PAGE_PSE)
-			goto out;
+			return NULL;
 
 		pmd_p = (pmd_t *)(native_pud_val(*pud_p) & ~PTE_FLAGS_MASK);
 	} else {
 		pud_t pud;
 
-		pmd_p = pgtable_area;
+		pmd_p = ppd->pgtable_area;
 		memset(pmd_p, 0, sizeof(*pmd_p) * PTRS_PER_PMD);
-		pgtable_area += sizeof(*pmd_p) * PTRS_PER_PMD;
+		ppd->pgtable_area += sizeof(*pmd_p) * PTRS_PER_PMD;
 
 		pud = native_make_pud((pmdval_t)pmd_p + PUD_FLAGS);
 		native_set_pud(pud_p, pud);
 	}
 
-	pmd_p += pmd_index(vaddr);
-	if (!native_pmd_val(*pmd_p) || !(native_pmd_val(*pmd_p) & _PAGE_PSE))
-		native_set_pmd(pmd_p, native_make_pmd(pmd_val));
+	return pmd_p;
+}
 
-out:
-	return pgtable_area;
+static void __init sme_populate_pgd_large(struct sme_populate_pgd_data *ppd)
+{
+	pmd_t *pmd_p;
+
+	pmd_p = sme_prepare_pgd(ppd);
+	if (!pmd_p)
+		return;
+
+	pmd_p += pmd_index(ppd->vaddr);
+	if (!native_pmd_val(*pmd_p) || !(native_pmd_val(*pmd_p) & _PAGE_PSE))
+		native_set_pmd(pmd_p, native_make_pmd(ppd->paddr | ppd->pmd_flags));
+}
+
+static void __init sme_populate_pgd(struct sme_populate_pgd_data *ppd)
+{
+	pmd_t *pmd_p;
+	pte_t *pte_p;
+
+	pmd_p = sme_prepare_pgd(ppd);
+	if (!pmd_p)
+		return;
+
+	pmd_p += pmd_index(ppd->vaddr);
+	if (native_pmd_val(*pmd_p)) {
+		if (native_pmd_val(*pmd_p) & _PAGE_PSE)
+			return;
+
+		pte_p = (pte_t *)(native_pmd_val(*pmd_p) & ~PTE_FLAGS_MASK);
+	} else {
+		pmd_t pmd;
+
+		pte_p = ppd->pgtable_area;
+		memset(pte_p, 0, sizeof(*pte_p) * PTRS_PER_PTE);
+		ppd->pgtable_area += sizeof(*pte_p) * PTRS_PER_PTE;
+
+		pmd = native_make_pmd((pteval_t)pte_p + PMD_FLAGS);
+		native_set_pmd(pmd_p, pmd);
+	}
+
+	pte_p += pte_index(ppd->vaddr);
+	if (!native_pte_val(*pte_p))
+		native_set_pte(pte_p, native_make_pte(ppd->paddr | ppd->pte_flags));
+}
+
+static void __init __sme_map_range_pmd(struct sme_populate_pgd_data *ppd)
+{
+	while (ppd->vaddr < ppd->vaddr_end) {
+		sme_populate_pgd_large(ppd);
+
+		ppd->vaddr += PMD_PAGE_SIZE;
+		ppd->paddr += PMD_PAGE_SIZE;
+	}
+}
+
+static void __init __sme_map_range_pte(struct sme_populate_pgd_data *ppd)
+{
+	while (ppd->vaddr < ppd->vaddr_end) {
+		sme_populate_pgd(ppd);
+
+		ppd->vaddr += PAGE_SIZE;
+		ppd->paddr += PAGE_SIZE;
+	}
+}
+
+static void __init __sme_map_range(struct sme_populate_pgd_data *ppd,
+				   pmdval_t pmd_flags, pteval_t pte_flags)
+{
+	unsigned long vaddr_end;
+
+	ppd->pmd_flags = pmd_flags;
+	ppd->pte_flags = pte_flags;
+
+	/* Save original end value since we modify the struct value */
+	vaddr_end = ppd->vaddr_end;
+
+	/* If start is not 2MB aligned, create PTE entries */
+	ppd->vaddr_end = ALIGN(ppd->vaddr, PMD_PAGE_SIZE);
+	__sme_map_range_pte(ppd);
+
+	/* Create PMD entries */
+	ppd->vaddr_end = vaddr_end & PMD_PAGE_MASK;
+	__sme_map_range_pmd(ppd);
+
+	/* If end is not 2MB aligned, create PTE entries */
+	ppd->vaddr_end = vaddr_end;
+	__sme_map_range_pte(ppd);
+}
+
+static void __init sme_map_range_encrypted(struct sme_populate_pgd_data *ppd)
+{
+	__sme_map_range(ppd, PMD_FLAGS_ENC, PTE_FLAGS_ENC);
+}
+
+static void __init sme_map_range_decrypted(struct sme_populate_pgd_data *ppd)
+{
+	__sme_map_range(ppd, PMD_FLAGS_DEC, PTE_FLAGS_DEC);
+}
+
+static void __init sme_map_range_decrypted_wp(struct sme_populate_pgd_data *ppd)
+{
+	__sme_map_range(ppd, PMD_FLAGS_DEC_WP, PTE_FLAGS_DEC_WP);
 }
 
 static unsigned long __init sme_pgtable_calc(unsigned long len)
 {
-	unsigned long p4d_size, pud_size, pmd_size;
+	unsigned long p4d_size, pud_size, pmd_size, pte_size;
 	unsigned long total;
 
 	/*
 	 * Perform a relatively simplistic calculation of the pagetable
-	 * entries that are needed. That mappings will be covered by 2MB
-	 * PMD entries so we can conservatively calculate the required
+	 * entries that are needed. Those mappings will be covered mostly
+	 * by 2MB PMD entries so we can conservatively calculate the required
 	 * number of P4D, PUD and PMD structures needed to perform the
-	 * mappings. Incrementing the count for each covers the case where
-	 * the addresses cross entries.
+	 * mappings.  For mappings that are not 2MB aligned, PTE mappings
+	 * would be needed for the start and end portion of the address range
+	 * that fall outside of the 2MB alignment.  This results in, at most,
+	 * two extra pages to hold PTE entries for each range that is mapped.
+	 * Incrementing the count for each covers the case where the addresses
+	 * cross entries.
 	 */
 	if (IS_ENABLED(CONFIG_X86_5LEVEL)) {
 		p4d_size = (ALIGN(len, PGDIR_SIZE) / PGDIR_SIZE) + 1;
@@ -585,8 +712,9 @@ static unsigned long __init sme_pgtable_calc(unsigned long len)
 	}
 	pmd_size = (ALIGN(len, PUD_SIZE) / PUD_SIZE) + 1;
 	pmd_size *= sizeof(pmd_t) * PTRS_PER_PMD;
+	pte_size = 2 * sizeof(pte_t) * PTRS_PER_PTE;
 
-	total = p4d_size + pud_size + pmd_size;
+	total = p4d_size + pud_size + pmd_size + pte_size;
 
 	/*
 	 * Now calculate the added pagetable structures needed to populate
@@ -610,29 +738,29 @@ static unsigned long __init sme_pgtable_calc(unsigned long len)
 	return total;
 }
 
-void __init sme_encrypt_kernel(void)
+void __init __nostackprotector sme_encrypt_kernel(struct boot_params *bp)
 {
 	unsigned long workarea_start, workarea_end, workarea_len;
 	unsigned long execute_start, execute_end, execute_len;
 	unsigned long kernel_start, kernel_end, kernel_len;
+	unsigned long initrd_start, initrd_end, initrd_len;
+	struct sme_populate_pgd_data ppd;
 	unsigned long pgtable_area_len;
-	unsigned long paddr, pmd_flags;
 	unsigned long decrypted_base;
-	void *pgtable_area;
-	pgd_t *pgd;
 
 	if (!sme_active())
 		return;
 
 	/*
-	 * Prepare for encrypting the kernel by building new pagetables with
-	 * the necessary attributes needed to encrypt the kernel in place.
+	 * Prepare for encrypting the kernel and initrd by building new
+	 * pagetables with the necessary attributes needed to encrypt the
+	 * kernel in place.
 	 *
 	 *   One range of virtual addresses will map the memory occupied
-	 *   by the kernel as encrypted.
+	 *   by the kernel and initrd as encrypted.
 	 *
 	 *   Another range of virtual addresses will map the memory occupied
-	 *   by the kernel as decrypted and write-protected.
+	 *   by the kernel and initrd as decrypted and write-protected.
 	 *
 	 *     The use of write-protect attribute will prevent any of the
 	 *     memory from being cached.
@@ -643,6 +771,20 @@ void __init sme_encrypt_kernel(void)
 	kernel_end = ALIGN(__pa_symbol(_end), PMD_PAGE_SIZE);
 	kernel_len = kernel_end - kernel_start;
 
+	initrd_start = 0;
+	initrd_end = 0;
+	initrd_len = 0;
+#ifdef CONFIG_BLK_DEV_INITRD
+	initrd_len = (unsigned long)bp->hdr.ramdisk_size |
+		     ((unsigned long)bp->ext_ramdisk_size << 32);
+	if (initrd_len) {
+		initrd_start = (unsigned long)bp->hdr.ramdisk_image |
+			       ((unsigned long)bp->ext_ramdisk_image << 32);
+		initrd_end = PAGE_ALIGN(initrd_start + initrd_len);
+		initrd_len = initrd_end - initrd_start;
+	}
+#endif
+
 	/* Set the encryption workarea to be immediately after the kernel */
 	workarea_start = kernel_end;
 
@@ -665,16 +807,21 @@ void __init sme_encrypt_kernel(void)
 	 */
 	pgtable_area_len = sizeof(pgd_t) * PTRS_PER_PGD;
 	pgtable_area_len += sme_pgtable_calc(execute_end - kernel_start) * 2;
+	if (initrd_len)
+		pgtable_area_len += sme_pgtable_calc(initrd_len) * 2;
 
 	/* PUDs and PMDs needed in the current pagetables for the workarea */
 	pgtable_area_len += sme_pgtable_calc(execute_len + pgtable_area_len);
 
 	/*
 	 * The total workarea includes the executable encryption area and
-	 * the pagetable area.
+	 * the pagetable area. The start of the workarea is already 2MB
+	 * aligned, align the end of the workarea on a 2MB boundary so that
+	 * we don't try to create/allocate PTE entries from the workarea
+	 * before it is mapped.
 	 */
 	workarea_len = execute_len + pgtable_area_len;
-	workarea_end = workarea_start + workarea_len;
+	workarea_end = ALIGN(workarea_start + workarea_len, PMD_PAGE_SIZE);
 
 	/*
 	 * Set the address to the start of where newly created pagetable
@@ -683,45 +830,30 @@ void __init sme_encrypt_kernel(void)
 	 * pagetables and when the new encrypted and decrypted kernel
 	 * mappings are populated.
 	 */
-	pgtable_area = (void *)execute_end;
+	ppd.pgtable_area = (void *)execute_end;
 
 	/*
 	 * Make sure the current pagetable structure has entries for
 	 * addressing the workarea.
 	 */
-	pgd = (pgd_t *)native_read_cr3_pa();
-	paddr = workarea_start;
-	while (paddr < workarea_end) {
-		pgtable_area = sme_populate_pgd(pgd, pgtable_area,
-						paddr,
-						paddr + PMD_FLAGS);
-
-		paddr += PMD_PAGE_SIZE;
-	}
+	ppd.pgd = (pgd_t *)native_read_cr3_pa();
+	ppd.paddr = workarea_start;
+	ppd.vaddr = workarea_start;
+	ppd.vaddr_end = workarea_end;
+	sme_map_range_decrypted(&ppd);
 
 	/* Flush the TLB - no globals so cr3 is enough */
 	native_write_cr3(__native_read_cr3());
 
 	/*
 	 * A new pagetable structure is being built to allow for the kernel
-	 * to be encrypted. It starts with an empty PGD that will then be
-	 * populated with new PUDs and PMDs as the encrypted and decrypted
-	 * kernel mappings are created.
+	 * and initrd to be encrypted. It starts with an empty PGD that will
+	 * then be populated with new PUDs and PMDs as the encrypted and
+	 * decrypted kernel mappings are created.
 	 */
-	pgd = pgtable_area;
-	memset(pgd, 0, sizeof(*pgd) * PTRS_PER_PGD);
-	pgtable_area += sizeof(*pgd) * PTRS_PER_PGD;
-
-	/* Add encrypted kernel (identity) mappings */
-	pmd_flags = PMD_FLAGS | _PAGE_ENC;
-	paddr = kernel_start;
-	while (paddr < kernel_end) {
-		pgtable_area = sme_populate_pgd(pgd, pgtable_area,
-						paddr,
-						paddr + pmd_flags);
-
-		paddr += PMD_PAGE_SIZE;
-	}
+	ppd.pgd = ppd.pgtable_area;
+	memset(ppd.pgd, 0, sizeof(pgd_t) * PTRS_PER_PGD);
+	ppd.pgtable_area += sizeof(pgd_t) * PTRS_PER_PGD;
 
 	/*
 	 * A different PGD index/entry must be used to get different
@@ -730,47 +862,79 @@ void __init sme_encrypt_kernel(void)
 	 * the base of the mapping.
 	 */
 	decrypted_base = (pgd_index(workarea_end) + 1) & (PTRS_PER_PGD - 1);
+	if (initrd_len) {
+		unsigned long check_base;
+
+		check_base = (pgd_index(initrd_end) + 1) & (PTRS_PER_PGD - 1);
+		decrypted_base = max(decrypted_base, check_base);
+	}
 	decrypted_base <<= PGDIR_SHIFT;
 
-	/* Add decrypted, write-protected kernel (non-identity) mappings */
-	pmd_flags = (PMD_FLAGS & ~_PAGE_CACHE_MASK) | (_PAGE_PAT | _PAGE_PWT);
-	paddr = kernel_start;
-	while (paddr < kernel_end) {
-		pgtable_area = sme_populate_pgd(pgd, pgtable_area,
-						paddr + decrypted_base,
-						paddr + pmd_flags);
+	/* Add encrypted kernel (identity) mappings */
+	ppd.paddr = kernel_start;
+	ppd.vaddr = kernel_start;
+	ppd.vaddr_end = kernel_end;
+	sme_map_range_encrypted(&ppd);
 
-		paddr += PMD_PAGE_SIZE;
+	/* Add decrypted, write-protected kernel (non-identity) mappings */
+	ppd.paddr = kernel_start;
+	ppd.vaddr = kernel_start + decrypted_base;
+	ppd.vaddr_end = kernel_end + decrypted_base;
+	sme_map_range_decrypted_wp(&ppd);
+
+	if (initrd_len) {
+		/* Add encrypted initrd (identity) mappings */
+		ppd.paddr = initrd_start;
+		ppd.vaddr = initrd_start;
+		ppd.vaddr_end = initrd_end;
+		sme_map_range_encrypted(&ppd);
+		/*
+		 * Add decrypted, write-protected initrd (non-identity) mappings
+		 */
+		ppd.paddr = initrd_start;
+		ppd.vaddr = initrd_start + decrypted_base;
+		ppd.vaddr_end = initrd_end + decrypted_base;
+		sme_map_range_decrypted_wp(&ppd);
 	}
 
 	/* Add decrypted workarea mappings to both kernel mappings */
-	paddr = workarea_start;
-	while (paddr < workarea_end) {
-		pgtable_area = sme_populate_pgd(pgd, pgtable_area,
-						paddr,
-						paddr + PMD_FLAGS);
+	ppd.paddr = workarea_start;
+	ppd.vaddr = workarea_start;
+	ppd.vaddr_end = workarea_end;
+	sme_map_range_decrypted(&ppd);
 
-		pgtable_area = sme_populate_pgd(pgd, pgtable_area,
-						paddr + decrypted_base,
-						paddr + PMD_FLAGS);
-
-		paddr += PMD_PAGE_SIZE;
-	}
+	ppd.paddr = workarea_start;
+	ppd.vaddr = workarea_start + decrypted_base;
+	ppd.vaddr_end = workarea_end + decrypted_base;
+	sme_map_range_decrypted(&ppd);
 
 	/* Perform the encryption */
 	sme_encrypt_execute(kernel_start, kernel_start + decrypted_base,
-			    kernel_len, workarea_start, (unsigned long)pgd);
+			    kernel_len, workarea_start, (unsigned long)ppd.pgd);
+
+	if (initrd_len)
+		sme_encrypt_execute(initrd_start, initrd_start + decrypted_base,
+				    initrd_len, workarea_start,
+				    (unsigned long)ppd.pgd);
 
 	/*
 	 * At this point we are running encrypted.  Remove the mappings for
 	 * the decrypted areas - all that is needed for this is to remove
 	 * the PGD entry/entries.
 	 */
-	sme_clear_pgd(pgd, kernel_start + decrypted_base,
-		      kernel_end + decrypted_base);
+	ppd.vaddr = kernel_start + decrypted_base;
+	ppd.vaddr_end = kernel_end + decrypted_base;
+	sme_clear_pgd(&ppd);
 
-	sme_clear_pgd(pgd, workarea_start + decrypted_base,
-		      workarea_end + decrypted_base);
+	if (initrd_len) {
+		ppd.vaddr = initrd_start + decrypted_base;
+		ppd.vaddr_end = initrd_end + decrypted_base;
+		sme_clear_pgd(&ppd);
+	}
+
+	ppd.vaddr = workarea_start + decrypted_base;
+	ppd.vaddr_end = workarea_end + decrypted_base;
+	sme_clear_pgd(&ppd);
 
 	/* Flush the TLB - no globals so cr3 is enough */
 	native_write_cr3(__native_read_cr3());
diff --git a/arch/x86/mm/mem_encrypt_boot.S b/arch/x86/mm/mem_encrypt_boot.S
index 730e6d5..01f682c 100644
--- a/arch/x86/mm/mem_encrypt_boot.S
+++ b/arch/x86/mm/mem_encrypt_boot.S
@@ -22,9 +22,9 @@
 
 	/*
 	 * Entry parameters:
-	 *   RDI - virtual address for the encrypted kernel mapping
-	 *   RSI - virtual address for the decrypted kernel mapping
-	 *   RDX - length of kernel
+	 *   RDI - virtual address for the encrypted mapping
+	 *   RSI - virtual address for the decrypted mapping
+	 *   RDX - length to encrypt
 	 *   RCX - virtual address of the encryption workarea, including:
 	 *     - stack page (PAGE_SIZE)
 	 *     - encryption routine page (PAGE_SIZE)
@@ -41,9 +41,9 @@
 	addq	$PAGE_SIZE, %rax	/* Workarea encryption routine */
 
 	push	%r12
-	movq	%rdi, %r10		/* Encrypted kernel */
-	movq	%rsi, %r11		/* Decrypted kernel */
-	movq	%rdx, %r12		/* Kernel length */
+	movq	%rdi, %r10		/* Encrypted area */
+	movq	%rsi, %r11		/* Decrypted area */
+	movq	%rdx, %r12		/* Area length */
 
 	/* Copy encryption routine into the workarea */
 	movq	%rax, %rdi				/* Workarea encryption routine */
@@ -52,10 +52,10 @@
 	rep	movsb
 
 	/* Setup registers for call */
-	movq	%r10, %rdi		/* Encrypted kernel */
-	movq	%r11, %rsi		/* Decrypted kernel */
+	movq	%r10, %rdi		/* Encrypted area */
+	movq	%r11, %rsi		/* Decrypted area */
 	movq	%r8, %rdx		/* Pagetables used for encryption */
-	movq	%r12, %rcx		/* Kernel length */
+	movq	%r12, %rcx		/* Area length */
 	movq	%rax, %r8		/* Workarea encryption routine */
 	addq	$PAGE_SIZE, %r8		/* Workarea intermediate copy buffer */
 
@@ -71,7 +71,7 @@
 
 ENTRY(__enc_copy)
 /*
- * Routine used to encrypt kernel.
+ * Routine used to encrypt memory in place.
  *   This routine must be run outside of the kernel proper since
  *   the kernel will be encrypted during the process. So this
  *   routine is defined here and then copied to an area outside
@@ -79,19 +79,19 @@
  *   during execution.
  *
  *   On entry the registers must be:
- *     RDI - virtual address for the encrypted kernel mapping
- *     RSI - virtual address for the decrypted kernel mapping
+ *     RDI - virtual address for the encrypted mapping
+ *     RSI - virtual address for the decrypted mapping
  *     RDX - address of the pagetables to use for encryption
- *     RCX - length of kernel
+ *     RCX - length of area
  *      R8 - intermediate copy buffer
  *
  *     RAX - points to this routine
  *
- * The kernel will be encrypted by copying from the non-encrypted
- * kernel space to an intermediate buffer and then copying from the
- * intermediate buffer back to the encrypted kernel space. The physical
- * addresses of the two kernel space mappings are the same which
- * results in the kernel being encrypted "in place".
+ * The area will be encrypted by copying from the non-encrypted
+ * memory space to an intermediate buffer and then copying from the
+ * intermediate buffer back to the encrypted memory space. The physical
+ * addresses of the two mappings are the same which results in the area
+ * being encrypted "in place".
  */
 	/* Enable the new page tables */
 	mov	%rdx, %cr3
@@ -103,47 +103,55 @@
 	orq	$X86_CR4_PGE, %rdx
 	mov	%rdx, %cr4
 
+	push	%r15
+	push	%r12
+
+	movq	%rcx, %r9		/* Save area length */
+	movq	%rdi, %r10		/* Save encrypted area address */
+	movq	%rsi, %r11		/* Save decrypted area address */
+
 	/* Set the PAT register PA5 entry to write-protect */
-	push	%rcx
 	movl	$MSR_IA32_CR_PAT, %ecx
 	rdmsr
-	push	%rdx			/* Save original PAT value */
+	mov	%rdx, %r15		/* Save original PAT value */
 	andl	$0xffff00ff, %edx	/* Clear PA5 */
 	orl	$0x00000500, %edx	/* Set PA5 to WP */
 	wrmsr
-	pop	%rdx			/* RDX contains original PAT value */
-	pop	%rcx
-
-	movq	%rcx, %r9		/* Save kernel length */
-	movq	%rdi, %r10		/* Save encrypted kernel address */
-	movq	%rsi, %r11		/* Save decrypted kernel address */
 
 	wbinvd				/* Invalidate any cache entries */
 
-	/* Copy/encrypt 2MB at a time */
+	/* Copy/encrypt up to 2MB at a time */
+	movq	$PMD_PAGE_SIZE, %r12
 1:
-	movq	%r11, %rsi		/* Source - decrypted kernel */
+	cmpq	%r12, %r9
+	jnb	2f
+	movq	%r9, %r12
+
+2:
+	movq	%r11, %rsi		/* Source - decrypted area */
 	movq	%r8, %rdi		/* Dest   - intermediate copy buffer */
-	movq	$PMD_PAGE_SIZE, %rcx	/* 2MB length */
+	movq	%r12, %rcx
 	rep	movsb
 
 	movq	%r8, %rsi		/* Source - intermediate copy buffer */
-	movq	%r10, %rdi		/* Dest   - encrypted kernel */
-	movq	$PMD_PAGE_SIZE, %rcx	/* 2MB length */
+	movq	%r10, %rdi		/* Dest   - encrypted area */
+	movq	%r12, %rcx
 	rep	movsb
 
-	addq	$PMD_PAGE_SIZE, %r11
-	addq	$PMD_PAGE_SIZE, %r10
-	subq	$PMD_PAGE_SIZE, %r9	/* Kernel length decrement */
+	addq	%r12, %r11
+	addq	%r12, %r10
+	subq	%r12, %r9		/* Kernel length decrement */
 	jnz	1b			/* Kernel length not zero? */
 
 	/* Restore PAT register */
-	push	%rdx			/* Save original PAT value */
 	movl	$MSR_IA32_CR_PAT, %ecx
 	rdmsr
-	pop	%rdx			/* Restore original PAT value */
+	mov	%r15, %rdx		/* Restore original PAT value */
 	wrmsr
 
+	pop	%r12
+	pop	%r15
+
 	ret
 .L__enc_copy_end:
 ENDPROC(__enc_copy)
diff --git a/arch/x86/mm/pti.c b/arch/x86/mm/pti.c
index bce8aea..ce38f16 100644
--- a/arch/x86/mm/pti.c
+++ b/arch/x86/mm/pti.c
@@ -56,13 +56,13 @@
 
 static void __init pti_print_if_insecure(const char *reason)
 {
-	if (boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+	if (boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
 		pr_info("%s\n", reason);
 }
 
 static void __init pti_print_if_secure(const char *reason)
 {
-	if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+	if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
 		pr_info("%s\n", reason);
 }
 
@@ -96,7 +96,7 @@ void __init pti_check_boottime_disable(void)
 	}
 
 autosel:
-	if (!boot_cpu_has_bug(X86_BUG_CPU_INSECURE))
+	if (!boot_cpu_has_bug(X86_BUG_CPU_MELTDOWN))
 		return;
 enable:
 	setup_force_cpu_cap(X86_FEATURE_PTI);
@@ -149,7 +149,7 @@ pgd_t __pti_set_user_pgd(pgd_t *pgdp, pgd_t pgd)
  *
  * Returns a pointer to a P4D on success, or NULL on failure.
  */
-static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
+static __init p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
 {
 	pgd_t *pgd = kernel_to_user_pgdp(pgd_offset_k(address));
 	gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
@@ -164,12 +164,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
 		if (!new_p4d_page)
 			return NULL;
 
-		if (pgd_none(*pgd)) {
-			set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
-			new_p4d_page = 0;
-		}
-		if (new_p4d_page)
-			free_page(new_p4d_page);
+		set_pgd(pgd, __pgd(_KERNPG_TABLE | __pa(new_p4d_page)));
 	}
 	BUILD_BUG_ON(pgd_large(*pgd) != 0);
 
@@ -182,7 +177,7 @@ static p4d_t *pti_user_pagetable_walk_p4d(unsigned long address)
  *
  * Returns a pointer to a PMD on success, or NULL on failure.
  */
-static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
+static __init pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
 {
 	gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO);
 	p4d_t *p4d = pti_user_pagetable_walk_p4d(address);
@@ -194,12 +189,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
 		if (!new_pud_page)
 			return NULL;
 
-		if (p4d_none(*p4d)) {
-			set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page)));
-			new_pud_page = 0;
-		}
-		if (new_pud_page)
-			free_page(new_pud_page);
+		set_p4d(p4d, __p4d(_KERNPG_TABLE | __pa(new_pud_page)));
 	}
 
 	pud = pud_offset(p4d, address);
@@ -213,12 +203,7 @@ static pmd_t *pti_user_pagetable_walk_pmd(unsigned long address)
 		if (!new_pmd_page)
 			return NULL;
 
-		if (pud_none(*pud)) {
-			set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
-			new_pmd_page = 0;
-		}
-		if (new_pmd_page)
-			free_page(new_pmd_page);
+		set_pud(pud, __pud(_KERNPG_TABLE | __pa(new_pmd_page)));
 	}
 
 	return pmd_offset(pud, address);
@@ -251,12 +236,7 @@ static __init pte_t *pti_user_pagetable_walk_pte(unsigned long address)
 		if (!new_pte_page)
 			return NULL;
 
-		if (pmd_none(*pmd)) {
-			set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
-			new_pte_page = 0;
-		}
-		if (new_pte_page)
-			free_page(new_pte_page);
+		set_pmd(pmd, __pmd(_KERNPG_TABLE | __pa(new_pte_page)));
 	}
 
 	pte = pte_offset_kernel(pmd, address);
@@ -367,7 +347,8 @@ static void __init pti_setup_espfix64(void)
 static void __init pti_clone_entry_text(void)
 {
 	pti_clone_pmds((unsigned long) __entry_text_start,
-			(unsigned long) __irqentry_text_end, _PAGE_RW);
+			(unsigned long) __irqentry_text_end,
+		       _PAGE_RW | _PAGE_GLOBAL);
 }
 
 /*
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a156195..5bfe61a 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -151,6 +151,34 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
 	local_irq_restore(flags);
 }
 
+static void sync_current_stack_to_mm(struct mm_struct *mm)
+{
+	unsigned long sp = current_stack_pointer;
+	pgd_t *pgd = pgd_offset(mm, sp);
+
+	if (CONFIG_PGTABLE_LEVELS > 4) {
+		if (unlikely(pgd_none(*pgd))) {
+			pgd_t *pgd_ref = pgd_offset_k(sp);
+
+			set_pgd(pgd, *pgd_ref);
+		}
+	} else {
+		/*
+		 * "pgd" is faked.  The top level entries are "p4d"s, so sync
+		 * the p4d.  This compiles to approximately the same code as
+		 * the 5-level case.
+		 */
+		p4d_t *p4d = p4d_offset(pgd, sp);
+
+		if (unlikely(p4d_none(*p4d))) {
+			pgd_t *pgd_ref = pgd_offset_k(sp);
+			p4d_t *p4d_ref = p4d_offset(pgd_ref, sp);
+
+			set_p4d(p4d, *p4d_ref);
+		}
+	}
+}
+
 void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			struct task_struct *tsk)
 {
@@ -226,11 +254,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
 			 * mapped in the new pgd, we'll double-fault.  Forcibly
 			 * map it.
 			 */
-			unsigned int index = pgd_index(current_stack_pointer);
-			pgd_t *pgd = next->pgd + index;
-
-			if (unlikely(pgd_none(*pgd)))
-				set_pgd(pgd, init_mm.pgd[index]);
+			sync_current_stack_to_mm(next);
 		}
 
 		/* Stop remote flushes for the previous mm */
diff --git a/arch/x86/pci/common.c b/arch/x86/pci/common.c
index 7a5350d..563049c 100644
--- a/arch/x86/pci/common.c
+++ b/arch/x86/pci/common.c
@@ -594,6 +594,11 @@ char *__init pcibios_setup(char *str)
 	} else if (!strcmp(str, "nocrs")) {
 		pci_probe |= PCI_ROOT_NO_CRS;
 		return NULL;
+#ifdef CONFIG_PHYS_ADDR_T_64BIT
+	} else if (!strcmp(str, "big_root_window")) {
+		pci_probe |= PCI_BIG_ROOT_WINDOW;
+		return NULL;
+#endif
 	} else if (!strcmp(str, "earlydump")) {
 		pci_early_dump_regs = 1;
 		return NULL;
diff --git a/arch/x86/pci/fixup.c b/arch/x86/pci/fixup.c
index e663d6b..54ef19e 100644
--- a/arch/x86/pci/fixup.c
+++ b/arch/x86/pci/fixup.c
@@ -662,10 +662,14 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x2033, quirk_no_aersid);
  */
 static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 {
-	unsigned i;
-	u32 base, limit, high;
+	static const char *name = "PCI Bus 0000:00";
 	struct resource *res, *conflict;
+	u32 base, limit, high;
 	struct pci_dev *other;
+	unsigned i;
+
+	if (!(pci_probe & PCI_BIG_ROOT_WINDOW))
+		return;
 
 	/* Check that we are the only device of that type */
 	other = pci_get_device(dev->vendor, dev->device, NULL);
@@ -699,22 +703,30 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	if (!res)
 		return;
 
-	res->name = "PCI Bus 0000:00";
+	/*
+	 * Allocate a 256GB window directly below the 0xfd00000000 hardware
+	 * limit (see AMD Family 15h Models 30h-3Fh BKDG, sec 2.4.6).
+	 */
+	res->name = name;
 	res->flags = IORESOURCE_PREFETCH | IORESOURCE_MEM |
 		IORESOURCE_MEM_64 | IORESOURCE_WINDOW;
-	res->start = 0x100000000ull;
+	res->start = 0xbd00000000ull;
 	res->end = 0xfd00000000ull - 1;
 
-	/* Just grab the free area behind system memory for this */
-	while ((conflict = request_resource_conflict(&iomem_resource, res))) {
-		if (conflict->end >= res->end) {
-			kfree(res);
+	conflict = request_resource_conflict(&iomem_resource, res);
+	if (conflict) {
+		kfree(res);
+		if (conflict->name != name)
 			return;
-		}
-		res->start = conflict->end + 1;
-	}
 
-	dev_info(&dev->dev, "adding root bus resource %pR\n", res);
+		/* We are resuming from suspend; just reenable the window */
+		res = conflict;
+	} else {
+		dev_info(&dev->dev, "adding root bus resource %pR (tainting kernel)\n",
+			 res);
+		add_taint(TAINT_FIRMWARE_WORKAROUND, LOCKDEP_STILL_OK);
+		pci_bus_add_resource(dev->bus, res, 0);
+	}
 
 	base = ((res->start >> 8) & AMD_141b_MMIO_BASE_MMIOBASE_MASK) |
 		AMD_141b_MMIO_BASE_RE_MASK | AMD_141b_MMIO_BASE_WE_MASK;
@@ -726,13 +738,16 @@ static void pci_amd_enable_64bit_bar(struct pci_dev *dev)
 	pci_write_config_dword(dev, AMD_141b_MMIO_HIGH(i), high);
 	pci_write_config_dword(dev, AMD_141b_MMIO_LIMIT(i), limit);
 	pci_write_config_dword(dev, AMD_141b_MMIO_BASE(i), base);
-
-	pci_bus_add_resource(dev->bus, res, 0);
 }
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1401, pci_amd_enable_64bit_bar);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1571, pci_amd_enable_64bit_bar);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x15b1, pci_amd_enable_64bit_bar);
 DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_AMD, 0x1601, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1401, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x141b, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1571, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x15b1, pci_amd_enable_64bit_bar);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_AMD, 0x1601, pci_amd_enable_64bit_bar);
 
 #endif
diff --git a/arch/x86/pci/intel_mid_pci.c b/arch/x86/pci/intel_mid_pci.c
index 5119210..43867bc 100644
--- a/arch/x86/pci/intel_mid_pci.c
+++ b/arch/x86/pci/intel_mid_pci.c
@@ -300,6 +300,7 @@ int __init intel_mid_pci_init(void)
 	pci_root_ops = intel_mid_pci_ops;
 	pci_soc_mode = 1;
 	/* Continue with standard init */
+	acpi_noirq_set();
 	return 1;
 }
 
diff --git a/arch/x86/platform/efi/efi_64.c b/arch/x86/platform/efi/efi_64.c
index d87ac96..c310a82 100644
--- a/arch/x86/platform/efi/efi_64.c
+++ b/arch/x86/platform/efi/efi_64.c
@@ -25,7 +25,6 @@
 #include <linux/spinlock.h>
 #include <linux/bootmem.h>
 #include <linux/ioport.h>
-#include <linux/init.h>
 #include <linux/mc146818rtc.h>
 #include <linux/efi.h>
 #include <linux/uaccess.h>
@@ -135,7 +134,9 @@ pgd_t * __init efi_call_phys_prolog(void)
 				pud[j] = *pud_offset(p4d_k, vaddr);
 			}
 		}
+		pgd_offset_k(pgd * PGDIR_SIZE)->pgd &= ~_PAGE_NX;
 	}
+
 out:
 	__flush_tlb_all();
 
diff --git a/arch/x86/platform/efi/quirks.c b/arch/x86/platform/efi/quirks.c
index 8a99a2e..5b513cc 100644
--- a/arch/x86/platform/efi/quirks.c
+++ b/arch/x86/platform/efi/quirks.c
@@ -592,7 +592,18 @@ static int qrk_capsule_setup_info(struct capsule_info *cap_info, void **pkbuff,
 	/*
 	 * Update the first page pointer to skip over the CSH header.
 	 */
-	cap_info->pages[0] += csh->headersize;
+	cap_info->phys[0] += csh->headersize;
+
+	/*
+	 * cap_info->capsule should point at a virtual mapping of the entire
+	 * capsule, starting at the capsule header. Our image has the Quark
+	 * security header prepended, so we cannot rely on the default vmap()
+	 * mapping created by the generic capsule code.
+	 * Given that the Quark firmware does not appear to care about the
+	 * virtual mapping, let's just point cap_info->capsule at our copy
+	 * of the capsule header.
+	 */
+	cap_info->capsule = &cap_info->header;
 
 	return 1;
 }
diff --git a/arch/x86/platform/intel-mid/device_libs/platform_bt.c b/arch/x86/platform/intel-mid/device_libs/platform_bt.c
index dc036e5..5a0483e 100644
--- a/arch/x86/platform/intel-mid/device_libs/platform_bt.c
+++ b/arch/x86/platform/intel-mid/device_libs/platform_bt.c
@@ -60,7 +60,7 @@ static int __init tng_bt_sfi_setup(struct bt_sfi_data *ddata)
 	return 0;
 }
 
-static const struct bt_sfi_data tng_bt_sfi_data __initdata = {
+static struct bt_sfi_data tng_bt_sfi_data __initdata = {
 	.setup	= tng_bt_sfi_setup,
 };
 
diff --git a/arch/x86/platform/intel-mid/intel-mid.c b/arch/x86/platform/intel-mid/intel-mid.c
index 86676ce..2c67bae 100644
--- a/arch/x86/platform/intel-mid/intel-mid.c
+++ b/arch/x86/platform/intel-mid/intel-mid.c
@@ -194,7 +194,7 @@ void __init x86_intel_mid_early_setup(void)
 	x86_platform.calibrate_tsc = intel_mid_calibrate_tsc;
 	x86_platform.get_nmi_reason = intel_mid_get_nmi_reason;
 
-	x86_init.pci.init = intel_mid_pci_init;
+	x86_init.pci.arch_init = intel_mid_pci_init;
 	x86_init.pci.fixup_irqs = x86_init_noop;
 
 	legacy_pic = &null_legacy_pic;
diff --git a/arch/x86/platform/intel-mid/sfi.c b/arch/x86/platform/intel-mid/sfi.c
index 19b43e3..7be1e1f 100644
--- a/arch/x86/platform/intel-mid/sfi.c
+++ b/arch/x86/platform/intel-mid/sfi.c
@@ -96,8 +96,7 @@ int __init sfi_parse_mtmr(struct sfi_table_header *table)
 			pentry->freq_hz, pentry->irq);
 		mp_irq.type = MP_INTSRC;
 		mp_irq.irqtype = mp_INT;
-		/* triggering mode edge bit 2-3, active high polarity bit 0-1 */
-		mp_irq.irqflag = 5;
+		mp_irq.irqflag = MP_IRQTRIG_EDGE | MP_IRQPOL_ACTIVE_HIGH;
 		mp_irq.srcbus = MP_BUS_ISA;
 		mp_irq.srcbusirq = pentry->irq;	/* IRQ */
 		mp_irq.dstapic = MP_APIC_ALL;
@@ -168,7 +167,7 @@ int __init sfi_parse_mrtc(struct sfi_table_header *table)
 			totallen, (u32)pentry->phys_addr, pentry->irq);
 		mp_irq.type = MP_INTSRC;
 		mp_irq.irqtype = mp_INT;
-		mp_irq.irqflag = 0xf;	/* level trigger and active low */
+		mp_irq.irqflag = MP_IRQTRIG_LEVEL | MP_IRQPOL_ACTIVE_LOW;
 		mp_irq.srcbus = MP_BUS_ISA;
 		mp_irq.srcbusirq = pentry->irq;	/* IRQ */
 		mp_irq.dstapic = MP_APIC_ALL;
diff --git a/arch/x86/platform/uv/tlb_uv.c b/arch/x86/platform/uv/tlb_uv.c
index 8538a67..c2e9285 100644
--- a/arch/x86/platform/uv/tlb_uv.c
+++ b/arch/x86/platform/uv/tlb_uv.c
@@ -1751,7 +1751,8 @@ static void activation_descriptor_init(int node, int pnode, int base_pnode)
 		uv1 = 1;
 
 	/* the 14-bit pnode */
-	write_mmr_descriptor_base(pnode, (n << UV_DESC_PSHIFT | m));
+	write_mmr_descriptor_base(pnode,
+		(n << UVH_LB_BAU_SB_DESCRIPTOR_BASE_NODE_ID_SHFT | m));
 	/*
 	 * Initializing all 8 (ITEMS_PER_DESC) descriptors for each
 	 * cpu even though we only use the first one; one descriptor can
diff --git a/arch/x86/tools/Makefile b/arch/x86/tools/Makefile
index 972b8e8..09af7ff 100644
--- a/arch/x86/tools/Makefile
+++ b/arch/x86/tools/Makefile
@@ -13,28 +13,28 @@
   posttest_64bit = -n
 endif
 
-distill_awk = $(srctree)/arch/x86/tools/distill.awk
+reformatter = $(srctree)/arch/x86/tools/objdump_reformat.awk
 chkobjdump = $(srctree)/arch/x86/tools/chkobjdump.awk
 
 quiet_cmd_posttest = TEST    $@
-      cmd_posttest = ($(OBJDUMP) -v | $(AWK) -f $(chkobjdump)) || $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(distill_awk) | $(obj)/test_get_len $(posttest_64bit) $(posttest_verbose)
+      cmd_posttest = ($(OBJDUMP) -v | $(AWK) -f $(chkobjdump)) || $(OBJDUMP) -d -j .text $(objtree)/vmlinux | $(AWK) -f $(reformatter) | $(obj)/insn_decoder_test $(posttest_64bit) $(posttest_verbose)
 
 quiet_cmd_sanitytest = TEST    $@
       cmd_sanitytest = $(obj)/insn_sanity $(posttest_64bit) -m 1000000
 
-posttest: $(obj)/test_get_len vmlinux $(obj)/insn_sanity
+posttest: $(obj)/insn_decoder_test vmlinux $(obj)/insn_sanity
 	$(call cmd,posttest)
 	$(call cmd,sanitytest)
 
-hostprogs-y	+= test_get_len insn_sanity
+hostprogs-y	+= insn_decoder_test insn_sanity
 
 # -I needed for generated C source and C source which in the kernel tree.
-HOSTCFLAGS_test_get_len.o := -Wall -I$(objtree)/arch/x86/lib/ -I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
+HOSTCFLAGS_insn_decoder_test.o := -Wall -I$(objtree)/arch/x86/lib/ -I$(srctree)/arch/x86/include/uapi/ -I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/uapi/
 
 HOSTCFLAGS_insn_sanity.o := -Wall -I$(objtree)/arch/x86/lib/ -I$(srctree)/arch/x86/include/ -I$(srctree)/arch/x86/lib/ -I$(srctree)/include/
 
 # Dependencies are also needed.
-$(obj)/test_get_len.o: $(srctree)/arch/x86/lib/insn.c $(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h $(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h $(objtree)/arch/x86/lib/inat-tables.c
+$(obj)/insn_decoder_test.o: $(srctree)/arch/x86/lib/insn.c $(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h $(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h $(objtree)/arch/x86/lib/inat-tables.c
 
 $(obj)/insn_sanity.o: $(srctree)/arch/x86/lib/insn.c $(srctree)/arch/x86/lib/inat.c $(srctree)/arch/x86/include/asm/inat_types.h $(srctree)/arch/x86/include/asm/inat.h $(srctree)/arch/x86/include/asm/insn.h $(objtree)/arch/x86/lib/inat-tables.c
 
diff --git a/arch/x86/tools/test_get_len.c b/arch/x86/tools/insn_decoder_test.c
similarity index 81%
rename from arch/x86/tools/test_get_len.c
rename to arch/x86/tools/insn_decoder_test.c
index ecf31e0..a3b4fd9 100644
--- a/arch/x86/tools/test_get_len.c
+++ b/arch/x86/tools/insn_decoder_test.c
@@ -9,10 +9,6 @@
  * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  * GNU General Public License for more details.
  *
- * You should have received a copy of the GNU General Public License
- * along with this program; if not, write to the Free Software
- * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
- *
  * Copyright (C) IBM Corporation, 2009
  */
 
@@ -21,6 +17,7 @@
 #include <string.h>
 #include <assert.h>
 #include <unistd.h>
+#include <stdarg.h>
 
 #define unlikely(cond) (cond)
 
@@ -33,7 +30,7 @@
  * particular.  See if insn_get_length() and the disassembler agree
  * on the length of each instruction in an elf disassembly.
  *
- * Usage: objdump -d a.out | awk -f distill.awk | ./test_get_len
+ * Usage: objdump -d a.out | awk -f objdump_reformat.awk | ./insn_decoder_test
  */
 
 const char *prog;
@@ -42,8 +39,8 @@ static int x86_64;
 
 static void usage(void)
 {
-	fprintf(stderr, "Usage: objdump -d a.out | awk -f distill.awk |"
-		" %s [-y|-n] [-v]\n", prog);
+	fprintf(stderr, "Usage: objdump -d a.out | awk -f objdump_reformat.awk"
+		" | %s [-y|-n] [-v]\n", prog);
 	fprintf(stderr, "\t-y	64bit mode\n");
 	fprintf(stderr, "\t-n	32bit mode\n");
 	fprintf(stderr, "\t-v	verbose mode\n");
@@ -52,10 +49,21 @@ static void usage(void)
 
 static void malformed_line(const char *line, int line_nr)
 {
-	fprintf(stderr, "%s: malformed line %d:\n%s", prog, line_nr, line);
+	fprintf(stderr, "%s: error: malformed line %d:\n%s",
+		prog, line_nr, line);
 	exit(3);
 }
 
+static void pr_warn(const char *fmt, ...)
+{
+	va_list ap;
+
+	fprintf(stderr, "%s: warning: ", prog);
+	va_start(ap, fmt);
+	vfprintf(stderr, fmt, ap);
+	va_end(ap);
+}
+
 static void dump_field(FILE *fp, const char *name, const char *indent,
 		       struct insn_field *field)
 {
@@ -153,21 +161,20 @@ int main(int argc, char **argv)
 		insn_get_length(&insn);
 		if (insn.length != nb) {
 			warnings++;
-			fprintf(stderr, "Warning: %s found difference at %s\n",
-				prog, sym);
-			fprintf(stderr, "Warning: %s", line);
-			fprintf(stderr, "Warning: objdump says %d bytes, but "
-				"insn_get_length() says %d\n", nb,
-				insn.length);
+			pr_warn("Found an x86 instruction decoder bug, "
+				"please report this.\n", sym);
+			pr_warn("%s", line);
+			pr_warn("objdump says %d bytes, but insn_get_length() "
+				"says %d\n", nb, insn.length);
 			if (verbose)
 				dump_insn(stderr, &insn);
 		}
 	}
 	if (warnings)
-		fprintf(stderr, "Warning: decoded and checked %d"
-			" instructions with %d warnings\n", insns, warnings);
+		pr_warn("Decoded and checked %d instructions with %d "
+			"failures\n", insns, warnings);
 	else
-		fprintf(stdout, "Success: decoded and checked %d"
-			" instructions\n", insns);
+		fprintf(stdout, "%s: success: Decoded and checked %d"
+			" instructions\n", prog, insns);
 	return 0;
 }
diff --git a/arch/x86/tools/distill.awk b/arch/x86/tools/objdump_reformat.awk
similarity index 90%
rename from arch/x86/tools/distill.awk
rename to arch/x86/tools/objdump_reformat.awk
index e0edecc..f418c91 100644
--- a/arch/x86/tools/distill.awk
+++ b/arch/x86/tools/objdump_reformat.awk
@@ -1,7 +1,7 @@
 #!/bin/awk -f
 # SPDX-License-Identifier: GPL-2.0
-# Usage: objdump -d a.out | awk -f distill.awk | ./test_get_len
-# Distills the disassembly as follows:
+# Usage: objdump -d a.out | awk -f objdump_reformat.awk | ./insn_decoder_test
+# Reformats the disassembly as follows:
 # - Removes all lines except the disassembled instructions.
 # - For instructions that exceed 1 line (7 bytes), crams all the hex bytes
 # into a single line.
diff --git a/arch/x86/xen/mmu_hvm.c b/arch/x86/xen/mmu_hvm.c
index 2cfcfe4..dd2ad82 100644
--- a/arch/x86/xen/mmu_hvm.c
+++ b/arch/x86/xen/mmu_hvm.c
@@ -75,6 +75,6 @@ void __init xen_hvm_init_mmu_ops(void)
 	if (is_pagetable_dying_supported())
 		pv_mmu_ops.exit_mmap = xen_hvm_exit_mmap;
 #ifdef CONFIG_PROC_VMCORE
-	register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram);
+	WARN_ON(register_oldmem_pfn_is_ram(&xen_oldmem_pfn_is_ram));
 #endif
 }
diff --git a/arch/x86/xen/mmu_pv.c b/arch/x86/xen/mmu_pv.c
index 4d62c07..d8507622 100644
--- a/arch/x86/xen/mmu_pv.c
+++ b/arch/x86/xen/mmu_pv.c
@@ -1325,20 +1325,18 @@ static void xen_flush_tlb_others(const struct cpumask *cpus,
 {
 	struct {
 		struct mmuext_op op;
-#ifdef CONFIG_SMP
-		DECLARE_BITMAP(mask, num_processors);
-#else
 		DECLARE_BITMAP(mask, NR_CPUS);
-#endif
 	} *args;
 	struct multicall_space mcs;
+	const size_t mc_entry_size = sizeof(args->op) +
+		sizeof(args->mask[0]) * BITS_TO_LONGS(num_possible_cpus());
 
 	trace_xen_mmu_flush_tlb_others(cpus, info->mm, info->start, info->end);
 
 	if (cpumask_empty(cpus))
 		return;		/* nothing to do */
 
-	mcs = xen_mc_entry(sizeof(*args));
+	mcs = xen_mc_entry(mc_entry_size);
 	args = mcs.args;
 	args->op.arg2.vcpumask = to_cpumask(args->mask);
 
diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c
index 02f3445..cd97a62 100644
--- a/arch/x86/xen/spinlock.c
+++ b/arch/x86/xen/spinlock.c
@@ -23,8 +23,6 @@ static DEFINE_PER_CPU(int, lock_kicker_irq) = -1;
 static DEFINE_PER_CPU(char *, irq_name);
 static bool xen_pvspin = true;
 
-#include <asm/qspinlock.h>
-
 static void xen_qlock_kick(int cpu)
 {
 	int irq = per_cpu(lock_kicker_irq, cpu);
diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h
index 75011b8..3b34745 100644
--- a/arch/x86/xen/xen-ops.h
+++ b/arch/x86/xen/xen-ops.h
@@ -72,7 +72,7 @@ u64 xen_clocksource_read(void);
 void xen_setup_cpu_clockevents(void);
 void xen_save_time_memory_area(void);
 void xen_restore_time_memory_area(void);
-void __init xen_init_time_ops(void);
+void __ref xen_init_time_ops(void);
 void __init xen_hvm_init_time_ops(void);
 
 irqreturn_t xen_debug_interrupt(int irq, void *dev_id);
diff --git a/arch/xtensa/Kconfig b/arch/xtensa/Kconfig
index 8bc52f7..c921e8b 100644
--- a/arch/xtensa/Kconfig
+++ b/arch/xtensa/Kconfig
@@ -15,6 +15,9 @@
 	select GENERIC_IRQ_SHOW
 	select GENERIC_PCI_IOMAP
 	select GENERIC_SCHED_CLOCK
+	select GENERIC_STRNCPY_FROM_USER if KASAN
+	select HAVE_ARCH_KASAN if MMU
+	select HAVE_CC_STACKPROTECTOR
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_API_DEBUG
 	select HAVE_DMA_CONTIGUOUS
@@ -79,6 +82,10 @@
 config HAVE_XTENSA_GPIO32
 	def_bool n
 
+config KASAN_SHADOW_OFFSET
+	hex
+	default 0x6e400000
+
 menu "Processor type and features"
 
 choice
diff --git a/arch/xtensa/Makefile b/arch/xtensa/Makefile
index 7ee02fe..3a934b7 100644
--- a/arch/xtensa/Makefile
+++ b/arch/xtensa/Makefile
@@ -42,10 +42,11 @@
 
 # temporarily until string.h is fixed
 KBUILD_CFLAGS += -ffreestanding -D__linux__
-
-KBUILD_CFLAGS += -pipe -mlongcalls
-
+KBUILD_CFLAGS += -pipe -mlongcalls -mtext-section-literals
 KBUILD_CFLAGS += $(call cc-option,-mforce-no-pic,)
+KBUILD_CFLAGS += $(call cc-option,-mno-serialize-volatile,)
+
+KBUILD_AFLAGS += -mlongcalls -mtext-section-literals
 
 ifneq ($(CONFIG_LD_NO_RELAX),)
 LDFLAGS := --no-relax
diff --git a/arch/xtensa/boot/boot-redboot/bootstrap.S b/arch/xtensa/boot/boot-redboot/bootstrap.S
index bf7fabe..bbf3b4b 100644
--- a/arch/xtensa/boot/boot-redboot/bootstrap.S
+++ b/arch/xtensa/boot/boot-redboot/bootstrap.S
@@ -42,6 +42,7 @@
 	.align 4
 
 	.section .text, "ax"
+	.literal_position
 	.begin literal_prefix .text
 
 	/* put literals in here! */
diff --git a/arch/xtensa/boot/lib/Makefile b/arch/xtensa/boot/lib/Makefile
index d2a7f48..355127f 100644
--- a/arch/xtensa/boot/lib/Makefile
+++ b/arch/xtensa/boot/lib/Makefile
@@ -15,6 +15,12 @@
 CFLAGS_REMOVE_inffast.o = -pg
 endif
 
+KASAN_SANITIZE := n
+
+CFLAGS_REMOVE_inflate.o += -fstack-protector -fstack-protector-strong
+CFLAGS_REMOVE_zmem.o += -fstack-protector -fstack-protector-strong
+CFLAGS_REMOVE_inftrees.o += -fstack-protector -fstack-protector-strong
+CFLAGS_REMOVE_inffast.o += -fstack-protector -fstack-protector-strong
 
 quiet_cmd_copy_zlib = COPY    $@
       cmd_copy_zlib = cat $< > $@
diff --git a/arch/xtensa/configs/audio_kc705_defconfig b/arch/xtensa/configs/audio_kc705_defconfig
index 8d16925..2bf964d 100644
--- a/arch/xtensa/configs/audio_kc705_defconfig
+++ b/arch/xtensa/configs/audio_kc705_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IRQ_TIME_ACCOUNTING=y
diff --git a/arch/xtensa/configs/cadence_csp_defconfig b/arch/xtensa/configs/cadence_csp_defconfig
index f2d3094..3221b70 100644
--- a/arch/xtensa/configs/cadence_csp_defconfig
+++ b/arch/xtensa/configs/cadence_csp_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_USELIB=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IRQ_TIME_ACCOUNTING=y
diff --git a/arch/xtensa/configs/generic_kc705_defconfig b/arch/xtensa/configs/generic_kc705_defconfig
index 744adea..985fa85 100644
--- a/arch/xtensa/configs/generic_kc705_defconfig
+++ b/arch/xtensa/configs/generic_kc705_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IRQ_TIME_ACCOUNTING=y
diff --git a/arch/xtensa/configs/nommu_kc705_defconfig b/arch/xtensa/configs/nommu_kc705_defconfig
index 78c2529..624f9b3 100644
--- a/arch/xtensa/configs/nommu_kc705_defconfig
+++ b/arch/xtensa/configs/nommu_kc705_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IRQ_TIME_ACCOUNTING=y
diff --git a/arch/xtensa/configs/smp_lx200_defconfig b/arch/xtensa/configs/smp_lx200_defconfig
index 14e3ca3..11fed6c 100644
--- a/arch/xtensa/configs/smp_lx200_defconfig
+++ b/arch/xtensa/configs/smp_lx200_defconfig
@@ -1,7 +1,6 @@
 CONFIG_SYSVIPC=y
 CONFIG_POSIX_MQUEUE=y
 CONFIG_FHANDLE=y
-CONFIG_IRQ_DOMAIN_DEBUG=y
 CONFIG_NO_HZ_IDLE=y
 CONFIG_HIGH_RES_TIMERS=y
 CONFIG_IRQ_TIME_ACCOUNTING=y
diff --git a/arch/xtensa/include/asm/asmmacro.h b/arch/xtensa/include/asm/asmmacro.h
index 746dcc8..7f2ae58 100644
--- a/arch/xtensa/include/asm/asmmacro.h
+++ b/arch/xtensa/include/asm/asmmacro.h
@@ -150,5 +150,45 @@
 		__endl	\ar \as
 	.endm
 
+/* Load or store instructions that may cause exceptions use the EX macro. */
+
+#define EX(handler)				\
+	.section __ex_table, "a";		\
+	.word	97f, handler;			\
+	.previous				\
+97:
+
+
+/*
+ * Extract unaligned word that is split between two registers w0 and w1
+ * into r regardless of machine endianness. SAR must be loaded with the
+ * starting bit of the word (see __ssa8).
+ */
+
+	.macro __src_b	r, w0, w1
+#ifdef __XTENSA_EB__
+		src	\r, \w0, \w1
+#else
+		src	\r, \w1, \w0
+#endif
+	.endm
+
+/*
+ * Load 2 lowest address bits of r into SAR for __src_b to extract unaligned
+ * word starting at r from two registers loaded from consecutive aligned
+ * addresses covering r regardless of machine endianness.
+ *
+ *      r   0   1   2   3
+ * LE SAR   0   8  16  24
+ * BE SAR  32  24  16   8
+ */
+
+	.macro __ssa8	r
+#ifdef __XTENSA_EB__
+		ssa8b	\r
+#else
+		ssa8l	\r
+#endif
+	.endm
 
 #endif /* _XTENSA_ASMMACRO_H */
diff --git a/arch/xtensa/include/asm/current.h b/arch/xtensa/include/asm/current.h
index 47e46dc..5d98a7a 100644
--- a/arch/xtensa/include/asm/current.h
+++ b/arch/xtensa/include/asm/current.h
@@ -11,6 +11,8 @@
 #ifndef _XTENSA_CURRENT_H
 #define _XTENSA_CURRENT_H
 
+#include <asm/thread_info.h>
+
 #ifndef __ASSEMBLY__
 
 #include <linux/thread_info.h>
@@ -26,8 +28,6 @@ static inline struct task_struct *get_current(void)
 
 #else
 
-#define CURRENT_SHIFT 13
-
 #define GET_CURRENT(reg,sp)		\
 	GET_THREAD_INFO(reg,sp);	\
 	l32i reg, reg, TI_TASK		\
diff --git a/arch/xtensa/include/asm/fixmap.h b/arch/xtensa/include/asm/fixmap.h
index 0d30403..7e25c1b 100644
--- a/arch/xtensa/include/asm/fixmap.h
+++ b/arch/xtensa/include/asm/fixmap.h
@@ -44,7 +44,7 @@ enum fixed_addresses {
 	__end_of_fixed_addresses
 };
 
-#define FIXADDR_TOP     (VMALLOC_START - PAGE_SIZE)
+#define FIXADDR_TOP     (XCHAL_KSEG_CACHED_VADDR - PAGE_SIZE)
 #define FIXADDR_SIZE	(__end_of_fixed_addresses << PAGE_SHIFT)
 #define FIXADDR_START	((FIXADDR_TOP - FIXADDR_SIZE) & PMD_MASK)
 
@@ -63,7 +63,7 @@ static __always_inline unsigned long fix_to_virt(const unsigned int idx)
 	 * table.
 	 */
 	BUILD_BUG_ON(FIXADDR_START <
-		     XCHAL_PAGE_TABLE_VADDR + XCHAL_PAGE_TABLE_SIZE);
+		     TLBTEMP_BASE_1 + TLBTEMP_SIZE);
 	BUILD_BUG_ON(idx >= __end_of_fixed_addresses);
 	return __fix_to_virt(idx);
 }
diff --git a/arch/xtensa/include/asm/futex.h b/arch/xtensa/include/asm/futex.h
index eaaf1eb..5bfbc1c 100644
--- a/arch/xtensa/include/asm/futex.h
+++ b/arch/xtensa/include/asm/futex.h
@@ -92,7 +92,6 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 			      u32 oldval, u32 newval)
 {
 	int ret = 0;
-	u32 prev;
 
 	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
 		return -EFAULT;
@@ -103,26 +102,24 @@ futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
 
 	__asm__ __volatile__ (
 	"	# futex_atomic_cmpxchg_inatomic\n"
-	"1:	l32i	%1, %3, 0\n"
-	"	mov	%0, %5\n"
-	"	wsr	%1, scompare1\n"
-	"2:	s32c1i	%0, %3, 0\n"
-	"3:\n"
+	"	wsr	%5, scompare1\n"
+	"1:	s32c1i	%1, %4, 0\n"
+	"	s32i	%1, %6, 0\n"
+	"2:\n"
 	"	.section .fixup,\"ax\"\n"
 	"	.align 4\n"
-	"4:	.long	3b\n"
-	"5:	l32r	%1, 4b\n"
-	"	movi	%0, %6\n"
+	"3:	.long	2b\n"
+	"4:	l32r	%1, 3b\n"
+	"	movi	%0, %7\n"
 	"	jx	%1\n"
 	"	.previous\n"
 	"	.section __ex_table,\"a\"\n"
-	"	.long 1b,5b,2b,5b\n"
+	"	.long 1b,4b\n"
 	"	.previous\n"
-	: "+r" (ret), "=&r" (prev), "+m" (*uaddr)
-	: "r" (uaddr), "r" (oldval), "r" (newval), "I" (-EFAULT)
+	: "+r" (ret), "+r" (newval), "+m" (*uaddr), "+m" (*uval)
+	: "r" (uaddr), "r" (oldval), "r" (uval), "I" (-EFAULT)
 	: "memory");
 
-	*uval = prev;
 	return ret;
 }
 
diff --git a/arch/xtensa/include/asm/highmem.h b/arch/xtensa/include/asm/highmem.h
index 6e070db..04e9340 100644
--- a/arch/xtensa/include/asm/highmem.h
+++ b/arch/xtensa/include/asm/highmem.h
@@ -72,7 +72,7 @@ static inline void *kmap(struct page *page)
 	 * page table.
 	 */
 	BUILD_BUG_ON(PKMAP_BASE <
-		     XCHAL_PAGE_TABLE_VADDR + XCHAL_PAGE_TABLE_SIZE);
+		     TLBTEMP_BASE_1 + TLBTEMP_SIZE);
 	BUG_ON(in_interrupt());
 	if (!PageHighMem(page))
 		return page_address(page);
diff --git a/arch/xtensa/include/asm/kasan.h b/arch/xtensa/include/asm/kasan.h
new file mode 100644
index 0000000..54be808
--- /dev/null
+++ b/arch/xtensa/include/asm/kasan.h
@@ -0,0 +1,37 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_KASAN_H
+#define __ASM_KASAN_H
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_KASAN
+
+#include <linux/kernel.h>
+#include <linux/sizes.h>
+#include <asm/kmem_layout.h>
+
+/* Start of area covered by KASAN */
+#define KASAN_START_VADDR __XTENSA_UL_CONST(0x90000000)
+/* Start of the shadow map */
+#define KASAN_SHADOW_START (XCHAL_PAGE_TABLE_VADDR + XCHAL_PAGE_TABLE_SIZE)
+/* Size of the shadow map */
+#define KASAN_SHADOW_SIZE (-KASAN_START_VADDR >> KASAN_SHADOW_SCALE_SHIFT)
+/* Offset for mem to shadow address transformation */
+#define KASAN_SHADOW_OFFSET __XTENSA_UL_CONST(CONFIG_KASAN_SHADOW_OFFSET)
+
+void __init kasan_early_init(void);
+void __init kasan_init(void);
+
+#else
+
+static inline void kasan_early_init(void)
+{
+}
+
+static inline void kasan_init(void)
+{
+}
+
+#endif
+#endif
+#endif
diff --git a/arch/xtensa/include/asm/kmem_layout.h b/arch/xtensa/include/asm/kmem_layout.h
index 561f872..2317c83 100644
--- a/arch/xtensa/include/asm/kmem_layout.h
+++ b/arch/xtensa/include/asm/kmem_layout.h
@@ -71,4 +71,11 @@
 
 #endif
 
+#ifndef CONFIG_KASAN
+#define KERNEL_STACK_SHIFT	13
+#else
+#define KERNEL_STACK_SHIFT	15
+#endif
+#define KERNEL_STACK_SIZE	(1 << KERNEL_STACK_SHIFT)
+
 #endif
diff --git a/arch/xtensa/include/asm/linkage.h b/arch/xtensa/include/asm/linkage.h
new file mode 100644
index 0000000..0ba9973
--- /dev/null
+++ b/arch/xtensa/include/asm/linkage.h
@@ -0,0 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __ASM_LINKAGE_H
+#define __ASM_LINKAGE_H
+
+#define __ALIGN		.align 4
+#define __ALIGN_STR	".align 4"
+
+#endif
diff --git a/arch/xtensa/include/asm/mmu_context.h b/arch/xtensa/include/asm/mmu_context.h
index f7e186d..de5e6cb 100644
--- a/arch/xtensa/include/asm/mmu_context.h
+++ b/arch/xtensa/include/asm/mmu_context.h
@@ -52,6 +52,7 @@ DECLARE_PER_CPU(unsigned long, asid_cache);
 #define ASID_INSERT(x)	(0x03020001 | (((x) & ASID_MASK) << 8))
 
 void init_mmu(void);
+void init_kio(void);
 
 static inline void set_rasid_register (unsigned long val)
 {
diff --git a/arch/xtensa/include/asm/nommu_context.h b/arch/xtensa/include/asm/nommu_context.h
index 2cebdbb..37251b2 100644
--- a/arch/xtensa/include/asm/nommu_context.h
+++ b/arch/xtensa/include/asm/nommu_context.h
@@ -3,6 +3,10 @@ static inline void init_mmu(void)
 {
 }
 
+static inline void init_kio(void)
+{
+}
+
 static inline void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
 {
 }
diff --git a/arch/xtensa/include/asm/page.h b/arch/xtensa/include/asm/page.h
index 4ddbfd5..5d69c11 100644
--- a/arch/xtensa/include/asm/page.h
+++ b/arch/xtensa/include/asm/page.h
@@ -36,8 +36,6 @@
 #define MAX_LOW_PFN	PHYS_PFN(0xfffffffful)
 #endif
 
-#define PGTABLE_START	0x80000000
-
 /*
  * Cache aliasing:
  *
diff --git a/arch/xtensa/include/asm/pgtable.h b/arch/xtensa/include/asm/pgtable.h
index 30dd5b2..3880225 100644
--- a/arch/xtensa/include/asm/pgtable.h
+++ b/arch/xtensa/include/asm/pgtable.h
@@ -12,9 +12,9 @@
 #define _XTENSA_PGTABLE_H
 
 #define __ARCH_USE_5LEVEL_HACK
-#include <asm-generic/pgtable-nopmd.h>
 #include <asm/page.h>
 #include <asm/kmem_layout.h>
+#include <asm-generic/pgtable-nopmd.h>
 
 /*
  * We only use two ring levels, user and kernel space.
@@ -170,6 +170,7 @@
 #define PAGE_SHARED_EXEC \
 	__pgprot(_PAGE_PRESENT | _PAGE_USER | _PAGE_WRITABLE | _PAGE_HW_EXEC)
 #define PAGE_KERNEL	   __pgprot(_PAGE_PRESENT | _PAGE_HW_WRITE)
+#define PAGE_KERNEL_RO	   __pgprot(_PAGE_PRESENT)
 #define PAGE_KERNEL_EXEC   __pgprot(_PAGE_PRESENT|_PAGE_HW_WRITE|_PAGE_HW_EXEC)
 
 #if (DCACHE_WAY_SIZE > PAGE_SIZE)
diff --git a/arch/xtensa/include/asm/ptrace.h b/arch/xtensa/include/asm/ptrace.h
index e2d9c5e..3a5c591 100644
--- a/arch/xtensa/include/asm/ptrace.h
+++ b/arch/xtensa/include/asm/ptrace.h
@@ -10,6 +10,7 @@
 #ifndef _XTENSA_PTRACE_H
 #define _XTENSA_PTRACE_H
 
+#include <asm/kmem_layout.h>
 #include <uapi/asm/ptrace.h>
 
 /*
@@ -38,20 +39,6 @@
  *		+-----------------------+ --------
  */
 
-#define KERNEL_STACK_SIZE (2 * PAGE_SIZE)
-
-/*  Offsets for exception_handlers[] (3 x 64-entries x 4-byte tables). */
-
-#define EXC_TABLE_KSTK		0x004	/* Kernel Stack */
-#define EXC_TABLE_DOUBLE_SAVE	0x008	/* Double exception save area for a0 */
-#define EXC_TABLE_FIXUP		0x00c	/* Fixup handler */
-#define EXC_TABLE_PARAM		0x010	/* For passing a parameter to fixup */
-#define EXC_TABLE_SYSCALL_SAVE	0x014	/* For fast syscall handler */
-#define EXC_TABLE_FAST_USER	0x100	/* Fast user exception handler */
-#define EXC_TABLE_FAST_KERNEL	0x200	/* Fast kernel exception handler */
-#define EXC_TABLE_DEFAULT	0x300	/* Default C-Handler */
-#define EXC_TABLE_SIZE		0x400
-
 #ifndef __ASSEMBLY__
 
 #include <asm/coprocessor.h>
diff --git a/arch/xtensa/include/asm/regs.h b/arch/xtensa/include/asm/regs.h
index 881a113..477594e 100644
--- a/arch/xtensa/include/asm/regs.h
+++ b/arch/xtensa/include/asm/regs.h
@@ -76,6 +76,7 @@
 #define EXCCAUSE_COPROCESSOR5_DISABLED		37
 #define EXCCAUSE_COPROCESSOR6_DISABLED		38
 #define EXCCAUSE_COPROCESSOR7_DISABLED		39
+#define EXCCAUSE_N				64
 
 /*  PS register fields.  */
 
diff --git a/arch/xtensa/include/asm/stackprotector.h b/arch/xtensa/include/asm/stackprotector.h
new file mode 100644
index 0000000..e368f94
--- /dev/null
+++ b/arch/xtensa/include/asm/stackprotector.h
@@ -0,0 +1,40 @@
+/*
+ * GCC stack protector support.
+ *
+ * (This is directly adopted from the ARM implementation)
+ *
+ * Stack protector works by putting predefined pattern at the start of
+ * the stack frame and verifying that it hasn't been overwritten when
+ * returning from the function.  The pattern is called stack canary
+ * and gcc expects it to be defined by a global variable called
+ * "__stack_chk_guard" on Xtensa.  This unfortunately means that on SMP
+ * we cannot have a different canary value per task.
+ */
+
+#ifndef _ASM_STACKPROTECTOR_H
+#define _ASM_STACKPROTECTOR_H 1
+
+#include <linux/random.h>
+#include <linux/version.h>
+
+extern unsigned long __stack_chk_guard;
+
+/*
+ * Initialize the stackprotector canary value.
+ *
+ * NOTE: this must only be called from functions that never return,
+ * and it must always be inlined.
+ */
+static __always_inline void boot_init_stack_canary(void)
+{
+	unsigned long canary;
+
+	/* Try to get a semi random initial value. */
+	get_random_bytes(&canary, sizeof(canary));
+	canary ^= LINUX_VERSION_CODE;
+
+	current->stack_canary = canary;
+	__stack_chk_guard = current->stack_canary;
+}
+
+#endif	/* _ASM_STACKPROTECTOR_H */
diff --git a/arch/xtensa/include/asm/string.h b/arch/xtensa/include/asm/string.h
index 8d5d9df..89b51a0 100644
--- a/arch/xtensa/include/asm/string.h
+++ b/arch/xtensa/include/asm/string.h
@@ -53,7 +53,7 @@ static inline char *strncpy(char *__dest, const char *__src, size_t __n)
 		"bne	%1, %5, 1b\n"
 		"2:"
 		: "=r" (__dest), "=r" (__src), "=&r" (__dummy)
-		: "0" (__dest), "1" (__src), "r" (__src+__n)
+		: "0" (__dest), "1" (__src), "r" ((uintptr_t)__src+__n)
 		: "memory");
 
 	return __xdest;
@@ -101,21 +101,40 @@ static inline int strncmp(const char *__cs, const char *__ct, size_t __n)
 		"2:\n\t"
 		"sub	%2, %2, %3"
 		: "=r" (__cs), "=r" (__ct), "=&r" (__res), "=&r" (__dummy)
-		: "0" (__cs), "1" (__ct), "r" (__cs+__n));
+		: "0" (__cs), "1" (__ct), "r" ((uintptr_t)__cs+__n));
 
 	return __res;
 }
 
 #define __HAVE_ARCH_MEMSET
 extern void *memset(void *__s, int __c, size_t __count);
+extern void *__memset(void *__s, int __c, size_t __count);
 
 #define __HAVE_ARCH_MEMCPY
 extern void *memcpy(void *__to, __const__ void *__from, size_t __n);
+extern void *__memcpy(void *__to, __const__ void *__from, size_t __n);
 
 #define __HAVE_ARCH_MEMMOVE
 extern void *memmove(void *__dest, __const__ void *__src, size_t __n);
+extern void *__memmove(void *__dest, __const__ void *__src, size_t __n);
 
 /* Don't build bcopy at all ...  */
 #define __HAVE_ARCH_BCOPY
 
+#if defined(CONFIG_KASAN) && !defined(__SANITIZE_ADDRESS__)
+
+/*
+ * For files that are not instrumented (e.g. mm/slub.c) we
+ * should use not instrumented version of mem* functions.
+ */
+
+#define memcpy(dst, src, len) __memcpy(dst, src, len)
+#define memmove(dst, src, len) __memmove(dst, src, len)
+#define memset(s, c, n) __memset(s, c, n)
+
+#ifndef __NO_FORTIFY
+#define __NO_FORTIFY /* FORTIFY_SOURCE uses __builtin_memcpy, etc. */
+#endif
+#endif
+
 #endif	/* _XTENSA_STRING_H */
diff --git a/arch/xtensa/include/asm/thread_info.h b/arch/xtensa/include/asm/thread_info.h
index 7be2400..2bd19ae 100644
--- a/arch/xtensa/include/asm/thread_info.h
+++ b/arch/xtensa/include/asm/thread_info.h
@@ -11,7 +11,9 @@
 #ifndef _XTENSA_THREAD_INFO_H
 #define _XTENSA_THREAD_INFO_H
 
-#ifdef __KERNEL__
+#include <asm/kmem_layout.h>
+
+#define CURRENT_SHIFT KERNEL_STACK_SHIFT
 
 #ifndef __ASSEMBLY__
 # include <asm/processor.h>
@@ -77,14 +79,11 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,		\
 }
 
-#define init_thread_info	(init_thread_union.thread_info)
-#define init_stack		(init_thread_union.stack)
-
 /* how to get the thread information struct from C */
 static inline struct thread_info *current_thread_info(void)
 {
 	struct thread_info *ti;
-	 __asm__("extui %0,a1,0,13\n\t"
+	 __asm__("extui %0, a1, 0, "__stringify(CURRENT_SHIFT)"\n\t"
 	         "xor %0, a1, %0" : "=&r" (ti) : );
 	return ti;
 }
@@ -93,7 +92,7 @@ static inline struct thread_info *current_thread_info(void)
 
 /* how to get the thread information struct from ASM */
 #define GET_THREAD_INFO(reg,sp) \
-	extui reg, sp, 0, 13; \
+	extui reg, sp, 0, CURRENT_SHIFT; \
 	xor   reg, sp, reg
 #endif
 
@@ -130,8 +129,7 @@ static inline struct thread_info *current_thread_info(void)
  */
 #define TS_USEDFPU		0x0001	/* FPU was used by this task this quantum (SMP) */
 
-#define THREAD_SIZE 8192	//(2*PAGE_SIZE)
-#define THREAD_SIZE_ORDER 1
+#define THREAD_SIZE KERNEL_STACK_SIZE
+#define THREAD_SIZE_ORDER (KERNEL_STACK_SHIFT - PAGE_SHIFT)
 
-#endif	/* __KERNEL__ */
 #endif	/* _XTENSA_THREAD_INFO */
diff --git a/arch/xtensa/include/asm/traps.h b/arch/xtensa/include/asm/traps.h
index 2e69aa4..f5cd7a7 100644
--- a/arch/xtensa/include/asm/traps.h
+++ b/arch/xtensa/include/asm/traps.h
@@ -13,12 +13,47 @@
 #include <asm/ptrace.h>
 
 /*
+ * Per-CPU exception handling data structure.
+ * EXCSAVE1 points to it.
+ */
+struct exc_table {
+	/* Kernel Stack */
+	void *kstk;
+	/* Double exception save area for a0 */
+	unsigned long double_save;
+	/* Fixup handler */
+	void *fixup;
+	/* For passing a parameter to fixup */
+	void *fixup_param;
+	/* For fast syscall handler */
+	unsigned long syscall_save;
+	/* Fast user exception handlers */
+	void *fast_user_handler[EXCCAUSE_N];
+	/* Fast kernel exception handlers */
+	void *fast_kernel_handler[EXCCAUSE_N];
+	/* Default C-Handlers */
+	void *default_handler[EXCCAUSE_N];
+};
+
+/*
  * handler must be either of the following:
  *  void (*)(struct pt_regs *regs);
  *  void (*)(struct pt_regs *regs, unsigned long exccause);
  */
 extern void * __init trap_set_handler(int cause, void *handler);
 extern void do_unhandled(struct pt_regs *regs, unsigned long exccause);
+void fast_second_level_miss(void);
+
+/* Initialize minimal exc_table structure sufficient for basic paging */
+static inline void __init early_trap_init(void)
+{
+	static struct exc_table exc_table __initdata = {
+		.fast_kernel_handler[EXCCAUSE_DTLB_MISS] =
+			fast_second_level_miss,
+	};
+	__asm__ __volatile__("wsr  %0, excsave1\n" : : "a" (&exc_table));
+}
+
 void secondary_trap_init(void);
 
 static inline void spill_registers(void)
diff --git a/arch/xtensa/include/asm/uaccess.h b/arch/xtensa/include/asm/uaccess.h
index b8f152b..f1158b4 100644
--- a/arch/xtensa/include/asm/uaccess.h
+++ b/arch/xtensa/include/asm/uaccess.h
@@ -44,6 +44,8 @@
 #define __access_ok(addr, size) (__kernel_ok || __user_ok((addr), (size)))
 #define access_ok(type, addr, size) __access_ok((unsigned long)(addr), (size))
 
+#define user_addr_max() (uaccess_kernel() ? ~0UL : TASK_SIZE)
+
 /*
  * These are the main single-value transfer routines.  They
  * automatically use the right size if we just have the right pointer
@@ -261,7 +263,7 @@ raw_copy_to_user(void __user *to, const void *from, unsigned long n)
 static inline unsigned long
 __xtensa_clear_user(void *addr, unsigned long size)
 {
-	if ( ! memset(addr, 0, size) )
+	if (!__memset(addr, 0, size))
 		return size;
 	return 0;
 }
@@ -277,6 +279,8 @@ clear_user(void *addr, unsigned long size)
 #define __clear_user  __xtensa_clear_user
 
 
+#ifndef CONFIG_GENERIC_STRNCPY_FROM_USER
+
 extern long __strncpy_user(char *, const char *, long);
 
 static inline long
@@ -286,6 +290,9 @@ strncpy_from_user(char *dst, const char *src, long count)
 		return __strncpy_user(dst, src, count);
 	return -EFAULT;
 }
+#else
+long strncpy_from_user(char *dst, const char *src, long count);
+#endif
 
 /*
  * Return the size of a string (including the ending 0!)
diff --git a/arch/xtensa/kernel/Makefile b/arch/xtensa/kernel/Makefile
index bb8d557..9190759 100644
--- a/arch/xtensa/kernel/Makefile
+++ b/arch/xtensa/kernel/Makefile
@@ -17,9 +17,6 @@
 obj-$(CONFIG_HAVE_HW_BREAKPOINT) += hw_breakpoint.o
 obj-$(CONFIG_S32C1I_SELFTEST) += s32c1i_selftest.o
 
-AFLAGS_head.o += -mtext-section-literals
-AFLAGS_mxhead.o += -mtext-section-literals
-
 # In the Xtensa architecture, assembly generates literals which must always
 # precede the L32R instruction with a relative offset less than 256 kB.
 # Therefore, the .text and .literal section must be combined in parenthesis
diff --git a/arch/xtensa/kernel/align.S b/arch/xtensa/kernel/align.S
index 890004a..9301452e 100644
--- a/arch/xtensa/kernel/align.S
+++ b/arch/xtensa/kernel/align.S
@@ -19,6 +19,7 @@
 #include <linux/linkage.h>
 #include <asm/current.h>
 #include <asm/asm-offsets.h>
+#include <asm/asmmacro.h>
 #include <asm/processor.h>
 
 #if XCHAL_UNALIGNED_LOAD_EXCEPTION || XCHAL_UNALIGNED_STORE_EXCEPTION
@@ -66,8 +67,6 @@
 #define	INSN_T		24
 #define	INSN_OP1	16
 
-.macro __src_b	r, w0, w1;	src	\r, \w0, \w1;	.endm
-.macro __ssa8	r;		ssa8b	\r;		.endm
 .macro __ssa8r	r;		ssa8l	\r;		.endm
 .macro __sh	r, s;		srl	\r, \s;		.endm
 .macro __sl	r, s;		sll	\r, \s;		.endm
@@ -81,8 +80,6 @@
 #define	INSN_T		4
 #define	INSN_OP1	12
 
-.macro __src_b	r, w0, w1;	src	\r, \w1, \w0;	.endm
-.macro __ssa8	r;		ssa8l	\r;		.endm
 .macro __ssa8r	r;		ssa8b	\r;		.endm
 .macro __sh	r, s;		sll	\r, \s;		.endm
 .macro __sl	r, s;		srl	\r, \s;		.endm
@@ -155,7 +152,7 @@
  *	     <  VALID_DOUBLE_EXCEPTION_ADDRESS: regular exception
  */
 
-
+	.literal_position
 ENTRY(fast_unaligned)
 
 	/* Note: We don't expect the address to be aligned on a word
diff --git a/arch/xtensa/kernel/asm-offsets.c b/arch/xtensa/kernel/asm-offsets.c
index bcb5beb..022cf91 100644
--- a/arch/xtensa/kernel/asm-offsets.c
+++ b/arch/xtensa/kernel/asm-offsets.c
@@ -76,6 +76,9 @@ int main(void)
 	DEFINE(TASK_PID, offsetof (struct task_struct, pid));
 	DEFINE(TASK_THREAD, offsetof (struct task_struct, thread));
 	DEFINE(TASK_THREAD_INFO, offsetof (struct task_struct, stack));
+#ifdef CONFIG_CC_STACKPROTECTOR
+	DEFINE(TASK_STACK_CANARY, offsetof(struct task_struct, stack_canary));
+#endif
 	DEFINE(TASK_STRUCT_SIZE, sizeof (struct task_struct));
 
 	/* offsets in thread_info struct */
@@ -129,5 +132,18 @@ int main(void)
 	       offsetof(struct debug_table, icount_level_save));
 #endif
 
+	/* struct exc_table */
+	DEFINE(EXC_TABLE_KSTK, offsetof(struct exc_table, kstk));
+	DEFINE(EXC_TABLE_DOUBLE_SAVE, offsetof(struct exc_table, double_save));
+	DEFINE(EXC_TABLE_FIXUP, offsetof(struct exc_table, fixup));
+	DEFINE(EXC_TABLE_PARAM, offsetof(struct exc_table, fixup_param));
+	DEFINE(EXC_TABLE_SYSCALL_SAVE,
+	       offsetof(struct exc_table, syscall_save));
+	DEFINE(EXC_TABLE_FAST_USER,
+	       offsetof(struct exc_table, fast_user_handler));
+	DEFINE(EXC_TABLE_FAST_KERNEL,
+	       offsetof(struct exc_table, fast_kernel_handler));
+	DEFINE(EXC_TABLE_DEFAULT, offsetof(struct exc_table, default_handler));
+
 	return 0;
 }
diff --git a/arch/xtensa/kernel/coprocessor.S b/arch/xtensa/kernel/coprocessor.S
index 3a98503..4f8b52d 100644
--- a/arch/xtensa/kernel/coprocessor.S
+++ b/arch/xtensa/kernel/coprocessor.S
@@ -212,8 +212,7 @@
 ENTRY(fast_coprocessor_double)
 
 	wsr	a0, excsave1
-	movi	a0, unrecoverable_exception
-	callx0	a0
+	call0	unrecoverable_exception
 
 ENDPROC(fast_coprocessor_double)
 
diff --git a/arch/xtensa/kernel/entry.S b/arch/xtensa/kernel/entry.S
index 37a2395..5caff07 100644
--- a/arch/xtensa/kernel/entry.S
+++ b/arch/xtensa/kernel/entry.S
@@ -14,6 +14,7 @@
 
 #include <linux/linkage.h>
 #include <asm/asm-offsets.h>
+#include <asm/asmmacro.h>
 #include <asm/processor.h>
 #include <asm/coprocessor.h>
 #include <asm/thread_info.h>
@@ -125,6 +126,7 @@
  *
  * Note: _user_exception might be at an odd address. Don't use call0..call12
  */
+	.literal_position
 
 ENTRY(user_exception)
 
@@ -475,8 +477,7 @@
 1:
 	irq_save a2, a3
 #ifdef CONFIG_TRACE_IRQFLAGS
-	movi	a4, trace_hardirqs_off
-	callx4	a4
+	call4	trace_hardirqs_off
 #endif
 
 	/* Jump if we are returning from kernel exceptions. */
@@ -503,24 +504,20 @@
 	/* Call do_signal() */
 
 #ifdef CONFIG_TRACE_IRQFLAGS
-	movi	a4, trace_hardirqs_on
-	callx4	a4
+	call4	trace_hardirqs_on
 #endif
 	rsil	a2, 0
-	movi	a4, do_notify_resume	# int do_notify_resume(struct pt_regs*)
 	mov	a6, a1
-	callx4	a4
+	call4	do_notify_resume	# int do_notify_resume(struct pt_regs*)
 	j	1b
 
 3:	/* Reschedule */
 
 #ifdef CONFIG_TRACE_IRQFLAGS
-	movi	a4, trace_hardirqs_on
-	callx4	a4
+	call4	trace_hardirqs_on
 #endif
 	rsil	a2, 0
-	movi	a4, schedule	# void schedule (void)
-	callx4	a4
+	call4	schedule	# void schedule (void)
 	j	1b
 
 #ifdef CONFIG_PREEMPT
@@ -531,8 +528,7 @@
 
 	l32i	a4, a2, TI_PRE_COUNT
 	bnez	a4, 4f
-	movi	a4, preempt_schedule_irq
-	callx4	a4
+	call4	preempt_schedule_irq
 	j	1b
 #endif
 
@@ -545,23 +541,20 @@
 5:
 #ifdef CONFIG_HAVE_HW_BREAKPOINT
 	_bbci.l	a4, TIF_DB_DISABLED, 7f
-	movi	a4, restore_dbreak
-	callx4	a4
+	call4	restore_dbreak
 7:
 #endif
 #ifdef CONFIG_DEBUG_TLB_SANITY
 	l32i	a4, a1, PT_DEPC
 	bgeui	a4, VALID_DOUBLE_EXCEPTION_ADDRESS, 4f
-	movi	a4, check_tlb_sanity
-	callx4	a4
+	call4	check_tlb_sanity
 #endif
 6:
 4:
 #ifdef CONFIG_TRACE_IRQFLAGS
 	extui	a4, a3, PS_INTLEVEL_SHIFT, PS_INTLEVEL_WIDTH
 	bgei	a4, LOCKLEVEL, 1f
-	movi	a4, trace_hardirqs_on
-	callx4	a4
+	call4	trace_hardirqs_on
 1:
 #endif
 	/* Restore optional registers. */
@@ -777,6 +770,8 @@
  * When we get here,  a0 is trashed and saved to excsave[debuglevel]
  */
 
+	.literal_position
+
 ENTRY(debug_exception)
 
 	rsr	a0, SREG_EPS + XCHAL_DEBUGLEVEL
@@ -916,6 +911,8 @@
 unrecoverable_text:
 	.ascii "Unrecoverable error in exception handler\0"
 
+	.literal_position
+
 ENTRY(unrecoverable_exception)
 
 	movi	a0, 1
@@ -933,10 +930,8 @@
 	movi	a0, 0
 	addi	a1, a1, PT_REGS_OFFSET
 
-	movi	a4, panic
 	movi	a6, unrecoverable_text
-
-	callx4	a4
+	call4	panic
 
 1:	j	1b
 
@@ -1073,8 +1068,7 @@
 	xsr     a2, depc                # restore a2, depc
 
 	wsr     a0, excsave1
-	movi    a0, unrecoverable_exception
-	callx0  a0
+	call0	unrecoverable_exception
 
 ENDPROC(fast_syscall_unrecoverable)
 
@@ -1101,33 +1095,12 @@
  *	     <  VALID_DOUBLE_EXCEPTION_ADDRESS: regular exception
  *
  * Note: we don't have to save a2; a2 holds the return value
- *
- * We use the two macros TRY and CATCH:
- *
- * TRY	 adds an entry to the __ex_table fixup table for the immediately
- *	 following instruction.
- *
- * CATCH catches any exception that occurred at one of the preceding TRY
- *       statements and continues from there
- *
- * Usage TRY	l32i	a0, a1, 0
- *		<other code>
- *	 done:	rfe
- *	 CATCH	<set return code>
- *		j done
  */
 
+	.literal_position
+
 #ifdef CONFIG_FAST_SYSCALL_XTENSA
 
-#define TRY								\
-	.section __ex_table, "a";					\
-	.word	66f, 67f;						\
-	.text;								\
-66:
-
-#define CATCH								\
-67:
-
 ENTRY(fast_syscall_xtensa)
 
 	s32i	a7, a2, PT_AREG7	# we need an additional register
@@ -1141,9 +1114,9 @@
 
 .Lswp:	/* Atomic compare and swap */
 
-TRY	l32i	a0, a3, 0		# read old value
+EX(.Leac) l32i	a0, a3, 0		# read old value
 	bne	a0, a4, 1f		# same as old value? jump
-TRY	s32i	a5, a3, 0		# different, modify value
+EX(.Leac) s32i	a5, a3, 0		# different, modify value
 	l32i	a7, a2, PT_AREG7	# restore a7
 	l32i	a0, a2, PT_AREG0	# restore a0
 	movi	a2, 1			# and return 1
@@ -1156,12 +1129,12 @@
 
 .Lnswp:	/* Atomic set, add, and exg_add. */
 
-TRY	l32i	a7, a3, 0		# orig
+EX(.Leac) l32i	a7, a3, 0		# orig
 	addi	a6, a6, -SYS_XTENSA_ATOMIC_SET
 	add	a0, a4, a7		# + arg
 	moveqz	a0, a4, a6		# set
 	addi	a6, a6, SYS_XTENSA_ATOMIC_SET
-TRY	s32i	a0, a3, 0		# write new value
+EX(.Leac) s32i	a0, a3, 0		# write new value
 
 	mov	a0, a2
 	mov	a2, a7
@@ -1169,7 +1142,6 @@
 	l32i	a0, a0, PT_AREG0	# restore a0
 	rfe
 
-CATCH
 .Leac:	l32i	a7, a2, PT_AREG7	# restore a7
 	l32i	a0, a2, PT_AREG0	# restore a0
 	movi	a2, -EFAULT
@@ -1411,14 +1383,12 @@
 	rsync
 
 	movi	a6, SIGSEGV
-	movi	a4, do_exit
-	callx4	a4
+	call4	do_exit
 
 	/* shouldn't return, so panic */
 
 	wsr	a0, excsave1
-	movi	a0, unrecoverable_exception
-	callx0	a0		# should not return
+	call0	unrecoverable_exception		# should not return
 1:	j	1b
 
 
@@ -1564,8 +1534,8 @@
 
 ENTRY(fast_second_level_miss_double_kernel)
 
-1:	movi	a0, unrecoverable_exception
-	callx0	a0		# should not return
+1:
+	call0	unrecoverable_exception		# should not return
 1:	j	1b
 
 ENDPROC(fast_second_level_miss_double_kernel)
@@ -1887,6 +1857,7 @@
  * void system_call (struct pt_regs* regs, int exccause)
  *                            a2                 a3
  */
+	.literal_position
 
 ENTRY(system_call)
 
@@ -1896,9 +1867,8 @@
 
 	l32i	a3, a2, PT_AREG2
 	mov	a6, a2
-	movi	a4, do_syscall_trace_enter
 	s32i	a3, a2, PT_SYSCALL
-	callx4	a4
+	call4	do_syscall_trace_enter
 	mov	a3, a6
 
 	/* syscall = sys_call_table[syscall_nr] */
@@ -1930,9 +1900,8 @@
 1:	/* regs->areg[2] = return_value */
 
 	s32i	a6, a2, PT_AREG2
-	movi	a4, do_syscall_trace_leave
 	mov	a6, a2
-	callx4	a4
+	call4	do_syscall_trace_leave
 	retw
 
 ENDPROC(system_call)
@@ -2002,6 +1971,12 @@
 	s32i	a1, a2, THREAD_SP	# save stack pointer
 #endif
 
+#if defined(CONFIG_CC_STACKPROTECTOR) && !defined(CONFIG_SMP)
+	movi	a6, __stack_chk_guard
+	l32i	a8, a3, TASK_STACK_CANARY
+	s32i	a8, a6, 0
+#endif
+
 	/* Disable ints while we manipulate the stack pointer. */
 
 	irq_save a14, a3
@@ -2048,12 +2023,10 @@
 	/* void schedule_tail (struct task_struct *prev)
 	 * Note: prev is still in a6 (return value from fake call4 frame)
 	 */
-	movi	a4, schedule_tail
-	callx4	a4
+	call4	schedule_tail
 
-	movi	a4, do_syscall_trace_leave
 	mov	a6, a1
-	callx4	a4
+	call4	do_syscall_trace_leave
 
 	j	common_exception_return
 
diff --git a/arch/xtensa/kernel/head.S b/arch/xtensa/kernel/head.S
index 23ce62e..9c4e943 100644
--- a/arch/xtensa/kernel/head.S
+++ b/arch/xtensa/kernel/head.S
@@ -264,11 +264,8 @@
 
 	/* init_arch kick-starts the linux kernel */
 
-	movi	a4, init_arch
-	callx4	a4
-
-	movi	a4, start_kernel
-	callx4	a4
+	call4	init_arch
+	call4	start_kernel
 
 should_never_return:
 	j	should_never_return
@@ -294,8 +291,7 @@
 	movi	a6, 0
 	wsr	a6, excsave1
 
-	movi	a4, secondary_start_kernel
-	callx4	a4
+	call4	secondary_start_kernel
 	j	should_never_return
 
 #endif  /* CONFIG_SMP */
diff --git a/arch/xtensa/kernel/module.c b/arch/xtensa/kernel/module.c
index b715237..902845d 100644
--- a/arch/xtensa/kernel/module.c
+++ b/arch/xtensa/kernel/module.c
@@ -22,8 +22,6 @@
 #include <linux/kernel.h>
 #include <linux/cache.h>
 
-#undef DEBUG_RELOCATE
-
 static int
 decode_calln_opcode (unsigned char *location)
 {
@@ -58,10 +56,9 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 	unsigned char *location;
 	uint32_t value;
 
-#ifdef DEBUG_RELOCATE
-	printk("Applying relocate section %u to %u\n", relsec,
-	       sechdrs[relsec].sh_info);
-#endif
+	pr_debug("Applying relocate section %u to %u\n", relsec,
+		 sechdrs[relsec].sh_info);
+
 	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rela); i++) {
 		location = (char *)sechdrs[sechdrs[relsec].sh_info].sh_addr
 			+ rela[i].r_offset;
@@ -87,7 +84,7 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 				value -= ((unsigned long)location & -4) + 4;
 				if ((value & 3) != 0 ||
 				    ((value + (1 << 19)) >> 20) != 0) {
-					printk("%s: relocation out of range, "
+					pr_err("%s: relocation out of range, "
 					       "section %d reloc %d "
 					       "sym '%s'\n",
 					       mod->name, relsec, i,
@@ -111,7 +108,7 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 				value -= (((unsigned long)location + 3) & -4);
 				if ((value & 3) != 0 ||
 				    (signed int)value >> 18 != -1) {
-					printk("%s: relocation out of range, "
+					pr_err("%s: relocation out of range, "
 					       "section %d reloc %d "
 					       "sym '%s'\n",
 					       mod->name, relsec, i,
@@ -156,7 +153,7 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 		case R_XTENSA_SLOT12_OP:
 		case R_XTENSA_SLOT13_OP:
 		case R_XTENSA_SLOT14_OP:
-			printk("%s: unexpected FLIX relocation: %u\n",
+			pr_err("%s: unexpected FLIX relocation: %u\n",
 			       mod->name,
 			       ELF32_R_TYPE(rela[i].r_info));
 			return -ENOEXEC;
@@ -176,13 +173,13 @@ int apply_relocate_add(Elf32_Shdr *sechdrs,
 		case R_XTENSA_SLOT12_ALT:
 		case R_XTENSA_SLOT13_ALT:
 		case R_XTENSA_SLOT14_ALT:
-			printk("%s: unexpected ALT relocation: %u\n",
+			pr_err("%s: unexpected ALT relocation: %u\n",
 			       mod->name,
 			       ELF32_R_TYPE(rela[i].r_info));
 			return -ENOEXEC;
 
 		default:
-			printk("%s: unexpected relocation: %u\n",
+			pr_err("%s: unexpected relocation: %u\n",
 			       mod->name,
 			       ELF32_R_TYPE(rela[i].r_info));
 			return -ENOEXEC;
diff --git a/arch/xtensa/kernel/pci.c b/arch/xtensa/kernel/pci.c
index 903963e..d981f01 100644
--- a/arch/xtensa/kernel/pci.c
+++ b/arch/xtensa/kernel/pci.c
@@ -29,14 +29,6 @@
 #include <asm/pci-bridge.h>
 #include <asm/platform.h>
 
-#undef DEBUG
-
-#ifdef DEBUG
-#define DBG(x...) printk(x)
-#else
-#define DBG(x...)
-#endif
-
 /* PCI Controller */
 
 
@@ -101,8 +93,8 @@ pcibios_enable_resources(struct pci_dev *dev, int mask)
 	for(idx=0; idx<6; idx++) {
 		r = &dev->resource[idx];
 		if (!r->start && r->end) {
-			printk (KERN_ERR "PCI: Device %s not available because "
-				"of resource collisions\n", pci_name(dev));
+			pr_err("PCI: Device %s not available because "
+			       "of resource collisions\n", pci_name(dev));
 			return -EINVAL;
 		}
 		if (r->flags & IORESOURCE_IO)
@@ -113,7 +105,7 @@ pcibios_enable_resources(struct pci_dev *dev, int mask)
 	if (dev->resource[PCI_ROM_RESOURCE].start)
 		cmd |= PCI_COMMAND_MEMORY;
 	if (cmd != old_cmd) {
-		printk("PCI: Enabling device %s (%04x -> %04x)\n",
+		pr_info("PCI: Enabling device %s (%04x -> %04x)\n",
 			pci_name(dev), old_cmd, cmd);
 		pci_write_config_word(dev, PCI_COMMAND, cmd);
 	}
@@ -144,8 +136,8 @@ static void __init pci_controller_apertures(struct pci_controller *pci_ctrl,
 	res = &pci_ctrl->io_resource;
 	if (!res->flags) {
 		if (io_offset)
-			printk (KERN_ERR "I/O resource not set for host"
-				" bridge %d\n", pci_ctrl->index);
+			pr_err("I/O resource not set for host bridge %d\n",
+			       pci_ctrl->index);
 		res->start = 0;
 		res->end = IO_SPACE_LIMIT;
 		res->flags = IORESOURCE_IO;
@@ -159,8 +151,8 @@ static void __init pci_controller_apertures(struct pci_controller *pci_ctrl,
 		if (!res->flags) {
 			if (i > 0)
 				continue;
-			printk(KERN_ERR "Memory resource not set for "
-			       "host bridge %d\n", pci_ctrl->index);
+			pr_err("Memory resource not set for host bridge %d\n",
+			       pci_ctrl->index);
 			res->start = 0;
 			res->end = ~0U;
 			res->flags = IORESOURCE_MEM;
@@ -176,7 +168,7 @@ static int __init pcibios_init(void)
 	struct pci_bus *bus;
 	int next_busno = 0, ret;
 
-	printk("PCI: Probing PCI hardware\n");
+	pr_info("PCI: Probing PCI hardware\n");
 
 	/* Scan all of the recorded PCI controllers.  */
 	for (pci_ctrl = pci_ctrl_head; pci_ctrl; pci_ctrl = pci_ctrl->next) {
@@ -232,7 +224,7 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 	for (idx=0; idx<6; idx++) {
 		r = &dev->resource[idx];
 		if (!r->start && r->end) {
-			printk(KERN_ERR "PCI: Device %s not available because "
+			pr_err("PCI: Device %s not available because "
 			       "of resource collisions\n", pci_name(dev));
 			return -EINVAL;
 		}
@@ -242,8 +234,8 @@ int pcibios_enable_device(struct pci_dev *dev, int mask)
 			cmd |= PCI_COMMAND_MEMORY;
 	}
 	if (cmd != old_cmd) {
-		printk("PCI: Enabling device %s (%04x -> %04x)\n",
-		       pci_name(dev), old_cmd, cmd);
+		pr_info("PCI: Enabling device %s (%04x -> %04x)\n",
+			pci_name(dev), old_cmd, cmd);
 		pci_write_config_word(dev, PCI_COMMAND, cmd);
 	}
 
diff --git a/arch/xtensa/kernel/process.c b/arch/xtensa/kernel/process.c
index ff4f0ec..8dd0593 100644
--- a/arch/xtensa/kernel/process.c
+++ b/arch/xtensa/kernel/process.c
@@ -58,6 +58,12 @@ void (*pm_power_off)(void) = NULL;
 EXPORT_SYMBOL(pm_power_off);
 
 
+#ifdef CONFIG_CC_STACKPROTECTOR
+#include <linux/stackprotector.h>
+unsigned long __stack_chk_guard __read_mostly;
+EXPORT_SYMBOL(__stack_chk_guard);
+#endif
+
 #if XTENSA_HAVE_COPROCESSORS
 
 void coprocessor_release_all(struct thread_info *ti)
diff --git a/arch/xtensa/kernel/setup.c b/arch/xtensa/kernel/setup.c
index 08175df..a931af9 100644
--- a/arch/xtensa/kernel/setup.c
+++ b/arch/xtensa/kernel/setup.c
@@ -36,6 +36,7 @@
 #endif
 
 #include <asm/bootparam.h>
+#include <asm/kasan.h>
 #include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -156,7 +157,7 @@ static int __init parse_bootparam(const bp_tag_t* tag)
 	/* Boot parameters must start with a BP_TAG_FIRST tag. */
 
 	if (tag->id != BP_TAG_FIRST) {
-		printk(KERN_WARNING "Invalid boot parameters!\n");
+		pr_warn("Invalid boot parameters!\n");
 		return 0;
 	}
 
@@ -165,15 +166,14 @@ static int __init parse_bootparam(const bp_tag_t* tag)
 	/* Parse all tags. */
 
 	while (tag != NULL && tag->id != BP_TAG_LAST) {
-	 	for (t = &__tagtable_begin; t < &__tagtable_end; t++) {
+		for (t = &__tagtable_begin; t < &__tagtable_end; t++) {
 			if (tag->id == t->tag) {
 				t->parse(tag);
 				break;
 			}
 		}
 		if (t == &__tagtable_end)
-			printk(KERN_WARNING "Ignoring tag "
-			       "0x%08x\n", tag->id);
+			pr_warn("Ignoring tag 0x%08x\n", tag->id);
 		tag = (bp_tag_t*)((unsigned long)(tag + 1) + tag->size);
 	}
 
@@ -208,6 +208,8 @@ static int __init xtensa_dt_io_area(unsigned long node, const char *uname,
 	/* round down to nearest 256MB boundary */
 	xtensa_kio_paddr &= 0xf0000000;
 
+	init_kio();
+
 	return 1;
 }
 #else
@@ -246,6 +248,14 @@ void __init early_init_devtree(void *params)
 
 void __init init_arch(bp_tag_t *bp_start)
 {
+	/* Initialize MMU. */
+
+	init_mmu();
+
+	/* Initialize initial KASAN shadow map */
+
+	kasan_early_init();
+
 	/* Parse boot parameters */
 
 	if (bp_start)
@@ -263,10 +273,6 @@ void __init init_arch(bp_tag_t *bp_start)
 	/* Early hook for platforms */
 
 	platform_init(bp_start);
-
-	/* Initialize MMU. */
-
-	init_mmu();
 }
 
 /*
@@ -277,13 +283,13 @@ extern char _end[];
 extern char _stext[];
 extern char _WindowVectors_text_start;
 extern char _WindowVectors_text_end;
-extern char _DebugInterruptVector_literal_start;
+extern char _DebugInterruptVector_text_start;
 extern char _DebugInterruptVector_text_end;
-extern char _KernelExceptionVector_literal_start;
+extern char _KernelExceptionVector_text_start;
 extern char _KernelExceptionVector_text_end;
-extern char _UserExceptionVector_literal_start;
+extern char _UserExceptionVector_text_start;
 extern char _UserExceptionVector_text_end;
-extern char _DoubleExceptionVector_literal_start;
+extern char _DoubleExceptionVector_text_start;
 extern char _DoubleExceptionVector_text_end;
 #if XCHAL_EXCM_LEVEL >= 2
 extern char _Level2InterruptVector_text_start;
@@ -317,6 +323,13 @@ static inline int mem_reserve(unsigned long start, unsigned long end)
 
 void __init setup_arch(char **cmdline_p)
 {
+	pr_info("config ID: %08x:%08x\n",
+		get_sr(SREG_EPC), get_sr(SREG_EXCSAVE));
+	if (get_sr(SREG_EPC) != XCHAL_HW_CONFIGID0 ||
+	    get_sr(SREG_EXCSAVE) != XCHAL_HW_CONFIGID1)
+		pr_info("built for config ID: %08x:%08x\n",
+			XCHAL_HW_CONFIGID0, XCHAL_HW_CONFIGID1);
+
 	*cmdline_p = command_line;
 	platform_setup(cmdline_p);
 	strlcpy(boot_command_line, *cmdline_p, COMMAND_LINE_SIZE);
@@ -339,16 +352,16 @@ void __init setup_arch(char **cmdline_p)
 	mem_reserve(__pa(&_WindowVectors_text_start),
 		    __pa(&_WindowVectors_text_end));
 
-	mem_reserve(__pa(&_DebugInterruptVector_literal_start),
+	mem_reserve(__pa(&_DebugInterruptVector_text_start),
 		    __pa(&_DebugInterruptVector_text_end));
 
-	mem_reserve(__pa(&_KernelExceptionVector_literal_start),
+	mem_reserve(__pa(&_KernelExceptionVector_text_start),
 		    __pa(&_KernelExceptionVector_text_end));
 
-	mem_reserve(__pa(&_UserExceptionVector_literal_start),
+	mem_reserve(__pa(&_UserExceptionVector_text_start),
 		    __pa(&_UserExceptionVector_text_end));
 
-	mem_reserve(__pa(&_DoubleExceptionVector_literal_start),
+	mem_reserve(__pa(&_DoubleExceptionVector_text_start),
 		    __pa(&_DoubleExceptionVector_text_end));
 
 #if XCHAL_EXCM_LEVEL >= 2
@@ -380,7 +393,7 @@ void __init setup_arch(char **cmdline_p)
 #endif
 	parse_early_param();
 	bootmem_init();
-
+	kasan_init();
 	unflatten_and_copy_device_tree();
 
 #ifdef CONFIG_SMP
@@ -582,12 +595,14 @@ c_show(struct seq_file *f, void *slot)
 		      "model\t\t: Xtensa " XCHAL_HW_VERSION_NAME "\n"
 		      "core ID\t\t: " XCHAL_CORE_ID "\n"
 		      "build ID\t: 0x%x\n"
+		      "config ID\t: %08x:%08x\n"
 		      "byte order\t: %s\n"
 		      "cpu MHz\t\t: %lu.%02lu\n"
 		      "bogomips\t: %lu.%02lu\n",
 		      num_online_cpus(),
 		      cpumask_pr_args(cpu_online_mask),
 		      XCHAL_BUILD_UNIQUE_ID,
+		      get_sr(SREG_EPC), get_sr(SREG_EXCSAVE),
 		      XCHAL_HAVE_BE ?  "big" : "little",
 		      ccount_freq/1000000,
 		      (ccount_freq/10000) % 100,
diff --git a/arch/xtensa/kernel/signal.c b/arch/xtensa/kernel/signal.c
index d427e78..f88e7a0 100644
--- a/arch/xtensa/kernel/signal.c
+++ b/arch/xtensa/kernel/signal.c
@@ -28,8 +28,6 @@
 #include <asm/coprocessor.h>
 #include <asm/unistd.h>
 
-#define DEBUG_SIG  0
-
 extern struct task_struct *coproc_owners[];
 
 struct rt_sigframe
@@ -399,10 +397,8 @@ static int setup_frame(struct ksignal *ksig, sigset_t *set,
 	regs->areg[8] = (unsigned long) &frame->uc;
 	regs->threadptr = tp;
 
-#if DEBUG_SIG
-	printk("SIG rt deliver (%s:%d): signal=%d sp=%p pc=%08x\n",
-		current->comm, current->pid, sig, frame, regs->pc);
-#endif
+	pr_debug("SIG rt deliver (%s:%d): signal=%d sp=%p pc=%08lx\n",
+		 current->comm, current->pid, sig, frame, regs->pc);
 
 	return 0;
 }
diff --git a/arch/xtensa/kernel/traps.c b/arch/xtensa/kernel/traps.c
index bae697a..32c5207 100644
--- a/arch/xtensa/kernel/traps.c
+++ b/arch/xtensa/kernel/traps.c
@@ -33,6 +33,7 @@
 #include <linux/kallsyms.h>
 #include <linux/delay.h>
 #include <linux/hardirq.h>
+#include <linux/ratelimit.h>
 
 #include <asm/stacktrace.h>
 #include <asm/ptrace.h>
@@ -158,8 +159,7 @@ COPROCESSOR(7),
  * 2. it is a temporary memory buffer for the exception handlers.
  */
 
-DEFINE_PER_CPU(unsigned long, exc_table[EXC_TABLE_SIZE/4]);
-
+DEFINE_PER_CPU(struct exc_table, exc_table);
 DEFINE_PER_CPU(struct debug_table, debug_table);
 
 void die(const char*, struct pt_regs*, long);
@@ -178,13 +178,14 @@ __die_if_kernel(const char *str, struct pt_regs *regs, long err)
 void do_unhandled(struct pt_regs *regs, unsigned long exccause)
 {
 	__die_if_kernel("Caught unhandled exception - should not happen",
-	    		regs, SIGKILL);
+			regs, SIGKILL);
 
 	/* If in user mode, send SIGILL signal to current process */
-	printk("Caught unhandled exception in '%s' "
-	       "(pid = %d, pc = %#010lx) - should not happen\n"
-	       "\tEXCCAUSE is %ld\n",
-	       current->comm, task_pid_nr(current), regs->pc, exccause);
+	pr_info_ratelimited("Caught unhandled exception in '%s' "
+			    "(pid = %d, pc = %#010lx) - should not happen\n"
+			    "\tEXCCAUSE is %ld\n",
+			    current->comm, task_pid_nr(current), regs->pc,
+			    exccause);
 	force_sig(SIGILL, current);
 }
 
@@ -305,8 +306,8 @@ do_illegal_instruction(struct pt_regs *regs)
 
 	/* If in user mode, send SIGILL signal to current process. */
 
-	printk("Illegal Instruction in '%s' (pid = %d, pc = %#010lx)\n",
-	    current->comm, task_pid_nr(current), regs->pc);
+	pr_info_ratelimited("Illegal Instruction in '%s' (pid = %d, pc = %#010lx)\n",
+			    current->comm, task_pid_nr(current), regs->pc);
 	force_sig(SIGILL, current);
 }
 
@@ -325,13 +326,14 @@ do_unaligned_user (struct pt_regs *regs)
 	siginfo_t info;
 
 	__die_if_kernel("Unhandled unaligned exception in kernel",
-	    		regs, SIGKILL);
+			regs, SIGKILL);
 
 	current->thread.bad_vaddr = regs->excvaddr;
 	current->thread.error_code = -3;
-	printk("Unaligned memory access to %08lx in '%s' "
-	       "(pid = %d, pc = %#010lx)\n",
-	       regs->excvaddr, current->comm, task_pid_nr(current), regs->pc);
+	pr_info_ratelimited("Unaligned memory access to %08lx in '%s' "
+			    "(pid = %d, pc = %#010lx)\n",
+			    regs->excvaddr, current->comm,
+			    task_pid_nr(current), regs->pc);
 	info.si_signo = SIGBUS;
 	info.si_errno = 0;
 	info.si_code = BUS_ADRALN;
@@ -365,28 +367,28 @@ do_debug(struct pt_regs *regs)
 }
 
 
-static void set_handler(int idx, void *handler)
-{
-	unsigned int cpu;
-
-	for_each_possible_cpu(cpu)
-		per_cpu(exc_table, cpu)[idx] = (unsigned long)handler;
-}
+#define set_handler(type, cause, handler)				\
+	do {								\
+		unsigned int cpu;					\
+									\
+		for_each_possible_cpu(cpu)				\
+			per_cpu(exc_table, cpu).type[cause] = (handler);\
+	} while (0)
 
 /* Set exception C handler - for temporary use when probing exceptions */
 
 void * __init trap_set_handler(int cause, void *handler)
 {
-	void *previous = (void *)per_cpu(exc_table, 0)[
-		EXC_TABLE_DEFAULT / 4 + cause];
-	set_handler(EXC_TABLE_DEFAULT / 4 + cause, handler);
+	void *previous = per_cpu(exc_table, 0).default_handler[cause];
+
+	set_handler(default_handler, cause, handler);
 	return previous;
 }
 
 
 static void trap_init_excsave(void)
 {
-	unsigned long excsave1 = (unsigned long)this_cpu_ptr(exc_table);
+	unsigned long excsave1 = (unsigned long)this_cpu_ptr(&exc_table);
 	__asm__ __volatile__("wsr  %0, excsave1\n" : : "a" (excsave1));
 }
 
@@ -418,10 +420,10 @@ void __init trap_init(void)
 
 	/* Setup default vectors. */
 
-	for(i = 0; i < 64; i++) {
-		set_handler(EXC_TABLE_FAST_USER/4   + i, user_exception);
-		set_handler(EXC_TABLE_FAST_KERNEL/4 + i, kernel_exception);
-		set_handler(EXC_TABLE_DEFAULT/4 + i, do_unhandled);
+	for (i = 0; i < EXCCAUSE_N; i++) {
+		set_handler(fast_user_handler, i, user_exception);
+		set_handler(fast_kernel_handler, i, kernel_exception);
+		set_handler(default_handler, i, do_unhandled);
 	}
 
 	/* Setup specific handlers. */
@@ -433,11 +435,11 @@ void __init trap_init(void)
 		void *handler = dispatch_init_table[i].handler;
 
 		if (fast == 0)
-			set_handler (EXC_TABLE_DEFAULT/4 + cause, handler);
+			set_handler(default_handler, cause, handler);
 		if (fast && fast & USER)
-			set_handler (EXC_TABLE_FAST_USER/4 + cause, handler);
+			set_handler(fast_user_handler, cause, handler);
 		if (fast && fast & KRNL)
-			set_handler (EXC_TABLE_FAST_KERNEL/4 + cause, handler);
+			set_handler(fast_kernel_handler, cause, handler);
 	}
 
 	/* Initialize EXCSAVE_1 to hold the address of the exception table. */
diff --git a/arch/xtensa/kernel/vectors.S b/arch/xtensa/kernel/vectors.S
index 332e9d6..841503d 100644
--- a/arch/xtensa/kernel/vectors.S
+++ b/arch/xtensa/kernel/vectors.S
@@ -205,9 +205,6 @@
  */
 
 	.section .DoubleExceptionVector.text, "ax"
-	.begin literal_prefix .DoubleExceptionVector
-	.globl _DoubleExceptionVector_WindowUnderflow
-	.globl _DoubleExceptionVector_WindowOverflow
 
 ENTRY(_DoubleExceptionVector)
 
@@ -217,8 +214,12 @@
 	/* Check for kernel double exception (usually fatal). */
 
 	rsr	a2, ps
-	_bbci.l	a2, PS_UM_BIT, .Lksp
+	_bbsi.l	a2, PS_UM_BIT, 1f
+	j	.Lksp
 
+	.align	4
+	.literal_position
+1:
 	/* Check if we are currently handling a window exception. */
 	/* Note: We don't need to indicate that we enter a critical section. */
 
@@ -304,8 +305,7 @@
 .Lunrecoverable:
 	rsr	a3, excsave1
 	wsr	a0, excsave1
-	movi	a0, unrecoverable_exception
-	callx0	a0
+	call0	unrecoverable_exception
 
 .Lfixup:/* Check for a fixup handler or if we were in a critical section. */
 
@@ -475,11 +475,8 @@
 	rotw	-3
 	j	1b
 
-
 ENDPROC(_DoubleExceptionVector)
 
-	.end literal_prefix
-
 	.text
 /*
  * Fixup handler for TLB miss in double exception handler for window owerflow.
@@ -508,6 +505,8 @@
  * a3: exctable, original value in excsave1
  */
 
+	.literal_position
+
 ENTRY(window_overflow_restore_a0_fixup)
 
 	rsr	a0, ps
diff --git a/arch/xtensa/kernel/vmlinux.lds.S b/arch/xtensa/kernel/vmlinux.lds.S
index 162c77e..70b731e 100644
--- a/arch/xtensa/kernel/vmlinux.lds.S
+++ b/arch/xtensa/kernel/vmlinux.lds.S
@@ -45,24 +45,16 @@
 	LONG(sym ## _end);			\
 	LONG(LOADADDR(section))
 
-/* Macro to define a section for a vector.
- *
- * Use of the MIN function catches the types of errors illustrated in
- * the following example:
- *
- * Assume the section .DoubleExceptionVector.literal is completely
- * full.  Then a programmer adds code to .DoubleExceptionVector.text
- * that produces another literal.  The final literal position will
- * overlay onto the first word of the adjacent code section
- * .DoubleExceptionVector.text.  (In practice, the literals will
- * overwrite the code, and the first few instructions will be
- * garbage.)
+/*
+ * Macro to define a section for a vector. When CONFIG_VECTORS_OFFSET is
+ * defined code for every vector is located with other init data. At startup
+ * time head.S copies code for every vector to its final position according
+ * to description recorded in the corresponding RELOCATE_ENTRY.
  */
 
 #ifdef CONFIG_VECTORS_OFFSET
-#define SECTION_VECTOR(sym, section, addr, max_prevsec_size, prevsec)       \
-  section addr : AT((MIN(LOADADDR(prevsec) + max_prevsec_size,		    \
-		         LOADADDR(prevsec) + SIZEOF(prevsec)) + 3) & ~ 3)   \
+#define SECTION_VECTOR(sym, section, addr, prevsec)                         \
+  section addr : AT(((LOADADDR(prevsec) + SIZEOF(prevsec)) + 3) & ~ 3)      \
   {									    \
     . = ALIGN(4);							    \
     sym ## _start = ABSOLUTE(.);		 			    \
@@ -112,26 +104,19 @@
 #if XCHAL_EXCM_LEVEL >= 6
   SECTION_VECTOR (.Level6InterruptVector.text, INTLEVEL6_VECTOR_VADDR)
 #endif
-  SECTION_VECTOR (.DebugInterruptVector.literal, DEBUG_VECTOR_VADDR - 4)
   SECTION_VECTOR (.DebugInterruptVector.text, DEBUG_VECTOR_VADDR)
-  SECTION_VECTOR (.KernelExceptionVector.literal, KERNEL_VECTOR_VADDR - 4)
   SECTION_VECTOR (.KernelExceptionVector.text, KERNEL_VECTOR_VADDR)
-  SECTION_VECTOR (.UserExceptionVector.literal, USER_VECTOR_VADDR - 4)
   SECTION_VECTOR (.UserExceptionVector.text, USER_VECTOR_VADDR)
-  SECTION_VECTOR (.DoubleExceptionVector.literal, DOUBLEEXC_VECTOR_VADDR - 20)
   SECTION_VECTOR (.DoubleExceptionVector.text, DOUBLEEXC_VECTOR_VADDR)
 #endif
 
+    IRQENTRY_TEXT
+    SOFTIRQENTRY_TEXT
+    ENTRY_TEXT
     TEXT_TEXT
-    VMLINUX_SYMBOL(__sched_text_start) = .;
-    *(.sched.literal .sched.text)
-    VMLINUX_SYMBOL(__sched_text_end) = .;
-    VMLINUX_SYMBOL(__cpuidle_text_start) = .;
-    *(.cpuidle.literal .cpuidle.text)
-    VMLINUX_SYMBOL(__cpuidle_text_end) = .;
-    VMLINUX_SYMBOL(__lock_text_start) = .;
-    *(.spinlock.literal .spinlock.text)
-    VMLINUX_SYMBOL(__lock_text_end) = .;
+    SCHED_TEXT
+    CPUIDLE_TEXT
+    LOCK_TEXT
 
   }
   _etext = .;
@@ -196,8 +181,6 @@
 		   .KernelExceptionVector.text);
     RELOCATE_ENTRY(_UserExceptionVector_text,
 		   .UserExceptionVector.text);
-    RELOCATE_ENTRY(_DoubleExceptionVector_literal,
-		   .DoubleExceptionVector.literal);
     RELOCATE_ENTRY(_DoubleExceptionVector_text,
 		   .DoubleExceptionVector.text);
     RELOCATE_ENTRY(_DebugInterruptVector_text,
@@ -230,25 +213,19 @@
 
   SECTION_VECTOR (_WindowVectors_text,
 		  .WindowVectors.text,
-		  WINDOW_VECTORS_VADDR, 4,
+		  WINDOW_VECTORS_VADDR,
 		  .dummy)
-  SECTION_VECTOR (_DebugInterruptVector_literal,
-		  .DebugInterruptVector.literal,
-		  DEBUG_VECTOR_VADDR - 4,
-		  SIZEOF(.WindowVectors.text),
-		  .WindowVectors.text)
   SECTION_VECTOR (_DebugInterruptVector_text,
 		  .DebugInterruptVector.text,
 		  DEBUG_VECTOR_VADDR,
-		  4,
-		  .DebugInterruptVector.literal)
+		  .WindowVectors.text)
 #undef LAST
 #define LAST	.DebugInterruptVector.text
 #if XCHAL_EXCM_LEVEL >= 2
   SECTION_VECTOR (_Level2InterruptVector_text,
 		  .Level2InterruptVector.text,
 		  INTLEVEL2_VECTOR_VADDR,
-		  SIZEOF(LAST), LAST)
+		  LAST)
 # undef LAST
 # define LAST	.Level2InterruptVector.text
 #endif
@@ -256,7 +233,7 @@
   SECTION_VECTOR (_Level3InterruptVector_text,
 		  .Level3InterruptVector.text,
 		  INTLEVEL3_VECTOR_VADDR,
-		  SIZEOF(LAST), LAST)
+		  LAST)
 # undef LAST
 # define LAST	.Level3InterruptVector.text
 #endif
@@ -264,7 +241,7 @@
   SECTION_VECTOR (_Level4InterruptVector_text,
 		  .Level4InterruptVector.text,
 		  INTLEVEL4_VECTOR_VADDR,
-		  SIZEOF(LAST), LAST)
+		  LAST)
 # undef LAST
 # define LAST	.Level4InterruptVector.text
 #endif
@@ -272,7 +249,7 @@
   SECTION_VECTOR (_Level5InterruptVector_text,
 		  .Level5InterruptVector.text,
 		  INTLEVEL5_VECTOR_VADDR,
-		  SIZEOF(LAST), LAST)
+		  LAST)
 # undef LAST
 # define LAST	.Level5InterruptVector.text
 #endif
@@ -280,40 +257,23 @@
   SECTION_VECTOR (_Level6InterruptVector_text,
 		  .Level6InterruptVector.text,
 		  INTLEVEL6_VECTOR_VADDR,
-		  SIZEOF(LAST), LAST)
+		  LAST)
 # undef LAST
 # define LAST	.Level6InterruptVector.text
 #endif
-  SECTION_VECTOR (_KernelExceptionVector_literal,
-		  .KernelExceptionVector.literal,
-		  KERNEL_VECTOR_VADDR - 4,
-		  SIZEOF(LAST), LAST)
-#undef LAST
   SECTION_VECTOR (_KernelExceptionVector_text,
 		  .KernelExceptionVector.text,
 		  KERNEL_VECTOR_VADDR,
-		  4,
-		  .KernelExceptionVector.literal)
-  SECTION_VECTOR (_UserExceptionVector_literal,
-		  .UserExceptionVector.literal,
-		  USER_VECTOR_VADDR - 4,
-		  SIZEOF(.KernelExceptionVector.text),
-		  .KernelExceptionVector.text)
+		  LAST)
+#undef LAST
   SECTION_VECTOR (_UserExceptionVector_text,
 		  .UserExceptionVector.text,
 		  USER_VECTOR_VADDR,
-		  4,
-		  .UserExceptionVector.literal)
-  SECTION_VECTOR (_DoubleExceptionVector_literal,
-		  .DoubleExceptionVector.literal,
-		  DOUBLEEXC_VECTOR_VADDR - 20,
-		  SIZEOF(.UserExceptionVector.text),
-		  .UserExceptionVector.text)
+		  .KernelExceptionVector.text)
   SECTION_VECTOR (_DoubleExceptionVector_text,
 		  .DoubleExceptionVector.text,
 		  DOUBLEEXC_VECTOR_VADDR,
-		  20,
-		  .DoubleExceptionVector.literal)
+		  .UserExceptionVector.text)
 
   . = (LOADADDR( .DoubleExceptionVector.text ) + SIZEOF( .DoubleExceptionVector.text ) + 3) & ~ 3;
 
@@ -323,7 +283,6 @@
   SECTION_VECTOR (_SecondaryResetVector_text,
 		  .SecondaryResetVector.text,
 		  RESET_VECTOR1_VADDR,
-		  SIZEOF(.DoubleExceptionVector.text),
 		  .DoubleExceptionVector.text)
 
   . = LOADADDR(.SecondaryResetVector.text)+SIZEOF(.SecondaryResetVector.text);
@@ -373,5 +332,4 @@
 
   /* Sections to be discarded */
   DISCARDS
-  /DISCARD/ : { *(.exit.literal) }
 }
diff --git a/arch/xtensa/kernel/xtensa_ksyms.c b/arch/xtensa/kernel/xtensa_ksyms.c
index 6723910..04f19de 100644
--- a/arch/xtensa/kernel/xtensa_ksyms.c
+++ b/arch/xtensa/kernel/xtensa_ksyms.c
@@ -41,7 +41,12 @@
 EXPORT_SYMBOL(memset);
 EXPORT_SYMBOL(memcpy);
 EXPORT_SYMBOL(memmove);
+EXPORT_SYMBOL(__memset);
+EXPORT_SYMBOL(__memcpy);
+EXPORT_SYMBOL(__memmove);
+#ifndef CONFIG_GENERIC_STRNCPY_FROM_USER
 EXPORT_SYMBOL(__strncpy_user);
+#endif
 EXPORT_SYMBOL(clear_page);
 EXPORT_SYMBOL(copy_page);
 
diff --git a/arch/xtensa/lib/checksum.S b/arch/xtensa/lib/checksum.S
index 4eb573d..528fe0d 100644
--- a/arch/xtensa/lib/checksum.S
+++ b/arch/xtensa/lib/checksum.S
@@ -14,9 +14,10 @@
  *		2 of the License, or (at your option) any later version.
  */
 
-#include <asm/errno.h>
+#include <linux/errno.h>
 #include <linux/linkage.h>
 #include <variant/core.h>
+#include <asm/asmmacro.h>
 
 /*
  * computes a partial checksum, e.g. for TCP/UDP fragments
@@ -175,23 +176,8 @@
 
 /*
  * Copy from ds while checksumming, otherwise like csum_partial
- *
- * The macros SRC and DST specify the type of access for the instruction.
- * thus we can call a custom exception handler for each access type.
  */
 
-#define SRC(y...)			\
-	9999: y;			\
-	.section __ex_table, "a";	\
-	.long 9999b, 6001f	;	\
-	.previous
-
-#define DST(y...)			\
-	9999: y;			\
-	.section __ex_table, "a";	\
-	.long 9999b, 6002f	;	\
-	.previous
-
 /*
 unsigned int csum_partial_copy_generic (const char *src, char *dst, int len,
 					int sum, int *src_err_ptr, int *dst_err_ptr)
@@ -244,28 +230,28 @@
 	add	a10, a10, a2	/* a10 = end of last 32-byte src chunk */
 .Loop5:
 #endif
-SRC(	l32i	a9, a2, 0	)
-SRC(	l32i	a8, a2, 4	)
-DST(	s32i	a9, a3, 0	)
-DST(	s32i	a8, a3, 4	)
+EX(10f)	l32i	a9, a2, 0
+EX(10f)	l32i	a8, a2, 4
+EX(11f)	s32i	a9, a3, 0
+EX(11f)	s32i	a8, a3, 4
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
-SRC(	l32i	a9, a2, 8	)
-SRC(	l32i	a8, a2, 12	)
-DST(	s32i	a9, a3, 8	)
-DST(	s32i	a8, a3, 12	)
+EX(10f)	l32i	a9, a2, 8
+EX(10f)	l32i	a8, a2, 12
+EX(11f)	s32i	a9, a3, 8
+EX(11f)	s32i	a8, a3, 12
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
-SRC(	l32i	a9, a2, 16	)
-SRC(	l32i	a8, a2, 20	)
-DST(	s32i	a9, a3, 16	)
-DST(	s32i	a8, a3, 20	)
+EX(10f)	l32i	a9, a2, 16
+EX(10f)	l32i	a8, a2, 20
+EX(11f)	s32i	a9, a3, 16
+EX(11f)	s32i	a8, a3, 20
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
-SRC(	l32i	a9, a2, 24	)
-SRC(	l32i	a8, a2, 28	)
-DST(	s32i	a9, a3, 24	)
-DST(	s32i	a8, a3, 28	)
+EX(10f)	l32i	a9, a2, 24
+EX(10f)	l32i	a8, a2, 28
+EX(11f)	s32i	a9, a3, 24
+EX(11f)	s32i	a8, a3, 28
 	ONES_ADD(a5, a9)
 	ONES_ADD(a5, a8)
 	addi	a2, a2, 32
@@ -284,8 +270,8 @@
 	add	a10, a10, a2	/* a10 = end of last 4-byte src chunk */
 .Loop6:
 #endif
-SRC(	l32i	a9, a2, 0	)
-DST(	s32i	a9, a3, 0	)
+EX(10f)	l32i	a9, a2, 0
+EX(11f)	s32i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 4
 	addi	a3, a3, 4
@@ -315,8 +301,8 @@
 	add	a10, a10, a2	/* a10 = end of last 2-byte src chunk */
 .Loop7:
 #endif
-SRC(	l16ui	a9, a2, 0	)
-DST(	s16i	a9, a3, 0	)
+EX(10f)	l16ui	a9, a2, 0
+EX(11f)	s16i	a9, a3, 0
 	ONES_ADD(a5, a9)
 	addi	a2, a2, 2
 	addi	a3, a3, 2
@@ -326,8 +312,8 @@
 4:
 	/* This section processes a possible trailing odd byte. */
 	_bbci.l	a4, 0, 8f	/* 1-byte chunk */
-SRC(	l8ui	a9, a2, 0	)
-DST(	s8i	a9, a3, 0	)
+EX(10f)	l8ui	a9, a2, 0
+EX(11f)	s8i	a9, a3, 0
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* shift byte to bits 8..15 */
 #endif
@@ -350,10 +336,10 @@
 	add	a10, a10, a2	/* a10 = end of last odd-aligned, 2-byte src chunk */
 .Loop8:
 #endif
-SRC(	l8ui	a9, a2, 0	)
-SRC(	l8ui	a8, a2, 1	)
-DST(	s8i	a9, a3, 0	)
-DST(	s8i	a8, a3, 1	)
+EX(10f)	l8ui	a9, a2, 0
+EX(10f)	l8ui	a8, a2, 1
+EX(11f)	s8i	a9, a3, 0
+EX(11f)	s8i	a8, a3, 1
 #ifdef __XTENSA_EB__
 	slli	a9, a9, 8	/* combine into a single 16-bit value */
 #else				/* for checksum computation */
@@ -381,7 +367,7 @@
 	a12 = original dst for exception handling
 */
 
-6001:
+10:
 	_movi	a2, -EFAULT
 	s32i	a2, a6, 0	/* src_err_ptr */
 
@@ -403,7 +389,7 @@
 2:
 	retw
 
-6002:
+11:
 	movi	a2, -EFAULT
 	s32i	a2, a7, 0	/* dst_err_ptr */
 	movi	a2, 0
diff --git a/arch/xtensa/lib/memcopy.S b/arch/xtensa/lib/memcopy.S
index b1c219a..c0f6981 100644
--- a/arch/xtensa/lib/memcopy.S
+++ b/arch/xtensa/lib/memcopy.S
@@ -9,23 +9,9 @@
  * Copyright (C) 2002 - 2012 Tensilica Inc.
  */
 
+#include <linux/linkage.h>
 #include <variant/core.h>
-
-	.macro	src_b	r, w0, w1
-#ifdef __XTENSA_EB__
-	src	\r, \w0, \w1
-#else
-	src	\r, \w1, \w0
-#endif
-	.endm
-
-	.macro	ssa8	r
-#ifdef __XTENSA_EB__
-	ssa8b	\r
-#else
-	ssa8l	\r
-#endif
-	.endm
+#include <asm/asmmacro.h>
 
 /*
  * void *memcpy(void *dst, const void *src, size_t len);
@@ -123,10 +109,8 @@
 	addi	a5, a5,  2
 	j	.Ldstaligned	# dst is now aligned, return to main algorithm
 
-	.align	4
-	.global	memcpy
-	.type   memcpy,@function
-memcpy:
+ENTRY(__memcpy)
+WEAK(memcpy)
 
 	entry	sp, 16		# minimal stack frame
 	# a2/ dst, a3/ src, a4/ len
@@ -209,7 +193,7 @@
 .Lsrcunaligned:
 	_beqz	a4, .Ldone	# avoid loading anything for zero-length copies
 	# copy 16 bytes per iteration for word-aligned dst and unaligned src
-	ssa8	a3		# set shift amount from byte offset
+	__ssa8	a3		# set shift amount from byte offset
 
 /* set to 1 when running on ISS (simulator) with the
    lint or ferret client, or 0 to save a few cycles */
@@ -229,16 +213,16 @@
 .Loop2:
 	l32i	a7, a3,  4
 	l32i	a8, a3,  8
-	src_b	a6, a6, a7
+	__src_b	a6, a6, a7
 	s32i	a6, a5,  0
 	l32i	a9, a3, 12
-	src_b	a7, a7, a8
+	__src_b	a7, a7, a8
 	s32i	a7, a5,  4
 	l32i	a6, a3, 16
-	src_b	a8, a8, a9
+	__src_b	a8, a8, a9
 	s32i	a8, a5,  8
 	addi	a3, a3, 16
-	src_b	a9, a9, a6
+	__src_b	a9, a9, a6
 	s32i	a9, a5, 12
 	addi	a5, a5, 16
 #if !XCHAL_HAVE_LOOPS
@@ -249,10 +233,10 @@
 	# copy 8 bytes
 	l32i	a7, a3,  4
 	l32i	a8, a3,  8
-	src_b	a6, a6, a7
+	__src_b	a6, a6, a7
 	s32i	a6, a5,  0
 	addi	a3, a3,  8
-	src_b	a7, a7, a8
+	__src_b	a7, a7, a8
 	s32i	a7, a5,  4
 	addi	a5, a5,  8
 	mov	a6, a8
@@ -261,7 +245,7 @@
 	# copy 4 bytes
 	l32i	a7, a3,  4
 	addi	a3, a3,  4
-	src_b	a6, a6, a7
+	__src_b	a6, a6, a7
 	s32i	a6, a5,  0
 	addi	a5, a5,  4
 	mov	a6, a7
@@ -288,14 +272,14 @@
 	s8i	a6, a5,  0
 	retw
 
+ENDPROC(__memcpy)
 
 /*
  * void bcopy(const void *src, void *dest, size_t n);
  */
-	.align	4
-	.global	bcopy
-	.type   bcopy,@function
-bcopy:
+
+ENTRY(bcopy)
+
 	entry	sp, 16		# minimal stack frame
 	# a2=src, a3=dst, a4=len
 	mov	a5, a3
@@ -303,6 +287,8 @@
 	mov	a2, a5
 	j	.Lmovecommon	# go to common code for memmove+bcopy
 
+ENDPROC(bcopy)
+
 /*
  * void *memmove(void *dst, const void *src, size_t len);
  *
@@ -391,10 +377,8 @@
 	j	.Lbackdstaligned	# dst is now aligned,
 					# return to main algorithm
 
-	.align	4
-	.global	memmove
-	.type   memmove,@function
-memmove:
+ENTRY(__memmove)
+WEAK(memmove)
 
 	entry	sp, 16		# minimal stack frame
 	# a2/ dst, a3/ src, a4/ len
@@ -485,7 +469,7 @@
 .Lbacksrcunaligned:
 	_beqz	a4, .Lbackdone	# avoid loading anything for zero-length copies
 	# copy 16 bytes per iteration for word-aligned dst and unaligned src
-	ssa8	a3		# set shift amount from byte offset
+	__ssa8	a3		# set shift amount from byte offset
 #define SIM_CHECKS_ALIGNMENT	1	/* set to 1 when running on ISS with
 					 * the lint or ferret client, or 0
 					 * to save a few cycles */
@@ -506,15 +490,15 @@
 	l32i	a7, a3, 12
 	l32i	a8, a3,  8
 	addi	a5, a5, -16
-	src_b	a6, a7, a6
+	__src_b	a6, a7, a6
 	s32i	a6, a5, 12
 	l32i	a9, a3,  4
-	src_b	a7, a8, a7
+	__src_b	a7, a8, a7
 	s32i	a7, a5,  8
 	l32i	a6, a3,  0
-	src_b	a8, a9, a8
+	__src_b	a8, a9, a8
 	s32i	a8, a5,  4
-	src_b	a9, a6, a9
+	__src_b	a9, a6, a9
 	s32i	a9, a5,  0
 #if !XCHAL_HAVE_LOOPS
 	bne	a3, a10, .backLoop2 # continue loop if a3:src != a10:src_start
@@ -526,9 +510,9 @@
 	l32i	a7, a3,  4
 	l32i	a8, a3,  0
 	addi	a5, a5, -8
-	src_b	a6, a7, a6
+	__src_b	a6, a7, a6
 	s32i	a6, a5,  4
-	src_b	a7, a8, a7
+	__src_b	a7, a8, a7
 	s32i	a7, a5,  0
 	mov	a6, a8
 .Lback12:
@@ -537,7 +521,7 @@
 	addi	a3, a3, -4
 	l32i	a7, a3,  0
 	addi	a5, a5, -4
-	src_b	a6, a7, a6
+	__src_b	a6, a7, a6
 	s32i	a6, a5,  0
 	mov	a6, a7
 .Lback13:
@@ -566,11 +550,4 @@
 	s8i	a6, a5,  0
 	retw
 
-
-/*
- * Local Variables:
- * mode:fundamental
- * comment-start: "# "
- * comment-start-skip: "# *"
- * End:
- */
+ENDPROC(__memmove)
diff --git a/arch/xtensa/lib/memset.S b/arch/xtensa/lib/memset.S
index 10b8c40..276747d 100644
--- a/arch/xtensa/lib/memset.S
+++ b/arch/xtensa/lib/memset.S
@@ -11,7 +11,9 @@
  *  Copyright (C) 2002 Tensilica Inc.
  */
 
+#include <linux/linkage.h>
 #include <variant/core.h>
+#include <asm/asmmacro.h>
 
 /*
  * void *memset(void *dst, int c, size_t length)
@@ -28,20 +30,10 @@
  *     the alignment labels).
  */
 
-/* Load or store instructions that may cause exceptions use the EX macro. */
-
-#define EX(insn,reg1,reg2,offset,handler)	\
-9:	insn	reg1, reg2, offset;		\
-	.section __ex_table, "a";		\
-	.word	9b, handler;			\
-	.previous
-
-
 .text
-.align	4
-.global	memset
-.type	memset,@function
-memset:
+ENTRY(__memset)
+WEAK(memset)
+
 	entry	sp, 16		# minimal stack frame
 	# a2/ dst, a3/ c, a4/ length
 	extui	a3, a3, 0, 8	# mask to just 8 bits
@@ -73,10 +65,10 @@
 	add	a6, a6, a5	# a6 = end of last 16B chunk
 #endif /* !XCHAL_HAVE_LOOPS */
 .Loop1:
-	EX(s32i, a3, a5,  0, memset_fixup)
-	EX(s32i, a3, a5,  4, memset_fixup)
-	EX(s32i, a3, a5,  8, memset_fixup)
-	EX(s32i, a3, a5, 12, memset_fixup)
+EX(10f) s32i	a3, a5,  0
+EX(10f) s32i	a3, a5,  4
+EX(10f) s32i	a3, a5,  8
+EX(10f) s32i	a3, a5, 12
 	addi	a5, a5, 16
 #if !XCHAL_HAVE_LOOPS
 	blt	a5, a6, .Loop1
@@ -84,23 +76,23 @@
 .Loop1done:
 	bbci.l	a4, 3, .L2
 	# set 8 bytes
-	EX(s32i, a3, a5,  0, memset_fixup)
-	EX(s32i, a3, a5,  4, memset_fixup)
+EX(10f) s32i	a3, a5,  0
+EX(10f) s32i	a3, a5,  4
 	addi	a5, a5,  8
 .L2:
 	bbci.l	a4, 2, .L3
 	# set 4 bytes
-	EX(s32i, a3, a5,  0, memset_fixup)
+EX(10f) s32i	a3, a5,  0
 	addi	a5, a5,  4
 .L3:
 	bbci.l	a4, 1, .L4
 	# set 2 bytes
-	EX(s16i, a3, a5,  0, memset_fixup)
+EX(10f) s16i	a3, a5,  0
 	addi	a5, a5,  2
 .L4:
 	bbci.l	a4, 0, .L5
 	# set 1 byte
-	EX(s8i, a3, a5,  0, memset_fixup)
+EX(10f) s8i	a3, a5,  0
 .L5:
 .Lret1:
 	retw
@@ -114,7 +106,7 @@
 	bbci.l	a5, 0, .L20		# branch if dst alignment half-aligned
 	# dst is only byte aligned
 	# set 1 byte
-	EX(s8i, a3, a5,  0, memset_fixup)
+EX(10f) s8i	a3, a5,  0
 	addi	a5, a5,  1
 	addi	a4, a4, -1
 	# now retest if dst aligned
@@ -122,7 +114,7 @@
 .L20:
 	# dst half-aligned
 	# set 2 bytes
-	EX(s16i, a3, a5,  0, memset_fixup)
+EX(10f) s16i	a3, a5,  0
 	addi	a5, a5,  2
 	addi	a4, a4, -2
 	j	.L0		# dst is now aligned, return to main algorithm
@@ -141,7 +133,7 @@
 	add	a6, a5, a4	# a6 = ending address
 #endif /* !XCHAL_HAVE_LOOPS */
 .Lbyteloop:
-	EX(s8i, a3, a5, 0, memset_fixup)
+EX(10f) s8i	a3, a5, 0
 	addi	a5, a5, 1
 #if !XCHAL_HAVE_LOOPS
 	blt	a5, a6, .Lbyteloop
@@ -149,12 +141,13 @@
 .Lbytesetdone:
 	retw
 
+ENDPROC(__memset)
 
 	.section .fixup, "ax"
 	.align	4
 
 /* We return zero if a failure occurred. */
 
-memset_fixup:
+10:
 	movi	a2, 0
 	retw
diff --git a/arch/xtensa/lib/pci-auto.c b/arch/xtensa/lib/pci-auto.c
index 34d05ab..a2b5581 100644
--- a/arch/xtensa/lib/pci-auto.c
+++ b/arch/xtensa/lib/pci-auto.c
@@ -49,17 +49,6 @@
  *
  */
 
-
-/* define DEBUG to print some debugging messages. */
-
-#undef DEBUG
-
-#ifdef DEBUG
-# define DBG(x...) printk(x)
-#else
-# define DBG(x...)
-#endif
-
 static int pciauto_upper_iospc;
 static int pciauto_upper_memspc;
 
@@ -97,7 +86,7 @@ pciauto_setup_bars(struct pci_dev *dev, int bar_limit)
 		{
 			bar_size &= PCI_BASE_ADDRESS_IO_MASK;
 			upper_limit = &pciauto_upper_iospc;
-			DBG("PCI Autoconfig: BAR %d, I/O, ", bar_nr);
+			pr_debug("PCI Autoconfig: BAR %d, I/O, ", bar_nr);
 		}
 		else
 		{
@@ -107,7 +96,7 @@ pciauto_setup_bars(struct pci_dev *dev, int bar_limit)
 
 			bar_size &= PCI_BASE_ADDRESS_MEM_MASK;
 			upper_limit = &pciauto_upper_memspc;
-			DBG("PCI Autoconfig: BAR %d, Mem, ", bar_nr);
+			pr_debug("PCI Autoconfig: BAR %d, Mem, ", bar_nr);
 		}
 
 		/* Allocate a base address (bar_size is negative!) */
@@ -125,7 +114,8 @@ pciauto_setup_bars(struct pci_dev *dev, int bar_limit)
 		if (found_mem64)
 			pci_write_config_dword(dev, (bar+=4), 0x00000000);
 
-		DBG("size=0x%x, address=0x%x\n", ~bar_size + 1, *upper_limit);
+		pr_debug("size=0x%x, address=0x%x\n",
+			 ~bar_size + 1, *upper_limit);
 	}
 }
 
@@ -150,7 +140,7 @@ pciauto_setup_irq(struct pci_controller* pci_ctrl,struct pci_dev *dev,int devfn)
 	if (irq == -1)
 		irq = 0;
 
-	DBG("PCI Autoconfig: Interrupt %d, pin %d\n", irq, pin);
+	pr_debug("PCI Autoconfig: Interrupt %d, pin %d\n", irq, pin);
 
 	pci_write_config_byte(dev, PCI_INTERRUPT_LINE, irq);
 }
@@ -289,8 +279,8 @@ int __init pciauto_bus_scan(struct pci_controller *pci_ctrl, int current_bus)
 
 			int iosave, memsave;
 
-			DBG("PCI Autoconfig: Found P2P bridge, device %d\n",
-			    PCI_SLOT(pci_devfn));
+			pr_debug("PCI Autoconfig: Found P2P bridge, device %d\n",
+				 PCI_SLOT(pci_devfn));
 
 			/* Allocate PCI I/O and/or memory space */
 			pciauto_setup_bars(dev, PCI_BASE_ADDRESS_1);
@@ -306,23 +296,6 @@ int __init pciauto_bus_scan(struct pci_controller *pci_ctrl, int current_bus)
 
 		}
 
-
-#if 0
-		/* Skip legacy mode IDE controller */
-
-		if ((pci_class >> 16) == PCI_CLASS_STORAGE_IDE) {
-
-			unsigned char prg_iface;
-			pci_read_config_byte(dev, PCI_CLASS_PROG, &prg_iface);
-
-			if (!(prg_iface & PCIAUTO_IDE_MODE_MASK)) {
-				DBG("PCI Autoconfig: Skipping legacy mode "
-				    "IDE controller\n");
-				continue;
-			}
-		}
-#endif
-
 		/*
 		 * Found a peripheral, enable some standard
 		 * settings
@@ -337,8 +310,8 @@ int __init pciauto_bus_scan(struct pci_controller *pci_ctrl, int current_bus)
 		pci_write_config_byte(dev, PCI_LATENCY_TIMER, 0x80);
 
 		/* Allocate PCI I/O and/or memory space */
-		DBG("PCI Autoconfig: Found Bus %d, Device %d, Function %d\n",
-		    current_bus, PCI_SLOT(pci_devfn), PCI_FUNC(pci_devfn) );
+		pr_debug("PCI Autoconfig: Found Bus %d, Device %d, Function %d\n",
+			 current_bus, PCI_SLOT(pci_devfn), PCI_FUNC(pci_devfn));
 
 		pciauto_setup_bars(dev, PCI_BASE_ADDRESS_5);
 		pciauto_setup_irq(pci_ctrl, dev, pci_devfn);
diff --git a/arch/xtensa/lib/strncpy_user.S b/arch/xtensa/lib/strncpy_user.S
index 1ad0ecf..5fce16b 100644
--- a/arch/xtensa/lib/strncpy_user.S
+++ b/arch/xtensa/lib/strncpy_user.S
@@ -11,16 +11,10 @@
  *  Copyright (C) 2002 Tensilica Inc.
  */
 
-#include <variant/core.h>
 #include <linux/errno.h>
-
-/* Load or store instructions that may cause exceptions use the EX macro. */
-
-#define EX(insn,reg1,reg2,offset,handler)	\
-9:	insn	reg1, reg2, offset;		\
-	.section __ex_table, "a";		\
-	.word	9b, handler;			\
-	.previous
+#include <linux/linkage.h>
+#include <variant/core.h>
+#include <asm/asmmacro.h>
 
 /*
  * char *__strncpy_user(char *dst, const char *src, size_t len)
@@ -54,10 +48,8 @@
 #   a12/ tmp
 
 .text
-.align	4
-.global	__strncpy_user
-.type	__strncpy_user,@function
-__strncpy_user:
+ENTRY(__strncpy_user)
+
 	entry	sp, 16		# minimal stack frame
 	# a2/ dst, a3/ src, a4/ len
 	mov	a11, a2		# leave dst in return value register
@@ -75,9 +67,9 @@
 	j	.Ldstunaligned
 
 .Lsrc1mod2:	# src address is odd
-	EX(l8ui, a9, a3, 0, fixup_l)	# get byte 0
+EX(11f)	l8ui	a9, a3, 0		# get byte 0
 	addi	a3, a3, 1		# advance src pointer
-	EX(s8i, a9, a11, 0, fixup_s)	# store byte 0
+EX(10f)	s8i	a9, a11, 0		# store byte 0
 	beqz	a9, .Lret		# if byte 0 is zero
 	addi	a11, a11, 1		# advance dst pointer
 	addi	a4, a4, -1		# decrement len
@@ -85,16 +77,16 @@
 	bbci.l	a3, 1, .Lsrcaligned	# if src is now word-aligned
 
 .Lsrc2mod4:	# src address is 2 mod 4
-	EX(l8ui, a9, a3, 0, fixup_l)	# get byte 0
+EX(11f)	l8ui	a9, a3, 0		# get byte 0
 	/* 1-cycle interlock */
-	EX(s8i, a9, a11, 0, fixup_s)	# store byte 0
+EX(10f)	s8i	a9, a11, 0		# store byte 0
 	beqz	a9, .Lret		# if byte 0 is zero
 	addi	a11, a11, 1		# advance dst pointer
 	addi	a4, a4, -1		# decrement len
 	beqz	a4, .Lret		# if len is zero
-	EX(l8ui, a9, a3, 1, fixup_l)	# get byte 0
+EX(11f)	l8ui	a9, a3, 1		# get byte 0
 	addi	a3, a3, 2		# advance src pointer
-	EX(s8i, a9, a11, 0, fixup_s)	# store byte 0
+EX(10f)	s8i	a9, a11, 0		# store byte 0
 	beqz	a9, .Lret		# if byte 0 is zero
 	addi	a11, a11, 1		# advance dst pointer
 	addi	a4, a4, -1		# decrement len
@@ -117,12 +109,12 @@
 	add	a12, a12, a11	# a12 = end of last 4B chunck
 #endif
 .Loop1:
-	EX(l32i, a9, a3, 0, fixup_l)	# get word from src
+EX(11f)	l32i	a9, a3, 0		# get word from src
 	addi	a3, a3, 4		# advance src pointer
 	bnone	a9, a5, .Lz0		# if byte 0 is zero
 	bnone	a9, a6, .Lz1		# if byte 1 is zero
 	bnone	a9, a7, .Lz2		# if byte 2 is zero
-	EX(s32i, a9, a11, 0, fixup_s)	# store word to dst
+EX(10f)	s32i	a9, a11, 0		# store word to dst
 	bnone	a9, a8, .Lz3		# if byte 3 is zero
 	addi	a11, a11, 4		# advance dst pointer
 #if !XCHAL_HAVE_LOOPS
@@ -132,7 +124,7 @@
 .Loop1done:
 	bbci.l	a4, 1, .L100
 	# copy 2 bytes
-	EX(l16ui, a9, a3, 0, fixup_l)
+EX(11f)	l16ui	a9, a3, 0
 	addi	a3, a3, 2		# advance src pointer
 #ifdef __XTENSA_EB__
 	bnone	a9, a7, .Lz0		# if byte 2 is zero
@@ -141,13 +133,13 @@
 	bnone	a9, a5, .Lz0		# if byte 0 is zero
 	bnone	a9, a6, .Lz1		# if byte 1 is zero
 #endif
-	EX(s16i, a9, a11, 0, fixup_s)
+EX(10f)	s16i	a9, a11, 0
 	addi	a11, a11, 2		# advance dst pointer
 .L100:
 	bbci.l	a4, 0, .Lret
-	EX(l8ui, a9, a3, 0, fixup_l)
+EX(11f)	l8ui	a9, a3, 0
 	/* slot */
-	EX(s8i, a9, a11, 0, fixup_s)
+EX(10f)	s8i	a9, a11, 0
 	beqz	a9, .Lret		# if byte is zero
 	addi	a11, a11, 1-3		# advance dst ptr 1, but also cancel
 					# the effect of adding 3 in .Lz3 code
@@ -161,14 +153,14 @@
 #ifdef __XTENSA_EB__
 	movi	a9, 0
 #endif /* __XTENSA_EB__ */
-	EX(s8i, a9, a11, 0, fixup_s)
+EX(10f)	s8i	a9, a11, 0
 	sub	a2, a11, a2		# compute strlen
 	retw
 .Lz1:	# byte 1 is zero
 #ifdef __XTENSA_EB__
 	extui   a9, a9, 16, 16
 #endif /* __XTENSA_EB__ */
-	EX(s16i, a9, a11, 0, fixup_s)
+EX(10f)	s16i	a9, a11, 0
 	addi	a11, a11, 1		# advance dst pointer
 	sub	a2, a11, a2		# compute strlen
 	retw
@@ -176,9 +168,9 @@
 #ifdef __XTENSA_EB__
 	extui   a9, a9, 16, 16
 #endif /* __XTENSA_EB__ */
-	EX(s16i, a9, a11, 0, fixup_s)
+EX(10f)	s16i	a9, a11, 0
 	movi	a9, 0
-	EX(s8i, a9, a11, 2, fixup_s)
+EX(10f)	s8i	a9, a11, 2
 	addi	a11, a11, 2		# advance dst pointer
 	sub	a2, a11, a2		# compute strlen
 	retw
@@ -196,9 +188,9 @@
 	add	a12, a11, a4		# a12 = ending address
 #endif /* XCHAL_HAVE_LOOPS */
 .Lnextbyte:
-	EX(l8ui, a9, a3, 0, fixup_l)
+EX(11f)	l8ui	a9, a3, 0
 	addi	a3, a3, 1
-	EX(s8i, a9, a11, 0, fixup_s)
+EX(10f)	s8i	a9, a11, 0
 	beqz	a9, .Lunalignedend
 	addi	a11, a11, 1
 #if !XCHAL_HAVE_LOOPS
@@ -209,6 +201,7 @@
 	sub	a2, a11, a2		# compute strlen
 	retw
 
+ENDPROC(__strncpy_user)
 
 	.section .fixup, "ax"
 	.align	4
@@ -218,8 +211,7 @@
 	 * implementation in memset().  Thus, we differentiate between
 	 * load/store fixups. */
 
-fixup_s:
-fixup_l:
+10:
+11:
 	movi	a2, -EFAULT
 	retw
-
diff --git a/arch/xtensa/lib/strnlen_user.S b/arch/xtensa/lib/strnlen_user.S
index 4c03b1e..0b956ce 100644
--- a/arch/xtensa/lib/strnlen_user.S
+++ b/arch/xtensa/lib/strnlen_user.S
@@ -11,15 +11,9 @@
  *  Copyright (C) 2002 Tensilica Inc.
  */
 
+#include <linux/linkage.h>
 #include <variant/core.h>
-
-/* Load or store instructions that may cause exceptions use the EX macro. */
-
-#define EX(insn,reg1,reg2,offset,handler)	\
-9:	insn	reg1, reg2, offset;		\
-	.section __ex_table, "a";		\
-	.word	9b, handler;			\
-	.previous
+#include <asm/asmmacro.h>
 
 /*
  * size_t __strnlen_user(const char *s, size_t len)
@@ -49,10 +43,8 @@
 #   a10/ tmp
 
 .text
-.align	4
-.global	__strnlen_user
-.type	__strnlen_user,@function
-__strnlen_user:
+ENTRY(__strnlen_user)
+
 	entry	sp, 16		# minimal stack frame
 	# a2/ s, a3/ len
 	addi	a4, a2, -4	# because we overincrement at the end;
@@ -77,7 +69,7 @@
 	add	a10, a10, a4	# a10 = end of last 4B chunk
 #endif /* XCHAL_HAVE_LOOPS */
 .Loop:
-	EX(l32i, a9, a4, 4, lenfixup)	# get next word of string
+EX(10f)	l32i	a9, a4, 4		# get next word of string
 	addi	a4, a4, 4		# advance string pointer
 	bnone	a9, a5, .Lz0		# if byte 0 is zero
 	bnone	a9, a6, .Lz1		# if byte 1 is zero
@@ -88,7 +80,7 @@
 #endif
 
 .Ldone:
-	EX(l32i, a9, a4, 4, lenfixup)	# load 4 bytes for remaining checks
+EX(10f)	l32i	a9, a4, 4	# load 4 bytes for remaining checks
 
 	bbci.l	a3, 1, .L100
 	# check two more bytes (bytes 0, 1 of word)
@@ -125,14 +117,14 @@
 	retw
 
 .L1mod2:	# address is odd
-	EX(l8ui, a9, a4, 4, lenfixup)	# get byte 0
+EX(10f)	l8ui	a9, a4, 4		# get byte 0
 	addi	a4, a4, 1		# advance string pointer
 	beqz	a9, .Lz3		# if byte 0 is zero
 	bbci.l	a4, 1, .Laligned	# if string pointer is now word-aligned
 
 .L2mod4:	# address is 2 mod 4
 	addi	a4, a4, 2	# advance ptr for aligned access
-	EX(l32i, a9, a4, 0, lenfixup)	# get word with first two bytes of string
+EX(10f)	l32i	a9, a4, 0	# get word with first two bytes of string
 	bnone	a9, a7, .Lz2	# if byte 2 (of word, not string) is zero
 	bany	a9, a8, .Laligned # if byte 3 (of word, not string) is nonzero
 	# byte 3 is zero
@@ -140,8 +132,10 @@
 	sub	a2, a4, a2	# subtract to get length
 	retw
 
+ENDPROC(__strnlen_user)
+
 	.section .fixup, "ax"
 	.align	4
-lenfixup:
+10:
 	movi	a2, 0
 	retw
diff --git a/arch/xtensa/lib/usercopy.S b/arch/xtensa/lib/usercopy.S
index d9cd766..64ab197 100644
--- a/arch/xtensa/lib/usercopy.S
+++ b/arch/xtensa/lib/usercopy.S
@@ -53,30 +53,13 @@
  *	a11/ original length
  */
 
+#include <linux/linkage.h>
 #include <variant/core.h>
-
-#ifdef __XTENSA_EB__
-#define ALIGN(R, W0, W1) src	R, W0, W1
-#define SSA8(R)	ssa8b R
-#else
-#define ALIGN(R, W0, W1) src	R, W1, W0
-#define SSA8(R)	ssa8l R
-#endif
-
-/* Load or store instructions that may cause exceptions use the EX macro. */
-
-#define EX(insn,reg1,reg2,offset,handler)	\
-9:	insn	reg1, reg2, offset;		\
-	.section __ex_table, "a";		\
-	.word	9b, handler;			\
-	.previous
-
+#include <asm/asmmacro.h>
 
 	.text
-	.align	4
-	.global	__xtensa_copy_user
-	.type	__xtensa_copy_user,@function
-__xtensa_copy_user:
+ENTRY(__xtensa_copy_user)
+
 	entry	sp, 16		# minimal stack frame
 	# a2/ dst, a3/ src, a4/ len
 	mov	a5, a2		# copy dst so that a2 is return value
@@ -89,7 +72,7 @@
 				# per iteration
 	movi	a8, 3		  # if source is also aligned,
 	bnone	a3, a8, .Laligned # then use word copy
-	SSA8(	a3)		# set shift amount from byte offset
+	__ssa8	a3		# set shift amount from byte offset
 	bnez	a4, .Lsrcunaligned
 	movi	a2, 0		# return success for len==0
 	retw
@@ -102,9 +85,9 @@
 	bltui	a4, 7, .Lbytecopy	# do short copies byte by byte
 
 	# copy 1 byte
-	EX(l8ui, a6, a3, 0, fixup)
+EX(10f)	l8ui	a6, a3, 0
 	addi	a3, a3,  1
-	EX(s8i, a6, a5,  0, fixup)
+EX(10f)	s8i	a6, a5,  0
 	addi	a5, a5,  1
 	addi	a4, a4, -1
 	bbci.l	a5, 1, .Ldstaligned	# if dst is now aligned, then
@@ -112,11 +95,11 @@
 .Ldst2mod4:	# dst 16-bit aligned
 	# copy 2 bytes
 	bltui	a4, 6, .Lbytecopy	# do short copies byte by byte
-	EX(l8ui, a6, a3, 0, fixup)
-	EX(l8ui, a7, a3, 1, fixup)
+EX(10f)	l8ui	a6, a3, 0
+EX(10f)	l8ui	a7, a3, 1
 	addi	a3, a3,  2
-	EX(s8i, a6, a5,  0, fixup)
-	EX(s8i, a7, a5,  1, fixup)
+EX(10f)	s8i	a6, a5,  0
+EX(10f)	s8i	a7, a5,  1
 	addi	a5, a5,  2
 	addi	a4, a4, -2
 	j	.Ldstaligned	# dst is now aligned, return to main algorithm
@@ -135,9 +118,9 @@
 	add	a7, a3, a4	# a7 = end address for source
 #endif /* !XCHAL_HAVE_LOOPS */
 .Lnextbyte:
-	EX(l8ui, a6, a3, 0, fixup)
+EX(10f)	l8ui	a6, a3, 0
 	addi	a3, a3, 1
-	EX(s8i, a6, a5, 0, fixup)
+EX(10f)	s8i	a6, a5, 0
 	addi	a5, a5, 1
 #if !XCHAL_HAVE_LOOPS
 	blt	a3, a7, .Lnextbyte
@@ -161,15 +144,15 @@
 	add	a8, a8, a3	# a8 = end of last 16B source chunk
 #endif /* !XCHAL_HAVE_LOOPS */
 .Loop1:
-	EX(l32i, a6, a3,  0, fixup)
-	EX(l32i, a7, a3,  4, fixup)
-	EX(s32i, a6, a5,  0, fixup)
-	EX(l32i, a6, a3,  8, fixup)
-	EX(s32i, a7, a5,  4, fixup)
-	EX(l32i, a7, a3, 12, fixup)
-	EX(s32i, a6, a5,  8, fixup)
+EX(10f)	l32i	a6, a3,  0
+EX(10f)	l32i	a7, a3,  4
+EX(10f)	s32i	a6, a5,  0
+EX(10f)	l32i	a6, a3,  8
+EX(10f)	s32i	a7, a5,  4
+EX(10f)	l32i	a7, a3, 12
+EX(10f)	s32i	a6, a5,  8
 	addi	a3, a3, 16
-	EX(s32i, a7, a5, 12, fixup)
+EX(10f)	s32i	a7, a5, 12
 	addi	a5, a5, 16
 #if !XCHAL_HAVE_LOOPS
 	blt	a3, a8, .Loop1
@@ -177,31 +160,31 @@
 .Loop1done:
 	bbci.l	a4, 3, .L2
 	# copy 8 bytes
-	EX(l32i, a6, a3,  0, fixup)
-	EX(l32i, a7, a3,  4, fixup)
+EX(10f)	l32i	a6, a3,  0
+EX(10f)	l32i	a7, a3,  4
 	addi	a3, a3,  8
-	EX(s32i, a6, a5,  0, fixup)
-	EX(s32i, a7, a5,  4, fixup)
+EX(10f)	s32i	a6, a5,  0
+EX(10f)	s32i	a7, a5,  4
 	addi	a5, a5,  8
 .L2:
 	bbci.l	a4, 2, .L3
 	# copy 4 bytes
-	EX(l32i, a6, a3,  0, fixup)
+EX(10f)	l32i	a6, a3,  0
 	addi	a3, a3,  4
-	EX(s32i, a6, a5,  0, fixup)
+EX(10f)	s32i	a6, a5,  0
 	addi	a5, a5,  4
 .L3:
 	bbci.l	a4, 1, .L4
 	# copy 2 bytes
-	EX(l16ui, a6, a3,  0, fixup)
+EX(10f)	l16ui	a6, a3,  0
 	addi	a3, a3,  2
-	EX(s16i,  a6, a5,  0, fixup)
+EX(10f)	s16i	a6, a5,  0
 	addi	a5, a5,  2
 .L4:
 	bbci.l	a4, 0, .L5
 	# copy 1 byte
-	EX(l8ui, a6, a3,  0, fixup)
-	EX(s8i,  a6, a5,  0, fixup)
+EX(10f)	l8ui	a6, a3,  0
+EX(10f)	s8i	a6, a5,  0
 .L5:
 	movi	a2, 0		# return success for len bytes copied
 	retw
@@ -217,7 +200,7 @@
 	# copy 16 bytes per iteration for word-aligned dst and unaligned src
 	and	a10, a3, a8	# save unalignment offset for below
 	sub	a3, a3, a10	# align a3 (to avoid sim warnings only; not needed for hardware)
-	EX(l32i, a6, a3, 0, fixup)	# load first word
+EX(10f)	l32i	a6, a3, 0	# load first word
 #if XCHAL_HAVE_LOOPS
 	loopnez	a7, .Loop2done
 #else /* !XCHAL_HAVE_LOOPS */
@@ -226,19 +209,19 @@
 	add	a12, a12, a3	# a12 = end of last 16B source chunk
 #endif /* !XCHAL_HAVE_LOOPS */
 .Loop2:
-	EX(l32i, a7, a3,  4, fixup)
-	EX(l32i, a8, a3,  8, fixup)
-	ALIGN(	a6, a6, a7)
-	EX(s32i, a6, a5,  0, fixup)
-	EX(l32i, a9, a3, 12, fixup)
-	ALIGN(	a7, a7, a8)
-	EX(s32i, a7, a5,  4, fixup)
-	EX(l32i, a6, a3, 16, fixup)
-	ALIGN(	a8, a8, a9)
-	EX(s32i, a8, a5,  8, fixup)
+EX(10f)	l32i	a7, a3,  4
+EX(10f)	l32i	a8, a3,  8
+	__src_b	a6, a6, a7
+EX(10f)	s32i	a6, a5,  0
+EX(10f)	l32i	a9, a3, 12
+	__src_b	a7, a7, a8
+EX(10f)	s32i	a7, a5,  4
+EX(10f)	l32i	a6, a3, 16
+	__src_b	a8, a8, a9
+EX(10f)	s32i	a8, a5,  8
 	addi	a3, a3, 16
-	ALIGN(	a9, a9, a6)
-	EX(s32i, a9, a5, 12, fixup)
+	__src_b	a9, a9, a6
+EX(10f)	s32i	a9, a5, 12
 	addi	a5, a5, 16
 #if !XCHAL_HAVE_LOOPS
 	blt	a3, a12, .Loop2
@@ -246,43 +229,44 @@
 .Loop2done:
 	bbci.l	a4, 3, .L12
 	# copy 8 bytes
-	EX(l32i, a7, a3,  4, fixup)
-	EX(l32i, a8, a3,  8, fixup)
-	ALIGN(	a6, a6, a7)
-	EX(s32i, a6, a5,  0, fixup)
+EX(10f)	l32i	a7, a3,  4
+EX(10f)	l32i	a8, a3,  8
+	__src_b	a6, a6, a7
+EX(10f)	s32i	a6, a5,  0
 	addi	a3, a3,  8
-	ALIGN(	a7, a7, a8)
-	EX(s32i, a7, a5,  4, fixup)
+	__src_b	a7, a7, a8
+EX(10f)	s32i	a7, a5,  4
 	addi	a5, a5,  8
 	mov	a6, a8
 .L12:
 	bbci.l	a4, 2, .L13
 	# copy 4 bytes
-	EX(l32i, a7, a3,  4, fixup)
+EX(10f)	l32i	a7, a3,  4
 	addi	a3, a3,  4
-	ALIGN(	a6, a6, a7)
-	EX(s32i, a6, a5,  0, fixup)
+	__src_b	a6, a6, a7
+EX(10f)	s32i	a6, a5,  0
 	addi	a5, a5,  4
 	mov	a6, a7
 .L13:
 	add	a3, a3, a10	# readjust a3 with correct misalignment
 	bbci.l	a4, 1, .L14
 	# copy 2 bytes
-	EX(l8ui, a6, a3,  0, fixup)
-	EX(l8ui, a7, a3,  1, fixup)
+EX(10f)	l8ui	a6, a3,  0
+EX(10f)	l8ui	a7, a3,  1
 	addi	a3, a3,  2
-	EX(s8i, a6, a5,  0, fixup)
-	EX(s8i, a7, a5,  1, fixup)
+EX(10f)	s8i	a6, a5,  0
+EX(10f)	s8i	a7, a5,  1
 	addi	a5, a5,  2
 .L14:
 	bbci.l	a4, 0, .L15
 	# copy 1 byte
-	EX(l8ui, a6, a3,  0, fixup)
-	EX(s8i,  a6, a5,  0, fixup)
+EX(10f)	l8ui	a6, a3,  0
+EX(10f)	s8i	a6, a5,  0
 .L15:
 	movi	a2, 0		# return success for len bytes copied
 	retw
 
+ENDPROC(__xtensa_copy_user)
 
 	.section .fixup, "ax"
 	.align	4
@@ -294,7 +278,7 @@
  */
 
 
-fixup:
+10:
 	sub	a2, a5, a2	/* a2 <-- bytes copied */
 	sub	a2, a11, a2	/* a2 <-- bytes not copied */
 	retw
diff --git a/arch/xtensa/mm/Makefile b/arch/xtensa/mm/Makefile
index 0b3d296..734888a 100644
--- a/arch/xtensa/mm/Makefile
+++ b/arch/xtensa/mm/Makefile
@@ -5,3 +5,8 @@
 obj-y			:= init.o misc.o
 obj-$(CONFIG_MMU)	+= cache.o fault.o ioremap.o mmu.o tlb.o
 obj-$(CONFIG_HIGHMEM)	+= highmem.o
+obj-$(CONFIG_KASAN)	+= kasan_init.o
+
+KASAN_SANITIZE_fault.o := n
+KASAN_SANITIZE_kasan_init.o := n
+KASAN_SANITIZE_mmu.o := n
diff --git a/arch/xtensa/mm/cache.c b/arch/xtensa/mm/cache.c
index 3c75c4e..57dc231 100644
--- a/arch/xtensa/mm/cache.c
+++ b/arch/xtensa/mm/cache.c
@@ -33,9 +33,6 @@
 #include <asm/pgalloc.h>
 #include <asm/pgtable.h>
 
-//#define printd(x...) printk(x)
-#define printd(x...) do { } while(0)
-
 /* 
  * Note:
  * The kernel provides one architecture bit PG_arch_1 in the page flags that 
diff --git a/arch/xtensa/mm/fault.c b/arch/xtensa/mm/fault.c
index a14df5a..8b9b6f4 100644
--- a/arch/xtensa/mm/fault.c
+++ b/arch/xtensa/mm/fault.c
@@ -25,8 +25,6 @@
 DEFINE_PER_CPU(unsigned long, asid_cache) = ASID_USER_FIRST;
 void bad_page_fault(struct pt_regs*, unsigned long, int);
 
-#undef DEBUG_PAGE_FAULT
-
 /*
  * This routine handles page faults.  It determines the address,
  * and the problem, and then passes it off to one of the appropriate
@@ -68,10 +66,10 @@ void do_page_fault(struct pt_regs *regs)
 		    exccause == EXCCAUSE_ITLB_MISS ||
 		    exccause == EXCCAUSE_FETCH_CACHE_ATTRIBUTE) ? 1 : 0;
 
-#ifdef DEBUG_PAGE_FAULT
-	printk("[%s:%d:%08x:%d:%08x:%s%s]\n", current->comm, current->pid,
-	       address, exccause, regs->pc, is_write? "w":"", is_exec? "x":"");
-#endif
+	pr_debug("[%s:%d:%08x:%d:%08lx:%s%s]\n",
+		 current->comm, current->pid,
+		 address, exccause, regs->pc,
+		 is_write ? "w" : "", is_exec ? "x" : "");
 
 	if (user_mode(regs))
 		flags |= FAULT_FLAG_USER;
@@ -247,10 +245,8 @@ bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 
 	/* Are we prepared to handle this kernel fault?  */
 	if ((entry = search_exception_tables(regs->pc)) != NULL) {
-#ifdef DEBUG_PAGE_FAULT
-		printk(KERN_DEBUG "%s: Exception at pc=%#010lx (%lx)\n",
-				current->comm, regs->pc, entry->fixup);
-#endif
+		pr_debug("%s: Exception at pc=%#010lx (%lx)\n",
+			 current->comm, regs->pc, entry->fixup);
 		current->thread.bad_uaddr = address;
 		regs->pc = entry->fixup;
 		return;
@@ -259,9 +255,9 @@ bad_page_fault(struct pt_regs *regs, unsigned long address, int sig)
 	/* Oops. The kernel tried to access some bad page. We'll have to
 	 * terminate things with extreme prejudice.
 	 */
-	printk(KERN_ALERT "Unable to handle kernel paging request at virtual "
-	       "address %08lx\n pc = %08lx, ra = %08lx\n",
-	       address, regs->pc, regs->areg[0]);
+	pr_alert("Unable to handle kernel paging request at virtual "
+		 "address %08lx\n pc = %08lx, ra = %08lx\n",
+		 address, regs->pc, regs->areg[0]);
 	die("Oops", regs, sig);
 	do_exit(sig);
 }
diff --git a/arch/xtensa/mm/init.c b/arch/xtensa/mm/init.c
index 720fe4e..d776ec0 100644
--- a/arch/xtensa/mm/init.c
+++ b/arch/xtensa/mm/init.c
@@ -100,29 +100,51 @@ void __init mem_init(void)
 
 	mem_init_print_info(NULL);
 	pr_info("virtual kernel memory layout:\n"
-#ifdef CONFIG_HIGHMEM
-		"    pkmap   : 0x%08lx - 0x%08lx  (%5lu kB)\n"
-		"    fixmap  : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+#ifdef CONFIG_KASAN
+		"    kasan   : 0x%08lx - 0x%08lx  (%5lu MB)\n"
 #endif
 #ifdef CONFIG_MMU
 		"    vmalloc : 0x%08lx - 0x%08lx  (%5lu MB)\n"
 #endif
-		"    lowmem  : 0x%08lx - 0x%08lx  (%5lu MB)\n",
+#ifdef CONFIG_HIGHMEM
+		"    pkmap   : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+		"    fixmap  : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+#endif
+		"    lowmem  : 0x%08lx - 0x%08lx  (%5lu MB)\n"
+		"    .text   : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+		"    .rodata : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+		"    .data   : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+		"    .init   : 0x%08lx - 0x%08lx  (%5lu kB)\n"
+		"    .bss    : 0x%08lx - 0x%08lx  (%5lu kB)\n",
+#ifdef CONFIG_KASAN
+		KASAN_SHADOW_START, KASAN_SHADOW_START + KASAN_SHADOW_SIZE,
+		KASAN_SHADOW_SIZE >> 20,
+#endif
+#ifdef CONFIG_MMU
+		VMALLOC_START, VMALLOC_END,
+		(VMALLOC_END - VMALLOC_START) >> 20,
 #ifdef CONFIG_HIGHMEM
 		PKMAP_BASE, PKMAP_BASE + LAST_PKMAP * PAGE_SIZE,
 		(LAST_PKMAP*PAGE_SIZE) >> 10,
 		FIXADDR_START, FIXADDR_TOP,
 		(FIXADDR_TOP - FIXADDR_START) >> 10,
 #endif
-#ifdef CONFIG_MMU
-		VMALLOC_START, VMALLOC_END,
-		(VMALLOC_END - VMALLOC_START) >> 20,
 		PAGE_OFFSET, PAGE_OFFSET +
 		(max_low_pfn - min_low_pfn) * PAGE_SIZE,
 #else
 		min_low_pfn * PAGE_SIZE, max_low_pfn * PAGE_SIZE,
 #endif
-		((max_low_pfn - min_low_pfn) * PAGE_SIZE) >> 20);
+		((max_low_pfn - min_low_pfn) * PAGE_SIZE) >> 20,
+		(unsigned long)_text, (unsigned long)_etext,
+		(unsigned long)(_etext - _text) >> 10,
+		(unsigned long)__start_rodata, (unsigned long)_sdata,
+		(unsigned long)(_sdata - __start_rodata) >> 10,
+		(unsigned long)_sdata, (unsigned long)_edata,
+		(unsigned long)(_edata - _sdata) >> 10,
+		(unsigned long)__init_begin, (unsigned long)__init_end,
+		(unsigned long)(__init_end - __init_begin) >> 10,
+		(unsigned long)__bss_start, (unsigned long)__bss_stop,
+		(unsigned long)(__bss_stop - __bss_start) >> 10);
 }
 
 #ifdef CONFIG_BLK_DEV_INITRD
diff --git a/arch/xtensa/mm/kasan_init.c b/arch/xtensa/mm/kasan_init.c
new file mode 100644
index 0000000..6b532b6
--- /dev/null
+++ b/arch/xtensa/mm/kasan_init.c
@@ -0,0 +1,95 @@
+/*
+ * Xtensa KASAN shadow map initialization
+ *
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 2017 Cadence Design Systems Inc.
+ */
+
+#include <linux/bootmem.h>
+#include <linux/init_task.h>
+#include <linux/kasan.h>
+#include <linux/kernel.h>
+#include <linux/memblock.h>
+#include <asm/initialize_mmu.h>
+#include <asm/tlbflush.h>
+#include <asm/traps.h>
+
+void __init kasan_early_init(void)
+{
+	unsigned long vaddr = KASAN_SHADOW_START;
+	pgd_t *pgd = pgd_offset_k(vaddr);
+	pmd_t *pmd = pmd_offset(pgd, vaddr);
+	int i;
+
+	for (i = 0; i < PTRS_PER_PTE; ++i)
+		set_pte(kasan_zero_pte + i,
+			mk_pte(virt_to_page(kasan_zero_page), PAGE_KERNEL));
+
+	for (vaddr = 0; vaddr < KASAN_SHADOW_SIZE; vaddr += PMD_SIZE, ++pmd) {
+		BUG_ON(!pmd_none(*pmd));
+		set_pmd(pmd, __pmd((unsigned long)kasan_zero_pte));
+	}
+	early_trap_init();
+}
+
+static void __init populate(void *start, void *end)
+{
+	unsigned long n_pages = (end - start) / PAGE_SIZE;
+	unsigned long n_pmds = n_pages / PTRS_PER_PTE;
+	unsigned long i, j;
+	unsigned long vaddr = (unsigned long)start;
+	pgd_t *pgd = pgd_offset_k(vaddr);
+	pmd_t *pmd = pmd_offset(pgd, vaddr);
+	pte_t *pte = memblock_virt_alloc(n_pages * sizeof(pte_t), PAGE_SIZE);
+
+	pr_debug("%s: %p - %p\n", __func__, start, end);
+
+	for (i = j = 0; i < n_pmds; ++i) {
+		int k;
+
+		for (k = 0; k < PTRS_PER_PTE; ++k, ++j) {
+			phys_addr_t phys =
+				memblock_alloc_base(PAGE_SIZE, PAGE_SIZE,
+						    MEMBLOCK_ALLOC_ANYWHERE);
+
+			set_pte(pte + j, pfn_pte(PHYS_PFN(phys), PAGE_KERNEL));
+		}
+	}
+
+	for (i = 0; i < n_pmds ; ++i, pte += PTRS_PER_PTE)
+		set_pmd(pmd + i, __pmd((unsigned long)pte));
+
+	local_flush_tlb_all();
+	memset(start, 0, end - start);
+}
+
+void __init kasan_init(void)
+{
+	int i;
+
+	BUILD_BUG_ON(KASAN_SHADOW_OFFSET != KASAN_SHADOW_START -
+		     (KASAN_START_VADDR >> KASAN_SHADOW_SCALE_SHIFT));
+	BUILD_BUG_ON(VMALLOC_START < KASAN_START_VADDR);
+
+	/*
+	 * Replace shadow map pages that cover addresses from VMALLOC area
+	 * start to the end of KSEG with clean writable pages.
+	 */
+	populate(kasan_mem_to_shadow((void *)VMALLOC_START),
+		 kasan_mem_to_shadow((void *)XCHAL_KSEG_BYPASS_VADDR));
+
+	/* Write protect kasan_zero_page and zero-initialize it again. */
+	for (i = 0; i < PTRS_PER_PTE; ++i)
+		set_pte(kasan_zero_pte + i,
+			mk_pte(virt_to_page(kasan_zero_page), PAGE_KERNEL_RO));
+
+	local_flush_tlb_all();
+	memset(kasan_zero_page, 0, PAGE_SIZE);
+
+	/* At this point kasan is fully initialized. Enable error messages. */
+	current->kasan_depth = 0;
+	pr_info("KernelAddressSanitizer initialized\n");
+}
diff --git a/arch/xtensa/mm/mmu.c b/arch/xtensa/mm/mmu.c
index 358d748..9d1ecfc5 100644
--- a/arch/xtensa/mm/mmu.c
+++ b/arch/xtensa/mm/mmu.c
@@ -56,7 +56,6 @@ static void __init fixedrange_init(void)
 
 void __init paging_init(void)
 {
-	memset(swapper_pg_dir, 0, PAGE_SIZE);
 #ifdef CONFIG_HIGHMEM
 	fixedrange_init();
 	pkmap_page_table = init_pmd(PKMAP_BASE, LAST_PKMAP);
@@ -82,6 +81,23 @@ void init_mmu(void)
 	set_itlbcfg_register(0);
 	set_dtlbcfg_register(0);
 #endif
+	init_kio();
+	local_flush_tlb_all();
+
+	/* Set rasid register to a known value. */
+
+	set_rasid_register(ASID_INSERT(ASID_USER_FIRST));
+
+	/* Set PTEVADDR special register to the start of the page
+	 * table, which is in kernel mappable space (ie. not
+	 * statically mapped).  This register's value is undefined on
+	 * reset.
+	 */
+	set_ptevaddr_register(XCHAL_PAGE_TABLE_VADDR);
+}
+
+void init_kio(void)
+{
 #if XCHAL_HAVE_PTP_MMU && XCHAL_HAVE_SPANNING_WAY && defined(CONFIG_OF)
 	/*
 	 * Update the IO area mapping in case xtensa_kio_paddr has changed
@@ -95,17 +111,4 @@ void init_mmu(void)
 	write_itlb_entry(__pte(xtensa_kio_paddr + CA_BYPASS),
 			XCHAL_KIO_BYPASS_VADDR + 6);
 #endif
-
-	local_flush_tlb_all();
-
-	/* Set rasid register to a known value. */
-
-	set_rasid_register(ASID_INSERT(ASID_USER_FIRST));
-
-	/* Set PTEVADDR special register to the start of the page
-	 * table, which is in kernel mappable space (ie. not
-	 * statically mapped).  This register's value is undefined on
-	 * reset.
-	 */
-	set_ptevaddr_register(PGTABLE_START);
 }
diff --git a/arch/xtensa/mm/tlb.c b/arch/xtensa/mm/tlb.c
index 35c8222..59153d0 100644
--- a/arch/xtensa/mm/tlb.c
+++ b/arch/xtensa/mm/tlb.c
@@ -95,10 +95,8 @@ void local_flush_tlb_range(struct vm_area_struct *vma,
 	if (mm->context.asid[cpu] == NO_CONTEXT)
 		return;
 
-#if 0
-	printk("[tlbrange<%02lx,%08lx,%08lx>]\n",
-			(unsigned long)mm->context.asid[cpu], start, end);
-#endif
+	pr_debug("[tlbrange<%02lx,%08lx,%08lx>]\n",
+		 (unsigned long)mm->context.asid[cpu], start, end);
 	local_irq_save(flags);
 
 	if (end-start + (PAGE_SIZE-1) <= _TLB_ENTRIES << PAGE_SHIFT) {
diff --git a/arch/xtensa/platforms/iss/console.c b/arch/xtensa/platforms/iss/console.c
index 464c268..92f567f 100644
--- a/arch/xtensa/platforms/iss/console.c
+++ b/arch/xtensa/platforms/iss/console.c
@@ -185,7 +185,7 @@ int __init rs_init(void)
 
 	serial_driver = alloc_tty_driver(SERIAL_MAX_NUM_LINES);
 
-	printk ("%s %s\n", serial_name, serial_version);
+	pr_info("%s %s\n", serial_name, serial_version);
 
 	/* Initialize the tty_driver structure */
 
@@ -214,7 +214,7 @@ static __exit void rs_exit(void)
 	int error;
 
 	if ((error = tty_unregister_driver(serial_driver)))
-		printk("ISS_SERIAL: failed to unregister serial driver (%d)\n",
+		pr_err("ISS_SERIAL: failed to unregister serial driver (%d)\n",
 		       error);
 	put_tty_driver(serial_driver);
 	tty_port_destroy(&serial_port);
diff --git a/arch/xtensa/platforms/iss/network.c b/arch/xtensa/platforms/iss/network.c
index 6363b18..d027ddd 100644
--- a/arch/xtensa/platforms/iss/network.c
+++ b/arch/xtensa/platforms/iss/network.c
@@ -16,6 +16,8 @@
  *
  */
 
+#define pr_fmt(fmt) "%s: " fmt, __func__
+
 #include <linux/list.h>
 #include <linux/irq.h>
 #include <linux/spinlock.h>
@@ -606,8 +608,6 @@ struct iss_net_init {
  * those fields. They will be later initialized in iss_net_init.
  */
 
-#define ERR KERN_ERR "iss_net_setup: "
-
 static int __init iss_net_setup(char *str)
 {
 	struct iss_net_private *device = NULL;
@@ -619,14 +619,14 @@ static int __init iss_net_setup(char *str)
 
 	end = strchr(str, '=');
 	if (!end) {
-		printk(ERR "Expected '=' after device number\n");
+		pr_err("Expected '=' after device number\n");
 		return 1;
 	}
 	*end = 0;
 	rc = kstrtouint(str, 0, &n);
 	*end = '=';
 	if (rc < 0) {
-		printk(ERR "Failed to parse '%s'\n", str);
+		pr_err("Failed to parse '%s'\n", str);
 		return 1;
 	}
 	str = end;
@@ -642,13 +642,13 @@ static int __init iss_net_setup(char *str)
 	spin_unlock(&devices_lock);
 
 	if (device && device->index == n) {
-		printk(ERR "Device %u already configured\n", n);
+		pr_err("Device %u already configured\n", n);
 		return 1;
 	}
 
 	new = alloc_bootmem(sizeof(*new));
 	if (new == NULL) {
-		printk(ERR "Alloc_bootmem failed\n");
+		pr_err("Alloc_bootmem failed\n");
 		return 1;
 	}
 
@@ -660,8 +660,6 @@ static int __init iss_net_setup(char *str)
 	return 1;
 }
 
-#undef ERR
-
 __setup("eth", iss_net_setup);
 
 /*
diff --git a/block/bfq-cgroup.c b/block/bfq-cgroup.c
index da1525e..d819dc7 100644
--- a/block/bfq-cgroup.c
+++ b/block/bfq-cgroup.c
@@ -775,10 +775,11 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
 	unsigned long flags;
 	int i;
 
-	if (!entity) /* root group */
-		return;
-
 	spin_lock_irqsave(&bfqd->lock, flags);
+
+	if (!entity) /* root group */
+		goto put_async_queues;
+
 	/*
 	 * Empty all service_trees belonging to this group before
 	 * deactivating the group itself.
@@ -809,6 +810,8 @@ static void bfq_pd_offline(struct blkg_policy_data *pd)
 	}
 
 	__bfq_deactivate_entity(entity, false);
+
+put_async_queues:
 	bfq_put_async_queues(bfqd, bfqg);
 
 	spin_unlock_irqrestore(&bfqd->lock, flags);
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index bcb6d21..47e6ec7 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -166,6 +166,20 @@ static const int bfq_async_charge_factor = 10;
 /* Default timeout values, in jiffies, approximating CFQ defaults. */
 const int bfq_timeout = HZ / 8;
 
+/*
+ * Time limit for merging (see comments in bfq_setup_cooperator). Set
+ * to the slowest value that, in our tests, proved to be effective in
+ * removing false positives, while not causing true positives to miss
+ * queue merging.
+ *
+ * As can be deduced from the low time limit below, queue merging, if
+ * successful, happens at the very beggining of the I/O of the involved
+ * cooperating processes, as a consequence of the arrival of the very
+ * first requests from each cooperator.  After that, there is very
+ * little chance to find cooperators.
+ */
+static const unsigned long bfq_merge_time_limit = HZ/10;
+
 static struct kmem_cache *bfq_pool;
 
 /* Below this threshold (in ns), we consider thinktime immediate. */
@@ -178,7 +192,7 @@ static struct kmem_cache *bfq_pool;
 #define BFQQ_SEEK_THR		(sector_t)(8 * 100)
 #define BFQQ_SECT_THR_NONROT	(sector_t)(2 * 32)
 #define BFQQ_CLOSE_THR		(sector_t)(8 * 1024)
-#define BFQQ_SEEKY(bfqq)	(hweight32(bfqq->seek_history) > 32/8)
+#define BFQQ_SEEKY(bfqq)	(hweight32(bfqq->seek_history) > 19)
 
 /* Min number of samples required to perform peak-rate update */
 #define BFQ_RATE_MIN_SAMPLES	32
@@ -195,15 +209,17 @@ static struct kmem_cache *bfq_pool;
  * interactive applications automatically, using the following formula:
  * duration = (R / r) * T, where r is the peak rate of the device, and
  * R and T are two reference parameters.
- * In particular, R is the peak rate of the reference device (see below),
- * and T is a reference time: given the systems that are likely to be
- * installed on the reference device according to its speed class, T is
- * about the maximum time needed, under BFQ and while reading two files in
- * parallel, to load typical large applications on these systems.
- * In practice, the slower/faster the device at hand is, the more/less it
- * takes to load applications with respect to the reference device.
- * Accordingly, the longer/shorter BFQ grants weight raising to interactive
- * applications.
+ * In particular, R is the peak rate of the reference device (see
+ * below), and T is a reference time: given the systems that are
+ * likely to be installed on the reference device according to its
+ * speed class, T is about the maximum time needed, under BFQ and
+ * while reading two files in parallel, to load typical large
+ * applications on these systems (see the comments on
+ * max_service_from_wr below, for more details on how T is obtained).
+ * In practice, the slower/faster the device at hand is, the more/less
+ * it takes to load applications with respect to the reference device.
+ * Accordingly, the longer/shorter BFQ grants weight raising to
+ * interactive applications.
  *
  * BFQ uses four different reference pairs (R, T), depending on:
  * . whether the device is rotational or non-rotational;
@@ -240,6 +256,60 @@ static int T_slow[2];
 static int T_fast[2];
 static int device_speed_thresh[2];
 
+/*
+ * BFQ uses the above-detailed, time-based weight-raising mechanism to
+ * privilege interactive tasks. This mechanism is vulnerable to the
+ * following false positives: I/O-bound applications that will go on
+ * doing I/O for much longer than the duration of weight
+ * raising. These applications have basically no benefit from being
+ * weight-raised at the beginning of their I/O. On the opposite end,
+ * while being weight-raised, these applications
+ * a) unjustly steal throughput to applications that may actually need
+ * low latency;
+ * b) make BFQ uselessly perform device idling; device idling results
+ * in loss of device throughput with most flash-based storage, and may
+ * increase latencies when used purposelessly.
+ *
+ * BFQ tries to reduce these problems, by adopting the following
+ * countermeasure. To introduce this countermeasure, we need first to
+ * finish explaining how the duration of weight-raising for
+ * interactive tasks is computed.
+ *
+ * For a bfq_queue deemed as interactive, the duration of weight
+ * raising is dynamically adjusted, as a function of the estimated
+ * peak rate of the device, so as to be equal to the time needed to
+ * execute the 'largest' interactive task we benchmarked so far. By
+ * largest task, we mean the task for which each involved process has
+ * to do more I/O than for any of the other tasks we benchmarked. This
+ * reference interactive task is the start-up of LibreOffice Writer,
+ * and in this task each process/bfq_queue needs to have at most ~110K
+ * sectors transferred.
+ *
+ * This last piece of information enables BFQ to reduce the actual
+ * duration of weight-raising for at least one class of I/O-bound
+ * applications: those doing sequential or quasi-sequential I/O. An
+ * example is file copy. In fact, once started, the main I/O-bound
+ * processes of these applications usually consume the above 110K
+ * sectors in much less time than the processes of an application that
+ * is starting, because these I/O-bound processes will greedily devote
+ * almost all their CPU cycles only to their target,
+ * throughput-friendly I/O operations. This is even more true if BFQ
+ * happens to be underestimating the device peak rate, and thus
+ * overestimating the duration of weight raising. But, according to
+ * our measurements, once transferred 110K sectors, these processes
+ * have no right to be weight-raised any longer.
+ *
+ * Basing on the last consideration, BFQ ends weight-raising for a
+ * bfq_queue if the latter happens to have received an amount of
+ * service at least equal to the following constant. The constant is
+ * set to slightly more than 110K, to have a minimum safety margin.
+ *
+ * This early ending of weight-raising reduces the amount of time
+ * during which interactive false positives cause the two problems
+ * described at the beginning of these comments.
+ */
+static const unsigned long max_service_from_wr = 120000;
+
 #define RQ_BIC(rq)		icq_to_bic((rq)->elv.priv[0])
 #define RQ_BFQQ(rq)		((rq)->elv.priv[1])
 
@@ -403,6 +473,82 @@ static struct request *bfq_choose_req(struct bfq_data *bfqd,
 	}
 }
 
+/*
+ * See the comments on bfq_limit_depth for the purpose of
+ * the depths set in the function.
+ */
+static void bfq_update_depths(struct bfq_data *bfqd, struct sbitmap_queue *bt)
+{
+	bfqd->sb_shift = bt->sb.shift;
+
+	/*
+	 * In-word depths if no bfq_queue is being weight-raised:
+	 * leaving 25% of tags only for sync reads.
+	 *
+	 * In next formulas, right-shift the value
+	 * (1U<<bfqd->sb_shift), instead of computing directly
+	 * (1U<<(bfqd->sb_shift - something)), to be robust against
+	 * any possible value of bfqd->sb_shift, without having to
+	 * limit 'something'.
+	 */
+	/* no more than 50% of tags for async I/O */
+	bfqd->word_depths[0][0] = max((1U<<bfqd->sb_shift)>>1, 1U);
+	/*
+	 * no more than 75% of tags for sync writes (25% extra tags
+	 * w.r.t. async I/O, to prevent async I/O from starving sync
+	 * writes)
+	 */
+	bfqd->word_depths[0][1] = max(((1U<<bfqd->sb_shift) * 3)>>2, 1U);
+
+	/*
+	 * In-word depths in case some bfq_queue is being weight-
+	 * raised: leaving ~63% of tags for sync reads. This is the
+	 * highest percentage for which, in our tests, application
+	 * start-up times didn't suffer from any regression due to tag
+	 * shortage.
+	 */
+	/* no more than ~18% of tags for async I/O */
+	bfqd->word_depths[1][0] = max(((1U<<bfqd->sb_shift) * 3)>>4, 1U);
+	/* no more than ~37% of tags for sync writes (~20% extra tags) */
+	bfqd->word_depths[1][1] = max(((1U<<bfqd->sb_shift) * 6)>>4, 1U);
+}
+
+/*
+ * Async I/O can easily starve sync I/O (both sync reads and sync
+ * writes), by consuming all tags. Similarly, storms of sync writes,
+ * such as those that sync(2) may trigger, can starve sync reads.
+ * Limit depths of async I/O and sync writes so as to counter both
+ * problems.
+ */
+static void bfq_limit_depth(unsigned int op, struct blk_mq_alloc_data *data)
+{
+	struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
+	struct bfq_data *bfqd = data->q->elevator->elevator_data;
+	struct sbitmap_queue *bt;
+
+	if (op_is_sync(op) && !op_is_write(op))
+		return;
+
+	if (data->flags & BLK_MQ_REQ_RESERVED) {
+		if (unlikely(!tags->nr_reserved_tags)) {
+			WARN_ON_ONCE(1);
+			return;
+		}
+		bt = &tags->breserved_tags;
+	} else
+		bt = &tags->bitmap_tags;
+
+	if (unlikely(bfqd->sb_shift != bt->sb.shift))
+		bfq_update_depths(bfqd, bt);
+
+	data->shallow_depth =
+		bfqd->word_depths[!!bfqd->wr_busy_queues][op_is_sync(op)];
+
+	bfq_log(bfqd, "[%s] wr_busy %d sync %d depth %u",
+			__func__, bfqd->wr_busy_queues, op_is_sync(op),
+			data->shallow_depth);
+}
+
 static struct bfq_queue *
 bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root,
 		     sector_t sector, struct rb_node **ret_parent,
@@ -444,6 +590,13 @@ bfq_rq_pos_tree_lookup(struct bfq_data *bfqd, struct rb_root *root,
 	return bfqq;
 }
 
+static bool bfq_too_late_for_merging(struct bfq_queue *bfqq)
+{
+	return bfqq->service_from_backlogged > 0 &&
+		time_is_before_jiffies(bfqq->first_IO_time +
+				       bfq_merge_time_limit);
+}
+
 void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 {
 	struct rb_node **p, *parent;
@@ -454,6 +607,14 @@ void bfq_pos_tree_add_move(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 		bfqq->pos_root = NULL;
 	}
 
+	/*
+	 * bfqq cannot be merged any longer (see comments in
+	 * bfq_setup_cooperator): no point in adding bfqq into the
+	 * position tree.
+	 */
+	if (bfq_too_late_for_merging(bfqq))
+		return;
+
 	if (bfq_class_idle(bfqq))
 		return;
 	if (!bfqq->next_rq)
@@ -1247,6 +1408,7 @@ static void bfq_update_bfqq_wr_on_rq_arrival(struct bfq_data *bfqd,
 	if (old_wr_coeff == 1 && wr_or_deserves_wr) {
 		/* start a weight-raising period */
 		if (interactive) {
+			bfqq->service_from_wr = 0;
 			bfqq->wr_coeff = bfqd->bfq_wr_coeff;
 			bfqq->wr_cur_max_time = bfq_wr_duration(bfqd);
 		} else {
@@ -1627,6 +1789,8 @@ static void bfq_remove_request(struct request_queue *q,
 			rb_erase(&bfqq->pos_node, bfqq->pos_root);
 			bfqq->pos_root = NULL;
 		}
+	} else {
+		bfq_pos_tree_add_move(bfqd, bfqq);
 	}
 
 	if (rq->cmd_flags & REQ_META)
@@ -1933,6 +2097,9 @@ bfq_setup_merge(struct bfq_queue *bfqq, struct bfq_queue *new_bfqq)
 static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq,
 					struct bfq_queue *new_bfqq)
 {
+	if (bfq_too_late_for_merging(new_bfqq))
+		return false;
+
 	if (bfq_class_idle(bfqq) || bfq_class_idle(new_bfqq) ||
 	    (bfqq->ioprio_class != new_bfqq->ioprio_class))
 		return false;
@@ -1957,20 +2124,6 @@ static bool bfq_may_be_close_cooperator(struct bfq_queue *bfqq,
 }
 
 /*
- * If this function returns true, then bfqq cannot be merged. The idea
- * is that true cooperation happens very early after processes start
- * to do I/O. Usually, late cooperations are just accidental false
- * positives. In case bfqq is weight-raised, such false positives
- * would evidently degrade latency guarantees for bfqq.
- */
-static bool wr_from_too_long(struct bfq_queue *bfqq)
-{
-	return bfqq->wr_coeff > 1 &&
-		time_is_before_jiffies(bfqq->last_wr_start_finish +
-				       msecs_to_jiffies(100));
-}
-
-/*
  * Attempt to schedule a merge of bfqq with the currently in-service
  * queue or with a close queue among the scheduled queues.  Return
  * NULL if no merge was scheduled, a pointer to the shared bfq_queue
@@ -1983,11 +2136,6 @@ static bool wr_from_too_long(struct bfq_queue *bfqq)
  * to maintain. Besides, in such a critical condition as an out of memory,
  * the benefits of queue merging may be little relevant, or even negligible.
  *
- * Weight-raised queues can be merged only if their weight-raising
- * period has just started. In fact cooperating processes are usually
- * started together. Thus, with this filter we avoid false positives
- * that would jeopardize low-latency guarantees.
- *
  * WARNING: queue merging may impair fairness among non-weight raised
  * queues, for at least two reasons: 1) the original weight of a
  * merged queue may change during the merged state, 2) even being the
@@ -2001,12 +2149,24 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 {
 	struct bfq_queue *in_service_bfqq, *new_bfqq;
 
+	/*
+	 * Prevent bfqq from being merged if it has been created too
+	 * long ago. The idea is that true cooperating processes, and
+	 * thus their associated bfq_queues, are supposed to be
+	 * created shortly after each other. This is the case, e.g.,
+	 * for KVM/QEMU and dump I/O threads. Basing on this
+	 * assumption, the following filtering greatly reduces the
+	 * probability that two non-cooperating processes, which just
+	 * happen to do close I/O for some short time interval, have
+	 * their queues merged by mistake.
+	 */
+	if (bfq_too_late_for_merging(bfqq))
+		return NULL;
+
 	if (bfqq->new_bfqq)
 		return bfqq->new_bfqq;
 
-	if (!io_struct ||
-	    wr_from_too_long(bfqq) ||
-	    unlikely(bfqq == &bfqd->oom_bfqq))
+	if (!io_struct || unlikely(bfqq == &bfqd->oom_bfqq))
 		return NULL;
 
 	/* If there is only one backlogged queue, don't search. */
@@ -2015,12 +2175,9 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 
 	in_service_bfqq = bfqd->in_service_queue;
 
-	if (!in_service_bfqq || in_service_bfqq == bfqq
-	    || wr_from_too_long(in_service_bfqq) ||
-	    unlikely(in_service_bfqq == &bfqd->oom_bfqq))
-		goto check_scheduled;
-
-	if (bfq_rq_close_to_sector(io_struct, request, bfqd->last_position) &&
+	if (in_service_bfqq && in_service_bfqq != bfqq &&
+	    likely(in_service_bfqq != &bfqd->oom_bfqq) &&
+	    bfq_rq_close_to_sector(io_struct, request, bfqd->last_position) &&
 	    bfqq->entity.parent == in_service_bfqq->entity.parent &&
 	    bfq_may_be_close_cooperator(bfqq, in_service_bfqq)) {
 		new_bfqq = bfq_setup_merge(bfqq, in_service_bfqq);
@@ -2032,12 +2189,10 @@ bfq_setup_cooperator(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	 * queues. The only thing we need is that the bio/request is not
 	 * NULL, as we need it to establish whether a cooperator exists.
 	 */
-check_scheduled:
 	new_bfqq = bfq_find_close_cooperator(bfqd, bfqq,
 			bfq_io_struct_pos(io_struct, request));
 
-	if (new_bfqq && !wr_from_too_long(new_bfqq) &&
-	    likely(new_bfqq != &bfqd->oom_bfqq) &&
+	if (new_bfqq && likely(new_bfqq != &bfqd->oom_bfqq) &&
 	    bfq_may_be_close_cooperator(bfqq, new_bfqq))
 		return bfq_setup_merge(bfqq, new_bfqq);
 
@@ -2062,7 +2217,8 @@ static void bfq_bfqq_save_state(struct bfq_queue *bfqq)
 	bic->saved_in_large_burst = bfq_bfqq_in_large_burst(bfqq);
 	bic->was_in_burst_list = !hlist_unhashed(&bfqq->burst_list_node);
 	if (unlikely(bfq_bfqq_just_created(bfqq) &&
-		     !bfq_bfqq_in_large_burst(bfqq))) {
+		     !bfq_bfqq_in_large_burst(bfqq) &&
+		     bfqq->bfqd->low_latency)) {
 		/*
 		 * bfqq being merged right after being created: bfqq
 		 * would have deserved interactive weight raising, but
@@ -2917,45 +3073,87 @@ static bool bfq_bfqq_is_slow(struct bfq_data *bfqd, struct bfq_queue *bfqq,
  * whereas soft_rt_next_start is set to infinity for applications that do
  * not.
  *
- * Unfortunately, even a greedy application may happen to behave in an
- * isochronous way if the CPU load is high. In fact, the application may
- * stop issuing requests while the CPUs are busy serving other processes,
- * then restart, then stop again for a while, and so on. In addition, if
- * the disk achieves a low enough throughput with the request pattern
- * issued by the application (e.g., because the request pattern is random
- * and/or the device is slow), then the application may meet the above
- * bandwidth requirement too. To prevent such a greedy application to be
- * deemed as soft real-time, a further rule is used in the computation of
- * soft_rt_next_start: soft_rt_next_start must be higher than the current
- * time plus the maximum time for which the arrival of a request is waited
- * for when a sync queue becomes idle, namely bfqd->bfq_slice_idle.
- * This filters out greedy applications, as the latter issue instead their
- * next request as soon as possible after the last one has been completed
- * (in contrast, when a batch of requests is completed, a soft real-time
- * application spends some time processing data).
+ * Unfortunately, even a greedy (i.e., I/O-bound) application may
+ * happen to meet, occasionally or systematically, both the above
+ * bandwidth and isochrony requirements. This may happen at least in
+ * the following circumstances. First, if the CPU load is high. The
+ * application may stop issuing requests while the CPUs are busy
+ * serving other processes, then restart, then stop again for a while,
+ * and so on. The other circumstances are related to the storage
+ * device: the storage device is highly loaded or reaches a low-enough
+ * throughput with the I/O of the application (e.g., because the I/O
+ * is random and/or the device is slow). In all these cases, the
+ * I/O of the application may be simply slowed down enough to meet
+ * the bandwidth and isochrony requirements. To reduce the probability
+ * that greedy applications are deemed as soft real-time in these
+ * corner cases, a further rule is used in the computation of
+ * soft_rt_next_start: the return value of this function is forced to
+ * be higher than the maximum between the following two quantities.
  *
- * Unfortunately, the last filter may easily generate false positives if
- * only bfqd->bfq_slice_idle is used as a reference time interval and one
- * or both the following cases occur:
- * 1) HZ is so low that the duration of a jiffy is comparable to or higher
- *    than bfqd->bfq_slice_idle. This happens, e.g., on slow devices with
- *    HZ=100.
+ * (a) Current time plus: (1) the maximum time for which the arrival
+ *     of a request is waited for when a sync queue becomes idle,
+ *     namely bfqd->bfq_slice_idle, and (2) a few extra jiffies. We
+ *     postpone for a moment the reason for adding a few extra
+ *     jiffies; we get back to it after next item (b).  Lower-bounding
+ *     the return value of this function with the current time plus
+ *     bfqd->bfq_slice_idle tends to filter out greedy applications,
+ *     because the latter issue their next request as soon as possible
+ *     after the last one has been completed. In contrast, a soft
+ *     real-time application spends some time processing data, after a
+ *     batch of its requests has been completed.
+ *
+ * (b) Current value of bfqq->soft_rt_next_start. As pointed out
+ *     above, greedy applications may happen to meet both the
+ *     bandwidth and isochrony requirements under heavy CPU or
+ *     storage-device load. In more detail, in these scenarios, these
+ *     applications happen, only for limited time periods, to do I/O
+ *     slowly enough to meet all the requirements described so far,
+ *     including the filtering in above item (a). These slow-speed
+ *     time intervals are usually interspersed between other time
+ *     intervals during which these applications do I/O at a very high
+ *     speed. Fortunately, exactly because of the high speed of the
+ *     I/O in the high-speed intervals, the values returned by this
+ *     function happen to be so high, near the end of any such
+ *     high-speed interval, to be likely to fall *after* the end of
+ *     the low-speed time interval that follows. These high values are
+ *     stored in bfqq->soft_rt_next_start after each invocation of
+ *     this function. As a consequence, if the last value of
+ *     bfqq->soft_rt_next_start is constantly used to lower-bound the
+ *     next value that this function may return, then, from the very
+ *     beginning of a low-speed interval, bfqq->soft_rt_next_start is
+ *     likely to be constantly kept so high that any I/O request
+ *     issued during the low-speed interval is considered as arriving
+ *     to soon for the application to be deemed as soft
+ *     real-time. Then, in the high-speed interval that follows, the
+ *     application will not be deemed as soft real-time, just because
+ *     it will do I/O at a high speed. And so on.
+ *
+ * Getting back to the filtering in item (a), in the following two
+ * cases this filtering might be easily passed by a greedy
+ * application, if the reference quantity was just
+ * bfqd->bfq_slice_idle:
+ * 1) HZ is so low that the duration of a jiffy is comparable to or
+ *    higher than bfqd->bfq_slice_idle. This happens, e.g., on slow
+ *    devices with HZ=100. The time granularity may be so coarse
+ *    that the approximation, in jiffies, of bfqd->bfq_slice_idle
+ *    is rather lower than the exact value.
  * 2) jiffies, instead of increasing at a constant rate, may stop increasing
  *    for a while, then suddenly 'jump' by several units to recover the lost
  *    increments. This seems to happen, e.g., inside virtual machines.
- * To address this issue, we do not use as a reference time interval just
- * bfqd->bfq_slice_idle, but bfqd->bfq_slice_idle plus a few jiffies. In
- * particular we add the minimum number of jiffies for which the filter
- * seems to be quite precise also in embedded systems and KVM/QEMU virtual
- * machines.
+ * To address this issue, in the filtering in (a) we do not use as a
+ * reference time interval just bfqd->bfq_slice_idle, but
+ * bfqd->bfq_slice_idle plus a few jiffies. In particular, we add the
+ * minimum number of jiffies for which the filter seems to be quite
+ * precise also in embedded systems and KVM/QEMU virtual machines.
  */
 static unsigned long bfq_bfqq_softrt_next_start(struct bfq_data *bfqd,
 						struct bfq_queue *bfqq)
 {
-	return max(bfqq->last_idle_bklogged +
-		   HZ * bfqq->service_from_backlogged /
-		   bfqd->bfq_wr_max_softrt_rate,
-		   jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4);
+	return max3(bfqq->soft_rt_next_start,
+		    bfqq->last_idle_bklogged +
+		    HZ * bfqq->service_from_backlogged /
+		    bfqd->bfq_wr_max_softrt_rate,
+		    jiffies + nsecs_to_jiffies(bfqq->bfqd->bfq_slice_idle) + 4);
 }
 
 /**
@@ -3000,17 +3198,6 @@ void bfq_bfqq_expire(struct bfq_data *bfqd,
 	slow = bfq_bfqq_is_slow(bfqd, bfqq, compensate, reason, &delta);
 
 	/*
-	 * Increase service_from_backlogged before next statement,
-	 * because the possible next invocation of
-	 * bfq_bfqq_charge_time would likely inflate
-	 * entity->service. In contrast, service_from_backlogged must
-	 * contain real service, to enable the soft real-time
-	 * heuristic to correctly compute the bandwidth consumed by
-	 * bfqq.
-	 */
-	bfqq->service_from_backlogged += entity->service;
-
-	/*
 	 * As above explained, charge slow (typically seeky) and
 	 * timed-out queues with the time and not the service
 	 * received, to favor sequential workloads.
@@ -3535,6 +3722,12 @@ static void bfq_update_wr_data(struct bfq_data *bfqd, struct bfq_queue *bfqq)
 				bfqq->entity.prio_changed = 1;
 			}
 		}
+		if (bfqq->wr_coeff > 1 &&
+		    bfqq->wr_cur_max_time != bfqd->bfq_wr_rt_max_time &&
+		    bfqq->service_from_wr > max_service_from_wr) {
+			/* see comments on max_service_from_wr */
+			bfq_bfqq_end_wr(bfqq);
+		}
 	}
 	/*
 	 * To improve latency (for this or other queues), immediately
@@ -3630,8 +3823,8 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 		}
 
 		/*
-		 * We exploit the put_rq_private hook to decrement
-		 * rq_in_driver, but put_rq_private will not be
+		 * We exploit the bfq_finish_request hook to decrement
+		 * rq_in_driver, but bfq_finish_request will not be
 		 * invoked on this request. So, to avoid unbalance,
 		 * just start this request, without incrementing
 		 * rq_in_driver. As a negative consequence,
@@ -3640,14 +3833,14 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 		 * bfq_schedule_dispatch to be invoked uselessly.
 		 *
 		 * As for implementing an exact solution, the
-		 * put_request hook, if defined, is probably invoked
-		 * also on this request. So, by exploiting this hook,
-		 * we could 1) increment rq_in_driver here, and 2)
-		 * decrement it in put_request. Such a solution would
-		 * let the value of the counter be always accurate,
-		 * but it would entail using an extra interface
-		 * function. This cost seems higher than the benefit,
-		 * being the frequency of non-elevator-private
+		 * bfq_finish_request hook, if defined, is probably
+		 * invoked also on this request. So, by exploiting
+		 * this hook, we could 1) increment rq_in_driver here,
+		 * and 2) decrement it in bfq_finish_request. Such a
+		 * solution would let the value of the counter be
+		 * always accurate, but it would entail using an extra
+		 * interface function. This cost seems higher than the
+		 * benefit, being the frequency of non-elevator-private
 		 * requests very low.
 		 */
 		goto start_rq;
@@ -3689,35 +3882,16 @@ static struct request *__bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	return rq;
 }
 
-static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
+#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
+static void bfq_update_dispatch_stats(struct request_queue *q,
+				      struct request *rq,
+				      struct bfq_queue *in_serv_queue,
+				      bool idle_timer_disabled)
 {
-	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
-	struct request *rq;
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
-	struct bfq_queue *in_serv_queue, *bfqq;
-	bool waiting_rq, idle_timer_disabled;
-#endif
+	struct bfq_queue *bfqq = rq ? RQ_BFQQ(rq) : NULL;
 
-	spin_lock_irq(&bfqd->lock);
-
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
-	in_serv_queue = bfqd->in_service_queue;
-	waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue);
-
-	rq = __bfq_dispatch_request(hctx);
-
-	idle_timer_disabled =
-		waiting_rq && !bfq_bfqq_wait_request(in_serv_queue);
-
-#else
-	rq = __bfq_dispatch_request(hctx);
-#endif
-	spin_unlock_irq(&bfqd->lock);
-
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
-	bfqq = rq ? RQ_BFQQ(rq) : NULL;
 	if (!idle_timer_disabled && !bfqq)
-		return rq;
+		return;
 
 	/*
 	 * rq and bfqq are guaranteed to exist until this function
@@ -3732,7 +3906,7 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	 * In addition, the following queue lock guarantees that
 	 * bfqq_group(bfqq) exists as well.
 	 */
-	spin_lock_irq(hctx->queue->queue_lock);
+	spin_lock_irq(q->queue_lock);
 	if (idle_timer_disabled)
 		/*
 		 * Since the idle timer has been disabled,
@@ -3751,9 +3925,37 @@ static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
 		bfqg_stats_set_start_empty_time(bfqg);
 		bfqg_stats_update_io_remove(bfqg, rq->cmd_flags);
 	}
-	spin_unlock_irq(hctx->queue->queue_lock);
+	spin_unlock_irq(q->queue_lock);
+}
+#else
+static inline void bfq_update_dispatch_stats(struct request_queue *q,
+					     struct request *rq,
+					     struct bfq_queue *in_serv_queue,
+					     bool idle_timer_disabled) {}
 #endif
 
+static struct request *bfq_dispatch_request(struct blk_mq_hw_ctx *hctx)
+{
+	struct bfq_data *bfqd = hctx->queue->elevator->elevator_data;
+	struct request *rq;
+	struct bfq_queue *in_serv_queue;
+	bool waiting_rq, idle_timer_disabled;
+
+	spin_lock_irq(&bfqd->lock);
+
+	in_serv_queue = bfqd->in_service_queue;
+	waiting_rq = in_serv_queue && bfq_bfqq_wait_request(in_serv_queue);
+
+	rq = __bfq_dispatch_request(hctx);
+
+	idle_timer_disabled =
+		waiting_rq && !bfq_bfqq_wait_request(in_serv_queue);
+
+	spin_unlock_irq(&bfqd->lock);
+
+	bfq_update_dispatch_stats(hctx->queue, rq, in_serv_queue,
+				  idle_timer_disabled);
+
 	return rq;
 }
 
@@ -4002,10 +4204,15 @@ static void bfq_init_bfqq(struct bfq_data *bfqd, struct bfq_queue *bfqq,
 	bfqq->split_time = bfq_smallest_from_now();
 
 	/*
-	 * Set to the value for which bfqq will not be deemed as
-	 * soft rt when it becomes backlogged.
+	 * To not forget the possibly high bandwidth consumed by a
+	 * process/queue in the recent past,
+	 * bfq_bfqq_softrt_next_start() returns a value at least equal
+	 * to the current value of bfqq->soft_rt_next_start (see
+	 * comments on bfq_bfqq_softrt_next_start).  Set
+	 * soft_rt_next_start to now, to mean that bfqq has consumed
+	 * no bandwidth so far.
 	 */
-	bfqq->soft_rt_next_start = bfq_greatest_from_now();
+	bfqq->soft_rt_next_start = jiffies;
 
 	/* first request is almost certainly seeky */
 	bfqq->seek_history = 1;
@@ -4276,16 +4483,46 @@ static bool __bfq_insert_request(struct bfq_data *bfqd, struct request *rq)
 	return idle_timer_disabled;
 }
 
+#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
+static void bfq_update_insert_stats(struct request_queue *q,
+				    struct bfq_queue *bfqq,
+				    bool idle_timer_disabled,
+				    unsigned int cmd_flags)
+{
+	if (!bfqq)
+		return;
+
+	/*
+	 * bfqq still exists, because it can disappear only after
+	 * either it is merged with another queue, or the process it
+	 * is associated with exits. But both actions must be taken by
+	 * the same process currently executing this flow of
+	 * instructions.
+	 *
+	 * In addition, the following queue lock guarantees that
+	 * bfqq_group(bfqq) exists as well.
+	 */
+	spin_lock_irq(q->queue_lock);
+	bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, cmd_flags);
+	if (idle_timer_disabled)
+		bfqg_stats_update_idle_time(bfqq_group(bfqq));
+	spin_unlock_irq(q->queue_lock);
+}
+#else
+static inline void bfq_update_insert_stats(struct request_queue *q,
+					   struct bfq_queue *bfqq,
+					   bool idle_timer_disabled,
+					   unsigned int cmd_flags) {}
+#endif
+
 static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 			       bool at_head)
 {
 	struct request_queue *q = hctx->queue;
 	struct bfq_data *bfqd = q->elevator->elevator_data;
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
 	struct bfq_queue *bfqq = RQ_BFQQ(rq);
 	bool idle_timer_disabled = false;
 	unsigned int cmd_flags;
-#endif
 
 	spin_lock_irq(&bfqd->lock);
 	if (blk_mq_sched_try_insert_merge(q, rq)) {
@@ -4304,7 +4541,6 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 		else
 			list_add_tail(&rq->queuelist, &bfqd->dispatch);
 	} else {
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
 		idle_timer_disabled = __bfq_insert_request(bfqd, rq);
 		/*
 		 * Update bfqq, because, if a queue merge has occurred
@@ -4312,9 +4548,6 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 		 * redirected into a new queue.
 		 */
 		bfqq = RQ_BFQQ(rq);
-#else
-		__bfq_insert_request(bfqd, rq);
-#endif
 
 		if (rq_mergeable(rq)) {
 			elv_rqhash_add(q, rq);
@@ -4323,35 +4556,17 @@ static void bfq_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 		}
 	}
 
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
 	/*
 	 * Cache cmd_flags before releasing scheduler lock, because rq
 	 * may disappear afterwards (for example, because of a request
 	 * merge).
 	 */
 	cmd_flags = rq->cmd_flags;
-#endif
+
 	spin_unlock_irq(&bfqd->lock);
 
-#if defined(CONFIG_BFQ_GROUP_IOSCHED) && defined(CONFIG_DEBUG_BLK_CGROUP)
-	if (!bfqq)
-		return;
-	/*
-	 * bfqq still exists, because it can disappear only after
-	 * either it is merged with another queue, or the process it
-	 * is associated with exits. But both actions must be taken by
-	 * the same process currently executing this flow of
-	 * instruction.
-	 *
-	 * In addition, the following queue lock guarantees that
-	 * bfqq_group(bfqq) exists as well.
-	 */
-	spin_lock_irq(q->queue_lock);
-	bfqg_stats_update_io_add(bfqq_group(bfqq), bfqq, cmd_flags);
-	if (idle_timer_disabled)
-		bfqg_stats_update_idle_time(bfqq_group(bfqq));
-	spin_unlock_irq(q->queue_lock);
-#endif
+	bfq_update_insert_stats(q, bfqq, idle_timer_disabled,
+				cmd_flags);
 }
 
 static void bfq_insert_requests(struct blk_mq_hw_ctx *hctx,
@@ -4482,7 +4697,7 @@ static void bfq_completed_request(struct bfq_queue *bfqq, struct bfq_data *bfqd)
 		bfq_schedule_dispatch(bfqd);
 }
 
-static void bfq_put_rq_priv_body(struct bfq_queue *bfqq)
+static void bfq_finish_request_body(struct bfq_queue *bfqq)
 {
 	bfqq->allocated--;
 
@@ -4512,7 +4727,7 @@ static void bfq_finish_request(struct request *rq)
 		spin_lock_irqsave(&bfqd->lock, flags);
 
 		bfq_completed_request(bfqq, bfqd);
-		bfq_put_rq_priv_body(bfqq);
+		bfq_finish_request_body(bfqq);
 
 		spin_unlock_irqrestore(&bfqd->lock, flags);
 	} else {
@@ -4533,7 +4748,7 @@ static void bfq_finish_request(struct request *rq)
 			bfqg_stats_update_io_remove(bfqq_group(bfqq),
 						    rq->cmd_flags);
 		}
-		bfq_put_rq_priv_body(bfqq);
+		bfq_finish_request_body(bfqq);
 	}
 
 	rq->elv.priv[0] = NULL;
@@ -4818,6 +5033,9 @@ static void bfq_exit_queue(struct elevator_queue *e)
 	hrtimer_cancel(&bfqd->idle_slice_timer);
 
 #ifdef CONFIG_BFQ_GROUP_IOSCHED
+	/* release oom-queue reference to root group */
+	bfqg_and_blkg_put(bfqd->root_group);
+
 	blkcg_deactivate_policy(bfqd->queue, &blkcg_policy_bfq);
 #else
 	spin_lock_irq(&bfqd->lock);
@@ -5206,6 +5424,7 @@ static struct elv_fs_entry bfq_attrs[] = {
 
 static struct elevator_type iosched_bfq_mq = {
 	.ops.mq = {
+		.limit_depth		= bfq_limit_depth,
 		.prepare_request	= bfq_prepare_request,
 		.finish_request		= bfq_finish_request,
 		.exit_icq		= bfq_exit_icq,
diff --git a/block/bfq-iosched.h b/block/bfq-iosched.h
index 91c4390..350c39a 100644
--- a/block/bfq-iosched.h
+++ b/block/bfq-iosched.h
@@ -337,6 +337,11 @@ struct bfq_queue {
 	 * last transition from idle to backlogged.
 	 */
 	unsigned long service_from_backlogged;
+	/*
+	 * Cumulative service received from the @bfq_queue since its
+	 * last transition to weight-raised state.
+	 */
+	unsigned long service_from_wr;
 
 	/*
 	 * Value of wr start time when switching to soft rt
@@ -344,6 +349,8 @@ struct bfq_queue {
 	unsigned long wr_start_at_switch_to_srt;
 
 	unsigned long split_time; /* time of last split */
+
+	unsigned long first_IO_time; /* time of first I/O for this queue */
 };
 
 /**
@@ -627,6 +634,18 @@ struct bfq_data {
 	struct bfq_io_cq *bio_bic;
 	/* bfqq associated with the task issuing current bio for merging */
 	struct bfq_queue *bio_bfqq;
+
+	/*
+	 * Cached sbitmap shift, used to compute depth limits in
+	 * bfq_update_depths.
+	 */
+	unsigned int sb_shift;
+
+	/*
+	 * Depth limits used in bfq_limit_depth (see comments on the
+	 * function)
+	 */
+	unsigned int word_depths[2][2];
 };
 
 enum bfqq_state_flags {
diff --git a/block/bfq-wf2q.c b/block/bfq-wf2q.c
index e495d3f..4498c43 100644
--- a/block/bfq-wf2q.c
+++ b/block/bfq-wf2q.c
@@ -835,6 +835,13 @@ void bfq_bfqq_served(struct bfq_queue *bfqq, int served)
 	struct bfq_entity *entity = &bfqq->entity;
 	struct bfq_service_tree *st;
 
+	if (!bfqq->service_from_backlogged)
+		bfqq->first_IO_time = jiffies;
+
+	if (bfqq->wr_coeff > 1)
+		bfqq->service_from_wr += served;
+
+	bfqq->service_from_backlogged += served;
 	for_each_entity(entity) {
 		st = bfq_entity_service_tree(entity);
 
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 23b42e8..9cfdd6c 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -374,7 +374,6 @@ static void bio_integrity_verify_fn(struct work_struct *work)
 /**
  * __bio_integrity_endio - Integrity I/O completion function
  * @bio:	Protected bio
- * @error:	Pointer to errno
  *
  * Description: Completion for integrity I/O
  *
diff --git a/block/bio.c b/block/bio.c
index 9ef6cf3..e1708db 100644
--- a/block/bio.c
+++ b/block/bio.c
@@ -971,34 +971,6 @@ void bio_advance(struct bio *bio, unsigned bytes)
 EXPORT_SYMBOL(bio_advance);
 
 /**
- * bio_alloc_pages - allocates a single page for each bvec in a bio
- * @bio: bio to allocate pages for
- * @gfp_mask: flags for allocation
- *
- * Allocates pages up to @bio->bi_vcnt.
- *
- * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
- * freed.
- */
-int bio_alloc_pages(struct bio *bio, gfp_t gfp_mask)
-{
-	int i;
-	struct bio_vec *bv;
-
-	bio_for_each_segment_all(bv, bio, i) {
-		bv->bv_page = alloc_page(gfp_mask);
-		if (!bv->bv_page) {
-			while (--bv >= bio->bi_io_vec)
-				__free_page(bv->bv_page);
-			return -ENOMEM;
-		}
-	}
-
-	return 0;
-}
-EXPORT_SYMBOL(bio_alloc_pages);
-
-/**
  * bio_copy_data - copy contents of data buffers from one chain of bios to
  * another
  * @src: source bio list
@@ -1838,7 +1810,7 @@ struct bio *bio_split(struct bio *bio, int sectors,
 	bio_advance(bio, split->bi_iter.bi_size);
 
 	if (bio_flagged(bio, BIO_TRACE_COMPLETION))
-		bio_set_flag(bio, BIO_TRACE_COMPLETION);
+		bio_set_flag(split, BIO_TRACE_COMPLETION);
 
 	return split;
 }
diff --git a/block/blk-core.c b/block/blk-core.c
index b888175..a2005a4 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -126,6 +126,8 @@ void blk_rq_init(struct request_queue *q, struct request *rq)
 	rq->start_time = jiffies;
 	set_start_time_ns(rq);
 	rq->part = NULL;
+	seqcount_init(&rq->gstate_seq);
+	u64_stats_init(&rq->aborted_gstate_sync);
 }
 EXPORT_SYMBOL(blk_rq_init);
 
@@ -562,6 +564,13 @@ static void __blk_drain_queue(struct request_queue *q, bool drain_all)
 	}
 }
 
+void blk_drain_queue(struct request_queue *q)
+{
+	spin_lock_irq(q->queue_lock);
+	__blk_drain_queue(q, true);
+	spin_unlock_irq(q->queue_lock);
+}
+
 /**
  * blk_queue_bypass_start - enter queue bypass mode
  * @q: queue of interest
@@ -689,11 +698,18 @@ void blk_cleanup_queue(struct request_queue *q)
 	 */
 	blk_freeze_queue(q);
 	spin_lock_irq(lock);
-	if (!q->mq_ops)
-		__blk_drain_queue(q, true);
 	queue_flag_set(QUEUE_FLAG_DEAD, q);
 	spin_unlock_irq(lock);
 
+	/*
+	 * make sure all in-progress dispatch are completed because
+	 * blk_freeze_queue() can only complete all requests, and
+	 * dispatch may still be in-progress since we dispatch requests
+	 * from more than one contexts
+	 */
+	if (q->mq_ops)
+		blk_mq_quiesce_queue(q);
+
 	/* for synchronous bio-based driver finish in-flight integrity i/o */
 	blk_flush_integrity();
 
@@ -1641,6 +1657,7 @@ void __blk_put_request(struct request_queue *q, struct request *req)
 
 	lockdep_assert_held(q->queue_lock);
 
+	blk_req_zone_write_unlock(req);
 	blk_pm_put_request(req);
 
 	elv_completed_request(q, req);
@@ -2050,6 +2067,21 @@ static inline bool should_fail_request(struct hd_struct *part,
 
 #endif /* CONFIG_FAIL_MAKE_REQUEST */
 
+static inline bool bio_check_ro(struct bio *bio, struct hd_struct *part)
+{
+	if (part->policy && op_is_write(bio_op(bio))) {
+		char b[BDEVNAME_SIZE];
+
+		printk(KERN_ERR
+		       "generic_make_request: Trying to write "
+			"to read-only block-device %s (partno %d)\n",
+			bio_devname(bio, b), part->partno);
+		return true;
+	}
+
+	return false;
+}
+
 /*
  * Remap block n of partition p to block n+start(p) of the disk.
  */
@@ -2058,27 +2090,28 @@ static inline int blk_partition_remap(struct bio *bio)
 	struct hd_struct *p;
 	int ret = 0;
 
+	rcu_read_lock();
+	p = __disk_get_part(bio->bi_disk, bio->bi_partno);
+	if (unlikely(!p || should_fail_request(p, bio->bi_iter.bi_size) ||
+		     bio_check_ro(bio, p))) {
+		ret = -EIO;
+		goto out;
+	}
+
 	/*
 	 * Zone reset does not include bi_size so bio_sectors() is always 0.
 	 * Include a test for the reset op code and perform the remap if needed.
 	 */
-	if (!bio->bi_partno ||
-	    (!bio_sectors(bio) && bio_op(bio) != REQ_OP_ZONE_RESET))
-		return 0;
+	if (!bio_sectors(bio) && bio_op(bio) != REQ_OP_ZONE_RESET)
+		goto out;
 
-	rcu_read_lock();
-	p = __disk_get_part(bio->bi_disk, bio->bi_partno);
-	if (likely(p && !should_fail_request(p, bio->bi_iter.bi_size))) {
-		bio->bi_iter.bi_sector += p->start_sect;
-		bio->bi_partno = 0;
-		trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p),
-				bio->bi_iter.bi_sector - p->start_sect);
-	} else {
-		printk("%s: fail for partition %d\n", __func__, bio->bi_partno);
-		ret = -EIO;
-	}
+	bio->bi_iter.bi_sector += p->start_sect;
+	bio->bi_partno = 0;
+	trace_block_bio_remap(bio->bi_disk->queue, bio, part_devt(p),
+			      bio->bi_iter.bi_sector - p->start_sect);
+
+out:
 	rcu_read_unlock();
-
 	return ret;
 }
 
@@ -2137,15 +2170,19 @@ generic_make_request_checks(struct bio *bio)
 	 * For a REQ_NOWAIT based request, return -EOPNOTSUPP
 	 * if queue is not a request based queue.
 	 */
-
 	if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_rq_based(q))
 		goto not_supported;
 
 	if (should_fail_request(&bio->bi_disk->part0, bio->bi_iter.bi_size))
 		goto end_io;
 
-	if (blk_partition_remap(bio))
-		goto end_io;
+	if (!bio->bi_partno) {
+		if (unlikely(bio_check_ro(bio, &bio->bi_disk->part0)))
+			goto end_io;
+	} else {
+		if (blk_partition_remap(bio))
+			goto end_io;
+	}
 
 	if (bio_check_eod(bio, nr_sectors))
 		goto end_io;
@@ -2488,8 +2525,7 @@ blk_status_t blk_insert_cloned_request(struct request_queue *q, struct request *
 		 * bypass a potential scheduler on the bottom device for
 		 * insert.
 		 */
-		blk_mq_request_bypass_insert(rq, true);
-		return BLK_STS_OK;
+		return blk_mq_request_issue_directly(rq);
 	}
 
 	spin_lock_irqsave(q->queue_lock, flags);
@@ -2841,7 +2877,7 @@ void blk_start_request(struct request *req)
 		wbt_issue(req->q->rq_wb, &req->issue_stat);
 	}
 
-	BUG_ON(test_bit(REQ_ATOM_COMPLETE, &req->atomic_flags));
+	BUG_ON(blk_rq_is_complete(req));
 	blk_add_timer(req);
 }
 EXPORT_SYMBOL(blk_start_request);
@@ -3410,20 +3446,6 @@ int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork,
 }
 EXPORT_SYMBOL(kblockd_mod_delayed_work_on);
 
-int kblockd_schedule_delayed_work(struct delayed_work *dwork,
-				  unsigned long delay)
-{
-	return queue_delayed_work(kblockd_workqueue, dwork, delay);
-}
-EXPORT_SYMBOL(kblockd_schedule_delayed_work);
-
-int kblockd_schedule_delayed_work_on(int cpu, struct delayed_work *dwork,
-				     unsigned long delay)
-{
-	return queue_delayed_work_on(cpu, kblockd_workqueue, dwork, delay);
-}
-EXPORT_SYMBOL(kblockd_schedule_delayed_work_on);
-
 /**
  * blk_start_plug - initialize blk_plug and track it inside the task_struct
  * @plug:	The &struct blk_plug that needs to be initialized
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 5c0f3dc..f7b292f 100644
--- a/block/blk-exec.c
+++ b/block/blk-exec.c
@@ -61,7 +61,7 @@ void blk_execute_rq_nowait(struct request_queue *q, struct gendisk *bd_disk,
 	 * be reused after dying flag is set
 	 */
 	if (q->mq_ops) {
-		blk_mq_sched_insert_request(rq, at_head, true, false, false);
+		blk_mq_sched_insert_request(rq, at_head, true, false);
 		return;
 	}
 
diff --git a/block/blk-lib.c b/block/blk-lib.c
index 2bc544c..a676084d 100644
--- a/block/blk-lib.c
+++ b/block/blk-lib.c
@@ -37,6 +37,9 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector,
 	if (!q)
 		return -ENXIO;
 
+	if (bdev_read_only(bdev))
+		return -EPERM;
+
 	if (flags & BLKDEV_DISCARD_SECURE) {
 		if (!blk_queue_secure_erase(q))
 			return -EOPNOTSUPP;
@@ -156,6 +159,9 @@ static int __blkdev_issue_write_same(struct block_device *bdev, sector_t sector,
 	if (!q)
 		return -ENXIO;
 
+	if (bdev_read_only(bdev))
+		return -EPERM;
+
 	bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1;
 	if ((sector | nr_sects) & bs_mask)
 		return -EINVAL;
@@ -233,6 +239,9 @@ static int __blkdev_issue_write_zeroes(struct block_device *bdev,
 	if (!q)
 		return -ENXIO;
 
+	if (bdev_read_only(bdev))
+		return -EPERM;
+
 	/* Ensure that max_write_zeroes_sectors doesn't overflow bi_size */
 	max_write_zeroes_sectors = bdev_write_zeroes_sectors(bdev);
 
@@ -287,6 +296,9 @@ static int __blkdev_issue_zero_pages(struct block_device *bdev,
 	if (!q)
 		return -ENXIO;
 
+	if (bdev_read_only(bdev))
+		return -EPERM;
+
 	while (nr_sects != 0) {
 		bio = next_bio(bio, __blkdev_sectors_to_bio_pages(nr_sects),
 			       gfp_mask);
diff --git a/block/blk-map.c b/block/blk-map.c
index d3a9471..db9373b 100644
--- a/block/blk-map.c
+++ b/block/blk-map.c
@@ -119,7 +119,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 	unsigned long align = q->dma_pad_mask | queue_dma_alignment(q);
 	struct bio *bio = NULL;
 	struct iov_iter i;
-	int ret;
+	int ret = -EINVAL;
 
 	if (!iter_is_iovec(iter))
 		goto fail;
@@ -148,7 +148,7 @@ int blk_rq_map_user_iov(struct request_queue *q, struct request *rq,
 	__blk_rq_unmap_user(bio);
 fail:
 	rq->bio = NULL;
-	return -EINVAL;
+	return ret;
 }
 EXPORT_SYMBOL(blk_rq_map_user_iov);
 
diff --git a/block/blk-merge.c b/block/blk-merge.c
index f5dedd5..8452fc7 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -128,9 +128,7 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 				nsegs++;
 				sectors = max_sectors;
 			}
-			if (sectors)
-				goto split;
-			/* Make this single bvec as the 1st segment */
+			goto split;
 		}
 
 		if (bvprvp && blk_queue_cluster(q)) {
@@ -146,22 +144,21 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 			bvprvp = &bvprv;
 			sectors += bv.bv_len >> 9;
 
-			if (nsegs == 1 && seg_size > front_seg_size)
-				front_seg_size = seg_size;
 			continue;
 		}
 new_segment:
 		if (nsegs == queue_max_segments(q))
 			goto split;
 
+		if (nsegs == 1 && seg_size > front_seg_size)
+			front_seg_size = seg_size;
+
 		nsegs++;
 		bvprv = bv;
 		bvprvp = &bvprv;
 		seg_size = bv.bv_len;
 		sectors += bv.bv_len >> 9;
 
-		if (nsegs == 1 && seg_size > front_seg_size)
-			front_seg_size = seg_size;
 	}
 
 	do_split = false;
@@ -174,6 +171,8 @@ static struct bio *blk_bio_segment_split(struct request_queue *q,
 			bio = new;
 	}
 
+	if (nsegs == 1 && seg_size > front_seg_size)
+		front_seg_size = seg_size;
 	bio->bi_seg_front_size = front_seg_size;
 	if (seg_size > bio->bi_seg_back_size)
 		bio->bi_seg_back_size = seg_size;
diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c
index b56a4f3..21cbc1f 100644
--- a/block/blk-mq-debugfs.c
+++ b/block/blk-mq-debugfs.c
@@ -289,17 +289,12 @@ static const char *const rqf_name[] = {
 	RQF_NAME(HASHED),
 	RQF_NAME(STATS),
 	RQF_NAME(SPECIAL_PAYLOAD),
+	RQF_NAME(ZONE_WRITE_LOCKED),
+	RQF_NAME(MQ_TIMEOUT_EXPIRED),
+	RQF_NAME(MQ_POLL_SLEPT),
 };
 #undef RQF_NAME
 
-#define RQAF_NAME(name) [REQ_ATOM_##name] = #name
-static const char *const rqaf_name[] = {
-	RQAF_NAME(COMPLETE),
-	RQAF_NAME(STARTED),
-	RQAF_NAME(POLL_SLEPT),
-};
-#undef RQAF_NAME
-
 int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 {
 	const struct blk_mq_ops *const mq_ops = rq->q->mq_ops;
@@ -316,8 +311,7 @@ int __blk_mq_debugfs_rq_show(struct seq_file *m, struct request *rq)
 	seq_puts(m, ", .rq_flags=");
 	blk_flags_show(m, (__force unsigned int)rq->rq_flags, rqf_name,
 		       ARRAY_SIZE(rqf_name));
-	seq_puts(m, ", .atomic_flags=");
-	blk_flags_show(m, rq->atomic_flags, rqaf_name, ARRAY_SIZE(rqaf_name));
+	seq_printf(m, ", complete=%d", blk_rq_is_complete(rq));
 	seq_printf(m, ", .tag=%d, .internal_tag=%d", rq->tag,
 		   rq->internal_tag);
 	if (mq_ops->show_rq)
@@ -409,7 +403,7 @@ static void hctx_show_busy_rq(struct request *rq, void *data, bool reserved)
 	const struct show_busy_params *params = data;
 
 	if (blk_mq_map_queue(rq->q, rq->mq_ctx->cpu) == params->hctx &&
-	    test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
+	    blk_mq_rq_state(rq) != MQ_RQ_IDLE)
 		__blk_mq_debugfs_rq_show(params->m,
 					 list_entry_rq(&rq->queuelist));
 }
@@ -703,7 +697,11 @@ static ssize_t blk_mq_debugfs_write(struct file *file, const char __user *buf,
 	const struct blk_mq_debugfs_attr *attr = m->private;
 	void *data = d_inode(file->f_path.dentry->d_parent)->i_private;
 
-	if (!attr->write)
+	/*
+	 * Attributes that only implement .seq_ops are read-only and 'attr' is
+	 * the same with 'data' in this case.
+	 */
+	if (attr == data || !attr->write)
 		return -EPERM;
 
 	return attr->write(data, buf, count, ppos);
diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c
index c117bd8..55c0a74 100644
--- a/block/blk-mq-sched.c
+++ b/block/blk-mq-sched.c
@@ -172,7 +172,6 @@ static void blk_mq_do_dispatch_ctx(struct blk_mq_hw_ctx *hctx)
 	WRITE_ONCE(hctx->dispatch_from, ctx);
 }
 
-/* return true if hw queue need to be run again */
 void blk_mq_sched_dispatch_requests(struct blk_mq_hw_ctx *hctx)
 {
 	struct request_queue *q = hctx->queue;
@@ -428,7 +427,7 @@ void blk_mq_sched_restart(struct blk_mq_hw_ctx *const hctx)
 }
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
-				 bool run_queue, bool async, bool can_block)
+				 bool run_queue, bool async)
 {
 	struct request_queue *q = rq->q;
 	struct elevator_queue *e = q->elevator;
diff --git a/block/blk-mq-sched.h b/block/blk-mq-sched.h
index ba1d141..1e9c901 100644
--- a/block/blk-mq-sched.h
+++ b/block/blk-mq-sched.h
@@ -18,7 +18,7 @@ bool blk_mq_sched_try_insert_merge(struct request_queue *q, struct request *rq);
 void blk_mq_sched_restart(struct blk_mq_hw_ctx *hctx);
 
 void blk_mq_sched_insert_request(struct request *rq, bool at_head,
-				 bool run_queue, bool async, bool can_block);
+				 bool run_queue, bool async);
 void blk_mq_sched_insert_requests(struct request_queue *q,
 				  struct blk_mq_ctx *ctx,
 				  struct list_head *list, bool run_queue_async);
diff --git a/block/blk-mq-sysfs.c b/block/blk-mq-sysfs.c
index 79969c3..a54b4b0 100644
--- a/block/blk-mq-sysfs.c
+++ b/block/blk-mq-sysfs.c
@@ -248,7 +248,7 @@ static int blk_mq_register_hctx(struct blk_mq_hw_ctx *hctx)
 	return ret;
 }
 
-static void __blk_mq_unregister_dev(struct device *dev, struct request_queue *q)
+void blk_mq_unregister_dev(struct device *dev, struct request_queue *q)
 {
 	struct blk_mq_hw_ctx *hctx;
 	int i;
@@ -265,13 +265,6 @@ static void __blk_mq_unregister_dev(struct device *dev, struct request_queue *q)
 	q->mq_sysfs_init_done = false;
 }
 
-void blk_mq_unregister_dev(struct device *dev, struct request_queue *q)
-{
-	mutex_lock(&q->sysfs_lock);
-	__blk_mq_unregister_dev(dev, q);
-	mutex_unlock(&q->sysfs_lock);
-}
-
 void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx)
 {
 	kobject_init(&hctx->kobj, &blk_mq_hw_ktype);
diff --git a/block/blk-mq-tag.c b/block/blk-mq-tag.c
index c81b40ec..336dde0 100644
--- a/block/blk-mq-tag.c
+++ b/block/blk-mq-tag.c
@@ -134,12 +134,6 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 	ws = bt_wait_ptr(bt, data->hctx);
 	drop_ctx = data->ctx == NULL;
 	do {
-		prepare_to_wait(&ws->wait, &wait, TASK_UNINTERRUPTIBLE);
-
-		tag = __blk_mq_get_tag(data, bt);
-		if (tag != -1)
-			break;
-
 		/*
 		 * We're out of tags on this hardware queue, kick any
 		 * pending IO submits before going to sleep waiting for
@@ -155,6 +149,13 @@ unsigned int blk_mq_get_tag(struct blk_mq_alloc_data *data)
 		if (tag != -1)
 			break;
 
+		prepare_to_wait_exclusive(&ws->wait, &wait,
+						TASK_UNINTERRUPTIBLE);
+
+		tag = __blk_mq_get_tag(data, bt);
+		if (tag != -1)
+			break;
+
 		if (data->ctx)
 			blk_mq_put_ctx(data->ctx);
 
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 1109747..01f271d 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -95,8 +95,7 @@ static void blk_mq_check_inflight(struct blk_mq_hw_ctx *hctx,
 {
 	struct mq_inflight *mi = priv;
 
-	if (test_bit(REQ_ATOM_STARTED, &rq->atomic_flags) &&
-	    !test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) {
+	if (blk_mq_rq_state(rq) == MQ_RQ_IN_FLIGHT) {
 		/*
 		 * index[0] counts the specific partition that was asked
 		 * for. index[1] counts the ones that are active on the
@@ -161,6 +160,8 @@ void blk_freeze_queue(struct request_queue *q)
 	 * exported to drivers as the only user for unfreeze is blk_mq.
 	 */
 	blk_freeze_queue_start(q);
+	if (!q->mq_ops)
+		blk_drain_queue(q);
 	blk_mq_freeze_queue_wait(q);
 }
 
@@ -220,7 +221,7 @@ void blk_mq_quiesce_queue(struct request_queue *q)
 
 	queue_for_each_hw_ctx(q, hctx, i) {
 		if (hctx->flags & BLK_MQ_F_BLOCKING)
-			synchronize_srcu(hctx->queue_rq_srcu);
+			synchronize_srcu(hctx->srcu);
 		else
 			rcu = true;
 	}
@@ -270,15 +271,14 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 {
 	struct blk_mq_tags *tags = blk_mq_tags_from_data(data);
 	struct request *rq = tags->static_rqs[tag];
-
-	rq->rq_flags = 0;
+	req_flags_t rq_flags = 0;
 
 	if (data->flags & BLK_MQ_REQ_INTERNAL) {
 		rq->tag = -1;
 		rq->internal_tag = tag;
 	} else {
 		if (blk_mq_tag_busy(data->hctx)) {
-			rq->rq_flags = RQF_MQ_INFLIGHT;
+			rq_flags = RQF_MQ_INFLIGHT;
 			atomic_inc(&data->hctx->nr_active);
 		}
 		rq->tag = tag;
@@ -286,27 +286,22 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 		data->hctx->tags->rqs[rq->tag] = rq;
 	}
 
-	INIT_LIST_HEAD(&rq->queuelist);
 	/* csd/requeue_work/fifo_time is initialized before use */
 	rq->q = data->q;
 	rq->mq_ctx = data->ctx;
+	rq->rq_flags = rq_flags;
+	rq->cpu = -1;
 	rq->cmd_flags = op;
 	if (data->flags & BLK_MQ_REQ_PREEMPT)
 		rq->rq_flags |= RQF_PREEMPT;
 	if (blk_queue_io_stat(data->q))
 		rq->rq_flags |= RQF_IO_STAT;
-	/* do not touch atomic flags, it needs atomic ops against the timer */
-	rq->cpu = -1;
+	INIT_LIST_HEAD(&rq->queuelist);
 	INIT_HLIST_NODE(&rq->hash);
 	RB_CLEAR_NODE(&rq->rb_node);
 	rq->rq_disk = NULL;
 	rq->part = NULL;
 	rq->start_time = jiffies;
-#ifdef CONFIG_BLK_CGROUP
-	rq->rl = NULL;
-	set_start_time_ns(rq);
-	rq->io_start_time_ns = 0;
-#endif
 	rq->nr_phys_segments = 0;
 #if defined(CONFIG_BLK_DEV_INTEGRITY)
 	rq->nr_integrity_segments = 0;
@@ -314,6 +309,7 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	rq->special = NULL;
 	/* tag was already set */
 	rq->extra_len = 0;
+	rq->__deadline = 0;
 
 	INIT_LIST_HEAD(&rq->timeout_list);
 	rq->timeout = 0;
@@ -322,6 +318,12 @@ static struct request *blk_mq_rq_ctx_init(struct blk_mq_alloc_data *data,
 	rq->end_io_data = NULL;
 	rq->next_rq = NULL;
 
+#ifdef CONFIG_BLK_CGROUP
+	rq->rl = NULL;
+	set_start_time_ns(rq);
+	rq->io_start_time_ns = 0;
+#endif
+
 	data->ctx->rq_dispatched[op_is_sync(op)]++;
 	return rq;
 }
@@ -441,7 +443,7 @@ struct request *blk_mq_alloc_request_hctx(struct request_queue *q,
 		blk_queue_exit(q);
 		return ERR_PTR(-EXDEV);
 	}
-	cpu = cpumask_first(alloc_data.hctx->cpumask);
+	cpu = cpumask_first_and(alloc_data.hctx->cpumask, cpu_online_mask);
 	alloc_data.ctx = __blk_mq_get_ctx(q, cpu);
 
 	rq = blk_mq_get_request(q, NULL, op, &alloc_data);
@@ -483,8 +485,7 @@ void blk_mq_free_request(struct request *rq)
 	if (blk_rq_rl(rq))
 		blk_put_rl(blk_rq_rl(rq));
 
-	clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
-	clear_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
+	blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
 	if (rq->tag != -1)
 		blk_mq_put_tag(hctx, hctx->tags, ctx, rq->tag);
 	if (sched_tag != -1)
@@ -530,6 +531,9 @@ static void __blk_mq_complete_request(struct request *rq)
 	bool shared = false;
 	int cpu;
 
+	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IN_FLIGHT);
+	blk_mq_rq_update_state(rq, MQ_RQ_COMPLETE);
+
 	if (rq->internal_tag != -1)
 		blk_mq_sched_completed_request(rq);
 	if (rq->rq_flags & RQF_STATS) {
@@ -557,6 +561,56 @@ static void __blk_mq_complete_request(struct request *rq)
 	put_cpu();
 }
 
+static void hctx_unlock(struct blk_mq_hw_ctx *hctx, int srcu_idx)
+	__releases(hctx->srcu)
+{
+	if (!(hctx->flags & BLK_MQ_F_BLOCKING))
+		rcu_read_unlock();
+	else
+		srcu_read_unlock(hctx->srcu, srcu_idx);
+}
+
+static void hctx_lock(struct blk_mq_hw_ctx *hctx, int *srcu_idx)
+	__acquires(hctx->srcu)
+{
+	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
+		/* shut up gcc false positive */
+		*srcu_idx = 0;
+		rcu_read_lock();
+	} else
+		*srcu_idx = srcu_read_lock(hctx->srcu);
+}
+
+static void blk_mq_rq_update_aborted_gstate(struct request *rq, u64 gstate)
+{
+	unsigned long flags;
+
+	/*
+	 * blk_mq_rq_aborted_gstate() is used from the completion path and
+	 * can thus be called from irq context.  u64_stats_fetch in the
+	 * middle of update on the same CPU leads to lockup.  Disable irq
+	 * while updating.
+	 */
+	local_irq_save(flags);
+	u64_stats_update_begin(&rq->aborted_gstate_sync);
+	rq->aborted_gstate = gstate;
+	u64_stats_update_end(&rq->aborted_gstate_sync);
+	local_irq_restore(flags);
+}
+
+static u64 blk_mq_rq_aborted_gstate(struct request *rq)
+{
+	unsigned int start;
+	u64 aborted_gstate;
+
+	do {
+		start = u64_stats_fetch_begin(&rq->aborted_gstate_sync);
+		aborted_gstate = rq->aborted_gstate;
+	} while (u64_stats_fetch_retry(&rq->aborted_gstate_sync, start));
+
+	return aborted_gstate;
+}
+
 /**
  * blk_mq_complete_request - end I/O on a request
  * @rq:		the request being processed
@@ -568,17 +622,33 @@ static void __blk_mq_complete_request(struct request *rq)
 void blk_mq_complete_request(struct request *rq)
 {
 	struct request_queue *q = rq->q;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(q, rq->mq_ctx->cpu);
+	int srcu_idx;
 
 	if (unlikely(blk_should_fake_timeout(q)))
 		return;
-	if (!blk_mark_rq_complete(rq))
+
+	/*
+	 * If @rq->aborted_gstate equals the current instance, timeout is
+	 * claiming @rq and we lost.  This is synchronized through
+	 * hctx_lock().  See blk_mq_timeout_work() for details.
+	 *
+	 * Completion path never blocks and we can directly use RCU here
+	 * instead of hctx_lock() which can be either RCU or SRCU.
+	 * However, that would complicate paths which want to synchronize
+	 * against us.  Let stay in sync with the issue path so that
+	 * hctx_lock() covers both issue and completion paths.
+	 */
+	hctx_lock(hctx, &srcu_idx);
+	if (blk_mq_rq_aborted_gstate(rq) != rq->gstate)
 		__blk_mq_complete_request(rq);
+	hctx_unlock(hctx, srcu_idx);
 }
 EXPORT_SYMBOL(blk_mq_complete_request);
 
 int blk_mq_request_started(struct request *rq)
 {
-	return test_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
+	return blk_mq_rq_state(rq) != MQ_RQ_IDLE;
 }
 EXPORT_SYMBOL_GPL(blk_mq_request_started);
 
@@ -596,34 +666,27 @@ void blk_mq_start_request(struct request *rq)
 		wbt_issue(q->rq_wb, &rq->issue_stat);
 	}
 
-	blk_add_timer(rq);
-
-	WARN_ON_ONCE(test_bit(REQ_ATOM_STARTED, &rq->atomic_flags));
+	WARN_ON_ONCE(blk_mq_rq_state(rq) != MQ_RQ_IDLE);
 
 	/*
-	 * Mark us as started and clear complete. Complete might have been
-	 * set if requeue raced with timeout, which then marked it as
-	 * complete. So be sure to clear complete again when we start
-	 * the request, otherwise we'll ignore the completion event.
+	 * Mark @rq in-flight which also advances the generation number,
+	 * and register for timeout.  Protect with a seqcount to allow the
+	 * timeout path to read both @rq->gstate and @rq->deadline
+	 * coherently.
 	 *
-	 * Ensure that ->deadline is visible before we set STARTED, such that
-	 * blk_mq_check_expired() is guaranteed to observe our ->deadline when
-	 * it observes STARTED.
+	 * This is the only place where a request is marked in-flight.  If
+	 * the timeout path reads an in-flight @rq->gstate, the
+	 * @rq->deadline it reads together under @rq->gstate_seq is
+	 * guaranteed to be the matching one.
 	 */
-	smp_wmb();
-	set_bit(REQ_ATOM_STARTED, &rq->atomic_flags);
-	if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags)) {
-		/*
-		 * Coherence order guarantees these consecutive stores to a
-		 * single variable propagate in the specified order. Thus the
-		 * clear_bit() is ordered _after_ the set bit. See
-		 * blk_mq_check_expired().
-		 *
-		 * (the bits must be part of the same byte for this to be
-		 * true).
-		 */
-		clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
-	}
+	preempt_disable();
+	write_seqcount_begin(&rq->gstate_seq);
+
+	blk_mq_rq_update_state(rq, MQ_RQ_IN_FLIGHT);
+	blk_add_timer(rq);
+
+	write_seqcount_end(&rq->gstate_seq);
+	preempt_enable();
 
 	if (q->dma_drain_size && blk_rq_bytes(rq)) {
 		/*
@@ -637,13 +700,9 @@ void blk_mq_start_request(struct request *rq)
 EXPORT_SYMBOL(blk_mq_start_request);
 
 /*
- * When we reach here because queue is busy, REQ_ATOM_COMPLETE
- * flag isn't set yet, so there may be race with timeout handler,
- * but given rq->deadline is just set in .queue_rq() under
- * this situation, the race won't be possible in reality because
- * rq->timeout should be set as big enough to cover the window
- * between blk_mq_start_request() called from .queue_rq() and
- * clearing REQ_ATOM_STARTED here.
+ * When we reach here because queue is busy, it's safe to change the state
+ * to IDLE without checking @rq->aborted_gstate because we should still be
+ * holding the RCU read lock and thus protected against timeout.
  */
 static void __blk_mq_requeue_request(struct request *rq)
 {
@@ -655,7 +714,8 @@ static void __blk_mq_requeue_request(struct request *rq)
 	wbt_requeue(q->rq_wb, &rq->issue_stat);
 	blk_mq_sched_requeue_request(rq);
 
-	if (test_and_clear_bit(REQ_ATOM_STARTED, &rq->atomic_flags)) {
+	if (blk_mq_rq_state(rq) != MQ_RQ_IDLE) {
+		blk_mq_rq_update_state(rq, MQ_RQ_IDLE);
 		if (q->dma_drain_size && blk_rq_bytes(rq))
 			rq->nr_phys_segments--;
 	}
@@ -687,13 +747,13 @@ static void blk_mq_requeue_work(struct work_struct *work)
 
 		rq->rq_flags &= ~RQF_SOFTBARRIER;
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, true, false, false, true);
+		blk_mq_sched_insert_request(rq, true, false, false);
 	}
 
 	while (!list_empty(&rq_list)) {
 		rq = list_entry(rq_list.next, struct request, queuelist);
 		list_del_init(&rq->queuelist);
-		blk_mq_sched_insert_request(rq, false, false, false, true);
+		blk_mq_sched_insert_request(rq, false, false, false);
 	}
 
 	blk_mq_run_hw_queues(q, false);
@@ -727,7 +787,7 @@ EXPORT_SYMBOL(blk_mq_add_to_requeue_list);
 
 void blk_mq_kick_requeue_list(struct request_queue *q)
 {
-	kblockd_schedule_delayed_work(&q->requeue_work, 0);
+	kblockd_mod_delayed_work_on(WORK_CPU_UNBOUND, &q->requeue_work, 0);
 }
 EXPORT_SYMBOL(blk_mq_kick_requeue_list);
 
@@ -753,24 +813,15 @@ EXPORT_SYMBOL(blk_mq_tag_to_rq);
 struct blk_mq_timeout_data {
 	unsigned long next;
 	unsigned int next_set;
+	unsigned int nr_expired;
 };
 
-void blk_mq_rq_timed_out(struct request *req, bool reserved)
+static void blk_mq_rq_timed_out(struct request *req, bool reserved)
 {
 	const struct blk_mq_ops *ops = req->q->mq_ops;
 	enum blk_eh_timer_return ret = BLK_EH_RESET_TIMER;
 
-	/*
-	 * We know that complete is set at this point. If STARTED isn't set
-	 * anymore, then the request isn't active and the "timeout" should
-	 * just be ignored. This can happen due to the bitflag ordering.
-	 * Timeout first checks if STARTED is set, and if it is, assumes
-	 * the request is active. But if we race with completion, then
-	 * both flags will get cleared. So check here again, and ignore
-	 * a timeout event with a request that isn't active.
-	 */
-	if (!test_bit(REQ_ATOM_STARTED, &req->atomic_flags))
-		return;
+	req->rq_flags |= RQF_MQ_TIMEOUT_EXPIRED;
 
 	if (ops->timeout)
 		ret = ops->timeout(req, reserved);
@@ -780,8 +831,13 @@ void blk_mq_rq_timed_out(struct request *req, bool reserved)
 		__blk_mq_complete_request(req);
 		break;
 	case BLK_EH_RESET_TIMER:
+		/*
+		 * As nothing prevents from completion happening while
+		 * ->aborted_gstate is set, this may lead to ignored
+		 * completions and further spurious timeouts.
+		 */
+		blk_mq_rq_update_aborted_gstate(req, 0);
 		blk_add_timer(req);
-		blk_clear_rq_complete(req);
 		break;
 	case BLK_EH_NOT_HANDLED:
 		break;
@@ -795,50 +851,51 @@ static void blk_mq_check_expired(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, void *priv, bool reserved)
 {
 	struct blk_mq_timeout_data *data = priv;
-	unsigned long deadline;
+	unsigned long gstate, deadline;
+	int start;
 
-	if (!test_bit(REQ_ATOM_STARTED, &rq->atomic_flags))
+	might_sleep();
+
+	if (rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED)
 		return;
 
-	/*
-	 * Ensures that if we see STARTED we must also see our
-	 * up-to-date deadline, see blk_mq_start_request().
-	 */
-	smp_rmb();
+	/* read coherent snapshots of @rq->state_gen and @rq->deadline */
+	while (true) {
+		start = read_seqcount_begin(&rq->gstate_seq);
+		gstate = READ_ONCE(rq->gstate);
+		deadline = blk_rq_deadline(rq);
+		if (!read_seqcount_retry(&rq->gstate_seq, start))
+			break;
+		cond_resched();
+	}
 
-	deadline = READ_ONCE(rq->deadline);
-
-	/*
-	 * The rq being checked may have been freed and reallocated
-	 * out already here, we avoid this race by checking rq->deadline
-	 * and REQ_ATOM_COMPLETE flag together:
-	 *
-	 * - if rq->deadline is observed as new value because of
-	 *   reusing, the rq won't be timed out because of timing.
-	 * - if rq->deadline is observed as previous value,
-	 *   REQ_ATOM_COMPLETE flag won't be cleared in reuse path
-	 *   because we put a barrier between setting rq->deadline
-	 *   and clearing the flag in blk_mq_start_request(), so
-	 *   this rq won't be timed out too.
-	 */
-	if (time_after_eq(jiffies, deadline)) {
-		if (!blk_mark_rq_complete(rq)) {
-			/*
-			 * Again coherence order ensures that consecutive reads
-			 * from the same variable must be in that order. This
-			 * ensures that if we see COMPLETE clear, we must then
-			 * see STARTED set and we'll ignore this timeout.
-			 *
-			 * (There's also the MB implied by the test_and_clear())
-			 */
-			blk_mq_rq_timed_out(rq, reserved);
-		}
+	/* if in-flight && overdue, mark for abortion */
+	if ((gstate & MQ_RQ_STATE_MASK) == MQ_RQ_IN_FLIGHT &&
+	    time_after_eq(jiffies, deadline)) {
+		blk_mq_rq_update_aborted_gstate(rq, gstate);
+		data->nr_expired++;
+		hctx->nr_expired++;
 	} else if (!data->next_set || time_after(data->next, deadline)) {
 		data->next = deadline;
 		data->next_set = 1;
 	}
 }
 
+static void blk_mq_terminate_expired(struct blk_mq_hw_ctx *hctx,
+		struct request *rq, void *priv, bool reserved)
+{
+	/*
+	 * We marked @rq->aborted_gstate and waited for RCU.  If there were
+	 * completions that we lost to, they would have finished and
+	 * updated @rq->gstate by now; otherwise, the completion path is
+	 * now guaranteed to see @rq->aborted_gstate and yield.  If
+	 * @rq->aborted_gstate still matches @rq->gstate, @rq is ours.
+	 */
+	if (!(rq->rq_flags & RQF_MQ_TIMEOUT_EXPIRED) &&
+	    READ_ONCE(rq->gstate) == rq->aborted_gstate)
+		blk_mq_rq_timed_out(rq, reserved);
+}
+
 static void blk_mq_timeout_work(struct work_struct *work)
 {
 	struct request_queue *q =
@@ -846,7 +903,9 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	struct blk_mq_timeout_data data = {
 		.next		= 0,
 		.next_set	= 0,
+		.nr_expired	= 0,
 	};
+	struct blk_mq_hw_ctx *hctx;
 	int i;
 
 	/* A deadlock might occur if a request is stuck requiring a
@@ -865,14 +924,46 @@ static void blk_mq_timeout_work(struct work_struct *work)
 	if (!percpu_ref_tryget(&q->q_usage_counter))
 		return;
 
+	/* scan for the expired ones and set their ->aborted_gstate */
 	blk_mq_queue_tag_busy_iter(q, blk_mq_check_expired, &data);
 
+	if (data.nr_expired) {
+		bool has_rcu = false;
+
+		/*
+		 * Wait till everyone sees ->aborted_gstate.  The
+		 * sequential waits for SRCUs aren't ideal.  If this ever
+		 * becomes a problem, we can add per-hw_ctx rcu_head and
+		 * wait in parallel.
+		 */
+		queue_for_each_hw_ctx(q, hctx, i) {
+			if (!hctx->nr_expired)
+				continue;
+
+			if (!(hctx->flags & BLK_MQ_F_BLOCKING))
+				has_rcu = true;
+			else
+				synchronize_srcu(hctx->srcu);
+
+			hctx->nr_expired = 0;
+		}
+		if (has_rcu)
+			synchronize_rcu();
+
+		/* terminate the ones we won */
+		blk_mq_queue_tag_busy_iter(q, blk_mq_terminate_expired, NULL);
+	}
+
 	if (data.next_set) {
 		data.next = blk_rq_timeout(round_jiffies_up(data.next));
 		mod_timer(&q->timeout, data.next);
 	} else {
-		struct blk_mq_hw_ctx *hctx;
-
+		/*
+		 * Request timeouts are handled as a forward rolling timer. If
+		 * we end up here it means that no requests are pending and
+		 * also that no request has been pending for a while. Mark
+		 * each hctx as idle.
+		 */
 		queue_for_each_hw_ctx(q, hctx, i) {
 			/* the hctx may be unmapped, so check it here */
 			if (blk_mq_hw_queue_mapped(hctx))
@@ -1008,66 +1099,67 @@ static int blk_mq_dispatch_wake(wait_queue_entry_t *wait, unsigned mode,
 
 /*
  * Mark us waiting for a tag. For shared tags, this involves hooking us into
- * the tag wakeups. For non-shared tags, we can simply mark us nedeing a
- * restart. For both caes, take care to check the condition again after
+ * the tag wakeups. For non-shared tags, we can simply mark us needing a
+ * restart. For both cases, take care to check the condition again after
  * marking us as waiting.
  */
 static bool blk_mq_mark_tag_wait(struct blk_mq_hw_ctx **hctx,
 				 struct request *rq)
 {
 	struct blk_mq_hw_ctx *this_hctx = *hctx;
-	bool shared_tags = (this_hctx->flags & BLK_MQ_F_TAG_SHARED) != 0;
 	struct sbq_wait_state *ws;
 	wait_queue_entry_t *wait;
 	bool ret;
 
-	if (!shared_tags) {
+	if (!(this_hctx->flags & BLK_MQ_F_TAG_SHARED)) {
 		if (!test_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state))
 			set_bit(BLK_MQ_S_SCHED_RESTART, &this_hctx->state);
-	} else {
-		wait = &this_hctx->dispatch_wait;
-		if (!list_empty_careful(&wait->entry))
-			return false;
 
-		spin_lock(&this_hctx->lock);
-		if (!list_empty(&wait->entry)) {
-			spin_unlock(&this_hctx->lock);
-			return false;
-		}
-
-		ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
-		add_wait_queue(&ws->wait, wait);
+		/*
+		 * It's possible that a tag was freed in the window between the
+		 * allocation failure and adding the hardware queue to the wait
+		 * queue.
+		 *
+		 * Don't clear RESTART here, someone else could have set it.
+		 * At most this will cost an extra queue run.
+		 */
+		return blk_mq_get_driver_tag(rq, hctx, false);
 	}
 
+	wait = &this_hctx->dispatch_wait;
+	if (!list_empty_careful(&wait->entry))
+		return false;
+
+	spin_lock(&this_hctx->lock);
+	if (!list_empty(&wait->entry)) {
+		spin_unlock(&this_hctx->lock);
+		return false;
+	}
+
+	ws = bt_wait_ptr(&this_hctx->tags->bitmap_tags, this_hctx);
+	add_wait_queue(&ws->wait, wait);
+
 	/*
 	 * It's possible that a tag was freed in the window between the
 	 * allocation failure and adding the hardware queue to the wait
 	 * queue.
 	 */
 	ret = blk_mq_get_driver_tag(rq, hctx, false);
-
-	if (!shared_tags) {
-		/*
-		 * Don't clear RESTART here, someone else could have set it.
-		 * At most this will cost an extra queue run.
-		 */
-		return ret;
-	} else {
-		if (!ret) {
-			spin_unlock(&this_hctx->lock);
-			return false;
-		}
-
-		/*
-		 * We got a tag, remove ourselves from the wait queue to ensure
-		 * someone else gets the wakeup.
-		 */
-		spin_lock_irq(&ws->wait.lock);
-		list_del_init(&wait->entry);
-		spin_unlock_irq(&ws->wait.lock);
+	if (!ret) {
 		spin_unlock(&this_hctx->lock);
-		return true;
+		return false;
 	}
+
+	/*
+	 * We got a tag, remove ourselves from the wait queue to ensure
+	 * someone else gets the wakeup.
+	 */
+	spin_lock_irq(&ws->wait.lock);
+	list_del_init(&wait->entry);
+	spin_unlock_irq(&ws->wait.lock);
+	spin_unlock(&this_hctx->lock);
+
+	return true;
 }
 
 bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list,
@@ -1204,9 +1296,27 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 	/*
 	 * We should be running this queue from one of the CPUs that
 	 * are mapped to it.
+	 *
+	 * There are at least two related races now between setting
+	 * hctx->next_cpu from blk_mq_hctx_next_cpu() and running
+	 * __blk_mq_run_hw_queue():
+	 *
+	 * - hctx->next_cpu is found offline in blk_mq_hctx_next_cpu(),
+	 *   but later it becomes online, then this warning is harmless
+	 *   at all
+	 *
+	 * - hctx->next_cpu is found online in blk_mq_hctx_next_cpu(),
+	 *   but later it becomes offline, then the warning can't be
+	 *   triggered, and we depend on blk-mq timeout handler to
+	 *   handle dispatched requests to this hctx
 	 */
-	WARN_ON(!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
-		cpu_online(hctx->next_cpu));
+	if (!cpumask_test_cpu(raw_smp_processor_id(), hctx->cpumask) &&
+		cpu_online(hctx->next_cpu)) {
+		printk(KERN_WARNING "run queue from wrong CPU %d, hctx %s\n",
+			raw_smp_processor_id(),
+			cpumask_empty(hctx->cpumask) ? "inactive": "active");
+		dump_stack();
+	}
 
 	/*
 	 * We can't run the queue inline with ints disabled. Ensure that
@@ -1214,17 +1324,11 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
 	 */
 	WARN_ON_ONCE(in_interrupt());
 
-	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
-		rcu_read_lock();
-		blk_mq_sched_dispatch_requests(hctx);
-		rcu_read_unlock();
-	} else {
-		might_sleep();
+	might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
 
-		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
-		blk_mq_sched_dispatch_requests(hctx);
-		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
-	}
+	hctx_lock(hctx, &srcu_idx);
+	blk_mq_sched_dispatch_requests(hctx);
+	hctx_unlock(hctx, srcu_idx);
 }
 
 /*
@@ -1235,20 +1339,47 @@ static void __blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx)
  */
 static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
 {
+	bool tried = false;
+
 	if (hctx->queue->nr_hw_queues == 1)
 		return WORK_CPU_UNBOUND;
 
 	if (--hctx->next_cpu_batch <= 0) {
 		int next_cpu;
-
-		next_cpu = cpumask_next(hctx->next_cpu, hctx->cpumask);
+select_cpu:
+		next_cpu = cpumask_next_and(hctx->next_cpu, hctx->cpumask,
+				cpu_online_mask);
 		if (next_cpu >= nr_cpu_ids)
-			next_cpu = cpumask_first(hctx->cpumask);
+			next_cpu = cpumask_first_and(hctx->cpumask,cpu_online_mask);
 
-		hctx->next_cpu = next_cpu;
+		/*
+		 * No online CPU is found, so have to make sure hctx->next_cpu
+		 * is set correctly for not breaking workqueue.
+		 */
+		if (next_cpu >= nr_cpu_ids)
+			hctx->next_cpu = cpumask_first(hctx->cpumask);
+		else
+			hctx->next_cpu = next_cpu;
 		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
 	}
 
+	/*
+	 * Do unbound schedule if we can't find a online CPU for this hctx,
+	 * and it should only happen in the path of handling CPU DEAD.
+	 */
+	if (!cpu_online(hctx->next_cpu)) {
+		if (!tried) {
+			tried = true;
+			goto select_cpu;
+		}
+
+		/*
+		 * Make sure to re-select CPU next time once after CPUs
+		 * in hctx->cpumask become online again.
+		 */
+		hctx->next_cpu_batch = 1;
+		return WORK_CPU_UNBOUND;
+	}
 	return hctx->next_cpu;
 }
 
@@ -1272,9 +1403,8 @@ static void __blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async,
 		put_cpu();
 	}
 
-	kblockd_schedule_delayed_work_on(blk_mq_hctx_next_cpu(hctx),
-					 &hctx->run_work,
-					 msecs_to_jiffies(msecs));
+	kblockd_mod_delayed_work_on(blk_mq_hctx_next_cpu(hctx), &hctx->run_work,
+				    msecs_to_jiffies(msecs));
 }
 
 void blk_mq_delay_run_hw_queue(struct blk_mq_hw_ctx *hctx, unsigned long msecs)
@@ -1285,7 +1415,23 @@ EXPORT_SYMBOL(blk_mq_delay_run_hw_queue);
 
 bool blk_mq_run_hw_queue(struct blk_mq_hw_ctx *hctx, bool async)
 {
-	if (blk_mq_hctx_has_pending(hctx)) {
+	int srcu_idx;
+	bool need_run;
+
+	/*
+	 * When queue is quiesced, we may be switching io scheduler, or
+	 * updating nr_hw_queues, or other things, and we can't run queue
+	 * any more, even __blk_mq_hctx_has_pending() can't be called safely.
+	 *
+	 * And queue will be rerun in blk_mq_unquiesce_queue() if it is
+	 * quiesced.
+	 */
+	hctx_lock(hctx, &srcu_idx);
+	need_run = !blk_queue_quiesced(hctx->queue) &&
+		blk_mq_hctx_has_pending(hctx);
+	hctx_unlock(hctx, srcu_idx);
+
+	if (need_run) {
 		__blk_mq_delay_run_hw_queue(hctx, async, 0);
 		return true;
 	}
@@ -1593,9 +1739,9 @@ static blk_qc_t request_to_qc_t(struct blk_mq_hw_ctx *hctx, struct request *rq)
 	return blk_tag_to_qc_t(rq->internal_tag, hctx->queue_num, true);
 }
 
-static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
-					struct request *rq,
-					blk_qc_t *cookie, bool may_sleep)
+static blk_status_t __blk_mq_issue_directly(struct blk_mq_hw_ctx *hctx,
+					    struct request *rq,
+					    blk_qc_t *cookie)
 {
 	struct request_queue *q = rq->q;
 	struct blk_mq_queue_data bd = {
@@ -1604,15 +1750,52 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 	};
 	blk_qc_t new_cookie;
 	blk_status_t ret;
+
+	new_cookie = request_to_qc_t(hctx, rq);
+
+	/*
+	 * For OK queue, we are done. For error, caller may kill it.
+	 * Any other error (busy), just add it to our list as we
+	 * previously would have done.
+	 */
+	ret = q->mq_ops->queue_rq(hctx, &bd);
+	switch (ret) {
+	case BLK_STS_OK:
+		*cookie = new_cookie;
+		break;
+	case BLK_STS_RESOURCE:
+		__blk_mq_requeue_request(rq);
+		break;
+	default:
+		*cookie = BLK_QC_T_NONE;
+		break;
+	}
+
+	return ret;
+}
+
+static blk_status_t __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
+						struct request *rq,
+						blk_qc_t *cookie,
+						bool bypass_insert)
+{
+	struct request_queue *q = rq->q;
 	bool run_queue = true;
 
-	/* RCU or SRCU read lock is needed before checking quiesced flag */
+	/*
+	 * RCU or SRCU read lock is needed before checking quiesced flag.
+	 *
+	 * When queue is stopped or quiesced, ignore 'bypass_insert' from
+	 * blk_mq_request_issue_directly(), and return BLK_STS_OK to caller,
+	 * and avoid driver to try to dispatch again.
+	 */
 	if (blk_mq_hctx_stopped(hctx) || blk_queue_quiesced(q)) {
 		run_queue = false;
+		bypass_insert = false;
 		goto insert;
 	}
 
-	if (q->elevator)
+	if (q->elevator && !bypass_insert)
 		goto insert;
 
 	if (!blk_mq_get_driver_tag(rq, NULL, false))
@@ -1623,47 +1806,47 @@ static void __blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		goto insert;
 	}
 
-	new_cookie = request_to_qc_t(hctx, rq);
-
-	/*
-	 * For OK queue, we are done. For error, kill it. Any other
-	 * error (busy), just add it to our list as we previously
-	 * would have done
-	 */
-	ret = q->mq_ops->queue_rq(hctx, &bd);
-	switch (ret) {
-	case BLK_STS_OK:
-		*cookie = new_cookie;
-		return;
-	case BLK_STS_RESOURCE:
-		__blk_mq_requeue_request(rq);
-		goto insert;
-	default:
-		*cookie = BLK_QC_T_NONE;
-		blk_mq_end_request(rq, ret);
-		return;
-	}
-
+	return __blk_mq_issue_directly(hctx, rq, cookie);
 insert:
-	blk_mq_sched_insert_request(rq, false, run_queue, false, may_sleep);
+	if (bypass_insert)
+		return BLK_STS_RESOURCE;
+
+	blk_mq_sched_insert_request(rq, false, run_queue, false);
+	return BLK_STS_OK;
 }
 
 static void blk_mq_try_issue_directly(struct blk_mq_hw_ctx *hctx,
 		struct request *rq, blk_qc_t *cookie)
 {
-	if (!(hctx->flags & BLK_MQ_F_BLOCKING)) {
-		rcu_read_lock();
-		__blk_mq_try_issue_directly(hctx, rq, cookie, false);
-		rcu_read_unlock();
-	} else {
-		unsigned int srcu_idx;
+	blk_status_t ret;
+	int srcu_idx;
 
-		might_sleep();
+	might_sleep_if(hctx->flags & BLK_MQ_F_BLOCKING);
 
-		srcu_idx = srcu_read_lock(hctx->queue_rq_srcu);
-		__blk_mq_try_issue_directly(hctx, rq, cookie, true);
-		srcu_read_unlock(hctx->queue_rq_srcu, srcu_idx);
-	}
+	hctx_lock(hctx, &srcu_idx);
+
+	ret = __blk_mq_try_issue_directly(hctx, rq, cookie, false);
+	if (ret == BLK_STS_RESOURCE)
+		blk_mq_sched_insert_request(rq, false, true, false);
+	else if (ret != BLK_STS_OK)
+		blk_mq_end_request(rq, ret);
+
+	hctx_unlock(hctx, srcu_idx);
+}
+
+blk_status_t blk_mq_request_issue_directly(struct request *rq)
+{
+	blk_status_t ret;
+	int srcu_idx;
+	blk_qc_t unused_cookie;
+	struct blk_mq_ctx *ctx = rq->mq_ctx;
+	struct blk_mq_hw_ctx *hctx = blk_mq_map_queue(rq->q, ctx->cpu);
+
+	hctx_lock(hctx, &srcu_idx);
+	ret = __blk_mq_try_issue_directly(hctx, rq, &unused_cookie, true);
+	hctx_unlock(hctx, srcu_idx);
+
+	return ret;
 }
 
 static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
@@ -1774,7 +1957,7 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
 	} else if (q->elevator) {
 		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
-		blk_mq_sched_insert_request(rq, false, true, true, true);
+		blk_mq_sched_insert_request(rq, false, true, true);
 	} else {
 		blk_mq_put_ctx(data.ctx);
 		blk_mq_bio_to_request(rq, bio);
@@ -1867,6 +2050,22 @@ static size_t order_to_size(unsigned int order)
 	return (size_t)PAGE_SIZE << order;
 }
 
+static int blk_mq_init_request(struct blk_mq_tag_set *set, struct request *rq,
+			       unsigned int hctx_idx, int node)
+{
+	int ret;
+
+	if (set->ops->init_request) {
+		ret = set->ops->init_request(set, rq, hctx_idx, node);
+		if (ret)
+			return ret;
+	}
+
+	seqcount_init(&rq->gstate_seq);
+	u64_stats_init(&rq->aborted_gstate_sync);
+	return 0;
+}
+
 int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 		     unsigned int hctx_idx, unsigned int depth)
 {
@@ -1928,12 +2127,9 @@ int blk_mq_alloc_rqs(struct blk_mq_tag_set *set, struct blk_mq_tags *tags,
 			struct request *rq = p;
 
 			tags->static_rqs[i] = rq;
-			if (set->ops->init_request) {
-				if (set->ops->init_request(set, rq, hctx_idx,
-						node)) {
-					tags->static_rqs[i] = NULL;
-					goto fail;
-				}
+			if (blk_mq_init_request(set, rq, hctx_idx, node)) {
+				tags->static_rqs[i] = NULL;
+				goto fail;
 			}
 
 			p += rq_size;
@@ -1992,7 +2188,8 @@ static void blk_mq_exit_hctx(struct request_queue *q,
 {
 	blk_mq_debugfs_unregister_hctx(hctx);
 
-	blk_mq_tag_idle(hctx);
+	if (blk_mq_hw_queue_mapped(hctx))
+		blk_mq_tag_idle(hctx);
 
 	if (set->ops->exit_request)
 		set->ops->exit_request(set, hctx->fq->flush_rq, hctx_idx);
@@ -2003,7 +2200,7 @@ static void blk_mq_exit_hctx(struct request_queue *q,
 		set->ops->exit_hctx(hctx, hctx_idx);
 
 	if (hctx->flags & BLK_MQ_F_BLOCKING)
-		cleanup_srcu_struct(hctx->queue_rq_srcu);
+		cleanup_srcu_struct(hctx->srcu);
 
 	blk_mq_remove_cpuhp(hctx);
 	blk_free_flush_queue(hctx->fq);
@@ -2072,13 +2269,11 @@ static int blk_mq_init_hctx(struct request_queue *q,
 	if (!hctx->fq)
 		goto sched_exit_hctx;
 
-	if (set->ops->init_request &&
-	    set->ops->init_request(set, hctx->fq->flush_rq, hctx_idx,
-				   node))
+	if (blk_mq_init_request(set, hctx->fq->flush_rq, hctx_idx, node))
 		goto free_fq;
 
 	if (hctx->flags & BLK_MQ_F_BLOCKING)
-		init_srcu_struct(hctx->queue_rq_srcu);
+		init_srcu_struct(hctx->srcu);
 
 	blk_mq_debugfs_register_hctx(q, hctx);
 
@@ -2114,16 +2309,11 @@ static void blk_mq_init_cpu_queues(struct request_queue *q,
 		INIT_LIST_HEAD(&__ctx->rq_list);
 		__ctx->queue = q;
 
-		/* If the cpu isn't present, the cpu is mapped to first hctx */
-		if (!cpu_present(i))
-			continue;
-
-		hctx = blk_mq_map_queue(q, i);
-
 		/*
 		 * Set local node, IFF we have more than one hw queue. If
 		 * not, we remain on the home node of the device
 		 */
+		hctx = blk_mq_map_queue(q, i);
 		if (nr_hw_queues > 1 && hctx->numa_node == NUMA_NO_NODE)
 			hctx->numa_node = local_memory_node(cpu_to_node(i));
 	}
@@ -2180,7 +2370,7 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 	 *
 	 * If the cpu isn't present, the cpu is mapped to first hctx.
 	 */
-	for_each_present_cpu(i) {
+	for_each_possible_cpu(i) {
 		hctx_idx = q->mq_map[i];
 		/* unmapped hw queue can be remapped after CPU topo changed */
 		if (!set->tags[hctx_idx] &&
@@ -2234,7 +2424,8 @@ static void blk_mq_map_swqueue(struct request_queue *q)
 		/*
 		 * Initialize batch roundrobin counts
 		 */
-		hctx->next_cpu = cpumask_first(hctx->cpumask);
+		hctx->next_cpu = cpumask_first_and(hctx->cpumask,
+				cpu_online_mask);
 		hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
 	}
 }
@@ -2367,7 +2558,7 @@ static int blk_mq_hw_ctx_size(struct blk_mq_tag_set *tag_set)
 {
 	int hw_ctx_size = sizeof(struct blk_mq_hw_ctx);
 
-	BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, queue_rq_srcu),
+	BUILD_BUG_ON(ALIGN(offsetof(struct blk_mq_hw_ctx, srcu),
 			   __alignof__(struct blk_mq_hw_ctx)) !=
 		     sizeof(struct blk_mq_hw_ctx));
 
@@ -2384,6 +2575,9 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
 	struct blk_mq_hw_ctx **hctxs = q->queue_hw_ctx;
 
 	blk_mq_sysfs_unregister(q);
+
+	/* protect against switching io scheduler  */
+	mutex_lock(&q->sysfs_lock);
 	for (i = 0; i < set->nr_hw_queues; i++) {
 		int node;
 
@@ -2428,6 +2622,7 @@ static void blk_mq_realloc_hw_ctxs(struct blk_mq_tag_set *set,
 		}
 	}
 	q->nr_hw_queues = i;
+	mutex_unlock(&q->sysfs_lock);
 	blk_mq_sysfs_register(q);
 }
 
@@ -2599,9 +2794,27 @@ static int blk_mq_alloc_rq_maps(struct blk_mq_tag_set *set)
 
 static int blk_mq_update_queue_map(struct blk_mq_tag_set *set)
 {
-	if (set->ops->map_queues)
+	if (set->ops->map_queues) {
+		int cpu;
+		/*
+		 * transport .map_queues is usually done in the following
+		 * way:
+		 *
+		 * for (queue = 0; queue < set->nr_hw_queues; queue++) {
+		 * 	mask = get_cpu_mask(queue)
+		 * 	for_each_cpu(cpu, mask)
+		 * 		set->mq_map[cpu] = queue;
+		 * }
+		 *
+		 * When we need to remap, the table has to be cleared for
+		 * killing stale mapping since one CPU may not be mapped
+		 * to any hw queue.
+		 */
+		for_each_possible_cpu(cpu)
+			set->mq_map[cpu] = 0;
+
 		return set->ops->map_queues(set);
-	else
+	} else
 		return blk_mq_map_queues(set);
 }
 
@@ -2710,6 +2923,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 		return -EINVAL;
 
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	ret = 0;
 	queue_for_each_hw_ctx(q, hctx, i) {
@@ -2733,6 +2947,7 @@ int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr)
 	if (!ret)
 		q->nr_requests = nr;
 
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 
 	return ret;
@@ -2848,7 +3063,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	unsigned int nsecs;
 	ktime_t kt;
 
-	if (test_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags))
+	if (rq->rq_flags & RQF_MQ_POLL_SLEPT)
 		return false;
 
 	/*
@@ -2868,7 +3083,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 	if (!nsecs)
 		return false;
 
-	set_bit(REQ_ATOM_POLL_SLEPT, &rq->atomic_flags);
+	rq->rq_flags |= RQF_MQ_POLL_SLEPT;
 
 	/*
 	 * This will be replaced with the stats tracking code, using
@@ -2882,7 +3097,7 @@ static bool blk_mq_poll_hybrid_sleep(struct request_queue *q,
 
 	hrtimer_init_sleeper(&hs, current);
 	do {
-		if (test_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags))
+		if (blk_mq_rq_state(rq) == MQ_RQ_COMPLETE)
 			break;
 		set_current_state(TASK_UNINTERRUPTIBLE);
 		hrtimer_start_expires(&hs.timer, mode);
@@ -2968,12 +3183,6 @@ static bool blk_mq_poll(struct request_queue *q, blk_qc_t cookie)
 
 static int __init blk_mq_init(void)
 {
-	/*
-	 * See comment in block/blk.h rq_atomic_flags enum
-	 */
-	BUILD_BUG_ON((REQ_ATOM_STARTED / BITS_PER_BYTE) !=
-			(REQ_ATOM_COMPLETE / BITS_PER_BYTE));
-
 	cpuhp_setup_state_multi(CPUHP_BLK_MQ_DEAD, "block/mq:dead", NULL,
 				blk_mq_hctx_notify_dead);
 	return 0;
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 6c7c3ff..88c558f 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -27,6 +27,20 @@ struct blk_mq_ctx {
 	struct kobject		kobj;
 } ____cacheline_aligned_in_smp;
 
+/*
+ * Bits for request->gstate.  The lower two bits carry MQ_RQ_* state value
+ * and the upper bits the generation number.
+ */
+enum mq_rq_state {
+	MQ_RQ_IDLE		= 0,
+	MQ_RQ_IN_FLIGHT		= 1,
+	MQ_RQ_COMPLETE		= 2,
+
+	MQ_RQ_STATE_BITS	= 2,
+	MQ_RQ_STATE_MASK	= (1 << MQ_RQ_STATE_BITS) - 1,
+	MQ_RQ_GEN_INC		= 1 << MQ_RQ_STATE_BITS,
+};
+
 void blk_mq_freeze_queue(struct request_queue *q);
 void blk_mq_free_queue(struct request_queue *q);
 int blk_mq_update_nr_requests(struct request_queue *q, unsigned int nr);
@@ -60,6 +74,9 @@ void blk_mq_request_bypass_insert(struct request *rq, bool run_queue);
 void blk_mq_insert_requests(struct blk_mq_hw_ctx *hctx, struct blk_mq_ctx *ctx,
 				struct list_head *list);
 
+/* Used by blk_insert_cloned_request() to issue request directly */
+blk_status_t blk_mq_request_issue_directly(struct request *rq);
+
 /*
  * CPU -> queue mappings
  */
@@ -81,10 +98,41 @@ extern int blk_mq_sysfs_register(struct request_queue *q);
 extern void blk_mq_sysfs_unregister(struct request_queue *q);
 extern void blk_mq_hctx_kobj_init(struct blk_mq_hw_ctx *hctx);
 
-extern void blk_mq_rq_timed_out(struct request *req, bool reserved);
-
 void blk_mq_release(struct request_queue *q);
 
+/**
+ * blk_mq_rq_state() - read the current MQ_RQ_* state of a request
+ * @rq: target request.
+ */
+static inline int blk_mq_rq_state(struct request *rq)
+{
+	return READ_ONCE(rq->gstate) & MQ_RQ_STATE_MASK;
+}
+
+/**
+ * blk_mq_rq_update_state() - set the current MQ_RQ_* state of a request
+ * @rq: target request.
+ * @state: new state to set.
+ *
+ * Set @rq's state to @state.  The caller is responsible for ensuring that
+ * there are no other updaters.  A request can transition into IN_FLIGHT
+ * only from IDLE and doing so increments the generation number.
+ */
+static inline void blk_mq_rq_update_state(struct request *rq,
+					  enum mq_rq_state state)
+{
+	u64 old_val = READ_ONCE(rq->gstate);
+	u64 new_val = (old_val & ~MQ_RQ_STATE_MASK) | state;
+
+	if (state == MQ_RQ_IN_FLIGHT) {
+		WARN_ON_ONCE((old_val & MQ_RQ_STATE_MASK) != MQ_RQ_IDLE);
+		new_val += MQ_RQ_GEN_INC;
+	}
+
+	/* avoid exposing interim values */
+	WRITE_ONCE(rq->gstate, new_val);
+}
+
 static inline struct blk_mq_ctx *__blk_mq_get_ctx(struct request_queue *q,
 					   unsigned int cpu)
 {
diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c
index 870484e..cbea895 100644
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -853,6 +853,10 @@ struct kobj_type blk_queue_ktype = {
 	.release	= blk_release_queue,
 };
 
+/**
+ * blk_register_queue - register a block layer queue with sysfs
+ * @disk: Disk of which the request queue should be registered with sysfs.
+ */
 int blk_register_queue(struct gendisk *disk)
 {
 	int ret;
@@ -909,11 +913,12 @@ int blk_register_queue(struct gendisk *disk)
 	if (q->request_fn || (q->mq_ops && q->elevator)) {
 		ret = elv_register_queue(q);
 		if (ret) {
+			mutex_unlock(&q->sysfs_lock);
 			kobject_uevent(&q->kobj, KOBJ_REMOVE);
 			kobject_del(&q->kobj);
 			blk_trace_remove_sysfs(dev);
 			kobject_put(&dev->kobj);
-			goto unlock;
+			return ret;
 		}
 	}
 	ret = 0;
@@ -921,7 +926,15 @@ int blk_register_queue(struct gendisk *disk)
 	mutex_unlock(&q->sysfs_lock);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(blk_register_queue);
 
+/**
+ * blk_unregister_queue - counterpart of blk_register_queue()
+ * @disk: Disk of which the request queue should be unregistered from sysfs.
+ *
+ * Note: the caller is responsible for guaranteeing that this function is called
+ * after blk_register_queue() has finished.
+ */
 void blk_unregister_queue(struct gendisk *disk)
 {
 	struct request_queue *q = disk->queue;
@@ -929,21 +942,39 @@ void blk_unregister_queue(struct gendisk *disk)
 	if (WARN_ON(!q))
 		return;
 
+	/* Return early if disk->queue was never registered. */
+	if (!test_bit(QUEUE_FLAG_REGISTERED, &q->queue_flags))
+		return;
+
+	/*
+	 * Since sysfs_remove_dir() prevents adding new directory entries
+	 * before removal of existing entries starts, protect against
+	 * concurrent elv_iosched_store() calls.
+	 */
 	mutex_lock(&q->sysfs_lock);
-	queue_flag_clear_unlocked(QUEUE_FLAG_REGISTERED, q);
-	mutex_unlock(&q->sysfs_lock);
 
-	wbt_exit(q);
+	spin_lock_irq(q->queue_lock);
+	queue_flag_clear(QUEUE_FLAG_REGISTERED, q);
+	spin_unlock_irq(q->queue_lock);
 
-
+	/*
+	 * Remove the sysfs attributes before unregistering the queue data
+	 * structures that can be modified through sysfs.
+	 */
 	if (q->mq_ops)
 		blk_mq_unregister_dev(disk_to_dev(disk), q);
-
-	if (q->request_fn || (q->mq_ops && q->elevator))
-		elv_unregister_queue(q);
+	mutex_unlock(&q->sysfs_lock);
 
 	kobject_uevent(&q->kobj, KOBJ_REMOVE);
 	kobject_del(&q->kobj);
 	blk_trace_remove_sysfs(disk_to_dev(disk));
+
+	wbt_exit(q);
+
+	mutex_lock(&q->sysfs_lock);
+	if (q->request_fn || (q->mq_ops && q->elevator))
+		elv_unregister_queue(q);
+	mutex_unlock(&q->sysfs_lock);
+
 	kobject_put(&disk_to_dev(disk)->kobj);
 }
diff --git a/block/blk-throttle.c b/block/blk-throttle.c
index d19f416..c5a1316 100644
--- a/block/blk-throttle.c
+++ b/block/blk-throttle.c
@@ -216,9 +216,9 @@ struct throtl_data
 
 	unsigned int scale;
 
-	struct latency_bucket tmp_buckets[LATENCY_BUCKET_SIZE];
-	struct avg_latency_bucket avg_buckets[LATENCY_BUCKET_SIZE];
-	struct latency_bucket __percpu *latency_buckets;
+	struct latency_bucket tmp_buckets[2][LATENCY_BUCKET_SIZE];
+	struct avg_latency_bucket avg_buckets[2][LATENCY_BUCKET_SIZE];
+	struct latency_bucket __percpu *latency_buckets[2];
 	unsigned long last_calculate_time;
 	unsigned long filtered_latency;
 
@@ -1511,10 +1511,20 @@ static struct cftype throtl_legacy_files[] = {
 		.seq_show = blkg_print_stat_bytes,
 	},
 	{
+		.name = "throttle.io_service_bytes_recursive",
+		.private = (unsigned long)&blkcg_policy_throtl,
+		.seq_show = blkg_print_stat_bytes_recursive,
+	},
+	{
 		.name = "throttle.io_serviced",
 		.private = (unsigned long)&blkcg_policy_throtl,
 		.seq_show = blkg_print_stat_ios,
 	},
+	{
+		.name = "throttle.io_serviced_recursive",
+		.private = (unsigned long)&blkcg_policy_throtl,
+		.seq_show = blkg_print_stat_ios_recursive,
+	},
 	{ }	/* terminate */
 };
 
@@ -2040,10 +2050,10 @@ static void blk_throtl_update_idletime(struct throtl_grp *tg)
 #ifdef CONFIG_BLK_DEV_THROTTLING_LOW
 static void throtl_update_latency_buckets(struct throtl_data *td)
 {
-	struct avg_latency_bucket avg_latency[LATENCY_BUCKET_SIZE];
-	int i, cpu;
-	unsigned long last_latency = 0;
-	unsigned long latency;
+	struct avg_latency_bucket avg_latency[2][LATENCY_BUCKET_SIZE];
+	int i, cpu, rw;
+	unsigned long last_latency[2] = { 0 };
+	unsigned long latency[2];
 
 	if (!blk_queue_nonrot(td->queue))
 		return;
@@ -2052,56 +2062,67 @@ static void throtl_update_latency_buckets(struct throtl_data *td)
 	td->last_calculate_time = jiffies;
 
 	memset(avg_latency, 0, sizeof(avg_latency));
-	for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
-		struct latency_bucket *tmp = &td->tmp_buckets[i];
+	for (rw = READ; rw <= WRITE; rw++) {
+		for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
+			struct latency_bucket *tmp = &td->tmp_buckets[rw][i];
 
-		for_each_possible_cpu(cpu) {
-			struct latency_bucket *bucket;
+			for_each_possible_cpu(cpu) {
+				struct latency_bucket *bucket;
 
-			/* this isn't race free, but ok in practice */
-			bucket = per_cpu_ptr(td->latency_buckets, cpu);
-			tmp->total_latency += bucket[i].total_latency;
-			tmp->samples += bucket[i].samples;
-			bucket[i].total_latency = 0;
-			bucket[i].samples = 0;
-		}
+				/* this isn't race free, but ok in practice */
+				bucket = per_cpu_ptr(td->latency_buckets[rw],
+					cpu);
+				tmp->total_latency += bucket[i].total_latency;
+				tmp->samples += bucket[i].samples;
+				bucket[i].total_latency = 0;
+				bucket[i].samples = 0;
+			}
 
-		if (tmp->samples >= 32) {
-			int samples = tmp->samples;
+			if (tmp->samples >= 32) {
+				int samples = tmp->samples;
 
-			latency = tmp->total_latency;
+				latency[rw] = tmp->total_latency;
 
-			tmp->total_latency = 0;
-			tmp->samples = 0;
-			latency /= samples;
-			if (latency == 0)
-				continue;
-			avg_latency[i].latency = latency;
+				tmp->total_latency = 0;
+				tmp->samples = 0;
+				latency[rw] /= samples;
+				if (latency[rw] == 0)
+					continue;
+				avg_latency[rw][i].latency = latency[rw];
+			}
 		}
 	}
 
-	for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
-		if (!avg_latency[i].latency) {
-			if (td->avg_buckets[i].latency < last_latency)
-				td->avg_buckets[i].latency = last_latency;
-			continue;
+	for (rw = READ; rw <= WRITE; rw++) {
+		for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
+			if (!avg_latency[rw][i].latency) {
+				if (td->avg_buckets[rw][i].latency < last_latency[rw])
+					td->avg_buckets[rw][i].latency =
+						last_latency[rw];
+				continue;
+			}
+
+			if (!td->avg_buckets[rw][i].valid)
+				latency[rw] = avg_latency[rw][i].latency;
+			else
+				latency[rw] = (td->avg_buckets[rw][i].latency * 7 +
+					avg_latency[rw][i].latency) >> 3;
+
+			td->avg_buckets[rw][i].latency = max(latency[rw],
+				last_latency[rw]);
+			td->avg_buckets[rw][i].valid = true;
+			last_latency[rw] = td->avg_buckets[rw][i].latency;
 		}
-
-		if (!td->avg_buckets[i].valid)
-			latency = avg_latency[i].latency;
-		else
-			latency = (td->avg_buckets[i].latency * 7 +
-				avg_latency[i].latency) >> 3;
-
-		td->avg_buckets[i].latency = max(latency, last_latency);
-		td->avg_buckets[i].valid = true;
-		last_latency = td->avg_buckets[i].latency;
 	}
 
 	for (i = 0; i < LATENCY_BUCKET_SIZE; i++)
 		throtl_log(&td->service_queue,
-			"Latency bucket %d: latency=%ld, valid=%d", i,
-			td->avg_buckets[i].latency, td->avg_buckets[i].valid);
+			"Latency bucket %d: read latency=%ld, read valid=%d, "
+			"write latency=%ld, write valid=%d", i,
+			td->avg_buckets[READ][i].latency,
+			td->avg_buckets[READ][i].valid,
+			td->avg_buckets[WRITE][i].latency,
+			td->avg_buckets[WRITE][i].valid);
 }
 #else
 static inline void throtl_update_latency_buckets(struct throtl_data *td)
@@ -2242,16 +2263,17 @@ static void throtl_track_latency(struct throtl_data *td, sector_t size,
 	struct latency_bucket *latency;
 	int index;
 
-	if (!td || td->limit_index != LIMIT_LOW || op != REQ_OP_READ ||
+	if (!td || td->limit_index != LIMIT_LOW ||
+	    !(op == REQ_OP_READ || op == REQ_OP_WRITE) ||
 	    !blk_queue_nonrot(td->queue))
 		return;
 
 	index = request_bucket_index(size);
 
-	latency = get_cpu_ptr(td->latency_buckets);
+	latency = get_cpu_ptr(td->latency_buckets[op]);
 	latency[index].total_latency += time;
 	latency[index].samples++;
-	put_cpu_ptr(td->latency_buckets);
+	put_cpu_ptr(td->latency_buckets[op]);
 }
 
 void blk_throtl_stat_add(struct request *rq, u64 time_ns)
@@ -2270,6 +2292,7 @@ void blk_throtl_bio_endio(struct bio *bio)
 	unsigned long finish_time;
 	unsigned long start_time;
 	unsigned long lat;
+	int rw = bio_data_dir(bio);
 
 	tg = bio->bi_cg_private;
 	if (!tg)
@@ -2298,7 +2321,7 @@ void blk_throtl_bio_endio(struct bio *bio)
 
 		bucket = request_bucket_index(
 			blk_stat_size(&bio->bi_issue_stat));
-		threshold = tg->td->avg_buckets[bucket].latency +
+		threshold = tg->td->avg_buckets[rw][bucket].latency +
 			tg->latency_target;
 		if (lat > threshold)
 			tg->bad_bio_cnt++;
@@ -2391,9 +2414,16 @@ int blk_throtl_init(struct request_queue *q)
 	td = kzalloc_node(sizeof(*td), GFP_KERNEL, q->node);
 	if (!td)
 		return -ENOMEM;
-	td->latency_buckets = __alloc_percpu(sizeof(struct latency_bucket) *
+	td->latency_buckets[READ] = __alloc_percpu(sizeof(struct latency_bucket) *
 		LATENCY_BUCKET_SIZE, __alignof__(u64));
-	if (!td->latency_buckets) {
+	if (!td->latency_buckets[READ]) {
+		kfree(td);
+		return -ENOMEM;
+	}
+	td->latency_buckets[WRITE] = __alloc_percpu(sizeof(struct latency_bucket) *
+		LATENCY_BUCKET_SIZE, __alignof__(u64));
+	if (!td->latency_buckets[WRITE]) {
+		free_percpu(td->latency_buckets[READ]);
 		kfree(td);
 		return -ENOMEM;
 	}
@@ -2412,7 +2442,8 @@ int blk_throtl_init(struct request_queue *q)
 	/* activate policy */
 	ret = blkcg_activate_policy(q, &blkcg_policy_throtl);
 	if (ret) {
-		free_percpu(td->latency_buckets);
+		free_percpu(td->latency_buckets[READ]);
+		free_percpu(td->latency_buckets[WRITE]);
 		kfree(td);
 	}
 	return ret;
@@ -2423,7 +2454,8 @@ void blk_throtl_exit(struct request_queue *q)
 	BUG_ON(!q->td);
 	throtl_shutdown_wq(q);
 	blkcg_deactivate_policy(q, &blkcg_policy_throtl);
-	free_percpu(q->td->latency_buckets);
+	free_percpu(q->td->latency_buckets[READ]);
+	free_percpu(q->td->latency_buckets[WRITE]);
 	kfree(q->td);
 }
 
@@ -2441,15 +2473,17 @@ void blk_throtl_register_queue(struct request_queue *q)
 	} else {
 		td->throtl_slice = DFL_THROTL_SLICE_HD;
 		td->filtered_latency = LATENCY_FILTERED_HD;
-		for (i = 0; i < LATENCY_BUCKET_SIZE; i++)
-			td->avg_buckets[i].latency = DFL_HD_BASELINE_LATENCY;
+		for (i = 0; i < LATENCY_BUCKET_SIZE; i++) {
+			td->avg_buckets[READ][i].latency = DFL_HD_BASELINE_LATENCY;
+			td->avg_buckets[WRITE][i].latency = DFL_HD_BASELINE_LATENCY;
+		}
 	}
 #ifndef CONFIG_BLK_DEV_THROTTLING_LOW
 	/* if no low limit, use previous default */
 	td->throtl_slice = DFL_THROTL_SLICE_HD;
 #endif
 
-	td->track_bio_latency = !q->mq_ops && !q->request_fn;
+	td->track_bio_latency = !queue_is_rq_based(q);
 	if (!td->track_bio_latency)
 		blk_stat_enable_accounting(q);
 }
diff --git a/block/blk-timeout.c b/block/blk-timeout.c
index 764ecf9..a05e367 100644
--- a/block/blk-timeout.c
+++ b/block/blk-timeout.c
@@ -112,7 +112,9 @@ static void blk_rq_timed_out(struct request *req)
 static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout,
 			  unsigned int *next_set)
 {
-	if (time_after_eq(jiffies, rq->deadline)) {
+	const unsigned long deadline = blk_rq_deadline(rq);
+
+	if (time_after_eq(jiffies, deadline)) {
 		list_del_init(&rq->timeout_list);
 
 		/*
@@ -120,8 +122,8 @@ static void blk_rq_check_expired(struct request *rq, unsigned long *next_timeout
 		 */
 		if (!blk_mark_rq_complete(rq))
 			blk_rq_timed_out(rq);
-	} else if (!*next_set || time_after(*next_timeout, rq->deadline)) {
-		*next_timeout = rq->deadline;
+	} else if (!*next_set || time_after(*next_timeout, deadline)) {
+		*next_timeout = deadline;
 		*next_set = 1;
 	}
 }
@@ -156,12 +158,17 @@ void blk_timeout_work(struct work_struct *work)
  */
 void blk_abort_request(struct request *req)
 {
-	if (blk_mark_rq_complete(req))
-		return;
-
 	if (req->q->mq_ops) {
-		blk_mq_rq_timed_out(req, false);
+		/*
+		 * All we need to ensure is that timeout scan takes place
+		 * immediately and that scan sees the new timeout value.
+		 * No need for fancy synchronizations.
+		 */
+		blk_rq_set_deadline(req, jiffies);
+		mod_timer(&req->q->timeout, 0);
 	} else {
+		if (blk_mark_rq_complete(req))
+			return;
 		blk_delete_timer(req);
 		blk_rq_timed_out(req);
 	}
@@ -208,7 +215,8 @@ void blk_add_timer(struct request *req)
 	if (!req->timeout)
 		req->timeout = q->rq_timeout;
 
-	WRITE_ONCE(req->deadline, jiffies + req->timeout);
+	blk_rq_set_deadline(req, jiffies + req->timeout);
+	req->rq_flags &= ~RQF_MQ_TIMEOUT_EXPIRED;
 
 	/*
 	 * Only the non-mq case needs to add the request to a protected list.
@@ -222,7 +230,7 @@ void blk_add_timer(struct request *req)
 	 * than an existing one, modify the timer. Round up to next nearest
 	 * second.
 	 */
-	expiry = blk_rq_timeout(round_jiffies_up(req->deadline));
+	expiry = blk_rq_timeout(round_jiffies_up(blk_rq_deadline(req)));
 
 	if (!timer_pending(&q->timeout) ||
 	    time_before(expiry, q->timeout.expires)) {
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index ff57fb5..acb7252 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -22,6 +22,48 @@ static inline sector_t blk_zone_start(struct request_queue *q,
 }
 
 /*
+ * Return true if a request is a write requests that needs zone write locking.
+ */
+bool blk_req_needs_zone_write_lock(struct request *rq)
+{
+	if (!rq->q->seq_zones_wlock)
+		return false;
+
+	if (blk_rq_is_passthrough(rq))
+		return false;
+
+	switch (req_op(rq)) {
+	case REQ_OP_WRITE_ZEROES:
+	case REQ_OP_WRITE_SAME:
+	case REQ_OP_WRITE:
+		return blk_rq_zone_is_seq(rq);
+	default:
+		return false;
+	}
+}
+EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock);
+
+void __blk_req_zone_write_lock(struct request *rq)
+{
+	if (WARN_ON_ONCE(test_and_set_bit(blk_rq_zone_no(rq),
+					  rq->q->seq_zones_wlock)))
+		return;
+
+	WARN_ON_ONCE(rq->rq_flags & RQF_ZONE_WRITE_LOCKED);
+	rq->rq_flags |= RQF_ZONE_WRITE_LOCKED;
+}
+EXPORT_SYMBOL_GPL(__blk_req_zone_write_lock);
+
+void __blk_req_zone_write_unlock(struct request *rq)
+{
+	rq->rq_flags &= ~RQF_ZONE_WRITE_LOCKED;
+	if (rq->q->seq_zones_wlock)
+		WARN_ON_ONCE(!test_and_clear_bit(blk_rq_zone_no(rq),
+						 rq->q->seq_zones_wlock));
+}
+EXPORT_SYMBOL_GPL(__blk_req_zone_write_unlock);
+
+/*
  * Check that a zone report belongs to the partition.
  * If yes, fix its start sector and write pointer, copy it in the
  * zone information array and return true. Return false otherwise.
diff --git a/block/blk.h b/block/blk.h
index 3f14469..46db5dc 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -120,33 +120,23 @@ void blk_account_io_completion(struct request *req, unsigned int bytes);
 void blk_account_io_done(struct request *req);
 
 /*
- * Internal atomic flags for request handling
- */
-enum rq_atomic_flags {
-	/*
-	 * Keep these two bits first - not because we depend on the
-	 * value of them, but we do depend on them being in the same
-	 * byte of storage to ensure ordering on writes. Keeping them
-	 * first will achieve that nicely.
-	 */
-	REQ_ATOM_COMPLETE = 0,
-	REQ_ATOM_STARTED,
-
-	REQ_ATOM_POLL_SLEPT,
-};
-
-/*
  * EH timer and IO completion will both attempt to 'grab' the request, make
- * sure that only one of them succeeds
+ * sure that only one of them succeeds. Steal the bottom bit of the
+ * __deadline field for this.
  */
 static inline int blk_mark_rq_complete(struct request *rq)
 {
-	return test_and_set_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	return test_and_set_bit(0, &rq->__deadline);
 }
 
 static inline void blk_clear_rq_complete(struct request *rq)
 {
-	clear_bit(REQ_ATOM_COMPLETE, &rq->atomic_flags);
+	clear_bit(0, &rq->__deadline);
+}
+
+static inline bool blk_rq_is_complete(struct request *rq)
+{
+	return test_bit(0, &rq->__deadline);
 }
 
 /*
@@ -172,6 +162,9 @@ static inline void elv_deactivate_rq(struct request_queue *q, struct request *rq
 		e->type->ops.sq.elevator_deactivate_req_fn(q, rq);
 }
 
+int elv_register_queue(struct request_queue *q);
+void elv_unregister_queue(struct request_queue *q);
+
 struct hd_struct *__disk_get_part(struct gendisk *disk, int partno);
 
 #ifdef CONFIG_FAIL_IO_TIMEOUT
@@ -246,6 +239,21 @@ static inline void req_set_nomerge(struct request_queue *q, struct request *req)
 }
 
 /*
+ * Steal a bit from this field for legacy IO path atomic IO marking. Note that
+ * setting the deadline clears the bottom bit, potentially clearing the
+ * completed bit. The user has to be OK with this (current ones are fine).
+ */
+static inline void blk_rq_set_deadline(struct request *rq, unsigned long time)
+{
+	rq->__deadline = time & ~0x1UL;
+}
+
+static inline unsigned long blk_rq_deadline(struct request *rq)
+{
+	return rq->__deadline & ~0x1UL;
+}
+
+/*
  * Internal io_context interface
  */
 void get_io_context(struct io_context *ioc);
@@ -330,4 +338,6 @@ static inline void blk_queue_bounce(struct request_queue *q, struct bio **bio)
 }
 #endif /* CONFIG_BOUNCE */
 
+extern void blk_drain_queue(struct request_queue *q);
+
 #endif /* BLK_INTERNAL_H */
diff --git a/block/bounce.c b/block/bounce.c
index 1d05c42..6a3e682 100644
--- a/block/bounce.c
+++ b/block/bounce.c
@@ -113,45 +113,50 @@ int init_emergency_isa_pool(void)
 static void copy_to_high_bio_irq(struct bio *to, struct bio *from)
 {
 	unsigned char *vfrom;
-	struct bio_vec tovec, *fromvec = from->bi_io_vec;
+	struct bio_vec tovec, fromvec;
 	struct bvec_iter iter;
+	/*
+	 * The bio of @from is created by bounce, so we can iterate
+	 * its bvec from start to end, but the @from->bi_iter can't be
+	 * trusted because it might be changed by splitting.
+	 */
+	struct bvec_iter from_iter = BVEC_ITER_ALL_INIT;
 
 	bio_for_each_segment(tovec, to, iter) {
-		if (tovec.bv_page != fromvec->bv_page) {
+		fromvec = bio_iter_iovec(from, from_iter);
+		if (tovec.bv_page != fromvec.bv_page) {
 			/*
 			 * fromvec->bv_offset and fromvec->bv_len might have
 			 * been modified by the block layer, so use the original
 			 * copy, bounce_copy_vec already uses tovec->bv_len
 			 */
-			vfrom = page_address(fromvec->bv_page) +
+			vfrom = page_address(fromvec.bv_page) +
 				tovec.bv_offset;
 
 			bounce_copy_vec(&tovec, vfrom);
 			flush_dcache_page(tovec.bv_page);
 		}
-
-		fromvec++;
+		bio_advance_iter(from, &from_iter, tovec.bv_len);
 	}
 }
 
 static void bounce_end_io(struct bio *bio, mempool_t *pool)
 {
 	struct bio *bio_orig = bio->bi_private;
-	struct bio_vec *bvec, *org_vec;
+	struct bio_vec *bvec, orig_vec;
 	int i;
-	int start = bio_orig->bi_iter.bi_idx;
+	struct bvec_iter orig_iter = bio_orig->bi_iter;
 
 	/*
 	 * free up bounce indirect pages used
 	 */
 	bio_for_each_segment_all(bvec, bio, i) {
-		org_vec = bio_orig->bi_io_vec + i + start;
-
-		if (bvec->bv_page == org_vec->bv_page)
-			continue;
-
-		dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
-		mempool_free(bvec->bv_page, pool);
+		orig_vec = bio_iter_iovec(bio_orig, orig_iter);
+		if (bvec->bv_page != orig_vec.bv_page) {
+			dec_zone_page_state(bvec->bv_page, NR_BOUNCE);
+			mempool_free(bvec->bv_page, pool);
+		}
+		bio_advance_iter(bio_orig, &orig_iter, orig_vec.bv_len);
 	}
 
 	bio_orig->bi_status = bio->bi_status;
diff --git a/block/bsg-lib.c b/block/bsg-lib.c
index 15d25cc..1474153 100644
--- a/block/bsg-lib.c
+++ b/block/bsg-lib.c
@@ -30,7 +30,7 @@
 
 /**
  * bsg_teardown_job - routine to teardown a bsg job
- * @job: bsg_job that is to be torn down
+ * @kref: kref inside bsg_job that is to be torn down
  */
 static void bsg_teardown_job(struct kref *kref)
 {
@@ -251,6 +251,7 @@ static void bsg_exit_rq(struct request_queue *q, struct request *req)
  * @name: device to give bsg device
  * @job_fn: bsg job handler
  * @dd_job_size: size of LLD data needed for each job
+ * @release: @dev release function
  */
 struct request_queue *bsg_setup_queue(struct device *dev, const char *name,
 		bsg_job_fn *job_fn, int dd_job_size,
diff --git a/block/bsg.c b/block/bsg.c
index 452f94f..a1bcbb6 100644
--- a/block/bsg.c
+++ b/block/bsg.c
@@ -32,6 +32,9 @@
 #define BSG_DESCRIPTION	"Block layer SCSI generic (bsg) driver"
 #define BSG_VERSION	"0.4"
 
+#define bsg_dbg(bd, fmt, ...) \
+	pr_debug("%s: " fmt, (bd)->name, ##__VA_ARGS__)
+
 struct bsg_device {
 	struct request_queue *queue;
 	spinlock_t lock;
@@ -55,14 +58,6 @@ enum {
 #define BSG_DEFAULT_CMDS	64
 #define BSG_MAX_DEVS		32768
 
-#undef BSG_DEBUG
-
-#ifdef BSG_DEBUG
-#define dprintk(fmt, args...) printk(KERN_ERR "%s: " fmt, __func__, ##args)
-#else
-#define dprintk(fmt, args...)
-#endif
-
 static DEFINE_MUTEX(bsg_mutex);
 static DEFINE_IDR(bsg_minor_idr);
 
@@ -123,7 +118,7 @@ static struct bsg_command *bsg_alloc_command(struct bsg_device *bd)
 
 	bc->bd = bd;
 	INIT_LIST_HEAD(&bc->list);
-	dprintk("%s: returning free cmd %p\n", bd->name, bc);
+	bsg_dbg(bd, "returning free cmd %p\n", bc);
 	return bc;
 out:
 	spin_unlock_irq(&bd->lock);
@@ -222,7 +217,8 @@ bsg_map_hdr(struct bsg_device *bd, struct sg_io_v4 *hdr, fmode_t mode)
 	if (!bcd->class_dev)
 		return ERR_PTR(-ENXIO);
 
-	dprintk("map hdr %llx/%u %llx/%u\n", (unsigned long long) hdr->dout_xferp,
+	bsg_dbg(bd, "map hdr %llx/%u %llx/%u\n",
+		(unsigned long long) hdr->dout_xferp,
 		hdr->dout_xfer_len, (unsigned long long) hdr->din_xferp,
 		hdr->din_xfer_len);
 
@@ -299,8 +295,8 @@ static void bsg_rq_end_io(struct request *rq, blk_status_t status)
 	struct bsg_device *bd = bc->bd;
 	unsigned long flags;
 
-	dprintk("%s: finished rq %p bc %p, bio %p\n",
-		bd->name, rq, bc, bc->bio);
+	bsg_dbg(bd, "finished rq %p bc %p, bio %p\n",
+		rq, bc, bc->bio);
 
 	bc->hdr.duration = jiffies_to_msecs(jiffies - bc->hdr.duration);
 
@@ -333,7 +329,7 @@ static void bsg_add_command(struct bsg_device *bd, struct request_queue *q,
 	list_add_tail(&bc->list, &bd->busy_list);
 	spin_unlock_irq(&bd->lock);
 
-	dprintk("%s: queueing rq %p, bc %p\n", bd->name, rq, bc);
+	bsg_dbg(bd, "queueing rq %p, bc %p\n", rq, bc);
 
 	rq->end_io_data = bc;
 	blk_execute_rq_nowait(q, NULL, rq, at_head, bsg_rq_end_io);
@@ -379,7 +375,7 @@ static struct bsg_command *bsg_get_done_cmd(struct bsg_device *bd)
 		}
 	} while (1);
 
-	dprintk("%s: returning done %p\n", bd->name, bc);
+	bsg_dbg(bd, "returning done %p\n", bc);
 
 	return bc;
 }
@@ -390,7 +386,7 @@ static int blk_complete_sgv4_hdr_rq(struct request *rq, struct sg_io_v4 *hdr,
 	struct scsi_request *req = scsi_req(rq);
 	int ret = 0;
 
-	dprintk("rq %p bio %p 0x%x\n", rq, bio, req->result);
+	pr_debug("rq %p bio %p 0x%x\n", rq, bio, req->result);
 	/*
 	 * fill in all the output members
 	 */
@@ -469,7 +465,7 @@ static int bsg_complete_all_commands(struct bsg_device *bd)
 	struct bsg_command *bc;
 	int ret, tret;
 
-	dprintk("%s: entered\n", bd->name);
+	bsg_dbg(bd, "entered\n");
 
 	/*
 	 * wait for all commands to complete
@@ -572,7 +568,7 @@ bsg_read(struct file *file, char __user *buf, size_t count, loff_t *ppos)
 	int ret;
 	ssize_t bytes_read;
 
-	dprintk("%s: read %zd bytes\n", bd->name, count);
+	bsg_dbg(bd, "read %zd bytes\n", count);
 
 	bsg_set_block(bd, file);
 
@@ -646,7 +642,7 @@ bsg_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 	ssize_t bytes_written;
 	int ret;
 
-	dprintk("%s: write %zd bytes\n", bd->name, count);
+	bsg_dbg(bd, "write %zd bytes\n", count);
 
 	if (unlikely(uaccess_kernel()))
 		return -EINVAL;
@@ -664,7 +660,7 @@ bsg_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos)
 	if (!bytes_written || err_block_err(ret))
 		bytes_written = ret;
 
-	dprintk("%s: returning %zd\n", bd->name, bytes_written);
+	bsg_dbg(bd, "returning %zd\n", bytes_written);
 	return bytes_written;
 }
 
@@ -717,7 +713,7 @@ static int bsg_put_device(struct bsg_device *bd)
 	hlist_del(&bd->dev_list);
 	mutex_unlock(&bsg_mutex);
 
-	dprintk("%s: tearing down\n", bd->name);
+	bsg_dbg(bd, "tearing down\n");
 
 	/*
 	 * close can always block
@@ -744,9 +740,7 @@ static struct bsg_device *bsg_add_device(struct inode *inode,
 					 struct file *file)
 {
 	struct bsg_device *bd;
-#ifdef BSG_DEBUG
 	unsigned char buf[32];
-#endif
 
 	if (!blk_queue_scsi_passthrough(rq)) {
 		WARN_ONCE(true, "Attempt to register a non-SCSI queue\n");
@@ -771,7 +765,7 @@ static struct bsg_device *bsg_add_device(struct inode *inode,
 	hlist_add_head(&bd->dev_list, bsg_dev_idx_hash(iminor(inode)));
 
 	strncpy(bd->name, dev_name(rq->bsg_dev.class_dev), sizeof(bd->name) - 1);
-	dprintk("bound to <%s>, max queue %d\n",
+	bsg_dbg(bd, "bound to <%s>, max queue %d\n",
 		format_dev_t(buf, inode->i_rdev), bd->max_queue);
 
 	mutex_unlock(&bsg_mutex);
diff --git a/block/deadline-iosched.c b/block/deadline-iosched.c
index b83f774..9de9f15 100644
--- a/block/deadline-iosched.c
+++ b/block/deadline-iosched.c
@@ -50,8 +50,6 @@ struct deadline_data {
 	int front_merges;
 };
 
-static void deadline_move_request(struct deadline_data *, struct request *);
-
 static inline struct rb_root *
 deadline_rb_root(struct deadline_data *dd, struct request *rq)
 {
@@ -100,6 +98,12 @@ deadline_add_request(struct request_queue *q, struct request *rq)
 	struct deadline_data *dd = q->elevator->elevator_data;
 	const int data_dir = rq_data_dir(rq);
 
+	/*
+	 * This may be a requeue of a write request that has locked its
+	 * target zone. If it is the case, this releases the zone lock.
+	 */
+	blk_req_zone_write_unlock(rq);
+
 	deadline_add_rq_rb(dd, rq);
 
 	/*
@@ -190,6 +194,12 @@ deadline_move_to_dispatch(struct deadline_data *dd, struct request *rq)
 {
 	struct request_queue *q = rq->q;
 
+	/*
+	 * For a zoned block device, write requests must write lock their
+	 * target zone.
+	 */
+	blk_req_zone_write_lock(rq);
+
 	deadline_remove_request(q, rq);
 	elv_dispatch_add_tail(q, rq);
 }
@@ -231,6 +241,69 @@ static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
 }
 
 /*
+ * For the specified data direction, return the next request to dispatch using
+ * arrival ordered lists.
+ */
+static struct request *
+deadline_fifo_request(struct deadline_data *dd, int data_dir)
+{
+	struct request *rq;
+
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	if (list_empty(&dd->fifo_list[data_dir]))
+		return NULL;
+
+	rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
+	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
+		return rq;
+
+	/*
+	 * Look for a write request that can be dispatched, that is one with
+	 * an unlocked target zone.
+	 */
+	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
+		if (blk_req_can_dispatch_to_zone(rq))
+			return rq;
+	}
+
+	return NULL;
+}
+
+/*
+ * For the specified data direction, return the next request to dispatch using
+ * sector position sorted lists.
+ */
+static struct request *
+deadline_next_request(struct deadline_data *dd, int data_dir)
+{
+	struct request *rq;
+
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	rq = dd->next_rq[data_dir];
+	if (!rq)
+		return NULL;
+
+	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
+		return rq;
+
+	/*
+	 * Look for a write request that can be dispatched, that is one with
+	 * an unlocked target zone.
+	 */
+	while (rq) {
+		if (blk_req_can_dispatch_to_zone(rq))
+			return rq;
+		rq = deadline_latter_request(rq);
+	}
+
+	return NULL;
+}
+
+/*
  * deadline_dispatch_requests selects the best request according to
  * read/write expire, fifo_batch, etc
  */
@@ -239,16 +312,15 @@ static int deadline_dispatch_requests(struct request_queue *q, int force)
 	struct deadline_data *dd = q->elevator->elevator_data;
 	const int reads = !list_empty(&dd->fifo_list[READ]);
 	const int writes = !list_empty(&dd->fifo_list[WRITE]);
-	struct request *rq;
+	struct request *rq, *next_rq;
 	int data_dir;
 
 	/*
 	 * batches are currently reads XOR writes
 	 */
-	if (dd->next_rq[WRITE])
-		rq = dd->next_rq[WRITE];
-	else
-		rq = dd->next_rq[READ];
+	rq = deadline_next_request(dd, WRITE);
+	if (!rq)
+		rq = deadline_next_request(dd, READ);
 
 	if (rq && dd->batching < dd->fifo_batch)
 		/* we have a next request are still entitled to batch */
@@ -262,7 +334,8 @@ static int deadline_dispatch_requests(struct request_queue *q, int force)
 	if (reads) {
 		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[READ]));
 
-		if (writes && (dd->starved++ >= dd->writes_starved))
+		if (deadline_fifo_request(dd, WRITE) &&
+		    (dd->starved++ >= dd->writes_starved))
 			goto dispatch_writes;
 
 		data_dir = READ;
@@ -291,21 +364,29 @@ static int deadline_dispatch_requests(struct request_queue *q, int force)
 	/*
 	 * we are not running a batch, find best request for selected data_dir
 	 */
-	if (deadline_check_fifo(dd, data_dir) || !dd->next_rq[data_dir]) {
+	next_rq = deadline_next_request(dd, data_dir);
+	if (deadline_check_fifo(dd, data_dir) || !next_rq) {
 		/*
 		 * A deadline has expired, the last request was in the other
 		 * direction, or we have run out of higher-sectored requests.
 		 * Start again from the request with the earliest expiry time.
 		 */
-		rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
+		rq = deadline_fifo_request(dd, data_dir);
 	} else {
 		/*
 		 * The last req was the same dir and we have a next request in
 		 * sort order. No expired requests so continue on from here.
 		 */
-		rq = dd->next_rq[data_dir];
+		rq = next_rq;
 	}
 
+	/*
+	 * For a zoned block device, if we only have writes queued and none of
+	 * them can be dispatched, rq will be NULL.
+	 */
+	if (!rq)
+		return 0;
+
 	dd->batching = 0;
 
 dispatch_request:
@@ -318,6 +399,16 @@ static int deadline_dispatch_requests(struct request_queue *q, int force)
 	return 1;
 }
 
+/*
+ * For zoned block devices, write unlock the target zone of completed
+ * write requests.
+ */
+static void
+deadline_completed_request(struct request_queue *q, struct request *rq)
+{
+	blk_req_zone_write_unlock(rq);
+}
+
 static void deadline_exit_queue(struct elevator_queue *e)
 {
 	struct deadline_data *dd = e->elevator_data;
@@ -439,6 +530,7 @@ static struct elevator_type iosched_deadline = {
 		.elevator_merged_fn =		deadline_merged_request,
 		.elevator_merge_req_fn =	deadline_merged_requests,
 		.elevator_dispatch_fn =		deadline_dispatch_requests,
+		.elevator_completed_req_fn =	deadline_completed_request,
 		.elevator_add_req_fn =		deadline_add_request,
 		.elevator_former_req_fn =	elv_rb_former_request,
 		.elevator_latter_req_fn =	elv_rb_latter_request,
diff --git a/block/elevator.c b/block/elevator.c
index 7bda083..e87e9b43 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -869,6 +869,8 @@ int elv_register_queue(struct request_queue *q)
 	struct elevator_queue *e = q->elevator;
 	int error;
 
+	lockdep_assert_held(&q->sysfs_lock);
+
 	error = kobject_add(&e->kobj, &q->kobj, "%s", "iosched");
 	if (!error) {
 		struct elv_fs_entry *attr = e->type->elevator_attrs;
@@ -886,10 +888,11 @@ int elv_register_queue(struct request_queue *q)
 	}
 	return error;
 }
-EXPORT_SYMBOL(elv_register_queue);
 
 void elv_unregister_queue(struct request_queue *q)
 {
+	lockdep_assert_held(&q->sysfs_lock);
+
 	if (q) {
 		struct elevator_queue *e = q->elevator;
 
@@ -900,7 +903,6 @@ void elv_unregister_queue(struct request_queue *q)
 		wbt_enable_default(q);
 	}
 }
-EXPORT_SYMBOL(elv_unregister_queue);
 
 int elv_register(struct elevator_type *e)
 {
@@ -967,7 +969,10 @@ static int elevator_switch_mq(struct request_queue *q,
 {
 	int ret;
 
+	lockdep_assert_held(&q->sysfs_lock);
+
 	blk_mq_freeze_queue(q);
+	blk_mq_quiesce_queue(q);
 
 	if (q->elevator) {
 		if (q->elevator->registered)
@@ -994,6 +999,7 @@ static int elevator_switch_mq(struct request_queue *q,
 		blk_add_trace_msg(q, "elv switch: none");
 
 out:
+	blk_mq_unquiesce_queue(q);
 	blk_mq_unfreeze_queue(q);
 	return ret;
 }
@@ -1010,6 +1016,8 @@ static int elevator_switch(struct request_queue *q, struct elevator_type *new_e)
 	bool old_registered = false;
 	int err;
 
+	lockdep_assert_held(&q->sysfs_lock);
+
 	if (q->mq_ops)
 		return elevator_switch_mq(q, new_e);
 
diff --git a/block/genhd.c b/block/genhd.c
index 96a66f6..88a53c1 100644
--- a/block/genhd.c
+++ b/block/genhd.c
@@ -629,16 +629,18 @@ static void register_disk(struct device *parent, struct gendisk *disk)
 }
 
 /**
- * device_add_disk - add partitioning information to kernel list
+ * __device_add_disk - add disk information to kernel list
  * @parent: parent device for the disk
  * @disk: per-device partitioning information
+ * @register_queue: register the queue if set to true
  *
  * This function registers the partitioning information in @disk
  * with the kernel.
  *
  * FIXME: error handling
  */
-void device_add_disk(struct device *parent, struct gendisk *disk)
+static void __device_add_disk(struct device *parent, struct gendisk *disk,
+			      bool register_queue)
 {
 	dev_t devt;
 	int retval;
@@ -682,7 +684,8 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 				    exact_match, exact_lock, disk);
 	}
 	register_disk(parent, disk);
-	blk_register_queue(disk);
+	if (register_queue)
+		blk_register_queue(disk);
 
 	/*
 	 * Take an extra ref on queue which will be put on disk_release()
@@ -693,8 +696,19 @@ void device_add_disk(struct device *parent, struct gendisk *disk)
 	disk_add_events(disk);
 	blk_integrity_add(disk);
 }
+
+void device_add_disk(struct device *parent, struct gendisk *disk)
+{
+	__device_add_disk(parent, disk, true);
+}
 EXPORT_SYMBOL(device_add_disk);
 
+void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk)
+{
+	__device_add_disk(parent, disk, false);
+}
+EXPORT_SYMBOL(device_add_disk_no_queue_reg);
+
 void del_gendisk(struct gendisk *disk)
 {
 	struct disk_part_iter piter;
@@ -725,7 +739,8 @@ void del_gendisk(struct gendisk *disk)
 		 * Unregister bdi before releasing device numbers (as they can
 		 * get reused and we'd get clashes in sysfs).
 		 */
-		bdi_unregister(disk->queue->backing_dev_info);
+		if (!(disk->flags & GENHD_FL_HIDDEN))
+			bdi_unregister(disk->queue->backing_dev_info);
 		blk_unregister_queue(disk);
 	} else {
 		WARN_ON(1);
diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 0179e48..c56f211 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -59,6 +59,7 @@ struct deadline_data {
 	int front_merges;
 
 	spinlock_t lock;
+	spinlock_t zone_lock;
 	struct list_head dispatch;
 };
 
@@ -192,13 +193,83 @@ static inline int deadline_check_fifo(struct deadline_data *dd, int ddir)
 }
 
 /*
+ * For the specified data direction, return the next request to
+ * dispatch using arrival ordered lists.
+ */
+static struct request *
+deadline_fifo_request(struct deadline_data *dd, int data_dir)
+{
+	struct request *rq;
+	unsigned long flags;
+
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	if (list_empty(&dd->fifo_list[data_dir]))
+		return NULL;
+
+	rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
+	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
+		return rq;
+
+	/*
+	 * Look for a write request that can be dispatched, that is one with
+	 * an unlocked target zone.
+	 */
+	spin_lock_irqsave(&dd->zone_lock, flags);
+	list_for_each_entry(rq, &dd->fifo_list[WRITE], queuelist) {
+		if (blk_req_can_dispatch_to_zone(rq))
+			goto out;
+	}
+	rq = NULL;
+out:
+	spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+	return rq;
+}
+
+/*
+ * For the specified data direction, return the next request to
+ * dispatch using sector position sorted lists.
+ */
+static struct request *
+deadline_next_request(struct deadline_data *dd, int data_dir)
+{
+	struct request *rq;
+	unsigned long flags;
+
+	if (WARN_ON_ONCE(data_dir != READ && data_dir != WRITE))
+		return NULL;
+
+	rq = dd->next_rq[data_dir];
+	if (!rq)
+		return NULL;
+
+	if (data_dir == READ || !blk_queue_is_zoned(rq->q))
+		return rq;
+
+	/*
+	 * Look for a write request that can be dispatched, that is one with
+	 * an unlocked target zone.
+	 */
+	spin_lock_irqsave(&dd->zone_lock, flags);
+	while (rq) {
+		if (blk_req_can_dispatch_to_zone(rq))
+			break;
+		rq = deadline_latter_request(rq);
+	}
+	spin_unlock_irqrestore(&dd->zone_lock, flags);
+
+	return rq;
+}
+
+/*
  * deadline_dispatch_requests selects the best request according to
  * read/write expire, fifo_batch, etc
  */
-static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
+static struct request *__dd_dispatch_request(struct deadline_data *dd)
 {
-	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
-	struct request *rq;
+	struct request *rq, *next_rq;
 	bool reads, writes;
 	int data_dir;
 
@@ -214,10 +285,9 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	/*
 	 * batches are currently reads XOR writes
 	 */
-	if (dd->next_rq[WRITE])
-		rq = dd->next_rq[WRITE];
-	else
-		rq = dd->next_rq[READ];
+	rq = deadline_next_request(dd, WRITE);
+	if (!rq)
+		rq = deadline_next_request(dd, READ);
 
 	if (rq && dd->batching < dd->fifo_batch)
 		/* we have a next request are still entitled to batch */
@@ -231,7 +301,8 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	if (reads) {
 		BUG_ON(RB_EMPTY_ROOT(&dd->sort_list[READ]));
 
-		if (writes && (dd->starved++ >= dd->writes_starved))
+		if (deadline_fifo_request(dd, WRITE) &&
+		    (dd->starved++ >= dd->writes_starved))
 			goto dispatch_writes;
 
 		data_dir = READ;
@@ -260,21 +331,29 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	/*
 	 * we are not running a batch, find best request for selected data_dir
 	 */
-	if (deadline_check_fifo(dd, data_dir) || !dd->next_rq[data_dir]) {
+	next_rq = deadline_next_request(dd, data_dir);
+	if (deadline_check_fifo(dd, data_dir) || !next_rq) {
 		/*
 		 * A deadline has expired, the last request was in the other
 		 * direction, or we have run out of higher-sectored requests.
 		 * Start again from the request with the earliest expiry time.
 		 */
-		rq = rq_entry_fifo(dd->fifo_list[data_dir].next);
+		rq = deadline_fifo_request(dd, data_dir);
 	} else {
 		/*
 		 * The last req was the same dir and we have a next request in
 		 * sort order. No expired requests so continue on from here.
 		 */
-		rq = dd->next_rq[data_dir];
+		rq = next_rq;
 	}
 
+	/*
+	 * For a zoned block device, if we only have writes queued and none of
+	 * them can be dispatched, rq will be NULL.
+	 */
+	if (!rq)
+		return NULL;
+
 	dd->batching = 0;
 
 dispatch_request:
@@ -284,17 +363,27 @@ static struct request *__dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 	dd->batching++;
 	deadline_move_request(dd, rq);
 done:
+	/*
+	 * If the request needs its target zone locked, do it.
+	 */
+	blk_req_zone_write_lock(rq);
 	rq->rq_flags |= RQF_STARTED;
 	return rq;
 }
 
+/*
+ * One confusing aspect here is that we get called for a specific
+ * hardware queue, but we return a request that may not be for a
+ * different hardware queue. This is because mq-deadline has shared
+ * state for all hardware queues, in terms of sorting, FIFOs, etc.
+ */
 static struct request *dd_dispatch_request(struct blk_mq_hw_ctx *hctx)
 {
 	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
 	struct request *rq;
 
 	spin_lock(&dd->lock);
-	rq = __dd_dispatch_request(hctx);
+	rq = __dd_dispatch_request(dd);
 	spin_unlock(&dd->lock);
 
 	return rq;
@@ -339,6 +428,7 @@ static int dd_init_queue(struct request_queue *q, struct elevator_type *e)
 	dd->front_merges = 1;
 	dd->fifo_batch = fifo_batch;
 	spin_lock_init(&dd->lock);
+	spin_lock_init(&dd->zone_lock);
 	INIT_LIST_HEAD(&dd->dispatch);
 
 	q->elevator = eq;
@@ -395,6 +485,12 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 	struct deadline_data *dd = q->elevator->elevator_data;
 	const int data_dir = rq_data_dir(rq);
 
+	/*
+	 * This may be a requeue of a write request that has locked its
+	 * target zone. If it is the case, this releases the zone lock.
+	 */
+	blk_req_zone_write_unlock(rq);
+
 	if (blk_mq_sched_try_insert_merge(q, rq))
 		return;
 
@@ -439,6 +535,26 @@ static void dd_insert_requests(struct blk_mq_hw_ctx *hctx,
 	spin_unlock(&dd->lock);
 }
 
+/*
+ * For zoned block devices, write unlock the target zone of
+ * completed write requests. Do this while holding the zone lock
+ * spinlock so that the zone is never unlocked while deadline_fifo_request()
+ * while deadline_next_request() are executing.
+ */
+static void dd_completed_request(struct request *rq)
+{
+	struct request_queue *q = rq->q;
+
+	if (blk_queue_is_zoned(q)) {
+		struct deadline_data *dd = q->elevator->elevator_data;
+		unsigned long flags;
+
+		spin_lock_irqsave(&dd->zone_lock, flags);
+		blk_req_zone_write_unlock(rq);
+		spin_unlock_irqrestore(&dd->zone_lock, flags);
+	}
+}
+
 static bool dd_has_work(struct blk_mq_hw_ctx *hctx)
 {
 	struct deadline_data *dd = hctx->queue->elevator->elevator_data;
@@ -640,6 +756,7 @@ static struct elevator_type mq_deadline = {
 	.ops.mq = {
 		.insert_requests	= dd_insert_requests,
 		.dispatch_request	= dd_dispatch_request,
+		.completed_request	= dd_completed_request,
 		.next_request		= elv_rb_latter_request,
 		.former_request		= elv_rb_former_request,
 		.bio_merge		= dd_bio_merge,
diff --git a/block/partitions/msdos.c b/block/partitions/msdos.c
index 0af3a3d..82c44f7 100644
--- a/block/partitions/msdos.c
+++ b/block/partitions/msdos.c
@@ -301,7 +301,9 @@ static void parse_bsd(struct parsed_partitions *state,
 			continue;
 		bsd_start = le32_to_cpu(p->p_offset);
 		bsd_size = le32_to_cpu(p->p_size);
-		if (memcmp(flavour, "bsd\0", 4) == 0)
+		/* FreeBSD has relative offset if C partition offset is zero */
+		if (memcmp(flavour, "bsd\0", 4) == 0 &&
+		    le32_to_cpu(l->d_partitions[2].p_offset) == 0)
 			bsd_start += offset;
 		if (offset == bsd_start && size == bsd_size)
 			/* full parent partition, we have it already */
diff --git a/block/scsi_ioctl.c b/block/scsi_ioctl.c
index edcfff9..60b471f 100644
--- a/block/scsi_ioctl.c
+++ b/block/scsi_ioctl.c
@@ -384,9 +384,10 @@ static int sg_io(struct request_queue *q, struct gendisk *bd_disk,
 
 /**
  * sg_scsi_ioctl  --  handle deprecated SCSI_IOCTL_SEND_COMMAND ioctl
- * @file:	file this ioctl operates on (optional)
  * @q:		request queue to send scsi commands down
  * @disk:	gendisk to operate on (option)
+ * @mode:	mode used to open the file through which the ioctl has been
+ *		submitted
  * @sic:	userspace structure describing the command to perform
  *
  * Send down the scsi command described by @sic to the device below
@@ -415,10 +416,10 @@ static int sg_io(struct request_queue *q, struct gendisk *bd_disk,
  *      Positive numbers returned are the compacted SCSI error codes (4
  *      bytes in one int) where the lowest byte is the SCSI status.
  */
-#define OMAX_SB_LEN 16          /* For backward compatibility */
 int sg_scsi_ioctl(struct request_queue *q, struct gendisk *disk, fmode_t mode,
 		struct scsi_ioctl_command __user *sic)
 {
+	enum { OMAX_SB_LEN = 16 };	/* For backward compatibility */
 	struct request *rq;
 	struct scsi_request *req;
 	int err;
@@ -692,38 +693,9 @@ int scsi_verify_blk_ioctl(struct block_device *bd, unsigned int cmd)
 	if (bd && bd == bd->bd_contains)
 		return 0;
 
-	/* Actually none of these is particularly useful on a partition,
-	 * but they are safe.
-	 */
-	switch (cmd) {
-	case SCSI_IOCTL_GET_IDLUN:
-	case SCSI_IOCTL_GET_BUS_NUMBER:
-	case SCSI_IOCTL_GET_PCI:
-	case SCSI_IOCTL_PROBE_HOST:
-	case SG_GET_VERSION_NUM:
-	case SG_SET_TIMEOUT:
-	case SG_GET_TIMEOUT:
-	case SG_GET_RESERVED_SIZE:
-	case SG_SET_RESERVED_SIZE:
-	case SG_EMULATED_HOST:
-		return 0;
-	case CDROM_GET_CAPABILITY:
-		/* Keep this until we remove the printk below.  udev sends it
-		 * and we do not want to spam dmesg about it.   CD-ROMs do
-		 * not have partitions, so we get here only for disks.
-		 */
-		return -ENOIOCTLCMD;
-	default:
-		break;
-	}
-
 	if (capable(CAP_SYS_RAWIO))
 		return 0;
 
-	/* In particular, rule out all resets and host-specific ioctls.  */
-	printk_ratelimited(KERN_WARNING
-			   "%s: sending ioctl %x to a partition!\n", current->comm, cmd);
-
 	return -ENOIOCTLCMD;
 }
 EXPORT_SYMBOL(scsi_verify_blk_ioctl);
diff --git a/crypto/Kconfig b/crypto/Kconfig
index f7911963..20360e0 100644
--- a/crypto/Kconfig
+++ b/crypto/Kconfig
@@ -106,6 +106,7 @@
 config CRYPTO_ACOMP2
 	tristate
 	select CRYPTO_ALGAPI2
+	select SGL_ALLOC
 
 config CRYPTO_ACOMP
 	tristate
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index 444a387..35d4dce 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -664,7 +664,7 @@ void af_alg_free_areq_sgls(struct af_alg_async_req *areq)
 	unsigned int i;
 
 	list_for_each_entry_safe(rsgl, tmp, &areq->rsgl_list, list) {
-		ctx->rcvused -= rsgl->sg_num_bytes;
+		atomic_sub(rsgl->sg_num_bytes, &ctx->rcvused);
 		af_alg_free_sg(&rsgl->sgl);
 		list_del(&rsgl->list);
 		if (rsgl != &areq->first_rsgl)
@@ -1163,7 +1163,7 @@ int af_alg_get_rsgl(struct sock *sk, struct msghdr *msg, int flags,
 
 		areq->last_rsgl = rsgl;
 		len += err;
-		ctx->rcvused += err;
+		atomic_add(err, &ctx->rcvused);
 		rsgl->sg_num_bytes = err;
 		iov_iter_advance(&msg->msg_iter, err);
 	}
diff --git a/crypto/algapi.c b/crypto/algapi.c
index 60d7366..9a636f9 100644
--- a/crypto/algapi.c
+++ b/crypto/algapi.c
@@ -167,6 +167,18 @@ void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list,
 
 			spawn->alg = NULL;
 			spawns = &inst->alg.cra_users;
+
+			/*
+			 * We may encounter an unregistered instance here, since
+			 * an instance's spawns are set up prior to the instance
+			 * being registered.  An unregistered instance will have
+			 * NULL ->cra_users.next, since ->cra_users isn't
+			 * properly initialized until registration.  But an
+			 * unregistered instance cannot have any users, so treat
+			 * it the same as ->cra_users being empty.
+			 */
+			if (spawns->next == NULL)
+				break;
 		}
 	} while ((spawns = crypto_more_spawns(alg, &stack, &top,
 					      &secondary_spawns)));
diff --git a/crypto/algif_aead.c b/crypto/algif_aead.c
index ddcc45f..e9885a3 100644
--- a/crypto/algif_aead.c
+++ b/crypto/algif_aead.c
@@ -571,7 +571,7 @@ static int aead_accept_parent_nokey(void *private, struct sock *sk)
 	INIT_LIST_HEAD(&ctx->tsgl_list);
 	ctx->len = len;
 	ctx->used = 0;
-	ctx->rcvused = 0;
+	atomic_set(&ctx->rcvused, 0);
 	ctx->more = 0;
 	ctx->merge = 0;
 	ctx->enc = 0;
diff --git a/crypto/algif_skcipher.c b/crypto/algif_skcipher.c
index baef9bf..c5c47b6 100644
--- a/crypto/algif_skcipher.c
+++ b/crypto/algif_skcipher.c
@@ -390,7 +390,7 @@ static int skcipher_accept_parent_nokey(void *private, struct sock *sk)
 	INIT_LIST_HEAD(&ctx->tsgl_list);
 	ctx->len = len;
 	ctx->used = 0;
-	ctx->rcvused = 0;
+	atomic_set(&ctx->rcvused, 0);
 	ctx->more = 0;
 	ctx->merge = 0;
 	ctx->enc = 0;
diff --git a/crypto/chacha20poly1305.c b/crypto/chacha20poly1305.c
index db1bc31..600afa9 100644
--- a/crypto/chacha20poly1305.c
+++ b/crypto/chacha20poly1305.c
@@ -610,6 +610,11 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
 						    algt->mask));
 	if (IS_ERR(poly))
 		return PTR_ERR(poly);
+	poly_hash = __crypto_hash_alg_common(poly);
+
+	err = -EINVAL;
+	if (poly_hash->digestsize != POLY1305_DIGEST_SIZE)
+		goto out_put_poly;
 
 	err = -ENOMEM;
 	inst = kzalloc(sizeof(*inst) + sizeof(*ctx), GFP_KERNEL);
@@ -618,7 +623,6 @@ static int chachapoly_create(struct crypto_template *tmpl, struct rtattr **tb,
 
 	ctx = aead_instance_ctx(inst);
 	ctx->saltlen = CHACHAPOLY_IV_SIZE - ivsize;
-	poly_hash = __crypto_hash_alg_common(poly);
 	err = crypto_init_ahash_spawn(&ctx->poly, poly_hash,
 				      aead_crypto_instance(inst));
 	if (err)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c
index ee9cfb9..f8ec3d4 100644
--- a/crypto/pcrypt.c
+++ b/crypto/pcrypt.c
@@ -254,6 +254,14 @@ static void pcrypt_aead_exit_tfm(struct crypto_aead *tfm)
 	crypto_free_aead(ctx->child);
 }
 
+static void pcrypt_free(struct aead_instance *inst)
+{
+	struct pcrypt_instance_ctx *ctx = aead_instance_ctx(inst);
+
+	crypto_drop_aead(&ctx->spawn);
+	kfree(inst);
+}
+
 static int pcrypt_init_instance(struct crypto_instance *inst,
 				struct crypto_alg *alg)
 {
@@ -319,6 +327,8 @@ static int pcrypt_create_aead(struct crypto_template *tmpl, struct rtattr **tb,
 	inst->alg.encrypt = pcrypt_aead_encrypt;
 	inst->alg.decrypt = pcrypt_aead_decrypt;
 
+	inst->free = pcrypt_free;
+
 	err = aead_register_instance(tmpl, inst);
 	if (err)
 		goto out_drop_aead;
@@ -349,14 +359,6 @@ static int pcrypt_create(struct crypto_template *tmpl, struct rtattr **tb)
 	return -EINVAL;
 }
 
-static void pcrypt_free(struct crypto_instance *inst)
-{
-	struct pcrypt_instance_ctx *ctx = crypto_instance_ctx(inst);
-
-	crypto_drop_aead(&ctx->spawn);
-	kfree(inst);
-}
-
 static int pcrypt_cpumask_change_notify(struct notifier_block *self,
 					unsigned long val, void *data)
 {
@@ -469,7 +471,6 @@ static void pcrypt_fini_padata(struct padata_pcrypt *pcrypt)
 static struct crypto_template pcrypt_tmpl = {
 	.name = "pcrypt",
 	.create = pcrypt_create,
-	.free = pcrypt_free,
 	.module = THIS_MODULE,
 };
 
diff --git a/crypto/scompress.c b/crypto/scompress.c
index 2075e2c..968bbcf 100644
--- a/crypto/scompress.c
+++ b/crypto/scompress.c
@@ -140,53 +140,6 @@ static int crypto_scomp_init_tfm(struct crypto_tfm *tfm)
 	return ret;
 }
 
-static void crypto_scomp_sg_free(struct scatterlist *sgl)
-{
-	int i, n;
-	struct page *page;
-
-	if (!sgl)
-		return;
-
-	n = sg_nents(sgl);
-	for_each_sg(sgl, sgl, n, i) {
-		page = sg_page(sgl);
-		if (page)
-			__free_page(page);
-	}
-
-	kfree(sgl);
-}
-
-static struct scatterlist *crypto_scomp_sg_alloc(size_t size, gfp_t gfp)
-{
-	struct scatterlist *sgl;
-	struct page *page;
-	int i, n;
-
-	n = ((size - 1) >> PAGE_SHIFT) + 1;
-
-	sgl = kmalloc_array(n, sizeof(struct scatterlist), gfp);
-	if (!sgl)
-		return NULL;
-
-	sg_init_table(sgl, n);
-
-	for (i = 0; i < n; i++) {
-		page = alloc_page(gfp);
-		if (!page)
-			goto err;
-		sg_set_page(sgl + i, page, PAGE_SIZE, 0);
-	}
-
-	return sgl;
-
-err:
-	sg_mark_end(sgl + i);
-	crypto_scomp_sg_free(sgl);
-	return NULL;
-}
-
 static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 {
 	struct crypto_acomp *tfm = crypto_acomp_reqtfm(req);
@@ -220,7 +173,7 @@ static int scomp_acomp_comp_decomp(struct acomp_req *req, int dir)
 					      scratch_dst, &req->dlen, *ctx);
 	if (!ret) {
 		if (!req->dst) {
-			req->dst = crypto_scomp_sg_alloc(req->dlen, GFP_ATOMIC);
+			req->dst = sgl_alloc(req->dlen, GFP_ATOMIC, NULL);
 			if (!req->dst)
 				goto out;
 		}
@@ -274,7 +227,7 @@ int crypto_init_scomp_ops_async(struct crypto_tfm *tfm)
 
 	crt->compress = scomp_acomp_compress;
 	crt->decompress = scomp_acomp_decompress;
-	crt->dst_free = crypto_scomp_sg_free;
+	crt->dst_free = sgl_free;
 	crt->reqsize = sizeof(void *);
 
 	return 0;
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig
index 4650539..d650c5b 100644
--- a/drivers/acpi/Kconfig
+++ b/drivers/acpi/Kconfig
@@ -361,22 +361,6 @@
 	  i.e., segment/bus/device/function tuples, with physical slots in
 	  the system.  If you are unsure, say N.
 
-config X86_PM_TIMER
-	bool "Power Management Timer Support" if EXPERT
-	depends on X86
-	default y
-	help
-	  The Power Management Timer is available on all ACPI-capable,
-	  in most cases even if ACPI is unusable or blacklisted.
-
-	  This timing source is not affected by power management features
-	  like aggressive processor idling, throttling, frequency and/or
-	  voltage scaling, unlike the commonly used Time Stamp Counter
-	  (TSC) timing source.
-
-	  You should nearly always say Y here because many modern
-	  systems require this timer. 
-
 config ACPI_CONTAINER
 	bool "Container and Module Devices"
 	default (ACPI_HOTPLUG_MEMORY || ACPI_HOTPLUG_CPU)
@@ -564,3 +548,19 @@
 	  using this, are probed.
 
 endif	# ACPI
+
+config X86_PM_TIMER
+	bool "Power Management Timer Support" if EXPERT
+	depends on X86 && (ACPI || JAILHOUSE_GUEST)
+	default y
+	help
+	  The Power Management Timer is available on all ACPI-capable,
+	  in most cases even if ACPI is unusable or blacklisted.
+
+	  This timing source is not affected by power management features
+	  like aggressive processor idling, throttling, frequency and/or
+	  voltage scaling, unlike the commonly used Time Stamp Counter
+	  (TSC) timing source.
+
+	  You should nearly always say Y here because many modern
+	  systems require this timer.
diff --git a/drivers/acpi/acpi_lpss.c b/drivers/acpi/acpi_lpss.c
index 7f2b02c..2bcffec 100644
--- a/drivers/acpi/acpi_lpss.c
+++ b/drivers/acpi/acpi_lpss.c
@@ -427,6 +427,142 @@ static int register_device_clock(struct acpi_device *adev,
 	return 0;
 }
 
+struct lpss_device_links {
+	const char *supplier_hid;
+	const char *supplier_uid;
+	const char *consumer_hid;
+	const char *consumer_uid;
+	u32 flags;
+};
+
+/*
+ * The _DEP method is used to identify dependencies but instead of creating
+ * device links for every handle in _DEP, only links in the following list are
+ * created. That is necessary because, in the general case, _DEP can refer to
+ * devices that might not have drivers, or that are on different buses, or where
+ * the supplier is not enumerated until after the consumer is probed.
+ */
+static const struct lpss_device_links lpss_device_links[] = {
+	{"808622C1", "7", "80860F14", "3", DL_FLAG_PM_RUNTIME},
+};
+
+static bool hid_uid_match(const char *hid1, const char *uid1,
+			  const char *hid2, const char *uid2)
+{
+	return !strcmp(hid1, hid2) && uid1 && uid2 && !strcmp(uid1, uid2);
+}
+
+static bool acpi_lpss_is_supplier(struct acpi_device *adev,
+				  const struct lpss_device_links *link)
+{
+	return hid_uid_match(acpi_device_hid(adev), acpi_device_uid(adev),
+			     link->supplier_hid, link->supplier_uid);
+}
+
+static bool acpi_lpss_is_consumer(struct acpi_device *adev,
+				  const struct lpss_device_links *link)
+{
+	return hid_uid_match(acpi_device_hid(adev), acpi_device_uid(adev),
+			     link->consumer_hid, link->consumer_uid);
+}
+
+struct hid_uid {
+	const char *hid;
+	const char *uid;
+};
+
+static int match_hid_uid(struct device *dev, void *data)
+{
+	struct acpi_device *adev = ACPI_COMPANION(dev);
+	struct hid_uid *id = data;
+
+	if (!adev)
+		return 0;
+
+	return hid_uid_match(acpi_device_hid(adev), acpi_device_uid(adev),
+			     id->hid, id->uid);
+}
+
+static struct device *acpi_lpss_find_device(const char *hid, const char *uid)
+{
+	struct hid_uid data = {
+		.hid = hid,
+		.uid = uid,
+	};
+
+	return bus_find_device(&platform_bus_type, NULL, &data, match_hid_uid);
+}
+
+static bool acpi_lpss_dep(struct acpi_device *adev, acpi_handle handle)
+{
+	struct acpi_handle_list dep_devices;
+	acpi_status status;
+	int i;
+
+	if (!acpi_has_method(adev->handle, "_DEP"))
+		return false;
+
+	status = acpi_evaluate_reference(adev->handle, "_DEP", NULL,
+					 &dep_devices);
+	if (ACPI_FAILURE(status)) {
+		dev_dbg(&adev->dev, "Failed to evaluate _DEP.\n");
+		return false;
+	}
+
+	for (i = 0; i < dep_devices.count; i++) {
+		if (dep_devices.handles[i] == handle)
+			return true;
+	}
+
+	return false;
+}
+
+static void acpi_lpss_link_consumer(struct device *dev1,
+				    const struct lpss_device_links *link)
+{
+	struct device *dev2;
+
+	dev2 = acpi_lpss_find_device(link->consumer_hid, link->consumer_uid);
+	if (!dev2)
+		return;
+
+	if (acpi_lpss_dep(ACPI_COMPANION(dev2), ACPI_HANDLE(dev1)))
+		device_link_add(dev2, dev1, link->flags);
+
+	put_device(dev2);
+}
+
+static void acpi_lpss_link_supplier(struct device *dev1,
+				    const struct lpss_device_links *link)
+{
+	struct device *dev2;
+
+	dev2 = acpi_lpss_find_device(link->supplier_hid, link->supplier_uid);
+	if (!dev2)
+		return;
+
+	if (acpi_lpss_dep(ACPI_COMPANION(dev1), ACPI_HANDLE(dev2)))
+		device_link_add(dev1, dev2, link->flags);
+
+	put_device(dev2);
+}
+
+static void acpi_lpss_create_device_links(struct acpi_device *adev,
+					  struct platform_device *pdev)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(lpss_device_links); i++) {
+		const struct lpss_device_links *link = &lpss_device_links[i];
+
+		if (acpi_lpss_is_supplier(adev, link))
+			acpi_lpss_link_consumer(&pdev->dev, link);
+
+		if (acpi_lpss_is_consumer(adev, link))
+			acpi_lpss_link_supplier(&pdev->dev, link);
+	}
+}
+
 static int acpi_lpss_create_device(struct acpi_device *adev,
 				   const struct acpi_device_id *id)
 {
@@ -465,6 +601,8 @@ static int acpi_lpss_create_device(struct acpi_device *adev,
 	acpi_dev_free_resource_list(&resource_list);
 
 	if (!pdata->mmio_base) {
+		/* Avoid acpi_bus_attach() instantiating a pdev for this dev. */
+		adev->pnp.type.platform_id = 0;
 		/* Skip the device, but continue the namespace scan. */
 		ret = 0;
 		goto err_out;
@@ -500,6 +638,7 @@ static int acpi_lpss_create_device(struct acpi_device *adev,
 	adev->driver_data = pdata;
 	pdev = acpi_create_platform_device(adev, dev_desc->properties);
 	if (!IS_ERR_OR_NULL(pdev)) {
+		acpi_lpss_create_device_links(adev, pdev);
 		return 1;
 	}
 
diff --git a/drivers/acpi/acpi_video.c b/drivers/acpi/acpi_video.c
index 0972ec0..f53ccc6 100644
--- a/drivers/acpi/acpi_video.c
+++ b/drivers/acpi/acpi_video.c
@@ -80,8 +80,8 @@ MODULE_PARM_DESC(report_key_events,
 static bool device_id_scheme = false;
 module_param(device_id_scheme, bool, 0444);
 
-static bool only_lcd = false;
-module_param(only_lcd, bool, 0444);
+static int only_lcd = -1;
+module_param(only_lcd, int, 0444);
 
 static int register_count;
 static DEFINE_MUTEX(register_count_mutex);
@@ -2136,6 +2136,16 @@ int acpi_video_register(void)
 		goto leave;
 	}
 
+	/*
+	 * We're seeing a lot of bogus backlight interfaces on newer machines
+	 * without a LCD such as desktops, servers and HDMI sticks. Checking
+	 * the lcd flag fixes this, so enable this on any machines which are
+	 * win8 ready (where we also prefer the native backlight driver, so
+	 * normally the acpi_video code should not register there anyways).
+	 */
+	if (only_lcd == -1)
+		only_lcd = acpi_osi_is_win8();
+
 	dmi_check_system(video_dmi_table);
 
 	ret = acpi_bus_register_driver(&acpi_video_bus);
diff --git a/drivers/acpi/acpica/acapps.h b/drivers/acpi/acpica/acapps.h
index 7a1a68b..2243c81 100644
--- a/drivers/acpi/acpica/acapps.h
+++ b/drivers/acpi/acpica/acapps.h
@@ -80,6 +80,9 @@
 	prefix, ACPICA_COPYRIGHT, \
 	prefix
 
+#define ACPI_COMMON_BUILD_TIME \
+	"Build date/time: %s %s\n", __DATE__, __TIME__
+
 /* Macros for usage messages */
 
 #define ACPI_USAGE_HEADER(usage) \
diff --git a/drivers/acpi/acpica/acdebug.h b/drivers/acpi/acpica/acdebug.h
index 71743e5..54b8d9d 100644
--- a/drivers/acpi/acpica/acdebug.h
+++ b/drivers/acpi/acpica/acdebug.h
@@ -223,6 +223,10 @@ void
 acpi_db_execute(char *name, char **args, acpi_object_type *types, u32 flags);
 
 void
+acpi_db_create_execution_thread(char *method_name_arg,
+				char **arguments, acpi_object_type *types);
+
+void
 acpi_db_create_execution_threads(char *num_threads_arg,
 				 char *num_loops_arg, char *method_name_arg);
 
diff --git a/drivers/acpi/acpica/acglobal.h b/drivers/acpi/acpica/acglobal.h
index 95eed44..45ef3f5 100644
--- a/drivers/acpi/acpica/acglobal.h
+++ b/drivers/acpi/acpica/acglobal.h
@@ -46,7 +46,7 @@
 
 /*****************************************************************************
  *
- * Globals related to the ACPI tables
+ * Globals related to the incoming ACPI tables
  *
  ****************************************************************************/
 
@@ -87,7 +87,7 @@ ACPI_GLOBAL(u8, acpi_gbl_integer_nybble_width);
 
 /*****************************************************************************
  *
- * Mutual exclusion within ACPICA subsystem
+ * Mutual exclusion within the ACPICA subsystem
  *
  ****************************************************************************/
 
@@ -167,7 +167,7 @@ ACPI_GLOBAL(u8, acpi_gbl_next_owner_id_offset);
 
 ACPI_INIT_GLOBAL(u8, acpi_gbl_namespace_initialized, FALSE);
 
-/* Misc */
+/* Miscellaneous */
 
 ACPI_GLOBAL(u32, acpi_gbl_original_mode);
 ACPI_GLOBAL(u32, acpi_gbl_ns_lookup_count);
@@ -191,10 +191,9 @@ extern const char acpi_gbl_lower_hex_digits[];
 extern const char acpi_gbl_upper_hex_digits[];
 extern const struct acpi_opcode_info acpi_gbl_aml_op_info[AML_NUM_OPCODES];
 
-#ifdef ACPI_DBG_TRACK_ALLOCATIONS
-
 /* Lists for tracking memory allocations (debug only) */
 
+#ifdef ACPI_DBG_TRACK_ALLOCATIONS
 ACPI_GLOBAL(struct acpi_memory_list *, acpi_gbl_global_list);
 ACPI_GLOBAL(struct acpi_memory_list *, acpi_gbl_ns_node_list);
 ACPI_GLOBAL(u8, acpi_gbl_display_final_mem_stats);
@@ -203,7 +202,7 @@ ACPI_GLOBAL(u8, acpi_gbl_disable_mem_tracking);
 
 /*****************************************************************************
  *
- * Namespace globals
+ * ACPI Namespace
  *
  ****************************************************************************/
 
@@ -234,15 +233,20 @@ ACPI_INIT_GLOBAL(u32, acpi_gbl_nesting_level, 0);
 
 /*****************************************************************************
  *
- * Interpreter globals
+ * Interpreter/Parser globals
  *
  ****************************************************************************/
 
-ACPI_GLOBAL(struct acpi_thread_state *, acpi_gbl_current_walk_list);
-
 /* Control method single step flag */
 
 ACPI_GLOBAL(u8, acpi_gbl_cm_single_step);
+ACPI_GLOBAL(struct acpi_thread_state *, acpi_gbl_current_walk_list);
+ACPI_INIT_GLOBAL(union acpi_parse_object, *acpi_gbl_current_scope, NULL);
+
+/* ASL/ASL+ converter */
+
+ACPI_INIT_GLOBAL(u8, acpi_gbl_capture_comments, FALSE);
+ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_last_list_head, NULL);
 
 /*****************************************************************************
  *
@@ -252,7 +256,6 @@ ACPI_GLOBAL(u8, acpi_gbl_cm_single_step);
 
 extern struct acpi_bit_register_info
     acpi_gbl_bit_register_info[ACPI_NUM_BITREG];
-
 ACPI_GLOBAL(u8, acpi_gbl_sleep_type_a);
 ACPI_GLOBAL(u8, acpi_gbl_sleep_type_b);
 
@@ -263,7 +266,6 @@ ACPI_GLOBAL(u8, acpi_gbl_sleep_type_b);
  ****************************************************************************/
 
 #if (!ACPI_REDUCED_HARDWARE)
-
 ACPI_GLOBAL(u8, acpi_gbl_all_gpes_initialized);
 ACPI_GLOBAL(struct acpi_gpe_xrupt_info *, acpi_gbl_gpe_xrupt_list_head);
 ACPI_GLOBAL(struct acpi_gpe_block_info *,
@@ -272,10 +274,8 @@ ACPI_GLOBAL(acpi_gbl_event_handler, acpi_gbl_global_event_handler);
 ACPI_GLOBAL(void *, acpi_gbl_global_event_handler_context);
 ACPI_GLOBAL(struct acpi_fixed_event_handler,
 	    acpi_gbl_fixed_event_handlers[ACPI_NUM_FIXED_EVENTS]);
-
 extern struct acpi_fixed_event_info
     acpi_gbl_fixed_event_info[ACPI_NUM_FIXED_EVENTS];
-
 #endif				/* !ACPI_REDUCED_HARDWARE */
 
 /*****************************************************************************
@@ -291,14 +291,14 @@ ACPI_GLOBAL(u32, acpi_gpe_count);
 ACPI_GLOBAL(u32, acpi_sci_count);
 ACPI_GLOBAL(u32, acpi_fixed_event_count[ACPI_NUM_FIXED_EVENTS]);
 
-/* Support for dynamic control method tracing mechanism */
+/* Dynamic control method tracing mechanism */
 
 ACPI_GLOBAL(u32, acpi_gbl_original_dbg_level);
 ACPI_GLOBAL(u32, acpi_gbl_original_dbg_layer);
 
 /*****************************************************************************
  *
- * Debugger and Disassembler globals
+ * Debugger and Disassembler
  *
  ****************************************************************************/
 
@@ -326,7 +326,6 @@ ACPI_GLOBAL(struct acpi_external_file *, acpi_gbl_external_file_list);
 #endif
 
 #ifdef ACPI_DEBUGGER
-
 ACPI_INIT_GLOBAL(u8, acpi_gbl_abort_method, FALSE);
 ACPI_INIT_GLOBAL(acpi_thread_id, acpi_gbl_db_thread_id, ACPI_INVALID_THREAD_ID);
 
@@ -340,7 +339,6 @@ ACPI_GLOBAL(u32, acpi_gbl_db_console_debug_level);
 ACPI_GLOBAL(struct acpi_namespace_node *, acpi_gbl_db_scope_node);
 ACPI_GLOBAL(u8, acpi_gbl_db_terminate_loop);
 ACPI_GLOBAL(u8, acpi_gbl_db_threads_terminated);
-
 ACPI_GLOBAL(char *, acpi_gbl_db_args[ACPI_DEBUGGER_MAX_ARGS]);
 ACPI_GLOBAL(acpi_object_type, acpi_gbl_db_arg_types[ACPI_DEBUGGER_MAX_ARGS]);
 
@@ -350,32 +348,33 @@ ACPI_GLOBAL(char, acpi_gbl_db_parsed_buf[ACPI_DB_LINE_BUFFER_SIZE]);
 ACPI_GLOBAL(char, acpi_gbl_db_scope_buf[ACPI_DB_LINE_BUFFER_SIZE]);
 ACPI_GLOBAL(char, acpi_gbl_db_debug_filename[ACPI_DB_LINE_BUFFER_SIZE]);
 
-/*
- * Statistic globals
- */
+/* Statistics globals */
+
 ACPI_GLOBAL(u16, acpi_gbl_obj_type_count[ACPI_TOTAL_TYPES]);
 ACPI_GLOBAL(u16, acpi_gbl_node_type_count[ACPI_TOTAL_TYPES]);
 ACPI_GLOBAL(u16, acpi_gbl_obj_type_count_misc);
 ACPI_GLOBAL(u16, acpi_gbl_node_type_count_misc);
 ACPI_GLOBAL(u32, acpi_gbl_num_nodes);
 ACPI_GLOBAL(u32, acpi_gbl_num_objects);
-
 #endif				/* ACPI_DEBUGGER */
 
 #if defined (ACPI_DISASSEMBLER) || defined (ACPI_ASL_COMPILER)
-
 ACPI_GLOBAL(const char, *acpi_gbl_pld_panel_list[]);
 ACPI_GLOBAL(const char, *acpi_gbl_pld_vertical_position_list[]);
 ACPI_GLOBAL(const char, *acpi_gbl_pld_horizontal_position_list[]);
 ACPI_GLOBAL(const char, *acpi_gbl_pld_shape_list[]);
-
 ACPI_INIT_GLOBAL(u8, acpi_gbl_disasm_flag, FALSE);
-
 #endif
 
-/*
- * Meant for the -ca option.
- */
+/*****************************************************************************
+ *
+ * ACPICA application-specific globals
+ *
+ ****************************************************************************/
+
+/* ASL-to-ASL+ conversion utility (implemented within the iASL compiler) */
+
+#ifdef ACPI_ASL_COMPILER
 ACPI_INIT_GLOBAL(char *, acpi_gbl_current_inline_comment, NULL);
 ACPI_INIT_GLOBAL(char *, acpi_gbl_current_end_node_comment, NULL);
 ACPI_INIT_GLOBAL(char *, acpi_gbl_current_open_brace_comment, NULL);
@@ -386,23 +385,18 @@ ACPI_INIT_GLOBAL(char *, acpi_gbl_current_filename, NULL);
 ACPI_INIT_GLOBAL(char *, acpi_gbl_current_parent_filename, NULL);
 ACPI_INIT_GLOBAL(char *, acpi_gbl_current_include_filename, NULL);
 
-ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_last_list_head, NULL);
-
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_def_blk_comment_list_head,
 		 NULL);
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_def_blk_comment_list_tail,
 		 NULL);
-
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_reg_comment_list_head,
 		 NULL);
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_reg_comment_list_tail,
 		 NULL);
-
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_inc_comment_list_head,
 		 NULL);
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_inc_comment_list_tail,
 		 NULL);
-
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_end_blk_comment_list_head,
 		 NULL);
 ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_end_blk_comment_list_tail,
@@ -410,30 +404,18 @@ ACPI_INIT_GLOBAL(struct acpi_comment_node, *acpi_gbl_end_blk_comment_list_tail,
 
 ACPI_INIT_GLOBAL(struct acpi_comment_addr_node,
 		 *acpi_gbl_comment_addr_list_head, NULL);
-
-ACPI_INIT_GLOBAL(union acpi_parse_object, *acpi_gbl_current_scope, NULL);
-
 ACPI_INIT_GLOBAL(struct acpi_file_node, *acpi_gbl_file_tree_root, NULL);
 
 ACPI_GLOBAL(acpi_cache_t *, acpi_gbl_reg_comment_cache);
 ACPI_GLOBAL(acpi_cache_t *, acpi_gbl_comment_addr_cache);
 ACPI_GLOBAL(acpi_cache_t *, acpi_gbl_file_cache);
 
-ACPI_INIT_GLOBAL(u8, gbl_capture_comments, FALSE);
-
 ACPI_INIT_GLOBAL(u8, acpi_gbl_debug_asl_conversion, FALSE);
 ACPI_INIT_GLOBAL(ACPI_FILE, acpi_gbl_conv_debug_file, NULL);
-
 ACPI_GLOBAL(char, acpi_gbl_table_sig[4]);
-
-/*****************************************************************************
- *
- * Application globals
- *
- ****************************************************************************/
+#endif
 
 #ifdef ACPI_APPLICATION
-
 ACPI_INIT_GLOBAL(ACPI_FILE, acpi_gbl_debug_file, NULL);
 ACPI_INIT_GLOBAL(ACPI_FILE, acpi_gbl_output_file, NULL);
 ACPI_INIT_GLOBAL(u8, acpi_gbl_debug_timeout, FALSE);
@@ -442,16 +424,6 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_debug_timeout, FALSE);
 
 ACPI_GLOBAL(acpi_spinlock, acpi_gbl_print_lock);	/* For print buffer */
 ACPI_GLOBAL(char, acpi_gbl_print_buffer[1024]);
-
 #endif				/* ACPI_APPLICATION */
 
-/*****************************************************************************
- *
- * Info/help support
- *
- ****************************************************************************/
-
-extern const struct ah_predefined_name asl_predefined_info[];
-extern const struct ah_device_id asl_device_ids[];
-
 #endif				/* __ACGLOBAL_H__ */
diff --git a/drivers/acpi/acpica/aclocal.h b/drivers/acpi/acpica/aclocal.h
index 0d45b8b..a56675f 100644
--- a/drivers/acpi/acpica/aclocal.h
+++ b/drivers/acpi/acpica/aclocal.h
@@ -622,7 +622,7 @@ struct acpi_control_state {
 	union acpi_parse_object *predicate_op;
 	u8 *aml_predicate_start;	/* Start of if/while predicate */
 	u8 *package_end;	/* End of if/while block */
-	u32 loop_count;		/* While() loop counter */
+	u64 loop_timeout;	/* While() loop timeout */
 };
 
 /*
@@ -1218,16 +1218,17 @@ struct acpi_db_method_info {
 	acpi_object_type *types;
 
 	/*
-	 * Arguments to be passed to method for the command
-	 * Threads -
-	 *   the Number of threads, ID of current thread and
-	 *   Index of current thread inside all them created.
+	 * Arguments to be passed to method for the commands Threads and
+	 * Background. Note, ACPI specifies a maximum of 7 arguments (0 - 6).
+	 *
+	 * For the Threads command, the Number of threads, ID of current
+	 * thread and Index of current thread inside all them created.
 	 */
 	char init_args;
 #ifdef ACPI_DEBUGGER
-	acpi_object_type arg_types[4];
+	acpi_object_type arg_types[ACPI_METHOD_NUM_ARGS];
 #endif
-	char *arguments[4];
+	char *arguments[ACPI_METHOD_NUM_ARGS];
 	char num_threads_str[11];
 	char id_of_thread_str[11];
 	char index_of_thread_str[11];
diff --git a/drivers/acpi/acpica/acmacros.h b/drivers/acpi/acpica/acmacros.h
index c7f0c96..128a3d7 100644
--- a/drivers/acpi/acpica/acmacros.h
+++ b/drivers/acpi/acpica/acmacros.h
@@ -455,7 +455,7 @@
  * the plist contains a set of parens to allow variable-length lists.
  * These macros are used for both the debug and non-debug versions of the code.
  */
-#define ACPI_ERROR_NAMESPACE(s, e)          acpi_ut_namespace_error (AE_INFO, s, e);
+#define ACPI_ERROR_NAMESPACE(s, p, e)       acpi_ut_prefixed_namespace_error (AE_INFO, s, p, e);
 #define ACPI_ERROR_METHOD(s, n, p, e)       acpi_ut_method_error (AE_INFO, s, n, p, e);
 #define ACPI_WARN_PREDEFINED(plist)         acpi_ut_predefined_warning plist
 #define ACPI_INFO_PREDEFINED(plist)         acpi_ut_predefined_info plist
diff --git a/drivers/acpi/acpica/acnamesp.h b/drivers/acpi/acpica/acnamesp.h
index 54a0c51..2fb1bb7 100644
--- a/drivers/acpi/acpica/acnamesp.h
+++ b/drivers/acpi/acpica/acnamesp.h
@@ -289,6 +289,9 @@ acpi_ns_build_normalized_path(struct acpi_namespace_node *node,
 char *acpi_ns_get_normalized_pathname(struct acpi_namespace_node *node,
 				      u8 no_trailing);
 
+char *acpi_ns_build_prefixed_pathname(union acpi_generic_state *prefix_scope,
+				      const char *internal_path);
+
 char *acpi_ns_name_of_current_scope(struct acpi_walk_state *walk_state);
 
 acpi_status
diff --git a/drivers/acpi/acpica/acutils.h b/drivers/acpi/acpica/acutils.h
index 83b75e9..b6b29d7 100644
--- a/drivers/acpi/acpica/acutils.h
+++ b/drivers/acpi/acpica/acutils.h
@@ -118,9 +118,6 @@ extern const char *acpi_gbl_ptyp_decode[];
 #ifndef ACPI_MSG_ERROR
 #define ACPI_MSG_ERROR          "ACPI Error: "
 #endif
-#ifndef ACPI_MSG_EXCEPTION
-#define ACPI_MSG_EXCEPTION      "ACPI Exception: "
-#endif
 #ifndef ACPI_MSG_WARNING
 #define ACPI_MSG_WARNING        "ACPI Warning: "
 #endif
@@ -129,10 +126,10 @@ extern const char *acpi_gbl_ptyp_decode[];
 #endif
 
 #ifndef ACPI_MSG_BIOS_ERROR
-#define ACPI_MSG_BIOS_ERROR     "ACPI BIOS Error (bug): "
+#define ACPI_MSG_BIOS_ERROR     "Firmware Error (ACPI): "
 #endif
 #ifndef ACPI_MSG_BIOS_WARNING
-#define ACPI_MSG_BIOS_WARNING   "ACPI BIOS Warning (bug): "
+#define ACPI_MSG_BIOS_WARNING   "Firmware Warning (ACPI): "
 #endif
 
 /*
@@ -233,10 +230,10 @@ u64 acpi_ut_implicit_strtoul64(char *string);
  */
 acpi_status acpi_ut_init_globals(void);
 
-#if defined(ACPI_DEBUG_OUTPUT) || defined(ACPI_DEBUGGER)
-
 const char *acpi_ut_get_mutex_name(u32 mutex_id);
 
+#if defined(ACPI_DEBUG_OUTPUT) || defined(ACPI_DEBUGGER)
+
 const char *acpi_ut_get_notify_name(u32 notify_value, acpi_object_type type);
 #endif
 
@@ -641,9 +638,11 @@ void ut_convert_backslashes(char *pathname);
 
 void acpi_ut_repair_name(char *name);
 
-#if defined (ACPI_DEBUGGER) || defined (ACPI_APPLICATION)
+#if defined (ACPI_DEBUGGER) || defined (ACPI_APPLICATION) || defined (ACPI_DEBUG_OUTPUT)
 u8 acpi_ut_safe_strcpy(char *dest, acpi_size dest_size, char *source);
 
+void acpi_ut_safe_strncpy(char *dest, char *source, acpi_size dest_size);
+
 u8 acpi_ut_safe_strcat(char *dest, acpi_size dest_size, char *source);
 
 u8
@@ -737,9 +736,11 @@ acpi_ut_predefined_bios_error(const char *module_name,
 			      u8 node_flags, const char *format, ...);
 
 void
-acpi_ut_namespace_error(const char *module_name,
-			u32 line_number,
-			const char *internal_name, acpi_status lookup_status);
+acpi_ut_prefixed_namespace_error(const char *module_name,
+				 u32 line_number,
+				 union acpi_generic_state *prefix_scope,
+				 const char *internal_name,
+				 acpi_status lookup_status);
 
 void
 acpi_ut_method_error(const char *module_name,
diff --git a/drivers/acpi/acpica/dbexec.c b/drivers/acpi/acpica/dbexec.c
index 3b30319..ed088fc 100644
--- a/drivers/acpi/acpica/dbexec.c
+++ b/drivers/acpi/acpica/dbexec.c
@@ -67,6 +67,8 @@ static acpi_status
 acpi_db_execution_walk(acpi_handle obj_handle,
 		       u32 nesting_level, void *context, void **return_value);
 
+static void ACPI_SYSTEM_XFACE acpi_db_single_execution_thread(void *context);
+
 /*******************************************************************************
  *
  * FUNCTION:    acpi_db_delete_objects
@@ -229,7 +231,7 @@ static acpi_status acpi_db_execute_setup(struct acpi_db_method_info *info)
 
 	ACPI_FUNCTION_NAME(db_execute_setup);
 
-	/* Catenate the current scope to the supplied name */
+	/* Concatenate the current scope to the supplied name */
 
 	info->pathname[0] = 0;
 	if ((info->name[0] != '\\') && (info->name[0] != '/')) {
@@ -611,6 +613,112 @@ static void ACPI_SYSTEM_XFACE acpi_db_method_thread(void *context)
 
 /*******************************************************************************
  *
+ * FUNCTION:    acpi_db_single_execution_thread
+ *
+ * PARAMETERS:  context                 - Method info struct
+ *
+ * RETURN:      None
+ *
+ * DESCRIPTION: Create one thread and execute a method
+ *
+ ******************************************************************************/
+
+static void ACPI_SYSTEM_XFACE acpi_db_single_execution_thread(void *context)
+{
+	struct acpi_db_method_info *info = context;
+	acpi_status status;
+	struct acpi_buffer return_obj;
+
+	acpi_os_printf("\n");
+
+	status = acpi_db_execute_method(info, &return_obj);
+	if (ACPI_FAILURE(status)) {
+		acpi_os_printf("%s During evaluation of %s\n",
+			       acpi_format_exception(status), info->pathname);
+		return;
+	}
+
+	/* Display a return object, if any */
+
+	if (return_obj.length) {
+		acpi_os_printf("Evaluation of %s returned object %p, "
+			       "external buffer length %X\n",
+			       acpi_gbl_db_method_info.pathname,
+			       return_obj.pointer, (u32)return_obj.length);
+
+		acpi_db_dump_external_object(return_obj.pointer, 1);
+	}
+
+	acpi_os_printf("\nBackground thread completed\n%c ",
+		       ACPI_DEBUGGER_COMMAND_PROMPT);
+}
+
+/*******************************************************************************
+ *
+ * FUNCTION:    acpi_db_create_execution_thread
+ *
+ * PARAMETERS:  method_name_arg         - Control method to execute
+ *              arguments               - Array of arguments to the method
+ *              types                   - Corresponding array of object types
+ *
+ * RETURN:      None
+ *
+ * DESCRIPTION: Create a single thread to evaluate a namespace object. Handles
+ *              arguments passed on command line for control methods.
+ *
+ ******************************************************************************/
+
+void
+acpi_db_create_execution_thread(char *method_name_arg,
+				char **arguments, acpi_object_type *types)
+{
+	acpi_status status;
+	u32 i;
+
+	memset(&acpi_gbl_db_method_info, 0, sizeof(struct acpi_db_method_info));
+	acpi_gbl_db_method_info.name = method_name_arg;
+	acpi_gbl_db_method_info.init_args = 1;
+	acpi_gbl_db_method_info.args = acpi_gbl_db_method_info.arguments;
+	acpi_gbl_db_method_info.types = acpi_gbl_db_method_info.arg_types;
+
+	/* Setup method arguments, up to 7 (0-6) */
+
+	for (i = 0; (i < ACPI_METHOD_NUM_ARGS) && *arguments; i++) {
+		acpi_gbl_db_method_info.arguments[i] = *arguments;
+		arguments++;
+
+		acpi_gbl_db_method_info.arg_types[i] = *types;
+		types++;
+	}
+
+	status = acpi_db_execute_setup(&acpi_gbl_db_method_info);
+	if (ACPI_FAILURE(status)) {
+		return;
+	}
+
+	/* Get the NS node, determines existence also */
+
+	status = acpi_get_handle(NULL, acpi_gbl_db_method_info.pathname,
+				 &acpi_gbl_db_method_info.method);
+	if (ACPI_FAILURE(status)) {
+		acpi_os_printf("%s Could not get handle for %s\n",
+			       acpi_format_exception(status),
+			       acpi_gbl_db_method_info.pathname);
+		return;
+	}
+
+	status = acpi_os_execute(OSL_DEBUGGER_EXEC_THREAD,
+				 acpi_db_single_execution_thread,
+				 &acpi_gbl_db_method_info);
+	if (ACPI_FAILURE(status)) {
+		return;
+	}
+
+	acpi_os_printf("\nBackground thread started\n");
+}
+
+/*******************************************************************************
+ *
  * FUNCTION:    acpi_db_create_execution_threads
  *
  * PARAMETERS:  num_threads_arg         - Number of threads to create
diff --git a/drivers/acpi/acpica/dbfileio.c b/drivers/acpi/acpica/dbfileio.c
index 4d81ea2..cf96079 100644
--- a/drivers/acpi/acpica/dbfileio.c
+++ b/drivers/acpi/acpica/dbfileio.c
@@ -99,8 +99,8 @@ void acpi_db_open_debug_file(char *name)
 	}
 
 	acpi_os_printf("Debug output file %s opened\n", name);
-	strncpy(acpi_gbl_db_debug_filename, name,
-		sizeof(acpi_gbl_db_debug_filename));
+	acpi_ut_safe_strncpy(acpi_gbl_db_debug_filename, name,
+			     sizeof(acpi_gbl_db_debug_filename));
 	acpi_gbl_db_output_to_file = TRUE;
 }
 #endif
diff --git a/drivers/acpi/acpica/dbinput.c b/drivers/acpi/acpica/dbinput.c
index 2626d79..954ca3b 100644
--- a/drivers/acpi/acpica/dbinput.c
+++ b/drivers/acpi/acpica/dbinput.c
@@ -136,6 +136,7 @@ enum acpi_ex_debugger_commands {
 	CMD_UNLOAD,
 
 	CMD_TERMINATE,
+	CMD_BACKGROUND,
 	CMD_THREADS,
 
 	CMD_TEST,
@@ -212,6 +213,7 @@ static const struct acpi_db_command_info acpi_gbl_db_commands[] = {
 	{"UNLOAD", 1},
 
 	{"TERMINATE", 0},
+	{"BACKGROUND", 1},
 	{"THREADS", 3},
 
 	{"TEST", 1},
@@ -222,36 +224,12 @@ static const struct acpi_db_command_info acpi_gbl_db_commands[] = {
 /*
  * Help for all debugger commands. First argument is the number of lines
  * of help to output for the command.
+ *
+ * Note: Some commands are not supported by the kernel-level version of
+ * the debugger.
  */
 static const struct acpi_db_command_help acpi_gbl_db_command_help[] = {
-	{0, "\nGeneral-Purpose Commands:", "\n"},
-	{1, "  Allocations", "Display list of current memory allocations\n"},
-	{2, "  Dump <Address>|<Namepath>", "\n"},
-	{0, "       [Byte|Word|Dword|Qword]",
-	 "Display ACPI objects or memory\n"},
-	{1, "  Handlers", "Info about global handlers\n"},
-	{1, "  Help [Command]", "This help screen or individual command\n"},
-	{1, "  History", "Display command history buffer\n"},
-	{1, "  Level <DebugLevel>] [console]",
-	 "Get/Set debug level for file or console\n"},
-	{1, "  Locks", "Current status of internal mutexes\n"},
-	{1, "  Osi [Install|Remove <name>]",
-	 "Display or modify global _OSI list\n"},
-	{1, "  Quit or Exit", "Exit this command\n"},
-	{8, "  Stats <SubCommand>",
-	 "Display namespace and memory statistics\n"},
-	{1, "     Allocations", "Display list of current memory allocations\n"},
-	{1, "     Memory", "Dump internal memory lists\n"},
-	{1, "     Misc", "Namespace search and mutex stats\n"},
-	{1, "     Objects", "Summary of namespace objects\n"},
-	{1, "     Sizes", "Sizes for each of the internal objects\n"},
-	{1, "     Stack", "Display CPU stack usage\n"},
-	{1, "     Tables", "Info about current ACPI table(s)\n"},
-	{1, "  Tables", "Display info about loaded ACPI tables\n"},
-	{1, "  ! <CommandNumber>", "Execute command from history buffer\n"},
-	{1, "  !!", "Execute last command again\n"},
-
-	{0, "\nNamespace Access Commands:", "\n"},
+	{0, "\nNamespace Access:", "\n"},
 	{1, "  Businfo", "Display system bus info\n"},
 	{1, "  Disassemble <Method>", "Disassemble a control method\n"},
 	{1, "  Find <AcpiName> (? is wildcard)",
@@ -275,19 +253,74 @@ static const struct acpi_db_command_help acpi_gbl_db_command_help[] = {
 	{1, "  Template <Object>", "Format/dump a Buffer/ResourceTemplate\n"},
 	{1, "  Type <Object>", "Display object type\n"},
 
-	{0, "\nControl Method Execution Commands:", "\n"},
+	{0, "\nControl Method Execution:", "\n"},
+	{1, "  Evaluate <Namepath> [Arguments]",
+	 "Evaluate object or control method\n"},
+	{1, "  Execute <Namepath> [Arguments]", "Synonym for Evaluate\n"},
+#ifdef ACPI_APPLICATION
+	{1, "  Background <Namepath> [Arguments]",
+	 "Evaluate object/method in a separate thread\n"},
+	{1, "  Thread <Threads><Loops><NamePath>",
+	 "Spawn threads to execute method(s)\n"},
+#endif
+	{1, "  Debug <Namepath> [Arguments]", "Single-Step a control method\n"},
+	{7, "  [Arguments] formats:", "Control method argument formats\n"},
+	{1, "     Hex Integer", "Integer\n"},
+	{1, "     \"Ascii String\"", "String\n"},
+	{1, "     (Hex Byte List)", "Buffer\n"},
+	{1, "         (01 42 7A BF)", "Buffer example (4 bytes)\n"},
+	{1, "     [Package Element List]", "Package\n"},
+	{1, "         [0x01 0x1234 \"string\"]",
+	 "Package example (3 elements)\n"},
+
+	{0, "\nMiscellaneous:", "\n"},
+	{1, "  Allocations", "Display list of current memory allocations\n"},
+	{2, "  Dump <Address>|<Namepath>", "\n"},
+	{0, "       [Byte|Word|Dword|Qword]",
+	 "Display ACPI objects or memory\n"},
+	{1, "  Handlers", "Info about global handlers\n"},
+	{1, "  Help [Command]", "This help screen or individual command\n"},
+	{1, "  History", "Display command history buffer\n"},
+	{1, "  Level <DebugLevel>] [console]",
+	 "Get/Set debug level for file or console\n"},
+	{1, "  Locks", "Current status of internal mutexes\n"},
+	{1, "  Osi [Install|Remove <name>]",
+	 "Display or modify global _OSI list\n"},
+	{1, "  Quit or Exit", "Exit this command\n"},
+	{8, "  Stats <SubCommand>",
+	 "Display namespace and memory statistics\n"},
+	{1, "     Allocations", "Display list of current memory allocations\n"},
+	{1, "     Memory", "Dump internal memory lists\n"},
+	{1, "     Misc", "Namespace search and mutex stats\n"},
+	{1, "     Objects", "Summary of namespace objects\n"},
+	{1, "     Sizes", "Sizes for each of the internal objects\n"},
+	{1, "     Stack", "Display CPU stack usage\n"},
+	{1, "     Tables", "Info about current ACPI table(s)\n"},
+	{1, "  Tables", "Display info about loaded ACPI tables\n"},
+#ifdef ACPI_APPLICATION
+	{1, "  Terminate", "Delete namespace and all internal objects\n"},
+#endif
+	{1, "  ! <CommandNumber>", "Execute command from history buffer\n"},
+	{1, "  !!", "Execute last command again\n"},
+
+	{0, "\nMethod and Namespace Debugging:", "\n"},
+	{5, "  Trace <State> [<Namepath>] [Once]",
+	 "Trace control method execution\n"},
+	{1, "     Enable", "Enable all messages\n"},
+	{1, "     Disable", "Disable tracing\n"},
+	{1, "     Method", "Enable method execution messages\n"},
+	{1, "     Opcode", "Enable opcode execution messages\n"},
+	{3, "  Test <TestName>", "Invoke a debug test\n"},
+	{1, "     Objects", "Read/write/compare all namespace data objects\n"},
+	{1, "     Predefined",
+	 "Validate all ACPI predefined names (_STA, etc.)\n"},
+	{1, "  Execute predefined",
+	 "Execute all predefined (public) methods\n"},
+
+	{0, "\nControl Method Single-Step Execution:", "\n"},
 	{1, "  Arguments (or Args)", "Display method arguments\n"},
 	{1, "  Breakpoint <AmlOffset>", "Set an AML execution breakpoint\n"},
 	{1, "  Call", "Run to next control method invocation\n"},
-	{1, "  Debug <Namepath> [Arguments]", "Single Step a control method\n"},
-	{6, "  Evaluate", "Synonym for Execute\n"},
-	{5, "  Execute <Namepath> [Arguments]", "Execute control method\n"},
-	{1, "     Hex Integer", "Integer method argument\n"},
-	{1, "     \"Ascii String\"", "String method argument\n"},
-	{1, "     (Hex Byte List)", "Buffer method argument\n"},
-	{1, "     [Package Element List]", "Package method argument\n"},
-	{5, "  Execute predefined",
-	 "Execute all predefined (public) methods\n"},
 	{1, "  Go", "Allow method to run to completion\n"},
 	{1, "  Information", "Display info about the current method\n"},
 	{1, "  Into", "Step into (not over) a method call\n"},
@@ -296,41 +329,24 @@ static const struct acpi_db_command_help acpi_gbl_db_command_help[] = {
 	{1, "  Results", "Display method result stack\n"},
 	{1, "  Set <A|L> <#> <Value>", "Set method data (Arguments/Locals)\n"},
 	{1, "  Stop", "Terminate control method\n"},
-	{5, "  Trace <State> [<Namepath>] [Once]",
-	 "Trace control method execution\n"},
-	{1, "     Enable", "Enable all messages\n"},
-	{1, "     Disable", "Disable tracing\n"},
-	{1, "     Method", "Enable method execution messages\n"},
-	{1, "     Opcode", "Enable opcode execution messages\n"},
 	{1, "  Tree", "Display control method calling tree\n"},
 	{1, "  <Enter>", "Single step next AML opcode (over calls)\n"},
 
 #ifdef ACPI_APPLICATION
-	{0, "\nHardware Simulation Commands:", "\n"},
-	{1, "  EnableAcpi", "Enable ACPI (hardware) mode\n"},
-	{1, "  Event <F|G> <Value>", "Generate AcpiEvent (Fixed/GPE)\n"},
-	{1, "  Gpe <GpeNum> [GpeBlockDevice]", "Simulate a GPE\n"},
-	{1, "  Gpes", "Display info on all GPE devices\n"},
-	{1, "  Sci", "Generate an SCI\n"},
-	{1, "  Sleep [SleepState]", "Simulate sleep/wake sequence(s) (0-5)\n"},
-
-	{0, "\nFile I/O Commands:", "\n"},
+	{0, "\nFile Operations:", "\n"},
 	{1, "  Close", "Close debug output file\n"},
 	{1, "  Load <Input Filename>", "Load ACPI table from a file\n"},
 	{1, "  Open <Output Filename>", "Open a file for debug output\n"},
 	{1, "  Unload <Namepath>",
 	 "Unload an ACPI table via namespace object\n"},
 
-	{0, "\nUser Space Commands:", "\n"},
-	{1, "  Terminate", "Delete namespace and all internal objects\n"},
-	{1, "  Thread <Threads><Loops><NamePath>",
-	 "Spawn threads to execute method(s)\n"},
-
-	{0, "\nDebug Test Commands:", "\n"},
-	{3, "  Test <TestName>", "Invoke a debug test\n"},
-	{1, "     Objects", "Read/write/compare all namespace data objects\n"},
-	{1, "     Predefined",
-	 "Execute all ACPI predefined names (_STA, etc.)\n"},
+	{0, "\nHardware Simulation:", "\n"},
+	{1, "  EnableAcpi", "Enable ACPI (hardware) mode\n"},
+	{1, "  Event <F|G> <Value>", "Generate AcpiEvent (Fixed/GPE)\n"},
+	{1, "  Gpe <GpeNum> [GpeBlockDevice]", "Simulate a GPE\n"},
+	{1, "  Gpes", "Display info on all GPE devices\n"},
+	{1, "  Sci", "Generate an SCI\n"},
+	{1, "  Sleep [SleepState]", "Simulate sleep/wake sequence(s) (0-5)\n"},
 #endif
 	{0, NULL, NULL}
 };
@@ -442,11 +458,15 @@ static void acpi_db_display_help(char *command)
 
 		/* No argument to help, display help for all commands */
 
+		acpi_os_printf("\nSummary of AML Debugger Commands\n\n");
+
 		while (next->invocation) {
 			acpi_os_printf("%-38s%s", next->invocation,
 				       next->description);
 			next++;
 		}
+		acpi_os_printf("\n");
+
 	} else {
 		/* Display help for all commands that match the subtring */
 
@@ -1087,6 +1107,13 @@ acpi_db_command_dispatch(char *input_buffer,
 		/*  acpi_initialize (NULL); */
 		break;
 
+	case CMD_BACKGROUND:
+
+		acpi_db_create_execution_thread(acpi_gbl_db_args[1],
+						&acpi_gbl_db_args[2],
+						&acpi_gbl_db_arg_types[2]);
+		break;
+
 	case CMD_THREADS:
 
 		acpi_db_create_execution_threads(acpi_gbl_db_args[1],
diff --git a/drivers/acpi/acpica/dscontrol.c b/drivers/acpi/acpica/dscontrol.c
index f470e81..4b6ebc2 100644
--- a/drivers/acpi/acpica/dscontrol.c
+++ b/drivers/acpi/acpica/dscontrol.c
@@ -118,6 +118,8 @@ acpi_ds_exec_begin_control_op(struct acpi_walk_state *walk_state,
 		control_state->control.package_end =
 		    walk_state->parser_state.pkg_end;
 		control_state->control.opcode = op->common.aml_opcode;
+		control_state->control.loop_timeout = acpi_os_get_timer() +
+		    (u64)(acpi_gbl_max_loop_iterations * ACPI_100NSEC_PER_SEC);
 
 		/* Push the control state on this walk's control stack */
 
@@ -206,15 +208,15 @@ acpi_ds_exec_end_control_op(struct acpi_walk_state *walk_state,
 			/* Predicate was true, the body of the loop was just executed */
 
 			/*
-			 * This loop counter mechanism allows the interpreter to escape
-			 * possibly infinite loops. This can occur in poorly written AML
-			 * when the hardware does not respond within a while loop and the
-			 * loop does not implement a timeout.
+			 * This infinite loop detection mechanism allows the interpreter
+			 * to escape possibly infinite loops. This can occur in poorly
+			 * written AML when the hardware does not respond within a while
+			 * loop and the loop does not implement a timeout.
 			 */
-			control_state->control.loop_count++;
-			if (control_state->control.loop_count >
-			    acpi_gbl_max_loop_iterations) {
-				status = AE_AML_INFINITE_LOOP;
+			if (ACPI_TIME_AFTER(acpi_os_get_timer(),
+					    control_state->control.
+					    loop_timeout)) {
+				status = AE_AML_LOOP_TIMEOUT;
 				break;
 			}
 
diff --git a/drivers/acpi/acpica/dsfield.c b/drivers/acpi/acpica/dsfield.c
index 7bcf5f5e..0cab34a 100644
--- a/drivers/acpi/acpica/dsfield.c
+++ b/drivers/acpi/acpica/dsfield.c
@@ -209,7 +209,8 @@ acpi_ds_create_buffer_field(union acpi_parse_object *op,
 					ACPI_IMODE_LOAD_PASS1, flags,
 					walk_state, &node);
 		if (ACPI_FAILURE(status)) {
-			ACPI_ERROR_NAMESPACE(arg->common.value.string, status);
+			ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+					     arg->common.value.string, status);
 			return_ACPI_STATUS(status);
 		}
 	}
@@ -383,7 +384,9 @@ acpi_ds_get_field_names(struct acpi_create_field_info *info,
 							walk_state,
 							&info->connection_node);
 				if (ACPI_FAILURE(status)) {
-					ACPI_ERROR_NAMESPACE(child->common.
+					ACPI_ERROR_NAMESPACE(walk_state->
+							     scope_info,
+							     child->common.
 							     value.name,
 							     status);
 					return_ACPI_STATUS(status);
@@ -402,7 +405,8 @@ acpi_ds_get_field_names(struct acpi_create_field_info *info,
 						ACPI_NS_DONT_OPEN_SCOPE,
 						walk_state, &info->field_node);
 			if (ACPI_FAILURE(status)) {
-				ACPI_ERROR_NAMESPACE((char *)&arg->named.name,
+				ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+						     (char *)&arg->named.name,
 						     status);
 				return_ACPI_STATUS(status);
 			} else {
@@ -498,7 +502,8 @@ acpi_ds_create_field(union acpi_parse_object *op,
 							&region_node);
 #endif
 		if (ACPI_FAILURE(status)) {
-			ACPI_ERROR_NAMESPACE(arg->common.value.name, status);
+			ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+					     arg->common.value.name, status);
 			return_ACPI_STATUS(status);
 		}
 	}
@@ -618,7 +623,8 @@ acpi_ds_init_field_objects(union acpi_parse_object *op,
 						ACPI_IMODE_LOAD_PASS1, flags,
 						walk_state, &node);
 			if (ACPI_FAILURE(status)) {
-				ACPI_ERROR_NAMESPACE((char *)&arg->named.name,
+				ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+						     (char *)&arg->named.name,
 						     status);
 				if (status != AE_ALREADY_EXISTS) {
 					return_ACPI_STATUS(status);
@@ -681,7 +687,8 @@ acpi_ds_create_bank_field(union acpi_parse_object *op,
 							&region_node);
 #endif
 		if (ACPI_FAILURE(status)) {
-			ACPI_ERROR_NAMESPACE(arg->common.value.name, status);
+			ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+					     arg->common.value.name, status);
 			return_ACPI_STATUS(status);
 		}
 	}
@@ -695,7 +702,8 @@ acpi_ds_create_bank_field(union acpi_parse_object *op,
 			   ACPI_NS_SEARCH_PARENT, walk_state,
 			   &info.register_node);
 	if (ACPI_FAILURE(status)) {
-		ACPI_ERROR_NAMESPACE(arg->common.value.string, status);
+		ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+				     arg->common.value.string, status);
 		return_ACPI_STATUS(status);
 	}
 
@@ -765,7 +773,8 @@ acpi_ds_create_index_field(union acpi_parse_object *op,
 			   ACPI_NS_SEARCH_PARENT, walk_state,
 			   &info.register_node);
 	if (ACPI_FAILURE(status)) {
-		ACPI_ERROR_NAMESPACE(arg->common.value.string, status);
+		ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+				     arg->common.value.string, status);
 		return_ACPI_STATUS(status);
 	}
 
@@ -778,7 +787,8 @@ acpi_ds_create_index_field(union acpi_parse_object *op,
 			   ACPI_NS_SEARCH_PARENT, walk_state,
 			   &info.data_register_node);
 	if (ACPI_FAILURE(status)) {
-		ACPI_ERROR_NAMESPACE(arg->common.value.string, status);
+		ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+				     arg->common.value.string, status);
 		return_ACPI_STATUS(status);
 	}
 
diff --git a/drivers/acpi/acpica/dsobject.c b/drivers/acpi/acpica/dsobject.c
index 8244855..b21fe08 100644
--- a/drivers/acpi/acpica/dsobject.c
+++ b/drivers/acpi/acpica/dsobject.c
@@ -112,7 +112,9 @@ acpi_ds_build_internal_object(struct acpi_walk_state *walk_state,
 							 acpi_namespace_node,
 							 &(op->common.node)));
 				if (ACPI_FAILURE(status)) {
-					ACPI_ERROR_NAMESPACE(op->common.value.
+					ACPI_ERROR_NAMESPACE(walk_state->
+							     scope_info,
+							     op->common.value.
 							     string, status);
 					return_ACPI_STATUS(status);
 				}
diff --git a/drivers/acpi/acpica/dspkginit.c b/drivers/acpi/acpica/dspkginit.c
index 6d487ed..5a602b7 100644
--- a/drivers/acpi/acpica/dspkginit.c
+++ b/drivers/acpi/acpica/dspkginit.c
@@ -297,8 +297,10 @@ acpi_ds_init_package_element(u8 object_type,
 {
 	union acpi_operand_object **element_ptr;
 
+	ACPI_FUNCTION_TRACE(ds_init_package_element);
+
 	if (!source_object) {
-		return (AE_OK);
+		return_ACPI_STATUS(AE_OK);
 	}
 
 	/*
@@ -329,7 +331,7 @@ acpi_ds_init_package_element(u8 object_type,
 		source_object->package.flags |= AOPOBJ_DATA_VALID;
 	}
 
-	return (AE_OK);
+	return_ACPI_STATUS(AE_OK);
 }
 
 /*******************************************************************************
@@ -352,6 +354,7 @@ acpi_ds_resolve_package_element(union acpi_operand_object **element_ptr)
 	union acpi_generic_state scope_info;
 	union acpi_operand_object *element = *element_ptr;
 	struct acpi_namespace_node *resolved_node;
+	struct acpi_namespace_node *original_node;
 	char *external_path = NULL;
 	acpi_object_type type;
 
@@ -441,6 +444,7 @@ acpi_ds_resolve_package_element(union acpi_operand_object **element_ptr)
 	 * will remain as named references. This behavior is not described
 	 * in the ACPI spec, but it appears to be an oversight.
 	 */
+	original_node = resolved_node;
 	status = acpi_ex_resolve_node_to_value(&resolved_node, NULL);
 	if (ACPI_FAILURE(status)) {
 		return_VOID;
@@ -468,26 +472,27 @@ acpi_ds_resolve_package_element(union acpi_operand_object **element_ptr)
 		 */
 	case ACPI_TYPE_DEVICE:
 	case ACPI_TYPE_THERMAL:
-
-		/* TBD: This may not be necesssary */
-
-		acpi_ut_add_reference(resolved_node->object);
+	case ACPI_TYPE_METHOD:
 		break;
 
 	case ACPI_TYPE_MUTEX:
-	case ACPI_TYPE_METHOD:
 	case ACPI_TYPE_POWER:
 	case ACPI_TYPE_PROCESSOR:
 	case ACPI_TYPE_EVENT:
 	case ACPI_TYPE_REGION:
 
+		/* acpi_ex_resolve_node_to_value gave these an extra reference */
+
+		acpi_ut_remove_reference(original_node->object);
 		break;
 
 	default:
 		/*
 		 * For all other types - the node was resolved to an actual
-		 * operand object with a value, return the object
+		 * operand object with a value, return the object. Remove
+		 * a reference on the existing object.
 		 */
+		acpi_ut_remove_reference(element);
 		*element_ptr = (union acpi_operand_object *)resolved_node;
 		break;
 	}
diff --git a/drivers/acpi/acpica/dsutils.c b/drivers/acpi/acpica/dsutils.c
index 0dabd9b..4c5faf6 100644
--- a/drivers/acpi/acpica/dsutils.c
+++ b/drivers/acpi/acpica/dsutils.c
@@ -583,7 +583,8 @@ acpi_ds_create_operand(struct acpi_walk_state *walk_state,
 			}
 
 			if (ACPI_FAILURE(status)) {
-				ACPI_ERROR_NAMESPACE(name_string, status);
+				ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+						     name_string, status);
 			}
 		}
 
diff --git a/drivers/acpi/acpica/dswload.c b/drivers/acpi/acpica/dswload.c
index eaa859a..5771e4e4 100644
--- a/drivers/acpi/acpica/dswload.c
+++ b/drivers/acpi/acpica/dswload.c
@@ -207,7 +207,8 @@ acpi_ds_load1_begin_op(struct acpi_walk_state *walk_state,
 		}
 #endif
 		if (ACPI_FAILURE(status)) {
-			ACPI_ERROR_NAMESPACE(path, status);
+			ACPI_ERROR_NAMESPACE(walk_state->scope_info, path,
+					     status);
 			return_ACPI_STATUS(status);
 		}
 
@@ -375,7 +376,8 @@ acpi_ds_load1_begin_op(struct acpi_walk_state *walk_state,
 			}
 
 			if (ACPI_FAILURE(status)) {
-				ACPI_ERROR_NAMESPACE(path, status);
+				ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+						     path, status);
 				return_ACPI_STATUS(status);
 			}
 		}
diff --git a/drivers/acpi/acpica/dswload2.c b/drivers/acpi/acpica/dswload2.c
index aad83ef..b3d0aae 100644
--- a/drivers/acpi/acpica/dswload2.c
+++ b/drivers/acpi/acpica/dswload2.c
@@ -184,11 +184,14 @@ acpi_ds_load2_begin_op(struct acpi_walk_state *walk_state,
 				if (status == AE_NOT_FOUND) {
 					status = AE_OK;
 				} else {
-					ACPI_ERROR_NAMESPACE(buffer_ptr,
+					ACPI_ERROR_NAMESPACE(walk_state->
+							     scope_info,
+							     buffer_ptr,
 							     status);
 				}
 #else
-				ACPI_ERROR_NAMESPACE(buffer_ptr, status);
+				ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+						     buffer_ptr, status);
 #endif
 				return_ACPI_STATUS(status);
 			}
@@ -343,7 +346,8 @@ acpi_ds_load2_begin_op(struct acpi_walk_state *walk_state,
 	}
 
 	if (ACPI_FAILURE(status)) {
-		ACPI_ERROR_NAMESPACE(buffer_ptr, status);
+		ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+				     buffer_ptr, status);
 		return_ACPI_STATUS(status);
 	}
 
@@ -719,7 +723,8 @@ acpi_status acpi_ds_load2_end_op(struct acpi_walk_state *walk_state)
 			 */
 			op->common.node = new_node;
 		} else {
-			ACPI_ERROR_NAMESPACE(arg->common.value.string, status);
+			ACPI_ERROR_NAMESPACE(walk_state->scope_info,
+					     arg->common.value.string, status);
 		}
 		break;
 
diff --git a/drivers/acpi/acpica/evregion.c b/drivers/acpi/acpica/evregion.c
index 28b447f..bb58419 100644
--- a/drivers/acpi/acpica/evregion.c
+++ b/drivers/acpi/acpica/evregion.c
@@ -298,6 +298,16 @@ acpi_ev_address_space_dispatch(union acpi_operand_object *region_obj,
 		ACPI_EXCEPTION((AE_INFO, status, "Returned by Handler for [%s]",
 				acpi_ut_get_region_name(region_obj->region.
 							space_id)));
+
+		/*
+		 * Special case for an EC timeout. These are seen so frequently
+		 * that an additional error message is helpful
+		 */
+		if ((region_obj->region.space_id == ACPI_ADR_SPACE_EC) &&
+		    (status == AE_TIME)) {
+			ACPI_ERROR((AE_INFO,
+				    "Timeout from EC hardware or EC device driver"));
+		}
 	}
 
 	if (!(handler_desc->address_space.handler_flags &
diff --git a/drivers/acpi/acpica/exdump.c b/drivers/acpi/acpica/exdump.c
index 83398dc..b2ff61b 100644
--- a/drivers/acpi/acpica/exdump.c
+++ b/drivers/acpi/acpica/exdump.c
@@ -617,10 +617,11 @@ void acpi_ex_dump_operand(union acpi_operand_object *obj_desc, u32 depth)
 	u32 length;
 	u32 index;
 
-	ACPI_FUNCTION_NAME(ex_dump_operand)
+	ACPI_FUNCTION_NAME(ex_dump_operand);
 
-	    /* Check if debug output enabled */
-	    if (!ACPI_IS_DEBUG_ENABLED(ACPI_LV_EXEC, _COMPONENT)) {
+	/* Check if debug output enabled */
+
+	if (!ACPI_IS_DEBUG_ENABLED(ACPI_LV_EXEC, _COMPONENT)) {
 		return;
 	}
 
@@ -904,7 +905,7 @@ void
 acpi_ex_dump_operands(union acpi_operand_object **operands,
 		      const char *opcode_name, u32 num_operands)
 {
-	ACPI_FUNCTION_NAME(ex_dump_operands);
+	ACPI_FUNCTION_TRACE(ex_dump_operands);
 
 	if (!opcode_name) {
 		opcode_name = "UNKNOWN";
@@ -928,7 +929,7 @@ acpi_ex_dump_operands(union acpi_operand_object **operands,
 
 	ACPI_DEBUG_PRINT((ACPI_DB_EXEC,
 			  "**** End operand dump for [%s]\n", opcode_name));
-	return;
+	return_VOID;
 }
 
 /*******************************************************************************
diff --git a/drivers/acpi/acpica/hwtimer.c b/drivers/acpi/acpica/hwtimer.c
index a2f4e25..5b42829 100644
--- a/drivers/acpi/acpica/hwtimer.c
+++ b/drivers/acpi/acpica/hwtimer.c
@@ -150,10 +150,10 @@ ACPI_EXPORT_SYMBOL(acpi_get_timer)
  *
  ******************************************************************************/
 acpi_status
-acpi_get_timer_duration(u32 start_ticks, u32 end_ticks, u32 * time_elapsed)
+acpi_get_timer_duration(u32 start_ticks, u32 end_ticks, u32 *time_elapsed)
 {
 	acpi_status status;
-	u32 delta_ticks;
+	u64 delta_ticks;
 	u64 quotient;
 
 	ACPI_FUNCTION_TRACE(acpi_get_timer_duration);
@@ -168,30 +168,29 @@ acpi_get_timer_duration(u32 start_ticks, u32 end_ticks, u32 * time_elapsed)
 		return_ACPI_STATUS(AE_SUPPORT);
 	}
 
+	if (start_ticks == end_ticks) {
+		*time_elapsed = 0;
+		return_ACPI_STATUS(AE_OK);
+	}
+
 	/*
 	 * Compute Tick Delta:
 	 * Handle (max one) timer rollovers on 24-bit versus 32-bit timers.
 	 */
-	if (start_ticks < end_ticks) {
-		delta_ticks = end_ticks - start_ticks;
-	} else if (start_ticks > end_ticks) {
+	delta_ticks = end_ticks;
+	if (start_ticks > end_ticks) {
 		if ((acpi_gbl_FADT.flags & ACPI_FADT_32BIT_TIMER) == 0) {
 
 			/* 24-bit Timer */
 
-			delta_ticks =
-			    (((0x00FFFFFF - start_ticks) +
-			      end_ticks) & 0x00FFFFFF);
+			delta_ticks |= (u64)1 << 24;
 		} else {
 			/* 32-bit Timer */
 
-			delta_ticks = (0xFFFFFFFF - start_ticks) + end_ticks;
+			delta_ticks |= (u64)1 << 32;
 		}
-	} else {		/* start_ticks == end_ticks */
-
-		*time_elapsed = 0;
-		return_ACPI_STATUS(AE_OK);
 	}
+	delta_ticks -= start_ticks;
 
 	/*
 	 * Compute Duration (Requires a 64-bit multiply and divide):
@@ -199,10 +198,10 @@ acpi_get_timer_duration(u32 start_ticks, u32 end_ticks, u32 * time_elapsed)
 	 * time_elapsed (microseconds) =
 	 *  (delta_ticks * ACPI_USEC_PER_SEC) / ACPI_PM_TIMER_FREQUENCY;
 	 */
-	status = acpi_ut_short_divide(((u64)delta_ticks) * ACPI_USEC_PER_SEC,
+	status = acpi_ut_short_divide(delta_ticks * ACPI_USEC_PER_SEC,
 				      ACPI_PM_TIMER_FREQUENCY, &quotient, NULL);
 
-	*time_elapsed = (u32) quotient;
+	*time_elapsed = (u32)quotient;
 	return_ACPI_STATUS(status);
 }
 
diff --git a/drivers/acpi/acpica/hwvalid.c b/drivers/acpi/acpica/hwvalid.c
index 3094cec..d167903 100644
--- a/drivers/acpi/acpica/hwvalid.c
+++ b/drivers/acpi/acpica/hwvalid.c
@@ -128,14 +128,14 @@ acpi_hw_validate_io_request(acpi_io_address address, u32 bit_width)
 	acpi_io_address last_address;
 	const struct acpi_port_info *port_info;
 
-	ACPI_FUNCTION_NAME(hw_validate_io_request);
+	ACPI_FUNCTION_TRACE(hw_validate_io_request);
 
 	/* Supported widths are 8/16/32 */
 
 	if ((bit_width != 8) && (bit_width != 16) && (bit_width != 32)) {
 		ACPI_ERROR((AE_INFO,
 			    "Bad BitWidth parameter: %8.8X", bit_width));
-		return (AE_BAD_PARAMETER);
+		return_ACPI_STATUS(AE_BAD_PARAMETER);
 	}
 
 	port_info = acpi_protected_ports;
@@ -153,13 +153,13 @@ acpi_hw_validate_io_request(acpi_io_address address, u32 bit_width)
 		ACPI_ERROR((AE_INFO,
 			    "Illegal I/O port address/length above 64K: %8.8X%8.8X/0x%X",
 			    ACPI_FORMAT_UINT64(address), byte_width));
-		return (AE_LIMIT);
+		return_ACPI_STATUS(AE_LIMIT);
 	}
 
 	/* Exit if requested address is not within the protected port table */
 
 	if (address > acpi_protected_ports[ACPI_PORT_INFO_ENTRIES - 1].end) {
-		return (AE_OK);
+		return_ACPI_STATUS(AE_OK);
 	}
 
 	/* Check request against the list of protected I/O ports */
@@ -180,8 +180,8 @@ acpi_hw_validate_io_request(acpi_io_address address, u32 bit_width)
 			/* Port illegality may depend on the _OSI calls made by the BIOS */
 
 			if (acpi_gbl_osi_data >= port_info->osi_dependency) {
-				ACPI_DEBUG_PRINT((ACPI_DB_IO,
-						  "Denied AML access to port 0x%8.8X%8.8X/%X (%s 0x%.4X-0x%.4X)",
+				ACPI_DEBUG_PRINT((ACPI_DB_VALUES,
+						  "Denied AML access to port 0x%8.8X%8.8X/%X (%s 0x%.4X-0x%.4X)\n",
 						  ACPI_FORMAT_UINT64(address),
 						  byte_width, port_info->name,
 						  port_info->start,
@@ -198,7 +198,7 @@ acpi_hw_validate_io_request(acpi_io_address address, u32 bit_width)
 		}
 	}
 
-	return (AE_OK);
+	return_ACPI_STATUS(AE_OK);
 }
 
 /******************************************************************************
diff --git a/drivers/acpi/acpica/nsaccess.c b/drivers/acpi/acpica/nsaccess.c
index f2733f5..33e652a 100644
--- a/drivers/acpi/acpica/nsaccess.c
+++ b/drivers/acpi/acpica/nsaccess.c
@@ -644,17 +644,18 @@ acpi_ns_lookup(union acpi_generic_state *scope_info,
 					    this_node->object;
 				}
 			}
-#ifdef ACPI_ASL_COMPILER
-			if (!acpi_gbl_disasm_flag &&
-			    (this_node->flags & ANOBJ_IS_EXTERNAL)) {
-				this_node->flags |= IMPLICIT_EXTERNAL;
-			}
-#endif
 		}
 
 		/* Special handling for the last segment (num_segments == 0) */
 
 		else {
+#ifdef ACPI_ASL_COMPILER
+			if (!acpi_gbl_disasm_flag
+			    && (this_node->flags & ANOBJ_IS_EXTERNAL)) {
+				this_node->flags &= ~IMPLICIT_EXTERNAL;
+			}
+#endif
+
 			/*
 			 * Sanity typecheck of the target object:
 			 *
diff --git a/drivers/acpi/acpica/nsconvert.c b/drivers/acpi/acpica/nsconvert.c
index 539d775..d55dcc8 100644
--- a/drivers/acpi/acpica/nsconvert.c
+++ b/drivers/acpi/acpica/nsconvert.c
@@ -495,7 +495,8 @@ acpi_ns_convert_to_reference(struct acpi_namespace_node *scope,
 
 		/* Check if we are resolving a named reference within a package */
 
-		ACPI_ERROR_NAMESPACE(original_object->string.pointer, status);
+		ACPI_ERROR_NAMESPACE(&scope_info,
+				     original_object->string.pointer, status);
 		goto error_exit;
 	}
 
diff --git a/drivers/acpi/acpica/nsnames.c b/drivers/acpi/acpica/nsnames.c
index a410760..22c92d1 100644
--- a/drivers/acpi/acpica/nsnames.c
+++ b/drivers/acpi/acpica/nsnames.c
@@ -49,6 +49,9 @@
 #define _COMPONENT          ACPI_NAMESPACE
 ACPI_MODULE_NAME("nsnames")
 
+/* Local Prototypes */
+static void acpi_ns_normalize_pathname(char *original_path);
+
 /*******************************************************************************
  *
  * FUNCTION:    acpi_ns_get_external_pathname
@@ -63,6 +66,7 @@ ACPI_MODULE_NAME("nsnames")
  *              for error and debug statements.
  *
  ******************************************************************************/
+
 char *acpi_ns_get_external_pathname(struct acpi_namespace_node *node)
 {
 	char *name_buffer;
@@ -352,3 +356,148 @@ char *acpi_ns_get_normalized_pathname(struct acpi_namespace_node *node,
 
 	return_PTR(name_buffer);
 }
+
+/*******************************************************************************
+ *
+ * FUNCTION:    acpi_ns_build_prefixed_pathname
+ *
+ * PARAMETERS:  prefix_scope        - Scope/Path that prefixes the internal path
+ *              internal_path       - Name or path of the namespace node
+ *
+ * RETURN:      None
+ *
+ * DESCRIPTION: Construct a fully qualified pathname from a concatenation of:
+ *              1) Path associated with the prefix_scope namespace node
+ *              2) External path representation of the Internal path
+ *
+ ******************************************************************************/
+
+char *acpi_ns_build_prefixed_pathname(union acpi_generic_state *prefix_scope,
+				      const char *internal_path)
+{
+	acpi_status status;
+	char *full_path = NULL;
+	char *external_path = NULL;
+	char *prefix_path = NULL;
+	u32 prefix_path_length = 0;
+
+	/* If there is a prefix, get the pathname to it */
+
+	if (prefix_scope && prefix_scope->scope.node) {
+		prefix_path =
+		    acpi_ns_get_normalized_pathname(prefix_scope->scope.node,
+						    TRUE);
+		if (prefix_path) {
+			prefix_path_length = strlen(prefix_path);
+		}
+	}
+
+	status = acpi_ns_externalize_name(ACPI_UINT32_MAX, internal_path,
+					  NULL, &external_path);
+	if (ACPI_FAILURE(status)) {
+		goto cleanup;
+	}
+
+	/* Merge the prefix path and the path. 2 is for one dot and trailing null */
+
+	full_path =
+	    ACPI_ALLOCATE_ZEROED(prefix_path_length + strlen(external_path) +
+				 2);
+	if (!full_path) {
+		goto cleanup;
+	}
+
+	/* Don't merge if the External path is already fully qualified */
+
+	if (prefix_path && (*external_path != '\\') && (*external_path != '^')) {
+		strcat(full_path, prefix_path);
+		if (prefix_path[1]) {
+			strcat(full_path, ".");
+		}
+	}
+
+	acpi_ns_normalize_pathname(external_path);
+	strcat(full_path, external_path);
+
+cleanup:
+	if (prefix_path) {
+		ACPI_FREE(prefix_path);
+	}
+	if (external_path) {
+		ACPI_FREE(external_path);
+	}
+
+	return (full_path);
+}
+
+/*******************************************************************************
+ *
+ * FUNCTION:    acpi_ns_normalize_pathname
+ *
+ * PARAMETERS:  original_path       - Path to be normalized, in External format
+ *
+ * RETURN:      The original path is processed in-place
+ *
+ * DESCRIPTION: Remove trailing underscores from each element of a path.
+ *
+ *              For example:  \A___.B___.C___ becomes \A.B.C
+ *
+ ******************************************************************************/
+
+static void acpi_ns_normalize_pathname(char *original_path)
+{
+	char *input_path = original_path;
+	char *new_path_buffer;
+	char *new_path;
+	u32 i;
+
+	/* Allocate a temp buffer in which to construct the new path */
+
+	new_path_buffer = ACPI_ALLOCATE_ZEROED(strlen(input_path) + 1);
+	new_path = new_path_buffer;
+	if (!new_path_buffer) {
+		return;
+	}
+
+	/* Special characters may appear at the beginning of the path */
+
+	if (*input_path == '\\') {
+		*new_path = *input_path;
+		new_path++;
+		input_path++;
+	}
+
+	while (*input_path == '^') {
+		*new_path = *input_path;
+		new_path++;
+		input_path++;
+	}
+
+	/* Remainder of the path */
+
+	while (*input_path) {
+
+		/* Do one nameseg at a time */
+
+		for (i = 0; (i < ACPI_NAME_SIZE) && *input_path; i++) {
+			if ((i == 0) || (*input_path != '_')) {	/* First char is allowed to be underscore */
+				*new_path = *input_path;
+				new_path++;
+			}
+
+			input_path++;
+		}
+
+		/* Dot means that there are more namesegs to come */
+
+		if (*input_path == '.') {
+			*new_path = *input_path;
+			new_path++;
+			input_path++;
+		}
+	}
+
+	*new_path = 0;
+	strcpy(original_path, new_path_buffer);
+	ACPI_FREE(new_path_buffer);
+}
diff --git a/drivers/acpi/acpica/nssearch.c b/drivers/acpi/acpica/nssearch.c
index 5de8957..e91dbee 100644
--- a/drivers/acpi/acpica/nssearch.c
+++ b/drivers/acpi/acpica/nssearch.c
@@ -417,6 +417,7 @@ acpi_ns_search_and_enter(u32 target_name,
 	if (flags & ACPI_NS_EXTERNAL ||
 	    (walk_state && walk_state->opcode == AML_SCOPE_OP)) {
 		new_node->flags |= ANOBJ_IS_EXTERNAL;
+		new_node->flags |= IMPLICIT_EXTERNAL;
 	}
 #endif
 
diff --git a/drivers/acpi/acpica/nsxfeval.c b/drivers/acpi/acpica/nsxfeval.c
index 783f4c8..9b51f65 100644
--- a/drivers/acpi/acpica/nsxfeval.c
+++ b/drivers/acpi/acpica/nsxfeval.c
@@ -61,10 +61,10 @@ static void acpi_ns_resolve_references(struct acpi_evaluate_info *info);
  *
  * PARAMETERS:  handle              - Object handle (optional)
  *              pathname            - Object pathname (optional)
- *              external_params     - List of parameters to pass to method,
+ *              external_params     - List of parameters to pass to a method,
  *                                    terminated by NULL. May be NULL
  *                                    if no parameters are being passed.
- *              return_buffer       - Where to put method's return value (if
+ *              return_buffer       - Where to put the object's return value (if
  *                                    any). If NULL, no value is returned.
  *              return_type         - Expected type of return object
  *
@@ -100,13 +100,14 @@ acpi_evaluate_object_typed(acpi_handle handle,
 		free_buffer_on_error = TRUE;
 	}
 
+	/* Get a handle here, in order to build an error message if needed */
+
+	target_handle = handle;
 	if (pathname) {
 		status = acpi_get_handle(handle, pathname, &target_handle);
 		if (ACPI_FAILURE(status)) {
 			return_ACPI_STATUS(status);
 		}
-	} else {
-		target_handle = handle;
 	}
 
 	full_pathname = acpi_ns_get_external_pathname(target_handle);
diff --git a/drivers/acpi/acpica/psargs.c b/drivers/acpi/acpica/psargs.c
index eb9dfac..171e2fa 100644
--- a/drivers/acpi/acpica/psargs.c
+++ b/drivers/acpi/acpica/psargs.c
@@ -361,7 +361,7 @@ acpi_ps_get_next_namepath(struct acpi_walk_state *walk_state,
 	/* Final exception check (may have been changed from code above) */
 
 	if (ACPI_FAILURE(status)) {
-		ACPI_ERROR_NAMESPACE(path, status);
+		ACPI_ERROR_NAMESPACE(walk_state->scope_info, path, status);
 
 		if ((walk_state->parse_flags & ACPI_PARSE_MODE_MASK) ==
 		    ACPI_PARSE_EXECUTE) {
diff --git a/drivers/acpi/acpica/psobject.c b/drivers/acpi/acpica/psobject.c
index 0bef6df..c0b1798 100644
--- a/drivers/acpi/acpica/psobject.c
+++ b/drivers/acpi/acpica/psobject.c
@@ -372,16 +372,10 @@ acpi_ps_create_op(struct acpi_walk_state *walk_state,
 			 * external declaration opcode. Setting walk_state->Aml to
 			 * walk_state->parser_state.Aml + 2 moves increments the
 			 * walk_state->Aml past the object type and the paramcount of the
-			 * external opcode. For the error message, only print the AML
-			 * offset. We could attempt to print the name but this may cause
-			 * a segmentation fault when printing the namepath because the
-			 * AML may be incorrect.
+			 * external opcode.
 			 */
-			acpi_os_printf
-			    ("// Invalid external declaration at AML offset 0x%x.\n",
-			     walk_state->aml -
-			     walk_state->parser_state.aml_start);
 			walk_state->aml = walk_state->parser_state.aml + 2;
+			walk_state->parser_state.aml = walk_state->aml;
 			return_ACPI_STATUS(AE_CTRL_PARSE_CONTINUE);
 		}
 #endif
diff --git a/drivers/acpi/acpica/psutils.c b/drivers/acpi/acpica/psutils.c
index 0264276..cd59dfe 100644
--- a/drivers/acpi/acpica/psutils.c
+++ b/drivers/acpi/acpica/psutils.c
@@ -94,9 +94,11 @@ void acpi_ps_init_op(union acpi_parse_object *op, u16 opcode)
 	op->common.descriptor_type = ACPI_DESC_TYPE_PARSER;
 	op->common.aml_opcode = opcode;
 
-	ACPI_DISASM_ONLY_MEMBERS(strncpy(op->common.aml_op_name,
-					 (acpi_ps_get_opcode_info(opcode))->
-					 name, sizeof(op->common.aml_op_name)));
+	ACPI_DISASM_ONLY_MEMBERS(acpi_ut_safe_strncpy(op->common.aml_op_name,
+						      (acpi_ps_get_opcode_info
+						       (opcode))->name,
+						      sizeof(op->common.
+							     aml_op_name)));
 }
 
 /*******************************************************************************
@@ -158,10 +160,10 @@ union acpi_parse_object *acpi_ps_alloc_op(u16 opcode, u8 *aml)
 		if (opcode == AML_SCOPE_OP) {
 			acpi_gbl_current_scope = op;
 		}
-	}
 
-	if (gbl_capture_comments) {
-		ASL_CV_TRANSFER_COMMENTS(op);
+		if (acpi_gbl_capture_comments) {
+			ASL_CV_TRANSFER_COMMENTS(op);
+		}
 	}
 
 	return (op);
diff --git a/drivers/acpi/acpica/utdebug.c b/drivers/acpi/acpica/utdebug.c
index 615a885..cff7154 100644
--- a/drivers/acpi/acpica/utdebug.c
+++ b/drivers/acpi/acpica/utdebug.c
@@ -163,6 +163,9 @@ acpi_debug_print(u32 requested_debug_level,
 {
 	acpi_thread_id thread_id;
 	va_list args;
+#ifdef ACPI_APPLICATION
+	int fill_count;
+#endif
 
 	/* Check if debug output enabled */
 
@@ -202,10 +205,21 @@ acpi_debug_print(u32 requested_debug_level,
 		acpi_os_printf("[%u] ", (u32)thread_id);
 	}
 
-	acpi_os_printf("[%02ld] ", acpi_gbl_nesting_level);
-#endif
+	fill_count = 48 - acpi_gbl_nesting_level -
+	    strlen(acpi_ut_trim_function_name(function_name));
+	if (fill_count < 0) {
+		fill_count = 0;
+	}
 
+	acpi_os_printf("[%02ld] %*s",
+		       acpi_gbl_nesting_level, acpi_gbl_nesting_level + 1, " ");
+	acpi_os_printf("%s%*s: ",
+		       acpi_ut_trim_function_name(function_name), fill_count,
+		       " ");
+
+#else
 	acpi_os_printf("%-22.22s: ", acpi_ut_trim_function_name(function_name));
+#endif
 
 	va_start(args, format);
 	acpi_os_vprintf(format, args);
diff --git a/drivers/acpi/acpica/utdecode.c b/drivers/acpi/acpica/utdecode.c
index 02cd2c2..55debba 100644
--- a/drivers/acpi/acpica/utdecode.c
+++ b/drivers/acpi/acpica/utdecode.c
@@ -395,11 +395,6 @@ const char *acpi_ut_get_reference_name(union acpi_operand_object *object)
 	return (acpi_gbl_ref_class_names[object->reference.class]);
 }
 
-#if defined(ACPI_DEBUG_OUTPUT) || defined(ACPI_DEBUGGER)
-/*
- * Strings and procedures used for debug only
- */
-
 /*******************************************************************************
  *
  * FUNCTION:    acpi_ut_get_mutex_name
@@ -433,6 +428,12 @@ const char *acpi_ut_get_mutex_name(u32 mutex_id)
 	return (acpi_gbl_mutex_names[mutex_id]);
 }
 
+#if defined(ACPI_DEBUG_OUTPUT) || defined(ACPI_DEBUGGER)
+
+/*
+ * Strings and procedures used for debug only
+ */
+
 /*******************************************************************************
  *
  * FUNCTION:    acpi_ut_get_notify_name
diff --git a/drivers/acpi/acpica/uterror.c b/drivers/acpi/acpica/uterror.c
index e336818..42388dc 100644
--- a/drivers/acpi/acpica/uterror.c
+++ b/drivers/acpi/acpica/uterror.c
@@ -182,6 +182,78 @@ acpi_ut_predefined_bios_error(const char *module_name,
 
 /*******************************************************************************
  *
+ * FUNCTION:    acpi_ut_prefixed_namespace_error
+ *
+ * PARAMETERS:  module_name         - Caller's module name (for error output)
+ *              line_number         - Caller's line number (for error output)
+ *              prefix_scope        - Scope/Path that prefixes the internal path
+ *              internal_path       - Name or path of the namespace node
+ *              lookup_status       - Exception code from NS lookup
+ *
+ * RETURN:      None
+ *
+ * DESCRIPTION: Print error message with the full pathname constructed this way:
+ *
+ *                  prefix_scope_node_full_path.externalized_internal_path
+ *
+ * NOTE:        10/2017: Treat the major ns_lookup errors as firmware errors
+ *
+ ******************************************************************************/
+
+void
+acpi_ut_prefixed_namespace_error(const char *module_name,
+				 u32 line_number,
+				 union acpi_generic_state *prefix_scope,
+				 const char *internal_path,
+				 acpi_status lookup_status)
+{
+	char *full_path;
+	const char *message;
+
+	/*
+	 * Main cases:
+	 * 1) Object creation, object must not already exist
+	 * 2) Object lookup, object must exist
+	 */
+	switch (lookup_status) {
+	case AE_ALREADY_EXISTS:
+
+		acpi_os_printf(ACPI_MSG_BIOS_ERROR);
+		message = "Failure creating";
+		break;
+
+	case AE_NOT_FOUND:
+
+		acpi_os_printf(ACPI_MSG_BIOS_ERROR);
+		message = "Failure looking up";
+		break;
+
+	default:
+
+		acpi_os_printf(ACPI_MSG_ERROR);
+		message = "Failure looking up";
+		break;
+	}
+
+	/* Concatenate the prefix path and the internal path */
+
+	full_path =
+	    acpi_ns_build_prefixed_pathname(prefix_scope, internal_path);
+
+	acpi_os_printf("%s [%s], %s", message,
+		       full_path ? full_path : "Could not get pathname",
+		       acpi_format_exception(lookup_status));
+
+	if (full_path) {
+		ACPI_FREE(full_path);
+	}
+
+	ACPI_MSG_SUFFIX;
+}
+
+#ifdef __OBSOLETE_FUNCTION
+/*******************************************************************************
+ *
  * FUNCTION:    acpi_ut_namespace_error
  *
  * PARAMETERS:  module_name         - Caller's module name (for error output)
@@ -240,6 +312,7 @@ acpi_ut_namespace_error(const char *module_name,
 	ACPI_MSG_SUFFIX;
 	ACPI_MSG_REDIRECT_END;
 }
+#endif
 
 /*******************************************************************************
  *
diff --git a/drivers/acpi/acpica/utinit.c b/drivers/acpi/acpica/utinit.c
index 23e766d..45eeb0d 100644
--- a/drivers/acpi/acpica/utinit.c
+++ b/drivers/acpi/acpica/utinit.c
@@ -206,7 +206,6 @@ acpi_status acpi_ut_init_globals(void)
 	acpi_gbl_next_owner_id_offset = 0;
 	acpi_gbl_debugger_configuration = DEBUGGER_THREADING;
 	acpi_gbl_osi_mutex = NULL;
-	acpi_gbl_max_loop_iterations = ACPI_MAX_LOOP_COUNT;
 
 	/* Hardware oriented */
 
diff --git a/drivers/acpi/acpica/utmath.c b/drivers/acpi/acpica/utmath.c
index 5f9c680..2055a85 100644
--- a/drivers/acpi/acpica/utmath.c
+++ b/drivers/acpi/acpica/utmath.c
@@ -134,7 +134,7 @@ acpi_status acpi_ut_short_shift_left(u64 operand, u32 count, u64 *out_result)
 
 	if ((count & 63) >= 32) {
 		operand_ovl.part.hi = operand_ovl.part.lo;
-		operand_ovl.part.lo ^= operand_ovl.part.lo;
+		operand_ovl.part.lo = 0;
 		count = (count & 63) - 32;
 	}
 	ACPI_SHIFT_LEFT_64_BY_32(operand_ovl.part.hi,
@@ -171,7 +171,7 @@ acpi_status acpi_ut_short_shift_right(u64 operand, u32 count, u64 *out_result)
 
 	if ((count & 63) >= 32) {
 		operand_ovl.part.lo = operand_ovl.part.hi;
-		operand_ovl.part.hi ^= operand_ovl.part.hi;
+		operand_ovl.part.hi = 0;
 		count = (count & 63) - 32;
 	}
 	ACPI_SHIFT_RIGHT_64_BY_32(operand_ovl.part.hi,
diff --git a/drivers/acpi/acpica/utmutex.c b/drivers/acpi/acpica/utmutex.c
index 5863547..524ba93 100644
--- a/drivers/acpi/acpica/utmutex.c
+++ b/drivers/acpi/acpica/utmutex.c
@@ -286,8 +286,9 @@ acpi_status acpi_ut_acquire_mutex(acpi_mutex_handle mutex_id)
 		acpi_gbl_mutex_info[mutex_id].thread_id = this_thread_id;
 	} else {
 		ACPI_EXCEPTION((AE_INFO, status,
-				"Thread %u could not acquire Mutex [0x%X]",
-				(u32)this_thread_id, mutex_id));
+				"Thread %u could not acquire Mutex [%s] (0x%X)",
+				(u32)this_thread_id,
+				acpi_ut_get_mutex_name(mutex_id), mutex_id));
 	}
 
 	return (status);
@@ -322,8 +323,8 @@ acpi_status acpi_ut_release_mutex(acpi_mutex_handle mutex_id)
 	 */
 	if (acpi_gbl_mutex_info[mutex_id].thread_id == ACPI_MUTEX_NOT_ACQUIRED) {
 		ACPI_ERROR((AE_INFO,
-			    "Mutex [0x%X] is not acquired, cannot release",
-			    mutex_id));
+			    "Mutex [%s] (0x%X) is not acquired, cannot release",
+			    acpi_ut_get_mutex_name(mutex_id), mutex_id));
 
 		return (AE_NOT_ACQUIRED);
 	}
diff --git a/drivers/acpi/acpica/utnonansi.c b/drivers/acpi/acpica/utnonansi.c
index 7926649..33a0970 100644
--- a/drivers/acpi/acpica/utnonansi.c
+++ b/drivers/acpi/acpica/utnonansi.c
@@ -140,7 +140,7 @@ int acpi_ut_stricmp(char *string1, char *string2)
 	return (c1 - c2);
 }
 
-#if defined (ACPI_DEBUGGER) || defined (ACPI_APPLICATION)
+#if defined (ACPI_DEBUGGER) || defined (ACPI_APPLICATION) || defined (ACPI_DEBUG_OUTPUT)
 /*******************************************************************************
  *
  * FUNCTION:    acpi_ut_safe_strcpy, acpi_ut_safe_strcat, acpi_ut_safe_strncat
@@ -199,4 +199,13 @@ acpi_ut_safe_strncat(char *dest,
 	strncat(dest, source, max_transfer_length);
 	return (FALSE);
 }
+
+void acpi_ut_safe_strncpy(char *dest, char *source, acpi_size dest_size)
+{
+	/* Always terminate destination string */
+
+	strncpy(dest, source, dest_size);
+	dest[dest_size - 1] = 0;
+}
+
 #endif
diff --git a/drivers/acpi/acpica/utosi.c b/drivers/acpi/acpica/utosi.c
index 3175b13..f6b8dd2 100644
--- a/drivers/acpi/acpica/utosi.c
+++ b/drivers/acpi/acpica/utosi.c
@@ -101,6 +101,8 @@ static struct acpi_interface_info acpi_default_supported_interfaces[] = {
 	{"Windows 2012", NULL, 0, ACPI_OSI_WIN_8},	/* Windows 8 and Server 2012 - Added 08/2012 */
 	{"Windows 2013", NULL, 0, ACPI_OSI_WIN_8},	/* Windows 8.1 and Server 2012 R2 - Added 01/2014 */
 	{"Windows 2015", NULL, 0, ACPI_OSI_WIN_10},	/* Windows 10 - Added 03/2015 */
+	{"Windows 2016", NULL, 0, ACPI_OSI_WIN_10_RS1},	/* Windows 10 version 1607 - Added 12/2017 */
+	{"Windows 2017", NULL, 0, ACPI_OSI_WIN_10_RS2},	/* Windows 10 version 1703 - Added 12/2017 */
 
 	/* Feature Group Strings */
 
diff --git a/drivers/acpi/acpica/utstrsuppt.c b/drivers/acpi/acpica/utstrsuppt.c
index 965fb5c..97f48d71 100644
--- a/drivers/acpi/acpica/utstrsuppt.c
+++ b/drivers/acpi/acpica/utstrsuppt.c
@@ -52,10 +52,9 @@ static acpi_status
 acpi_ut_insert_digit(u64 *accumulated_value, u32 base, int ascii_digit);
 
 static acpi_status
-acpi_ut_strtoul_multiply64(u64 multiplicand, u64 multiplier, u64 *out_product);
+acpi_ut_strtoul_multiply64(u64 multiplicand, u32 base, u64 *out_product);
 
-static acpi_status
-acpi_ut_strtoul_add64(u64 addend1, u64 addend2, u64 *out_sum);
+static acpi_status acpi_ut_strtoul_add64(u64 addend1, u32 digit, u64 *out_sum);
 
 /*******************************************************************************
  *
@@ -357,7 +356,7 @@ acpi_ut_insert_digit(u64 *accumulated_value, u32 base, int ascii_digit)
  * FUNCTION:    acpi_ut_strtoul_multiply64
  *
  * PARAMETERS:  multiplicand            - Current accumulated converted integer
- *              multiplier              - Base/Radix
+ *              base                    - Base/Radix
  *              out_product             - Where the product is returned
  *
  * RETURN:      Status and 64-bit product
@@ -369,33 +368,40 @@ acpi_ut_insert_digit(u64 *accumulated_value, u32 base, int ascii_digit)
  ******************************************************************************/
 
 static acpi_status
-acpi_ut_strtoul_multiply64(u64 multiplicand, u64 multiplier, u64 *out_product)
+acpi_ut_strtoul_multiply64(u64 multiplicand, u32 base, u64 *out_product)
 {
-	u64 val;
+	u64 product;
+	u64 quotient;
 
 	/* Exit if either operand is zero */
 
 	*out_product = 0;
-	if (!multiplicand || !multiplier) {
+	if (!multiplicand || !base) {
 		return (AE_OK);
 	}
 
-	/* Check for 64-bit overflow before the actual multiplication */
-
-	acpi_ut_short_divide(ACPI_UINT64_MAX, (u32)multiplier, &val, NULL);
-	if (multiplicand > val) {
+	/*
+	 * Check for 64-bit overflow before the actual multiplication.
+	 *
+	 * Notes: 64-bit division is often not supported on 32-bit platforms
+	 * (it requires a library function), Therefore ACPICA has a local
+	 * 64-bit divide function. Also, Multiplier is currently only used
+	 * as the radix (8/10/16), to the 64/32 divide will always work.
+	 */
+	acpi_ut_short_divide(ACPI_UINT64_MAX, base, &quotient, NULL);
+	if (multiplicand > quotient) {
 		return (AE_NUMERIC_OVERFLOW);
 	}
 
-	val = multiplicand * multiplier;
+	product = multiplicand * base;
 
 	/* Check for 32-bit overflow if necessary */
 
-	if ((acpi_gbl_integer_bit_width == 32) && (val > ACPI_UINT32_MAX)) {
+	if ((acpi_gbl_integer_bit_width == 32) && (product > ACPI_UINT32_MAX)) {
 		return (AE_NUMERIC_OVERFLOW);
 	}
 
-	*out_product = val;
+	*out_product = product;
 	return (AE_OK);
 }
 
@@ -404,7 +410,7 @@ acpi_ut_strtoul_multiply64(u64 multiplicand, u64 multiplier, u64 *out_product)
  * FUNCTION:    acpi_ut_strtoul_add64
  *
  * PARAMETERS:  addend1                 - Current accumulated converted integer
- *              addend2                 - New hex value/char
+ *              digit                   - New hex value/char
  *              out_sum                 - Where sum is returned (Accumulator)
  *
  * RETURN:      Status and 64-bit sum
@@ -415,17 +421,17 @@ acpi_ut_strtoul_multiply64(u64 multiplicand, u64 multiplier, u64 *out_product)
  *
  ******************************************************************************/
 
-static acpi_status acpi_ut_strtoul_add64(u64 addend1, u64 addend2, u64 *out_sum)
+static acpi_status acpi_ut_strtoul_add64(u64 addend1, u32 digit, u64 *out_sum)
 {
 	u64 sum;
 
 	/* Check for 64-bit overflow before the actual addition */
 
-	if ((addend1 > 0) && (addend2 > (ACPI_UINT64_MAX - addend1))) {
+	if ((addend1 > 0) && (digit > (ACPI_UINT64_MAX - addend1))) {
 		return (AE_NUMERIC_OVERFLOW);
 	}
 
-	sum = addend1 + addend2;
+	sum = addend1 + digit;
 
 	/* Check for 32-bit overflow if necessary */
 
diff --git a/drivers/acpi/acpica/uttrack.c b/drivers/acpi/acpica/uttrack.c
index 3c8de88..633b4e2 100644
--- a/drivers/acpi/acpica/uttrack.c
+++ b/drivers/acpi/acpica/uttrack.c
@@ -402,8 +402,8 @@ acpi_ut_track_allocation(struct acpi_debug_mem_block *allocation,
 	allocation->component = component;
 	allocation->line = line;
 
-	strncpy(allocation->module, module, ACPI_MAX_MODULE_NAME);
-	allocation->module[ACPI_MAX_MODULE_NAME - 1] = 0;
+	acpi_ut_safe_strncpy(allocation->module, (char *)module,
+			     ACPI_MAX_MODULE_NAME);
 
 	if (!element) {
 
@@ -717,7 +717,7 @@ void acpi_ut_dump_allocations(u32 component, const char *module)
 	if (!num_outstanding) {
 		ACPI_INFO(("No outstanding allocations"));
 	} else {
-		ACPI_ERROR((AE_INFO, "%u(0x%X) Outstanding allocations",
+		ACPI_ERROR((AE_INFO, "%u (0x%X) Outstanding cache allocations",
 			    num_outstanding, num_outstanding));
 	}
 
diff --git a/drivers/acpi/acpica/utxferror.c b/drivers/acpi/acpica/utxferror.c
index 950a1e5..9da4f8e 100644
--- a/drivers/acpi/acpica/utxferror.c
+++ b/drivers/acpi/acpica/utxferror.c
@@ -96,8 +96,8 @@ ACPI_EXPORT_SYMBOL(acpi_error)
  *
  * RETURN:      None
  *
- * DESCRIPTION: Print "ACPI Exception" message with module/line/version info
- *              and decoded acpi_status.
+ * DESCRIPTION: Print an "ACPI Error" message with module/line/version
+ *              info as well as decoded acpi_status.
  *
  ******************************************************************************/
 void ACPI_INTERNAL_VAR_XFACE
@@ -111,10 +111,10 @@ acpi_exception(const char *module_name,
 	/* For AE_OK, just print the message */
 
 	if (ACPI_SUCCESS(status)) {
-		acpi_os_printf(ACPI_MSG_EXCEPTION);
+		acpi_os_printf(ACPI_MSG_ERROR);
 
 	} else {
-		acpi_os_printf(ACPI_MSG_EXCEPTION "%s, ",
+		acpi_os_printf(ACPI_MSG_ERROR "%s, ",
 			       acpi_format_exception(status));
 	}
 
diff --git a/drivers/acpi/apei/ghes.c b/drivers/acpi/apei/ghes.c
index bb5f9c6..1efefe9 100644
--- a/drivers/acpi/apei/ghes.c
+++ b/drivers/acpi/apei/ghes.c
@@ -414,6 +414,51 @@ static void ghes_handle_memory_failure(struct acpi_hest_generic_data *gdata, int
 #endif
 }
 
+/*
+ * PCIe AER errors need to be sent to the AER driver for reporting and
+ * recovery. The GHES severities map to the following AER severities and
+ * require the following handling:
+ *
+ * GHES_SEV_CORRECTABLE -> AER_CORRECTABLE
+ *     These need to be reported by the AER driver but no recovery is
+ *     necessary.
+ * GHES_SEV_RECOVERABLE -> AER_NONFATAL
+ * GHES_SEV_RECOVERABLE && CPER_SEC_RESET -> AER_FATAL
+ *     These both need to be reported and recovered from by the AER driver.
+ * GHES_SEV_PANIC does not make it to this handling since the kernel must
+ *     panic.
+ */
+static void ghes_handle_aer(struct acpi_hest_generic_data *gdata)
+{
+#ifdef CONFIG_ACPI_APEI_PCIEAER
+	struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
+
+	if (pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
+	    pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
+		unsigned int devfn;
+		int aer_severity;
+
+		devfn = PCI_DEVFN(pcie_err->device_id.device,
+				  pcie_err->device_id.function);
+		aer_severity = cper_severity_to_aer(gdata->error_severity);
+
+		/*
+		 * If firmware reset the component to contain
+		 * the error, we must reinitialize it before
+		 * use, so treat it as a fatal AER error.
+		 */
+		if (gdata->flags & CPER_SEC_RESET)
+			aer_severity = AER_FATAL;
+
+		aer_recover_queue(pcie_err->device_id.segment,
+				  pcie_err->device_id.bus,
+				  devfn, aer_severity,
+				  (struct aer_capability_regs *)
+				  pcie_err->aer_info);
+	}
+#endif
+}
+
 static void ghes_do_proc(struct ghes *ghes,
 			 const struct acpi_hest_generic_status *estatus)
 {
@@ -441,38 +486,9 @@ static void ghes_do_proc(struct ghes *ghes,
 			arch_apei_report_mem_error(sev, mem_err);
 			ghes_handle_memory_failure(gdata, sev);
 		}
-#ifdef CONFIG_ACPI_APEI_PCIEAER
 		else if (guid_equal(sec_type, &CPER_SEC_PCIE)) {
-			struct cper_sec_pcie *pcie_err = acpi_hest_get_payload(gdata);
-
-			if (sev == GHES_SEV_RECOVERABLE &&
-			    sec_sev == GHES_SEV_RECOVERABLE &&
-			    pcie_err->validation_bits & CPER_PCIE_VALID_DEVICE_ID &&
-			    pcie_err->validation_bits & CPER_PCIE_VALID_AER_INFO) {
-				unsigned int devfn;
-				int aer_severity;
-
-				devfn = PCI_DEVFN(pcie_err->device_id.device,
-						  pcie_err->device_id.function);
-				aer_severity = cper_severity_to_aer(gdata->error_severity);
-
-				/*
-				 * If firmware reset the component to contain
-				 * the error, we must reinitialize it before
-				 * use, so treat it as a fatal AER error.
-				 */
-				if (gdata->flags & CPER_SEC_RESET)
-					aer_severity = AER_FATAL;
-
-				aer_recover_queue(pcie_err->device_id.segment,
-						  pcie_err->device_id.bus,
-						  devfn, aer_severity,
-						  (struct aer_capability_regs *)
-						  pcie_err->aer_info);
-			}
-
+			ghes_handle_aer(gdata);
 		}
-#endif
 		else if (guid_equal(sec_type, &CPER_SEC_PROC_ARM)) {
 			struct cper_sec_proc_arm *err = acpi_hest_get_payload(gdata);
 
@@ -870,7 +886,6 @@ static void ghes_print_queued_estatus(void)
 	struct ghes_estatus_node *estatus_node;
 	struct acpi_hest_generic *generic;
 	struct acpi_hest_generic_status *estatus;
-	u32 len, node_len;
 
 	llnode = llist_del_all(&ghes_estatus_llist);
 	/*
@@ -882,8 +897,6 @@ static void ghes_print_queued_estatus(void)
 		estatus_node = llist_entry(llnode, struct ghes_estatus_node,
 					   llnode);
 		estatus = GHES_ESTATUS_FROM_NODE(estatus_node);
-		len = cper_estatus_len(estatus);
-		node_len = GHES_ESTATUS_NODE_LEN(len);
 		generic = estatus_node->generic;
 		ghes_print_estatus(NULL, generic, estatus);
 		llnode = llnode->next;
diff --git a/drivers/acpi/battery.c b/drivers/acpi/battery.c
index 13e7b56..19bc440 100644
--- a/drivers/acpi/battery.c
+++ b/drivers/acpi/battery.c
@@ -70,6 +70,7 @@ static async_cookie_t async_cookie;
 static bool battery_driver_registered;
 static int battery_bix_broken_package;
 static int battery_notification_delay_ms;
+static int battery_full_discharging;
 static unsigned int cache_time = 1000;
 module_param(cache_time, uint, 0644);
 MODULE_PARM_DESC(cache_time, "cache time in milliseconds");
@@ -214,9 +215,12 @@ static int acpi_battery_get_property(struct power_supply *psy,
 		return -ENODEV;
 	switch (psp) {
 	case POWER_SUPPLY_PROP_STATUS:
-		if (battery->state & ACPI_BATTERY_STATE_DISCHARGING)
-			val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
-		else if (battery->state & ACPI_BATTERY_STATE_CHARGING)
+		if (battery->state & ACPI_BATTERY_STATE_DISCHARGING) {
+			if (battery_full_discharging && battery->rate_now == 0)
+				val->intval = POWER_SUPPLY_STATUS_FULL;
+			else
+				val->intval = POWER_SUPPLY_STATUS_DISCHARGING;
+		} else if (battery->state & ACPI_BATTERY_STATE_CHARGING)
 			val->intval = POWER_SUPPLY_STATUS_CHARGING;
 		else if (acpi_battery_is_charged(battery))
 			val->intval = POWER_SUPPLY_STATUS_FULL;
@@ -1166,6 +1170,12 @@ battery_notification_delay_quirk(const struct dmi_system_id *d)
 	return 0;
 }
 
+static int __init battery_full_discharging_quirk(const struct dmi_system_id *d)
+{
+	battery_full_discharging = 1;
+	return 0;
+}
+
 static const struct dmi_system_id bat_dmi_table[] __initconst = {
 	{
 		.callback = battery_bix_broken_package_quirk,
@@ -1183,6 +1193,22 @@ static const struct dmi_system_id bat_dmi_table[] __initconst = {
 			DMI_MATCH(DMI_PRODUCT_NAME, "Aspire V5-573G"),
 		},
 	},
+	{
+		.callback = battery_full_discharging_quirk,
+		.ident = "ASUS GL502VSK",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "GL502VSK"),
+		},
+	},
+	{
+		.callback = battery_full_discharging_quirk,
+		.ident = "ASUS UX305LA",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
+			DMI_MATCH(DMI_PRODUCT_NAME, "UX305LA"),
+		},
+	},
 	{},
 };
 
@@ -1237,13 +1263,11 @@ static int acpi_battery_add(struct acpi_device *device)
 
 #ifdef CONFIG_ACPI_PROCFS_POWER
 	result = acpi_battery_add_fs(device);
-#endif
 	if (result) {
-#ifdef CONFIG_ACPI_PROCFS_POWER
 		acpi_battery_remove_fs(device);
-#endif
 		goto fail;
 	}
+#endif
 
 	printk(KERN_INFO PREFIX "%s Slot [%s] (battery %s)\n",
 		ACPI_BATTERY_DEVICE_NAME, acpi_device_bid(device),
diff --git a/drivers/acpi/button.c b/drivers/acpi/button.c
index bf8e4d3..e1eee7a 100644
--- a/drivers/acpi/button.c
+++ b/drivers/acpi/button.c
@@ -30,6 +30,7 @@
 #include <linux/input.h>
 #include <linux/slab.h>
 #include <linux/acpi.h>
+#include <linux/dmi.h>
 #include <acpi/button.h>
 
 #define PREFIX "ACPI: "
@@ -76,6 +77,22 @@ static const struct acpi_device_id button_device_ids[] = {
 };
 MODULE_DEVICE_TABLE(acpi, button_device_ids);
 
+/*
+ * Some devices which don't even have a lid in anyway have a broken _LID
+ * method (e.g. pointing to a floating gpio pin) causing spurious LID events.
+ */
+static const struct dmi_system_id lid_blacklst[] = {
+	{
+		/* GP-electronic T701 */
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Insyde"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "T701"),
+			DMI_MATCH(DMI_BIOS_VERSION, "BYT70A.YNCHENG.WIN.007"),
+		},
+	},
+	{}
+};
+
 static int acpi_button_add(struct acpi_device *device);
 static int acpi_button_remove(struct acpi_device *device);
 static void acpi_button_notify(struct acpi_device *device, u32 event);
@@ -210,6 +227,8 @@ static int acpi_lid_notify_state(struct acpi_device *device, int state)
 	}
 	/* Send the platform triggered reliable event */
 	if (do_update) {
+		acpi_handle_debug(device->handle, "ACPI LID %s\n",
+				  state ? "open" : "closed");
 		input_report_switch(button->input, SW_LID, !state);
 		input_sync(button->input);
 		button->last_state = !!state;
@@ -473,6 +492,9 @@ static int acpi_button_add(struct acpi_device *device)
 	char *name, *class;
 	int error;
 
+	if (!strcmp(hid, ACPI_BUTTON_HID_LID) && dmi_check_system(lid_blacklst))
+		return -ENODEV;
+
 	button = kzalloc(sizeof(struct acpi_button), GFP_KERNEL);
 	if (!button)
 		return -ENOMEM;
diff --git a/drivers/acpi/device_pm.c b/drivers/acpi/device_pm.c
index a4c8ad9..c4d0a1c 100644
--- a/drivers/acpi/device_pm.c
+++ b/drivers/acpi/device_pm.c
@@ -990,7 +990,7 @@ void acpi_subsys_complete(struct device *dev)
 	 * the sleep state it is going out of and it has never been resumed till
 	 * now, resume it in case the firmware powered it up.
 	 */
-	if (dev->power.direct_complete && pm_resume_via_firmware())
+	if (pm_runtime_suspended(dev) && pm_resume_via_firmware())
 		pm_request_resume(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_complete);
@@ -1039,10 +1039,28 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend_late);
  */
 int acpi_subsys_suspend_noirq(struct device *dev)
 {
-	if (dev_pm_smart_suspend_and_suspended(dev))
-		return 0;
+	int ret;
 
-	return pm_generic_suspend_noirq(dev);
+	if (dev_pm_smart_suspend_and_suspended(dev)) {
+		dev->power.may_skip_resume = true;
+		return 0;
+	}
+
+	ret = pm_generic_suspend_noirq(dev);
+	if (ret)
+		return ret;
+
+	/*
+	 * If the target system sleep state is suspend-to-idle, it is sufficient
+	 * to check whether or not the device's wakeup settings are good for
+	 * runtime PM.  Otherwise, the pm_resume_via_firmware() check will cause
+	 * acpi_subsys_complete() to take care of fixing up the device's state
+	 * anyway, if need be.
+	 */
+	dev->power.may_skip_resume = device_may_wakeup(dev) ||
+					!device_can_wakeup(dev);
+
+	return 0;
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_suspend_noirq);
 
@@ -1052,6 +1070,9 @@ EXPORT_SYMBOL_GPL(acpi_subsys_suspend_noirq);
  */
 int acpi_subsys_resume_noirq(struct device *dev)
 {
+	if (dev_pm_may_skip_resume(dev))
+		return 0;
+
 	/*
 	 * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend
 	 * during system suspend, so update their runtime PM status to "active"
diff --git a/drivers/acpi/ec.c b/drivers/acpi/ec.c
index 0252c9b..d9f38c6 100644
--- a/drivers/acpi/ec.c
+++ b/drivers/acpi/ec.c
@@ -1516,7 +1516,7 @@ static int acpi_ec_setup(struct acpi_ec *ec, bool handle_events)
 	}
 
 	acpi_handle_info(ec->handle,
-			 "GPE=0x%lx, EC_CMD/EC_SC=0x%lx, EC_DATA=0x%lx\n",
+			 "GPE=0x%x, EC_CMD/EC_SC=0x%lx, EC_DATA=0x%lx\n",
 			 ec->gpe, ec->command_addr, ec->data_addr);
 	return ret;
 }
diff --git a/drivers/acpi/ec_sys.c b/drivers/acpi/ec_sys.c
index 6c7dd7a..dd70d6c 100644
--- a/drivers/acpi/ec_sys.c
+++ b/drivers/acpi/ec_sys.c
@@ -128,7 +128,7 @@ static int acpi_ec_add_debugfs(struct acpi_ec *ec, unsigned int ec_device_count)
 		return -ENOMEM;
 	}
 
-	if (!debugfs_create_x32("gpe", 0444, dev_dir, (u32 *)&first_ec->gpe))
+	if (!debugfs_create_x32("gpe", 0444, dev_dir, &first_ec->gpe))
 		goto error;
 	if (!debugfs_create_bool("use_global_lock", 0444, dev_dir,
 				 &first_ec->global_lock))
diff --git a/drivers/acpi/evged.c b/drivers/acpi/evged.c
index 46f0603..f13ba2c 100644
--- a/drivers/acpi/evged.c
+++ b/drivers/acpi/evged.c
@@ -49,6 +49,11 @@
 
 #define MODULE_NAME	"acpi-ged"
 
+struct acpi_ged_device {
+	struct device *dev;
+	struct list_head event_list;
+};
+
 struct acpi_ged_event {
 	struct list_head node;
 	struct device *dev;
@@ -76,7 +81,8 @@ static acpi_status acpi_ged_request_interrupt(struct acpi_resource *ares,
 	unsigned int irq;
 	unsigned int gsi;
 	unsigned int irqflags = IRQF_ONESHOT;
-	struct device *dev = context;
+	struct acpi_ged_device *geddev = context;
+	struct device *dev = geddev->dev;
 	acpi_handle handle = ACPI_HANDLE(dev);
 	acpi_handle evt_handle;
 	struct resource r;
@@ -102,8 +108,6 @@ static acpi_status acpi_ged_request_interrupt(struct acpi_resource *ares,
 		return AE_ERROR;
 	}
 
-	dev_info(dev, "GED listening GSI %u @ IRQ %u\n", gsi, irq);
-
 	event = devm_kzalloc(dev, sizeof(*event), GFP_KERNEL);
 	if (!event)
 		return AE_ERROR;
@@ -116,29 +120,58 @@ static acpi_status acpi_ged_request_interrupt(struct acpi_resource *ares,
 	if (r.flags & IORESOURCE_IRQ_SHAREABLE)
 		irqflags |= IRQF_SHARED;
 
-	if (devm_request_threaded_irq(dev, irq, NULL, acpi_ged_irq_handler,
-				      irqflags, "ACPI:Ged", event)) {
+	if (request_threaded_irq(irq, NULL, acpi_ged_irq_handler,
+				 irqflags, "ACPI:Ged", event)) {
 		dev_err(dev, "failed to setup event handler for irq %u\n", irq);
 		return AE_ERROR;
 	}
 
+	dev_dbg(dev, "GED listening GSI %u @ IRQ %u\n", gsi, irq);
+	list_add_tail(&event->node, &geddev->event_list);
 	return AE_OK;
 }
 
 static int ged_probe(struct platform_device *pdev)
 {
+	struct acpi_ged_device *geddev;
 	acpi_status acpi_ret;
 
+	geddev = devm_kzalloc(&pdev->dev, sizeof(*geddev), GFP_KERNEL);
+	if (!geddev)
+		return -ENOMEM;
+
+	geddev->dev = &pdev->dev;
+	INIT_LIST_HEAD(&geddev->event_list);
 	acpi_ret = acpi_walk_resources(ACPI_HANDLE(&pdev->dev), "_CRS",
-				       acpi_ged_request_interrupt, &pdev->dev);
+				       acpi_ged_request_interrupt, geddev);
 	if (ACPI_FAILURE(acpi_ret)) {
 		dev_err(&pdev->dev, "unable to parse the _CRS record\n");
 		return -EINVAL;
 	}
+	platform_set_drvdata(pdev, geddev);
 
 	return 0;
 }
 
+static void ged_shutdown(struct platform_device *pdev)
+{
+	struct acpi_ged_device *geddev = platform_get_drvdata(pdev);
+	struct acpi_ged_event *event, *next;
+
+	list_for_each_entry_safe(event, next, &geddev->event_list, node) {
+		free_irq(event->irq, event);
+		list_del(&event->node);
+		dev_dbg(geddev->dev, "GED releasing GSI %u @ IRQ %u\n",
+			 event->gsi, event->irq);
+	}
+}
+
+static int ged_remove(struct platform_device *pdev)
+{
+	ged_shutdown(pdev);
+	return 0;
+}
+
 static const struct acpi_device_id ged_acpi_ids[] = {
 	{"ACPI0013"},
 	{},
@@ -146,6 +179,8 @@ static const struct acpi_device_id ged_acpi_ids[] = {
 
 static struct platform_driver ged_driver = {
 	.probe = ged_probe,
+	.remove = ged_remove,
+	.shutdown = ged_shutdown,
 	.driver = {
 		.name = MODULE_NAME,
 		.acpi_match_table = ACPI_PTR(ged_acpi_ids),
diff --git a/drivers/acpi/internal.h b/drivers/acpi/internal.h
index 7f43423..1d0a501 100644
--- a/drivers/acpi/internal.h
+++ b/drivers/acpi/internal.h
@@ -159,7 +159,7 @@ static inline void acpi_early_processor_osc(void) {}
    -------------------------------------------------------------------------- */
 struct acpi_ec {
 	acpi_handle handle;
-	unsigned long gpe;
+	u32 gpe;
 	unsigned long command_addr;
 	unsigned long data_addr;
 	bool global_lock;
diff --git a/drivers/acpi/numa.c b/drivers/acpi/numa.c
index 917f1cc..8ccaae3 100644
--- a/drivers/acpi/numa.c
+++ b/drivers/acpi/numa.c
@@ -460,8 +460,7 @@ int __init acpi_numa_init(void)
 					srat_proc, ARRAY_SIZE(srat_proc), 0);
 
 		cnt = acpi_table_parse_srat(ACPI_SRAT_TYPE_MEMORY_AFFINITY,
-					    acpi_parse_memory_affinity,
-					    NR_NODE_MEMBLKS);
+					    acpi_parse_memory_affinity, 0);
 	}
 
 	/* SLIT: System Locality Information Table */
diff --git a/drivers/acpi/pci_link.c b/drivers/acpi/pci_link.c
index bc3d914..85ad679 100644
--- a/drivers/acpi/pci_link.c
+++ b/drivers/acpi/pci_link.c
@@ -612,7 +612,7 @@ static int acpi_pci_link_allocate(struct acpi_pci_link *link)
 			acpi_isa_irq_penalty[link->irq.active] +=
 				PIRQ_PENALTY_PCI_USING;
 
-		printk(KERN_WARNING PREFIX "%s [%s] enabled at IRQ %d\n",
+		pr_info("%s [%s] enabled at IRQ %d\n",
 		       acpi_device_name(link->device),
 		       acpi_device_bid(link->device), link->irq.active);
 	}
diff --git a/drivers/acpi/pmic/intel_pmic_bxtwc.c b/drivers/acpi/pmic/intel_pmic_bxtwc.c
index 90011aad..886ac8b 100644
--- a/drivers/acpi/pmic/intel_pmic_bxtwc.c
+++ b/drivers/acpi/pmic/intel_pmic_bxtwc.c
@@ -400,7 +400,7 @@ static int intel_bxtwc_pmic_opregion_probe(struct platform_device *pdev)
 			&intel_bxtwc_pmic_opregion_data);
 }
 
-static struct platform_device_id bxt_wc_opregion_id_table[] = {
+static const struct platform_device_id bxt_wc_opregion_id_table[] = {
 	{ .name = "bxt_wcove_region" },
 	{},
 };
@@ -412,9 +412,4 @@ static struct platform_driver intel_bxtwc_pmic_opregion_driver = {
 	},
 	.id_table = bxt_wc_opregion_id_table,
 };
-
-static int __init intel_bxtwc_pmic_opregion_driver_init(void)
-{
-	return platform_driver_register(&intel_bxtwc_pmic_opregion_driver);
-}
-device_initcall(intel_bxtwc_pmic_opregion_driver_init);
+builtin_platform_driver(intel_bxtwc_pmic_opregion_driver);
diff --git a/drivers/acpi/pmic/intel_pmic_chtdc_ti.c b/drivers/acpi/pmic/intel_pmic_chtdc_ti.c
index 109c1e9..f6d73a2 100644
--- a/drivers/acpi/pmic/intel_pmic_chtdc_ti.c
+++ b/drivers/acpi/pmic/intel_pmic_chtdc_ti.c
@@ -131,7 +131,4 @@ static struct platform_driver chtdc_ti_pmic_opregion_driver = {
 	},
 	.id_table = chtdc_ti_pmic_opregion_id_table,
 };
-module_platform_driver(chtdc_ti_pmic_opregion_driver);
-
-MODULE_DESCRIPTION("Dollar Cove TI PMIC opregion driver");
-MODULE_LICENSE("GPL v2");
+builtin_platform_driver(chtdc_ti_pmic_opregion_driver);
diff --git a/drivers/acpi/pmic/intel_pmic_chtwc.c b/drivers/acpi/pmic/intel_pmic_chtwc.c
index 85636d7..9912422 100644
--- a/drivers/acpi/pmic/intel_pmic_chtwc.c
+++ b/drivers/acpi/pmic/intel_pmic_chtwc.c
@@ -260,11 +260,10 @@ static int intel_cht_wc_pmic_opregion_probe(struct platform_device *pdev)
 			&intel_cht_wc_pmic_opregion_data);
 }
 
-static struct platform_device_id cht_wc_opregion_id_table[] = {
+static const struct platform_device_id cht_wc_opregion_id_table[] = {
 	{ .name = "cht_wcove_region" },
 	{},
 };
-MODULE_DEVICE_TABLE(platform, cht_wc_opregion_id_table);
 
 static struct platform_driver intel_cht_wc_pmic_opregion_driver = {
 	.probe = intel_cht_wc_pmic_opregion_probe,
@@ -273,8 +272,4 @@ static struct platform_driver intel_cht_wc_pmic_opregion_driver = {
 	},
 	.id_table = cht_wc_opregion_id_table,
 };
-module_platform_driver(intel_cht_wc_pmic_opregion_driver);
-
-MODULE_DESCRIPTION("Intel CHT Whiskey Cove PMIC operation region driver");
-MODULE_AUTHOR("Hans de Goede <hdegoede@redhat.com>");
-MODULE_LICENSE("GPL");
+builtin_platform_driver(intel_cht_wc_pmic_opregion_driver);
diff --git a/drivers/acpi/pmic/intel_pmic_crc.c b/drivers/acpi/pmic/intel_pmic_crc.c
index d7f1761..7ffa740 100644
--- a/drivers/acpi/pmic/intel_pmic_crc.c
+++ b/drivers/acpi/pmic/intel_pmic_crc.c
@@ -201,9 +201,4 @@ static struct platform_driver intel_crc_pmic_opregion_driver = {
 		.name = "crystal_cove_pmic",
 	},
 };
-
-static int __init intel_crc_pmic_opregion_driver_init(void)
-{
-	return platform_driver_register(&intel_crc_pmic_opregion_driver);
-}
-device_initcall(intel_crc_pmic_opregion_driver_init);
+builtin_platform_driver(intel_crc_pmic_opregion_driver);
diff --git a/drivers/acpi/pmic/intel_pmic_xpower.c b/drivers/acpi/pmic/intel_pmic_xpower.c
index 6c99d3f..316e551 100644
--- a/drivers/acpi/pmic/intel_pmic_xpower.c
+++ b/drivers/acpi/pmic/intel_pmic_xpower.c
@@ -278,9 +278,4 @@ static struct platform_driver intel_xpower_pmic_opregion_driver = {
 		.name = "axp288_pmic_acpi",
 	},
 };
-
-static int __init intel_xpower_pmic_opregion_driver_init(void)
-{
-	return platform_driver_register(&intel_xpower_pmic_opregion_driver);
-}
-device_initcall(intel_xpower_pmic_opregion_driver_init);
+builtin_platform_driver(intel_xpower_pmic_opregion_driver);
diff --git a/drivers/acpi/processor_idle.c b/drivers/acpi/processor_idle.c
index d50a7b6..5f0071c 100644
--- a/drivers/acpi/processor_idle.c
+++ b/drivers/acpi/processor_idle.c
@@ -207,6 +207,7 @@ static void tsc_check_state(int state)
 	switch (boot_cpu_data.x86_vendor) {
 	case X86_VENDOR_AMD:
 	case X86_VENDOR_INTEL:
+	case X86_VENDOR_CENTAUR:
 		/*
 		 * AMD Fam10h TSC will tick in all
 		 * C/P/S0/S1 states when this bit is set.
diff --git a/drivers/acpi/sleep.c b/drivers/acpi/sleep.c
index 8082871..46cde091 100644
--- a/drivers/acpi/sleep.c
+++ b/drivers/acpi/sleep.c
@@ -367,10 +367,20 @@ static const struct dmi_system_id acpisleep_dmi_table[] __initconst = {
 	{},
 };
 
+static bool ignore_blacklist;
+
+void __init acpi_sleep_no_blacklist(void)
+{
+	ignore_blacklist = true;
+}
+
 static void __init acpi_sleep_dmi_check(void)
 {
 	int year;
 
+	if (ignore_blacklist)
+		return;
+
 	if (dmi_get_date(DMI_BIOS_DATE, &year, NULL, NULL) && year >= 2012)
 		acpi_nvs_nosave_s3();
 
@@ -697,7 +707,8 @@ static const struct acpi_device_id lps0_device_ids[] = {
 #define ACPI_LPS0_ENTRY		5
 #define ACPI_LPS0_EXIT		6
 
-#define ACPI_S2IDLE_FUNC_MASK	((1 << ACPI_LPS0_ENTRY) | (1 << ACPI_LPS0_EXIT))
+#define ACPI_LPS0_SCREEN_MASK	((1 << ACPI_LPS0_SCREEN_OFF) | (1 << ACPI_LPS0_SCREEN_ON))
+#define ACPI_LPS0_PLATFORM_MASK	((1 << ACPI_LPS0_ENTRY) | (1 << ACPI_LPS0_EXIT))
 
 static acpi_handle lps0_device_handle;
 static guid_t lps0_dsm_guid;
@@ -900,7 +911,8 @@ static int lps0_device_attach(struct acpi_device *adev,
 	if (out_obj && out_obj->type == ACPI_TYPE_BUFFER) {
 		char bitmask = *(char *)out_obj->buffer.pointer;
 
-		if ((bitmask & ACPI_S2IDLE_FUNC_MASK) == ACPI_S2IDLE_FUNC_MASK) {
+		if ((bitmask & ACPI_LPS0_PLATFORM_MASK) == ACPI_LPS0_PLATFORM_MASK ||
+		    (bitmask & ACPI_LPS0_SCREEN_MASK) == ACPI_LPS0_SCREEN_MASK) {
 			lps0_dsm_func_mask = bitmask;
 			lps0_device_handle = adev->handle;
 			/*
diff --git a/drivers/acpi/sysfs.c b/drivers/acpi/sysfs.c
index 06a150b..4fc59c3 100644
--- a/drivers/acpi/sysfs.c
+++ b/drivers/acpi/sysfs.c
@@ -816,14 +816,8 @@ static ssize_t counter_set(struct kobject *kobj,
  * interface:
  *   echo unmask > /sys/firmware/acpi/interrupts/gpe00
  */
-
-/*
- * Currently, the GPE flooding prevention only supports to mask the GPEs
- * numbered from 00 to 7f.
- */
-#define ACPI_MASKABLE_GPE_MAX	0x80
-
-static u64 __initdata acpi_masked_gpes;
+#define ACPI_MASKABLE_GPE_MAX	0xFF
+static DECLARE_BITMAP(acpi_masked_gpes_map, ACPI_MASKABLE_GPE_MAX) __initdata;
 
 static int __init acpi_gpe_set_masked_gpes(char *val)
 {
@@ -831,7 +825,7 @@ static int __init acpi_gpe_set_masked_gpes(char *val)
 
 	if (kstrtou8(val, 0, &gpe) || gpe > ACPI_MASKABLE_GPE_MAX)
 		return -EINVAL;
-	acpi_masked_gpes |= ((u64)1<<gpe);
+	set_bit(gpe, acpi_masked_gpes_map);
 
 	return 1;
 }
@@ -843,15 +837,11 @@ void __init acpi_gpe_apply_masked_gpes(void)
 	acpi_status status;
 	u8 gpe;
 
-	for (gpe = 0;
-	     gpe < min_t(u8, ACPI_MASKABLE_GPE_MAX, acpi_current_gpe_count);
-	     gpe++) {
-		if (acpi_masked_gpes & ((u64)1<<gpe)) {
-			status = acpi_get_gpe_device(gpe, &handle);
-			if (ACPI_SUCCESS(status)) {
-				pr_info("Masking GPE 0x%x.\n", gpe);
-				(void)acpi_mask_gpe(handle, gpe, TRUE);
-			}
+	for_each_set_bit(gpe, acpi_masked_gpes_map, ACPI_MASKABLE_GPE_MAX) {
+		status = acpi_get_gpe_device(gpe, &handle);
+		if (ACPI_SUCCESS(status)) {
+			pr_info("Masking GPE 0x%x.\n", gpe);
+			(void)acpi_mask_gpe(handle, gpe, TRUE);
 		}
 	}
 }
diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c
index 9d49a1a..78db976 100644
--- a/drivers/acpi/utils.c
+++ b/drivers/acpi/utils.c
@@ -737,16 +737,17 @@ bool acpi_dev_found(const char *hid)
 }
 EXPORT_SYMBOL(acpi_dev_found);
 
-struct acpi_dev_present_info {
+struct acpi_dev_match_info {
+	const char *dev_name;
 	struct acpi_device_id hid[2];
 	const char *uid;
 	s64 hrv;
 };
 
-static int acpi_dev_present_cb(struct device *dev, void *data)
+static int acpi_dev_match_cb(struct device *dev, void *data)
 {
 	struct acpi_device *adev = to_acpi_device(dev);
-	struct acpi_dev_present_info *match = data;
+	struct acpi_dev_match_info *match = data;
 	unsigned long long hrv;
 	acpi_status status;
 
@@ -757,6 +758,8 @@ static int acpi_dev_present_cb(struct device *dev, void *data)
 	    strcmp(adev->pnp.unique_id, match->uid)))
 		return 0;
 
+	match->dev_name = acpi_dev_name(adev);
+
 	if (match->hrv == -1)
 		return 1;
 
@@ -789,20 +792,44 @@ static int acpi_dev_present_cb(struct device *dev, void *data)
  */
 bool acpi_dev_present(const char *hid, const char *uid, s64 hrv)
 {
-	struct acpi_dev_present_info match = {};
+	struct acpi_dev_match_info match = {};
 	struct device *dev;
 
 	strlcpy(match.hid[0].id, hid, sizeof(match.hid[0].id));
 	match.uid = uid;
 	match.hrv = hrv;
 
-	dev = bus_find_device(&acpi_bus_type, NULL, &match,
-			      acpi_dev_present_cb);
-
+	dev = bus_find_device(&acpi_bus_type, NULL, &match, acpi_dev_match_cb);
 	return !!dev;
 }
 EXPORT_SYMBOL(acpi_dev_present);
 
+/**
+ * acpi_dev_get_first_match_name - Return name of first match of ACPI device
+ * @hid: Hardware ID of the device.
+ * @uid: Unique ID of the device, pass NULL to not check _UID
+ * @hrv: Hardware Revision of the device, pass -1 to not check _HRV
+ *
+ * Return device name if a matching device was present
+ * at the moment of invocation, or NULL otherwise.
+ *
+ * See additional information in acpi_dev_present() as well.
+ */
+const char *
+acpi_dev_get_first_match_name(const char *hid, const char *uid, s64 hrv)
+{
+	struct acpi_dev_match_info match = {};
+	struct device *dev;
+
+	strlcpy(match.hid[0].id, hid, sizeof(match.hid[0].id));
+	match.uid = uid;
+	match.hrv = hrv;
+
+	dev = bus_find_device(&acpi_bus_type, NULL, &match, acpi_dev_match_cb);
+	return dev ? match.dev_name : NULL;
+}
+EXPORT_SYMBOL(acpi_dev_get_first_match_name);
+
 /*
  * acpi_backlight= handling, this is done here rather then in video_detect.c
  * because __setup cannot be used in modules.
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c
index 8193b38..3c09122 100644
--- a/drivers/ata/libata-core.c
+++ b/drivers/ata/libata-core.c
@@ -4449,6 +4449,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = {
 	 * https://bugzilla.kernel.org/show_bug.cgi?id=121671
 	 */
 	{ "LITEON CX1-JB*-HP",	NULL,		ATA_HORKAGE_MAX_SEC_1024 },
+	{ "LITEON EP1-*",	NULL,		ATA_HORKAGE_MAX_SEC_1024 },
 
 	/* Devices we expect to fail diagnostics */
 
diff --git a/drivers/base/Kconfig b/drivers/base/Kconfig
index bdc8790..2415ad9 100644
--- a/drivers/base/Kconfig
+++ b/drivers/base/Kconfig
@@ -236,6 +236,9 @@
 config GENERIC_CPU_AUTOPROBE
 	bool
 
+config GENERIC_CPU_VULNERABILITIES
+	bool
+
 config SOC_BUS
 	bool
 	select GLOB
diff --git a/drivers/base/cpu.c b/drivers/base/cpu.c
index 58a9b60..d990384 100644
--- a/drivers/base/cpu.c
+++ b/drivers/base/cpu.c
@@ -511,10 +511,58 @@ static void __init cpu_dev_register_generic(void)
 #endif
 }
 
+#ifdef CONFIG_GENERIC_CPU_VULNERABILITIES
+
+ssize_t __weak cpu_show_meltdown(struct device *dev,
+				 struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
+ssize_t __weak cpu_show_spectre_v1(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
+ssize_t __weak cpu_show_spectre_v2(struct device *dev,
+				   struct device_attribute *attr, char *buf)
+{
+	return sprintf(buf, "Not affected\n");
+}
+
+static DEVICE_ATTR(meltdown, 0444, cpu_show_meltdown, NULL);
+static DEVICE_ATTR(spectre_v1, 0444, cpu_show_spectre_v1, NULL);
+static DEVICE_ATTR(spectre_v2, 0444, cpu_show_spectre_v2, NULL);
+
+static struct attribute *cpu_root_vulnerabilities_attrs[] = {
+	&dev_attr_meltdown.attr,
+	&dev_attr_spectre_v1.attr,
+	&dev_attr_spectre_v2.attr,
+	NULL
+};
+
+static const struct attribute_group cpu_root_vulnerabilities_group = {
+	.name  = "vulnerabilities",
+	.attrs = cpu_root_vulnerabilities_attrs,
+};
+
+static void __init cpu_register_vulnerabilities(void)
+{
+	if (sysfs_create_group(&cpu_subsys.dev_root->kobj,
+			       &cpu_root_vulnerabilities_group))
+		pr_err("Unable to register CPU vulnerabilities\n");
+}
+
+#else
+static inline void cpu_register_vulnerabilities(void) { }
+#endif
+
 void __init cpu_dev_init(void)
 {
 	if (subsys_system_register(&cpu_subsys, cpu_root_attr_groups))
 		panic("Failed to register CPU subsystem");
 
 	cpu_dev_register_generic();
+	cpu_register_vulnerabilities();
 }
diff --git a/drivers/base/power/domain.c b/drivers/base/power/domain.c
index 0c80bea..528b241 100644
--- a/drivers/base/power/domain.c
+++ b/drivers/base/power/domain.c
@@ -1032,15 +1032,12 @@ static int genpd_prepare(struct device *dev)
 static int genpd_finish_suspend(struct device *dev, bool poweroff)
 {
 	struct generic_pm_domain *genpd;
-	int ret;
+	int ret = 0;
 
 	genpd = dev_to_genpd(dev);
 	if (IS_ERR(genpd))
 		return -EINVAL;
 
-	if (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))
-		return 0;
-
 	if (poweroff)
 		ret = pm_generic_poweroff_noirq(dev);
 	else
@@ -1048,10 +1045,19 @@ static int genpd_finish_suspend(struct device *dev, bool poweroff)
 	if (ret)
 		return ret;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_suspend(dev);
-		if (ret)
+	if (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))
+		return 0;
+
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_stop_dev(genpd, dev);
+		if (ret) {
+			if (poweroff)
+				pm_generic_restore_noirq(dev);
+			else
+				pm_generic_resume_noirq(dev);
 			return ret;
+		}
 	}
 
 	genpd_lock(genpd);
@@ -1085,7 +1091,7 @@ static int genpd_suspend_noirq(struct device *dev)
 static int genpd_resume_noirq(struct device *dev)
 {
 	struct generic_pm_domain *genpd;
-	int ret = 0;
+	int ret;
 
 	dev_dbg(dev, "%s()\n", __func__);
 
@@ -1094,21 +1100,21 @@ static int genpd_resume_noirq(struct device *dev)
 		return -EINVAL;
 
 	if (dev->power.wakeup_path && genpd_is_active_wakeup(genpd))
-		return 0;
+		return pm_generic_resume_noirq(dev);
 
 	genpd_lock(genpd);
 	genpd_sync_power_on(genpd, true, 0);
 	genpd->suspended_count--;
 	genpd_unlock(genpd);
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start)
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
+		if (ret)
+			return ret;
+	}
 
-	ret = pm_generic_resume_noirq(dev);
-	if (ret)
-		return ret;
-
-	return ret;
+	return pm_generic_resume_noirq(dev);
 }
 
 /**
@@ -1135,8 +1141,9 @@ static int genpd_freeze_noirq(struct device *dev)
 	if (ret)
 		return ret;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start)
-		ret = pm_runtime_force_suspend(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev))
+		ret = genpd_stop_dev(genpd, dev);
 
 	return ret;
 }
@@ -1159,8 +1166,9 @@ static int genpd_thaw_noirq(struct device *dev)
 	if (IS_ERR(genpd))
 		return -EINVAL;
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
 		if (ret)
 			return ret;
 	}
@@ -1217,8 +1225,9 @@ static int genpd_restore_noirq(struct device *dev)
 	genpd_sync_power_on(genpd, true, 0);
 	genpd_unlock(genpd);
 
-	if (genpd->dev_ops.stop && genpd->dev_ops.start) {
-		ret = pm_runtime_force_resume(dev);
+	if (genpd->dev_ops.stop && genpd->dev_ops.start &&
+	    !pm_runtime_status_suspended(dev)) {
+		ret = genpd_start_dev(genpd, dev);
 		if (ret)
 			return ret;
 	}
@@ -2199,20 +2208,8 @@ int genpd_dev_pm_attach(struct device *dev)
 
 	ret = of_parse_phandle_with_args(dev->of_node, "power-domains",
 					"#power-domain-cells", 0, &pd_args);
-	if (ret < 0) {
-		if (ret != -ENOENT)
-			return ret;
-
-		/*
-		 * Try legacy Samsung-specific bindings
-		 * (for backwards compatibility of DT ABI)
-		 */
-		pd_args.args_count = 0;
-		pd_args.np = of_parse_phandle(dev->of_node,
-						"samsung,power-domain", 0);
-		if (!pd_args.np)
-			return -ENOENT;
-	}
+	if (ret < 0)
+		return ret;
 
 	mutex_lock(&gpd_list_lock);
 	pd = genpd_get_from_provider(&pd_args);
diff --git a/drivers/base/power/main.c b/drivers/base/power/main.c
index 08744b5..02a497e 100644
--- a/drivers/base/power/main.c
+++ b/drivers/base/power/main.c
@@ -18,7 +18,6 @@
  */
 
 #include <linux/device.h>
-#include <linux/kallsyms.h>
 #include <linux/export.h>
 #include <linux/mutex.h>
 #include <linux/pm.h>
@@ -541,30 +540,41 @@ void dev_pm_skip_next_resume_phases(struct device *dev)
 }
 
 /**
- * device_resume_noirq - Execute a "noirq resume" callback for given device.
- * @dev: Device to handle.
- * @state: PM transition of the system being carried out.
- * @async: If true, the device is being resumed asynchronously.
- *
- * The driver of @dev will not receive interrupts while this function is being
- * executed.
+ * suspend_event - Return a "suspend" message for given "resume" one.
+ * @resume_msg: PM message representing a system-wide resume transition.
  */
-static int device_resume_noirq(struct device *dev, pm_message_t state, bool async)
+static pm_message_t suspend_event(pm_message_t resume_msg)
 {
-	pm_callback_t callback = NULL;
-	const char *info = NULL;
-	int error = 0;
+	switch (resume_msg.event) {
+	case PM_EVENT_RESUME:
+		return PMSG_SUSPEND;
+	case PM_EVENT_THAW:
+	case PM_EVENT_RESTORE:
+		return PMSG_FREEZE;
+	case PM_EVENT_RECOVER:
+		return PMSG_HIBERNATE;
+	}
+	return PMSG_ON;
+}
 
-	TRACE_DEVICE(dev);
-	TRACE_RESUME(0);
+/**
+ * dev_pm_may_skip_resume - System-wide device resume optimization check.
+ * @dev: Target device.
+ *
+ * Checks whether or not the device may be left in suspend after a system-wide
+ * transition to the working state.
+ */
+bool dev_pm_may_skip_resume(struct device *dev)
+{
+	return !dev->power.must_resume && pm_transition.event != PM_EVENT_RESTORE;
+}
 
-	if (dev->power.syscore || dev->power.direct_complete)
-		goto Out;
-
-	if (!dev->power.is_noirq_suspended)
-		goto Out;
-
-	dpm_wait_for_superior(dev, async);
+static pm_callback_t dpm_subsys_resume_noirq_cb(struct device *dev,
+						pm_message_t state,
+						const char **info_p)
+{
+	pm_callback_t callback;
+	const char *info;
 
 	if (dev->pm_domain) {
 		info = "noirq power domain ";
@@ -578,17 +588,106 @@ static int device_resume_noirq(struct device *dev, pm_message_t state, bool asyn
 	} else if (dev->bus && dev->bus->pm) {
 		info = "noirq bus ";
 		callback = pm_noirq_op(dev->bus->pm, state);
+	} else {
+		return NULL;
 	}
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	if (info_p)
+		*info_p = info;
+
+	return callback;
+}
+
+static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev,
+						 pm_message_t state,
+						 const char **info_p);
+
+static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev,
+						pm_message_t state,
+						const char **info_p);
+
+/**
+ * device_resume_noirq - Execute a "noirq resume" callback for given device.
+ * @dev: Device to handle.
+ * @state: PM transition of the system being carried out.
+ * @async: If true, the device is being resumed asynchronously.
+ *
+ * The driver of @dev will not receive interrupts while this function is being
+ * executed.
+ */
+static int device_resume_noirq(struct device *dev, pm_message_t state, bool async)
+{
+	pm_callback_t callback;
+	const char *info;
+	bool skip_resume;
+	int error = 0;
+
+	TRACE_DEVICE(dev);
+	TRACE_RESUME(0);
+
+	if (dev->power.syscore || dev->power.direct_complete)
+		goto Out;
+
+	if (!dev->power.is_noirq_suspended)
+		goto Out;
+
+	dpm_wait_for_superior(dev, async);
+
+	skip_resume = dev_pm_may_skip_resume(dev);
+
+	callback = dpm_subsys_resume_noirq_cb(dev, state, &info);
+	if (callback)
+		goto Run;
+
+	if (skip_resume)
+		goto Skip;
+
+	if (dev_pm_smart_suspend_and_suspended(dev)) {
+		pm_message_t suspend_msg = suspend_event(state);
+
+		/*
+		 * If "freeze" callbacks have been skipped during a transition
+		 * related to hibernation, the subsequent "thaw" callbacks must
+		 * be skipped too or bad things may happen.  Otherwise, resume
+		 * callbacks are going to be run for the device, so its runtime
+		 * PM status must be changed to reflect the new state after the
+		 * transition under way.
+		 */
+		if (!dpm_subsys_suspend_late_cb(dev, suspend_msg, NULL) &&
+		    !dpm_subsys_suspend_noirq_cb(dev, suspend_msg, NULL)) {
+			if (state.event == PM_EVENT_THAW) {
+				skip_resume = true;
+				goto Skip;
+			} else {
+				pm_runtime_set_active(dev);
+			}
+		}
+	}
+
+	if (dev->driver && dev->driver->pm) {
 		info = "noirq driver ";
 		callback = pm_noirq_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
+
+Skip:
 	dev->power.is_noirq_suspended = false;
 
- Out:
+	if (skip_resume) {
+		/*
+		 * The device is going to be left in suspend, but it might not
+		 * have been in runtime suspend before the system suspended, so
+		 * its runtime PM status needs to be updated to avoid confusing
+		 * the runtime PM framework when runtime PM is enabled for the
+		 * device again.
+		 */
+		pm_runtime_set_suspended(dev);
+		dev_pm_skip_next_resume_phases(dev);
+	}
+
+Out:
 	complete_all(&dev->power.completion);
 	TRACE_RESUME(error);
 	return error;
@@ -681,30 +780,12 @@ void dpm_resume_noirq(pm_message_t state)
 	dpm_noirq_end();
 }
 
-/**
- * device_resume_early - Execute an "early resume" callback for given device.
- * @dev: Device to handle.
- * @state: PM transition of the system being carried out.
- * @async: If true, the device is being resumed asynchronously.
- *
- * Runtime PM is disabled for @dev while this function is being executed.
- */
-static int device_resume_early(struct device *dev, pm_message_t state, bool async)
+static pm_callback_t dpm_subsys_resume_early_cb(struct device *dev,
+						pm_message_t state,
+						const char **info_p)
 {
-	pm_callback_t callback = NULL;
-	const char *info = NULL;
-	int error = 0;
-
-	TRACE_DEVICE(dev);
-	TRACE_RESUME(0);
-
-	if (dev->power.syscore || dev->power.direct_complete)
-		goto Out;
-
-	if (!dev->power.is_late_suspended)
-		goto Out;
-
-	dpm_wait_for_superior(dev, async);
+	pm_callback_t callback;
+	const char *info;
 
 	if (dev->pm_domain) {
 		info = "early power domain ";
@@ -718,8 +799,43 @@ static int device_resume_early(struct device *dev, pm_message_t state, bool asyn
 	} else if (dev->bus && dev->bus->pm) {
 		info = "early bus ";
 		callback = pm_late_early_op(dev->bus->pm, state);
+	} else {
+		return NULL;
 	}
 
+	if (info_p)
+		*info_p = info;
+
+	return callback;
+}
+
+/**
+ * device_resume_early - Execute an "early resume" callback for given device.
+ * @dev: Device to handle.
+ * @state: PM transition of the system being carried out.
+ * @async: If true, the device is being resumed asynchronously.
+ *
+ * Runtime PM is disabled for @dev while this function is being executed.
+ */
+static int device_resume_early(struct device *dev, pm_message_t state, bool async)
+{
+	pm_callback_t callback;
+	const char *info;
+	int error = 0;
+
+	TRACE_DEVICE(dev);
+	TRACE_RESUME(0);
+
+	if (dev->power.syscore || dev->power.direct_complete)
+		goto Out;
+
+	if (!dev->power.is_late_suspended)
+		goto Out;
+
+	dpm_wait_for_superior(dev, async);
+
+	callback = dpm_subsys_resume_early_cb(dev, state, &info);
+
 	if (!callback && dev->driver && dev->driver->pm) {
 		info = "early driver ";
 		callback = pm_late_early_op(dev->driver->pm, state);
@@ -1089,6 +1205,77 @@ static pm_message_t resume_event(pm_message_t sleep_state)
 	return PMSG_ON;
 }
 
+static void dpm_superior_set_must_resume(struct device *dev)
+{
+	struct device_link *link;
+	int idx;
+
+	if (dev->parent)
+		dev->parent->power.must_resume = true;
+
+	idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+		link->supplier->power.must_resume = true;
+
+	device_links_read_unlock(idx);
+}
+
+static pm_callback_t dpm_subsys_suspend_noirq_cb(struct device *dev,
+						 pm_message_t state,
+						 const char **info_p)
+{
+	pm_callback_t callback;
+	const char *info;
+
+	if (dev->pm_domain) {
+		info = "noirq power domain ";
+		callback = pm_noirq_op(&dev->pm_domain->ops, state);
+	} else if (dev->type && dev->type->pm) {
+		info = "noirq type ";
+		callback = pm_noirq_op(dev->type->pm, state);
+	} else if (dev->class && dev->class->pm) {
+		info = "noirq class ";
+		callback = pm_noirq_op(dev->class->pm, state);
+	} else if (dev->bus && dev->bus->pm) {
+		info = "noirq bus ";
+		callback = pm_noirq_op(dev->bus->pm, state);
+	} else {
+		return NULL;
+	}
+
+	if (info_p)
+		*info_p = info;
+
+	return callback;
+}
+
+static bool device_must_resume(struct device *dev, pm_message_t state,
+			       bool no_subsys_suspend_noirq)
+{
+	pm_message_t resume_msg = resume_event(state);
+
+	/*
+	 * If all of the device driver's "noirq", "late" and "early" callbacks
+	 * are invoked directly by the core, the decision to allow the device to
+	 * stay in suspend can be based on its current runtime PM status and its
+	 * wakeup settings.
+	 */
+	if (no_subsys_suspend_noirq &&
+	    !dpm_subsys_suspend_late_cb(dev, state, NULL) &&
+	    !dpm_subsys_resume_early_cb(dev, resume_msg, NULL) &&
+	    !dpm_subsys_resume_noirq_cb(dev, resume_msg, NULL))
+		return !pm_runtime_status_suspended(dev) &&
+			(resume_msg.event != PM_EVENT_RESUME ||
+			 (device_can_wakeup(dev) && !device_may_wakeup(dev)));
+
+	/*
+	 * The only safe strategy here is to require that if the device may not
+	 * be left in suspend, resume callbacks must be invoked for it.
+	 */
+	return !dev->power.may_skip_resume;
+}
+
 /**
  * __device_suspend_noirq - Execute a "noirq suspend" callback for given device.
  * @dev: Device to handle.
@@ -1100,8 +1287,9 @@ static pm_message_t resume_event(pm_message_t sleep_state)
  */
 static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool async)
 {
-	pm_callback_t callback = NULL;
-	const char *info = NULL;
+	pm_callback_t callback;
+	const char *info;
+	bool no_subsys_cb = false;
 	int error = 0;
 
 	TRACE_DEVICE(dev);
@@ -1120,30 +1308,40 @@ static int __device_suspend_noirq(struct device *dev, pm_message_t state, bool a
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
-	if (dev->pm_domain) {
-		info = "noirq power domain ";
-		callback = pm_noirq_op(&dev->pm_domain->ops, state);
-	} else if (dev->type && dev->type->pm) {
-		info = "noirq type ";
-		callback = pm_noirq_op(dev->type->pm, state);
-	} else if (dev->class && dev->class->pm) {
-		info = "noirq class ";
-		callback = pm_noirq_op(dev->class->pm, state);
-	} else if (dev->bus && dev->bus->pm) {
-		info = "noirq bus ";
-		callback = pm_noirq_op(dev->bus->pm, state);
-	}
+	callback = dpm_subsys_suspend_noirq_cb(dev, state, &info);
+	if (callback)
+		goto Run;
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	no_subsys_cb = !dpm_subsys_suspend_late_cb(dev, state, NULL);
+
+	if (dev_pm_smart_suspend_and_suspended(dev) && no_subsys_cb)
+		goto Skip;
+
+	if (dev->driver && dev->driver->pm) {
 		info = "noirq driver ";
 		callback = pm_noirq_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
-		dev->power.is_noirq_suspended = true;
-	else
+	if (error) {
 		async_error = error;
+		goto Complete;
+	}
+
+Skip:
+	dev->power.is_noirq_suspended = true;
+
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED)) {
+		dev->power.must_resume = dev->power.must_resume ||
+				atomic_read(&dev->power.usage_count) > 1 ||
+				device_must_resume(dev, state, no_subsys_cb);
+	} else {
+		dev->power.must_resume = true;
+	}
+
+	if (dev->power.must_resume)
+		dpm_superior_set_must_resume(dev);
 
 Complete:
 	complete_all(&dev->power.completion);
@@ -1249,6 +1447,50 @@ int dpm_suspend_noirq(pm_message_t state)
 	return ret;
 }
 
+static void dpm_propagate_wakeup_to_parent(struct device *dev)
+{
+	struct device *parent = dev->parent;
+
+	if (!parent)
+		return;
+
+	spin_lock_irq(&parent->power.lock);
+
+	if (dev->power.wakeup_path && !parent->power.ignore_children)
+		parent->power.wakeup_path = true;
+
+	spin_unlock_irq(&parent->power.lock);
+}
+
+static pm_callback_t dpm_subsys_suspend_late_cb(struct device *dev,
+						pm_message_t state,
+						const char **info_p)
+{
+	pm_callback_t callback;
+	const char *info;
+
+	if (dev->pm_domain) {
+		info = "late power domain ";
+		callback = pm_late_early_op(&dev->pm_domain->ops, state);
+	} else if (dev->type && dev->type->pm) {
+		info = "late type ";
+		callback = pm_late_early_op(dev->type->pm, state);
+	} else if (dev->class && dev->class->pm) {
+		info = "late class ";
+		callback = pm_late_early_op(dev->class->pm, state);
+	} else if (dev->bus && dev->bus->pm) {
+		info = "late bus ";
+		callback = pm_late_early_op(dev->bus->pm, state);
+	} else {
+		return NULL;
+	}
+
+	if (info_p)
+		*info_p = info;
+
+	return callback;
+}
+
 /**
  * __device_suspend_late - Execute a "late suspend" callback for given device.
  * @dev: Device to handle.
@@ -1259,8 +1501,8 @@ int dpm_suspend_noirq(pm_message_t state)
  */
 static int __device_suspend_late(struct device *dev, pm_message_t state, bool async)
 {
-	pm_callback_t callback = NULL;
-	const char *info = NULL;
+	pm_callback_t callback;
+	const char *info;
 	int error = 0;
 
 	TRACE_DEVICE(dev);
@@ -1281,30 +1523,29 @@ static int __device_suspend_late(struct device *dev, pm_message_t state, bool as
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
-	if (dev->pm_domain) {
-		info = "late power domain ";
-		callback = pm_late_early_op(&dev->pm_domain->ops, state);
-	} else if (dev->type && dev->type->pm) {
-		info = "late type ";
-		callback = pm_late_early_op(dev->type->pm, state);
-	} else if (dev->class && dev->class->pm) {
-		info = "late class ";
-		callback = pm_late_early_op(dev->class->pm, state);
-	} else if (dev->bus && dev->bus->pm) {
-		info = "late bus ";
-		callback = pm_late_early_op(dev->bus->pm, state);
-	}
+	callback = dpm_subsys_suspend_late_cb(dev, state, &info);
+	if (callback)
+		goto Run;
 
-	if (!callback && dev->driver && dev->driver->pm) {
+	if (dev_pm_smart_suspend_and_suspended(dev) &&
+	    !dpm_subsys_suspend_noirq_cb(dev, state, NULL))
+		goto Skip;
+
+	if (dev->driver && dev->driver->pm) {
 		info = "late driver ";
 		callback = pm_late_early_op(dev->driver->pm, state);
 	}
 
+Run:
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
-		dev->power.is_late_suspended = true;
-	else
+	if (error) {
 		async_error = error;
+		goto Complete;
+	}
+	dpm_propagate_wakeup_to_parent(dev);
+
+Skip:
+	dev->power.is_late_suspended = true;
 
 Complete:
 	TRACE_SUSPEND(error);
@@ -1435,11 +1676,17 @@ static int legacy_suspend(struct device *dev, pm_message_t state,
 	return error;
 }
 
-static void dpm_clear_suppliers_direct_complete(struct device *dev)
+static void dpm_clear_superiors_direct_complete(struct device *dev)
 {
 	struct device_link *link;
 	int idx;
 
+	if (dev->parent) {
+		spin_lock_irq(&dev->parent->power.lock);
+		dev->parent->power.direct_complete = false;
+		spin_unlock_irq(&dev->parent->power.lock);
+	}
+
 	idx = device_links_read_lock();
 
 	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node) {
@@ -1500,6 +1747,9 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 		dev->power.direct_complete = false;
 	}
 
+	dev->power.may_skip_resume = false;
+	dev->power.must_resume = false;
+
 	dpm_watchdog_set(&wd, dev);
 	device_lock(dev);
 
@@ -1543,20 +1793,12 @@ static int __device_suspend(struct device *dev, pm_message_t state, bool async)
 
  End:
 	if (!error) {
-		struct device *parent = dev->parent;
-
 		dev->power.is_suspended = true;
-		if (parent) {
-			spin_lock_irq(&parent->power.lock);
+		if (device_may_wakeup(dev))
+			dev->power.wakeup_path = true;
 
-			dev->parent->power.direct_complete = false;
-			if (dev->power.wakeup_path
-			    && !dev->parent->power.ignore_children)
-				dev->parent->power.wakeup_path = true;
-
-			spin_unlock_irq(&parent->power.lock);
-		}
-		dpm_clear_suppliers_direct_complete(dev);
+		dpm_propagate_wakeup_to_parent(dev);
+		dpm_clear_superiors_direct_complete(dev);
 	}
 
 	device_unlock(dev);
@@ -1665,8 +1907,9 @@ static int device_prepare(struct device *dev, pm_message_t state)
 	if (dev->power.syscore)
 		return 0;
 
-	WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
-		!pm_runtime_enabled(dev));
+	WARN_ON(!pm_runtime_enabled(dev) &&
+		dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND |
+					      DPM_FLAG_LEAVE_SUSPENDED));
 
 	/*
 	 * If a device's parent goes into runtime suspend at the wrong time,
@@ -1678,7 +1921,7 @@ static int device_prepare(struct device *dev, pm_message_t state)
 
 	device_lock(dev);
 
-	dev->power.wakeup_path = device_may_wakeup(dev);
+	dev->power.wakeup_path = false;
 
 	if (dev->power.no_pm_callbacks) {
 		ret = 1;	/* Let device go direct_complete */
diff --git a/drivers/base/power/power.h b/drivers/base/power/power.h
index 7beee75..21244c5 100644
--- a/drivers/base/power/power.h
+++ b/drivers/base/power/power.h
@@ -41,20 +41,15 @@ extern void dev_pm_disable_wake_irq_check(struct device *dev);
 
 #ifdef CONFIG_PM_SLEEP
 
-extern int device_wakeup_attach_irq(struct device *dev,
-				    struct wake_irq *wakeirq);
+extern void device_wakeup_attach_irq(struct device *dev, struct wake_irq *wakeirq);
 extern void device_wakeup_detach_irq(struct device *dev);
 extern void device_wakeup_arm_wake_irqs(void);
 extern void device_wakeup_disarm_wake_irqs(void);
 
 #else
 
-static inline int
-device_wakeup_attach_irq(struct device *dev,
-			 struct wake_irq *wakeirq)
-{
-	return 0;
-}
+static inline void device_wakeup_attach_irq(struct device *dev,
+					    struct wake_irq *wakeirq) {}
 
 static inline void device_wakeup_detach_irq(struct device *dev)
 {
diff --git a/drivers/base/power/runtime.c b/drivers/base/power/runtime.c
index 6e89b51..8bef3cb 100644
--- a/drivers/base/power/runtime.c
+++ b/drivers/base/power/runtime.c
@@ -1613,22 +1613,34 @@ void pm_runtime_drop_link(struct device *dev)
 	spin_unlock_irq(&dev->power.lock);
 }
 
+static bool pm_runtime_need_not_resume(struct device *dev)
+{
+	return atomic_read(&dev->power.usage_count) <= 1 &&
+		(atomic_read(&dev->power.child_count) == 0 ||
+		 dev->power.ignore_children);
+}
+
 /**
  * pm_runtime_force_suspend - Force a device into suspend state if needed.
  * @dev: Device to suspend.
  *
  * Disable runtime PM so we safely can check the device's runtime PM status and
- * if it is active, invoke it's .runtime_suspend callback to bring it into
- * suspend state. Keep runtime PM disabled to preserve the state unless we
- * encounter errors.
+ * if it is active, invoke its ->runtime_suspend callback to suspend it and
+ * change its runtime PM status field to RPM_SUSPENDED.  Also, if the device's
+ * usage and children counters don't indicate that the device was in use before
+ * the system-wide transition under way, decrement its parent's children counter
+ * (if there is a parent).  Keep runtime PM disabled to preserve the state
+ * unless we encounter errors.
  *
  * Typically this function may be invoked from a system suspend callback to make
- * sure the device is put into low power state.
+ * sure the device is put into low power state and it should only be used during
+ * system-wide PM transitions to sleep states.  It assumes that the analogous
+ * pm_runtime_force_resume() will be used to resume the device.
  */
 int pm_runtime_force_suspend(struct device *dev)
 {
 	int (*callback)(struct device *);
-	int ret = 0;
+	int ret;
 
 	pm_runtime_disable(dev);
 	if (pm_runtime_status_suspended(dev))
@@ -1636,27 +1648,23 @@ int pm_runtime_force_suspend(struct device *dev)
 
 	callback = RPM_GET_CALLBACK(dev, runtime_suspend);
 
-	if (!callback) {
-		ret = -ENOSYS;
-		goto err;
-	}
-
-	ret = callback(dev);
+	ret = callback ? callback(dev) : 0;
 	if (ret)
 		goto err;
 
 	/*
-	 * Increase the runtime PM usage count for the device's parent, in case
-	 * when we find the device being used when system suspend was invoked.
-	 * This informs pm_runtime_force_resume() to resume the parent
-	 * immediately, which is needed to be able to resume its children,
-	 * when not deferring the resume to be managed via runtime PM.
+	 * If the device can stay in suspend after the system-wide transition
+	 * to the working state that will follow, drop the children counter of
+	 * its parent, but set its status to RPM_SUSPENDED anyway in case this
+	 * function will be called again for it in the meantime.
 	 */
-	if (dev->parent && atomic_read(&dev->power.usage_count) > 1)
-		pm_runtime_get_noresume(dev->parent);
+	if (pm_runtime_need_not_resume(dev))
+		pm_runtime_set_suspended(dev);
+	else
+		__update_runtime_status(dev, RPM_SUSPENDED);
 
-	pm_runtime_set_suspended(dev);
 	return 0;
+
 err:
 	pm_runtime_enable(dev);
 	return ret;
@@ -1669,13 +1677,9 @@ EXPORT_SYMBOL_GPL(pm_runtime_force_suspend);
  *
  * Prior invoking this function we expect the user to have brought the device
  * into low power state by a call to pm_runtime_force_suspend(). Here we reverse
- * those actions and brings the device into full power, if it is expected to be
- * used on system resume. To distinguish that, we check whether the runtime PM
- * usage count is greater than 1 (the PM core increases the usage count in the
- * system PM prepare phase), as that indicates a real user (such as a subsystem,
- * driver, userspace, etc.) is using it. If that is the case, the device is
- * expected to be used on system resume as well, so then we resume it. In the
- * other case, we defer the resume to be managed via runtime PM.
+ * those actions and bring the device into full power, if it is expected to be
+ * used on system resume.  In the other case, we defer the resume to be managed
+ * via runtime PM.
  *
  * Typically this function may be invoked from a system resume callback.
  */
@@ -1684,32 +1688,18 @@ int pm_runtime_force_resume(struct device *dev)
 	int (*callback)(struct device *);
 	int ret = 0;
 
-	callback = RPM_GET_CALLBACK(dev, runtime_resume);
-
-	if (!callback) {
-		ret = -ENOSYS;
-		goto out;
-	}
-
-	if (!pm_runtime_status_suspended(dev))
+	if (!pm_runtime_status_suspended(dev) || pm_runtime_need_not_resume(dev))
 		goto out;
 
 	/*
-	 * Decrease the parent's runtime PM usage count, if we increased it
-	 * during system suspend in pm_runtime_force_suspend().
-	*/
-	if (atomic_read(&dev->power.usage_count) > 1) {
-		if (dev->parent)
-			pm_runtime_put_noidle(dev->parent);
-	} else {
-		goto out;
-	}
+	 * The value of the parent's children counter is correct already, so
+	 * just update the status of the device.
+	 */
+	__update_runtime_status(dev, RPM_ACTIVE);
 
-	ret = pm_runtime_set_active(dev);
-	if (ret)
-		goto out;
+	callback = RPM_GET_CALLBACK(dev, runtime_resume);
 
-	ret = callback(dev);
+	ret = callback ? callback(dev) : 0;
 	if (ret) {
 		pm_runtime_set_suspended(dev);
 		goto out;
diff --git a/drivers/base/power/sysfs.c b/drivers/base/power/sysfs.c
index e153e28..0f651ef 100644
--- a/drivers/base/power/sysfs.c
+++ b/drivers/base/power/sysfs.c
@@ -108,16 +108,10 @@ static ssize_t control_show(struct device *dev, struct device_attribute *attr,
 static ssize_t control_store(struct device * dev, struct device_attribute *attr,
 			     const char * buf, size_t n)
 {
-	char *cp;
-	int len = n;
-
-	cp = memchr(buf, '\n', n);
-	if (cp)
-		len = cp - buf;
 	device_lock(dev);
-	if (len == sizeof ctrl_auto - 1 && strncmp(buf, ctrl_auto, len) == 0)
+	if (sysfs_streq(buf, ctrl_auto))
 		pm_runtime_allow(dev);
-	else if (len == sizeof ctrl_on - 1 && strncmp(buf, ctrl_on, len) == 0)
+	else if (sysfs_streq(buf, ctrl_on))
 		pm_runtime_forbid(dev);
 	else
 		n = -EINVAL;
@@ -125,9 +119,9 @@ static ssize_t control_store(struct device * dev, struct device_attribute *attr,
 	return n;
 }
 
-static DEVICE_ATTR(control, 0644, control_show, control_store);
+static DEVICE_ATTR_RW(control);
 
-static ssize_t rtpm_active_time_show(struct device *dev,
+static ssize_t runtime_active_time_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
 	int ret;
@@ -138,9 +132,9 @@ static ssize_t rtpm_active_time_show(struct device *dev,
 	return ret;
 }
 
-static DEVICE_ATTR(runtime_active_time, 0444, rtpm_active_time_show, NULL);
+static DEVICE_ATTR_RO(runtime_active_time);
 
-static ssize_t rtpm_suspended_time_show(struct device *dev,
+static ssize_t runtime_suspended_time_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
 	int ret;
@@ -152,9 +146,9 @@ static ssize_t rtpm_suspended_time_show(struct device *dev,
 	return ret;
 }
 
-static DEVICE_ATTR(runtime_suspended_time, 0444, rtpm_suspended_time_show, NULL);
+static DEVICE_ATTR_RO(runtime_suspended_time);
 
-static ssize_t rtpm_status_show(struct device *dev,
+static ssize_t runtime_status_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
 	const char *p;
@@ -184,7 +178,7 @@ static ssize_t rtpm_status_show(struct device *dev,
 	return sprintf(buf, p);
 }
 
-static DEVICE_ATTR(runtime_status, 0444, rtpm_status_show, NULL);
+static DEVICE_ATTR_RO(runtime_status);
 
 static ssize_t autosuspend_delay_ms_show(struct device *dev,
 		struct device_attribute *attr, char *buf)
@@ -211,26 +205,25 @@ static ssize_t autosuspend_delay_ms_store(struct device *dev,
 	return n;
 }
 
-static DEVICE_ATTR(autosuspend_delay_ms, 0644, autosuspend_delay_ms_show,
-		autosuspend_delay_ms_store);
+static DEVICE_ATTR_RW(autosuspend_delay_ms);
 
-static ssize_t pm_qos_resume_latency_show(struct device *dev,
-					  struct device_attribute *attr,
-					  char *buf)
+static ssize_t pm_qos_resume_latency_us_show(struct device *dev,
+					     struct device_attribute *attr,
+					     char *buf)
 {
 	s32 value = dev_pm_qos_requested_resume_latency(dev);
 
 	if (value == 0)
 		return sprintf(buf, "n/a\n");
-	else if (value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
+	if (value == PM_QOS_RESUME_LATENCY_NO_CONSTRAINT)
 		value = 0;
 
 	return sprintf(buf, "%d\n", value);
 }
 
-static ssize_t pm_qos_resume_latency_store(struct device *dev,
-					   struct device_attribute *attr,
-					   const char *buf, size_t n)
+static ssize_t pm_qos_resume_latency_us_store(struct device *dev,
+					      struct device_attribute *attr,
+					      const char *buf, size_t n)
 {
 	s32 value;
 	int ret;
@@ -245,7 +238,7 @@ static ssize_t pm_qos_resume_latency_store(struct device *dev,
 
 		if (value == 0)
 			value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
-	} else if (!strcmp(buf, "n/a") || !strcmp(buf, "n/a\n")) {
+	} else if (sysfs_streq(buf, "n/a")) {
 		value = 0;
 	} else {
 		return -EINVAL;
@@ -256,26 +249,25 @@ static ssize_t pm_qos_resume_latency_store(struct device *dev,
 	return ret < 0 ? ret : n;
 }
 
-static DEVICE_ATTR(pm_qos_resume_latency_us, 0644,
-		   pm_qos_resume_latency_show, pm_qos_resume_latency_store);
+static DEVICE_ATTR_RW(pm_qos_resume_latency_us);
 
-static ssize_t pm_qos_latency_tolerance_show(struct device *dev,
-					     struct device_attribute *attr,
-					     char *buf)
+static ssize_t pm_qos_latency_tolerance_us_show(struct device *dev,
+						struct device_attribute *attr,
+						char *buf)
 {
 	s32 value = dev_pm_qos_get_user_latency_tolerance(dev);
 
 	if (value < 0)
 		return sprintf(buf, "auto\n");
-	else if (value == PM_QOS_LATENCY_ANY)
+	if (value == PM_QOS_LATENCY_ANY)
 		return sprintf(buf, "any\n");
 
 	return sprintf(buf, "%d\n", value);
 }
 
-static ssize_t pm_qos_latency_tolerance_store(struct device *dev,
-					      struct device_attribute *attr,
-					      const char *buf, size_t n)
+static ssize_t pm_qos_latency_tolerance_us_store(struct device *dev,
+						 struct device_attribute *attr,
+						 const char *buf, size_t n)
 {
 	s32 value;
 	int ret;
@@ -285,9 +277,9 @@ static ssize_t pm_qos_latency_tolerance_store(struct device *dev,
 		if (value < 0)
 			return -EINVAL;
 	} else {
-		if (!strcmp(buf, "auto") || !strcmp(buf, "auto\n"))
+		if (sysfs_streq(buf, "auto"))
 			value = PM_QOS_LATENCY_TOLERANCE_NO_CONSTRAINT;
-		else if (!strcmp(buf, "any") || !strcmp(buf, "any\n"))
+		else if (sysfs_streq(buf, "any"))
 			value = PM_QOS_LATENCY_ANY;
 		else
 			return -EINVAL;
@@ -296,8 +288,7 @@ static ssize_t pm_qos_latency_tolerance_store(struct device *dev,
 	return ret < 0 ? ret : n;
 }
 
-static DEVICE_ATTR(pm_qos_latency_tolerance_us, 0644,
-		   pm_qos_latency_tolerance_show, pm_qos_latency_tolerance_store);
+static DEVICE_ATTR_RW(pm_qos_latency_tolerance_us);
 
 static ssize_t pm_qos_no_power_off_show(struct device *dev,
 					struct device_attribute *attr,
@@ -323,49 +314,39 @@ static ssize_t pm_qos_no_power_off_store(struct device *dev,
 	return ret < 0 ? ret : n;
 }
 
-static DEVICE_ATTR(pm_qos_no_power_off, 0644,
-		   pm_qos_no_power_off_show, pm_qos_no_power_off_store);
+static DEVICE_ATTR_RW(pm_qos_no_power_off);
 
 #ifdef CONFIG_PM_SLEEP
 static const char _enabled[] = "enabled";
 static const char _disabled[] = "disabled";
 
-static ssize_t
-wake_show(struct device * dev, struct device_attribute *attr, char * buf)
+static ssize_t wakeup_show(struct device *dev, struct device_attribute *attr,
+			   char *buf)
 {
 	return sprintf(buf, "%s\n", device_can_wakeup(dev)
 		? (device_may_wakeup(dev) ? _enabled : _disabled)
 		: "");
 }
 
-static ssize_t
-wake_store(struct device * dev, struct device_attribute *attr,
-	const char * buf, size_t n)
+static ssize_t wakeup_store(struct device *dev, struct device_attribute *attr,
+			    const char *buf, size_t n)
 {
-	char *cp;
-	int len = n;
-
 	if (!device_can_wakeup(dev))
 		return -EINVAL;
 
-	cp = memchr(buf, '\n', n);
-	if (cp)
-		len = cp - buf;
-	if (len == sizeof _enabled - 1
-			&& strncmp(buf, _enabled, sizeof _enabled - 1) == 0)
+	if (sysfs_streq(buf, _enabled))
 		device_set_wakeup_enable(dev, 1);
-	else if (len == sizeof _disabled - 1
-			&& strncmp(buf, _disabled, sizeof _disabled - 1) == 0)
+	else if (sysfs_streq(buf, _disabled))
 		device_set_wakeup_enable(dev, 0);
 	else
 		return -EINVAL;
 	return n;
 }
 
-static DEVICE_ATTR(wakeup, 0644, wake_show, wake_store);
+static DEVICE_ATTR_RW(wakeup);
 
 static ssize_t wakeup_count_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+				 struct device_attribute *attr, char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
@@ -379,10 +360,11 @@ static ssize_t wakeup_count_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_count, 0444, wakeup_count_show, NULL);
+static DEVICE_ATTR_RO(wakeup_count);
 
 static ssize_t wakeup_active_count_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+					struct device_attribute *attr,
+					char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
@@ -396,11 +378,11 @@ static ssize_t wakeup_active_count_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_active_count, 0444, wakeup_active_count_show, NULL);
+static DEVICE_ATTR_RO(wakeup_active_count);
 
 static ssize_t wakeup_abort_count_show(struct device *dev,
-					struct device_attribute *attr,
-					char *buf)
+				       struct device_attribute *attr,
+				       char *buf)
 {
 	unsigned long count = 0;
 	bool enabled = false;
@@ -414,7 +396,7 @@ static ssize_t wakeup_abort_count_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_abort_count, 0444, wakeup_abort_count_show, NULL);
+static DEVICE_ATTR_RO(wakeup_abort_count);
 
 static ssize_t wakeup_expire_count_show(struct device *dev,
 					struct device_attribute *attr,
@@ -432,10 +414,10 @@ static ssize_t wakeup_expire_count_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lu\n", count) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_expire_count, 0444, wakeup_expire_count_show, NULL);
+static DEVICE_ATTR_RO(wakeup_expire_count);
 
 static ssize_t wakeup_active_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+				  struct device_attribute *attr, char *buf)
 {
 	unsigned int active = 0;
 	bool enabled = false;
@@ -449,10 +431,11 @@ static ssize_t wakeup_active_show(struct device *dev,
 	return enabled ? sprintf(buf, "%u\n", active) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_active, 0444, wakeup_active_show, NULL);
+static DEVICE_ATTR_RO(wakeup_active);
 
-static ssize_t wakeup_total_time_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_total_time_ms_show(struct device *dev,
+					 struct device_attribute *attr,
+					 char *buf)
 {
 	s64 msec = 0;
 	bool enabled = false;
@@ -466,10 +449,10 @@ static ssize_t wakeup_total_time_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_total_time_ms, 0444, wakeup_total_time_show, NULL);
+static DEVICE_ATTR_RO(wakeup_total_time_ms);
 
-static ssize_t wakeup_max_time_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_max_time_ms_show(struct device *dev,
+				       struct device_attribute *attr, char *buf)
 {
 	s64 msec = 0;
 	bool enabled = false;
@@ -483,10 +466,11 @@ static ssize_t wakeup_max_time_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_max_time_ms, 0444, wakeup_max_time_show, NULL);
+static DEVICE_ATTR_RO(wakeup_max_time_ms);
 
-static ssize_t wakeup_last_time_show(struct device *dev,
-				struct device_attribute *attr, char *buf)
+static ssize_t wakeup_last_time_ms_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
 {
 	s64 msec = 0;
 	bool enabled = false;
@@ -500,12 +484,12 @@ static ssize_t wakeup_last_time_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_last_time_ms, 0444, wakeup_last_time_show, NULL);
+static DEVICE_ATTR_RO(wakeup_last_time_ms);
 
 #ifdef CONFIG_PM_AUTOSLEEP
-static ssize_t wakeup_prevent_sleep_time_show(struct device *dev,
-					      struct device_attribute *attr,
-					      char *buf)
+static ssize_t wakeup_prevent_sleep_time_ms_show(struct device *dev,
+						 struct device_attribute *attr,
+						 char *buf)
 {
 	s64 msec = 0;
 	bool enabled = false;
@@ -519,40 +503,39 @@ static ssize_t wakeup_prevent_sleep_time_show(struct device *dev,
 	return enabled ? sprintf(buf, "%lld\n", msec) : sprintf(buf, "\n");
 }
 
-static DEVICE_ATTR(wakeup_prevent_sleep_time_ms, 0444,
-		   wakeup_prevent_sleep_time_show, NULL);
+static DEVICE_ATTR_RO(wakeup_prevent_sleep_time_ms);
 #endif /* CONFIG_PM_AUTOSLEEP */
 #endif /* CONFIG_PM_SLEEP */
 
 #ifdef CONFIG_PM_ADVANCED_DEBUG
-static ssize_t rtpm_usagecount_show(struct device *dev,
-				    struct device_attribute *attr, char *buf)
+static ssize_t runtime_usage_show(struct device *dev,
+				  struct device_attribute *attr, char *buf)
 {
 	return sprintf(buf, "%d\n", atomic_read(&dev->power.usage_count));
 }
+static DEVICE_ATTR_RO(runtime_usage);
 
-static ssize_t rtpm_children_show(struct device *dev,
-				  struct device_attribute *attr, char *buf)
+static ssize_t runtime_active_kids_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
 {
 	return sprintf(buf, "%d\n", dev->power.ignore_children ?
 		0 : atomic_read(&dev->power.child_count));
 }
+static DEVICE_ATTR_RO(runtime_active_kids);
 
-static ssize_t rtpm_enabled_show(struct device *dev,
-				 struct device_attribute *attr, char *buf)
+static ssize_t runtime_enabled_show(struct device *dev,
+				    struct device_attribute *attr, char *buf)
 {
-	if ((dev->power.disable_depth) && (dev->power.runtime_auto == false))
+	if (dev->power.disable_depth && (dev->power.runtime_auto == false))
 		return sprintf(buf, "disabled & forbidden\n");
-	else if (dev->power.disable_depth)
+	if (dev->power.disable_depth)
 		return sprintf(buf, "disabled\n");
-	else if (dev->power.runtime_auto == false)
+	if (dev->power.runtime_auto == false)
 		return sprintf(buf, "forbidden\n");
 	return sprintf(buf, "enabled\n");
 }
-
-static DEVICE_ATTR(runtime_usage, 0444, rtpm_usagecount_show, NULL);
-static DEVICE_ATTR(runtime_active_kids, 0444, rtpm_children_show, NULL);
-static DEVICE_ATTR(runtime_enabled, 0444, rtpm_enabled_show, NULL);
+static DEVICE_ATTR_RO(runtime_enabled);
 
 #ifdef CONFIG_PM_SLEEP
 static ssize_t async_show(struct device *dev, struct device_attribute *attr,
@@ -566,23 +549,16 @@ static ssize_t async_show(struct device *dev, struct device_attribute *attr,
 static ssize_t async_store(struct device *dev, struct device_attribute *attr,
 			   const char *buf, size_t n)
 {
-	char *cp;
-	int len = n;
-
-	cp = memchr(buf, '\n', n);
-	if (cp)
-		len = cp - buf;
-	if (len == sizeof _enabled - 1 && strncmp(buf, _enabled, len) == 0)
+	if (sysfs_streq(buf, _enabled))
 		device_enable_async_suspend(dev);
-	else if (len == sizeof _disabled - 1 &&
-		 strncmp(buf, _disabled, len) == 0)
+	else if (sysfs_streq(buf, _disabled))
 		device_disable_async_suspend(dev);
 	else
 		return -EINVAL;
 	return n;
 }
 
-static DEVICE_ATTR(async, 0644, async_show, async_store);
+static DEVICE_ATTR_RW(async);
 
 #endif /* CONFIG_PM_SLEEP */
 #endif /* CONFIG_PM_ADVANCED_DEBUG */
diff --git a/drivers/base/power/wakeirq.c b/drivers/base/power/wakeirq.c
index ae04298..a8ac86e 100644
--- a/drivers/base/power/wakeirq.c
+++ b/drivers/base/power/wakeirq.c
@@ -33,7 +33,6 @@ static int dev_pm_attach_wake_irq(struct device *dev, int irq,
 				  struct wake_irq *wirq)
 {
 	unsigned long flags;
-	int err;
 
 	if (!dev || !wirq)
 		return -EINVAL;
@@ -45,12 +44,11 @@ static int dev_pm_attach_wake_irq(struct device *dev, int irq,
 		return -EEXIST;
 	}
 
-	err = device_wakeup_attach_irq(dev, wirq);
-	if (!err)
-		dev->power.wakeirq = wirq;
+	dev->power.wakeirq = wirq;
+	device_wakeup_attach_irq(dev, wirq);
 
 	spin_unlock_irqrestore(&dev->power.lock, flags);
-	return err;
+	return 0;
 }
 
 /**
diff --git a/drivers/base/power/wakeup.c b/drivers/base/power/wakeup.c
index 38559f0..ea01621 100644
--- a/drivers/base/power/wakeup.c
+++ b/drivers/base/power/wakeup.c
@@ -19,6 +19,11 @@
 
 #include "power.h"
 
+#ifndef CONFIG_SUSPEND
+suspend_state_t pm_suspend_target_state;
+#define pm_suspend_target_state	(PM_SUSPEND_ON)
+#endif
+
 /*
  * If set, the suspend/hibernate code will abort transitions to a sleep state
  * if wakeup events are registered during or immediately before the transition.
@@ -268,6 +273,9 @@ int device_wakeup_enable(struct device *dev)
 	if (!dev || !dev->power.can_wakeup)
 		return -EINVAL;
 
+	if (pm_suspend_target_state != PM_SUSPEND_ON)
+		dev_dbg(dev, "Suspicious %s() during system transition!\n", __func__);
+
 	ws = wakeup_source_register(dev_name(dev));
 	if (!ws)
 		return -ENOMEM;
@@ -291,22 +299,19 @@ EXPORT_SYMBOL_GPL(device_wakeup_enable);
  *
  * Call under the device's power.lock lock.
  */
-int device_wakeup_attach_irq(struct device *dev,
+void device_wakeup_attach_irq(struct device *dev,
 			     struct wake_irq *wakeirq)
 {
 	struct wakeup_source *ws;
 
 	ws = dev->power.wakeup;
-	if (!ws) {
-		dev_err(dev, "forgot to call call device_init_wakeup?\n");
-		return -EINVAL;
-	}
+	if (!ws)
+		return;
 
 	if (ws->wakeirq)
-		return -EEXIST;
+		dev_err(dev, "Leftover wakeup IRQ found, overriding\n");
 
 	ws->wakeirq = wakeirq;
-	return 0;
 }
 
 /**
@@ -448,9 +453,7 @@ int device_init_wakeup(struct device *dev, bool enable)
 		device_set_wakeup_capable(dev, true);
 		ret = device_wakeup_enable(dev);
 	} else {
-		if (dev->power.can_wakeup)
-			device_wakeup_disable(dev);
-
+		device_wakeup_disable(dev);
 		device_set_wakeup_capable(dev, false);
 	}
 
@@ -464,9 +467,6 @@ EXPORT_SYMBOL_GPL(device_init_wakeup);
  */
 int device_set_wakeup_enable(struct device *dev, bool enable)
 {
-	if (!dev || !dev->power.can_wakeup)
-		return -EINVAL;
-
 	return enable ? device_wakeup_enable(dev) : device_wakeup_disable(dev);
 }
 EXPORT_SYMBOL_GPL(device_set_wakeup_enable);
diff --git a/drivers/base/regmap/Kconfig b/drivers/base/regmap/Kconfig
index 3a1535d..067073e 100644
--- a/drivers/base/regmap/Kconfig
+++ b/drivers/base/regmap/Kconfig
@@ -6,7 +6,6 @@
 config REGMAP
 	default y if (REGMAP_I2C || REGMAP_SPI || REGMAP_SPMI || REGMAP_W1 || REGMAP_AC97 || REGMAP_MMIO || REGMAP_IRQ)
 	select IRQ_DOMAIN if REGMAP_IRQ
-	select REGMAP_HWSPINLOCK if HWSPINLOCK=y
 	bool
 
 config REGCACHE_COMPRESSED
@@ -39,5 +38,6 @@
 config REGMAP_IRQ
 	bool
 
-config REGMAP_HWSPINLOCK
-	bool
+config REGMAP_SOUNDWIRE
+	tristate
+	depends on SOUNDWIRE_BUS
diff --git a/drivers/base/regmap/Makefile b/drivers/base/regmap/Makefile
index 0d298c4..22d263c 100644
--- a/drivers/base/regmap/Makefile
+++ b/drivers/base/regmap/Makefile
@@ -13,3 +13,4 @@
 obj-$(CONFIG_REGMAP_MMIO) += regmap-mmio.o
 obj-$(CONFIG_REGMAP_IRQ) += regmap-irq.o
 obj-$(CONFIG_REGMAP_W1) += regmap-w1.o
+obj-$(CONFIG_REGMAP_SOUNDWIRE) += regmap-sdw.o
diff --git a/drivers/base/regmap/internal.h b/drivers/base/regmap/internal.h
index 8641183..53785e0 100644
--- a/drivers/base/regmap/internal.h
+++ b/drivers/base/regmap/internal.h
@@ -77,6 +77,7 @@ struct regmap {
 	int async_ret;
 
 #ifdef CONFIG_DEBUG_FS
+	bool debugfs_disable;
 	struct dentry *debugfs;
 	const char *debugfs_name;
 
@@ -215,10 +216,17 @@ struct regmap_field {
 extern void regmap_debugfs_initcall(void);
 extern void regmap_debugfs_init(struct regmap *map, const char *name);
 extern void regmap_debugfs_exit(struct regmap *map);
+
+static inline void regmap_debugfs_disable(struct regmap *map)
+{
+	map->debugfs_disable = true;
+}
+
 #else
 static inline void regmap_debugfs_initcall(void) { }
 static inline void regmap_debugfs_init(struct regmap *map, const char *name) { }
 static inline void regmap_debugfs_exit(struct regmap *map) { }
+static inline void regmap_debugfs_disable(struct regmap *map) { }
 #endif
 
 /* regcache core declarations */
diff --git a/drivers/base/regmap/regcache-flat.c b/drivers/base/regmap/regcache-flat.c
index 4d2e50b..bc6cd88 100644
--- a/drivers/base/regmap/regcache-flat.c
+++ b/drivers/base/regmap/regcache-flat.c
@@ -37,9 +37,12 @@ static int regcache_flat_init(struct regmap *map)
 
 	cache = map->cache;
 
-	for (i = 0; i < map->num_reg_defaults; i++)
-		cache[regcache_flat_get_index(map, map->reg_defaults[i].reg)] =
-				map->reg_defaults[i].def;
+	for (i = 0; i < map->num_reg_defaults; i++) {
+		unsigned int reg = map->reg_defaults[i].reg;
+		unsigned int index = regcache_flat_get_index(map, reg);
+
+		cache[index] = map->reg_defaults[i].def;
+	}
 
 	return 0;
 }
@@ -56,8 +59,9 @@ static int regcache_flat_read(struct regmap *map,
 			      unsigned int reg, unsigned int *value)
 {
 	unsigned int *cache = map->cache;
+	unsigned int index = regcache_flat_get_index(map, reg);
 
-	*value = cache[regcache_flat_get_index(map, reg)];
+	*value = cache[index];
 
 	return 0;
 }
@@ -66,8 +70,9 @@ static int regcache_flat_write(struct regmap *map, unsigned int reg,
 			       unsigned int value)
 {
 	unsigned int *cache = map->cache;
+	unsigned int index = regcache_flat_get_index(map, reg);
 
-	cache[regcache_flat_get_index(map, reg)] = value;
+	cache[index] = value;
 
 	return 0;
 }
diff --git a/drivers/base/regmap/regmap-debugfs.c b/drivers/base/regmap/regmap-debugfs.c
index 36ce351..f326633 100644
--- a/drivers/base/regmap/regmap-debugfs.c
+++ b/drivers/base/regmap/regmap-debugfs.c
@@ -529,6 +529,18 @@ void regmap_debugfs_init(struct regmap *map, const char *name)
 	struct regmap_range_node *range_node;
 	const char *devname = "dummy";
 
+	/*
+	 * Userspace can initiate reads from the hardware over debugfs.
+	 * Normally internal regmap structures and buffers are protected with
+	 * a mutex or a spinlock, but if the regmap owner decided to disable
+	 * all locking mechanisms, this is no longer the case. For safety:
+	 * don't create the debugfs entries if locking is disabled.
+	 */
+	if (map->debugfs_disable) {
+		dev_dbg(map->dev, "regmap locking disabled - not creating debugfs entries\n");
+		return;
+	}
+
 	/* If we don't have the debugfs root yet, postpone init */
 	if (!regmap_debugfs_root) {
 		struct regmap_debugfs_node *node;
diff --git a/drivers/base/regmap/regmap-sdw.c b/drivers/base/regmap/regmap-sdw.c
new file mode 100644
index 0000000..50a6638
--- /dev/null
+++ b/drivers/base/regmap/regmap-sdw.c
@@ -0,0 +1,88 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright(c) 2015-17 Intel Corporation.
+
+#include <linux/device.h>
+#include <linux/mod_devicetable.h>
+#include <linux/module.h>
+#include <linux/soundwire/sdw.h>
+#include "internal.h"
+
+static int regmap_sdw_write(void *context, unsigned int reg, unsigned int val)
+{
+	struct device *dev = context;
+	struct sdw_slave *slave = dev_to_sdw_dev(dev);
+
+	return sdw_write(slave, reg, val);
+}
+
+static int regmap_sdw_read(void *context, unsigned int reg, unsigned int *val)
+{
+	struct device *dev = context;
+	struct sdw_slave *slave = dev_to_sdw_dev(dev);
+	int read;
+
+	read = sdw_read(slave, reg);
+	if (read < 0)
+		return read;
+
+	*val = read;
+	return 0;
+}
+
+static struct regmap_bus regmap_sdw = {
+	.reg_read = regmap_sdw_read,
+	.reg_write = regmap_sdw_write,
+	.reg_format_endian_default = REGMAP_ENDIAN_LITTLE,
+	.val_format_endian_default = REGMAP_ENDIAN_LITTLE,
+};
+
+static int regmap_sdw_config_check(const struct regmap_config *config)
+{
+	/* All register are 8-bits wide as per MIPI Soundwire 1.0 Spec */
+	if (config->val_bits != 8)
+		return -ENOTSUPP;
+
+	/* Registers are 32 bits wide */
+	if (config->reg_bits != 32)
+		return -ENOTSUPP;
+
+	if (config->pad_bits != 0)
+		return -ENOTSUPP;
+
+	return 0;
+}
+
+struct regmap *__regmap_init_sdw(struct sdw_slave *sdw,
+				 const struct regmap_config *config,
+				 struct lock_class_key *lock_key,
+				 const char *lock_name)
+{
+	int ret;
+
+	ret = regmap_sdw_config_check(config);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return __regmap_init(&sdw->dev, &regmap_sdw,
+			&sdw->dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__regmap_init_sdw);
+
+struct regmap *__devm_regmap_init_sdw(struct sdw_slave *sdw,
+				      const struct regmap_config *config,
+				      struct lock_class_key *lock_key,
+				      const char *lock_name)
+{
+	int ret;
+
+	ret = regmap_sdw_config_check(config);
+	if (ret)
+		return ERR_PTR(ret);
+
+	return __devm_regmap_init(&sdw->dev, &regmap_sdw,
+			&sdw->dev, config, lock_key, lock_name);
+}
+EXPORT_SYMBOL_GPL(__devm_regmap_init_sdw);
+
+MODULE_DESCRIPTION("Regmap SoundWire Module");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 8d516a9..ee302cc 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -414,7 +414,6 @@ static unsigned int regmap_parse_64_native(const void *buf)
 }
 #endif
 
-#ifdef REGMAP_HWSPINLOCK
 static void regmap_lock_hwlock(void *__map)
 {
 	struct regmap *map = __map;
@@ -457,7 +456,11 @@ static void regmap_unlock_hwlock_irqrestore(void *__map)
 
 	hwspin_unlock_irqrestore(map->hwlock, &map->spinlock_flags);
 }
-#endif
+
+static void regmap_lock_unlock_none(void *__map)
+{
+
+}
 
 static void regmap_lock_mutex(void *__map)
 {
@@ -669,16 +672,26 @@ struct regmap *__regmap_init(struct device *dev,
 		goto err;
 	}
 
-	if (config->lock && config->unlock) {
+	if (config->name) {
+		map->name = kstrdup_const(config->name, GFP_KERNEL);
+		if (!map->name) {
+			ret = -ENOMEM;
+			goto err_map;
+		}
+	}
+
+	if (config->disable_locking) {
+		map->lock = map->unlock = regmap_lock_unlock_none;
+		regmap_debugfs_disable(map);
+	} else if (config->lock && config->unlock) {
 		map->lock = config->lock;
 		map->unlock = config->unlock;
 		map->lock_arg = config->lock_arg;
-	} else if (config->hwlock_id) {
-#ifdef REGMAP_HWSPINLOCK
+	} else if (config->use_hwlock) {
 		map->hwlock = hwspin_lock_request_specific(config->hwlock_id);
 		if (!map->hwlock) {
 			ret = -ENXIO;
-			goto err_map;
+			goto err_name;
 		}
 
 		switch (config->hwlock_mode) {
@@ -697,10 +710,6 @@ struct regmap *__regmap_init(struct device *dev,
 		}
 
 		map->lock_arg = map;
-#else
-		ret = -EINVAL;
-		goto err_map;
-#endif
 	} else {
 		if ((bus && bus->fast_io) ||
 		    config->fast_io) {
@@ -762,14 +771,15 @@ struct regmap *__regmap_init(struct device *dev,
 	map->volatile_reg = config->volatile_reg;
 	map->precious_reg = config->precious_reg;
 	map->cache_type = config->cache_type;
-	map->name = config->name;
 
 	spin_lock_init(&map->async_lock);
 	INIT_LIST_HEAD(&map->async_list);
 	INIT_LIST_HEAD(&map->async_free);
 	init_waitqueue_head(&map->async_waitq);
 
-	if (config->read_flag_mask || config->write_flag_mask) {
+	if (config->read_flag_mask ||
+	    config->write_flag_mask ||
+	    config->zero_flag_mask) {
 		map->read_flag_mask = config->read_flag_mask;
 		map->write_flag_mask = config->write_flag_mask;
 	} else if (bus) {
@@ -1116,8 +1126,10 @@ struct regmap *__regmap_init(struct device *dev,
 	regmap_range_exit(map);
 	kfree(map->work_buf);
 err_hwlock:
-	if (IS_ENABLED(REGMAP_HWSPINLOCK) && map->hwlock)
+	if (map->hwlock)
 		hwspin_lock_free(map->hwlock);
+err_name:
+	kfree_const(map->name);
 err_map:
 	kfree(map);
 err:
@@ -1305,8 +1317,9 @@ void regmap_exit(struct regmap *map)
 		kfree(async->work_buf);
 		kfree(async);
 	}
-	if (IS_ENABLED(REGMAP_HWSPINLOCK) && map->hwlock)
+	if (map->hwlock)
 		hwspin_lock_free(map->hwlock);
+	kfree_const(map->name);
 	kfree(map);
 }
 EXPORT_SYMBOL_GPL(regmap_exit);
@@ -2423,13 +2436,15 @@ static int _regmap_bus_read(void *context, unsigned int reg,
 {
 	int ret;
 	struct regmap *map = context;
+	void *work_val = map->work_buf + map->format.reg_bytes +
+		map->format.pad_bytes;
 
 	if (!map->format.parse_val)
 		return -EINVAL;
 
-	ret = _regmap_raw_read(map, reg, map->work_buf, map->format.val_bytes);
+	ret = _regmap_raw_read(map, reg, work_val, map->format.val_bytes);
 	if (ret == 0)
-		*val = map->format.parse_val(map->work_buf);
+		*val = map->format.parse_val(work_val);
 
 	return ret;
 }
diff --git a/drivers/bcma/Kconfig b/drivers/bcma/Kconfig
index 02d78f6..ba8acca 100644
--- a/drivers/bcma/Kconfig
+++ b/drivers/bcma/Kconfig
@@ -55,7 +55,7 @@
 
 config BCMA_DRIVER_PCI_HOSTMODE
 	bool "Driver for PCI core working in hostmode"
-	depends on MIPS && BCMA_DRIVER_PCI
+	depends on MIPS && BCMA_DRIVER_PCI && PCI_DRIVERS_LEGACY
 	help
 	  PCI core hostmode operation (external PCI bus).
 
diff --git a/drivers/block/DAC960.c b/drivers/block/DAC960.c
index 442e777..7280752 100644
--- a/drivers/block/DAC960.c
+++ b/drivers/block/DAC960.c
@@ -6619,43 +6619,27 @@ static void DAC960_DestroyProcEntries(DAC960_Controller_T *Controller)
 
 #ifdef DAC960_GAM_MINOR
 
-/*
- * DAC960_gam_ioctl is the ioctl function for performing RAID operations.
-*/
-
-static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
-						unsigned long Argument)
+static long DAC960_gam_get_controller_info(DAC960_ControllerInfo_T __user *UserSpaceControllerInfo)
 {
-  long ErrorCode = 0;
-  if (!capable(CAP_SYS_ADMIN)) return -EACCES;
-
-  mutex_lock(&DAC960_mutex);
-  switch (Request)
-    {
-    case DAC960_IOCTL_GET_CONTROLLER_COUNT:
-      ErrorCode = DAC960_ControllerCount;
-      break;
-    case DAC960_IOCTL_GET_CONTROLLER_INFO:
-      {
-	DAC960_ControllerInfo_T __user *UserSpaceControllerInfo =
-	  (DAC960_ControllerInfo_T __user *) Argument;
 	DAC960_ControllerInfo_T ControllerInfo;
 	DAC960_Controller_T *Controller;
 	int ControllerNumber;
+	long ErrorCode;
+
 	if (UserSpaceControllerInfo == NULL)
 		ErrorCode = -EINVAL;
 	else ErrorCode = get_user(ControllerNumber,
 			     &UserSpaceControllerInfo->ControllerNumber);
 	if (ErrorCode != 0)
-		break;
+		goto out;
 	ErrorCode = -ENXIO;
 	if (ControllerNumber < 0 ||
 	    ControllerNumber > DAC960_ControllerCount - 1) {
-	  break;
+		goto out;
 	}
 	Controller = DAC960_Controllers[ControllerNumber];
 	if (Controller == NULL)
-		break;
+		goto out;
 	memset(&ControllerInfo, 0, sizeof(DAC960_ControllerInfo_T));
 	ControllerInfo.ControllerNumber = ControllerNumber;
 	ControllerInfo.FirmwareType = Controller->FirmwareType;
@@ -6670,12 +6654,12 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 	strcpy(ControllerInfo.FirmwareVersion, Controller->FirmwareVersion);
 	ErrorCode = (copy_to_user(UserSpaceControllerInfo, &ControllerInfo,
 			     sizeof(DAC960_ControllerInfo_T)) ? -EFAULT : 0);
-	break;
-      }
-    case DAC960_IOCTL_V1_EXECUTE_COMMAND:
-      {
-	DAC960_V1_UserCommand_T __user *UserSpaceUserCommand =
-	  (DAC960_V1_UserCommand_T __user *) Argument;
+out:
+	return ErrorCode;
+}
+
+static long DAC960_gam_v1_execute_command(DAC960_V1_UserCommand_T __user *UserSpaceUserCommand)
+{
 	DAC960_V1_UserCommand_T UserCommand;
 	DAC960_Controller_T *Controller;
 	DAC960_Command_T *Command = NULL;
@@ -6688,39 +6672,41 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 	int ControllerNumber, DataTransferLength;
 	unsigned char *DataTransferBuffer = NULL;
 	dma_addr_t DataTransferBufferDMA;
+        long ErrorCode;
+
 	if (UserSpaceUserCommand == NULL) {
 		ErrorCode = -EINVAL;
-		break;
+		goto out;
 	}
 	if (copy_from_user(&UserCommand, UserSpaceUserCommand,
 				   sizeof(DAC960_V1_UserCommand_T))) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	}
 	ControllerNumber = UserCommand.ControllerNumber;
     	ErrorCode = -ENXIO;
 	if (ControllerNumber < 0 ||
 	    ControllerNumber > DAC960_ControllerCount - 1)
-	    	break;
+		goto out;
 	Controller = DAC960_Controllers[ControllerNumber];
 	if (Controller == NULL)
-		break;
+		goto out;
 	ErrorCode = -EINVAL;
 	if (Controller->FirmwareType != DAC960_V1_Controller)
-		break;
+		goto out;
 	CommandOpcode = UserCommand.CommandMailbox.Common.CommandOpcode;
 	DataTransferLength = UserCommand.DataTransferLength;
 	if (CommandOpcode & 0x80)
-		break;
+		goto out;
 	if (CommandOpcode == DAC960_V1_DCDB)
 	  {
 	    if (copy_from_user(&DCDB, UserCommand.DCDB,
 			       sizeof(DAC960_V1_DCDB_T))) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	    }
 	    if (DCDB.Channel >= DAC960_V1_MaxChannels)
-	    		break;
+		goto out;
 	    if (!((DataTransferLength == 0 &&
 		   DCDB.Direction
 		   == DAC960_V1_DCDB_NoDataTransfer) ||
@@ -6730,15 +6716,15 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 		  (DataTransferLength < 0 &&
 		   DCDB.Direction
 		   == DAC960_V1_DCDB_DataTransferSystemToDevice)))
-		   	break;
+			goto out;
 	    if (((DCDB.TransferLengthHigh4 << 16) | DCDB.TransferLength)
 		!= abs(DataTransferLength))
-			break;
+			goto out;
 	    DCDB_IOBUF = pci_alloc_consistent(Controller->PCIDevice,
 			sizeof(DAC960_V1_DCDB_T), &DCDB_IOBUFDMA);
 	    if (DCDB_IOBUF == NULL) {
 	    		ErrorCode = -ENOMEM;
-			break;
+			goto out;
 		}
 	  }
 	ErrorCode = -ENOMEM;
@@ -6748,19 +6734,19 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
                                                        DataTransferLength,
                                                        &DataTransferBufferDMA);
 	    if (DataTransferBuffer == NULL)
-	    	break;
+		goto out;
 	  }
 	else if (DataTransferLength < 0)
 	  {
 	    DataTransferBuffer = pci_alloc_consistent(Controller->PCIDevice,
 				-DataTransferLength, &DataTransferBufferDMA);
 	    if (DataTransferBuffer == NULL)
-	    	break;
+		goto out;
 	    if (copy_from_user(DataTransferBuffer,
 			       UserCommand.DataTransferBuffer,
 			       -DataTransferLength)) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	    }
 	  }
 	if (CommandOpcode == DAC960_V1_DCDB)
@@ -6837,12 +6823,12 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 	if (DCDB_IOBUF != NULL)
 	  pci_free_consistent(Controller->PCIDevice, sizeof(DAC960_V1_DCDB_T),
 			DCDB_IOBUF, DCDB_IOBUFDMA);
-      	break;
-      }
-    case DAC960_IOCTL_V2_EXECUTE_COMMAND:
-      {
-	DAC960_V2_UserCommand_T __user *UserSpaceUserCommand =
-	  (DAC960_V2_UserCommand_T __user *) Argument;
+	out:
+	return ErrorCode;
+}
+
+static long DAC960_gam_v2_execute_command(DAC960_V2_UserCommand_T __user *UserSpaceUserCommand)
+{
 	DAC960_V2_UserCommand_T UserCommand;
 	DAC960_Controller_T *Controller;
 	DAC960_Command_T *Command = NULL;
@@ -6855,26 +6841,26 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 	dma_addr_t DataTransferBufferDMA;
 	unsigned char *RequestSenseBuffer = NULL;
 	dma_addr_t RequestSenseBufferDMA;
+	long ErrorCode = -EINVAL;
 
-	ErrorCode = -EINVAL;
 	if (UserSpaceUserCommand == NULL)
-		break;
+		goto out;
 	if (copy_from_user(&UserCommand, UserSpaceUserCommand,
 			   sizeof(DAC960_V2_UserCommand_T))) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	}
 	ErrorCode = -ENXIO;
 	ControllerNumber = UserCommand.ControllerNumber;
 	if (ControllerNumber < 0 ||
 	    ControllerNumber > DAC960_ControllerCount - 1)
-	    	break;
+		goto out;
 	Controller = DAC960_Controllers[ControllerNumber];
 	if (Controller == NULL)
-		break;
+		goto out;
 	if (Controller->FirmwareType != DAC960_V2_Controller){
 		ErrorCode = -EINVAL;
-		break;
+		goto out;
 	}
 	DataTransferLength = UserCommand.DataTransferLength;
     	ErrorCode = -ENOMEM;
@@ -6884,14 +6870,14 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
                                                        DataTransferLength,
                                                        &DataTransferBufferDMA);
 	    if (DataTransferBuffer == NULL)
-	    	break;
+		goto out;
 	  }
 	else if (DataTransferLength < 0)
 	  {
 	    DataTransferBuffer = pci_alloc_consistent(Controller->PCIDevice,
 				-DataTransferLength, &DataTransferBufferDMA);
 	    if (DataTransferBuffer == NULL)
-	    	break;
+		goto out;
 	    if (copy_from_user(DataTransferBuffer,
 			       UserCommand.DataTransferBuffer,
 			       -DataTransferLength)) {
@@ -7001,42 +6987,44 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 	if (RequestSenseBuffer != NULL)
 	  pci_free_consistent(Controller->PCIDevice, RequestSenseLength,
 		RequestSenseBuffer, RequestSenseBufferDMA);
-        break;
-      }
-    case DAC960_IOCTL_V2_GET_HEALTH_STATUS:
-      {
-	DAC960_V2_GetHealthStatus_T __user *UserSpaceGetHealthStatus =
-	  (DAC960_V2_GetHealthStatus_T __user *) Argument;
+out:
+        return ErrorCode;
+}
+
+static long DAC960_gam_v2_get_health_status(DAC960_V2_GetHealthStatus_T __user *UserSpaceGetHealthStatus)
+{
 	DAC960_V2_GetHealthStatus_T GetHealthStatus;
 	DAC960_V2_HealthStatusBuffer_T HealthStatusBuffer;
 	DAC960_Controller_T *Controller;
 	int ControllerNumber;
+	long ErrorCode;
+
 	if (UserSpaceGetHealthStatus == NULL) {
 		ErrorCode = -EINVAL;
-		break;
+		goto out;
 	}
 	if (copy_from_user(&GetHealthStatus, UserSpaceGetHealthStatus,
 			   sizeof(DAC960_V2_GetHealthStatus_T))) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	}
 	ErrorCode = -ENXIO;
 	ControllerNumber = GetHealthStatus.ControllerNumber;
 	if (ControllerNumber < 0 ||
 	    ControllerNumber > DAC960_ControllerCount - 1)
-		    break;
+		goto out;
 	Controller = DAC960_Controllers[ControllerNumber];
 	if (Controller == NULL)
-		break;
+		goto out;
 	if (Controller->FirmwareType != DAC960_V2_Controller) {
 		ErrorCode = -EINVAL;
-		break;
+		goto out;
 	}
 	if (copy_from_user(&HealthStatusBuffer,
 			   GetHealthStatus.HealthStatusBuffer,
 			   sizeof(DAC960_V2_HealthStatusBuffer_T))) {
 		ErrorCode = -EFAULT;
-		break;
+		goto out;
 	}
 	ErrorCode = wait_event_interruptible_timeout(Controller->HealthStatusWaitQueue,
 			!(Controller->V2.HealthStatusBuffer->StatusChangeCounter
@@ -7046,7 +7034,7 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 			DAC960_MonitoringTimerInterval);
 	if (ErrorCode == -ERESTARTSYS) {
 		ErrorCode = -EINTR;
-		break;
+		goto out;
 	}
 	if (copy_to_user(GetHealthStatus.HealthStatusBuffer,
 			 Controller->V2.HealthStatusBuffer,
@@ -7054,7 +7042,39 @@ static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
 		ErrorCode = -EFAULT;
 	else
 		ErrorCode =  0;
-      }
+
+out:
+	return ErrorCode;
+}
+
+/*
+ * DAC960_gam_ioctl is the ioctl function for performing RAID operations.
+*/
+
+static long DAC960_gam_ioctl(struct file *file, unsigned int Request,
+						unsigned long Argument)
+{
+  long ErrorCode = 0;
+  void __user *argp = (void __user *)Argument;
+  if (!capable(CAP_SYS_ADMIN)) return -EACCES;
+
+  mutex_lock(&DAC960_mutex);
+  switch (Request)
+    {
+    case DAC960_IOCTL_GET_CONTROLLER_COUNT:
+      ErrorCode = DAC960_ControllerCount;
+      break;
+    case DAC960_IOCTL_GET_CONTROLLER_INFO:
+      ErrorCode = DAC960_gam_get_controller_info(argp);
+      break;
+    case DAC960_IOCTL_V1_EXECUTE_COMMAND:
+      ErrorCode = DAC960_gam_v1_execute_command(argp);
+      break;
+    case DAC960_IOCTL_V2_EXECUTE_COMMAND:
+      ErrorCode = DAC960_gam_v2_execute_command(argp);
+      break;
+    case DAC960_IOCTL_V2_GET_HEALTH_STATUS:
+      ErrorCode = DAC960_gam_v2_get_health_status(argp);
       break;
       default:
 	ErrorCode = -ENOTTY;
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig
index 40579d0..ad9b687 100644
--- a/drivers/block/Kconfig
+++ b/drivers/block/Kconfig
@@ -20,6 +20,10 @@
 	tristate "Null test block driver"
 	select CONFIGFS_FS
 
+config BLK_DEV_NULL_BLK_FAULT_INJECTION
+	bool "Support fault injection for Null test block driver"
+	depends on BLK_DEV_NULL_BLK && FAULT_INJECTION
+
 config BLK_DEV_FD
 	tristate "Normal floppy disk support"
 	depends on ARCH_MAY_HAVE_PC_FDC
diff --git a/drivers/block/aoe/aoe.h b/drivers/block/aoe/aoe.h
index 9220f8e..c0ebda1 100644
--- a/drivers/block/aoe/aoe.h
+++ b/drivers/block/aoe/aoe.h
@@ -112,8 +112,7 @@ enum frame_flags {
 struct frame {
 	struct list_head head;
 	u32 tag;
-	struct timeval sent;	/* high-res time packet was sent */
-	u32 sent_jiffs;		/* low-res jiffies-based sent time */
+	ktime_t sent;			/* high-res time packet was sent */
 	ulong waited;
 	ulong waited_total;
 	struct aoetgt *t;		/* parent target I belong to */
diff --git a/drivers/block/aoe/aoecmd.c b/drivers/block/aoe/aoecmd.c
index 812fed0..540bb60 100644
--- a/drivers/block/aoe/aoecmd.c
+++ b/drivers/block/aoe/aoecmd.c
@@ -398,8 +398,7 @@ aoecmd_ata_rw(struct aoedev *d)
 
 	skb = skb_clone(f->skb, GFP_ATOMIC);
 	if (skb) {
-		do_gettimeofday(&f->sent);
-		f->sent_jiffs = (u32) jiffies;
+		f->sent = ktime_get();
 		__skb_queue_head_init(&queue);
 		__skb_queue_tail(&queue, skb);
 		aoenet_xmit(&queue);
@@ -489,8 +488,7 @@ resend(struct aoedev *d, struct frame *f)
 	skb = skb_clone(skb, GFP_ATOMIC);
 	if (skb == NULL)
 		return;
-	do_gettimeofday(&f->sent);
-	f->sent_jiffs = (u32) jiffies;
+	f->sent = ktime_get();
 	__skb_queue_head_init(&queue);
 	__skb_queue_tail(&queue, skb);
 	aoenet_xmit(&queue);
@@ -499,33 +497,17 @@ resend(struct aoedev *d, struct frame *f)
 static int
 tsince_hr(struct frame *f)
 {
-	struct timeval now;
-	int n;
+	u64 delta = ktime_to_ns(ktime_sub(ktime_get(), f->sent));
 
-	do_gettimeofday(&now);
-	n = now.tv_usec - f->sent.tv_usec;
-	n += (now.tv_sec - f->sent.tv_sec) * USEC_PER_SEC;
+	/* delta is normally under 4.2 seconds, avoid 64-bit division */
+	if (likely(delta <= UINT_MAX))
+		return (u32)delta / NSEC_PER_USEC;
 
-	if (n < 0)
-		n = -n;
+	/* avoid overflow after 71 minutes */
+	if (delta > ((u64)INT_MAX * NSEC_PER_USEC))
+		return INT_MAX;
 
-	/* For relatively long periods, use jiffies to avoid
-	 * discrepancies caused by updates to the system time.
-	 *
-	 * On system with HZ of 1000, 32-bits is over 49 days
-	 * worth of jiffies, or over 71 minutes worth of usecs.
-	 *
-	 * Jiffies overflow is handled by subtraction of unsigned ints:
-	 * (gdb) print (unsigned) 2 - (unsigned) 0xfffffffe
-	 * $3 = 4
-	 * (gdb)
-	 */
-	if (n > USEC_PER_SEC / 4) {
-		n = ((u32) jiffies) - f->sent_jiffs;
-		n *= USEC_PER_SEC / HZ;
-	}
-
-	return n;
+	return div_u64(delta, NSEC_PER_USEC);
 }
 
 static int
@@ -589,7 +571,6 @@ reassign_frame(struct frame *f)
 	nf->waited = 0;
 	nf->waited_total = f->waited_total;
 	nf->sent = f->sent;
-	nf->sent_jiffs = f->sent_jiffs;
 	f->skb = skb;
 
 	return nf;
@@ -633,8 +614,7 @@ probe(struct aoetgt *t)
 
 	skb = skb_clone(f->skb, GFP_ATOMIC);
 	if (skb) {
-		do_gettimeofday(&f->sent);
-		f->sent_jiffs = (u32) jiffies;
+		f->sent = ktime_get();
 		__skb_queue_head_init(&queue);
 		__skb_queue_tail(&queue, skb);
 		aoenet_xmit(&queue);
@@ -1432,10 +1412,8 @@ aoecmd_ata_id(struct aoedev *d)
 	d->timer.function = rexmit_timer;
 
 	skb = skb_clone(skb, GFP_ATOMIC);
-	if (skb) {
-		do_gettimeofday(&f->sent);
-		f->sent_jiffs = (u32) jiffies;
-	}
+	if (skb)
+		f->sent = ktime_get();
 
 	return skb;
 }
diff --git a/drivers/block/drbd/drbd_bitmap.c b/drivers/block/drbd/drbd_bitmap.c
index bd97908..9f4e6f5 100644
--- a/drivers/block/drbd/drbd_bitmap.c
+++ b/drivers/block/drbd/drbd_bitmap.c
@@ -953,7 +953,7 @@ static void drbd_bm_endio(struct bio *bio)
 	struct drbd_bm_aio_ctx *ctx = bio->bi_private;
 	struct drbd_device *device = ctx->device;
 	struct drbd_bitmap *b = device->bitmap;
-	unsigned int idx = bm_page_to_idx(bio->bi_io_vec[0].bv_page);
+	unsigned int idx = bm_page_to_idx(bio_first_page_all(bio));
 
 	if ((ctx->flags & BM_AIO_COPY_PAGES) == 0 &&
 	    !bm_test_page_unchanged(b->bm_pages[idx]))
diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index bc8e615..d5fe720 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1581,9 +1581,8 @@ static int lo_open(struct block_device *bdev, fmode_t mode)
 	return err;
 }
 
-static void lo_release(struct gendisk *disk, fmode_t mode)
+static void __lo_release(struct loop_device *lo)
 {
-	struct loop_device *lo = disk->private_data;
 	int err;
 
 	if (atomic_dec_return(&lo->lo_refcnt))
@@ -1610,6 +1609,13 @@ static void lo_release(struct gendisk *disk, fmode_t mode)
 	mutex_unlock(&lo->lo_ctl_mutex);
 }
 
+static void lo_release(struct gendisk *disk, fmode_t mode)
+{
+	mutex_lock(&loop_index_mutex);
+	__lo_release(disk->private_data);
+	mutex_unlock(&loop_index_mutex);
+}
+
 static const struct block_device_operations lo_fops = {
 	.owner =	THIS_MODULE,
 	.open =		lo_open,
diff --git a/drivers/block/null_blk.c b/drivers/block/null_blk.c
index ad0477a..6655893 100644
--- a/drivers/block/null_blk.c
+++ b/drivers/block/null_blk.c
@@ -12,9 +12,9 @@
 #include <linux/slab.h>
 #include <linux/blk-mq.h>
 #include <linux/hrtimer.h>
-#include <linux/lightnvm.h>
 #include <linux/configfs.h>
 #include <linux/badblocks.h>
+#include <linux/fault-inject.h>
 
 #define SECTOR_SHIFT		9
 #define PAGE_SECTORS_SHIFT	(PAGE_SHIFT - SECTOR_SHIFT)
@@ -27,6 +27,10 @@
 #define TICKS_PER_SEC		50ULL
 #define TIMER_INTERVAL		(NSEC_PER_SEC / TICKS_PER_SEC)
 
+#ifdef CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION
+static DECLARE_FAULT_ATTR(null_timeout_attr);
+#endif
+
 static inline u64 mb_per_tick(int mbps)
 {
 	return (1 << 20) / TICKS_PER_SEC * ((u64) mbps);
@@ -107,7 +111,6 @@ struct nullb_device {
 	unsigned int hw_queue_depth; /* queue depth */
 	unsigned int index; /* index of the disk, only valid with a disk */
 	unsigned int mbps; /* Bandwidth throttle cap (in MB/s) */
-	bool use_lightnvm; /* register as a LightNVM device */
 	bool blocking; /* blocking blk-mq device */
 	bool use_per_node_hctx; /* use per-node allocation for hardware context */
 	bool power; /* power on/off the device */
@@ -121,7 +124,6 @@ struct nullb {
 	unsigned int index;
 	struct request_queue *q;
 	struct gendisk *disk;
-	struct nvm_dev *ndev;
 	struct blk_mq_tag_set *tag_set;
 	struct blk_mq_tag_set __tag_set;
 	unsigned int queue_depth;
@@ -139,7 +141,6 @@ static LIST_HEAD(nullb_list);
 static struct mutex lock;
 static int null_major;
 static DEFINE_IDA(nullb_indexes);
-static struct kmem_cache *ppa_cache;
 static struct blk_mq_tag_set tag_set;
 
 enum {
@@ -166,6 +167,11 @@ static int g_home_node = NUMA_NO_NODE;
 module_param_named(home_node, g_home_node, int, S_IRUGO);
 MODULE_PARM_DESC(home_node, "Home node for the device");
 
+#ifdef CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION
+static char g_timeout_str[80];
+module_param_string(timeout, g_timeout_str, sizeof(g_timeout_str), S_IRUGO);
+#endif
+
 static int g_queue_mode = NULL_Q_MQ;
 
 static int null_param_store_val(const char *str, int *val, int min, int max)
@@ -208,10 +214,6 @@ static int nr_devices = 1;
 module_param(nr_devices, int, S_IRUGO);
 MODULE_PARM_DESC(nr_devices, "Number of devices to register");
 
-static bool g_use_lightnvm;
-module_param_named(use_lightnvm, g_use_lightnvm, bool, S_IRUGO);
-MODULE_PARM_DESC(use_lightnvm, "Register as a LightNVM device");
-
 static bool g_blocking;
 module_param_named(blocking, g_blocking, bool, S_IRUGO);
 MODULE_PARM_DESC(blocking, "Register as a blocking blk-mq driver device");
@@ -345,7 +347,6 @@ NULLB_DEVICE_ATTR(blocksize, uint);
 NULLB_DEVICE_ATTR(irqmode, uint);
 NULLB_DEVICE_ATTR(hw_queue_depth, uint);
 NULLB_DEVICE_ATTR(index, uint);
-NULLB_DEVICE_ATTR(use_lightnvm, bool);
 NULLB_DEVICE_ATTR(blocking, bool);
 NULLB_DEVICE_ATTR(use_per_node_hctx, bool);
 NULLB_DEVICE_ATTR(memory_backed, bool);
@@ -455,7 +456,6 @@ static struct configfs_attribute *nullb_device_attrs[] = {
 	&nullb_device_attr_irqmode,
 	&nullb_device_attr_hw_queue_depth,
 	&nullb_device_attr_index,
-	&nullb_device_attr_use_lightnvm,
 	&nullb_device_attr_blocking,
 	&nullb_device_attr_use_per_node_hctx,
 	&nullb_device_attr_power,
@@ -573,7 +573,6 @@ static struct nullb_device *null_alloc_dev(void)
 	dev->blocksize = g_bs;
 	dev->irqmode = g_irqmode;
 	dev->hw_queue_depth = g_hw_queue_depth;
-	dev->use_lightnvm = g_use_lightnvm;
 	dev->blocking = g_blocking;
 	dev->use_per_node_hctx = g_use_per_node_hctx;
 	return dev;
@@ -1352,6 +1351,12 @@ static blk_qc_t null_queue_bio(struct request_queue *q, struct bio *bio)
 	return BLK_QC_T_NONE;
 }
 
+static enum blk_eh_timer_return null_rq_timed_out_fn(struct request *rq)
+{
+	pr_info("null: rq %p timed out\n", rq);
+	return BLK_EH_HANDLED;
+}
+
 static int null_rq_prep_fn(struct request_queue *q, struct request *req)
 {
 	struct nullb *nullb = q->queuedata;
@@ -1369,6 +1374,16 @@ static int null_rq_prep_fn(struct request_queue *q, struct request *req)
 	return BLKPREP_DEFER;
 }
 
+static bool should_timeout_request(struct request *rq)
+{
+#ifdef CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION
+	if (g_timeout_str[0])
+		return should_fail(&null_timeout_attr, 1);
+#endif
+
+	return false;
+}
+
 static void null_request_fn(struct request_queue *q)
 {
 	struct request *rq;
@@ -1376,12 +1391,20 @@ static void null_request_fn(struct request_queue *q)
 	while ((rq = blk_fetch_request(q)) != NULL) {
 		struct nullb_cmd *cmd = rq->special;
 
-		spin_unlock_irq(q->queue_lock);
-		null_handle_cmd(cmd);
-		spin_lock_irq(q->queue_lock);
+		if (!should_timeout_request(rq)) {
+			spin_unlock_irq(q->queue_lock);
+			null_handle_cmd(cmd);
+			spin_lock_irq(q->queue_lock);
+		}
 	}
 }
 
+static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res)
+{
+	pr_info("null: rq %p timed out\n", rq);
+	return BLK_EH_HANDLED;
+}
+
 static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
 			 const struct blk_mq_queue_data *bd)
 {
@@ -1399,12 +1422,16 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
 
 	blk_mq_start_request(bd->rq);
 
-	return null_handle_cmd(cmd);
+	if (!should_timeout_request(bd->rq))
+		return null_handle_cmd(cmd);
+
+	return BLK_STS_OK;
 }
 
 static const struct blk_mq_ops null_mq_ops = {
 	.queue_rq       = null_queue_rq,
 	.complete	= null_softirq_done_fn,
+	.timeout	= null_timeout_rq,
 };
 
 static void cleanup_queue(struct nullb_queue *nq)
@@ -1423,170 +1450,6 @@ static void cleanup_queues(struct nullb *nullb)
 	kfree(nullb->queues);
 }
 
-#ifdef CONFIG_NVM
-
-static void null_lnvm_end_io(struct request *rq, blk_status_t status)
-{
-	struct nvm_rq *rqd = rq->end_io_data;
-
-	/* XXX: lighnvm core seems to expect NVM_RSP_* values here.. */
-	rqd->error = status ? -EIO : 0;
-	nvm_end_io(rqd);
-
-	blk_put_request(rq);
-}
-
-static int null_lnvm_submit_io(struct nvm_dev *dev, struct nvm_rq *rqd)
-{
-	struct request_queue *q = dev->q;
-	struct request *rq;
-	struct bio *bio = rqd->bio;
-
-	rq = blk_mq_alloc_request(q,
-		op_is_write(bio_op(bio)) ? REQ_OP_DRV_OUT : REQ_OP_DRV_IN, 0);
-	if (IS_ERR(rq))
-		return -ENOMEM;
-
-	blk_init_request_from_bio(rq, bio);
-
-	rq->end_io_data = rqd;
-
-	blk_execute_rq_nowait(q, NULL, rq, 0, null_lnvm_end_io);
-
-	return 0;
-}
-
-static int null_lnvm_id(struct nvm_dev *dev, struct nvm_id *id)
-{
-	struct nullb *nullb = dev->q->queuedata;
-	sector_t size = (sector_t)nullb->dev->size * 1024 * 1024ULL;
-	sector_t blksize;
-	struct nvm_id_group *grp;
-
-	id->ver_id = 0x1;
-	id->vmnt = 0;
-	id->cap = 0x2;
-	id->dom = 0x1;
-
-	id->ppaf.blk_offset = 0;
-	id->ppaf.blk_len = 16;
-	id->ppaf.pg_offset = 16;
-	id->ppaf.pg_len = 16;
-	id->ppaf.sect_offset = 32;
-	id->ppaf.sect_len = 8;
-	id->ppaf.pln_offset = 40;
-	id->ppaf.pln_len = 8;
-	id->ppaf.lun_offset = 48;
-	id->ppaf.lun_len = 8;
-	id->ppaf.ch_offset = 56;
-	id->ppaf.ch_len = 8;
-
-	sector_div(size, nullb->dev->blocksize); /* convert size to pages */
-	size >>= 8; /* concert size to pgs pr blk */
-	grp = &id->grp;
-	grp->mtype = 0;
-	grp->fmtype = 0;
-	grp->num_ch = 1;
-	grp->num_pg = 256;
-	blksize = size;
-	size >>= 16;
-	grp->num_lun = size + 1;
-	sector_div(blksize, grp->num_lun);
-	grp->num_blk = blksize;
-	grp->num_pln = 1;
-
-	grp->fpg_sz = nullb->dev->blocksize;
-	grp->csecs = nullb->dev->blocksize;
-	grp->trdt = 25000;
-	grp->trdm = 25000;
-	grp->tprt = 500000;
-	grp->tprm = 500000;
-	grp->tbet = 1500000;
-	grp->tbem = 1500000;
-	grp->mpos = 0x010101; /* single plane rwe */
-	grp->cpar = nullb->dev->hw_queue_depth;
-
-	return 0;
-}
-
-static void *null_lnvm_create_dma_pool(struct nvm_dev *dev, char *name)
-{
-	mempool_t *virtmem_pool;
-
-	virtmem_pool = mempool_create_slab_pool(64, ppa_cache);
-	if (!virtmem_pool) {
-		pr_err("null_blk: Unable to create virtual memory pool\n");
-		return NULL;
-	}
-
-	return virtmem_pool;
-}
-
-static void null_lnvm_destroy_dma_pool(void *pool)
-{
-	mempool_destroy(pool);
-}
-
-static void *null_lnvm_dev_dma_alloc(struct nvm_dev *dev, void *pool,
-				gfp_t mem_flags, dma_addr_t *dma_handler)
-{
-	return mempool_alloc(pool, mem_flags);
-}
-
-static void null_lnvm_dev_dma_free(void *pool, void *entry,
-							dma_addr_t dma_handler)
-{
-	mempool_free(entry, pool);
-}
-
-static struct nvm_dev_ops null_lnvm_dev_ops = {
-	.identity		= null_lnvm_id,
-	.submit_io		= null_lnvm_submit_io,
-
-	.create_dma_pool	= null_lnvm_create_dma_pool,
-	.destroy_dma_pool	= null_lnvm_destroy_dma_pool,
-	.dev_dma_alloc		= null_lnvm_dev_dma_alloc,
-	.dev_dma_free		= null_lnvm_dev_dma_free,
-
-	/* Simulate nvme protocol restriction */
-	.max_phys_sect		= 64,
-};
-
-static int null_nvm_register(struct nullb *nullb)
-{
-	struct nvm_dev *dev;
-	int rv;
-
-	dev = nvm_alloc_dev(0);
-	if (!dev)
-		return -ENOMEM;
-
-	dev->q = nullb->q;
-	memcpy(dev->name, nullb->disk_name, DISK_NAME_LEN);
-	dev->ops = &null_lnvm_dev_ops;
-
-	rv = nvm_register(dev);
-	if (rv) {
-		kfree(dev);
-		return rv;
-	}
-	nullb->ndev = dev;
-	return 0;
-}
-
-static void null_nvm_unregister(struct nullb *nullb)
-{
-	nvm_unregister(nullb->ndev);
-}
-#else
-static int null_nvm_register(struct nullb *nullb)
-{
-	pr_err("null_blk: CONFIG_NVM needs to be enabled for LightNVM\n");
-	return -EINVAL;
-}
-static void null_nvm_unregister(struct nullb *nullb) {}
-#endif /* CONFIG_NVM */
-
 static void null_del_dev(struct nullb *nullb)
 {
 	struct nullb_device *dev = nullb->dev;
@@ -1595,10 +1458,7 @@ static void null_del_dev(struct nullb *nullb)
 
 	list_del_init(&nullb->list);
 
-	if (dev->use_lightnvm)
-		null_nvm_unregister(nullb);
-	else
-		del_gendisk(nullb->disk);
+	del_gendisk(nullb->disk);
 
 	if (test_bit(NULLB_DEV_FL_THROTTLED, &nullb->dev->flags)) {
 		hrtimer_cancel(&nullb->bw_timer);
@@ -1610,8 +1470,7 @@ static void null_del_dev(struct nullb *nullb)
 	if (dev->queue_mode == NULL_Q_MQ &&
 	    nullb->tag_set == &nullb->__tag_set)
 		blk_mq_free_tag_set(nullb->tag_set);
-	if (!dev->use_lightnvm)
-		put_disk(nullb->disk);
+	put_disk(nullb->disk);
 	cleanup_queues(nullb);
 	if (null_cache_active(nullb))
 		null_free_device_storage(nullb->dev, true);
@@ -1775,11 +1634,6 @@ static void null_validate_conf(struct nullb_device *dev)
 {
 	dev->blocksize = round_down(dev->blocksize, 512);
 	dev->blocksize = clamp_t(unsigned int, dev->blocksize, 512, 4096);
-	if (dev->use_lightnvm && dev->blocksize != 4096)
-		dev->blocksize = 4096;
-
-	if (dev->use_lightnvm && dev->queue_mode != NULL_Q_MQ)
-		dev->queue_mode = NULL_Q_MQ;
 
 	if (dev->queue_mode == NULL_Q_MQ && dev->use_per_node_hctx) {
 		if (dev->submit_queues != nr_online_nodes)
@@ -1805,6 +1659,20 @@ static void null_validate_conf(struct nullb_device *dev)
 		dev->mbps = 0;
 }
 
+static bool null_setup_fault(void)
+{
+#ifdef CONFIG_BLK_DEV_NULL_BLK_FAULT_INJECTION
+	if (!g_timeout_str[0])
+		return true;
+
+	if (!setup_fault_attr(&null_timeout_attr, g_timeout_str))
+		return false;
+
+	null_timeout_attr.verbose = 0;
+#endif
+	return true;
+}
+
 static int null_add_dev(struct nullb_device *dev)
 {
 	struct nullb *nullb;
@@ -1838,6 +1706,10 @@ static int null_add_dev(struct nullb_device *dev)
 		if (rv)
 			goto out_cleanup_queues;
 
+		if (!null_setup_fault())
+			goto out_cleanup_queues;
+
+		nullb->tag_set->timeout = 5 * HZ;
 		nullb->q = blk_mq_init_queue(nullb->tag_set);
 		if (IS_ERR(nullb->q)) {
 			rv = -ENOMEM;
@@ -1861,8 +1733,14 @@ static int null_add_dev(struct nullb_device *dev)
 			rv = -ENOMEM;
 			goto out_cleanup_queues;
 		}
+
+		if (!null_setup_fault())
+			goto out_cleanup_blk_queue;
+
 		blk_queue_prep_rq(nullb->q, null_rq_prep_fn);
 		blk_queue_softirq_done(nullb->q, null_softirq_done_fn);
+		blk_queue_rq_timed_out(nullb->q, null_rq_timed_out_fn);
+		nullb->q->rq_timeout = 5 * HZ;
 		rv = init_driver_queues(nullb);
 		if (rv)
 			goto out_cleanup_blk_queue;
@@ -1895,11 +1773,7 @@ static int null_add_dev(struct nullb_device *dev)
 
 	sprintf(nullb->disk_name, "nullb%d", nullb->index);
 
-	if (dev->use_lightnvm)
-		rv = null_nvm_register(nullb);
-	else
-		rv = null_gendisk_register(nullb);
-
+	rv = null_gendisk_register(nullb);
 	if (rv)
 		goto out_cleanup_blk_queue;
 
@@ -1938,18 +1812,6 @@ static int __init null_init(void)
 		g_bs = PAGE_SIZE;
 	}
 
-	if (g_use_lightnvm && g_bs != 4096) {
-		pr_warn("null_blk: LightNVM only supports 4k block size\n");
-		pr_warn("null_blk: defaults block size to 4k\n");
-		g_bs = 4096;
-	}
-
-	if (g_use_lightnvm && g_queue_mode != NULL_Q_MQ) {
-		pr_warn("null_blk: LightNVM only supported for blk-mq\n");
-		pr_warn("null_blk: defaults queue mode to blk-mq\n");
-		g_queue_mode = NULL_Q_MQ;
-	}
-
 	if (g_queue_mode == NULL_Q_MQ && g_use_per_node_hctx) {
 		if (g_submit_queues != nr_online_nodes) {
 			pr_warn("null_blk: submit_queues param is set to %u.\n",
@@ -1982,16 +1844,6 @@ static int __init null_init(void)
 		goto err_conf;
 	}
 
-	if (g_use_lightnvm) {
-		ppa_cache = kmem_cache_create("ppa_cache", 64 * sizeof(u64),
-								0, 0, NULL);
-		if (!ppa_cache) {
-			pr_err("null_blk: unable to create ppa cache\n");
-			ret = -ENOMEM;
-			goto err_ppa;
-		}
-	}
-
 	for (i = 0; i < nr_devices; i++) {
 		dev = null_alloc_dev();
 		if (!dev) {
@@ -2015,8 +1867,6 @@ static int __init null_init(void)
 		null_del_dev(nullb);
 		null_free_dev(dev);
 	}
-	kmem_cache_destroy(ppa_cache);
-err_ppa:
 	unregister_blkdev(null_major, "nullb");
 err_conf:
 	configfs_unregister_subsystem(&nullb_subsys);
@@ -2047,8 +1897,6 @@ static void __exit null_exit(void)
 
 	if (g_queue_mode == NULL_Q_MQ && shared_tags)
 		blk_mq_free_tag_set(&tag_set);
-
-	kmem_cache_destroy(ppa_cache);
 }
 
 module_init(null_init);
diff --git a/drivers/block/pktcdvd.c b/drivers/block/pktcdvd.c
index 6797479..531a091 100644
--- a/drivers/block/pktcdvd.c
+++ b/drivers/block/pktcdvd.c
@@ -2579,14 +2579,14 @@ static int pkt_new_dev(struct pktcdvd_device *pd, dev_t dev)
 	bdev = bdget(dev);
 	if (!bdev)
 		return -ENOMEM;
-	if (!blk_queue_scsi_passthrough(bdev_get_queue(bdev))) {
-		WARN_ONCE(true, "Attempt to register a non-SCSI queue\n");
-		bdput(bdev);
-		return -EINVAL;
-	}
 	ret = blkdev_get(bdev, FMODE_READ | FMODE_NDELAY, NULL);
 	if (ret)
 		return ret;
+	if (!blk_queue_scsi_passthrough(bdev_get_queue(bdev))) {
+		WARN_ONCE(true, "Attempt to register a non-SCSI queue\n");
+		blkdev_put(bdev, FMODE_READ | FMODE_NDELAY);
+		return -EINVAL;
+	}
 
 	/* This is safe, since we have a reference from open(). */
 	__module_get(THIS_MODULE);
@@ -2745,7 +2745,7 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
 	pd->pkt_dev = MKDEV(pktdev_major, idx);
 	ret = pkt_new_dev(pd, dev);
 	if (ret)
-		goto out_new_dev;
+		goto out_mem2;
 
 	/* inherit events of the host device */
 	disk->events = pd->bdev->bd_disk->events;
@@ -2763,8 +2763,6 @@ static int pkt_setup_dev(dev_t dev, dev_t* pkt_dev)
 	mutex_unlock(&ctl_mutex);
 	return 0;
 
-out_new_dev:
-	blk_cleanup_queue(disk->queue);
 out_mem2:
 	put_disk(disk);
 out_mem:
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index 38fc5f3..cc93522 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3047,13 +3047,21 @@ static void format_lock_cookie(struct rbd_device *rbd_dev, char *buf)
 	mutex_unlock(&rbd_dev->watch_mutex);
 }
 
+static void __rbd_lock(struct rbd_device *rbd_dev, const char *cookie)
+{
+	struct rbd_client_id cid = rbd_get_cid(rbd_dev);
+
+	strcpy(rbd_dev->lock_cookie, cookie);
+	rbd_set_owner_cid(rbd_dev, &cid);
+	queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work);
+}
+
 /*
  * lock_rwsem must be held for write
  */
 static int rbd_lock(struct rbd_device *rbd_dev)
 {
 	struct ceph_osd_client *osdc = &rbd_dev->rbd_client->client->osdc;
-	struct rbd_client_id cid = rbd_get_cid(rbd_dev);
 	char cookie[32];
 	int ret;
 
@@ -3068,9 +3076,7 @@ static int rbd_lock(struct rbd_device *rbd_dev)
 		return ret;
 
 	rbd_dev->lock_state = RBD_LOCK_STATE_LOCKED;
-	strcpy(rbd_dev->lock_cookie, cookie);
-	rbd_set_owner_cid(rbd_dev, &cid);
-	queue_work(rbd_dev->task_wq, &rbd_dev->acquired_lock_work);
+	__rbd_lock(rbd_dev, cookie);
 	return 0;
 }
 
@@ -3856,7 +3862,7 @@ static void rbd_reacquire_lock(struct rbd_device *rbd_dev)
 			queue_delayed_work(rbd_dev->task_wq,
 					   &rbd_dev->lock_dwork, 0);
 	} else {
-		strcpy(rbd_dev->lock_cookie, cookie);
+		__rbd_lock(rbd_dev, cookie);
 	}
 }
 
@@ -4381,7 +4387,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev)
 	segment_size = rbd_obj_bytes(&rbd_dev->header);
 	blk_queue_max_hw_sectors(q, segment_size / SECTOR_SIZE);
 	q->limits.max_sectors = queue_max_hw_sectors(q);
-	blk_queue_max_segments(q, segment_size / SECTOR_SIZE);
+	blk_queue_max_segments(q, USHRT_MAX);
 	blk_queue_max_segment_size(q, segment_size);
 	blk_queue_io_min(q, segment_size);
 	blk_queue_io_opt(q, segment_size);
diff --git a/drivers/block/smart1,2.h b/drivers/block/smart1,2.h
deleted file mode 100644
index e5565fb..0000000
--- a/drivers/block/smart1,2.h
+++ /dev/null
@@ -1,278 +0,0 @@
-/*
- *    Disk Array driver for Compaq SMART2 Controllers
- *    Copyright 1998 Compaq Computer Corporation
- *
- *    This program is free software; you can redistribute it and/or modify
- *    it under the terms of the GNU General Public License as published by
- *    the Free Software Foundation; either version 2 of the License, or
- *    (at your option) any later version.
- *
- *    This program is distributed in the hope that it will be useful,
- *    but WITHOUT ANY WARRANTY; without even the implied warranty of
- *    MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE, GOOD TITLE or
- *    NON INFRINGEMENT.  See the GNU General Public License for more details.
- *
- *    You should have received a copy of the GNU General Public License
- *    along with this program; if not, write to the Free Software
- *    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
- *
- *    Questions/Comments/Bugfixes to iss_storagedev@hp.com
- *
- *    If you want to make changes, improve or add functionality to this
- *    driver, you'll probably need the Compaq Array Controller Interface
- *    Specificiation (Document number ECG086/1198)
- */
-
-/*
- * This file contains the controller communication implementation for
- * Compaq SMART-1 and SMART-2 controllers.  To the best of my knowledge,
- * this should support:
- *
- *  PCI:
- *  SMART-2/P, SMART-2DH, SMART-2SL, SMART-221, SMART-3100ES, SMART-3200
- *  Integerated SMART Array Controller, SMART-4200, SMART-4250ES
- *
- *  EISA:
- *  SMART-2/E, SMART, IAES, IDA-2, IDA
- */
-
-/*
- * Memory mapped FIFO interface (SMART 42xx cards)
- */
-static void smart4_submit_command(ctlr_info_t *h, cmdlist_t *c)
-{
-        writel(c->busaddr, h->vaddr + S42XX_REQUEST_PORT_OFFSET);
-}
-
-/*  
- *  This card is the opposite of the other cards.  
- *   0 turns interrupts on... 
- *   0x08 turns them off... 
- */
-static void smart4_intr_mask(ctlr_info_t *h, unsigned long val)
-{
-	if (val) 
-	{ /* Turn interrupts on */
-		writel(0, h->vaddr + S42XX_REPLY_INTR_MASK_OFFSET);
-	} else /* Turn them off */
-	{
-        	writel( S42XX_INTR_OFF, 
-			h->vaddr + S42XX_REPLY_INTR_MASK_OFFSET);
-	}
-}
-
-/*
- *  For older cards FIFO Full = 0. 
- *  On this card 0 means there is room, anything else FIFO Full. 
- * 
- */ 
-static unsigned long smart4_fifo_full(ctlr_info_t *h)
-{
-	
-        return (!readl(h->vaddr + S42XX_REQUEST_PORT_OFFSET));
-}
-
-/* This type of controller returns -1 if the fifo is empty, 
- *    Not 0 like the others.
- *    And we need to let it know we read a value out 
- */ 
-static unsigned long smart4_completed(ctlr_info_t *h)
-{
-	long register_value 
-		= readl(h->vaddr + S42XX_REPLY_PORT_OFFSET);
-
-	/* Fifo is empty */
-	if( register_value == 0xffffffff)
-		return 0; 	
-
-	/* Need to let it know we got the reply */
-	/* We do this by writing a 0 to the port we just read from */
-	writel(0, h->vaddr + S42XX_REPLY_PORT_OFFSET);
-
-	return ((unsigned long) register_value); 
-}
-
- /*
- *  This hardware returns interrupt pending at a different place and 
- *  it does not tell us if the fifo is empty, we will have check  
- *  that by getting a 0 back from the command_completed call. 
- */
-static unsigned long smart4_intr_pending(ctlr_info_t *h)
-{
-	unsigned long register_value  = 
-		readl(h->vaddr + S42XX_INTR_STATUS);
-
-	if( register_value &  S42XX_INTR_PENDING) 
-		return  FIFO_NOT_EMPTY;	
-	return 0 ;
-}
-
-static struct access_method smart4_access = {
-	smart4_submit_command,
-	smart4_intr_mask,
-	smart4_fifo_full,
-	smart4_intr_pending,
-	smart4_completed,
-};
-
-/*
- * Memory mapped FIFO interface (PCI SMART2 and SMART 3xxx cards)
- */
-static void smart2_submit_command(ctlr_info_t *h, cmdlist_t *c)
-{
-	writel(c->busaddr, h->vaddr + COMMAND_FIFO);
-}
-
-static void smart2_intr_mask(ctlr_info_t *h, unsigned long val)
-{
-	writel(val, h->vaddr + INTR_MASK);
-}
-
-static unsigned long smart2_fifo_full(ctlr_info_t *h)
-{
-	return readl(h->vaddr + COMMAND_FIFO);
-}
-
-static unsigned long smart2_completed(ctlr_info_t *h)
-{
-	return readl(h->vaddr + COMMAND_COMPLETE_FIFO);
-}
-
-static unsigned long smart2_intr_pending(ctlr_info_t *h)
-{
-	return readl(h->vaddr + INTR_PENDING);
-}
-
-static struct access_method smart2_access = {
-	smart2_submit_command,
-	smart2_intr_mask,
-	smart2_fifo_full,
-	smart2_intr_pending,
-	smart2_completed,
-};
-
-/*
- *  IO access for SMART-2/E cards
- */
-static void smart2e_submit_command(ctlr_info_t *h, cmdlist_t *c)
-{
-	outl(c->busaddr, h->io_mem_addr + COMMAND_FIFO);
-}
-
-static void smart2e_intr_mask(ctlr_info_t *h, unsigned long val)
-{
-	outl(val, h->io_mem_addr + INTR_MASK);
-}
-
-static unsigned long smart2e_fifo_full(ctlr_info_t *h)
-{
-	return inl(h->io_mem_addr + COMMAND_FIFO);
-}
-
-static unsigned long smart2e_completed(ctlr_info_t *h)
-{
-	return inl(h->io_mem_addr + COMMAND_COMPLETE_FIFO);
-}
-
-static unsigned long smart2e_intr_pending(ctlr_info_t *h)
-{
-	return inl(h->io_mem_addr + INTR_PENDING);
-}
-
-static struct access_method smart2e_access = {
-	smart2e_submit_command,
-	smart2e_intr_mask,
-	smart2e_fifo_full,
-	smart2e_intr_pending,
-	smart2e_completed,
-};
-
-/*
- *  IO access for older SMART-1 type cards
- */
-#define SMART1_SYSTEM_MASK		0xC8E
-#define SMART1_SYSTEM_DOORBELL		0xC8F
-#define SMART1_LOCAL_MASK		0xC8C
-#define SMART1_LOCAL_DOORBELL		0xC8D
-#define SMART1_INTR_MASK		0xC89
-#define SMART1_LISTADDR			0xC90
-#define SMART1_LISTLEN			0xC94
-#define SMART1_TAG			0xC97
-#define SMART1_COMPLETE_ADDR		0xC98
-#define SMART1_LISTSTATUS		0xC9E
-
-#define CHANNEL_BUSY			0x01
-#define CHANNEL_CLEAR			0x02
-
-static void smart1_submit_command(ctlr_info_t *h, cmdlist_t *c)
-{
-	/*
-	 * This __u16 is actually a bunch of control flags on SMART
-	 * and below.  We want them all to be zero.
-	 */
-	c->hdr.size = 0;
-
-	outb(CHANNEL_CLEAR, h->io_mem_addr + SMART1_SYSTEM_DOORBELL);
-
-	outl(c->busaddr, h->io_mem_addr + SMART1_LISTADDR);
-	outw(c->size, h->io_mem_addr + SMART1_LISTLEN);
-
-	outb(CHANNEL_BUSY, h->io_mem_addr + SMART1_LOCAL_DOORBELL);
-}
-
-static void smart1_intr_mask(ctlr_info_t *h, unsigned long val)
-{
-	if (val == 1) {
-		outb(0xFD, h->io_mem_addr + SMART1_SYSTEM_DOORBELL);
-		outb(CHANNEL_BUSY, h->io_mem_addr + SMART1_LOCAL_DOORBELL);
-		outb(0x01, h->io_mem_addr + SMART1_INTR_MASK);
-		outb(0x01, h->io_mem_addr + SMART1_SYSTEM_MASK);
-	} else {
-		outb(0, h->io_mem_addr + 0xC8E);
-	}
-}
-
-static unsigned long smart1_fifo_full(ctlr_info_t *h)
-{
-	unsigned char chan;
-	chan = inb(h->io_mem_addr + SMART1_SYSTEM_DOORBELL) & CHANNEL_CLEAR;
-	return chan;
-}
-
-static unsigned long smart1_completed(ctlr_info_t *h)
-{
-	unsigned char status;
-	unsigned long cmd;
-
-	if (inb(h->io_mem_addr + SMART1_SYSTEM_DOORBELL) & CHANNEL_BUSY) {
-		outb(CHANNEL_BUSY, h->io_mem_addr + SMART1_SYSTEM_DOORBELL);
-
-		cmd = inl(h->io_mem_addr + SMART1_COMPLETE_ADDR);
-		status = inb(h->io_mem_addr + SMART1_LISTSTATUS);
-
-		outb(CHANNEL_CLEAR, h->io_mem_addr + SMART1_LOCAL_DOORBELL);
-
-		/*
-		 * this is x86 (actually compaq x86) only, so it's ok
-		 */
-		if (cmd) ((cmdlist_t*)bus_to_virt(cmd))->req.hdr.rcode = status;
-	} else {
-		cmd = 0;
-	}
-	return cmd;
-}
-
-static unsigned long smart1_intr_pending(ctlr_info_t *h)
-{
-	unsigned char chan;
-	chan = inb(h->io_mem_addr + SMART1_SYSTEM_DOORBELL) & CHANNEL_BUSY;
-	return chan;
-}
-
-static struct access_method smart1_access = {
-	smart1_submit_command,
-	smart1_intr_mask,
-	smart1_fifo_full,
-	smart1_intr_pending,
-	smart1_completed,
-};
diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index d70eba3..0afa6c8 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -430,7 +430,7 @@ static void put_entry_bdev(struct zram *zram, unsigned long entry)
 
 static void zram_page_end_io(struct bio *bio)
 {
-	struct page *page = bio->bi_io_vec[0].bv_page;
+	struct page *page = bio_first_page_all(bio);
 
 	page_endio(page, op_is_write(bio_op(bio)),
 			blk_status_to_errno(bio->bi_status));
diff --git a/drivers/bus/Kconfig b/drivers/bus/Kconfig
index dc7b3c7..57e011d 100644
--- a/drivers/bus/Kconfig
+++ b/drivers/bus/Kconfig
@@ -120,7 +120,7 @@
 	  SRAM, ethernet adapters, FPGAs and LCD displays.
 
 config SIMPLE_PM_BUS
-	bool "Simple Power-Managed Bus Driver"
+	tristate "Simple Power-Managed Bus Driver"
 	depends on OF && PM
 	help
 	  Driver for transparent busses that don't need a real driver, but
diff --git a/drivers/bus/sunxi-rsb.c b/drivers/bus/sunxi-rsb.c
index 328ca93..1b76d95 100644
--- a/drivers/bus/sunxi-rsb.c
+++ b/drivers/bus/sunxi-rsb.c
@@ -178,6 +178,7 @@ static struct bus_type sunxi_rsb_bus = {
 	.match		= sunxi_rsb_device_match,
 	.probe		= sunxi_rsb_device_probe,
 	.remove		= sunxi_rsb_device_remove,
+	.uevent		= of_device_uevent_modalias,
 };
 
 static void sunxi_rsb_dev_release(struct device *dev)
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index c729a88..b3b4ed9 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -269,6 +269,7 @@
 	bool "Clocksource for STM32 SoCs" if !ARCH_STM32
 	depends on OF && ARM && (ARCH_STM32 || COMPILE_TEST)
 	select CLKSRC_MMIO
+	select TIMER_OF
 
 config CLKSRC_MPS2
 	bool "Clocksource for MPS2 SoCs" if COMPILE_TEST
@@ -441,6 +442,13 @@
 	help
 	  Support for Mediatek timer driver.
 
+config SPRD_TIMER
+	bool "Spreadtrum timer driver" if COMPILE_TEST
+	depends on HAS_IOMEM
+	select TIMER_OF
+	help
+	  Enables support for the Spreadtrum timer driver.
+
 config SYS_SUPPORTS_SH_MTU2
         bool
 
diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
index 72711f1..d6dec44 100644
--- a/drivers/clocksource/Makefile
+++ b/drivers/clocksource/Makefile
@@ -54,6 +54,7 @@
 obj-$(CONFIG_CLKSRC_NPS)	+= timer-nps.o
 obj-$(CONFIG_OXNAS_RPS_TIMER)	+= timer-oxnas-rps.o
 obj-$(CONFIG_OWL_TIMER)		+= owl-timer.o
+obj-$(CONFIG_SPRD_TIMER)	+= timer-sprd.o
 
 obj-$(CONFIG_ARC_TIMERS)		+= arc_timer.o
 obj-$(CONFIG_ARM_ARCH_TIMER)		+= arm_arch_timer.o
diff --git a/drivers/clocksource/owl-timer.c b/drivers/clocksource/owl-timer.c
index c686305..ea00a5e 100644
--- a/drivers/clocksource/owl-timer.c
+++ b/drivers/clocksource/owl-timer.c
@@ -168,5 +168,6 @@ static int __init owl_timer_init(struct device_node *node)
 
 	return 0;
 }
-CLOCKSOURCE_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init);
-CLOCKSOURCE_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init);
+TIMER_OF_DECLARE(owl_s500, "actions,s500-timer", owl_timer_init);
+TIMER_OF_DECLARE(owl_s700, "actions,s700-timer", owl_timer_init);
+TIMER_OF_DECLARE(owl_s900, "actions,s900-timer", owl_timer_init);
diff --git a/drivers/clocksource/tcb_clksrc.c b/drivers/clocksource/tcb_clksrc.c
index 9de47d4..43f4d5c 100644
--- a/drivers/clocksource/tcb_clksrc.c
+++ b/drivers/clocksource/tcb_clksrc.c
@@ -384,7 +384,7 @@ static int __init tcb_clksrc_init(void)
 
 	printk(bootinfo, clksrc.name, CONFIG_ATMEL_TCB_CLKSRC_BLOCK,
 			divided_rate / 1000000,
-			((divided_rate + 500000) % 1000000) / 1000);
+			((divided_rate % 1000000) + 500) / 1000);
 
 	if (tc->tcb_config && tc->tcb_config->counter_width == 32) {
 		/* use apropriate function to read 32 bit counter */
diff --git a/drivers/clocksource/timer-of.c b/drivers/clocksource/timer-of.c
index a319904..06ed88a 100644
--- a/drivers/clocksource/timer-of.c
+++ b/drivers/clocksource/timer-of.c
@@ -24,7 +24,13 @@
 
 #include "timer-of.h"
 
-static __init void timer_irq_exit(struct of_timer_irq *of_irq)
+/**
+ * timer_of_irq_exit - Release the interrupt
+ * @of_irq: an of_timer_irq structure pointer
+ *
+ * Free the irq resource
+ */
+static __init void timer_of_irq_exit(struct of_timer_irq *of_irq)
 {
 	struct timer_of *to = container_of(of_irq, struct timer_of, of_irq);
 
@@ -34,8 +40,24 @@ static __init void timer_irq_exit(struct of_timer_irq *of_irq)
 		free_irq(of_irq->irq, clkevt);
 }
 
-static __init int timer_irq_init(struct device_node *np,
-				 struct of_timer_irq *of_irq)
+/**
+ * timer_of_irq_init - Request the interrupt
+ * @np: a device tree node pointer
+ * @of_irq: an of_timer_irq structure pointer
+ *
+ * Get the interrupt number from the DT from its definition and
+ * request it. The interrupt is gotten by falling back the following way:
+ *
+ * - Get interrupt number by name
+ * - Get interrupt number by index
+ *
+ * When the interrupt is per CPU, 'request_percpu_irq()' is called,
+ * otherwise 'request_irq()' is used.
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static __init int timer_of_irq_init(struct device_node *np,
+				    struct of_timer_irq *of_irq)
 {
 	int ret;
 	struct timer_of *to = container_of(of_irq, struct timer_of, of_irq);
@@ -72,15 +94,30 @@ static __init int timer_irq_init(struct device_node *np,
 	return 0;
 }
 
-static __init void timer_clk_exit(struct of_timer_clk *of_clk)
+/**
+ * timer_of_clk_exit - Release the clock resources
+ * @of_clk: a of_timer_clk structure pointer
+ *
+ * Disables and releases the refcount on the clk
+ */
+static __init void timer_of_clk_exit(struct of_timer_clk *of_clk)
 {
 	of_clk->rate = 0;
 	clk_disable_unprepare(of_clk->clk);
 	clk_put(of_clk->clk);
 }
 
-static __init int timer_clk_init(struct device_node *np,
-				 struct of_timer_clk *of_clk)
+/**
+ * timer_of_clk_init - Initialize the clock resources
+ * @np: a device tree node pointer
+ * @of_clk: a of_timer_clk structure pointer
+ *
+ * Get the clock by name or by index, enable it and get the rate
+ *
+ * Returns 0 on success, < 0 otherwise
+ */
+static __init int timer_of_clk_init(struct device_node *np,
+				    struct of_timer_clk *of_clk)
 {
 	int ret;
 
@@ -116,19 +153,19 @@ static __init int timer_clk_init(struct device_node *np,
 	goto out;
 }
 
-static __init void timer_base_exit(struct of_timer_base *of_base)
+static __init void timer_of_base_exit(struct of_timer_base *of_base)
 {
 	iounmap(of_base->base);
 }
 
-static __init int timer_base_init(struct device_node *np,
-				  struct of_timer_base *of_base)
+static __init int timer_of_base_init(struct device_node *np,
+				     struct of_timer_base *of_base)
 {
-	const char *name = of_base->name ? of_base->name : np->full_name;
-
-	of_base->base = of_io_request_and_map(np, of_base->index, name);
+	of_base->base = of_base->name ?
+		of_io_request_and_map(np, of_base->index, of_base->name) :
+		of_iomap(np, of_base->index);
 	if (IS_ERR(of_base->base)) {
-		pr_err("Failed to iomap (%s)\n", name);
+		pr_err("Failed to iomap (%s)\n", of_base->name);
 		return PTR_ERR(of_base->base);
 	}
 
@@ -141,21 +178,21 @@ int __init timer_of_init(struct device_node *np, struct timer_of *to)
 	int flags = 0;
 
 	if (to->flags & TIMER_OF_BASE) {
-		ret = timer_base_init(np, &to->of_base);
+		ret = timer_of_base_init(np, &to->of_base);
 		if (ret)
 			goto out_fail;
 		flags |= TIMER_OF_BASE;
 	}
 
 	if (to->flags & TIMER_OF_CLOCK) {
-		ret = timer_clk_init(np, &to->of_clk);
+		ret = timer_of_clk_init(np, &to->of_clk);
 		if (ret)
 			goto out_fail;
 		flags |= TIMER_OF_CLOCK;
 	}
 
 	if (to->flags & TIMER_OF_IRQ) {
-		ret = timer_irq_init(np, &to->of_irq);
+		ret = timer_of_irq_init(np, &to->of_irq);
 		if (ret)
 			goto out_fail;
 		flags |= TIMER_OF_IRQ;
@@ -163,17 +200,20 @@ int __init timer_of_init(struct device_node *np, struct timer_of *to)
 
 	if (!to->clkevt.name)
 		to->clkevt.name = np->name;
+
+	to->np = np;
+
 	return ret;
 
 out_fail:
 	if (flags & TIMER_OF_IRQ)
-		timer_irq_exit(&to->of_irq);
+		timer_of_irq_exit(&to->of_irq);
 
 	if (flags & TIMER_OF_CLOCK)
-		timer_clk_exit(&to->of_clk);
+		timer_of_clk_exit(&to->of_clk);
 
 	if (flags & TIMER_OF_BASE)
-		timer_base_exit(&to->of_base);
+		timer_of_base_exit(&to->of_base);
 	return ret;
 }
 
@@ -187,11 +227,11 @@ int __init timer_of_init(struct device_node *np, struct timer_of *to)
 void __init timer_of_cleanup(struct timer_of *to)
 {
 	if (to->flags & TIMER_OF_IRQ)
-		timer_irq_exit(&to->of_irq);
+		timer_of_irq_exit(&to->of_irq);
 
 	if (to->flags & TIMER_OF_CLOCK)
-		timer_clk_exit(&to->of_clk);
+		timer_of_clk_exit(&to->of_clk);
 
 	if (to->flags & TIMER_OF_BASE)
-		timer_base_exit(&to->of_base);
+		timer_of_base_exit(&to->of_base);
 }
diff --git a/drivers/clocksource/timer-of.h b/drivers/clocksource/timer-of.h
index 3f708f1..a5478f3 100644
--- a/drivers/clocksource/timer-of.h
+++ b/drivers/clocksource/timer-of.h
@@ -33,6 +33,7 @@ struct of_timer_clk {
 
 struct timer_of {
 	unsigned int flags;
+	struct device_node *np;
 	struct clock_event_device clkevt;
 	struct of_timer_base of_base;
 	struct of_timer_irq  of_irq;
diff --git a/drivers/clocksource/timer-sprd.c b/drivers/clocksource/timer-sprd.c
new file mode 100644
index 0000000..ef9ebea
--- /dev/null
+++ b/drivers/clocksource/timer-sprd.c
@@ -0,0 +1,159 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ */
+
+#include <linux/init.h>
+#include <linux/interrupt.h>
+
+#include "timer-of.h"
+
+#define TIMER_NAME		"sprd_timer"
+
+#define TIMER_LOAD_LO		0x0
+#define TIMER_LOAD_HI		0x4
+#define TIMER_VALUE_LO		0x8
+#define TIMER_VALUE_HI		0xc
+
+#define TIMER_CTL		0x10
+#define TIMER_CTL_PERIOD_MODE	BIT(0)
+#define TIMER_CTL_ENABLE	BIT(1)
+#define TIMER_CTL_64BIT_WIDTH	BIT(16)
+
+#define TIMER_INT		0x14
+#define TIMER_INT_EN		BIT(0)
+#define TIMER_INT_RAW_STS	BIT(1)
+#define TIMER_INT_MASK_STS	BIT(2)
+#define TIMER_INT_CLR		BIT(3)
+
+#define TIMER_VALUE_SHDW_LO	0x18
+#define TIMER_VALUE_SHDW_HI	0x1c
+
+#define TIMER_VALUE_LO_MASK	GENMASK(31, 0)
+
+static void sprd_timer_enable(void __iomem *base, u32 flag)
+{
+	u32 val = readl_relaxed(base + TIMER_CTL);
+
+	val |= TIMER_CTL_ENABLE;
+	if (flag & TIMER_CTL_64BIT_WIDTH)
+		val |= TIMER_CTL_64BIT_WIDTH;
+	else
+		val &= ~TIMER_CTL_64BIT_WIDTH;
+
+	if (flag & TIMER_CTL_PERIOD_MODE)
+		val |= TIMER_CTL_PERIOD_MODE;
+	else
+		val &= ~TIMER_CTL_PERIOD_MODE;
+
+	writel_relaxed(val, base + TIMER_CTL);
+}
+
+static void sprd_timer_disable(void __iomem *base)
+{
+	u32 val = readl_relaxed(base + TIMER_CTL);
+
+	val &= ~TIMER_CTL_ENABLE;
+	writel_relaxed(val, base + TIMER_CTL);
+}
+
+static void sprd_timer_update_counter(void __iomem *base, unsigned long cycles)
+{
+	writel_relaxed(cycles & TIMER_VALUE_LO_MASK, base + TIMER_LOAD_LO);
+	writel_relaxed(0, base + TIMER_LOAD_HI);
+}
+
+static void sprd_timer_enable_interrupt(void __iomem *base)
+{
+	writel_relaxed(TIMER_INT_EN, base + TIMER_INT);
+}
+
+static void sprd_timer_clear_interrupt(void __iomem *base)
+{
+	u32 val = readl_relaxed(base + TIMER_INT);
+
+	val |= TIMER_INT_CLR;
+	writel_relaxed(val, base + TIMER_INT);
+}
+
+static int sprd_timer_set_next_event(unsigned long cycles,
+				     struct clock_event_device *ce)
+{
+	struct timer_of *to = to_timer_of(ce);
+
+	sprd_timer_disable(timer_of_base(to));
+	sprd_timer_update_counter(timer_of_base(to), cycles);
+	sprd_timer_enable(timer_of_base(to), 0);
+
+	return 0;
+}
+
+static int sprd_timer_set_periodic(struct clock_event_device *ce)
+{
+	struct timer_of *to = to_timer_of(ce);
+
+	sprd_timer_disable(timer_of_base(to));
+	sprd_timer_update_counter(timer_of_base(to), timer_of_period(to));
+	sprd_timer_enable(timer_of_base(to), TIMER_CTL_PERIOD_MODE);
+
+	return 0;
+}
+
+static int sprd_timer_shutdown(struct clock_event_device *ce)
+{
+	struct timer_of *to = to_timer_of(ce);
+
+	sprd_timer_disable(timer_of_base(to));
+	return 0;
+}
+
+static irqreturn_t sprd_timer_interrupt(int irq, void *dev_id)
+{
+	struct clock_event_device *ce = (struct clock_event_device *)dev_id;
+	struct timer_of *to = to_timer_of(ce);
+
+	sprd_timer_clear_interrupt(timer_of_base(to));
+
+	if (clockevent_state_oneshot(ce))
+		sprd_timer_disable(timer_of_base(to));
+
+	ce->event_handler(ce);
+	return IRQ_HANDLED;
+}
+
+static struct timer_of to = {
+	.flags = TIMER_OF_IRQ | TIMER_OF_BASE | TIMER_OF_CLOCK,
+
+	.clkevt = {
+		.name = TIMER_NAME,
+		.rating = 300,
+		.features = CLOCK_EVT_FEAT_DYNIRQ | CLOCK_EVT_FEAT_PERIODIC |
+			CLOCK_EVT_FEAT_ONESHOT,
+		.set_state_shutdown = sprd_timer_shutdown,
+		.set_state_periodic = sprd_timer_set_periodic,
+		.set_next_event = sprd_timer_set_next_event,
+		.cpumask = cpu_possible_mask,
+	},
+
+	.of_irq = {
+		.handler = sprd_timer_interrupt,
+		.flags = IRQF_TIMER | IRQF_IRQPOLL,
+	},
+};
+
+static int __init sprd_timer_init(struct device_node *np)
+{
+	int ret;
+
+	ret = timer_of_init(np, &to);
+	if (ret)
+		return ret;
+
+	sprd_timer_enable_interrupt(timer_of_base(&to));
+	clockevents_config_and_register(&to.clkevt, timer_of_rate(&to),
+					1, UINT_MAX);
+
+	return 0;
+}
+
+TIMER_OF_DECLARE(sc9860_timer, "sprd,sc9860-timer", sprd_timer_init);
diff --git a/drivers/clocksource/timer-stm32.c b/drivers/clocksource/timer-stm32.c
index 8f24237..e5cdc3a 100644
--- a/drivers/clocksource/timer-stm32.c
+++ b/drivers/clocksource/timer-stm32.c
@@ -9,6 +9,7 @@
 #include <linux/kernel.h>
 #include <linux/clocksource.h>
 #include <linux/clockchips.h>
+#include <linux/delay.h>
 #include <linux/irq.h>
 #include <linux/interrupt.h>
 #include <linux/of.h>
@@ -16,175 +17,318 @@
 #include <linux/of_irq.h>
 #include <linux/clk.h>
 #include <linux/reset.h>
+#include <linux/sched_clock.h>
+#include <linux/slab.h>
+
+#include "timer-of.h"
 
 #define TIM_CR1		0x00
 #define TIM_DIER	0x0c
 #define TIM_SR		0x10
 #define TIM_EGR		0x14
+#define TIM_CNT		0x24
 #define TIM_PSC		0x28
 #define TIM_ARR		0x2c
+#define TIM_CCR1	0x34
 
 #define TIM_CR1_CEN	BIT(0)
+#define TIM_CR1_UDIS	BIT(1)
 #define TIM_CR1_OPM	BIT(3)
 #define TIM_CR1_ARPE	BIT(7)
 
 #define TIM_DIER_UIE	BIT(0)
+#define TIM_DIER_CC1IE	BIT(1)
 
 #define TIM_SR_UIF	BIT(0)
 
 #define TIM_EGR_UG	BIT(0)
 
-struct stm32_clock_event_ddata {
-	struct clock_event_device evtdev;
-	unsigned periodic_top;
-	void __iomem *base;
+#define TIM_PSC_MAX	USHRT_MAX
+#define TIM_PSC_CLKRATE	10000
+
+struct stm32_timer_private {
+	int bits;
 };
 
-static int stm32_clock_event_shutdown(struct clock_event_device *evtdev)
+/**
+ * stm32_timer_of_bits_set - set accessor helper
+ * @to: a timer_of structure pointer
+ * @bits: the number of bits (16 or 32)
+ *
+ * Accessor helper to set the number of bits in the timer-of private
+ * structure.
+ *
+ */
+static void stm32_timer_of_bits_set(struct timer_of *to, int bits)
 {
-	struct stm32_clock_event_ddata *data =
-		container_of(evtdev, struct stm32_clock_event_ddata, evtdev);
-	void *base = data->base;
+	struct stm32_timer_private *pd = to->private_data;
 
-	writel_relaxed(0, base + TIM_CR1);
-	return 0;
+	pd->bits = bits;
 }
 
-static int stm32_clock_event_set_periodic(struct clock_event_device *evtdev)
+/**
+ * stm32_timer_of_bits_get - get accessor helper
+ * @to: a timer_of structure pointer
+ *
+ * Accessor helper to get the number of bits in the timer-of private
+ * structure.
+ *
+ * Returns an integer corresponding to the number of bits.
+ */
+static int stm32_timer_of_bits_get(struct timer_of *to)
 {
-	struct stm32_clock_event_ddata *data =
-		container_of(evtdev, struct stm32_clock_event_ddata, evtdev);
-	void *base = data->base;
+	struct stm32_timer_private *pd = to->private_data;
 
-	writel_relaxed(data->periodic_top, base + TIM_ARR);
-	writel_relaxed(TIM_CR1_ARPE | TIM_CR1_CEN, base + TIM_CR1);
+	return pd->bits;
+}
+
+static void __iomem *stm32_timer_cnt __read_mostly;
+
+static u64 notrace stm32_read_sched_clock(void)
+{
+	return readl_relaxed(stm32_timer_cnt);
+}
+
+static struct delay_timer stm32_timer_delay;
+
+static unsigned long stm32_read_delay(void)
+{
+	return readl_relaxed(stm32_timer_cnt);
+}
+
+static void stm32_clock_event_disable(struct timer_of *to)
+{
+	writel_relaxed(0, timer_of_base(to) + TIM_DIER);
+}
+
+/**
+ * stm32_timer_start - Start the counter without event
+ * @to: a timer_of structure pointer
+ *
+ * Start the timer in order to have the counter reset and start
+ * incrementing but disable interrupt event when there is a counter
+ * overflow. By default, the counter direction is used as upcounter.
+ */
+static void stm32_timer_start(struct timer_of *to)
+{
+	writel_relaxed(TIM_CR1_UDIS | TIM_CR1_CEN, timer_of_base(to) + TIM_CR1);
+}
+
+static int stm32_clock_event_shutdown(struct clock_event_device *clkevt)
+{
+	struct timer_of *to = to_timer_of(clkevt);
+
+	stm32_clock_event_disable(to);
+
 	return 0;
 }
 
 static int stm32_clock_event_set_next_event(unsigned long evt,
-					    struct clock_event_device *evtdev)
+					    struct clock_event_device *clkevt)
 {
-	struct stm32_clock_event_ddata *data =
-		container_of(evtdev, struct stm32_clock_event_ddata, evtdev);
+	struct timer_of *to = to_timer_of(clkevt);
+	unsigned long now, next;
 
-	writel_relaxed(evt, data->base + TIM_ARR);
-	writel_relaxed(TIM_CR1_ARPE | TIM_CR1_OPM | TIM_CR1_CEN,
-		       data->base + TIM_CR1);
+	next = readl_relaxed(timer_of_base(to) + TIM_CNT) + evt;
+	writel_relaxed(next, timer_of_base(to) + TIM_CCR1);
+	now = readl_relaxed(timer_of_base(to) + TIM_CNT);
+
+	if ((next - now) > evt)
+		return -ETIME;
+
+	writel_relaxed(TIM_DIER_CC1IE, timer_of_base(to) + TIM_DIER);
+
+	return 0;
+}
+
+static int stm32_clock_event_set_periodic(struct clock_event_device *clkevt)
+{
+	struct timer_of *to = to_timer_of(clkevt);
+
+	stm32_timer_start(to);
+
+	return stm32_clock_event_set_next_event(timer_of_period(to), clkevt);
+}
+
+static int stm32_clock_event_set_oneshot(struct clock_event_device *clkevt)
+{
+	struct timer_of *to = to_timer_of(clkevt);
+
+	stm32_timer_start(to);
 
 	return 0;
 }
 
 static irqreturn_t stm32_clock_event_handler(int irq, void *dev_id)
 {
-	struct stm32_clock_event_ddata *data = dev_id;
+	struct clock_event_device *clkevt = (struct clock_event_device *)dev_id;
+	struct timer_of *to = to_timer_of(clkevt);
 
-	writel_relaxed(0, data->base + TIM_SR);
+	writel_relaxed(0, timer_of_base(to) + TIM_SR);
 
-	data->evtdev.event_handler(&data->evtdev);
+	if (clockevent_state_periodic(clkevt))
+		stm32_clock_event_set_periodic(clkevt);
+	else
+		stm32_clock_event_shutdown(clkevt);
+
+	clkevt->event_handler(clkevt);
 
 	return IRQ_HANDLED;
 }
 
-static struct stm32_clock_event_ddata clock_event_ddata = {
-	.evtdev = {
-		.name = "stm32 clockevent",
-		.features = CLOCK_EVT_FEAT_ONESHOT | CLOCK_EVT_FEAT_PERIODIC,
-		.set_state_shutdown = stm32_clock_event_shutdown,
-		.set_state_periodic = stm32_clock_event_set_periodic,
-		.set_state_oneshot = stm32_clock_event_shutdown,
-		.tick_resume = stm32_clock_event_shutdown,
-		.set_next_event = stm32_clock_event_set_next_event,
-		.rating = 200,
-	},
-};
-
-static int __init stm32_clockevent_init(struct device_node *np)
+/**
+ * stm32_timer_width - Sort out the timer width (32/16)
+ * @to: a pointer to a timer-of structure
+ *
+ * Write the 32-bit max value and read/return the result. If the timer
+ * is 32 bits wide, the result will be UINT_MAX, otherwise it will
+ * be truncated by the 16-bit register to USHRT_MAX.
+ *
+ */
+static void __init stm32_timer_set_width(struct timer_of *to)
 {
-	struct stm32_clock_event_ddata *data = &clock_event_ddata;
-	struct clk *clk;
+	u32 width;
+
+	writel_relaxed(UINT_MAX, timer_of_base(to) + TIM_ARR);
+
+	width = readl_relaxed(timer_of_base(to) + TIM_ARR);
+
+	stm32_timer_of_bits_set(to, width == UINT_MAX ? 32 : 16);
+}
+
+/**
+ * stm32_timer_set_prescaler - Compute and set the prescaler register
+ * @to: a pointer to a timer-of structure
+ *
+ * Depending on the timer width, compute the prescaler to always
+ * target a 10MHz timer rate for 16 bits. 32-bit timers are
+ * considered precise and long enough to not use the prescaler.
+ */
+static void __init stm32_timer_set_prescaler(struct timer_of *to)
+{
+	int prescaler = 1;
+
+	if (stm32_timer_of_bits_get(to) != 32) {
+		prescaler = DIV_ROUND_CLOSEST(timer_of_rate(to),
+					      TIM_PSC_CLKRATE);
+		/*
+		 * The prescaler register is an u16, the variable
+		 * can't be greater than TIM_PSC_MAX, let's cap it in
+		 * this case.
+		 */
+		prescaler = prescaler < TIM_PSC_MAX ? prescaler : TIM_PSC_MAX;
+	}
+
+	writel_relaxed(prescaler - 1, timer_of_base(to) + TIM_PSC);
+	writel_relaxed(TIM_EGR_UG, timer_of_base(to) + TIM_EGR);
+	writel_relaxed(0, timer_of_base(to) + TIM_SR);
+
+	/* Adjust rate and period given the prescaler value */
+	to->of_clk.rate = DIV_ROUND_CLOSEST(to->of_clk.rate, prescaler);
+	to->of_clk.period = DIV_ROUND_UP(to->of_clk.rate, HZ);
+}
+
+static int __init stm32_clocksource_init(struct timer_of *to)
+{
+        u32 bits = stm32_timer_of_bits_get(to);
+	const char *name = to->np->full_name;
+
+	/*
+	 * This driver allows to register several timers and relies on
+	 * the generic time framework to select the right one.
+	 * However, nothing allows to do the same for the
+	 * sched_clock. We are not interested in a sched_clock for the
+	 * 16-bit timers but only for the 32-bit one, so if no 32-bit
+	 * timer is registered yet, we select this 32-bit timer as a
+	 * sched_clock.
+	 */
+	if (bits == 32 && !stm32_timer_cnt) {
+
+		/*
+		 * Start immediately the counter as we will be using
+		 * it right after.
+		 */
+		stm32_timer_start(to);
+
+		stm32_timer_cnt = timer_of_base(to) + TIM_CNT;
+		sched_clock_register(stm32_read_sched_clock, bits, timer_of_rate(to));
+		pr_info("%s: STM32 sched_clock registered\n", name);
+
+		stm32_timer_delay.read_current_timer = stm32_read_delay;
+		stm32_timer_delay.freq = timer_of_rate(to);
+		register_current_timer_delay(&stm32_timer_delay);
+		pr_info("%s: STM32 delay timer registered\n", name);
+	}
+
+	return clocksource_mmio_init(timer_of_base(to) + TIM_CNT, name,
+				     timer_of_rate(to), bits == 32 ? 250 : 100,
+				     bits, clocksource_mmio_readl_up);
+}
+
+static void __init stm32_clockevent_init(struct timer_of *to)
+{
+	u32 bits = stm32_timer_of_bits_get(to);
+
+	to->clkevt.name = to->np->full_name;
+	to->clkevt.features = CLOCK_EVT_FEAT_PERIODIC | CLOCK_EVT_FEAT_ONESHOT;
+	to->clkevt.set_state_shutdown = stm32_clock_event_shutdown;
+	to->clkevt.set_state_periodic = stm32_clock_event_set_periodic;
+	to->clkevt.set_state_oneshot = stm32_clock_event_set_oneshot;
+	to->clkevt.tick_resume = stm32_clock_event_shutdown;
+	to->clkevt.set_next_event = stm32_clock_event_set_next_event;
+	to->clkevt.rating = bits == 32 ? 250 : 100;
+
+	clockevents_config_and_register(&to->clkevt, timer_of_rate(to), 0x1,
+					(1 <<  bits) - 1);
+
+	pr_info("%pOF: STM32 clockevent driver initialized (%d bits)\n",
+		to->np, bits);
+}
+
+static int __init stm32_timer_init(struct device_node *node)
+{
 	struct reset_control *rstc;
-	unsigned long rate, max_delta;
-	int irq, ret, bits, prescaler = 1;
+	struct timer_of *to;
+	int ret;
 
-	clk = of_clk_get(np, 0);
-	if (IS_ERR(clk)) {
-		ret = PTR_ERR(clk);
-		pr_err("failed to get clock for clockevent (%d)\n", ret);
-		goto err_clk_get;
-	}
+	to = kzalloc(sizeof(*to), GFP_KERNEL);
+	if (!to)
+		return -ENOMEM;
 
-	ret = clk_prepare_enable(clk);
-	if (ret) {
-		pr_err("failed to enable timer clock for clockevent (%d)\n",
-		       ret);
-		goto err_clk_enable;
-	}
+	to->flags = TIMER_OF_IRQ | TIMER_OF_CLOCK | TIMER_OF_BASE;
+	to->of_irq.handler = stm32_clock_event_handler;
 
-	rate = clk_get_rate(clk);
+	ret = timer_of_init(node, to);
+	if (ret)
+		goto err;
 
-	rstc = of_reset_control_get(np, NULL);
+	to->private_data = kzalloc(sizeof(struct stm32_timer_private),
+				   GFP_KERNEL);
+	if (!to->private_data)
+		goto deinit;
+
+	rstc = of_reset_control_get(node, NULL);
 	if (!IS_ERR(rstc)) {
 		reset_control_assert(rstc);
 		reset_control_deassert(rstc);
 	}
 
-	data->base = of_iomap(np, 0);
-	if (!data->base) {
-		ret = -ENXIO;
-		pr_err("failed to map registers for clockevent\n");
-		goto err_iomap;
-	}
+	stm32_timer_set_width(to);
 
-	irq = irq_of_parse_and_map(np, 0);
-	if (!irq) {
-		ret = -EINVAL;
-		pr_err("%pOF: failed to get irq.\n", np);
-		goto err_get_irq;
-	}
+	stm32_timer_set_prescaler(to);
 
-	/* Detect whether the timer is 16 or 32 bits */
-	writel_relaxed(~0U, data->base + TIM_ARR);
-	max_delta = readl_relaxed(data->base + TIM_ARR);
-	if (max_delta == ~0U) {
-		prescaler = 1;
-		bits = 32;
-	} else {
-		prescaler = 1024;
-		bits = 16;
-	}
-	writel_relaxed(0, data->base + TIM_ARR);
+	ret = stm32_clocksource_init(to);
+	if (ret)
+		goto deinit;
 
-	writel_relaxed(prescaler - 1, data->base + TIM_PSC);
-	writel_relaxed(TIM_EGR_UG, data->base + TIM_EGR);
-	writel_relaxed(TIM_DIER_UIE, data->base + TIM_DIER);
-	writel_relaxed(0, data->base + TIM_SR);
+	stm32_clockevent_init(to);
+	return 0;
 
-	data->periodic_top = DIV_ROUND_CLOSEST(rate, prescaler * HZ);
-
-	clockevents_config_and_register(&data->evtdev,
-					DIV_ROUND_CLOSEST(rate, prescaler),
-					0x1, max_delta);
-
-	ret = request_irq(irq, stm32_clock_event_handler, IRQF_TIMER,
-			"stm32 clockevent", data);
-	if (ret) {
-		pr_err("%pOF: failed to request irq.\n", np);
-		goto err_get_irq;
-	}
-
-	pr_info("%pOF: STM32 clockevent driver initialized (%d bits)\n",
-			np, bits);
-
-	return ret;
-
-err_get_irq:
-	iounmap(data->base);
-err_iomap:
-	clk_disable_unprepare(clk);
-err_clk_enable:
-	clk_put(clk);
-err_clk_get:
+deinit:
+	timer_of_cleanup(to);
+err:
+	kfree(to);
 	return ret;
 }
 
-TIMER_OF_DECLARE(stm32, "st,stm32-timer", stm32_clockevent_init);
+TIMER_OF_DECLARE(stm32, "st,stm32-timer", stm32_timer_init);
diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index bdce448..3a88e33 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -2,6 +2,29 @@
 # ARM CPU Frequency scaling drivers
 #
 
+config ACPI_CPPC_CPUFREQ
+	tristate "CPUFreq driver based on the ACPI CPPC spec"
+	depends on ACPI_PROCESSOR
+	select ACPI_CPPC_LIB
+	help
+	  This adds a CPUFreq driver which uses CPPC methods
+	  as described in the ACPIv5.1 spec. CPPC stands for
+	  Collaborative Processor Performance Controls. It
+	  is based on an abstract continuous scale of CPU
+	  performance values which allows the remote power
+	  processor to flexibly optimize for power and
+	  performance. CPPC relies on power management firmware
+	  support for its operation.
+
+	  If in doubt, say N.
+
+config ARM_ARMADA_37XX_CPUFREQ
+	tristate "Armada 37xx CPUFreq support"
+	depends on ARCH_MVEBU
+	help
+	  This adds the CPUFreq driver support for Marvell Armada 37xx SoCs.
+	  The Armada 37xx PMU supports 4 frequency and VDD levels.
+
 # big LITTLE core layer and glue drivers
 config ARM_BIG_LITTLE_CPUFREQ
 	tristate "Generic ARM big LITTLE CPUfreq driver"
@@ -12,6 +35,30 @@
 	help
 	  This enables the Generic CPUfreq driver for ARM big.LITTLE platforms.
 
+config ARM_DT_BL_CPUFREQ
+	tristate "Generic probing via DT for ARM big LITTLE CPUfreq driver"
+	depends on ARM_BIG_LITTLE_CPUFREQ && OF
+	help
+	  This enables probing via DT for Generic CPUfreq driver for ARM
+	  big.LITTLE platform. This gets frequency tables from DT.
+
+config ARM_SCPI_CPUFREQ
+	tristate "SCPI based CPUfreq driver"
+	depends on ARM_BIG_LITTLE_CPUFREQ && ARM_SCPI_PROTOCOL && COMMON_CLK_SCPI
+	help
+	  This adds the CPUfreq driver support for ARM big.LITTLE platforms
+	  using SCPI protocol for CPU power management.
+
+	  This driver uses SCPI Message Protocol driver to interact with the
+	  firmware providing the CPU DVFS functionality.
+
+config ARM_VEXPRESS_SPC_CPUFREQ
+	tristate "Versatile Express SPC based CPUfreq driver"
+	depends on ARM_BIG_LITTLE_CPUFREQ && ARCH_VEXPRESS_SPC
+	help
+	  This add the CPUfreq driver support for Versatile Express
+	  big.LITTLE platforms using SPC for power management.
+
 config ARM_BRCMSTB_AVS_CPUFREQ
 	tristate "Broadcom STB AVS CPUfreq driver"
 	depends on ARCH_BRCMSTB || COMPILE_TEST
@@ -33,20 +80,6 @@
 
 	  If in doubt, say N.
 
-config ARM_DT_BL_CPUFREQ
-	tristate "Generic probing via DT for ARM big LITTLE CPUfreq driver"
-	depends on ARM_BIG_LITTLE_CPUFREQ && OF
-	help
-	  This enables probing via DT for Generic CPUfreq driver for ARM
-	  big.LITTLE platform. This gets frequency tables from DT.
-
-config ARM_VEXPRESS_SPC_CPUFREQ
-        tristate "Versatile Express SPC based CPUfreq driver"
-	depends on ARM_BIG_LITTLE_CPUFREQ && ARCH_VEXPRESS_SPC
-        help
-          This add the CPUfreq driver support for Versatile Express
-	  big.LITTLE platforms using SPC for power management.
-
 config ARM_EXYNOS5440_CPUFREQ
 	tristate "SAMSUNG EXYNOS5440"
 	depends on SOC_EXYNOS5440
@@ -205,16 +238,6 @@
 config ARM_SA1110_CPUFREQ
 	bool
 
-config ARM_SCPI_CPUFREQ
-        tristate "SCPI based CPUfreq driver"
-	depends on ARM_BIG_LITTLE_CPUFREQ && ARM_SCPI_PROTOCOL && COMMON_CLK_SCPI
-        help
-	  This adds the CPUfreq driver support for ARM big.LITTLE platforms
-	  using SCPI protocol for CPU power management.
-
-	  This driver uses SCPI Message Protocol driver to interact with the
-	  firmware providing the CPU DVFS functionality.
-
 config ARM_SPEAR_CPUFREQ
 	bool "SPEAr CPUFreq support"
 	depends on PLAT_SPEAR
@@ -275,20 +298,3 @@
 	  This add the CPUFreq driver support for Intel PXA2xx SOCs.
 
 	  If in doubt, say N.
-
-config ACPI_CPPC_CPUFREQ
-	tristate "CPUFreq driver based on the ACPI CPPC spec"
-	depends on ACPI_PROCESSOR
-	select ACPI_CPPC_LIB
-	default n
-	help
-	  This adds a CPUFreq driver which uses CPPC methods
-	  as described in the ACPIv5.1 spec. CPPC stands for
-	  Collaborative Processor Performance Controls. It
-	  is based on an abstract continuous scale of CPU
-	  performance values which allows the remote power
-	  processor to flexibly optimize for power and
-	  performance. CPPC relies on power management firmware
-	  support for its operation.
-
-	  If in doubt, say N.
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index 812f9e0..e07715c 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -52,23 +52,26 @@
 # LITTLE drivers, so that it is probed last.
 obj-$(CONFIG_ARM_DT_BL_CPUFREQ)		+= arm_big_little_dt.o
 
+obj-$(CONFIG_ARM_ARMADA_37XX_CPUFREQ)	+= armada-37xx-cpufreq.o
 obj-$(CONFIG_ARM_BRCMSTB_AVS_CPUFREQ)	+= brcmstb-avs-cpufreq.o
+obj-$(CONFIG_ACPI_CPPC_CPUFREQ)		+= cppc_cpufreq.o
 obj-$(CONFIG_ARCH_DAVINCI)		+= davinci-cpufreq.o
 obj-$(CONFIG_ARM_EXYNOS5440_CPUFREQ)	+= exynos5440-cpufreq.o
 obj-$(CONFIG_ARM_HIGHBANK_CPUFREQ)	+= highbank-cpufreq.o
 obj-$(CONFIG_ARM_IMX6Q_CPUFREQ)		+= imx6q-cpufreq.o
 obj-$(CONFIG_ARM_KIRKWOOD_CPUFREQ)	+= kirkwood-cpufreq.o
 obj-$(CONFIG_ARM_MEDIATEK_CPUFREQ)	+= mediatek-cpufreq.o
+obj-$(CONFIG_MACH_MVEBU_V7)		+= mvebu-cpufreq.o
 obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)	+= omap-cpufreq.o
 obj-$(CONFIG_ARM_PXA2xx_CPUFREQ)	+= pxa2xx-cpufreq.o
 obj-$(CONFIG_PXA3xx)			+= pxa3xx-cpufreq.o
-obj-$(CONFIG_ARM_S3C24XX_CPUFREQ)	+= s3c24xx-cpufreq.o
-obj-$(CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS) += s3c24xx-cpufreq-debugfs.o
 obj-$(CONFIG_ARM_S3C2410_CPUFREQ)	+= s3c2410-cpufreq.o
 obj-$(CONFIG_ARM_S3C2412_CPUFREQ)	+= s3c2412-cpufreq.o
 obj-$(CONFIG_ARM_S3C2416_CPUFREQ)	+= s3c2416-cpufreq.o
 obj-$(CONFIG_ARM_S3C2440_CPUFREQ)	+= s3c2440-cpufreq.o
 obj-$(CONFIG_ARM_S3C64XX_CPUFREQ)	+= s3c64xx-cpufreq.o
+obj-$(CONFIG_ARM_S3C24XX_CPUFREQ)	+= s3c24xx-cpufreq.o
+obj-$(CONFIG_ARM_S3C24XX_CPUFREQ_DEBUGFS) += s3c24xx-cpufreq-debugfs.o
 obj-$(CONFIG_ARM_S5PV210_CPUFREQ)	+= s5pv210-cpufreq.o
 obj-$(CONFIG_ARM_SA1100_CPUFREQ)	+= sa1100-cpufreq.o
 obj-$(CONFIG_ARM_SA1110_CPUFREQ)	+= sa1110-cpufreq.o
@@ -81,8 +84,6 @@
 obj-$(CONFIG_ARM_TEGRA186_CPUFREQ)	+= tegra186-cpufreq.o
 obj-$(CONFIG_ARM_TI_CPUFREQ)		+= ti-cpufreq.o
 obj-$(CONFIG_ARM_VEXPRESS_SPC_CPUFREQ)	+= vexpress-spc-cpufreq.o
-obj-$(CONFIG_ACPI_CPPC_CPUFREQ) += cppc_cpufreq.o
-obj-$(CONFIG_MACH_MVEBU_V7)		+= mvebu-cpufreq.o
 
 
 ##################################################################################
diff --git a/drivers/cpufreq/arm_big_little.c b/drivers/cpufreq/arm_big_little.c
index 65ec5f0..c56b57d 100644
--- a/drivers/cpufreq/arm_big_little.c
+++ b/drivers/cpufreq/arm_big_little.c
@@ -526,34 +526,13 @@ static int bL_cpufreq_exit(struct cpufreq_policy *policy)
 
 static void bL_cpufreq_ready(struct cpufreq_policy *policy)
 {
-	struct device *cpu_dev = get_cpu_device(policy->cpu);
 	int cur_cluster = cpu_to_cluster(policy->cpu);
-	struct device_node *np;
 
 	/* Do not register a cpu_cooling device if we are in IKS mode */
 	if (cur_cluster >= MAX_CLUSTERS)
 		return;
 
-	np = of_node_get(cpu_dev->of_node);
-	if (WARN_ON(!np))
-		return;
-
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		u32 power_coefficient = 0;
-
-		of_property_read_u32(np, "dynamic-power-coefficient",
-				     &power_coefficient);
-
-		cdev[cur_cluster] = of_cpufreq_power_cooling_register(np,
-				policy, power_coefficient, NULL);
-		if (IS_ERR(cdev[cur_cluster])) {
-			dev_err(cpu_dev,
-				"running cpufreq without cooling device: %ld\n",
-				PTR_ERR(cdev[cur_cluster]));
-			cdev[cur_cluster] = NULL;
-		}
-	}
-	of_node_put(np);
+	cdev[cur_cluster] = of_cpufreq_cooling_register(policy);
 }
 
 static struct cpufreq_driver bL_cpufreq_driver = {
diff --git a/drivers/cpufreq/armada-37xx-cpufreq.c b/drivers/cpufreq/armada-37xx-cpufreq.c
new file mode 100644
index 0000000..c6ebc88
--- /dev/null
+++ b/drivers/cpufreq/armada-37xx-cpufreq.c
@@ -0,0 +1,241 @@
+// SPDX-License-Identifier: GPL-2.0+
+/*
+ * CPU frequency scaling support for Armada 37xx platform.
+ *
+ * Copyright (C) 2017 Marvell
+ *
+ * Gregory CLEMENT <gregory.clement@free-electrons.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/cpu.h>
+#include <linux/cpufreq.h>
+#include <linux/err.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/mfd/syscon.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/of_irq.h>
+#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+
+/* Power management in North Bridge register set */
+#define ARMADA_37XX_NB_L0L1	0x18
+#define ARMADA_37XX_NB_L2L3	0x1C
+#define  ARMADA_37XX_NB_TBG_DIV_OFF	13
+#define  ARMADA_37XX_NB_TBG_DIV_MASK	0x7
+#define  ARMADA_37XX_NB_CLK_SEL_OFF	11
+#define  ARMADA_37XX_NB_CLK_SEL_MASK	0x1
+#define  ARMADA_37XX_NB_CLK_SEL_TBG	0x1
+#define  ARMADA_37XX_NB_TBG_SEL_OFF	9
+#define  ARMADA_37XX_NB_TBG_SEL_MASK	0x3
+#define  ARMADA_37XX_NB_VDD_SEL_OFF	6
+#define  ARMADA_37XX_NB_VDD_SEL_MASK	0x3
+#define  ARMADA_37XX_NB_CONFIG_SHIFT	16
+#define ARMADA_37XX_NB_DYN_MOD	0x24
+#define  ARMADA_37XX_NB_CLK_SEL_EN	BIT(26)
+#define  ARMADA_37XX_NB_TBG_EN		BIT(28)
+#define  ARMADA_37XX_NB_DIV_EN		BIT(29)
+#define  ARMADA_37XX_NB_VDD_EN		BIT(30)
+#define  ARMADA_37XX_NB_DFS_EN		BIT(31)
+#define ARMADA_37XX_NB_CPU_LOAD 0x30
+#define  ARMADA_37XX_NB_CPU_LOAD_MASK	0x3
+#define  ARMADA_37XX_DVFS_LOAD_0	0
+#define  ARMADA_37XX_DVFS_LOAD_1	1
+#define  ARMADA_37XX_DVFS_LOAD_2	2
+#define  ARMADA_37XX_DVFS_LOAD_3	3
+
+/*
+ * On Armada 37xx the Power management manages 4 level of CPU load,
+ * each level can be associated with a CPU clock source, a CPU
+ * divider, a VDD level, etc...
+ */
+#define LOAD_LEVEL_NR	4
+
+struct armada_37xx_dvfs {
+	u32 cpu_freq_max;
+	u8 divider[LOAD_LEVEL_NR];
+};
+
+static struct armada_37xx_dvfs armada_37xx_dvfs[] = {
+	{.cpu_freq_max = 1200*1000*1000, .divider = {1, 2, 4, 6} },
+	{.cpu_freq_max = 1000*1000*1000, .divider = {1, 2, 4, 5} },
+	{.cpu_freq_max = 800*1000*1000,  .divider = {1, 2, 3, 4} },
+	{.cpu_freq_max = 600*1000*1000,  .divider = {2, 4, 5, 6} },
+};
+
+static struct armada_37xx_dvfs *armada_37xx_cpu_freq_info_get(u32 freq)
+{
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(armada_37xx_dvfs); i++) {
+		if (freq == armada_37xx_dvfs[i].cpu_freq_max)
+			return &armada_37xx_dvfs[i];
+	}
+
+	pr_err("Unsupported CPU frequency %d MHz\n", freq/1000000);
+	return NULL;
+}
+
+/*
+ * Setup the four level managed by the hardware. Once the four level
+ * will be configured then the DVFS will be enabled.
+ */
+static void __init armada37xx_cpufreq_dvfs_setup(struct regmap *base,
+						 struct clk *clk, u8 *divider)
+{
+	int load_lvl;
+	struct clk *parent;
+
+	for (load_lvl = 0; load_lvl < LOAD_LEVEL_NR; load_lvl++) {
+		unsigned int reg, mask, val, offset = 0;
+
+		if (load_lvl <= ARMADA_37XX_DVFS_LOAD_1)
+			reg = ARMADA_37XX_NB_L0L1;
+		else
+			reg = ARMADA_37XX_NB_L2L3;
+
+		if (load_lvl == ARMADA_37XX_DVFS_LOAD_0 ||
+		    load_lvl == ARMADA_37XX_DVFS_LOAD_2)
+			offset += ARMADA_37XX_NB_CONFIG_SHIFT;
+
+		/* Set cpu clock source, for all the level we use TBG */
+		val = ARMADA_37XX_NB_CLK_SEL_TBG << ARMADA_37XX_NB_CLK_SEL_OFF;
+		mask = (ARMADA_37XX_NB_CLK_SEL_MASK
+			<< ARMADA_37XX_NB_CLK_SEL_OFF);
+
+		/*
+		 * Set cpu divider based on the pre-computed array in
+		 * order to have balanced step.
+		 */
+		val |= divider[load_lvl] << ARMADA_37XX_NB_TBG_DIV_OFF;
+		mask |= (ARMADA_37XX_NB_TBG_DIV_MASK
+			<< ARMADA_37XX_NB_TBG_DIV_OFF);
+
+		/* Set VDD divider which is actually the load level. */
+		val |= load_lvl << ARMADA_37XX_NB_VDD_SEL_OFF;
+		mask |= (ARMADA_37XX_NB_VDD_SEL_MASK
+			<< ARMADA_37XX_NB_VDD_SEL_OFF);
+
+		val <<= offset;
+		mask <<= offset;
+
+		regmap_update_bits(base, reg, mask, val);
+	}
+
+	/*
+	 * Set cpu clock source, for all the level we keep the same
+	 * clock source that the one already configured. For this one
+	 * we need to use the clock framework
+	 */
+	parent = clk_get_parent(clk);
+	clk_set_parent(clk, parent);
+}
+
+static void __init armada37xx_cpufreq_disable_dvfs(struct regmap *base)
+{
+	unsigned int reg = ARMADA_37XX_NB_DYN_MOD,
+		mask = ARMADA_37XX_NB_DFS_EN;
+
+	regmap_update_bits(base, reg, mask, 0);
+}
+
+static void __init armada37xx_cpufreq_enable_dvfs(struct regmap *base)
+{
+	unsigned int val, reg = ARMADA_37XX_NB_CPU_LOAD,
+		mask = ARMADA_37XX_NB_CPU_LOAD_MASK;
+
+	/* Start with the highest load (0) */
+	val = ARMADA_37XX_DVFS_LOAD_0;
+	regmap_update_bits(base, reg, mask, val);
+
+	/* Now enable DVFS for the CPUs */
+	reg = ARMADA_37XX_NB_DYN_MOD;
+	mask =	ARMADA_37XX_NB_CLK_SEL_EN | ARMADA_37XX_NB_TBG_EN |
+		ARMADA_37XX_NB_DIV_EN | ARMADA_37XX_NB_VDD_EN |
+		ARMADA_37XX_NB_DFS_EN;
+
+	regmap_update_bits(base, reg, mask, mask);
+}
+
+static int __init armada37xx_cpufreq_driver_init(void)
+{
+	struct armada_37xx_dvfs *dvfs;
+	struct platform_device *pdev;
+	unsigned int cur_frequency;
+	struct regmap *nb_pm_base;
+	struct device *cpu_dev;
+	int load_lvl, ret;
+	struct clk *clk;
+
+	nb_pm_base =
+		syscon_regmap_lookup_by_compatible("marvell,armada-3700-nb-pm");
+
+	if (IS_ERR(nb_pm_base))
+		return -ENODEV;
+
+	/* Before doing any configuration on the DVFS first, disable it */
+	armada37xx_cpufreq_disable_dvfs(nb_pm_base);
+
+	/*
+	 * On CPU 0 register the operating points supported (which are
+	 * the nominal CPU frequency and full integer divisions of
+	 * it).
+	 */
+	cpu_dev = get_cpu_device(0);
+	if (!cpu_dev) {
+		dev_err(cpu_dev, "Cannot get CPU\n");
+		return -ENODEV;
+	}
+
+	clk = clk_get(cpu_dev, 0);
+	if (IS_ERR(clk)) {
+		dev_err(cpu_dev, "Cannot get clock for CPU0\n");
+		return PTR_ERR(clk);
+	}
+
+	/* Get nominal (current) CPU frequency */
+	cur_frequency = clk_get_rate(clk);
+	if (!cur_frequency) {
+		dev_err(cpu_dev, "Failed to get clock rate for CPU\n");
+		return -EINVAL;
+	}
+
+	dvfs = armada_37xx_cpu_freq_info_get(cur_frequency);
+	if (!dvfs)
+		return -EINVAL;
+
+	armada37xx_cpufreq_dvfs_setup(nb_pm_base, clk, dvfs->divider);
+
+	for (load_lvl = ARMADA_37XX_DVFS_LOAD_0; load_lvl < LOAD_LEVEL_NR;
+	     load_lvl++) {
+		unsigned long freq = cur_frequency / dvfs->divider[load_lvl];
+
+		ret = dev_pm_opp_add(cpu_dev, freq, 0);
+		if (ret) {
+			/* clean-up the already added opp before leaving */
+			while (load_lvl-- > ARMADA_37XX_DVFS_LOAD_0) {
+				freq = cur_frequency / dvfs->divider[load_lvl];
+				dev_pm_opp_remove(cpu_dev, freq);
+			}
+			return ret;
+		}
+	}
+
+	/* Now that everything is setup, enable the DVFS at hardware level */
+	armada37xx_cpufreq_enable_dvfs(nb_pm_base);
+
+	pdev = platform_device_register_simple("cpufreq-dt", -1, NULL, 0);
+
+	return PTR_ERR_OR_ZERO(pdev);
+}
+/* late_initcall, to guarantee the driver is loaded after A37xx clock driver */
+late_initcall(armada37xx_cpufreq_driver_init);
+
+MODULE_AUTHOR("Gregory CLEMENT <gregory.clement@free-electrons.com>");
+MODULE_DESCRIPTION("Armada 37xx cpufreq driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/cpufreq/cpufreq-dt-platdev.c b/drivers/cpufreq/cpufreq-dt-platdev.c
index ecc56e26..3b585e4 100644
--- a/drivers/cpufreq/cpufreq-dt-platdev.c
+++ b/drivers/cpufreq/cpufreq-dt-platdev.c
@@ -108,6 +108,14 @@ static const struct of_device_id blacklist[] __initconst = {
 
 	{ .compatible = "marvell,armadaxp", },
 
+	{ .compatible = "mediatek,mt2701", },
+	{ .compatible = "mediatek,mt2712", },
+	{ .compatible = "mediatek,mt7622", },
+	{ .compatible = "mediatek,mt7623", },
+	{ .compatible = "mediatek,mt817x", },
+	{ .compatible = "mediatek,mt8173", },
+	{ .compatible = "mediatek,mt8176", },
+
 	{ .compatible = "nvidia,tegra124", },
 
 	{ .compatible = "st,stih407", },
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 545946a..de3d104 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -319,33 +319,8 @@ static int cpufreq_exit(struct cpufreq_policy *policy)
 static void cpufreq_ready(struct cpufreq_policy *policy)
 {
 	struct private_data *priv = policy->driver_data;
-	struct device_node *np = of_node_get(priv->cpu_dev->of_node);
 
-	if (WARN_ON(!np))
-		return;
-
-	/*
-	 * For now, just loading the cooling device;
-	 * thermal DT code takes care of matching them.
-	 */
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		u32 power_coefficient = 0;
-
-		of_property_read_u32(np, "dynamic-power-coefficient",
-				     &power_coefficient);
-
-		priv->cdev = of_cpufreq_power_cooling_register(np,
-				policy, power_coefficient, NULL);
-		if (IS_ERR(priv->cdev)) {
-			dev_err(priv->cpu_dev,
-				"running cpufreq without cooling device: %ld\n",
-				PTR_ERR(priv->cdev));
-
-			priv->cdev = NULL;
-		}
-	}
-
-	of_node_put(np);
+	priv->cdev = of_cpufreq_cooling_register(policy);
 }
 
 static struct cpufreq_driver dt_cpufreq_driver = {
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
index 41d148a..421f318 100644
--- a/drivers/cpufreq/cpufreq.c
+++ b/drivers/cpufreq/cpufreq.c
@@ -601,19 +601,18 @@ static struct cpufreq_governor *find_governor(const char *str_governor)
 /**
  * cpufreq_parse_governor - parse a governor string
  */
-static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
-				struct cpufreq_governor **governor)
+static int cpufreq_parse_governor(char *str_governor,
+				  struct cpufreq_policy *policy)
 {
-	int err = -EINVAL;
-
 	if (cpufreq_driver->setpolicy) {
 		if (!strncasecmp(str_governor, "performance", CPUFREQ_NAME_LEN)) {
-			*policy = CPUFREQ_POLICY_PERFORMANCE;
-			err = 0;
-		} else if (!strncasecmp(str_governor, "powersave",
-						CPUFREQ_NAME_LEN)) {
-			*policy = CPUFREQ_POLICY_POWERSAVE;
-			err = 0;
+			policy->policy = CPUFREQ_POLICY_PERFORMANCE;
+			return 0;
+		}
+
+		if (!strncasecmp(str_governor, "powersave", CPUFREQ_NAME_LEN)) {
+			policy->policy = CPUFREQ_POLICY_POWERSAVE;
+			return 0;
 		}
 	} else {
 		struct cpufreq_governor *t;
@@ -621,26 +620,31 @@ static int cpufreq_parse_governor(char *str_governor, unsigned int *policy,
 		mutex_lock(&cpufreq_governor_mutex);
 
 		t = find_governor(str_governor);
-
-		if (t == NULL) {
+		if (!t) {
 			int ret;
 
 			mutex_unlock(&cpufreq_governor_mutex);
+
 			ret = request_module("cpufreq_%s", str_governor);
+			if (ret)
+				return -EINVAL;
+
 			mutex_lock(&cpufreq_governor_mutex);
 
-			if (ret == 0)
-				t = find_governor(str_governor);
+			t = find_governor(str_governor);
 		}
-
-		if (t != NULL) {
-			*governor = t;
-			err = 0;
-		}
+		if (t && !try_module_get(t->owner))
+			t = NULL;
 
 		mutex_unlock(&cpufreq_governor_mutex);
+
+		if (t) {
+			policy->governor = t;
+			return 0;
+		}
 	}
-	return err;
+
+	return -EINVAL;
 }
 
 /**
@@ -760,11 +764,14 @@ static ssize_t store_scaling_governor(struct cpufreq_policy *policy,
 	if (ret != 1)
 		return -EINVAL;
 
-	if (cpufreq_parse_governor(str_governor, &new_policy.policy,
-						&new_policy.governor))
+	if (cpufreq_parse_governor(str_governor, &new_policy))
 		return -EINVAL;
 
 	ret = cpufreq_set_policy(policy, &new_policy);
+
+	if (new_policy.governor)
+		module_put(new_policy.governor->owner);
+
 	return ret ? ret : count;
 }
 
@@ -1044,8 +1051,7 @@ static int cpufreq_init_policy(struct cpufreq_policy *policy)
 		if (policy->last_policy)
 			new_policy.policy = policy->last_policy;
 		else
-			cpufreq_parse_governor(gov->name, &new_policy.policy,
-					       NULL);
+			cpufreq_parse_governor(gov->name, &new_policy);
 	}
 	/* set default policy */
 	return cpufreq_set_policy(policy, &new_policy);
@@ -2160,7 +2166,6 @@ void cpufreq_unregister_governor(struct cpufreq_governor *governor)
 	mutex_lock(&cpufreq_governor_mutex);
 	list_del(&governor->governor_list);
 	mutex_unlock(&cpufreq_governor_mutex);
-	return;
 }
 EXPORT_SYMBOL_GPL(cpufreq_unregister_governor);
 
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c
index 1e55b57..1572129 100644
--- a/drivers/cpufreq/cpufreq_stats.c
+++ b/drivers/cpufreq/cpufreq_stats.c
@@ -27,7 +27,7 @@ struct cpufreq_stats {
 	unsigned int *trans_table;
 };
 
-static int cpufreq_stats_update(struct cpufreq_stats *stats)
+static void cpufreq_stats_update(struct cpufreq_stats *stats)
 {
 	unsigned long long cur_time = get_jiffies_64();
 
@@ -35,7 +35,6 @@ static int cpufreq_stats_update(struct cpufreq_stats *stats)
 	stats->time_in_state[stats->last_index] += cur_time - stats->last_time;
 	stats->last_time = cur_time;
 	spin_unlock(&cpufreq_stats_lock);
-	return 0;
 }
 
 static void cpufreq_stats_clear_table(struct cpufreq_stats *stats)
diff --git a/drivers/cpufreq/imx6q-cpufreq.c b/drivers/cpufreq/imx6q-cpufreq.c
index d9b2c2d..741f22e 100644
--- a/drivers/cpufreq/imx6q-cpufreq.c
+++ b/drivers/cpufreq/imx6q-cpufreq.c
@@ -25,15 +25,29 @@ static struct regulator *arm_reg;
 static struct regulator *pu_reg;
 static struct regulator *soc_reg;
 
-static struct clk *arm_clk;
-static struct clk *pll1_sys_clk;
-static struct clk *pll1_sw_clk;
-static struct clk *step_clk;
-static struct clk *pll2_pfd2_396m_clk;
+enum IMX6_CPUFREQ_CLKS {
+	ARM,
+	PLL1_SYS,
+	STEP,
+	PLL1_SW,
+	PLL2_PFD2_396M,
+	/* MX6UL requires two more clks */
+	PLL2_BUS,
+	SECONDARY_SEL,
+};
+#define IMX6Q_CPUFREQ_CLK_NUM		5
+#define IMX6UL_CPUFREQ_CLK_NUM		7
 
-/* clk used by i.MX6UL */
-static struct clk *pll2_bus_clk;
-static struct clk *secondary_sel_clk;
+static int num_clks;
+static struct clk_bulk_data clks[] = {
+	{ .id = "arm" },
+	{ .id = "pll1_sys" },
+	{ .id = "step" },
+	{ .id = "pll1_sw" },
+	{ .id = "pll2_pfd2_396m" },
+	{ .id = "pll2_bus" },
+	{ .id = "secondary_sel" },
+};
 
 static struct device *cpu_dev;
 static bool free_opp;
@@ -53,7 +67,7 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 
 	new_freq = freq_table[index].frequency;
 	freq_hz = new_freq * 1000;
-	old_freq = clk_get_rate(arm_clk) / 1000;
+	old_freq = clk_get_rate(clks[ARM].clk) / 1000;
 
 	opp = dev_pm_opp_find_freq_ceil(cpu_dev, &freq_hz);
 	if (IS_ERR(opp)) {
@@ -112,29 +126,35 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 		 * voltage of 528MHz, so lower the CPU frequency to one
 		 * half before changing CPU frequency.
 		 */
-		clk_set_rate(arm_clk, (old_freq >> 1) * 1000);
-		clk_set_parent(pll1_sw_clk, pll1_sys_clk);
-		if (freq_hz > clk_get_rate(pll2_pfd2_396m_clk))
-			clk_set_parent(secondary_sel_clk, pll2_bus_clk);
+		clk_set_rate(clks[ARM].clk, (old_freq >> 1) * 1000);
+		clk_set_parent(clks[PLL1_SW].clk, clks[PLL1_SYS].clk);
+		if (freq_hz > clk_get_rate(clks[PLL2_PFD2_396M].clk))
+			clk_set_parent(clks[SECONDARY_SEL].clk,
+				       clks[PLL2_BUS].clk);
 		else
-			clk_set_parent(secondary_sel_clk, pll2_pfd2_396m_clk);
-		clk_set_parent(step_clk, secondary_sel_clk);
-		clk_set_parent(pll1_sw_clk, step_clk);
+			clk_set_parent(clks[SECONDARY_SEL].clk,
+				       clks[PLL2_PFD2_396M].clk);
+		clk_set_parent(clks[STEP].clk, clks[SECONDARY_SEL].clk);
+		clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
+		if (freq_hz > clk_get_rate(clks[PLL2_BUS].clk)) {
+			clk_set_rate(clks[PLL1_SYS].clk, new_freq * 1000);
+			clk_set_parent(clks[PLL1_SW].clk, clks[PLL1_SYS].clk);
+		}
 	} else {
-		clk_set_parent(step_clk, pll2_pfd2_396m_clk);
-		clk_set_parent(pll1_sw_clk, step_clk);
-		if (freq_hz > clk_get_rate(pll2_pfd2_396m_clk)) {
-			clk_set_rate(pll1_sys_clk, new_freq * 1000);
-			clk_set_parent(pll1_sw_clk, pll1_sys_clk);
+		clk_set_parent(clks[STEP].clk, clks[PLL2_PFD2_396M].clk);
+		clk_set_parent(clks[PLL1_SW].clk, clks[STEP].clk);
+		if (freq_hz > clk_get_rate(clks[PLL2_PFD2_396M].clk)) {
+			clk_set_rate(clks[PLL1_SYS].clk, new_freq * 1000);
+			clk_set_parent(clks[PLL1_SW].clk, clks[PLL1_SYS].clk);
 		} else {
 			/* pll1_sys needs to be enabled for divider rate change to work. */
 			pll1_sys_temp_enabled = true;
-			clk_prepare_enable(pll1_sys_clk);
+			clk_prepare_enable(clks[PLL1_SYS].clk);
 		}
 	}
 
 	/* Ensure the arm clock divider is what we expect */
-	ret = clk_set_rate(arm_clk, new_freq * 1000);
+	ret = clk_set_rate(clks[ARM].clk, new_freq * 1000);
 	if (ret) {
 		dev_err(cpu_dev, "failed to set clock rate: %d\n", ret);
 		regulator_set_voltage_tol(arm_reg, volt_old, 0);
@@ -143,7 +163,7 @@ static int imx6q_set_target(struct cpufreq_policy *policy, unsigned int index)
 
 	/* PLL1 is only needed until after ARM-PODF is set. */
 	if (pll1_sys_temp_enabled)
-		clk_disable_unprepare(pll1_sys_clk);
+		clk_disable_unprepare(clks[PLL1_SYS].clk);
 
 	/* scaling down?  scale voltage after frequency */
 	if (new_freq < old_freq) {
@@ -174,7 +194,7 @@ static int imx6q_cpufreq_init(struct cpufreq_policy *policy)
 {
 	int ret;
 
-	policy->clk = arm_clk;
+	policy->clk = clks[ARM].clk;
 	ret = cpufreq_generic_init(policy, freq_table, transition_latency);
 	policy->suspend_freq = policy->max;
 
@@ -244,6 +264,43 @@ static void imx6q_opp_check_speed_grading(struct device *dev)
 	of_node_put(np);
 }
 
+#define OCOTP_CFG3_6UL_SPEED_696MHZ	0x2
+
+static void imx6ul_opp_check_speed_grading(struct device *dev)
+{
+	struct device_node *np;
+	void __iomem *base;
+	u32 val;
+
+	np = of_find_compatible_node(NULL, NULL, "fsl,imx6ul-ocotp");
+	if (!np)
+		return;
+
+	base = of_iomap(np, 0);
+	if (!base) {
+		dev_err(dev, "failed to map ocotp\n");
+		goto put_node;
+	}
+
+	/*
+	 * Speed GRADING[1:0] defines the max speed of ARM:
+	 * 2b'00: Reserved;
+	 * 2b'01: 528000000Hz;
+	 * 2b'10: 696000000Hz;
+	 * 2b'11: Reserved;
+	 * We need to set the max speed of ARM according to fuse map.
+	 */
+	val = readl_relaxed(base + OCOTP_CFG3);
+	val >>= OCOTP_CFG3_SPEED_SHIFT;
+	val &= 0x3;
+	if (val != OCOTP_CFG3_6UL_SPEED_696MHZ)
+		if (dev_pm_opp_disable(dev, 696000000))
+			dev_warn(dev, "failed to disable 696MHz OPP\n");
+	iounmap(base);
+put_node:
+	of_node_put(np);
+}
+
 static int imx6q_cpufreq_probe(struct platform_device *pdev)
 {
 	struct device_node *np;
@@ -266,28 +323,15 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 		return -ENOENT;
 	}
 
-	arm_clk = clk_get(cpu_dev, "arm");
-	pll1_sys_clk = clk_get(cpu_dev, "pll1_sys");
-	pll1_sw_clk = clk_get(cpu_dev, "pll1_sw");
-	step_clk = clk_get(cpu_dev, "step");
-	pll2_pfd2_396m_clk = clk_get(cpu_dev, "pll2_pfd2_396m");
-	if (IS_ERR(arm_clk) || IS_ERR(pll1_sys_clk) || IS_ERR(pll1_sw_clk) ||
-	    IS_ERR(step_clk) || IS_ERR(pll2_pfd2_396m_clk)) {
-		dev_err(cpu_dev, "failed to get clocks\n");
-		ret = -ENOENT;
-		goto put_clk;
-	}
-
 	if (of_machine_is_compatible("fsl,imx6ul") ||
-	    of_machine_is_compatible("fsl,imx6ull")) {
-		pll2_bus_clk = clk_get(cpu_dev, "pll2_bus");
-		secondary_sel_clk = clk_get(cpu_dev, "secondary_sel");
-		if (IS_ERR(pll2_bus_clk) || IS_ERR(secondary_sel_clk)) {
-			dev_err(cpu_dev, "failed to get clocks specific to imx6ul\n");
-			ret = -ENOENT;
-			goto put_clk;
-		}
-	}
+	    of_machine_is_compatible("fsl,imx6ull"))
+		num_clks = IMX6UL_CPUFREQ_CLK_NUM;
+	else
+		num_clks = IMX6Q_CPUFREQ_CLK_NUM;
+
+	ret = clk_bulk_get(cpu_dev, num_clks, clks);
+	if (ret)
+		goto put_node;
 
 	arm_reg = regulator_get(cpu_dev, "arm");
 	pu_reg = regulator_get_optional(cpu_dev, "pu");
@@ -311,7 +355,10 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 		goto put_reg;
 	}
 
-	imx6q_opp_check_speed_grading(cpu_dev);
+	if (of_machine_is_compatible("fsl,imx6ul"))
+		imx6ul_opp_check_speed_grading(cpu_dev);
+	else
+		imx6q_opp_check_speed_grading(cpu_dev);
 
 	/* Because we have added the OPPs here, we must free them */
 	free_opp = true;
@@ -424,22 +471,11 @@ static int imx6q_cpufreq_probe(struct platform_device *pdev)
 		regulator_put(pu_reg);
 	if (!IS_ERR(soc_reg))
 		regulator_put(soc_reg);
-put_clk:
-	if (!IS_ERR(arm_clk))
-		clk_put(arm_clk);
-	if (!IS_ERR(pll1_sys_clk))
-		clk_put(pll1_sys_clk);
-	if (!IS_ERR(pll1_sw_clk))
-		clk_put(pll1_sw_clk);
-	if (!IS_ERR(step_clk))
-		clk_put(step_clk);
-	if (!IS_ERR(pll2_pfd2_396m_clk))
-		clk_put(pll2_pfd2_396m_clk);
-	if (!IS_ERR(pll2_bus_clk))
-		clk_put(pll2_bus_clk);
-	if (!IS_ERR(secondary_sel_clk))
-		clk_put(secondary_sel_clk);
+
+	clk_bulk_put(num_clks, clks);
+put_node:
 	of_node_put(np);
+
 	return ret;
 }
 
@@ -453,13 +489,8 @@ static int imx6q_cpufreq_remove(struct platform_device *pdev)
 	if (!IS_ERR(pu_reg))
 		regulator_put(pu_reg);
 	regulator_put(soc_reg);
-	clk_put(arm_clk);
-	clk_put(pll1_sys_clk);
-	clk_put(pll1_sw_clk);
-	clk_put(step_clk);
-	clk_put(pll2_pfd2_396m_clk);
-	clk_put(pll2_bus_clk);
-	clk_put(secondary_sel_clk);
+
+	clk_bulk_put(num_clks, clks);
 
 	return 0;
 }
diff --git a/drivers/cpufreq/intel_pstate.c b/drivers/cpufreq/intel_pstate.c
index 93a0e88..7edf7a0 100644
--- a/drivers/cpufreq/intel_pstate.c
+++ b/drivers/cpufreq/intel_pstate.c
@@ -1595,15 +1595,6 @@ static const struct pstate_funcs knl_funcs = {
 	.get_val = core_get_val,
 };
 
-static const struct pstate_funcs bxt_funcs = {
-	.get_max = core_get_max_pstate,
-	.get_max_physical = core_get_max_pstate_physical,
-	.get_min = core_get_min_pstate,
-	.get_turbo = core_get_turbo_pstate,
-	.get_scaling = core_get_scaling,
-	.get_val = core_get_val,
-};
-
 #define ICPU(model, policy) \
 	{ X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF,\
 			(unsigned long)&policy }
@@ -1627,8 +1618,9 @@ static const struct x86_cpu_id intel_pstate_cpu_ids[] = {
 	ICPU(INTEL_FAM6_BROADWELL_XEON_D,	core_funcs),
 	ICPU(INTEL_FAM6_XEON_PHI_KNL,		knl_funcs),
 	ICPU(INTEL_FAM6_XEON_PHI_KNM,		knl_funcs),
-	ICPU(INTEL_FAM6_ATOM_GOLDMONT,		bxt_funcs),
-	ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE,       bxt_funcs),
+	ICPU(INTEL_FAM6_ATOM_GOLDMONT,		core_funcs),
+	ICPU(INTEL_FAM6_ATOM_GEMINI_LAKE,       core_funcs),
+	ICPU(INTEL_FAM6_SKYLAKE_X,		core_funcs),
 	{}
 };
 MODULE_DEVICE_TABLE(x86cpu, intel_pstate_cpu_ids);
diff --git a/drivers/cpufreq/longhaul.c b/drivers/cpufreq/longhaul.c
index c46a12d..5faa37c 100644
--- a/drivers/cpufreq/longhaul.c
+++ b/drivers/cpufreq/longhaul.c
@@ -894,7 +894,7 @@ static int longhaul_cpu_init(struct cpufreq_policy *policy)
 	if ((longhaul_version != TYPE_LONGHAUL_V1) && (scale_voltage != 0))
 		longhaul_setup_voltagescaling();
 
-	policy->cpuinfo.transition_latency = 200000;	/* nsec */
+	policy->transition_delay_us = 200000;	/* usec */
 
 	return cpufreq_table_validate_and_show(policy, longhaul_table);
 }
diff --git a/drivers/cpufreq/mediatek-cpufreq.c b/drivers/cpufreq/mediatek-cpufreq.c
index e0d5090..8c04ddd 100644
--- a/drivers/cpufreq/mediatek-cpufreq.c
+++ b/drivers/cpufreq/mediatek-cpufreq.c
@@ -310,28 +310,8 @@ static int mtk_cpufreq_set_target(struct cpufreq_policy *policy,
 static void mtk_cpufreq_ready(struct cpufreq_policy *policy)
 {
 	struct mtk_cpu_dvfs_info *info = policy->driver_data;
-	struct device_node *np = of_node_get(info->cpu_dev->of_node);
-	u32 capacitance = 0;
 
-	if (WARN_ON(!np))
-		return;
-
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		of_property_read_u32(np, DYNAMIC_POWER, &capacitance);
-
-		info->cdev = of_cpufreq_power_cooling_register(np,
-						policy, capacitance, NULL);
-
-		if (IS_ERR(info->cdev)) {
-			dev_err(info->cpu_dev,
-				"running cpufreq without cooling device: %ld\n",
-				PTR_ERR(info->cdev));
-
-			info->cdev = NULL;
-		}
-	}
-
-	of_node_put(np);
+	info->cdev = of_cpufreq_cooling_register(policy);
 }
 
 static int mtk_cpu_dvfs_info_init(struct mtk_cpu_dvfs_info *info, int cpu)
@@ -574,6 +554,7 @@ static struct platform_driver mtk_cpufreq_platdrv = {
 /* List of machines supported by this driver */
 static const struct of_device_id mtk_cpufreq_machines[] __initconst = {
 	{ .compatible = "mediatek,mt2701", },
+	{ .compatible = "mediatek,mt2712", },
 	{ .compatible = "mediatek,mt7622", },
 	{ .compatible = "mediatek,mt7623", },
 	{ .compatible = "mediatek,mt817x", },
diff --git a/drivers/cpufreq/mvebu-cpufreq.c b/drivers/cpufreq/mvebu-cpufreq.c
index ed915ee..31513bd 100644
--- a/drivers/cpufreq/mvebu-cpufreq.c
+++ b/drivers/cpufreq/mvebu-cpufreq.c
@@ -76,12 +76,6 @@ static int __init armada_xp_pmsu_cpufreq_init(void)
 			return PTR_ERR(clk);
 		}
 
-		/*
-		 * In case of a failure of dev_pm_opp_add(), we don't
-		 * bother with cleaning up the registered OPP (there's
-		 * no function to do so), and simply cancel the
-		 * registration of the cpufreq device.
-		 */
 		ret = dev_pm_opp_add(cpu_dev, clk_get_rate(clk), 0);
 		if (ret) {
 			clk_put(clk);
@@ -91,7 +85,8 @@ static int __init armada_xp_pmsu_cpufreq_init(void)
 		ret = dev_pm_opp_add(cpu_dev, clk_get_rate(clk) / 2, 0);
 		if (ret) {
 			clk_put(clk);
-			return ret;
+			dev_err(cpu_dev, "Failed to register OPPs\n");
+			goto opp_register_failed;
 		}
 
 		ret = dev_pm_opp_set_sharing_cpus(cpu_dev,
@@ -99,9 +94,16 @@ static int __init armada_xp_pmsu_cpufreq_init(void)
 		if (ret)
 			dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n",
 				__func__, ret);
+		clk_put(clk);
 	}
 
 	platform_device_register_simple("cpufreq-dt", -1, NULL, 0);
 	return 0;
+
+opp_register_failed:
+	/* As registering has failed remove all the opp for all cpus */
+	dev_pm_opp_cpumask_remove_table(cpu_possible_mask);
+
+	return ret;
 }
 device_initcall(armada_xp_pmsu_cpufreq_init);
diff --git a/drivers/cpufreq/powernv-cpufreq.c b/drivers/cpufreq/powernv-cpufreq.c
index b6d7c4c..29cdec1 100644
--- a/drivers/cpufreq/powernv-cpufreq.c
+++ b/drivers/cpufreq/powernv-cpufreq.c
@@ -29,6 +29,7 @@
 #include <linux/reboot.h>
 #include <linux/slab.h>
 #include <linux/cpu.h>
+#include <linux/hashtable.h>
 #include <trace/events/power.h>
 
 #include <asm/cputhreads.h>
@@ -38,14 +39,13 @@
 #include <asm/opal.h>
 #include <linux/timer.h>
 
-#define POWERNV_MAX_PSTATES	256
+#define POWERNV_MAX_PSTATES_ORDER  8
+#define POWERNV_MAX_PSTATES	(1UL << (POWERNV_MAX_PSTATES_ORDER))
 #define PMSR_PSAFE_ENABLE	(1UL << 30)
 #define PMSR_SPR_EM_DISABLE	(1UL << 31)
-#define PMSR_MAX(x)		((x >> 32) & 0xFF)
+#define MAX_PSTATE_SHIFT	32
 #define LPSTATE_SHIFT		48
 #define GPSTATE_SHIFT		56
-#define GET_LPSTATE(x)		(((x) >> LPSTATE_SHIFT) & 0xFF)
-#define GET_GPSTATE(x)		(((x) >> GPSTATE_SHIFT) & 0xFF)
 
 #define MAX_RAMP_DOWN_TIME				5120
 /*
@@ -94,6 +94,27 @@ struct global_pstate_info {
 };
 
 static struct cpufreq_frequency_table powernv_freqs[POWERNV_MAX_PSTATES+1];
+
+DEFINE_HASHTABLE(pstate_revmap, POWERNV_MAX_PSTATES_ORDER);
+/**
+ * struct pstate_idx_revmap_data: Entry in the hashmap pstate_revmap
+ *				  indexed by a function of pstate id.
+ *
+ * @pstate_id: pstate id for this entry.
+ *
+ * @cpufreq_table_idx: Index into the powernv_freqs
+ *		       cpufreq_frequency_table for frequency
+ *		       corresponding to pstate_id.
+ *
+ * @hentry: hlist_node that hooks this entry into the pstate_revmap
+ *	    hashtable
+ */
+struct pstate_idx_revmap_data {
+	u8 pstate_id;
+	unsigned int cpufreq_table_idx;
+	struct hlist_node hentry;
+};
+
 static bool rebooting, throttled, occ_reset;
 
 static const char * const throttle_reason[] = {
@@ -148,39 +169,56 @@ static struct powernv_pstate_info {
 	bool wof_enabled;
 } powernv_pstate_info;
 
-/* Use following macros for conversions between pstate_id and index */
-static inline int idx_to_pstate(unsigned int i)
+static inline u8 extract_pstate(u64 pmsr_val, unsigned int shift)
+{
+	return ((pmsr_val >> shift) & 0xFF);
+}
+
+#define extract_local_pstate(x) extract_pstate(x, LPSTATE_SHIFT)
+#define extract_global_pstate(x) extract_pstate(x, GPSTATE_SHIFT)
+#define extract_max_pstate(x)  extract_pstate(x, MAX_PSTATE_SHIFT)
+
+/* Use following functions for conversions between pstate_id and index */
+
+/**
+ * idx_to_pstate : Returns the pstate id corresponding to the
+ *		   frequency in the cpufreq frequency table
+ *		   powernv_freqs indexed by @i.
+ *
+ *		   If @i is out of bound, this will return the pstate
+ *		   corresponding to the nominal frequency.
+ */
+static inline u8 idx_to_pstate(unsigned int i)
 {
 	if (unlikely(i >= powernv_pstate_info.nr_pstates)) {
-		pr_warn_once("index %u is out of bound\n", i);
+		pr_warn_once("idx_to_pstate: index %u is out of bound\n", i);
 		return powernv_freqs[powernv_pstate_info.nominal].driver_data;
 	}
 
 	return powernv_freqs[i].driver_data;
 }
 
-static inline unsigned int pstate_to_idx(int pstate)
+/**
+ * pstate_to_idx : Returns the index in the cpufreq frequencytable
+ *		   powernv_freqs for the frequency whose corresponding
+ *		   pstate id is @pstate.
+ *
+ *		   If no frequency corresponding to @pstate is found,
+ *		   this will return the index of the nominal
+ *		   frequency.
+ */
+static unsigned int pstate_to_idx(u8 pstate)
 {
-	int min = powernv_freqs[powernv_pstate_info.min].driver_data;
-	int max = powernv_freqs[powernv_pstate_info.max].driver_data;
+	unsigned int key = pstate % POWERNV_MAX_PSTATES;
+	struct pstate_idx_revmap_data *revmap_data;
 
-	if (min > 0) {
-		if (unlikely((pstate < max) || (pstate > min))) {
-			pr_warn_once("pstate %d is out of bound\n", pstate);
-			return powernv_pstate_info.nominal;
-		}
-	} else {
-		if (unlikely((pstate > max) || (pstate < min))) {
-			pr_warn_once("pstate %d is out of bound\n", pstate);
-			return powernv_pstate_info.nominal;
-		}
+	hash_for_each_possible(pstate_revmap, revmap_data, hentry, key) {
+		if (revmap_data->pstate_id == pstate)
+			return revmap_data->cpufreq_table_idx;
 	}
-	/*
-	 * abs() is deliberately used so that is works with
-	 * both monotonically increasing and decreasing
-	 * pstate values
-	 */
-	return abs(pstate - idx_to_pstate(powernv_pstate_info.max));
+
+	pr_warn_once("pstate_to_idx: pstate 0x%x not found\n", pstate);
+	return powernv_pstate_info.nominal;
 }
 
 static inline void reset_gpstates(struct cpufreq_policy *policy)
@@ -247,7 +285,7 @@ static int init_powernv_pstates(void)
 		powernv_pstate_info.wof_enabled = true;
 
 next:
-	pr_info("cpufreq pstate min %d nominal %d max %d\n", pstate_min,
+	pr_info("cpufreq pstate min 0x%x nominal 0x%x max 0x%x\n", pstate_min,
 		pstate_nominal, pstate_max);
 	pr_info("Workload Optimized Frequency is %s in the platform\n",
 		(powernv_pstate_info.wof_enabled) ? "enabled" : "disabled");
@@ -278,19 +316,30 @@ static int init_powernv_pstates(void)
 
 	powernv_pstate_info.nr_pstates = nr_pstates;
 	pr_debug("NR PStates %d\n", nr_pstates);
+
 	for (i = 0; i < nr_pstates; i++) {
 		u32 id = be32_to_cpu(pstate_ids[i]);
 		u32 freq = be32_to_cpu(pstate_freqs[i]);
+		struct pstate_idx_revmap_data *revmap_data;
+		unsigned int key;
 
 		pr_debug("PState id %d freq %d MHz\n", id, freq);
 		powernv_freqs[i].frequency = freq * 1000; /* kHz */
-		powernv_freqs[i].driver_data = id;
+		powernv_freqs[i].driver_data = id & 0xFF;
+
+		revmap_data = (struct pstate_idx_revmap_data *)
+			      kmalloc(sizeof(*revmap_data), GFP_KERNEL);
+
+		revmap_data->pstate_id = id & 0xFF;
+		revmap_data->cpufreq_table_idx = i;
+		key = (revmap_data->pstate_id) % POWERNV_MAX_PSTATES;
+		hash_add(pstate_revmap, &revmap_data->hentry, key);
 
 		if (id == pstate_max)
 			powernv_pstate_info.max = i;
-		else if (id == pstate_nominal)
+		if (id == pstate_nominal)
 			powernv_pstate_info.nominal = i;
-		else if (id == pstate_min)
+		if (id == pstate_min)
 			powernv_pstate_info.min = i;
 
 		if (powernv_pstate_info.wof_enabled && id == pstate_turbo) {
@@ -307,14 +356,13 @@ static int init_powernv_pstates(void)
 }
 
 /* Returns the CPU frequency corresponding to the pstate_id. */
-static unsigned int pstate_id_to_freq(int pstate_id)
+static unsigned int pstate_id_to_freq(u8 pstate_id)
 {
 	int i;
 
 	i = pstate_to_idx(pstate_id);
 	if (i >= powernv_pstate_info.nr_pstates || i < 0) {
-		pr_warn("PState id %d outside of PState table, "
-			"reporting nominal id %d instead\n",
+		pr_warn("PState id 0x%x outside of PState table, reporting nominal id 0x%x instead\n",
 			pstate_id, idx_to_pstate(powernv_pstate_info.nominal));
 		i = powernv_pstate_info.nominal;
 	}
@@ -420,8 +468,8 @@ static inline void set_pmspr(unsigned long sprn, unsigned long val)
  */
 struct powernv_smp_call_data {
 	unsigned int freq;
-	int pstate_id;
-	int gpstate_id;
+	u8 pstate_id;
+	u8 gpstate_id;
 };
 
 /*
@@ -438,22 +486,15 @@ struct powernv_smp_call_data {
 static void powernv_read_cpu_freq(void *arg)
 {
 	unsigned long pmspr_val;
-	s8 local_pstate_id;
 	struct powernv_smp_call_data *freq_data = arg;
 
 	pmspr_val = get_pmspr(SPRN_PMSR);
-
-	/*
-	 * The local pstate id corresponds bits 48..55 in the PMSR.
-	 * Note: Watch out for the sign!
-	 */
-	local_pstate_id = (pmspr_val >> 48) & 0xFF;
-	freq_data->pstate_id = local_pstate_id;
+	freq_data->pstate_id = extract_local_pstate(pmspr_val);
 	freq_data->freq = pstate_id_to_freq(freq_data->pstate_id);
 
-	pr_debug("cpu %d pmsr %016lX pstate_id %d frequency %d kHz\n",
-		raw_smp_processor_id(), pmspr_val, freq_data->pstate_id,
-		freq_data->freq);
+	pr_debug("cpu %d pmsr %016lX pstate_id 0x%x frequency %d kHz\n",
+		 raw_smp_processor_id(), pmspr_val, freq_data->pstate_id,
+		 freq_data->freq);
 }
 
 /*
@@ -515,21 +556,21 @@ static void powernv_cpufreq_throttle_check(void *data)
 	struct chip *chip;
 	unsigned int cpu = smp_processor_id();
 	unsigned long pmsr;
-	int pmsr_pmax;
+	u8 pmsr_pmax;
 	unsigned int pmsr_pmax_idx;
 
 	pmsr = get_pmspr(SPRN_PMSR);
 	chip = this_cpu_read(chip_info);
 
 	/* Check for Pmax Capping */
-	pmsr_pmax = (s8)PMSR_MAX(pmsr);
+	pmsr_pmax = extract_max_pstate(pmsr);
 	pmsr_pmax_idx = pstate_to_idx(pmsr_pmax);
 	if (pmsr_pmax_idx != powernv_pstate_info.max) {
 		if (chip->throttled)
 			goto next;
 		chip->throttled = true;
 		if (pmsr_pmax_idx > powernv_pstate_info.nominal) {
-			pr_warn_once("CPU %d on Chip %u has Pmax(%d) reduced below nominal frequency(%d)\n",
+			pr_warn_once("CPU %d on Chip %u has Pmax(0x%x) reduced below that of nominal frequency(0x%x)\n",
 				     cpu, chip->id, pmsr_pmax,
 				     idx_to_pstate(powernv_pstate_info.nominal));
 			chip->throttle_sub_turbo++;
@@ -645,8 +686,8 @@ void gpstate_timer_handler(struct timer_list *t)
 	 * value. Hence, read from PMCR to get correct data.
 	 */
 	val = get_pmspr(SPRN_PMCR);
-	freq_data.gpstate_id = (s8)GET_GPSTATE(val);
-	freq_data.pstate_id = (s8)GET_LPSTATE(val);
+	freq_data.gpstate_id = extract_global_pstate(val);
+	freq_data.pstate_id = extract_local_pstate(val);
 	if (freq_data.gpstate_id  == freq_data.pstate_id) {
 		reset_gpstates(policy);
 		spin_unlock(&gpstates->gpstate_lock);
diff --git a/drivers/cpufreq/qoriq-cpufreq.c b/drivers/cpufreq/qoriq-cpufreq.c
index 4ada55b..0562761a 100644
--- a/drivers/cpufreq/qoriq-cpufreq.c
+++ b/drivers/cpufreq/qoriq-cpufreq.c
@@ -275,20 +275,8 @@ static int qoriq_cpufreq_target(struct cpufreq_policy *policy,
 static void qoriq_cpufreq_ready(struct cpufreq_policy *policy)
 {
 	struct cpu_data *cpud = policy->driver_data;
-	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
 
-	if (of_find_property(np, "#cooling-cells", NULL)) {
-		cpud->cdev = of_cpufreq_cooling_register(np, policy);
-
-		if (IS_ERR(cpud->cdev) && PTR_ERR(cpud->cdev) != -ENOSYS) {
-			pr_err("cpu%d is not running as cooling device: %ld\n",
-					policy->cpu, PTR_ERR(cpud->cdev));
-
-			cpud->cdev = NULL;
-		}
-	}
-
-	of_node_put(np);
+	cpud->cdev = of_cpufreq_cooling_register(policy);
 }
 
 static struct cpufreq_driver qoriq_cpufreq_driver = {
diff --git a/drivers/cpufreq/scpi-cpufreq.c b/drivers/cpufreq/scpi-cpufreq.c
index 05d2990..247fcbf 100644
--- a/drivers/cpufreq/scpi-cpufreq.c
+++ b/drivers/cpufreq/scpi-cpufreq.c
@@ -18,27 +18,89 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/clk.h>
 #include <linux/cpu.h>
 #include <linux/cpufreq.h>
+#include <linux/cpumask.h>
+#include <linux/cpu_cooling.h>
+#include <linux/export.h>
 #include <linux/module.h>
-#include <linux/platform_device.h>
+#include <linux/of_platform.h>
 #include <linux/pm_opp.h>
 #include <linux/scpi_protocol.h>
+#include <linux/slab.h>
 #include <linux/types.h>
 
-#include "arm_big_little.h"
+struct scpi_data {
+	struct clk *clk;
+	struct device *cpu_dev;
+	struct thermal_cooling_device *cdev;
+};
 
 static struct scpi_ops *scpi_ops;
 
-static int scpi_get_transition_latency(struct device *cpu_dev)
+static unsigned int scpi_cpufreq_get_rate(unsigned int cpu)
 {
-	return scpi_ops->get_transition_latency(cpu_dev);
+	struct cpufreq_policy *policy = cpufreq_cpu_get_raw(cpu);
+	struct scpi_data *priv = policy->driver_data;
+	unsigned long rate = clk_get_rate(priv->clk);
+
+	return rate / 1000;
 }
 
-static int scpi_init_opp_table(const struct cpumask *cpumask)
+static int
+scpi_cpufreq_set_target(struct cpufreq_policy *policy, unsigned int index)
+{
+	struct scpi_data *priv = policy->driver_data;
+	u64 rate = policy->freq_table[index].frequency * 1000;
+	int ret;
+
+	ret = clk_set_rate(priv->clk, rate);
+	if (!ret && (clk_get_rate(priv->clk) != rate))
+		ret = -EIO;
+
+	return ret;
+}
+
+static int
+scpi_get_sharing_cpus(struct device *cpu_dev, struct cpumask *cpumask)
+{
+	int cpu, domain, tdomain;
+	struct device *tcpu_dev;
+
+	domain = scpi_ops->device_domain_id(cpu_dev);
+	if (domain < 0)
+		return domain;
+
+	for_each_possible_cpu(cpu) {
+		if (cpu == cpu_dev->id)
+			continue;
+
+		tcpu_dev = get_cpu_device(cpu);
+		if (!tcpu_dev)
+			continue;
+
+		tdomain = scpi_ops->device_domain_id(tcpu_dev);
+		if (tdomain == domain)
+			cpumask_set_cpu(cpu, cpumask);
+	}
+
+	return 0;
+}
+
+static int scpi_cpufreq_init(struct cpufreq_policy *policy)
 {
 	int ret;
-	struct device *cpu_dev = get_cpu_device(cpumask_first(cpumask));
+	unsigned int latency;
+	struct device *cpu_dev;
+	struct scpi_data *priv;
+	struct cpufreq_frequency_table *freq_table;
+
+	cpu_dev = get_cpu_device(policy->cpu);
+	if (!cpu_dev) {
+		pr_err("failed to get cpu%d device\n", policy->cpu);
+		return -ENODEV;
+	}
 
 	ret = scpi_ops->add_opps_to_device(cpu_dev);
 	if (ret) {
@@ -46,32 +108,133 @@ static int scpi_init_opp_table(const struct cpumask *cpumask)
 		return ret;
 	}
 
-	ret = dev_pm_opp_set_sharing_cpus(cpu_dev, cpumask);
-	if (ret)
+	ret = scpi_get_sharing_cpus(cpu_dev, policy->cpus);
+	if (ret) {
+		dev_warn(cpu_dev, "failed to get sharing cpumask\n");
+		return ret;
+	}
+
+	ret = dev_pm_opp_set_sharing_cpus(cpu_dev, policy->cpus);
+	if (ret) {
 		dev_err(cpu_dev, "%s: failed to mark OPPs as shared: %d\n",
 			__func__, ret);
+		return ret;
+	}
+
+	ret = dev_pm_opp_get_opp_count(cpu_dev);
+	if (ret <= 0) {
+		dev_dbg(cpu_dev, "OPP table is not ready, deferring probe\n");
+		ret = -EPROBE_DEFER;
+		goto out_free_opp;
+	}
+
+	priv = kzalloc(sizeof(*priv), GFP_KERNEL);
+	if (!priv) {
+		ret = -ENOMEM;
+		goto out_free_opp;
+	}
+
+	ret = dev_pm_opp_init_cpufreq_table(cpu_dev, &freq_table);
+	if (ret) {
+		dev_err(cpu_dev, "failed to init cpufreq table: %d\n", ret);
+		goto out_free_priv;
+	}
+
+	priv->cpu_dev = cpu_dev;
+	priv->clk = clk_get(cpu_dev, NULL);
+	if (IS_ERR(priv->clk)) {
+		dev_err(cpu_dev, "%s: Failed to get clk for cpu: %d\n",
+			__func__, cpu_dev->id);
+		goto out_free_cpufreq_table;
+	}
+
+	policy->driver_data = priv;
+
+	ret = cpufreq_table_validate_and_show(policy, freq_table);
+	if (ret) {
+		dev_err(cpu_dev, "%s: invalid frequency table: %d\n", __func__,
+			ret);
+		goto out_put_clk;
+	}
+
+	/* scpi allows DVFS request for any domain from any CPU */
+	policy->dvfs_possible_from_any_cpu = true;
+
+	latency = scpi_ops->get_transition_latency(cpu_dev);
+	if (!latency)
+		latency = CPUFREQ_ETERNAL;
+
+	policy->cpuinfo.transition_latency = latency;
+
+	policy->fast_switch_possible = false;
+	return 0;
+
+out_put_clk:
+	clk_put(priv->clk);
+out_free_cpufreq_table:
+	dev_pm_opp_free_cpufreq_table(cpu_dev, &freq_table);
+out_free_priv:
+	kfree(priv);
+out_free_opp:
+	dev_pm_opp_cpumask_remove_table(policy->cpus);
+
 	return ret;
 }
 
-static const struct cpufreq_arm_bL_ops scpi_cpufreq_ops = {
-	.name	= "scpi",
-	.get_transition_latency = scpi_get_transition_latency,
-	.init_opp_table = scpi_init_opp_table,
-	.free_opp_table = dev_pm_opp_cpumask_remove_table,
+static int scpi_cpufreq_exit(struct cpufreq_policy *policy)
+{
+	struct scpi_data *priv = policy->driver_data;
+
+	cpufreq_cooling_unregister(priv->cdev);
+	clk_put(priv->clk);
+	dev_pm_opp_free_cpufreq_table(priv->cpu_dev, &policy->freq_table);
+	kfree(priv);
+	dev_pm_opp_cpumask_remove_table(policy->related_cpus);
+
+	return 0;
+}
+
+static void scpi_cpufreq_ready(struct cpufreq_policy *policy)
+{
+	struct scpi_data *priv = policy->driver_data;
+	struct thermal_cooling_device *cdev;
+
+	cdev = of_cpufreq_cooling_register(policy);
+	if (!IS_ERR(cdev))
+		priv->cdev = cdev;
+}
+
+static struct cpufreq_driver scpi_cpufreq_driver = {
+	.name	= "scpi-cpufreq",
+	.flags	= CPUFREQ_STICKY | CPUFREQ_HAVE_GOVERNOR_PER_POLICY |
+		  CPUFREQ_NEED_INITIAL_FREQ_CHECK,
+	.verify	= cpufreq_generic_frequency_table_verify,
+	.attr	= cpufreq_generic_attr,
+	.get	= scpi_cpufreq_get_rate,
+	.init	= scpi_cpufreq_init,
+	.exit	= scpi_cpufreq_exit,
+	.ready	= scpi_cpufreq_ready,
+	.target_index	= scpi_cpufreq_set_target,
 };
 
 static int scpi_cpufreq_probe(struct platform_device *pdev)
 {
+	int ret;
+
 	scpi_ops = get_scpi_ops();
 	if (!scpi_ops)
 		return -EIO;
 
-	return bL_cpufreq_register(&scpi_cpufreq_ops);
+	ret = cpufreq_register_driver(&scpi_cpufreq_driver);
+	if (ret)
+		dev_err(&pdev->dev, "%s: registering cpufreq failed, err: %d\n",
+			__func__, ret);
+	return ret;
 }
 
 static int scpi_cpufreq_remove(struct platform_device *pdev)
 {
-	bL_cpufreq_unregister(&scpi_cpufreq_ops);
+	cpufreq_unregister_driver(&scpi_cpufreq_driver);
 	scpi_ops = NULL;
 	return 0;
 }
diff --git a/drivers/cpufreq/ti-cpufreq.c b/drivers/cpufreq/ti-cpufreq.c
index 923317f..a099b7bf 100644
--- a/drivers/cpufreq/ti-cpufreq.c
+++ b/drivers/cpufreq/ti-cpufreq.c
@@ -17,6 +17,7 @@
 #include <linux/cpu.h>
 #include <linux/io.h>
 #include <linux/mfd/syscon.h>
+#include <linux/module.h>
 #include <linux/init.h>
 #include <linux/of.h>
 #include <linux/of_platform.h>
@@ -50,6 +51,7 @@ struct ti_cpufreq_soc_data {
 	unsigned long efuse_mask;
 	unsigned long efuse_shift;
 	unsigned long rev_offset;
+	bool multi_regulator;
 };
 
 struct ti_cpufreq_data {
@@ -57,6 +59,7 @@ struct ti_cpufreq_data {
 	struct device_node *opp_node;
 	struct regmap *syscon;
 	const struct ti_cpufreq_soc_data *soc_data;
+	struct opp_table *opp_table;
 };
 
 static unsigned long amx3_efuse_xlate(struct ti_cpufreq_data *opp_data,
@@ -95,6 +98,7 @@ static struct ti_cpufreq_soc_data am3x_soc_data = {
 	.efuse_offset = 0x07fc,
 	.efuse_mask = 0x1fff,
 	.rev_offset = 0x600,
+	.multi_regulator = false,
 };
 
 static struct ti_cpufreq_soc_data am4x_soc_data = {
@@ -103,6 +107,7 @@ static struct ti_cpufreq_soc_data am4x_soc_data = {
 	.efuse_offset = 0x0610,
 	.efuse_mask = 0x3f,
 	.rev_offset = 0x600,
+	.multi_regulator = false,
 };
 
 static struct ti_cpufreq_soc_data dra7_soc_data = {
@@ -111,6 +116,7 @@ static struct ti_cpufreq_soc_data dra7_soc_data = {
 	.efuse_mask = 0xf80000,
 	.efuse_shift = 19,
 	.rev_offset = 0x204,
+	.multi_regulator = true,
 };
 
 /**
@@ -195,12 +201,14 @@ static const struct of_device_id ti_cpufreq_of_match[] = {
 	{},
 };
 
-static int ti_cpufreq_init(void)
+static int ti_cpufreq_probe(struct platform_device *pdev)
 {
 	u32 version[VERSION_COUNT];
 	struct device_node *np;
 	const struct of_device_id *match;
+	struct opp_table *ti_opp_table;
 	struct ti_cpufreq_data *opp_data;
+	const char * const reg_names[] = {"vdd", "vbb"};
 	int ret;
 
 	np = of_find_node_by_path("/");
@@ -247,16 +255,29 @@ static int ti_cpufreq_init(void)
 	if (ret)
 		goto fail_put_node;
 
-	ret = PTR_ERR_OR_ZERO(dev_pm_opp_set_supported_hw(opp_data->cpu_dev,
-							  version, VERSION_COUNT));
-	if (ret) {
+	ti_opp_table = dev_pm_opp_set_supported_hw(opp_data->cpu_dev,
+						   version, VERSION_COUNT);
+	if (IS_ERR(ti_opp_table)) {
 		dev_err(opp_data->cpu_dev,
 			"Failed to set supported hardware\n");
+		ret = PTR_ERR(ti_opp_table);
 		goto fail_put_node;
 	}
 
-	of_node_put(opp_data->opp_node);
+	opp_data->opp_table = ti_opp_table;
 
+	if (opp_data->soc_data->multi_regulator) {
+		ti_opp_table = dev_pm_opp_set_regulators(opp_data->cpu_dev,
+							 reg_names,
+							 ARRAY_SIZE(reg_names));
+		if (IS_ERR(ti_opp_table)) {
+			dev_pm_opp_put_supported_hw(opp_data->opp_table);
+			ret =  PTR_ERR(ti_opp_table);
+			goto fail_put_node;
+		}
+	}
+
+	of_node_put(opp_data->opp_node);
 register_cpufreq_dt:
 	platform_device_register_simple("cpufreq-dt", -1, NULL, 0);
 
@@ -269,4 +290,22 @@ static int ti_cpufreq_init(void)
 
 	return ret;
 }
-device_initcall(ti_cpufreq_init);
+
+static int ti_cpufreq_init(void)
+{
+	platform_device_register_simple("ti-cpufreq", -1, NULL, 0);
+	return 0;
+}
+module_init(ti_cpufreq_init);
+
+static struct platform_driver ti_cpufreq_driver = {
+	.probe = ti_cpufreq_probe,
+	.driver = {
+		.name = "ti-cpufreq",
+	},
+};
+module_platform_driver(ti_cpufreq_driver);
+
+MODULE_DESCRIPTION("TI CPUFreq/OPP hw-supported driver");
+MODULE_AUTHOR("Dave Gerlach <d-gerlach@ti.com>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/cpuidle/governor.c b/drivers/cpuidle/governor.c
index 4e78263..5d359af 100644
--- a/drivers/cpuidle/governor.c
+++ b/drivers/cpuidle/governor.c
@@ -36,14 +36,15 @@ static struct cpuidle_governor * __cpuidle_find_governor(const char *str)
 /**
  * cpuidle_switch_governor - changes the governor
  * @gov: the new target governor
- *
- * NOTE: "gov" can be NULL to specify disabled
  * Must be called with cpuidle_lock acquired.
  */
 int cpuidle_switch_governor(struct cpuidle_governor *gov)
 {
 	struct cpuidle_device *dev;
 
+	if (!gov)
+		return -EINVAL;
+
 	if (gov == cpuidle_curr_governor)
 		return 0;
 
diff --git a/drivers/crypto/chelsio/Kconfig b/drivers/crypto/chelsio/Kconfig
index 3e104f5..b56b3f7 100644
--- a/drivers/crypto/chelsio/Kconfig
+++ b/drivers/crypto/chelsio/Kconfig
@@ -5,6 +5,7 @@
 	select CRYPTO_SHA256
 	select CRYPTO_SHA512
 	select CRYPTO_AUTHENC
+	select CRYPTO_GF128MUL
 	---help---
 	  The Chelsio Crypto Co-processor driver for T6 adapters.
 
diff --git a/drivers/crypto/inside-secure/safexcel.c b/drivers/crypto/inside-secure/safexcel.c
index 89ba9e8..4bcef78 100644
--- a/drivers/crypto/inside-secure/safexcel.c
+++ b/drivers/crypto/inside-secure/safexcel.c
@@ -607,6 +607,7 @@ static inline void safexcel_handle_result_descriptor(struct safexcel_crypto_priv
 		ndesc = ctx->handle_result(priv, ring, sreq->req,
 					   &should_complete, &ret);
 		if (ndesc < 0) {
+			kfree(sreq);
 			dev_err(priv->dev, "failed to handle result (%d)", ndesc);
 			return;
 		}
diff --git a/drivers/crypto/inside-secure/safexcel_cipher.c b/drivers/crypto/inside-secure/safexcel_cipher.c
index 5438552..fcc0a60 100644
--- a/drivers/crypto/inside-secure/safexcel_cipher.c
+++ b/drivers/crypto/inside-secure/safexcel_cipher.c
@@ -14,6 +14,7 @@
 
 #include <crypto/aes.h>
 #include <crypto/skcipher.h>
+#include <crypto/internal/skcipher.h>
 
 #include "safexcel.h"
 
@@ -33,6 +34,10 @@ struct safexcel_cipher_ctx {
 	unsigned int key_len;
 };
 
+struct safexcel_cipher_req {
+	bool needs_inv;
+};
+
 static void safexcel_cipher_token(struct safexcel_cipher_ctx *ctx,
 				  struct crypto_async_request *async,
 				  struct safexcel_command_desc *cdesc,
@@ -126,9 +131,9 @@ static int safexcel_context_control(struct safexcel_cipher_ctx *ctx,
 	return 0;
 }
 
-static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
-				  struct crypto_async_request *async,
-				  bool *should_complete, int *ret)
+static int safexcel_handle_req_result(struct safexcel_crypto_priv *priv, int ring,
+				      struct crypto_async_request *async,
+				      bool *should_complete, int *ret)
 {
 	struct skcipher_request *req = skcipher_request_cast(async);
 	struct safexcel_result_desc *rdesc;
@@ -265,7 +270,6 @@ static int safexcel_aes_send(struct crypto_async_request *async,
 	spin_unlock_bh(&priv->ring[ring].egress_lock);
 
 	request->req = &req->base;
-	ctx->base.handle_result = safexcel_handle_result;
 
 	*commands = n_cdesc;
 	*results = n_rdesc;
@@ -341,8 +345,6 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv,
 
 	ring = safexcel_select_ring(priv);
 	ctx->base.ring = ring;
-	ctx->base.needs_inv = false;
-	ctx->base.send = safexcel_aes_send;
 
 	spin_lock_bh(&priv->ring[ring].queue_lock);
 	enq_ret = crypto_enqueue_request(&priv->ring[ring].queue, async);
@@ -359,6 +361,26 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv,
 	return ndesc;
 }
 
+static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
+				  struct crypto_async_request *async,
+				  bool *should_complete, int *ret)
+{
+	struct skcipher_request *req = skcipher_request_cast(async);
+	struct safexcel_cipher_req *sreq = skcipher_request_ctx(req);
+	int err;
+
+	if (sreq->needs_inv) {
+		sreq->needs_inv = false;
+		err = safexcel_handle_inv_result(priv, ring, async,
+						 should_complete, ret);
+	} else {
+		err = safexcel_handle_req_result(priv, ring, async,
+						 should_complete, ret);
+	}
+
+	return err;
+}
+
 static int safexcel_cipher_send_inv(struct crypto_async_request *async,
 				    int ring, struct safexcel_request *request,
 				    int *commands, int *results)
@@ -368,8 +390,6 @@ static int safexcel_cipher_send_inv(struct crypto_async_request *async,
 	struct safexcel_crypto_priv *priv = ctx->priv;
 	int ret;
 
-	ctx->base.handle_result = safexcel_handle_inv_result;
-
 	ret = safexcel_invalidate_cache(async, &ctx->base, priv,
 					ctx->base.ctxr_dma, ring, request);
 	if (unlikely(ret))
@@ -381,28 +401,46 @@ static int safexcel_cipher_send_inv(struct crypto_async_request *async,
 	return 0;
 }
 
+static int safexcel_send(struct crypto_async_request *async,
+			 int ring, struct safexcel_request *request,
+			 int *commands, int *results)
+{
+	struct skcipher_request *req = skcipher_request_cast(async);
+	struct safexcel_cipher_req *sreq = skcipher_request_ctx(req);
+	int ret;
+
+	if (sreq->needs_inv)
+		ret = safexcel_cipher_send_inv(async, ring, request,
+					       commands, results);
+	else
+		ret = safexcel_aes_send(async, ring, request,
+					commands, results);
+	return ret;
+}
+
 static int safexcel_cipher_exit_inv(struct crypto_tfm *tfm)
 {
 	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct safexcel_crypto_priv *priv = ctx->priv;
-	struct skcipher_request req;
+	SKCIPHER_REQUEST_ON_STACK(req, __crypto_skcipher_cast(tfm));
+	struct safexcel_cipher_req *sreq = skcipher_request_ctx(req);
 	struct safexcel_inv_result result = {};
 	int ring = ctx->base.ring;
 
-	memset(&req, 0, sizeof(struct skcipher_request));
+	memset(req, 0, sizeof(struct skcipher_request));
 
 	/* create invalidation request */
 	init_completion(&result.completion);
-	skcipher_request_set_callback(&req, CRYPTO_TFM_REQ_MAY_BACKLOG,
-					safexcel_inv_complete, &result);
+	skcipher_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+				      safexcel_inv_complete, &result);
 
-	skcipher_request_set_tfm(&req, __crypto_skcipher_cast(tfm));
-	ctx = crypto_tfm_ctx(req.base.tfm);
+	skcipher_request_set_tfm(req, __crypto_skcipher_cast(tfm));
+	ctx = crypto_tfm_ctx(req->base.tfm);
 	ctx->base.exit_inv = true;
-	ctx->base.send = safexcel_cipher_send_inv;
+	sreq->needs_inv = true;
 
 	spin_lock_bh(&priv->ring[ring].queue_lock);
-	crypto_enqueue_request(&priv->ring[ring].queue, &req.base);
+	crypto_enqueue_request(&priv->ring[ring].queue, &req->base);
 	spin_unlock_bh(&priv->ring[ring].queue_lock);
 
 	if (!priv->ring[ring].need_dequeue)
@@ -424,19 +462,21 @@ static int safexcel_aes(struct skcipher_request *req,
 			enum safexcel_cipher_direction dir, u32 mode)
 {
 	struct safexcel_cipher_ctx *ctx = crypto_tfm_ctx(req->base.tfm);
+	struct safexcel_cipher_req *sreq = skcipher_request_ctx(req);
 	struct safexcel_crypto_priv *priv = ctx->priv;
 	int ret, ring;
 
+	sreq->needs_inv = false;
 	ctx->direction = dir;
 	ctx->mode = mode;
 
 	if (ctx->base.ctxr) {
-		if (ctx->base.needs_inv)
-			ctx->base.send = safexcel_cipher_send_inv;
+		if (ctx->base.needs_inv) {
+			sreq->needs_inv = true;
+			ctx->base.needs_inv = false;
+		}
 	} else {
 		ctx->base.ring = safexcel_select_ring(priv);
-		ctx->base.send = safexcel_aes_send;
-
 		ctx->base.ctxr = dma_pool_zalloc(priv->context_pool,
 						 EIP197_GFP_FLAGS(req->base),
 						 &ctx->base.ctxr_dma);
@@ -476,6 +516,11 @@ static int safexcel_skcipher_cra_init(struct crypto_tfm *tfm)
 			     alg.skcipher.base);
 
 	ctx->priv = tmpl->priv;
+	ctx->base.send = safexcel_send;
+	ctx->base.handle_result = safexcel_handle_result;
+
+	crypto_skcipher_set_reqsize(__crypto_skcipher_cast(tfm),
+				    sizeof(struct safexcel_cipher_req));
 
 	return 0;
 }
diff --git a/drivers/crypto/inside-secure/safexcel_hash.c b/drivers/crypto/inside-secure/safexcel_hash.c
index 74feb62..0c5a582 100644
--- a/drivers/crypto/inside-secure/safexcel_hash.c
+++ b/drivers/crypto/inside-secure/safexcel_hash.c
@@ -32,9 +32,10 @@ struct safexcel_ahash_req {
 	bool last_req;
 	bool finish;
 	bool hmac;
+	bool needs_inv;
 
 	u8 state_sz;    /* expected sate size, only set once */
-	u32 state[SHA256_DIGEST_SIZE / sizeof(u32)];
+	u32 state[SHA256_DIGEST_SIZE / sizeof(u32)] __aligned(sizeof(u32));
 
 	u64 len;
 	u64 processed;
@@ -119,15 +120,15 @@ static void safexcel_context_control(struct safexcel_ahash_ctx *ctx,
 	}
 }
 
-static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
-				  struct crypto_async_request *async,
-				  bool *should_complete, int *ret)
+static int safexcel_handle_req_result(struct safexcel_crypto_priv *priv, int ring,
+				      struct crypto_async_request *async,
+				      bool *should_complete, int *ret)
 {
 	struct safexcel_result_desc *rdesc;
 	struct ahash_request *areq = ahash_request_cast(async);
 	struct crypto_ahash *ahash = crypto_ahash_reqtfm(areq);
 	struct safexcel_ahash_req *sreq = ahash_request_ctx(areq);
-	int cache_len, result_sz = sreq->state_sz;
+	int cache_len;
 
 	*ret = 0;
 
@@ -148,8 +149,8 @@ static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
 	spin_unlock_bh(&priv->ring[ring].egress_lock);
 
 	if (sreq->finish)
-		result_sz = crypto_ahash_digestsize(ahash);
-	memcpy(sreq->state, areq->result, result_sz);
+		memcpy(areq->result, sreq->state,
+		       crypto_ahash_digestsize(ahash));
 
 	dma_unmap_sg(priv->dev, areq->src,
 		     sg_nents_for_len(areq->src, areq->nbytes), DMA_TO_DEVICE);
@@ -165,9 +166,9 @@ static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
 	return 1;
 }
 
-static int safexcel_ahash_send(struct crypto_async_request *async, int ring,
-			       struct safexcel_request *request, int *commands,
-			       int *results)
+static int safexcel_ahash_send_req(struct crypto_async_request *async, int ring,
+				   struct safexcel_request *request,
+				   int *commands, int *results)
 {
 	struct ahash_request *areq = ahash_request_cast(async);
 	struct crypto_ahash *ahash = crypto_ahash_reqtfm(areq);
@@ -273,7 +274,7 @@ static int safexcel_ahash_send(struct crypto_async_request *async, int ring,
 	/* Add the token */
 	safexcel_hash_token(first_cdesc, len, req->state_sz);
 
-	ctx->base.result_dma = dma_map_single(priv->dev, areq->result,
+	ctx->base.result_dma = dma_map_single(priv->dev, req->state,
 					      req->state_sz, DMA_FROM_DEVICE);
 	if (dma_mapping_error(priv->dev, ctx->base.result_dma)) {
 		ret = -EINVAL;
@@ -292,7 +293,6 @@ static int safexcel_ahash_send(struct crypto_async_request *async, int ring,
 
 	req->processed += len;
 	request->req = &areq->base;
-	ctx->base.handle_result = safexcel_handle_result;
 
 	*commands = n_cdesc;
 	*results = 1;
@@ -374,8 +374,6 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv,
 
 	ring = safexcel_select_ring(priv);
 	ctx->base.ring = ring;
-	ctx->base.needs_inv = false;
-	ctx->base.send = safexcel_ahash_send;
 
 	spin_lock_bh(&priv->ring[ring].queue_lock);
 	enq_ret = crypto_enqueue_request(&priv->ring[ring].queue, async);
@@ -392,6 +390,26 @@ static int safexcel_handle_inv_result(struct safexcel_crypto_priv *priv,
 	return 1;
 }
 
+static int safexcel_handle_result(struct safexcel_crypto_priv *priv, int ring,
+				  struct crypto_async_request *async,
+				  bool *should_complete, int *ret)
+{
+	struct ahash_request *areq = ahash_request_cast(async);
+	struct safexcel_ahash_req *req = ahash_request_ctx(areq);
+	int err;
+
+	if (req->needs_inv) {
+		req->needs_inv = false;
+		err = safexcel_handle_inv_result(priv, ring, async,
+						 should_complete, ret);
+	} else {
+		err = safexcel_handle_req_result(priv, ring, async,
+						 should_complete, ret);
+	}
+
+	return err;
+}
+
 static int safexcel_ahash_send_inv(struct crypto_async_request *async,
 				   int ring, struct safexcel_request *request,
 				   int *commands, int *results)
@@ -400,7 +418,6 @@ static int safexcel_ahash_send_inv(struct crypto_async_request *async,
 	struct safexcel_ahash_ctx *ctx = crypto_ahash_ctx(crypto_ahash_reqtfm(areq));
 	int ret;
 
-	ctx->base.handle_result = safexcel_handle_inv_result;
 	ret = safexcel_invalidate_cache(async, &ctx->base, ctx->priv,
 					ctx->base.ctxr_dma, ring, request);
 	if (unlikely(ret))
@@ -412,28 +429,46 @@ static int safexcel_ahash_send_inv(struct crypto_async_request *async,
 	return 0;
 }
 
+static int safexcel_ahash_send(struct crypto_async_request *async,
+			       int ring, struct safexcel_request *request,
+			       int *commands, int *results)
+{
+	struct ahash_request *areq = ahash_request_cast(async);
+	struct safexcel_ahash_req *req = ahash_request_ctx(areq);
+	int ret;
+
+	if (req->needs_inv)
+		ret = safexcel_ahash_send_inv(async, ring, request,
+					      commands, results);
+	else
+		ret = safexcel_ahash_send_req(async, ring, request,
+					      commands, results);
+	return ret;
+}
+
 static int safexcel_ahash_exit_inv(struct crypto_tfm *tfm)
 {
 	struct safexcel_ahash_ctx *ctx = crypto_tfm_ctx(tfm);
 	struct safexcel_crypto_priv *priv = ctx->priv;
-	struct ahash_request req;
+	AHASH_REQUEST_ON_STACK(req, __crypto_ahash_cast(tfm));
+	struct safexcel_ahash_req *rctx = ahash_request_ctx(req);
 	struct safexcel_inv_result result = {};
 	int ring = ctx->base.ring;
 
-	memset(&req, 0, sizeof(struct ahash_request));
+	memset(req, 0, sizeof(struct ahash_request));
 
 	/* create invalidation request */
 	init_completion(&result.completion);
-	ahash_request_set_callback(&req, CRYPTO_TFM_REQ_MAY_BACKLOG,
+	ahash_request_set_callback(req, CRYPTO_TFM_REQ_MAY_BACKLOG,
 				   safexcel_inv_complete, &result);
 
-	ahash_request_set_tfm(&req, __crypto_ahash_cast(tfm));
-	ctx = crypto_tfm_ctx(req.base.tfm);
+	ahash_request_set_tfm(req, __crypto_ahash_cast(tfm));
+	ctx = crypto_tfm_ctx(req->base.tfm);
 	ctx->base.exit_inv = true;
-	ctx->base.send = safexcel_ahash_send_inv;
+	rctx->needs_inv = true;
 
 	spin_lock_bh(&priv->ring[ring].queue_lock);
-	crypto_enqueue_request(&priv->ring[ring].queue, &req.base);
+	crypto_enqueue_request(&priv->ring[ring].queue, &req->base);
 	spin_unlock_bh(&priv->ring[ring].queue_lock);
 
 	if (!priv->ring[ring].need_dequeue)
@@ -481,14 +516,16 @@ static int safexcel_ahash_enqueue(struct ahash_request *areq)
 	struct safexcel_crypto_priv *priv = ctx->priv;
 	int ret, ring;
 
-	ctx->base.send = safexcel_ahash_send;
+	req->needs_inv = false;
 
 	if (req->processed && ctx->digest == CONTEXT_CONTROL_DIGEST_PRECOMPUTED)
 		ctx->base.needs_inv = safexcel_ahash_needs_inv_get(areq);
 
 	if (ctx->base.ctxr) {
-		if (ctx->base.needs_inv)
-			ctx->base.send = safexcel_ahash_send_inv;
+		if (ctx->base.needs_inv) {
+			ctx->base.needs_inv = false;
+			req->needs_inv = true;
+		}
 	} else {
 		ctx->base.ring = safexcel_select_ring(priv);
 		ctx->base.ctxr = dma_pool_zalloc(priv->context_pool,
@@ -622,6 +659,8 @@ static int safexcel_ahash_cra_init(struct crypto_tfm *tfm)
 			     struct safexcel_alg_template, alg.ahash);
 
 	ctx->priv = tmpl->priv;
+	ctx->base.send = safexcel_ahash_send;
+	ctx->base.handle_result = safexcel_handle_result;
 
 	crypto_ahash_set_reqsize(__crypto_ahash_cast(tfm),
 				 sizeof(struct safexcel_ahash_req));
diff --git a/drivers/crypto/n2_core.c b/drivers/crypto/n2_core.c
index 48de52c..662e709 100644
--- a/drivers/crypto/n2_core.c
+++ b/drivers/crypto/n2_core.c
@@ -1625,6 +1625,7 @@ static int queue_cache_init(void)
 					  CWQ_ENTRY_SIZE, 0, NULL);
 	if (!queue_cache[HV_NCS_QTYPE_CWQ - 1]) {
 		kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]);
+		queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL;
 		return -ENOMEM;
 	}
 	return 0;
@@ -1634,6 +1635,8 @@ static void queue_cache_destroy(void)
 {
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_MAU - 1]);
 	kmem_cache_destroy(queue_cache[HV_NCS_QTYPE_CWQ - 1]);
+	queue_cache[HV_NCS_QTYPE_MAU - 1] = NULL;
+	queue_cache[HV_NCS_QTYPE_CWQ - 1] = NULL;
 }
 
 static long spu_queue_register_workfn(void *arg)
diff --git a/drivers/devfreq/devfreq.c b/drivers/devfreq/devfreq.c
index 78fb496..fe2af6a 100644
--- a/drivers/devfreq/devfreq.c
+++ b/drivers/devfreq/devfreq.c
@@ -737,7 +737,7 @@ struct devfreq *devm_devfreq_add_device(struct device *dev,
 	devfreq = devfreq_add_device(dev, profile, governor_name, data);
 	if (IS_ERR(devfreq)) {
 		devres_free(ptr);
-		return ERR_PTR(-ENOMEM);
+		return devfreq;
 	}
 
 	*ptr = devfreq;
@@ -996,7 +996,8 @@ static ssize_t governor_store(struct device *dev, struct device_attribute *attr,
 	if (df->governor == governor) {
 		ret = 0;
 		goto out;
-	} else if (df->governor->immutable || governor->immutable) {
+	} else if ((df->governor && df->governor->immutable) ||
+					governor->immutable) {
 		ret = -EINVAL;
 		goto out;
 	}
diff --git a/drivers/dma/ioat/dma.c b/drivers/dma/ioat/dma.c
index 58d4ccd..8b5b23a 100644
--- a/drivers/dma/ioat/dma.c
+++ b/drivers/dma/ioat/dma.c
@@ -597,7 +597,6 @@ static void __cleanup(struct ioatdma_chan *ioat_chan, dma_addr_t phys_complete)
 	for (i = 0; i < active && !seen_current; i++) {
 		struct dma_async_tx_descriptor *tx;
 
-		smp_read_barrier_depends();
 		prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
 		desc = ioat_get_ring_ent(ioat_chan, idx + i);
 		dump_desc_dbg(ioat_chan, desc);
@@ -715,7 +714,6 @@ static void ioat_abort_descs(struct ioatdma_chan *ioat_chan)
 	for (i = 1; i < active; i++) {
 		struct dma_async_tx_descriptor *tx;
 
-		smp_read_barrier_depends();
 		prefetch(ioat_get_ring_ent(ioat_chan, idx + i + 1));
 		desc = ioat_get_ring_ent(ioat_chan, idx + i);
 
diff --git a/drivers/dma/sh/rcar-dmac.c b/drivers/dma/sh/rcar-dmac.c
index 2b2c7db..35c3936 100644
--- a/drivers/dma/sh/rcar-dmac.c
+++ b/drivers/dma/sh/rcar-dmac.c
@@ -1615,22 +1615,6 @@ static struct dma_chan *rcar_dmac_of_xlate(struct of_phandle_args *dma_spec,
  * Power management
  */
 
-#ifdef CONFIG_PM_SLEEP
-static int rcar_dmac_sleep_suspend(struct device *dev)
-{
-	/*
-	 * TODO: Wait for the current transfer to complete and stop the device.
-	 */
-	return 0;
-}
-
-static int rcar_dmac_sleep_resume(struct device *dev)
-{
-	/* TODO: Resume transfers, if any. */
-	return 0;
-}
-#endif
-
 #ifdef CONFIG_PM
 static int rcar_dmac_runtime_suspend(struct device *dev)
 {
@@ -1646,7 +1630,13 @@ static int rcar_dmac_runtime_resume(struct device *dev)
 #endif
 
 static const struct dev_pm_ops rcar_dmac_pm = {
-	SET_SYSTEM_SLEEP_PM_OPS(rcar_dmac_sleep_suspend, rcar_dmac_sleep_resume)
+	/*
+	 * TODO for system sleep/resume:
+	 *   - Wait for the current transfer to complete and stop the device,
+	 *   - Resume transfers, if any.
+	 */
+	SET_LATE_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
+				     pm_runtime_force_resume)
 	SET_RUNTIME_PM_OPS(rcar_dmac_runtime_suspend, rcar_dmac_runtime_resume,
 			   NULL)
 };
diff --git a/drivers/edac/Kconfig b/drivers/edac/Kconfig
index 96afb2a..3c40170 100644
--- a/drivers/edac/Kconfig
+++ b/drivers/edac/Kconfig
@@ -457,4 +457,11 @@
 	  Support for error detection and correction on the
 	  APM X-Gene family of SOCs.
 
+config EDAC_TI
+	tristate "Texas Instruments DDR3 ECC Controller"
+	depends on ARCH_KEYSTONE || SOC_DRA7XX
+	help
+	  Support for error detection and correction on the
+          TI SoCs.
+
 endif # EDAC
diff --git a/drivers/edac/Makefile b/drivers/edac/Makefile
index 0fd9ffa..b54912e 100644
--- a/drivers/edac/Makefile
+++ b/drivers/edac/Makefile
@@ -78,3 +78,4 @@
 obj-$(CONFIG_EDAC_ALTERA)		+= altera_edac.o
 obj-$(CONFIG_EDAC_SYNOPSYS)		+= synopsys_edac.o
 obj-$(CONFIG_EDAC_XGENE)		+= xgene_edac.o
+obj-$(CONFIG_EDAC_TI)			+= ti_edac.o
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index ec5d695..3c68bb5 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -758,7 +758,7 @@ static int mv64x60_mc_err_probe(struct platform_device *pdev)
 		/* Non-ECC RAM? */
 		printk(KERN_WARNING "%s: No ECC DIMMs discovered\n", __func__);
 		res = -ENODEV;
-		goto err2;
+		goto err;
 	}
 
 	edac_dbg(3, "init mci\n");
diff --git a/drivers/edac/octeon_edac-lmc.c b/drivers/edac/octeon_edac-lmc.c
index 9c1ffe3..aeb222c 100644
--- a/drivers/edac/octeon_edac-lmc.c
+++ b/drivers/edac/octeon_edac-lmc.c
@@ -78,6 +78,7 @@ static void octeon_lmc_edac_poll_o2(struct mem_ctl_info *mci)
 	if (!pvt->inject)
 		int_reg.u64 = cvmx_read_csr(CVMX_LMCX_INT(mci->mc_idx));
 	else {
+		int_reg.u64 = 0;
 		if (pvt->error_type == 1)
 			int_reg.s.sec_err = 1;
 		if (pvt->error_type == 2)
diff --git a/drivers/edac/ti_edac.c b/drivers/edac/ti_edac.c
new file mode 100644
index 0000000..6ac26d1
--- /dev/null
+++ b/drivers/edac/ti_edac.c
@@ -0,0 +1,341 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017 Texas Instruments Incorporated - http://www.ti.com/
+ *
+ * Texas Instruments DDR3 ECC error correction and detection driver
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/init.h>
+#include <linux/edac.h>
+#include <linux/io.h>
+#include <linux/interrupt.h>
+#include <linux/of_address.h>
+#include <linux/of_device.h>
+#include <linux/module.h>
+
+#include "edac_module.h"
+
+/* EMIF controller registers */
+#define EMIF_SDRAM_CONFIG		0x008
+#define EMIF_IRQ_STATUS			0x0ac
+#define EMIF_IRQ_ENABLE_SET		0x0b4
+#define EMIF_ECC_CTRL			0x110
+#define EMIF_1B_ECC_ERR_CNT		0x130
+#define EMIF_1B_ECC_ERR_THRSH		0x134
+#define EMIF_1B_ECC_ERR_ADDR_LOG	0x13c
+#define EMIF_2B_ECC_ERR_ADDR_LOG	0x140
+
+/* Bit definitions for EMIF_SDRAM_CONFIG */
+#define SDRAM_TYPE_SHIFT		29
+#define SDRAM_TYPE_MASK			GENMASK(31, 29)
+#define SDRAM_TYPE_DDR3			(3 << SDRAM_TYPE_SHIFT)
+#define SDRAM_TYPE_DDR2			(2 << SDRAM_TYPE_SHIFT)
+#define SDRAM_NARROW_MODE_MASK		GENMASK(15, 14)
+#define SDRAM_K2_NARROW_MODE_SHIFT	12
+#define SDRAM_K2_NARROW_MODE_MASK	GENMASK(13, 12)
+#define SDRAM_ROWSIZE_SHIFT		7
+#define SDRAM_ROWSIZE_MASK		GENMASK(9, 7)
+#define SDRAM_IBANK_SHIFT		4
+#define SDRAM_IBANK_MASK		GENMASK(6, 4)
+#define SDRAM_K2_IBANK_SHIFT		5
+#define SDRAM_K2_IBANK_MASK		GENMASK(6, 5)
+#define SDRAM_K2_EBANK_SHIFT		3
+#define SDRAM_K2_EBANK_MASK		BIT(SDRAM_K2_EBANK_SHIFT)
+#define SDRAM_PAGESIZE_SHIFT		0
+#define SDRAM_PAGESIZE_MASK		GENMASK(2, 0)
+#define SDRAM_K2_PAGESIZE_SHIFT		0
+#define SDRAM_K2_PAGESIZE_MASK		GENMASK(1, 0)
+
+#define EMIF_1B_ECC_ERR_THRSH_SHIFT	24
+
+/* IRQ bit definitions */
+#define EMIF_1B_ECC_ERR			BIT(5)
+#define EMIF_2B_ECC_ERR			BIT(4)
+#define EMIF_WR_ECC_ERR			BIT(3)
+#define EMIF_SYS_ERR			BIT(0)
+/* Bit 31 enables ECC and 28 enables RMW */
+#define ECC_ENABLED			(BIT(31) | BIT(28))
+
+#define EDAC_MOD_NAME			"ti-emif-edac"
+
+enum {
+	EMIF_TYPE_DRA7,
+	EMIF_TYPE_K2
+};
+
+struct ti_edac {
+	void __iomem *reg;
+};
+
+static u32 ti_edac_readl(struct ti_edac *edac, u16 offset)
+{
+	return readl_relaxed(edac->reg + offset);
+}
+
+static void ti_edac_writel(struct ti_edac *edac, u32 val, u16 offset)
+{
+	writel_relaxed(val, edac->reg + offset);
+}
+
+static irqreturn_t ti_edac_isr(int irq, void *data)
+{
+	struct mem_ctl_info *mci = data;
+	struct ti_edac *edac = mci->pvt_info;
+	u32 irq_status;
+	u32 err_addr;
+	int err_count;
+
+	irq_status = ti_edac_readl(edac, EMIF_IRQ_STATUS);
+
+	if (irq_status & EMIF_1B_ECC_ERR) {
+		err_addr = ti_edac_readl(edac, EMIF_1B_ECC_ERR_ADDR_LOG);
+		err_count = ti_edac_readl(edac, EMIF_1B_ECC_ERR_CNT);
+		ti_edac_writel(edac, err_count, EMIF_1B_ECC_ERR_CNT);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, err_count,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & ~PAGE_MASK, -1, 0, 0, 0,
+				     mci->ctl_name, "1B");
+	}
+
+	if (irq_status & EMIF_2B_ECC_ERR) {
+		err_addr = ti_edac_readl(edac, EMIF_2B_ECC_ERR_ADDR_LOG);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & ~PAGE_MASK, -1, 0, 0, 0,
+				     mci->ctl_name, "2B");
+	}
+
+	if (irq_status & EMIF_WR_ECC_ERR)
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 1,
+				     0, 0, -1, 0, 0, 0,
+				     mci->ctl_name, "WR");
+
+	ti_edac_writel(edac, irq_status, EMIF_IRQ_STATUS);
+
+	return IRQ_HANDLED;
+}
+
+static void ti_edac_setup_dimm(struct mem_ctl_info *mci, u32 type)
+{
+	struct dimm_info *dimm;
+	struct ti_edac *edac = mci->pvt_info;
+	int bits;
+	u32 val;
+	u32 memsize;
+
+	dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers, 0, 0, 0);
+
+	val = ti_edac_readl(edac, EMIF_SDRAM_CONFIG);
+
+	if (type == EMIF_TYPE_DRA7) {
+		bits = ((val & SDRAM_PAGESIZE_MASK) >> SDRAM_PAGESIZE_SHIFT) + 8;
+		bits += ((val & SDRAM_ROWSIZE_MASK) >> SDRAM_ROWSIZE_SHIFT) + 9;
+		bits += (val & SDRAM_IBANK_MASK) >> SDRAM_IBANK_SHIFT;
+
+		if (val & SDRAM_NARROW_MODE_MASK) {
+			bits++;
+			dimm->dtype = DEV_X16;
+		} else {
+			bits += 2;
+			dimm->dtype = DEV_X32;
+		}
+	} else {
+		bits = 16;
+		bits += ((val & SDRAM_K2_PAGESIZE_MASK) >>
+			SDRAM_K2_PAGESIZE_SHIFT) + 8;
+		bits += (val & SDRAM_K2_IBANK_MASK) >> SDRAM_K2_IBANK_SHIFT;
+		bits += (val & SDRAM_K2_EBANK_MASK) >> SDRAM_K2_EBANK_SHIFT;
+
+		val = (val & SDRAM_K2_NARROW_MODE_MASK) >>
+			SDRAM_K2_NARROW_MODE_SHIFT;
+		switch (val) {
+		case 0:
+			bits += 3;
+			dimm->dtype = DEV_X64;
+			break;
+		case 1:
+			bits += 2;
+			dimm->dtype = DEV_X32;
+			break;
+		case 2:
+			bits++;
+			dimm->dtype = DEV_X16;
+			break;
+		}
+	}
+
+	memsize = 1 << bits;
+
+	dimm->nr_pages = memsize >> PAGE_SHIFT;
+	dimm->grain = 4;
+	if ((val & SDRAM_TYPE_MASK) == SDRAM_TYPE_DDR2)
+		dimm->mtype = MEM_DDR2;
+	else
+		dimm->mtype = MEM_DDR3;
+
+	val = ti_edac_readl(edac, EMIF_ECC_CTRL);
+	if (val & ECC_ENABLED)
+		dimm->edac_mode = EDAC_SECDED;
+	else
+		dimm->edac_mode = EDAC_NONE;
+}
+
+static const struct of_device_id ti_edac_of_match[] = {
+	{ .compatible = "ti,emif-keystone", .data = (void *)EMIF_TYPE_K2 },
+	{ .compatible = "ti,emif-dra7xx", .data = (void *)EMIF_TYPE_DRA7 },
+	{},
+};
+
+static int _emif_get_id(struct device_node *node)
+{
+	struct device_node *np;
+	const __be32 *addrp;
+	u32 addr, my_addr;
+	int my_id = 0;
+
+	addrp = of_get_address(node, 0, NULL, NULL);
+	my_addr = (u32)of_translate_address(node, addrp);
+
+	for_each_matching_node(np, ti_edac_of_match) {
+		if (np == node)
+			continue;
+
+		addrp = of_get_address(np, 0, NULL, NULL);
+		addr = (u32)of_translate_address(np, addrp);
+
+		edac_printk(KERN_INFO, EDAC_MOD_NAME,
+			    "addr=%x, my_addr=%x\n",
+			    addr, my_addr);
+
+		if (addr < my_addr)
+			my_id++;
+	}
+
+	return my_id;
+}
+
+static int ti_edac_probe(struct platform_device *pdev)
+{
+	int error_irq = 0, ret = -ENODEV;
+	struct device *dev = &pdev->dev;
+	struct resource *res;
+	void __iomem *reg;
+	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[1];
+	const struct of_device_id *id;
+	struct ti_edac *edac;
+	int emif_id;
+
+	id = of_match_device(ti_edac_of_match, &pdev->dev);
+	if (!id)
+		return -ENODEV;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	reg = devm_ioremap_resource(dev, res);
+	if (IS_ERR(reg)) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "EMIF controller regs not defined\n");
+		return PTR_ERR(reg);
+	}
+
+	layers[0].type = EDAC_MC_LAYER_ALL_MEM;
+	layers[0].size = 1;
+
+	/* Allocate ID number for our EMIF controller */
+	emif_id = _emif_get_id(pdev->dev.of_node);
+	if (emif_id < 0)
+		return -EINVAL;
+
+	mci = edac_mc_alloc(emif_id, 1, layers, sizeof(*edac));
+	if (!mci)
+		return -ENOMEM;
+
+	mci->pdev = &pdev->dev;
+	edac = mci->pvt_info;
+	edac->reg = reg;
+	platform_set_drvdata(pdev, mci);
+
+	mci->mtype_cap = MEM_FLAG_DDR3 | MEM_FLAG_DDR2;
+	mci->edac_ctl_cap = EDAC_FLAG_SECDED | EDAC_FLAG_NONE;
+	mci->mod_name = EDAC_MOD_NAME;
+	mci->ctl_name = id->compatible;
+	mci->dev_name = dev_name(&pdev->dev);
+
+	/* Setup memory layout */
+	ti_edac_setup_dimm(mci, (u32)(id->data));
+
+	/* add EMIF ECC error handler */
+	error_irq = platform_get_irq(pdev, 0);
+	if (!error_irq) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "EMIF irq number not defined.\n");
+		goto err;
+	}
+
+	ret = devm_request_irq(dev, error_irq, ti_edac_isr, 0,
+			       "emif-edac-irq", mci);
+	if (ret) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "request_irq fail for EMIF EDAC irq\n");
+		goto err;
+	}
+
+	ret = edac_mc_add_mc(mci);
+	if (ret) {
+		edac_printk(KERN_ERR, EDAC_MOD_NAME,
+			    "Failed to register mci: %d.\n", ret);
+		goto err;
+	}
+
+	/* Generate an interrupt with each 1b error */
+	ti_edac_writel(edac, 1 << EMIF_1B_ECC_ERR_THRSH_SHIFT,
+		       EMIF_1B_ECC_ERR_THRSH);
+
+	/* Enable interrupts */
+	ti_edac_writel(edac,
+		       EMIF_1B_ECC_ERR | EMIF_2B_ECC_ERR | EMIF_WR_ECC_ERR,
+		       EMIF_IRQ_ENABLE_SET);
+
+	return 0;
+
+err:
+	edac_mc_free(mci);
+	return ret;
+}
+
+static int ti_edac_remove(struct platform_device *pdev)
+{
+	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
+
+	edac_mc_del_mc(&pdev->dev);
+	edac_mc_free(mci);
+
+	return 0;
+}
+
+static struct platform_driver ti_edac_driver = {
+	.probe = ti_edac_probe,
+	.remove = ti_edac_remove,
+	.driver = {
+		   .name = EDAC_MOD_NAME,
+		   .of_match_table = ti_edac_of_match,
+	},
+};
+
+module_platform_driver(ti_edac_driver);
+
+MODULE_AUTHOR("Texas Instruments Inc.");
+MODULE_DESCRIPTION("EDAC Driver for Texas Instruments DDR3 MC");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/extcon/extcon-axp288.c b/drivers/extcon/extcon-axp288.c
index 981fba5..1621f2f 100644
--- a/drivers/extcon/extcon-axp288.c
+++ b/drivers/extcon/extcon-axp288.c
@@ -24,8 +24,6 @@
 #include <linux/notifier.h>
 #include <linux/extcon-provider.h>
 #include <linux/regmap.h>
-#include <linux/gpio.h>
-#include <linux/gpio/consumer.h>
 #include <linux/mfd/axp20x.h>
 
 /* Power source status register */
@@ -79,11 +77,6 @@ enum axp288_extcon_reg {
 	AXP288_BC_DET_STAT_REG		= 0x2f,
 };
 
-enum axp288_mux_select {
-	EXTCON_GPIO_MUX_SEL_PMIC = 0,
-	EXTCON_GPIO_MUX_SEL_SOC,
-};
-
 enum axp288_extcon_irq {
 	VBUS_FALLING_IRQ = 0,
 	VBUS_RISING_IRQ,
@@ -104,10 +97,8 @@ struct axp288_extcon_info {
 	struct device *dev;
 	struct regmap *regmap;
 	struct regmap_irq_chip_data *regmap_irqc;
-	struct gpio_desc *gpio_mux_cntl;
 	int irq[EXTCON_IRQ_END];
 	struct extcon_dev *edev;
-	struct notifier_block extcon_nb;
 	unsigned int previous_cable;
 };
 
@@ -197,15 +188,6 @@ static int axp288_handle_chrg_det_event(struct axp288_extcon_info *info)
 	}
 
 no_vbus:
-	/*
-	 * If VBUS is absent Connect D+/D- lines to PMIC for BC
-	 * detection. Else connect them to SOC for USB communication.
-	 */
-	if (info->gpio_mux_cntl)
-		gpiod_set_value(info->gpio_mux_cntl,
-			vbus_attach ? EXTCON_GPIO_MUX_SEL_SOC
-					: EXTCON_GPIO_MUX_SEL_PMIC);
-
 	extcon_set_state_sync(info->edev, info->previous_cable, false);
 	if (info->previous_cable == EXTCON_CHG_USB_SDP)
 		extcon_set_state_sync(info->edev, EXTCON_USB, false);
@@ -253,8 +235,7 @@ static int axp288_extcon_probe(struct platform_device *pdev)
 {
 	struct axp288_extcon_info *info;
 	struct axp20x_dev *axp20x = dev_get_drvdata(pdev->dev.parent);
-	struct axp288_extcon_pdata *pdata = pdev->dev.platform_data;
-	int ret, i, pirq, gpio;
+	int ret, i, pirq;
 
 	info = devm_kzalloc(&pdev->dev, sizeof(*info), GFP_KERNEL);
 	if (!info)
@@ -264,8 +245,6 @@ static int axp288_extcon_probe(struct platform_device *pdev)
 	info->regmap = axp20x->regmap;
 	info->regmap_irqc = axp20x->regmap_irqc;
 	info->previous_cable = EXTCON_NONE;
-	if (pdata)
-		info->gpio_mux_cntl = pdata->gpio_mux_cntl;
 
 	platform_set_drvdata(pdev, info);
 
@@ -286,21 +265,11 @@ static int axp288_extcon_probe(struct platform_device *pdev)
 		return ret;
 	}
 
-	/* Set up gpio control for USB Mux */
-	if (info->gpio_mux_cntl) {
-		gpio = desc_to_gpio(info->gpio_mux_cntl);
-		ret = devm_gpio_request(&pdev->dev, gpio, "USB_MUX");
-		if (ret < 0) {
-			dev_err(&pdev->dev,
-				"failed to request the gpio=%d\n", gpio);
-			return ret;
-		}
-		gpiod_direction_output(info->gpio_mux_cntl,
-						EXTCON_GPIO_MUX_SEL_PMIC);
-	}
-
 	for (i = 0; i < EXTCON_IRQ_END; i++) {
 		pirq = platform_get_irq(pdev, i);
+		if (pirq < 0)
+			return pirq;
+
 		info->irq[i] = regmap_irq_get_virq(info->regmap_irqc, pirq);
 		if (info->irq[i] < 0) {
 			dev_err(&pdev->dev,
diff --git a/drivers/extcon/extcon-usbc-cros-ec.c b/drivers/extcon/extcon-usbc-cros-ec.c
index 6187f73..6721ab0 100644
--- a/drivers/extcon/extcon-usbc-cros-ec.c
+++ b/drivers/extcon/extcon-usbc-cros-ec.c
@@ -34,16 +34,26 @@ struct cros_ec_extcon_info {
 
 	struct notifier_block notifier;
 
+	unsigned int dr; /* data role */
+	bool pr; /* power role (true if VBUS enabled) */
 	bool dp; /* DisplayPort enabled */
 	bool mux; /* SuperSpeed (usb3) enabled */
 	unsigned int power_type;
 };
 
 static const unsigned int usb_type_c_cable[] = {
+	EXTCON_USB,
+	EXTCON_USB_HOST,
 	EXTCON_DISP_DP,
 	EXTCON_NONE,
 };
 
+enum usb_data_roles {
+	DR_NONE,
+	DR_HOST,
+	DR_DEVICE,
+};
+
 /**
  * cros_ec_pd_command() - Send a command to the EC.
  * @info: pointer to struct cros_ec_extcon_info
@@ -150,6 +160,7 @@ static int cros_ec_usb_get_role(struct cros_ec_extcon_info *info,
 	pd_control.port = info->port_id;
 	pd_control.role = USB_PD_CTRL_ROLE_NO_CHANGE;
 	pd_control.mux = USB_PD_CTRL_MUX_NO_CHANGE;
+	pd_control.swap = USB_PD_CTRL_SWAP_NONE;
 	ret = cros_ec_pd_command(info, EC_CMD_USB_PD_CONTROL, 1,
 				 &pd_control, sizeof(pd_control),
 				 &resp, sizeof(resp));
@@ -183,11 +194,72 @@ static int cros_ec_pd_get_num_ports(struct cros_ec_extcon_info *info)
 	return resp.num_ports;
 }
 
+static const char *cros_ec_usb_role_string(unsigned int role)
+{
+	return role == DR_NONE ? "DISCONNECTED" :
+		(role == DR_HOST ? "DFP" : "UFP");
+}
+
+static const char *cros_ec_usb_power_type_string(unsigned int type)
+{
+	switch (type) {
+	case USB_CHG_TYPE_NONE:
+		return "USB_CHG_TYPE_NONE";
+	case USB_CHG_TYPE_PD:
+		return "USB_CHG_TYPE_PD";
+	case USB_CHG_TYPE_PROPRIETARY:
+		return "USB_CHG_TYPE_PROPRIETARY";
+	case USB_CHG_TYPE_C:
+		return "USB_CHG_TYPE_C";
+	case USB_CHG_TYPE_BC12_DCP:
+		return "USB_CHG_TYPE_BC12_DCP";
+	case USB_CHG_TYPE_BC12_CDP:
+		return "USB_CHG_TYPE_BC12_CDP";
+	case USB_CHG_TYPE_BC12_SDP:
+		return "USB_CHG_TYPE_BC12_SDP";
+	case USB_CHG_TYPE_OTHER:
+		return "USB_CHG_TYPE_OTHER";
+	case USB_CHG_TYPE_VBUS:
+		return "USB_CHG_TYPE_VBUS";
+	case USB_CHG_TYPE_UNKNOWN:
+		return "USB_CHG_TYPE_UNKNOWN";
+	default:
+		return "USB_CHG_TYPE_UNKNOWN";
+	}
+}
+
+static bool cros_ec_usb_power_type_is_wall_wart(unsigned int type,
+						unsigned int role)
+{
+	switch (type) {
+	/* FIXME : Guppy, Donnettes, and other chargers will be miscategorized
+	 * because they identify with USB_CHG_TYPE_C, but we can't return true
+	 * here from that code because that breaks Suzy-Q and other kinds of
+	 * USB Type-C cables and peripherals.
+	 */
+	case USB_CHG_TYPE_PROPRIETARY:
+	case USB_CHG_TYPE_BC12_DCP:
+		return true;
+	case USB_CHG_TYPE_PD:
+	case USB_CHG_TYPE_C:
+	case USB_CHG_TYPE_BC12_CDP:
+	case USB_CHG_TYPE_BC12_SDP:
+	case USB_CHG_TYPE_OTHER:
+	case USB_CHG_TYPE_VBUS:
+	case USB_CHG_TYPE_UNKNOWN:
+	case USB_CHG_TYPE_NONE:
+	default:
+		return false;
+	}
+}
+
 static int extcon_cros_ec_detect_cable(struct cros_ec_extcon_info *info,
 				       bool force)
 {
 	struct device *dev = info->dev;
 	int role, power_type;
+	unsigned int dr = DR_NONE;
+	bool pr = false;
 	bool polarity = false;
 	bool dp = false;
 	bool mux = false;
@@ -206,9 +278,12 @@ static int extcon_cros_ec_detect_cable(struct cros_ec_extcon_info *info,
 			dev_err(dev, "failed getting role err = %d\n", role);
 			return role;
 		}
+		dev_dbg(dev, "disconnected\n");
 	} else {
 		int pd_mux_state;
 
+		dr = (role & PD_CTRL_RESP_ROLE_DATA) ? DR_HOST : DR_DEVICE;
+		pr = (role & PD_CTRL_RESP_ROLE_POWER);
 		pd_mux_state = cros_ec_usb_get_pd_mux_state(info);
 		if (pd_mux_state < 0)
 			pd_mux_state = USB_PD_MUX_USB_ENABLED;
@@ -216,20 +291,62 @@ static int extcon_cros_ec_detect_cable(struct cros_ec_extcon_info *info,
 		dp = pd_mux_state & USB_PD_MUX_DP_ENABLED;
 		mux = pd_mux_state & USB_PD_MUX_USB_ENABLED;
 		hpd = pd_mux_state & USB_PD_MUX_HPD_IRQ;
+
+		dev_dbg(dev,
+			"connected role 0x%x pwr type %d dr %d pr %d pol %d mux %d dp %d hpd %d\n",
+			role, power_type, dr, pr, polarity, mux, dp, hpd);
 	}
 
-	if (force || info->dp != dp || info->mux != mux ||
-		info->power_type != power_type) {
+	/*
+	 * When there is no USB host (e.g. USB PD charger),
+	 * we are not really a UFP for the AP.
+	 */
+	if (dr == DR_DEVICE &&
+	    cros_ec_usb_power_type_is_wall_wart(power_type, role))
+		dr = DR_NONE;
 
+	if (force || info->dr != dr || info->pr != pr || info->dp != dp ||
+	    info->mux != mux || info->power_type != power_type) {
+		bool host_connected = false, device_connected = false;
+
+		dev_dbg(dev, "Type/Role switch! type = %s role = %s\n",
+			cros_ec_usb_power_type_string(power_type),
+			cros_ec_usb_role_string(dr));
+		info->dr = dr;
+		info->pr = pr;
 		info->dp = dp;
 		info->mux = mux;
 		info->power_type = power_type;
 
-		extcon_set_state(info->edev, EXTCON_DISP_DP, dp);
+		if (dr == DR_DEVICE)
+			device_connected = true;
+		else if (dr == DR_HOST)
+			host_connected = true;
 
+		extcon_set_state(info->edev, EXTCON_USB, device_connected);
+		extcon_set_state(info->edev, EXTCON_USB_HOST, host_connected);
+		extcon_set_state(info->edev, EXTCON_DISP_DP, dp);
+		extcon_set_property(info->edev, EXTCON_USB,
+				    EXTCON_PROP_USB_VBUS,
+				    (union extcon_property_value)(int)pr);
+		extcon_set_property(info->edev, EXTCON_USB_HOST,
+				    EXTCON_PROP_USB_VBUS,
+				    (union extcon_property_value)(int)pr);
+		extcon_set_property(info->edev, EXTCON_USB,
+				    EXTCON_PROP_USB_TYPEC_POLARITY,
+				    (union extcon_property_value)(int)polarity);
+		extcon_set_property(info->edev, EXTCON_USB_HOST,
+				    EXTCON_PROP_USB_TYPEC_POLARITY,
+				    (union extcon_property_value)(int)polarity);
 		extcon_set_property(info->edev, EXTCON_DISP_DP,
 				    EXTCON_PROP_USB_TYPEC_POLARITY,
 				    (union extcon_property_value)(int)polarity);
+		extcon_set_property(info->edev, EXTCON_USB,
+				    EXTCON_PROP_USB_SS,
+				    (union extcon_property_value)(int)mux);
+		extcon_set_property(info->edev, EXTCON_USB_HOST,
+				    EXTCON_PROP_USB_SS,
+				    (union extcon_property_value)(int)mux);
 		extcon_set_property(info->edev, EXTCON_DISP_DP,
 				    EXTCON_PROP_USB_SS,
 				    (union extcon_property_value)(int)mux);
@@ -237,6 +354,8 @@ static int extcon_cros_ec_detect_cable(struct cros_ec_extcon_info *info,
 				    EXTCON_PROP_DISP_HPD,
 				    (union extcon_property_value)(int)hpd);
 
+		extcon_sync(info->edev, EXTCON_USB);
+		extcon_sync(info->edev, EXTCON_USB_HOST);
 		extcon_sync(info->edev, EXTCON_DISP_DP);
 
 	} else if (hpd) {
@@ -322,13 +441,28 @@ static int extcon_cros_ec_probe(struct platform_device *pdev)
 		return ret;
 	}
 
+	extcon_set_property_capability(info->edev, EXTCON_USB,
+				       EXTCON_PROP_USB_VBUS);
+	extcon_set_property_capability(info->edev, EXTCON_USB_HOST,
+				       EXTCON_PROP_USB_VBUS);
+	extcon_set_property_capability(info->edev, EXTCON_USB,
+				       EXTCON_PROP_USB_TYPEC_POLARITY);
+	extcon_set_property_capability(info->edev, EXTCON_USB_HOST,
+				       EXTCON_PROP_USB_TYPEC_POLARITY);
 	extcon_set_property_capability(info->edev, EXTCON_DISP_DP,
 				       EXTCON_PROP_USB_TYPEC_POLARITY);
+	extcon_set_property_capability(info->edev, EXTCON_USB,
+				       EXTCON_PROP_USB_SS);
+	extcon_set_property_capability(info->edev, EXTCON_USB_HOST,
+				       EXTCON_PROP_USB_SS);
 	extcon_set_property_capability(info->edev, EXTCON_DISP_DP,
 				       EXTCON_PROP_USB_SS);
 	extcon_set_property_capability(info->edev, EXTCON_DISP_DP,
 				       EXTCON_PROP_DISP_HPD);
 
+	info->dr = DR_NONE;
+	info->pr = false;
+
 	platform_set_drvdata(pdev, info);
 
 	/* Get PD events from the EC */
diff --git a/drivers/firmware/Kconfig b/drivers/firmware/Kconfig
index fa87a055..e77f77c 100644
--- a/drivers/firmware/Kconfig
+++ b/drivers/firmware/Kconfig
@@ -48,6 +48,14 @@
 	  This enables support for the SCPI power domains which can be
 	  enabled or disabled via the SCP firmware
 
+config ARM_SDE_INTERFACE
+	bool "ARM Software Delegated Exception Interface (SDEI)"
+	depends on ARM64
+	help
+	  The Software Delegated Exception Interface (SDEI) is an ARM
+	  standard for registering callbacks from the platform firmware
+	  into the OS. This is typically used to implement RAS notifications.
+
 config EDD
 	tristate "BIOS Enhanced Disk Drive calls determine boot disk"
 	depends on X86
diff --git a/drivers/firmware/Makefile b/drivers/firmware/Makefile
index feaa890..b248238 100644
--- a/drivers/firmware/Makefile
+++ b/drivers/firmware/Makefile
@@ -6,6 +6,7 @@
 obj-$(CONFIG_ARM_PSCI_CHECKER)	+= psci_checker.o
 obj-$(CONFIG_ARM_SCPI_PROTOCOL)	+= arm_scpi.o
 obj-$(CONFIG_ARM_SCPI_POWER_DOMAIN) += scpi_pm_domain.o
+obj-$(CONFIG_ARM_SDE_INTERFACE)	+= arm_sdei.o
 obj-$(CONFIG_DMI)		+= dmi_scan.o
 obj-$(CONFIG_DMI_SYSFS)		+= dmi-sysfs.o
 obj-$(CONFIG_EDD)		+= edd.o
diff --git a/drivers/firmware/arm_sdei.c b/drivers/firmware/arm_sdei.c
new file mode 100644
index 0000000..1ea7164
--- /dev/null
+++ b/drivers/firmware/arm_sdei.c
@@ -0,0 +1,1092 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017 Arm Ltd.
+#define pr_fmt(fmt) "sdei: " fmt
+
+#include <linux/acpi.h>
+#include <linux/arm_sdei.h>
+#include <linux/arm-smccc.h>
+#include <linux/atomic.h>
+#include <linux/bitops.h>
+#include <linux/compiler.h>
+#include <linux/cpuhotplug.h>
+#include <linux/cpu.h>
+#include <linux/cpu_pm.h>
+#include <linux/errno.h>
+#include <linux/hardirq.h>
+#include <linux/kernel.h>
+#include <linux/kprobes.h>
+#include <linux/kvm_host.h>
+#include <linux/list.h>
+#include <linux/mutex.h>
+#include <linux/notifier.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/percpu.h>
+#include <linux/platform_device.h>
+#include <linux/pm.h>
+#include <linux/ptrace.h>
+#include <linux/preempt.h>
+#include <linux/reboot.h>
+#include <linux/slab.h>
+#include <linux/smp.h>
+#include <linux/spinlock.h>
+#include <linux/uaccess.h>
+
+/*
+ * The call to use to reach the firmware.
+ */
+static asmlinkage void (*sdei_firmware_call)(unsigned long function_id,
+		      unsigned long arg0, unsigned long arg1,
+		      unsigned long arg2, unsigned long arg3,
+		      unsigned long arg4, struct arm_smccc_res *res);
+
+/* entry point from firmware to arch asm code */
+static unsigned long sdei_entry_point;
+
+struct sdei_event {
+	/* These three are protected by the sdei_list_lock */
+	struct list_head	list;
+	bool			reregister;
+	bool			reenable;
+
+	u32			event_num;
+	u8			type;
+	u8			priority;
+
+	/* This pointer is handed to firmware as the event argument. */
+	union {
+		/* Shared events */
+		struct sdei_registered_event *registered;
+
+		/* CPU private events */
+		struct sdei_registered_event __percpu *private_registered;
+	};
+};
+
+/* Take the mutex for any API call or modification. Take the mutex first. */
+static DEFINE_MUTEX(sdei_events_lock);
+
+/* and then hold this when modifying the list */
+static DEFINE_SPINLOCK(sdei_list_lock);
+static LIST_HEAD(sdei_list);
+
+/* Private events are registered/enabled via IPI passing one of these */
+struct sdei_crosscall_args {
+	struct sdei_event *event;
+	atomic_t errors;
+	int first_error;
+};
+
+#define CROSSCALL_INIT(arg, event)	(arg.event = event, \
+					 arg.first_error = 0, \
+					 atomic_set(&arg.errors, 0))
+
+static inline int sdei_do_cross_call(void *fn, struct sdei_event * event)
+{
+	struct sdei_crosscall_args arg;
+
+	CROSSCALL_INIT(arg, event);
+	on_each_cpu(fn, &arg, true);
+
+	return arg.first_error;
+}
+
+static inline void
+sdei_cross_call_return(struct sdei_crosscall_args *arg, int err)
+{
+	if (err && (atomic_inc_return(&arg->errors) == 1))
+		arg->first_error = err;
+}
+
+static int sdei_to_linux_errno(unsigned long sdei_err)
+{
+	switch (sdei_err) {
+	case SDEI_NOT_SUPPORTED:
+		return -EOPNOTSUPP;
+	case SDEI_INVALID_PARAMETERS:
+		return -EINVAL;
+	case SDEI_DENIED:
+		return -EPERM;
+	case SDEI_PENDING:
+		return -EINPROGRESS;
+	case SDEI_OUT_OF_RESOURCE:
+		return -ENOMEM;
+	}
+
+	/* Not an error value ... */
+	return sdei_err;
+}
+
+/*
+ * If x0 is any of these values, then the call failed, use sdei_to_linux_errno()
+ * to translate.
+ */
+static int sdei_is_err(struct arm_smccc_res *res)
+{
+	switch (res->a0) {
+	case SDEI_NOT_SUPPORTED:
+	case SDEI_INVALID_PARAMETERS:
+	case SDEI_DENIED:
+	case SDEI_PENDING:
+	case SDEI_OUT_OF_RESOURCE:
+		return true;
+	}
+
+	return false;
+}
+
+static int invoke_sdei_fn(unsigned long function_id, unsigned long arg0,
+			  unsigned long arg1, unsigned long arg2,
+			  unsigned long arg3, unsigned long arg4,
+			  u64 *result)
+{
+	int err = 0;
+	struct arm_smccc_res res;
+
+	if (sdei_firmware_call) {
+		sdei_firmware_call(function_id, arg0, arg1, arg2, arg3, arg4,
+				   &res);
+		if (sdei_is_err(&res))
+			err = sdei_to_linux_errno(res.a0);
+	} else {
+		/*
+		 * !sdei_firmware_call means we failed to probe or called
+		 * sdei_mark_interface_broken(). -EIO is not an error returned
+		 * by sdei_to_linux_errno() and is used to suppress messages
+		 * from this driver.
+		 */
+		err = -EIO;
+		res.a0 = SDEI_NOT_SUPPORTED;
+	}
+
+	if (result)
+		*result = res.a0;
+
+	return err;
+}
+
+static struct sdei_event *sdei_event_find(u32 event_num)
+{
+	struct sdei_event *e, *found = NULL;
+
+	lockdep_assert_held(&sdei_events_lock);
+
+	spin_lock(&sdei_list_lock);
+	list_for_each_entry(e, &sdei_list, list) {
+		if (e->event_num == event_num) {
+			found = e;
+			break;
+		}
+	}
+	spin_unlock(&sdei_list_lock);
+
+	return found;
+}
+
+int sdei_api_event_context(u32 query, u64 *result)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_CONTEXT, query, 0, 0, 0, 0,
+			      result);
+}
+NOKPROBE_SYMBOL(sdei_api_event_context);
+
+static int sdei_api_event_get_info(u32 event, u32 info, u64 *result)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_GET_INFO, event, info, 0,
+			      0, 0, result);
+}
+
+static struct sdei_event *sdei_event_create(u32 event_num,
+					    sdei_event_callback *cb,
+					    void *cb_arg)
+{
+	int err;
+	u64 result;
+	struct sdei_event *event;
+	struct sdei_registered_event *reg;
+
+	lockdep_assert_held(&sdei_events_lock);
+
+	event = kzalloc(sizeof(*event), GFP_KERNEL);
+	if (!event)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&event->list);
+	event->event_num = event_num;
+
+	err = sdei_api_event_get_info(event_num, SDEI_EVENT_INFO_EV_PRIORITY,
+				      &result);
+	if (err) {
+		kfree(event);
+		return ERR_PTR(err);
+	}
+	event->priority = result;
+
+	err = sdei_api_event_get_info(event_num, SDEI_EVENT_INFO_EV_TYPE,
+				      &result);
+	if (err) {
+		kfree(event);
+		return ERR_PTR(err);
+	}
+	event->type = result;
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED) {
+		reg = kzalloc(sizeof(*reg), GFP_KERNEL);
+		if (!reg) {
+			kfree(event);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		reg->event_num = event_num;
+		reg->priority = event->priority;
+
+		reg->callback = cb;
+		reg->callback_arg = cb_arg;
+		event->registered = reg;
+	} else {
+		int cpu;
+		struct sdei_registered_event __percpu *regs;
+
+		regs = alloc_percpu(struct sdei_registered_event);
+		if (!regs) {
+			kfree(event);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		for_each_possible_cpu(cpu) {
+			reg = per_cpu_ptr(regs, cpu);
+
+			reg->event_num = event->event_num;
+			reg->priority = event->priority;
+			reg->callback = cb;
+			reg->callback_arg = cb_arg;
+		}
+
+		event->private_registered = regs;
+	}
+
+	if (sdei_event_find(event_num)) {
+		kfree(event->registered);
+		kfree(event);
+		event = ERR_PTR(-EBUSY);
+	} else {
+		spin_lock(&sdei_list_lock);
+		list_add(&event->list, &sdei_list);
+		spin_unlock(&sdei_list_lock);
+	}
+
+	return event;
+}
+
+static void sdei_event_destroy(struct sdei_event *event)
+{
+	lockdep_assert_held(&sdei_events_lock);
+
+	spin_lock(&sdei_list_lock);
+	list_del(&event->list);
+	spin_unlock(&sdei_list_lock);
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED)
+		kfree(event->registered);
+	else
+		free_percpu(event->private_registered);
+
+	kfree(event);
+}
+
+static int sdei_api_get_version(u64 *version)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_VERSION, 0, 0, 0, 0, 0, version);
+}
+
+int sdei_mask_local_cpu(void)
+{
+	int err;
+
+	WARN_ON_ONCE(preemptible());
+
+	err = invoke_sdei_fn(SDEI_1_0_FN_SDEI_PE_MASK, 0, 0, 0, 0, 0, NULL);
+	if (err && err != -EIO) {
+		pr_warn_once("failed to mask CPU[%u]: %d\n",
+			      smp_processor_id(), err);
+		return err;
+	}
+
+	return 0;
+}
+
+static void _ipi_mask_cpu(void *ignored)
+{
+	sdei_mask_local_cpu();
+}
+
+int sdei_unmask_local_cpu(void)
+{
+	int err;
+
+	WARN_ON_ONCE(preemptible());
+
+	err = invoke_sdei_fn(SDEI_1_0_FN_SDEI_PE_UNMASK, 0, 0, 0, 0, 0, NULL);
+	if (err && err != -EIO) {
+		pr_warn_once("failed to unmask CPU[%u]: %d\n",
+			     smp_processor_id(), err);
+		return err;
+	}
+
+	return 0;
+}
+
+static void _ipi_unmask_cpu(void *ignored)
+{
+	sdei_unmask_local_cpu();
+}
+
+static void _ipi_private_reset(void *ignored)
+{
+	int err;
+
+	err = invoke_sdei_fn(SDEI_1_0_FN_SDEI_PRIVATE_RESET, 0, 0, 0, 0, 0,
+			     NULL);
+	if (err && err != -EIO)
+		pr_warn_once("failed to reset CPU[%u]: %d\n",
+			     smp_processor_id(), err);
+}
+
+static int sdei_api_shared_reset(void)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_SHARED_RESET, 0, 0, 0, 0, 0,
+			      NULL);
+}
+
+static void sdei_mark_interface_broken(void)
+{
+	pr_err("disabling SDEI firmware interface\n");
+	on_each_cpu(&_ipi_mask_cpu, NULL, true);
+	sdei_firmware_call = NULL;
+}
+
+static int sdei_platform_reset(void)
+{
+	int err;
+
+	on_each_cpu(&_ipi_private_reset, NULL, true);
+	err = sdei_api_shared_reset();
+	if (err) {
+		pr_err("Failed to reset platform: %d\n", err);
+		sdei_mark_interface_broken();
+	}
+
+	return err;
+}
+
+static int sdei_api_event_enable(u32 event_num)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_ENABLE, event_num, 0, 0, 0,
+			      0, NULL);
+}
+
+/* Called directly by the hotplug callbacks */
+static void _local_event_enable(void *data)
+{
+	int err;
+	struct sdei_crosscall_args *arg = data;
+
+	WARN_ON_ONCE(preemptible());
+
+	err = sdei_api_event_enable(arg->event->event_num);
+
+	sdei_cross_call_return(arg, err);
+}
+
+int sdei_event_enable(u32 event_num)
+{
+	int err = -EINVAL;
+	struct sdei_event *event;
+
+	mutex_lock(&sdei_events_lock);
+	event = sdei_event_find(event_num);
+	if (!event) {
+		mutex_unlock(&sdei_events_lock);
+		return -ENOENT;
+	}
+
+	spin_lock(&sdei_list_lock);
+	event->reenable = true;
+	spin_unlock(&sdei_list_lock);
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED)
+		err = sdei_api_event_enable(event->event_num);
+	else
+		err = sdei_do_cross_call(_local_event_enable, event);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(sdei_event_enable);
+
+static int sdei_api_event_disable(u32 event_num)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_DISABLE, event_num, 0, 0,
+			      0, 0, NULL);
+}
+
+static void _ipi_event_disable(void *data)
+{
+	int err;
+	struct sdei_crosscall_args *arg = data;
+
+	err = sdei_api_event_disable(arg->event->event_num);
+
+	sdei_cross_call_return(arg, err);
+}
+
+int sdei_event_disable(u32 event_num)
+{
+	int err = -EINVAL;
+	struct sdei_event *event;
+
+	mutex_lock(&sdei_events_lock);
+	event = sdei_event_find(event_num);
+	if (!event) {
+		mutex_unlock(&sdei_events_lock);
+		return -ENOENT;
+	}
+
+	spin_lock(&sdei_list_lock);
+	event->reenable = false;
+	spin_unlock(&sdei_list_lock);
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED)
+		err = sdei_api_event_disable(event->event_num);
+	else
+		err = sdei_do_cross_call(_ipi_event_disable, event);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(sdei_event_disable);
+
+static int sdei_api_event_unregister(u32 event_num)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_UNREGISTER, event_num, 0,
+			      0, 0, 0, NULL);
+}
+
+/* Called directly by the hotplug callbacks */
+static void _local_event_unregister(void *data)
+{
+	int err;
+	struct sdei_crosscall_args *arg = data;
+
+	WARN_ON_ONCE(preemptible());
+
+	err = sdei_api_event_unregister(arg->event->event_num);
+
+	sdei_cross_call_return(arg, err);
+}
+
+static int _sdei_event_unregister(struct sdei_event *event)
+{
+	lockdep_assert_held(&sdei_events_lock);
+
+	spin_lock(&sdei_list_lock);
+	event->reregister = false;
+	event->reenable = false;
+	spin_unlock(&sdei_list_lock);
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED)
+		return sdei_api_event_unregister(event->event_num);
+
+	return sdei_do_cross_call(_local_event_unregister, event);
+}
+
+int sdei_event_unregister(u32 event_num)
+{
+	int err;
+	struct sdei_event *event;
+
+	WARN_ON(in_nmi());
+
+	mutex_lock(&sdei_events_lock);
+	event = sdei_event_find(event_num);
+	do {
+		if (!event) {
+			pr_warn("Event %u not registered\n", event_num);
+			err = -ENOENT;
+			break;
+		}
+
+		err = _sdei_event_unregister(event);
+		if (err)
+			break;
+
+		sdei_event_destroy(event);
+	} while (0);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(sdei_event_unregister);
+
+/*
+ * unregister events, but don't destroy them as they are re-registered by
+ * sdei_reregister_shared().
+ */
+static int sdei_unregister_shared(void)
+{
+	int err = 0;
+	struct sdei_event *event;
+
+	mutex_lock(&sdei_events_lock);
+	spin_lock(&sdei_list_lock);
+	list_for_each_entry(event, &sdei_list, list) {
+		if (event->type != SDEI_EVENT_TYPE_SHARED)
+			continue;
+
+		err = _sdei_event_unregister(event);
+		if (err)
+			break;
+	}
+	spin_unlock(&sdei_list_lock);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+
+static int sdei_api_event_register(u32 event_num, unsigned long entry_point,
+				   void *arg, u64 flags, u64 affinity)
+{
+	return invoke_sdei_fn(SDEI_1_0_FN_SDEI_EVENT_REGISTER, event_num,
+			      (unsigned long)entry_point, (unsigned long)arg,
+			      flags, affinity, NULL);
+}
+
+/* Called directly by the hotplug callbacks */
+static void _local_event_register(void *data)
+{
+	int err;
+	struct sdei_registered_event *reg;
+	struct sdei_crosscall_args *arg = data;
+
+	WARN_ON(preemptible());
+
+	reg = per_cpu_ptr(arg->event->private_registered, smp_processor_id());
+	err = sdei_api_event_register(arg->event->event_num, sdei_entry_point,
+				      reg, 0, 0);
+
+	sdei_cross_call_return(arg, err);
+}
+
+static int _sdei_event_register(struct sdei_event *event)
+{
+	int err;
+
+	lockdep_assert_held(&sdei_events_lock);
+
+	spin_lock(&sdei_list_lock);
+	event->reregister = true;
+	spin_unlock(&sdei_list_lock);
+
+	if (event->type == SDEI_EVENT_TYPE_SHARED)
+		return sdei_api_event_register(event->event_num,
+					       sdei_entry_point,
+					       event->registered,
+					       SDEI_EVENT_REGISTER_RM_ANY, 0);
+
+
+	err = sdei_do_cross_call(_local_event_register, event);
+	if (err) {
+		spin_lock(&sdei_list_lock);
+		event->reregister = false;
+		event->reenable = false;
+		spin_unlock(&sdei_list_lock);
+
+		sdei_do_cross_call(_local_event_unregister, event);
+	}
+
+	return err;
+}
+
+int sdei_event_register(u32 event_num, sdei_event_callback *cb, void *arg)
+{
+	int err;
+	struct sdei_event *event;
+
+	WARN_ON(in_nmi());
+
+	mutex_lock(&sdei_events_lock);
+	do {
+		if (sdei_event_find(event_num)) {
+			pr_warn("Event %u already registered\n", event_num);
+			err = -EBUSY;
+			break;
+		}
+
+		event = sdei_event_create(event_num, cb, arg);
+		if (IS_ERR(event)) {
+			err = PTR_ERR(event);
+			pr_warn("Failed to create event %u: %d\n", event_num,
+				err);
+			break;
+		}
+
+		err = _sdei_event_register(event);
+		if (err) {
+			sdei_event_destroy(event);
+			pr_warn("Failed to register event %u: %d\n", event_num,
+				err);
+		}
+	} while (0);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+EXPORT_SYMBOL(sdei_event_register);
+
+static int sdei_reregister_event(struct sdei_event *event)
+{
+	int err;
+
+	lockdep_assert_held(&sdei_events_lock);
+
+	err = _sdei_event_register(event);
+	if (err) {
+		pr_err("Failed to re-register event %u\n", event->event_num);
+		sdei_event_destroy(event);
+		return err;
+	}
+
+	if (event->reenable) {
+		if (event->type == SDEI_EVENT_TYPE_SHARED)
+			err = sdei_api_event_enable(event->event_num);
+		else
+			err = sdei_do_cross_call(_local_event_enable, event);
+	}
+
+	if (err)
+		pr_err("Failed to re-enable event %u\n", event->event_num);
+
+	return err;
+}
+
+static int sdei_reregister_shared(void)
+{
+	int err = 0;
+	struct sdei_event *event;
+
+	mutex_lock(&sdei_events_lock);
+	spin_lock(&sdei_list_lock);
+	list_for_each_entry(event, &sdei_list, list) {
+		if (event->type != SDEI_EVENT_TYPE_SHARED)
+			continue;
+
+		if (event->reregister) {
+			err = sdei_reregister_event(event);
+			if (err)
+				break;
+		}
+	}
+	spin_unlock(&sdei_list_lock);
+	mutex_unlock(&sdei_events_lock);
+
+	return err;
+}
+
+static int sdei_cpuhp_down(unsigned int cpu)
+{
+	struct sdei_event *event;
+	struct sdei_crosscall_args arg;
+
+	/* un-register private events */
+	spin_lock(&sdei_list_lock);
+	list_for_each_entry(event, &sdei_list, list) {
+		if (event->type == SDEI_EVENT_TYPE_SHARED)
+			continue;
+
+		CROSSCALL_INIT(arg, event);
+		/* call the cross-call function locally... */
+		_local_event_unregister(&arg);
+		if (arg.first_error)
+			pr_err("Failed to unregister event %u: %d\n",
+			       event->event_num, arg.first_error);
+	}
+	spin_unlock(&sdei_list_lock);
+
+	return sdei_mask_local_cpu();
+}
+
+static int sdei_cpuhp_up(unsigned int cpu)
+{
+	struct sdei_event *event;
+	struct sdei_crosscall_args arg;
+
+	/* re-register/enable private events */
+	spin_lock(&sdei_list_lock);
+	list_for_each_entry(event, &sdei_list, list) {
+		if (event->type == SDEI_EVENT_TYPE_SHARED)
+			continue;
+
+		if (event->reregister) {
+			CROSSCALL_INIT(arg, event);
+			/* call the cross-call function locally... */
+			_local_event_register(&arg);
+			if (arg.first_error)
+				pr_err("Failed to re-register event %u: %d\n",
+				       event->event_num, arg.first_error);
+		}
+
+		if (event->reenable) {
+			CROSSCALL_INIT(arg, event);
+			_local_event_enable(&arg);
+			if (arg.first_error)
+				pr_err("Failed to re-enable event %u: %d\n",
+				       event->event_num, arg.first_error);
+		}
+	}
+	spin_unlock(&sdei_list_lock);
+
+	return sdei_unmask_local_cpu();
+}
+
+/* When entering idle, mask/unmask events for this cpu */
+static int sdei_pm_notifier(struct notifier_block *nb, unsigned long action,
+			    void *data)
+{
+	int rv;
+
+	switch (action) {
+	case CPU_PM_ENTER:
+		rv = sdei_mask_local_cpu();
+		break;
+	case CPU_PM_EXIT:
+	case CPU_PM_ENTER_FAILED:
+		rv = sdei_unmask_local_cpu();
+		break;
+	default:
+		return NOTIFY_DONE;
+	}
+
+	if (rv)
+		return notifier_from_errno(rv);
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block sdei_pm_nb = {
+	.notifier_call = sdei_pm_notifier,
+};
+
+static int sdei_device_suspend(struct device *dev)
+{
+	on_each_cpu(_ipi_mask_cpu, NULL, true);
+
+	return 0;
+}
+
+static int sdei_device_resume(struct device *dev)
+{
+	on_each_cpu(_ipi_unmask_cpu, NULL, true);
+
+	return 0;
+}
+
+/*
+ * We need all events to be reregistered when we resume from hibernate.
+ *
+ * The sequence is freeze->thaw. Reboot. freeze->restore. We unregister
+ * events during freeze, then re-register and re-enable them during thaw
+ * and restore.
+ */
+static int sdei_device_freeze(struct device *dev)
+{
+	int err;
+
+	/* unregister private events */
+	cpuhp_remove_state(CPUHP_AP_ARM_SDEI_STARTING);
+
+	err = sdei_unregister_shared();
+	if (err)
+		return err;
+
+	return 0;
+}
+
+static int sdei_device_thaw(struct device *dev)
+{
+	int err;
+
+	/* re-register shared events */
+	err = sdei_reregister_shared();
+	if (err) {
+		pr_warn("Failed to re-register shared events...\n");
+		sdei_mark_interface_broken();
+		return err;
+	}
+
+	err = cpuhp_setup_state(CPUHP_AP_ARM_SDEI_STARTING, "SDEI",
+				&sdei_cpuhp_up, &sdei_cpuhp_down);
+	if (err)
+		pr_warn("Failed to re-register CPU hotplug notifier...\n");
+
+	return err;
+}
+
+static int sdei_device_restore(struct device *dev)
+{
+	int err;
+
+	err = sdei_platform_reset();
+	if (err)
+		return err;
+
+	return sdei_device_thaw(dev);
+}
+
+static const struct dev_pm_ops sdei_pm_ops = {
+	.suspend = sdei_device_suspend,
+	.resume = sdei_device_resume,
+	.freeze = sdei_device_freeze,
+	.thaw = sdei_device_thaw,
+	.restore = sdei_device_restore,
+};
+
+/*
+ * Mask all CPUs and unregister all events on panic, reboot or kexec.
+ */
+static int sdei_reboot_notifier(struct notifier_block *nb, unsigned long action,
+				void *data)
+{
+	/*
+	 * We are going to reset the interface, after this there is no point
+	 * doing work when we take CPUs offline.
+	 */
+	cpuhp_remove_state(CPUHP_AP_ARM_SDEI_STARTING);
+
+	sdei_platform_reset();
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block sdei_reboot_nb = {
+	.notifier_call = sdei_reboot_notifier,
+};
+
+static void sdei_smccc_smc(unsigned long function_id,
+			   unsigned long arg0, unsigned long arg1,
+			   unsigned long arg2, unsigned long arg3,
+			   unsigned long arg4, struct arm_smccc_res *res)
+{
+	arm_smccc_smc(function_id, arg0, arg1, arg2, arg3, arg4, 0, 0, res);
+}
+
+static void sdei_smccc_hvc(unsigned long function_id,
+			   unsigned long arg0, unsigned long arg1,
+			   unsigned long arg2, unsigned long arg3,
+			   unsigned long arg4, struct arm_smccc_res *res)
+{
+	arm_smccc_hvc(function_id, arg0, arg1, arg2, arg3, arg4, 0, 0, res);
+}
+
+static int sdei_get_conduit(struct platform_device *pdev)
+{
+	const char *method;
+	struct device_node *np = pdev->dev.of_node;
+
+	sdei_firmware_call = NULL;
+	if (np) {
+		if (of_property_read_string(np, "method", &method)) {
+			pr_warn("missing \"method\" property\n");
+			return CONDUIT_INVALID;
+		}
+
+		if (!strcmp("hvc", method)) {
+			sdei_firmware_call = &sdei_smccc_hvc;
+			return CONDUIT_HVC;
+		} else if (!strcmp("smc", method)) {
+			sdei_firmware_call = &sdei_smccc_smc;
+			return CONDUIT_SMC;
+		}
+
+		pr_warn("invalid \"method\" property: %s\n", method);
+	} else if (IS_ENABLED(CONFIG_ACPI) && !acpi_disabled) {
+		if (acpi_psci_use_hvc()) {
+			sdei_firmware_call = &sdei_smccc_hvc;
+			return CONDUIT_HVC;
+		} else {
+			sdei_firmware_call = &sdei_smccc_smc;
+			return CONDUIT_SMC;
+		}
+	}
+
+	return CONDUIT_INVALID;
+}
+
+static int sdei_probe(struct platform_device *pdev)
+{
+	int err;
+	u64 ver = 0;
+	int conduit;
+
+	conduit = sdei_get_conduit(pdev);
+	if (!sdei_firmware_call)
+		return 0;
+
+	err = sdei_api_get_version(&ver);
+	if (err == -EOPNOTSUPP)
+		pr_err("advertised but not implemented in platform firmware\n");
+	if (err) {
+		pr_err("Failed to get SDEI version: %d\n", err);
+		sdei_mark_interface_broken();
+		return err;
+	}
+
+	pr_info("SDEIv%d.%d (0x%x) detected in firmware.\n",
+		(int)SDEI_VERSION_MAJOR(ver), (int)SDEI_VERSION_MINOR(ver),
+		(int)SDEI_VERSION_VENDOR(ver));
+
+	if (SDEI_VERSION_MAJOR(ver) != 1) {
+		pr_warn("Conflicting SDEI version detected.\n");
+		sdei_mark_interface_broken();
+		return -EINVAL;
+	}
+
+	err = sdei_platform_reset();
+	if (err)
+		return err;
+
+	sdei_entry_point = sdei_arch_get_entry_point(conduit);
+	if (!sdei_entry_point) {
+		/* Not supported due to hardware or boot configuration */
+		sdei_mark_interface_broken();
+		return 0;
+	}
+
+	err = cpu_pm_register_notifier(&sdei_pm_nb);
+	if (err) {
+		pr_warn("Failed to register CPU PM notifier...\n");
+		goto error;
+	}
+
+	err = register_reboot_notifier(&sdei_reboot_nb);
+	if (err) {
+		pr_warn("Failed to register reboot notifier...\n");
+		goto remove_cpupm;
+	}
+
+	err = cpuhp_setup_state(CPUHP_AP_ARM_SDEI_STARTING, "SDEI",
+				&sdei_cpuhp_up, &sdei_cpuhp_down);
+	if (err) {
+		pr_warn("Failed to register CPU hotplug notifier...\n");
+		goto remove_reboot;
+	}
+
+	return 0;
+
+remove_reboot:
+	unregister_reboot_notifier(&sdei_reboot_nb);
+
+remove_cpupm:
+	cpu_pm_unregister_notifier(&sdei_pm_nb);
+
+error:
+	sdei_mark_interface_broken();
+	return err;
+}
+
+static const struct of_device_id sdei_of_match[] = {
+	{ .compatible = "arm,sdei-1.0" },
+	{}
+};
+
+static struct platform_driver sdei_driver = {
+	.driver		= {
+		.name			= "sdei",
+		.pm			= &sdei_pm_ops,
+		.of_match_table		= sdei_of_match,
+	},
+	.probe		= sdei_probe,
+};
+
+static bool __init sdei_present_dt(void)
+{
+	struct platform_device *pdev;
+	struct device_node *np, *fw_np;
+
+	fw_np = of_find_node_by_name(NULL, "firmware");
+	if (!fw_np)
+		return false;
+
+	np = of_find_matching_node(fw_np, sdei_of_match);
+	of_node_put(fw_np);
+	if (!np)
+		return false;
+
+	pdev = of_platform_device_create(np, sdei_driver.driver.name, NULL);
+	of_node_put(np);
+	if (!pdev)
+		return false;
+
+	return true;
+}
+
+static bool __init sdei_present_acpi(void)
+{
+	acpi_status status;
+	struct platform_device *pdev;
+	struct acpi_table_header *sdei_table_header;
+
+	if (acpi_disabled)
+		return false;
+
+	status = acpi_get_table(ACPI_SIG_SDEI, 0, &sdei_table_header);
+	if (ACPI_FAILURE(status) && status != AE_NOT_FOUND) {
+		const char *msg = acpi_format_exception(status);
+
+		pr_info("Failed to get ACPI:SDEI table, %s\n", msg);
+	}
+	if (ACPI_FAILURE(status))
+		return false;
+
+	pdev = platform_device_register_simple(sdei_driver.driver.name, 0, NULL,
+					       0);
+	if (IS_ERR(pdev))
+		return false;
+
+	return true;
+}
+
+static int __init sdei_init(void)
+{
+	if (sdei_present_dt() || sdei_present_acpi())
+		platform_driver_register(&sdei_driver);
+
+	return 0;
+}
+
+/*
+ * On an ACPI system SDEI needs to be ready before HEST:GHES tries to register
+ * its events. ACPI is initialised from a subsys_initcall(), GHES is initialised
+ * by device_initcall(). We want to be called in the middle.
+ */
+subsys_initcall_sync(sdei_init);
+
+int sdei_event_handler(struct pt_regs *regs,
+		       struct sdei_registered_event *arg)
+{
+	int err;
+	mm_segment_t orig_addr_limit;
+	u32 event_num = arg->event_num;
+
+	orig_addr_limit = get_fs();
+	set_fs(USER_DS);
+
+	err = arg->callback(event_num, regs, arg->callback_arg);
+	if (err)
+		pr_err_ratelimited("event %u on CPU %u failed with error: %d\n",
+				   event_num, smp_processor_id(), err);
+
+	set_fs(orig_addr_limit);
+
+	return err;
+}
+NOKPROBE_SYMBOL(sdei_event_handler);
diff --git a/drivers/firmware/efi/Kconfig b/drivers/firmware/efi/Kconfig
index 2b4c39f..6047ed4 100644
--- a/drivers/firmware/efi/Kconfig
+++ b/drivers/firmware/efi/Kconfig
@@ -159,13 +159,21 @@
 	  using the TCG Platform Reset Attack Mitigation specification. This
 	  protects against an attacker forcibly rebooting the system while it
 	  still contains secrets in RAM, booting another OS and extracting the
-	  secrets.
+	  secrets. This should only be enabled when userland is configured to
+	  clear the MemoryOverwriteRequest flag on clean shutdown after secrets
+	  have been evicted, since otherwise it will trigger even on clean
+	  reboots.
 
 endmenu
 
 config UEFI_CPER
 	bool
 
+config UEFI_CPER_ARM
+	bool
+	depends on UEFI_CPER && ( ARM || ARM64 )
+	default y
+
 config EFI_DEV_PATH_PARSER
 	bool
 	depends on ACPI
diff --git a/drivers/firmware/efi/Makefile b/drivers/firmware/efi/Makefile
index 269501d..a3e73d6 100644
--- a/drivers/firmware/efi/Makefile
+++ b/drivers/firmware/efi/Makefile
@@ -30,3 +30,4 @@
 obj-$(CONFIG_ARM)			+= $(arm-obj-y)
 obj-$(CONFIG_ARM64)			+= $(arm-obj-y)
 obj-$(CONFIG_EFI_CAPSULE_LOADER)	+= capsule-loader.o
+obj-$(CONFIG_UEFI_CPER_ARM)		+= cper-arm.o
diff --git a/drivers/firmware/efi/capsule-loader.c b/drivers/firmware/efi/capsule-loader.c
index ec8ac5c..e456f46 100644
--- a/drivers/firmware/efi/capsule-loader.c
+++ b/drivers/firmware/efi/capsule-loader.c
@@ -20,10 +20,6 @@
 
 #define NO_FURTHER_WRITE_ACTION -1
 
-#ifndef phys_to_page
-#define phys_to_page(x)		pfn_to_page((x) >> PAGE_SHIFT)
-#endif
-
 /**
  * efi_free_all_buff_pages - free all previous allocated buffer pages
  * @cap_info: pointer to current instance of capsule_info structure
@@ -35,7 +31,7 @@
 static void efi_free_all_buff_pages(struct capsule_info *cap_info)
 {
 	while (cap_info->index > 0)
-		__free_page(phys_to_page(cap_info->pages[--cap_info->index]));
+		__free_page(cap_info->pages[--cap_info->index]);
 
 	cap_info->index = NO_FURTHER_WRITE_ACTION;
 }
@@ -49,7 +45,7 @@ int __efi_capsule_setup_info(struct capsule_info *cap_info)
 	pages_needed = ALIGN(cap_info->total_size, PAGE_SIZE) / PAGE_SIZE;
 
 	if (pages_needed == 0) {
-		pr_err("invalid capsule size");
+		pr_err("invalid capsule size\n");
 		return -EINVAL;
 	}
 
@@ -71,6 +67,14 @@ int __efi_capsule_setup_info(struct capsule_info *cap_info)
 
 	cap_info->pages = temp_page;
 
+	temp_page = krealloc(cap_info->phys,
+			     pages_needed * sizeof(phys_addr_t *),
+			     GFP_KERNEL | __GFP_ZERO);
+	if (!temp_page)
+		return -ENOMEM;
+
+	cap_info->phys = temp_page;
+
 	return 0;
 }
 
@@ -105,9 +109,24 @@ int __weak efi_capsule_setup_info(struct capsule_info *cap_info, void *kbuff,
  **/
 static ssize_t efi_capsule_submit_update(struct capsule_info *cap_info)
 {
+	bool do_vunmap = false;
 	int ret;
 
-	ret = efi_capsule_update(&cap_info->header, cap_info->pages);
+	/*
+	 * cap_info->capsule may have been assigned already by a quirk
+	 * handler, so only overwrite it if it is NULL
+	 */
+	if (!cap_info->capsule) {
+		cap_info->capsule = vmap(cap_info->pages, cap_info->index,
+					 VM_MAP, PAGE_KERNEL);
+		if (!cap_info->capsule)
+			return -ENOMEM;
+		do_vunmap = true;
+	}
+
+	ret = efi_capsule_update(cap_info->capsule, cap_info->phys);
+	if (do_vunmap)
+		vunmap(cap_info->capsule);
 	if (ret) {
 		pr_err("capsule update failed\n");
 		return ret;
@@ -165,10 +184,12 @@ static ssize_t efi_capsule_write(struct file *file, const char __user *buff,
 			goto failed;
 		}
 
-		cap_info->pages[cap_info->index++] = page_to_phys(page);
+		cap_info->pages[cap_info->index] = page;
+		cap_info->phys[cap_info->index] = page_to_phys(page);
 		cap_info->page_bytes_remain = PAGE_SIZE;
+		cap_info->index++;
 	} else {
-		page = phys_to_page(cap_info->pages[cap_info->index - 1]);
+		page = cap_info->pages[cap_info->index - 1];
 	}
 
 	kbuff = kmap(page);
@@ -252,6 +273,7 @@ static int efi_capsule_release(struct inode *inode, struct file *file)
 	struct capsule_info *cap_info = file->private_data;
 
 	kfree(cap_info->pages);
+	kfree(cap_info->phys);
 	kfree(file->private_data);
 	file->private_data = NULL;
 	return 0;
@@ -281,6 +303,13 @@ static int efi_capsule_open(struct inode *inode, struct file *file)
 		return -ENOMEM;
 	}
 
+	cap_info->phys = kzalloc(sizeof(void *), GFP_KERNEL);
+	if (!cap_info->phys) {
+		kfree(cap_info->pages);
+		kfree(cap_info);
+		return -ENOMEM;
+	}
+
 	file->private_data = cap_info;
 
 	return 0;
diff --git a/drivers/firmware/efi/cper-arm.c b/drivers/firmware/efi/cper-arm.c
new file mode 100644
index 0000000..698e5c8
--- /dev/null
+++ b/drivers/firmware/efi/cper-arm.c
@@ -0,0 +1,356 @@
+/*
+ * UEFI Common Platform Error Record (CPER) support
+ *
+ * Copyright (C) 2017, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License version
+ * 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/time.h>
+#include <linux/cper.h>
+#include <linux/dmi.h>
+#include <linux/acpi.h>
+#include <linux/pci.h>
+#include <linux/aer.h>
+#include <linux/printk.h>
+#include <linux/bcd.h>
+#include <acpi/ghes.h>
+#include <ras/ras_event.h>
+
+#define INDENT_SP	" "
+
+static const char * const arm_reg_ctx_strs[] = {
+	"AArch32 general purpose registers",
+	"AArch32 EL1 context registers",
+	"AArch32 EL2 context registers",
+	"AArch32 secure context registers",
+	"AArch64 general purpose registers",
+	"AArch64 EL1 context registers",
+	"AArch64 EL2 context registers",
+	"AArch64 EL3 context registers",
+	"Misc. system register structure",
+};
+
+static const char * const arm_err_trans_type_strs[] = {
+	"Instruction",
+	"Data Access",
+	"Generic",
+};
+
+static const char * const arm_bus_err_op_strs[] = {
+	"Generic error (type cannot be determined)",
+	"Generic read (type of instruction or data request cannot be determined)",
+	"Generic write (type of instruction of data request cannot be determined)",
+	"Data read",
+	"Data write",
+	"Instruction fetch",
+	"Prefetch",
+};
+
+static const char * const arm_cache_err_op_strs[] = {
+	"Generic error (type cannot be determined)",
+	"Generic read (type of instruction or data request cannot be determined)",
+	"Generic write (type of instruction of data request cannot be determined)",
+	"Data read",
+	"Data write",
+	"Instruction fetch",
+	"Prefetch",
+	"Eviction",
+	"Snooping (processor initiated a cache snoop that resulted in an error)",
+	"Snooped (processor raised a cache error caused by another processor or device snooping its cache)",
+	"Management",
+};
+
+static const char * const arm_tlb_err_op_strs[] = {
+	"Generic error (type cannot be determined)",
+	"Generic read (type of instruction or data request cannot be determined)",
+	"Generic write (type of instruction of data request cannot be determined)",
+	"Data read",
+	"Data write",
+	"Instruction fetch",
+	"Prefetch",
+	"Local management operation (processor initiated a TLB management operation that resulted in an error)",
+	"External management operation (processor raised a TLB error caused by another processor or device broadcasting TLB operations)",
+};
+
+static const char * const arm_bus_err_part_type_strs[] = {
+	"Local processor originated request",
+	"Local processor responded to request",
+	"Local processor observed",
+	"Generic",
+};
+
+static const char * const arm_bus_err_addr_space_strs[] = {
+	"External Memory Access",
+	"Internal Memory Access",
+	"Unknown",
+	"Device Memory Access",
+};
+
+static void cper_print_arm_err_info(const char *pfx, u32 type,
+				    u64 error_info)
+{
+	u8 trans_type, op_type, level, participation_type, address_space;
+	u16 mem_attributes;
+	bool proc_context_corrupt, corrected, precise_pc, restartable_pc;
+	bool time_out, access_mode;
+
+	/* If the type is unknown, bail. */
+	if (type > CPER_ARM_MAX_TYPE)
+		return;
+
+	/*
+	 * Vendor type errors have error information values that are vendor
+	 * specific.
+	 */
+	if (type == CPER_ARM_VENDOR_ERROR)
+		return;
+
+	if (error_info & CPER_ARM_ERR_VALID_TRANSACTION_TYPE) {
+		trans_type = ((error_info >> CPER_ARM_ERR_TRANSACTION_SHIFT)
+			      & CPER_ARM_ERR_TRANSACTION_MASK);
+		if (trans_type < ARRAY_SIZE(arm_err_trans_type_strs)) {
+			printk("%stransaction type: %s\n", pfx,
+			       arm_err_trans_type_strs[trans_type]);
+		}
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_OPERATION_TYPE) {
+		op_type = ((error_info >> CPER_ARM_ERR_OPERATION_SHIFT)
+			   & CPER_ARM_ERR_OPERATION_MASK);
+		switch (type) {
+		case CPER_ARM_CACHE_ERROR:
+			if (op_type < ARRAY_SIZE(arm_cache_err_op_strs)) {
+				printk("%soperation type: %s\n", pfx,
+				       arm_cache_err_op_strs[op_type]);
+			}
+			break;
+		case CPER_ARM_TLB_ERROR:
+			if (op_type < ARRAY_SIZE(arm_tlb_err_op_strs)) {
+				printk("%soperation type: %s\n", pfx,
+				       arm_tlb_err_op_strs[op_type]);
+			}
+			break;
+		case CPER_ARM_BUS_ERROR:
+			if (op_type < ARRAY_SIZE(arm_bus_err_op_strs)) {
+				printk("%soperation type: %s\n", pfx,
+				       arm_bus_err_op_strs[op_type]);
+			}
+			break;
+		}
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_LEVEL) {
+		level = ((error_info >> CPER_ARM_ERR_LEVEL_SHIFT)
+			 & CPER_ARM_ERR_LEVEL_MASK);
+		switch (type) {
+		case CPER_ARM_CACHE_ERROR:
+			printk("%scache level: %d\n", pfx, level);
+			break;
+		case CPER_ARM_TLB_ERROR:
+			printk("%sTLB level: %d\n", pfx, level);
+			break;
+		case CPER_ARM_BUS_ERROR:
+			printk("%saffinity level at which the bus error occurred: %d\n",
+			       pfx, level);
+			break;
+		}
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT) {
+		proc_context_corrupt = ((error_info >> CPER_ARM_ERR_PC_CORRUPT_SHIFT)
+					& CPER_ARM_ERR_PC_CORRUPT_MASK);
+		if (proc_context_corrupt)
+			printk("%sprocessor context corrupted\n", pfx);
+		else
+			printk("%sprocessor context not corrupted\n", pfx);
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_CORRECTED) {
+		corrected = ((error_info >> CPER_ARM_ERR_CORRECTED_SHIFT)
+			     & CPER_ARM_ERR_CORRECTED_MASK);
+		if (corrected)
+			printk("%sthe error has been corrected\n", pfx);
+		else
+			printk("%sthe error has not been corrected\n", pfx);
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_PRECISE_PC) {
+		precise_pc = ((error_info >> CPER_ARM_ERR_PRECISE_PC_SHIFT)
+			      & CPER_ARM_ERR_PRECISE_PC_MASK);
+		if (precise_pc)
+			printk("%sPC is precise\n", pfx);
+		else
+			printk("%sPC is imprecise\n", pfx);
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_RESTARTABLE_PC) {
+		restartable_pc = ((error_info >> CPER_ARM_ERR_RESTARTABLE_PC_SHIFT)
+				  & CPER_ARM_ERR_RESTARTABLE_PC_MASK);
+		if (restartable_pc)
+			printk("%sProgram execution can be restarted reliably at the PC associated with the error.\n", pfx);
+	}
+
+	/* The rest of the fields are specific to bus errors */
+	if (type != CPER_ARM_BUS_ERROR)
+		return;
+
+	if (error_info & CPER_ARM_ERR_VALID_PARTICIPATION_TYPE) {
+		participation_type = ((error_info >> CPER_ARM_ERR_PARTICIPATION_TYPE_SHIFT)
+				      & CPER_ARM_ERR_PARTICIPATION_TYPE_MASK);
+		if (participation_type < ARRAY_SIZE(arm_bus_err_part_type_strs)) {
+			printk("%sparticipation type: %s\n", pfx,
+			       arm_bus_err_part_type_strs[participation_type]);
+		}
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_TIME_OUT) {
+		time_out = ((error_info >> CPER_ARM_ERR_TIME_OUT_SHIFT)
+			    & CPER_ARM_ERR_TIME_OUT_MASK);
+		if (time_out)
+			printk("%srequest timed out\n", pfx);
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_ADDRESS_SPACE) {
+		address_space = ((error_info >> CPER_ARM_ERR_ADDRESS_SPACE_SHIFT)
+				 & CPER_ARM_ERR_ADDRESS_SPACE_MASK);
+		if (address_space < ARRAY_SIZE(arm_bus_err_addr_space_strs)) {
+			printk("%saddress space: %s\n", pfx,
+			       arm_bus_err_addr_space_strs[address_space]);
+		}
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_MEM_ATTRIBUTES) {
+		mem_attributes = ((error_info >> CPER_ARM_ERR_MEM_ATTRIBUTES_SHIFT)
+				  & CPER_ARM_ERR_MEM_ATTRIBUTES_MASK);
+		printk("%smemory access attributes:0x%x\n", pfx, mem_attributes);
+	}
+
+	if (error_info & CPER_ARM_ERR_VALID_ACCESS_MODE) {
+		access_mode = ((error_info >> CPER_ARM_ERR_ACCESS_MODE_SHIFT)
+			       & CPER_ARM_ERR_ACCESS_MODE_MASK);
+		if (access_mode)
+			printk("%saccess mode: normal\n", pfx);
+		else
+			printk("%saccess mode: secure\n", pfx);
+	}
+}
+
+void cper_print_proc_arm(const char *pfx,
+			 const struct cper_sec_proc_arm *proc)
+{
+	int i, len, max_ctx_type;
+	struct cper_arm_err_info *err_info;
+	struct cper_arm_ctx_info *ctx_info;
+	char newpfx[64], infopfx[64];
+
+	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
+
+	len = proc->section_length - (sizeof(*proc) +
+		proc->err_info_num * (sizeof(*err_info)));
+	if (len < 0) {
+		printk("%ssection length: %d\n", pfx, proc->section_length);
+		printk("%ssection length is too small\n", pfx);
+		printk("%sfirmware-generated error record is incorrect\n", pfx);
+		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
+		return;
+	}
+
+	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
+		printk("%sMultiprocessor Affinity Register (MPIDR): 0x%016llx\n",
+			pfx, proc->mpidr);
+
+	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
+		printk("%serror affinity level: %d\n", pfx,
+			proc->affinity_level);
+
+	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
+		printk("%srunning state: 0x%x\n", pfx, proc->running_state);
+		printk("%sPower State Coordination Interface state: %d\n",
+			pfx, proc->psci_state);
+	}
+
+	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
+
+	err_info = (struct cper_arm_err_info *)(proc + 1);
+	for (i = 0; i < proc->err_info_num; i++) {
+		printk("%sError info structure %d:\n", pfx, i);
+
+		printk("%snum errors: %d\n", pfx, err_info->multiple_error + 1);
+
+		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
+			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
+				printk("%sfirst error captured\n", newpfx);
+			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
+				printk("%slast error captured\n", newpfx);
+			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
+				printk("%spropagated error captured\n",
+				       newpfx);
+			if (err_info->flags & CPER_ARM_INFO_FLAGS_OVERFLOW)
+				printk("%soverflow occurred, error info is incomplete\n",
+				       newpfx);
+		}
+
+		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
+			err_info->type < ARRAY_SIZE(cper_proc_error_type_strs) ?
+			cper_proc_error_type_strs[err_info->type] : "unknown");
+		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO) {
+			printk("%serror_info: 0x%016llx\n", newpfx,
+			       err_info->error_info);
+			snprintf(infopfx, sizeof(infopfx), "%s%s", newpfx, INDENT_SP);
+			cper_print_arm_err_info(infopfx, err_info->type,
+						err_info->error_info);
+		}
+		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
+			printk("%svirtual fault address: 0x%016llx\n",
+				newpfx, err_info->virt_fault_addr);
+		if (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
+			printk("%sphysical fault address: 0x%016llx\n",
+				newpfx, err_info->physical_fault_addr);
+		err_info += 1;
+	}
+
+	ctx_info = (struct cper_arm_ctx_info *)err_info;
+	max_ctx_type = ARRAY_SIZE(arm_reg_ctx_strs) - 1;
+	for (i = 0; i < proc->context_info_num; i++) {
+		int size = sizeof(*ctx_info) + ctx_info->size;
+
+		printk("%sContext info structure %d:\n", pfx, i);
+		if (len < size) {
+			printk("%ssection length is too small\n", newpfx);
+			printk("%sfirmware-generated error record is incorrect\n", pfx);
+			return;
+		}
+		if (ctx_info->type > max_ctx_type) {
+			printk("%sInvalid context type: %d (max: %d)\n",
+				newpfx, ctx_info->type, max_ctx_type);
+			return;
+		}
+		printk("%sregister context type: %s\n", newpfx,
+			arm_reg_ctx_strs[ctx_info->type]);
+		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
+				(ctx_info + 1), ctx_info->size, 0);
+		len -= size;
+		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
+	}
+
+	if (len > 0) {
+		printk("%sVendor specific error info has %u bytes:\n", pfx,
+		       len);
+		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
+				len, true);
+	}
+}
diff --git a/drivers/firmware/efi/cper.c b/drivers/firmware/efi/cper.c
index d2fcafc..c165933 100644
--- a/drivers/firmware/efi/cper.c
+++ b/drivers/firmware/efi/cper.c
@@ -122,7 +122,7 @@ static const char * const proc_isa_strs[] = {
 	"ARM A64",
 };
 
-static const char * const proc_error_type_strs[] = {
+const char * const cper_proc_error_type_strs[] = {
 	"cache error",
 	"TLB error",
 	"bus error",
@@ -157,8 +157,8 @@ static void cper_print_proc_generic(const char *pfx,
 	if (proc->validation_bits & CPER_PROC_VALID_ERROR_TYPE) {
 		printk("%s""error_type: 0x%02x\n", pfx, proc->proc_error_type);
 		cper_print_bits(pfx, proc->proc_error_type,
-				proc_error_type_strs,
-				ARRAY_SIZE(proc_error_type_strs));
+				cper_proc_error_type_strs,
+				ARRAY_SIZE(cper_proc_error_type_strs));
 	}
 	if (proc->validation_bits & CPER_PROC_VALID_OPERATION)
 		printk("%s""operation: %d, %s\n", pfx, proc->operation,
@@ -188,122 +188,6 @@ static void cper_print_proc_generic(const char *pfx,
 		printk("%s""IP: 0x%016llx\n", pfx, proc->ip);
 }
 
-#if defined(CONFIG_ARM64) || defined(CONFIG_ARM)
-static const char * const arm_reg_ctx_strs[] = {
-	"AArch32 general purpose registers",
-	"AArch32 EL1 context registers",
-	"AArch32 EL2 context registers",
-	"AArch32 secure context registers",
-	"AArch64 general purpose registers",
-	"AArch64 EL1 context registers",
-	"AArch64 EL2 context registers",
-	"AArch64 EL3 context registers",
-	"Misc. system register structure",
-};
-
-static void cper_print_proc_arm(const char *pfx,
-				const struct cper_sec_proc_arm *proc)
-{
-	int i, len, max_ctx_type;
-	struct cper_arm_err_info *err_info;
-	struct cper_arm_ctx_info *ctx_info;
-	char newpfx[64];
-
-	printk("%sMIDR: 0x%016llx\n", pfx, proc->midr);
-
-	len = proc->section_length - (sizeof(*proc) +
-		proc->err_info_num * (sizeof(*err_info)));
-	if (len < 0) {
-		printk("%ssection length: %d\n", pfx, proc->section_length);
-		printk("%ssection length is too small\n", pfx);
-		printk("%sfirmware-generated error record is incorrect\n", pfx);
-		printk("%sERR_INFO_NUM is %d\n", pfx, proc->err_info_num);
-		return;
-	}
-
-	if (proc->validation_bits & CPER_ARM_VALID_MPIDR)
-		printk("%sMultiprocessor Affinity Register (MPIDR): 0x%016llx\n",
-			pfx, proc->mpidr);
-
-	if (proc->validation_bits & CPER_ARM_VALID_AFFINITY_LEVEL)
-		printk("%serror affinity level: %d\n", pfx,
-			proc->affinity_level);
-
-	if (proc->validation_bits & CPER_ARM_VALID_RUNNING_STATE) {
-		printk("%srunning state: 0x%x\n", pfx, proc->running_state);
-		printk("%sPower State Coordination Interface state: %d\n",
-			pfx, proc->psci_state);
-	}
-
-	snprintf(newpfx, sizeof(newpfx), "%s%s", pfx, INDENT_SP);
-
-	err_info = (struct cper_arm_err_info *)(proc + 1);
-	for (i = 0; i < proc->err_info_num; i++) {
-		printk("%sError info structure %d:\n", pfx, i);
-
-		printk("%snum errors: %d\n", pfx, err_info->multiple_error + 1);
-
-		if (err_info->validation_bits & CPER_ARM_INFO_VALID_FLAGS) {
-			if (err_info->flags & CPER_ARM_INFO_FLAGS_FIRST)
-				printk("%sfirst error captured\n", newpfx);
-			if (err_info->flags & CPER_ARM_INFO_FLAGS_LAST)
-				printk("%slast error captured\n", newpfx);
-			if (err_info->flags & CPER_ARM_INFO_FLAGS_PROPAGATED)
-				printk("%spropagated error captured\n",
-				       newpfx);
-			if (err_info->flags & CPER_ARM_INFO_FLAGS_OVERFLOW)
-				printk("%soverflow occurred, error info is incomplete\n",
-				       newpfx);
-		}
-
-		printk("%serror_type: %d, %s\n", newpfx, err_info->type,
-			err_info->type < ARRAY_SIZE(proc_error_type_strs) ?
-			proc_error_type_strs[err_info->type] : "unknown");
-		if (err_info->validation_bits & CPER_ARM_INFO_VALID_ERR_INFO)
-			printk("%serror_info: 0x%016llx\n", newpfx,
-			       err_info->error_info);
-		if (err_info->validation_bits & CPER_ARM_INFO_VALID_VIRT_ADDR)
-			printk("%svirtual fault address: 0x%016llx\n",
-				newpfx, err_info->virt_fault_addr);
-		if (err_info->validation_bits & CPER_ARM_INFO_VALID_PHYSICAL_ADDR)
-			printk("%sphysical fault address: 0x%016llx\n",
-				newpfx, err_info->physical_fault_addr);
-		err_info += 1;
-	}
-
-	ctx_info = (struct cper_arm_ctx_info *)err_info;
-	max_ctx_type = ARRAY_SIZE(arm_reg_ctx_strs) - 1;
-	for (i = 0; i < proc->context_info_num; i++) {
-		int size = sizeof(*ctx_info) + ctx_info->size;
-
-		printk("%sContext info structure %d:\n", pfx, i);
-		if (len < size) {
-			printk("%ssection length is too small\n", newpfx);
-			printk("%sfirmware-generated error record is incorrect\n", pfx);
-			return;
-		}
-		if (ctx_info->type > max_ctx_type) {
-			printk("%sInvalid context type: %d (max: %d)\n",
-				newpfx, ctx_info->type, max_ctx_type);
-			return;
-		}
-		printk("%sregister context type: %s\n", newpfx,
-			arm_reg_ctx_strs[ctx_info->type]);
-		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4,
-				(ctx_info + 1), ctx_info->size, 0);
-		len -= size;
-		ctx_info = (struct cper_arm_ctx_info *)((long)ctx_info + size);
-	}
-
-	if (len > 0) {
-		printk("%sVendor specific error info has %u bytes:\n", pfx,
-		       len);
-		print_hex_dump(newpfx, "", DUMP_PREFIX_OFFSET, 16, 4, ctx_info,
-				len, true);
-	}
-}
-#endif
-
 static const char * const mem_err_type_strs[] = {
 	"unknown",
 	"no error",
diff --git a/drivers/firmware/efi/efi.c b/drivers/firmware/efi/efi.c
index 557a478..8ce70c2 100644
--- a/drivers/firmware/efi/efi.c
+++ b/drivers/firmware/efi/efi.c
@@ -608,7 +608,7 @@ static int __init efi_load_efivars(void)
 		return 0;
 
 	pdev = platform_device_register_simple("efivars", 0, NULL, 0);
-	return IS_ERR(pdev) ? PTR_ERR(pdev) : 0;
+	return PTR_ERR_OR_ZERO(pdev);
 }
 device_initcall(efi_load_efivars);
 #endif
diff --git a/drivers/firmware/psci.c b/drivers/firmware/psci.c
index d687ca3d..8b25d31 100644
--- a/drivers/firmware/psci.c
+++ b/drivers/firmware/psci.c
@@ -496,6 +496,8 @@ static void __init psci_init_migrate(void)
 static void __init psci_0_2_set_functions(void)
 {
 	pr_info("Using standard PSCI v0.2 function IDs\n");
+	psci_ops.get_version = psci_get_version;
+
 	psci_function_id[PSCI_FN_CPU_SUSPEND] =
 					PSCI_FN_NATIVE(0_2, CPU_SUSPEND);
 	psci_ops.cpu_suspend = psci_cpu_suspend;
diff --git a/drivers/firmware/psci_checker.c b/drivers/firmware/psci_checker.c
index f3f4f81..bb1c068 100644
--- a/drivers/firmware/psci_checker.c
+++ b/drivers/firmware/psci_checker.c
@@ -77,8 +77,8 @@ static int psci_ops_check(void)
 	return 0;
 }
 
-static int find_clusters(const struct cpumask *cpus,
-			 const struct cpumask **clusters)
+static int find_cpu_groups(const struct cpumask *cpus,
+			   const struct cpumask **cpu_groups)
 {
 	unsigned int nb = 0;
 	cpumask_var_t tmp;
@@ -88,11 +88,11 @@ static int find_clusters(const struct cpumask *cpus,
 	cpumask_copy(tmp, cpus);
 
 	while (!cpumask_empty(tmp)) {
-		const struct cpumask *cluster =
+		const struct cpumask *cpu_group =
 			topology_core_cpumask(cpumask_any(tmp));
 
-		clusters[nb++] = cluster;
-		cpumask_andnot(tmp, tmp, cluster);
+		cpu_groups[nb++] = cpu_group;
+		cpumask_andnot(tmp, tmp, cpu_group);
 	}
 
 	free_cpumask_var(tmp);
@@ -170,24 +170,24 @@ static int hotplug_tests(void)
 {
 	int err;
 	cpumask_var_t offlined_cpus;
-	int i, nb_cluster;
-	const struct cpumask **clusters;
+	int i, nb_cpu_group;
+	const struct cpumask **cpu_groups;
 	char *page_buf;
 
 	err = -ENOMEM;
 	if (!alloc_cpumask_var(&offlined_cpus, GFP_KERNEL))
 		return err;
-	/* We may have up to nb_available_cpus clusters. */
-	clusters = kmalloc_array(nb_available_cpus, sizeof(*clusters),
-				 GFP_KERNEL);
-	if (!clusters)
+	/* We may have up to nb_available_cpus cpu_groups. */
+	cpu_groups = kmalloc_array(nb_available_cpus, sizeof(*cpu_groups),
+				   GFP_KERNEL);
+	if (!cpu_groups)
 		goto out_free_cpus;
 	page_buf = (char *)__get_free_page(GFP_KERNEL);
 	if (!page_buf)
-		goto out_free_clusters;
+		goto out_free_cpu_groups;
 
 	err = 0;
-	nb_cluster = find_clusters(cpu_online_mask, clusters);
+	nb_cpu_group = find_cpu_groups(cpu_online_mask, cpu_groups);
 
 	/*
 	 * Of course the last CPU cannot be powered down and cpu_down() should
@@ -197,24 +197,22 @@ static int hotplug_tests(void)
 	err += down_and_up_cpus(cpu_online_mask, offlined_cpus);
 
 	/*
-	 * Take down CPUs by cluster this time. When the last CPU is turned
-	 * off, the cluster itself should shut down.
+	 * Take down CPUs by cpu group this time. When the last CPU is turned
+	 * off, the cpu group itself should shut down.
 	 */
-	for (i = 0; i < nb_cluster; ++i) {
-		int cluster_id =
-			topology_physical_package_id(cpumask_any(clusters[i]));
+	for (i = 0; i < nb_cpu_group; ++i) {
 		ssize_t len = cpumap_print_to_pagebuf(true, page_buf,
-						      clusters[i]);
+						      cpu_groups[i]);
 		/* Remove trailing newline. */
 		page_buf[len - 1] = '\0';
-		pr_info("Trying to turn off and on again cluster %d "
-			"(CPUs %s)\n", cluster_id, page_buf);
-		err += down_and_up_cpus(clusters[i], offlined_cpus);
+		pr_info("Trying to turn off and on again group %d (CPUs %s)\n",
+			i, page_buf);
+		err += down_and_up_cpus(cpu_groups[i], offlined_cpus);
 	}
 
 	free_page((unsigned long)page_buf);
-out_free_clusters:
-	kfree(clusters);
+out_free_cpu_groups:
+	kfree(cpu_groups);
 out_free_cpus:
 	free_cpumask_var(offlined_cpus);
 	return err;
diff --git a/drivers/gpio/gpio-merrifield.c b/drivers/gpio/gpio-merrifield.c
index dd67a31..c38624e 100644
--- a/drivers/gpio/gpio-merrifield.c
+++ b/drivers/gpio/gpio-merrifield.c
@@ -9,6 +9,7 @@
  * published by the Free Software Foundation.
  */
 
+#include <linux/acpi.h>
 #include <linux/bitops.h>
 #include <linux/gpio/driver.h>
 #include <linux/init.h>
@@ -380,9 +381,16 @@ static void mrfld_irq_init_hw(struct mrfld_gpio *priv)
 	}
 }
 
+static const char *mrfld_gpio_get_pinctrl_dev_name(void)
+{
+	const char *dev_name = acpi_dev_get_first_match_name("INTC1002", NULL, -1);
+	return dev_name ? dev_name : "pinctrl-merrifield";
+}
+
 static int mrfld_gpio_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 {
 	const struct mrfld_gpio_pinrange *range;
+	const char *pinctrl_dev_name;
 	struct mrfld_gpio *priv;
 	u32 gpio_base, irq_base;
 	void __iomem *base;
@@ -439,10 +447,11 @@ static int mrfld_gpio_probe(struct pci_dev *pdev, const struct pci_device_id *id
 		return retval;
 	}
 
+	pinctrl_dev_name = mrfld_gpio_get_pinctrl_dev_name();
 	for (i = 0; i < ARRAY_SIZE(mrfld_gpio_ranges); i++) {
 		range = &mrfld_gpio_ranges[i];
 		retval = gpiochip_add_pin_range(&priv->chip,
-						"pinctrl-merrifield",
+						pinctrl_dev_name,
 						range->gpio_base,
 						range->pin_base,
 						range->npins);
diff --git a/drivers/gpio/gpio-mmio.c b/drivers/gpio/gpio-mmio.c
index f9042bc..7b14d62 100644
--- a/drivers/gpio/gpio-mmio.c
+++ b/drivers/gpio/gpio-mmio.c
@@ -152,14 +152,13 @@ static int bgpio_get_set_multiple(struct gpio_chip *gc, unsigned long *mask,
 {
 	unsigned long get_mask = 0;
 	unsigned long set_mask = 0;
-	int bit = 0;
 
-	while ((bit = find_next_bit(mask, gc->ngpio, bit)) != gc->ngpio) {
-		if (gc->bgpio_dir & BIT(bit))
-			set_mask |= BIT(bit);
-		else
-			get_mask |= BIT(bit);
-	}
+	/* Make sure we first clear any bits that are zero when we read the register */
+	*bits &= ~*mask;
+
+	/* Exploit the fact that we know which directions are set */
+	set_mask = *mask & gc->bgpio_dir;
+	get_mask = *mask & ~gc->bgpio_dir;
 
 	if (set_mask)
 		*bits |= gc->read_reg(gc->reg_set) & set_mask;
@@ -176,13 +175,13 @@ static int bgpio_get(struct gpio_chip *gc, unsigned int gpio)
 
 /*
  * This only works if the bits in the GPIO register are in native endianness.
- * It is dirt simple and fast in this case. (Also the most common case.)
  */
 static int bgpio_get_multiple(struct gpio_chip *gc, unsigned long *mask,
 			      unsigned long *bits)
 {
-
-	*bits = gc->read_reg(gc->reg_dat) & *mask;
+	/* Make sure we first clear any bits that are zero when we read the register */
+	*bits &= ~*mask;
+	*bits |= gc->read_reg(gc->reg_dat) & *mask;
 	return 0;
 }
 
@@ -196,9 +195,12 @@ static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask,
 	unsigned long val;
 	int bit;
 
+	/* Make sure we first clear any bits that are zero when we read the register */
+	*bits &= ~*mask;
+
 	/* Create a mirrored mask */
-	bit = 0;
-	while ((bit = find_next_bit(mask, gc->ngpio, bit)) != gc->ngpio)
+	bit = -1;
+	while ((bit = find_next_bit(mask, gc->ngpio, bit + 1)) < gc->ngpio)
 		readmask |= bgpio_line2mask(gc, bit);
 
 	/* Read the register */
@@ -208,8 +210,8 @@ static int bgpio_get_multiple_be(struct gpio_chip *gc, unsigned long *mask,
 	 * Mirror the result into the "bits" result, this will give line 0
 	 * in bit 0 ... line 31 in bit 31 for a 32bit register.
 	 */
-	bit = 0;
-	while ((bit = find_next_bit(&val, gc->ngpio, bit)) != gc->ngpio)
+	bit = -1;
+	while ((bit = find_next_bit(&val, gc->ngpio, bit + 1)) < gc->ngpio)
 		*bits |= bgpio_line2mask(gc, bit);
 
 	return 0;
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c
index 44332b7..14532d9 100644
--- a/drivers/gpio/gpiolib.c
+++ b/drivers/gpio/gpiolib.c
@@ -2893,6 +2893,27 @@ void gpiod_set_raw_value(struct gpio_desc *desc, int value)
 EXPORT_SYMBOL_GPL(gpiod_set_raw_value);
 
 /**
+ * gpiod_set_value_nocheck() - set a GPIO line value without checking
+ * @desc: the descriptor to set the value on
+ * @value: value to set
+ *
+ * This sets the value of a GPIO line backing a descriptor, applying
+ * different semantic quirks like active low and open drain/source
+ * handling.
+ */
+static void gpiod_set_value_nocheck(struct gpio_desc *desc, int value)
+{
+	if (test_bit(FLAG_ACTIVE_LOW, &desc->flags))
+		value = !value;
+	if (test_bit(FLAG_OPEN_DRAIN, &desc->flags))
+		gpio_set_open_drain_value_commit(desc, value);
+	else if (test_bit(FLAG_OPEN_SOURCE, &desc->flags))
+		gpio_set_open_source_value_commit(desc, value);
+	else
+		gpiod_set_raw_value_commit(desc, value);
+}
+
+/**
  * gpiod_set_value() - assign a gpio's value
  * @desc: gpio whose value will be assigned
  * @value: value to assign
@@ -2906,16 +2927,8 @@ EXPORT_SYMBOL_GPL(gpiod_set_raw_value);
 void gpiod_set_value(struct gpio_desc *desc, int value)
 {
 	VALIDATE_DESC_VOID(desc);
-	/* Should be using gpiod_set_value_cansleep() */
 	WARN_ON(desc->gdev->chip->can_sleep);
-	if (test_bit(FLAG_ACTIVE_LOW, &desc->flags))
-		value = !value;
-	if (test_bit(FLAG_OPEN_DRAIN, &desc->flags))
-		gpio_set_open_drain_value_commit(desc, value);
-	else if (test_bit(FLAG_OPEN_SOURCE, &desc->flags))
-		gpio_set_open_source_value_commit(desc, value);
-	else
-		gpiod_set_raw_value_commit(desc, value);
+	gpiod_set_value_nocheck(desc, value);
 }
 EXPORT_SYMBOL_GPL(gpiod_set_value);
 
@@ -3243,9 +3256,7 @@ void gpiod_set_value_cansleep(struct gpio_desc *desc, int value)
 {
 	might_sleep_if(extra_checks);
 	VALIDATE_DESC_VOID(desc);
-	if (test_bit(FLAG_ACTIVE_LOW, &desc->flags))
-		value = !value;
-	gpiod_set_raw_value_commit(desc, value);
+	gpiod_set_value_nocheck(desc, value);
 }
 EXPORT_SYMBOL_GPL(gpiod_set_value_cansleep);
 
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.h b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.h
index a9782b1..34daf89 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.h
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp.h
@@ -1360,7 +1360,7 @@ void dpp1_cm_set_output_csc_adjustment(
 
 void dpp1_cm_set_output_csc_default(
 		struct dpp *dpp_base,
-		const struct default_adjustment *default_adjust);
+		enum dc_color_space colorspace);
 
 void dpp1_cm_set_gamut_remap(
 	struct dpp *dpp,
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
index 40627c2..ed1216b 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_dpp_cm.c
@@ -225,14 +225,13 @@ void dpp1_cm_set_gamut_remap(
 
 void dpp1_cm_set_output_csc_default(
 		struct dpp *dpp_base,
-		const struct default_adjustment *default_adjust)
+		enum dc_color_space colorspace)
 {
 
 	struct dcn10_dpp *dpp = TO_DCN10_DPP(dpp_base);
 	uint32_t ocsc_mode = 0;
 
-	if (default_adjust != NULL) {
-		switch (default_adjust->out_color_space) {
+	switch (colorspace) {
 		case COLOR_SPACE_SRGB:
 		case COLOR_SPACE_2020_RGB_FULLRANGE:
 			ocsc_mode = 0;
@@ -253,7 +252,6 @@ void dpp1_cm_set_output_csc_default(
 		case COLOR_SPACE_UNKNOWN:
 		default:
 			break;
-		}
 	}
 
 	REG_SET(CM_OCSC_CONTROL, 0, CM_OCSC_MODE, ocsc_mode);
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
index 961ad5c..05dc01e 100644
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
+++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c
@@ -2097,6 +2097,8 @@ static void program_csc_matrix(struct pipe_ctx *pipe_ctx,
 			tbl_entry.color_space = color_space;
 			//tbl_entry.regval = matrix;
 			pipe_ctx->plane_res.dpp->funcs->opp_set_csc_adjustment(pipe_ctx->plane_res.dpp, &tbl_entry);
+	} else {
+		pipe_ctx->plane_res.dpp->funcs->opp_set_csc_default(pipe_ctx->plane_res.dpp, colorspace);
 	}
 }
 static bool is_lower_pipe_tree_visible(struct pipe_ctx *pipe_ctx)
diff --git a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
index 83a6846..9420dfb 100644
--- a/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
+++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dpp.h
@@ -64,7 +64,7 @@ struct dpp_funcs {
 
 	void (*opp_set_csc_default)(
 		struct dpp *dpp,
-		const struct default_adjustment *default_adjust);
+		enum dc_color_space colorspace);
 
 	void (*opp_set_csc_adjustment)(
 		struct dpp *dpp,
diff --git a/drivers/gpu/drm/armada/armada_crtc.c b/drivers/gpu/drm/armada/armada_crtc.c
index 2e065fa..a0f4d2a 100644
--- a/drivers/gpu/drm/armada/armada_crtc.c
+++ b/drivers/gpu/drm/armada/armada_crtc.c
@@ -168,16 +168,23 @@ static void armada_drm_crtc_update(struct armada_crtc *dcrtc)
 void armada_drm_plane_calc_addrs(u32 *addrs, struct drm_framebuffer *fb,
 	int x, int y)
 {
+	const struct drm_format_info *format = fb->format;
+	unsigned int num_planes = format->num_planes;
 	u32 addr = drm_fb_obj(fb)->dev_addr;
-	int num_planes = fb->format->num_planes;
 	int i;
 
 	if (num_planes > 3)
 		num_planes = 3;
 
-	for (i = 0; i < num_planes; i++)
+	addrs[0] = addr + fb->offsets[0] + y * fb->pitches[0] +
+		   x * format->cpp[0];
+
+	y /= format->vsub;
+	x /= format->hsub;
+
+	for (i = 1; i < num_planes; i++)
 		addrs[i] = addr + fb->offsets[i] + y * fb->pitches[i] +
-			     x * fb->format->cpp[i];
+			     x * format->cpp[i];
 	for (; i < 3; i++)
 		addrs[i] = 0;
 }
@@ -744,15 +751,14 @@ void armada_drm_crtc_plane_disable(struct armada_crtc *dcrtc,
 	if (plane->fb)
 		drm_framebuffer_put(plane->fb);
 
-	/* Power down the Y/U/V FIFOs */
-	sram_para1 = CFG_PDWN16x66 | CFG_PDWN32x66;
-
 	/* Power down most RAMs and FIFOs if this is the primary plane */
 	if (plane->type == DRM_PLANE_TYPE_PRIMARY) {
-		sram_para1 |= CFG_PDWN256x32 | CFG_PDWN256x24 | CFG_PDWN256x8 |
-			      CFG_PDWN32x32 | CFG_PDWN64x66;
+		sram_para1 = CFG_PDWN256x32 | CFG_PDWN256x24 | CFG_PDWN256x8 |
+			     CFG_PDWN32x32 | CFG_PDWN64x66;
 		dma_ctrl0_mask = CFG_GRA_ENA;
 	} else {
+		/* Power down the Y/U/V FIFOs */
+		sram_para1 = CFG_PDWN16x66 | CFG_PDWN32x66;
 		dma_ctrl0_mask = CFG_DMA_ENA;
 	}
 
@@ -1225,17 +1231,13 @@ static int armada_drm_crtc_create(struct drm_device *drm, struct device *dev,
 
 	ret = devm_request_irq(dev, irq, armada_drm_irq, 0, "armada_drm_crtc",
 			       dcrtc);
-	if (ret < 0) {
-		kfree(dcrtc);
-		return ret;
-	}
+	if (ret < 0)
+		goto err_crtc;
 
 	if (dcrtc->variant->init) {
 		ret = dcrtc->variant->init(dcrtc, dev);
-		if (ret) {
-			kfree(dcrtc);
-			return ret;
-		}
+		if (ret)
+			goto err_crtc;
 	}
 
 	/* Ensure AXI pipeline is enabled */
@@ -1246,13 +1248,15 @@ static int armada_drm_crtc_create(struct drm_device *drm, struct device *dev,
 	dcrtc->crtc.port = port;
 
 	primary = kzalloc(sizeof(*primary), GFP_KERNEL);
-	if (!primary)
-		return -ENOMEM;
+	if (!primary) {
+		ret = -ENOMEM;
+		goto err_crtc;
+	}
 
 	ret = armada_drm_plane_init(primary);
 	if (ret) {
 		kfree(primary);
-		return ret;
+		goto err_crtc;
 	}
 
 	ret = drm_universal_plane_init(drm, &primary->base, 0,
@@ -1263,7 +1267,7 @@ static int armada_drm_crtc_create(struct drm_device *drm, struct device *dev,
 				       DRM_PLANE_TYPE_PRIMARY, NULL);
 	if (ret) {
 		kfree(primary);
-		return ret;
+		goto err_crtc;
 	}
 
 	ret = drm_crtc_init_with_planes(drm, &dcrtc->crtc, &primary->base, NULL,
@@ -1282,6 +1286,9 @@ static int armada_drm_crtc_create(struct drm_device *drm, struct device *dev,
 
 err_crtc_init:
 	primary->base.funcs->destroy(&primary->base);
+err_crtc:
+	kfree(dcrtc);
+
 	return ret;
 }
 
diff --git a/drivers/gpu/drm/armada/armada_crtc.h b/drivers/gpu/drm/armada/armada_crtc.h
index bab11f4..bfd3514 100644
--- a/drivers/gpu/drm/armada/armada_crtc.h
+++ b/drivers/gpu/drm/armada/armada_crtc.h
@@ -42,6 +42,8 @@ struct armada_plane_work {
 };
 
 struct armada_plane_state {
+	u16 src_x;
+	u16 src_y;
 	u32 src_hw;
 	u32 dst_hw;
 	u32 dst_yx;
diff --git a/drivers/gpu/drm/armada/armada_overlay.c b/drivers/gpu/drm/armada/armada_overlay.c
index b411b60..aba9476 100644
--- a/drivers/gpu/drm/armada/armada_overlay.c
+++ b/drivers/gpu/drm/armada/armada_overlay.c
@@ -99,6 +99,7 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 {
 	struct armada_ovl_plane *dplane = drm_to_armada_ovl_plane(plane);
 	struct armada_crtc *dcrtc = drm_to_armada_crtc(crtc);
+	const struct drm_format_info *format;
 	struct drm_rect src = {
 		.x1 = src_x,
 		.y1 = src_y,
@@ -117,7 +118,7 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 	};
 	uint32_t val, ctrl0;
 	unsigned idx = 0;
-	bool visible;
+	bool visible, fb_changed;
 	int ret;
 
 	trace_armada_ovl_plane_update(plane, crtc, fb,
@@ -138,6 +139,18 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 	if (!visible)
 		ctrl0 &= ~CFG_DMA_ENA;
 
+	/*
+	 * Shifting a YUV packed format image by one pixel causes the U/V
+	 * planes to swap.  Compensate for it by also toggling the UV swap.
+	 */
+	format = fb->format;
+	if (format->num_planes == 1 && src.x1 >> 16 & (format->hsub - 1))
+		ctrl0 ^= CFG_DMA_MOD(CFG_SWAPUV);
+
+	fb_changed = plane->fb != fb ||
+		     dplane->base.state.src_x != src.x1 >> 16 ||
+	             dplane->base.state.src_y != src.y1 >> 16;
+
 	if (!dcrtc->plane) {
 		dcrtc->plane = plane;
 		armada_ovl_update_attr(&dplane->prop, dcrtc);
@@ -145,7 +158,7 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 
 	/* FIXME: overlay on an interlaced display */
 	/* Just updating the position/size? */
-	if (plane->fb == fb && dplane->base.state.ctrl0 == ctrl0) {
+	if (!fb_changed && dplane->base.state.ctrl0 == ctrl0) {
 		val = (drm_rect_height(&src) & 0xffff0000) |
 		      drm_rect_width(&src) >> 16;
 		dplane->base.state.src_hw = val;
@@ -169,9 +182,8 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 	if (armada_drm_plane_work_wait(&dplane->base, HZ / 25) == 0)
 		armada_drm_plane_work_cancel(dcrtc, &dplane->base);
 
-	if (plane->fb != fb) {
-		u32 addrs[3], pixel_format;
-		int num_planes, hsub;
+	if (fb_changed) {
+		u32 addrs[3];
 
 		/*
 		 * Take a reference on the new framebuffer - we want to
@@ -182,23 +194,11 @@ armada_ovl_plane_update(struct drm_plane *plane, struct drm_crtc *crtc,
 		if (plane->fb)
 			armada_ovl_retire_fb(dplane, plane->fb);
 
-		src_y = src.y1 >> 16;
-		src_x = src.x1 >> 16;
+		dplane->base.state.src_y = src_y = src.y1 >> 16;
+		dplane->base.state.src_x = src_x = src.x1 >> 16;
 
 		armada_drm_plane_calc_addrs(addrs, fb, src_x, src_y);
 
-		pixel_format = fb->format->format;
-		hsub = drm_format_horz_chroma_subsampling(pixel_format);
-		num_planes = fb->format->num_planes;
-
-		/*
-		 * Annoyingly, shifting a YUYV-format image by one pixel
-		 * causes the U/V planes to toggle.  Toggle the UV swap.
-		 * (Unfortunately, this causes momentary colour flickering.)
-		 */
-		if (src_x & (hsub - 1) && num_planes == 1)
-			ctrl0 ^= CFG_DMA_MOD(CFG_SWAPUV);
-
 		armada_reg_queue_set(dplane->vbl.regs, idx, addrs[0],
 				     LCD_SPU_DMA_START_ADDR_Y0);
 		armada_reg_queue_set(dplane->vbl.regs, idx, addrs[1],
diff --git a/drivers/gpu/drm/i915/gvt/cmd_parser.c b/drivers/gpu/drm/i915/gvt/cmd_parser.c
index 85d4c57..49af946 100644
--- a/drivers/gpu/drm/i915/gvt/cmd_parser.c
+++ b/drivers/gpu/drm/i915/gvt/cmd_parser.c
@@ -2777,12 +2777,12 @@ int intel_gvt_scan_and_shadow_wa_ctx(struct intel_shadow_wa_ctx *wa_ctx)
 }
 
 static struct cmd_info *find_cmd_entry_any_ring(struct intel_gvt *gvt,
-		unsigned int opcode, int rings)
+		unsigned int opcode, unsigned long rings)
 {
 	struct cmd_info *info = NULL;
 	unsigned int ring;
 
-	for_each_set_bit(ring, (unsigned long *)&rings, I915_NUM_ENGINES) {
+	for_each_set_bit(ring, &rings, I915_NUM_ENGINES) {
 		info = find_cmd_entry(gvt, opcode, ring);
 		if (info)
 			break;
diff --git a/drivers/gpu/drm/i915/gvt/gtt.c b/drivers/gpu/drm/i915/gvt/gtt.c
index 8e33114..64d67ff 100644
--- a/drivers/gpu/drm/i915/gvt/gtt.c
+++ b/drivers/gpu/drm/i915/gvt/gtt.c
@@ -1359,12 +1359,15 @@ static int ppgtt_handle_guest_write_page_table_bytes(void *gp,
 			return ret;
 	} else {
 		if (!test_bit(index, spt->post_shadow_bitmap)) {
+			int type = spt->shadow_page.type;
+
 			ppgtt_get_shadow_entry(spt, &se, index);
 			ret = ppgtt_handle_guest_entry_removal(gpt, &se, index);
 			if (ret)
 				return ret;
+			ops->set_pfn(&se, vgpu->gtt.scratch_pt[type].page_mfn);
+			ppgtt_set_shadow_entry(spt, &se, index);
 		}
-
 		ppgtt_set_post_shadow(spt, index);
 	}
 
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 54b5d4c..e143004 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -2368,6 +2368,9 @@ struct drm_i915_private {
 	 */
 	struct workqueue_struct *wq;
 
+	/* ordered wq for modesets */
+	struct workqueue_struct *modeset_wq;
+
 	/* Display functions */
 	struct drm_i915_display_funcs display;
 
diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c
index 18de656..5cfba89 100644
--- a/drivers/gpu/drm/i915/i915_gem.c
+++ b/drivers/gpu/drm/i915/i915_gem.c
@@ -467,7 +467,7 @@ static void __fence_set_priority(struct dma_fence *fence, int prio)
 	struct drm_i915_gem_request *rq;
 	struct intel_engine_cs *engine;
 
-	if (!dma_fence_is_i915(fence))
+	if (dma_fence_is_signaled(fence) || !dma_fence_is_i915(fence))
 		return;
 
 	rq = to_request(fence);
diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h
index 3866c49..7923dfd 100644
--- a/drivers/gpu/drm/i915/i915_reg.h
+++ b/drivers/gpu/drm/i915/i915_reg.h
@@ -6977,6 +6977,7 @@ enum {
 #define  RESET_PCH_HANDSHAKE_ENABLE	(1<<4)
 
 #define GEN8_CHICKEN_DCPR_1		_MMIO(0x46430)
+#define   SKL_SELECT_ALTERNATE_DC_EXIT	(1<<30)
 #define   MASK_WAKEMEM			(1<<13)
 
 #define SKL_DFSM			_MMIO(0x51000)
@@ -7026,6 +7027,8 @@ enum {
 #define GEN9_SLICE_COMMON_ECO_CHICKEN0		_MMIO(0x7308)
 #define  DISABLE_PIXEL_MASK_CAMMING		(1<<14)
 
+#define GEN9_SLICE_COMMON_ECO_CHICKEN1		_MMIO(0x731c)
+
 #define GEN7_L3SQCREG1				_MMIO(0xB010)
 #define  VLV_B0_WA_L3SQCREG1_VALUE		0x00D30000
 
@@ -8522,6 +8525,7 @@ enum skl_power_gate {
 #define  BXT_CDCLK_CD2X_DIV_SEL_2	(2<<22)
 #define  BXT_CDCLK_CD2X_DIV_SEL_4	(3<<22)
 #define  BXT_CDCLK_CD2X_PIPE(pipe)	((pipe)<<20)
+#define  CDCLK_DIVMUX_CD_OVERRIDE	(1<<19)
 #define  BXT_CDCLK_CD2X_PIPE_NONE	BXT_CDCLK_CD2X_PIPE(3)
 #define  BXT_CDCLK_SSA_PRECHARGE_ENABLE	(1<<16)
 #define  CDCLK_FREQ_DECIMAL_MASK	(0x7ff)
diff --git a/drivers/gpu/drm/i915/intel_cdclk.c b/drivers/gpu/drm/i915/intel_cdclk.c
index b2a6d62..60cf4e5 100644
--- a/drivers/gpu/drm/i915/intel_cdclk.c
+++ b/drivers/gpu/drm/i915/intel_cdclk.c
@@ -860,16 +860,10 @@ static void skl_set_preferred_cdclk_vco(struct drm_i915_private *dev_priv,
 
 static void skl_dpll0_enable(struct drm_i915_private *dev_priv, int vco)
 {
-	int min_cdclk = skl_calc_cdclk(0, vco);
 	u32 val;
 
 	WARN_ON(vco != 8100000 && vco != 8640000);
 
-	/* select the minimum CDCLK before enabling DPLL 0 */
-	val = CDCLK_FREQ_337_308 | skl_cdclk_decimal(min_cdclk);
-	I915_WRITE(CDCLK_CTL, val);
-	POSTING_READ(CDCLK_CTL);
-
 	/*
 	 * We always enable DPLL0 with the lowest link rate possible, but still
 	 * taking into account the VCO required to operate the eDP panel at the
@@ -923,7 +917,7 @@ static void skl_set_cdclk(struct drm_i915_private *dev_priv,
 {
 	int cdclk = cdclk_state->cdclk;
 	int vco = cdclk_state->vco;
-	u32 freq_select, pcu_ack;
+	u32 freq_select, pcu_ack, cdclk_ctl;
 	int ret;
 
 	WARN_ON((cdclk == 24000) != (vco == 0));
@@ -940,7 +934,7 @@ static void skl_set_cdclk(struct drm_i915_private *dev_priv,
 		return;
 	}
 
-	/* set CDCLK_CTL */
+	/* Choose frequency for this cdclk */
 	switch (cdclk) {
 	case 450000:
 	case 432000:
@@ -968,10 +962,33 @@ static void skl_set_cdclk(struct drm_i915_private *dev_priv,
 	    dev_priv->cdclk.hw.vco != vco)
 		skl_dpll0_disable(dev_priv);
 
+	cdclk_ctl = I915_READ(CDCLK_CTL);
+
+	if (dev_priv->cdclk.hw.vco != vco) {
+		/* Wa Display #1183: skl,kbl,cfl */
+		cdclk_ctl &= ~(CDCLK_FREQ_SEL_MASK | CDCLK_FREQ_DECIMAL_MASK);
+		cdclk_ctl |= freq_select | skl_cdclk_decimal(cdclk);
+		I915_WRITE(CDCLK_CTL, cdclk_ctl);
+	}
+
+	/* Wa Display #1183: skl,kbl,cfl */
+	cdclk_ctl |= CDCLK_DIVMUX_CD_OVERRIDE;
+	I915_WRITE(CDCLK_CTL, cdclk_ctl);
+	POSTING_READ(CDCLK_CTL);
+
 	if (dev_priv->cdclk.hw.vco != vco)
 		skl_dpll0_enable(dev_priv, vco);
 
-	I915_WRITE(CDCLK_CTL, freq_select | skl_cdclk_decimal(cdclk));
+	/* Wa Display #1183: skl,kbl,cfl */
+	cdclk_ctl &= ~(CDCLK_FREQ_SEL_MASK | CDCLK_FREQ_DECIMAL_MASK);
+	I915_WRITE(CDCLK_CTL, cdclk_ctl);
+
+	cdclk_ctl |= freq_select | skl_cdclk_decimal(cdclk);
+	I915_WRITE(CDCLK_CTL, cdclk_ctl);
+
+	/* Wa Display #1183: skl,kbl,cfl */
+	cdclk_ctl &= ~CDCLK_DIVMUX_CD_OVERRIDE;
+	I915_WRITE(CDCLK_CTL, cdclk_ctl);
 	POSTING_READ(CDCLK_CTL);
 
 	/* inform PCU of the change */
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c
index 30cf273..50f8443 100644
--- a/drivers/gpu/drm/i915/intel_display.c
+++ b/drivers/gpu/drm/i915/intel_display.c
@@ -1211,23 +1211,6 @@ void assert_panel_unlocked(struct drm_i915_private *dev_priv, enum pipe pipe)
 	     pipe_name(pipe));
 }
 
-static void assert_cursor(struct drm_i915_private *dev_priv,
-			  enum pipe pipe, bool state)
-{
-	bool cur_state;
-
-	if (IS_I845G(dev_priv) || IS_I865G(dev_priv))
-		cur_state = I915_READ(CURCNTR(PIPE_A)) & CURSOR_ENABLE;
-	else
-		cur_state = I915_READ(CURCNTR(pipe)) & CURSOR_MODE;
-
-	I915_STATE_WARN(cur_state != state,
-	     "cursor on pipe %c assertion failure (expected %s, current %s)\n",
-			pipe_name(pipe), onoff(state), onoff(cur_state));
-}
-#define assert_cursor_enabled(d, p) assert_cursor(d, p, true)
-#define assert_cursor_disabled(d, p) assert_cursor(d, p, false)
-
 void assert_pipe(struct drm_i915_private *dev_priv,
 		 enum pipe pipe, bool state)
 {
@@ -1255,77 +1238,25 @@ void assert_pipe(struct drm_i915_private *dev_priv,
 			pipe_name(pipe), onoff(state), onoff(cur_state));
 }
 
-static void assert_plane(struct drm_i915_private *dev_priv,
-			 enum plane plane, bool state)
+static void assert_plane(struct intel_plane *plane, bool state)
 {
-	u32 val;
-	bool cur_state;
+	bool cur_state = plane->get_hw_state(plane);
 
-	val = I915_READ(DSPCNTR(plane));
-	cur_state = !!(val & DISPLAY_PLANE_ENABLE);
 	I915_STATE_WARN(cur_state != state,
-	     "plane %c assertion failure (expected %s, current %s)\n",
-			plane_name(plane), onoff(state), onoff(cur_state));
+			"%s assertion failure (expected %s, current %s)\n",
+			plane->base.name, onoff(state), onoff(cur_state));
 }
 
-#define assert_plane_enabled(d, p) assert_plane(d, p, true)
-#define assert_plane_disabled(d, p) assert_plane(d, p, false)
+#define assert_plane_enabled(p) assert_plane(p, true)
+#define assert_plane_disabled(p) assert_plane(p, false)
 
-static void assert_planes_disabled(struct drm_i915_private *dev_priv,
-				   enum pipe pipe)
+static void assert_planes_disabled(struct intel_crtc *crtc)
 {
-	int i;
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	struct intel_plane *plane;
 
-	/* Primary planes are fixed to pipes on gen4+ */
-	if (INTEL_GEN(dev_priv) >= 4) {
-		u32 val = I915_READ(DSPCNTR(pipe));
-		I915_STATE_WARN(val & DISPLAY_PLANE_ENABLE,
-		     "plane %c assertion failure, should be disabled but not\n",
-		     plane_name(pipe));
-		return;
-	}
-
-	/* Need to check both planes against the pipe */
-	for_each_pipe(dev_priv, i) {
-		u32 val = I915_READ(DSPCNTR(i));
-		enum pipe cur_pipe = (val & DISPPLANE_SEL_PIPE_MASK) >>
-			DISPPLANE_SEL_PIPE_SHIFT;
-		I915_STATE_WARN((val & DISPLAY_PLANE_ENABLE) && pipe == cur_pipe,
-		     "plane %c assertion failure, should be off on pipe %c but is still active\n",
-		     plane_name(i), pipe_name(pipe));
-	}
-}
-
-static void assert_sprites_disabled(struct drm_i915_private *dev_priv,
-				    enum pipe pipe)
-{
-	int sprite;
-
-	if (INTEL_GEN(dev_priv) >= 9) {
-		for_each_sprite(dev_priv, pipe, sprite) {
-			u32 val = I915_READ(PLANE_CTL(pipe, sprite));
-			I915_STATE_WARN(val & PLANE_CTL_ENABLE,
-			     "plane %d assertion failure, should be off on pipe %c but is still active\n",
-			     sprite, pipe_name(pipe));
-		}
-	} else if (IS_VALLEYVIEW(dev_priv) || IS_CHERRYVIEW(dev_priv)) {
-		for_each_sprite(dev_priv, pipe, sprite) {
-			u32 val = I915_READ(SPCNTR(pipe, PLANE_SPRITE0 + sprite));
-			I915_STATE_WARN(val & SP_ENABLE,
-			     "sprite %c assertion failure, should be off on pipe %c but is still active\n",
-			     sprite_name(pipe, sprite), pipe_name(pipe));
-		}
-	} else if (INTEL_GEN(dev_priv) >= 7) {
-		u32 val = I915_READ(SPRCTL(pipe));
-		I915_STATE_WARN(val & SPRITE_ENABLE,
-		     "sprite %c assertion failure, should be off on pipe %c but is still active\n",
-		     plane_name(pipe), pipe_name(pipe));
-	} else if (INTEL_GEN(dev_priv) >= 5 || IS_G4X(dev_priv)) {
-		u32 val = I915_READ(DVSCNTR(pipe));
-		I915_STATE_WARN(val & DVS_ENABLE,
-		     "sprite %c assertion failure, should be off on pipe %c but is still active\n",
-		     plane_name(pipe), pipe_name(pipe));
-	}
+	for_each_intel_plane_on_crtc(&dev_priv->drm, crtc, plane)
+		assert_plane_disabled(plane);
 }
 
 static void assert_vblank_disabled(struct drm_crtc *crtc)
@@ -1918,9 +1849,7 @@ static void intel_enable_pipe(struct intel_crtc *crtc)
 
 	DRM_DEBUG_KMS("enabling pipe %c\n", pipe_name(pipe));
 
-	assert_planes_disabled(dev_priv, pipe);
-	assert_cursor_disabled(dev_priv, pipe);
-	assert_sprites_disabled(dev_priv, pipe);
+	assert_planes_disabled(crtc);
 
 	/*
 	 * A pipe without a PLL won't actually be able to drive bits from
@@ -1989,9 +1918,7 @@ static void intel_disable_pipe(struct intel_crtc *crtc)
 	 * Make sure planes won't keep trying to pump pixels to us,
 	 * or we might hang the display.
 	 */
-	assert_planes_disabled(dev_priv, pipe);
-	assert_cursor_disabled(dev_priv, pipe);
-	assert_sprites_disabled(dev_priv, pipe);
+	assert_planes_disabled(crtc);
 
 	reg = PIPECONF(cpu_transcoder);
 	val = I915_READ(reg);
@@ -2820,6 +2747,23 @@ intel_set_plane_visible(struct intel_crtc_state *crtc_state,
 		      crtc_state->active_planes);
 }
 
+static void intel_plane_disable_noatomic(struct intel_crtc *crtc,
+					 struct intel_plane *plane)
+{
+	struct intel_crtc_state *crtc_state =
+		to_intel_crtc_state(crtc->base.state);
+	struct intel_plane_state *plane_state =
+		to_intel_plane_state(plane->base.state);
+
+	intel_set_plane_visible(crtc_state, plane_state, false);
+
+	if (plane->id == PLANE_PRIMARY)
+		intel_pre_disable_primary_noatomic(&crtc->base);
+
+	trace_intel_disable_plane(&plane->base, crtc);
+	plane->disable_plane(plane, crtc);
+}
+
 static void
 intel_find_initial_plane_obj(struct intel_crtc *intel_crtc,
 			     struct intel_initial_plane_config *plane_config)
@@ -2877,12 +2821,7 @@ intel_find_initial_plane_obj(struct intel_crtc *intel_crtc,
 	 * simplest solution is to just disable the primary plane now and
 	 * pretend the BIOS never had it enabled.
 	 */
-	intel_set_plane_visible(to_intel_crtc_state(crtc_state),
-				to_intel_plane_state(plane_state),
-				false);
-	intel_pre_disable_primary_noatomic(&intel_crtc->base);
-	trace_intel_disable_plane(primary, intel_crtc);
-	intel_plane->disable_plane(intel_plane, intel_crtc);
+	intel_plane_disable_noatomic(intel_crtc, intel_plane);
 
 	return;
 
@@ -3385,6 +3324,31 @@ static void i9xx_disable_primary_plane(struct intel_plane *primary,
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
+static bool i9xx_plane_get_hw_state(struct intel_plane *primary)
+{
+
+	struct drm_i915_private *dev_priv = to_i915(primary->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum plane plane = primary->plane;
+	enum pipe pipe = primary->pipe;
+	bool ret;
+
+	/*
+	 * Not 100% correct for planes that can move between pipes,
+	 * but that's only the case for gen2-4 which don't have any
+	 * display power wells.
+	 */
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(DSPCNTR(plane)) & DISPLAY_PLANE_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static u32
 intel_fb_stride_alignment(const struct drm_framebuffer *fb, int plane)
 {
@@ -4866,7 +4830,8 @@ void hsw_enable_ips(struct intel_crtc *crtc)
 	 * a vblank wait.
 	 */
 
-	assert_plane_enabled(dev_priv, crtc->plane);
+	assert_plane_enabled(to_intel_plane(crtc->base.primary));
+
 	if (IS_BROADWELL(dev_priv)) {
 		mutex_lock(&dev_priv->pcu_lock);
 		WARN_ON(sandybridge_pcode_write(dev_priv, DISPLAY_IPS_CONTROL,
@@ -4899,7 +4864,8 @@ void hsw_disable_ips(struct intel_crtc *crtc)
 	if (!crtc->config->ips_enabled)
 		return;
 
-	assert_plane_enabled(dev_priv, crtc->plane);
+	assert_plane_enabled(to_intel_plane(crtc->base.primary));
+
 	if (IS_BROADWELL(dev_priv)) {
 		mutex_lock(&dev_priv->pcu_lock);
 		WARN_ON(sandybridge_pcode_write(dev_priv, DISPLAY_IPS_CONTROL, 0));
@@ -5899,6 +5865,7 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc,
 	struct intel_crtc *intel_crtc = to_intel_crtc(crtc);
 	struct drm_i915_private *dev_priv = to_i915(crtc->dev);
 	enum intel_display_power_domain domain;
+	struct intel_plane *plane;
 	u64 domains;
 	struct drm_atomic_state *state;
 	struct intel_crtc_state *crtc_state;
@@ -5907,11 +5874,12 @@ static void intel_crtc_disable_noatomic(struct drm_crtc *crtc,
 	if (!intel_crtc->active)
 		return;
 
-	if (crtc->primary->state->visible) {
-		intel_pre_disable_primary_noatomic(crtc);
+	for_each_intel_plane_on_crtc(&dev_priv->drm, intel_crtc, plane) {
+		const struct intel_plane_state *plane_state =
+			to_intel_plane_state(plane->base.state);
 
-		intel_crtc_disable_planes(crtc, 1 << drm_plane_index(crtc->primary));
-		crtc->primary->state->visible = false;
+		if (plane_state->base.visible)
+			intel_plane_disable_noatomic(intel_crtc, plane);
 	}
 
 	state = drm_atomic_state_alloc(crtc->dev);
@@ -9477,6 +9445,23 @@ static void i845_disable_cursor(struct intel_plane *plane,
 	i845_update_cursor(plane, NULL, NULL);
 }
 
+static bool i845_cursor_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	bool ret;
+
+	power_domain = POWER_DOMAIN_PIPE(PIPE_A);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(CURCNTR(PIPE_A)) & CURSOR_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static u32 i9xx_cursor_ctl(const struct intel_crtc_state *crtc_state,
 			   const struct intel_plane_state *plane_state)
 {
@@ -9670,6 +9655,28 @@ static void i9xx_disable_cursor(struct intel_plane *plane,
 	i9xx_update_cursor(plane, NULL, NULL);
 }
 
+static bool i9xx_cursor_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum pipe pipe = plane->pipe;
+	bool ret;
+
+	/*
+	 * Not 100% correct for planes that can move between pipes,
+	 * but that's only the case for gen2-3 which don't have any
+	 * display power wells.
+	 */
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(CURCNTR(pipe)) & CURSOR_MODE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
 
 /* VESA 640x480x72Hz mode to set on the pipe */
 static const struct drm_display_mode load_detect_mode = {
@@ -12544,11 +12551,15 @@ static int intel_atomic_commit(struct drm_device *dev,
 	INIT_WORK(&state->commit_work, intel_atomic_commit_work);
 
 	i915_sw_fence_commit(&intel_state->commit_ready);
-	if (nonblock)
+	if (nonblock && intel_state->modeset) {
+		queue_work(dev_priv->modeset_wq, &state->commit_work);
+	} else if (nonblock) {
 		queue_work(system_unbound_wq, &state->commit_work);
-	else
+	} else {
+		if (intel_state->modeset)
+			flush_workqueue(dev_priv->modeset_wq);
 		intel_atomic_commit_tail(state);
-
+	}
 
 	return 0;
 }
@@ -13201,6 +13212,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 
 		primary->update_plane = skl_update_plane;
 		primary->disable_plane = skl_disable_plane;
+		primary->get_hw_state = skl_plane_get_hw_state;
 	} else if (INTEL_GEN(dev_priv) >= 9) {
 		intel_primary_formats = skl_primary_formats;
 		num_formats = ARRAY_SIZE(skl_primary_formats);
@@ -13211,6 +13223,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 
 		primary->update_plane = skl_update_plane;
 		primary->disable_plane = skl_disable_plane;
+		primary->get_hw_state = skl_plane_get_hw_state;
 	} else if (INTEL_GEN(dev_priv) >= 4) {
 		intel_primary_formats = i965_primary_formats;
 		num_formats = ARRAY_SIZE(i965_primary_formats);
@@ -13218,6 +13231,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 
 		primary->update_plane = i9xx_update_primary_plane;
 		primary->disable_plane = i9xx_disable_primary_plane;
+		primary->get_hw_state = i9xx_plane_get_hw_state;
 	} else {
 		intel_primary_formats = i8xx_primary_formats;
 		num_formats = ARRAY_SIZE(i8xx_primary_formats);
@@ -13225,6 +13239,7 @@ intel_primary_plane_create(struct drm_i915_private *dev_priv, enum pipe pipe)
 
 		primary->update_plane = i9xx_update_primary_plane;
 		primary->disable_plane = i9xx_disable_primary_plane;
+		primary->get_hw_state = i9xx_plane_get_hw_state;
 	}
 
 	if (INTEL_GEN(dev_priv) >= 9)
@@ -13314,10 +13329,12 @@ intel_cursor_plane_create(struct drm_i915_private *dev_priv,
 	if (IS_I845G(dev_priv) || IS_I865G(dev_priv)) {
 		cursor->update_plane = i845_update_cursor;
 		cursor->disable_plane = i845_disable_cursor;
+		cursor->get_hw_state = i845_cursor_get_hw_state;
 		cursor->check_plane = i845_check_cursor;
 	} else {
 		cursor->update_plane = i9xx_update_cursor;
 		cursor->disable_plane = i9xx_disable_cursor;
+		cursor->get_hw_state = i9xx_cursor_get_hw_state;
 		cursor->check_plane = i9xx_check_cursor;
 	}
 
@@ -14462,6 +14479,8 @@ int intel_modeset_init(struct drm_device *dev)
 	enum pipe pipe;
 	struct intel_crtc *crtc;
 
+	dev_priv->modeset_wq = alloc_ordered_workqueue("i915_modeset", 0);
+
 	drm_mode_config_init(dev);
 
 	dev->mode_config.min_width = 0;
@@ -14665,8 +14684,11 @@ void i830_disable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe)
 	DRM_DEBUG_KMS("disabling pipe %c due to force quirk\n",
 		      pipe_name(pipe));
 
-	assert_plane_disabled(dev_priv, PLANE_A);
-	assert_plane_disabled(dev_priv, PLANE_B);
+	WARN_ON(I915_READ(DSPCNTR(PLANE_A)) & DISPLAY_PLANE_ENABLE);
+	WARN_ON(I915_READ(DSPCNTR(PLANE_B)) & DISPLAY_PLANE_ENABLE);
+	WARN_ON(I915_READ(DSPCNTR(PLANE_C)) & DISPLAY_PLANE_ENABLE);
+	WARN_ON(I915_READ(CURCNTR(PIPE_A)) & CURSOR_MODE);
+	WARN_ON(I915_READ(CURCNTR(PIPE_B)) & CURSOR_MODE);
 
 	I915_WRITE(PIPECONF(pipe), 0);
 	POSTING_READ(PIPECONF(pipe));
@@ -14677,22 +14699,36 @@ void i830_disable_pipe(struct drm_i915_private *dev_priv, enum pipe pipe)
 	POSTING_READ(DPLL(pipe));
 }
 
-static bool
-intel_check_plane_mapping(struct intel_crtc *crtc)
+static bool intel_plane_mapping_ok(struct intel_crtc *crtc,
+				   struct intel_plane *primary)
 {
 	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
-	u32 val;
+	enum plane plane = primary->plane;
+	u32 val = I915_READ(DSPCNTR(plane));
 
-	if (INTEL_INFO(dev_priv)->num_pipes == 1)
-		return true;
+	return (val & DISPLAY_PLANE_ENABLE) == 0 ||
+		(val & DISPPLANE_SEL_PIPE_MASK) == DISPPLANE_SEL_PIPE(crtc->pipe);
+}
 
-	val = I915_READ(DSPCNTR(!crtc->plane));
+static void
+intel_sanitize_plane_mapping(struct drm_i915_private *dev_priv)
+{
+	struct intel_crtc *crtc;
 
-	if ((val & DISPLAY_PLANE_ENABLE) &&
-	    (!!(val & DISPPLANE_SEL_PIPE_MASK) == crtc->pipe))
-		return false;
+	if (INTEL_GEN(dev_priv) >= 4)
+		return;
 
-	return true;
+	for_each_intel_crtc(&dev_priv->drm, crtc) {
+		struct intel_plane *plane =
+			to_intel_plane(crtc->base.primary);
+
+		if (intel_plane_mapping_ok(crtc, plane))
+			continue;
+
+		DRM_DEBUG_KMS("%s attached to the wrong pipe, disabling plane\n",
+			      plane->base.name);
+		intel_plane_disable_noatomic(crtc, plane);
+	}
 }
 
 static bool intel_crtc_has_encoders(struct intel_crtc *crtc)
@@ -14748,33 +14784,15 @@ static void intel_sanitize_crtc(struct intel_crtc *crtc,
 
 		/* Disable everything but the primary plane */
 		for_each_intel_plane_on_crtc(dev, crtc, plane) {
-			if (plane->base.type == DRM_PLANE_TYPE_PRIMARY)
-				continue;
+			const struct intel_plane_state *plane_state =
+				to_intel_plane_state(plane->base.state);
 
-			trace_intel_disable_plane(&plane->base, crtc);
-			plane->disable_plane(plane, crtc);
+			if (plane_state->base.visible &&
+			    plane->base.type != DRM_PLANE_TYPE_PRIMARY)
+				intel_plane_disable_noatomic(crtc, plane);
 		}
 	}
 
-	/* We need to sanitize the plane -> pipe mapping first because this will
-	 * disable the crtc (and hence change the state) if it is wrong. Note
-	 * that gen4+ has a fixed plane -> pipe mapping.  */
-	if (INTEL_GEN(dev_priv) < 4 && !intel_check_plane_mapping(crtc)) {
-		bool plane;
-
-		DRM_DEBUG_KMS("[CRTC:%d:%s] wrong plane connection detected!\n",
-			      crtc->base.base.id, crtc->base.name);
-
-		/* Pipe has the wrong plane attached and the plane is active.
-		 * Temporarily change the plane mapping and disable everything
-		 * ...  */
-		plane = crtc->plane;
-		crtc->base.primary->state->visible = true;
-		crtc->plane = !plane;
-		intel_crtc_disable_noatomic(&crtc->base, ctx);
-		crtc->plane = plane;
-	}
-
 	/* Adjust the state of the output pipe according to whether we
 	 * have active connectors/encoders. */
 	if (crtc->active && !intel_crtc_has_encoders(crtc))
@@ -14879,24 +14897,21 @@ void i915_redisable_vga(struct drm_i915_private *dev_priv)
 	intel_display_power_put(dev_priv, POWER_DOMAIN_VGA);
 }
 
-static bool primary_get_hw_state(struct intel_plane *plane)
-{
-	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
-
-	return I915_READ(DSPCNTR(plane->plane)) & DISPLAY_PLANE_ENABLE;
-}
-
 /* FIXME read out full plane state for all planes */
 static void readout_plane_state(struct intel_crtc *crtc)
 {
-	struct intel_plane *primary = to_intel_plane(crtc->base.primary);
-	bool visible;
+	struct drm_i915_private *dev_priv = to_i915(crtc->base.dev);
+	struct intel_crtc_state *crtc_state =
+		to_intel_crtc_state(crtc->base.state);
+	struct intel_plane *plane;
 
-	visible = crtc->active && primary_get_hw_state(primary);
+	for_each_intel_plane_on_crtc(&dev_priv->drm, crtc, plane) {
+		struct intel_plane_state *plane_state =
+			to_intel_plane_state(plane->base.state);
+		bool visible = plane->get_hw_state(plane);
 
-	intel_set_plane_visible(to_intel_crtc_state(crtc->base.state),
-				to_intel_plane_state(primary->base.state),
-				visible);
+		intel_set_plane_visible(crtc_state, plane_state, visible);
+	}
 }
 
 static void intel_modeset_readout_hw_state(struct drm_device *dev)
@@ -15094,6 +15109,8 @@ intel_modeset_setup_hw_state(struct drm_device *dev,
 	/* HW state is read out, now we need to sanitize this mess. */
 	get_encoder_power_domains(dev_priv);
 
+	intel_sanitize_plane_mapping(dev_priv);
+
 	for_each_intel_encoder(dev, encoder) {
 		intel_sanitize_encoder(encoder);
 	}
@@ -15270,6 +15287,8 @@ void intel_modeset_cleanup(struct drm_device *dev)
 	intel_cleanup_gt_powersave(dev_priv);
 
 	intel_teardown_gmbus(dev_priv);
+
+	destroy_workqueue(dev_priv->modeset_wq);
 }
 
 void intel_connector_attach_encoder(struct intel_connector *connector,
diff --git a/drivers/gpu/drm/i915/intel_drv.h b/drivers/gpu/drm/i915/intel_drv.h
index 6c7f8bc..5d77f75 100644
--- a/drivers/gpu/drm/i915/intel_drv.h
+++ b/drivers/gpu/drm/i915/intel_drv.h
@@ -862,6 +862,7 @@ struct intel_plane {
 			     const struct intel_plane_state *plane_state);
 	void (*disable_plane)(struct intel_plane *plane,
 			      struct intel_crtc *crtc);
+	bool (*get_hw_state)(struct intel_plane *plane);
 	int (*check_plane)(struct intel_plane *plane,
 			   struct intel_crtc_state *crtc_state,
 			   struct intel_plane_state *state);
@@ -1924,6 +1925,7 @@ void skl_update_plane(struct intel_plane *plane,
 		      const struct intel_crtc_state *crtc_state,
 		      const struct intel_plane_state *plane_state);
 void skl_disable_plane(struct intel_plane *plane, struct intel_crtc *crtc);
+bool skl_plane_get_hw_state(struct intel_plane *plane);
 
 /* intel_tv.c */
 void intel_tv_init(struct drm_i915_private *dev_priv);
diff --git a/drivers/gpu/drm/i915/intel_engine_cs.c b/drivers/gpu/drm/i915/intel_engine_cs.c
index ab5bf4e..6074e04d 100644
--- a/drivers/gpu/drm/i915/intel_engine_cs.c
+++ b/drivers/gpu/drm/i915/intel_engine_cs.c
@@ -1390,6 +1390,11 @@ static int glk_init_workarounds(struct intel_engine_cs *engine)
 	if (ret)
 		return ret;
 
+	/* WA #0862: Userspace has to set "Barrier Mode" to avoid hangs. */
+	ret = wa_ring_whitelist_reg(engine, GEN9_SLICE_COMMON_ECO_CHICKEN1);
+	if (ret)
+		return ret;
+
 	/* WaToEnableHwFixForPushConstHWBug:glk */
 	WA_SET_BIT_MASKED(COMMON_SLICE_CHICKEN2,
 			  GEN8_SBE_DISABLE_REPLAY_BUF_OPTIMIZATION);
diff --git a/drivers/gpu/drm/i915/intel_lrc.c b/drivers/gpu/drm/i915/intel_lrc.c
index d36e256..e71a8cd 100644
--- a/drivers/gpu/drm/i915/intel_lrc.c
+++ b/drivers/gpu/drm/i915/intel_lrc.c
@@ -974,6 +974,9 @@ static void execlists_schedule(struct drm_i915_gem_request *request, int prio)
 
 	GEM_BUG_ON(prio == I915_PRIORITY_INVALID);
 
+	if (i915_gem_request_completed(request))
+		return;
+
 	if (prio <= READ_ONCE(request->priotree.priority))
 		return;
 
diff --git a/drivers/gpu/drm/i915/intel_psr.c b/drivers/gpu/drm/i915/intel_psr.c
index 6e3b430..55ea5eb 100644
--- a/drivers/gpu/drm/i915/intel_psr.c
+++ b/drivers/gpu/drm/i915/intel_psr.c
@@ -590,7 +590,7 @@ static void hsw_psr_disable(struct intel_dp *intel_dp,
 	struct drm_i915_private *dev_priv = to_i915(dev);
 
 	if (dev_priv->psr.active) {
-		i915_reg_t psr_ctl;
+		i915_reg_t psr_status;
 		u32 psr_status_mask;
 
 		if (dev_priv->psr.aux_frame_sync)
@@ -599,24 +599,24 @@ static void hsw_psr_disable(struct intel_dp *intel_dp,
 					0);
 
 		if (dev_priv->psr.psr2_support) {
-			psr_ctl = EDP_PSR2_CTL;
+			psr_status = EDP_PSR2_STATUS_CTL;
 			psr_status_mask = EDP_PSR2_STATUS_STATE_MASK;
 
-			I915_WRITE(psr_ctl,
-				   I915_READ(psr_ctl) &
+			I915_WRITE(EDP_PSR2_CTL,
+				   I915_READ(EDP_PSR2_CTL) &
 				   ~(EDP_PSR2_ENABLE | EDP_SU_TRACK_ENABLE));
 
 		} else {
-			psr_ctl = EDP_PSR_STATUS_CTL;
+			psr_status = EDP_PSR_STATUS_CTL;
 			psr_status_mask = EDP_PSR_STATUS_STATE_MASK;
 
-			I915_WRITE(psr_ctl,
-				   I915_READ(psr_ctl) & ~EDP_PSR_ENABLE);
+			I915_WRITE(EDP_PSR_CTL,
+				   I915_READ(EDP_PSR_CTL) & ~EDP_PSR_ENABLE);
 		}
 
 		/* Wait till PSR is idle */
 		if (intel_wait_for_register(dev_priv,
-					    psr_ctl, psr_status_mask, 0,
+					    psr_status, psr_status_mask, 0,
 					    2000))
 			DRM_ERROR("Timed out waiting for PSR Idle State\n");
 
diff --git a/drivers/gpu/drm/i915/intel_runtime_pm.c b/drivers/gpu/drm/i915/intel_runtime_pm.c
index 8af286c..7e115f3 100644
--- a/drivers/gpu/drm/i915/intel_runtime_pm.c
+++ b/drivers/gpu/drm/i915/intel_runtime_pm.c
@@ -598,6 +598,11 @@ void gen9_enable_dc5(struct drm_i915_private *dev_priv)
 
 	DRM_DEBUG_KMS("Enabling DC5\n");
 
+	/* Wa Display #1183: skl,kbl,cfl */
+	if (IS_GEN9_BC(dev_priv))
+		I915_WRITE(GEN8_CHICKEN_DCPR_1, I915_READ(GEN8_CHICKEN_DCPR_1) |
+			   SKL_SELECT_ALTERNATE_DC_EXIT);
+
 	gen9_set_dc_state(dev_priv, DC_STATE_EN_UPTO_DC5);
 }
 
@@ -625,6 +630,11 @@ void skl_disable_dc6(struct drm_i915_private *dev_priv)
 {
 	DRM_DEBUG_KMS("Disabling DC6\n");
 
+	/* Wa Display #1183: skl,kbl,cfl */
+	if (IS_GEN9_BC(dev_priv))
+		I915_WRITE(GEN8_CHICKEN_DCPR_1, I915_READ(GEN8_CHICKEN_DCPR_1) |
+			   SKL_SELECT_ALTERNATE_DC_EXIT);
+
 	gen9_set_dc_state(dev_priv, DC_STATE_DISABLE);
 }
 
@@ -1786,6 +1796,7 @@ void intel_display_power_put(struct drm_i915_private *dev_priv,
 	GLK_DISPLAY_POWERWELL_2_POWER_DOMAINS |		\
 	BIT_ULL(POWER_DOMAIN_MODESET) |			\
 	BIT_ULL(POWER_DOMAIN_AUX_A) |			\
+	BIT_ULL(POWER_DOMAIN_GMBUS) |			\
 	BIT_ULL(POWER_DOMAIN_INIT))
 
 #define CNL_DISPLAY_POWERWELL_2_POWER_DOMAINS (		\
diff --git a/drivers/gpu/drm/i915/intel_sprite.c b/drivers/gpu/drm/i915/intel_sprite.c
index 4fcf80c..4a8a5d9 100644
--- a/drivers/gpu/drm/i915/intel_sprite.c
+++ b/drivers/gpu/drm/i915/intel_sprite.c
@@ -329,6 +329,26 @@ skl_disable_plane(struct intel_plane *plane, struct intel_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
+bool
+skl_plane_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum plane_id plane_id = plane->id;
+	enum pipe pipe = plane->pipe;
+	bool ret;
+
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(PLANE_CTL(pipe, plane_id)) & PLANE_CTL_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static void
 chv_update_csc(struct intel_plane *plane, uint32_t format)
 {
@@ -506,6 +526,26 @@ vlv_disable_plane(struct intel_plane *plane, struct intel_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
+static bool
+vlv_plane_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum plane_id plane_id = plane->id;
+	enum pipe pipe = plane->pipe;
+	bool ret;
+
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(SPCNTR(pipe, plane_id)) & SP_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static u32 ivb_sprite_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
@@ -646,6 +686,25 @@ ivb_disable_plane(struct intel_plane *plane, struct intel_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
+static bool
+ivb_plane_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum pipe pipe = plane->pipe;
+	bool ret;
+
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret =  I915_READ(SPRCTL(pipe)) & SPRITE_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static u32 g4x_sprite_ctl(const struct intel_crtc_state *crtc_state,
 			  const struct intel_plane_state *plane_state)
 {
@@ -777,6 +836,25 @@ g4x_disable_plane(struct intel_plane *plane, struct intel_crtc *crtc)
 	spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags);
 }
 
+static bool
+g4x_plane_get_hw_state(struct intel_plane *plane)
+{
+	struct drm_i915_private *dev_priv = to_i915(plane->base.dev);
+	enum intel_display_power_domain power_domain;
+	enum pipe pipe = plane->pipe;
+	bool ret;
+
+	power_domain = POWER_DOMAIN_PIPE(pipe);
+	if (!intel_display_power_get_if_enabled(dev_priv, power_domain))
+		return false;
+
+	ret = I915_READ(DVSCNTR(pipe)) & DVS_ENABLE;
+
+	intel_display_power_put(dev_priv, power_domain);
+
+	return ret;
+}
+
 static int
 intel_check_sprite_plane(struct intel_plane *plane,
 			 struct intel_crtc_state *crtc_state,
@@ -1232,6 +1310,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		intel_plane->update_plane = skl_update_plane;
 		intel_plane->disable_plane = skl_disable_plane;
+		intel_plane->get_hw_state = skl_plane_get_hw_state;
 
 		plane_formats = skl_plane_formats;
 		num_plane_formats = ARRAY_SIZE(skl_plane_formats);
@@ -1242,6 +1321,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		intel_plane->update_plane = skl_update_plane;
 		intel_plane->disable_plane = skl_disable_plane;
+		intel_plane->get_hw_state = skl_plane_get_hw_state;
 
 		plane_formats = skl_plane_formats;
 		num_plane_formats = ARRAY_SIZE(skl_plane_formats);
@@ -1252,6 +1332,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		intel_plane->update_plane = vlv_update_plane;
 		intel_plane->disable_plane = vlv_disable_plane;
+		intel_plane->get_hw_state = vlv_plane_get_hw_state;
 
 		plane_formats = vlv_plane_formats;
 		num_plane_formats = ARRAY_SIZE(vlv_plane_formats);
@@ -1267,6 +1348,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		intel_plane->update_plane = ivb_update_plane;
 		intel_plane->disable_plane = ivb_disable_plane;
+		intel_plane->get_hw_state = ivb_plane_get_hw_state;
 
 		plane_formats = snb_plane_formats;
 		num_plane_formats = ARRAY_SIZE(snb_plane_formats);
@@ -1277,6 +1359,7 @@ intel_sprite_plane_create(struct drm_i915_private *dev_priv,
 
 		intel_plane->update_plane = g4x_update_plane;
 		intel_plane->disable_plane = g4x_disable_plane;
+		intel_plane->get_hw_state = g4x_plane_get_hw_state;
 
 		modifiers = i9xx_plane_format_modifiers;
 		if (IS_GEN6(dev_priv)) {
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
index 0760b93..baab933 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/mmu.h
@@ -121,6 +121,7 @@ int nv41_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int nv44_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int nv50_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int g84_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
+int mcp77_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gf100_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gk104_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
 int gk20a_mmu_new(struct nvkm_device *, int, struct nvkm_mmu **);
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index 435ff86..ef68741 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -1447,11 +1447,13 @@ nouveau_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_reg *reg)
 				args.nv50.ro = 0;
 				args.nv50.kind = mem->kind;
 				args.nv50.comp = mem->comp;
+				argc = sizeof(args.nv50);
 				break;
 			case NVIF_CLASS_MEM_GF100:
 				args.gf100.version = 0;
 				args.gf100.ro = 0;
 				args.gf100.kind = mem->kind;
+				argc = sizeof(args.gf100);
 				break;
 			default:
 				WARN_ON(1);
@@ -1459,7 +1461,7 @@ nouveau_ttm_io_mem_reserve(struct ttm_bo_device *bdev, struct ttm_mem_reg *reg)
 			}
 
 			ret = nvif_object_map_handle(&mem->mem.object,
-						     &argc, argc,
+						     &args, argc,
 						     &handle, &length);
 			if (ret != 1)
 				return ret ? ret : -EINVAL;
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
index 00eeaaf..08e77cd 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/base.c
@@ -1251,7 +1251,7 @@ nvaa_chipset = {
 	.i2c = g94_i2c_new,
 	.imem = nv50_instmem_new,
 	.mc = g98_mc_new,
-	.mmu = g84_mmu_new,
+	.mmu = mcp77_mmu_new,
 	.mxm = nv50_mxm_new,
 	.pci = g94_pci_new,
 	.therm = g84_therm_new,
@@ -1283,7 +1283,7 @@ nvac_chipset = {
 	.i2c = g94_i2c_new,
 	.imem = nv50_instmem_new,
 	.mc = g98_mc_new,
-	.mmu = g84_mmu_new,
+	.mmu = mcp77_mmu_new,
 	.mxm = nv50_mxm_new,
 	.pci = g94_pci_new,
 	.therm = g84_therm_new,
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c
index a2978a3..700fc75 100644
--- a/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c
+++ b/drivers/gpu/drm/nouveau/nvkm/engine/disp/sorgf119.c
@@ -174,6 +174,7 @@ gf119_sor = {
 		.links = gf119_sor_dp_links,
 		.power = g94_sor_dp_power,
 		.pattern = gf119_sor_dp_pattern,
+		.drive = gf119_sor_dp_drive,
 		.vcpi = gf119_sor_dp_vcpi,
 		.audio = gf119_sor_dp_audio,
 		.audio_sym = gf119_sor_dp_audio_sym,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c
index 9646ade..243f0a5 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/base.c
@@ -73,7 +73,8 @@ static int
 nvkm_bar_fini(struct nvkm_subdev *subdev, bool suspend)
 {
 	struct nvkm_bar *bar = nvkm_bar(subdev);
-	bar->func->bar1.fini(bar);
+	if (bar->func->bar1.fini)
+		bar->func->bar1.fini(bar);
 	return 0;
 }
 
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c
index b10077d..35878fb 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/bar/gk20a.c
@@ -26,7 +26,6 @@ gk20a_bar_func = {
 	.dtor = gf100_bar_dtor,
 	.oneinit = gf100_bar_oneinit,
 	.bar1.init = gf100_bar_bar1_init,
-	.bar1.fini = gf100_bar_bar1_fini,
 	.bar1.wait = gf100_bar_bar1_wait,
 	.bar1.vmm = gf100_bar_bar1_vmm,
 	.flush = g84_bar_flush,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
index 352a65f..67ee983 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/Kbuild
@@ -4,6 +4,7 @@
 nvkm-y += nvkm/subdev/mmu/nv44.o
 nvkm-y += nvkm/subdev/mmu/nv50.o
 nvkm-y += nvkm/subdev/mmu/g84.o
+nvkm-y += nvkm/subdev/mmu/mcp77.o
 nvkm-y += nvkm/subdev/mmu/gf100.o
 nvkm-y += nvkm/subdev/mmu/gk104.o
 nvkm-y += nvkm/subdev/mmu/gk20a.o
@@ -22,6 +23,7 @@
 nvkm-y += nvkm/subdev/mmu/vmmnv41.o
 nvkm-y += nvkm/subdev/mmu/vmmnv44.o
 nvkm-y += nvkm/subdev/mmu/vmmnv50.o
+nvkm-y += nvkm/subdev/mmu/vmmmcp77.o
 nvkm-y += nvkm/subdev/mmu/vmmgf100.o
 nvkm-y += nvkm/subdev/mmu/vmmgk104.o
 nvkm-y += nvkm/subdev/mmu/vmmgk20a.o
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c
new file mode 100644
index 0000000..0527b50
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/mcp77.c
@@ -0,0 +1,41 @@
+/*
+ * Copyright 2017 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include "mem.h"
+#include "vmm.h"
+
+#include <nvif/class.h>
+
+static const struct nvkm_mmu_func
+mcp77_mmu = {
+	.dma_bits = 40,
+	.mmu = {{ -1, -1, NVIF_CLASS_MMU_NV50}},
+	.mem = {{ -1,  0, NVIF_CLASS_MEM_NV50}, nv50_mem_new, nv50_mem_map },
+	.vmm = {{ -1, -1, NVIF_CLASS_VMM_NV50}, mcp77_vmm_new, false, 0x0200 },
+	.kind = nv50_mmu_kind,
+	.kind_sys = true,
+};
+
+int
+mcp77_mmu_new(struct nvkm_device *device, int index, struct nvkm_mmu **pmmu)
+{
+	return nvkm_mmu_new_(&mcp77_mmu, device, index, pmmu);
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
index 6d8f61e..da06e64 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmm.h
@@ -95,6 +95,9 @@ struct nvkm_vmm_desc {
 	const struct nvkm_vmm_desc_func *func;
 };
 
+extern const struct nvkm_vmm_desc nv50_vmm_desc_12[];
+extern const struct nvkm_vmm_desc nv50_vmm_desc_16[];
+
 extern const struct nvkm_vmm_desc gk104_vmm_desc_16_12[];
 extern const struct nvkm_vmm_desc gk104_vmm_desc_16_16[];
 extern const struct nvkm_vmm_desc gk104_vmm_desc_17_12[];
@@ -169,6 +172,11 @@ int nv04_vmm_new_(const struct nvkm_vmm_func *, struct nvkm_mmu *, u32,
 		  const char *, struct nvkm_vmm **);
 int nv04_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
 
+int nv50_vmm_join(struct nvkm_vmm *, struct nvkm_memory *);
+void nv50_vmm_part(struct nvkm_vmm *, struct nvkm_memory *);
+int nv50_vmm_valid(struct nvkm_vmm *, void *, u32, struct nvkm_vmm_map *);
+void nv50_vmm_flush(struct nvkm_vmm *, int);
+
 int gf100_vmm_new_(const struct nvkm_vmm_func *, const struct nvkm_vmm_func *,
 		   struct nvkm_mmu *, u64, u64, void *, u32,
 		   struct lock_class_key *, const char *, struct nvkm_vmm **);
@@ -200,6 +208,8 @@ int nv44_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
 int nv50_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
 		 struct lock_class_key *, const char *, struct nvkm_vmm **);
+int mcp77_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
+		  struct lock_class_key *, const char *, struct nvkm_vmm **);
 int g84_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
 		struct lock_class_key *, const char *, struct nvkm_vmm **);
 int gf100_vmm_new(struct nvkm_mmu *, u64, u64, void *, u32,
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c
new file mode 100644
index 0000000..e63d984
--- /dev/null
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmmcp77.c
@@ -0,0 +1,45 @@
+/*
+ * Copyright 2017 Red Hat Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a
+ * copy of this software and associated documentation files (the "Software"),
+ * to deal in the Software without restriction, including without limitation
+ * the rights to use, copy, modify, merge, publish, distribute, sublicense,
+ * and/or sell copies of the Software, and to permit persons to whom the
+ * Software is furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
+ * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
+ * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
+ * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
+ * OTHER DEALINGS IN THE SOFTWARE.
+ */
+#include "vmm.h"
+
+static const struct nvkm_vmm_func
+mcp77_vmm = {
+	.join = nv50_vmm_join,
+	.part = nv50_vmm_part,
+	.valid = nv50_vmm_valid,
+	.flush = nv50_vmm_flush,
+	.page_block = 1 << 29,
+	.page = {
+		{ 16, &nv50_vmm_desc_16[0], NVKM_VMM_PAGE_xVxx },
+		{ 12, &nv50_vmm_desc_12[0], NVKM_VMM_PAGE_xVHx },
+		{}
+	}
+};
+
+int
+mcp77_vmm_new(struct nvkm_mmu *mmu, u64 addr, u64 size, void *argv, u32 argc,
+	      struct lock_class_key *key, const char *name,
+	      struct nvkm_vmm **pvmm)
+{
+	return nv04_vmm_new_(&mcp77_vmm, mmu, 0, addr, size,
+			     argv, argc, key, name, pvmm);
+}
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
index 863a2ed..64f75d9 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/mmu/vmmnv50.c
@@ -32,7 +32,7 @@ static inline void
 nv50_vmm_pgt_pte(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
 		 u32 ptei, u32 ptes, struct nvkm_vmm_map *map, u64 addr)
 {
-	u64 next = addr | map->type, data;
+	u64 next = addr + map->type, data;
 	u32 pten;
 	int log2blk;
 
@@ -69,7 +69,7 @@ nv50_vmm_pgt_dma(struct nvkm_vmm *vmm, struct nvkm_mmu_pt *pt,
 		VMM_SPAM(vmm, "DMAA %08x %08x PTE(s)", ptei, ptes);
 		nvkm_kmap(pt->memory);
 		while (ptes--) {
-			const u64 data = *map->dma++ | map->type;
+			const u64 data = *map->dma++ + map->type;
 			VMM_WO064(pt, vmm, ptei++ * 8, data);
 			map->type += map->ctag;
 		}
@@ -163,21 +163,21 @@ nv50_vmm_pgd = {
 	.pde = nv50_vmm_pgd_pde,
 };
 
-static const struct nvkm_vmm_desc
+const struct nvkm_vmm_desc
 nv50_vmm_desc_12[] = {
 	{ PGT, 17, 8, 0x1000, &nv50_vmm_pgt },
 	{ PGD, 11, 0, 0x0000, &nv50_vmm_pgd },
 	{}
 };
 
-static const struct nvkm_vmm_desc
+const struct nvkm_vmm_desc
 nv50_vmm_desc_16[] = {
 	{ PGT, 13, 8, 0x1000, &nv50_vmm_pgt },
 	{ PGD, 11, 0, 0x0000, &nv50_vmm_pgd },
 	{}
 };
 
-static void
+void
 nv50_vmm_flush(struct nvkm_vmm *vmm, int level)
 {
 	struct nvkm_subdev *subdev = &vmm->mmu->subdev;
@@ -223,7 +223,7 @@ nv50_vmm_flush(struct nvkm_vmm *vmm, int level)
 	mutex_unlock(&subdev->mutex);
 }
 
-static int
+int
 nv50_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc,
 	       struct nvkm_vmm_map *map)
 {
@@ -321,7 +321,7 @@ nv50_vmm_valid(struct nvkm_vmm *vmm, void *argv, u32 argc,
 	return 0;
 }
 
-static void
+void
 nv50_vmm_part(struct nvkm_vmm *vmm, struct nvkm_memory *inst)
 {
 	struct nvkm_vmm_join *join;
@@ -335,7 +335,7 @@ nv50_vmm_part(struct nvkm_vmm *vmm, struct nvkm_memory *inst)
 	}
 }
 
-static int
+int
 nv50_vmm_join(struct nvkm_vmm *vmm, struct nvkm_memory *inst)
 {
 	const u32 pd_offset = vmm->mmu->func->vmm.pd_offset;
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index deb96de..ee2431a 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -71,6 +71,10 @@ nvkm_pci_intr(int irq, void *arg)
 	struct nvkm_pci *pci = arg;
 	struct nvkm_device *device = pci->subdev.device;
 	bool handled = false;
+
+	if (pci->irq < 0)
+		return IRQ_HANDLED;
+
 	nvkm_mc_intr_unarm(device);
 	if (pci->msi)
 		pci->func->msi_rearm(pci);
@@ -84,11 +88,6 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend)
 {
 	struct nvkm_pci *pci = nvkm_pci(subdev);
 
-	if (pci->irq >= 0) {
-		free_irq(pci->irq, pci);
-		pci->irq = -1;
-	}
-
 	if (pci->agp.bridge)
 		nvkm_agp_fini(pci);
 
@@ -108,8 +107,20 @@ static int
 nvkm_pci_oneinit(struct nvkm_subdev *subdev)
 {
 	struct nvkm_pci *pci = nvkm_pci(subdev);
-	if (pci_is_pcie(pci->pdev))
-		return nvkm_pcie_oneinit(pci);
+	struct pci_dev *pdev = pci->pdev;
+	int ret;
+
+	if (pci_is_pcie(pci->pdev)) {
+		ret = nvkm_pcie_oneinit(pci);
+		if (ret)
+			return ret;
+	}
+
+	ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+	if (ret)
+		return ret;
+
+	pci->irq = pdev->irq;
 	return 0;
 }
 
@@ -117,7 +128,6 @@ static int
 nvkm_pci_init(struct nvkm_subdev *subdev)
 {
 	struct nvkm_pci *pci = nvkm_pci(subdev);
-	struct pci_dev *pdev = pci->pdev;
 	int ret;
 
 	if (pci->agp.bridge) {
@@ -131,28 +141,34 @@ nvkm_pci_init(struct nvkm_subdev *subdev)
 	if (pci->func->init)
 		pci->func->init(pci);
 
-	ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
-	if (ret)
-		return ret;
-
-	pci->irq = pdev->irq;
-
 	/* Ensure MSI interrupts are armed, for the case where there are
 	 * already interrupts pending (for whatever reason) at load time.
 	 */
 	if (pci->msi)
 		pci->func->msi_rearm(pci);
 
-	return ret;
+	return 0;
 }
 
 static void *
 nvkm_pci_dtor(struct nvkm_subdev *subdev)
 {
 	struct nvkm_pci *pci = nvkm_pci(subdev);
+
 	nvkm_agp_dtor(pci);
+
+	if (pci->irq >= 0) {
+		/* freq_irq() will call the handler, we use pci->irq == -1
+		 * to signal that it's been torn down and should be a noop.
+		 */
+		int irq = pci->irq;
+		pci->irq = -1;
+		free_irq(irq, pci);
+	}
+
 	if (pci->msi)
 		pci_disable_msi(pci->pdev);
+
 	return nvkm_pci(subdev);
 }
 
diff --git a/drivers/gpu/drm/omapdrm/dss/hdmi4_cec.c b/drivers/gpu/drm/omapdrm/dss/hdmi4_cec.c
index e626edd..23db74a 100644
--- a/drivers/gpu/drm/omapdrm/dss/hdmi4_cec.c
+++ b/drivers/gpu/drm/omapdrm/dss/hdmi4_cec.c
@@ -78,6 +78,8 @@ static void hdmi_cec_received_msg(struct hdmi_core_data *core)
 
 			/* then read the message */
 			msg.len = cnt & 0xf;
+			if (msg.len > CEC_MAX_MSG_SIZE - 2)
+				msg.len = CEC_MAX_MSG_SIZE - 2;
 			msg.msg[0] = hdmi_read_reg(core->base,
 						   HDMI_CEC_RX_CMD_HEADER);
 			msg.msg[1] = hdmi_read_reg(core->base,
@@ -104,26 +106,6 @@ static void hdmi_cec_received_msg(struct hdmi_core_data *core)
 	}
 }
 
-static void hdmi_cec_transmit_fifo_empty(struct hdmi_core_data *core, u32 stat1)
-{
-	if (stat1 & 2) {
-		u32 dbg3 = hdmi_read_reg(core->base, HDMI_CEC_DBG_3);
-
-		cec_transmit_done(core->adap,
-				  CEC_TX_STATUS_NACK |
-				  CEC_TX_STATUS_MAX_RETRIES,
-				  0, (dbg3 >> 4) & 7, 0, 0);
-	} else if (stat1 & 1) {
-		cec_transmit_done(core->adap,
-				  CEC_TX_STATUS_ARB_LOST |
-				  CEC_TX_STATUS_MAX_RETRIES,
-				  0, 0, 0, 0);
-	} else if (stat1 == 0) {
-		cec_transmit_done(core->adap, CEC_TX_STATUS_OK,
-				  0, 0, 0, 0);
-	}
-}
-
 void hdmi4_cec_irq(struct hdmi_core_data *core)
 {
 	u32 stat0 = hdmi_read_reg(core->base, HDMI_CEC_INT_STATUS_0);
@@ -132,27 +114,21 @@ void hdmi4_cec_irq(struct hdmi_core_data *core)
 	hdmi_write_reg(core->base, HDMI_CEC_INT_STATUS_0, stat0);
 	hdmi_write_reg(core->base, HDMI_CEC_INT_STATUS_1, stat1);
 
-	if (stat0 & 0x40)
+	if (stat0 & 0x20) {
+		cec_transmit_done(core->adap, CEC_TX_STATUS_OK,
+				  0, 0, 0, 0);
 		REG_FLD_MOD(core->base, HDMI_CEC_DBG_3, 0x1, 7, 7);
-	else if (stat0 & 0x24)
-		hdmi_cec_transmit_fifo_empty(core, stat1);
-	if (stat1 & 2) {
+	} else if (stat1 & 0x02) {
 		u32 dbg3 = hdmi_read_reg(core->base, HDMI_CEC_DBG_3);
 
 		cec_transmit_done(core->adap,
 				  CEC_TX_STATUS_NACK |
 				  CEC_TX_STATUS_MAX_RETRIES,
 				  0, (dbg3 >> 4) & 7, 0, 0);
-	} else if (stat1 & 1) {
-		cec_transmit_done(core->adap,
-				  CEC_TX_STATUS_ARB_LOST |
-				  CEC_TX_STATUS_MAX_RETRIES,
-				  0, 0, 0, 0);
+		REG_FLD_MOD(core->base, HDMI_CEC_DBG_3, 0x1, 7, 7);
 	}
 	if (stat0 & 0x02)
 		hdmi_cec_received_msg(core);
-	if (stat1 & 0x3)
-		REG_FLD_MOD(core->base, HDMI_CEC_DBG_3, 0x1, 7, 7);
 }
 
 static bool hdmi_cec_clear_tx_fifo(struct cec_adapter *adap)
@@ -231,18 +207,14 @@ static int hdmi_cec_adap_enable(struct cec_adapter *adap, bool enable)
 	/*
 	 * Enable CEC interrupts:
 	 * Transmit Buffer Full/Empty Change event
-	 * Transmitter FIFO Empty event
 	 * Receiver FIFO Not Empty event
 	 */
-	hdmi_write_reg(core->base, HDMI_CEC_INT_ENABLE_0, 0x26);
+	hdmi_write_reg(core->base, HDMI_CEC_INT_ENABLE_0, 0x22);
 	/*
 	 * Enable CEC interrupts:
-	 * RX FIFO Overrun Error event
-	 * Short Pulse Detected event
 	 * Frame Retransmit Count Exceeded event
-	 * Start Bit Irregularity event
 	 */
-	hdmi_write_reg(core->base, HDMI_CEC_INT_ENABLE_1, 0x0f);
+	hdmi_write_reg(core->base, HDMI_CEC_INT_ENABLE_1, 0x02);
 
 	/* cec calibration enable (self clearing) */
 	hdmi_write_reg(core->base, HDMI_CEC_SETUP, 0x03);
diff --git a/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c b/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
index dc332ea..3ecffa5 100644
--- a/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
+++ b/drivers/gpu/drm/sun4i/sun4i_hdmi_tmds_clk.c
@@ -102,10 +102,13 @@ static int sun4i_tmds_determine_rate(struct clk_hw *hw,
 					goto out;
 				}
 
-				if (abs(rate - rounded / i) <
-				    abs(rate - best_parent / best_div)) {
+				if (!best_parent ||
+				    abs(rate - rounded / i / j) <
+				    abs(rate - best_parent / best_half /
+					best_div)) {
 					best_parent = rounded;
-					best_div = i;
+					best_half = i;
+					best_div = j;
 				}
 			}
 		}
diff --git a/drivers/gpu/drm/tegra/sor.c b/drivers/gpu/drm/tegra/sor.c
index b0a1ded..476079f 100644
--- a/drivers/gpu/drm/tegra/sor.c
+++ b/drivers/gpu/drm/tegra/sor.c
@@ -2656,6 +2656,9 @@ static int tegra_sor_probe(struct platform_device *pdev)
 				name, err);
 			goto remove;
 		}
+	} else {
+		/* fall back to the module clock on SOR0 (eDP/LVDS only) */
+		sor->clk_out = sor->clk;
 	}
 
 	sor->clk_parent = devm_clk_get(&pdev->dev, "parent");
diff --git a/drivers/gpu/drm/ttm/ttm_page_alloc.c b/drivers/gpu/drm/ttm/ttm_page_alloc.c
index b5ba644..5d252fb 100644
--- a/drivers/gpu/drm/ttm/ttm_page_alloc.c
+++ b/drivers/gpu/drm/ttm/ttm_page_alloc.c
@@ -1007,6 +1007,8 @@ int ttm_page_alloc_init(struct ttm_mem_global *glob, unsigned max_pages)
 	pr_info("Initializing pool allocator\n");
 
 	_manager = kzalloc(sizeof(*_manager), GFP_KERNEL);
+	if (!_manager)
+		return -ENOMEM;
 
 	ttm_page_pool_init_locked(&_manager->wc_pool, GFP_HIGHUSER, "wc", 0);
 
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index 6385409..c94cce9 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -146,7 +146,7 @@ vc4_save_hang_state(struct drm_device *dev)
 	struct vc4_exec_info *exec[2];
 	struct vc4_bo *bo;
 	unsigned long irqflags;
-	unsigned int i, j, unref_list_count, prev_idx;
+	unsigned int i, j, k, unref_list_count;
 
 	kernel_state = kcalloc(1, sizeof(*kernel_state), GFP_KERNEL);
 	if (!kernel_state)
@@ -182,7 +182,7 @@ vc4_save_hang_state(struct drm_device *dev)
 		return;
 	}
 
-	prev_idx = 0;
+	k = 0;
 	for (i = 0; i < 2; i++) {
 		if (!exec[i])
 			continue;
@@ -197,7 +197,7 @@ vc4_save_hang_state(struct drm_device *dev)
 			WARN_ON(!refcount_read(&bo->usecnt));
 			refcount_inc(&bo->usecnt);
 			drm_gem_object_get(&exec[i]->bo[j]->base);
-			kernel_state->bo[j + prev_idx] = &exec[i]->bo[j]->base;
+			kernel_state->bo[k++] = &exec[i]->bo[j]->base;
 		}
 
 		list_for_each_entry(bo, &exec[i]->unref_list, unref_head) {
@@ -205,12 +205,12 @@ vc4_save_hang_state(struct drm_device *dev)
 			 * because they are naturally unpurgeable.
 			 */
 			drm_gem_object_get(&bo->base.base);
-			kernel_state->bo[j + prev_idx] = &bo->base.base;
-			j++;
+			kernel_state->bo[k++] = &bo->base.base;
 		}
-		prev_idx = j + 1;
 	}
 
+	WARN_ON_ONCE(k != state->bo_count);
+
 	if (exec[0])
 		state->start_bin = exec[0]->ct0ca;
 	if (exec[1])
@@ -436,6 +436,19 @@ vc4_flush_caches(struct drm_device *dev)
 		  VC4_SET_FIELD(0xf, V3D_SLCACTL_ICC));
 }
 
+static void
+vc4_flush_texture_caches(struct drm_device *dev)
+{
+	struct vc4_dev *vc4 = to_vc4_dev(dev);
+
+	V3D_WRITE(V3D_L2CACTL,
+		  V3D_L2CACTL_L2CCLR);
+
+	V3D_WRITE(V3D_SLCACTL,
+		  VC4_SET_FIELD(0xf, V3D_SLCACTL_T1CC) |
+		  VC4_SET_FIELD(0xf, V3D_SLCACTL_T0CC));
+}
+
 /* Sets the registers for the next job to be actually be executed in
  * the hardware.
  *
@@ -474,6 +487,14 @@ vc4_submit_next_render_job(struct drm_device *dev)
 	if (!exec)
 		return;
 
+	/* A previous RCL may have written to one of our textures, and
+	 * our full cache flush at bin time may have occurred before
+	 * that RCL completed.  Flush the texture cache now, but not
+	 * the instructions or uniforms (since we don't write those
+	 * from an RCL).
+	 */
+	vc4_flush_texture_caches(dev);
+
 	submit_cl(dev, 1, exec->ct1ca, exec->ct1ea);
 }
 
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 26eddbb..3dd62d7 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -209,9 +209,6 @@ vc4_irq_postinstall(struct drm_device *dev)
 {
 	struct vc4_dev *vc4 = to_vc4_dev(dev);
 
-	/* Undo the effects of a previous vc4_irq_uninstall. */
-	enable_irq(dev->irq);
-
 	/* Enable both the render done and out of memory interrupts. */
 	V3D_WRITE(V3D_INTENA, V3D_DRIVER_IRQS);
 
diff --git a/drivers/gpu/drm/vc4/vc4_v3d.c b/drivers/gpu/drm/vc4/vc4_v3d.c
index 622cd43..493f392b 100644
--- a/drivers/gpu/drm/vc4/vc4_v3d.c
+++ b/drivers/gpu/drm/vc4/vc4_v3d.c
@@ -327,6 +327,9 @@ static int vc4_v3d_runtime_resume(struct device *dev)
 		return ret;
 
 	vc4_v3d_init_hw(vc4->dev);
+
+	/* We disabled the IRQ as part of vc4_irq_uninstall in suspend. */
+	enable_irq(vc4->dev->irq);
 	vc4_irq_postinstall(vc4->dev);
 
 	return 0;
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
index 21c62a3..87e8af5 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c
@@ -2731,6 +2731,8 @@ static int vmw_cmd_dx_view_define(struct vmw_private *dev_priv,
 	}
 
 	view_type = vmw_view_cmd_to_type(header->id);
+	if (view_type == vmw_view_max)
+		return -EINVAL;
 	cmd = container_of(header, typeof(*cmd), header);
 	ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface,
 				user_surface_converter,
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
index 0545740..fcd5814 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.c
@@ -697,7 +697,6 @@ vmw_du_plane_duplicate_state(struct drm_plane *plane)
 	vps->pinned = 0;
 
 	/* Mapping is managed by prepare_fb/cleanup_fb */
-	memset(&vps->guest_map, 0, sizeof(vps->guest_map));
 	memset(&vps->host_map, 0, sizeof(vps->host_map));
 	vps->cpp = 0;
 
@@ -760,11 +759,6 @@ vmw_du_plane_destroy_state(struct drm_plane *plane,
 
 
 	/* Should have been freed by cleanup_fb */
-	if (vps->guest_map.virtual) {
-		DRM_ERROR("Guest mapping not freed\n");
-		ttm_bo_kunmap(&vps->guest_map);
-	}
-
 	if (vps->host_map.virtual) {
 		DRM_ERROR("Host mapping not freed\n");
 		ttm_bo_kunmap(&vps->host_map);
@@ -1869,7 +1863,7 @@ u32 vmw_get_vblank_counter(struct drm_device *dev, unsigned int pipe)
  */
 int vmw_enable_vblank(struct drm_device *dev, unsigned int pipe)
 {
-	return -ENOSYS;
+	return -EINVAL;
 }
 
 /**
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
index ff9c838..cd9da2d 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_kms.h
@@ -175,7 +175,7 @@ struct vmw_plane_state {
 	int pinned;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map, guest_map;
+	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
index b8a0980..3824595 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_ldu.c
@@ -266,8 +266,8 @@ static const struct drm_connector_funcs vmw_legacy_connector_funcs = {
 	.set_property = vmw_du_connector_set_property,
 	.destroy = vmw_ldu_connector_destroy,
 	.reset = vmw_du_connector_reset,
-	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
-	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+	.atomic_duplicate_state = vmw_du_connector_duplicate_state,
+	.atomic_destroy_state = vmw_du_connector_destroy_state,
 	.atomic_set_property = vmw_du_connector_atomic_set_property,
 	.atomic_get_property = vmw_du_connector_atomic_get_property,
 };
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
index bc5f602..63a4cd7 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_scrn.c
@@ -420,8 +420,8 @@ static const struct drm_connector_funcs vmw_sou_connector_funcs = {
 	.set_property = vmw_du_connector_set_property,
 	.destroy = vmw_sou_connector_destroy,
 	.reset = vmw_du_connector_reset,
-	.atomic_duplicate_state = drm_atomic_helper_connector_duplicate_state,
-	.atomic_destroy_state = drm_atomic_helper_connector_destroy_state,
+	.atomic_duplicate_state = vmw_du_connector_duplicate_state,
+	.atomic_destroy_state = vmw_du_connector_destroy_state,
 	.atomic_set_property = vmw_du_connector_atomic_set_property,
 	.atomic_get_property = vmw_du_connector_atomic_get_property,
 };
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
index 90b5437..b68d748 100644
--- a/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
+++ b/drivers/gpu/drm/vmwgfx/vmwgfx_stdu.c
@@ -114,7 +114,7 @@ struct vmw_screen_target_display_unit {
 	bool defined;
 
 	/* For CPU Blit */
-	struct ttm_bo_kmap_obj host_map, guest_map;
+	struct ttm_bo_kmap_obj host_map;
 	unsigned int cpp;
 };
 
@@ -695,7 +695,8 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	s32 src_pitch, dst_pitch;
 	u8 *src, *dst;
 	bool not_used;
-
+	struct ttm_bo_kmap_obj guest_map;
+	int ret;
 
 	if (!dirty->num_hits)
 		return;
@@ -706,6 +707,13 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	if (width == 0 || height == 0)
 		return;
 
+	ret = ttm_bo_kmap(&ddirty->buf->base, 0, ddirty->buf->base.num_pages,
+			  &guest_map);
+	if (ret) {
+		DRM_ERROR("Failed mapping framebuffer for blit: %d\n",
+			  ret);
+		goto out_cleanup;
+	}
 
 	/* Assume we are blitting from Host (display_srf) to Guest (dmabuf) */
 	src_pitch = stdu->display_srf->base_size.width * stdu->cpp;
@@ -713,7 +721,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 	src += ddirty->top * src_pitch + ddirty->left * stdu->cpp;
 
 	dst_pitch = ddirty->pitch;
-	dst = ttm_kmap_obj_virtual(&stdu->guest_map, &not_used);
+	dst = ttm_kmap_obj_virtual(&guest_map, &not_used);
 	dst += ddirty->fb_top * dst_pitch + ddirty->fb_left * stdu->cpp;
 
 
@@ -772,6 +780,7 @@ static void vmw_stdu_dmabuf_cpu_commit(struct vmw_kms_dirty *dirty)
 		vmw_fifo_commit(dev_priv, sizeof(*cmd));
 	}
 
+	ttm_bo_kunmap(&guest_map);
 out_cleanup:
 	ddirty->left = ddirty->top = ddirty->fb_left = ddirty->fb_top = S32_MAX;
 	ddirty->right = ddirty->bottom = S32_MIN;
@@ -1109,9 +1118,6 @@ vmw_stdu_primary_plane_cleanup_fb(struct drm_plane *plane,
 {
 	struct vmw_plane_state *vps = vmw_plane_state_to_vps(old_state);
 
-	if (vps->guest_map.virtual)
-		ttm_bo_kunmap(&vps->guest_map);
-
 	if (vps->host_map.virtual)
 		ttm_bo_kunmap(&vps->host_map);
 
@@ -1277,33 +1283,11 @@ vmw_stdu_primary_plane_prepare_fb(struct drm_plane *plane,
 	 */
 	if (vps->content_fb_type == SEPARATE_DMA &&
 	    !(dev_priv->capabilities & SVGA_CAP_3D)) {
-
-		struct vmw_framebuffer_dmabuf *new_vfbd;
-
-		new_vfbd = vmw_framebuffer_to_vfbd(new_fb);
-
-		ret = ttm_bo_reserve(&new_vfbd->buffer->base, false, false,
-				     NULL);
-		if (ret)
-			goto out_srf_unpin;
-
-		ret = ttm_bo_kmap(&new_vfbd->buffer->base, 0,
-				  new_vfbd->buffer->base.num_pages,
-				  &vps->guest_map);
-
-		ttm_bo_unreserve(&new_vfbd->buffer->base);
-
-		if (ret) {
-			DRM_ERROR("Failed to map content buffer to CPU\n");
-			goto out_srf_unpin;
-		}
-
 		ret = ttm_bo_kmap(&vps->surf->res.backup->base, 0,
 				  vps->surf->res.backup->base.num_pages,
 				  &vps->host_map);
 		if (ret) {
 			DRM_ERROR("Failed to map display buffer to CPU\n");
-			ttm_bo_kunmap(&vps->guest_map);
 			goto out_srf_unpin;
 		}
 
@@ -1350,7 +1334,6 @@ vmw_stdu_primary_plane_atomic_update(struct drm_plane *plane,
 	stdu->display_srf = vps->surf;
 	stdu->content_fb_type = vps->content_fb_type;
 	stdu->cpp = vps->cpp;
-	memcpy(&stdu->guest_map, &vps->guest_map, sizeof(vps->guest_map));
 	memcpy(&stdu->host_map, &vps->host_map, sizeof(vps->host_map));
 
 	if (!stdu->defined)
diff --git a/drivers/hwmon/Kconfig b/drivers/hwmon/Kconfig
index 7ad0176..ef23553 100644
--- a/drivers/hwmon/Kconfig
+++ b/drivers/hwmon/Kconfig
@@ -26,11 +26,9 @@
 
 config HWMON_VID
 	tristate
-	default n
 
 config HWMON_DEBUG_CHIP
 	bool "Hardware Monitoring Chip debugging messages"
-	default n
 	help
 	  Say Y here if you want the I2C chip drivers to produce a bunch of
 	  debug messages to the system log.  Select this if you are having
@@ -42,7 +40,6 @@
 config SENSORS_AB8500
 	tristate "AB8500 thermal monitoring"
 	depends on AB8500_GPADC && AB8500_BM
-	default n
 	help
 	  If you say yes here you get support for the thermal sensor part
 	  of the AB8500 chip. The driver includes thermal management for
@@ -302,7 +299,6 @@
 	select NEW_LEDS
 	select LEDS_CLASS
 	select INPUT_POLLDEV
-	default n
 	help
 	  This driver provides support for the Apple System Management
 	  Controller, which provides an accelerometer (Apple Sudden Motion
@@ -678,7 +674,6 @@
 config SENSORS_POWR1220
 	tristate "Lattice POWR1220 Power Monitoring"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get access to the hardware monitoring
 	  functions of the Lattice POWR1220 isp Power Supply Monitoring,
@@ -702,7 +697,6 @@
 	tristate "Linear Technology LTC2945"
 	depends on I2C
 	select REGMAP_I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC2945
 	  I2C System Monitor.
@@ -727,7 +721,6 @@
 config SENSORS_LTC4151
 	tristate "Linear Technology LTC4151"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4151
 	  High Voltage I2C Current and Voltage Monitor interface.
@@ -738,7 +731,6 @@
 config SENSORS_LTC4215
 	tristate "Linear Technology LTC4215"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4215
 	  Hot Swap Controller I2C interface.
@@ -750,7 +742,6 @@
 	tristate "Linear Technology LTC4222"
 	depends on I2C
 	select REGMAP_I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4222
 	  Dual Hot Swap Controller I2C interface.
@@ -761,7 +752,6 @@
 config SENSORS_LTC4245
 	tristate "Linear Technology LTC4245"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4245
 	  Multiple Supply Hot Swap Controller I2C interface.
@@ -773,7 +763,6 @@
 	tristate "Linear Technology LTC4260"
 	depends on I2C
 	select REGMAP_I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4260
 	  Positive Voltage Hot Swap Controller I2C interface.
@@ -784,7 +773,6 @@
 config SENSORS_LTC4261
 	tristate "Linear Technology LTC4261"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for Linear Technology LTC4261
 	  Negative Voltage Hot Swap Controller I2C interface.
@@ -1276,7 +1264,6 @@
 config SENSORS_PCF8591
 	tristate "Philips PCF8591 ADC/DAC"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for Philips PCF8591 4-channel
 	  ADC, 1-channel DAC chips.
@@ -1459,7 +1446,6 @@
 
 config SENSORS_SCH56XX_COMMON
 	tristate
-	default n
 
 config SENSORS_SCH5627
 	tristate "SMSC SCH5627"
@@ -1505,7 +1491,6 @@
 config SENSORS_SMM665
 	tristate "Summit Microelectronics SMM665"
 	depends on I2C
-	default n
 	help
 	  If you say yes here you get support for the hardware monitoring
 	  features of the Summit Microelectronics SMM665/SMM665B Six-Channel
@@ -1725,6 +1710,16 @@
 	  This driver can also be built as a module.  If so, the module
 	  will be called vt8231.
 
+config SENSORS_W83773G
+	tristate "Nuvoton W83773G"
+	depends on I2C
+	help
+	  If you say yes here you get support for the Nuvoton W83773G hardware
+	  monitoring chip.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called w83773g.
+
 config SENSORS_W83781D
 	tristate "Winbond W83781D, W83782D, W83783S, Asus AS99127F"
 	depends on I2C
@@ -1782,7 +1777,6 @@
 config SENSORS_W83795_FANCTRL
 	bool "Include automatic fan control support (DANGEROUS)"
 	depends on SENSORS_W83795
-	default n
 	help
 	  If you say yes here, support for automatic fan speed control
 	  will be included in the driver.
diff --git a/drivers/hwmon/Makefile b/drivers/hwmon/Makefile
index 0fe489f..f814b4a 100644
--- a/drivers/hwmon/Makefile
+++ b/drivers/hwmon/Makefile
@@ -14,6 +14,7 @@
 # asb100, then w83781d go first, as they can override other drivers' addresses.
 obj-$(CONFIG_SENSORS_ASB100)	+= asb100.o
 obj-$(CONFIG_SENSORS_W83627HF)	+= w83627hf.o
+obj-$(CONFIG_SENSORS_W83773G)	+= w83773g.o
 obj-$(CONFIG_SENSORS_W83792D)	+= w83792d.o
 obj-$(CONFIG_SENSORS_W83793)	+= w83793.o
 obj-$(CONFIG_SENSORS_W83795)	+= w83795.o
diff --git a/drivers/hwmon/aspeed-pwm-tacho.c b/drivers/hwmon/aspeed-pwm-tacho.c
index 63a95e2..693a3d5 100644
--- a/drivers/hwmon/aspeed-pwm-tacho.c
+++ b/drivers/hwmon/aspeed-pwm-tacho.c
@@ -19,6 +19,7 @@
 #include <linux/of_platform.h>
 #include <linux/platform_device.h>
 #include <linux/regmap.h>
+#include <linux/reset.h>
 #include <linux/sysfs.h>
 #include <linux/thermal.h>
 
@@ -181,6 +182,7 @@ struct aspeed_cooling_device {
 
 struct aspeed_pwm_tacho_data {
 	struct regmap *regmap;
+	struct reset_control *rst;
 	unsigned long clk_freq;
 	bool pwm_present[8];
 	bool fan_tach_present[16];
@@ -905,6 +907,13 @@ static int aspeed_create_fan(struct device *dev,
 	return 0;
 }
 
+static void aspeed_pwm_tacho_remove(void *data)
+{
+	struct aspeed_pwm_tacho_data *priv = data;
+
+	reset_control_assert(priv->rst);
+}
+
 static int aspeed_pwm_tacho_probe(struct platform_device *pdev)
 {
 	struct device *dev = &pdev->dev;
@@ -931,6 +940,19 @@ static int aspeed_pwm_tacho_probe(struct platform_device *pdev)
 			&aspeed_pwm_tacho_regmap_config);
 	if (IS_ERR(priv->regmap))
 		return PTR_ERR(priv->regmap);
+
+	priv->rst = devm_reset_control_get_exclusive(dev, NULL);
+	if (IS_ERR(priv->rst)) {
+		dev_err(dev,
+			"missing or invalid reset controller device tree entry");
+		return PTR_ERR(priv->rst);
+	}
+	reset_control_deassert(priv->rst);
+
+	ret = devm_add_action_or_reset(dev, aspeed_pwm_tacho_remove, priv);
+	if (ret)
+		return ret;
+
 	regmap_write(priv->regmap, ASPEED_PTCR_TACH_SOURCE, 0);
 	regmap_write(priv->regmap, ASPEED_PTCR_TACH_SOURCE_EXT, 0);
 
diff --git a/drivers/hwmon/coretemp.c b/drivers/hwmon/coretemp.c
index c13a4fd..4bdbf77 100644
--- a/drivers/hwmon/coretemp.c
+++ b/drivers/hwmon/coretemp.c
@@ -246,7 +246,8 @@ static int adjust_tjmax(struct cpuinfo_x86 *c, u32 id, struct device *dev)
 	int err;
 	u32 eax, edx;
 	int i;
-	struct pci_dev *host_bridge = pci_get_bus_and_slot(0, PCI_DEVFN(0, 0));
+	u16 devfn = PCI_DEVFN(0, 0);
+	struct pci_dev *host_bridge = pci_get_domain_bus_and_slot(0, 0, devfn);
 
 	/*
 	 * Explicit tjmax table entries override heuristics.
diff --git a/drivers/hwmon/dell-smm-hwmon.c b/drivers/hwmon/dell-smm-hwmon.c
index c7c9e95..bf3bb7e 100644
--- a/drivers/hwmon/dell-smm-hwmon.c
+++ b/drivers/hwmon/dell-smm-hwmon.c
@@ -76,6 +76,7 @@ static uint i8k_fan_mult = I8K_FAN_MULT;
 static uint i8k_pwm_mult;
 static uint i8k_fan_max = I8K_FAN_HIGH;
 static bool disallow_fan_type_call;
+static bool disallow_fan_support;
 
 #define I8K_HWMON_HAVE_TEMP1	(1 << 0)
 #define I8K_HWMON_HAVE_TEMP2	(1 << 1)
@@ -242,6 +243,9 @@ static int i8k_get_fan_status(int fan)
 {
 	struct smm_regs regs = { .eax = I8K_SMM_GET_FAN, };
 
+	if (disallow_fan_support)
+		return -EINVAL;
+
 	regs.ebx = fan & 0xff;
 	return i8k_smm(&regs) ? : regs.eax & 0xff;
 }
@@ -253,6 +257,9 @@ static int i8k_get_fan_speed(int fan)
 {
 	struct smm_regs regs = { .eax = I8K_SMM_GET_SPEED, };
 
+	if (disallow_fan_support)
+		return -EINVAL;
+
 	regs.ebx = fan & 0xff;
 	return i8k_smm(&regs) ? : (regs.eax & 0xffff) * i8k_fan_mult;
 }
@@ -264,7 +271,7 @@ static int _i8k_get_fan_type(int fan)
 {
 	struct smm_regs regs = { .eax = I8K_SMM_GET_FAN_TYPE, };
 
-	if (disallow_fan_type_call)
+	if (disallow_fan_support || disallow_fan_type_call)
 		return -EINVAL;
 
 	regs.ebx = fan & 0xff;
@@ -289,6 +296,9 @@ static int i8k_get_fan_nominal_speed(int fan, int speed)
 {
 	struct smm_regs regs = { .eax = I8K_SMM_GET_NOM_SPEED, };
 
+	if (disallow_fan_support)
+		return -EINVAL;
+
 	regs.ebx = (fan & 0xff) | (speed << 8);
 	return i8k_smm(&regs) ? : (regs.eax & 0xffff) * i8k_fan_mult;
 }
@@ -300,6 +310,9 @@ static int i8k_set_fan(int fan, int speed)
 {
 	struct smm_regs regs = { .eax = I8K_SMM_SET_FAN, };
 
+	if (disallow_fan_support)
+		return -EINVAL;
+
 	speed = (speed < 0) ? 0 : ((speed > i8k_fan_max) ? i8k_fan_max : speed);
 	regs.ebx = (fan & 0xff) | (speed << 8);
 
@@ -772,6 +785,8 @@ static struct attribute *i8k_attrs[] = {
 static umode_t i8k_is_visible(struct kobject *kobj, struct attribute *attr,
 			      int index)
 {
+	if (disallow_fan_support && index >= 8)
+		return 0;
 	if (disallow_fan_type_call &&
 	    (index == 9 || index == 12 || index == 15))
 		return 0;
@@ -1039,6 +1054,30 @@ static const struct dmi_system_id i8k_blacklist_fan_type_dmi_table[] __initconst
 };
 
 /*
+ * On some machines all fan related SMM functions implemented by Dell BIOS
+ * firmware freeze kernel for about 500ms. Until Dell fixes these problems fan
+ * support for affected blacklisted Dell machines stay disabled.
+ * See bug: https://bugzilla.kernel.org/show_bug.cgi?id=195751
+ */
+static struct dmi_system_id i8k_blacklist_fan_support_dmi_table[] __initdata = {
+	{
+		.ident = "Dell Inspiron 7720",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Inspiron 7720"),
+		},
+	},
+	{
+		.ident = "Dell Vostro 3360",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
+			DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "Vostro 3360"),
+		},
+	},
+	{ }
+};
+
+/*
  * Probe for the presence of a supported laptop.
  */
 static int __init i8k_probe(void)
@@ -1060,8 +1099,17 @@ static int __init i8k_probe(void)
 			i8k_get_dmi_data(DMI_BIOS_VERSION));
 	}
 
-	if (dmi_check_system(i8k_blacklist_fan_type_dmi_table))
-		disallow_fan_type_call = true;
+	if (dmi_check_system(i8k_blacklist_fan_support_dmi_table)) {
+		pr_warn("broken Dell BIOS detected, disallow fan support\n");
+		if (!force)
+			disallow_fan_support = true;
+	}
+
+	if (dmi_check_system(i8k_blacklist_fan_type_dmi_table)) {
+		pr_warn("broken Dell BIOS detected, disallow fan type call\n");
+		if (!force)
+			disallow_fan_type_call = true;
+	}
 
 	strlcpy(bios_version, i8k_get_dmi_data(DMI_BIOS_VERSION),
 		sizeof(bios_version));
diff --git a/drivers/hwmon/hih6130.c b/drivers/hwmon/hih6130.c
index 7b73d20..0ae1ee1 100644
--- a/drivers/hwmon/hih6130.c
+++ b/drivers/hwmon/hih6130.c
@@ -37,7 +37,7 @@
 
 /**
  * struct hih6130 - HIH-6130 device specific data
- * @hwmon_dev: device registered with hwmon
+ * @client: pointer to I2C client device
  * @lock: mutex to protect measurement values
  * @valid: only false before first measurement is taken
  * @last_update: time of last update (jiffies)
diff --git a/drivers/hwmon/hwmon.c b/drivers/hwmon/hwmon.c
index af51230..32083e4 100644
--- a/drivers/hwmon/hwmon.c
+++ b/drivers/hwmon/hwmon.c
@@ -678,7 +678,7 @@ EXPORT_SYMBOL_GPL(hwmon_device_register_with_groups);
  * @dev: the parent device
  * @name: hwmon name attribute
  * @drvdata: driver data to attach to created device
- * @info: pointer to hwmon chip information
+ * @chip: pointer to hwmon chip information
  * @extra_groups: pointer to list of additional non-standard attribute groups
  *
  * hwmon_device_unregister() must be called when the device is no
@@ -785,11 +785,11 @@ EXPORT_SYMBOL_GPL(devm_hwmon_device_register_with_groups);
 
 /**
  * devm_hwmon_device_register_with_info - register w/ hwmon
- * @dev: the parent device
- * @name: hwmon name attribute
- * @drvdata: driver data to attach to created device
- * @info: Pointer to hwmon chip information
- * @groups - pointer to list of driver specific attribute groups
+ * @dev:	the parent device
+ * @name:	hwmon name attribute
+ * @drvdata:	driver data to attach to created device
+ * @chip:	pointer to hwmon chip information
+ * @groups:	pointer to list of driver specific attribute groups
  *
  * Returns the pointer to the new device. The new device is automatically
  * unregistered with the parent device.
diff --git a/drivers/hwmon/iio_hwmon.c b/drivers/hwmon/iio_hwmon.c
index f6a7667..5e5b32a 100644
--- a/drivers/hwmon/iio_hwmon.c
+++ b/drivers/hwmon/iio_hwmon.c
@@ -23,7 +23,8 @@
  * @channels:		filled with array of channels from iio
  * @num_channels:	number of channels in channels (saves counting twice)
  * @hwmon_dev:		associated hwmon device
- * @attr_group:	the group of attributes
+ * @attr_group:		the group of attributes
+ * @groups:		null terminated array of attribute groups
  * @attrs:		null terminated array of attribute pointers.
  */
 struct iio_hwmon_state {
diff --git a/drivers/hwmon/ina2xx.c b/drivers/hwmon/ina2xx.c
index 62e38fa..e9e6aea 100644
--- a/drivers/hwmon/ina2xx.c
+++ b/drivers/hwmon/ina2xx.c
@@ -95,18 +95,20 @@ enum ina2xx_ids { ina219, ina226 };
 
 struct ina2xx_config {
 	u16 config_default;
-	int calibration_factor;
+	int calibration_value;
 	int registers;
 	int shunt_div;
 	int bus_voltage_shift;
 	int bus_voltage_lsb;	/* uV */
-	int power_lsb;		/* uW */
+	int power_lsb_factor;
 };
 
 struct ina2xx_data {
 	const struct ina2xx_config *config;
 
 	long rshunt;
+	long current_lsb_uA;
+	long power_lsb_uW;
 	struct mutex config_lock;
 	struct regmap *regmap;
 
@@ -116,21 +118,21 @@ struct ina2xx_data {
 static const struct ina2xx_config ina2xx_config[] = {
 	[ina219] = {
 		.config_default = INA219_CONFIG_DEFAULT,
-		.calibration_factor = 40960000,
+		.calibration_value = 4096,
 		.registers = INA219_REGISTERS,
 		.shunt_div = 100,
 		.bus_voltage_shift = 3,
 		.bus_voltage_lsb = 4000,
-		.power_lsb = 20000,
+		.power_lsb_factor = 20,
 	},
 	[ina226] = {
 		.config_default = INA226_CONFIG_DEFAULT,
-		.calibration_factor = 5120000,
+		.calibration_value = 2048,
 		.registers = INA226_REGISTERS,
 		.shunt_div = 400,
 		.bus_voltage_shift = 0,
 		.bus_voltage_lsb = 1250,
-		.power_lsb = 25000,
+		.power_lsb_factor = 25,
 	},
 };
 
@@ -169,12 +171,16 @@ static u16 ina226_interval_to_reg(int interval)
 	return INA226_SHIFT_AVG(avg_bits);
 }
 
+/*
+ * Calibration register is set to the best value, which eliminates
+ * truncation errors on calculating current register in hardware.
+ * According to datasheet (eq. 3) the best values are 2048 for
+ * ina226 and 4096 for ina219. They are hardcoded as calibration_value.
+ */
 static int ina2xx_calibrate(struct ina2xx_data *data)
 {
-	u16 val = DIV_ROUND_CLOSEST(data->config->calibration_factor,
-				    data->rshunt);
-
-	return regmap_write(data->regmap, INA2XX_CALIBRATION, val);
+	return regmap_write(data->regmap, INA2XX_CALIBRATION,
+			    data->config->calibration_value);
 }
 
 /*
@@ -187,10 +193,6 @@ static int ina2xx_init(struct ina2xx_data *data)
 	if (ret < 0)
 		return ret;
 
-	/*
-	 * Set current LSB to 1mA, shunt is in uOhms
-	 * (equation 13 in datasheet).
-	 */
 	return ina2xx_calibrate(data);
 }
 
@@ -268,15 +270,15 @@ static int ina2xx_get_value(struct ina2xx_data *data, u8 reg,
 		val = DIV_ROUND_CLOSEST(val, 1000);
 		break;
 	case INA2XX_POWER:
-		val = regval * data->config->power_lsb;
+		val = regval * data->power_lsb_uW;
 		break;
 	case INA2XX_CURRENT:
-		/* signed register, LSB=1mA (selected), in mA */
-		val = (s16)regval;
+		/* signed register, result in mA */
+		val = regval * data->current_lsb_uA;
+		val = DIV_ROUND_CLOSEST(val, 1000);
 		break;
 	case INA2XX_CALIBRATION:
-		val = DIV_ROUND_CLOSEST(data->config->calibration_factor,
-					regval);
+		val = regval;
 		break;
 	default:
 		/* programmer goofed */
@@ -304,9 +306,32 @@ static ssize_t ina2xx_show_value(struct device *dev,
 			ina2xx_get_value(data, attr->index, regval));
 }
 
-static ssize_t ina2xx_set_shunt(struct device *dev,
-				struct device_attribute *da,
-				const char *buf, size_t count)
+/*
+ * In order to keep calibration register value fixed, the product
+ * of current_lsb and shunt_resistor should also be fixed and equal
+ * to shunt_voltage_lsb = 1 / shunt_div multiplied by 10^9 in order
+ * to keep the scale.
+ */
+static int ina2xx_set_shunt(struct ina2xx_data *data, long val)
+{
+	unsigned int dividend = DIV_ROUND_CLOSEST(1000000000,
+						  data->config->shunt_div);
+	if (val <= 0 || val > dividend)
+		return -EINVAL;
+
+	mutex_lock(&data->config_lock);
+	data->rshunt = val;
+	data->current_lsb_uA = DIV_ROUND_CLOSEST(dividend, val);
+	data->power_lsb_uW = data->config->power_lsb_factor *
+			     data->current_lsb_uA;
+	mutex_unlock(&data->config_lock);
+
+	return 0;
+}
+
+static ssize_t ina2xx_store_shunt(struct device *dev,
+				  struct device_attribute *da,
+				  const char *buf, size_t count)
 {
 	unsigned long val;
 	int status;
@@ -316,18 +341,9 @@ static ssize_t ina2xx_set_shunt(struct device *dev,
 	if (status < 0)
 		return status;
 
-	if (val == 0 ||
-	    /* Values greater than the calibration factor make no sense. */
-	    val > data->config->calibration_factor)
-		return -EINVAL;
-
-	mutex_lock(&data->config_lock);
-	data->rshunt = val;
-	status = ina2xx_calibrate(data);
-	mutex_unlock(&data->config_lock);
+	status = ina2xx_set_shunt(data, val);
 	if (status < 0)
 		return status;
-
 	return count;
 }
 
@@ -387,7 +403,7 @@ static SENSOR_DEVICE_ATTR(power1_input, S_IRUGO, ina2xx_show_value, NULL,
 
 /* shunt resistance */
 static SENSOR_DEVICE_ATTR(shunt_resistor, S_IRUGO | S_IWUSR,
-			  ina2xx_show_value, ina2xx_set_shunt,
+			  ina2xx_show_value, ina2xx_store_shunt,
 			  INA2XX_CALIBRATION);
 
 /* update interval (ina226 only) */
@@ -438,6 +454,7 @@ static int ina2xx_probe(struct i2c_client *client,
 
 	/* set the device type */
 	data->config = &ina2xx_config[chip];
+	mutex_init(&data->config_lock);
 
 	if (of_property_read_u32(dev->of_node, "shunt-resistor", &val) < 0) {
 		struct ina2xx_platform_data *pdata = dev_get_platdata(dev);
@@ -448,10 +465,7 @@ static int ina2xx_probe(struct i2c_client *client,
 			val = INA2XX_RSHUNT_DEFAULT;
 	}
 
-	if (val <= 0 || val > data->config->calibration_factor)
-		return -ENODEV;
-
-	data->rshunt = val;
+	ina2xx_set_shunt(data, val);
 
 	ina2xx_regmap_config.max_register = data->config->registers;
 
@@ -467,8 +481,6 @@ static int ina2xx_probe(struct i2c_client *client,
 		return -ENODEV;
 	}
 
-	mutex_init(&data->config_lock);
-
 	data->groups[group++] = &ina2xx_group;
 	if (id->driver_data == ina226)
 		data->groups[group++] = &ina226_group;
diff --git a/drivers/hwmon/k10temp.c b/drivers/hwmon/k10temp.c
index 0721e17..06b4e1c 100644
--- a/drivers/hwmon/k10temp.c
+++ b/drivers/hwmon/k10temp.c
@@ -86,6 +86,7 @@ static const struct tctl_offset tctl_offset_table[] = {
 	{ 0x17, "AMD Ryzen 7 1800X", 20000 },
 	{ 0x17, "AMD Ryzen Threadripper 1950X", 27000 },
 	{ 0x17, "AMD Ryzen Threadripper 1920X", 27000 },
+	{ 0x17, "AMD Ryzen Threadripper 1900X", 27000 },
 	{ 0x17, "AMD Ryzen Threadripper 1950", 10000 },
 	{ 0x17, "AMD Ryzen Threadripper 1920", 10000 },
 	{ 0x17, "AMD Ryzen Threadripper 1910", 10000 },
diff --git a/drivers/hwmon/lm75.c b/drivers/hwmon/lm75.c
index 005ffb5..49f4b33 100644
--- a/drivers/hwmon/lm75.c
+++ b/drivers/hwmon/lm75.c
@@ -100,7 +100,7 @@ static int lm75_read(struct device *dev, enum hwmon_sensor_types type,
 		switch (attr) {
 		case hwmon_chip_update_interval:
 			*val = data->sample_time;
-			break;;
+			break;
 		default:
 			return -EINVAL;
 		}
diff --git a/drivers/hwmon/pmbus/Kconfig b/drivers/hwmon/pmbus/Kconfig
index 0847900..6e4298e 100644
--- a/drivers/hwmon/pmbus/Kconfig
+++ b/drivers/hwmon/pmbus/Kconfig
@@ -39,6 +39,7 @@
 
 config SENSORS_IBM_CFFPS
 	tristate "IBM Common Form Factor Power Supply"
+	depends on LEDS_CLASS
 	help
 	  If you say yes here you get hardware monitoring support for the IBM
 	  Common Form Factor power supply.
diff --git a/drivers/hwmon/pmbus/ibm-cffps.c b/drivers/hwmon/pmbus/ibm-cffps.c
index cb56da6..93d9a9e 100644
--- a/drivers/hwmon/pmbus/ibm-cffps.c
+++ b/drivers/hwmon/pmbus/ibm-cffps.c
@@ -8,12 +8,29 @@
  */
 
 #include <linux/bitops.h>
+#include <linux/debugfs.h>
 #include <linux/device.h>
+#include <linux/fs.h>
 #include <linux/i2c.h>
+#include <linux/jiffies.h>
+#include <linux/leds.h>
 #include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/pmbus.h>
 
 #include "pmbus.h"
 
+#define CFFPS_FRU_CMD				0x9A
+#define CFFPS_PN_CMD				0x9B
+#define CFFPS_SN_CMD				0x9E
+#define CFFPS_CCIN_CMD				0xBD
+#define CFFPS_FW_CMD_START			0xFA
+#define CFFPS_FW_NUM_BYTES			4
+#define CFFPS_SYS_CONFIG_CMD			0xDA
+
+#define CFFPS_INPUT_HISTORY_CMD			0xD6
+#define CFFPS_INPUT_HISTORY_SIZE		100
+
 /* STATUS_MFR_SPECIFIC bits */
 #define CFFPS_MFR_FAN_FAULT			BIT(0)
 #define CFFPS_MFR_THERMAL_FAULT			BIT(1)
@@ -24,6 +41,153 @@
 #define CFFPS_MFR_VAUX_FAULT			BIT(6)
 #define CFFPS_MFR_CURRENT_SHARE_WARNING		BIT(7)
 
+#define CFFPS_LED_BLINK				BIT(0)
+#define CFFPS_LED_ON				BIT(1)
+#define CFFPS_LED_OFF				BIT(2)
+#define CFFPS_BLINK_RATE_MS			250
+
+enum {
+	CFFPS_DEBUGFS_INPUT_HISTORY = 0,
+	CFFPS_DEBUGFS_FRU,
+	CFFPS_DEBUGFS_PN,
+	CFFPS_DEBUGFS_SN,
+	CFFPS_DEBUGFS_CCIN,
+	CFFPS_DEBUGFS_FW,
+	CFFPS_DEBUGFS_NUM_ENTRIES
+};
+
+struct ibm_cffps_input_history {
+	struct mutex update_lock;
+	unsigned long last_update;
+
+	u8 byte_count;
+	u8 data[CFFPS_INPUT_HISTORY_SIZE];
+};
+
+struct ibm_cffps {
+	struct i2c_client *client;
+
+	struct ibm_cffps_input_history input_history;
+
+	int debugfs_entries[CFFPS_DEBUGFS_NUM_ENTRIES];
+
+	char led_name[32];
+	u8 led_state;
+	struct led_classdev led;
+};
+
+#define to_psu(x, y) container_of((x), struct ibm_cffps, debugfs_entries[(y)])
+
+static ssize_t ibm_cffps_read_input_history(struct ibm_cffps *psu,
+					    char __user *buf, size_t count,
+					    loff_t *ppos)
+{
+	int rc;
+	u8 msgbuf0[1] = { CFFPS_INPUT_HISTORY_CMD };
+	u8 msgbuf1[CFFPS_INPUT_HISTORY_SIZE + 1] = { 0 };
+	struct i2c_msg msg[2] = {
+		{
+			.addr = psu->client->addr,
+			.flags = psu->client->flags,
+			.len = 1,
+			.buf = msgbuf0,
+		}, {
+			.addr = psu->client->addr,
+			.flags = psu->client->flags | I2C_M_RD,
+			.len = CFFPS_INPUT_HISTORY_SIZE + 1,
+			.buf = msgbuf1,
+		},
+	};
+
+	if (!*ppos) {
+		mutex_lock(&psu->input_history.update_lock);
+		if (time_after(jiffies, psu->input_history.last_update + HZ)) {
+			/*
+			 * Use a raw i2c transfer, since we need more bytes
+			 * than Linux I2C supports through smbus xfr (only 32).
+			 */
+			rc = i2c_transfer(psu->client->adapter, msg, 2);
+			if (rc < 0) {
+				mutex_unlock(&psu->input_history.update_lock);
+				return rc;
+			}
+
+			psu->input_history.byte_count = msgbuf1[0];
+			memcpy(psu->input_history.data, &msgbuf1[1],
+			       CFFPS_INPUT_HISTORY_SIZE);
+			psu->input_history.last_update = jiffies;
+		}
+
+		mutex_unlock(&psu->input_history.update_lock);
+	}
+
+	return simple_read_from_buffer(buf, count, ppos,
+				       psu->input_history.data,
+				       psu->input_history.byte_count);
+}
+
+static ssize_t ibm_cffps_debugfs_op(struct file *file, char __user *buf,
+				    size_t count, loff_t *ppos)
+{
+	u8 cmd;
+	int i, rc;
+	int *idxp = file->private_data;
+	int idx = *idxp;
+	struct ibm_cffps *psu = to_psu(idxp, idx);
+	char data[I2C_SMBUS_BLOCK_MAX] = { 0 };
+
+	switch (idx) {
+	case CFFPS_DEBUGFS_INPUT_HISTORY:
+		return ibm_cffps_read_input_history(psu, buf, count, ppos);
+	case CFFPS_DEBUGFS_FRU:
+		cmd = CFFPS_FRU_CMD;
+		break;
+	case CFFPS_DEBUGFS_PN:
+		cmd = CFFPS_PN_CMD;
+		break;
+	case CFFPS_DEBUGFS_SN:
+		cmd = CFFPS_SN_CMD;
+		break;
+	case CFFPS_DEBUGFS_CCIN:
+		rc = i2c_smbus_read_word_swapped(psu->client, CFFPS_CCIN_CMD);
+		if (rc < 0)
+			return rc;
+
+		rc = snprintf(data, 5, "%04X", rc);
+		goto done;
+	case CFFPS_DEBUGFS_FW:
+		for (i = 0; i < CFFPS_FW_NUM_BYTES; ++i) {
+			rc = i2c_smbus_read_byte_data(psu->client,
+						      CFFPS_FW_CMD_START + i);
+			if (rc < 0)
+				return rc;
+
+			snprintf(&data[i * 2], 3, "%02X", rc);
+		}
+
+		rc = i * 2;
+		goto done;
+	default:
+		return -EINVAL;
+	}
+
+	rc = i2c_smbus_read_block_data(psu->client, cmd, data);
+	if (rc < 0)
+		return rc;
+
+done:
+	data[rc] = '\n';
+	rc += 2;
+
+	return simple_read_from_buffer(buf, count, ppos, data, rc);
+}
+
+static const struct file_operations ibm_cffps_fops = {
+	.llseek = noop_llseek,
+	.read = ibm_cffps_debugfs_op,
+	.open = simple_open,
+};
+
 static int ibm_cffps_read_byte_data(struct i2c_client *client, int page,
 				    int reg)
 {
@@ -105,6 +269,69 @@ static int ibm_cffps_read_word_data(struct i2c_client *client, int page,
 	return rc;
 }
 
+static void ibm_cffps_led_brightness_set(struct led_classdev *led_cdev,
+					 enum led_brightness brightness)
+{
+	int rc;
+	struct ibm_cffps *psu = container_of(led_cdev, struct ibm_cffps, led);
+
+	if (brightness == LED_OFF) {
+		psu->led_state = CFFPS_LED_OFF;
+	} else {
+		brightness = LED_FULL;
+		if (psu->led_state != CFFPS_LED_BLINK)
+			psu->led_state = CFFPS_LED_ON;
+	}
+
+	rc = i2c_smbus_write_byte_data(psu->client, CFFPS_SYS_CONFIG_CMD,
+				       psu->led_state);
+	if (rc < 0)
+		return;
+
+	led_cdev->brightness = brightness;
+}
+
+static int ibm_cffps_led_blink_set(struct led_classdev *led_cdev,
+				   unsigned long *delay_on,
+				   unsigned long *delay_off)
+{
+	int rc;
+	struct ibm_cffps *psu = container_of(led_cdev, struct ibm_cffps, led);
+
+	psu->led_state = CFFPS_LED_BLINK;
+
+	if (led_cdev->brightness == LED_OFF)
+		return 0;
+
+	rc = i2c_smbus_write_byte_data(psu->client, CFFPS_SYS_CONFIG_CMD,
+				       CFFPS_LED_BLINK);
+	if (rc < 0)
+		return rc;
+
+	*delay_on = CFFPS_BLINK_RATE_MS;
+	*delay_off = CFFPS_BLINK_RATE_MS;
+
+	return 0;
+}
+
+static void ibm_cffps_create_led_class(struct ibm_cffps *psu)
+{
+	int rc;
+	struct i2c_client *client = psu->client;
+	struct device *dev = &client->dev;
+
+	snprintf(psu->led_name, sizeof(psu->led_name), "%s-%02x", client->name,
+		 client->addr);
+	psu->led.name = psu->led_name;
+	psu->led.max_brightness = LED_FULL;
+	psu->led.brightness_set = ibm_cffps_led_brightness_set;
+	psu->led.blink_set = ibm_cffps_led_blink_set;
+
+	rc = devm_led_classdev_register(dev, &psu->led);
+	if (rc)
+		dev_warn(dev, "failed to register led class: %d\n", rc);
+}
+
 static struct pmbus_driver_info ibm_cffps_info = {
 	.pages = 1,
 	.func[0] = PMBUS_HAVE_VIN | PMBUS_HAVE_VOUT | PMBUS_HAVE_IOUT |
@@ -116,10 +343,69 @@ static struct pmbus_driver_info ibm_cffps_info = {
 	.read_word_data = ibm_cffps_read_word_data,
 };
 
+static struct pmbus_platform_data ibm_cffps_pdata = {
+	.flags = PMBUS_SKIP_STATUS_CHECK,
+};
+
 static int ibm_cffps_probe(struct i2c_client *client,
 			   const struct i2c_device_id *id)
 {
-	return pmbus_do_probe(client, id, &ibm_cffps_info);
+	int i, rc;
+	struct dentry *debugfs;
+	struct dentry *ibm_cffps_dir;
+	struct ibm_cffps *psu;
+
+	client->dev.platform_data = &ibm_cffps_pdata;
+	rc = pmbus_do_probe(client, id, &ibm_cffps_info);
+	if (rc)
+		return rc;
+
+	/*
+	 * Don't fail the probe if there isn't enough memory for leds and
+	 * debugfs.
+	 */
+	psu = devm_kzalloc(&client->dev, sizeof(*psu), GFP_KERNEL);
+	if (!psu)
+		return 0;
+
+	psu->client = client;
+	mutex_init(&psu->input_history.update_lock);
+	psu->input_history.last_update = jiffies - HZ;
+
+	ibm_cffps_create_led_class(psu);
+
+	/* Don't fail the probe if we can't create debugfs */
+	debugfs = pmbus_get_debugfs_dir(client);
+	if (!debugfs)
+		return 0;
+
+	ibm_cffps_dir = debugfs_create_dir(client->name, debugfs);
+	if (!ibm_cffps_dir)
+		return 0;
+
+	for (i = 0; i < CFFPS_DEBUGFS_NUM_ENTRIES; ++i)
+		psu->debugfs_entries[i] = i;
+
+	debugfs_create_file("input_history", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_INPUT_HISTORY],
+			    &ibm_cffps_fops);
+	debugfs_create_file("fru", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_FRU],
+			    &ibm_cffps_fops);
+	debugfs_create_file("part_number", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_PN],
+			    &ibm_cffps_fops);
+	debugfs_create_file("serial_number", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_SN],
+			    &ibm_cffps_fops);
+	debugfs_create_file("ccin", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_CCIN],
+			    &ibm_cffps_fops);
+	debugfs_create_file("fw_version", 0444, ibm_cffps_dir,
+			    &psu->debugfs_entries[CFFPS_DEBUGFS_FW],
+			    &ibm_cffps_fops);
+
+	return 0;
 }
 
 static const struct i2c_device_id ibm_cffps_id[] = {
diff --git a/drivers/hwmon/pmbus/ir35221.c b/drivers/hwmon/pmbus/ir35221.c
index 8b906b4..977315b 100644
--- a/drivers/hwmon/pmbus/ir35221.c
+++ b/drivers/hwmon/pmbus/ir35221.c
@@ -25,168 +25,19 @@
 #define IR35221_MFR_IOUT_VALLEY		0xcb
 #define IR35221_MFR_TEMP_VALLEY		0xcc
 
-static long ir35221_reg2data(int data, enum pmbus_sensor_classes class)
-{
-	s16 exponent;
-	s32 mantissa;
-	long val;
-
-	/* We only modify LINEAR11 formats */
-	exponent = ((s16)data) >> 11;
-	mantissa = ((s16)((data & 0x7ff) << 5)) >> 5;
-
-	val = mantissa * 1000L;
-
-	/* scale result to micro-units for power sensors */
-	if (class == PSC_POWER)
-		val = val * 1000L;
-
-	if (exponent >= 0)
-		val <<= exponent;
-	else
-		val >>= -exponent;
-
-	return val;
-}
-
-#define MAX_MANTISSA	(1023 * 1000)
-#define MIN_MANTISSA	(511 * 1000)
-
-static u16 ir35221_data2reg(long val, enum pmbus_sensor_classes class)
-{
-	s16 exponent = 0, mantissa;
-	bool negative = false;
-
-	if (val == 0)
-		return 0;
-
-	if (val < 0) {
-		negative = true;
-		val = -val;
-	}
-
-	/* Power is in uW. Convert to mW before converting. */
-	if (class == PSC_POWER)
-		val = DIV_ROUND_CLOSEST(val, 1000L);
-
-	/* Reduce large mantissa until it fits into 10 bit */
-	while (val >= MAX_MANTISSA && exponent < 15) {
-		exponent++;
-		val >>= 1;
-	}
-	/* Increase small mantissa to improve precision */
-	while (val < MIN_MANTISSA && exponent > -15) {
-		exponent--;
-		val <<= 1;
-	}
-
-	/* Convert mantissa from milli-units to units */
-	mantissa = DIV_ROUND_CLOSEST(val, 1000);
-
-	/* Ensure that resulting number is within range */
-	if (mantissa > 0x3ff)
-		mantissa = 0x3ff;
-
-	/* restore sign */
-	if (negative)
-		mantissa = -mantissa;
-
-	/* Convert to 5 bit exponent, 11 bit mantissa */
-	return (mantissa & 0x7ff) | ((exponent << 11) & 0xf800);
-}
-
-static u16 ir35221_scale_result(s16 data, int shift,
-				enum pmbus_sensor_classes class)
-{
-	long val;
-
-	val = ir35221_reg2data(data, class);
-
-	if (shift < 0)
-		val >>= -shift;
-	else
-		val <<= shift;
-
-	return ir35221_data2reg(val, class);
-}
-
 static int ir35221_read_word_data(struct i2c_client *client, int page, int reg)
 {
 	int ret;
 
 	switch (reg) {
-	case PMBUS_IOUT_OC_FAULT_LIMIT:
-	case PMBUS_IOUT_OC_WARN_LIMIT:
-		ret = pmbus_read_word_data(client, page, reg);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, 1, PSC_CURRENT_OUT);
-		break;
-	case PMBUS_VIN_OV_FAULT_LIMIT:
-	case PMBUS_VIN_OV_WARN_LIMIT:
-	case PMBUS_VIN_UV_WARN_LIMIT:
-		ret = pmbus_read_word_data(client, page, reg);
-		ret = ir35221_scale_result(ret, -4, PSC_VOLTAGE_IN);
-		break;
-	case PMBUS_IIN_OC_WARN_LIMIT:
-		ret = pmbus_read_word_data(client, page, reg);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -1, PSC_CURRENT_IN);
-		break;
-	case PMBUS_READ_VIN:
-		ret = pmbus_read_word_data(client, page, PMBUS_READ_VIN);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -5, PSC_VOLTAGE_IN);
-		break;
-	case PMBUS_READ_IIN:
-		ret = pmbus_read_word_data(client, page, PMBUS_READ_IIN);
-		if (ret < 0)
-			break;
-		if (page == 0)
-			ret = ir35221_scale_result(ret, -4, PSC_CURRENT_IN);
-		else
-			ret = ir35221_scale_result(ret, -5, PSC_CURRENT_IN);
-		break;
-	case PMBUS_READ_POUT:
-		ret = pmbus_read_word_data(client, page, PMBUS_READ_POUT);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -1, PSC_POWER);
-		break;
-	case PMBUS_READ_PIN:
-		ret = pmbus_read_word_data(client, page, PMBUS_READ_PIN);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -1, PSC_POWER);
-		break;
-	case PMBUS_READ_IOUT:
-		ret = pmbus_read_word_data(client, page, PMBUS_READ_IOUT);
-		if (ret < 0)
-			break;
-		if (page == 0)
-			ret = ir35221_scale_result(ret, -1, PSC_CURRENT_OUT);
-		else
-			ret = ir35221_scale_result(ret, -2, PSC_CURRENT_OUT);
-		break;
 	case PMBUS_VIRT_READ_VIN_MAX:
 		ret = pmbus_read_word_data(client, page, IR35221_MFR_VIN_PEAK);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -5, PSC_VOLTAGE_IN);
 		break;
 	case PMBUS_VIRT_READ_VOUT_MAX:
 		ret = pmbus_read_word_data(client, page, IR35221_MFR_VOUT_PEAK);
 		break;
 	case PMBUS_VIRT_READ_IOUT_MAX:
 		ret = pmbus_read_word_data(client, page, IR35221_MFR_IOUT_PEAK);
-		if (ret < 0)
-			break;
-		if (page == 0)
-			ret = ir35221_scale_result(ret, -1, PSC_CURRENT_IN);
-		else
-			ret = ir35221_scale_result(ret, -2, PSC_CURRENT_IN);
 		break;
 	case PMBUS_VIRT_READ_TEMP_MAX:
 		ret = pmbus_read_word_data(client, page, IR35221_MFR_TEMP_PEAK);
@@ -194,9 +45,6 @@ static int ir35221_read_word_data(struct i2c_client *client, int page, int reg)
 	case PMBUS_VIRT_READ_VIN_MIN:
 		ret = pmbus_read_word_data(client, page,
 					   IR35221_MFR_VIN_VALLEY);
-		if (ret < 0)
-			break;
-		ret = ir35221_scale_result(ret, -5, PSC_VOLTAGE_IN);
 		break;
 	case PMBUS_VIRT_READ_VOUT_MIN:
 		ret = pmbus_read_word_data(client, page,
@@ -205,12 +53,6 @@ static int ir35221_read_word_data(struct i2c_client *client, int page, int reg)
 	case PMBUS_VIRT_READ_IOUT_MIN:
 		ret = pmbus_read_word_data(client, page,
 					   IR35221_MFR_IOUT_VALLEY);
-		if (ret < 0)
-			break;
-		if (page == 0)
-			ret = ir35221_scale_result(ret, -1, PSC_CURRENT_IN);
-		else
-			ret = ir35221_scale_result(ret, -2, PSC_CURRENT_IN);
 		break;
 	case PMBUS_VIRT_READ_TEMP_MIN:
 		ret = pmbus_read_word_data(client, page,
@@ -224,36 +66,6 @@ static int ir35221_read_word_data(struct i2c_client *client, int page, int reg)
 	return ret;
 }
 
-static int ir35221_write_word_data(struct i2c_client *client, int page, int reg,
-				   u16 word)
-{
-	int ret;
-	u16 val;
-
-	switch (reg) {
-	case PMBUS_IOUT_OC_FAULT_LIMIT:
-	case PMBUS_IOUT_OC_WARN_LIMIT:
-		val = ir35221_scale_result(word, -1, PSC_CURRENT_OUT);
-		ret = pmbus_write_word_data(client, page, reg, val);
-		break;
-	case PMBUS_VIN_OV_FAULT_LIMIT:
-	case PMBUS_VIN_OV_WARN_LIMIT:
-	case PMBUS_VIN_UV_WARN_LIMIT:
-		val = ir35221_scale_result(word, 4, PSC_VOLTAGE_IN);
-		ret = pmbus_write_word_data(client, page, reg, val);
-		break;
-	case PMBUS_IIN_OC_WARN_LIMIT:
-		val = ir35221_scale_result(word, 1, PSC_CURRENT_IN);
-		ret = pmbus_write_word_data(client, page, reg, val);
-		break;
-	default:
-		ret = -ENODATA;
-		break;
-	}
-
-	return ret;
-}
-
 static int ir35221_probe(struct i2c_client *client,
 			 const struct i2c_device_id *id)
 {
@@ -292,7 +104,6 @@ static int ir35221_probe(struct i2c_client *client,
 	if (!info)
 		return -ENOMEM;
 
-	info->write_word_data = ir35221_write_word_data;
 	info->read_word_data = ir35221_read_word_data;
 
 	info->pages = 2;
diff --git a/drivers/hwmon/pmbus/lm25066.c b/drivers/hwmon/pmbus/lm25066.c
index 10d17fb..53db787 100644
--- a/drivers/hwmon/pmbus/lm25066.c
+++ b/drivers/hwmon/pmbus/lm25066.c
@@ -1,5 +1,5 @@
 /*
- * Hardware monitoring driver for LM25056 / LM25063 / LM25066 / LM5064 / LM5066
+ * Hardware monitoring driver for LM25056 / LM25066 / LM5064 / LM5066
  *
  * Copyright (c) 2011 Ericsson AB.
  * Copyright (c) 2013 Guenter Roeck
@@ -28,7 +28,7 @@
 #include <linux/i2c.h>
 #include "pmbus.h"
 
-enum chips { lm25056, lm25063, lm25066, lm5064, lm5066, lm5066i };
+enum chips { lm25056, lm25066, lm5064, lm5066, lm5066i };
 
 #define LM25066_READ_VAUX		0xd0
 #define LM25066_MFR_READ_IIN		0xd1
@@ -53,11 +53,6 @@ enum chips { lm25056, lm25063, lm25066, lm5064, lm5066, lm5066i };
 #define LM25056_MFR_STS_VAUX_OV_WARN	BIT(1)
 #define LM25056_MFR_STS_VAUX_UV_WARN	BIT(0)
 
-/* LM25063 only */
-
-#define LM25063_READ_VOUT_MAX		0xe5
-#define LM25063_READ_VOUT_MIN		0xe6
-
 struct __coeff {
 	short m, b, R;
 };
@@ -122,36 +117,6 @@ static struct __coeff lm25066_coeff[6][PSC_NUM_CLASSES + 2] = {
 			.m = 16,
 		},
 	},
-	[lm25063] = {
-		[PSC_VOLTAGE_IN] = {
-			.m = 16000,
-			.R = -2,
-		},
-		[PSC_VOLTAGE_OUT] = {
-			.m = 16000,
-			.R = -2,
-		},
-		[PSC_CURRENT_IN] = {
-			.m = 10000,
-			.R = -2,
-		},
-		[PSC_CURRENT_IN_L] = {
-			.m = 10000,
-			.R = -2,
-		},
-		[PSC_POWER] = {
-			.m = 5000,
-			.R = -3,
-		},
-		[PSC_POWER_L] = {
-			.m = 5000,
-			.R = -3,
-		},
-		[PSC_TEMPERATURE] = {
-			.m = 15596,
-			.R = -3,
-		},
-	},
 	[lm5064] = {
 		[PSC_VOLTAGE_IN] = {
 			.m = 4611,
@@ -272,10 +237,6 @@ static int lm25066_read_word_data(struct i2c_client *client, int page, int reg)
 			/* VIN: 6.14 mV VAUX: 293 uV LSB */
 			ret = DIV_ROUND_CLOSEST(ret * 293, 6140);
 			break;
-		case lm25063:
-			/* VIN: 6.25 mV VAUX: 200.0 uV LSB */
-			ret = DIV_ROUND_CLOSEST(ret * 20, 625);
-			break;
 		case lm25066:
 			/* VIN: 4.54 mV VAUX: 283.2 uV LSB */
 			ret = DIV_ROUND_CLOSEST(ret * 2832, 45400);
@@ -330,24 +291,6 @@ static int lm25066_read_word_data(struct i2c_client *client, int page, int reg)
 	return ret;
 }
 
-static int lm25063_read_word_data(struct i2c_client *client, int page, int reg)
-{
-	int ret;
-
-	switch (reg) {
-	case PMBUS_VIRT_READ_VOUT_MAX:
-		ret = pmbus_read_word_data(client, 0, LM25063_READ_VOUT_MAX);
-		break;
-	case PMBUS_VIRT_READ_VOUT_MIN:
-		ret = pmbus_read_word_data(client, 0, LM25063_READ_VOUT_MIN);
-		break;
-	default:
-		ret = lm25066_read_word_data(client, page, reg);
-		break;
-	}
-	return ret;
-}
-
 static int lm25056_read_word_data(struct i2c_client *client, int page, int reg)
 {
 	int ret;
@@ -502,11 +445,6 @@ static int lm25066_probe(struct i2c_client *client,
 		info->read_word_data = lm25056_read_word_data;
 		info->read_byte_data = lm25056_read_byte_data;
 		data->rlimit = 0x0fff;
-	} else if (data->id == lm25063) {
-		info->func[0] |= PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT
-		  | PMBUS_HAVE_POUT;
-		info->read_word_data = lm25063_read_word_data;
-		data->rlimit = 0xffff;
 	} else {
 		info->func[0] |= PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT;
 		info->read_word_data = lm25066_read_word_data;
@@ -543,7 +481,6 @@ static int lm25066_probe(struct i2c_client *client,
 
 static const struct i2c_device_id lm25066_id[] = {
 	{"lm25056", lm25056},
-	{"lm25063", lm25063},
 	{"lm25066", lm25066},
 	{"lm5064", lm5064},
 	{"lm5066", lm5066},
diff --git a/drivers/hwmon/pmbus/max31785.c b/drivers/hwmon/pmbus/max31785.c
index 9313849..c9dc879 100644
--- a/drivers/hwmon/pmbus/max31785.c
+++ b/drivers/hwmon/pmbus/max31785.c
@@ -16,12 +16,231 @@
 
 enum max31785_regs {
 	MFR_REVISION		= 0x9b,
+	MFR_FAN_CONFIG		= 0xf1,
 };
 
+#define MAX31785			0x3030
+#define MAX31785A			0x3040
+
+#define MFR_FAN_CONFIG_DUAL_TACH	BIT(12)
+
 #define MAX31785_NR_PAGES		23
+#define MAX31785_NR_FAN_PAGES		6
+
+static int max31785_read_byte_data(struct i2c_client *client, int page,
+				   int reg)
+{
+	if (page < MAX31785_NR_PAGES)
+		return -ENODATA;
+
+	switch (reg) {
+	case PMBUS_VOUT_MODE:
+		return -ENOTSUPP;
+	case PMBUS_FAN_CONFIG_12:
+		return pmbus_read_byte_data(client, page - MAX31785_NR_PAGES,
+					    reg);
+	}
+
+	return -ENODATA;
+}
+
+static int max31785_write_byte(struct i2c_client *client, int page, u8 value)
+{
+	if (page < MAX31785_NR_PAGES)
+		return -ENODATA;
+
+	return -ENOTSUPP;
+}
+
+static int max31785_read_long_data(struct i2c_client *client, int page,
+				   int reg, u32 *data)
+{
+	unsigned char cmdbuf[1];
+	unsigned char rspbuf[4];
+	int rc;
+
+	struct i2c_msg msg[2] = {
+		{
+			.addr = client->addr,
+			.flags = 0,
+			.len = sizeof(cmdbuf),
+			.buf = cmdbuf,
+		},
+		{
+			.addr = client->addr,
+			.flags = I2C_M_RD,
+			.len = sizeof(rspbuf),
+			.buf = rspbuf,
+		},
+	};
+
+	cmdbuf[0] = reg;
+
+	rc = pmbus_set_page(client, page);
+	if (rc < 0)
+		return rc;
+
+	rc = i2c_transfer(client->adapter, msg, ARRAY_SIZE(msg));
+	if (rc < 0)
+		return rc;
+
+	*data = (rspbuf[0] << (0 * 8)) | (rspbuf[1] << (1 * 8)) |
+		(rspbuf[2] << (2 * 8)) | (rspbuf[3] << (3 * 8));
+
+	return rc;
+}
+
+static int max31785_get_pwm(struct i2c_client *client, int page)
+{
+	int rv;
+
+	rv = pmbus_get_fan_rate_device(client, page, 0, percent);
+	if (rv < 0)
+		return rv;
+	else if (rv >= 0x8000)
+		return 0;
+	else if (rv >= 0x2711)
+		return 0x2710;
+
+	return rv;
+}
+
+static int max31785_get_pwm_mode(struct i2c_client *client, int page)
+{
+	int config;
+	int command;
+
+	config = pmbus_read_byte_data(client, page, PMBUS_FAN_CONFIG_12);
+	if (config < 0)
+		return config;
+
+	command = pmbus_read_word_data(client, page, PMBUS_FAN_COMMAND_1);
+	if (command < 0)
+		return command;
+
+	if (config & PB_FAN_1_RPM)
+		return (command >= 0x8000) ? 3 : 2;
+
+	if (command >= 0x8000)
+		return 3;
+	else if (command >= 0x2711)
+		return 0;
+
+	return 1;
+}
+
+static int max31785_read_word_data(struct i2c_client *client, int page,
+				   int reg)
+{
+	u32 val;
+	int rv;
+
+	switch (reg) {
+	case PMBUS_READ_FAN_SPEED_1:
+		if (page < MAX31785_NR_PAGES)
+			return -ENODATA;
+
+		rv = max31785_read_long_data(client, page - MAX31785_NR_PAGES,
+					     reg, &val);
+		if (rv < 0)
+			return rv;
+
+		rv = (val >> 16) & 0xffff;
+		break;
+	case PMBUS_FAN_COMMAND_1:
+		/*
+		 * PMBUS_FAN_COMMAND_x is probed to judge whether or not to
+		 * expose fan control registers.
+		 *
+		 * Don't expose fan_target attribute for virtual pages.
+		 */
+		rv = (page >= MAX31785_NR_PAGES) ? -ENOTSUPP : -ENODATA;
+		break;
+	case PMBUS_VIRT_PWM_1:
+		rv = max31785_get_pwm(client, page);
+		break;
+	case PMBUS_VIRT_PWM_ENABLE_1:
+		rv = max31785_get_pwm_mode(client, page);
+		break;
+	default:
+		rv = -ENODATA;
+		break;
+	}
+
+	return rv;
+}
+
+static inline u32 max31785_scale_pwm(u32 sensor_val)
+{
+	/*
+	 * The datasheet describes the accepted value range for manual PWM as
+	 * [0, 0x2710], while the hwmon pwmX sysfs interface accepts values in
+	 * [0, 255]. The MAX31785 uses DIRECT mode to scale the FAN_COMMAND
+	 * registers and in PWM mode the coefficients are m=1, b=0, R=2. The
+	 * important observation here is that 0x2710 == 10000 == 100 * 100.
+	 *
+	 * R=2 (== 10^2 == 100) accounts for scaling the value provided at the
+	 * sysfs interface into the required hardware resolution, but it does
+	 * not yet yield a value that we can write to the device (this initial
+	 * scaling is handled by pmbus_data2reg()). Multiplying by 100 below
+	 * translates the parameter value into the percentage units required by
+	 * PMBus, and then we scale back by 255 as required by the hwmon pwmX
+	 * interface to yield the percentage value at the appropriate
+	 * resolution for hardware.
+	 */
+	return (sensor_val * 100) / 255;
+}
+
+static int max31785_pwm_enable(struct i2c_client *client, int page,
+				    u16 word)
+{
+	int config = 0;
+	int rate;
+
+	switch (word) {
+	case 0:
+		rate = 0x7fff;
+		break;
+	case 1:
+		rate = pmbus_get_fan_rate_cached(client, page, 0, percent);
+		if (rate < 0)
+			return rate;
+		rate = max31785_scale_pwm(rate);
+		break;
+	case 2:
+		config = PB_FAN_1_RPM;
+		rate = pmbus_get_fan_rate_cached(client, page, 0, rpm);
+		if (rate < 0)
+			return rate;
+		break;
+	case 3:
+		rate = 0xffff;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return pmbus_update_fan(client, page, 0, config, PB_FAN_1_RPM, rate);
+}
+
+static int max31785_write_word_data(struct i2c_client *client, int page,
+				    int reg, u16 word)
+{
+	switch (reg) {
+	case PMBUS_VIRT_PWM_1:
+		return pmbus_update_fan(client, page, 0, 0, PB_FAN_1_RPM,
+					max31785_scale_pwm(word));
+	case PMBUS_VIRT_PWM_ENABLE_1:
+		return max31785_pwm_enable(client, page, word);
+	default:
+		break;
+	}
+
+	return -ENODATA;
+}
 
 #define MAX31785_FAN_FUNCS \
-	(PMBUS_HAVE_FAN12 | PMBUS_HAVE_STATUS_FAN12)
+	(PMBUS_HAVE_FAN12 | PMBUS_HAVE_STATUS_FAN12 | PMBUS_HAVE_PWM12)
 
 #define MAX31785_TEMP_FUNCS \
 	(PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP)
@@ -29,14 +248,26 @@ enum max31785_regs {
 #define MAX31785_VOUT_FUNCS \
 	(PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT)
 
+#define MAX37185_NUM_FAN_PAGES 6
+
 static const struct pmbus_driver_info max31785_info = {
 	.pages = MAX31785_NR_PAGES,
 
+	.write_word_data = max31785_write_word_data,
+	.read_byte_data = max31785_read_byte_data,
+	.read_word_data = max31785_read_word_data,
+	.write_byte = max31785_write_byte,
+
 	/* RPM */
 	.format[PSC_FAN] = direct,
 	.m[PSC_FAN] = 1,
 	.b[PSC_FAN] = 0,
 	.R[PSC_FAN] = 0,
+	/* PWM */
+	.format[PSC_PWM] = direct,
+	.m[PSC_PWM] = 1,
+	.b[PSC_PWM] = 0,
+	.R[PSC_PWM] = 2,
 	.func[0] = MAX31785_FAN_FUNCS,
 	.func[1] = MAX31785_FAN_FUNCS,
 	.func[2] = MAX31785_FAN_FUNCS,
@@ -72,13 +303,46 @@ static const struct pmbus_driver_info max31785_info = {
 	.func[22] = MAX31785_VOUT_FUNCS,
 };
 
+static int max31785_configure_dual_tach(struct i2c_client *client,
+					struct pmbus_driver_info *info)
+{
+	int ret;
+	int i;
+
+	for (i = 0; i < MAX31785_NR_FAN_PAGES; i++) {
+		ret = i2c_smbus_write_byte_data(client, PMBUS_PAGE, i);
+		if (ret < 0)
+			return ret;
+
+		ret = i2c_smbus_read_word_data(client, MFR_FAN_CONFIG);
+		if (ret < 0)
+			return ret;
+
+		if (ret & MFR_FAN_CONFIG_DUAL_TACH) {
+			int virtual = MAX31785_NR_PAGES + i;
+
+			info->pages = virtual + 1;
+			info->func[virtual] |= PMBUS_HAVE_FAN12;
+			info->func[virtual] |= PMBUS_PAGE_VIRTUAL;
+		}
+	}
+
+	return 0;
+}
+
 static int max31785_probe(struct i2c_client *client,
 			  const struct i2c_device_id *id)
 {
 	struct device *dev = &client->dev;
 	struct pmbus_driver_info *info;
+	bool dual_tach = false;
 	s64 ret;
 
+	if (!i2c_check_functionality(client->adapter,
+				     I2C_FUNC_SMBUS_BYTE_DATA |
+				     I2C_FUNC_SMBUS_WORD_DATA))
+		return -ENODEV;
+
 	info = devm_kzalloc(dev, sizeof(struct pmbus_driver_info), GFP_KERNEL);
 	if (!info)
 		return -ENOMEM;
@@ -89,6 +353,25 @@ static int max31785_probe(struct i2c_client *client,
 	if (ret < 0)
 		return ret;
 
+	ret = i2c_smbus_read_word_data(client, MFR_REVISION);
+	if (ret < 0)
+		return ret;
+
+	if (ret == MAX31785A) {
+		dual_tach = true;
+	} else if (ret == MAX31785) {
+		if (!strcmp("max31785a", id->name))
+			dev_warn(dev, "Expected max3175a, found max31785: cannot provide secondary tachometer readings\n");
+	} else {
+		return -ENODEV;
+	}
+
+	if (dual_tach) {
+		ret = max31785_configure_dual_tach(client, info);
+		if (ret < 0)
+			return ret;
+	}
+
 	return pmbus_do_probe(client, id, info);
 }
 
@@ -100,9 +383,18 @@ static const struct i2c_device_id max31785_id[] = {
 
 MODULE_DEVICE_TABLE(i2c, max31785_id);
 
+static const struct of_device_id max31785_of_match[] = {
+	{ .compatible = "maxim,max31785" },
+	{ .compatible = "maxim,max31785a" },
+	{ },
+};
+
+MODULE_DEVICE_TABLE(of, max31785_of_match);
+
 static struct i2c_driver max31785_driver = {
 	.driver = {
 		.name = "max31785",
+		.of_match_table = max31785_of_match,
 	},
 	.probe = max31785_probe,
 	.remove = pmbus_do_remove,
diff --git a/drivers/hwmon/pmbus/pmbus.h b/drivers/hwmon/pmbus/pmbus.h
index fa613bd..1d24397 100644
--- a/drivers/hwmon/pmbus/pmbus.h
+++ b/drivers/hwmon/pmbus/pmbus.h
@@ -190,6 +190,33 @@ enum pmbus_regs {
 	PMBUS_VIRT_VMON_UV_FAULT_LIMIT,
 	PMBUS_VIRT_VMON_OV_FAULT_LIMIT,
 	PMBUS_VIRT_STATUS_VMON,
+
+	/*
+	 * RPM and PWM Fan control
+	 *
+	 * Drivers wanting to expose PWM control must define the behaviour of
+	 * PMBUS_VIRT_PWM_[1-4] and PMBUS_VIRT_PWM_ENABLE_[1-4] in the
+	 * {read,write}_word_data callback.
+	 *
+	 * pmbus core provides a default implementation for
+	 * PMBUS_VIRT_FAN_TARGET_[1-4].
+	 *
+	 * TARGET, PWM and PWM_ENABLE members must be defined sequentially;
+	 * pmbus core uses the difference between the provided register and
+	 * it's _1 counterpart to calculate the FAN/PWM ID.
+	 */
+	PMBUS_VIRT_FAN_TARGET_1,
+	PMBUS_VIRT_FAN_TARGET_2,
+	PMBUS_VIRT_FAN_TARGET_3,
+	PMBUS_VIRT_FAN_TARGET_4,
+	PMBUS_VIRT_PWM_1,
+	PMBUS_VIRT_PWM_2,
+	PMBUS_VIRT_PWM_3,
+	PMBUS_VIRT_PWM_4,
+	PMBUS_VIRT_PWM_ENABLE_1,
+	PMBUS_VIRT_PWM_ENABLE_2,
+	PMBUS_VIRT_PWM_ENABLE_3,
+	PMBUS_VIRT_PWM_ENABLE_4,
 };
 
 /*
@@ -223,6 +250,8 @@ enum pmbus_regs {
 #define PB_FAN_1_RPM			BIT(6)
 #define PB_FAN_1_INSTALLED		BIT(7)
 
+enum pmbus_fan_mode { percent = 0, rpm };
+
 /*
  * STATUS_BYTE, STATUS_WORD (lower)
  */
@@ -313,6 +342,7 @@ enum pmbus_sensor_classes {
 	PSC_POWER,
 	PSC_TEMPERATURE,
 	PSC_FAN,
+	PSC_PWM,
 	PSC_NUM_CLASSES		/* Number of power sensor classes */
 };
 
@@ -339,6 +369,10 @@ enum pmbus_sensor_classes {
 #define PMBUS_HAVE_STATUS_FAN34	BIT(17)
 #define PMBUS_HAVE_VMON		BIT(18)
 #define PMBUS_HAVE_STATUS_VMON	BIT(19)
+#define PMBUS_HAVE_PWM12	BIT(20)
+#define PMBUS_HAVE_PWM34	BIT(21)
+
+#define PMBUS_PAGE_VIRTUAL	BIT(31)
 
 enum pmbus_data_format { linear = 0, direct, vid };
 enum vrm_version { vr11 = 0, vr12, vr13 };
@@ -421,5 +455,12 @@ int pmbus_do_probe(struct i2c_client *client, const struct i2c_device_id *id,
 int pmbus_do_remove(struct i2c_client *client);
 const struct pmbus_driver_info *pmbus_get_driver_info(struct i2c_client
 						      *client);
+int pmbus_get_fan_rate_device(struct i2c_client *client, int page, int id,
+			      enum pmbus_fan_mode mode);
+int pmbus_get_fan_rate_cached(struct i2c_client *client, int page, int id,
+			      enum pmbus_fan_mode mode);
+int pmbus_update_fan(struct i2c_client *client, int page, int id,
+		     u8 config, u8 mask, u16 command);
+struct dentry *pmbus_get_debugfs_dir(struct i2c_client *client);
 
 #endif /* PMBUS_H */
diff --git a/drivers/hwmon/pmbus/pmbus_core.c b/drivers/hwmon/pmbus/pmbus_core.c
index a139940..f7c47d7 100644
--- a/drivers/hwmon/pmbus/pmbus_core.c
+++ b/drivers/hwmon/pmbus/pmbus_core.c
@@ -65,6 +65,7 @@ struct pmbus_sensor {
 	u16 reg;		/* register */
 	enum pmbus_sensor_classes class;	/* sensor class */
 	bool update;		/* runtime sensor update needed */
+	bool convert;		/* Whether or not to apply linear/vid/direct */
 	int data;		/* Sensor data.
 				   Negative if there was a read error */
 };
@@ -129,6 +130,27 @@ struct pmbus_debugfs_entry {
 	u8 reg;
 };
 
+static const int pmbus_fan_rpm_mask[] = {
+	PB_FAN_1_RPM,
+	PB_FAN_2_RPM,
+	PB_FAN_1_RPM,
+	PB_FAN_2_RPM,
+};
+
+static const int pmbus_fan_config_registers[] = {
+	PMBUS_FAN_CONFIG_12,
+	PMBUS_FAN_CONFIG_12,
+	PMBUS_FAN_CONFIG_34,
+	PMBUS_FAN_CONFIG_34
+};
+
+static const int pmbus_fan_command_registers[] = {
+	PMBUS_FAN_COMMAND_1,
+	PMBUS_FAN_COMMAND_2,
+	PMBUS_FAN_COMMAND_3,
+	PMBUS_FAN_COMMAND_4,
+};
+
 void pmbus_clear_cache(struct i2c_client *client)
 {
 	struct pmbus_data *data = i2c_get_clientdata(client);
@@ -140,18 +162,27 @@ EXPORT_SYMBOL_GPL(pmbus_clear_cache);
 int pmbus_set_page(struct i2c_client *client, int page)
 {
 	struct pmbus_data *data = i2c_get_clientdata(client);
-	int rv = 0;
-	int newpage;
+	int rv;
 
-	if (page >= 0 && page != data->currpage) {
+	if (page < 0 || page == data->currpage)
+		return 0;
+
+	if (!(data->info->func[page] & PMBUS_PAGE_VIRTUAL)) {
 		rv = i2c_smbus_write_byte_data(client, PMBUS_PAGE, page);
-		newpage = i2c_smbus_read_byte_data(client, PMBUS_PAGE);
-		if (newpage != page)
-			rv = -EIO;
-		else
-			data->currpage = page;
+		if (rv < 0)
+			return rv;
+
+		rv = i2c_smbus_read_byte_data(client, PMBUS_PAGE);
+		if (rv < 0)
+			return rv;
+
+		if (rv != page)
+			return -EIO;
 	}
-	return rv;
+
+	data->currpage = page;
+
+	return 0;
 }
 EXPORT_SYMBOL_GPL(pmbus_set_page);
 
@@ -198,6 +229,28 @@ int pmbus_write_word_data(struct i2c_client *client, int page, u8 reg,
 }
 EXPORT_SYMBOL_GPL(pmbus_write_word_data);
 
+
+static int pmbus_write_virt_reg(struct i2c_client *client, int page, int reg,
+				u16 word)
+{
+	int bit;
+	int id;
+	int rv;
+
+	switch (reg) {
+	case PMBUS_VIRT_FAN_TARGET_1 ... PMBUS_VIRT_FAN_TARGET_4:
+		id = reg - PMBUS_VIRT_FAN_TARGET_1;
+		bit = pmbus_fan_rpm_mask[id];
+		rv = pmbus_update_fan(client, page, id, bit, bit, word);
+		break;
+	default:
+		rv = -ENXIO;
+		break;
+	}
+
+	return rv;
+}
+
 /*
  * _pmbus_write_word_data() is similar to pmbus_write_word_data(), but checks if
  * a device specific mapping function exists and calls it if necessary.
@@ -214,11 +267,38 @@ static int _pmbus_write_word_data(struct i2c_client *client, int page, int reg,
 		if (status != -ENODATA)
 			return status;
 	}
+
 	if (reg >= PMBUS_VIRT_BASE)
-		return -ENXIO;
+		return pmbus_write_virt_reg(client, page, reg, word);
+
 	return pmbus_write_word_data(client, page, reg, word);
 }
 
+int pmbus_update_fan(struct i2c_client *client, int page, int id,
+		     u8 config, u8 mask, u16 command)
+{
+	int from;
+	int rv;
+	u8 to;
+
+	from = pmbus_read_byte_data(client, page,
+				    pmbus_fan_config_registers[id]);
+	if (from < 0)
+		return from;
+
+	to = (from & ~mask) | (config & mask);
+	if (to != from) {
+		rv = pmbus_write_byte_data(client, page,
+					   pmbus_fan_config_registers[id], to);
+		if (rv < 0)
+			return rv;
+	}
+
+	return _pmbus_write_word_data(client, page,
+				      pmbus_fan_command_registers[id], command);
+}
+EXPORT_SYMBOL_GPL(pmbus_update_fan);
+
 int pmbus_read_word_data(struct i2c_client *client, int page, u8 reg)
 {
 	int rv;
@@ -231,6 +311,24 @@ int pmbus_read_word_data(struct i2c_client *client, int page, u8 reg)
 }
 EXPORT_SYMBOL_GPL(pmbus_read_word_data);
 
+static int pmbus_read_virt_reg(struct i2c_client *client, int page, int reg)
+{
+	int rv;
+	int id;
+
+	switch (reg) {
+	case PMBUS_VIRT_FAN_TARGET_1 ... PMBUS_VIRT_FAN_TARGET_4:
+		id = reg - PMBUS_VIRT_FAN_TARGET_1;
+		rv = pmbus_get_fan_rate_device(client, page, id, rpm);
+		break;
+	default:
+		rv = -ENXIO;
+		break;
+	}
+
+	return rv;
+}
+
 /*
  * _pmbus_read_word_data() is similar to pmbus_read_word_data(), but checks if
  * a device specific mapping function exists and calls it if necessary.
@@ -246,8 +344,10 @@ static int _pmbus_read_word_data(struct i2c_client *client, int page, int reg)
 		if (status != -ENODATA)
 			return status;
 	}
+
 	if (reg >= PMBUS_VIRT_BASE)
-		return -ENXIO;
+		return pmbus_read_virt_reg(client, page, reg);
+
 	return pmbus_read_word_data(client, page, reg);
 }
 
@@ -312,6 +412,68 @@ static int _pmbus_read_byte_data(struct i2c_client *client, int page, int reg)
 	return pmbus_read_byte_data(client, page, reg);
 }
 
+static struct pmbus_sensor *pmbus_find_sensor(struct pmbus_data *data, int page,
+					      int reg)
+{
+	struct pmbus_sensor *sensor;
+
+	for (sensor = data->sensors; sensor; sensor = sensor->next) {
+		if (sensor->page == page && sensor->reg == reg)
+			return sensor;
+	}
+
+	return ERR_PTR(-EINVAL);
+}
+
+static int pmbus_get_fan_rate(struct i2c_client *client, int page, int id,
+			      enum pmbus_fan_mode mode,
+			      bool from_cache)
+{
+	struct pmbus_data *data = i2c_get_clientdata(client);
+	bool want_rpm, have_rpm;
+	struct pmbus_sensor *s;
+	int config;
+	int reg;
+
+	want_rpm = (mode == rpm);
+
+	if (from_cache) {
+		reg = want_rpm ? PMBUS_VIRT_FAN_TARGET_1 : PMBUS_VIRT_PWM_1;
+		s = pmbus_find_sensor(data, page, reg + id);
+		if (IS_ERR(s))
+			return PTR_ERR(s);
+
+		return s->data;
+	}
+
+	config = pmbus_read_byte_data(client, page,
+				      pmbus_fan_config_registers[id]);
+	if (config < 0)
+		return config;
+
+	have_rpm = !!(config & pmbus_fan_rpm_mask[id]);
+	if (want_rpm == have_rpm)
+		return pmbus_read_word_data(client, page,
+					    pmbus_fan_command_registers[id]);
+
+	/* Can't sensibly map between RPM and PWM, just return zero */
+	return 0;
+}
+
+int pmbus_get_fan_rate_device(struct i2c_client *client, int page, int id,
+			      enum pmbus_fan_mode mode)
+{
+	return pmbus_get_fan_rate(client, page, id, mode, false);
+}
+EXPORT_SYMBOL_GPL(pmbus_get_fan_rate_device);
+
+int pmbus_get_fan_rate_cached(struct i2c_client *client, int page, int id,
+			      enum pmbus_fan_mode mode)
+{
+	return pmbus_get_fan_rate(client, page, id, mode, true);
+}
+EXPORT_SYMBOL_GPL(pmbus_get_fan_rate_cached);
+
 static void pmbus_clear_fault_page(struct i2c_client *client, int page)
 {
 	_pmbus_write_byte(client, page, PMBUS_CLEAR_FAULTS);
@@ -513,7 +675,7 @@ static long pmbus_reg2data_direct(struct pmbus_data *data,
 	/* X = 1/m * (Y * 10^-R - b) */
 	R = -R;
 	/* scale result to milli-units for everything but fans */
-	if (sensor->class != PSC_FAN) {
+	if (!(sensor->class == PSC_FAN || sensor->class == PSC_PWM)) {
 		R += 3;
 		b *= 1000;
 	}
@@ -568,6 +730,9 @@ static long pmbus_reg2data(struct pmbus_data *data, struct pmbus_sensor *sensor)
 {
 	long val;
 
+	if (!sensor->convert)
+		return sensor->data;
+
 	switch (data->info->format[sensor->class]) {
 	case direct:
 		val = pmbus_reg2data_direct(data, sensor);
@@ -672,7 +837,7 @@ static u16 pmbus_data2reg_direct(struct pmbus_data *data,
 	}
 
 	/* Calculate Y = (m * X + b) * 10^R */
-	if (sensor->class != PSC_FAN) {
+	if (!(sensor->class == PSC_FAN || sensor->class == PSC_PWM)) {
 		R -= 3;		/* Adjust R and b for data in milli-units */
 		b *= 1000;
 	}
@@ -703,6 +868,9 @@ static u16 pmbus_data2reg(struct pmbus_data *data,
 {
 	u16 regval;
 
+	if (!sensor->convert)
+		return val;
+
 	switch (data->info->format[sensor->class]) {
 	case direct:
 		regval = pmbus_data2reg_direct(data, sensor, val);
@@ -915,7 +1083,8 @@ static struct pmbus_sensor *pmbus_add_sensor(struct pmbus_data *data,
 					     const char *name, const char *type,
 					     int seq, int page, int reg,
 					     enum pmbus_sensor_classes class,
-					     bool update, bool readonly)
+					     bool update, bool readonly,
+					     bool convert)
 {
 	struct pmbus_sensor *sensor;
 	struct device_attribute *a;
@@ -925,12 +1094,18 @@ static struct pmbus_sensor *pmbus_add_sensor(struct pmbus_data *data,
 		return NULL;
 	a = &sensor->attribute;
 
-	snprintf(sensor->name, sizeof(sensor->name), "%s%d_%s",
-		 name, seq, type);
+	if (type)
+		snprintf(sensor->name, sizeof(sensor->name), "%s%d_%s",
+			 name, seq, type);
+	else
+		snprintf(sensor->name, sizeof(sensor->name), "%s%d",
+			 name, seq);
+
 	sensor->page = page;
 	sensor->reg = reg;
 	sensor->class = class;
 	sensor->update = update;
+	sensor->convert = convert;
 	pmbus_dev_attr_init(a, sensor->name,
 			    readonly ? S_IRUGO : S_IRUGO | S_IWUSR,
 			    pmbus_show_sensor, pmbus_set_sensor);
@@ -1029,7 +1204,7 @@ static int pmbus_add_limit_attrs(struct i2c_client *client,
 			curr = pmbus_add_sensor(data, name, l->attr, index,
 						page, l->reg, attr->class,
 						attr->update || l->update,
-						false);
+						false, true);
 			if (!curr)
 				return -ENOMEM;
 			if (l->sbit && (info->func[page] & attr->sfunc)) {
@@ -1068,7 +1243,7 @@ static int pmbus_add_sensor_attrs_one(struct i2c_client *client,
 			return ret;
 	}
 	base = pmbus_add_sensor(data, name, "input", index, page, attr->reg,
-				attr->class, true, true);
+				attr->class, true, true, true);
 	if (!base)
 		return -ENOMEM;
 	if (attr->sfunc) {
@@ -1592,13 +1767,6 @@ static const int pmbus_fan_registers[] = {
 	PMBUS_READ_FAN_SPEED_4
 };
 
-static const int pmbus_fan_config_registers[] = {
-	PMBUS_FAN_CONFIG_12,
-	PMBUS_FAN_CONFIG_12,
-	PMBUS_FAN_CONFIG_34,
-	PMBUS_FAN_CONFIG_34
-};
-
 static const int pmbus_fan_status_registers[] = {
 	PMBUS_STATUS_FAN_12,
 	PMBUS_STATUS_FAN_12,
@@ -1621,6 +1789,42 @@ static const u32 pmbus_fan_status_flags[] = {
 };
 
 /* Fans */
+
+/* Precondition: FAN_CONFIG_x_y and FAN_COMMAND_x must exist for the fan ID */
+static int pmbus_add_fan_ctrl(struct i2c_client *client,
+		struct pmbus_data *data, int index, int page, int id,
+		u8 config)
+{
+	struct pmbus_sensor *sensor;
+
+	sensor = pmbus_add_sensor(data, "fan", "target", index, page,
+				  PMBUS_VIRT_FAN_TARGET_1 + id, PSC_FAN,
+				  false, false, true);
+
+	if (!sensor)
+		return -ENOMEM;
+
+	if (!((data->info->func[page] & PMBUS_HAVE_PWM12) ||
+			(data->info->func[page] & PMBUS_HAVE_PWM34)))
+		return 0;
+
+	sensor = pmbus_add_sensor(data, "pwm", NULL, index, page,
+				  PMBUS_VIRT_PWM_1 + id, PSC_PWM,
+				  false, false, true);
+
+	if (!sensor)
+		return -ENOMEM;
+
+	sensor = pmbus_add_sensor(data, "pwm", "enable", index, page,
+				  PMBUS_VIRT_PWM_ENABLE_1 + id, PSC_PWM,
+				  true, false, false);
+
+	if (!sensor)
+		return -ENOMEM;
+
+	return 0;
+}
+
 static int pmbus_add_fan_attributes(struct i2c_client *client,
 				    struct pmbus_data *data)
 {
@@ -1655,9 +1859,18 @@ static int pmbus_add_fan_attributes(struct i2c_client *client,
 
 			if (pmbus_add_sensor(data, "fan", "input", index,
 					     page, pmbus_fan_registers[f],
-					     PSC_FAN, true, true) == NULL)
+					     PSC_FAN, true, true, true) == NULL)
 				return -ENOMEM;
 
+			/* Fan control */
+			if (pmbus_check_word_register(client, page,
+					pmbus_fan_command_registers[f])) {
+				ret = pmbus_add_fan_ctrl(client, data, index,
+							 page, f, regval);
+				if (ret < 0)
+					return ret;
+			}
+
 			/*
 			 * Each fan status register covers multiple fans,
 			 * so we have to do some magic.
@@ -2168,6 +2381,14 @@ int pmbus_do_remove(struct i2c_client *client)
 }
 EXPORT_SYMBOL_GPL(pmbus_do_remove);
 
+struct dentry *pmbus_get_debugfs_dir(struct i2c_client *client)
+{
+	struct pmbus_data *data = i2c_get_clientdata(client);
+
+	return data->debugfs;
+}
+EXPORT_SYMBOL_GPL(pmbus_get_debugfs_dir);
+
 static int __init pmbus_core_init(void)
 {
 	pmbus_debugfs_dir = debugfs_create_dir("pmbus", NULL);
diff --git a/drivers/hwmon/sht15.c b/drivers/hwmon/sht15.c
index 25d2834..2be7775 100644
--- a/drivers/hwmon/sht15.c
+++ b/drivers/hwmon/sht15.c
@@ -179,6 +179,7 @@ struct sht15_data {
  * sht15_crc8() - compute crc8
  * @data:	sht15 specific data.
  * @value:	sht15 retrieved data.
+ * @len:	Length of retrieved data
  *
  * This implements section 2 of the CRC datasheet.
  */
diff --git a/drivers/hwmon/sht21.c b/drivers/hwmon/sht21.c
index 06706d2..190e7b3 100644
--- a/drivers/hwmon/sht21.c
+++ b/drivers/hwmon/sht21.c
@@ -41,7 +41,7 @@
 
 /**
  * struct sht21 - SHT21 device specific data
- * @hwmon_dev: device registered with hwmon
+ * @client: I2C client device
  * @lock: mutex to protect measurement values
  * @last_update: time of last update (jiffies)
  * @temperature: cached temperature measurement value
diff --git a/drivers/hwmon/sht3x.c b/drivers/hwmon/sht3x.c
index 6ea99cd..370b57d 100644
--- a/drivers/hwmon/sht3x.c
+++ b/drivers/hwmon/sht3x.c
@@ -732,6 +732,13 @@ static int sht3x_probe(struct i2c_client *client,
 	mutex_init(&data->i2c_lock);
 	mutex_init(&data->data_lock);
 
+	/*
+	 * An attempt to read limits register too early
+	 * causes a NACK response from the chip.
+	 * Waiting for an empirical delay of 500 us solves the issue.
+	 */
+	usleep_range(500, 600);
+
 	ret = limits_update(data);
 	if (ret)
 		return ret;
diff --git a/drivers/hwmon/w83773g.c b/drivers/hwmon/w83773g.c
new file mode 100644
index 0000000..e858093
--- /dev/null
+++ b/drivers/hwmon/w83773g.c
@@ -0,0 +1,329 @@
+/*
+ * Copyright (C) 2017 IBM Corp.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * Driver for the Nuvoton W83773G SMBus temperature sensor IC.
+ * Supported models: W83773G
+ */
+
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/i2c.h>
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/err.h>
+#include <linux/of_device.h>
+#include <linux/regmap.h>
+
+/* W83773 has 3 channels */
+#define W83773_CHANNELS				3
+
+/* The W83773 registers */
+#define W83773_CONVERSION_RATE_REG_READ		0x04
+#define W83773_CONVERSION_RATE_REG_WRITE	0x0A
+#define W83773_MANUFACTURER_ID_REG		0xFE
+#define W83773_LOCAL_TEMP			0x00
+
+static const u8 W83773_STATUS[2] = { 0x02, 0x17 };
+
+static const u8 W83773_TEMP_LSB[2] = { 0x10, 0x25 };
+static const u8 W83773_TEMP_MSB[2] = { 0x01, 0x24 };
+
+static const u8 W83773_OFFSET_LSB[2] = { 0x12, 0x16 };
+static const u8 W83773_OFFSET_MSB[2] = { 0x11, 0x15 };
+
+/* this is the number of sensors in the device */
+static const struct i2c_device_id w83773_id[] = {
+	{ "w83773g" },
+	{ }
+};
+
+MODULE_DEVICE_TABLE(i2c, w83773_id);
+
+static const struct of_device_id w83773_of_match[] = {
+	{
+		.compatible = "nuvoton,w83773g"
+	},
+	{ },
+};
+MODULE_DEVICE_TABLE(of, w83773_of_match);
+
+static inline long temp_of_local(s8 reg)
+{
+	return reg * 1000;
+}
+
+static inline long temp_of_remote(s8 hb, u8 lb)
+{
+	return (hb << 3 | lb >> 5) * 125;
+}
+
+static int get_local_temp(struct regmap *regmap, long *val)
+{
+	unsigned int regval;
+	int ret;
+
+	ret = regmap_read(regmap, W83773_LOCAL_TEMP, &regval);
+	if (ret < 0)
+		return ret;
+
+	*val = temp_of_local(regval);
+	return 0;
+}
+
+static int get_remote_temp(struct regmap *regmap, int index, long *val)
+{
+	unsigned int regval_high;
+	unsigned int regval_low;
+	int ret;
+
+	ret = regmap_read(regmap, W83773_TEMP_MSB[index], &regval_high);
+	if (ret < 0)
+		return ret;
+
+	ret = regmap_read(regmap, W83773_TEMP_LSB[index], &regval_low);
+	if (ret < 0)
+		return ret;
+
+	*val = temp_of_remote(regval_high, regval_low);
+	return 0;
+}
+
+static int get_fault(struct regmap *regmap, int index, long *val)
+{
+	unsigned int regval;
+	int ret;
+
+	ret = regmap_read(regmap, W83773_STATUS[index], &regval);
+	if (ret < 0)
+		return ret;
+
+	*val = (regval & 0x04) >> 2;
+	return 0;
+}
+
+static int get_offset(struct regmap *regmap, int index, long *val)
+{
+	unsigned int regval_high;
+	unsigned int regval_low;
+	int ret;
+
+	ret = regmap_read(regmap, W83773_OFFSET_MSB[index], &regval_high);
+	if (ret < 0)
+		return ret;
+
+	ret = regmap_read(regmap, W83773_OFFSET_LSB[index], &regval_low);
+	if (ret < 0)
+		return ret;
+
+	*val = temp_of_remote(regval_high, regval_low);
+	return 0;
+}
+
+static int set_offset(struct regmap *regmap, int index, long val)
+{
+	int ret;
+	u8 high_byte;
+	u8 low_byte;
+
+	val = clamp_val(val, -127825, 127825);
+	/* offset value equals to (high_byte << 3 | low_byte >> 5) * 125 */
+	val /= 125;
+	high_byte = val >> 3;
+	low_byte = (val & 0x07) << 5;
+
+	ret = regmap_write(regmap, W83773_OFFSET_MSB[index], high_byte);
+	if (ret < 0)
+		return ret;
+
+	return regmap_write(regmap, W83773_OFFSET_LSB[index], low_byte);
+}
+
+static int get_update_interval(struct regmap *regmap, long *val)
+{
+	unsigned int regval;
+	int ret;
+
+	ret = regmap_read(regmap, W83773_CONVERSION_RATE_REG_READ, &regval);
+	if (ret < 0)
+		return ret;
+
+	*val = 16000 >> regval;
+	return 0;
+}
+
+static int set_update_interval(struct regmap *regmap, long val)
+{
+	int rate;
+
+	/*
+	 * For valid rates, interval can be calculated as
+	 *	interval = (1 << (8 - rate)) * 62.5;
+	 * Rounded rate is therefore
+	 *	rate = 8 - __fls(interval * 8 / (62.5 * 7));
+	 * Use clamp_val() to avoid overflows, and to ensure valid input
+	 * for __fls.
+	 */
+	val = clamp_val(val, 62, 16000) * 10;
+	rate = 8 - __fls((val * 8 / (625 * 7)));
+	return regmap_write(regmap, W83773_CONVERSION_RATE_REG_WRITE, rate);
+}
+
+static int w83773_read(struct device *dev, enum hwmon_sensor_types type,
+		       u32 attr, int channel, long *val)
+{
+	struct regmap *regmap = dev_get_drvdata(dev);
+
+	if (type == hwmon_chip) {
+		if (attr == hwmon_chip_update_interval)
+			return get_update_interval(regmap, val);
+		return -EOPNOTSUPP;
+	}
+
+	switch (attr) {
+	case hwmon_temp_input:
+		if (channel == 0)
+			return get_local_temp(regmap, val);
+		return get_remote_temp(regmap, channel - 1, val);
+	case hwmon_temp_fault:
+		return get_fault(regmap, channel - 1, val);
+	case hwmon_temp_offset:
+		return get_offset(regmap, channel - 1, val);
+	default:
+		return -EOPNOTSUPP;
+	}
+}
+
+static int w83773_write(struct device *dev, enum hwmon_sensor_types type,
+			u32 attr, int channel, long val)
+{
+	struct regmap *regmap = dev_get_drvdata(dev);
+
+	if (type == hwmon_chip && attr == hwmon_chip_update_interval)
+		return set_update_interval(regmap, val);
+
+	if (type == hwmon_temp && attr == hwmon_temp_offset)
+		return set_offset(regmap, channel - 1, val);
+
+	return -EOPNOTSUPP;
+}
+
+static umode_t w83773_is_visible(const void *data, enum hwmon_sensor_types type,
+				 u32 attr, int channel)
+{
+	switch (type) {
+	case hwmon_chip:
+		switch (attr) {
+		case hwmon_chip_update_interval:
+			return 0644;
+		}
+		break;
+	case hwmon_temp:
+		switch (attr) {
+		case hwmon_temp_input:
+		case hwmon_temp_fault:
+			return 0444;
+		case hwmon_temp_offset:
+			return 0644;
+		}
+		break;
+	default:
+		break;
+	}
+	return 0;
+}
+
+static const u32 w83773_chip_config[] = {
+	HWMON_C_REGISTER_TZ | HWMON_C_UPDATE_INTERVAL,
+	0
+};
+
+static const struct hwmon_channel_info w83773_chip = {
+	.type = hwmon_chip,
+	.config = w83773_chip_config,
+};
+
+static const u32 w83773_temp_config[] = {
+	HWMON_T_INPUT,
+	HWMON_T_INPUT | HWMON_T_FAULT | HWMON_T_OFFSET,
+	HWMON_T_INPUT | HWMON_T_FAULT | HWMON_T_OFFSET,
+	0
+};
+
+static const struct hwmon_channel_info w83773_temp = {
+	.type = hwmon_temp,
+	.config = w83773_temp_config,
+};
+
+static const struct hwmon_channel_info *w83773_info[] = {
+	&w83773_chip,
+	&w83773_temp,
+	NULL
+};
+
+static const struct hwmon_ops w83773_ops = {
+	.is_visible = w83773_is_visible,
+	.read = w83773_read,
+	.write = w83773_write,
+};
+
+static const struct hwmon_chip_info w83773_chip_info = {
+	.ops = &w83773_ops,
+	.info = w83773_info,
+};
+
+static const struct regmap_config w83773_regmap_config = {
+	.reg_bits = 8,
+	.val_bits = 8,
+};
+
+static int w83773_probe(struct i2c_client *client,
+			const struct i2c_device_id *id)
+{
+	struct device *dev = &client->dev;
+	struct device *hwmon_dev;
+	struct regmap *regmap;
+	int ret;
+
+	regmap = devm_regmap_init_i2c(client, &w83773_regmap_config);
+	if (IS_ERR(regmap)) {
+		dev_err(dev, "failed to allocate register map\n");
+		return PTR_ERR(regmap);
+	}
+
+	/* Set the conversion rate to 2 Hz */
+	ret = regmap_write(regmap, W83773_CONVERSION_RATE_REG_WRITE, 0x05);
+	if (ret < 0) {
+		dev_err(&client->dev, "error writing config rate register\n");
+		return ret;
+	}
+
+	i2c_set_clientdata(client, regmap);
+
+	hwmon_dev = devm_hwmon_device_register_with_info(dev,
+							 client->name,
+							 regmap,
+							 &w83773_chip_info,
+							 NULL);
+	return PTR_ERR_OR_ZERO(hwmon_dev);
+}
+
+static struct i2c_driver w83773_driver = {
+	.class = I2C_CLASS_HWMON,
+	.driver = {
+		.name	= "w83773g",
+		.of_match_table = of_match_ptr(w83773_of_match),
+	},
+	.probe = w83773_probe,
+	.id_table = w83773_id,
+};
+
+module_i2c_driver(w83773_driver);
+
+MODULE_AUTHOR("Lei YU <mine260309@gmail.com>");
+MODULE_DESCRIPTION("W83773G temperature sensor driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/hwtracing/coresight/of_coresight.c b/drivers/hwtracing/coresight/of_coresight.c
index a187941..7c37544 100644
--- a/drivers/hwtracing/coresight/of_coresight.c
+++ b/drivers/hwtracing/coresight/of_coresight.c
@@ -104,26 +104,17 @@ static int of_coresight_alloc_memory(struct device *dev,
 int of_coresight_get_cpu(const struct device_node *node)
 {
 	int cpu;
-	bool found;
-	struct device_node *dn, *np;
+	struct device_node *dn;
 
 	dn = of_parse_phandle(node, "cpu", 0);
-
 	/* Affinity defaults to CPU0 */
 	if (!dn)
 		return 0;
-
-	for_each_possible_cpu(cpu) {
-		np = of_cpu_device_node_get(cpu);
-		found = (dn == np);
-		of_node_put(np);
-		if (found)
-			break;
-	}
+	cpu = of_cpu_node_to_id(dn);
 	of_node_put(dn);
 
 	/* Affinity to CPU0 if no cpu nodes are found */
-	return found ? cpu : 0;
+	return (cpu < 0) ? 0 : cpu;
 }
 EXPORT_SYMBOL_GPL(of_coresight_get_cpu);
 
diff --git a/drivers/i2c/busses/i2c-designware-core.h b/drivers/i2c/busses/i2c-designware-core.h
index 21bf619..9fee4c0 100644
--- a/drivers/i2c/busses/i2c-designware-core.h
+++ b/drivers/i2c/busses/i2c-designware-core.h
@@ -280,8 +280,6 @@ struct dw_i2c_dev {
 	int			(*acquire_lock)(struct dw_i2c_dev *dev);
 	void			(*release_lock)(struct dw_i2c_dev *dev);
 	bool			pm_disabled;
-	bool			suspended;
-	bool			skip_resume;
 	void			(*disable)(struct dw_i2c_dev *dev);
 	void			(*disable_int)(struct dw_i2c_dev *dev);
 	int			(*init)(struct dw_i2c_dev *dev);
diff --git a/drivers/i2c/busses/i2c-designware-platdrv.c b/drivers/i2c/busses/i2c-designware-platdrv.c
index 58add69..153b947 100644
--- a/drivers/i2c/busses/i2c-designware-platdrv.c
+++ b/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -42,6 +42,7 @@
 #include <linux/reset.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/suspend.h>
 
 #include "i2c-designware-core.h"
 
@@ -372,6 +373,11 @@ static int dw_i2c_plat_probe(struct platform_device *pdev)
 	ACPI_COMPANION_SET(&adap->dev, ACPI_COMPANION(&pdev->dev));
 	adap->dev.of_node = pdev->dev.of_node;
 
+	dev_pm_set_driver_flags(&pdev->dev,
+				DPM_FLAG_SMART_PREPARE |
+				DPM_FLAG_SMART_SUSPEND |
+				DPM_FLAG_LEAVE_SUSPENDED);
+
 	/* The code below assumes runtime PM to be disabled. */
 	WARN_ON(pm_runtime_enabled(&pdev->dev));
 
@@ -435,12 +441,24 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match);
 #ifdef CONFIG_PM_SLEEP
 static int dw_i2c_plat_prepare(struct device *dev)
 {
-	return pm_runtime_suspended(dev);
+	/*
+	 * If the ACPI companion device object is present for this device, it
+	 * may be accessed during suspend and resume of other devices via I2C
+	 * operation regions, so tell the PM core and middle layers to avoid
+	 * skipping system suspend/resume callbacks for it in that case.
+	 */
+	return !has_acpi_companion(dev);
 }
 
 static void dw_i2c_plat_complete(struct device *dev)
 {
-	if (dev->power.direct_complete)
+	/*
+	 * The device can only be in runtime suspend at this point if it has not
+	 * been resumed throughout the ending system suspend/resume cycle, so if
+	 * the platform firmware might mess up with it, request the runtime PM
+	 * framework to resume it.
+	 */
+	if (pm_runtime_suspended(dev) && pm_resume_via_firmware())
 		pm_request_resume(dev);
 }
 #else
@@ -453,16 +471,9 @@ static int dw_i2c_plat_suspend(struct device *dev)
 {
 	struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
 
-	if (i_dev->suspended) {
-		i_dev->skip_resume = true;
-		return 0;
-	}
-
 	i_dev->disable(i_dev);
 	i2c_dw_plat_prepare_clk(i_dev, false);
 
-	i_dev->suspended = true;
-
 	return 0;
 }
 
@@ -470,19 +481,9 @@ static int dw_i2c_plat_resume(struct device *dev)
 {
 	struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
 
-	if (!i_dev->suspended)
-		return 0;
-
-	if (i_dev->skip_resume) {
-		i_dev->skip_resume = false;
-		return 0;
-	}
-
 	i2c_dw_plat_prepare_clk(i_dev, true);
 	i_dev->init(i_dev);
 
-	i_dev->suspended = false;
-
 	return 0;
 }
 
diff --git a/drivers/i2c/i2c-core-base.c b/drivers/i2c/i2c-core-base.c
index 706164b..f7829a7 100644
--- a/drivers/i2c/i2c-core-base.c
+++ b/drivers/i2c/i2c-core-base.c
@@ -821,8 +821,12 @@ void i2c_unregister_device(struct i2c_client *client)
 {
 	if (!client)
 		return;
-	if (client->dev.of_node)
+
+	if (client->dev.of_node) {
 		of_node_clear_flag(client->dev.of_node, OF_POPULATED);
+		of_node_put(client->dev.of_node);
+	}
+
 	if (ACPI_COMPANION(&client->dev))
 		acpi_device_clear_enumerated(ACPI_COMPANION(&client->dev));
 	device_unregister(&client->dev);
diff --git a/drivers/i2c/i2c-core-smbus.c b/drivers/i2c/i2c-core-smbus.c
index 4bb9927..a1082c0 100644
--- a/drivers/i2c/i2c-core-smbus.c
+++ b/drivers/i2c/i2c-core-smbus.c
@@ -397,16 +397,17 @@ static s32 i2c_smbus_xfer_emulated(struct i2c_adapter *adapter, u16 addr,
 				   the underlying bus driver */
 		break;
 	case I2C_SMBUS_I2C_BLOCK_DATA:
+		if (data->block[0] > I2C_SMBUS_BLOCK_MAX) {
+			dev_err(&adapter->dev, "Invalid block %s size %d\n",
+				read_write == I2C_SMBUS_READ ? "read" : "write",
+				data->block[0]);
+			return -EINVAL;
+		}
+
 		if (read_write == I2C_SMBUS_READ) {
 			msg[1].len = data->block[0];
 		} else {
 			msg[0].len = data->block[0] + 1;
-			if (msg[0].len > I2C_SMBUS_BLOCK_MAX + 1) {
-				dev_err(&adapter->dev,
-					"Invalid block write size %d\n",
-					data->block[0]);
-				return -EINVAL;
-			}
 			for (i = 1; i <= data->block[0]; i++)
 				msgbuf0[i] = data->block[i];
 		}
diff --git a/drivers/iio/adc/Kconfig b/drivers/iio/adc/Kconfig
index ef86296..39e3b34 100644
--- a/drivers/iio/adc/Kconfig
+++ b/drivers/iio/adc/Kconfig
@@ -629,6 +629,18 @@
 	  To compile this driver as a module, choose M here: the
 	  module will be called spear_adc.
 
+config SD_ADC_MODULATOR
+	tristate "Generic sigma delta modulator"
+	depends on OF
+	select IIO_BUFFER
+	select IIO_TRIGGERED_BUFFER
+	help
+	  Select this option to enables sigma delta modulator. This driver can
+	  support generic sigma delta modulators.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called sd_adc_modulator.
+
 config STM32_ADC_CORE
 	tristate "STMicroelectronics STM32 adc core"
 	depends on ARCH_STM32 || COMPILE_TEST
@@ -656,6 +668,31 @@
 	  This driver can also be built as a module.  If so, the module
 	  will be called stm32-adc.
 
+config STM32_DFSDM_CORE
+	tristate "STMicroelectronics STM32 DFSDM core"
+	depends on (ARCH_STM32 && OF) || COMPILE_TEST
+	select REGMAP
+	select REGMAP_MMIO
+	help
+	  Select this option to enable the  driver for STMicroelectronics
+	  STM32 digital filter for sigma delta converter.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called stm32-dfsdm-core.
+
+config STM32_DFSDM_ADC
+	tristate "STMicroelectronics STM32 dfsdm adc"
+	depends on (ARCH_STM32 && OF) || COMPILE_TEST
+	select STM32_DFSDM_CORE
+	select REGMAP_MMIO
+	select IIO_BUFFER_HW_CONSUMER
+	help
+	  Select this option to support ADCSigma delta modulator for
+	  STMicroelectronics STM32 digital filter for sigma delta converter.
+
+	  This driver can also be built as a module.  If so, the module
+	  will be called stm32-dfsdm-adc.
+
 config STX104
 	tristate "Apex Embedded Systems STX104 driver"
 	depends on PC104 && X86 && ISA_BUS_API
diff --git a/drivers/iio/adc/Makefile b/drivers/iio/adc/Makefile
index 9572c10..28a9423 100644
--- a/drivers/iio/adc/Makefile
+++ b/drivers/iio/adc/Makefile
@@ -64,6 +64,8 @@
 obj-$(CONFIG_SUN4I_GPADC) += sun4i-gpadc-iio.o
 obj-$(CONFIG_STM32_ADC_CORE) += stm32-adc-core.o
 obj-$(CONFIG_STM32_ADC) += stm32-adc.o
+obj-$(CONFIG_STM32_DFSDM_CORE) += stm32-dfsdm-core.o
+obj-$(CONFIG_STM32_DFSDM_ADC) += stm32-dfsdm-adc.o
 obj-$(CONFIG_TI_ADC081C) += ti-adc081c.o
 obj-$(CONFIG_TI_ADC0832) += ti-adc0832.o
 obj-$(CONFIG_TI_ADC084S021) += ti-adc084s021.o
@@ -82,3 +84,4 @@
 obj-$(CONFIG_VIPERBOARD_ADC) += viperboard_adc.o
 xilinx-xadc-y := xilinx-xadc-core.o xilinx-xadc-events.o
 obj-$(CONFIG_XILINX_XADC) += xilinx-xadc.o
+obj-$(CONFIG_SD_ADC_MODULATOR) += sd_adc_modulator.o
diff --git a/drivers/iio/adc/sd_adc_modulator.c b/drivers/iio/adc/sd_adc_modulator.c
new file mode 100644
index 0000000..560d8c7
--- /dev/null
+++ b/drivers/iio/adc/sd_adc_modulator.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Generic sigma delta modulator driver
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Author: Arnaud Pouliquen <arnaud.pouliquen@st.com>.
+ */
+
+#include <linux/iio/iio.h>
+#include <linux/iio/triggered_buffer.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+
+static const struct iio_info iio_sd_mod_iio_info;
+
+static const struct iio_chan_spec iio_sd_mod_ch = {
+	.type = IIO_VOLTAGE,
+	.indexed = 1,
+	.scan_type = {
+		.sign = 'u',
+		.realbits = 1,
+		.shift = 0,
+	},
+};
+
+static int iio_sd_mod_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct iio_dev *iio;
+
+	iio = devm_iio_device_alloc(dev, 0);
+	if (!iio)
+		return -ENOMEM;
+
+	iio->dev.parent = dev;
+	iio->dev.of_node = dev->of_node;
+	iio->name = dev_name(dev);
+	iio->info = &iio_sd_mod_iio_info;
+	iio->modes = INDIO_BUFFER_HARDWARE;
+
+	iio->num_channels = 1;
+	iio->channels = &iio_sd_mod_ch;
+
+	platform_set_drvdata(pdev, iio);
+
+	return devm_iio_device_register(&pdev->dev, iio);
+}
+
+static const struct of_device_id sd_adc_of_match[] = {
+	{ .compatible = "sd-modulator" },
+	{ .compatible = "ads1201" },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, sd_adc_of_match);
+
+static struct platform_driver iio_sd_mod_adc = {
+	.driver = {
+		.name = "iio_sd_adc_mod",
+		.of_match_table = of_match_ptr(sd_adc_of_match),
+	},
+	.probe = iio_sd_mod_probe,
+};
+
+module_platform_driver(iio_sd_mod_adc);
+
+MODULE_DESCRIPTION("Basic sigma delta modulator");
+MODULE_AUTHOR("Arnaud Pouliquen <arnaud.pouliquen@st.com>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iio/adc/stm32-dfsdm-adc.c b/drivers/iio/adc/stm32-dfsdm-adc.c
new file mode 100644
index 0000000..daa026d
--- /dev/null
+++ b/drivers/iio/adc/stm32-dfsdm-adc.c
@@ -0,0 +1,1205 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is the ADC part of the STM32 DFSDM driver
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Author: Arnaud Pouliquen <arnaud.pouliquen@st.com>.
+ */
+
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
+#include <linux/interrupt.h>
+#include <linux/iio/buffer.h>
+#include <linux/iio/hw-consumer.h>
+#include <linux/iio/iio.h>
+#include <linux/iio/sysfs.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+
+#include "stm32-dfsdm.h"
+
+#define DFSDM_DMA_BUFFER_SIZE (4 * PAGE_SIZE)
+
+/* Conversion timeout */
+#define DFSDM_TIMEOUT_US 100000
+#define DFSDM_TIMEOUT (msecs_to_jiffies(DFSDM_TIMEOUT_US / 1000))
+
+/* Oversampling attribute default */
+#define DFSDM_DEFAULT_OVERSAMPLING  100
+
+/* Oversampling max values */
+#define DFSDM_MAX_INT_OVERSAMPLING 256
+#define DFSDM_MAX_FL_OVERSAMPLING 1024
+
+/* Max sample resolutions */
+#define DFSDM_MAX_RES BIT(31)
+#define DFSDM_DATA_RES BIT(23)
+
+enum sd_converter_type {
+	DFSDM_AUDIO,
+	DFSDM_IIO,
+};
+
+struct stm32_dfsdm_dev_data {
+	int type;
+	int (*init)(struct iio_dev *indio_dev);
+	unsigned int num_channels;
+	const struct regmap_config *regmap_cfg;
+};
+
+struct stm32_dfsdm_adc {
+	struct stm32_dfsdm *dfsdm;
+	const struct stm32_dfsdm_dev_data *dev_data;
+	unsigned int fl_id;
+	unsigned int ch_id;
+
+	/* ADC specific */
+	unsigned int oversamp;
+	struct iio_hw_consumer *hwc;
+	struct completion completion;
+	u32 *buffer;
+
+	/* Audio specific */
+	unsigned int spi_freq;  /* SPI bus clock frequency */
+	unsigned int sample_freq; /* Sample frequency after filter decimation */
+	int (*cb)(const void *data, size_t size, void *cb_priv);
+	void *cb_priv;
+
+	/* DMA */
+	u8 *rx_buf;
+	unsigned int bufi; /* Buffer current position */
+	unsigned int buf_sz; /* Buffer size */
+	struct dma_chan	*dma_chan;
+	dma_addr_t dma_buf;
+};
+
+struct stm32_dfsdm_str2field {
+	const char	*name;
+	unsigned int	val;
+};
+
+/* DFSDM channel serial interface type */
+static const struct stm32_dfsdm_str2field stm32_dfsdm_chan_type[] = {
+	{ "SPI_R", 0 }, /* SPI with data on rising edge */
+	{ "SPI_F", 1 }, /* SPI with data on falling edge */
+	{ "MANCH_R", 2 }, /* Manchester codec, rising edge = logic 0 */
+	{ "MANCH_F", 3 }, /* Manchester codec, falling edge = logic 1 */
+	{},
+};
+
+/* DFSDM channel clock source */
+static const struct stm32_dfsdm_str2field stm32_dfsdm_chan_src[] = {
+	/* External SPI clock (CLKIN x) */
+	{ "CLKIN", DFSDM_CHANNEL_SPI_CLOCK_EXTERNAL },
+	/* Internal SPI clock (CLKOUT) */
+	{ "CLKOUT", DFSDM_CHANNEL_SPI_CLOCK_INTERNAL },
+	/* Internal SPI clock divided by 2 (falling edge) */
+	{ "CLKOUT_F", DFSDM_CHANNEL_SPI_CLOCK_INTERNAL_DIV2_FALLING },
+	/* Internal SPI clock divided by 2 (falling edge) */
+	{ "CLKOUT_R", DFSDM_CHANNEL_SPI_CLOCK_INTERNAL_DIV2_RISING },
+	{},
+};
+
+static int stm32_dfsdm_str2val(const char *str,
+			       const struct stm32_dfsdm_str2field *list)
+{
+	const struct stm32_dfsdm_str2field *p = list;
+
+	for (p = list; p && p->name; p++)
+		if (!strcmp(p->name, str))
+			return p->val;
+
+	return -EINVAL;
+}
+
+static int stm32_dfsdm_set_osrs(struct stm32_dfsdm_filter *fl,
+				unsigned int fast, unsigned int oversamp)
+{
+	unsigned int i, d, fosr, iosr;
+	u64 res;
+	s64 delta;
+	unsigned int m = 1;	/* multiplication factor */
+	unsigned int p = fl->ford;	/* filter order (ford) */
+
+	pr_debug("%s: Requested oversampling: %d\n",  __func__, oversamp);
+	/*
+	 * This function tries to compute filter oversampling and integrator
+	 * oversampling, base on oversampling ratio requested by user.
+	 *
+	 * Decimation d depends on the filter order and the oversampling ratios.
+	 * ford: filter order
+	 * fosr: filter over sampling ratio
+	 * iosr: integrator over sampling ratio
+	 */
+	if (fl->ford == DFSDM_FASTSINC_ORDER) {
+		m = 2;
+		p = 2;
+	}
+
+	/*
+	 * Look for filter and integrator oversampling ratios which allows
+	 * to reach 24 bits data output resolution.
+	 * Leave as soon as if exact resolution if reached.
+	 * Otherwise the higher resolution below 32 bits is kept.
+	 */
+	for (fosr = 1; fosr <= DFSDM_MAX_FL_OVERSAMPLING; fosr++) {
+		for (iosr = 1; iosr <= DFSDM_MAX_INT_OVERSAMPLING; iosr++) {
+			if (fast)
+				d = fosr * iosr;
+			else if (fl->ford == DFSDM_FASTSINC_ORDER)
+				d = fosr * (iosr + 3) + 2;
+			else
+				d = fosr * (iosr - 1 + p) + p;
+
+			if (d > oversamp)
+				break;
+			else if (d != oversamp)
+				continue;
+			/*
+			 * Check resolution (limited to signed 32 bits)
+			 *   res <= 2^31
+			 * Sincx filters:
+			 *   res = m * fosr^p x iosr (with m=1, p=ford)
+			 * FastSinc filter
+			 *   res = m * fosr^p x iosr (with m=2, p=2)
+			 */
+			res = fosr;
+			for (i = p - 1; i > 0; i--) {
+				res = res * (u64)fosr;
+				if (res > DFSDM_MAX_RES)
+					break;
+			}
+			if (res > DFSDM_MAX_RES)
+				continue;
+			res = res * (u64)m * (u64)iosr;
+			if (res > DFSDM_MAX_RES)
+				continue;
+
+			delta = res - DFSDM_DATA_RES;
+
+			if (res >= fl->res) {
+				fl->res = res;
+				fl->fosr = fosr;
+				fl->iosr = iosr;
+				fl->fast = fast;
+				pr_debug("%s: fosr = %d, iosr = %d\n",
+					 __func__, fl->fosr, fl->iosr);
+			}
+
+			if (!delta)
+				return 0;
+		}
+	}
+
+	if (!fl->fosr)
+		return -EINVAL;
+
+	return 0;
+}
+
+static int stm32_dfsdm_start_channel(struct stm32_dfsdm *dfsdm,
+				     unsigned int ch_id)
+{
+	return regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(ch_id),
+				  DFSDM_CHCFGR1_CHEN_MASK,
+				  DFSDM_CHCFGR1_CHEN(1));
+}
+
+static void stm32_dfsdm_stop_channel(struct stm32_dfsdm *dfsdm,
+				     unsigned int ch_id)
+{
+	regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(ch_id),
+			   DFSDM_CHCFGR1_CHEN_MASK, DFSDM_CHCFGR1_CHEN(0));
+}
+
+static int stm32_dfsdm_chan_configure(struct stm32_dfsdm *dfsdm,
+				      struct stm32_dfsdm_channel *ch)
+{
+	unsigned int id = ch->id;
+	struct regmap *regmap = dfsdm->regmap;
+	int ret;
+
+	ret = regmap_update_bits(regmap, DFSDM_CHCFGR1(id),
+				 DFSDM_CHCFGR1_SITP_MASK,
+				 DFSDM_CHCFGR1_SITP(ch->type));
+	if (ret < 0)
+		return ret;
+	ret = regmap_update_bits(regmap, DFSDM_CHCFGR1(id),
+				 DFSDM_CHCFGR1_SPICKSEL_MASK,
+				 DFSDM_CHCFGR1_SPICKSEL(ch->src));
+	if (ret < 0)
+		return ret;
+	return regmap_update_bits(regmap, DFSDM_CHCFGR1(id),
+				  DFSDM_CHCFGR1_CHINSEL_MASK,
+				  DFSDM_CHCFGR1_CHINSEL(ch->alt_si));
+}
+
+static int stm32_dfsdm_start_filter(struct stm32_dfsdm *dfsdm,
+				    unsigned int fl_id)
+{
+	int ret;
+
+	/* Enable filter */
+	ret = regmap_update_bits(dfsdm->regmap, DFSDM_CR1(fl_id),
+				 DFSDM_CR1_DFEN_MASK, DFSDM_CR1_DFEN(1));
+	if (ret < 0)
+		return ret;
+
+	/* Start conversion */
+	return regmap_update_bits(dfsdm->regmap, DFSDM_CR1(fl_id),
+				  DFSDM_CR1_RSWSTART_MASK,
+				  DFSDM_CR1_RSWSTART(1));
+}
+
+static void stm32_dfsdm_stop_filter(struct stm32_dfsdm *dfsdm, unsigned int fl_id)
+{
+	/* Disable conversion */
+	regmap_update_bits(dfsdm->regmap, DFSDM_CR1(fl_id),
+			   DFSDM_CR1_DFEN_MASK, DFSDM_CR1_DFEN(0));
+}
+
+static int stm32_dfsdm_filter_configure(struct stm32_dfsdm *dfsdm,
+					unsigned int fl_id, unsigned int ch_id)
+{
+	struct regmap *regmap = dfsdm->regmap;
+	struct stm32_dfsdm_filter *fl = &dfsdm->fl_list[fl_id];
+	int ret;
+
+	/* Average integrator oversampling */
+	ret = regmap_update_bits(regmap, DFSDM_FCR(fl_id), DFSDM_FCR_IOSR_MASK,
+				 DFSDM_FCR_IOSR(fl->iosr - 1));
+	if (ret)
+		return ret;
+
+	/* Filter order and Oversampling */
+	ret = regmap_update_bits(regmap, DFSDM_FCR(fl_id), DFSDM_FCR_FOSR_MASK,
+				 DFSDM_FCR_FOSR(fl->fosr - 1));
+	if (ret)
+		return ret;
+
+	ret = regmap_update_bits(regmap, DFSDM_FCR(fl_id), DFSDM_FCR_FORD_MASK,
+				 DFSDM_FCR_FORD(fl->ford));
+	if (ret)
+		return ret;
+
+	/* No scan mode supported for the moment */
+	ret = regmap_update_bits(regmap, DFSDM_CR1(fl_id), DFSDM_CR1_RCH_MASK,
+				 DFSDM_CR1_RCH(ch_id));
+	if (ret)
+		return ret;
+
+	return regmap_update_bits(regmap, DFSDM_CR1(fl_id),
+				  DFSDM_CR1_RSYNC_MASK,
+				  DFSDM_CR1_RSYNC(fl->sync_mode));
+}
+
+static int stm32_dfsdm_channel_parse_of(struct stm32_dfsdm *dfsdm,
+					struct iio_dev *indio_dev,
+					struct iio_chan_spec *ch)
+{
+	struct stm32_dfsdm_channel *df_ch;
+	const char *of_str;
+	int chan_idx = ch->scan_index;
+	int ret, val;
+
+	ret = of_property_read_u32_index(indio_dev->dev.of_node,
+					 "st,adc-channels", chan_idx,
+					 &ch->channel);
+	if (ret < 0) {
+		dev_err(&indio_dev->dev,
+			" Error parsing 'st,adc-channels' for idx %d\n",
+			chan_idx);
+		return ret;
+	}
+	if (ch->channel >= dfsdm->num_chs) {
+		dev_err(&indio_dev->dev,
+			" Error bad channel number %d (max = %d)\n",
+			ch->channel, dfsdm->num_chs);
+		return -EINVAL;
+	}
+
+	ret = of_property_read_string_index(indio_dev->dev.of_node,
+					    "st,adc-channel-names", chan_idx,
+					    &ch->datasheet_name);
+	if (ret < 0) {
+		dev_err(&indio_dev->dev,
+			" Error parsing 'st,adc-channel-names' for idx %d\n",
+			chan_idx);
+		return ret;
+	}
+
+	df_ch =  &dfsdm->ch_list[ch->channel];
+	df_ch->id = ch->channel;
+
+	ret = of_property_read_string_index(indio_dev->dev.of_node,
+					    "st,adc-channel-types", chan_idx,
+					    &of_str);
+	if (!ret) {
+		val  = stm32_dfsdm_str2val(of_str, stm32_dfsdm_chan_type);
+		if (val < 0)
+			return val;
+	} else {
+		val = 0;
+	}
+	df_ch->type = val;
+
+	ret = of_property_read_string_index(indio_dev->dev.of_node,
+					    "st,adc-channel-clk-src", chan_idx,
+					    &of_str);
+	if (!ret) {
+		val  = stm32_dfsdm_str2val(of_str, stm32_dfsdm_chan_src);
+		if (val < 0)
+			return val;
+	} else {
+		val = 0;
+	}
+	df_ch->src = val;
+
+	ret = of_property_read_u32_index(indio_dev->dev.of_node,
+					 "st,adc-alt-channel", chan_idx,
+					 &df_ch->alt_si);
+	if (ret < 0)
+		df_ch->alt_si = 0;
+
+	return 0;
+}
+
+static ssize_t dfsdm_adc_audio_get_spiclk(struct iio_dev *indio_dev,
+					  uintptr_t priv,
+					  const struct iio_chan_spec *chan,
+					  char *buf)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+
+	return snprintf(buf, PAGE_SIZE, "%d\n", adc->spi_freq);
+}
+
+static ssize_t dfsdm_adc_audio_set_spiclk(struct iio_dev *indio_dev,
+					  uintptr_t priv,
+					  const struct iio_chan_spec *chan,
+					  const char *buf, size_t len)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	struct stm32_dfsdm_filter *fl = &adc->dfsdm->fl_list[adc->fl_id];
+	struct stm32_dfsdm_channel *ch = &adc->dfsdm->ch_list[adc->ch_id];
+	unsigned int sample_freq = adc->sample_freq;
+	unsigned int spi_freq;
+	int ret;
+
+	dev_err(&indio_dev->dev, "enter %s\n", __func__);
+	/* If DFSDM is master on SPI, SPI freq can not be updated */
+	if (ch->src != DFSDM_CHANNEL_SPI_CLOCK_EXTERNAL)
+		return -EPERM;
+
+	ret = kstrtoint(buf, 0, &spi_freq);
+	if (ret)
+		return ret;
+
+	if (!spi_freq)
+		return -EINVAL;
+
+	if (sample_freq) {
+		if (spi_freq % sample_freq)
+			dev_warn(&indio_dev->dev,
+				 "Sampling rate not accurate (%d)\n",
+				 spi_freq / (spi_freq / sample_freq));
+
+		ret = stm32_dfsdm_set_osrs(fl, 0, (spi_freq / sample_freq));
+		if (ret < 0) {
+			dev_err(&indio_dev->dev,
+				"No filter parameters that match!\n");
+			return ret;
+		}
+	}
+	adc->spi_freq = spi_freq;
+
+	return len;
+}
+
+static int stm32_dfsdm_start_conv(struct stm32_dfsdm_adc *adc, bool dma)
+{
+	struct regmap *regmap = adc->dfsdm->regmap;
+	int ret;
+	unsigned int dma_en = 0, cont_en = 0;
+
+	ret = stm32_dfsdm_start_channel(adc->dfsdm, adc->ch_id);
+	if (ret < 0)
+		return ret;
+
+	ret = stm32_dfsdm_filter_configure(adc->dfsdm, adc->fl_id,
+					   adc->ch_id);
+	if (ret < 0)
+		goto stop_channels;
+
+	if (dma) {
+		/* Enable DMA transfer*/
+		dma_en =  DFSDM_CR1_RDMAEN(1);
+		/* Enable conversion triggered by SPI clock*/
+		cont_en = DFSDM_CR1_RCONT(1);
+	}
+	/* Enable DMA transfer*/
+	ret = regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+				 DFSDM_CR1_RDMAEN_MASK, dma_en);
+	if (ret < 0)
+		goto stop_channels;
+
+	/* Enable conversion triggered by SPI clock*/
+	ret = regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+				 DFSDM_CR1_RCONT_MASK, cont_en);
+	if (ret < 0)
+		goto stop_channels;
+
+	ret = stm32_dfsdm_start_filter(adc->dfsdm, adc->fl_id);
+	if (ret < 0)
+		goto stop_channels;
+
+	return 0;
+
+stop_channels:
+	regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+			   DFSDM_CR1_RDMAEN_MASK, 0);
+
+	regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+			   DFSDM_CR1_RCONT_MASK, 0);
+	stm32_dfsdm_stop_channel(adc->dfsdm, adc->fl_id);
+
+	return ret;
+}
+
+static void stm32_dfsdm_stop_conv(struct stm32_dfsdm_adc *adc)
+{
+	struct regmap *regmap = adc->dfsdm->regmap;
+
+	stm32_dfsdm_stop_filter(adc->dfsdm, adc->fl_id);
+
+	/* Clean conversion options */
+	regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+			   DFSDM_CR1_RDMAEN_MASK, 0);
+
+	regmap_update_bits(regmap, DFSDM_CR1(adc->fl_id),
+			   DFSDM_CR1_RCONT_MASK, 0);
+
+	stm32_dfsdm_stop_channel(adc->dfsdm, adc->ch_id);
+}
+
+static int stm32_dfsdm_set_watermark(struct iio_dev *indio_dev,
+				     unsigned int val)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	unsigned int watermark = DFSDM_DMA_BUFFER_SIZE / 2;
+
+	/*
+	 * DMA cyclic transfers are used, buffer is split into two periods.
+	 * There should be :
+	 * - always one buffer (period) DMA is working on
+	 * - one buffer (period) driver pushed to ASoC side.
+	 */
+	watermark = min(watermark, val * (unsigned int)(sizeof(u32)));
+	adc->buf_sz = watermark * 2;
+
+	return 0;
+}
+
+static unsigned int stm32_dfsdm_adc_dma_residue(struct stm32_dfsdm_adc *adc)
+{
+	struct dma_tx_state state;
+	enum dma_status status;
+
+	status = dmaengine_tx_status(adc->dma_chan,
+				     adc->dma_chan->cookie,
+				     &state);
+	if (status == DMA_IN_PROGRESS) {
+		/* Residue is size in bytes from end of buffer */
+		unsigned int i = adc->buf_sz - state.residue;
+		unsigned int size;
+
+		/* Return available bytes */
+		if (i >= adc->bufi)
+			size = i - adc->bufi;
+		else
+			size = adc->buf_sz + i - adc->bufi;
+
+		return size;
+	}
+
+	return 0;
+}
+
+static void stm32_dfsdm_audio_dma_buffer_done(void *data)
+{
+	struct iio_dev *indio_dev = data;
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	int available = stm32_dfsdm_adc_dma_residue(adc);
+	size_t old_pos;
+
+	/*
+	 * FIXME: In Kernel interface does not support cyclic DMA buffer,and
+	 * offers only an interface to push data samples per samples.
+	 * For this reason IIO buffer interface is not used and interface is
+	 * bypassed using a private callback registered by ASoC.
+	 * This should be a temporary solution waiting a cyclic DMA engine
+	 * support in IIO.
+	 */
+
+	dev_dbg(&indio_dev->dev, "%s: pos = %d, available = %d\n", __func__,
+		adc->bufi, available);
+	old_pos = adc->bufi;
+
+	while (available >= indio_dev->scan_bytes) {
+		u32 *buffer = (u32 *)&adc->rx_buf[adc->bufi];
+
+		/* Mask 8 LSB that contains the channel ID */
+		*buffer = (*buffer & 0xFFFFFF00) << 8;
+		available -= indio_dev->scan_bytes;
+		adc->bufi += indio_dev->scan_bytes;
+		if (adc->bufi >= adc->buf_sz) {
+			if (adc->cb)
+				adc->cb(&adc->rx_buf[old_pos],
+					 adc->buf_sz - old_pos, adc->cb_priv);
+			adc->bufi = 0;
+			old_pos = 0;
+		}
+	}
+	if (adc->cb)
+		adc->cb(&adc->rx_buf[old_pos], adc->bufi - old_pos,
+			adc->cb_priv);
+}
+
+static int stm32_dfsdm_adc_dma_start(struct iio_dev *indio_dev)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	struct dma_async_tx_descriptor *desc;
+	dma_cookie_t cookie;
+	int ret;
+
+	if (!adc->dma_chan)
+		return -EINVAL;
+
+	dev_dbg(&indio_dev->dev, "%s size=%d watermark=%d\n", __func__,
+		adc->buf_sz, adc->buf_sz / 2);
+
+	/* Prepare a DMA cyclic transaction */
+	desc = dmaengine_prep_dma_cyclic(adc->dma_chan,
+					 adc->dma_buf,
+					 adc->buf_sz, adc->buf_sz / 2,
+					 DMA_DEV_TO_MEM,
+					 DMA_PREP_INTERRUPT);
+	if (!desc)
+		return -EBUSY;
+
+	desc->callback = stm32_dfsdm_audio_dma_buffer_done;
+	desc->callback_param = indio_dev;
+
+	cookie = dmaengine_submit(desc);
+	ret = dma_submit_error(cookie);
+	if (ret) {
+		dmaengine_terminate_all(adc->dma_chan);
+		return ret;
+	}
+
+	/* Issue pending DMA requests */
+	dma_async_issue_pending(adc->dma_chan);
+
+	return 0;
+}
+
+static int stm32_dfsdm_postenable(struct iio_dev *indio_dev)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	int ret;
+
+	/* Reset adc buffer index */
+	adc->bufi = 0;
+
+	ret = stm32_dfsdm_start_dfsdm(adc->dfsdm);
+	if (ret < 0)
+		return ret;
+
+	ret = stm32_dfsdm_start_conv(adc, true);
+	if (ret) {
+		dev_err(&indio_dev->dev, "Can't start conversion\n");
+		goto stop_dfsdm;
+	}
+
+	if (adc->dma_chan) {
+		ret = stm32_dfsdm_adc_dma_start(indio_dev);
+		if (ret) {
+			dev_err(&indio_dev->dev, "Can't start DMA\n");
+			goto err_stop_conv;
+		}
+	}
+
+	return 0;
+
+err_stop_conv:
+	stm32_dfsdm_stop_conv(adc);
+stop_dfsdm:
+	stm32_dfsdm_stop_dfsdm(adc->dfsdm);
+
+	return ret;
+}
+
+static int stm32_dfsdm_predisable(struct iio_dev *indio_dev)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+
+	if (adc->dma_chan)
+		dmaengine_terminate_all(adc->dma_chan);
+
+	stm32_dfsdm_stop_conv(adc);
+
+	stm32_dfsdm_stop_dfsdm(adc->dfsdm);
+
+	return 0;
+}
+
+static const struct iio_buffer_setup_ops stm32_dfsdm_buffer_setup_ops = {
+	.postenable = &stm32_dfsdm_postenable,
+	.predisable = &stm32_dfsdm_predisable,
+};
+
+/**
+ * stm32_dfsdm_get_buff_cb() - register a callback that will be called when
+ *                             DMA transfer period is achieved.
+ *
+ * @iio_dev: Handle to IIO device.
+ * @cb: Pointer to callback function:
+ *      - data: pointer to data buffer
+ *      - size: size in byte of the data buffer
+ *      - private: pointer to consumer private structure.
+ * @private: Pointer to consumer private structure.
+ */
+int stm32_dfsdm_get_buff_cb(struct iio_dev *iio_dev,
+			    int (*cb)(const void *data, size_t size,
+				      void *private),
+			    void *private)
+{
+	struct stm32_dfsdm_adc *adc;
+
+	if (!iio_dev)
+		return -EINVAL;
+	adc = iio_priv(iio_dev);
+
+	adc->cb = cb;
+	adc->cb_priv = private;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(stm32_dfsdm_get_buff_cb);
+
+/**
+ * stm32_dfsdm_release_buff_cb - unregister buffer callback
+ *
+ * @iio_dev: Handle to IIO device.
+ */
+int stm32_dfsdm_release_buff_cb(struct iio_dev *iio_dev)
+{
+	struct stm32_dfsdm_adc *adc;
+
+	if (!iio_dev)
+		return -EINVAL;
+	adc = iio_priv(iio_dev);
+
+	adc->cb = NULL;
+	adc->cb_priv = NULL;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(stm32_dfsdm_release_buff_cb);
+
+static int stm32_dfsdm_single_conv(struct iio_dev *indio_dev,
+				   const struct iio_chan_spec *chan, int *res)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	long timeout;
+	int ret;
+
+	reinit_completion(&adc->completion);
+
+	adc->buffer = res;
+
+	ret = stm32_dfsdm_start_dfsdm(adc->dfsdm);
+	if (ret < 0)
+		return ret;
+
+	ret = regmap_update_bits(adc->dfsdm->regmap, DFSDM_CR2(adc->fl_id),
+				 DFSDM_CR2_REOCIE_MASK, DFSDM_CR2_REOCIE(1));
+	if (ret < 0)
+		goto stop_dfsdm;
+
+	ret = stm32_dfsdm_start_conv(adc, false);
+	if (ret < 0) {
+		regmap_update_bits(adc->dfsdm->regmap, DFSDM_CR2(adc->fl_id),
+				   DFSDM_CR2_REOCIE_MASK, DFSDM_CR2_REOCIE(0));
+		goto stop_dfsdm;
+	}
+
+	timeout = wait_for_completion_interruptible_timeout(&adc->completion,
+							    DFSDM_TIMEOUT);
+
+	/* Mask IRQ for regular conversion achievement*/
+	regmap_update_bits(adc->dfsdm->regmap, DFSDM_CR2(adc->fl_id),
+			   DFSDM_CR2_REOCIE_MASK, DFSDM_CR2_REOCIE(0));
+
+	if (timeout == 0)
+		ret = -ETIMEDOUT;
+	else if (timeout < 0)
+		ret = timeout;
+	else
+		ret = IIO_VAL_INT;
+
+	stm32_dfsdm_stop_conv(adc);
+
+stop_dfsdm:
+	stm32_dfsdm_stop_dfsdm(adc->dfsdm);
+
+	return ret;
+}
+
+static int stm32_dfsdm_write_raw(struct iio_dev *indio_dev,
+				 struct iio_chan_spec const *chan,
+				 int val, int val2, long mask)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	struct stm32_dfsdm_filter *fl = &adc->dfsdm->fl_list[adc->fl_id];
+	struct stm32_dfsdm_channel *ch = &adc->dfsdm->ch_list[adc->ch_id];
+	unsigned int spi_freq = adc->spi_freq;
+	int ret = -EINVAL;
+
+	switch (mask) {
+	case IIO_CHAN_INFO_OVERSAMPLING_RATIO:
+		ret = stm32_dfsdm_set_osrs(fl, 0, val);
+		if (!ret)
+			adc->oversamp = val;
+
+		return ret;
+
+	case IIO_CHAN_INFO_SAMP_FREQ:
+		if (!val)
+			return -EINVAL;
+		if (ch->src != DFSDM_CHANNEL_SPI_CLOCK_EXTERNAL)
+			spi_freq = adc->dfsdm->spi_master_freq;
+
+		if (spi_freq % val)
+			dev_warn(&indio_dev->dev,
+				 "Sampling rate not accurate (%d)\n",
+				 spi_freq / (spi_freq / val));
+
+		ret = stm32_dfsdm_set_osrs(fl, 0, (spi_freq / val));
+		if (ret < 0) {
+			dev_err(&indio_dev->dev,
+				"Not able to find parameter that match!\n");
+			return ret;
+		}
+		adc->sample_freq = val;
+
+		return 0;
+	}
+
+	return -EINVAL;
+}
+
+static int stm32_dfsdm_read_raw(struct iio_dev *indio_dev,
+				struct iio_chan_spec const *chan, int *val,
+				int *val2, long mask)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	int ret;
+
+	switch (mask) {
+	case IIO_CHAN_INFO_RAW:
+		ret = iio_hw_consumer_enable(adc->hwc);
+		if (ret < 0) {
+			dev_err(&indio_dev->dev,
+				"%s: IIO enable failed (channel %d)\n",
+				__func__, chan->channel);
+			return ret;
+		}
+		ret = stm32_dfsdm_single_conv(indio_dev, chan, val);
+		iio_hw_consumer_disable(adc->hwc);
+		if (ret < 0) {
+			dev_err(&indio_dev->dev,
+				"%s: Conversion failed (channel %d)\n",
+				__func__, chan->channel);
+			return ret;
+		}
+		return IIO_VAL_INT;
+
+	case IIO_CHAN_INFO_OVERSAMPLING_RATIO:
+		*val = adc->oversamp;
+
+		return IIO_VAL_INT;
+
+	case IIO_CHAN_INFO_SAMP_FREQ:
+		*val = adc->sample_freq;
+
+		return IIO_VAL_INT;
+	}
+
+	return -EINVAL;
+}
+
+static const struct iio_info stm32_dfsdm_info_audio = {
+	.hwfifo_set_watermark = stm32_dfsdm_set_watermark,
+	.read_raw = stm32_dfsdm_read_raw,
+	.write_raw = stm32_dfsdm_write_raw,
+};
+
+static const struct iio_info stm32_dfsdm_info_adc = {
+	.read_raw = stm32_dfsdm_read_raw,
+	.write_raw = stm32_dfsdm_write_raw,
+};
+
+static irqreturn_t stm32_dfsdm_irq(int irq, void *arg)
+{
+	struct stm32_dfsdm_adc *adc = arg;
+	struct iio_dev *indio_dev = iio_priv_to_dev(adc);
+	struct regmap *regmap = adc->dfsdm->regmap;
+	unsigned int status, int_en;
+
+	regmap_read(regmap, DFSDM_ISR(adc->fl_id), &status);
+	regmap_read(regmap, DFSDM_CR2(adc->fl_id), &int_en);
+
+	if (status & DFSDM_ISR_REOCF_MASK) {
+		/* Read the data register clean the IRQ status */
+		regmap_read(regmap, DFSDM_RDATAR(adc->fl_id), adc->buffer);
+		complete(&adc->completion);
+	}
+
+	if (status & DFSDM_ISR_ROVRF_MASK) {
+		if (int_en & DFSDM_CR2_ROVRIE_MASK)
+			dev_warn(&indio_dev->dev, "Overrun detected\n");
+		regmap_update_bits(regmap, DFSDM_ICR(adc->fl_id),
+				   DFSDM_ICR_CLRROVRF_MASK,
+				   DFSDM_ICR_CLRROVRF_MASK);
+	}
+
+	return IRQ_HANDLED;
+}
+
+/*
+ * Define external info for SPI Frequency and audio sampling rate that can be
+ * configured by ASoC driver through consumer.h API
+ */
+static const struct iio_chan_spec_ext_info dfsdm_adc_audio_ext_info[] = {
+	/* spi_clk_freq : clock freq on SPI/manchester bus used by channel */
+	{
+		.name = "spi_clk_freq",
+		.shared = IIO_SHARED_BY_TYPE,
+		.read = dfsdm_adc_audio_get_spiclk,
+		.write = dfsdm_adc_audio_set_spiclk,
+	},
+	{},
+};
+
+static void stm32_dfsdm_dma_release(struct iio_dev *indio_dev)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+
+	if (adc->dma_chan) {
+		dma_free_coherent(adc->dma_chan->device->dev,
+				  DFSDM_DMA_BUFFER_SIZE,
+				  adc->rx_buf, adc->dma_buf);
+		dma_release_channel(adc->dma_chan);
+	}
+}
+
+static int stm32_dfsdm_dma_request(struct iio_dev *indio_dev)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	struct dma_slave_config config = {
+		.src_addr = (dma_addr_t)adc->dfsdm->phys_base +
+			DFSDM_RDATAR(adc->fl_id),
+		.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES,
+	};
+	int ret;
+
+	adc->dma_chan = dma_request_slave_channel(&indio_dev->dev, "rx");
+	if (!adc->dma_chan)
+		return -EINVAL;
+
+	adc->rx_buf = dma_alloc_coherent(adc->dma_chan->device->dev,
+					 DFSDM_DMA_BUFFER_SIZE,
+					 &adc->dma_buf, GFP_KERNEL);
+	if (!adc->rx_buf) {
+		ret = -ENOMEM;
+		goto err_release;
+	}
+
+	ret = dmaengine_slave_config(adc->dma_chan, &config);
+	if (ret)
+		goto err_free;
+
+	return 0;
+
+err_free:
+	dma_free_coherent(adc->dma_chan->device->dev, DFSDM_DMA_BUFFER_SIZE,
+			  adc->rx_buf, adc->dma_buf);
+err_release:
+	dma_release_channel(adc->dma_chan);
+
+	return ret;
+}
+
+static int stm32_dfsdm_adc_chan_init_one(struct iio_dev *indio_dev,
+					 struct iio_chan_spec *ch)
+{
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	int ret;
+
+	ret = stm32_dfsdm_channel_parse_of(adc->dfsdm, indio_dev, ch);
+	if (ret < 0)
+		return ret;
+
+	ch->type = IIO_VOLTAGE;
+	ch->indexed = 1;
+
+	/*
+	 * IIO_CHAN_INFO_RAW: used to compute regular conversion
+	 * IIO_CHAN_INFO_OVERSAMPLING_RATIO: used to set oversampling
+	 */
+	ch->info_mask_separate = BIT(IIO_CHAN_INFO_RAW);
+	ch->info_mask_shared_by_all = BIT(IIO_CHAN_INFO_OVERSAMPLING_RATIO);
+
+	if (adc->dev_data->type == DFSDM_AUDIO) {
+		ch->scan_type.sign = 's';
+		ch->ext_info = dfsdm_adc_audio_ext_info;
+	} else {
+		ch->scan_type.sign = 'u';
+	}
+	ch->scan_type.realbits = 24;
+	ch->scan_type.storagebits = 32;
+	adc->ch_id = ch->channel;
+
+	return stm32_dfsdm_chan_configure(adc->dfsdm,
+					  &adc->dfsdm->ch_list[ch->channel]);
+}
+
+static int stm32_dfsdm_audio_init(struct iio_dev *indio_dev)
+{
+	struct iio_chan_spec *ch;
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	struct stm32_dfsdm_channel *d_ch;
+	int ret;
+
+	indio_dev->modes |= INDIO_BUFFER_SOFTWARE;
+	indio_dev->setup_ops = &stm32_dfsdm_buffer_setup_ops;
+
+	ch = devm_kzalloc(&indio_dev->dev, sizeof(*ch), GFP_KERNEL);
+	if (!ch)
+		return -ENOMEM;
+
+	ch->scan_index = 0;
+
+	ret = stm32_dfsdm_adc_chan_init_one(indio_dev, ch);
+	if (ret < 0) {
+		dev_err(&indio_dev->dev, "Channels init failed\n");
+		return ret;
+	}
+	ch->info_mask_separate = BIT(IIO_CHAN_INFO_SAMP_FREQ);
+
+	d_ch = &adc->dfsdm->ch_list[adc->ch_id];
+	if (d_ch->src != DFSDM_CHANNEL_SPI_CLOCK_EXTERNAL)
+		adc->spi_freq = adc->dfsdm->spi_master_freq;
+
+	indio_dev->num_channels = 1;
+	indio_dev->channels = ch;
+
+	return stm32_dfsdm_dma_request(indio_dev);
+}
+
+static int stm32_dfsdm_adc_init(struct iio_dev *indio_dev)
+{
+	struct iio_chan_spec *ch;
+	struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
+	int num_ch;
+	int ret, chan_idx;
+
+	adc->oversamp = DFSDM_DEFAULT_OVERSAMPLING;
+	ret = stm32_dfsdm_set_osrs(&adc->dfsdm->fl_list[adc->fl_id], 0,
+				   adc->oversamp);
+	if (ret < 0)
+		return ret;
+
+	num_ch = of_property_count_u32_elems(indio_dev->dev.of_node,
+					     "st,adc-channels");
+	if (num_ch < 0 || num_ch > adc->dfsdm->num_chs) {
+		dev_err(&indio_dev->dev, "Bad st,adc-channels\n");
+		return num_ch < 0 ? num_ch : -EINVAL;
+	}
+
+	/* Bind to SD modulator IIO device */
+	adc->hwc = devm_iio_hw_consumer_alloc(&indio_dev->dev);
+	if (IS_ERR(adc->hwc))
+		return -EPROBE_DEFER;
+
+	ch = devm_kcalloc(&indio_dev->dev, num_ch, sizeof(*ch),
+			  GFP_KERNEL);
+	if (!ch)
+		return -ENOMEM;
+
+	for (chan_idx = 0; chan_idx < num_ch; chan_idx++) {
+		ch->scan_index = chan_idx;
+		ret = stm32_dfsdm_adc_chan_init_one(indio_dev, ch);
+		if (ret < 0) {
+			dev_err(&indio_dev->dev, "Channels init failed\n");
+			return ret;
+		}
+	}
+
+	indio_dev->num_channels = num_ch;
+	indio_dev->channels = ch;
+
+	init_completion(&adc->completion);
+
+	return 0;
+}
+
+static const struct stm32_dfsdm_dev_data stm32h7_dfsdm_adc_data = {
+	.type = DFSDM_IIO,
+	.init = stm32_dfsdm_adc_init,
+};
+
+static const struct stm32_dfsdm_dev_data stm32h7_dfsdm_audio_data = {
+	.type = DFSDM_AUDIO,
+	.init = stm32_dfsdm_audio_init,
+};
+
+static const struct of_device_id stm32_dfsdm_adc_match[] = {
+	{
+		.compatible = "st,stm32-dfsdm-adc",
+		.data = &stm32h7_dfsdm_adc_data,
+	},
+	{
+		.compatible = "st,stm32-dfsdm-dmic",
+		.data = &stm32h7_dfsdm_audio_data,
+	},
+	{}
+};
+
+static int stm32_dfsdm_adc_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct stm32_dfsdm_adc *adc;
+	struct device_node *np = dev->of_node;
+	const struct stm32_dfsdm_dev_data *dev_data;
+	struct iio_dev *iio;
+	char *name;
+	int ret, irq, val;
+
+
+	dev_data = of_device_get_match_data(dev);
+	iio = devm_iio_device_alloc(dev, sizeof(*adc));
+	if (!iio) {
+		dev_err(dev, "%s: Failed to allocate IIO\n", __func__);
+		return -ENOMEM;
+	}
+
+	adc = iio_priv(iio);
+	adc->dfsdm = dev_get_drvdata(dev->parent);
+
+	iio->dev.parent = dev;
+	iio->dev.of_node = np;
+	iio->modes = INDIO_DIRECT_MODE | INDIO_BUFFER_SOFTWARE;
+
+	platform_set_drvdata(pdev, adc);
+
+	ret = of_property_read_u32(dev->of_node, "reg", &adc->fl_id);
+	if (ret != 0) {
+		dev_err(dev, "Missing reg property\n");
+		return -EINVAL;
+	}
+
+	name = devm_kzalloc(dev, sizeof("dfsdm-adc0"), GFP_KERNEL);
+	if (!name)
+		return -ENOMEM;
+	if (dev_data->type == DFSDM_AUDIO) {
+		iio->info = &stm32_dfsdm_info_audio;
+		snprintf(name, sizeof("dfsdm-pdm0"), "dfsdm-pdm%d", adc->fl_id);
+	} else {
+		iio->info = &stm32_dfsdm_info_adc;
+		snprintf(name, sizeof("dfsdm-adc0"), "dfsdm-adc%d", adc->fl_id);
+	}
+	iio->name = name;
+
+	/*
+	 * In a first step IRQs generated for channels are not treated.
+	 * So IRQ associated to filter instance 0 is dedicated to the Filter 0.
+	 */
+	irq = platform_get_irq(pdev, 0);
+	ret = devm_request_irq(dev, irq, stm32_dfsdm_irq,
+			       0, pdev->name, adc);
+	if (ret < 0) {
+		dev_err(dev, "Failed to request IRQ\n");
+		return ret;
+	}
+
+	ret = of_property_read_u32(dev->of_node, "st,filter-order", &val);
+	if (ret < 0) {
+		dev_err(dev, "Failed to set filter order\n");
+		return ret;
+	}
+
+	adc->dfsdm->fl_list[adc->fl_id].ford = val;
+
+	ret = of_property_read_u32(dev->of_node, "st,filter0-sync", &val);
+	if (!ret)
+		adc->dfsdm->fl_list[adc->fl_id].sync_mode = val;
+
+	adc->dev_data = dev_data;
+	ret = dev_data->init(iio);
+	if (ret < 0)
+		return ret;
+
+	ret = iio_device_register(iio);
+	if (ret < 0)
+		goto err_cleanup;
+
+	dev_err(dev, "of_platform_populate\n");
+	if (dev_data->type == DFSDM_AUDIO) {
+		ret = of_platform_populate(np, NULL, NULL, dev);
+		if (ret < 0) {
+			dev_err(dev, "Failed to find an audio DAI\n");
+			goto err_unregister;
+		}
+	}
+
+	return 0;
+
+err_unregister:
+	iio_device_unregister(iio);
+err_cleanup:
+	stm32_dfsdm_dma_release(iio);
+
+	return ret;
+}
+
+static int stm32_dfsdm_adc_remove(struct platform_device *pdev)
+{
+	struct stm32_dfsdm_adc *adc = platform_get_drvdata(pdev);
+	struct iio_dev *indio_dev = iio_priv_to_dev(adc);
+
+	if (adc->dev_data->type == DFSDM_AUDIO)
+		of_platform_depopulate(&pdev->dev);
+	iio_device_unregister(indio_dev);
+	stm32_dfsdm_dma_release(indio_dev);
+
+	return 0;
+}
+
+static struct platform_driver stm32_dfsdm_adc_driver = {
+	.driver = {
+		.name = "stm32-dfsdm-adc",
+		.of_match_table = stm32_dfsdm_adc_match,
+	},
+	.probe = stm32_dfsdm_adc_probe,
+	.remove = stm32_dfsdm_adc_remove,
+};
+module_platform_driver(stm32_dfsdm_adc_driver);
+
+MODULE_DESCRIPTION("STM32 sigma delta ADC");
+MODULE_AUTHOR("Arnaud Pouliquen <arnaud.pouliquen@st.com>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iio/adc/stm32-dfsdm-core.c b/drivers/iio/adc/stm32-dfsdm-core.c
new file mode 100644
index 0000000..6290332
--- /dev/null
+++ b/drivers/iio/adc/stm32-dfsdm-core.c
@@ -0,0 +1,302 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is part the core part STM32 DFSDM driver
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Author(s): Arnaud Pouliquen <arnaud.pouliquen@st.com> for STMicroelectronics.
+ */
+
+#include <linux/clk.h>
+#include <linux/iio/iio.h>
+#include <linux/iio/sysfs.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+
+#include "stm32-dfsdm.h"
+
+struct stm32_dfsdm_dev_data {
+	unsigned int num_filters;
+	unsigned int num_channels;
+	const struct regmap_config *regmap_cfg;
+};
+
+#define STM32H7_DFSDM_NUM_FILTERS	4
+#define STM32H7_DFSDM_NUM_CHANNELS	8
+
+static bool stm32_dfsdm_volatile_reg(struct device *dev, unsigned int reg)
+{
+	if (reg < DFSDM_FILTER_BASE_ADR)
+		return false;
+
+	/*
+	 * Mask is done on register to avoid to list registers of all
+	 * filter instances.
+	 */
+	switch (reg & DFSDM_FILTER_REG_MASK) {
+	case DFSDM_CR1(0) & DFSDM_FILTER_REG_MASK:
+	case DFSDM_ISR(0) & DFSDM_FILTER_REG_MASK:
+	case DFSDM_JDATAR(0) & DFSDM_FILTER_REG_MASK:
+	case DFSDM_RDATAR(0) & DFSDM_FILTER_REG_MASK:
+		return true;
+	}
+
+	return false;
+}
+
+static const struct regmap_config stm32h7_dfsdm_regmap_cfg = {
+	.reg_bits = 32,
+	.val_bits = 32,
+	.reg_stride = sizeof(u32),
+	.max_register = 0x2B8,
+	.volatile_reg = stm32_dfsdm_volatile_reg,
+	.fast_io = true,
+};
+
+static const struct stm32_dfsdm_dev_data stm32h7_dfsdm_data = {
+	.num_filters = STM32H7_DFSDM_NUM_FILTERS,
+	.num_channels = STM32H7_DFSDM_NUM_CHANNELS,
+	.regmap_cfg = &stm32h7_dfsdm_regmap_cfg,
+};
+
+struct dfsdm_priv {
+	struct platform_device *pdev; /* platform device */
+
+	struct stm32_dfsdm dfsdm; /* common data exported for all instances */
+
+	unsigned int spi_clk_out_div; /* SPI clkout divider value */
+	atomic_t n_active_ch;	/* number of current active channels */
+
+	struct clk *clk; /* DFSDM clock */
+	struct clk *aclk; /* audio clock */
+};
+
+/**
+ * stm32_dfsdm_start_dfsdm - start global dfsdm interface.
+ *
+ * Enable interface if n_active_ch is not null.
+ * @dfsdm: Handle used to retrieve dfsdm context.
+ */
+int stm32_dfsdm_start_dfsdm(struct stm32_dfsdm *dfsdm)
+{
+	struct dfsdm_priv *priv = container_of(dfsdm, struct dfsdm_priv, dfsdm);
+	struct device *dev = &priv->pdev->dev;
+	unsigned int clk_div = priv->spi_clk_out_div;
+	int ret;
+
+	if (atomic_inc_return(&priv->n_active_ch) == 1) {
+		ret = clk_prepare_enable(priv->clk);
+		if (ret < 0) {
+			dev_err(dev, "Failed to start clock\n");
+			goto error_ret;
+		}
+		if (priv->aclk) {
+			ret = clk_prepare_enable(priv->aclk);
+			if (ret < 0) {
+				dev_err(dev, "Failed to start audio clock\n");
+				goto disable_clk;
+			}
+		}
+
+		/* Output the SPI CLKOUT (if clk_div == 0 clock if OFF) */
+		ret = regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(0),
+					 DFSDM_CHCFGR1_CKOUTDIV_MASK,
+					 DFSDM_CHCFGR1_CKOUTDIV(clk_div));
+		if (ret < 0)
+			goto disable_aclk;
+
+		/* Global enable of DFSDM interface */
+		ret = regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(0),
+					 DFSDM_CHCFGR1_DFSDMEN_MASK,
+					 DFSDM_CHCFGR1_DFSDMEN(1));
+		if (ret < 0)
+			goto disable_aclk;
+	}
+
+	dev_dbg(dev, "%s: n_active_ch %d\n", __func__,
+		atomic_read(&priv->n_active_ch));
+
+	return 0;
+
+disable_aclk:
+	clk_disable_unprepare(priv->aclk);
+disable_clk:
+	clk_disable_unprepare(priv->clk);
+
+error_ret:
+	atomic_dec(&priv->n_active_ch);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(stm32_dfsdm_start_dfsdm);
+
+/**
+ * stm32_dfsdm_stop_dfsdm - stop global DFSDM interface.
+ *
+ * Disable interface if n_active_ch is null
+ * @dfsdm: Handle used to retrieve dfsdm context.
+ */
+int stm32_dfsdm_stop_dfsdm(struct stm32_dfsdm *dfsdm)
+{
+	struct dfsdm_priv *priv = container_of(dfsdm, struct dfsdm_priv, dfsdm);
+	int ret;
+
+	if (atomic_dec_and_test(&priv->n_active_ch)) {
+		/* Global disable of DFSDM interface */
+		ret = regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(0),
+					 DFSDM_CHCFGR1_DFSDMEN_MASK,
+					 DFSDM_CHCFGR1_DFSDMEN(0));
+		if (ret < 0)
+			return ret;
+
+		/* Stop SPI CLKOUT */
+		ret = regmap_update_bits(dfsdm->regmap, DFSDM_CHCFGR1(0),
+					 DFSDM_CHCFGR1_CKOUTDIV_MASK,
+					 DFSDM_CHCFGR1_CKOUTDIV(0));
+		if (ret < 0)
+			return ret;
+
+		clk_disable_unprepare(priv->clk);
+		if (priv->aclk)
+			clk_disable_unprepare(priv->aclk);
+	}
+	dev_dbg(&priv->pdev->dev, "%s: n_active_ch %d\n", __func__,
+		atomic_read(&priv->n_active_ch));
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(stm32_dfsdm_stop_dfsdm);
+
+static int stm32_dfsdm_parse_of(struct platform_device *pdev,
+				struct dfsdm_priv *priv)
+{
+	struct device_node *node = pdev->dev.of_node;
+	struct resource *res;
+	unsigned long clk_freq;
+	unsigned int spi_freq, rem;
+	int ret;
+
+	if (!node)
+		return -EINVAL;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(&pdev->dev, "Failed to get memory resource\n");
+		return -ENODEV;
+	}
+	priv->dfsdm.phys_base = res->start;
+	priv->dfsdm.base = devm_ioremap_resource(&pdev->dev, res);
+
+	/*
+	 * "dfsdm" clock is mandatory for DFSDM peripheral clocking.
+	 * "dfsdm" or "audio" clocks can be used as source clock for
+	 * the SPI clock out signal and internal processing, depending
+	 * on use case.
+	 */
+	priv->clk = devm_clk_get(&pdev->dev, "dfsdm");
+	if (IS_ERR(priv->clk)) {
+		dev_err(&pdev->dev, "No stm32_dfsdm_clk clock found\n");
+		return -EINVAL;
+	}
+
+	priv->aclk = devm_clk_get(&pdev->dev, "audio");
+	if (IS_ERR(priv->aclk))
+		priv->aclk = NULL;
+
+	if (priv->aclk)
+		clk_freq = clk_get_rate(priv->aclk);
+	else
+		clk_freq = clk_get_rate(priv->clk);
+
+	/* SPI clock out frequency */
+	ret = of_property_read_u32(pdev->dev.of_node, "spi-max-frequency",
+				   &spi_freq);
+	if (ret < 0) {
+		/* No SPI master mode */
+		return 0;
+	}
+
+	priv->spi_clk_out_div = div_u64_rem(clk_freq, spi_freq, &rem) - 1;
+	priv->dfsdm.spi_master_freq = spi_freq;
+
+	if (rem) {
+		dev_warn(&pdev->dev, "SPI clock not accurate\n");
+		dev_warn(&pdev->dev, "%ld = %d * %d + %d\n",
+			 clk_freq, spi_freq, priv->spi_clk_out_div + 1, rem);
+	}
+
+	return 0;
+};
+
+static const struct of_device_id stm32_dfsdm_of_match[] = {
+	{
+		.compatible = "st,stm32h7-dfsdm",
+		.data = &stm32h7_dfsdm_data,
+	},
+	{}
+};
+MODULE_DEVICE_TABLE(of, stm32_dfsdm_of_match);
+
+static int stm32_dfsdm_probe(struct platform_device *pdev)
+{
+	struct dfsdm_priv *priv;
+	const struct stm32_dfsdm_dev_data *dev_data;
+	struct stm32_dfsdm *dfsdm;
+	int ret;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->pdev = pdev;
+
+	dev_data = of_device_get_match_data(&pdev->dev);
+
+	dfsdm = &priv->dfsdm;
+	dfsdm->fl_list = devm_kcalloc(&pdev->dev, dev_data->num_filters,
+				      sizeof(*dfsdm->fl_list), GFP_KERNEL);
+	if (!dfsdm->fl_list)
+		return -ENOMEM;
+
+	dfsdm->num_fls = dev_data->num_filters;
+	dfsdm->ch_list = devm_kcalloc(&pdev->dev, dev_data->num_channels,
+				      sizeof(*dfsdm->ch_list),
+				      GFP_KERNEL);
+	if (!dfsdm->ch_list)
+		return -ENOMEM;
+	dfsdm->num_chs = dev_data->num_channels;
+
+	ret = stm32_dfsdm_parse_of(pdev, priv);
+	if (ret < 0)
+		return ret;
+
+	dfsdm->regmap = devm_regmap_init_mmio_clk(&pdev->dev, "dfsdm",
+						  dfsdm->base,
+						  &stm32h7_dfsdm_regmap_cfg);
+	if (IS_ERR(dfsdm->regmap)) {
+		ret = PTR_ERR(dfsdm->regmap);
+		dev_err(&pdev->dev, "%s: Failed to allocate regmap: %d\n",
+			__func__, ret);
+		return ret;
+	}
+
+	platform_set_drvdata(pdev, dfsdm);
+
+	return devm_of_platform_populate(&pdev->dev);
+}
+
+static struct platform_driver stm32_dfsdm_driver = {
+	.probe = stm32_dfsdm_probe,
+	.driver = {
+		.name = "stm32-dfsdm",
+		.of_match_table = stm32_dfsdm_of_match,
+	},
+};
+
+module_platform_driver(stm32_dfsdm_driver);
+
+MODULE_AUTHOR("Arnaud Pouliquen <arnaud.pouliquen@st.com>");
+MODULE_DESCRIPTION("STMicroelectronics STM32 dfsdm driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iio/adc/stm32-dfsdm.h b/drivers/iio/adc/stm32-dfsdm.h
new file mode 100644
index 0000000..8708394
--- /dev/null
+++ b/drivers/iio/adc/stm32-dfsdm.h
@@ -0,0 +1,310 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * This file is part of STM32 DFSDM driver
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Author(s): Arnaud Pouliquen <arnaud.pouliquen@st.com>.
+ */
+
+#ifndef MDF_STM32_DFSDM__H
+#define MDF_STM32_DFSDM__H
+
+#include <linux/bitfield.h>
+
+/*
+ * STM32 DFSDM - global register map
+ * ________________________________________________________
+ * | Offset |                 Registers block             |
+ * --------------------------------------------------------
+ * | 0x000  |      CHANNEL 0 + COMMON CHANNEL FIELDS      |
+ * --------------------------------------------------------
+ * | 0x020  |                CHANNEL 1                    |
+ * --------------------------------------------------------
+ * | ...    |                .....                        |
+ * --------------------------------------------------------
+ * | 0x0E0  |                CHANNEL 7                    |
+ * --------------------------------------------------------
+ * | 0x100  |      FILTER  0 + COMMON  FILTER FIELDs      |
+ * --------------------------------------------------------
+ * | 0x200  |                FILTER  1                    |
+ * --------------------------------------------------------
+ * | 0x300  |                FILTER  2                    |
+ * --------------------------------------------------------
+ * | 0x400  |                FILTER  3                    |
+ * --------------------------------------------------------
+ */
+
+/*
+ * Channels register definitions
+ */
+#define DFSDM_CHCFGR1(y)  ((y) * 0x20 + 0x00)
+#define DFSDM_CHCFGR2(y)  ((y) * 0x20 + 0x04)
+#define DFSDM_AWSCDR(y)   ((y) * 0x20 + 0x08)
+#define DFSDM_CHWDATR(y)  ((y) * 0x20 + 0x0C)
+#define DFSDM_CHDATINR(y) ((y) * 0x20 + 0x10)
+
+/* CHCFGR1: Channel configuration register 1 */
+#define DFSDM_CHCFGR1_SITP_MASK     GENMASK(1, 0)
+#define DFSDM_CHCFGR1_SITP(v)       FIELD_PREP(DFSDM_CHCFGR1_SITP_MASK, v)
+#define DFSDM_CHCFGR1_SPICKSEL_MASK GENMASK(3, 2)
+#define DFSDM_CHCFGR1_SPICKSEL(v)   FIELD_PREP(DFSDM_CHCFGR1_SPICKSEL_MASK, v)
+#define DFSDM_CHCFGR1_SCDEN_MASK    BIT(5)
+#define DFSDM_CHCFGR1_SCDEN(v)      FIELD_PREP(DFSDM_CHCFGR1_SCDEN_MASK, v)
+#define DFSDM_CHCFGR1_CKABEN_MASK   BIT(6)
+#define DFSDM_CHCFGR1_CKABEN(v)     FIELD_PREP(DFSDM_CHCFGR1_CKABEN_MASK, v)
+#define DFSDM_CHCFGR1_CHEN_MASK     BIT(7)
+#define DFSDM_CHCFGR1_CHEN(v)       FIELD_PREP(DFSDM_CHCFGR1_CHEN_MASK, v)
+#define DFSDM_CHCFGR1_CHINSEL_MASK  BIT(8)
+#define DFSDM_CHCFGR1_CHINSEL(v)    FIELD_PREP(DFSDM_CHCFGR1_CHINSEL_MASK, v)
+#define DFSDM_CHCFGR1_DATMPX_MASK   GENMASK(13, 12)
+#define DFSDM_CHCFGR1_DATMPX(v)     FIELD_PREP(DFSDM_CHCFGR1_DATMPX_MASK, v)
+#define DFSDM_CHCFGR1_DATPACK_MASK  GENMASK(15, 14)
+#define DFSDM_CHCFGR1_DATPACK(v)    FIELD_PREP(DFSDM_CHCFGR1_DATPACK_MASK, v)
+#define DFSDM_CHCFGR1_CKOUTDIV_MASK GENMASK(23, 16)
+#define DFSDM_CHCFGR1_CKOUTDIV(v)   FIELD_PREP(DFSDM_CHCFGR1_CKOUTDIV_MASK, v)
+#define DFSDM_CHCFGR1_CKOUTSRC_MASK BIT(30)
+#define DFSDM_CHCFGR1_CKOUTSRC(v)   FIELD_PREP(DFSDM_CHCFGR1_CKOUTSRC_MASK, v)
+#define DFSDM_CHCFGR1_DFSDMEN_MASK  BIT(31)
+#define DFSDM_CHCFGR1_DFSDMEN(v)    FIELD_PREP(DFSDM_CHCFGR1_DFSDMEN_MASK, v)
+
+/* CHCFGR2: Channel configuration register 2 */
+#define DFSDM_CHCFGR2_DTRBS_MASK    GENMASK(7, 3)
+#define DFSDM_CHCFGR2_DTRBS(v)      FIELD_PREP(DFSDM_CHCFGR2_DTRBS_MASK, v)
+#define DFSDM_CHCFGR2_OFFSET_MASK   GENMASK(31, 8)
+#define DFSDM_CHCFGR2_OFFSET(v)     FIELD_PREP(DFSDM_CHCFGR2_OFFSET_MASK, v)
+
+/* AWSCDR: Channel analog watchdog and short circuit detector */
+#define DFSDM_AWSCDR_SCDT_MASK    GENMASK(7, 0)
+#define DFSDM_AWSCDR_SCDT(v)      FIELD_PREP(DFSDM_AWSCDR_SCDT_MASK, v)
+#define DFSDM_AWSCDR_BKSCD_MASK   GENMASK(15, 12)
+#define DFSDM_AWSCDR_BKSCD(v)	  FIELD_PREP(DFSDM_AWSCDR_BKSCD_MASK, v)
+#define DFSDM_AWSCDR_AWFOSR_MASK  GENMASK(20, 16)
+#define DFSDM_AWSCDR_AWFOSR(v)    FIELD_PREP(DFSDM_AWSCDR_AWFOSR_MASK, v)
+#define DFSDM_AWSCDR_AWFORD_MASK  GENMASK(23, 22)
+#define DFSDM_AWSCDR_AWFORD(v)    FIELD_PREP(DFSDM_AWSCDR_AWFORD_MASK, v)
+
+/*
+ * Filters register definitions
+ */
+#define DFSDM_FILTER_BASE_ADR		0x100
+#define DFSDM_FILTER_REG_MASK		0x7F
+#define DFSDM_FILTER_X_BASE_ADR(x)	((x) * 0x80 + DFSDM_FILTER_BASE_ADR)
+
+#define DFSDM_CR1(x)     (DFSDM_FILTER_X_BASE_ADR(x)  + 0x00)
+#define DFSDM_CR2(x)     (DFSDM_FILTER_X_BASE_ADR(x)  + 0x04)
+#define DFSDM_ISR(x)     (DFSDM_FILTER_X_BASE_ADR(x)  + 0x08)
+#define DFSDM_ICR(x)     (DFSDM_FILTER_X_BASE_ADR(x)  + 0x0C)
+#define DFSDM_JCHGR(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x10)
+#define DFSDM_FCR(x)     (DFSDM_FILTER_X_BASE_ADR(x)  + 0x14)
+#define DFSDM_JDATAR(x)  (DFSDM_FILTER_X_BASE_ADR(x)  + 0x18)
+#define DFSDM_RDATAR(x)  (DFSDM_FILTER_X_BASE_ADR(x)  + 0x1C)
+#define DFSDM_AWHTR(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x20)
+#define DFSDM_AWLTR(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x24)
+#define DFSDM_AWSR(x)    (DFSDM_FILTER_X_BASE_ADR(x)  + 0x28)
+#define DFSDM_AWCFR(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x2C)
+#define DFSDM_EXMAX(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x30)
+#define DFSDM_EXMIN(x)   (DFSDM_FILTER_X_BASE_ADR(x)  + 0x34)
+#define DFSDM_CNVTIMR(x) (DFSDM_FILTER_X_BASE_ADR(x)  + 0x38)
+
+/* CR1 Control register 1 */
+#define DFSDM_CR1_DFEN_MASK	BIT(0)
+#define DFSDM_CR1_DFEN(v)	FIELD_PREP(DFSDM_CR1_DFEN_MASK, v)
+#define DFSDM_CR1_JSWSTART_MASK	BIT(1)
+#define DFSDM_CR1_JSWSTART(v)	FIELD_PREP(DFSDM_CR1_JSWSTART_MASK, v)
+#define DFSDM_CR1_JSYNC_MASK	BIT(3)
+#define DFSDM_CR1_JSYNC(v)	FIELD_PREP(DFSDM_CR1_JSYNC_MASK, v)
+#define DFSDM_CR1_JSCAN_MASK	BIT(4)
+#define DFSDM_CR1_JSCAN(v)	FIELD_PREP(DFSDM_CR1_JSCAN_MASK, v)
+#define DFSDM_CR1_JDMAEN_MASK	BIT(5)
+#define DFSDM_CR1_JDMAEN(v)	FIELD_PREP(DFSDM_CR1_JDMAEN_MASK, v)
+#define DFSDM_CR1_JEXTSEL_MASK	GENMASK(12, 8)
+#define DFSDM_CR1_JEXTSEL(v)	FIELD_PREP(DFSDM_CR1_JEXTSEL_MASK, v)
+#define DFSDM_CR1_JEXTEN_MASK	GENMASK(14, 13)
+#define DFSDM_CR1_JEXTEN(v)	FIELD_PREP(DFSDM_CR1_JEXTEN_MASK, v)
+#define DFSDM_CR1_RSWSTART_MASK	BIT(17)
+#define DFSDM_CR1_RSWSTART(v)	FIELD_PREP(DFSDM_CR1_RSWSTART_MASK, v)
+#define DFSDM_CR1_RCONT_MASK	BIT(18)
+#define DFSDM_CR1_RCONT(v)	FIELD_PREP(DFSDM_CR1_RCONT_MASK, v)
+#define DFSDM_CR1_RSYNC_MASK	BIT(19)
+#define DFSDM_CR1_RSYNC(v)	FIELD_PREP(DFSDM_CR1_RSYNC_MASK, v)
+#define DFSDM_CR1_RDMAEN_MASK	BIT(21)
+#define DFSDM_CR1_RDMAEN(v)	FIELD_PREP(DFSDM_CR1_RDMAEN_MASK, v)
+#define DFSDM_CR1_RCH_MASK	GENMASK(26, 24)
+#define DFSDM_CR1_RCH(v)	FIELD_PREP(DFSDM_CR1_RCH_MASK, v)
+#define DFSDM_CR1_FAST_MASK	BIT(29)
+#define DFSDM_CR1_FAST(v)	FIELD_PREP(DFSDM_CR1_FAST_MASK, v)
+#define DFSDM_CR1_AWFSEL_MASK	BIT(30)
+#define DFSDM_CR1_AWFSEL(v)	FIELD_PREP(DFSDM_CR1_AWFSEL_MASK, v)
+
+/* CR2: Control register 2 */
+#define DFSDM_CR2_IE_MASK	GENMASK(6, 0)
+#define DFSDM_CR2_IE(v)		FIELD_PREP(DFSDM_CR2_IE_MASK, v)
+#define DFSDM_CR2_JEOCIE_MASK	BIT(0)
+#define DFSDM_CR2_JEOCIE(v)	FIELD_PREP(DFSDM_CR2_JEOCIE_MASK, v)
+#define DFSDM_CR2_REOCIE_MASK	BIT(1)
+#define DFSDM_CR2_REOCIE(v)	FIELD_PREP(DFSDM_CR2_REOCIE_MASK, v)
+#define DFSDM_CR2_JOVRIE_MASK	BIT(2)
+#define DFSDM_CR2_JOVRIE(v)	FIELD_PREP(DFSDM_CR2_JOVRIE_MASK, v)
+#define DFSDM_CR2_ROVRIE_MASK	BIT(3)
+#define DFSDM_CR2_ROVRIE(v)	FIELD_PREP(DFSDM_CR2_ROVRIE_MASK, v)
+#define DFSDM_CR2_AWDIE_MASK	BIT(4)
+#define DFSDM_CR2_AWDIE(v)	FIELD_PREP(DFSDM_CR2_AWDIE_MASK, v)
+#define DFSDM_CR2_SCDIE_MASK	BIT(5)
+#define DFSDM_CR2_SCDIE(v)	FIELD_PREP(DFSDM_CR2_SCDIE_MASK, v)
+#define DFSDM_CR2_CKABIE_MASK	BIT(6)
+#define DFSDM_CR2_CKABIE(v)	FIELD_PREP(DFSDM_CR2_CKABIE_MASK, v)
+#define DFSDM_CR2_EXCH_MASK	GENMASK(15, 8)
+#define DFSDM_CR2_EXCH(v)	FIELD_PREP(DFSDM_CR2_EXCH_MASK, v)
+#define DFSDM_CR2_AWDCH_MASK	GENMASK(23, 16)
+#define DFSDM_CR2_AWDCH(v)	FIELD_PREP(DFSDM_CR2_AWDCH_MASK, v)
+
+/* ISR: Interrupt status register */
+#define DFSDM_ISR_JEOCF_MASK	BIT(0)
+#define DFSDM_ISR_JEOCF(v)	FIELD_PREP(DFSDM_ISR_JEOCF_MASK, v)
+#define DFSDM_ISR_REOCF_MASK	BIT(1)
+#define DFSDM_ISR_REOCF(v)	FIELD_PREP(DFSDM_ISR_REOCF_MASK, v)
+#define DFSDM_ISR_JOVRF_MASK	BIT(2)
+#define DFSDM_ISR_JOVRF(v)	FIELD_PREP(DFSDM_ISR_JOVRF_MASK, v)
+#define DFSDM_ISR_ROVRF_MASK	BIT(3)
+#define DFSDM_ISR_ROVRF(v)	FIELD_PREP(DFSDM_ISR_ROVRF_MASK, v)
+#define DFSDM_ISR_AWDF_MASK	BIT(4)
+#define DFSDM_ISR_AWDF(v)	FIELD_PREP(DFSDM_ISR_AWDF_MASK, v)
+#define DFSDM_ISR_JCIP_MASK	BIT(13)
+#define DFSDM_ISR_JCIP(v)	FIELD_PREP(DFSDM_ISR_JCIP_MASK, v)
+#define DFSDM_ISR_RCIP_MASK	BIT(14)
+#define DFSDM_ISR_RCIP(v)	FIELD_PREP(DFSDM_ISR_RCIP, v)
+#define DFSDM_ISR_CKABF_MASK	GENMASK(23, 16)
+#define DFSDM_ISR_CKABF(v)	FIELD_PREP(DFSDM_ISR_CKABF_MASK, v)
+#define DFSDM_ISR_SCDF_MASK	GENMASK(31, 24)
+#define DFSDM_ISR_SCDF(v)	FIELD_PREP(DFSDM_ISR_SCDF_MASK, v)
+
+/* ICR: Interrupt flag clear register */
+#define DFSDM_ICR_CLRJOVRF_MASK	      BIT(2)
+#define DFSDM_ICR_CLRJOVRF(v)	      FIELD_PREP(DFSDM_ICR_CLRJOVRF_MASK, v)
+#define DFSDM_ICR_CLRROVRF_MASK	      BIT(3)
+#define DFSDM_ICR_CLRROVRF(v)	      FIELD_PREP(DFSDM_ICR_CLRROVRF_MASK, v)
+#define DFSDM_ICR_CLRCKABF_MASK	      GENMASK(23, 16)
+#define DFSDM_ICR_CLRCKABF(v)	      FIELD_PREP(DFSDM_ICR_CLRCKABF_MASK, v)
+#define DFSDM_ICR_CLRCKABF_CH_MASK(y) BIT(16 + (y))
+#define DFSDM_ICR_CLRCKABF_CH(v, y)   \
+			   (((v) << (16 + (y))) & DFSDM_ICR_CLRCKABF_CH_MASK(y))
+#define DFSDM_ICR_CLRSCDF_MASK	      GENMASK(31, 24)
+#define DFSDM_ICR_CLRSCDF(v)	      FIELD_PREP(DFSDM_ICR_CLRSCDF_MASK, v)
+#define DFSDM_ICR_CLRSCDF_CH_MASK(y)  BIT(24 + (y))
+#define DFSDM_ICR_CLRSCDF_CH(v, y)    \
+			       (((v) << (24 + (y))) & DFSDM_ICR_CLRSCDF_MASK(y))
+
+/* FCR: Filter control register */
+#define DFSDM_FCR_IOSR_MASK	GENMASK(7, 0)
+#define DFSDM_FCR_IOSR(v)	FIELD_PREP(DFSDM_FCR_IOSR_MASK, v)
+#define DFSDM_FCR_FOSR_MASK	GENMASK(25, 16)
+#define DFSDM_FCR_FOSR(v)	FIELD_PREP(DFSDM_FCR_FOSR_MASK, v)
+#define DFSDM_FCR_FORD_MASK	GENMASK(31, 29)
+#define DFSDM_FCR_FORD(v)	FIELD_PREP(DFSDM_FCR_FORD_MASK, v)
+
+/* RDATAR: Filter data register for regular channel */
+#define DFSDM_DATAR_CH_MASK	GENMASK(2, 0)
+#define DFSDM_DATAR_DATA_OFFSET 8
+#define DFSDM_DATAR_DATA_MASK	GENMASK(31, DFSDM_DATAR_DATA_OFFSET)
+
+/* AWLTR: Filter analog watchdog low threshold register */
+#define DFSDM_AWLTR_BKAWL_MASK	GENMASK(3, 0)
+#define DFSDM_AWLTR_BKAWL(v)	FIELD_PREP(DFSDM_AWLTR_BKAWL_MASK, v)
+#define DFSDM_AWLTR_AWLT_MASK	GENMASK(31, 8)
+#define DFSDM_AWLTR_AWLT(v)	FIELD_PREP(DFSDM_AWLTR_AWLT_MASK, v)
+
+/* AWHTR: Filter analog watchdog low threshold register */
+#define DFSDM_AWHTR_BKAWH_MASK	GENMASK(3, 0)
+#define DFSDM_AWHTR_BKAWH(v)	FIELD_PREP(DFSDM_AWHTR_BKAWH_MASK, v)
+#define DFSDM_AWHTR_AWHT_MASK	GENMASK(31, 8)
+#define DFSDM_AWHTR_AWHT(v)	FIELD_PREP(DFSDM_AWHTR_AWHT_MASK, v)
+
+/* AWSR: Filter watchdog status register */
+#define DFSDM_AWSR_AWLTF_MASK	GENMASK(7, 0)
+#define DFSDM_AWSR_AWLTF(v)	FIELD_PREP(DFSDM_AWSR_AWLTF_MASK, v)
+#define DFSDM_AWSR_AWHTF_MASK	GENMASK(15, 8)
+#define DFSDM_AWSR_AWHTF(v)	FIELD_PREP(DFSDM_AWSR_AWHTF_MASK, v)
+
+/* AWCFR: Filter watchdog status register */
+#define DFSDM_AWCFR_AWLTF_MASK	GENMASK(7, 0)
+#define DFSDM_AWCFR_AWLTF(v)	FIELD_PREP(DFSDM_AWCFR_AWLTF_MASK, v)
+#define DFSDM_AWCFR_AWHTF_MASK	GENMASK(15, 8)
+#define DFSDM_AWCFR_AWHTF(v)	FIELD_PREP(DFSDM_AWCFR_AWHTF_MASK, v)
+
+/* DFSDM filter order  */
+enum stm32_dfsdm_sinc_order {
+	DFSDM_FASTSINC_ORDER, /* FastSinc filter type */
+	DFSDM_SINC1_ORDER,    /* Sinc 1 filter type */
+	DFSDM_SINC2_ORDER,    /* Sinc 2 filter type */
+	DFSDM_SINC3_ORDER,    /* Sinc 3 filter type */
+	DFSDM_SINC4_ORDER,    /* Sinc 4 filter type (N.A. for watchdog) */
+	DFSDM_SINC5_ORDER,    /* Sinc 5 filter type (N.A. for watchdog) */
+	DFSDM_NB_SINC_ORDER,
+};
+
+/**
+ * struct stm32_dfsdm_filter - structure relative to stm32 FDSDM filter
+ * @iosr: integrator oversampling
+ * @fosr: filter oversampling
+ * @ford: filter order
+ * @res: output sample resolution
+ * @sync_mode: filter synchronized with filter 0
+ * @fast: filter fast mode
+ */
+struct stm32_dfsdm_filter {
+	unsigned int iosr;
+	unsigned int fosr;
+	enum stm32_dfsdm_sinc_order ford;
+	u64 res;
+	unsigned int sync_mode;
+	unsigned int fast;
+};
+
+/**
+ * struct stm32_dfsdm_channel - structure relative to stm32 FDSDM channel
+ * @id: id of the channel
+ * @type: interface type linked to stm32_dfsdm_chan_type
+ * @src: interface type linked to stm32_dfsdm_chan_src
+ * @alt_si: alternative serial input interface
+ */
+struct stm32_dfsdm_channel {
+	unsigned int id;
+	unsigned int type;
+	unsigned int src;
+	unsigned int alt_si;
+};
+
+/**
+ * struct stm32_dfsdm - stm32 FDSDM driver common data (for all instances)
+ * @base:	control registers base cpu addr
+ * @phys_base:	DFSDM IP register physical address
+ * @regmap:	regmap for register read/write
+ * @fl_list:	filter resources list
+ * @num_fls:	number of filter resources available
+ * @ch_list:	channel resources list
+ * @num_chs:	number of channel resources available
+ * @spi_master_freq: SPI clock out frequency
+ */
+struct stm32_dfsdm {
+	void __iomem	*base;
+	phys_addr_t	phys_base;
+	struct regmap *regmap;
+	struct stm32_dfsdm_filter *fl_list;
+	unsigned int num_fls;
+	struct stm32_dfsdm_channel *ch_list;
+	unsigned int num_chs;
+	unsigned int spi_master_freq;
+};
+
+/* DFSDM channel serial spi clock source */
+enum stm32_dfsdm_spi_clk_src {
+	DFSDM_CHANNEL_SPI_CLOCK_EXTERNAL,
+	DFSDM_CHANNEL_SPI_CLOCK_INTERNAL,
+	DFSDM_CHANNEL_SPI_CLOCK_INTERNAL_DIV2_FALLING,
+	DFSDM_CHANNEL_SPI_CLOCK_INTERNAL_DIV2_RISING
+};
+
+int stm32_dfsdm_start_dfsdm(struct stm32_dfsdm *dfsdm);
+int stm32_dfsdm_stop_dfsdm(struct stm32_dfsdm *dfsdm);
+
+#endif
diff --git a/drivers/iio/buffer/Kconfig b/drivers/iio/buffer/Kconfig
index 4ffd3db..338774c 100644
--- a/drivers/iio/buffer/Kconfig
+++ b/drivers/iio/buffer/Kconfig
@@ -29,6 +29,16 @@
 
 	  Should be selected by drivers that want to use this functionality.
 
+config IIO_BUFFER_HW_CONSUMER
+	tristate "Industrial I/O HW buffering"
+	help
+	  Provides a way to bonding when an IIO device has a direct connection
+	  to another device in hardware. In this case buffers for data transfers
+	  are handled by hardware.
+
+	  Should be selected by drivers that want to use the generic Hw consumer
+	  interface.
+
 config IIO_KFIFO_BUF
 	tristate "Industrial I/O buffering based on kfifo"
 	help
diff --git a/drivers/iio/buffer/Makefile b/drivers/iio/buffer/Makefile
index 95f9f41..1403eb2 100644
--- a/drivers/iio/buffer/Makefile
+++ b/drivers/iio/buffer/Makefile
@@ -7,5 +7,6 @@
 obj-$(CONFIG_IIO_BUFFER_CB) += industrialio-buffer-cb.o
 obj-$(CONFIG_IIO_BUFFER_DMA) += industrialio-buffer-dma.o
 obj-$(CONFIG_IIO_BUFFER_DMAENGINE) += industrialio-buffer-dmaengine.o
+obj-$(CONFIG_IIO_BUFFER_HW_CONSUMER) += industrialio-hw-consumer.o
 obj-$(CONFIG_IIO_TRIGGERED_BUFFER) += industrialio-triggered-buffer.o
 obj-$(CONFIG_IIO_KFIFO_BUF) += kfifo_buf.o
diff --git a/drivers/iio/buffer/industrialio-buffer-cb.c b/drivers/iio/buffer/industrialio-buffer-cb.c
index 4847534..ea63c83 100644
--- a/drivers/iio/buffer/industrialio-buffer-cb.c
+++ b/drivers/iio/buffer/industrialio-buffer-cb.c
@@ -104,6 +104,17 @@ struct iio_cb_buffer *iio_channel_get_all_cb(struct device *dev,
 }
 EXPORT_SYMBOL_GPL(iio_channel_get_all_cb);
 
+int iio_channel_cb_set_buffer_watermark(struct iio_cb_buffer *cb_buff,
+					size_t watermark)
+{
+	if (!watermark)
+		return -EINVAL;
+	cb_buff->buffer.watermark = watermark;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(iio_channel_cb_set_buffer_watermark);
+
 int iio_channel_start_all_cb(struct iio_cb_buffer *cb_buff)
 {
 	return iio_update_buffers(cb_buff->indio_dev, &cb_buff->buffer,
diff --git a/drivers/iio/buffer/industrialio-hw-consumer.c b/drivers/iio/buffer/industrialio-hw-consumer.c
new file mode 100644
index 0000000..9516569
--- /dev/null
+++ b/drivers/iio/buffer/industrialio-hw-consumer.c
@@ -0,0 +1,247 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright 2017 Analog Devices Inc.
+ *  Author: Lars-Peter Clausen <lars@metafoo.de>
+ */
+
+#include <linux/err.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/module.h>
+
+#include <linux/iio/iio.h>
+#include <linux/iio/consumer.h>
+#include <linux/iio/hw-consumer.h>
+#include <linux/iio/buffer_impl.h>
+
+/**
+ * struct iio_hw_consumer - IIO hw consumer block
+ * @buffers: hardware buffers list head.
+ * @channels: IIO provider channels.
+ */
+struct iio_hw_consumer {
+	struct list_head buffers;
+	struct iio_channel *channels;
+};
+
+struct hw_consumer_buffer {
+	struct list_head head;
+	struct iio_dev *indio_dev;
+	struct iio_buffer buffer;
+	long scan_mask[];
+};
+
+static struct hw_consumer_buffer *iio_buffer_to_hw_consumer_buffer(
+	struct iio_buffer *buffer)
+{
+	return container_of(buffer, struct hw_consumer_buffer, buffer);
+}
+
+static void iio_hw_buf_release(struct iio_buffer *buffer)
+{
+	struct hw_consumer_buffer *hw_buf =
+		iio_buffer_to_hw_consumer_buffer(buffer);
+	kfree(hw_buf);
+}
+
+static const struct iio_buffer_access_funcs iio_hw_buf_access = {
+	.release = &iio_hw_buf_release,
+	.modes = INDIO_BUFFER_HARDWARE,
+};
+
+static struct hw_consumer_buffer *iio_hw_consumer_get_buffer(
+	struct iio_hw_consumer *hwc, struct iio_dev *indio_dev)
+{
+	size_t mask_size = BITS_TO_LONGS(indio_dev->masklength) * sizeof(long);
+	struct hw_consumer_buffer *buf;
+
+	list_for_each_entry(buf, &hwc->buffers, head) {
+		if (buf->indio_dev == indio_dev)
+			return buf;
+	}
+
+	buf = kzalloc(sizeof(*buf) + mask_size, GFP_KERNEL);
+	if (!buf)
+		return NULL;
+
+	buf->buffer.access = &iio_hw_buf_access;
+	buf->indio_dev = indio_dev;
+	buf->buffer.scan_mask = buf->scan_mask;
+
+	iio_buffer_init(&buf->buffer);
+	list_add_tail(&buf->head, &hwc->buffers);
+
+	return buf;
+}
+
+/**
+ * iio_hw_consumer_alloc() - Allocate IIO hardware consumer
+ * @dev: Pointer to consumer device.
+ *
+ * Returns a valid iio_hw_consumer on success or a ERR_PTR() on failure.
+ */
+struct iio_hw_consumer *iio_hw_consumer_alloc(struct device *dev)
+{
+	struct hw_consumer_buffer *buf;
+	struct iio_hw_consumer *hwc;
+	struct iio_channel *chan;
+	int ret;
+
+	hwc = kzalloc(sizeof(*hwc), GFP_KERNEL);
+	if (!hwc)
+		return ERR_PTR(-ENOMEM);
+
+	INIT_LIST_HEAD(&hwc->buffers);
+
+	hwc->channels = iio_channel_get_all(dev);
+	if (IS_ERR(hwc->channels)) {
+		ret = PTR_ERR(hwc->channels);
+		goto err_free_hwc;
+	}
+
+	chan = &hwc->channels[0];
+	while (chan->indio_dev) {
+		buf = iio_hw_consumer_get_buffer(hwc, chan->indio_dev);
+		if (!buf) {
+			ret = -ENOMEM;
+			goto err_put_buffers;
+		}
+		set_bit(chan->channel->scan_index, buf->buffer.scan_mask);
+		chan++;
+	}
+
+	return hwc;
+
+err_put_buffers:
+	list_for_each_entry(buf, &hwc->buffers, head)
+		iio_buffer_put(&buf->buffer);
+	iio_channel_release_all(hwc->channels);
+err_free_hwc:
+	kfree(hwc);
+	return ERR_PTR(ret);
+}
+EXPORT_SYMBOL_GPL(iio_hw_consumer_alloc);
+
+/**
+ * iio_hw_consumer_free() - Free IIO hardware consumer
+ * @hwc: hw consumer to free.
+ */
+void iio_hw_consumer_free(struct iio_hw_consumer *hwc)
+{
+	struct hw_consumer_buffer *buf, *n;
+
+	iio_channel_release_all(hwc->channels);
+	list_for_each_entry_safe(buf, n, &hwc->buffers, head)
+		iio_buffer_put(&buf->buffer);
+	kfree(hwc);
+}
+EXPORT_SYMBOL_GPL(iio_hw_consumer_free);
+
+static void devm_iio_hw_consumer_release(struct device *dev, void *res)
+{
+	iio_hw_consumer_free(*(struct iio_hw_consumer **)res);
+}
+
+static int devm_iio_hw_consumer_match(struct device *dev, void *res, void *data)
+{
+	struct iio_hw_consumer **r = res;
+
+	if (!r || !*r) {
+		WARN_ON(!r || !*r);
+		return 0;
+	}
+	return *r == data;
+}
+
+/**
+ * devm_iio_hw_consumer_alloc - Resource-managed iio_hw_consumer_alloc()
+ * @dev: Pointer to consumer device.
+ *
+ * Managed iio_hw_consumer_alloc. iio_hw_consumer allocated with this function
+ * is automatically freed on driver detach.
+ *
+ * If an iio_hw_consumer allocated with this function needs to be freed
+ * separately, devm_iio_hw_consumer_free() must be used.
+ *
+ * returns pointer to allocated iio_hw_consumer on success, NULL on failure.
+ */
+struct iio_hw_consumer *devm_iio_hw_consumer_alloc(struct device *dev)
+{
+	struct iio_hw_consumer **ptr, *iio_hwc;
+
+	ptr = devres_alloc(devm_iio_hw_consumer_release, sizeof(*ptr),
+			   GFP_KERNEL);
+	if (!ptr)
+		return NULL;
+
+	iio_hwc = iio_hw_consumer_alloc(dev);
+	if (IS_ERR(iio_hwc)) {
+		devres_free(ptr);
+	} else {
+		*ptr = iio_hwc;
+		devres_add(dev, ptr);
+	}
+
+	return iio_hwc;
+}
+EXPORT_SYMBOL_GPL(devm_iio_hw_consumer_alloc);
+
+/**
+ * devm_iio_hw_consumer_free - Resource-managed iio_hw_consumer_free()
+ * @dev: Pointer to consumer device.
+ * @hwc: iio_hw_consumer to free.
+ *
+ * Free iio_hw_consumer allocated with devm_iio_hw_consumer_alloc().
+ */
+void devm_iio_hw_consumer_free(struct device *dev, struct iio_hw_consumer *hwc)
+{
+	int rc;
+
+	rc = devres_release(dev, devm_iio_hw_consumer_release,
+			    devm_iio_hw_consumer_match, hwc);
+	WARN_ON(rc);
+}
+EXPORT_SYMBOL_GPL(devm_iio_hw_consumer_free);
+
+/**
+ * iio_hw_consumer_enable() - Enable IIO hardware consumer
+ * @hwc: iio_hw_consumer to enable.
+ *
+ * Returns 0 on success.
+ */
+int iio_hw_consumer_enable(struct iio_hw_consumer *hwc)
+{
+	struct hw_consumer_buffer *buf;
+	int ret;
+
+	list_for_each_entry(buf, &hwc->buffers, head) {
+		ret = iio_update_buffers(buf->indio_dev, &buf->buffer, NULL);
+		if (ret)
+			goto err_disable_buffers;
+	}
+
+	return 0;
+
+err_disable_buffers:
+	list_for_each_entry_continue_reverse(buf, &hwc->buffers, head)
+		iio_update_buffers(buf->indio_dev, NULL, &buf->buffer);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(iio_hw_consumer_enable);
+
+/**
+ * iio_hw_consumer_disable() - Disable IIO hardware consumer
+ * @hwc: iio_hw_consumer to disable.
+ */
+void iio_hw_consumer_disable(struct iio_hw_consumer *hwc)
+{
+	struct hw_consumer_buffer *buf;
+
+	list_for_each_entry(buf, &hwc->buffers, head)
+		iio_update_buffers(buf->indio_dev, NULL, &buf->buffer);
+}
+EXPORT_SYMBOL_GPL(iio_hw_consumer_disable);
+
+MODULE_AUTHOR("Lars-Peter Clausen <lars@metafoo.de>");
+MODULE_DESCRIPTION("Hardware consumer buffer the IIO framework");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/iio/inkern.c b/drivers/iio/inkern.c
index 069defc..ec98790 100644
--- a/drivers/iio/inkern.c
+++ b/drivers/iio/inkern.c
@@ -664,9 +664,8 @@ int iio_convert_raw_to_processed(struct iio_channel *chan, int raw,
 }
 EXPORT_SYMBOL_GPL(iio_convert_raw_to_processed);
 
-static int iio_read_channel_attribute(struct iio_channel *chan,
-				      int *val, int *val2,
-				      enum iio_chan_info_enum attribute)
+int iio_read_channel_attribute(struct iio_channel *chan, int *val, int *val2,
+			       enum iio_chan_info_enum attribute)
 {
 	int ret;
 
@@ -682,6 +681,7 @@ static int iio_read_channel_attribute(struct iio_channel *chan,
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(iio_read_channel_attribute);
 
 int iio_read_channel_offset(struct iio_channel *chan, int *val, int *val2)
 {
@@ -850,7 +850,8 @@ static int iio_channel_write(struct iio_channel *chan, int val, int val2,
 						chan->channel, val, val2, info);
 }
 
-int iio_write_channel_raw(struct iio_channel *chan, int val)
+int iio_write_channel_attribute(struct iio_channel *chan, int val, int val2,
+				enum iio_chan_info_enum attribute)
 {
 	int ret;
 
@@ -860,12 +861,18 @@ int iio_write_channel_raw(struct iio_channel *chan, int val)
 		goto err_unlock;
 	}
 
-	ret = iio_channel_write(chan, val, 0, IIO_CHAN_INFO_RAW);
+	ret = iio_channel_write(chan, val, val2, attribute);
 err_unlock:
 	mutex_unlock(&chan->indio_dev->info_exist_lock);
 
 	return ret;
 }
+EXPORT_SYMBOL_GPL(iio_write_channel_attribute);
+
+int iio_write_channel_raw(struct iio_channel *chan, int val)
+{
+	return iio_write_channel_attribute(chan, val, 0, IIO_CHAN_INFO_RAW);
+}
 EXPORT_SYMBOL_GPL(iio_write_channel_raw);
 
 unsigned int iio_get_channel_ext_info_count(struct iio_channel *chan)
diff --git a/drivers/infiniband/Kconfig b/drivers/infiniband/Kconfig
index cbf1865..5cd7004 100644
--- a/drivers/infiniband/Kconfig
+++ b/drivers/infiniband/Kconfig
@@ -4,6 +4,7 @@
 	depends on NET
 	depends on INET
 	depends on m || IPV6 != m
+	depends on !ALPHA
 	select IRQ_POLL
 	---help---
 	  Core support for InfiniBand (IB).  Make sure to also select
diff --git a/drivers/infiniband/core/core_priv.h b/drivers/infiniband/core/core_priv.h
index a1d687a..66f0268 100644
--- a/drivers/infiniband/core/core_priv.h
+++ b/drivers/infiniband/core/core_priv.h
@@ -314,7 +314,7 @@ static inline int ib_mad_enforce_security(struct ib_mad_agent_private *map,
 }
 #endif
 
-struct ib_device *__ib_device_get_by_index(u32 ifindex);
+struct ib_device *ib_device_get_by_index(u32 ifindex);
 /* RDMA device netlink */
 void nldev_init(void);
 void nldev_exit(void);
diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
index 30914f3..4655206 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -134,7 +134,7 @@ static int ib_device_check_mandatory(struct ib_device *device)
 	return 0;
 }
 
-struct ib_device *__ib_device_get_by_index(u32 index)
+static struct ib_device *__ib_device_get_by_index(u32 index)
 {
 	struct ib_device *device;
 
@@ -145,6 +145,22 @@ struct ib_device *__ib_device_get_by_index(u32 index)
 	return NULL;
 }
 
+/*
+ * Caller is responsible to return refrerence count by calling put_device()
+ */
+struct ib_device *ib_device_get_by_index(u32 index)
+{
+	struct ib_device *device;
+
+	down_read(&lists_rwsem);
+	device = __ib_device_get_by_index(index);
+	if (device)
+		get_device(&device->dev);
+
+	up_read(&lists_rwsem);
+	return device;
+}
+
 static struct ib_device *__ib_device_get_by_name(const char *name)
 {
 	struct ib_device *device;
diff --git a/drivers/infiniband/core/nldev.c b/drivers/infiniband/core/nldev.c
index 9a05245..0dcd1aa 100644
--- a/drivers/infiniband/core/nldev.c
+++ b/drivers/infiniband/core/nldev.c
@@ -142,27 +142,34 @@ static int nldev_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
 
-	device = __ib_device_get_by_index(index);
+	device = ib_device_get_by_index(index);
 	if (!device)
 		return -EINVAL;
 
 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
-	if (!msg)
-		return -ENOMEM;
+	if (!msg) {
+		err = -ENOMEM;
+		goto err;
+	}
 
 	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
 			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
 			0, 0);
 
 	err = fill_dev_info(msg, device);
-	if (err) {
-		nlmsg_free(msg);
-		return err;
-	}
+	if (err)
+		goto err_free;
 
 	nlmsg_end(msg, nlh);
 
+	put_device(&device->dev);
 	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+
+err_free:
+	nlmsg_free(msg);
+err:
+	put_device(&device->dev);
+	return err;
 }
 
 static int _nldev_get_dumpit(struct ib_device *device,
@@ -220,31 +227,40 @@ static int nldev_port_get_doit(struct sk_buff *skb, struct nlmsghdr *nlh,
 		return -EINVAL;
 
 	index = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
-	device = __ib_device_get_by_index(index);
+	device = ib_device_get_by_index(index);
 	if (!device)
 		return -EINVAL;
 
 	port = nla_get_u32(tb[RDMA_NLDEV_ATTR_PORT_INDEX]);
-	if (!rdma_is_port_valid(device, port))
-		return -EINVAL;
+	if (!rdma_is_port_valid(device, port)) {
+		err = -EINVAL;
+		goto err;
+	}
 
 	msg = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
-	if (!msg)
-		return -ENOMEM;
+	if (!msg) {
+		err = -ENOMEM;
+		goto err;
+	}
 
 	nlh = nlmsg_put(msg, NETLINK_CB(skb).portid, nlh->nlmsg_seq,
 			RDMA_NL_GET_TYPE(RDMA_NL_NLDEV, RDMA_NLDEV_CMD_GET),
 			0, 0);
 
 	err = fill_port_info(msg, device, port);
-	if (err) {
-		nlmsg_free(msg);
-		return err;
-	}
+	if (err)
+		goto err_free;
 
 	nlmsg_end(msg, nlh);
+	put_device(&device->dev);
 
 	return rdma_nl_unicast(msg, NETLINK_CB(skb).portid);
+
+err_free:
+	nlmsg_free(msg);
+err:
+	put_device(&device->dev);
+	return err;
 }
 
 static int nldev_port_get_dumpit(struct sk_buff *skb,
@@ -265,7 +281,7 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
 		return -EINVAL;
 
 	ifindex = nla_get_u32(tb[RDMA_NLDEV_ATTR_DEV_INDEX]);
-	device = __ib_device_get_by_index(ifindex);
+	device = ib_device_get_by_index(ifindex);
 	if (!device)
 		return -EINVAL;
 
@@ -299,7 +315,9 @@ static int nldev_port_get_dumpit(struct sk_buff *skb,
 		nlmsg_end(skb, nlh);
 	}
 
-out:	cb->args[0] = idx;
+out:
+	put_device(&device->dev);
+	cb->args[0] = idx;
 	return skb->len;
 }
 
diff --git a/drivers/infiniband/hw/hfi1/file_ops.c b/drivers/infiniband/hw/hfi1/file_ops.c
index 7750a9c..1df7da4 100644
--- a/drivers/infiniband/hw/hfi1/file_ops.c
+++ b/drivers/infiniband/hw/hfi1/file_ops.c
@@ -763,11 +763,11 @@ static int complete_subctxt(struct hfi1_filedata *fd)
 	}
 
 	if (ret) {
-		hfi1_rcd_put(fd->uctxt);
-		fd->uctxt = NULL;
 		spin_lock_irqsave(&fd->dd->uctxt_lock, flags);
 		__clear_bit(fd->subctxt, fd->uctxt->in_use_ctxts);
 		spin_unlock_irqrestore(&fd->dd->uctxt_lock, flags);
+		hfi1_rcd_put(fd->uctxt);
+		fd->uctxt = NULL;
 	}
 
 	return ret;
diff --git a/drivers/infiniband/hw/hfi1/rc.c b/drivers/infiniband/hw/hfi1/rc.c
index af5f793..7eb5d50 100644
--- a/drivers/infiniband/hw/hfi1/rc.c
+++ b/drivers/infiniband/hw/hfi1/rc.c
@@ -302,7 +302,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -346,7 +345,6 @@ int hfi1_make_rc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		newreq = 0;
 		if (qp->s_cur == qp->s_tail) {
 			/* Check if send work queue is empty. */
-			smp_read_barrier_depends(); /* see post_one_send() */
 			if (qp->s_tail == READ_ONCE(qp->s_head)) {
 				clear_ahg(qp);
 				goto bail;
@@ -900,7 +898,6 @@ void hfi1_send_rc_ack(struct hfi1_ctxtdata *rcd,
 	}
 
 	/* Ensure s_rdma_ack_cnt changes are committed */
-	smp_read_barrier_depends();
 	if (qp->s_rdma_ack_cnt) {
 		hfi1_queue_rc_ack(qp, is_fecn);
 		return;
@@ -1562,7 +1559,6 @@ static void rc_rcv_resp(struct hfi1_packet *packet)
 	trace_hfi1_ack(qp, psn);
 
 	/* Ignore invalid responses. */
-	smp_read_barrier_depends(); /* see post_one_send */
 	if (cmp_psn(psn, READ_ONCE(qp->s_next_psn)) >= 0)
 		goto ack_done;
 
diff --git a/drivers/infiniband/hw/hfi1/ruc.c b/drivers/infiniband/hw/hfi1/ruc.c
index 2c7fc6e..13b9947 100644
--- a/drivers/infiniband/hw/hfi1/ruc.c
+++ b/drivers/infiniband/hw/hfi1/ruc.c
@@ -362,7 +362,6 @@ static void ruc_loopback(struct rvt_qp *sqp)
 	sqp->s_flags |= RVT_S_BUSY;
 
 again:
-	smp_read_barrier_depends(); /* see post_one_send() */
 	if (sqp->s_last == READ_ONCE(sqp->s_head))
 		goto clr_busy;
 	wqe = rvt_get_swqe_ptr(sqp, sqp->s_last);
diff --git a/drivers/infiniband/hw/hfi1/sdma.c b/drivers/infiniband/hw/hfi1/sdma.c
index 31c8f89..61c130d 100644
--- a/drivers/infiniband/hw/hfi1/sdma.c
+++ b/drivers/infiniband/hw/hfi1/sdma.c
@@ -553,7 +553,6 @@ static void sdma_hw_clean_up_task(unsigned long opaque)
 
 static inline struct sdma_txreq *get_txhead(struct sdma_engine *sde)
 {
-	smp_read_barrier_depends(); /* see sdma_update_tail() */
 	return sde->tx_ring[sde->tx_head & sde->sdma_mask];
 }
 
diff --git a/drivers/infiniband/hw/hfi1/uc.c b/drivers/infiniband/hw/hfi1/uc.c
index 991bbee..132b63e 100644
--- a/drivers/infiniband/hw/hfi1/uc.c
+++ b/drivers/infiniband/hw/hfi1/uc.c
@@ -79,7 +79,6 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -119,7 +118,6 @@ int hfi1_make_uc_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		    RVT_PROCESS_NEXT_SEND_OK))
 			goto bail;
 		/* Check if send work queue is empty. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_cur == READ_ONCE(qp->s_head)) {
 			clear_ahg(qp);
 			goto bail;
diff --git a/drivers/infiniband/hw/hfi1/ud.c b/drivers/infiniband/hw/hfi1/ud.c
index beb5091..deb1845 100644
--- a/drivers/infiniband/hw/hfi1/ud.c
+++ b/drivers/infiniband/hw/hfi1/ud.c
@@ -486,7 +486,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -500,7 +499,6 @@ int hfi1_make_ud_req(struct rvt_qp *qp, struct hfi1_pkt_state *ps)
 	}
 
 	/* see post_one_send() */
-	smp_read_barrier_depends();
 	if (qp->s_cur == READ_ONCE(qp->s_head))
 		goto bail;
 
diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
index 313bfb9..4975f3e 100644
--- a/drivers/infiniband/hw/mlx4/mr.c
+++ b/drivers/infiniband/hw/mlx4/mr.c
@@ -642,7 +642,6 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd,
 		goto err_free_mr;
 
 	mr->max_pages = max_num_sg;
-
 	err = mlx4_mr_enable(dev->dev, &mr->mmr);
 	if (err)
 		goto err_free_pl;
@@ -653,6 +652,7 @@ struct ib_mr *mlx4_ib_alloc_mr(struct ib_pd *pd,
 	return &mr->ibmr;
 
 err_free_pl:
+	mr->ibmr.device = pd->device;
 	mlx4_free_priv_pages(mr);
 err_free_mr:
 	(void) mlx4_mr_free(dev->dev, &mr->mmr);
diff --git a/drivers/infiniband/hw/mlx5/main.c b/drivers/infiniband/hw/mlx5/main.c
index 8ac50de..262c1aa 100644
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1324,7 +1324,8 @@ static int mlx5_ib_alloc_transport_domain(struct mlx5_ib_dev *dev, u32 *tdn)
 		return err;
 
 	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
-	    !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
 		return err;
 
 	mutex_lock(&dev->lb_mutex);
@@ -1342,7 +1343,8 @@ static void mlx5_ib_dealloc_transport_domain(struct mlx5_ib_dev *dev, u32 tdn)
 	mlx5_core_dealloc_transport_domain(dev->mdev, tdn);
 
 	if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
-	    !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+	    (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+	     !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
 		return;
 
 	mutex_lock(&dev->lb_mutex);
@@ -4158,7 +4160,7 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 		goto err_cnt;
 
 	dev->mdev->priv.uar = mlx5_get_uars_page(dev->mdev);
-	if (!dev->mdev->priv.uar)
+	if (IS_ERR(dev->mdev->priv.uar))
 		goto err_cong;
 
 	err = mlx5_alloc_bfreg(dev->mdev, &dev->bfreg, false, false);
@@ -4187,7 +4189,8 @@ static void *mlx5_ib_add(struct mlx5_core_dev *mdev)
 	}
 
 	if ((MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
-	    MLX5_CAP_GEN(mdev, disable_local_lb))
+	    (MLX5_CAP_GEN(mdev, disable_local_lb_uc) ||
+	     MLX5_CAP_GEN(mdev, disable_local_lb_mc)))
 		mutex_init(&dev->lb_mutex);
 
 	dev->ib_active = true;
diff --git a/drivers/infiniband/hw/mlx5/qp.c b/drivers/infiniband/hw/mlx5/qp.c
index 31ad2885..cffe596 100644
--- a/drivers/infiniband/hw/mlx5/qp.c
+++ b/drivers/infiniband/hw/mlx5/qp.c
@@ -4362,12 +4362,11 @@ static void to_rdma_ah_attr(struct mlx5_ib_dev *ibdev,
 
 	memset(ah_attr, 0, sizeof(*ah_attr));
 
-	ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port);
-	rdma_ah_set_port_num(ah_attr, path->port);
-	if (rdma_ah_get_port_num(ah_attr) == 0 ||
-	    rdma_ah_get_port_num(ah_attr) > MLX5_CAP_GEN(dev, num_ports))
+	if (!path->port || path->port > MLX5_CAP_GEN(dev, num_ports))
 		return;
 
+	ah_attr->type = rdma_ah_find_type(&ibdev->ib_dev, path->port);
+
 	rdma_ah_set_port_num(ah_attr, path->port);
 	rdma_ah_set_sl(ah_attr, path->dci_cfi_prio_sl & 0xf);
 
diff --git a/drivers/infiniband/hw/qib/qib_rc.c b/drivers/infiniband/hw/qib/qib_rc.c
index 8f5754f..1a785c3 100644
--- a/drivers/infiniband/hw/qib/qib_rc.c
+++ b/drivers/infiniband/hw/qib/qib_rc.c
@@ -246,7 +246,6 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -293,7 +292,6 @@ int qib_make_rc_req(struct rvt_qp *qp, unsigned long *flags)
 		newreq = 0;
 		if (qp->s_cur == qp->s_tail) {
 			/* Check if send work queue is empty. */
-			smp_read_barrier_depends(); /* see post_one_send() */
 			if (qp->s_tail == READ_ONCE(qp->s_head))
 				goto bail;
 			/*
@@ -1340,7 +1338,6 @@ static void qib_rc_rcv_resp(struct qib_ibport *ibp,
 		goto ack_done;
 
 	/* Ignore invalid responses. */
-	smp_read_barrier_depends(); /* see post_one_send */
 	if (qib_cmp24(psn, READ_ONCE(qp->s_next_psn)) >= 0)
 		goto ack_done;
 
diff --git a/drivers/infiniband/hw/qib/qib_ruc.c b/drivers/infiniband/hw/qib/qib_ruc.c
index 9a37e84..4662cc7 100644
--- a/drivers/infiniband/hw/qib/qib_ruc.c
+++ b/drivers/infiniband/hw/qib/qib_ruc.c
@@ -367,7 +367,6 @@ static void qib_ruc_loopback(struct rvt_qp *sqp)
 	sqp->s_flags |= RVT_S_BUSY;
 
 again:
-	smp_read_barrier_depends(); /* see post_one_send() */
 	if (sqp->s_last == READ_ONCE(sqp->s_head))
 		goto clr_busy;
 	wqe = rvt_get_swqe_ptr(sqp, sqp->s_last);
diff --git a/drivers/infiniband/hw/qib/qib_uc.c b/drivers/infiniband/hw/qib/qib_uc.c
index bddcc37..70c58b8 100644
--- a/drivers/infiniband/hw/qib/qib_uc.c
+++ b/drivers/infiniband/hw/qib/qib_uc.c
@@ -60,7 +60,6 @@ int qib_make_uc_req(struct rvt_qp *qp, unsigned long *flags)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -90,7 +89,6 @@ int qib_make_uc_req(struct rvt_qp *qp, unsigned long *flags)
 		    RVT_PROCESS_NEXT_SEND_OK))
 			goto bail;
 		/* Check if send work queue is empty. */
-		smp_read_barrier_depends(); /* see post_one_send() */
 		if (qp->s_cur == READ_ONCE(qp->s_head))
 			goto bail;
 		/*
diff --git a/drivers/infiniband/hw/qib/qib_ud.c b/drivers/infiniband/hw/qib/qib_ud.c
index 15962ed..386c3c4 100644
--- a/drivers/infiniband/hw/qib/qib_ud.c
+++ b/drivers/infiniband/hw/qib/qib_ud.c
@@ -252,7 +252,6 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
 		if (!(ib_rvt_state_ops[qp->state] & RVT_FLUSH_SEND))
 			goto bail;
 		/* We are in the error state, flush the work request. */
-		smp_read_barrier_depends(); /* see post_one_send */
 		if (qp->s_last == READ_ONCE(qp->s_head))
 			goto bail;
 		/* If DMAs are in progress, we can't flush immediately. */
@@ -266,7 +265,6 @@ int qib_make_ud_req(struct rvt_qp *qp, unsigned long *flags)
 	}
 
 	/* see post_one_send() */
-	smp_read_barrier_depends();
 	if (qp->s_cur == READ_ONCE(qp->s_head))
 		goto bail;
 
diff --git a/drivers/infiniband/sw/rdmavt/qp.c b/drivers/infiniband/sw/rdmavt/qp.c
index 9177df6..eae84c2 100644
--- a/drivers/infiniband/sw/rdmavt/qp.c
+++ b/drivers/infiniband/sw/rdmavt/qp.c
@@ -1684,7 +1684,6 @@ static inline int rvt_qp_is_avail(
 	/* non-reserved operations */
 	if (likely(qp->s_avail))
 		return 0;
-	smp_read_barrier_depends(); /* see rc.c */
 	slast = READ_ONCE(qp->s_last);
 	if (qp->s_head >= slast)
 		avail = qp->s_size - (qp->s_head - slast);
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_cm.c b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
index 2c13123..71ea9e2 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_cm.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_cm.c
@@ -1456,8 +1456,7 @@ void ipoib_cm_skb_too_long(struct net_device *dev, struct sk_buff *skb,
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
 	int e = skb_queue_empty(&priv->cm.skb_queue);
 
-	if (skb_dst(skb))
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 
 	skb_queue_tail(&priv->cm.skb_queue, skb);
 	if (e)
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 12b7f91..8880351d 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -902,8 +902,8 @@ static int path_rec_start(struct net_device *dev,
 	return 0;
 }
 
-static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
-			   struct net_device *dev)
+static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr,
+					  struct net_device *dev)
 {
 	struct ipoib_dev_priv *priv = ipoib_priv(dev);
 	struct rdma_netdev *rn = netdev_priv(dev);
@@ -917,7 +917,15 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 		spin_unlock_irqrestore(&priv->lock, flags);
 		++dev->stats.tx_dropped;
 		dev_kfree_skb_any(skb);
-		return;
+		return NULL;
+	}
+
+	/* To avoid race condition, make sure that the
+	 * neigh will be added only once.
+	 */
+	if (unlikely(!list_empty(&neigh->list))) {
+		spin_unlock_irqrestore(&priv->lock, flags);
+		return neigh;
 	}
 
 	path = __path_find(dev, daddr + 4);
@@ -956,7 +964,7 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 			path->ah->last_send = rn->send(dev, skb, path->ah->ah,
 						       IPOIB_QPN(daddr));
 			ipoib_neigh_put(neigh);
-			return;
+			return NULL;
 		}
 	} else {
 		neigh->ah  = NULL;
@@ -973,7 +981,7 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
-	return;
+	return NULL;
 
 err_path:
 	ipoib_neigh_free(neigh);
@@ -983,6 +991,8 @@ static void neigh_add_path(struct sk_buff *skb, u8 *daddr,
 
 	spin_unlock_irqrestore(&priv->lock, flags);
 	ipoib_neigh_put(neigh);
+
+	return NULL;
 }
 
 static void unicast_arp_send(struct sk_buff *skb, struct net_device *dev,
@@ -1091,8 +1101,9 @@ static int ipoib_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	case htons(ETH_P_TIPC):
 		neigh = ipoib_neigh_get(dev, phdr->hwaddr);
 		if (unlikely(!neigh)) {
-			neigh_add_path(skb, phdr->hwaddr, dev);
-			return NETDEV_TX_OK;
+			neigh = neigh_add_path(skb, phdr->hwaddr, dev);
+			if (likely(!neigh))
+				return NETDEV_TX_OK;
 		}
 		break;
 	case htons(ETH_P_ARP):
diff --git a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
index 93e149e..9b3f47a 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_multicast.c
@@ -816,7 +816,10 @@ void ipoib_mcast_send(struct net_device *dev, u8 *daddr, struct sk_buff *skb)
 		spin_lock_irqsave(&priv->lock, flags);
 		if (!neigh) {
 			neigh = ipoib_neigh_alloc(daddr, dev);
-			if (neigh) {
+			/* Make sure that the neigh will be added only
+			 * once to mcast list.
+			 */
+			if (neigh && list_empty(&neigh->list)) {
 				kref_get(&mcast->ah->ref);
 				neigh->ah	= mcast->ah;
 				list_add_tail(&neigh->list, &mcast->neigh_list);
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c
index 720dfb3..1b02283 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -741,6 +741,7 @@ isert_connect_error(struct rdma_cm_id *cma_id)
 {
 	struct isert_conn *isert_conn = cma_id->qp->qp_context;
 
+	ib_drain_qp(isert_conn->qp);
 	list_del_init(&isert_conn->node);
 	isert_conn->cm_id = NULL;
 	isert_put_conn(isert_conn);
diff --git a/drivers/infiniband/ulp/srpt/ib_srpt.c b/drivers/infiniband/ulp/srpt/ib_srpt.c
index 8a1bd35..bfa576a 100644
--- a/drivers/infiniband/ulp/srpt/ib_srpt.c
+++ b/drivers/infiniband/ulp/srpt/ib_srpt.c
@@ -1013,8 +1013,7 @@ static int srpt_init_ch_qp(struct srpt_rdma_ch *ch, struct ib_qp *qp)
 		return -ENOMEM;
 
 	attr->qp_state = IB_QPS_INIT;
-	attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE | IB_ACCESS_REMOTE_READ |
-	    IB_ACCESS_REMOTE_WRITE;
+	attr->qp_access_flags = IB_ACCESS_LOCAL_WRITE;
 	attr->port_num = ch->sport->port;
 	attr->pkey_index = 0;
 
@@ -2078,7 +2077,7 @@ static int srpt_cm_req_recv(struct ib_cm_id *cm_id,
 		goto destroy_ib;
 	}
 
-	guid = (__be16 *)&param->primary_path->sgid.global.interface_id;
+	guid = (__be16 *)&param->primary_path->dgid.global.interface_id;
 	snprintf(ch->ini_guid, sizeof(ch->ini_guid), "%04x:%04x:%04x:%04x",
 		 be16_to_cpu(guid[0]), be16_to_cpu(guid[1]),
 		 be16_to_cpu(guid[2]), be16_to_cpu(guid[3]));
diff --git a/drivers/input/joystick/analog.c b/drivers/input/joystick/analog.c
index 3d8ff09..c868a87 100644
--- a/drivers/input/joystick/analog.c
+++ b/drivers/input/joystick/analog.c
@@ -163,7 +163,7 @@ static unsigned int get_time_pit(void)
 #define GET_TIME(x)	do { x = (unsigned int)rdtsc(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"TSC"
-#elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_TILE)
+#elif defined(__alpha__) || defined(CONFIG_MN10300) || defined(CONFIG_ARM) || defined(CONFIG_ARM64) || defined(CONFIG_RISCV) || defined(CONFIG_TILE)
 #define GET_TIME(x)	do { x = get_cycles(); } while (0)
 #define DELTA(x,y)	((y)-(x))
 #define TIME_NAME	"get_cycles"
diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c
index d86e595..d88d3e0 100644
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -229,6 +229,7 @@ static const struct xpad_device {
 	{ 0x0e6f, 0x0213, "Afterglow Gamepad for Xbox 360", 0, XTYPE_XBOX360 },
 	{ 0x0e6f, 0x021f, "Rock Candy Gamepad for Xbox 360", 0, XTYPE_XBOX360 },
 	{ 0x0e6f, 0x0246, "Rock Candy Gamepad for Xbox One 2015", 0, XTYPE_XBOXONE },
+	{ 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE },
 	{ 0x0e6f, 0x0301, "Logic3 Controller", 0, XTYPE_XBOX360 },
 	{ 0x0e6f, 0x0346, "Rock Candy Gamepad for Xbox One 2016", 0, XTYPE_XBOXONE },
 	{ 0x0e6f, 0x0401, "Logic3 Controller", 0, XTYPE_XBOX360 },
@@ -476,6 +477,22 @@ static const u8 xboxone_hori_init[] = {
 };
 
 /*
+ * This packet is required for some of the PDP pads to start
+ * sending input reports. One of those pads is (0x0e6f:0x02ab).
+ */
+static const u8 xboxone_pdp_init1[] = {
+	0x0a, 0x20, 0x00, 0x03, 0x00, 0x01, 0x14
+};
+
+/*
+ * This packet is required for some of the PDP pads to start
+ * sending input reports. One of those pads is (0x0e6f:0x02ab).
+ */
+static const u8 xboxone_pdp_init2[] = {
+	0x06, 0x20, 0x00, 0x02, 0x01, 0x00
+};
+
+/*
  * A specific rumble packet is required for some PowerA pads to start
  * sending input reports. One of those pads is (0x24c6:0x543a).
  */
@@ -505,6 +522,8 @@ static const struct xboxone_init_packet xboxone_init_packets[] = {
 	XBOXONE_INIT_PKT(0x0e6f, 0x0165, xboxone_hori_init),
 	XBOXONE_INIT_PKT(0x0f0d, 0x0067, xboxone_hori_init),
 	XBOXONE_INIT_PKT(0x0000, 0x0000, xboxone_fw2015_init),
+	XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init1),
+	XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init2),
 	XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumblebegin_init),
 	XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_rumblebegin_init),
 	XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumblebegin_init),
diff --git a/drivers/input/misc/ims-pcu.c b/drivers/input/misc/ims-pcu.c
index ae47312..3d51175 100644
--- a/drivers/input/misc/ims-pcu.c
+++ b/drivers/input/misc/ims-pcu.c
@@ -1651,7 +1651,7 @@ ims_pcu_get_cdc_union_desc(struct usb_interface *intf)
 				return union_desc;
 
 			dev_err(&intf->dev,
-				"Union descriptor to short (%d vs %zd\n)",
+				"Union descriptor too short (%d vs %zd)\n",
 				union_desc->bLength, sizeof(*union_desc));
 			return NULL;
 		}
diff --git a/drivers/input/misc/twl4030-vibra.c b/drivers/input/misc/twl4030-vibra.c
index 6c51d40..c37aea9 100644
--- a/drivers/input/misc/twl4030-vibra.c
+++ b/drivers/input/misc/twl4030-vibra.c
@@ -178,12 +178,14 @@ static SIMPLE_DEV_PM_OPS(twl4030_vibra_pm_ops,
 			 twl4030_vibra_suspend, twl4030_vibra_resume);
 
 static bool twl4030_vibra_check_coexist(struct twl4030_vibra_data *pdata,
-			      struct device_node *node)
+			      struct device_node *parent)
 {
+	struct device_node *node;
+
 	if (pdata && pdata->coexist)
 		return true;
 
-	node = of_find_node_by_name(node, "codec");
+	node = of_get_child_by_name(parent, "codec");
 	if (node) {
 		of_node_put(node);
 		return true;
diff --git a/drivers/input/misc/twl6040-vibra.c b/drivers/input/misc/twl6040-vibra.c
index 5690eb7..15e0d35 100644
--- a/drivers/input/misc/twl6040-vibra.c
+++ b/drivers/input/misc/twl6040-vibra.c
@@ -248,8 +248,7 @@ static int twl6040_vibra_probe(struct platform_device *pdev)
 	int vddvibr_uV = 0;
 	int error;
 
-	of_node_get(twl6040_core_dev->of_node);
-	twl6040_core_node = of_find_node_by_name(twl6040_core_dev->of_node,
+	twl6040_core_node = of_get_child_by_name(twl6040_core_dev->of_node,
 						 "vibra");
 	if (!twl6040_core_node) {
 		dev_err(&pdev->dev, "parent of node is missing?\n");
diff --git a/drivers/input/misc/xen-kbdfront.c b/drivers/input/misc/xen-kbdfront.c
index 6bf56bb..d91f3b1 100644
--- a/drivers/input/misc/xen-kbdfront.c
+++ b/drivers/input/misc/xen-kbdfront.c
@@ -326,8 +326,6 @@ static int xenkbd_probe(struct xenbus_device *dev,
 				     0, width, 0, 0);
 		input_set_abs_params(mtouch, ABS_MT_POSITION_Y,
 				     0, height, 0, 0);
-		input_set_abs_params(mtouch, ABS_MT_PRESSURE,
-				     0, 255, 0, 0);
 
 		ret = input_mt_init_slots(mtouch, num_cont, INPUT_MT_DIRECT);
 		if (ret) {
diff --git a/drivers/input/mouse/alps.c b/drivers/input/mouse/alps.c
index 579b899..dbe57da 100644
--- a/drivers/input/mouse/alps.c
+++ b/drivers/input/mouse/alps.c
@@ -1250,29 +1250,32 @@ static int alps_decode_ss4_v2(struct alps_fields *f,
 	case SS4_PACKET_ID_MULTI:
 		if (priv->flags & ALPS_BUTTONPAD) {
 			if (IS_SS4PLUS_DEV(priv->dev_id)) {
-				f->mt[0].x = SS4_PLUS_BTL_MF_X_V2(p, 0);
-				f->mt[1].x = SS4_PLUS_BTL_MF_X_V2(p, 1);
+				f->mt[2].x = SS4_PLUS_BTL_MF_X_V2(p, 0);
+				f->mt[3].x = SS4_PLUS_BTL_MF_X_V2(p, 1);
+				no_data_x = SS4_PLUS_MFPACKET_NO_AX_BL;
 			} else {
 				f->mt[2].x = SS4_BTL_MF_X_V2(p, 0);
 				f->mt[3].x = SS4_BTL_MF_X_V2(p, 1);
+				no_data_x = SS4_MFPACKET_NO_AX_BL;
 			}
+			no_data_y = SS4_MFPACKET_NO_AY_BL;
 
 			f->mt[2].y = SS4_BTL_MF_Y_V2(p, 0);
 			f->mt[3].y = SS4_BTL_MF_Y_V2(p, 1);
-			no_data_x = SS4_MFPACKET_NO_AX_BL;
-			no_data_y = SS4_MFPACKET_NO_AY_BL;
 		} else {
 			if (IS_SS4PLUS_DEV(priv->dev_id)) {
-				f->mt[0].x = SS4_PLUS_STD_MF_X_V2(p, 0);
-				f->mt[1].x = SS4_PLUS_STD_MF_X_V2(p, 1);
+				f->mt[2].x = SS4_PLUS_STD_MF_X_V2(p, 0);
+				f->mt[3].x = SS4_PLUS_STD_MF_X_V2(p, 1);
+				no_data_x = SS4_PLUS_MFPACKET_NO_AX;
 			} else {
-				f->mt[0].x = SS4_STD_MF_X_V2(p, 0);
-				f->mt[1].x = SS4_STD_MF_X_V2(p, 1);
+				f->mt[2].x = SS4_STD_MF_X_V2(p, 0);
+				f->mt[3].x = SS4_STD_MF_X_V2(p, 1);
+				no_data_x = SS4_MFPACKET_NO_AX;
 			}
+			no_data_y = SS4_MFPACKET_NO_AY;
+
 			f->mt[2].y = SS4_STD_MF_Y_V2(p, 0);
 			f->mt[3].y = SS4_STD_MF_Y_V2(p, 1);
-			no_data_x = SS4_MFPACKET_NO_AX;
-			no_data_y = SS4_MFPACKET_NO_AY;
 		}
 
 		f->first_mp = 0;
diff --git a/drivers/input/mouse/alps.h b/drivers/input/mouse/alps.h
index c80a7c7..79b6d69 100644
--- a/drivers/input/mouse/alps.h
+++ b/drivers/input/mouse/alps.h
@@ -141,10 +141,12 @@ enum SS4_PACKET_ID {
 #define SS4_TS_Z_V2(_b)		(s8)(_b[4] & 0x7F)
 
 
-#define SS4_MFPACKET_NO_AX	8160	/* X-Coordinate value */
-#define SS4_MFPACKET_NO_AY	4080	/* Y-Coordinate value */
-#define SS4_MFPACKET_NO_AX_BL	8176	/* Buttonless X-Coordinate value */
-#define SS4_MFPACKET_NO_AY_BL	4088	/* Buttonless Y-Coordinate value */
+#define SS4_MFPACKET_NO_AX		8160	/* X-Coordinate value */
+#define SS4_MFPACKET_NO_AY		4080	/* Y-Coordinate value */
+#define SS4_MFPACKET_NO_AX_BL		8176	/* Buttonless X-Coord value */
+#define SS4_MFPACKET_NO_AY_BL		4088	/* Buttonless Y-Coord value */
+#define SS4_PLUS_MFPACKET_NO_AX		4080	/* SS4 PLUS, X */
+#define SS4_PLUS_MFPACKET_NO_AX_BL	4088	/* Buttonless SS4 PLUS, X */
 
 /*
  * enum V7_PACKET_ID - defines the packet type for V7
diff --git a/drivers/input/mouse/elantech.c b/drivers/input/mouse/elantech.c
index b84cd97..a4aaa74 100644
--- a/drivers/input/mouse/elantech.c
+++ b/drivers/input/mouse/elantech.c
@@ -1613,7 +1613,7 @@ static int elantech_set_properties(struct elantech_data *etd)
 		case 5:
 			etd->hw_version = 3;
 			break;
-		case 6 ... 14:
+		case 6 ... 15:
 			etd->hw_version = 4;
 			break;
 		default:
diff --git a/drivers/input/mouse/synaptics.c b/drivers/input/mouse/synaptics.c
index ee5466a..cd9f61c 100644
--- a/drivers/input/mouse/synaptics.c
+++ b/drivers/input/mouse/synaptics.c
@@ -173,6 +173,7 @@ static const char * const smbus_pnp_ids[] = {
 	"LEN0046", /* X250 */
 	"LEN004a", /* W541 */
 	"LEN200f", /* T450s */
+	"LEN2018", /* T460p */
 	NULL
 };
 
diff --git a/drivers/input/mouse/trackpoint.c b/drivers/input/mouse/trackpoint.c
index 0871010..bbd2922 100644
--- a/drivers/input/mouse/trackpoint.c
+++ b/drivers/input/mouse/trackpoint.c
@@ -19,6 +19,13 @@
 #include "psmouse.h"
 #include "trackpoint.h"
 
+static const char * const trackpoint_variants[] = {
+	[TP_VARIANT_IBM]	= "IBM",
+	[TP_VARIANT_ALPS]	= "ALPS",
+	[TP_VARIANT_ELAN]	= "Elan",
+	[TP_VARIANT_NXP]	= "NXP",
+};
+
 /*
  * Power-on Reset: Resets all trackpoint parameters, including RAM values,
  * to defaults.
@@ -26,7 +33,7 @@
  */
 static int trackpoint_power_on_reset(struct ps2dev *ps2dev)
 {
-	unsigned char results[2];
+	u8 results[2];
 	int tries = 0;
 
 	/* Issue POR command, and repeat up to once if 0xFC00 received */
@@ -38,7 +45,7 @@ static int trackpoint_power_on_reset(struct ps2dev *ps2dev)
 
 	/* Check for success response -- 0xAA00 */
 	if (results[0] != 0xAA || results[1] != 0x00)
-		return -1;
+		return -ENODEV;
 
 	return 0;
 }
@@ -46,8 +53,7 @@ static int trackpoint_power_on_reset(struct ps2dev *ps2dev)
 /*
  * Device IO: read, write and toggle bit
  */
-static int trackpoint_read(struct ps2dev *ps2dev,
-			   unsigned char loc, unsigned char *results)
+static int trackpoint_read(struct ps2dev *ps2dev, u8 loc, u8 *results)
 {
 	if (ps2_command(ps2dev, NULL, MAKE_PS2_CMD(0, 0, TP_COMMAND)) ||
 	    ps2_command(ps2dev, results, MAKE_PS2_CMD(0, 1, loc))) {
@@ -57,8 +63,7 @@ static int trackpoint_read(struct ps2dev *ps2dev,
 	return 0;
 }
 
-static int trackpoint_write(struct ps2dev *ps2dev,
-			    unsigned char loc, unsigned char val)
+static int trackpoint_write(struct ps2dev *ps2dev, u8 loc, u8 val)
 {
 	if (ps2_command(ps2dev, NULL, MAKE_PS2_CMD(0, 0, TP_COMMAND)) ||
 	    ps2_command(ps2dev, NULL, MAKE_PS2_CMD(0, 0, TP_WRITE_MEM)) ||
@@ -70,8 +75,7 @@ static int trackpoint_write(struct ps2dev *ps2dev,
 	return 0;
 }
 
-static int trackpoint_toggle_bit(struct ps2dev *ps2dev,
-				 unsigned char loc, unsigned char mask)
+static int trackpoint_toggle_bit(struct ps2dev *ps2dev, u8 loc, u8 mask)
 {
 	/* Bad things will happen if the loc param isn't in this range */
 	if (loc < 0x20 || loc >= 0x2F)
@@ -87,11 +91,11 @@ static int trackpoint_toggle_bit(struct ps2dev *ps2dev,
 	return 0;
 }
 
-static int trackpoint_update_bit(struct ps2dev *ps2dev, unsigned char loc,
-				 unsigned char mask, unsigned char value)
+static int trackpoint_update_bit(struct ps2dev *ps2dev,
+				 u8 loc, u8 mask, u8 value)
 {
 	int retval = 0;
-	unsigned char data;
+	u8 data;
 
 	trackpoint_read(ps2dev, loc, &data);
 	if (((data & mask) == mask) != !!value)
@@ -105,17 +109,18 @@ static int trackpoint_update_bit(struct ps2dev *ps2dev, unsigned char loc,
  */
 struct trackpoint_attr_data {
 	size_t field_offset;
-	unsigned char command;
-	unsigned char mask;
-	unsigned char inverted;
-	unsigned char power_on_default;
+	u8 command;
+	u8 mask;
+	bool inverted;
+	u8 power_on_default;
 };
 
-static ssize_t trackpoint_show_int_attr(struct psmouse *psmouse, void *data, char *buf)
+static ssize_t trackpoint_show_int_attr(struct psmouse *psmouse,
+					void *data, char *buf)
 {
 	struct trackpoint_data *tp = psmouse->private;
 	struct trackpoint_attr_data *attr = data;
-	unsigned char value = *(unsigned char *)((char *)tp + attr->field_offset);
+	u8 value = *(u8 *)((void *)tp + attr->field_offset);
 
 	if (attr->inverted)
 		value = !value;
@@ -128,8 +133,8 @@ static ssize_t trackpoint_set_int_attr(struct psmouse *psmouse, void *data,
 {
 	struct trackpoint_data *tp = psmouse->private;
 	struct trackpoint_attr_data *attr = data;
-	unsigned char *field = (unsigned char *)((char *)tp + attr->field_offset);
-	unsigned char value;
+	u8 *field = (void *)tp + attr->field_offset;
+	u8 value;
 	int err;
 
 	err = kstrtou8(buf, 10, &value);
@@ -157,17 +162,14 @@ static ssize_t trackpoint_set_bit_attr(struct psmouse *psmouse, void *data,
 {
 	struct trackpoint_data *tp = psmouse->private;
 	struct trackpoint_attr_data *attr = data;
-	unsigned char *field = (unsigned char *)((char *)tp + attr->field_offset);
-	unsigned int value;
+	bool *field = (void *)tp + attr->field_offset;
+	bool value;
 	int err;
 
-	err = kstrtouint(buf, 10, &value);
+	err = kstrtobool(buf, &value);
 	if (err)
 		return err;
 
-	if (value > 1)
-		return -EINVAL;
-
 	if (attr->inverted)
 		value = !value;
 
@@ -193,30 +195,6 @@ PSMOUSE_DEFINE_ATTR(_name, S_IWUSR | S_IRUGO,				\
 		    &trackpoint_attr_##_name,				\
 		    trackpoint_show_int_attr, trackpoint_set_bit_attr)
 
-#define TRACKPOINT_UPDATE_BIT(_psmouse, _tp, _name)			\
-do {									\
-	struct trackpoint_attr_data *_attr = &trackpoint_attr_##_name;	\
-									\
-	trackpoint_update_bit(&_psmouse->ps2dev,			\
-			_attr->command, _attr->mask, _tp->_name);	\
-} while (0)
-
-#define TRACKPOINT_UPDATE(_power_on, _psmouse, _tp, _name)		\
-do {									\
-	if (!_power_on ||						\
-	    _tp->_name != trackpoint_attr_##_name.power_on_default) {	\
-		if (!trackpoint_attr_##_name.mask)			\
-			trackpoint_write(&_psmouse->ps2dev,		\
-				 trackpoint_attr_##_name.command,	\
-				 _tp->_name);				\
-		else							\
-			TRACKPOINT_UPDATE_BIT(_psmouse, _tp, _name);	\
-	}								\
-} while (0)
-
-#define TRACKPOINT_SET_POWER_ON_DEFAULT(_tp, _name)				\
-	(_tp->_name = trackpoint_attr_##_name.power_on_default)
-
 TRACKPOINT_INT_ATTR(sensitivity, TP_SENS, TP_DEF_SENS);
 TRACKPOINT_INT_ATTR(speed, TP_SPEED, TP_DEF_SPEED);
 TRACKPOINT_INT_ATTR(inertia, TP_INERTIA, TP_DEF_INERTIA);
@@ -229,13 +207,33 @@ TRACKPOINT_INT_ATTR(ztime, TP_Z_TIME, TP_DEF_Z_TIME);
 TRACKPOINT_INT_ATTR(jenks, TP_JENKS_CURV, TP_DEF_JENKS_CURV);
 TRACKPOINT_INT_ATTR(drift_time, TP_DRIFT_TIME, TP_DEF_DRIFT_TIME);
 
-TRACKPOINT_BIT_ATTR(press_to_select, TP_TOGGLE_PTSON, TP_MASK_PTSON, 0,
+TRACKPOINT_BIT_ATTR(press_to_select, TP_TOGGLE_PTSON, TP_MASK_PTSON, false,
 		    TP_DEF_PTSON);
-TRACKPOINT_BIT_ATTR(skipback, TP_TOGGLE_SKIPBACK, TP_MASK_SKIPBACK, 0,
+TRACKPOINT_BIT_ATTR(skipback, TP_TOGGLE_SKIPBACK, TP_MASK_SKIPBACK, false,
 		    TP_DEF_SKIPBACK);
-TRACKPOINT_BIT_ATTR(ext_dev, TP_TOGGLE_EXT_DEV, TP_MASK_EXT_DEV, 1,
+TRACKPOINT_BIT_ATTR(ext_dev, TP_TOGGLE_EXT_DEV, TP_MASK_EXT_DEV, true,
 		    TP_DEF_EXT_DEV);
 
+static bool trackpoint_is_attr_available(struct psmouse *psmouse,
+					 struct attribute *attr)
+{
+	struct trackpoint_data *tp = psmouse->private;
+
+	return tp->variant_id == TP_VARIANT_IBM ||
+		attr == &psmouse_attr_sensitivity.dattr.attr ||
+		attr == &psmouse_attr_press_to_select.dattr.attr;
+}
+
+static umode_t trackpoint_is_attr_visible(struct kobject *kobj,
+					  struct attribute *attr, int n)
+{
+	struct device *dev = container_of(kobj, struct device, kobj);
+	struct serio *serio = to_serio_port(dev);
+	struct psmouse *psmouse = serio_get_drvdata(serio);
+
+	return trackpoint_is_attr_available(psmouse, attr) ? attr->mode : 0;
+}
+
 static struct attribute *trackpoint_attrs[] = {
 	&psmouse_attr_sensitivity.dattr.attr,
 	&psmouse_attr_speed.dattr.attr,
@@ -255,24 +253,56 @@ static struct attribute *trackpoint_attrs[] = {
 };
 
 static struct attribute_group trackpoint_attr_group = {
-	.attrs = trackpoint_attrs,
+	.is_visible	= trackpoint_is_attr_visible,
+	.attrs		= trackpoint_attrs,
 };
 
-static int trackpoint_start_protocol(struct psmouse *psmouse, unsigned char *firmware_id)
+#define TRACKPOINT_UPDATE(_power_on, _psmouse, _tp, _name)		\
+do {									\
+	struct trackpoint_attr_data *_attr = &trackpoint_attr_##_name;	\
+									\
+	if ((!_power_on || _tp->_name != _attr->power_on_default) &&	\
+	    trackpoint_is_attr_available(_psmouse,			\
+				&psmouse_attr_##_name.dattr.attr)) {	\
+		if (!_attr->mask)					\
+			trackpoint_write(&_psmouse->ps2dev,		\
+					 _attr->command, _tp->_name);	\
+		else							\
+			trackpoint_update_bit(&_psmouse->ps2dev,	\
+					_attr->command, _attr->mask,	\
+					_tp->_name);			\
+	}								\
+} while (0)
+
+#define TRACKPOINT_SET_POWER_ON_DEFAULT(_tp, _name)			\
+do {									\
+	_tp->_name = trackpoint_attr_##_name.power_on_default;		\
+} while (0)
+
+static int trackpoint_start_protocol(struct psmouse *psmouse,
+				     u8 *variant_id, u8 *firmware_id)
 {
-	unsigned char param[2] = { 0 };
+	u8 param[2] = { 0 };
+	int error;
 
-	if (ps2_command(&psmouse->ps2dev, param, MAKE_PS2_CMD(0, 2, TP_READ_ID)))
-		return -1;
+	error = ps2_command(&psmouse->ps2dev,
+			    param, MAKE_PS2_CMD(0, 2, TP_READ_ID));
+	if (error)
+		return error;
 
-	/* add new TP ID. */
-	if (!(param[0] & TP_MAGIC_IDENT))
-		return -1;
+	switch (param[0]) {
+	case TP_VARIANT_IBM:
+	case TP_VARIANT_ALPS:
+	case TP_VARIANT_ELAN:
+	case TP_VARIANT_NXP:
+		if (variant_id)
+			*variant_id = param[0];
+		if (firmware_id)
+			*firmware_id = param[1];
+		return 0;
+	}
 
-	if (firmware_id)
-		*firmware_id = param[1];
-
-	return 0;
+	return -ENODEV;
 }
 
 /*
@@ -285,7 +315,7 @@ static int trackpoint_sync(struct psmouse *psmouse, bool in_power_on_state)
 {
 	struct trackpoint_data *tp = psmouse->private;
 
-	if (!in_power_on_state) {
+	if (!in_power_on_state && tp->variant_id == TP_VARIANT_IBM) {
 		/*
 		 * Disable features that may make device unusable
 		 * with this driver.
@@ -347,7 +377,8 @@ static void trackpoint_defaults(struct trackpoint_data *tp)
 
 static void trackpoint_disconnect(struct psmouse *psmouse)
 {
-	sysfs_remove_group(&psmouse->ps2dev.serio->dev.kobj, &trackpoint_attr_group);
+	device_remove_group(&psmouse->ps2dev.serio->dev,
+			    &trackpoint_attr_group);
 
 	kfree(psmouse->private);
 	psmouse->private = NULL;
@@ -355,14 +386,20 @@ static void trackpoint_disconnect(struct psmouse *psmouse)
 
 static int trackpoint_reconnect(struct psmouse *psmouse)
 {
-	int reset_fail;
+	struct trackpoint_data *tp = psmouse->private;
+	int error;
+	bool was_reset;
 
-	if (trackpoint_start_protocol(psmouse, NULL))
-		return -1;
+	error = trackpoint_start_protocol(psmouse, NULL, NULL);
+	if (error)
+		return error;
 
-	reset_fail = trackpoint_power_on_reset(&psmouse->ps2dev);
-	if (trackpoint_sync(psmouse, !reset_fail))
-		return -1;
+	was_reset = tp->variant_id == TP_VARIANT_IBM &&
+		    trackpoint_power_on_reset(&psmouse->ps2dev) == 0;
+
+	error = trackpoint_sync(psmouse, was_reset);
+	if (error)
+		return error;
 
 	return 0;
 }
@@ -370,46 +407,66 @@ static int trackpoint_reconnect(struct psmouse *psmouse)
 int trackpoint_detect(struct psmouse *psmouse, bool set_properties)
 {
 	struct ps2dev *ps2dev = &psmouse->ps2dev;
-	unsigned char firmware_id;
-	unsigned char button_info;
+	struct trackpoint_data *tp;
+	u8 variant_id;
+	u8 firmware_id;
+	u8 button_info;
 	int error;
 
-	if (trackpoint_start_protocol(psmouse, &firmware_id))
-		return -1;
+	error = trackpoint_start_protocol(psmouse, &variant_id, &firmware_id);
+	if (error)
+		return error;
 
 	if (!set_properties)
 		return 0;
 
-	if (trackpoint_read(ps2dev, TP_EXT_BTN, &button_info)) {
-		psmouse_warn(psmouse, "failed to get extended button data, assuming 3 buttons\n");
-		button_info = 0x33;
-	}
-
-	psmouse->private = kzalloc(sizeof(struct trackpoint_data), GFP_KERNEL);
-	if (!psmouse->private)
+	tp = kzalloc(sizeof(*tp), GFP_KERNEL);
+	if (!tp)
 		return -ENOMEM;
 
-	psmouse->vendor = "IBM";
+	trackpoint_defaults(tp);
+	tp->variant_id = variant_id;
+	tp->firmware_id = firmware_id;
+
+	psmouse->private = tp;
+
+	psmouse->vendor = trackpoint_variants[variant_id];
 	psmouse->name = "TrackPoint";
 
 	psmouse->reconnect = trackpoint_reconnect;
 	psmouse->disconnect = trackpoint_disconnect;
 
+	if (variant_id != TP_VARIANT_IBM) {
+		/* Newer variants do not support extended button query. */
+		button_info = 0x33;
+	} else {
+		error = trackpoint_read(ps2dev, TP_EXT_BTN, &button_info);
+		if (error) {
+			psmouse_warn(psmouse,
+				     "failed to get extended button data, assuming 3 buttons\n");
+			button_info = 0x33;
+		} else if (!button_info) {
+			psmouse_warn(psmouse,
+				     "got 0 in extended button data, assuming 3 buttons\n");
+			button_info = 0x33;
+		}
+	}
+
 	if ((button_info & 0x0f) >= 3)
-		__set_bit(BTN_MIDDLE, psmouse->dev->keybit);
+		input_set_capability(psmouse->dev, EV_KEY, BTN_MIDDLE);
 
 	__set_bit(INPUT_PROP_POINTER, psmouse->dev->propbit);
 	__set_bit(INPUT_PROP_POINTING_STICK, psmouse->dev->propbit);
 
-	trackpoint_defaults(psmouse->private);
-
-	error = trackpoint_power_on_reset(ps2dev);
-
-	/* Write defaults to TP only if reset fails. */
-	if (error)
+	if (variant_id != TP_VARIANT_IBM ||
+	    trackpoint_power_on_reset(ps2dev) != 0) {
+		/*
+		 * Write defaults to TP if we did not reset the trackpoint.
+		 */
 		trackpoint_sync(psmouse, false);
+	}
 
-	error = sysfs_create_group(&ps2dev->serio->dev.kobj, &trackpoint_attr_group);
+	error = device_add_group(&ps2dev->serio->dev, &trackpoint_attr_group);
 	if (error) {
 		psmouse_err(psmouse,
 			    "failed to create sysfs attributes, error: %d\n",
@@ -420,8 +477,8 @@ int trackpoint_detect(struct psmouse *psmouse, bool set_properties)
 	}
 
 	psmouse_info(psmouse,
-		     "IBM TrackPoint firmware: 0x%02x, buttons: %d/%d\n",
-		     firmware_id,
+		     "%s TrackPoint firmware: 0x%02x, buttons: %d/%d\n",
+		     psmouse->vendor, firmware_id,
 		     (button_info & 0xf0) >> 4, button_info & 0x0f);
 
 	return 0;
diff --git a/drivers/input/mouse/trackpoint.h b/drivers/input/mouse/trackpoint.h
index 8805575..10a0391 100644
--- a/drivers/input/mouse/trackpoint.h
+++ b/drivers/input/mouse/trackpoint.h
@@ -21,10 +21,16 @@
 #define TP_COMMAND		0xE2	/* Commands start with this */
 
 #define TP_READ_ID		0xE1	/* Sent for device identification */
-#define TP_MAGIC_IDENT		0x03	/* Sent after a TP_READ_ID followed */
-					/* by the firmware ID */
-					/* Firmware ID includes 0x1, 0x2, 0x3 */
 
+/*
+ * Valid first byte responses to the "Read Secondary ID" (0xE1) command.
+ * 0x01 was the original IBM trackpoint, others implement very limited
+ * subset of trackpoint features.
+ */
+#define TP_VARIANT_IBM		0x01
+#define TP_VARIANT_ALPS		0x02
+#define TP_VARIANT_ELAN		0x03
+#define TP_VARIANT_NXP		0x04
 
 /*
  * Commands
@@ -136,18 +142,20 @@
 
 #define MAKE_PS2_CMD(params, results, cmd) ((params<<12) | (results<<8) | (cmd))
 
-struct trackpoint_data
-{
-	unsigned char sensitivity, speed, inertia, reach;
-	unsigned char draghys, mindrag;
-	unsigned char thresh, upthresh;
-	unsigned char ztime, jenks;
-	unsigned char drift_time;
+struct trackpoint_data {
+	u8 variant_id;
+	u8 firmware_id;
+
+	u8 sensitivity, speed, inertia, reach;
+	u8 draghys, mindrag;
+	u8 thresh, upthresh;
+	u8 ztime, jenks;
+	u8 drift_time;
 
 	/* toggles */
-	unsigned char press_to_select;
-	unsigned char skipback;
-	unsigned char ext_dev;
+	bool press_to_select;
+	bool skipback;
+	bool ext_dev;
 };
 
 #ifdef CONFIG_MOUSE_PS2_TRACKPOINT
diff --git a/drivers/input/rmi4/rmi_driver.c b/drivers/input/rmi4/rmi_driver.c
index 4f2bb59..141ea22 100644
--- a/drivers/input/rmi4/rmi_driver.c
+++ b/drivers/input/rmi4/rmi_driver.c
@@ -230,8 +230,10 @@ static irqreturn_t rmi_irq_fn(int irq, void *dev_id)
 		rmi_dbg(RMI_DEBUG_CORE, &rmi_dev->dev,
 			"Failed to process interrupt request: %d\n", ret);
 
-	if (count)
+	if (count) {
 		kfree(attn_data.data);
+		attn_data.data = NULL;
+	}
 
 	if (!kfifo_is_empty(&drvdata->attn_fifo))
 		return rmi_irq_fn(irq, dev_id);
diff --git a/drivers/input/rmi4/rmi_f01.c b/drivers/input/rmi4/rmi_f01.c
index ae966e3..8a07ae1 100644
--- a/drivers/input/rmi4/rmi_f01.c
+++ b/drivers/input/rmi4/rmi_f01.c
@@ -570,14 +570,19 @@ static int rmi_f01_probe(struct rmi_function *fn)
 
 	dev_set_drvdata(&fn->dev, f01);
 
-	error = devm_device_add_group(&fn->rmi_dev->dev, &rmi_f01_attr_group);
+	error = sysfs_create_group(&fn->rmi_dev->dev.kobj, &rmi_f01_attr_group);
 	if (error)
-		dev_warn(&fn->dev,
-			 "Failed to create attribute group: %d\n", error);
+		dev_warn(&fn->dev, "Failed to create sysfs group: %d\n", error);
 
 	return 0;
 }
 
+static void rmi_f01_remove(struct rmi_function *fn)
+{
+	/* Note that the bus device is used, not the F01 device */
+	sysfs_remove_group(&fn->rmi_dev->dev.kobj, &rmi_f01_attr_group);
+}
+
 static int rmi_f01_config(struct rmi_function *fn)
 {
 	struct f01_data *f01 = dev_get_drvdata(&fn->dev);
@@ -717,6 +722,7 @@ struct rmi_function_handler rmi_f01_handler = {
 	},
 	.func		= 0x01,
 	.probe		= rmi_f01_probe,
+	.remove		= rmi_f01_remove,
 	.config		= rmi_f01_config,
 	.attention	= rmi_f01_attention,
 	.suspend	= rmi_f01_suspend,
diff --git a/drivers/input/touchscreen/88pm860x-ts.c b/drivers/input/touchscreen/88pm860x-ts.c
index 7ed828a..3486d94 100644
--- a/drivers/input/touchscreen/88pm860x-ts.c
+++ b/drivers/input/touchscreen/88pm860x-ts.c
@@ -126,7 +126,7 @@ static int pm860x_touch_dt_init(struct platform_device *pdev,
 	int data, n, ret;
 	if (!np)
 		return -ENODEV;
-	np = of_find_node_by_name(np, "touch");
+	np = of_get_child_by_name(np, "touch");
 	if (!np) {
 		dev_err(&pdev->dev, "Can't find touch node\n");
 		return -EINVAL;
@@ -144,13 +144,13 @@ static int pm860x_touch_dt_init(struct platform_device *pdev,
 	if (data) {
 		ret = pm860x_reg_write(i2c, PM8607_GPADC_MISC1, data);
 		if (ret < 0)
-			return -EINVAL;
+			goto err_put_node;
 	}
 	/* set tsi prebias time */
 	if (!of_property_read_u32(np, "marvell,88pm860x-tsi-prebias", &data)) {
 		ret = pm860x_reg_write(i2c, PM8607_TSI_PREBIAS, data);
 		if (ret < 0)
-			return -EINVAL;
+			goto err_put_node;
 	}
 	/* set prebias & prechg time of pen detect */
 	data = 0;
@@ -161,10 +161,18 @@ static int pm860x_touch_dt_init(struct platform_device *pdev,
 	if (data) {
 		ret = pm860x_reg_write(i2c, PM8607_PD_PREBIAS, data);
 		if (ret < 0)
-			return -EINVAL;
+			goto err_put_node;
 	}
 	of_property_read_u32(np, "marvell,88pm860x-resistor-X", res_x);
+
+	of_node_put(np);
+
 	return 0;
+
+err_put_node:
+	of_node_put(np);
+
+	return -EINVAL;
 }
 #else
 #define pm860x_touch_dt_init(x, y, z)	(-1)
diff --git a/drivers/input/touchscreen/elants_i2c.c b/drivers/input/touchscreen/elants_i2c.c
index e102d776..a458e5e 100644
--- a/drivers/input/touchscreen/elants_i2c.c
+++ b/drivers/input/touchscreen/elants_i2c.c
@@ -27,6 +27,7 @@
 #include <linux/module.h>
 #include <linux/input.h>
 #include <linux/interrupt.h>
+#include <linux/irq.h>
 #include <linux/platform_device.h>
 #include <linux/async.h>
 #include <linux/i2c.h>
@@ -1261,10 +1262,13 @@ static int elants_i2c_probe(struct i2c_client *client,
 	}
 
 	/*
-	 * Systems using device tree should set up interrupt via DTS,
-	 * the rest will use the default falling edge interrupts.
+	 * Platform code (ACPI, DTS) should normally set up interrupt
+	 * for us, but in case it did not let's fall back to using falling
+	 * edge to be compatible with older Chromebooks.
 	 */
-	irqflags = client->dev.of_node ? 0 : IRQF_TRIGGER_FALLING;
+	irqflags = irq_get_trigger_type(client->irq);
+	if (!irqflags)
+		irqflags = IRQF_TRIGGER_FALLING;
 
 	error = devm_request_threaded_irq(&client->dev, client->irq,
 					  NULL, elants_i2c_irq,
diff --git a/drivers/input/touchscreen/hideep.c b/drivers/input/touchscreen/hideep.c
index fc080a7..f1cd4dd 100644
--- a/drivers/input/touchscreen/hideep.c
+++ b/drivers/input/touchscreen/hideep.c
@@ -10,8 +10,7 @@
 #include <linux/of.h>
 #include <linux/firmware.h>
 #include <linux/delay.h>
-#include <linux/gpio.h>
-#include <linux/gpio/machine.h>
+#include <linux/gpio/consumer.h>
 #include <linux/i2c.h>
 #include <linux/acpi.h>
 #include <linux/interrupt.h>
diff --git a/drivers/input/touchscreen/of_touchscreen.c b/drivers/input/touchscreen/of_touchscreen.c
index 8d7f9c8..9642f10 100644
--- a/drivers/input/touchscreen/of_touchscreen.c
+++ b/drivers/input/touchscreen/of_touchscreen.c
@@ -13,6 +13,7 @@
 #include <linux/input.h>
 #include <linux/input/mt.h>
 #include <linux/input/touchscreen.h>
+#include <linux/module.h>
 
 static bool touchscreen_get_prop_u32(struct device *dev,
 				     const char *property,
@@ -185,3 +186,6 @@ void touchscreen_report_pos(struct input_dev *input,
 	input_report_abs(input, multitouch ? ABS_MT_POSITION_Y : ABS_Y, y);
 }
 EXPORT_SYMBOL(touchscreen_report_pos);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("Device-tree helpers functions for touchscreen devices");
diff --git a/drivers/input/touchscreen/s6sy761.c b/drivers/input/touchscreen/s6sy761.c
index 26b1cb8..675efa9 100644
--- a/drivers/input/touchscreen/s6sy761.c
+++ b/drivers/input/touchscreen/s6sy761.c
@@ -1,13 +1,8 @@
-/*
- * Copyright (c) 2017 Samsung Electronics Co., Ltd.
- * Author: Andi Shyti <andi.shyti@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * Samsung S6SY761 Touchscreen device driver
- */
+// SPDX-License-Identifier: GPL-2.0
+// Samsung S6SY761 Touchscreen device driver
+//
+// Copyright (c) 2017 Samsung Electronics Co., Ltd.
+// Copyright (c) 2017 Andi Shyti <andi.shyti@samsung.com>
 
 #include <asm/unaligned.h>
 #include <linux/delay.h>
diff --git a/drivers/input/touchscreen/stmfts.c b/drivers/input/touchscreen/stmfts.c
index c12d018..2a123e2 100644
--- a/drivers/input/touchscreen/stmfts.c
+++ b/drivers/input/touchscreen/stmfts.c
@@ -1,13 +1,8 @@
-/*
- * Copyright (c) 2017 Samsung Electronics Co., Ltd.
- * Author: Andi Shyti <andi.shyti@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * STMicroelectronics FTS Touchscreen device driver
- */
+// SPDX-License-Identifier: GPL-2.0
+// STMicroelectronics FTS Touchscreen device driver
+//
+// Copyright (c) 2017 Samsung Electronics Co., Ltd.
+// Copyright (c) 2017 Andi Shyti <andi.shyti@samsung.com>
 
 #include <linux/delay.h>
 #include <linux/i2c.h>
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c
index f122071..744592d 100644
--- a/drivers/iommu/arm-smmu-v3.c
+++ b/drivers/iommu/arm-smmu-v3.c
@@ -1698,13 +1698,15 @@ static int arm_smmu_domain_finalise(struct iommu_domain *domain)
 	domain->pgsize_bitmap = pgtbl_cfg.pgsize_bitmap;
 	domain->geometry.aperture_end = (1UL << ias) - 1;
 	domain->geometry.force_aperture = true;
-	smmu_domain->pgtbl_ops = pgtbl_ops;
 
 	ret = finalise_stage_fn(smmu_domain, &pgtbl_cfg);
-	if (ret < 0)
+	if (ret < 0) {
 		free_io_pgtable_ops(pgtbl_ops);
+		return ret;
+	}
 
-	return ret;
+	smmu_domain->pgtbl_ops = pgtbl_ops;
+	return 0;
 }
 
 static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
@@ -1731,7 +1733,7 @@ static __le64 *arm_smmu_get_step_for_sid(struct arm_smmu_device *smmu, u32 sid)
 
 static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 {
-	int i;
+	int i, j;
 	struct arm_smmu_master_data *master = fwspec->iommu_priv;
 	struct arm_smmu_device *smmu = master->smmu;
 
@@ -1739,6 +1741,13 @@ static void arm_smmu_install_ste_for_dev(struct iommu_fwspec *fwspec)
 		u32 sid = fwspec->ids[i];
 		__le64 *step = arm_smmu_get_step_for_sid(smmu, sid);
 
+		/* Bridged PCI devices may end up with duplicated IDs */
+		for (j = 0; j < i; j++)
+			if (fwspec->ids[j] == sid)
+				break;
+		if (j < i)
+			continue;
+
 		arm_smmu_write_strtab_ent(smmu, sid, step, &master->ste);
 	}
 }
diff --git a/drivers/irqchip/Kconfig b/drivers/irqchip/Kconfig
index c70476b..d913aec 100644
--- a/drivers/irqchip/Kconfig
+++ b/drivers/irqchip/Kconfig
@@ -343,4 +343,12 @@
        help
          Support Meson SoC Family GPIO Interrupt Multiplexer
 
+config GOLDFISH_PIC
+       bool "Goldfish programmable interrupt controller"
+       depends on MIPS && (GOLDFISH || COMPILE_TEST)
+       select IRQ_DOMAIN
+       help
+         Say yes here to enable Goldfish interrupt controller driver used
+         for Goldfish based virtual platforms.
+
 endmenu
diff --git a/drivers/irqchip/Makefile b/drivers/irqchip/Makefile
index d2df34a..d27e3e3 100644
--- a/drivers/irqchip/Makefile
+++ b/drivers/irqchip/Makefile
@@ -84,3 +84,4 @@
 obj-$(CONFIG_IRQ_UNIPHIER_AIDET)	+= irq-uniphier-aidet.o
 obj-$(CONFIG_ARCH_SYNQUACER)		+= irq-sni-exiu.o
 obj-$(CONFIG_MESON_IRQ_GPIO)		+= irq-meson-gpio.o
+obj-$(CONFIG_GOLDFISH_PIC) 		+= irq-goldfish-pic.o
diff --git a/drivers/irqchip/irq-bcm2836.c b/drivers/irqchip/irq-bcm2836.c
index 667b9e1..dfe4a46 100644
--- a/drivers/irqchip/irq-bcm2836.c
+++ b/drivers/irqchip/irq-bcm2836.c
@@ -98,13 +98,35 @@ static struct irq_chip bcm2836_arm_irqchip_gpu = {
 	.irq_unmask	= bcm2836_arm_irqchip_unmask_gpu_irq,
 };
 
-static void bcm2836_arm_irqchip_register_irq(int hwirq, struct irq_chip *chip)
+static int bcm2836_map(struct irq_domain *d, unsigned int irq,
+		       irq_hw_number_t hw)
 {
-	int irq = irq_create_mapping(intc.domain, hwirq);
+	struct irq_chip *chip;
+
+	switch (hw) {
+	case LOCAL_IRQ_CNTPSIRQ:
+	case LOCAL_IRQ_CNTPNSIRQ:
+	case LOCAL_IRQ_CNTHPIRQ:
+	case LOCAL_IRQ_CNTVIRQ:
+		chip = &bcm2836_arm_irqchip_timer;
+		break;
+	case LOCAL_IRQ_GPU_FAST:
+		chip = &bcm2836_arm_irqchip_gpu;
+		break;
+	case LOCAL_IRQ_PMU_FAST:
+		chip = &bcm2836_arm_irqchip_pmu;
+		break;
+	default:
+		pr_warn_once("Unexpected hw irq: %lu\n", hw);
+		return -EINVAL;
+	}
 
 	irq_set_percpu_devid(irq);
-	irq_set_chip_and_handler(irq, chip, handle_percpu_devid_irq);
+	irq_domain_set_info(d, irq, hw, chip, d->host_data,
+			    handle_percpu_devid_irq, NULL, NULL);
 	irq_set_status_flags(irq, IRQ_NOAUTOEN);
+
+	return 0;
 }
 
 static void
@@ -165,7 +187,8 @@ static int bcm2836_cpu_dying(unsigned int cpu)
 #endif
 
 static const struct irq_domain_ops bcm2836_arm_irqchip_intc_ops = {
-	.xlate = irq_domain_xlate_onecell
+	.xlate = irq_domain_xlate_onetwocell,
+	.map = bcm2836_map,
 };
 
 static void
@@ -218,19 +241,6 @@ static int __init bcm2836_arm_irqchip_l1_intc_of_init(struct device_node *node,
 	if (!intc.domain)
 		panic("%pOF: unable to create IRQ domain\n", node);
 
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_CNTPSIRQ,
-					 &bcm2836_arm_irqchip_timer);
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_CNTPNSIRQ,
-					 &bcm2836_arm_irqchip_timer);
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_CNTHPIRQ,
-					 &bcm2836_arm_irqchip_timer);
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_CNTVIRQ,
-					 &bcm2836_arm_irqchip_timer);
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_GPU_FAST,
-					 &bcm2836_arm_irqchip_gpu);
-	bcm2836_arm_irqchip_register_irq(LOCAL_IRQ_PMU_FAST,
-					 &bcm2836_arm_irqchip_pmu);
-
 	bcm2836_arm_irqchip_smp_init();
 
 	set_handle_irq(bcm2836_arm_irqchip_handle_irq);
diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index b56c3e2..a57c0fb 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -1070,31 +1070,6 @@ static int __init gic_validate_dist_version(void __iomem *dist_base)
 	return 0;
 }
 
-static int get_cpu_number(struct device_node *dn)
-{
-	const __be32 *cell;
-	u64 hwid;
-	int cpu;
-
-	cell = of_get_property(dn, "reg", NULL);
-	if (!cell)
-		return -1;
-
-	hwid = of_read_number(cell, of_n_addr_cells(dn));
-
-	/*
-	 * Non affinity bits must be set to 0 in the DT
-	 */
-	if (hwid & ~MPIDR_HWID_BITMASK)
-		return -1;
-
-	for_each_possible_cpu(cpu)
-		if (cpu_logical_map(cpu) == hwid)
-			return cpu;
-
-	return -1;
-}
-
 /* Create all possible partitions at boot time */
 static void __init gic_populate_ppi_partitions(struct device_node *gic_node)
 {
@@ -1145,8 +1120,8 @@ static void __init gic_populate_ppi_partitions(struct device_node *gic_node)
 			if (WARN_ON(!cpu_node))
 				continue;
 
-			cpu = get_cpu_number(cpu_node);
-			if (WARN_ON(cpu == -1))
+			cpu = of_cpu_node_to_id(cpu_node);
+			if (WARN_ON(cpu < 0))
 				continue;
 
 			pr_cont("%pOF[%d] ", cpu_node, cpu);
@@ -1331,6 +1306,10 @@ gic_acpi_parse_madt_gicc(struct acpi_subtable_header *header,
 	u32 size = reg == GIC_PIDR2_ARCH_GICv4 ? SZ_64K * 4 : SZ_64K * 2;
 	void __iomem *redist_base;
 
+	/* GICC entry which has !ACPI_MADT_ENABLED is not unusable so skip */
+	if (!(gicc->flags & ACPI_MADT_ENABLED))
+		return 0;
+
 	redist_base = ioremap(gicc->gicr_base_address, size);
 	if (!redist_base)
 		return -ENOMEM;
@@ -1380,6 +1359,13 @@ static int __init gic_acpi_match_gicc(struct acpi_subtable_header *header,
 	if ((gicc->flags & ACPI_MADT_ENABLED) && gicc->gicr_base_address)
 		return 0;
 
+	/*
+	 * It's perfectly valid firmware can pass disabled GICC entry, driver
+	 * should not treat as errors, skip the entry instead of probe fail.
+	 */
+	if (!(gicc->flags & ACPI_MADT_ENABLED))
+		return 0;
+
 	return -ENODEV;
 }
 
diff --git a/drivers/irqchip/irq-goldfish-pic.c b/drivers/irqchip/irq-goldfish-pic.c
new file mode 100644
index 0000000..2a92f03
--- /dev/null
+++ b/drivers/irqchip/irq-goldfish-pic.c
@@ -0,0 +1,139 @@
+/*
+ * Driver for MIPS Goldfish Programmable Interrupt Controller.
+ *
+ * Author: Miodrag Dinic <miodrag.dinic@mips.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2 of the License, or (at your
+ * option) any later version.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/irqchip.h>
+#include <linux/irqchip/chained_irq.h>
+#include <linux/irqdomain.h>
+#include <linux/of_address.h>
+#include <linux/of_irq.h>
+
+#define GFPIC_NR_IRQS			32
+
+/* 8..39 Cascaded Goldfish PIC interrupts */
+#define GFPIC_IRQ_BASE			8
+
+#define GFPIC_REG_IRQ_PENDING		0x04
+#define GFPIC_REG_IRQ_DISABLE_ALL	0x08
+#define GFPIC_REG_IRQ_DISABLE		0x0c
+#define GFPIC_REG_IRQ_ENABLE		0x10
+
+struct goldfish_pic_data {
+	void __iomem *base;
+	struct irq_domain *irq_domain;
+};
+
+static void goldfish_pic_cascade(struct irq_desc *desc)
+{
+	struct goldfish_pic_data *gfpic = irq_desc_get_handler_data(desc);
+	struct irq_chip *host_chip = irq_desc_get_chip(desc);
+	u32 pending, hwirq, virq;
+
+	chained_irq_enter(host_chip, desc);
+
+	pending = readl(gfpic->base + GFPIC_REG_IRQ_PENDING);
+	while (pending) {
+		hwirq = __fls(pending);
+		virq = irq_linear_revmap(gfpic->irq_domain, hwirq);
+		generic_handle_irq(virq);
+		pending &= ~(1 << hwirq);
+	}
+
+	chained_irq_exit(host_chip, desc);
+}
+
+static const struct irq_domain_ops goldfish_irq_domain_ops = {
+	.xlate = irq_domain_xlate_onecell,
+};
+
+static int __init goldfish_pic_of_init(struct device_node *of_node,
+				       struct device_node *parent)
+{
+	struct goldfish_pic_data *gfpic;
+	struct irq_chip_generic *gc;
+	struct irq_chip_type *ct;
+	unsigned int parent_irq;
+	int ret = 0;
+
+	gfpic = kzalloc(sizeof(*gfpic), GFP_KERNEL);
+	if (!gfpic) {
+		ret = -ENOMEM;
+		goto out_err;
+	}
+
+	parent_irq = irq_of_parse_and_map(of_node, 0);
+	if (!parent_irq) {
+		pr_err("Failed to map parent IRQ!\n");
+		ret = -EINVAL;
+		goto out_free;
+	}
+
+	gfpic->base = of_iomap(of_node, 0);
+	if (!gfpic->base) {
+		pr_err("Failed to map base address!\n");
+		ret = -ENOMEM;
+		goto out_unmap_irq;
+	}
+
+	/* Mask interrupts. */
+	writel(1, gfpic->base + GFPIC_REG_IRQ_DISABLE_ALL);
+
+	gc = irq_alloc_generic_chip("GFPIC", 1, GFPIC_IRQ_BASE, gfpic->base,
+				    handle_level_irq);
+	if (!gc) {
+		pr_err("Failed to allocate chip structures!\n");
+		ret = -ENOMEM;
+		goto out_iounmap;
+	}
+
+	ct = gc->chip_types;
+	ct->regs.enable = GFPIC_REG_IRQ_ENABLE;
+	ct->regs.disable = GFPIC_REG_IRQ_DISABLE;
+	ct->chip.irq_unmask = irq_gc_unmask_enable_reg;
+	ct->chip.irq_mask = irq_gc_mask_disable_reg;
+
+	irq_setup_generic_chip(gc, IRQ_MSK(GFPIC_NR_IRQS), 0,
+			       IRQ_NOPROBE | IRQ_LEVEL, 0);
+
+	gfpic->irq_domain = irq_domain_add_legacy(of_node, GFPIC_NR_IRQS,
+						  GFPIC_IRQ_BASE, 0,
+						  &goldfish_irq_domain_ops,
+						  NULL);
+	if (!gfpic->irq_domain) {
+		pr_err("Failed to add irqdomain!\n");
+		ret = -ENOMEM;
+		goto out_destroy_generic_chip;
+	}
+
+	irq_set_chained_handler_and_data(parent_irq,
+					 goldfish_pic_cascade, gfpic);
+
+	pr_info("Successfully registered.\n");
+	return 0;
+
+out_destroy_generic_chip:
+	irq_destroy_generic_chip(gc, IRQ_MSK(GFPIC_NR_IRQS),
+				 IRQ_NOPROBE | IRQ_LEVEL, 0);
+out_iounmap:
+	iounmap(gfpic->base);
+out_unmap_irq:
+	irq_dispose_mapping(parent_irq);
+out_free:
+	kfree(gfpic);
+out_err:
+	pr_err("Failed to initialize! (errno = %d)\n", ret);
+	return ret;
+}
+
+IRQCHIP_DECLARE(google_gf_pic, "google,goldfish-pic", goldfish_pic_of_init);
diff --git a/drivers/irqchip/irq-ompic.c b/drivers/irqchip/irq-ompic.c
index cf6d0c4..e66ef43 100644
--- a/drivers/irqchip/irq-ompic.c
+++ b/drivers/irqchip/irq-ompic.c
@@ -171,9 +171,9 @@ static int __init ompic_of_init(struct device_node *node,
 
 	/* Setup the device */
 	ompic_base = ioremap(res.start, resource_size(&res));
-	if (IS_ERR(ompic_base)) {
+	if (!ompic_base) {
 		pr_err("ompic: unable to map registers");
-		return PTR_ERR(ompic_base);
+		return -ENOMEM;
 	}
 
 	irq = irq_of_parse_and_map(node, 0);
diff --git a/drivers/leds/led-core.c b/drivers/leds/led-core.c
index f3654fd..ede4fa0 100644
--- a/drivers/leds/led-core.c
+++ b/drivers/leds/led-core.c
@@ -186,8 +186,9 @@ void led_blink_set(struct led_classdev *led_cdev,
 		   unsigned long *delay_on,
 		   unsigned long *delay_off)
 {
-	led_stop_software_blink(led_cdev);
+	del_timer_sync(&led_cdev->blink_timer);
 
+	clear_bit(LED_BLINK_SW, &led_cdev->work_flags);
 	clear_bit(LED_BLINK_ONESHOT, &led_cdev->work_flags);
 	clear_bit(LED_BLINK_ONESHOT_STOP, &led_cdev->work_flags);
 
diff --git a/drivers/leds/leds-pm8058.c b/drivers/leds/leds-pm8058.c
index a526743..8988ba3 100644
--- a/drivers/leds/leds-pm8058.c
+++ b/drivers/leds/leds-pm8058.c
@@ -106,7 +106,7 @@ static int pm8058_led_probe(struct platform_device *pdev)
 	if (!led)
 		return -ENOMEM;
 
-	led->ledtype = (u32)of_device_get_match_data(&pdev->dev);
+	led->ledtype = (u32)(unsigned long)of_device_get_match_data(&pdev->dev);
 
 	map = dev_get_regmap(pdev->dev.parent, NULL);
 	if (!map) {
diff --git a/drivers/lightnvm/Kconfig b/drivers/lightnvm/Kconfig
index 2a953efe..10c0898 100644
--- a/drivers/lightnvm/Kconfig
+++ b/drivers/lightnvm/Kconfig
@@ -27,13 +27,6 @@
 
 	It is required to create/remove targets without IOCTLs.
 
-config NVM_RRPC
-	tristate "Round-robin Hybrid Open-Channel SSD target"
-	---help---
-	Allows an open-channel SSD to be exposed as a block device to the
-	host. The target is implemented using a linear mapping table and
-	cost-based garbage collection. It is optimized for 4K IO sizes.
-
 config NVM_PBLK
 	tristate "Physical Block Device Open-Channel SSD target"
 	---help---
diff --git a/drivers/lightnvm/Makefile b/drivers/lightnvm/Makefile
index 2c3fd9d..97d9d7c 100644
--- a/drivers/lightnvm/Makefile
+++ b/drivers/lightnvm/Makefile
@@ -4,7 +4,6 @@
 #
 
 obj-$(CONFIG_NVM)		:= core.o
-obj-$(CONFIG_NVM_RRPC)		+= rrpc.o
 obj-$(CONFIG_NVM_PBLK)		+= pblk.o
 pblk-y				:= pblk-init.o pblk-core.o pblk-rb.o \
 				   pblk-write.o pblk-cache.o pblk-read.o \
diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index 83249b4..dcc9e62 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -45,12 +45,6 @@ struct nvm_dev_map {
 	int nr_chnls;
 };
 
-struct nvm_area {
-	struct list_head list;
-	sector_t begin;
-	sector_t end;	/* end is excluded */
-};
-
 static struct nvm_target *nvm_find_target(struct nvm_dev *dev, const char *name)
 {
 	struct nvm_target *tgt;
@@ -62,6 +56,30 @@ static struct nvm_target *nvm_find_target(struct nvm_dev *dev, const char *name)
 	return NULL;
 }
 
+static bool nvm_target_exists(const char *name)
+{
+	struct nvm_dev *dev;
+	struct nvm_target *tgt;
+	bool ret = false;
+
+	down_write(&nvm_lock);
+	list_for_each_entry(dev, &nvm_devices, devices) {
+		mutex_lock(&dev->mlock);
+		list_for_each_entry(tgt, &dev->targets, list) {
+			if (!strcmp(name, tgt->disk->disk_name)) {
+				ret = true;
+				mutex_unlock(&dev->mlock);
+				goto out;
+			}
+		}
+		mutex_unlock(&dev->mlock);
+	}
+
+out:
+	up_write(&nvm_lock);
+	return ret;
+}
+
 static int nvm_reserve_luns(struct nvm_dev *dev, int lun_begin, int lun_end)
 {
 	int i;
@@ -104,7 +122,7 @@ static void nvm_remove_tgt_dev(struct nvm_tgt_dev *tgt_dev, int clear)
 		if (clear) {
 			for (j = 0; j < ch_map->nr_luns; j++) {
 				int lun = j + lun_offs[j];
-				int lunid = (ch * dev->geo.luns_per_chnl) + lun;
+				int lunid = (ch * dev->geo.nr_luns) + lun;
 
 				WARN_ON(!test_and_clear_bit(lunid,
 							dev->lun_map));
@@ -122,7 +140,8 @@ static void nvm_remove_tgt_dev(struct nvm_tgt_dev *tgt_dev, int clear)
 }
 
 static struct nvm_tgt_dev *nvm_create_tgt_dev(struct nvm_dev *dev,
-					      int lun_begin, int lun_end)
+					      u16 lun_begin, u16 lun_end,
+					      u16 op)
 {
 	struct nvm_tgt_dev *tgt_dev = NULL;
 	struct nvm_dev_map *dev_rmap = dev->rmap;
@@ -130,10 +149,10 @@ static struct nvm_tgt_dev *nvm_create_tgt_dev(struct nvm_dev *dev,
 	struct ppa_addr *luns;
 	int nr_luns = lun_end - lun_begin + 1;
 	int luns_left = nr_luns;
-	int nr_chnls = nr_luns / dev->geo.luns_per_chnl;
-	int nr_chnls_mod = nr_luns % dev->geo.luns_per_chnl;
-	int bch = lun_begin / dev->geo.luns_per_chnl;
-	int blun = lun_begin % dev->geo.luns_per_chnl;
+	int nr_chnls = nr_luns / dev->geo.nr_luns;
+	int nr_chnls_mod = nr_luns % dev->geo.nr_luns;
+	int bch = lun_begin / dev->geo.nr_luns;
+	int blun = lun_begin % dev->geo.nr_luns;
 	int lunid = 0;
 	int lun_balanced = 1;
 	int prev_nr_luns;
@@ -154,15 +173,15 @@ static struct nvm_tgt_dev *nvm_create_tgt_dev(struct nvm_dev *dev,
 	if (!luns)
 		goto err_luns;
 
-	prev_nr_luns = (luns_left > dev->geo.luns_per_chnl) ?
-					dev->geo.luns_per_chnl : luns_left;
+	prev_nr_luns = (luns_left > dev->geo.nr_luns) ?
+					dev->geo.nr_luns : luns_left;
 	for (i = 0; i < nr_chnls; i++) {
 		struct nvm_ch_map *ch_rmap = &dev_rmap->chnls[i + bch];
 		int *lun_roffs = ch_rmap->lun_offs;
 		struct nvm_ch_map *ch_map = &dev_map->chnls[i];
 		int *lun_offs;
-		int luns_in_chnl = (luns_left > dev->geo.luns_per_chnl) ?
-					dev->geo.luns_per_chnl : luns_left;
+		int luns_in_chnl = (luns_left > dev->geo.nr_luns) ?
+					dev->geo.nr_luns : luns_left;
 
 		if (lun_balanced && prev_nr_luns != luns_in_chnl)
 			lun_balanced = 0;
@@ -199,8 +218,9 @@ static struct nvm_tgt_dev *nvm_create_tgt_dev(struct nvm_dev *dev,
 	memcpy(&tgt_dev->geo, &dev->geo, sizeof(struct nvm_geo));
 	/* Target device only owns a portion of the physical device */
 	tgt_dev->geo.nr_chnls = nr_chnls;
-	tgt_dev->geo.nr_luns = nr_luns;
-	tgt_dev->geo.luns_per_chnl = (lun_balanced) ? prev_nr_luns : -1;
+	tgt_dev->geo.all_luns = nr_luns;
+	tgt_dev->geo.nr_luns = (lun_balanced) ? prev_nr_luns : -1;
+	tgt_dev->geo.op = op;
 	tgt_dev->total_secs = nr_luns * tgt_dev->geo.sec_per_lun;
 	tgt_dev->q = dev->q;
 	tgt_dev->map = dev_map;
@@ -226,27 +246,79 @@ static const struct block_device_operations nvm_fops = {
 	.owner		= THIS_MODULE,
 };
 
-static struct nvm_tgt_type *nvm_find_target_type(const char *name, int lock)
+static struct nvm_tgt_type *__nvm_find_target_type(const char *name)
 {
-	struct nvm_tgt_type *tmp, *tt = NULL;
+	struct nvm_tgt_type *tt;
 
-	if (lock)
-		down_write(&nvm_tgtt_lock);
+	list_for_each_entry(tt, &nvm_tgt_types, list)
+		if (!strcmp(name, tt->name))
+			return tt;
 
-	list_for_each_entry(tmp, &nvm_tgt_types, list)
-		if (!strcmp(name, tmp->name)) {
-			tt = tmp;
-			break;
-		}
+	return NULL;
+}
 
-	if (lock)
-		up_write(&nvm_tgtt_lock);
+static struct nvm_tgt_type *nvm_find_target_type(const char *name)
+{
+	struct nvm_tgt_type *tt;
+
+	down_write(&nvm_tgtt_lock);
+	tt = __nvm_find_target_type(name);
+	up_write(&nvm_tgtt_lock);
+
 	return tt;
 }
 
+static int nvm_config_check_luns(struct nvm_geo *geo, int lun_begin,
+				 int lun_end)
+{
+	if (lun_begin > lun_end || lun_end >= geo->all_luns) {
+		pr_err("nvm: lun out of bound (%u:%u > %u)\n",
+			lun_begin, lun_end, geo->all_luns - 1);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int __nvm_config_simple(struct nvm_dev *dev,
+			       struct nvm_ioctl_create_simple *s)
+{
+	struct nvm_geo *geo = &dev->geo;
+
+	if (s->lun_begin == -1 && s->lun_end == -1) {
+		s->lun_begin = 0;
+		s->lun_end = geo->all_luns - 1;
+	}
+
+	return nvm_config_check_luns(geo, s->lun_begin, s->lun_end);
+}
+
+static int __nvm_config_extended(struct nvm_dev *dev,
+				 struct nvm_ioctl_create_extended *e)
+{
+	struct nvm_geo *geo = &dev->geo;
+
+	if (e->lun_begin == 0xFFFF && e->lun_end == 0xFFFF) {
+		e->lun_begin = 0;
+		e->lun_end = dev->geo.all_luns - 1;
+	}
+
+	/* op not set falls into target's default */
+	if (e->op == 0xFFFF)
+		e->op = NVM_TARGET_DEFAULT_OP;
+
+	if (e->op < NVM_TARGET_MIN_OP ||
+	    e->op > NVM_TARGET_MAX_OP) {
+		pr_err("nvm: invalid over provisioning value\n");
+		return -EINVAL;
+	}
+
+	return nvm_config_check_luns(geo, e->lun_begin, e->lun_end);
+}
+
 static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
 {
-	struct nvm_ioctl_create_simple *s = &create->conf.s;
+	struct nvm_ioctl_create_extended e;
 	struct request_queue *tqueue;
 	struct gendisk *tdisk;
 	struct nvm_tgt_type *tt;
@@ -255,22 +327,41 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
 	void *targetdata;
 	int ret;
 
-	tt = nvm_find_target_type(create->tgttype, 1);
+	switch (create->conf.type) {
+	case NVM_CONFIG_TYPE_SIMPLE:
+		ret = __nvm_config_simple(dev, &create->conf.s);
+		if (ret)
+			return ret;
+
+		e.lun_begin = create->conf.s.lun_begin;
+		e.lun_end = create->conf.s.lun_end;
+		e.op = NVM_TARGET_DEFAULT_OP;
+		break;
+	case NVM_CONFIG_TYPE_EXTENDED:
+		ret = __nvm_config_extended(dev, &create->conf.e);
+		if (ret)
+			return ret;
+
+		e = create->conf.e;
+		break;
+	default:
+		pr_err("nvm: config type not valid\n");
+		return -EINVAL;
+	}
+
+	tt = nvm_find_target_type(create->tgttype);
 	if (!tt) {
 		pr_err("nvm: target type %s not found\n", create->tgttype);
 		return -EINVAL;
 	}
 
-	mutex_lock(&dev->mlock);
-	t = nvm_find_target(dev, create->tgtname);
-	if (t) {
-		pr_err("nvm: target name already exists.\n");
-		mutex_unlock(&dev->mlock);
+	if (nvm_target_exists(create->tgtname)) {
+		pr_err("nvm: target name already exists (%s)\n",
+							create->tgtname);
 		return -EINVAL;
 	}
-	mutex_unlock(&dev->mlock);
 
-	ret = nvm_reserve_luns(dev, s->lun_begin, s->lun_end);
+	ret = nvm_reserve_luns(dev, e.lun_begin, e.lun_end);
 	if (ret)
 		return ret;
 
@@ -280,7 +371,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
 		goto err_reserve;
 	}
 
-	tgt_dev = nvm_create_tgt_dev(dev, s->lun_begin, s->lun_end);
+	tgt_dev = nvm_create_tgt_dev(dev, e.lun_begin, e.lun_end, e.op);
 	if (!tgt_dev) {
 		pr_err("nvm: could not create target device\n");
 		ret = -ENOMEM;
@@ -350,7 +441,7 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct nvm_ioctl_create *create)
 err_t:
 	kfree(t);
 err_reserve:
-	nvm_release_luns_err(dev, s->lun_begin, s->lun_end);
+	nvm_release_luns_err(dev, e.lun_begin, e.lun_end);
 	return ret;
 }
 
@@ -420,7 +511,7 @@ static int nvm_register_map(struct nvm_dev *dev)
 	for (i = 0; i < dev->geo.nr_chnls; i++) {
 		struct nvm_ch_map *ch_rmap;
 		int *lun_roffs;
-		int luns_in_chnl = dev->geo.luns_per_chnl;
+		int luns_in_chnl = dev->geo.nr_luns;
 
 		ch_rmap = &rmap->chnls[i];
 
@@ -524,41 +615,12 @@ static void nvm_rq_dev_to_tgt(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd)
 	nvm_ppa_dev_to_tgt(tgt_dev, rqd->ppa_list, rqd->nr_ppas);
 }
 
-void nvm_part_to_tgt(struct nvm_dev *dev, sector_t *entries,
-		     int len)
-{
-	struct nvm_geo *geo = &dev->geo;
-	struct nvm_dev_map *dev_rmap = dev->rmap;
-	u64 i;
-
-	for (i = 0; i < len; i++) {
-		struct nvm_ch_map *ch_rmap;
-		int *lun_roffs;
-		struct ppa_addr gaddr;
-		u64 pba = le64_to_cpu(entries[i]);
-		u64 diff;
-
-		if (!pba)
-			continue;
-
-		gaddr = linear_to_generic_addr(geo, pba);
-		ch_rmap = &dev_rmap->chnls[gaddr.g.ch];
-		lun_roffs = ch_rmap->lun_offs;
-
-		diff = ((ch_rmap->ch_off * geo->luns_per_chnl) +
-				(lun_roffs[gaddr.g.lun])) * geo->sec_per_lun;
-
-		entries[i] -= cpu_to_le64(diff);
-	}
-}
-EXPORT_SYMBOL(nvm_part_to_tgt);
-
 int nvm_register_tgt_type(struct nvm_tgt_type *tt)
 {
 	int ret = 0;
 
 	down_write(&nvm_tgtt_lock);
-	if (nvm_find_target_type(tt->name, 0))
+	if (__nvm_find_target_type(tt->name))
 		ret = -EEXIST;
 	else
 		list_add(&tt->list, &nvm_tgt_types);
@@ -726,112 +788,6 @@ int nvm_submit_io_sync(struct nvm_tgt_dev *tgt_dev, struct nvm_rq *rqd)
 }
 EXPORT_SYMBOL(nvm_submit_io_sync);
 
-int nvm_erase_sync(struct nvm_tgt_dev *tgt_dev, struct ppa_addr *ppas,
-								int nr_ppas)
-{
-	struct nvm_geo *geo = &tgt_dev->geo;
-	struct nvm_rq rqd;
-	int ret;
-
-	memset(&rqd, 0, sizeof(struct nvm_rq));
-
-	rqd.opcode = NVM_OP_ERASE;
-	rqd.flags = geo->plane_mode >> 1;
-
-	ret = nvm_set_rqd_ppalist(tgt_dev, &rqd, ppas, nr_ppas);
-	if (ret)
-		return ret;
-
-	ret = nvm_submit_io_sync(tgt_dev, &rqd);
-	if (ret) {
-		pr_err("rrpr: erase I/O submission failed: %d\n", ret);
-		goto free_ppa_list;
-	}
-
-free_ppa_list:
-	nvm_free_rqd_ppalist(tgt_dev, &rqd);
-
-	return ret;
-}
-EXPORT_SYMBOL(nvm_erase_sync);
-
-int nvm_get_l2p_tbl(struct nvm_tgt_dev *tgt_dev, u64 slba, u32 nlb,
-		    nvm_l2p_update_fn *update_l2p, void *priv)
-{
-	struct nvm_dev *dev = tgt_dev->parent;
-
-	if (!dev->ops->get_l2p_tbl)
-		return 0;
-
-	return dev->ops->get_l2p_tbl(dev, slba, nlb, update_l2p, priv);
-}
-EXPORT_SYMBOL(nvm_get_l2p_tbl);
-
-int nvm_get_area(struct nvm_tgt_dev *tgt_dev, sector_t *lba, sector_t len)
-{
-	struct nvm_dev *dev = tgt_dev->parent;
-	struct nvm_geo *geo = &dev->geo;
-	struct nvm_area *area, *prev, *next;
-	sector_t begin = 0;
-	sector_t max_sectors = (geo->sec_size * dev->total_secs) >> 9;
-
-	if (len > max_sectors)
-		return -EINVAL;
-
-	area = kmalloc(sizeof(struct nvm_area), GFP_KERNEL);
-	if (!area)
-		return -ENOMEM;
-
-	prev = NULL;
-
-	spin_lock(&dev->lock);
-	list_for_each_entry(next, &dev->area_list, list) {
-		if (begin + len > next->begin) {
-			begin = next->end;
-			prev = next;
-			continue;
-		}
-		break;
-	}
-
-	if ((begin + len) > max_sectors) {
-		spin_unlock(&dev->lock);
-		kfree(area);
-		return -EINVAL;
-	}
-
-	area->begin = *lba = begin;
-	area->end = begin + len;
-
-	if (prev) /* insert into sorted order */
-		list_add(&area->list, &prev->list);
-	else
-		list_add(&area->list, &dev->area_list);
-	spin_unlock(&dev->lock);
-
-	return 0;
-}
-EXPORT_SYMBOL(nvm_get_area);
-
-void nvm_put_area(struct nvm_tgt_dev *tgt_dev, sector_t begin)
-{
-	struct nvm_dev *dev = tgt_dev->parent;
-	struct nvm_area *area;
-
-	spin_lock(&dev->lock);
-	list_for_each_entry(area, &dev->area_list, list) {
-		if (area->begin != begin)
-			continue;
-
-		list_del(&area->list);
-		spin_unlock(&dev->lock);
-		kfree(area);
-		return;
-	}
-	spin_unlock(&dev->lock);
-}
-EXPORT_SYMBOL(nvm_put_area);
-
 void nvm_end_io(struct nvm_rq *rqd)
 {
 	struct nvm_tgt_dev *tgt_dev = rqd->dev;
@@ -858,10 +814,10 @@ int nvm_bb_tbl_fold(struct nvm_dev *dev, u8 *blks, int nr_blks)
 	struct nvm_geo *geo = &dev->geo;
 	int blk, offset, pl, blktype;
 
-	if (nr_blks != geo->blks_per_lun * geo->plane_mode)
+	if (nr_blks != geo->nr_chks * geo->plane_mode)
 		return -EINVAL;
 
-	for (blk = 0; blk < geo->blks_per_lun; blk++) {
+	for (blk = 0; blk < geo->nr_chks; blk++) {
 		offset = blk * geo->plane_mode;
 		blktype = blks[offset];
 
@@ -877,7 +833,7 @@ int nvm_bb_tbl_fold(struct nvm_dev *dev, u8 *blks, int nr_blks)
 		blks[blk] = blktype;
 	}
 
-	return geo->blks_per_lun;
+	return geo->nr_chks;
 }
 EXPORT_SYMBOL(nvm_bb_tbl_fold);
 
@@ -892,53 +848,6 @@ int nvm_get_tgt_bb_tbl(struct nvm_tgt_dev *tgt_dev, struct ppa_addr ppa,
 }
 EXPORT_SYMBOL(nvm_get_tgt_bb_tbl);
 
-static int nvm_init_slc_tbl(struct nvm_dev *dev, struct nvm_id_group *grp)
-{
-	struct nvm_geo *geo = &dev->geo;
-	int i;
-
-	dev->lps_per_blk = geo->pgs_per_blk;
-	dev->lptbl = kcalloc(dev->lps_per_blk, sizeof(int), GFP_KERNEL);
-	if (!dev->lptbl)
-		return -ENOMEM;
-
-	/* Just a linear array */
-	for (i = 0; i < dev->lps_per_blk; i++)
-		dev->lptbl[i] = i;
-
-	return 0;
-}
-
-static int nvm_init_mlc_tbl(struct nvm_dev *dev, struct nvm_id_group *grp)
-{
-	int i, p;
-	struct nvm_id_lp_mlc *mlc = &grp->lptbl.mlc;
-
-	if (!mlc->num_pairs)
-		return 0;
-
-	dev->lps_per_blk = mlc->num_pairs;
-	dev->lptbl = kcalloc(dev->lps_per_blk, sizeof(int), GFP_KERNEL);
-	if (!dev->lptbl)
-		return -ENOMEM;
-
-	/* The lower page table encoding consists of a list of bytes, where each
-	 * has a lower and an upper half. The first half byte maintains the
-	 * increment value and every value after is an offset added to the
-	 * previous incrementation value
-	 */
-	dev->lptbl[0] = mlc->pairs[0] & 0xF;
-	for (i = 1; i < dev->lps_per_blk; i++) {
-		p = mlc->pairs[i >> 1];
-		if (i & 0x1) /* upper */
-			dev->lptbl[i] = dev->lptbl[i - 1] + ((p & 0xF0) >> 4);
-		else /* lower */
-			dev->lptbl[i] = dev->lptbl[i - 1] + (p & 0xF);
-	}
-
-	return 0;
-}
-
 static int nvm_core_init(struct nvm_dev *dev)
 {
 	struct nvm_id *id = &dev->identity;
@@ -946,66 +855,44 @@ static int nvm_core_init(struct nvm_dev *dev)
 	struct nvm_geo *geo = &dev->geo;
 	int ret;
 
-	/* Whole device values */
-	geo->nr_chnls = grp->num_ch;
-	geo->luns_per_chnl = grp->num_lun;
-
-	/* Generic device values */
-	geo->pgs_per_blk = grp->num_pg;
-	geo->blks_per_lun = grp->num_blk;
-	geo->nr_planes = grp->num_pln;
-	geo->fpg_size = grp->fpg_sz;
-	geo->pfpg_size = grp->fpg_sz * grp->num_pln;
-	geo->sec_size = grp->csecs;
-	geo->oob_size = grp->sos;
-	geo->sec_per_pg = grp->fpg_sz / grp->csecs;
-	geo->mccap = grp->mccap;
 	memcpy(&geo->ppaf, &id->ppaf, sizeof(struct nvm_addr_format));
 
-	geo->plane_mode = NVM_PLANE_SINGLE;
-	geo->max_rq_size = dev->ops->max_phys_sect * geo->sec_size;
-
-	if (grp->mpos & 0x020202)
-		geo->plane_mode = NVM_PLANE_DOUBLE;
-	if (grp->mpos & 0x040404)
-		geo->plane_mode = NVM_PLANE_QUAD;
-
 	if (grp->mtype != 0) {
 		pr_err("nvm: memory type not supported\n");
 		return -EINVAL;
 	}
 
-	/* calculated values */
-	geo->sec_per_pl = geo->sec_per_pg * geo->nr_planes;
-	geo->sec_per_blk = geo->sec_per_pl * geo->pgs_per_blk;
-	geo->sec_per_lun = geo->sec_per_blk * geo->blks_per_lun;
-	geo->nr_luns = geo->luns_per_chnl * geo->nr_chnls;
+	/* Whole device values */
+	geo->nr_chnls = grp->num_ch;
+	geo->nr_luns = grp->num_lun;
 
-	dev->total_secs = geo->nr_luns * geo->sec_per_lun;
-	dev->lun_map = kcalloc(BITS_TO_LONGS(geo->nr_luns),
+	/* Generic device geometry values */
+	geo->ws_min = grp->ws_min;
+	geo->ws_opt = grp->ws_opt;
+	geo->ws_seq = grp->ws_seq;
+	geo->ws_per_chk = grp->ws_per_chk;
+	geo->nr_chks = grp->num_chk;
+	geo->sec_size = grp->csecs;
+	geo->oob_size = grp->sos;
+	geo->mccap = grp->mccap;
+	geo->max_rq_size = dev->ops->max_phys_sect * geo->sec_size;
+
+	geo->sec_per_chk = grp->clba;
+	geo->sec_per_lun = geo->sec_per_chk * geo->nr_chks;
+	geo->all_luns = geo->nr_luns * geo->nr_chnls;
+
+	/* 1.2 spec device geometry values */
+	geo->plane_mode = 1 << geo->ws_seq;
+	geo->nr_planes = geo->ws_opt / geo->ws_min;
+	geo->sec_per_pg = geo->ws_min;
+	geo->sec_per_pl = geo->sec_per_pg * geo->nr_planes;
+
+	dev->total_secs = geo->all_luns * geo->sec_per_lun;
+	dev->lun_map = kcalloc(BITS_TO_LONGS(geo->all_luns),
 					sizeof(unsigned long), GFP_KERNEL);
 	if (!dev->lun_map)
 		return -ENOMEM;
 
-	switch (grp->fmtype) {
-	case NVM_ID_FMTYPE_SLC:
-		if (nvm_init_slc_tbl(dev, grp)) {
-			ret = -ENOMEM;
-			goto err_fmtype;
-		}
-		break;
-	case NVM_ID_FMTYPE_MLC:
-		if (nvm_init_mlc_tbl(dev, grp)) {
-			ret = -ENOMEM;
-			goto err_fmtype;
-		}
-		break;
-	default:
-		pr_err("nvm: flash type not supported\n");
-		ret = -EINVAL;
-		goto err_fmtype;
-	}
-
 	INIT_LIST_HEAD(&dev->area_list);
 	INIT_LIST_HEAD(&dev->targets);
 	mutex_init(&dev->mlock);
@@ -1031,7 +918,6 @@ static void nvm_free(struct nvm_dev *dev)
 		dev->ops->destroy_dma_pool(dev->dma_pool);
 
 	nvm_unregister_map(dev);
-	kfree(dev->lptbl);
 	kfree(dev->lun_map);
 	kfree(dev);
 }
@@ -1062,8 +948,8 @@ static int nvm_init(struct nvm_dev *dev)
 
 	pr_info("nvm: registered %s [%u/%u/%u/%u/%u/%u]\n",
 			dev->name, geo->sec_per_pg, geo->nr_planes,
-			geo->pgs_per_blk, geo->blks_per_lun,
-			geo->nr_luns, geo->nr_chnls);
+			geo->ws_per_chk, geo->nr_chks,
+			geo->all_luns, geo->nr_chnls);
 	return 0;
 err:
 	pr_err("nvm: failed to initialize nvm\n");
@@ -1135,7 +1021,6 @@ EXPORT_SYMBOL(nvm_unregister);
 static int __nvm_configure_create(struct nvm_ioctl_create *create)
 {
 	struct nvm_dev *dev;
-	struct nvm_ioctl_create_simple *s;
 
 	down_write(&nvm_lock);
 	dev = nvm_find_nvm_dev(create->dev);
@@ -1146,23 +1031,6 @@ static int __nvm_configure_create(struct nvm_ioctl_create *create)
 		return -EINVAL;
 	}
 
-	if (create->conf.type != NVM_CONFIG_TYPE_SIMPLE) {
-		pr_err("nvm: config type not valid\n");
-		return -EINVAL;
-	}
-	s = &create->conf.s;
-
-	if (s->lun_begin == -1 && s->lun_end == -1) {
-		s->lun_begin = 0;
-		s->lun_end = dev->geo.nr_luns - 1;
-	}
-
-	if (s->lun_begin > s->lun_end || s->lun_end >= dev->geo.nr_luns) {
-		pr_err("nvm: lun out of bound (%u:%u > %u)\n",
-			s->lun_begin, s->lun_end, dev->geo.nr_luns - 1);
-		return -EINVAL;
-	}
-
 	return nvm_create_tgt(dev, create);
 }
 
@@ -1262,6 +1130,12 @@ static long nvm_ioctl_dev_create(struct file *file, void __user *arg)
 	if (copy_from_user(&create, arg, sizeof(struct nvm_ioctl_create)))
 		return -EFAULT;
 
+	if (create.conf.type == NVM_CONFIG_TYPE_EXTENDED &&
+	    create.conf.e.rsv != 0) {
+		pr_err("nvm: reserved config field in use\n");
+		return -EINVAL;
+	}
+
 	create.dev[DISK_NAME_LEN - 1] = '\0';
 	create.tgttype[NVM_TTYPE_NAME_MAX - 1] = '\0';
 	create.tgtname[DISK_NAME_LEN - 1] = '\0';
diff --git a/drivers/lightnvm/pblk-cache.c b/drivers/lightnvm/pblk-cache.c
index 0d227ef..000fcad 100644
--- a/drivers/lightnvm/pblk-cache.c
+++ b/drivers/lightnvm/pblk-cache.c
@@ -19,12 +19,16 @@
 
 int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 {
+	struct request_queue *q = pblk->dev->q;
 	struct pblk_w_ctx w_ctx;
 	sector_t lba = pblk_get_lba(bio);
+	unsigned long start_time = jiffies;
 	unsigned int bpos, pos;
 	int nr_entries = pblk_get_secs(bio);
 	int i, ret;
 
+	generic_start_io_acct(q, WRITE, bio_sectors(bio), &pblk->disk->part0);
+
 	/* Update the write buffer head (mem) with the entries that we can
 	 * write. The write in itself cannot fail, so there is no need to
 	 * rollback from here on.
@@ -67,6 +71,7 @@ int pblk_write_to_cache(struct pblk *pblk, struct bio *bio, unsigned long flags)
 	pblk_rl_inserted(&pblk->rl, nr_entries);
 
 out:
+	generic_end_io_acct(q, WRITE, &pblk->disk->part0, start_time);
 	pblk_write_should_kick(pblk);
 	return ret;
 }
diff --git a/drivers/lightnvm/pblk-core.c b/drivers/lightnvm/pblk-core.c
index 76516ee..0487b93 100644
--- a/drivers/lightnvm/pblk-core.c
+++ b/drivers/lightnvm/pblk-core.c
@@ -32,8 +32,8 @@ static void pblk_line_mark_bb(struct work_struct *work)
 		struct pblk_line *line;
 		int pos;
 
-		line = &pblk->lines[pblk_dev_ppa_to_line(*ppa)];
-		pos = pblk_dev_ppa_to_pos(&dev->geo, *ppa);
+		line = &pblk->lines[pblk_ppa_to_line(*ppa)];
+		pos = pblk_ppa_to_pos(&dev->geo, *ppa);
 
 		pr_err("pblk: failed to mark bb, line:%d, pos:%d\n",
 				line->id, pos);
@@ -48,7 +48,7 @@ static void pblk_mark_bb(struct pblk *pblk, struct pblk_line *line,
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
-	int pos = pblk_dev_ppa_to_pos(geo, *ppa);
+	int pos = pblk_ppa_to_pos(geo, *ppa);
 
 	pr_debug("pblk: erase failed: line:%d, pos:%d\n", line->id, pos);
 	atomic_long_inc(&pblk->erase_failed);
@@ -66,7 +66,7 @@ static void __pblk_end_io_erase(struct pblk *pblk, struct nvm_rq *rqd)
 {
 	struct pblk_line *line;
 
-	line = &pblk->lines[pblk_dev_ppa_to_line(rqd->ppa_addr)];
+	line = &pblk->lines[pblk_ppa_to_line(rqd->ppa_addr)];
 	atomic_dec(&line->left_seblks);
 
 	if (rqd->error) {
@@ -144,7 +144,7 @@ void pblk_map_invalidate(struct pblk *pblk, struct ppa_addr ppa)
 	BUG_ON(pblk_ppa_empty(ppa));
 #endif
 
-	line_id = pblk_tgt_ppa_to_line(ppa);
+	line_id = pblk_ppa_to_line(ppa);
 	line = &pblk->lines[line_id];
 	paddr = pblk_dev_ppa_to_line_addr(pblk, ppa);
 
@@ -650,7 +650,7 @@ static int pblk_line_submit_emeta_io(struct pblk *pblk, struct pblk_line *line,
 	} else {
 		for (i = 0; i < rqd.nr_ppas; ) {
 			struct ppa_addr ppa = addr_to_gen_ppa(pblk, paddr, id);
-			int pos = pblk_dev_ppa_to_pos(geo, ppa);
+			int pos = pblk_ppa_to_pos(geo, ppa);
 			int read_type = PBLK_READ_RANDOM;
 
 			if (pblk_io_aligned(pblk, rq_ppas))
@@ -668,7 +668,7 @@ static int pblk_line_submit_emeta_io(struct pblk *pblk, struct pblk_line *line,
 				}
 
 				ppa = addr_to_gen_ppa(pblk, paddr, id);
-				pos = pblk_dev_ppa_to_pos(geo, ppa);
+				pos = pblk_ppa_to_pos(geo, ppa);
 			}
 
 			if (pblk_boundary_paddr_checks(pblk, paddr + min)) {
@@ -742,7 +742,7 @@ static int pblk_line_submit_smeta_io(struct pblk *pblk, struct pblk_line *line,
 		cmd_op = NVM_OP_PWRITE;
 		flags = pblk_set_progr_mode(pblk, PBLK_WRITE);
 		lba_list = emeta_to_lbas(pblk, line->emeta->buf);
-	} else if (dir == PBLK_READ) {
+	} else if (dir == PBLK_READ_RECOV || dir == PBLK_READ) {
 		bio_op = REQ_OP_READ;
 		cmd_op = NVM_OP_PREAD;
 		flags = pblk_set_read_mode(pblk, PBLK_READ_SEQUENTIAL);
@@ -802,7 +802,7 @@ static int pblk_line_submit_smeta_io(struct pblk *pblk, struct pblk_line *line,
 	if (rqd.error) {
 		if (dir == PBLK_WRITE)
 			pblk_log_write_err(pblk, &rqd);
-		else
+		else if (dir == PBLK_READ)
 			pblk_log_read_err(pblk, &rqd);
 	}
 
@@ -816,7 +816,7 @@ int pblk_line_read_smeta(struct pblk *pblk, struct pblk_line *line)
 {
 	u64 bpaddr = pblk_line_smeta_start(pblk, line);
 
-	return pblk_line_submit_smeta_io(pblk, line, bpaddr, PBLK_READ);
+	return pblk_line_submit_smeta_io(pblk, line, bpaddr, PBLK_READ_RECOV);
 }
 
 int pblk_line_read_emeta(struct pblk *pblk, struct pblk_line *line,
@@ -854,8 +854,8 @@ static int pblk_blk_erase_sync(struct pblk *pblk, struct ppa_addr ppa)
 		struct nvm_geo *geo = &dev->geo;
 
 		pr_err("pblk: could not sync erase line:%d,blk:%d\n",
-					pblk_dev_ppa_to_line(ppa),
-					pblk_dev_ppa_to_pos(geo, ppa));
+					pblk_ppa_to_line(ppa),
+					pblk_ppa_to_pos(geo, ppa));
 
 		rqd.error = ret;
 		goto out;
@@ -979,7 +979,7 @@ static int pblk_line_init_metadata(struct pblk *pblk, struct pblk_line *line,
 
 	/* Start metadata */
 	smeta_buf->seq_nr = cpu_to_le64(line->seq_nr);
-	smeta_buf->window_wr_lun = cpu_to_le32(geo->nr_luns);
+	smeta_buf->window_wr_lun = cpu_to_le32(geo->all_luns);
 
 	/* Fill metadata among lines */
 	if (cur) {
@@ -1032,7 +1032,7 @@ static int pblk_line_init_bb(struct pblk *pblk, struct pblk_line *line,
 							lm->sec_per_line);
 		bitmap_or(line->map_bitmap, line->map_bitmap, l_mg->bb_aux,
 							lm->sec_per_line);
-		line->sec_in_line -= geo->sec_per_blk;
+		line->sec_in_line -= geo->sec_per_chk;
 		if (bit >= lm->emeta_bb)
 			nr_bb++;
 	}
@@ -1145,7 +1145,7 @@ int pblk_line_recov_alloc(struct pblk *pblk, struct pblk_line *line)
 	}
 	spin_unlock(&l_mg->free_lock);
 
-	pblk_rl_free_lines_dec(&pblk->rl, line);
+	pblk_rl_free_lines_dec(&pblk->rl, line, true);
 
 	if (!pblk_line_init_bb(pblk, line, 0)) {
 		list_add(&line->list, &l_mg->free_list);
@@ -1233,7 +1233,7 @@ static struct pblk_line *pblk_line_retry(struct pblk *pblk,
 	l_mg->data_line = retry_line;
 	spin_unlock(&l_mg->free_lock);
 
-	pblk_rl_free_lines_dec(&pblk->rl, retry_line);
+	pblk_rl_free_lines_dec(&pblk->rl, line, false);
 
 	if (pblk_line_erase(pblk, retry_line))
 		goto retry;
@@ -1252,7 +1252,6 @@ struct pblk_line *pblk_line_get_first_data(struct pblk *pblk)
 {
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line *line;
-	int is_next = 0;
 
 	spin_lock(&l_mg->free_lock);
 	line = pblk_line_get(pblk);
@@ -1280,7 +1279,6 @@ struct pblk_line *pblk_line_get_first_data(struct pblk *pblk)
 	} else {
 		l_mg->data_next->seq_nr = l_mg->d_seq_nr++;
 		l_mg->data_next->type = PBLK_LINETYPE_DATA;
-		is_next = 1;
 	}
 	spin_unlock(&l_mg->free_lock);
 
@@ -1290,10 +1288,6 @@ struct pblk_line *pblk_line_get_first_data(struct pblk *pblk)
 			return NULL;
 	}
 
-	pblk_rl_free_lines_dec(&pblk->rl, line);
-	if (is_next)
-		pblk_rl_free_lines_dec(&pblk->rl, l_mg->data_next);
-
 retry_setup:
 	if (!pblk_line_init_metadata(pblk, line, NULL)) {
 		line = pblk_line_retry(pblk, line);
@@ -1311,6 +1305,8 @@ struct pblk_line *pblk_line_get_first_data(struct pblk *pblk)
 		goto retry_setup;
 	}
 
+	pblk_rl_free_lines_dec(&pblk->rl, line, true);
+
 	return line;
 }
 
@@ -1395,7 +1391,6 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
 	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line *cur, *new = NULL;
 	unsigned int left_seblks;
-	int is_next = 0;
 
 	cur = l_mg->data_line;
 	new = l_mg->data_next;
@@ -1444,6 +1439,8 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
 		goto retry_setup;
 	}
 
+	pblk_rl_free_lines_dec(&pblk->rl, new, true);
+
 	/* Allocate next line for preparation */
 	spin_lock(&l_mg->free_lock);
 	l_mg->data_next = pblk_line_get(pblk);
@@ -1457,13 +1454,9 @@ struct pblk_line *pblk_line_replace_data(struct pblk *pblk)
 	} else {
 		l_mg->data_next->seq_nr = l_mg->d_seq_nr++;
 		l_mg->data_next->type = PBLK_LINETYPE_DATA;
-		is_next = 1;
 	}
 	spin_unlock(&l_mg->free_lock);
 
-	if (is_next)
-		pblk_rl_free_lines_dec(&pblk->rl, l_mg->data_next);
-
 out:
 	return new;
 }
@@ -1561,8 +1554,8 @@ int pblk_blk_erase_async(struct pblk *pblk, struct ppa_addr ppa)
 		struct nvm_geo *geo = &dev->geo;
 
 		pr_err("pblk: could not async erase line:%d,blk:%d\n",
-					pblk_dev_ppa_to_line(ppa),
-					pblk_dev_ppa_to_pos(geo, ppa));
+					pblk_ppa_to_line(ppa),
+					pblk_ppa_to_pos(geo, ppa));
 	}
 
 	return err;
@@ -1746,7 +1739,7 @@ void pblk_up_rq(struct pblk *pblk, struct ppa_addr *ppa_list, int nr_ppas,
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
 	struct pblk_lun *rlun;
-	int nr_luns = geo->nr_luns;
+	int nr_luns = geo->all_luns;
 	int bit = -1;
 
 	while ((bit = find_next_bit(lun_bitmap, nr_luns, bit + 1)) < nr_luns) {
@@ -1884,7 +1877,7 @@ void pblk_lookup_l2p_seq(struct pblk *pblk, struct ppa_addr *ppas,
 
 		/* If the L2P entry maps to a line, the reference is valid */
 		if (!pblk_ppa_empty(ppa) && !pblk_addr_in_cache(ppa)) {
-			int line_id = pblk_dev_ppa_to_line(ppa);
+			int line_id = pblk_ppa_to_line(ppa);
 			struct pblk_line *line = &pblk->lines[line_id];
 
 			kref_get(&line->ref);
diff --git a/drivers/lightnvm/pblk-gc.c b/drivers/lightnvm/pblk-gc.c
index 9c8e114..3d89938 100644
--- a/drivers/lightnvm/pblk-gc.c
+++ b/drivers/lightnvm/pblk-gc.c
@@ -169,7 +169,14 @@ static void pblk_gc_line_prepare_ws(struct work_struct *work)
 	 * the line untouched. TODO: Implement a recovery routine that scans and
 	 * moves all sectors on the line.
 	 */
-	lba_list = pblk_recov_get_lba_list(pblk, emeta_buf);
+
+	ret = pblk_recov_check_emeta(pblk, emeta_buf);
+	if (ret) {
+		pr_err("pblk: inconsistent emeta (line %d)\n", line->id);
+		goto fail_free_emeta;
+	}
+
+	lba_list = emeta_to_lbas(pblk, emeta_buf);
 	if (!lba_list) {
 		pr_err("pblk: could not interpret emeta (line %d)\n", line->id);
 		goto fail_free_emeta;
@@ -519,22 +526,12 @@ void pblk_gc_should_start(struct pblk *pblk)
 	}
 }
 
-/*
- * If flush_wq == 1 then no lock should be held by the caller since
- * flush_workqueue can sleep
- */
-static void pblk_gc_stop(struct pblk *pblk, int flush_wq)
-{
-	pblk->gc.gc_active = 0;
-	pr_debug("pblk: gc stop\n");
-}
-
 void pblk_gc_should_stop(struct pblk *pblk)
 {
 	struct pblk_gc *gc = &pblk->gc;
 
 	if (gc->gc_active && !gc->gc_forced)
-		pblk_gc_stop(pblk, 0);
+		gc->gc_active = 0;
 }
 
 void pblk_gc_should_kick(struct pblk *pblk)
@@ -660,7 +657,7 @@ void pblk_gc_exit(struct pblk *pblk)
 
 	gc->gc_enabled = 0;
 	del_timer_sync(&gc->gc_timer);
-	pblk_gc_stop(pblk, 1);
+	gc->gc_active = 0;
 
 	if (gc->gc_ts)
 		kthread_stop(gc->gc_ts);
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 695826a..93d671c 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -169,8 +169,8 @@ static int pblk_set_ppaf(struct pblk *pblk)
 	}
 	ppaf.ch_len = power_len;
 
-	power_len = get_count_order(geo->luns_per_chnl);
-	if (1 << power_len != geo->luns_per_chnl) {
+	power_len = get_count_order(geo->nr_luns);
+	if (1 << power_len != geo->nr_luns) {
 		pr_err("pblk: supports only power-of-two LUN config.\n");
 		return -EINVAL;
 	}
@@ -254,7 +254,7 @@ static int pblk_core_init(struct pblk *pblk)
 	struct nvm_geo *geo = &dev->geo;
 
 	pblk->pgs_in_buffer = NVM_MEM_PAGE_WRITE * geo->sec_per_pg *
-						geo->nr_planes * geo->nr_luns;
+						geo->nr_planes * geo->all_luns;
 
 	if (pblk_init_global_caches(pblk))
 		return -ENOMEM;
@@ -270,21 +270,22 @@ static int pblk_core_init(struct pblk *pblk)
 	if (!pblk->gen_ws_pool)
 		goto free_page_bio_pool;
 
-	pblk->rec_pool = mempool_create_slab_pool(geo->nr_luns, pblk_rec_cache);
+	pblk->rec_pool = mempool_create_slab_pool(geo->all_luns,
+							pblk_rec_cache);
 	if (!pblk->rec_pool)
 		goto free_gen_ws_pool;
 
-	pblk->r_rq_pool = mempool_create_slab_pool(geo->nr_luns,
+	pblk->r_rq_pool = mempool_create_slab_pool(geo->all_luns,
 							pblk_g_rq_cache);
 	if (!pblk->r_rq_pool)
 		goto free_rec_pool;
 
-	pblk->e_rq_pool = mempool_create_slab_pool(geo->nr_luns,
+	pblk->e_rq_pool = mempool_create_slab_pool(geo->all_luns,
 							pblk_g_rq_cache);
 	if (!pblk->e_rq_pool)
 		goto free_r_rq_pool;
 
-	pblk->w_rq_pool = mempool_create_slab_pool(geo->nr_luns,
+	pblk->w_rq_pool = mempool_create_slab_pool(geo->all_luns,
 							pblk_w_rq_cache);
 	if (!pblk->w_rq_pool)
 		goto free_e_rq_pool;
@@ -354,6 +355,8 @@ static void pblk_core_free(struct pblk *pblk)
 	mempool_destroy(pblk->e_rq_pool);
 	mempool_destroy(pblk->w_rq_pool);
 
+	pblk_rwb_free(pblk);
+
 	pblk_free_global_caches(pblk);
 }
 
@@ -409,7 +412,7 @@ static int pblk_bb_discovery(struct nvm_tgt_dev *dev, struct pblk_lun *rlun)
 	u8 *blks;
 	int nr_blks, ret;
 
-	nr_blks = geo->blks_per_lun * geo->plane_mode;
+	nr_blks = geo->nr_chks * geo->plane_mode;
 	blks = kmalloc(nr_blks, GFP_KERNEL);
 	if (!blks)
 		return -ENOMEM;
@@ -482,20 +485,21 @@ static int pblk_luns_init(struct pblk *pblk, struct ppa_addr *luns)
 	int i, ret;
 
 	/* TODO: Implement unbalanced LUN support */
-	if (geo->luns_per_chnl < 0) {
+	if (geo->nr_luns < 0) {
 		pr_err("pblk: unbalanced LUN config.\n");
 		return -EINVAL;
 	}
 
-	pblk->luns = kcalloc(geo->nr_luns, sizeof(struct pblk_lun), GFP_KERNEL);
+	pblk->luns = kcalloc(geo->all_luns, sizeof(struct pblk_lun),
+								GFP_KERNEL);
 	if (!pblk->luns)
 		return -ENOMEM;
 
-	for (i = 0; i < geo->nr_luns; i++) {
+	for (i = 0; i < geo->all_luns; i++) {
 		/* Stripe across channels */
 		int ch = i % geo->nr_chnls;
 		int lun_raw = i / geo->nr_chnls;
-		int lunid = lun_raw + ch * geo->luns_per_chnl;
+		int lunid = lun_raw + ch * geo->nr_luns;
 
 		rlun = &pblk->luns[i];
 		rlun->bppa = luns[lunid];
@@ -577,22 +581,37 @@ static unsigned int calc_emeta_len(struct pblk *pblk)
 static void pblk_set_provision(struct pblk *pblk, long nr_free_blks)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
+	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
+	struct pblk_line_meta *lm = &pblk->lm;
 	struct nvm_geo *geo = &dev->geo;
 	sector_t provisioned;
+	int sec_meta, blk_meta;
 
-	pblk->over_pct = 20;
+	if (geo->op == NVM_TARGET_DEFAULT_OP)
+		pblk->op = PBLK_DEFAULT_OP;
+	else
+		pblk->op = geo->op;
 
 	provisioned = nr_free_blks;
-	provisioned *= (100 - pblk->over_pct);
+	provisioned *= (100 - pblk->op);
 	sector_div(provisioned, 100);
 
+	pblk->op_blks = nr_free_blks - provisioned;
+
 	/* Internally pblk manages all free blocks, but all calculations based
 	 * on user capacity consider only provisioned blocks
 	 */
 	pblk->rl.total_blocks = nr_free_blks;
-	pblk->rl.nr_secs = nr_free_blks * geo->sec_per_blk;
-	pblk->capacity = provisioned * geo->sec_per_blk;
+	pblk->rl.nr_secs = nr_free_blks * geo->sec_per_chk;
+
+	/* Consider sectors used for metadata */
+	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
+	blk_meta = DIV_ROUND_UP(sec_meta, geo->sec_per_chk);
+
+	pblk->capacity = (provisioned - blk_meta) * geo->sec_per_chk;
+
 	atomic_set(&pblk->rl.free_blocks, nr_free_blks);
+	atomic_set(&pblk->rl.free_user_blocks, nr_free_blks);
 }
 
 static int pblk_lines_alloc_metadata(struct pblk *pblk)
@@ -683,7 +702,7 @@ static int pblk_lines_init(struct pblk *pblk)
 	int i, ret;
 
 	pblk->min_write_pgs = geo->sec_per_pl * (geo->sec_size / PAGE_SIZE);
-	max_write_ppas = pblk->min_write_pgs * geo->nr_luns;
+	max_write_ppas = pblk->min_write_pgs * geo->all_luns;
 	pblk->max_write_pgs = (max_write_ppas < nvm_max_phys_sects(dev)) ?
 				max_write_ppas : nvm_max_phys_sects(dev);
 	pblk_set_sec_per_write(pblk, pblk->min_write_pgs);
@@ -693,26 +712,26 @@ static int pblk_lines_init(struct pblk *pblk)
 		return -EINVAL;
 	}
 
-	div_u64_rem(geo->sec_per_blk, pblk->min_write_pgs, &mod);
+	div_u64_rem(geo->sec_per_chk, pblk->min_write_pgs, &mod);
 	if (mod) {
 		pr_err("pblk: bad configuration of sectors/pages\n");
 		return -EINVAL;
 	}
 
-	l_mg->nr_lines = geo->blks_per_lun;
+	l_mg->nr_lines = geo->nr_chks;
 	l_mg->log_line = l_mg->data_line = NULL;
 	l_mg->l_seq_nr = l_mg->d_seq_nr = 0;
 	l_mg->nr_free_lines = 0;
 	bitmap_zero(&l_mg->meta_bitmap, PBLK_DATA_LINES);
 
-	lm->sec_per_line = geo->sec_per_blk * geo->nr_luns;
-	lm->blk_per_line = geo->nr_luns;
-	lm->blk_bitmap_len = BITS_TO_LONGS(geo->nr_luns) * sizeof(long);
+	lm->sec_per_line = geo->sec_per_chk * geo->all_luns;
+	lm->blk_per_line = geo->all_luns;
+	lm->blk_bitmap_len = BITS_TO_LONGS(geo->all_luns) * sizeof(long);
 	lm->sec_bitmap_len = BITS_TO_LONGS(lm->sec_per_line) * sizeof(long);
-	lm->lun_bitmap_len = BITS_TO_LONGS(geo->nr_luns) * sizeof(long);
+	lm->lun_bitmap_len = BITS_TO_LONGS(geo->all_luns) * sizeof(long);
 	lm->mid_thrs = lm->sec_per_line / 2;
 	lm->high_thrs = lm->sec_per_line / 4;
-	lm->meta_distance = (geo->nr_luns / 2) * pblk->min_write_pgs;
+	lm->meta_distance = (geo->all_luns / 2) * pblk->min_write_pgs;
 
 	/* Calculate necessary pages for smeta. See comment over struct
 	 * line_smeta definition
@@ -742,12 +761,12 @@ static int pblk_lines_init(struct pblk *pblk)
 		goto add_emeta_page;
 	}
 
-	lm->emeta_bb = geo->nr_luns > i ? geo->nr_luns - i : 0;
+	lm->emeta_bb = geo->all_luns > i ? geo->all_luns - i : 0;
 
 	lm->min_blk_line = 1;
-	if (geo->nr_luns > 1)
+	if (geo->all_luns > 1)
 		lm->min_blk_line += DIV_ROUND_UP(lm->smeta_sec +
-					lm->emeta_sec[0], geo->sec_per_blk);
+					lm->emeta_sec[0], geo->sec_per_chk);
 
 	if (lm->min_blk_line > lm->blk_per_line) {
 		pr_err("pblk: config. not supported. Min. LUN in line:%d\n",
@@ -772,7 +791,7 @@ static int pblk_lines_init(struct pblk *pblk)
 		goto fail_free_bb_template;
 	}
 
-	bb_distance = (geo->nr_luns) * geo->sec_per_pl;
+	bb_distance = (geo->all_luns) * geo->sec_per_pl;
 	for (i = 0; i < lm->sec_per_line; i += bb_distance)
 		bitmap_set(l_mg->bb_template, i, geo->sec_per_pl);
 
@@ -844,7 +863,7 @@ static int pblk_lines_init(struct pblk *pblk)
 	pblk_set_provision(pblk, nr_free_blks);
 
 	/* Cleanup per-LUN bad block lists - managed within lines on run-time */
-	for (i = 0; i < geo->nr_luns; i++)
+	for (i = 0; i < geo->all_luns; i++)
 		kfree(pblk->luns[i].bb_list);
 
 	return 0;
@@ -858,7 +877,7 @@ static int pblk_lines_init(struct pblk *pblk)
 fail_free_meta:
 	pblk_line_meta_free(pblk);
 fail:
-	for (i = 0; i < geo->nr_luns; i++)
+	for (i = 0; i < geo->all_luns; i++)
 		kfree(pblk->luns[i].bb_list);
 
 	return ret;
@@ -866,15 +885,19 @@ static int pblk_lines_init(struct pblk *pblk)
 
 static int pblk_writer_init(struct pblk *pblk)
 {
-	timer_setup(&pblk->wtimer, pblk_write_timer_fn, 0);
-	mod_timer(&pblk->wtimer, jiffies + msecs_to_jiffies(100));
-
 	pblk->writer_ts = kthread_create(pblk_write_ts, pblk, "pblk-writer-t");
 	if (IS_ERR(pblk->writer_ts)) {
-		pr_err("pblk: could not allocate writer kthread\n");
-		return PTR_ERR(pblk->writer_ts);
+		int err = PTR_ERR(pblk->writer_ts);
+
+		if (err != -EINTR)
+			pr_err("pblk: could not allocate writer kthread (%d)\n",
+					err);
+		return err;
 	}
 
+	timer_setup(&pblk->wtimer, pblk_write_timer_fn, 0);
+	mod_timer(&pblk->wtimer, jiffies + msecs_to_jiffies(100));
+
 	return 0;
 }
 
@@ -910,7 +933,6 @@ static void pblk_tear_down(struct pblk *pblk)
 	pblk_pipeline_stop(pblk);
 	pblk_writer_stop(pblk);
 	pblk_rb_sync_l2p(&pblk->rwb);
-	pblk_rwb_free(pblk);
 	pblk_rl_free(&pblk->rl);
 
 	pr_debug("pblk: consistent tear down\n");
@@ -1025,7 +1047,8 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 
 	ret = pblk_writer_init(pblk);
 	if (ret) {
-		pr_err("pblk: could not initialize write thread\n");
+		if (ret != -EINTR)
+			pr_err("pblk: could not initialize write thread\n");
 		goto fail_free_lines;
 	}
 
@@ -1041,13 +1064,14 @@ static void *pblk_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
 
 	blk_queue_write_cache(tqueue, true, false);
 
-	tqueue->limits.discard_granularity = geo->pgs_per_blk * geo->pfpg_size;
+	tqueue->limits.discard_granularity = geo->sec_per_chk * geo->sec_size;
 	tqueue->limits.discard_alignment = 0;
 	blk_queue_max_discard_sectors(tqueue, UINT_MAX >> 9);
 	queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, tqueue);
 
-	pr_info("pblk init: luns:%u, lines:%d, secs:%llu, buf entries:%u\n",
-			geo->nr_luns, pblk->l_mg.nr_lines,
+	pr_info("pblk(%s): luns:%u, lines:%d, secs:%llu, buf entries:%u\n",
+			tdisk->disk_name,
+			geo->all_luns, pblk->l_mg.nr_lines,
 			(unsigned long long)pblk->rl.nr_secs,
 			pblk->rwb.nr_entries);
 
diff --git a/drivers/lightnvm/pblk-map.c b/drivers/lightnvm/pblk-map.c
index 6f3ecde..7445e64 100644
--- a/drivers/lightnvm/pblk-map.c
+++ b/drivers/lightnvm/pblk-map.c
@@ -146,7 +146,7 @@ void pblk_map_erase_rq(struct pblk *pblk, struct nvm_rq *rqd,
 		return;
 
 	/* Erase blocks that are bad in this line but might not be in next */
-	if (unlikely(ppa_empty(*erase_ppa)) &&
+	if (unlikely(pblk_ppa_empty(*erase_ppa)) &&
 			bitmap_weight(d_line->blk_bitmap, lm->blk_per_line)) {
 		int bit = -1;
 
diff --git a/drivers/lightnvm/pblk-rb.c b/drivers/lightnvm/pblk-rb.c
index b8f78e4..ec8fc31 100644
--- a/drivers/lightnvm/pblk-rb.c
+++ b/drivers/lightnvm/pblk-rb.c
@@ -54,7 +54,7 @@ int pblk_rb_init(struct pblk_rb *rb, struct pblk_rb_entry *rb_entry_base,
 	rb->seg_size = (1 << power_seg_sz);
 	rb->nr_entries = (1 << power_size);
 	rb->mem = rb->subm = rb->sync = rb->l2p_update = 0;
-	rb->sync_point = EMPTY_ENTRY;
+	rb->flush_point = EMPTY_ENTRY;
 
 	spin_lock_init(&rb->w_lock);
 	spin_lock_init(&rb->s_lock);
@@ -112,7 +112,7 @@ int pblk_rb_init(struct pblk_rb *rb, struct pblk_rb_entry *rb_entry_base,
 	up_write(&pblk_rb_lock);
 
 #ifdef CONFIG_NVM_DEBUG
-	atomic_set(&rb->inflight_sync_point, 0);
+	atomic_set(&rb->inflight_flush_point, 0);
 #endif
 
 	/*
@@ -226,7 +226,7 @@ static int __pblk_rb_update_l2p(struct pblk_rb *rb, unsigned int to_update)
 		pblk_update_map_dev(pblk, w_ctx->lba, w_ctx->ppa,
 							entry->cacheline);
 
-		line = &pblk->lines[pblk_tgt_ppa_to_line(w_ctx->ppa)];
+		line = &pblk->lines[pblk_ppa_to_line(w_ctx->ppa)];
 		kref_put(&line->ref, pblk_line_put);
 		clean_wctx(w_ctx);
 		rb->l2p_update = (rb->l2p_update + 1) & (rb->nr_entries - 1);
@@ -349,35 +349,35 @@ void pblk_rb_write_entry_gc(struct pblk_rb *rb, void *data,
 	smp_store_release(&entry->w_ctx.flags, flags);
 }
 
-static int pblk_rb_sync_point_set(struct pblk_rb *rb, struct bio *bio,
+static int pblk_rb_flush_point_set(struct pblk_rb *rb, struct bio *bio,
 				  unsigned int pos)
 {
 	struct pblk_rb_entry *entry;
-	unsigned int subm, sync_point;
+	unsigned int sync, flush_point;
 
-	subm = READ_ONCE(rb->subm);
+	sync = READ_ONCE(rb->sync);
+
+	if (pos == sync)
+		return 0;
 
 #ifdef CONFIG_NVM_DEBUG
-	atomic_inc(&rb->inflight_sync_point);
+	atomic_inc(&rb->inflight_flush_point);
 #endif
 
-	if (pos == subm)
-		return 0;
+	flush_point = (pos == 0) ? (rb->nr_entries - 1) : (pos - 1);
+	entry = &rb->entries[flush_point];
 
-	sync_point = (pos == 0) ? (rb->nr_entries - 1) : (pos - 1);
-	entry = &rb->entries[sync_point];
+	pblk_rb_sync_init(rb, NULL);
 
-	/* Protect syncs */
-	smp_store_release(&rb->sync_point, sync_point);
+	/* Protect flush points */
+	smp_store_release(&rb->flush_point, flush_point);
 
-	if (!bio)
-		return 0;
+	if (bio)
+		bio_list_add(&entry->w_ctx.bios, bio);
 
-	spin_lock_irq(&rb->s_lock);
-	bio_list_add(&entry->w_ctx.bios, bio);
-	spin_unlock_irq(&rb->s_lock);
+	pblk_rb_sync_end(rb, NULL);
 
-	return 1;
+	return bio ? 1 : 0;
 }
 
 static int __pblk_rb_may_write(struct pblk_rb *rb, unsigned int nr_entries,
@@ -416,7 +416,7 @@ void pblk_rb_flush(struct pblk_rb *rb)
 	struct pblk *pblk = container_of(rb, struct pblk, rwb);
 	unsigned int mem = READ_ONCE(rb->mem);
 
-	if (pblk_rb_sync_point_set(rb, NULL, mem))
+	if (pblk_rb_flush_point_set(rb, NULL, mem))
 		return;
 
 	pblk_write_should_kick(pblk);
@@ -440,7 +440,7 @@ static int pblk_rb_may_write_flush(struct pblk_rb *rb, unsigned int nr_entries,
 #ifdef CONFIG_NVM_DEBUG
 		atomic_long_inc(&pblk->nr_flush);
 #endif
-		if (pblk_rb_sync_point_set(&pblk->rwb, bio, mem))
+		if (pblk_rb_flush_point_set(&pblk->rwb, bio, mem))
 			*io_ret = NVM_IO_OK;
 	}
 
@@ -606,21 +606,6 @@ unsigned int pblk_rb_read_to_bio(struct pblk_rb *rb, struct nvm_rq *rqd,
 			return NVM_IO_ERR;
 		}
 
-		if (flags & PBLK_FLUSH_ENTRY) {
-			unsigned int sync_point;
-
-			sync_point = READ_ONCE(rb->sync_point);
-			if (sync_point == pos) {
-				/* Protect syncs */
-				smp_store_release(&rb->sync_point, EMPTY_ENTRY);
-			}
-
-			flags &= ~PBLK_FLUSH_ENTRY;
-#ifdef CONFIG_NVM_DEBUG
-			atomic_dec(&rb->inflight_sync_point);
-#endif
-		}
-
 		flags &= ~PBLK_WRITTEN_DATA;
 		flags |= PBLK_SUBMITTED_ENTRY;
 
@@ -730,15 +715,24 @@ void pblk_rb_sync_end(struct pblk_rb *rb, unsigned long *flags)
 
 unsigned int pblk_rb_sync_advance(struct pblk_rb *rb, unsigned int nr_entries)
 {
-	unsigned int sync;
-	unsigned int i;
-
+	unsigned int sync, flush_point;
 	lockdep_assert_held(&rb->s_lock);
 
 	sync = READ_ONCE(rb->sync);
+	flush_point = READ_ONCE(rb->flush_point);
 
-	for (i = 0; i < nr_entries; i++)
-		sync = (sync + 1) & (rb->nr_entries - 1);
+	if (flush_point != EMPTY_ENTRY) {
+		unsigned int secs_to_flush;
+
+		secs_to_flush = pblk_rb_ring_count(flush_point, sync,
+					rb->nr_entries);
+		if (secs_to_flush < nr_entries) {
+			/* Protect flush points */
+			smp_store_release(&rb->flush_point, EMPTY_ENTRY);
+		}
+	}
+
+	sync = (sync + nr_entries) & (rb->nr_entries - 1);
 
 	/* Protect from counts */
 	smp_store_release(&rb->sync, sync);
@@ -746,22 +740,27 @@ unsigned int pblk_rb_sync_advance(struct pblk_rb *rb, unsigned int nr_entries)
 	return sync;
 }
 
-unsigned int pblk_rb_sync_point_count(struct pblk_rb *rb)
+/* Calculate how many sectors to submit up to the current flush point. */
+unsigned int pblk_rb_flush_point_count(struct pblk_rb *rb)
 {
-	unsigned int subm, sync_point;
-	unsigned int count;
+	unsigned int subm, sync, flush_point;
+	unsigned int submitted, to_flush;
 
-	/* Protect syncs */
-	sync_point = smp_load_acquire(&rb->sync_point);
-	if (sync_point == EMPTY_ENTRY)
+	/* Protect flush points */
+	flush_point = smp_load_acquire(&rb->flush_point);
+	if (flush_point == EMPTY_ENTRY)
 		return 0;
 
+	/* Protect syncs */
+	sync = smp_load_acquire(&rb->sync);
+
 	subm = READ_ONCE(rb->subm);
+	submitted = pblk_rb_ring_count(subm, sync, rb->nr_entries);
 
 	/* The sync point itself counts as a sector to sync */
-	count = pblk_rb_ring_count(sync_point, subm, rb->nr_entries) + 1;
+	to_flush = pblk_rb_ring_count(flush_point, sync, rb->nr_entries) + 1;
 
-	return count;
+	return (submitted < to_flush) ? (to_flush - submitted) : 0;
 }
 
 /*
@@ -801,7 +800,7 @@ int pblk_rb_tear_down_check(struct pblk_rb *rb)
 
 	if ((rb->mem == rb->subm) && (rb->subm == rb->sync) &&
 				(rb->sync == rb->l2p_update) &&
-				(rb->sync_point == EMPTY_ENTRY)) {
+				(rb->flush_point == EMPTY_ENTRY)) {
 		goto out;
 	}
 
@@ -848,7 +847,7 @@ ssize_t pblk_rb_sysfs(struct pblk_rb *rb, char *buf)
 		queued_entries++;
 	spin_unlock_irq(&rb->s_lock);
 
-	if (rb->sync_point != EMPTY_ENTRY)
+	if (rb->flush_point != EMPTY_ENTRY)
 		offset = scnprintf(buf, PAGE_SIZE,
 			"%u\t%u\t%u\t%u\t%u\t%u\t%u - %u/%u/%u - %d\n",
 			rb->nr_entries,
@@ -857,14 +856,14 @@ ssize_t pblk_rb_sysfs(struct pblk_rb *rb, char *buf)
 			rb->sync,
 			rb->l2p_update,
 #ifdef CONFIG_NVM_DEBUG
-			atomic_read(&rb->inflight_sync_point),
+			atomic_read(&rb->inflight_flush_point),
 #else
 			0,
 #endif
-			rb->sync_point,
+			rb->flush_point,
 			pblk_rb_read_count(rb),
 			pblk_rb_space(rb),
-			pblk_rb_sync_point_count(rb),
+			pblk_rb_flush_point_count(rb),
 			queued_entries);
 	else
 		offset = scnprintf(buf, PAGE_SIZE,
@@ -875,13 +874,13 @@ ssize_t pblk_rb_sysfs(struct pblk_rb *rb, char *buf)
 			rb->sync,
 			rb->l2p_update,
 #ifdef CONFIG_NVM_DEBUG
-			atomic_read(&rb->inflight_sync_point),
+			atomic_read(&rb->inflight_flush_point),
 #else
 			0,
 #endif
 			pblk_rb_read_count(rb),
 			pblk_rb_space(rb),
-			pblk_rb_sync_point_count(rb),
+			pblk_rb_flush_point_count(rb),
 			queued_entries);
 
 	return offset;
diff --git a/drivers/lightnvm/pblk-read.c b/drivers/lightnvm/pblk-read.c
index ca79d8f..2f76128 100644
--- a/drivers/lightnvm/pblk-read.c
+++ b/drivers/lightnvm/pblk-read.c
@@ -141,7 +141,7 @@ static void pblk_read_put_rqd_kref(struct pblk *pblk, struct nvm_rq *rqd)
 		struct ppa_addr ppa = ppa_list[i];
 		struct pblk_line *line;
 
-		line = &pblk->lines[pblk_dev_ppa_to_line(ppa)];
+		line = &pblk->lines[pblk_ppa_to_line(ppa)];
 		kref_put(&line->ref, pblk_line_put_wq);
 	}
 }
@@ -158,8 +158,12 @@ static void pblk_end_user_read(struct bio *bio)
 static void __pblk_end_io_read(struct pblk *pblk, struct nvm_rq *rqd,
 			       bool put_line)
 {
+	struct nvm_tgt_dev *dev = pblk->dev;
 	struct pblk_g_ctx *r_ctx = nvm_rq_to_pdu(rqd);
 	struct bio *bio = rqd->bio;
+	unsigned long start_time = r_ctx->start_time;
+
+	generic_end_io_acct(dev->q, READ, &pblk->disk->part0, start_time);
 
 	if (rqd->error)
 		pblk_log_read_err(pblk, rqd);
@@ -193,9 +197,9 @@ static void pblk_end_io_read(struct nvm_rq *rqd)
 	__pblk_end_io_read(pblk, rqd, true);
 }
 
-static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
-				      unsigned int bio_init_idx,
-				      unsigned long *read_bitmap)
+static int pblk_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
+				 unsigned int bio_init_idx,
+				 unsigned long *read_bitmap)
 {
 	struct bio *new_bio, *bio = rqd->bio;
 	struct pblk_sec_meta *meta_list = rqd->meta_list;
@@ -270,7 +274,7 @@ static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
 	i = 0;
 	hole = find_first_zero_bit(read_bitmap, nr_secs);
 	do {
-		int line_id = pblk_dev_ppa_to_line(rqd->ppa_list[i]);
+		int line_id = pblk_ppa_to_line(rqd->ppa_list[i]);
 		struct pblk_line *line = &pblk->lines[line_id];
 
 		kref_put(&line->ref, pblk_line_put);
@@ -306,6 +310,8 @@ static int pblk_fill_partial_read_bio(struct pblk *pblk, struct nvm_rq *rqd,
 	return NVM_IO_OK;
 
 err:
+	pr_err("pblk: failed to perform partial read\n");
+
 	/* Free allocated pages in new bio */
 	pblk_bio_free_pages(pblk, bio, 0, new_bio->bi_vcnt);
 	__pblk_end_io_read(pblk, rqd, false);
@@ -357,6 +363,7 @@ static void pblk_read_rq(struct pblk *pblk, struct nvm_rq *rqd,
 int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 {
 	struct nvm_tgt_dev *dev = pblk->dev;
+	struct request_queue *q = dev->q;
 	sector_t blba = pblk_get_lba(bio);
 	unsigned int nr_secs = pblk_get_secs(bio);
 	struct pblk_g_ctx *r_ctx;
@@ -372,6 +379,8 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 		return NVM_IO_ERR;
 	}
 
+	generic_start_io_acct(q, READ, bio_sectors(bio), &pblk->disk->part0);
+
 	bitmap_zero(&read_bitmap, nr_secs);
 
 	rqd = pblk_alloc_rqd(pblk, PBLK_READ);
@@ -383,6 +392,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	rqd->end_io = pblk_end_io_read;
 
 	r_ctx = nvm_rq_to_pdu(rqd);
+	r_ctx->start_time = jiffies;
 	r_ctx->lba = blba;
 
 	/* Save the index for this bio's start. This is needed in case
@@ -422,7 +432,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 		int_bio = bio_clone_fast(bio, GFP_KERNEL, pblk_bio_set);
 		if (!int_bio) {
 			pr_err("pblk: could not clone read bio\n");
-			return NVM_IO_ERR;
+			goto fail_end_io;
 		}
 
 		rqd->bio = int_bio;
@@ -433,7 +443,7 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 			pr_err("pblk: read IO submission failed\n");
 			if (int_bio)
 				bio_put(int_bio);
-			return ret;
+			goto fail_end_io;
 		}
 
 		return NVM_IO_OK;
@@ -442,17 +452,14 @@ int pblk_submit_read(struct pblk *pblk, struct bio *bio)
 	/* The read bio request could be partially filled by the write buffer,
 	 * but there are some holes that need to be read from the drive.
 	 */
-	ret = pblk_fill_partial_read_bio(pblk, rqd, bio_init_idx, &read_bitmap);
-	if (ret) {
-		pr_err("pblk: failed to perform partial read\n");
-		return ret;
-	}
-
-	return NVM_IO_OK;
+	return pblk_partial_read_bio(pblk, rqd, bio_init_idx, &read_bitmap);
 
 fail_rqd_free:
 	pblk_free_rqd(pblk, rqd, PBLK_READ);
 	return ret;
+fail_end_io:
+	__pblk_end_io_read(pblk, rqd, false);
+	return ret;
 }
 
 static int read_ppalist_rq_gc(struct pblk *pblk, struct nvm_rq *rqd,
diff --git a/drivers/lightnvm/pblk-recovery.c b/drivers/lightnvm/pblk-recovery.c
index eadb3eb..1d5e961 100644
--- a/drivers/lightnvm/pblk-recovery.c
+++ b/drivers/lightnvm/pblk-recovery.c
@@ -111,18 +111,18 @@ int pblk_recov_setup_rq(struct pblk *pblk, struct pblk_c_ctx *c_ctx,
 	return 0;
 }
 
-__le64 *pblk_recov_get_lba_list(struct pblk *pblk, struct line_emeta *emeta_buf)
+int pblk_recov_check_emeta(struct pblk *pblk, struct line_emeta *emeta_buf)
 {
 	u32 crc;
 
 	crc = pblk_calc_emeta_crc(pblk, emeta_buf);
 	if (le32_to_cpu(emeta_buf->crc) != crc)
-		return NULL;
+		return 1;
 
 	if (le32_to_cpu(emeta_buf->header.identifier) != PBLK_MAGIC)
-		return NULL;
+		return 1;
 
-	return emeta_to_lbas(pblk, emeta_buf);
+	return 0;
 }
 
 static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
@@ -137,7 +137,7 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
 	u64 nr_valid_lbas, nr_lbas = 0;
 	u64 i;
 
-	lba_list = pblk_recov_get_lba_list(pblk, emeta_buf);
+	lba_list = emeta_to_lbas(pblk, emeta_buf);
 	if (!lba_list)
 		return 1;
 
@@ -149,7 +149,7 @@ static int pblk_recov_l2p_from_emeta(struct pblk *pblk, struct pblk_line *line)
 		struct ppa_addr ppa;
 		int pos;
 
-		ppa = addr_to_pblk_ppa(pblk, i, line->id);
+		ppa = addr_to_gen_ppa(pblk, i, line->id);
 		pos = pblk_ppa_to_pos(geo, ppa);
 
 		/* Do not update bad blocks */
@@ -188,7 +188,7 @@ static int pblk_calc_sec_in_line(struct pblk *pblk, struct pblk_line *line)
 	int nr_bb = bitmap_weight(line->blk_bitmap, lm->blk_per_line);
 
 	return lm->sec_per_line - lm->smeta_sec - lm->emeta_sec[0] -
-				nr_bb * geo->sec_per_blk;
+				nr_bb * geo->sec_per_chk;
 }
 
 struct pblk_recov_alloc {
@@ -263,12 +263,12 @@ static int pblk_recov_read_oob(struct pblk *pblk, struct pblk_line *line,
 		int pos;
 
 		ppa = addr_to_gen_ppa(pblk, r_ptr_int, line->id);
-		pos = pblk_dev_ppa_to_pos(geo, ppa);
+		pos = pblk_ppa_to_pos(geo, ppa);
 
 		while (test_bit(pos, line->blk_bitmap)) {
 			r_ptr_int += pblk->min_write_pgs;
 			ppa = addr_to_gen_ppa(pblk, r_ptr_int, line->id);
-			pos = pblk_dev_ppa_to_pos(geo, ppa);
+			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
 		for (j = 0; j < pblk->min_write_pgs; j++, i++, r_ptr_int++)
@@ -288,7 +288,7 @@ static int pblk_recov_read_oob(struct pblk *pblk, struct pblk_line *line,
 	/* At this point, the read should not fail. If it does, it is a problem
 	 * we cannot recover from here. Need FTL log.
 	 */
-	if (rqd->error) {
+	if (rqd->error && rqd->error != NVM_RSP_WARN_HIGHECC) {
 		pr_err("pblk: L2P recovery failed (%d)\n", rqd->error);
 		return -EINTR;
 	}
@@ -411,12 +411,12 @@ static int pblk_recov_pad_oob(struct pblk *pblk, struct pblk_line *line,
 		int pos;
 
 		w_ptr = pblk_alloc_page(pblk, line, pblk->min_write_pgs);
-		ppa = addr_to_pblk_ppa(pblk, w_ptr, line->id);
+		ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
 		pos = pblk_ppa_to_pos(geo, ppa);
 
 		while (test_bit(pos, line->blk_bitmap)) {
 			w_ptr += pblk->min_write_pgs;
-			ppa = addr_to_pblk_ppa(pblk, w_ptr, line->id);
+			ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
 			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
@@ -541,12 +541,12 @@ static int pblk_recov_scan_all_oob(struct pblk *pblk, struct pblk_line *line,
 
 		w_ptr = pblk_alloc_page(pblk, line, pblk->min_write_pgs);
 		ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
-		pos = pblk_dev_ppa_to_pos(geo, ppa);
+		pos = pblk_ppa_to_pos(geo, ppa);
 
 		while (test_bit(pos, line->blk_bitmap)) {
 			w_ptr += pblk->min_write_pgs;
 			ppa = addr_to_gen_ppa(pblk, w_ptr, line->id);
-			pos = pblk_dev_ppa_to_pos(geo, ppa);
+			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
 		for (j = 0; j < pblk->min_write_pgs; j++, i++, w_ptr++)
@@ -672,12 +672,12 @@ static int pblk_recov_scan_oob(struct pblk *pblk, struct pblk_line *line,
 
 		paddr = pblk_alloc_page(pblk, line, pblk->min_write_pgs);
 		ppa = addr_to_gen_ppa(pblk, paddr, line->id);
-		pos = pblk_dev_ppa_to_pos(geo, ppa);
+		pos = pblk_ppa_to_pos(geo, ppa);
 
 		while (test_bit(pos, line->blk_bitmap)) {
 			paddr += pblk->min_write_pgs;
 			ppa = addr_to_gen_ppa(pblk, paddr, line->id);
-			pos = pblk_dev_ppa_to_pos(geo, ppa);
+			pos = pblk_ppa_to_pos(geo, ppa);
 		}
 
 		for (j = 0; j < pblk->min_write_pgs; j++, i++, paddr++)
@@ -817,7 +817,7 @@ static u64 pblk_line_emeta_start(struct pblk *pblk, struct pblk_line *line)
 
 	while (emeta_secs) {
 		emeta_start--;
-		ppa = addr_to_pblk_ppa(pblk, emeta_start, line->id);
+		ppa = addr_to_gen_ppa(pblk, emeta_start, line->id);
 		pos = pblk_ppa_to_pos(geo, ppa);
 		if (!test_bit(pos, line->blk_bitmap))
 			emeta_secs--;
@@ -938,6 +938,11 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 			goto next;
 		}
 
+		if (pblk_recov_check_emeta(pblk, line->emeta->buf)) {
+			pblk_recov_l2p_from_oob(pblk, line);
+			goto next;
+		}
+
 		if (pblk_recov_l2p_from_emeta(pblk, line))
 			pblk_recov_l2p_from_oob(pblk, line);
 
@@ -984,10 +989,8 @@ struct pblk_line *pblk_recov_l2p(struct pblk *pblk)
 	}
 	spin_unlock(&l_mg->free_lock);
 
-	if (is_next) {
+	if (is_next)
 		pblk_line_erase(pblk, l_mg->data_next);
-		pblk_rl_free_lines_dec(&pblk->rl, l_mg->data_next);
-	}
 
 out:
 	if (found_lines != recovered_lines)
diff --git a/drivers/lightnvm/pblk-rl.c b/drivers/lightnvm/pblk-rl.c
index dacc719..0d457b1 100644
--- a/drivers/lightnvm/pblk-rl.c
+++ b/drivers/lightnvm/pblk-rl.c
@@ -89,17 +89,15 @@ unsigned long pblk_rl_nr_free_blks(struct pblk_rl *rl)
 	return atomic_read(&rl->free_blocks);
 }
 
-/*
- * We check for (i) the number of free blocks in the current LUN and (ii) the
- * total number of free blocks in the pblk instance. This is to even out the
- * number of free blocks on each LUN when GC kicks in.
- *
- * Only the total number of free blocks is used to configure the rate limiter.
- */
-void pblk_rl_update_rates(struct pblk_rl *rl)
+unsigned long pblk_rl_nr_user_free_blks(struct pblk_rl *rl)
+{
+	return atomic_read(&rl->free_user_blocks);
+}
+
+static void __pblk_rl_update_rates(struct pblk_rl *rl,
+				   unsigned long free_blocks)
 {
 	struct pblk *pblk = container_of(rl, struct pblk, rl);
-	unsigned long free_blocks = pblk_rl_nr_free_blks(rl);
 	int max = rl->rb_budget;
 
 	if (free_blocks >= rl->high) {
@@ -132,20 +130,37 @@ void pblk_rl_update_rates(struct pblk_rl *rl)
 		pblk_gc_should_stop(pblk);
 }
 
+void pblk_rl_update_rates(struct pblk_rl *rl)
+{
+	__pblk_rl_update_rates(rl, pblk_rl_nr_user_free_blks(rl));
+}
+
 void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line)
 {
 	int blk_in_line = atomic_read(&line->blk_in_line);
+	int free_blocks;
 
 	atomic_add(blk_in_line, &rl->free_blocks);
-	pblk_rl_update_rates(rl);
+	free_blocks = atomic_add_return(blk_in_line, &rl->free_user_blocks);
+
+	__pblk_rl_update_rates(rl, free_blocks);
 }
 
-void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line)
+void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line,
+			    bool used)
 {
 	int blk_in_line = atomic_read(&line->blk_in_line);
+	int free_blocks;
 
 	atomic_sub(blk_in_line, &rl->free_blocks);
-	pblk_rl_update_rates(rl);
+
+	if (used)
+		free_blocks = atomic_sub_return(blk_in_line,
+							&rl->free_user_blocks);
+	else
+		free_blocks = atomic_read(&rl->free_user_blocks);
+
+	__pblk_rl_update_rates(rl, free_blocks);
 }
 
 int pblk_rl_high_thrs(struct pblk_rl *rl)
@@ -174,16 +189,21 @@ void pblk_rl_free(struct pblk_rl *rl)
 void pblk_rl_init(struct pblk_rl *rl, int budget)
 {
 	struct pblk *pblk = container_of(rl, struct pblk, rl);
+	struct nvm_tgt_dev *dev = pblk->dev;
+	struct nvm_geo *geo = &dev->geo;
+	struct pblk_line_mgmt *l_mg = &pblk->l_mg;
 	struct pblk_line_meta *lm = &pblk->lm;
 	int min_blocks = lm->blk_per_line * PBLK_GC_RSV_LINE;
+	int sec_meta, blk_meta;
+
 	unsigned int rb_windows;
 
-	rl->high = rl->total_blocks / PBLK_USER_HIGH_THRS;
-	rl->high_pw = get_count_order(rl->high);
+	/* Consider sectors used for metadata */
+	sec_meta = (lm->smeta_sec + lm->emeta_sec[0]) * l_mg->nr_free_lines;
+	blk_meta = DIV_ROUND_UP(sec_meta, geo->sec_per_chk);
 
-	rl->low = rl->total_blocks / PBLK_USER_LOW_THRS;
-	if (rl->low < min_blocks)
-		rl->low = min_blocks;
+	rl->high = pblk->op_blks - blk_meta - lm->blk_per_line;
+	rl->high_pw = get_count_order(rl->high);
 
 	rl->rsv_blocks = min_blocks;
 
diff --git a/drivers/lightnvm/pblk-sysfs.c b/drivers/lightnvm/pblk-sysfs.c
index cd49e88..620bab8 100644
--- a/drivers/lightnvm/pblk-sysfs.c
+++ b/drivers/lightnvm/pblk-sysfs.c
@@ -28,7 +28,7 @@ static ssize_t pblk_sysfs_luns_show(struct pblk *pblk, char *page)
 	ssize_t sz = 0;
 	int i;
 
-	for (i = 0; i < geo->nr_luns; i++) {
+	for (i = 0; i < geo->all_luns; i++) {
 		int active = 1;
 
 		rlun = &pblk->luns[i];
@@ -49,11 +49,12 @@ static ssize_t pblk_sysfs_luns_show(struct pblk *pblk, char *page)
 
 static ssize_t pblk_sysfs_rate_limiter(struct pblk *pblk, char *page)
 {
-	int free_blocks, total_blocks;
+	int free_blocks, free_user_blocks, total_blocks;
 	int rb_user_max, rb_user_cnt;
 	int rb_gc_max, rb_gc_cnt, rb_budget, rb_state;
 
-	free_blocks = atomic_read(&pblk->rl.free_blocks);
+	free_blocks = pblk_rl_nr_free_blks(&pblk->rl);
+	free_user_blocks = pblk_rl_nr_user_free_blks(&pblk->rl);
 	rb_user_max = pblk->rl.rb_user_max;
 	rb_user_cnt = atomic_read(&pblk->rl.rb_user_cnt);
 	rb_gc_max = pblk->rl.rb_gc_max;
@@ -64,16 +65,16 @@ static ssize_t pblk_sysfs_rate_limiter(struct pblk *pblk, char *page)
 	total_blocks = pblk->rl.total_blocks;
 
 	return snprintf(page, PAGE_SIZE,
-		"u:%u/%u,gc:%u/%u(%u/%u)(stop:<%u,full:>%u,free:%d/%d)-%d\n",
+		"u:%u/%u,gc:%u/%u(%u)(stop:<%u,full:>%u,free:%d/%d/%d)-%d\n",
 				rb_user_cnt,
 				rb_user_max,
 				rb_gc_cnt,
 				rb_gc_max,
 				rb_state,
 				rb_budget,
-				pblk->rl.low,
 				pblk->rl.high,
 				free_blocks,
+				free_user_blocks,
 				total_blocks,
 				READ_ONCE(pblk->rl.rb_user_active));
 }
@@ -238,7 +239,7 @@ static ssize_t pblk_sysfs_lines(struct pblk *pblk, char *page)
 
 	sz = snprintf(page, PAGE_SIZE - sz,
 		"line: nluns:%d, nblks:%d, nsecs:%d\n",
-		geo->nr_luns, lm->blk_per_line, lm->sec_per_line);
+		geo->all_luns, lm->blk_per_line, lm->sec_per_line);
 
 	sz += snprintf(page + sz, PAGE_SIZE - sz,
 		"lines:d:%d,l:%d-f:%d,m:%d/%d,c:%d,b:%d,co:%d(d:%d,l:%d)t:%d\n",
@@ -287,7 +288,7 @@ static ssize_t pblk_sysfs_lines_info(struct pblk *pblk, char *page)
 				"blk_line:%d, sec_line:%d, sec_blk:%d\n",
 					lm->blk_per_line,
 					lm->sec_per_line,
-					geo->sec_per_blk);
+					geo->sec_per_chk);
 
 	return sz;
 }
diff --git a/drivers/lightnvm/pblk-write.c b/drivers/lightnvm/pblk-write.c
index 6c1cafa..aae86ed 100644
--- a/drivers/lightnvm/pblk-write.c
+++ b/drivers/lightnvm/pblk-write.c
@@ -21,13 +21,28 @@ static unsigned long pblk_end_w_bio(struct pblk *pblk, struct nvm_rq *rqd,
 				    struct pblk_c_ctx *c_ctx)
 {
 	struct bio *original_bio;
+	struct pblk_rb *rwb = &pblk->rwb;
 	unsigned long ret;
 	int i;
 
 	for (i = 0; i < c_ctx->nr_valid; i++) {
 		struct pblk_w_ctx *w_ctx;
+		int pos = c_ctx->sentry + i;
+		int flags;
 
-		w_ctx = pblk_rb_w_ctx(&pblk->rwb, c_ctx->sentry + i);
+		w_ctx = pblk_rb_w_ctx(rwb, pos);
+		flags = READ_ONCE(w_ctx->flags);
+
+		if (flags & PBLK_FLUSH_ENTRY) {
+			flags &= ~PBLK_FLUSH_ENTRY;
+			/* Release flags on context. Protect from writes */
+			smp_store_release(&w_ctx->flags, flags);
+
+#ifdef CONFIG_NVM_DEBUG
+			atomic_dec(&rwb->inflight_flush_point);
+#endif
+		}
+
 		while ((original_bio = bio_list_pop(&w_ctx->bios)))
 			bio_endio(original_bio);
 	}
@@ -439,7 +454,7 @@ static int pblk_submit_io_set(struct pblk *pblk, struct nvm_rq *rqd)
 	struct pblk_line *meta_line;
 	int err;
 
-	ppa_set_empty(&erase_ppa);
+	pblk_ppa_set_empty(&erase_ppa);
 
 	/* Assign lbas to ppas and populate request structure */
 	err = pblk_setup_w_rq(pblk, rqd, &erase_ppa);
@@ -457,7 +472,7 @@ static int pblk_submit_io_set(struct pblk *pblk, struct nvm_rq *rqd)
 		return NVM_IO_ERR;
 	}
 
-	if (!ppa_empty(erase_ppa)) {
+	if (!pblk_ppa_empty(erase_ppa)) {
 		/* Submit erase for next data line */
 		if (pblk_blk_erase_async(pblk, erase_ppa)) {
 			struct pblk_line *e_line = pblk_line_get_erase(pblk);
@@ -508,7 +523,7 @@ static int pblk_submit_write(struct pblk *pblk)
 	if (!secs_avail)
 		return 1;
 
-	secs_to_flush = pblk_rb_sync_point_count(&pblk->rwb);
+	secs_to_flush = pblk_rb_flush_point_count(&pblk->rwb);
 	if (!secs_to_flush && secs_avail < pblk->min_write_pgs)
 		return 1;
 
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 59a64d4..8c357fb 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -51,17 +51,16 @@
 
 #define NR_PHY_IN_LOG (PBLK_EXPOSED_PAGE_SIZE / PBLK_SECTOR)
 
-#define pblk_for_each_lun(pblk, rlun, i) \
-		for ((i) = 0, rlun = &(pblk)->luns[0]; \
-			(i) < (pblk)->nr_luns; (i)++, rlun = &(pblk)->luns[(i)])
-
 /* Static pool sizes */
 #define PBLK_GEN_WS_POOL_SIZE (2)
 
+#define PBLK_DEFAULT_OP (11)
+
 enum {
 	PBLK_READ		= READ,
 	PBLK_WRITE		= WRITE,/* Write from write buffer */
 	PBLK_WRITE_INT,			/* Internal write - no write buffer */
+	PBLK_READ_RECOV,		/* Recovery read - errors allowed */
 	PBLK_ERASE,
 };
 
@@ -114,6 +113,7 @@ struct pblk_c_ctx {
 /* read context */
 struct pblk_g_ctx {
 	void *private;
+	unsigned long start_time;
 	u64 lba;
 };
 
@@ -170,7 +170,7 @@ struct pblk_rb {
 					 * the last submitted entry that has
 					 * been successfully persisted to media
 					 */
-	unsigned int sync_point;	/* Sync point - last entry that must be
+	unsigned int flush_point;	/* Sync point - last entry that must be
 					 * flushed to the media. Used with
 					 * REQ_FLUSH and REQ_FUA
 					 */
@@ -193,7 +193,7 @@ struct pblk_rb {
 	spinlock_t s_lock;		/* Sync lock */
 
 #ifdef CONFIG_NVM_DEBUG
-	atomic_t inflight_sync_point;	/* Not served REQ_FLUSH | REQ_FUA */
+	atomic_t inflight_flush_point;	/* Not served REQ_FLUSH | REQ_FUA */
 #endif
 };
 
@@ -256,9 +256,6 @@ struct pblk_rl {
 	unsigned int high;	/* Upper threshold for rate limiter (free run -
 				 * user I/O rate limiter
 				 */
-	unsigned int low;	/* Lower threshold for rate limiter (user I/O
-				 * rate limiter - stall)
-				 */
 	unsigned int high_pw;	/* High rounded up as a power of 2 */
 
 #define PBLK_USER_HIGH_THRS 8	/* Begin write limit at 12% available blks */
@@ -292,7 +289,9 @@ struct pblk_rl {
 
 	unsigned long long nr_secs;
 	unsigned long total_blocks;
-	atomic_t free_blocks;
+
+	atomic_t free_blocks;		/* Total number of free blocks (+ OP) */
+	atomic_t free_user_blocks;	/* Number of user free blocks (no OP) */
 };
 
 #define PBLK_LINE_EMPTY (~0U)
@@ -583,7 +582,9 @@ struct pblk {
 			    */
 
 	sector_t capacity; /* Device capacity when bad blocks are subtracted */
-	int over_pct;      /* Percentage of device used for over-provisioning */
+
+	int op;      /* Percentage of device used for over-provisioning */
+	int op_blks; /* Number of blocks used for over-provisioning */
 
 	/* pblk provisioning values. Used by rate limiter */
 	struct pblk_rl rl;
@@ -691,7 +692,7 @@ unsigned int pblk_rb_sync_advance(struct pblk_rb *rb, unsigned int nr_entries);
 struct pblk_rb_entry *pblk_rb_sync_scan_entry(struct pblk_rb *rb,
 					      struct ppa_addr *ppa);
 void pblk_rb_sync_end(struct pblk_rb *rb, unsigned long *flags);
-unsigned int pblk_rb_sync_point_count(struct pblk_rb *rb);
+unsigned int pblk_rb_flush_point_count(struct pblk_rb *rb);
 
 unsigned int pblk_rb_read_count(struct pblk_rb *rb);
 unsigned int pblk_rb_sync_count(struct pblk_rb *rb);
@@ -812,7 +813,7 @@ int pblk_submit_read_gc(struct pblk *pblk, struct pblk_gc_rq *gc_rq);
 void pblk_submit_rec(struct work_struct *work);
 struct pblk_line *pblk_recov_l2p(struct pblk *pblk);
 int pblk_recov_pad(struct pblk *pblk);
-__le64 *pblk_recov_get_lba_list(struct pblk *pblk, struct line_emeta *emeta);
+int pblk_recov_check_emeta(struct pblk *pblk, struct line_emeta *emeta);
 int pblk_recov_setup_rq(struct pblk *pblk, struct pblk_c_ctx *c_ctx,
 			struct pblk_rec_ctx *recovery, u64 *comp_bits,
 			unsigned int comp);
@@ -843,6 +844,7 @@ void pblk_rl_free(struct pblk_rl *rl);
 void pblk_rl_update_rates(struct pblk_rl *rl);
 int pblk_rl_high_thrs(struct pblk_rl *rl);
 unsigned long pblk_rl_nr_free_blks(struct pblk_rl *rl);
+unsigned long pblk_rl_nr_user_free_blks(struct pblk_rl *rl);
 int pblk_rl_user_may_insert(struct pblk_rl *rl, int nr_entries);
 void pblk_rl_inserted(struct pblk_rl *rl, int nr_entries);
 void pblk_rl_user_in(struct pblk_rl *rl, int nr_entries);
@@ -851,7 +853,8 @@ void pblk_rl_gc_in(struct pblk_rl *rl, int nr_entries);
 void pblk_rl_out(struct pblk_rl *rl, int nr_user, int nr_gc);
 int pblk_rl_max_io(struct pblk_rl *rl);
 void pblk_rl_free_lines_inc(struct pblk_rl *rl, struct pblk_line *line);
-void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line);
+void pblk_rl_free_lines_dec(struct pblk_rl *rl, struct pblk_line *line,
+			    bool used);
 int pblk_rl_is_limit(struct pblk_rl *rl);
 
 /*
@@ -907,15 +910,10 @@ static inline int pblk_pad_distance(struct pblk *pblk)
 	struct nvm_tgt_dev *dev = pblk->dev;
 	struct nvm_geo *geo = &dev->geo;
 
-	return NVM_MEM_PAGE_WRITE * geo->nr_luns * geo->sec_per_pl;
+	return NVM_MEM_PAGE_WRITE * geo->all_luns * geo->sec_per_pl;
 }
 
-static inline int pblk_dev_ppa_to_line(struct ppa_addr p)
-{
-	return p.g.blk;
-}
-
-static inline int pblk_tgt_ppa_to_line(struct ppa_addr p)
+static inline int pblk_ppa_to_line(struct ppa_addr p)
 {
 	return p.g.blk;
 }
@@ -925,10 +923,34 @@ static inline int pblk_ppa_to_pos(struct nvm_geo *geo, struct ppa_addr p)
 	return p.g.lun * geo->nr_chnls + p.g.ch;
 }
 
-/* A block within a line corresponds to the lun */
-static inline int pblk_dev_ppa_to_pos(struct nvm_geo *geo, struct ppa_addr p)
+static inline struct ppa_addr addr_to_gen_ppa(struct pblk *pblk, u64 paddr,
+					      u64 line_id)
 {
-	return p.g.lun * geo->nr_chnls + p.g.ch;
+	struct ppa_addr ppa;
+
+	ppa.ppa = 0;
+	ppa.g.blk = line_id;
+	ppa.g.pg = (paddr & pblk->ppaf.pg_mask) >> pblk->ppaf.pg_offset;
+	ppa.g.lun = (paddr & pblk->ppaf.lun_mask) >> pblk->ppaf.lun_offset;
+	ppa.g.ch = (paddr & pblk->ppaf.ch_mask) >> pblk->ppaf.ch_offset;
+	ppa.g.pl = (paddr & pblk->ppaf.pln_mask) >> pblk->ppaf.pln_offset;
+	ppa.g.sec = (paddr & pblk->ppaf.sec_mask) >> pblk->ppaf.sec_offset;
+
+	return ppa;
+}
+
+static inline u64 pblk_dev_ppa_to_line_addr(struct pblk *pblk,
+							struct ppa_addr p)
+{
+	u64 paddr;
+
+	paddr = (u64)p.g.pg << pblk->ppaf.pg_offset;
+	paddr |= (u64)p.g.lun << pblk->ppaf.lun_offset;
+	paddr |= (u64)p.g.ch << pblk->ppaf.ch_offset;
+	paddr |= (u64)p.g.pl << pblk->ppaf.pln_offset;
+	paddr |= (u64)p.g.sec << pblk->ppaf.sec_offset;
+
+	return paddr;
 }
 
 static inline struct ppa_addr pblk_ppa32_to_ppa64(struct pblk *pblk, u32 ppa32)
@@ -960,24 +982,6 @@ static inline struct ppa_addr pblk_ppa32_to_ppa64(struct pblk *pblk, u32 ppa32)
 	return ppa64;
 }
 
-static inline struct ppa_addr pblk_trans_map_get(struct pblk *pblk,
-								sector_t lba)
-{
-	struct ppa_addr ppa;
-
-	if (pblk->ppaf_bitsize < 32) {
-		u32 *map = (u32 *)pblk->trans_map;
-
-		ppa = pblk_ppa32_to_ppa64(pblk, map[lba]);
-	} else {
-		struct ppa_addr *map = (struct ppa_addr *)pblk->trans_map;
-
-		ppa = map[lba];
-	}
-
-	return ppa;
-}
-
 static inline u32 pblk_ppa64_to_ppa32(struct pblk *pblk, struct ppa_addr ppa64)
 {
 	u32 ppa32 = 0;
@@ -999,6 +1003,24 @@ static inline u32 pblk_ppa64_to_ppa32(struct pblk *pblk, struct ppa_addr ppa64)
 	return ppa32;
 }
 
+static inline struct ppa_addr pblk_trans_map_get(struct pblk *pblk,
+								sector_t lba)
+{
+	struct ppa_addr ppa;
+
+	if (pblk->ppaf_bitsize < 32) {
+		u32 *map = (u32 *)pblk->trans_map;
+
+		ppa = pblk_ppa32_to_ppa64(pblk, map[lba]);
+	} else {
+		struct ppa_addr *map = (struct ppa_addr *)pblk->trans_map;
+
+		ppa = map[lba];
+	}
+
+	return ppa;
+}
+
 static inline void pblk_trans_map_set(struct pblk *pblk, sector_t lba,
 						struct ppa_addr ppa)
 {
@@ -1013,21 +1035,6 @@ static inline void pblk_trans_map_set(struct pblk *pblk, sector_t lba,
 	}
 }
 
-static inline u64 pblk_dev_ppa_to_line_addr(struct pblk *pblk,
-							struct ppa_addr p)
-{
-	u64 paddr;
-
-	paddr = 0;
-	paddr |= (u64)p.g.pg << pblk->ppaf.pg_offset;
-	paddr |= (u64)p.g.lun << pblk->ppaf.lun_offset;
-	paddr |= (u64)p.g.ch << pblk->ppaf.ch_offset;
-	paddr |= (u64)p.g.pl << pblk->ppaf.pln_offset;
-	paddr |= (u64)p.g.sec << pblk->ppaf.sec_offset;
-
-	return paddr;
-}
-
 static inline int pblk_ppa_empty(struct ppa_addr ppa_addr)
 {
 	return (ppa_addr.ppa == ADDR_EMPTY);
@@ -1040,10 +1047,7 @@ static inline void pblk_ppa_set_empty(struct ppa_addr *ppa_addr)
 
 static inline bool pblk_ppa_comp(struct ppa_addr lppa, struct ppa_addr rppa)
 {
-	if (lppa.ppa == rppa.ppa)
-		return true;
-
-	return false;
+	return (lppa.ppa == rppa.ppa);
 }
 
 static inline int pblk_addr_in_cache(struct ppa_addr ppa)
@@ -1066,32 +1070,6 @@ static inline struct ppa_addr pblk_cacheline_to_addr(int addr)
 	return p;
 }
 
-static inline struct ppa_addr addr_to_gen_ppa(struct pblk *pblk, u64 paddr,
-					      u64 line_id)
-{
-	struct ppa_addr ppa;
-
-	ppa.ppa = 0;
-	ppa.g.blk = line_id;
-	ppa.g.pg = (paddr & pblk->ppaf.pg_mask) >> pblk->ppaf.pg_offset;
-	ppa.g.lun = (paddr & pblk->ppaf.lun_mask) >> pblk->ppaf.lun_offset;
-	ppa.g.ch = (paddr & pblk->ppaf.ch_mask) >> pblk->ppaf.ch_offset;
-	ppa.g.pl = (paddr & pblk->ppaf.pln_mask) >> pblk->ppaf.pln_offset;
-	ppa.g.sec = (paddr & pblk->ppaf.sec_mask) >> pblk->ppaf.sec_offset;
-
-	return ppa;
-}
-
-static inline struct ppa_addr addr_to_pblk_ppa(struct pblk *pblk, u64 paddr,
-					 u64 line_id)
-{
-	struct ppa_addr ppa;
-
-	ppa = addr_to_gen_ppa(pblk, paddr, line_id);
-
-	return ppa;
-}
-
 static inline u32 pblk_calc_meta_header_crc(struct pblk *pblk,
 					    struct line_header *header)
 {
@@ -1212,10 +1190,10 @@ static inline int pblk_boundary_ppa_checks(struct nvm_tgt_dev *tgt_dev,
 
 		if (!ppa->c.is_cached &&
 				ppa->g.ch < geo->nr_chnls &&
-				ppa->g.lun < geo->luns_per_chnl &&
+				ppa->g.lun < geo->nr_luns &&
 				ppa->g.pl < geo->nr_planes &&
-				ppa->g.blk < geo->blks_per_lun &&
-				ppa->g.pg < geo->pgs_per_blk &&
+				ppa->g.blk < geo->nr_chks &&
+				ppa->g.pg < geo->ws_per_chk &&
 				ppa->g.sec < geo->sec_per_pg)
 			continue;
 
@@ -1245,7 +1223,7 @@ static inline int pblk_check_io(struct pblk *pblk, struct nvm_rq *rqd)
 
 		for (i = 0; i < rqd->nr_ppas; i++) {
 			ppa = ppa_list[i];
-			line = &pblk->lines[pblk_dev_ppa_to_line(ppa)];
+			line = &pblk->lines[pblk_ppa_to_line(ppa)];
 
 			spin_lock(&line->lock);
 			if (line->state != PBLK_LINESTATE_OPEN) {
@@ -1288,11 +1266,6 @@ static inline unsigned int pblk_get_secs(struct bio *bio)
 	return  bio->bi_iter.bi_size / PBLK_EXPOSED_PAGE_SIZE;
 }
 
-static inline sector_t pblk_get_sector(sector_t lba)
-{
-	return lba * NR_PHY_IN_LOG;
-}
-
 static inline void pblk_setup_uuid(struct pblk *pblk)
 {
 	uuid_le uuid;
diff --git a/drivers/lightnvm/rrpc.c b/drivers/lightnvm/rrpc.c
deleted file mode 100644
index 0993c14..0000000
--- a/drivers/lightnvm/rrpc.c
+++ /dev/null
@@ -1,1625 +0,0 @@
-/*
- * Copyright (C) 2015 IT University of Copenhagen
- * Initial release: Matias Bjorling <m@bjorling.me>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * Implementation of a Round-robin page-based Hybrid FTL for Open-channel SSDs.
- */
-
-#include "rrpc.h"
-
-static struct kmem_cache *rrpc_gcb_cache, *rrpc_rq_cache;
-static DECLARE_RWSEM(rrpc_lock);
-
-static int rrpc_submit_io(struct rrpc *rrpc, struct bio *bio,
-				struct nvm_rq *rqd, unsigned long flags);
-
-#define rrpc_for_each_lun(rrpc, rlun, i) \
-		for ((i) = 0, rlun = &(rrpc)->luns[0]; \
-			(i) < (rrpc)->nr_luns; (i)++, rlun = &(rrpc)->luns[(i)])
-
-static void rrpc_page_invalidate(struct rrpc *rrpc, struct rrpc_addr *a)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_block *rblk = a->rblk;
-	unsigned int pg_offset;
-
-	lockdep_assert_held(&rrpc->rev_lock);
-
-	if (a->addr == ADDR_EMPTY || !rblk)
-		return;
-
-	spin_lock(&rblk->lock);
-
-	div_u64_rem(a->addr, dev->geo.sec_per_blk, &pg_offset);
-	WARN_ON(test_and_set_bit(pg_offset, rblk->invalid_pages));
-	rblk->nr_invalid_pages++;
-
-	spin_unlock(&rblk->lock);
-
-	rrpc->rev_trans_map[a->addr].addr = ADDR_EMPTY;
-}
-
-static void rrpc_invalidate_range(struct rrpc *rrpc, sector_t slba,
-							unsigned int len)
-{
-	sector_t i;
-
-	spin_lock(&rrpc->rev_lock);
-	for (i = slba; i < slba + len; i++) {
-		struct rrpc_addr *gp = &rrpc->trans_map[i];
-
-		rrpc_page_invalidate(rrpc, gp);
-		gp->rblk = NULL;
-	}
-	spin_unlock(&rrpc->rev_lock);
-}
-
-static struct nvm_rq *rrpc_inflight_laddr_acquire(struct rrpc *rrpc,
-					sector_t laddr, unsigned int pages)
-{
-	struct nvm_rq *rqd;
-	struct rrpc_inflight_rq *inf;
-
-	rqd = mempool_alloc(rrpc->rq_pool, GFP_ATOMIC);
-	if (!rqd)
-		return ERR_PTR(-ENOMEM);
-
-	inf = rrpc_get_inflight_rq(rqd);
-	if (rrpc_lock_laddr(rrpc, laddr, pages, inf)) {
-		mempool_free(rqd, rrpc->rq_pool);
-		return NULL;
-	}
-
-	return rqd;
-}
-
-static void rrpc_inflight_laddr_release(struct rrpc *rrpc, struct nvm_rq *rqd)
-{
-	struct rrpc_inflight_rq *inf = rrpc_get_inflight_rq(rqd);
-
-	rrpc_unlock_laddr(rrpc, inf);
-
-	mempool_free(rqd, rrpc->rq_pool);
-}
-
-static void rrpc_discard(struct rrpc *rrpc, struct bio *bio)
-{
-	sector_t slba = bio->bi_iter.bi_sector / NR_PHY_IN_LOG;
-	sector_t len = bio->bi_iter.bi_size / RRPC_EXPOSED_PAGE_SIZE;
-	struct nvm_rq *rqd;
-
-	while (1) {
-		rqd = rrpc_inflight_laddr_acquire(rrpc, slba, len);
-		if (rqd)
-			break;
-
-		schedule();
-	}
-
-	if (IS_ERR(rqd)) {
-		pr_err("rrpc: unable to acquire inflight IO\n");
-		bio_io_error(bio);
-		return;
-	}
-
-	rrpc_invalidate_range(rrpc, slba, len);
-	rrpc_inflight_laddr_release(rrpc, rqd);
-}
-
-static int block_is_full(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-
-	return (rblk->next_page == dev->geo.sec_per_blk);
-}
-
-/* Calculate relative addr for the given block, considering instantiated LUNs */
-static u64 block_to_rel_addr(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_lun *rlun = rblk->rlun;
-
-	return rlun->id * dev->geo.sec_per_blk;
-}
-
-static struct ppa_addr rrpc_ppa_to_gaddr(struct nvm_tgt_dev *dev,
-					 struct rrpc_addr *gp)
-{
-	struct rrpc_block *rblk = gp->rblk;
-	struct rrpc_lun *rlun = rblk->rlun;
-	u64 addr = gp->addr;
-	struct ppa_addr paddr;
-
-	paddr.ppa = addr;
-	paddr = rrpc_linear_to_generic_addr(&dev->geo, paddr);
-	paddr.g.ch = rlun->bppa.g.ch;
-	paddr.g.lun = rlun->bppa.g.lun;
-	paddr.g.blk = rblk->id;
-
-	return paddr;
-}
-
-/* requires lun->lock taken */
-static void rrpc_set_lun_cur(struct rrpc_lun *rlun, struct rrpc_block *new_rblk,
-						struct rrpc_block **cur_rblk)
-{
-	struct rrpc *rrpc = rlun->rrpc;
-
-	if (*cur_rblk) {
-		spin_lock(&(*cur_rblk)->lock);
-		WARN_ON(!block_is_full(rrpc, *cur_rblk));
-		spin_unlock(&(*cur_rblk)->lock);
-	}
-	*cur_rblk = new_rblk;
-}
-
-static struct rrpc_block *__rrpc_get_blk(struct rrpc *rrpc,
-							struct rrpc_lun *rlun)
-{
-	struct rrpc_block *rblk = NULL;
-
-	if (list_empty(&rlun->free_list))
-		goto out;
-
-	rblk = list_first_entry(&rlun->free_list, struct rrpc_block, list);
-
-	list_move_tail(&rblk->list, &rlun->used_list);
-	rblk->state = NVM_BLK_ST_TGT;
-	rlun->nr_free_blocks--;
-
-out:
-	return rblk;
-}
-
-static struct rrpc_block *rrpc_get_blk(struct rrpc *rrpc, struct rrpc_lun *rlun,
-							unsigned long flags)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_block *rblk;
-	int is_gc = flags & NVM_IOTYPE_GC;
-
-	spin_lock(&rlun->lock);
-	if (!is_gc && rlun->nr_free_blocks < rlun->reserved_blocks) {
-		pr_err("nvm: rrpc: cannot give block to non GC request\n");
-		spin_unlock(&rlun->lock);
-		return NULL;
-	}
-
-	rblk = __rrpc_get_blk(rrpc, rlun);
-	if (!rblk) {
-		pr_err("nvm: rrpc: cannot get new block\n");
-		spin_unlock(&rlun->lock);
-		return NULL;
-	}
-	spin_unlock(&rlun->lock);
-
-	bitmap_zero(rblk->invalid_pages, dev->geo.sec_per_blk);
-	rblk->next_page = 0;
-	rblk->nr_invalid_pages = 0;
-	atomic_set(&rblk->data_cmnt_size, 0);
-
-	return rblk;
-}
-
-static void rrpc_put_blk(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct rrpc_lun *rlun = rblk->rlun;
-
-	spin_lock(&rlun->lock);
-	if (rblk->state & NVM_BLK_ST_TGT) {
-		list_move_tail(&rblk->list, &rlun->free_list);
-		rlun->nr_free_blocks++;
-		rblk->state = NVM_BLK_ST_FREE;
-	} else if (rblk->state & NVM_BLK_ST_BAD) {
-		list_move_tail(&rblk->list, &rlun->bb_list);
-		rblk->state = NVM_BLK_ST_BAD;
-	} else {
-		WARN_ON_ONCE(1);
-		pr_err("rrpc: erroneous type (ch:%d,lun:%d,blk%d-> %u)\n",
-					rlun->bppa.g.ch, rlun->bppa.g.lun,
-					rblk->id, rblk->state);
-		list_move_tail(&rblk->list, &rlun->bb_list);
-	}
-	spin_unlock(&rlun->lock);
-}
-
-static void rrpc_put_blks(struct rrpc *rrpc)
-{
-	struct rrpc_lun *rlun;
-	int i;
-
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		rlun = &rrpc->luns[i];
-		if (rlun->cur)
-			rrpc_put_blk(rrpc, rlun->cur);
-		if (rlun->gc_cur)
-			rrpc_put_blk(rrpc, rlun->gc_cur);
-	}
-}
-
-static struct rrpc_lun *get_next_lun(struct rrpc *rrpc)
-{
-	int next = atomic_inc_return(&rrpc->next_lun);
-
-	return &rrpc->luns[next % rrpc->nr_luns];
-}
-
-static void rrpc_gc_kick(struct rrpc *rrpc)
-{
-	struct rrpc_lun *rlun;
-	unsigned int i;
-
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		rlun = &rrpc->luns[i];
-		queue_work(rrpc->krqd_wq, &rlun->ws_gc);
-	}
-}
-
-/*
- * timed GC every interval.
- */
-static void rrpc_gc_timer(struct timer_list *t)
-{
-	struct rrpc *rrpc = from_timer(rrpc, t, gc_timer);
-
-	rrpc_gc_kick(rrpc);
-	mod_timer(&rrpc->gc_timer, jiffies + msecs_to_jiffies(10));
-}
-
-static void rrpc_end_sync_bio(struct bio *bio)
-{
-	struct completion *waiting = bio->bi_private;
-
-	if (bio->bi_status)
-		pr_err("nvm: gc request failed (%u).\n", bio->bi_status);
-
-	complete(waiting);
-}
-
-/*
- * rrpc_move_valid_pages -- migrate live data off the block
- * @rrpc: the 'rrpc' structure
- * @block: the block from which to migrate live pages
- *
- * Description:
- *   GC algorithms may call this function to migrate remaining live
- *   pages off the block prior to erasing it. This function blocks
- *   further execution until the operation is complete.
- */
-static int rrpc_move_valid_pages(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct request_queue *q = dev->q;
-	struct rrpc_rev_addr *rev;
-	struct nvm_rq *rqd;
-	struct bio *bio;
-	struct page *page;
-	int slot;
-	int nr_sec_per_blk = dev->geo.sec_per_blk;
-	u64 phys_addr;
-	DECLARE_COMPLETION_ONSTACK(wait);
-
-	if (bitmap_full(rblk->invalid_pages, nr_sec_per_blk))
-		return 0;
-
-	bio = bio_alloc(GFP_NOIO, 1);
-	if (!bio) {
-		pr_err("nvm: could not alloc bio to gc\n");
-		return -ENOMEM;
-	}
-
-	page = mempool_alloc(rrpc->page_pool, GFP_NOIO);
-
-	while ((slot = find_first_zero_bit(rblk->invalid_pages,
-					    nr_sec_per_blk)) < nr_sec_per_blk) {
-
-		/* Lock laddr */
-		phys_addr = rrpc_blk_to_ppa(rrpc, rblk) + slot;
-
-try:
-		spin_lock(&rrpc->rev_lock);
-		/* Get logical address from physical to logical table */
-		rev = &rrpc->rev_trans_map[phys_addr];
-		/* already updated by previous regular write */
-		if (rev->addr == ADDR_EMPTY) {
-			spin_unlock(&rrpc->rev_lock);
-			continue;
-		}
-
-		rqd = rrpc_inflight_laddr_acquire(rrpc, rev->addr, 1);
-		if (IS_ERR_OR_NULL(rqd)) {
-			spin_unlock(&rrpc->rev_lock);
-			schedule();
-			goto try;
-		}
-
-		spin_unlock(&rrpc->rev_lock);
-
-		/* Perform read to do GC */
-		bio->bi_iter.bi_sector = rrpc_get_sector(rev->addr);
-		bio_set_op_attrs(bio,  REQ_OP_READ, 0);
-		bio->bi_private = &wait;
-		bio->bi_end_io = rrpc_end_sync_bio;
-
-		/* TODO: may fail when EXP_PG_SIZE > PAGE_SIZE */
-		bio_add_pc_page(q, bio, page, RRPC_EXPOSED_PAGE_SIZE, 0);
-
-		if (rrpc_submit_io(rrpc, bio, rqd, NVM_IOTYPE_GC)) {
-			pr_err("rrpc: gc read failed.\n");
-			rrpc_inflight_laddr_release(rrpc, rqd);
-			goto finished;
-		}
-		wait_for_completion_io(&wait);
-		if (bio->bi_status) {
-			rrpc_inflight_laddr_release(rrpc, rqd);
-			goto finished;
-		}
-
-		bio_reset(bio);
-		reinit_completion(&wait);
-
-		bio->bi_iter.bi_sector = rrpc_get_sector(rev->addr);
-		bio_set_op_attrs(bio, REQ_OP_WRITE, 0);
-		bio->bi_private = &wait;
-		bio->bi_end_io = rrpc_end_sync_bio;
-
-		bio_add_pc_page(q, bio, page, RRPC_EXPOSED_PAGE_SIZE, 0);
-
-		/* turn the command around and write the data back to a new
-		 * address
-		 */
-		if (rrpc_submit_io(rrpc, bio, rqd, NVM_IOTYPE_GC)) {
-			pr_err("rrpc: gc write failed.\n");
-			rrpc_inflight_laddr_release(rrpc, rqd);
-			goto finished;
-		}
-		wait_for_completion_io(&wait);
-
-		rrpc_inflight_laddr_release(rrpc, rqd);
-		if (bio->bi_status)
-			goto finished;
-
-		bio_reset(bio);
-	}
-
-finished:
-	mempool_free(page, rrpc->page_pool);
-	bio_put(bio);
-
-	if (!bitmap_full(rblk->invalid_pages, nr_sec_per_blk)) {
-		pr_err("nvm: failed to garbage collect block\n");
-		return -EIO;
-	}
-
-	return 0;
-}
-
-static void rrpc_block_gc(struct work_struct *work)
-{
-	struct rrpc_block_gc *gcb = container_of(work, struct rrpc_block_gc,
-									ws_gc);
-	struct rrpc *rrpc = gcb->rrpc;
-	struct rrpc_block *rblk = gcb->rblk;
-	struct rrpc_lun *rlun = rblk->rlun;
-	struct ppa_addr ppa;
-
-	mempool_free(gcb, rrpc->gcb_pool);
-	pr_debug("nvm: block 'ch:%d,lun:%d,blk:%d' being reclaimed\n",
-			rlun->bppa.g.ch, rlun->bppa.g.lun,
-			rblk->id);
-
-	if (rrpc_move_valid_pages(rrpc, rblk))
-		goto put_back;
-
-	ppa.ppa = 0;
-	ppa.g.ch = rlun->bppa.g.ch;
-	ppa.g.lun = rlun->bppa.g.lun;
-	ppa.g.blk = rblk->id;
-
-	if (nvm_erase_sync(rrpc->dev, &ppa, 1))
-		goto put_back;
-
-	rrpc_put_blk(rrpc, rblk);
-
-	return;
-
-put_back:
-	spin_lock(&rlun->lock);
-	list_add_tail(&rblk->prio, &rlun->prio_list);
-	spin_unlock(&rlun->lock);
-}
-
-/* the block with highest number of invalid pages, will be in the beginning
- * of the list
- */
-static struct rrpc_block *rblk_max_invalid(struct rrpc_block *ra,
-							struct rrpc_block *rb)
-{
-	if (ra->nr_invalid_pages == rb->nr_invalid_pages)
-		return ra;
-
-	return (ra->nr_invalid_pages < rb->nr_invalid_pages) ? rb : ra;
-}
-
-/* linearly find the block with highest number of invalid pages
- * requires lun->lock
- */
-static struct rrpc_block *block_prio_find_max(struct rrpc_lun *rlun)
-{
-	struct list_head *prio_list = &rlun->prio_list;
-	struct rrpc_block *rblk, *max;
-
-	BUG_ON(list_empty(prio_list));
-
-	max = list_first_entry(prio_list, struct rrpc_block, prio);
-	list_for_each_entry(rblk, prio_list, prio)
-		max = rblk_max_invalid(max, rblk);
-
-	return max;
-}
-
-static void rrpc_lun_gc(struct work_struct *work)
-{
-	struct rrpc_lun *rlun = container_of(work, struct rrpc_lun, ws_gc);
-	struct rrpc *rrpc = rlun->rrpc;
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_block_gc *gcb;
-	unsigned int nr_blocks_need;
-
-	nr_blocks_need = dev->geo.blks_per_lun / GC_LIMIT_INVERSE;
-
-	if (nr_blocks_need < rrpc->nr_luns)
-		nr_blocks_need = rrpc->nr_luns;
-
-	spin_lock(&rlun->lock);
-	while (nr_blocks_need > rlun->nr_free_blocks &&
-					!list_empty(&rlun->prio_list)) {
-		struct rrpc_block *rblk = block_prio_find_max(rlun);
-
-		if (!rblk->nr_invalid_pages)
-			break;
-
-		gcb = mempool_alloc(rrpc->gcb_pool, GFP_ATOMIC);
-		if (!gcb)
-			break;
-
-		list_del_init(&rblk->prio);
-
-		WARN_ON(!block_is_full(rrpc, rblk));
-
-		pr_debug("rrpc: selected block 'ch:%d,lun:%d,blk:%d' for GC\n",
-					rlun->bppa.g.ch, rlun->bppa.g.lun,
-					rblk->id);
-
-		gcb->rrpc = rrpc;
-		gcb->rblk = rblk;
-		INIT_WORK(&gcb->ws_gc, rrpc_block_gc);
-
-		queue_work(rrpc->kgc_wq, &gcb->ws_gc);
-
-		nr_blocks_need--;
-	}
-	spin_unlock(&rlun->lock);
-
-	/* TODO: Hint that request queue can be started again */
-}
-
-static void rrpc_gc_queue(struct work_struct *work)
-{
-	struct rrpc_block_gc *gcb = container_of(work, struct rrpc_block_gc,
-									ws_gc);
-	struct rrpc *rrpc = gcb->rrpc;
-	struct rrpc_block *rblk = gcb->rblk;
-	struct rrpc_lun *rlun = rblk->rlun;
-
-	spin_lock(&rlun->lock);
-	list_add_tail(&rblk->prio, &rlun->prio_list);
-	spin_unlock(&rlun->lock);
-
-	mempool_free(gcb, rrpc->gcb_pool);
-	pr_debug("nvm: block 'ch:%d,lun:%d,blk:%d' full, allow GC (sched)\n",
-					rlun->bppa.g.ch, rlun->bppa.g.lun,
-					rblk->id);
-}
-
-static const struct block_device_operations rrpc_fops = {
-	.owner		= THIS_MODULE,
-};
-
-static struct rrpc_lun *rrpc_get_lun_rr(struct rrpc *rrpc, int is_gc)
-{
-	unsigned int i;
-	struct rrpc_lun *rlun, *max_free;
-
-	if (!is_gc)
-		return get_next_lun(rrpc);
-
-	/* during GC, we don't care about RR, instead we want to make
-	 * sure that we maintain evenness between the block luns.
-	 */
-	max_free = &rrpc->luns[0];
-	/* prevent GC-ing lun from devouring pages of a lun with
-	 * little free blocks. We don't take the lock as we only need an
-	 * estimate.
-	 */
-	rrpc_for_each_lun(rrpc, rlun, i) {
-		if (rlun->nr_free_blocks > max_free->nr_free_blocks)
-			max_free = rlun;
-	}
-
-	return max_free;
-}
-
-static struct rrpc_addr *rrpc_update_map(struct rrpc *rrpc, sector_t laddr,
-					struct rrpc_block *rblk, u64 paddr)
-{
-	struct rrpc_addr *gp;
-	struct rrpc_rev_addr *rev;
-
-	BUG_ON(laddr >= rrpc->nr_sects);
-
-	gp = &rrpc->trans_map[laddr];
-	spin_lock(&rrpc->rev_lock);
-	if (gp->rblk)
-		rrpc_page_invalidate(rrpc, gp);
-
-	gp->addr = paddr;
-	gp->rblk = rblk;
-
-	rev = &rrpc->rev_trans_map[gp->addr];
-	rev->addr = laddr;
-	spin_unlock(&rrpc->rev_lock);
-
-	return gp;
-}
-
-static u64 rrpc_alloc_addr(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	u64 addr = ADDR_EMPTY;
-
-	spin_lock(&rblk->lock);
-	if (block_is_full(rrpc, rblk))
-		goto out;
-
-	addr = rblk->next_page;
-
-	rblk->next_page++;
-out:
-	spin_unlock(&rblk->lock);
-	return addr;
-}
-
-/* Map logical address to a physical page. The mapping implements a round robin
- * approach and allocates a page from the next lun available.
- *
- * Returns rrpc_addr with the physical address and block. Returns NULL if no
- * blocks in the next rlun are available.
- */
-static struct ppa_addr rrpc_map_page(struct rrpc *rrpc, sector_t laddr,
-								int is_gc)
-{
-	struct nvm_tgt_dev *tgt_dev = rrpc->dev;
-	struct rrpc_lun *rlun;
-	struct rrpc_block *rblk, **cur_rblk;
-	struct rrpc_addr *p;
-	struct ppa_addr ppa;
-	u64 paddr;
-	int gc_force = 0;
-
-	ppa.ppa = ADDR_EMPTY;
-	rlun = rrpc_get_lun_rr(rrpc, is_gc);
-
-	if (!is_gc && rlun->nr_free_blocks < rrpc->nr_luns * 4)
-		return ppa;
-
-	/*
-	 * page allocation steps:
-	 * 1. Try to allocate new page from current rblk
-	 * 2a. If succeed, proceed to map it in and return
-	 * 2b. If fail, first try to allocate a new block from media manger,
-	 *     and then retry step 1. Retry until the normal block pool is
-	 *     exhausted.
-	 * 3. If exhausted, and garbage collector is requesting the block,
-	 *    go to the reserved block and retry step 1.
-	 *    In the case that this fails as well, or it is not GC
-	 *    requesting, report not able to retrieve a block and let the
-	 *    caller handle further processing.
-	 */
-
-	spin_lock(&rlun->lock);
-	cur_rblk = &rlun->cur;
-	rblk = rlun->cur;
-retry:
-	paddr = rrpc_alloc_addr(rrpc, rblk);
-
-	if (paddr != ADDR_EMPTY)
-		goto done;
-
-	if (!list_empty(&rlun->wblk_list)) {
-new_blk:
-		rblk = list_first_entry(&rlun->wblk_list, struct rrpc_block,
-									prio);
-		rrpc_set_lun_cur(rlun, rblk, cur_rblk);
-		list_del(&rblk->prio);
-		goto retry;
-	}
-	spin_unlock(&rlun->lock);
-
-	rblk = rrpc_get_blk(rrpc, rlun, gc_force);
-	if (rblk) {
-		spin_lock(&rlun->lock);
-		list_add_tail(&rblk->prio, &rlun->wblk_list);
-		/*
-		 * another thread might already have added a new block,
-		 * Therefore, make sure that one is used, instead of the
-		 * one just added.
-		 */
-		goto new_blk;
-	}
-
-	if (unlikely(is_gc) && !gc_force) {
-		/* retry from emergency gc block */
-		cur_rblk = &rlun->gc_cur;
-		rblk = rlun->gc_cur;
-		gc_force = 1;
-		spin_lock(&rlun->lock);
-		goto retry;
-	}
-
-	pr_err("rrpc: failed to allocate new block\n");
-	return ppa;
-done:
-	spin_unlock(&rlun->lock);
-	p = rrpc_update_map(rrpc, laddr, rblk, paddr);
-	if (!p)
-		return ppa;
-
-	/* return global address */
-	return rrpc_ppa_to_gaddr(tgt_dev, p);
-}
-
-static void rrpc_run_gc(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct rrpc_block_gc *gcb;
-
-	gcb = mempool_alloc(rrpc->gcb_pool, GFP_ATOMIC);
-	if (!gcb) {
-		pr_err("rrpc: unable to queue block for gc.");
-		return;
-	}
-
-	gcb->rrpc = rrpc;
-	gcb->rblk = rblk;
-
-	INIT_WORK(&gcb->ws_gc, rrpc_gc_queue);
-	queue_work(rrpc->kgc_wq, &gcb->ws_gc);
-}
-
-static struct rrpc_lun *rrpc_ppa_to_lun(struct rrpc *rrpc, struct ppa_addr p)
-{
-	struct rrpc_lun *rlun = NULL;
-	int i;
-
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		if (rrpc->luns[i].bppa.g.ch == p.g.ch &&
-				rrpc->luns[i].bppa.g.lun == p.g.lun) {
-			rlun = &rrpc->luns[i];
-			break;
-		}
-	}
-
-	return rlun;
-}
-
-static void __rrpc_mark_bad_block(struct rrpc *rrpc, struct ppa_addr ppa)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_lun *rlun;
-	struct rrpc_block *rblk;
-
-	rlun = rrpc_ppa_to_lun(rrpc, ppa);
-	rblk = &rlun->blocks[ppa.g.blk];
-	rblk->state = NVM_BLK_ST_BAD;
-
-	nvm_set_tgt_bb_tbl(dev, &ppa, 1, NVM_BLK_T_GRWN_BAD);
-}
-
-static void rrpc_mark_bad_block(struct rrpc *rrpc, struct nvm_rq *rqd)
-{
-	void *comp_bits = &rqd->ppa_status;
-	struct ppa_addr ppa, prev_ppa;
-	int nr_ppas = rqd->nr_ppas;
-	int bit;
-
-	if (rqd->nr_ppas == 1)
-		__rrpc_mark_bad_block(rrpc, rqd->ppa_addr);
-
-	ppa_set_empty(&prev_ppa);
-	bit = -1;
-	while ((bit = find_next_bit(comp_bits, nr_ppas, bit + 1)) < nr_ppas) {
-		ppa = rqd->ppa_list[bit];
-		if (ppa_cmp_blk(ppa, prev_ppa))
-			continue;
-
-		__rrpc_mark_bad_block(rrpc, ppa);
-	}
-}
-
-static void rrpc_end_io_write(struct rrpc *rrpc, struct rrpc_rq *rrqd,
-						sector_t laddr, uint8_t npages)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_addr *p;
-	struct rrpc_block *rblk;
-	int cmnt_size, i;
-
-	for (i = 0; i < npages; i++) {
-		p = &rrpc->trans_map[laddr + i];
-		rblk = p->rblk;
-
-		cmnt_size = atomic_inc_return(&rblk->data_cmnt_size);
-		if (unlikely(cmnt_size == dev->geo.sec_per_blk))
-			rrpc_run_gc(rrpc, rblk);
-	}
-}
-
-static void rrpc_end_io(struct nvm_rq *rqd)
-{
-	struct rrpc *rrpc = rqd->private;
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_rq *rrqd = nvm_rq_to_pdu(rqd);
-	uint8_t npages = rqd->nr_ppas;
-	sector_t laddr = rrpc_get_laddr(rqd->bio) - npages;
-
-	if (bio_data_dir(rqd->bio) == WRITE) {
-		if (rqd->error == NVM_RSP_ERR_FAILWRITE)
-			rrpc_mark_bad_block(rrpc, rqd);
-
-		rrpc_end_io_write(rrpc, rrqd, laddr, npages);
-	}
-
-	bio_put(rqd->bio);
-
-	if (rrqd->flags & NVM_IOTYPE_GC)
-		return;
-
-	rrpc_unlock_rq(rrpc, rqd);
-
-	if (npages > 1)
-		nvm_dev_dma_free(dev->parent, rqd->ppa_list, rqd->dma_ppa_list);
-
-	mempool_free(rqd, rrpc->rq_pool);
-}
-
-static int rrpc_read_ppalist_rq(struct rrpc *rrpc, struct bio *bio,
-			struct nvm_rq *rqd, unsigned long flags, int npages)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_inflight_rq *r = rrpc_get_inflight_rq(rqd);
-	struct rrpc_addr *gp;
-	sector_t laddr = rrpc_get_laddr(bio);
-	int is_gc = flags & NVM_IOTYPE_GC;
-	int i;
-
-	if (!is_gc && rrpc_lock_rq(rrpc, bio, rqd)) {
-		nvm_dev_dma_free(dev->parent, rqd->ppa_list, rqd->dma_ppa_list);
-		return NVM_IO_REQUEUE;
-	}
-
-	for (i = 0; i < npages; i++) {
-		/* We assume that mapping occurs at 4KB granularity */
-		BUG_ON(!(laddr + i < rrpc->nr_sects));
-		gp = &rrpc->trans_map[laddr + i];
-
-		if (gp->rblk) {
-			rqd->ppa_list[i] = rrpc_ppa_to_gaddr(dev, gp);
-		} else {
-			BUG_ON(is_gc);
-			rrpc_unlock_laddr(rrpc, r);
-			nvm_dev_dma_free(dev->parent, rqd->ppa_list,
-							rqd->dma_ppa_list);
-			return NVM_IO_DONE;
-		}
-	}
-
-	rqd->opcode = NVM_OP_HBREAD;
-
-	return NVM_IO_OK;
-}
-
-static int rrpc_read_rq(struct rrpc *rrpc, struct bio *bio, struct nvm_rq *rqd,
-							unsigned long flags)
-{
-	int is_gc = flags & NVM_IOTYPE_GC;
-	sector_t laddr = rrpc_get_laddr(bio);
-	struct rrpc_addr *gp;
-
-	if (!is_gc && rrpc_lock_rq(rrpc, bio, rqd))
-		return NVM_IO_REQUEUE;
-
-	BUG_ON(!(laddr < rrpc->nr_sects));
-	gp = &rrpc->trans_map[laddr];
-
-	if (gp->rblk) {
-		rqd->ppa_addr = rrpc_ppa_to_gaddr(rrpc->dev, gp);
-	} else {
-		BUG_ON(is_gc);
-		rrpc_unlock_rq(rrpc, rqd);
-		return NVM_IO_DONE;
-	}
-
-	rqd->opcode = NVM_OP_HBREAD;
-
-	return NVM_IO_OK;
-}
-
-static int rrpc_write_ppalist_rq(struct rrpc *rrpc, struct bio *bio,
-			struct nvm_rq *rqd, unsigned long flags, int npages)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_inflight_rq *r = rrpc_get_inflight_rq(rqd);
-	struct ppa_addr p;
-	sector_t laddr = rrpc_get_laddr(bio);
-	int is_gc = flags & NVM_IOTYPE_GC;
-	int i;
-
-	if (!is_gc && rrpc_lock_rq(rrpc, bio, rqd)) {
-		nvm_dev_dma_free(dev->parent, rqd->ppa_list, rqd->dma_ppa_list);
-		return NVM_IO_REQUEUE;
-	}
-
-	for (i = 0; i < npages; i++) {
-		/* We assume that mapping occurs at 4KB granularity */
-		p = rrpc_map_page(rrpc, laddr + i, is_gc);
-		if (p.ppa == ADDR_EMPTY) {
-			BUG_ON(is_gc);
-			rrpc_unlock_laddr(rrpc, r);
-			nvm_dev_dma_free(dev->parent, rqd->ppa_list,
-							rqd->dma_ppa_list);
-			rrpc_gc_kick(rrpc);
-			return NVM_IO_REQUEUE;
-		}
-
-		rqd->ppa_list[i] = p;
-	}
-
-	rqd->opcode = NVM_OP_HBWRITE;
-
-	return NVM_IO_OK;
-}
-
-static int rrpc_write_rq(struct rrpc *rrpc, struct bio *bio,
-				struct nvm_rq *rqd, unsigned long flags)
-{
-	struct ppa_addr p;
-	int is_gc = flags & NVM_IOTYPE_GC;
-	sector_t laddr = rrpc_get_laddr(bio);
-
-	if (!is_gc && rrpc_lock_rq(rrpc, bio, rqd))
-		return NVM_IO_REQUEUE;
-
-	p = rrpc_map_page(rrpc, laddr, is_gc);
-	if (p.ppa == ADDR_EMPTY) {
-		BUG_ON(is_gc);
-		rrpc_unlock_rq(rrpc, rqd);
-		rrpc_gc_kick(rrpc);
-		return NVM_IO_REQUEUE;
-	}
-
-	rqd->ppa_addr = p;
-	rqd->opcode = NVM_OP_HBWRITE;
-
-	return NVM_IO_OK;
-}
-
-static int rrpc_setup_rq(struct rrpc *rrpc, struct bio *bio,
-			struct nvm_rq *rqd, unsigned long flags, uint8_t npages)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-
-	if (npages > 1) {
-		rqd->ppa_list = nvm_dev_dma_alloc(dev->parent, GFP_KERNEL,
-							&rqd->dma_ppa_list);
-		if (!rqd->ppa_list) {
-			pr_err("rrpc: not able to allocate ppa list\n");
-			return NVM_IO_ERR;
-		}
-
-		if (bio_op(bio) == REQ_OP_WRITE)
-			return rrpc_write_ppalist_rq(rrpc, bio, rqd, flags,
-									npages);
-
-		return rrpc_read_ppalist_rq(rrpc, bio, rqd, flags, npages);
-	}
-
-	if (bio_op(bio) == REQ_OP_WRITE)
-		return rrpc_write_rq(rrpc, bio, rqd, flags);
-
-	return rrpc_read_rq(rrpc, bio, rqd, flags);
-}
-
-static int rrpc_submit_io(struct rrpc *rrpc, struct bio *bio,
-				struct nvm_rq *rqd, unsigned long flags)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_rq *rrq = nvm_rq_to_pdu(rqd);
-	uint8_t nr_pages = rrpc_get_pages(bio);
-	int bio_size = bio_sectors(bio) << 9;
-	int err;
-
-	if (bio_size < dev->geo.sec_size)
-		return NVM_IO_ERR;
-	else if (bio_size > dev->geo.max_rq_size)
-		return NVM_IO_ERR;
-
-	err = rrpc_setup_rq(rrpc, bio, rqd, flags, nr_pages);
-	if (err)
-		return err;
-
-	bio_get(bio);
-	rqd->bio = bio;
-	rqd->private = rrpc;
-	rqd->nr_ppas = nr_pages;
-	rqd->end_io = rrpc_end_io;
-	rrq->flags = flags;
-
-	err = nvm_submit_io(dev, rqd);
-	if (err) {
-		pr_err("rrpc: I/O submission failed: %d\n", err);
-		bio_put(bio);
-		if (!(flags & NVM_IOTYPE_GC)) {
-			rrpc_unlock_rq(rrpc, rqd);
-			if (rqd->nr_ppas > 1)
-				nvm_dev_dma_free(dev->parent, rqd->ppa_list,
-							rqd->dma_ppa_list);
-		}
-		return NVM_IO_ERR;
-	}
-
-	return NVM_IO_OK;
-}
-
-static blk_qc_t rrpc_make_rq(struct request_queue *q, struct bio *bio)
-{
-	struct rrpc *rrpc = q->queuedata;
-	struct nvm_rq *rqd;
-	int err;
-
-	blk_queue_split(q, &bio);
-
-	if (bio_op(bio) == REQ_OP_DISCARD) {
-		rrpc_discard(rrpc, bio);
-		return BLK_QC_T_NONE;
-	}
-
-	rqd = mempool_alloc(rrpc->rq_pool, GFP_KERNEL);
-	memset(rqd, 0, sizeof(struct nvm_rq));
-
-	err = rrpc_submit_io(rrpc, bio, rqd, NVM_IOTYPE_NONE);
-	switch (err) {
-	case NVM_IO_OK:
-		return BLK_QC_T_NONE;
-	case NVM_IO_ERR:
-		bio_io_error(bio);
-		break;
-	case NVM_IO_DONE:
-		bio_endio(bio);
-		break;
-	case NVM_IO_REQUEUE:
-		spin_lock(&rrpc->bio_lock);
-		bio_list_add(&rrpc->requeue_bios, bio);
-		spin_unlock(&rrpc->bio_lock);
-		queue_work(rrpc->kgc_wq, &rrpc->ws_requeue);
-		break;
-	}
-
-	mempool_free(rqd, rrpc->rq_pool);
-	return BLK_QC_T_NONE;
-}
-
-static void rrpc_requeue(struct work_struct *work)
-{
-	struct rrpc *rrpc = container_of(work, struct rrpc, ws_requeue);
-	struct bio_list bios;
-	struct bio *bio;
-
-	bio_list_init(&bios);
-
-	spin_lock(&rrpc->bio_lock);
-	bio_list_merge(&bios, &rrpc->requeue_bios);
-	bio_list_init(&rrpc->requeue_bios);
-	spin_unlock(&rrpc->bio_lock);
-
-	while ((bio = bio_list_pop(&bios)))
-		rrpc_make_rq(rrpc->disk->queue, bio);
-}
-
-static void rrpc_gc_free(struct rrpc *rrpc)
-{
-	if (rrpc->krqd_wq)
-		destroy_workqueue(rrpc->krqd_wq);
-
-	if (rrpc->kgc_wq)
-		destroy_workqueue(rrpc->kgc_wq);
-}
-
-static int rrpc_gc_init(struct rrpc *rrpc)
-{
-	rrpc->krqd_wq = alloc_workqueue("rrpc-lun", WQ_MEM_RECLAIM|WQ_UNBOUND,
-								rrpc->nr_luns);
-	if (!rrpc->krqd_wq)
-		return -ENOMEM;
-
-	rrpc->kgc_wq = alloc_workqueue("rrpc-bg", WQ_MEM_RECLAIM, 1);
-	if (!rrpc->kgc_wq)
-		return -ENOMEM;
-
-	timer_setup(&rrpc->gc_timer, rrpc_gc_timer, 0);
-
-	return 0;
-}
-
-static void rrpc_map_free(struct rrpc *rrpc)
-{
-	vfree(rrpc->rev_trans_map);
-	vfree(rrpc->trans_map);
-}
-
-static int rrpc_l2p_update(u64 slba, u32 nlb, __le64 *entries, void *private)
-{
-	struct rrpc *rrpc = (struct rrpc *)private;
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_addr *addr = rrpc->trans_map + slba;
-	struct rrpc_rev_addr *raddr = rrpc->rev_trans_map;
-	struct rrpc_lun *rlun;
-	struct rrpc_block *rblk;
-	u64 i;
-
-	for (i = 0; i < nlb; i++) {
-		struct ppa_addr gaddr;
-		u64 pba = le64_to_cpu(entries[i]);
-		unsigned int mod;
-
-		/* LNVM treats address-spaces as silos, LBA and PBA are
-		 * equally large and zero-indexed.
-		 */
-		if (unlikely(pba >= dev->total_secs && pba != U64_MAX)) {
-			pr_err("nvm: L2P data entry is out of bounds!\n");
-			pr_err("nvm: Maybe loaded an old target L2P\n");
-			return -EINVAL;
-		}
-
-		/* Address zero is a special one. The first page on a disk is
-		 * protected. As it often holds internal device boot
-		 * information.
-		 */
-		if (!pba)
-			continue;
-
-		div_u64_rem(pba, rrpc->nr_sects, &mod);
-
-		gaddr = rrpc_recov_addr(dev, pba);
-		rlun = rrpc_ppa_to_lun(rrpc, gaddr);
-		if (!rlun) {
-			pr_err("rrpc: l2p corruption on lba %llu\n",
-							slba + i);
-			return -EINVAL;
-		}
-
-		rblk = &rlun->blocks[gaddr.g.blk];
-		if (!rblk->state) {
-			/* at this point, we don't know anything about the
-			 * block. It's up to the FTL on top to re-etablish the
-			 * block state. The block is assumed to be open.
-			 */
-			list_move_tail(&rblk->list, &rlun->used_list);
-			rblk->state = NVM_BLK_ST_TGT;
-			rlun->nr_free_blocks--;
-		}
-
-		addr[i].addr = pba;
-		addr[i].rblk = rblk;
-		raddr[mod].addr = slba + i;
-	}
-
-	return 0;
-}
-
-static int rrpc_map_init(struct rrpc *rrpc)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	sector_t i;
-	int ret;
-
-	rrpc->trans_map = vzalloc(sizeof(struct rrpc_addr) * rrpc->nr_sects);
-	if (!rrpc->trans_map)
-		return -ENOMEM;
-
-	rrpc->rev_trans_map = vmalloc(sizeof(struct rrpc_rev_addr)
-							* rrpc->nr_sects);
-	if (!rrpc->rev_trans_map)
-		return -ENOMEM;
-
-	for (i = 0; i < rrpc->nr_sects; i++) {
-		struct rrpc_addr *p = &rrpc->trans_map[i];
-		struct rrpc_rev_addr *r = &rrpc->rev_trans_map[i];
-
-		p->addr = ADDR_EMPTY;
-		r->addr = ADDR_EMPTY;
-	}
-
-	/* Bring up the mapping table from device */
-	ret = nvm_get_l2p_tbl(dev, rrpc->soffset, rrpc->nr_sects,
-							rrpc_l2p_update, rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: could not read L2P table.\n");
-		return -EINVAL;
-	}
-
-	return 0;
-}
-
-/* Minimum pages needed within a lun */
-#define PAGE_POOL_SIZE 16
-#define ADDR_POOL_SIZE 64
-
-static int rrpc_core_init(struct rrpc *rrpc)
-{
-	down_write(&rrpc_lock);
-	if (!rrpc_gcb_cache) {
-		rrpc_gcb_cache = kmem_cache_create("rrpc_gcb",
-				sizeof(struct rrpc_block_gc), 0, 0, NULL);
-		if (!rrpc_gcb_cache) {
-			up_write(&rrpc_lock);
-			return -ENOMEM;
-		}
-
-		rrpc_rq_cache = kmem_cache_create("rrpc_rq",
-				sizeof(struct nvm_rq) + sizeof(struct rrpc_rq),
-				0, 0, NULL);
-		if (!rrpc_rq_cache) {
-			kmem_cache_destroy(rrpc_gcb_cache);
-			up_write(&rrpc_lock);
-			return -ENOMEM;
-		}
-	}
-	up_write(&rrpc_lock);
-
-	rrpc->page_pool = mempool_create_page_pool(PAGE_POOL_SIZE, 0);
-	if (!rrpc->page_pool)
-		return -ENOMEM;
-
-	rrpc->gcb_pool = mempool_create_slab_pool(rrpc->dev->geo.nr_luns,
-								rrpc_gcb_cache);
-	if (!rrpc->gcb_pool)
-		return -ENOMEM;
-
-	rrpc->rq_pool = mempool_create_slab_pool(64, rrpc_rq_cache);
-	if (!rrpc->rq_pool)
-		return -ENOMEM;
-
-	spin_lock_init(&rrpc->inflights.lock);
-	INIT_LIST_HEAD(&rrpc->inflights.reqs);
-
-	return 0;
-}
-
-static void rrpc_core_free(struct rrpc *rrpc)
-{
-	mempool_destroy(rrpc->page_pool);
-	mempool_destroy(rrpc->gcb_pool);
-	mempool_destroy(rrpc->rq_pool);
-}
-
-static void rrpc_luns_free(struct rrpc *rrpc)
-{
-	struct rrpc_lun *rlun;
-	int i;
-
-	if (!rrpc->luns)
-		return;
-
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		rlun = &rrpc->luns[i];
-		vfree(rlun->blocks);
-	}
-
-	kfree(rrpc->luns);
-}
-
-static int rrpc_bb_discovery(struct nvm_tgt_dev *dev, struct rrpc_lun *rlun)
-{
-	struct nvm_geo *geo = &dev->geo;
-	struct rrpc_block *rblk;
-	struct ppa_addr ppa;
-	u8 *blks;
-	int nr_blks;
-	int i;
-	int ret;
-
-	if (!dev->parent->ops->get_bb_tbl)
-		return 0;
-
-	nr_blks = geo->blks_per_lun * geo->plane_mode;
-	blks = kmalloc(nr_blks, GFP_KERNEL);
-	if (!blks)
-		return -ENOMEM;
-
-	ppa.ppa = 0;
-	ppa.g.ch = rlun->bppa.g.ch;
-	ppa.g.lun = rlun->bppa.g.lun;
-
-	ret = nvm_get_tgt_bb_tbl(dev, ppa, blks);
-	if (ret) {
-		pr_err("rrpc: could not get BB table\n");
-		goto out;
-	}
-
-	nr_blks = nvm_bb_tbl_fold(dev->parent, blks, nr_blks);
-	if (nr_blks < 0) {
-		ret = nr_blks;
-		goto out;
-	}
-
-	for (i = 0; i < nr_blks; i++) {
-		if (blks[i] == NVM_BLK_T_FREE)
-			continue;
-
-		rblk = &rlun->blocks[i];
-		list_move_tail(&rblk->list, &rlun->bb_list);
-		rblk->state = NVM_BLK_ST_BAD;
-		rlun->nr_free_blocks--;
-	}
-
-out:
-	kfree(blks);
-	return ret;
-}
-
-static void rrpc_set_lun_ppa(struct rrpc_lun *rlun, struct ppa_addr ppa)
-{
-	rlun->bppa.ppa = 0;
-	rlun->bppa.g.ch = ppa.g.ch;
-	rlun->bppa.g.lun = ppa.g.lun;
-}
-
-static int rrpc_luns_init(struct rrpc *rrpc, struct ppa_addr *luns)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct nvm_geo *geo = &dev->geo;
-	struct rrpc_lun *rlun;
-	int i, j, ret = -EINVAL;
-
-	if (geo->sec_per_blk > MAX_INVALID_PAGES_STORAGE * BITS_PER_LONG) {
-		pr_err("rrpc: number of pages per block too high.");
-		return -EINVAL;
-	}
-
-	spin_lock_init(&rrpc->rev_lock);
-
-	rrpc->luns = kcalloc(rrpc->nr_luns, sizeof(struct rrpc_lun),
-								GFP_KERNEL);
-	if (!rrpc->luns)
-		return -ENOMEM;
-
-	/* 1:1 mapping */
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		rlun = &rrpc->luns[i];
-		rlun->id = i;
-		rrpc_set_lun_ppa(rlun, luns[i]);
-		rlun->blocks = vzalloc(sizeof(struct rrpc_block) *
-							geo->blks_per_lun);
-		if (!rlun->blocks) {
-			ret = -ENOMEM;
-			goto err;
-		}
-
-		INIT_LIST_HEAD(&rlun->free_list);
-		INIT_LIST_HEAD(&rlun->used_list);
-		INIT_LIST_HEAD(&rlun->bb_list);
-
-		for (j = 0; j < geo->blks_per_lun; j++) {
-			struct rrpc_block *rblk = &rlun->blocks[j];
-
-			rblk->id = j;
-			rblk->rlun = rlun;
-			rblk->state = NVM_BLK_T_FREE;
-			INIT_LIST_HEAD(&rblk->prio);
-			INIT_LIST_HEAD(&rblk->list);
-			spin_lock_init(&rblk->lock);
-
-			list_add_tail(&rblk->list, &rlun->free_list);
-		}
-
-		rlun->rrpc = rrpc;
-		rlun->nr_free_blocks = geo->blks_per_lun;
-		rlun->reserved_blocks = 2; /* for GC only */
-
-		INIT_LIST_HEAD(&rlun->prio_list);
-		INIT_LIST_HEAD(&rlun->wblk_list);
-
-		INIT_WORK(&rlun->ws_gc, rrpc_lun_gc);
-		spin_lock_init(&rlun->lock);
-
-		if (rrpc_bb_discovery(dev, rlun))
-			goto err;
-
-	}
-
-	return 0;
-err:
-	return ret;
-}
-
-/* returns 0 on success and stores the beginning address in *begin */
-static int rrpc_area_init(struct rrpc *rrpc, sector_t *begin)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	sector_t size = rrpc->nr_sects * dev->geo.sec_size;
-	int ret;
-
-	size >>= 9;
-
-	ret = nvm_get_area(dev, begin, size);
-	if (!ret)
-		*begin >>= (ilog2(dev->geo.sec_size) - 9);
-
-	return ret;
-}
-
-static void rrpc_area_free(struct rrpc *rrpc)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	sector_t begin = rrpc->soffset << (ilog2(dev->geo.sec_size) - 9);
-
-	nvm_put_area(dev, begin);
-}
-
-static void rrpc_free(struct rrpc *rrpc)
-{
-	rrpc_gc_free(rrpc);
-	rrpc_map_free(rrpc);
-	rrpc_core_free(rrpc);
-	rrpc_luns_free(rrpc);
-	rrpc_area_free(rrpc);
-
-	kfree(rrpc);
-}
-
-static void rrpc_exit(void *private)
-{
-	struct rrpc *rrpc = private;
-
-	del_timer(&rrpc->gc_timer);
-
-	flush_workqueue(rrpc->krqd_wq);
-	flush_workqueue(rrpc->kgc_wq);
-
-	rrpc_free(rrpc);
-}
-
-static sector_t rrpc_capacity(void *private)
-{
-	struct rrpc *rrpc = private;
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	sector_t reserved, provisioned;
-
-	/* cur, gc, and two emergency blocks for each lun */
-	reserved = rrpc->nr_luns * dev->geo.sec_per_blk * 4;
-	provisioned = rrpc->nr_sects - reserved;
-
-	if (reserved > rrpc->nr_sects) {
-		pr_err("rrpc: not enough space available to expose storage.\n");
-		return 0;
-	}
-
-	sector_div(provisioned, 10);
-	return provisioned * 9 * NR_PHY_IN_LOG;
-}
-
-/*
- * Looks up the logical address from reverse trans map and check if its valid by
- * comparing the logical to physical address with the physical address.
- * Returns 0 on free, otherwise 1 if in use
- */
-static void rrpc_block_map_update(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	int offset;
-	struct rrpc_addr *laddr;
-	u64 bpaddr, paddr, pladdr;
-
-	bpaddr = block_to_rel_addr(rrpc, rblk);
-	for (offset = 0; offset < dev->geo.sec_per_blk; offset++) {
-		paddr = bpaddr + offset;
-
-		pladdr = rrpc->rev_trans_map[paddr].addr;
-		if (pladdr == ADDR_EMPTY)
-			continue;
-
-		laddr = &rrpc->trans_map[pladdr];
-
-		if (paddr == laddr->addr) {
-			laddr->rblk = rblk;
-		} else {
-			set_bit(offset, rblk->invalid_pages);
-			rblk->nr_invalid_pages++;
-		}
-	}
-}
-
-static int rrpc_blocks_init(struct rrpc *rrpc)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct rrpc_lun *rlun;
-	struct rrpc_block *rblk;
-	int lun_iter, blk_iter;
-
-	for (lun_iter = 0; lun_iter < rrpc->nr_luns; lun_iter++) {
-		rlun = &rrpc->luns[lun_iter];
-
-		for (blk_iter = 0; blk_iter < dev->geo.blks_per_lun;
-								blk_iter++) {
-			rblk = &rlun->blocks[blk_iter];
-			rrpc_block_map_update(rrpc, rblk);
-		}
-	}
-
-	return 0;
-}
-
-static int rrpc_luns_configure(struct rrpc *rrpc)
-{
-	struct rrpc_lun *rlun;
-	struct rrpc_block *rblk;
-	int i;
-
-	for (i = 0; i < rrpc->nr_luns; i++) {
-		rlun = &rrpc->luns[i];
-
-		rblk = rrpc_get_blk(rrpc, rlun, 0);
-		if (!rblk)
-			goto err;
-		rrpc_set_lun_cur(rlun, rblk, &rlun->cur);
-
-		/* Emergency gc block */
-		rblk = rrpc_get_blk(rrpc, rlun, 1);
-		if (!rblk)
-			goto err;
-		rrpc_set_lun_cur(rlun, rblk, &rlun->gc_cur);
-	}
-
-	return 0;
-err:
-	rrpc_put_blks(rrpc);
-	return -EINVAL;
-}
-
-static struct nvm_tgt_type tt_rrpc;
-
-static void *rrpc_init(struct nvm_tgt_dev *dev, struct gendisk *tdisk,
-		       int flags)
-{
-	struct request_queue *bqueue = dev->q;
-	struct request_queue *tqueue = tdisk->queue;
-	struct nvm_geo *geo = &dev->geo;
-	struct rrpc *rrpc;
-	sector_t soffset;
-	int ret;
-
-	if (!(dev->identity.dom & NVM_RSP_L2P)) {
-		pr_err("nvm: rrpc: device does not support l2p (%x)\n",
-							dev->identity.dom);
-		return ERR_PTR(-EINVAL);
-	}
-
-	rrpc = kzalloc(sizeof(struct rrpc), GFP_KERNEL);
-	if (!rrpc)
-		return ERR_PTR(-ENOMEM);
-
-	rrpc->dev = dev;
-	rrpc->disk = tdisk;
-
-	bio_list_init(&rrpc->requeue_bios);
-	spin_lock_init(&rrpc->bio_lock);
-	INIT_WORK(&rrpc->ws_requeue, rrpc_requeue);
-
-	rrpc->nr_luns = geo->nr_luns;
-	rrpc->nr_sects = (unsigned long long)geo->sec_per_lun * rrpc->nr_luns;
-
-	/* simple round-robin strategy */
-	atomic_set(&rrpc->next_lun, -1);
-
-	ret = rrpc_area_init(rrpc, &soffset);
-	if (ret < 0) {
-		pr_err("nvm: rrpc: could not initialize area\n");
-		return ERR_PTR(ret);
-	}
-	rrpc->soffset = soffset;
-
-	ret = rrpc_luns_init(rrpc, dev->luns);
-	if (ret) {
-		pr_err("nvm: rrpc: could not initialize luns\n");
-		goto err;
-	}
-
-	ret = rrpc_core_init(rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: could not initialize core\n");
-		goto err;
-	}
-
-	ret = rrpc_map_init(rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: could not initialize maps\n");
-		goto err;
-	}
-
-	ret = rrpc_blocks_init(rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: could not initialize state for blocks\n");
-		goto err;
-	}
-
-	ret = rrpc_luns_configure(rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: not enough blocks available in LUNs.\n");
-		goto err;
-	}
-
-	ret = rrpc_gc_init(rrpc);
-	if (ret) {
-		pr_err("nvm: rrpc: could not initialize gc\n");
-		goto err;
-	}
-
-	/* inherit the size from the underlying device */
-	blk_queue_logical_block_size(tqueue, queue_physical_block_size(bqueue));
-	blk_queue_max_hw_sectors(tqueue, queue_max_hw_sectors(bqueue));
-
-	pr_info("nvm: rrpc initialized with %u luns and %llu pages.\n",
-			rrpc->nr_luns, (unsigned long long)rrpc->nr_sects);
-
-	mod_timer(&rrpc->gc_timer, jiffies + msecs_to_jiffies(10));
-
-	return rrpc;
-err:
-	rrpc_free(rrpc);
-	return ERR_PTR(ret);
-}
-
-/* round robin, page-based FTL, and cost-based GC */
-static struct nvm_tgt_type tt_rrpc = {
-	.name		= "rrpc",
-	.version	= {1, 0, 0},
-
-	.make_rq	= rrpc_make_rq,
-	.capacity	= rrpc_capacity,
-
-	.init		= rrpc_init,
-	.exit		= rrpc_exit,
-};
-
-static int __init rrpc_module_init(void)
-{
-	return nvm_register_tgt_type(&tt_rrpc);
-}
-
-static void rrpc_module_exit(void)
-{
-	nvm_unregister_tgt_type(&tt_rrpc);
-}
-
-module_init(rrpc_module_init);
-module_exit(rrpc_module_exit);
-MODULE_LICENSE("GPL v2");
-MODULE_DESCRIPTION("Block-Device Target for Open-Channel SSDs");
diff --git a/drivers/lightnvm/rrpc.h b/drivers/lightnvm/rrpc.h
deleted file mode 100644
index fdb6ff9..0000000
--- a/drivers/lightnvm/rrpc.h
+++ /dev/null
@@ -1,290 +0,0 @@
-/*
- * Copyright (C) 2015 IT University of Copenhagen
- * Initial release: Matias Bjorling <m@bjorling.me>
- *
- * This program is free software; you can redistribute it and/or
- * modify it under the terms of the GNU General Public License version
- * 2 as published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful, but
- * WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- * General Public License for more details.
- *
- * Implementation of a Round-robin page-based Hybrid FTL for Open-channel SSDs.
- */
-
-#ifndef RRPC_H_
-#define RRPC_H_
-
-#include <linux/blkdev.h>
-#include <linux/blk-mq.h>
-#include <linux/bio.h>
-#include <linux/module.h>
-#include <linux/kthread.h>
-#include <linux/vmalloc.h>
-
-#include <linux/lightnvm.h>
-
-/* Run only GC if less than 1/X blocks are free */
-#define GC_LIMIT_INVERSE 10
-#define GC_TIME_SECS 100
-
-#define RRPC_SECTOR (512)
-#define RRPC_EXPOSED_PAGE_SIZE (4096)
-
-#define NR_PHY_IN_LOG (RRPC_EXPOSED_PAGE_SIZE / RRPC_SECTOR)
-
-struct rrpc_inflight {
-	struct list_head reqs;
-	spinlock_t lock;
-};
-
-struct rrpc_inflight_rq {
-	struct list_head list;
-	sector_t l_start;
-	sector_t l_end;
-};
-
-struct rrpc_rq {
-	struct rrpc_inflight_rq inflight_rq;
-	unsigned long flags;
-};
-
-struct rrpc_block {
-	int id;				/* id inside of LUN */
-	struct rrpc_lun *rlun;
-
-	struct list_head prio;		/* LUN CG list */
-	struct list_head list;		/* LUN free, used, bb list */
-
-#define MAX_INVALID_PAGES_STORAGE 8
-	/* Bitmap for invalid page intries */
-	unsigned long invalid_pages[MAX_INVALID_PAGES_STORAGE];
-	/* points to the next writable page within a block */
-	unsigned int next_page;
-	/* number of pages that are invalid, wrt host page size */
-	unsigned int nr_invalid_pages;
-
-	int state;
-
-	spinlock_t lock;
-	atomic_t data_cmnt_size; /* data pages committed to stable storage */
-};
-
-struct rrpc_lun {
-	struct rrpc *rrpc;
-
-	int id;
-	struct ppa_addr bppa;
-
-	struct rrpc_block *cur, *gc_cur;
-	struct rrpc_block *blocks;	/* Reference to block allocation */
-
-	struct list_head prio_list;	/* Blocks that may be GC'ed */
-	struct list_head wblk_list;	/* Queued blocks to be written to */
-
-	/* lun block lists */
-	struct list_head used_list;	/* In-use blocks */
-	struct list_head free_list;	/* Not used blocks i.e. released
-					 * and ready for use
-					 */
-	struct list_head bb_list;	/* Bad blocks. Mutually exclusive with
-					 * free_list and used_list
-					 */
-	unsigned int nr_free_blocks;	/* Number of unused blocks */
-
-	struct work_struct ws_gc;
-
-	int reserved_blocks;
-
-	spinlock_t lock;
-};
-
-struct rrpc {
-	struct nvm_tgt_dev *dev;
-	struct gendisk *disk;
-
-	sector_t soffset; /* logical sector offset */
-
-	int nr_luns;
-	struct rrpc_lun *luns;
-
-	/* calculated values */
-	unsigned long long nr_sects;
-
-	/* Write strategy variables. Move these into each for structure for each
-	 * strategy
-	 */
-	atomic_t next_lun; /* Whenever a page is written, this is updated
-			    * to point to the next write lun
-			    */
-
-	spinlock_t bio_lock;
-	struct bio_list requeue_bios;
-	struct work_struct ws_requeue;
-
-	/* Simple translation map of logical addresses to physical addresses.
-	 * The logical addresses is known by the host system, while the physical
-	 * addresses are used when writing to the disk block device.
-	 */
-	struct rrpc_addr *trans_map;
-	/* also store a reverse map for garbage collection */
-	struct rrpc_rev_addr *rev_trans_map;
-	spinlock_t rev_lock;
-
-	struct rrpc_inflight inflights;
-
-	mempool_t *addr_pool;
-	mempool_t *page_pool;
-	mempool_t *gcb_pool;
-	mempool_t *rq_pool;
-
-	struct timer_list gc_timer;
-	struct workqueue_struct *krqd_wq;
-	struct workqueue_struct *kgc_wq;
-};
-
-struct rrpc_block_gc {
-	struct rrpc *rrpc;
-	struct rrpc_block *rblk;
-	struct work_struct ws_gc;
-};
-
-/* Logical to physical mapping */
-struct rrpc_addr {
-	u64 addr;
-	struct rrpc_block *rblk;
-};
-
-/* Physical to logical mapping */
-struct rrpc_rev_addr {
-	u64 addr;
-};
-
-static inline struct ppa_addr rrpc_linear_to_generic_addr(struct nvm_geo *geo,
-							  struct ppa_addr r)
-{
-	struct ppa_addr l;
-	int secs, pgs;
-	sector_t ppa = r.ppa;
-
-	l.ppa = 0;
-
-	div_u64_rem(ppa, geo->sec_per_pg, &secs);
-	l.g.sec = secs;
-
-	sector_div(ppa, geo->sec_per_pg);
-	div_u64_rem(ppa, geo->pgs_per_blk, &pgs);
-	l.g.pg = pgs;
-
-	return l;
-}
-
-static inline struct ppa_addr rrpc_recov_addr(struct nvm_tgt_dev *dev, u64 pba)
-{
-	return linear_to_generic_addr(&dev->geo, pba);
-}
-
-static inline u64 rrpc_blk_to_ppa(struct rrpc *rrpc, struct rrpc_block *rblk)
-{
-	struct nvm_tgt_dev *dev = rrpc->dev;
-	struct nvm_geo *geo = &dev->geo;
-	struct rrpc_lun *rlun = rblk->rlun;
-
-	return (rlun->id * geo->sec_per_lun) + (rblk->id * geo->sec_per_blk);
-}
-
-static inline sector_t rrpc_get_laddr(struct bio *bio)
-{
-	return bio->bi_iter.bi_sector / NR_PHY_IN_LOG;
-}
-
-static inline unsigned int rrpc_get_pages(struct bio *bio)
-{
-	return  bio->bi_iter.bi_size / RRPC_EXPOSED_PAGE_SIZE;
-}
-
-static inline sector_t rrpc_get_sector(sector_t laddr)
-{
-	return laddr * NR_PHY_IN_LOG;
-}
-
-static inline int request_intersects(struct rrpc_inflight_rq *r,
-				sector_t laddr_start, sector_t laddr_end)
-{
-	return (laddr_end >= r->l_start) && (laddr_start <= r->l_end);
-}
-
-static int __rrpc_lock_laddr(struct rrpc *rrpc, sector_t laddr,
-			     unsigned int pages, struct rrpc_inflight_rq *r)
-{
-	sector_t laddr_end = laddr + pages - 1;
-	struct rrpc_inflight_rq *rtmp;
-
-	WARN_ON(irqs_disabled());
-
-	spin_lock_irq(&rrpc->inflights.lock);
-	list_for_each_entry(rtmp, &rrpc->inflights.reqs, list) {
-		if (unlikely(request_intersects(rtmp, laddr, laddr_end))) {
-			/* existing, overlapping request, come back later */
-			spin_unlock_irq(&rrpc->inflights.lock);
-			return 1;
-		}
-	}
-
-	r->l_start = laddr;
-	r->l_end = laddr_end;
-
-	list_add_tail(&r->list, &rrpc->inflights.reqs);
-	spin_unlock_irq(&rrpc->inflights.lock);
-	return 0;
-}
-
-static inline int rrpc_lock_laddr(struct rrpc *rrpc, sector_t laddr,
-				 unsigned int pages,
-				 struct rrpc_inflight_rq *r)
-{
-	BUG_ON((laddr + pages) > rrpc->nr_sects);
-
-	return __rrpc_lock_laddr(rrpc, laddr, pages, r);
-}
-
-static inline struct rrpc_inflight_rq *rrpc_get_inflight_rq(struct nvm_rq *rqd)
-{
-	struct rrpc_rq *rrqd = nvm_rq_to_pdu(rqd);
-
-	return &rrqd->inflight_rq;
-}
-
-static inline int rrpc_lock_rq(struct rrpc *rrpc, struct bio *bio,
-							struct nvm_rq *rqd)
-{
-	sector_t laddr = rrpc_get_laddr(bio);
-	unsigned int pages = rrpc_get_pages(bio);
-	struct rrpc_inflight_rq *r = rrpc_get_inflight_rq(rqd);
-
-	return rrpc_lock_laddr(rrpc, laddr, pages, r);
-}
-
-static inline void rrpc_unlock_laddr(struct rrpc *rrpc,
-						struct rrpc_inflight_rq *r)
-{
-	unsigned long flags;
-
-	spin_lock_irqsave(&rrpc->inflights.lock, flags);
-	list_del_init(&r->list);
-	spin_unlock_irqrestore(&rrpc->inflights.lock, flags);
-}
-
-static inline void rrpc_unlock_rq(struct rrpc *rrpc, struct nvm_rq *rqd)
-{
-	struct rrpc_inflight_rq *r = rrpc_get_inflight_rq(rqd);
-	uint8_t pages = rqd->nr_ppas;
-
-	BUG_ON((r->l_start + pages) > rrpc->nr_sects);
-
-	rrpc_unlock_laddr(rrpc, r);
-}
-
-#endif /* RRPC_H_ */
diff --git a/drivers/md/bcache/alloc.c b/drivers/md/bcache/alloc.c
index a0cc1bc..6cc6c0f 100644
--- a/drivers/md/bcache/alloc.c
+++ b/drivers/md/bcache/alloc.c
@@ -525,15 +525,21 @@ struct open_bucket {
 
 /*
  * We keep multiple buckets open for writes, and try to segregate different
- * write streams for better cache utilization: first we look for a bucket where
- * the last write to it was sequential with the current write, and failing that
- * we look for a bucket that was last used by the same task.
+ * write streams for better cache utilization: first we try to segregate flash
+ * only volume write streams from cached devices, secondly we look for a bucket
+ * where the last write to it was sequential with the current write, and
+ * failing that we look for a bucket that was last used by the same task.
  *
  * The ideas is if you've got multiple tasks pulling data into the cache at the
  * same time, you'll get better cache utilization if you try to segregate their
  * data and preserve locality.
  *
- * For example, say you've starting Firefox at the same time you're copying a
+ * For example, dirty sectors of flash only volume is not reclaimable, if their
+ * dirty sectors mixed with dirty sectors of cached device, such buckets will
+ * be marked as dirty and won't be reclaimed, though the dirty data of cached
+ * device have been written back to backend device.
+ *
+ * And say you've starting Firefox at the same time you're copying a
  * bunch of files. Firefox will likely end up being fairly hot and stay in the
  * cache awhile, but the data you copied might not be; if you wrote all that
  * data to the same buckets it'd get invalidated at the same time.
@@ -550,7 +556,10 @@ static struct open_bucket *pick_data_bucket(struct cache_set *c,
 	struct open_bucket *ret, *ret_task = NULL;
 
 	list_for_each_entry_reverse(ret, &c->data_buckets, list)
-		if (!bkey_cmp(&ret->key, search))
+		if (UUID_FLASH_ONLY(&c->uuids[KEY_INODE(&ret->key)]) !=
+		    UUID_FLASH_ONLY(&c->uuids[KEY_INODE(search)]))
+			continue;
+		else if (!bkey_cmp(&ret->key, search))
 			goto found;
 		else if (ret->last_write_point == write_point)
 			ret_task = ret;
diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h
index 843877e..5e2d4e8 100644
--- a/drivers/md/bcache/bcache.h
+++ b/drivers/md/bcache/bcache.h
@@ -320,15 +320,16 @@ struct cached_dev {
 	 */
 	atomic_t		has_dirty;
 
+	/*
+	 * Set to zero by things that touch the backing volume-- except
+	 * writeback.  Incremented by writeback.  Used to determine when to
+	 * accelerate idle writeback.
+	 */
+	atomic_t		backing_idle;
+
 	struct bch_ratelimit	writeback_rate;
 	struct delayed_work	writeback_rate_update;
 
-	/*
-	 * Internal to the writeback code, so read_dirty() can keep track of
-	 * where it's at.
-	 */
-	sector_t		last_read;
-
 	/* Limit number of writeback bios in flight */
 	struct semaphore	in_flight;
 	struct task_struct	*writeback_thread;
@@ -336,6 +337,14 @@ struct cached_dev {
 
 	struct keybuf		writeback_keys;
 
+	/*
+	 * Order the write-half of writeback operations strongly in dispatch
+	 * order.  (Maintain LBA order; don't allow reads completing out of
+	 * order to re-order the writes...)
+	 */
+	struct closure_waitlist writeback_ordering_wait;
+	atomic_t		writeback_sequence_next;
+
 	/* For tracking sequential IO */
 #define RECENT_IO_BITS	7
 #define RECENT_IO	(1 << RECENT_IO_BITS)
@@ -488,6 +497,7 @@ struct cache_set {
 	int			caches_loaded;
 
 	struct bcache_device	**devices;
+	unsigned		devices_max_used;
 	struct list_head	cached_devs;
 	uint64_t		cached_dev_sectors;
 	struct closure		caching;
@@ -852,7 +862,7 @@ static inline void wake_up_allocators(struct cache_set *c)
 
 /* Forward declarations */
 
-void bch_count_io_errors(struct cache *, blk_status_t, const char *);
+void bch_count_io_errors(struct cache *, blk_status_t, int, const char *);
 void bch_bbio_count_io_errors(struct cache_set *, struct bio *,
 			      blk_status_t, const char *);
 void bch_bbio_endio(struct cache_set *, struct bio *, blk_status_t,
diff --git a/drivers/md/bcache/btree.c b/drivers/md/bcache/btree.c
index 81e8dc3..bf3a48a 100644
--- a/drivers/md/bcache/btree.c
+++ b/drivers/md/bcache/btree.c
@@ -419,7 +419,7 @@ static void do_btree_node_write(struct btree *b)
 	SET_PTR_OFFSET(&k.key, 0, PTR_OFFSET(&k.key, 0) +
 		       bset_sector_offset(&b->keys, i));
 
-	if (!bio_alloc_pages(b->bio, __GFP_NOWARN|GFP_NOWAIT)) {
+	if (!bch_bio_alloc_pages(b->bio, __GFP_NOWARN|GFP_NOWAIT)) {
 		int j;
 		struct bio_vec *bv;
 		void *base = (void *) ((unsigned long) i & ~(PAGE_SIZE - 1));
@@ -432,6 +432,7 @@ static void do_btree_node_write(struct btree *b)
 
 		continue_at(cl, btree_node_write_done, NULL);
 	} else {
+		/* No problem for multipage bvec since the bio is just allocated */
 		b->bio->bi_vcnt = 0;
 		bch_bio_map(b->bio, i);
 
@@ -1678,7 +1679,7 @@ static void bch_btree_gc_finish(struct cache_set *c)
 
 	/* don't reclaim buckets to which writeback keys point */
 	rcu_read_lock();
-	for (i = 0; i < c->nr_uuids; i++) {
+	for (i = 0; i < c->devices_max_used; i++) {
 		struct bcache_device *d = c->devices[i];
 		struct cached_dev *dc;
 		struct keybuf_key *w, *n;
@@ -1803,10 +1804,7 @@ static int bch_gc_thread(void *arg)
 int bch_gc_thread_start(struct cache_set *c)
 {
 	c->gc_thread = kthread_run(bch_gc_thread, c, "bcache_gc");
-	if (IS_ERR(c->gc_thread))
-		return PTR_ERR(c->gc_thread);
-
-	return 0;
+	return PTR_ERR_OR_ZERO(c->gc_thread);
 }
 
 /* Initial partial gc */
diff --git a/drivers/md/bcache/closure.c b/drivers/md/bcache/closure.c
index 1841d03..7f12920 100644
--- a/drivers/md/bcache/closure.c
+++ b/drivers/md/bcache/closure.c
@@ -8,6 +8,7 @@
 #include <linux/debugfs.h>
 #include <linux/module.h>
 #include <linux/seq_file.h>
+#include <linux/sched/debug.h>
 
 #include "closure.h"
 
@@ -18,10 +19,6 @@ static inline void closure_put_after_sub(struct closure *cl, int flags)
 	BUG_ON(flags & CLOSURE_GUARD_MASK);
 	BUG_ON(!r && (flags & ~CLOSURE_DESTRUCTOR));
 
-	/* Must deliver precisely one wakeup */
-	if (r == 1 && (flags & CLOSURE_SLEEPING))
-		wake_up_process(cl->task);
-
 	if (!r) {
 		if (cl->fn && !(flags & CLOSURE_DESTRUCTOR)) {
 			atomic_set(&cl->remaining,
@@ -100,28 +97,34 @@ bool closure_wait(struct closure_waitlist *waitlist, struct closure *cl)
 }
 EXPORT_SYMBOL(closure_wait);
 
-/**
- * closure_sync - sleep until a closure has nothing left to wait on
- *
- * Sleeps until the refcount hits 1 - the thread that's running the closure owns
- * the last refcount.
- */
-void closure_sync(struct closure *cl)
+struct closure_syncer {
+	struct task_struct	*task;
+	int			done;
+};
+
+static void closure_sync_fn(struct closure *cl)
 {
+	cl->s->done = 1;
+	wake_up_process(cl->s->task);
+}
+
+void __sched __closure_sync(struct closure *cl)
+{
+	struct closure_syncer s = { .task = current };
+
+	cl->s = &s;
+	continue_at(cl, closure_sync_fn, NULL);
+
 	while (1) {
-		__closure_start_sleep(cl);
-		closure_set_ret_ip(cl);
-
-		if ((atomic_read(&cl->remaining) &
-		     CLOSURE_REMAINING_MASK) == 1)
+		set_current_state(TASK_UNINTERRUPTIBLE);
+		if (s.done)
 			break;
-
 		schedule();
 	}
 
-	__closure_end_sleep(cl);
+	__set_current_state(TASK_RUNNING);
 }
-EXPORT_SYMBOL(closure_sync);
+EXPORT_SYMBOL(__closure_sync);
 
 #ifdef CONFIG_BCACHE_CLOSURES_DEBUG
 
@@ -168,12 +171,10 @@ static int debug_seq_show(struct seq_file *f, void *data)
 			   cl, (void *) cl->ip, cl->fn, cl->parent,
 			   r & CLOSURE_REMAINING_MASK);
 
-		seq_printf(f, "%s%s%s%s\n",
+		seq_printf(f, "%s%s\n",
 			   test_bit(WORK_STRUCT_PENDING_BIT,
 				    work_data_bits(&cl->work)) ? "Q" : "",
-			   r & CLOSURE_RUNNING	? "R" : "",
-			   r & CLOSURE_STACK	? "S" : "",
-			   r & CLOSURE_SLEEPING	? "Sl" : "");
+			   r & CLOSURE_RUNNING	? "R" : "");
 
 		if (r & CLOSURE_WAITING)
 			seq_printf(f, " W %pF\n",
diff --git a/drivers/md/bcache/closure.h b/drivers/md/bcache/closure.h
index ccfbea6..3b9dfc9 100644
--- a/drivers/md/bcache/closure.h
+++ b/drivers/md/bcache/closure.h
@@ -103,6 +103,7 @@
  */
 
 struct closure;
+struct closure_syncer;
 typedef void (closure_fn) (struct closure *);
 
 struct closure_waitlist {
@@ -115,10 +116,6 @@ enum closure_state {
 	 * the thread that owns the closure, and cleared by the thread that's
 	 * waking up the closure.
 	 *
-	 * CLOSURE_SLEEPING: Must be set before a thread uses a closure to sleep
-	 * - indicates that cl->task is valid and closure_put() may wake it up.
-	 * Only set or cleared by the thread that owns the closure.
-	 *
 	 * The rest are for debugging and don't affect behaviour:
 	 *
 	 * CLOSURE_RUNNING: Set when a closure is running (i.e. by
@@ -128,22 +125,16 @@ enum closure_state {
 	 * continue_at() and closure_return() clear it for you, if you're doing
 	 * something unusual you can use closure_set_dead() which also helps
 	 * annotate where references are being transferred.
-	 *
-	 * CLOSURE_STACK: Sanity check - remaining should never hit 0 on a
-	 * closure with this flag set
 	 */
 
-	CLOSURE_BITS_START	= (1 << 23),
-	CLOSURE_DESTRUCTOR	= (1 << 23),
-	CLOSURE_WAITING		= (1 << 25),
-	CLOSURE_SLEEPING	= (1 << 27),
-	CLOSURE_RUNNING		= (1 << 29),
-	CLOSURE_STACK		= (1 << 31),
+	CLOSURE_BITS_START	= (1U << 26),
+	CLOSURE_DESTRUCTOR	= (1U << 26),
+	CLOSURE_WAITING		= (1U << 28),
+	CLOSURE_RUNNING		= (1U << 30),
 };
 
 #define CLOSURE_GUARD_MASK					\
-	((CLOSURE_DESTRUCTOR|CLOSURE_WAITING|CLOSURE_SLEEPING|	\
-	  CLOSURE_RUNNING|CLOSURE_STACK) << 1)
+	((CLOSURE_DESTRUCTOR|CLOSURE_WAITING|CLOSURE_RUNNING) << 1)
 
 #define CLOSURE_REMAINING_MASK		(CLOSURE_BITS_START - 1)
 #define CLOSURE_REMAINING_INITIALIZER	(1|CLOSURE_RUNNING)
@@ -152,7 +143,7 @@ struct closure {
 	union {
 		struct {
 			struct workqueue_struct *wq;
-			struct task_struct	*task;
+			struct closure_syncer	*s;
 			struct llist_node	list;
 			closure_fn		*fn;
 		};
@@ -178,7 +169,19 @@ void closure_sub(struct closure *cl, int v);
 void closure_put(struct closure *cl);
 void __closure_wake_up(struct closure_waitlist *list);
 bool closure_wait(struct closure_waitlist *list, struct closure *cl);
-void closure_sync(struct closure *cl);
+void __closure_sync(struct closure *cl);
+
+/**
+ * closure_sync - sleep until a closure a closure has nothing left to wait on
+ *
+ * Sleeps until the refcount hits 1 - the thread that's running the closure owns
+ * the last refcount.
+ */
+static inline void closure_sync(struct closure *cl)
+{
+	if ((atomic_read(&cl->remaining) & CLOSURE_REMAINING_MASK) != 1)
+		__closure_sync(cl);
+}
 
 #ifdef CONFIG_BCACHE_CLOSURES_DEBUG
 
@@ -215,24 +218,6 @@ static inline void closure_set_waiting(struct closure *cl, unsigned long f)
 #endif
 }
 
-static inline void __closure_end_sleep(struct closure *cl)
-{
-	__set_current_state(TASK_RUNNING);
-
-	if (atomic_read(&cl->remaining) & CLOSURE_SLEEPING)
-		atomic_sub(CLOSURE_SLEEPING, &cl->remaining);
-}
-
-static inline void __closure_start_sleep(struct closure *cl)
-{
-	closure_set_ip(cl);
-	cl->task = current;
-	set_current_state(TASK_UNINTERRUPTIBLE);
-
-	if (!(atomic_read(&cl->remaining) & CLOSURE_SLEEPING))
-		atomic_add(CLOSURE_SLEEPING, &cl->remaining);
-}
-
 static inline void closure_set_stopped(struct closure *cl)
 {
 	atomic_sub(CLOSURE_RUNNING, &cl->remaining);
@@ -241,7 +226,6 @@ static inline void closure_set_stopped(struct closure *cl)
 static inline void set_closure_fn(struct closure *cl, closure_fn *fn,
 				  struct workqueue_struct *wq)
 {
-	BUG_ON(object_is_on_stack(cl));
 	closure_set_ip(cl);
 	cl->fn = fn;
 	cl->wq = wq;
@@ -300,7 +284,7 @@ static inline void closure_init(struct closure *cl, struct closure *parent)
 static inline void closure_init_stack(struct closure *cl)
 {
 	memset(cl, 0, sizeof(struct closure));
-	atomic_set(&cl->remaining, CLOSURE_REMAINING_INITIALIZER|CLOSURE_STACK);
+	atomic_set(&cl->remaining, CLOSURE_REMAINING_INITIALIZER);
 }
 
 /**
@@ -322,6 +306,8 @@ static inline void closure_wake_up(struct closure_waitlist *list)
  * This is because after calling continue_at() you no longer have a ref on @cl,
  * and whatever @cl owns may be freed out from under you - a running closure fn
  * has a ref on its own closure which continue_at() drops.
+ *
+ * Note you are expected to immediately return after using this macro.
  */
 #define continue_at(_cl, _fn, _wq)					\
 do {									\
diff --git a/drivers/md/bcache/debug.c b/drivers/md/bcache/debug.c
index c7a02c4..af89408 100644
--- a/drivers/md/bcache/debug.c
+++ b/drivers/md/bcache/debug.c
@@ -116,7 +116,7 @@ void bch_data_verify(struct cached_dev *dc, struct bio *bio)
 		return;
 	check->bi_opf = REQ_OP_READ;
 
-	if (bio_alloc_pages(check, GFP_NOIO))
+	if (bch_bio_alloc_pages(check, GFP_NOIO))
 		goto out_put;
 
 	submit_bio_wait(check);
@@ -251,8 +251,7 @@ void bch_debug_exit(void)
 
 int __init bch_debug_init(struct kobject *kobj)
 {
-	int ret = 0;
-
 	debug = debugfs_create_dir("bcache", NULL);
-	return ret;
+
+	return IS_ERR_OR_NULL(debug);
 }
diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c
index fac97ec..a783c5a41 100644
--- a/drivers/md/bcache/io.c
+++ b/drivers/md/bcache/io.c
@@ -51,7 +51,10 @@ void bch_submit_bbio(struct bio *bio, struct cache_set *c,
 
 /* IO errors */
 
-void bch_count_io_errors(struct cache *ca, blk_status_t error, const char *m)
+void bch_count_io_errors(struct cache *ca,
+			 blk_status_t error,
+			 int is_read,
+			 const char *m)
 {
 	/*
 	 * The halflife of an error is:
@@ -94,8 +97,9 @@ void bch_count_io_errors(struct cache *ca, blk_status_t error, const char *m)
 		errors >>= IO_ERROR_SHIFT;
 
 		if (errors < ca->set->error_limit)
-			pr_err("%s: IO error on %s, recovering",
-			       bdevname(ca->bdev, buf), m);
+			pr_err("%s: IO error on %s%s",
+			       bdevname(ca->bdev, buf), m,
+			       is_read ? ", recovering." : ".");
 		else
 			bch_cache_set_error(ca->set,
 					    "%s: too many IO errors %s",
@@ -108,6 +112,7 @@ void bch_bbio_count_io_errors(struct cache_set *c, struct bio *bio,
 {
 	struct bbio *b = container_of(bio, struct bbio, bio);
 	struct cache *ca = PTR_CACHE(c, &b->key, 0);
+	int is_read = (bio_data_dir(bio) == READ ? 1 : 0);
 
 	unsigned threshold = op_is_write(bio_op(bio))
 		? c->congested_write_threshold_us
@@ -129,7 +134,7 @@ void bch_bbio_count_io_errors(struct cache_set *c, struct bio *bio,
 			atomic_inc(&c->congested);
 	}
 
-	bch_count_io_errors(ca, error, m);
+	bch_count_io_errors(ca, error, is_read, m);
 }
 
 void bch_bbio_endio(struct cache_set *c, struct bio *bio,
diff --git a/drivers/md/bcache/movinggc.c b/drivers/md/bcache/movinggc.c
index d50c1c9..a24c3a9 100644
--- a/drivers/md/bcache/movinggc.c
+++ b/drivers/md/bcache/movinggc.c
@@ -162,7 +162,7 @@ static void read_moving(struct cache_set *c)
 		bio_set_op_attrs(bio, REQ_OP_READ, 0);
 		bio->bi_end_io	= read_moving_endio;
 
-		if (bio_alloc_pages(bio, GFP_KERNEL))
+		if (bch_bio_alloc_pages(bio, GFP_KERNEL))
 			goto err;
 
 		trace_bcache_gc_copy(&w->key);
diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c
index 643c3021..1a46b41 100644
--- a/drivers/md/bcache/request.c
+++ b/drivers/md/bcache/request.c
@@ -576,6 +576,7 @@ static void cache_lookup(struct closure *cl)
 {
 	struct search *s = container_of(cl, struct search, iop.cl);
 	struct bio *bio = &s->bio.bio;
+	struct cached_dev *dc;
 	int ret;
 
 	bch_btree_op_init(&s->op, -1);
@@ -588,6 +589,27 @@ static void cache_lookup(struct closure *cl)
 		return;
 	}
 
+	/*
+	 * We might meet err when searching the btree, If that happens, we will
+	 * get negative ret, in this scenario we should not recover data from
+	 * backing device (when cache device is dirty) because we don't know
+	 * whether bkeys the read request covered are all clean.
+	 *
+	 * And after that happened, s->iop.status is still its initial value
+	 * before we submit s->bio.bio
+	 */
+	if (ret < 0) {
+		BUG_ON(ret == -EINTR);
+		if (s->d && s->d->c &&
+				!UUID_FLASH_ONLY(&s->d->c->uuids[s->d->id])) {
+			dc = container_of(s->d, struct cached_dev, disk);
+			if (dc && atomic_read(&dc->has_dirty))
+				s->recoverable = false;
+		}
+		if (!s->iop.status)
+			s->iop.status = BLK_STS_IOERR;
+	}
+
 	closure_return(cl);
 }
 
@@ -611,8 +633,8 @@ static void request_endio(struct bio *bio)
 static void bio_complete(struct search *s)
 {
 	if (s->orig_bio) {
-		struct request_queue *q = s->orig_bio->bi_disk->queue;
-		generic_end_io_acct(q, bio_data_dir(s->orig_bio),
+		generic_end_io_acct(s->d->disk->queue,
+				    bio_data_dir(s->orig_bio),
 				    &s->d->disk->part0, s->start_time);
 
 		trace_bcache_request_end(s->d, s->orig_bio);
@@ -841,7 +863,7 @@ static int cached_dev_cache_miss(struct btree *b, struct search *s,
 	cache_bio->bi_private	= &s->cl;
 
 	bch_bio_map(cache_bio, NULL);
-	if (bio_alloc_pages(cache_bio, __GFP_NOWARN|GFP_NOIO))
+	if (bch_bio_alloc_pages(cache_bio, __GFP_NOWARN|GFP_NOIO))
 		goto out_put;
 
 	if (reada)
@@ -974,6 +996,7 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q,
 	struct cached_dev *dc = container_of(d, struct cached_dev, disk);
 	int rw = bio_data_dir(bio);
 
+	atomic_set(&dc->backing_idle, 0);
 	generic_start_io_acct(q, rw, bio_sectors(bio), &d->disk->part0);
 
 	bio_set_dev(bio, dc->bdev);
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index b4d2892..133b812 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -211,7 +211,7 @@ static void write_bdev_super_endio(struct bio *bio)
 
 static void __write_super(struct cache_sb *sb, struct bio *bio)
 {
-	struct cache_sb *out = page_address(bio->bi_io_vec[0].bv_page);
+	struct cache_sb *out = page_address(bio_first_page_all(bio));
 	unsigned i;
 
 	bio->bi_iter.bi_sector	= SB_SECTOR;
@@ -274,7 +274,9 @@ static void write_super_endio(struct bio *bio)
 {
 	struct cache *ca = bio->bi_private;
 
-	bch_count_io_errors(ca, bio->bi_status, "writing superblock");
+	/* is_read = 0 */
+	bch_count_io_errors(ca, bio->bi_status, 0,
+			    "writing superblock");
 	closure_put(&ca->set->sb_write);
 }
 
@@ -721,6 +723,9 @@ static void bcache_device_attach(struct bcache_device *d, struct cache_set *c,
 	d->c = c;
 	c->devices[id] = d;
 
+	if (id >= c->devices_max_used)
+		c->devices_max_used = id + 1;
+
 	closure_get(&c->caching);
 }
 
@@ -906,6 +911,12 @@ static void cached_dev_detach_finish(struct work_struct *w)
 
 	mutex_lock(&bch_register_lock);
 
+	cancel_delayed_work_sync(&dc->writeback_rate_update);
+	if (!IS_ERR_OR_NULL(dc->writeback_thread)) {
+		kthread_stop(dc->writeback_thread);
+		dc->writeback_thread = NULL;
+	}
+
 	memset(&dc->sb.set_uuid, 0, 16);
 	SET_BDEV_STATE(&dc->sb, BDEV_STATE_NONE);
 
@@ -1166,7 +1177,7 @@ static void register_bdev(struct cache_sb *sb, struct page *sb_page,
 	dc->bdev->bd_holder = dc;
 
 	bio_init(&dc->sb_bio, dc->sb_bio.bi_inline_vecs, 1);
-	dc->sb_bio.bi_io_vec[0].bv_page = sb_page;
+	bio_first_bvec_all(&dc->sb_bio)->bv_page = sb_page;
 	get_page(sb_page);
 
 	if (cached_dev_init(dc, sb->block_size << 9))
@@ -1261,7 +1272,7 @@ static int flash_devs_run(struct cache_set *c)
 	struct uuid_entry *u;
 
 	for (u = c->uuids;
-	     u < c->uuids + c->nr_uuids && !ret;
+	     u < c->uuids + c->devices_max_used && !ret;
 	     u++)
 		if (UUID_FLASH_ONLY(u))
 			ret = flash_dev_run(c, u);
@@ -1427,7 +1438,7 @@ static void __cache_set_unregister(struct closure *cl)
 
 	mutex_lock(&bch_register_lock);
 
-	for (i = 0; i < c->nr_uuids; i++)
+	for (i = 0; i < c->devices_max_used; i++)
 		if (c->devices[i]) {
 			if (!UUID_FLASH_ONLY(&c->uuids[i]) &&
 			    test_bit(CACHE_SET_UNREGISTERING, &c->flags)) {
@@ -1490,7 +1501,7 @@ struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 	c->bucket_bits		= ilog2(sb->bucket_size);
 	c->block_bits		= ilog2(sb->block_size);
 	c->nr_uuids		= bucket_bytes(c) / sizeof(struct uuid_entry);
-
+	c->devices_max_used	= 0;
 	c->btree_pages		= bucket_pages(c);
 	if (c->btree_pages > BTREE_MAX_PAGES)
 		c->btree_pages = max_t(int, c->btree_pages / 4,
@@ -1810,7 +1821,7 @@ void bch_cache_release(struct kobject *kobj)
 		free_fifo(&ca->free[i]);
 
 	if (ca->sb_bio.bi_inline_vecs[0].bv_page)
-		put_page(ca->sb_bio.bi_io_vec[0].bv_page);
+		put_page(bio_first_page_all(&ca->sb_bio));
 
 	if (!IS_ERR_OR_NULL(ca->bdev))
 		blkdev_put(ca->bdev, FMODE_READ|FMODE_WRITE|FMODE_EXCL);
@@ -1864,7 +1875,7 @@ static int register_cache(struct cache_sb *sb, struct page *sb_page,
 	ca->bdev->bd_holder = ca;
 
 	bio_init(&ca->sb_bio, ca->sb_bio.bi_inline_vecs, 1);
-	ca->sb_bio.bi_io_vec[0].bv_page = sb_page;
+	bio_first_bvec_all(&ca->sb_bio)->bv_page = sb_page;
 	get_page(sb_page);
 
 	if (blk_queue_discard(bdev_get_queue(ca->bdev)))
diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
index e548b8b..a23cd6a 100644
--- a/drivers/md/bcache/util.c
+++ b/drivers/md/bcache/util.c
@@ -249,6 +249,13 @@ uint64_t bch_next_delay(struct bch_ratelimit *d, uint64_t done)
 		: 0;
 }
 
+/*
+ * Generally it isn't good to access .bi_io_vec and .bi_vcnt directly,
+ * the preferred way is bio_add_page, but in this case, bch_bio_map()
+ * supposes that the bvec table is empty, so it is safe to access
+ * .bi_vcnt & .bi_io_vec in this way even after multipage bvec is
+ * supported.
+ */
 void bch_bio_map(struct bio *bio, void *base)
 {
 	size_t size = bio->bi_iter.bi_size;
@@ -276,6 +283,33 @@ start:		bv->bv_len	= min_t(size_t, PAGE_SIZE - bv->bv_offset,
 	}
 }
 
+/**
+ * bch_bio_alloc_pages - allocates a single page for each bvec in a bio
+ * @bio: bio to allocate pages for
+ * @gfp_mask: flags for allocation
+ *
+ * Allocates pages up to @bio->bi_vcnt.
+ *
+ * Returns 0 on success, -ENOMEM on failure. On failure, any allocated pages are
+ * freed.
+ */
+int bch_bio_alloc_pages(struct bio *bio, gfp_t gfp_mask)
+{
+	int i;
+	struct bio_vec *bv;
+
+	bio_for_each_segment_all(bv, bio, i) {
+		bv->bv_page = alloc_page(gfp_mask);
+		if (!bv->bv_page) {
+			while (--bv >= bio->bi_io_vec)
+				__free_page(bv->bv_page);
+			return -ENOMEM;
+		}
+	}
+
+	return 0;
+}
+
 /*
  * Portions Copyright (c) 1996-2001, PostgreSQL Global Development Group (Any
  * use permitted, subject to terms of PostgreSQL license; see.)
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index ed5e8a4..4df4c5c 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -558,6 +558,7 @@ static inline unsigned fract_exp_two(unsigned x, unsigned fract_bits)
 }
 
 void bch_bio_map(struct bio *bio, void *base);
+int bch_bio_alloc_pages(struct bio *bio, gfp_t gfp_mask);
 
 static inline sector_t bdev_sectors(struct block_device *bdev)
 {
diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 56a3788..51306a1 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -18,17 +18,39 @@
 #include <trace/events/bcache.h>
 
 /* Rate limiting */
+static uint64_t __calc_target_rate(struct cached_dev *dc)
+{
+	struct cache_set *c = dc->disk.c;
+
+	/*
+	 * This is the size of the cache, minus the amount used for
+	 * flash-only devices
+	 */
+	uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
+				bcache_flash_devs_sectors_dirty(c);
+
+	/*
+	 * Unfortunately there is no control of global dirty data.  If the
+	 * user states that they want 10% dirty data in the cache, and has,
+	 * e.g., 5 backing volumes of equal size, we try and ensure each
+	 * backing volume uses about 2% of the cache for dirty data.
+	 */
+	uint32_t bdev_share =
+		div64_u64(bdev_sectors(dc->bdev) << WRITEBACK_SHARE_SHIFT,
+				c->cached_dev_sectors);
+
+	uint64_t cache_dirty_target =
+		div_u64(cache_sectors * dc->writeback_percent, 100);
+
+	/* Ensure each backing dev gets at least one dirty share */
+	if (bdev_share < 1)
+		bdev_share = 1;
+
+	return (cache_dirty_target * bdev_share) >> WRITEBACK_SHARE_SHIFT;
+}
 
 static void __update_writeback_rate(struct cached_dev *dc)
 {
-	struct cache_set *c = dc->disk.c;
-	uint64_t cache_sectors = c->nbuckets * c->sb.bucket_size -
-				bcache_flash_devs_sectors_dirty(c);
-	uint64_t cache_dirty_target =
-		div_u64(cache_sectors * dc->writeback_percent, 100);
-	int64_t target = div64_u64(cache_dirty_target * bdev_sectors(dc->bdev),
-				   c->cached_dev_sectors);
-
 	/*
 	 * PI controller:
 	 * Figures out the amount that should be written per second.
@@ -49,6 +71,7 @@ static void __update_writeback_rate(struct cached_dev *dc)
 	 * This acts as a slow, long-term average that is not subject to
 	 * variations in usage like the p term.
 	 */
+	int64_t target = __calc_target_rate(dc);
 	int64_t dirty = bcache_dev_sectors_dirty(&dc->disk);
 	int64_t error = dirty - target;
 	int64_t proportional_scaled =
@@ -116,6 +139,7 @@ static unsigned writeback_delay(struct cached_dev *dc, unsigned sectors)
 struct dirty_io {
 	struct closure		cl;
 	struct cached_dev	*dc;
+	uint16_t		sequence;
 	struct bio		bio;
 };
 
@@ -194,6 +218,27 @@ static void write_dirty(struct closure *cl)
 {
 	struct dirty_io *io = container_of(cl, struct dirty_io, cl);
 	struct keybuf_key *w = io->bio.bi_private;
+	struct cached_dev *dc = io->dc;
+
+	uint16_t next_sequence;
+
+	if (atomic_read(&dc->writeback_sequence_next) != io->sequence) {
+		/* Not our turn to write; wait for a write to complete */
+		closure_wait(&dc->writeback_ordering_wait, cl);
+
+		if (atomic_read(&dc->writeback_sequence_next) == io->sequence) {
+			/*
+			 * Edge case-- it happened in indeterminate order
+			 * relative to when we were added to wait list..
+			 */
+			closure_wake_up(&dc->writeback_ordering_wait);
+		}
+
+		continue_at(cl, write_dirty, io->dc->writeback_write_wq);
+		return;
+	}
+
+	next_sequence = io->sequence + 1;
 
 	/*
 	 * IO errors are signalled using the dirty bit on the key.
@@ -211,6 +256,9 @@ static void write_dirty(struct closure *cl)
 		closure_bio_submit(&io->bio, cl);
 	}
 
+	atomic_set(&dc->writeback_sequence_next, next_sequence);
+	closure_wake_up(&dc->writeback_ordering_wait);
+
 	continue_at(cl, write_dirty_finish, io->dc->writeback_write_wq);
 }
 
@@ -219,8 +267,10 @@ static void read_dirty_endio(struct bio *bio)
 	struct keybuf_key *w = bio->bi_private;
 	struct dirty_io *io = w->private;
 
+	/* is_read = 1 */
 	bch_count_io_errors(PTR_CACHE(io->dc->disk.c, &w->key, 0),
-			    bio->bi_status, "reading dirty data from cache");
+			    bio->bi_status, 1,
+			    "reading dirty data from cache");
 
 	dirty_endio(bio);
 }
@@ -237,10 +287,15 @@ static void read_dirty_submit(struct closure *cl)
 static void read_dirty(struct cached_dev *dc)
 {
 	unsigned delay = 0;
-	struct keybuf_key *w;
+	struct keybuf_key *next, *keys[MAX_WRITEBACKS_IN_PASS], *w;
+	size_t size;
+	int nk, i;
 	struct dirty_io *io;
 	struct closure cl;
+	uint16_t sequence = 0;
 
+	BUG_ON(!llist_empty(&dc->writeback_ordering_wait.list));
+	atomic_set(&dc->writeback_sequence_next, sequence);
 	closure_init_stack(&cl);
 
 	/*
@@ -248,45 +303,109 @@ static void read_dirty(struct cached_dev *dc)
 	 * mempools.
 	 */
 
-	while (!kthread_should_stop()) {
+	next = bch_keybuf_next(&dc->writeback_keys);
 
-		w = bch_keybuf_next(&dc->writeback_keys);
-		if (!w)
-			break;
+	while (!kthread_should_stop() && next) {
+		size = 0;
+		nk = 0;
 
-		BUG_ON(ptr_stale(dc->disk.c, &w->key, 0));
+		do {
+			BUG_ON(ptr_stale(dc->disk.c, &next->key, 0));
 
-		if (KEY_START(&w->key) != dc->last_read ||
-		    jiffies_to_msecs(delay) > 50)
-			while (!kthread_should_stop() && delay)
-				delay = schedule_timeout_interruptible(delay);
+			/*
+			 * Don't combine too many operations, even if they
+			 * are all small.
+			 */
+			if (nk >= MAX_WRITEBACKS_IN_PASS)
+				break;
 
-		dc->last_read	= KEY_OFFSET(&w->key);
+			/*
+			 * If the current operation is very large, don't
+			 * further combine operations.
+			 */
+			if (size >= MAX_WRITESIZE_IN_PASS)
+				break;
 
-		io = kzalloc(sizeof(struct dirty_io) + sizeof(struct bio_vec)
-			     * DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
-			     GFP_KERNEL);
-		if (!io)
-			goto err;
+			/*
+			 * Operations are only eligible to be combined
+			 * if they are contiguous.
+			 *
+			 * TODO: add a heuristic willing to fire a
+			 * certain amount of non-contiguous IO per pass,
+			 * so that we can benefit from backing device
+			 * command queueing.
+			 */
+			if ((nk != 0) && bkey_cmp(&keys[nk-1]->key,
+						&START_KEY(&next->key)))
+				break;
 
-		w->private	= io;
-		io->dc		= dc;
+			size += KEY_SIZE(&next->key);
+			keys[nk++] = next;
+		} while ((next = bch_keybuf_next(&dc->writeback_keys)));
 
-		dirty_init(w);
-		bio_set_op_attrs(&io->bio, REQ_OP_READ, 0);
-		io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
-		bio_set_dev(&io->bio, PTR_CACHE(dc->disk.c, &w->key, 0)->bdev);
-		io->bio.bi_end_io	= read_dirty_endio;
+		/* Now we have gathered a set of 1..5 keys to write back. */
+		for (i = 0; i < nk; i++) {
+			w = keys[i];
 
-		if (bio_alloc_pages(&io->bio, GFP_KERNEL))
-			goto err_free;
+			io = kzalloc(sizeof(struct dirty_io) +
+				     sizeof(struct bio_vec) *
+				     DIV_ROUND_UP(KEY_SIZE(&w->key), PAGE_SECTORS),
+				     GFP_KERNEL);
+			if (!io)
+				goto err;
 
-		trace_bcache_writeback(&w->key);
+			w->private	= io;
+			io->dc		= dc;
+			io->sequence    = sequence++;
 
-		down(&dc->in_flight);
-		closure_call(&io->cl, read_dirty_submit, NULL, &cl);
+			dirty_init(w);
+			bio_set_op_attrs(&io->bio, REQ_OP_READ, 0);
+			io->bio.bi_iter.bi_sector = PTR_OFFSET(&w->key, 0);
+			bio_set_dev(&io->bio,
+				    PTR_CACHE(dc->disk.c, &w->key, 0)->bdev);
+			io->bio.bi_end_io	= read_dirty_endio;
 
-		delay = writeback_delay(dc, KEY_SIZE(&w->key));
+			if (bch_bio_alloc_pages(&io->bio, GFP_KERNEL))
+				goto err_free;
+
+			trace_bcache_writeback(&w->key);
+
+			down(&dc->in_flight);
+
+			/* We've acquired a semaphore for the maximum
+			 * simultaneous number of writebacks; from here
+			 * everything happens asynchronously.
+			 */
+			closure_call(&io->cl, read_dirty_submit, NULL, &cl);
+		}
+
+		delay = writeback_delay(dc, size);
+
+		/* If the control system would wait for at least half a
+		 * second, and there's been no reqs hitting the backing disk
+		 * for awhile: use an alternate mode where we have at most
+		 * one contiguous set of writebacks in flight at a time.  If
+		 * someone wants to do IO it will be quick, as it will only
+		 * have to contend with one operation in flight, and we'll
+		 * be round-tripping data to the backing disk as quickly as
+		 * it can accept it.
+		 */
+		if (delay >= HZ / 2) {
+			/* 3 means at least 1.5 seconds, up to 7.5 if we
+			 * have slowed way down.
+			 */
+			if (atomic_inc_return(&dc->backing_idle) >= 3) {
+				/* Wait for current I/Os to finish */
+				closure_sync(&cl);
+				/* And immediately launch a new set. */
+				delay = 0;
+			}
+		}
+
+		while (!kthread_should_stop() && delay) {
+			schedule_timeout_interruptible(delay);
+			delay = writeback_delay(dc, 0);
+		}
 	}
 
 	if (0) {
diff --git a/drivers/md/bcache/writeback.h b/drivers/md/bcache/writeback.h
index a9e3ffb..66f1c52 100644
--- a/drivers/md/bcache/writeback.h
+++ b/drivers/md/bcache/writeback.h
@@ -5,6 +5,16 @@
 #define CUTOFF_WRITEBACK	40
 #define CUTOFF_WRITEBACK_SYNC	70
 
+#define MAX_WRITEBACKS_IN_PASS  5
+#define MAX_WRITESIZE_IN_PASS   5000	/* *512b */
+
+/*
+ * 14 (16384ths) is chosen here as something that each backing device
+ * should be a reasonable fraction of the share, and not to blow up
+ * until individual backing devices are a petabyte.
+ */
+#define WRITEBACK_SHARE_SHIFT   14
+
 static inline uint64_t bcache_dev_sectors_dirty(struct bcache_device *d)
 {
 	uint64_t i, ret = 0;
@@ -21,7 +31,7 @@ static inline uint64_t  bcache_flash_devs_sectors_dirty(struct cache_set *c)
 
 	mutex_lock(&bch_register_lock);
 
-	for (i = 0; i < c->nr_uuids; i++) {
+	for (i = 0; i < c->devices_max_used; i++) {
 		struct bcache_device *d = c->devices[i];
 
 		if (!d || !UUID_FLASH_ONLY(&c->uuids[i]))
diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c
index 9fc12f5..2ad4291 100644
--- a/drivers/md/dm-crypt.c
+++ b/drivers/md/dm-crypt.c
@@ -1446,7 +1446,6 @@ static void crypt_free_buffer_pages(struct crypt_config *cc, struct bio *clone)
 	bio_for_each_segment_all(bv, clone, i) {
 		BUG_ON(!bv->bv_page);
 		mempool_free(bv->bv_page, cc->page_pool);
-		bv->bv_page = NULL;
 	}
 }
 
@@ -1954,10 +1953,15 @@ static int crypt_setkey(struct crypt_config *cc)
 	/* Ignore extra keys (which are used for IV etc) */
 	subkey_size = crypt_subkey_size(cc);
 
-	if (crypt_integrity_hmac(cc))
+	if (crypt_integrity_hmac(cc)) {
+		if (subkey_size < cc->key_mac_size)
+			return -EINVAL;
+
 		crypt_copy_authenckey(cc->authenc_key, cc->key,
 				      subkey_size - cc->key_mac_size,
 				      cc->key_mac_size);
+	}
+
 	for (i = 0; i < cc->tfms_count; i++) {
 		if (crypt_integrity_hmac(cc))
 			r = crypto_aead_setkey(cc->cipher_tfm.tfms_aead[i],
@@ -2053,9 +2057,6 @@ static int crypt_set_keyring_key(struct crypt_config *cc, const char *key_string
 
 	ret = crypt_setkey(cc);
 
-	/* wipe the kernel key payload copy in each case */
-	memset(cc->key, 0, cc->key_size * sizeof(u8));
-
 	if (!ret) {
 		set_bit(DM_CRYPT_KEY_VALID, &cc->flags);
 		kzfree(cc->key_string);
@@ -2523,6 +2524,10 @@ static int crypt_ctr_cipher(struct dm_target *ti, char *cipher_in, char *key)
 		}
 	}
 
+	/* wipe the kernel key payload copy */
+	if (cc->key_string)
+		memset(cc->key, 0, cc->key_size * sizeof(u8));
+
 	return ret;
 }
 
@@ -2740,6 +2745,7 @@ static int crypt_ctr(struct dm_target *ti, unsigned int argc, char **argv)
 			cc->tag_pool_max_sectors * cc->on_disk_tag_size);
 		if (!cc->tag_pool) {
 			ti->error = "Cannot allocate integrity tags mempool";
+			ret = -ENOMEM;
 			goto bad;
 		}
 
@@ -2961,6 +2967,9 @@ static int crypt_message(struct dm_target *ti, unsigned argc, char **argv)
 				return ret;
 			if (cc->iv_gen_ops && cc->iv_gen_ops->init)
 				ret = cc->iv_gen_ops->init(cc);
+			/* wipe the kernel key payload copy */
+			if (cc->key_string)
+				memset(cc->key, 0, cc->key_size * sizeof(u8));
 			return ret;
 		}
 		if (argc == 2 && !strcasecmp(argv[1], "wipe")) {
@@ -3007,7 +3016,7 @@ static void crypt_io_hints(struct dm_target *ti, struct queue_limits *limits)
 
 static struct target_type crypt_target = {
 	.name   = "crypt",
-	.version = {1, 18, 0},
+	.version = {1, 18, 1},
 	.module = THIS_MODULE,
 	.ctr    = crypt_ctr,
 	.dtr    = crypt_dtr,
diff --git a/drivers/md/dm-integrity.c b/drivers/md/dm-integrity.c
index 05c7bfd..46d7c87 100644
--- a/drivers/md/dm-integrity.c
+++ b/drivers/md/dm-integrity.c
@@ -2559,7 +2559,8 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 	int r = 0;
 	unsigned i;
 	__u64 journal_pages, journal_desc_size, journal_tree_size;
-	unsigned char *crypt_data = NULL;
+	unsigned char *crypt_data = NULL, *crypt_iv = NULL;
+	struct skcipher_request *req = NULL;
 
 	ic->commit_ids[0] = cpu_to_le64(0x1111111111111111ULL);
 	ic->commit_ids[1] = cpu_to_le64(0x2222222222222222ULL);
@@ -2617,9 +2618,20 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 
 		if (blocksize == 1) {
 			struct scatterlist *sg;
-			SKCIPHER_REQUEST_ON_STACK(req, ic->journal_crypt);
-			unsigned char iv[ivsize];
-			skcipher_request_set_tfm(req, ic->journal_crypt);
+
+			req = skcipher_request_alloc(ic->journal_crypt, GFP_KERNEL);
+			if (!req) {
+				*error = "Could not allocate crypt request";
+				r = -ENOMEM;
+				goto bad;
+			}
+
+			crypt_iv = kmalloc(ivsize, GFP_KERNEL);
+			if (!crypt_iv) {
+				*error = "Could not allocate iv";
+				r = -ENOMEM;
+				goto bad;
+			}
 
 			ic->journal_xor = dm_integrity_alloc_page_list(ic);
 			if (!ic->journal_xor) {
@@ -2641,9 +2653,9 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 				sg_set_buf(&sg[i], va, PAGE_SIZE);
 			}
 			sg_set_buf(&sg[i], &ic->commit_ids, sizeof ic->commit_ids);
-			memset(iv, 0x00, ivsize);
+			memset(crypt_iv, 0x00, ivsize);
 
-			skcipher_request_set_crypt(req, sg, sg, PAGE_SIZE * ic->journal_pages + sizeof ic->commit_ids, iv);
+			skcipher_request_set_crypt(req, sg, sg, PAGE_SIZE * ic->journal_pages + sizeof ic->commit_ids, crypt_iv);
 			init_completion(&comp.comp);
 			comp.in_flight = (atomic_t)ATOMIC_INIT(1);
 			if (do_crypt(true, req, &comp))
@@ -2659,10 +2671,22 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 			crypto_free_skcipher(ic->journal_crypt);
 			ic->journal_crypt = NULL;
 		} else {
-			SKCIPHER_REQUEST_ON_STACK(req, ic->journal_crypt);
-			unsigned char iv[ivsize];
 			unsigned crypt_len = roundup(ivsize, blocksize);
 
+			req = skcipher_request_alloc(ic->journal_crypt, GFP_KERNEL);
+			if (!req) {
+				*error = "Could not allocate crypt request";
+				r = -ENOMEM;
+				goto bad;
+			}
+
+			crypt_iv = kmalloc(ivsize, GFP_KERNEL);
+			if (!crypt_iv) {
+				*error = "Could not allocate iv";
+				r = -ENOMEM;
+				goto bad;
+			}
+
 			crypt_data = kmalloc(crypt_len, GFP_KERNEL);
 			if (!crypt_data) {
 				*error = "Unable to allocate crypt data";
@@ -2670,8 +2694,6 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 				goto bad;
 			}
 
-			skcipher_request_set_tfm(req, ic->journal_crypt);
-
 			ic->journal_scatterlist = dm_integrity_alloc_journal_scatterlist(ic, ic->journal);
 			if (!ic->journal_scatterlist) {
 				*error = "Unable to allocate sg list";
@@ -2695,12 +2717,12 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 				struct skcipher_request *section_req;
 				__u32 section_le = cpu_to_le32(i);
 
-				memset(iv, 0x00, ivsize);
+				memset(crypt_iv, 0x00, ivsize);
 				memset(crypt_data, 0x00, crypt_len);
 				memcpy(crypt_data, &section_le, min((size_t)crypt_len, sizeof(section_le)));
 
 				sg_init_one(&sg, crypt_data, crypt_len);
-				skcipher_request_set_crypt(req, &sg, &sg, crypt_len, iv);
+				skcipher_request_set_crypt(req, &sg, &sg, crypt_len, crypt_iv);
 				init_completion(&comp.comp);
 				comp.in_flight = (atomic_t)ATOMIC_INIT(1);
 				if (do_crypt(true, req, &comp))
@@ -2758,6 +2780,9 @@ static int create_journal(struct dm_integrity_c *ic, char **error)
 	}
 bad:
 	kfree(crypt_data);
+	kfree(crypt_iv);
+	skcipher_request_free(req);
+
 	return r;
 }
 
diff --git a/drivers/md/dm-mpath.c b/drivers/md/dm-mpath.c
index f7810cc..ef57c6d 100644
--- a/drivers/md/dm-mpath.c
+++ b/drivers/md/dm-mpath.c
@@ -1475,21 +1475,6 @@ static void activate_path_work(struct work_struct *work)
 	activate_or_offline_path(pgpath);
 }
 
-static int noretry_error(blk_status_t error)
-{
-	switch (error) {
-	case BLK_STS_NOTSUPP:
-	case BLK_STS_NOSPC:
-	case BLK_STS_TARGET:
-	case BLK_STS_NEXUS:
-	case BLK_STS_MEDIUM:
-		return 1;
-	}
-
-	/* Anything else could be a path failure, so should be retried */
-	return 0;
-}
-
 static int multipath_end_io(struct dm_target *ti, struct request *clone,
 			    blk_status_t error, union map_info *map_context)
 {
@@ -1508,7 +1493,7 @@ static int multipath_end_io(struct dm_target *ti, struct request *clone,
 	 * request into dm core, which will remake a clone request and
 	 * clone bios for it and resubmit it later.
 	 */
-	if (error && !noretry_error(error)) {
+	if (error && blk_path_error(error)) {
 		struct multipath *m = ti->private;
 
 		r = DM_ENDIO_REQUEUE;
@@ -1544,7 +1529,7 @@ static int multipath_end_io_bio(struct dm_target *ti, struct bio *clone,
 	unsigned long flags;
 	int r = DM_ENDIO_DONE;
 
-	if (!*error || noretry_error(*error))
+	if (!*error || !blk_path_error(*error))
 		goto done;
 
 	if (pgpath)
diff --git a/drivers/md/dm-rq.c b/drivers/md/dm-rq.c
index 9d32f25..b7d175e9 100644
--- a/drivers/md/dm-rq.c
+++ b/drivers/md/dm-rq.c
@@ -395,7 +395,7 @@ static void end_clone_request(struct request *clone, blk_status_t error)
 	dm_complete_request(tio->orig, error);
 }
 
-static void dm_dispatch_clone_request(struct request *clone, struct request *rq)
+static blk_status_t dm_dispatch_clone_request(struct request *clone, struct request *rq)
 {
 	blk_status_t r;
 
@@ -404,9 +404,10 @@ static void dm_dispatch_clone_request(struct request *clone, struct request *rq)
 
 	clone->start_time = jiffies;
 	r = blk_insert_cloned_request(clone->q, clone);
-	if (r)
+	if (r != BLK_STS_OK && r != BLK_STS_RESOURCE)
 		/* must complete clone in terms of original request */
 		dm_complete_request(rq, r);
+	return r;
 }
 
 static int dm_rq_bio_constructor(struct bio *bio, struct bio *bio_orig,
@@ -476,8 +477,10 @@ static int map_request(struct dm_rq_target_io *tio)
 	struct mapped_device *md = tio->md;
 	struct request *rq = tio->orig;
 	struct request *clone = NULL;
+	blk_status_t ret;
 
 	r = ti->type->clone_and_map_rq(ti, rq, &tio->info, &clone);
+check_again:
 	switch (r) {
 	case DM_MAPIO_SUBMITTED:
 		/* The target has taken the I/O to submit by itself later */
@@ -492,7 +495,17 @@ static int map_request(struct dm_rq_target_io *tio)
 		/* The target has remapped the I/O so dispatch it */
 		trace_block_rq_remap(clone->q, clone, disk_devt(dm_disk(md)),
 				     blk_rq_pos(rq));
-		dm_dispatch_clone_request(clone, rq);
+		ret = dm_dispatch_clone_request(clone, rq);
+		if (ret == BLK_STS_RESOURCE) {
+			blk_rq_unprep_clone(clone);
+			tio->ti->type->release_clone_rq(clone);
+			tio->clone = NULL;
+			if (!rq->q->mq_ops)
+				r = DM_MAPIO_DELAY_REQUEUE;
+			else
+				r = DM_MAPIO_REQUEUE;
+			goto check_again;
+		}
 		break;
 	case DM_MAPIO_REQUEUE:
 		/* The target wants to requeue the I/O */
@@ -713,8 +726,6 @@ int dm_old_init_request_queue(struct mapped_device *md, struct dm_table *t)
 		return error;
 	}
 
-	elv_register_queue(md->queue);
-
 	return 0;
 }
 
@@ -812,15 +823,8 @@ int dm_mq_init_request_queue(struct mapped_device *md, struct dm_table *t)
 	}
 	dm_init_md_queue(md);
 
-	/* backfill 'mq' sysfs registration normally done in blk_register_queue */
-	err = blk_mq_register_dev(disk_to_dev(md->disk), q);
-	if (err)
-		goto out_cleanup_queue;
-
 	return 0;
 
-out_cleanup_queue:
-	blk_cleanup_queue(q);
 out_tag_set:
 	blk_mq_free_tag_set(md->tag_set);
 out_kfree_tag_set:
diff --git a/drivers/md/dm-thin-metadata.c b/drivers/md/dm-thin-metadata.c
index d31d18d..36ef284 100644
--- a/drivers/md/dm-thin-metadata.c
+++ b/drivers/md/dm-thin-metadata.c
@@ -80,10 +80,14 @@
 #define SECTOR_TO_BLOCK_SHIFT 3
 
 /*
+ * For btree insert:
  *  3 for btree insert +
  *  2 for btree lookup used within space map
+ * For btree remove:
+ *  2 for shadow spine +
+ *  4 for rebalance 3 child node
  */
-#define THIN_MAX_CONCURRENT_LOCKS 5
+#define THIN_MAX_CONCURRENT_LOCKS 6
 
 /* This should be plenty */
 #define SPACE_MAP_ROOT_SIZE 128
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index de17b71..8c26bfc 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -920,7 +920,15 @@ int dm_set_target_max_io_len(struct dm_target *ti, sector_t len)
 		return -EINVAL;
 	}
 
-	ti->max_io_len = (uint32_t) len;
+	/*
+	 * BIO based queue uses its own splitting. When multipage bvecs
+	 * is switched on, size of the incoming bio may be too big to
+	 * be handled in some targets, such as crypt.
+	 *
+	 * When these targets are ready for the big bio, we can remove
+	 * the limit.
+	 */
+	ti->max_io_len = min_t(uint32_t, len, BIO_MAX_PAGES * PAGE_SIZE);
 
 	return 0;
 }
@@ -1753,7 +1761,7 @@ static struct mapped_device *alloc_dev(int minor)
 		goto bad;
 	md->dax_dev = dax_dev;
 
-	add_disk(md->disk);
+	add_disk_no_queue_reg(md->disk);
 	format_dev_t(md->name, MKDEV(_major, minor));
 
 	md->wq = alloc_workqueue("kdmflush", WQ_MEM_RECLAIM, 0);
@@ -2013,6 +2021,7 @@ EXPORT_SYMBOL_GPL(dm_get_queue_limits);
 int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 {
 	int r;
+	struct queue_limits limits;
 	enum dm_queue_mode type = dm_get_md_type(md);
 
 	switch (type) {
@@ -2049,6 +2058,14 @@ int dm_setup_md_queue(struct mapped_device *md, struct dm_table *t)
 		break;
 	}
 
+	r = dm_calculate_queue_limits(t, &limits);
+	if (r) {
+		DMERR("Cannot calculate initial queue limits");
+		return r;
+	}
+	dm_table_set_restrictions(t, md->queue, &limits);
+	blk_register_queue(md->disk);
+
 	return 0;
 }
 
diff --git a/drivers/md/persistent-data/dm-btree.c b/drivers/md/persistent-data/dm-btree.c
index f21ce6a..58b3197 100644
--- a/drivers/md/persistent-data/dm-btree.c
+++ b/drivers/md/persistent-data/dm-btree.c
@@ -683,23 +683,8 @@ static int btree_split_beneath(struct shadow_spine *s, uint64_t key)
 	pn->keys[1] = rn->keys[0];
 	memcpy_disk(value_ptr(pn, 1), &val, sizeof(__le64));
 
-	/*
-	 * rejig the spine.  This is ugly, since it knows too
-	 * much about the spine
-	 */
-	if (s->nodes[0] != new_parent) {
-		unlock_block(s->info, s->nodes[0]);
-		s->nodes[0] = new_parent;
-	}
-	if (key < le64_to_cpu(rn->keys[0])) {
-		unlock_block(s->info, right);
-		s->nodes[1] = left;
-	} else {
-		unlock_block(s->info, left);
-		s->nodes[1] = right;
-	}
-	s->count = 2;
-
+	unlock_block(s->info, left);
+	unlock_block(s->info, right);
 	return 0;
 }
 
diff --git a/drivers/memory/omap-gpmc.c b/drivers/memory/omap-gpmc.c
index a385a35..90a66b3 100644
--- a/drivers/memory/omap-gpmc.c
+++ b/drivers/memory/omap-gpmc.c
@@ -32,7 +32,6 @@
 #include <linux/pm_runtime.h>
 
 #include <linux/platform_data/mtd-nand-omap2.h>
-#include <linux/platform_data/mtd-onenand-omap2.h>
 
 #include <asm/mach-types.h>
 
@@ -1138,6 +1137,112 @@ struct gpmc_nand_ops *gpmc_omap_get_nand_ops(struct gpmc_nand_regs *reg, int cs)
 }
 EXPORT_SYMBOL_GPL(gpmc_omap_get_nand_ops);
 
+static void gpmc_omap_onenand_calc_sync_timings(struct gpmc_timings *t,
+						struct gpmc_settings *s,
+						int freq, int latency)
+{
+	struct gpmc_device_timings dev_t;
+	const int t_cer  = 15;
+	const int t_avdp = 12;
+	const int t_cez  = 20; /* max of t_cez, t_oez */
+	const int t_wpl  = 40;
+	const int t_wph  = 30;
+	int min_gpmc_clk_period, t_ces, t_avds, t_avdh, t_ach, t_aavdh, t_rdyo;
+
+	switch (freq) {
+	case 104:
+		min_gpmc_clk_period = 9600; /* 104 MHz */
+		t_ces   = 3;
+		t_avds  = 4;
+		t_avdh  = 2;
+		t_ach   = 3;
+		t_aavdh = 6;
+		t_rdyo  = 6;
+		break;
+	case 83:
+		min_gpmc_clk_period = 12000; /* 83 MHz */
+		t_ces   = 5;
+		t_avds  = 4;
+		t_avdh  = 2;
+		t_ach   = 6;
+		t_aavdh = 6;
+		t_rdyo  = 9;
+		break;
+	case 66:
+		min_gpmc_clk_period = 15000; /* 66 MHz */
+		t_ces   = 6;
+		t_avds  = 5;
+		t_avdh  = 2;
+		t_ach   = 6;
+		t_aavdh = 6;
+		t_rdyo  = 11;
+		break;
+	default:
+		min_gpmc_clk_period = 18500; /* 54 MHz */
+		t_ces   = 7;
+		t_avds  = 7;
+		t_avdh  = 7;
+		t_ach   = 9;
+		t_aavdh = 7;
+		t_rdyo  = 15;
+		break;
+	}
+
+	/* Set synchronous read timings */
+	memset(&dev_t, 0, sizeof(dev_t));
+
+	if (!s->sync_write) {
+		dev_t.t_avdp_w = max(t_avdp, t_cer) * 1000;
+		dev_t.t_wpl = t_wpl * 1000;
+		dev_t.t_wph = t_wph * 1000;
+		dev_t.t_aavdh = t_aavdh * 1000;
+	}
+	dev_t.ce_xdelay = true;
+	dev_t.avd_xdelay = true;
+	dev_t.oe_xdelay = true;
+	dev_t.we_xdelay = true;
+	dev_t.clk = min_gpmc_clk_period;
+	dev_t.t_bacc = dev_t.clk;
+	dev_t.t_ces = t_ces * 1000;
+	dev_t.t_avds = t_avds * 1000;
+	dev_t.t_avdh = t_avdh * 1000;
+	dev_t.t_ach = t_ach * 1000;
+	dev_t.cyc_iaa = (latency + 1);
+	dev_t.t_cez_r = t_cez * 1000;
+	dev_t.t_cez_w = dev_t.t_cez_r;
+	dev_t.cyc_aavdh_oe = 1;
+	dev_t.t_rdyo = t_rdyo * 1000 + min_gpmc_clk_period;
+
+	gpmc_calc_timings(t, s, &dev_t);
+}
+
+int gpmc_omap_onenand_set_timings(struct device *dev, int cs, int freq,
+				  int latency,
+				  struct gpmc_onenand_info *info)
+{
+	int ret;
+	struct gpmc_timings gpmc_t;
+	struct gpmc_settings gpmc_s;
+
+	gpmc_read_settings_dt(dev->of_node, &gpmc_s);
+
+	info->sync_read = gpmc_s.sync_read;
+	info->sync_write = gpmc_s.sync_write;
+	info->burst_len = gpmc_s.burst_len;
+
+	if (!gpmc_s.sync_read && !gpmc_s.sync_write)
+		return 0;
+
+	gpmc_omap_onenand_calc_sync_timings(&gpmc_t, &gpmc_s, freq, latency);
+
+	ret = gpmc_cs_program_settings(cs, &gpmc_s);
+	if (ret < 0)
+		return ret;
+
+	return gpmc_cs_set_timings(cs, &gpmc_t, &gpmc_s);
+}
+EXPORT_SYMBOL_GPL(gpmc_omap_onenand_set_timings);
+
 int gpmc_get_client_irq(unsigned irq_config)
 {
 	if (!gpmc_irq_domain) {
@@ -1916,41 +2021,6 @@ static void __maybe_unused gpmc_read_timings_dt(struct device_node *np,
 		of_property_read_bool(np, "gpmc,time-para-granularity");
 }
 
-#if IS_ENABLED(CONFIG_MTD_ONENAND)
-static int gpmc_probe_onenand_child(struct platform_device *pdev,
-				 struct device_node *child)
-{
-	u32 val;
-	struct omap_onenand_platform_data *gpmc_onenand_data;
-
-	if (of_property_read_u32(child, "reg", &val) < 0) {
-		dev_err(&pdev->dev, "%pOF has no 'reg' property\n",
-			child);
-		return -ENODEV;
-	}
-
-	gpmc_onenand_data = devm_kzalloc(&pdev->dev, sizeof(*gpmc_onenand_data),
-					 GFP_KERNEL);
-	if (!gpmc_onenand_data)
-		return -ENOMEM;
-
-	gpmc_onenand_data->cs = val;
-	gpmc_onenand_data->of_node = child;
-	gpmc_onenand_data->dma_channel = -1;
-
-	if (!of_property_read_u32(child, "dma-channel", &val))
-		gpmc_onenand_data->dma_channel = val;
-
-	return gpmc_onenand_init(gpmc_onenand_data);
-}
-#else
-static int gpmc_probe_onenand_child(struct platform_device *pdev,
-				    struct device_node *child)
-{
-	return 0;
-}
-#endif
-
 /**
  * gpmc_probe_generic_child - configures the gpmc for a child device
  * @pdev:	pointer to gpmc platform device
@@ -2053,6 +2123,16 @@ static int gpmc_probe_generic_child(struct platform_device *pdev,
 		}
 	}
 
+	if (of_node_cmp(child->name, "onenand") == 0) {
+		/* Warn about older DT blobs with no compatible property */
+		if (!of_property_read_bool(child, "compatible")) {
+			dev_warn(&pdev->dev,
+				 "Incompatible OneNAND node: missing compatible");
+			ret = -EINVAL;
+			goto err;
+		}
+	}
+
 	if (of_device_is_compatible(child, "ti,omap2-nand")) {
 		/* NAND specific setup */
 		val = 8;
@@ -2077,8 +2157,9 @@ static int gpmc_probe_generic_child(struct platform_device *pdev,
 	} else {
 		ret = of_property_read_u32(child, "bank-width",
 					   &gpmc_s.device_width);
-		if (ret < 0) {
-			dev_err(&pdev->dev, "%pOF has no 'bank-width' property\n",
+		if (ret < 0 && !gpmc_s.device_width) {
+			dev_err(&pdev->dev,
+				"%pOF has no 'gpmc,device-width' property\n",
 				child);
 			goto err;
 		}
@@ -2188,11 +2269,7 @@ static void gpmc_probe_dt_children(struct platform_device *pdev)
 		if (!child->name)
 			continue;
 
-		if (of_node_cmp(child->name, "onenand") == 0)
-			ret = gpmc_probe_onenand_child(pdev, child);
-		else
-			ret = gpmc_probe_generic_child(pdev, child);
-
+		ret = gpmc_probe_generic_child(pdev, child);
 		if (ret) {
 			dev_err(&pdev->dev, "failed to probe DT child '%s': %d\n",
 				child->name, ret);
diff --git a/drivers/memstick/host/Kconfig b/drivers/memstick/host/Kconfig
index 7310e32..aa2b078 100644
--- a/drivers/memstick/host/Kconfig
+++ b/drivers/memstick/host/Kconfig
@@ -45,7 +45,7 @@
 
 config MEMSTICK_REALTEK_PCI
 	tristate "Realtek PCI-E Memstick Card Interface Driver"
-	depends on MFD_RTSX_PCI
+	depends on MISC_RTSX_PCI
 	help
 	  Say Y here to include driver code to support Memstick card interface
 	  of Realtek PCI-E card reader
@@ -55,7 +55,7 @@
 
 config MEMSTICK_REALTEK_USB
 	tristate "Realtek USB Memstick Card Interface Driver"
-	depends on MFD_RTSX_USB
+	depends on MISC_RTSX_USB
 	help
 	  Say Y here to include driver code to support Memstick card interface
 	  of Realtek RTS5129/39 series USB card reader
diff --git a/drivers/memstick/host/rtsx_pci_ms.c b/drivers/memstick/host/rtsx_pci_ms.c
index 818fa94..a44b457 100644
--- a/drivers/memstick/host/rtsx_pci_ms.c
+++ b/drivers/memstick/host/rtsx_pci_ms.c
@@ -24,7 +24,7 @@
 #include <linux/delay.h>
 #include <linux/platform_device.h>
 #include <linux/memstick.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 #include <asm/unaligned.h>
 
 struct realtek_pci_ms {
diff --git a/drivers/memstick/host/rtsx_usb_ms.c b/drivers/memstick/host/rtsx_usb_ms.c
index 2e3cf01..4f64563 100644
--- a/drivers/memstick/host/rtsx_usb_ms.c
+++ b/drivers/memstick/host/rtsx_usb_ms.c
@@ -25,7 +25,7 @@
 #include <linux/workqueue.h>
 #include <linux/memstick.h>
 #include <linux/kthread.h>
-#include <linux/mfd/rtsx_usb.h>
+#include <linux/rtsx_usb.h>
 #include <linux/pm_runtime.h>
 #include <linux/mutex.h>
 #include <linux/sched.h>
diff --git a/drivers/mfd/Kconfig b/drivers/mfd/Kconfig
index 1d20a80..b860eb5 100644
--- a/drivers/mfd/Kconfig
+++ b/drivers/mfd/Kconfig
@@ -222,6 +222,16 @@
 	  response time cannot be guaranteed, we support ignoring
 	  'pre-amble' bytes before the response actually starts.
 
+config MFD_CROS_EC_CHARDEV
+        tristate "Chrome OS Embedded Controller userspace device interface"
+        depends on MFD_CROS_EC
+        select CROS_EC_CTL
+        ---help---
+          This driver adds support to talk with the ChromeOS EC from userspace.
+
+          If you have a supported Chromebook, choose Y or M here.
+          The module will be called cros_ec_dev.
+
 config MFD_ASIC3
 	bool "Compaq ASIC3"
 	depends on GPIOLIB && ARM
@@ -877,7 +887,7 @@
 
 config MFD_PM8XXX
 	tristate "Qualcomm PM8xxx PMIC chips driver"
-	depends on (ARM || HEXAGON)
+	depends on (ARM || HEXAGON || COMPILE_TEST)
 	select IRQ_DOMAIN
 	select MFD_CORE
 	select REGMAP
@@ -929,17 +939,6 @@
 	  southbridge which provides access to GPIOs and Watchdog using the
 	  southbridge PCI device configuration space.
 
-config MFD_RTSX_PCI
-	tristate "Realtek PCI-E card reader"
-	depends on PCI
-	select MFD_CORE
-	help
-	  This supports for Realtek PCI-Express card reader including rts5209,
-	  rts5227, rts522A, rts5229, rts5249, rts524A, rts525A, rtl8411, etc.
-	  Realtek card reader supports access to many types of memory cards,
-	  such as Memory Stick, Memory Stick Pro, Secure Digital and
-	  MultiMediaCard.
-
 config MFD_RT5033
 	tristate "Richtek RT5033 Power Management IC"
 	depends on I2C
@@ -953,16 +952,6 @@
 	  sub-devices like charger, fuel gauge, flash LED, current source,
 	  LDO and Buck.
 
-config MFD_RTSX_USB
-	tristate "Realtek USB card reader"
-	depends on USB
-	select MFD_CORE
-	help
-	  Select this option to get support for Realtek USB 2.0 card readers
-	  including RTS5129, RTS5139, RTS5179 and RTS5170.
-	  Realtek card reader supports access to many types of memory cards,
-	  such as Memory Stick Pro, Secure Digital and MultiMediaCard.
-
 config MFD_RC5T583
 	bool "Ricoh RC5T583 Power Management system device"
 	depends on I2C=y
@@ -1859,5 +1848,13 @@
 	  System Registers are the platform configuration block
 	  on the ARM Ltd. Versatile Express board.
 
+config RAVE_SP_CORE
+	tristate "RAVE SP MCU core driver"
+	depends on SERIAL_DEV_BUS
+	select CRC_CCITT
+	help
+	  Select this to get support for the Supervisory Processor
+	  device found on several devices in RAVE line of hardware.
+
 endmenu
 endif
diff --git a/drivers/mfd/Makefile b/drivers/mfd/Makefile
index d9474ad..d9d2cf0 100644
--- a/drivers/mfd/Makefile
+++ b/drivers/mfd/Makefile
@@ -17,12 +17,9 @@
 obj-$(CONFIG_MFD_CROS_EC)	+= cros_ec_core.o
 obj-$(CONFIG_MFD_CROS_EC_I2C)	+= cros_ec_i2c.o
 obj-$(CONFIG_MFD_CROS_EC_SPI)	+= cros_ec_spi.o
+obj-$(CONFIG_MFD_CROS_EC_CHARDEV) += cros_ec_dev.o
 obj-$(CONFIG_MFD_EXYNOS_LPASS)	+= exynos-lpass.o
 
-rtsx_pci-objs			:= rtsx_pcr.o rts5209.o rts5229.o rtl8411.o rts5227.o rts5249.o
-obj-$(CONFIG_MFD_RTSX_PCI)	+= rtsx_pci.o
-obj-$(CONFIG_MFD_RTSX_USB)	+= rtsx_usb.o
-
 obj-$(CONFIG_HTC_PASIC3)	+= htc-pasic3.o
 obj-$(CONFIG_HTC_I2CPLD)	+= htc-i2cpld.o
 
@@ -230,3 +227,5 @@
 obj-$(CONFIG_MFD_STM32_TIMERS) 	+= stm32-timers.o
 obj-$(CONFIG_MFD_MXS_LRADC)     += mxs-lradc.o
 obj-$(CONFIG_MFD_SC27XX_PMIC)	+= sprd-sc27xx-spi.o
+obj-$(CONFIG_RAVE_SP_CORE)	+= rave-sp.o
+
diff --git a/drivers/mfd/ab8500-debugfs.c b/drivers/mfd/ab8500-debugfs.c
index c1c8152..1afa27d 100644
--- a/drivers/mfd/ab8500-debugfs.c
+++ b/drivers/mfd/ab8500-debugfs.c
@@ -1258,6 +1258,19 @@ static struct ab8500_prcmu_ranges ab8540_debug_ranges[AB8500_NUM_BANKS] = {
 	},
 };
 
+#define DEFINE_SHOW_ATTRIBUTE(__name)					      \
+static int __name ## _open(struct inode *inode, struct file *file)	      \
+{									      \
+	return single_open(file, __name ## _show, inode->i_private);	      \
+}									      \
+									      \
+static const struct file_operations __name ## _fops = {			      \
+	.owner		= THIS_MODULE,					      \
+	.open		= __name ## _open,				      \
+	.read		= seq_read,					      \
+	.llseek		= seq_lseek,					      \
+	.release	= single_release,				      \
+}									      \
 
 static irqreturn_t ab8500_debug_handler(int irq, void *data)
 {
@@ -1318,7 +1331,7 @@ static int ab8500_registers_print(struct device *dev, u32 bank,
 	return 0;
 }
 
-static int ab8500_print_bank_registers(struct seq_file *s, void *p)
+static int ab8500_bank_registers_show(struct seq_file *s, void *p)
 {
 	struct device *dev = s->private;
 	u32 bank = debug_bank;
@@ -1330,18 +1343,7 @@ static int ab8500_print_bank_registers(struct seq_file *s, void *p)
 	return ab8500_registers_print(dev, bank, s);
 }
 
-static int ab8500_registers_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_print_bank_registers, inode->i_private);
-}
-
-static const struct file_operations ab8500_registers_fops = {
-	.open = ab8500_registers_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
+DEFINE_SHOW_ATTRIBUTE(ab8500_bank_registers);
 
 static int ab8500_print_all_banks(struct seq_file *s, void *p)
 {
@@ -1528,7 +1530,7 @@ void ab8500_debug_register_interrupt(int line)
 		num_interrupts[line]++;
 }
 
-static int ab8500_interrupts_print(struct seq_file *s, void *p)
+static int ab8500_interrupts_show(struct seq_file *s, void *p)
 {
 	int line;
 
@@ -1557,10 +1559,7 @@ static int ab8500_interrupts_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_interrupts_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_interrupts_print, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_interrupts);
 
 /*
  * - HWREG DB8500 formated routines
@@ -1603,7 +1602,7 @@ static int ab8500_hwreg_open(struct inode *inode, struct file *file)
 #define AB8500_LAST_SIM_REG 0x8B
 #define AB8505_LAST_SIM_REG 0x8C
 
-static int ab8500_print_modem_registers(struct seq_file *s, void *p)
+static int ab8500_modem_show(struct seq_file *s, void *p)
 {
 	struct device *dev = s->private;
 	struct ab8500 *ab8500;
@@ -1620,18 +1619,15 @@ static int ab8500_print_modem_registers(struct seq_file *s, void *p)
 
 	err = abx500_get_register_interruptible(dev,
 		AB8500_REGU_CTRL1, AB8500_SUPPLY_CONTROL_REG, &orig_value);
-	if (err < 0) {
-		dev_err(dev, "ab->read fail %d\n", err);
-		return err;
-	}
+	if (err < 0)
+		goto report_read_failure;
+
 	/* Config 1 will allow APE side to read SIM registers */
 	err = abx500_set_register_interruptible(dev,
 		AB8500_REGU_CTRL1, AB8500_SUPPLY_CONTROL_REG,
 		AB8500_SUPPLY_CONTROL_CONFIG_1);
-	if (err < 0) {
-		dev_err(dev, "ab->write fail %d\n", err);
-		return err;
-	}
+	if (err < 0)
+		goto report_write_failure;
 
 	seq_printf(s, " bank 0x%02X:\n", bank);
 
@@ -1641,36 +1637,30 @@ static int ab8500_print_modem_registers(struct seq_file *s, void *p)
 	for (reg = AB8500_FIRST_SIM_REG; reg <= last_sim_reg; reg++) {
 		err = abx500_get_register_interruptible(dev,
 			bank, reg, &value);
-		if (err < 0) {
-			dev_err(dev, "ab->read fail %d\n", err);
-			return err;
-		}
+		if (err < 0)
+			goto report_read_failure;
+
 		seq_printf(s, "  [0x%02X/0x%02X]: 0x%02X\n", bank, reg, value);
 	}
 	err = abx500_set_register_interruptible(dev,
 		AB8500_REGU_CTRL1, AB8500_SUPPLY_CONTROL_REG, orig_value);
-	if (err < 0) {
-		dev_err(dev, "ab->write fail %d\n", err);
-		return err;
-	}
+	if (err < 0)
+		goto report_write_failure;
+
 	return 0;
+
+report_read_failure:
+	dev_err(dev, "ab->read fail %d\n", err);
+	return err;
+
+report_write_failure:
+	dev_err(dev, "ab->write fail %d\n", err);
+	return err;
 }
 
-static int ab8500_modem_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_print_modem_registers,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_modem);
 
-static const struct file_operations ab8500_modem_fops = {
-	.open = ab8500_modem_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_bat_ctrl_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_bat_ctrl_show(struct seq_file *s, void *p)
 {
 	int bat_ctrl_raw;
 	int bat_ctrl_convert;
@@ -1687,21 +1677,9 @@ static int ab8500_gpadc_bat_ctrl_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_bat_ctrl_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_bat_ctrl_print,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_bat_ctrl);
 
-static const struct file_operations ab8500_gpadc_bat_ctrl_fops = {
-	.open = ab8500_gpadc_bat_ctrl_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_btemp_ball_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_btemp_ball_show(struct seq_file *s, void *p)
 {
 	int btemp_ball_raw;
 	int btemp_ball_convert;
@@ -1718,22 +1696,9 @@ static int ab8500_gpadc_btemp_ball_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_btemp_ball_open(struct inode *inode,
-					struct file *file)
-{
-	return single_open(file, ab8500_gpadc_btemp_ball_print,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_btemp_ball);
 
-static const struct file_operations ab8500_gpadc_btemp_ball_fops = {
-	.open = ab8500_gpadc_btemp_ball_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_main_charger_v_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_main_charger_v_show(struct seq_file *s, void *p)
 {
 	int main_charger_v_raw;
 	int main_charger_v_convert;
@@ -1750,22 +1715,9 @@ static int ab8500_gpadc_main_charger_v_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_main_charger_v_open(struct inode *inode,
-					    struct file *file)
-{
-	return single_open(file, ab8500_gpadc_main_charger_v_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_main_charger_v);
 
-static const struct file_operations ab8500_gpadc_main_charger_v_fops = {
-	.open = ab8500_gpadc_main_charger_v_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_acc_detect1_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_acc_detect1_show(struct seq_file *s, void *p)
 {
 	int acc_detect1_raw;
 	int acc_detect1_convert;
@@ -1782,22 +1734,9 @@ static int ab8500_gpadc_acc_detect1_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_acc_detect1_open(struct inode *inode,
-					 struct file *file)
-{
-	return single_open(file, ab8500_gpadc_acc_detect1_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_acc_detect1);
 
-static const struct file_operations ab8500_gpadc_acc_detect1_fops = {
-	.open = ab8500_gpadc_acc_detect1_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_acc_detect2_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_acc_detect2_show(struct seq_file *s, void *p)
 {
 	int acc_detect2_raw;
 	int acc_detect2_convert;
@@ -1814,22 +1753,9 @@ static int ab8500_gpadc_acc_detect2_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_acc_detect2_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8500_gpadc_acc_detect2_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_acc_detect2);
 
-static const struct file_operations ab8500_gpadc_acc_detect2_fops = {
-	.open = ab8500_gpadc_acc_detect2_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_aux1_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_aux1_show(struct seq_file *s, void *p)
 {
 	int aux1_raw;
 	int aux1_convert;
@@ -1846,20 +1772,9 @@ static int ab8500_gpadc_aux1_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_aux1_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_aux1_print, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_aux1);
 
-static const struct file_operations ab8500_gpadc_aux1_fops = {
-	.open = ab8500_gpadc_aux1_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_aux2_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_aux2_show(struct seq_file *s, void *p)
 {
 	int aux2_raw;
 	int aux2_convert;
@@ -1876,20 +1791,9 @@ static int ab8500_gpadc_aux2_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_aux2_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_aux2_print, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_aux2);
 
-static const struct file_operations ab8500_gpadc_aux2_fops = {
-	.open = ab8500_gpadc_aux2_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_main_bat_v_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_main_bat_v_show(struct seq_file *s, void *p)
 {
 	int main_bat_v_raw;
 	int main_bat_v_convert;
@@ -1906,22 +1810,9 @@ static int ab8500_gpadc_main_bat_v_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_main_bat_v_open(struct inode *inode,
-					struct file *file)
-{
-	return single_open(file, ab8500_gpadc_main_bat_v_print,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_main_bat_v);
 
-static const struct file_operations ab8500_gpadc_main_bat_v_fops = {
-	.open = ab8500_gpadc_main_bat_v_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_vbus_v_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_vbus_v_show(struct seq_file *s, void *p)
 {
 	int vbus_v_raw;
 	int vbus_v_convert;
@@ -1938,20 +1829,9 @@ static int ab8500_gpadc_vbus_v_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_vbus_v_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_vbus_v_print, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_vbus_v);
 
-static const struct file_operations ab8500_gpadc_vbus_v_fops = {
-	.open = ab8500_gpadc_vbus_v_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_main_charger_c_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_main_charger_c_show(struct seq_file *s, void *p)
 {
 	int main_charger_c_raw;
 	int main_charger_c_convert;
@@ -1968,22 +1848,9 @@ static int ab8500_gpadc_main_charger_c_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_main_charger_c_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8500_gpadc_main_charger_c_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_main_charger_c);
 
-static const struct file_operations ab8500_gpadc_main_charger_c_fops = {
-	.open = ab8500_gpadc_main_charger_c_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_usb_charger_c_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_usb_charger_c_show(struct seq_file *s, void *p)
 {
 	int usb_charger_c_raw;
 	int usb_charger_c_convert;
@@ -2000,22 +1867,9 @@ static int ab8500_gpadc_usb_charger_c_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_usb_charger_c_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8500_gpadc_usb_charger_c_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_usb_charger_c);
 
-static const struct file_operations ab8500_gpadc_usb_charger_c_fops = {
-	.open = ab8500_gpadc_usb_charger_c_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_bk_bat_v_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_bk_bat_v_show(struct seq_file *s, void *p)
 {
 	int bk_bat_v_raw;
 	int bk_bat_v_convert;
@@ -2032,21 +1886,9 @@ static int ab8500_gpadc_bk_bat_v_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_bk_bat_v_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_bk_bat_v_print,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_bk_bat_v);
 
-static const struct file_operations ab8500_gpadc_bk_bat_v_fops = {
-	.open = ab8500_gpadc_bk_bat_v_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_die_temp_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_die_temp_show(struct seq_file *s, void *p)
 {
 	int die_temp_raw;
 	int die_temp_convert;
@@ -2063,21 +1905,9 @@ static int ab8500_gpadc_die_temp_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_die_temp_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_die_temp_print,
-			   inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_die_temp);
 
-static const struct file_operations ab8500_gpadc_die_temp_fops = {
-	.open = ab8500_gpadc_die_temp_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8500_gpadc_usb_id_print(struct seq_file *s, void *p)
+static int ab8500_gpadc_usb_id_show(struct seq_file *s, void *p)
 {
 	int usb_id_raw;
 	int usb_id_convert;
@@ -2094,20 +1924,9 @@ static int ab8500_gpadc_usb_id_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8500_gpadc_usb_id_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8500_gpadc_usb_id_print, inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8500_gpadc_usb_id);
 
-static const struct file_operations ab8500_gpadc_usb_id_fops = {
-	.open = ab8500_gpadc_usb_id_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_xtal_temp_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_xtal_temp_show(struct seq_file *s, void *p)
 {
 	int xtal_temp_raw;
 	int xtal_temp_convert;
@@ -2124,21 +1943,9 @@ static int ab8540_gpadc_xtal_temp_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_xtal_temp_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8540_gpadc_xtal_temp_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_xtal_temp);
 
-static const struct file_operations ab8540_gpadc_xtal_temp_fops = {
-	.open = ab8540_gpadc_xtal_temp_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_vbat_true_meas_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_vbat_true_meas_show(struct seq_file *s, void *p)
 {
 	int vbat_true_meas_raw;
 	int vbat_true_meas_convert;
@@ -2156,22 +1963,9 @@ static int ab8540_gpadc_vbat_true_meas_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_vbat_true_meas_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8540_gpadc_vbat_true_meas_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_vbat_true_meas);
 
-static const struct file_operations ab8540_gpadc_vbat_true_meas_fops = {
-	.open = ab8540_gpadc_vbat_true_meas_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_bat_ctrl_and_ibat_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_bat_ctrl_and_ibat_show(struct seq_file *s, void *p)
 {
 	int bat_ctrl_raw;
 	int bat_ctrl_convert;
@@ -2197,22 +1991,9 @@ static int ab8540_gpadc_bat_ctrl_and_ibat_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_bat_ctrl_and_ibat_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8540_gpadc_bat_ctrl_and_ibat_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_bat_ctrl_and_ibat);
 
-static const struct file_operations ab8540_gpadc_bat_ctrl_and_ibat_fops = {
-	.open = ab8540_gpadc_bat_ctrl_and_ibat_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_vbat_meas_and_ibat_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_vbat_meas_and_ibat_show(struct seq_file *s, void *p)
 {
 	int vbat_meas_raw;
 	int vbat_meas_convert;
@@ -2237,23 +2018,9 @@ static int ab8540_gpadc_vbat_meas_and_ibat_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_vbat_meas_and_ibat_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8540_gpadc_vbat_meas_and_ibat_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_vbat_meas_and_ibat);
 
-static const struct file_operations ab8540_gpadc_vbat_meas_and_ibat_fops = {
-	.open = ab8540_gpadc_vbat_meas_and_ibat_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_vbat_true_meas_and_ibat_print(struct seq_file *s,
-						      void *p)
+static int ab8540_gpadc_vbat_true_meas_and_ibat_show(struct seq_file *s, void *p)
 {
 	int vbat_true_meas_raw;
 	int vbat_true_meas_convert;
@@ -2279,23 +2046,9 @@ static int ab8540_gpadc_vbat_true_meas_and_ibat_print(struct seq_file *s,
 	return 0;
 }
 
-static int ab8540_gpadc_vbat_true_meas_and_ibat_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8540_gpadc_vbat_true_meas_and_ibat_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_vbat_true_meas_and_ibat);
 
-static const struct file_operations
-ab8540_gpadc_vbat_true_meas_and_ibat_fops = {
-	.open = ab8540_gpadc_vbat_true_meas_and_ibat_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_bat_temp_and_ibat_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_bat_temp_and_ibat_show(struct seq_file *s, void *p)
 {
 	int bat_temp_raw;
 	int bat_temp_convert;
@@ -2320,22 +2073,9 @@ static int ab8540_gpadc_bat_temp_and_ibat_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_bat_temp_and_ibat_open(struct inode *inode,
-		struct file *file)
-{
-	return single_open(file, ab8540_gpadc_bat_temp_and_ibat_print,
-		inode->i_private);
-}
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_bat_temp_and_ibat);
 
-static const struct file_operations ab8540_gpadc_bat_temp_and_ibat_fops = {
-	.open = ab8540_gpadc_bat_temp_and_ibat_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
-static int ab8540_gpadc_otp_cal_print(struct seq_file *s, void *p)
+static int ab8540_gpadc_otp_calib_show(struct seq_file *s, void *p)
 {
 	struct ab8500_gpadc *gpadc;
 	u16 vmain_l, vmain_h, btemp_l, btemp_h;
@@ -2359,18 +2099,7 @@ static int ab8540_gpadc_otp_cal_print(struct seq_file *s, void *p)
 	return 0;
 }
 
-static int ab8540_gpadc_otp_cal_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, ab8540_gpadc_otp_cal_print, inode->i_private);
-}
-
-static const struct file_operations ab8540_gpadc_otp_calib_fops = {
-	.open = ab8540_gpadc_otp_cal_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
+DEFINE_SHOW_ATTRIBUTE(ab8540_gpadc_otp_calib);
 
 static int ab8500_gpadc_avg_sample_print(struct seq_file *s, void *p)
 {
@@ -2903,14 +2632,6 @@ static const struct file_operations ab8500_val_fops = {
 	.owner = THIS_MODULE,
 };
 
-static const struct file_operations ab8500_interrupts_fops = {
-	.open = ab8500_interrupts_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-	.owner = THIS_MODULE,
-};
-
 static const struct file_operations ab8500_subscribe_fops = {
 	.open = ab8500_subscribe_unsubscribe_open,
 	.write = ab8500_subscribe_write,
@@ -2997,7 +2718,7 @@ static int ab8500_debug_probe(struct platform_device *plf)
 		goto err;
 
 	file = debugfs_create_file("all-bank-registers", S_IRUGO, ab8500_dir,
-				   &plf->dev, &ab8500_registers_fops);
+				   &plf->dev, &ab8500_bank_registers_fops);
 	if (!file)
 		goto err;
 
diff --git a/drivers/mfd/atmel-flexcom.c b/drivers/mfd/atmel-flexcom.c
index 064bde9..f684a93 100644
--- a/drivers/mfd/atmel-flexcom.c
+++ b/drivers/mfd/atmel-flexcom.c
@@ -39,34 +39,43 @@
 #define FLEX_MR_OPMODE(opmode)	(((opmode) << FLEX_MR_OPMODE_OFFSET) &	\
 				 FLEX_MR_OPMODE_MASK)
 
+struct atmel_flexcom {
+	void __iomem *base;
+	u32 opmode;
+	struct clk *clk;
+};
 
 static int atmel_flexcom_probe(struct platform_device *pdev)
 {
 	struct device_node *np = pdev->dev.of_node;
-	struct clk *clk;
 	struct resource *res;
-	void __iomem *base;
-	u32 opmode;
+	struct atmel_flexcom *ddata;
 	int err;
 
-	err = of_property_read_u32(np, "atmel,flexcom-mode", &opmode);
+	ddata = devm_kzalloc(&pdev->dev, sizeof(*ddata), GFP_KERNEL);
+	if (!ddata)
+		return -ENOMEM;
+
+	platform_set_drvdata(pdev, ddata);
+
+	err = of_property_read_u32(np, "atmel,flexcom-mode", &ddata->opmode);
 	if (err)
 		return err;
 
-	if (opmode < ATMEL_FLEXCOM_MODE_USART ||
-	    opmode > ATMEL_FLEXCOM_MODE_TWI)
+	if (ddata->opmode < ATMEL_FLEXCOM_MODE_USART ||
+	    ddata->opmode > ATMEL_FLEXCOM_MODE_TWI)
 		return -EINVAL;
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	base = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(base))
-		return PTR_ERR(base);
+	ddata->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(ddata->base))
+		return PTR_ERR(ddata->base);
 
-	clk = devm_clk_get(&pdev->dev, NULL);
-	if (IS_ERR(clk))
-		return PTR_ERR(clk);
+	ddata->clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(ddata->clk))
+		return PTR_ERR(ddata->clk);
 
-	err = clk_prepare_enable(clk);
+	err = clk_prepare_enable(ddata->clk);
 	if (err)
 		return err;
 
@@ -76,9 +85,9 @@ static int atmel_flexcom_probe(struct platform_device *pdev)
 	 * inaccessible and are read as zero. Also the external I/O lines of the
 	 * Flexcom are muxed to reach the selected device.
 	 */
-	writel(FLEX_MR_OPMODE(opmode), base + FLEX_MR);
+	writel(FLEX_MR_OPMODE(ddata->opmode), ddata->base + FLEX_MR);
 
-	clk_disable_unprepare(clk);
+	clk_disable_unprepare(ddata->clk);
 
 	return devm_of_platform_populate(&pdev->dev);
 }
@@ -89,10 +98,34 @@ static const struct of_device_id atmel_flexcom_of_match[] = {
 };
 MODULE_DEVICE_TABLE(of, atmel_flexcom_of_match);
 
+#ifdef CONFIG_PM_SLEEP
+static int atmel_flexcom_resume(struct device *dev)
+{
+	struct atmel_flexcom *ddata = dev_get_drvdata(dev);
+	int err;
+	u32 val;
+
+	err = clk_prepare_enable(ddata->clk);
+	if (err)
+		return err;
+
+	val = FLEX_MR_OPMODE(ddata->opmode),
+	writel(val, ddata->base + FLEX_MR);
+
+	clk_disable_unprepare(ddata->clk);
+
+	return 0;
+}
+#endif
+
+static SIMPLE_DEV_PM_OPS(atmel_flexcom_pm_ops, NULL,
+			 atmel_flexcom_resume);
+
 static struct platform_driver atmel_flexcom_driver = {
 	.probe	= atmel_flexcom_probe,
 	.driver	= {
 		.name		= "atmel_flexcom",
+		.pm		= &atmel_flexcom_pm_ops,
 		.of_match_table	= atmel_flexcom_of_match,
 	},
 };
diff --git a/drivers/mfd/axp20x.c b/drivers/mfd/axp20x.c
index 2468b43..e94c72c 100644
--- a/drivers/mfd/axp20x.c
+++ b/drivers/mfd/axp20x.c
@@ -129,6 +129,7 @@ static const struct regmap_range axp288_volatile_ranges[] = {
 	regmap_reg_range(AXP20X_PWR_INPUT_STATUS, AXP288_POWER_REASON),
 	regmap_reg_range(AXP288_BC_GLOBAL, AXP288_BC_GLOBAL),
 	regmap_reg_range(AXP288_BC_DET_STAT, AXP288_BC_DET_STAT),
+	regmap_reg_range(AXP20X_CHRG_BAK_CTRL, AXP20X_CHRG_BAK_CTRL),
 	regmap_reg_range(AXP20X_IRQ1_EN, AXP20X_IPSOUT_V_HIGH_L),
 	regmap_reg_range(AXP20X_TIMER_CTRL, AXP20X_TIMER_CTRL),
 	regmap_reg_range(AXP22X_GPIO_STATE, AXP22X_GPIO_STATE),
@@ -878,6 +879,9 @@ static struct mfd_cell axp813_cells[] = {
 		.resources		= axp803_pek_resources,
 	}, {
 		.name			= "axp20x-regulator",
+	}, {
+		.name			= "axp20x-gpio",
+		.of_compatible		= "x-powers,axp813-gpio",
 	}
 };
 
diff --git a/drivers/mfd/cros_ec.c b/drivers/mfd/cros_ec.c
index b0ca5a4c..d610241 100644
--- a/drivers/mfd/cros_ec.c
+++ b/drivers/mfd/cros_ec.c
@@ -40,13 +40,13 @@ static struct cros_ec_platform pd_p = {
 };
 
 static const struct mfd_cell ec_cell = {
-	.name = "cros-ec-ctl",
+	.name = "cros-ec-dev",
 	.platform_data = &ec_p,
 	.pdata_size = sizeof(ec_p),
 };
 
 static const struct mfd_cell ec_pd_cell = {
-	.name = "cros-ec-ctl",
+	.name = "cros-ec-dev",
 	.platform_data = &pd_p,
 	.pdata_size = sizeof(pd_p),
 };
diff --git a/drivers/platform/chrome/cros_ec_dev.c b/drivers/mfd/cros_ec_dev.c
similarity index 98%
rename from drivers/platform/chrome/cros_ec_dev.c
rename to drivers/mfd/cros_ec_dev.c
index cf6c4f0..e4fafdd 100644
--- a/drivers/platform/chrome/cros_ec_dev.c
+++ b/drivers/mfd/cros_ec_dev.c
@@ -25,9 +25,10 @@
 #include <linux/slab.h>
 #include <linux/uaccess.h>
 
-#include "cros_ec_debugfs.h"
 #include "cros_ec_dev.h"
 
+#define DRV_NAME "cros-ec-dev"
+
 /* Device variables */
 #define CROS_MAX_DEV 128
 static int ec_major;
@@ -461,7 +462,7 @@ static int ec_device_remove(struct platform_device *pdev)
 }
 
 static const struct platform_device_id cros_ec_id[] = {
-	{ "cros-ec-ctl", 0 },
+	{ DRV_NAME, 0 },
 	{ /* sentinel */ },
 };
 MODULE_DEVICE_TABLE(platform, cros_ec_id);
@@ -493,7 +494,7 @@ static const struct dev_pm_ops cros_ec_dev_pm_ops = {
 
 static struct platform_driver cros_ec_dev_driver = {
 	.driver = {
-		.name = "cros-ec-ctl",
+		.name = DRV_NAME,
 		.pm = &cros_ec_dev_pm_ops,
 	},
 	.probe = ec_device_probe,
@@ -544,6 +545,7 @@ static void __exit cros_ec_dev_exit(void)
 module_init(cros_ec_dev_init);
 module_exit(cros_ec_dev_exit);
 
+MODULE_ALIAS("platform:" DRV_NAME);
 MODULE_AUTHOR("Bill Richardson <wfrichar@chromium.org>");
 MODULE_DESCRIPTION("Userspace interface to the Chrome OS Embedded Controller");
 MODULE_VERSION("1.0");
diff --git a/drivers/platform/chrome/cros_ec_dev.h b/drivers/mfd/cros_ec_dev.h
similarity index 100%
rename from drivers/platform/chrome/cros_ec_dev.h
rename to drivers/mfd/cros_ec_dev.h
diff --git a/drivers/mfd/cros_ec_spi.c b/drivers/mfd/cros_ec_spi.c
index 59c82cd..1b52b85 100644
--- a/drivers/mfd/cros_ec_spi.c
+++ b/drivers/mfd/cros_ec_spi.c
@@ -72,8 +72,7 @@
  * struct cros_ec_spi - information about a SPI-connected EC
  *
  * @spi: SPI device we are connected to
- * @last_transfer_ns: time that we last finished a transfer, or 0 if there
- *	if no record
+ * @last_transfer_ns: time that we last finished a transfer.
  * @start_of_msg_delay: used to set the delay_usecs on the spi_transfer that
  *      is sent when we want to turn on CS at the start of a transaction.
  * @end_of_msg_delay: used to set the delay_usecs on the spi_transfer that
@@ -379,18 +378,15 @@ static int cros_ec_pkt_xfer_spi(struct cros_ec_device *ec_dev,
 	u8 sum;
 	u8 rx_byte;
 	int ret = 0, final_ret;
+	unsigned long delay;
 
 	len = cros_ec_prepare_tx(ec_dev, ec_msg);
 	dev_dbg(ec_dev->dev, "prepared, len=%d\n", len);
 
 	/* If it's too soon to do another transaction, wait */
-	if (ec_spi->last_transfer_ns) {
-		unsigned long delay;	/* The delay completed so far */
-
-		delay = ktime_get_ns() - ec_spi->last_transfer_ns;
-		if (delay < EC_SPI_RECOVERY_TIME_NS)
-			ndelay(EC_SPI_RECOVERY_TIME_NS - delay);
-	}
+	delay = ktime_get_ns() - ec_spi->last_transfer_ns;
+	if (delay < EC_SPI_RECOVERY_TIME_NS)
+		ndelay(EC_SPI_RECOVERY_TIME_NS - delay);
 
 	rx_buf = kzalloc(len, GFP_KERNEL);
 	if (!rx_buf)
@@ -509,18 +505,15 @@ static int cros_ec_cmd_xfer_spi(struct cros_ec_device *ec_dev,
 	u8 rx_byte;
 	int sum;
 	int ret = 0, final_ret;
+	unsigned long delay;
 
 	len = cros_ec_prepare_tx(ec_dev, ec_msg);
 	dev_dbg(ec_dev->dev, "prepared, len=%d\n", len);
 
 	/* If it's too soon to do another transaction, wait */
-	if (ec_spi->last_transfer_ns) {
-		unsigned long delay;	/* The delay completed so far */
-
-		delay = ktime_get_ns() - ec_spi->last_transfer_ns;
-		if (delay < EC_SPI_RECOVERY_TIME_NS)
-			ndelay(EC_SPI_RECOVERY_TIME_NS - delay);
-	}
+	delay = ktime_get_ns() - ec_spi->last_transfer_ns;
+	if (delay < EC_SPI_RECOVERY_TIME_NS)
+		ndelay(EC_SPI_RECOVERY_TIME_NS - delay);
 
 	rx_buf = kzalloc(len, GFP_KERNEL);
 	if (!rx_buf)
diff --git a/drivers/mfd/intel-lpss.c b/drivers/mfd/intel-lpss.c
index 0e0ab9b..9e545eb 100644
--- a/drivers/mfd/intel-lpss.c
+++ b/drivers/mfd/intel-lpss.c
@@ -450,6 +450,8 @@ int intel_lpss_probe(struct device *dev,
 	if (ret)
 		goto err_remove_ltr;
 
+	dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
+
 	return 0;
 
 err_remove_ltr:
@@ -478,7 +480,9 @@ EXPORT_SYMBOL_GPL(intel_lpss_remove);
 
 static int resume_lpss_device(struct device *dev, void *data)
 {
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+		pm_runtime_resume(dev);
+
 	return 0;
 }
 
diff --git a/drivers/mfd/intel_soc_pmic_core.c b/drivers/mfd/intel_soc_pmic_core.c
index 36adf9e..274306d 100644
--- a/drivers/mfd/intel_soc_pmic_core.c
+++ b/drivers/mfd/intel_soc_pmic_core.c
@@ -16,7 +16,6 @@
  * Author: Zhu, Lejun <lejun.zhu@linux.intel.com>
  */
 
-#include <linux/acpi.h>
 #include <linux/module.h>
 #include <linux/mfd/core.h>
 #include <linux/i2c.h>
diff --git a/drivers/mfd/kempld-core.c b/drivers/mfd/kempld-core.c
index 55d824b..390b27c 100644
--- a/drivers/mfd/kempld-core.c
+++ b/drivers/mfd/kempld-core.c
@@ -458,7 +458,7 @@ static int kempld_probe(struct platform_device *pdev)
 		return -EINVAL;
 
 	pld->io_base = devm_ioport_map(dev, ioport->start,
-					ioport->end - ioport->start);
+					resource_size(ioport));
 	if (!pld->io_base)
 		return -ENOMEM;
 
diff --git a/drivers/mfd/lpc_ich.c b/drivers/mfd/lpc_ich.c
index cf1120a..53dc1a4 100644
--- a/drivers/mfd/lpc_ich.c
+++ b/drivers/mfd/lpc_ich.c
@@ -1143,11 +1143,6 @@ static int lpc_ich_init_spi(struct pci_dev *dev)
 			res->end = res->start + SPIBASE_APL_SZ - 1;
 
 			pci_bus_read_config_dword(bus, spi, BCR, &bcr);
-			if (!(bcr & BCR_WPD)) {
-				bcr |= BCR_WPD;
-				pci_bus_write_config_dword(bus, spi, BCR, bcr);
-				pci_bus_read_config_dword(bus, spi, BCR, &bcr);
-			}
 			info->writeable = !!(bcr & BCR_WPD);
 		}
 
diff --git a/drivers/mfd/max77843.c b/drivers/mfd/max77843.c
index dc5caea..da9612d 100644
--- a/drivers/mfd/max77843.c
+++ b/drivers/mfd/max77843.c
@@ -15,7 +15,6 @@
 #include <linux/i2c.h>
 #include <linux/init.h>
 #include <linux/interrupt.h>
-#include <linux/init.h>
 #include <linux/mfd/core.h>
 #include <linux/mfd/max77693-common.h>
 #include <linux/mfd/max77843-private.h>
diff --git a/drivers/mfd/palmas.c b/drivers/mfd/palmas.c
index 3922a93..663a239 100644
--- a/drivers/mfd/palmas.c
+++ b/drivers/mfd/palmas.c
@@ -430,6 +430,7 @@ static void palmas_power_off(void)
 {
 	unsigned int addr;
 	int ret, slave;
+	u8 powerhold_mask;
 	struct device_node *np = palmas_dev->dev->of_node;
 
 	if (of_property_read_bool(np, "ti,palmas-override-powerhold")) {
@@ -437,8 +438,15 @@ static void palmas_power_off(void)
 					  PALMAS_PRIMARY_SECONDARY_PAD2);
 		slave = PALMAS_BASE_TO_SLAVE(PALMAS_PU_PD_OD_BASE);
 
+		if (of_device_is_compatible(np, "ti,tps65917"))
+			powerhold_mask =
+				TPS65917_PRIMARY_SECONDARY_PAD2_GPIO_5_MASK;
+		else
+			powerhold_mask =
+				PALMAS_PRIMARY_SECONDARY_PAD2_GPIO_7_MASK;
+
 		ret = regmap_update_bits(palmas_dev->regmap[slave], addr,
-				PALMAS_PRIMARY_SECONDARY_PAD2_GPIO_7_MASK, 0);
+					 powerhold_mask, 0);
 		if (ret)
 			dev_err(palmas_dev->dev,
 				"Unable to write PRIMARY_SECONDARY_PAD2 %d\n",
diff --git a/drivers/mfd/pcf50633-core.c b/drivers/mfd/pcf50633-core.c
index 6155d12..f952dff 100644
--- a/drivers/mfd/pcf50633-core.c
+++ b/drivers/mfd/pcf50633-core.c
@@ -149,7 +149,7 @@ pcf50633_client_dev_register(struct pcf50633 *pcf, const char *name,
 
 	*pdev = platform_device_alloc(name, -1);
 	if (!*pdev) {
-		dev_err(pcf->dev, "Falied to allocate %s\n", name);
+		dev_err(pcf->dev, "Failed to allocate %s\n", name);
 		return;
 	}
 
diff --git a/drivers/mfd/rave-sp.c b/drivers/mfd/rave-sp.c
new file mode 100644
index 0000000..5c858e7
--- /dev/null
+++ b/drivers/mfd/rave-sp.c
@@ -0,0 +1,710 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Multifunction core driver for Zodiac Inflight Innovations RAVE
+ * Supervisory Processor(SP) MCU that is connected via dedicated UART
+ * port
+ *
+ * Copyright (C) 2017 Zodiac Inflight Innovations
+ */
+
+#include <linux/atomic.h>
+#include <linux/crc-ccitt.h>
+#include <linux/delay.h>
+#include <linux/export.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/kernel.h>
+#include <linux/mfd/rave-sp.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/sched.h>
+#include <linux/serdev.h>
+#include <asm/unaligned.h>
+
+/*
+ * UART protocol using following entities:
+ *  - message to MCU => ACK response
+ *  - event from MCU => event ACK
+ *
+ * Frame structure:
+ * <STX> <DATA> <CHECKSUM> <ETX>
+ * Where:
+ * - STX - is start of transmission character
+ * - ETX - end of transmission
+ * - DATA - payload
+ * - CHECKSUM - checksum calculated on <DATA>
+ *
+ * If <DATA> or <CHECKSUM> contain one of control characters, then it is
+ * escaped using <DLE> control code. Added <DLE> does not participate in
+ * checksum calculation.
+ */
+#define RAVE_SP_STX			0x02
+#define RAVE_SP_ETX			0x03
+#define RAVE_SP_DLE			0x10
+
+#define RAVE_SP_MAX_DATA_SIZE		64
+#define RAVE_SP_CHECKSUM_SIZE		2  /* Worst case scenario on RDU2 */
+/*
+ * We don't store STX, ETX and unescaped bytes, so Rx is only
+ * DATA + CSUM
+ */
+#define RAVE_SP_RX_BUFFER_SIZE				\
+	(RAVE_SP_MAX_DATA_SIZE + RAVE_SP_CHECKSUM_SIZE)
+
+#define RAVE_SP_STX_ETX_SIZE		2
+/*
+ * For Tx we have to have space for everything, STX, EXT and
+ * potentially stuffed DATA + CSUM data + csum
+ */
+#define RAVE_SP_TX_BUFFER_SIZE				\
+	(RAVE_SP_STX_ETX_SIZE + 2 * RAVE_SP_RX_BUFFER_SIZE)
+
+#define RAVE_SP_BOOT_SOURCE_GET		0
+#define RAVE_SP_BOOT_SOURCE_SET		1
+
+#define RAVE_SP_RDU2_BOARD_TYPE_RMB	0
+#define RAVE_SP_RDU2_BOARD_TYPE_DEB	1
+
+#define RAVE_SP_BOOT_SOURCE_SD		0
+#define RAVE_SP_BOOT_SOURCE_EMMC	1
+#define RAVE_SP_BOOT_SOURCE_NOR		2
+
+/**
+ * enum rave_sp_deframer_state - Possible state for de-framer
+ *
+ * @RAVE_SP_EXPECT_SOF:		 Scanning input for start-of-frame marker
+ * @RAVE_SP_EXPECT_DATA:	 Got start of frame marker, collecting frame
+ * @RAVE_SP_EXPECT_ESCAPED_DATA: Got escape character, collecting escaped byte
+ */
+enum rave_sp_deframer_state {
+	RAVE_SP_EXPECT_SOF,
+	RAVE_SP_EXPECT_DATA,
+	RAVE_SP_EXPECT_ESCAPED_DATA,
+};
+
+/**
+ * struct rave_sp_deframer - Device protocol deframer
+ *
+ * @state:  Current state of the deframer
+ * @data:   Buffer used to collect deframed data
+ * @length: Number of bytes de-framed so far
+ */
+struct rave_sp_deframer {
+	enum rave_sp_deframer_state state;
+	unsigned char data[RAVE_SP_RX_BUFFER_SIZE];
+	size_t length;
+};
+
+/**
+ * struct rave_sp_reply - Reply as per RAVE device protocol
+ *
+ * @length:	Expected reply length
+ * @data:	Buffer to store reply payload in
+ * @code:	Expected reply code
+ * @ackid:	Expected reply ACK ID
+ * @completion: Successful reply reception completion
+ */
+struct rave_sp_reply {
+	size_t length;
+	void  *data;
+	u8     code;
+	u8     ackid;
+	struct completion received;
+};
+
+/**
+ * struct rave_sp_checksum - Variant specific checksum implementation details
+ *
+ * @length:	Caculated checksum length
+ * @subroutine:	Utilized checksum algorithm implementation
+ */
+struct rave_sp_checksum {
+	size_t length;
+	void (*subroutine)(const u8 *, size_t, u8 *);
+};
+
+/**
+ * struct rave_sp_variant_cmds - Variant specific command routines
+ *
+ * @translate:	Generic to variant specific command mapping routine
+ *
+ */
+struct rave_sp_variant_cmds {
+	int (*translate)(enum rave_sp_command);
+};
+
+/**
+ * struct rave_sp_variant - RAVE supervisory processor core variant
+ *
+ * @checksum:	Variant specific checksum implementation
+ * @cmd:	Variant specific command pointer table
+ *
+ */
+struct rave_sp_variant {
+	const struct rave_sp_checksum *checksum;
+	struct rave_sp_variant_cmds cmd;
+};
+
+/**
+ * struct rave_sp - RAVE supervisory processor core
+ *
+ * @serdev:			Pointer to underlying serdev
+ * @deframer:			Stored state of the protocol deframer
+ * @ackid:			ACK ID used in last reply sent to the device
+ * @bus_lock:			Lock to serialize access to the device
+ * @reply_lock:			Lock protecting @reply
+ * @reply:			Pointer to memory to store reply payload
+ *
+ * @variant:			Device variant specific information
+ * @event_notifier_list:	Input event notification chain
+ *
+ */
+struct rave_sp {
+	struct serdev_device *serdev;
+	struct rave_sp_deframer deframer;
+	atomic_t ackid;
+	struct mutex bus_lock;
+	struct mutex reply_lock;
+	struct rave_sp_reply *reply;
+
+	const struct rave_sp_variant *variant;
+	struct blocking_notifier_head event_notifier_list;
+};
+
+static bool rave_sp_id_is_event(u8 code)
+{
+	return (code & 0xF0) == RAVE_SP_EVNT_BASE;
+}
+
+static void rave_sp_unregister_event_notifier(struct device *dev, void *res)
+{
+	struct rave_sp *sp = dev_get_drvdata(dev->parent);
+	struct notifier_block *nb = *(struct notifier_block **)res;
+	struct blocking_notifier_head *bnh = &sp->event_notifier_list;
+
+	WARN_ON(blocking_notifier_chain_unregister(bnh, nb));
+}
+
+int devm_rave_sp_register_event_notifier(struct device *dev,
+					 struct notifier_block *nb)
+{
+	struct rave_sp *sp = dev_get_drvdata(dev->parent);
+	struct notifier_block **rcnb;
+	int ret;
+
+	rcnb = devres_alloc(rave_sp_unregister_event_notifier,
+			    sizeof(*rcnb), GFP_KERNEL);
+	if (!rcnb)
+		return -ENOMEM;
+
+	ret = blocking_notifier_chain_register(&sp->event_notifier_list, nb);
+	if (!ret) {
+		*rcnb = nb;
+		devres_add(dev, rcnb);
+	} else {
+		devres_free(rcnb);
+	}
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(devm_rave_sp_register_event_notifier);
+
+static void csum_8b2c(const u8 *buf, size_t size, u8 *crc)
+{
+	*crc = *buf++;
+	size--;
+
+	while (size--)
+		*crc += *buf++;
+
+	*crc = 1 + ~(*crc);
+}
+
+static void csum_ccitt(const u8 *buf, size_t size, u8 *crc)
+{
+	const u16 calculated = crc_ccitt_false(0xffff, buf, size);
+
+	/*
+	 * While the rest of the wire protocol is little-endian,
+	 * CCITT-16 CRC in RDU2 device is sent out in big-endian order.
+	 */
+	put_unaligned_be16(calculated, crc);
+}
+
+static void *stuff(unsigned char *dest, const unsigned char *src, size_t n)
+{
+	while (n--) {
+		const unsigned char byte = *src++;
+
+		switch (byte) {
+		case RAVE_SP_STX:
+		case RAVE_SP_ETX:
+		case RAVE_SP_DLE:
+			*dest++ = RAVE_SP_DLE;
+			/* FALLTHROUGH */
+		default:
+			*dest++ = byte;
+		}
+	}
+
+	return dest;
+}
+
+static int rave_sp_write(struct rave_sp *sp, const u8 *data, u8 data_size)
+{
+	const size_t checksum_length = sp->variant->checksum->length;
+	unsigned char frame[RAVE_SP_TX_BUFFER_SIZE];
+	unsigned char crc[RAVE_SP_CHECKSUM_SIZE];
+	unsigned char *dest = frame;
+	size_t length;
+
+	if (WARN_ON(checksum_length > sizeof(crc)))
+		return -ENOMEM;
+
+	if (WARN_ON(data_size > sizeof(frame)))
+		return -ENOMEM;
+
+	sp->variant->checksum->subroutine(data, data_size, crc);
+
+	*dest++ = RAVE_SP_STX;
+	dest = stuff(dest, data, data_size);
+	dest = stuff(dest, crc, checksum_length);
+	*dest++ = RAVE_SP_ETX;
+
+	length = dest - frame;
+
+	print_hex_dump(KERN_DEBUG, "rave-sp tx: ", DUMP_PREFIX_NONE,
+		       16, 1, frame, length, false);
+
+	return serdev_device_write(sp->serdev, frame, length, HZ);
+}
+
+static u8 rave_sp_reply_code(u8 command)
+{
+	/*
+	 * There isn't a single rule that describes command code ->
+	 * ACK code transformation, but, going through various
+	 * versions of ICDs, there appear to be three distinct groups
+	 * that can be described by simple transformation.
+	 */
+	switch (command) {
+	case 0xA0 ... 0xBE:
+		/*
+		 * Commands implemented by firmware found in RDU1 and
+		 * older devices all seem to obey the following rule
+		 */
+		return command + 0x20;
+	case 0xE0 ... 0xEF:
+		/*
+		 * Events emitted by all versions of the firmare use
+		 * least significant bit to get an ACK code
+		 */
+		return command | 0x01;
+	default:
+		/*
+		 * Commands implemented by firmware found in RDU2 are
+		 * similar to "old" commands, but they use slightly
+		 * different offset
+		 */
+		return command + 0x40;
+	}
+}
+
+int rave_sp_exec(struct rave_sp *sp,
+		 void *__data,  size_t data_size,
+		 void *reply_data, size_t reply_data_size)
+{
+	struct rave_sp_reply reply = {
+		.data     = reply_data,
+		.length   = reply_data_size,
+		.received = COMPLETION_INITIALIZER_ONSTACK(reply.received),
+	};
+	unsigned char *data = __data;
+	int command, ret = 0;
+	u8 ackid;
+
+	command = sp->variant->cmd.translate(data[0]);
+	if (command < 0)
+		return command;
+
+	ackid       = atomic_inc_return(&sp->ackid);
+	reply.ackid = ackid;
+	reply.code  = rave_sp_reply_code((u8)command),
+
+	mutex_lock(&sp->bus_lock);
+
+	mutex_lock(&sp->reply_lock);
+	sp->reply = &reply;
+	mutex_unlock(&sp->reply_lock);
+
+	data[0] = command;
+	data[1] = ackid;
+
+	rave_sp_write(sp, data, data_size);
+
+	if (!wait_for_completion_timeout(&reply.received, HZ)) {
+		dev_err(&sp->serdev->dev, "Command timeout\n");
+		ret = -ETIMEDOUT;
+
+		mutex_lock(&sp->reply_lock);
+		sp->reply = NULL;
+		mutex_unlock(&sp->reply_lock);
+	}
+
+	mutex_unlock(&sp->bus_lock);
+	return ret;
+}
+EXPORT_SYMBOL_GPL(rave_sp_exec);
+
+static void rave_sp_receive_event(struct rave_sp *sp,
+				  const unsigned char *data, size_t length)
+{
+	u8 cmd[] = {
+		[0] = rave_sp_reply_code(data[0]),
+		[1] = data[1],
+	};
+
+	rave_sp_write(sp, cmd, sizeof(cmd));
+
+	blocking_notifier_call_chain(&sp->event_notifier_list,
+				     rave_sp_action_pack(data[0], data[2]),
+				     NULL);
+}
+
+static void rave_sp_receive_reply(struct rave_sp *sp,
+				  const unsigned char *data, size_t length)
+{
+	struct device *dev = &sp->serdev->dev;
+	struct rave_sp_reply *reply;
+	const  size_t payload_length = length - 2;
+
+	mutex_lock(&sp->reply_lock);
+	reply = sp->reply;
+
+	if (reply) {
+		if (reply->code == data[0] && reply->ackid == data[1] &&
+		    payload_length >= reply->length) {
+			/*
+			 * We are relying on memcpy(dst, src, 0) to be a no-op
+			 * when handling commands that have a no-payload reply
+			 */
+			memcpy(reply->data, &data[2], reply->length);
+			complete(&reply->received);
+			sp->reply = NULL;
+		} else {
+			dev_err(dev, "Ignoring incorrect reply\n");
+			dev_dbg(dev, "Code:   expected = 0x%08x received = 0x%08x\n",
+				reply->code, data[0]);
+			dev_dbg(dev, "ACK ID: expected = 0x%08x received = 0x%08x\n",
+				reply->ackid, data[1]);
+			dev_dbg(dev, "Length: expected = %zu received = %zu\n",
+				reply->length, payload_length);
+		}
+	}
+
+	mutex_unlock(&sp->reply_lock);
+}
+
+static void rave_sp_receive_frame(struct rave_sp *sp,
+				  const unsigned char *data,
+				  size_t length)
+{
+	const size_t checksum_length = sp->variant->checksum->length;
+	const size_t payload_length  = length - checksum_length;
+	const u8 *crc_reported       = &data[payload_length];
+	struct device *dev           = &sp->serdev->dev;
+	u8 crc_calculated[checksum_length];
+
+	print_hex_dump(KERN_DEBUG, "rave-sp rx: ", DUMP_PREFIX_NONE,
+		       16, 1, data, length, false);
+
+	if (unlikely(length <= checksum_length)) {
+		dev_warn(dev, "Dropping short frame\n");
+		return;
+	}
+
+	sp->variant->checksum->subroutine(data, payload_length,
+					  crc_calculated);
+
+	if (memcmp(crc_calculated, crc_reported, checksum_length)) {
+		dev_warn(dev, "Dropping bad frame\n");
+		return;
+	}
+
+	if (rave_sp_id_is_event(data[0]))
+		rave_sp_receive_event(sp, data, length);
+	else
+		rave_sp_receive_reply(sp, data, length);
+}
+
+static int rave_sp_receive_buf(struct serdev_device *serdev,
+			       const unsigned char *buf, size_t size)
+{
+	struct device *dev = &serdev->dev;
+	struct rave_sp *sp = dev_get_drvdata(dev);
+	struct rave_sp_deframer *deframer = &sp->deframer;
+	const unsigned char *src = buf;
+	const unsigned char *end = buf + size;
+
+	while (src < end) {
+		const unsigned char byte = *src++;
+
+		switch (deframer->state) {
+		case RAVE_SP_EXPECT_SOF:
+			if (byte == RAVE_SP_STX)
+				deframer->state = RAVE_SP_EXPECT_DATA;
+			break;
+
+		case RAVE_SP_EXPECT_DATA:
+			/*
+			 * Treat special byte values first
+			 */
+			switch (byte) {
+			case RAVE_SP_ETX:
+				rave_sp_receive_frame(sp,
+						      deframer->data,
+						      deframer->length);
+				/*
+				 * Once we extracted a complete frame
+				 * out of a stream, we call it done
+				 * and proceed to bailing out while
+				 * resetting the framer to initial
+				 * state, regardless if we've consumed
+				 * all of the stream or not.
+				 */
+				goto reset_framer;
+			case RAVE_SP_STX:
+				dev_warn(dev, "Bad frame: STX before ETX\n");
+				/*
+				 * If we encounter second "start of
+				 * the frame" marker before seeing
+				 * corresponding "end of frame", we
+				 * reset the framer and ignore both:
+				 * frame started by first SOF and
+				 * frame started by current SOF.
+				 *
+				 * NOTE: The above means that only the
+				 * frame started by third SOF, sent
+				 * after this one will have a chance
+				 * to get throught.
+				 */
+				goto reset_framer;
+			case RAVE_SP_DLE:
+				deframer->state = RAVE_SP_EXPECT_ESCAPED_DATA;
+				/*
+				 * If we encounter escape sequence we
+				 * need to skip it and collect the
+				 * byte that follows. We do it by
+				 * forcing the next iteration of the
+				 * encompassing while loop.
+				 */
+				continue;
+			}
+			/*
+			 * For the rest of the bytes, that are not
+			 * speical snoflakes, we do the same thing
+			 * that we do to escaped data - collect it in
+			 * deframer buffer
+			 */
+
+			/* FALLTHROUGH */
+
+		case RAVE_SP_EXPECT_ESCAPED_DATA:
+			deframer->data[deframer->length++] = byte;
+
+			if (deframer->length == sizeof(deframer->data)) {
+				dev_warn(dev, "Bad frame: Too long\n");
+				/*
+				 * If the amount of data we've
+				 * accumulated for current frame so
+				 * far starts to exceed the capacity
+				 * of deframer's buffer, there's
+				 * nothing else we can do but to
+				 * discard that data and start
+				 * assemblying a new frame again
+				 */
+				goto reset_framer;
+			}
+
+			/*
+			 * We've extracted out special byte, now we
+			 * can go back to regular data collecting
+			 */
+			deframer->state = RAVE_SP_EXPECT_DATA;
+			break;
+		}
+	}
+
+	/*
+	 * The only way to get out of the above loop and end up here
+	 * is throught consuming all of the supplied data, so here we
+	 * report that we processed it all.
+	 */
+	return size;
+
+reset_framer:
+	/*
+	 * NOTE: A number of codepaths that will drop us here will do
+	 * so before consuming all 'size' bytes of the data passed by
+	 * serdev layer. We rely on the fact that serdev layer will
+	 * re-execute this handler with the remainder of the Rx bytes
+	 * once we report actual number of bytes that we processed.
+	 */
+	deframer->state  = RAVE_SP_EXPECT_SOF;
+	deframer->length = 0;
+
+	return src - buf;
+}
+
+static int rave_sp_rdu1_cmd_translate(enum rave_sp_command command)
+{
+	if (command >= RAVE_SP_CMD_STATUS &&
+	    command <= RAVE_SP_CMD_CONTROL_EVENTS)
+		return command;
+
+	return -EINVAL;
+}
+
+static int rave_sp_rdu2_cmd_translate(enum rave_sp_command command)
+{
+	if (command >= RAVE_SP_CMD_GET_FIRMWARE_VERSION &&
+	    command <= RAVE_SP_CMD_GET_GPIO_STATE)
+		return command;
+
+	if (command == RAVE_SP_CMD_REQ_COPPER_REV) {
+		/*
+		 * As per RDU2 ICD 3.4.47 CMD_GET_COPPER_REV code is
+		 * different from that for RDU1 and it is set to 0x28.
+		 */
+		return 0x28;
+	}
+
+	return rave_sp_rdu1_cmd_translate(command);
+}
+
+static int rave_sp_default_cmd_translate(enum rave_sp_command command)
+{
+	/*
+	 * All of the following command codes were taken from "Table :
+	 * Communications Protocol Message Types" in section 3.3
+	 * "MESSAGE TYPES" of Rave PIC24 ICD.
+	 */
+	switch (command) {
+	case RAVE_SP_CMD_GET_FIRMWARE_VERSION:
+		return 0x11;
+	case RAVE_SP_CMD_GET_BOOTLOADER_VERSION:
+		return 0x12;
+	case RAVE_SP_CMD_BOOT_SOURCE:
+		return 0x14;
+	case RAVE_SP_CMD_SW_WDT:
+		return 0x1C;
+	case RAVE_SP_CMD_RESET:
+		return 0x1E;
+	case RAVE_SP_CMD_RESET_REASON:
+		return 0x1F;
+	default:
+		return -EINVAL;
+	}
+}
+
+static const struct rave_sp_checksum rave_sp_checksum_8b2c = {
+	.length     = 1,
+	.subroutine = csum_8b2c,
+};
+
+static const struct rave_sp_checksum rave_sp_checksum_ccitt = {
+	.length     = 2,
+	.subroutine = csum_ccitt,
+};
+
+static const struct rave_sp_variant rave_sp_legacy = {
+	.checksum = &rave_sp_checksum_8b2c,
+	.cmd = {
+		.translate = rave_sp_default_cmd_translate,
+	},
+};
+
+static const struct rave_sp_variant rave_sp_rdu1 = {
+	.checksum = &rave_sp_checksum_8b2c,
+	.cmd = {
+		.translate = rave_sp_rdu1_cmd_translate,
+	},
+};
+
+static const struct rave_sp_variant rave_sp_rdu2 = {
+	.checksum = &rave_sp_checksum_ccitt,
+	.cmd = {
+		.translate = rave_sp_rdu2_cmd_translate,
+	},
+};
+
+static const struct of_device_id rave_sp_dt_ids[] = {
+	{ .compatible = "zii,rave-sp-niu",  .data = &rave_sp_legacy },
+	{ .compatible = "zii,rave-sp-mezz", .data = &rave_sp_legacy },
+	{ .compatible = "zii,rave-sp-esb",  .data = &rave_sp_legacy },
+	{ .compatible = "zii,rave-sp-rdu1", .data = &rave_sp_rdu1   },
+	{ .compatible = "zii,rave-sp-rdu2", .data = &rave_sp_rdu2   },
+	{ /* sentinel */ }
+};
+
+static const struct serdev_device_ops rave_sp_serdev_device_ops = {
+	.receive_buf  = rave_sp_receive_buf,
+	.write_wakeup = serdev_device_write_wakeup,
+};
+
+static int rave_sp_probe(struct serdev_device *serdev)
+{
+	struct device *dev = &serdev->dev;
+	struct rave_sp *sp;
+	u32 baud;
+	int ret;
+
+	if (of_property_read_u32(dev->of_node, "current-speed", &baud)) {
+		dev_err(dev,
+			"'current-speed' is not specified in device node\n");
+		return -EINVAL;
+	}
+
+	sp = devm_kzalloc(dev, sizeof(*sp), GFP_KERNEL);
+	if (!sp)
+		return -ENOMEM;
+
+	sp->serdev = serdev;
+	dev_set_drvdata(dev, sp);
+
+	sp->variant = of_device_get_match_data(dev);
+	if (!sp->variant)
+		return -ENODEV;
+
+	mutex_init(&sp->bus_lock);
+	mutex_init(&sp->reply_lock);
+	BLOCKING_INIT_NOTIFIER_HEAD(&sp->event_notifier_list);
+
+	serdev_device_set_client_ops(serdev, &rave_sp_serdev_device_ops);
+	ret = devm_serdev_device_open(dev, serdev);
+	if (ret)
+		return ret;
+
+	serdev_device_set_baudrate(serdev, baud);
+
+	return devm_of_platform_populate(dev);
+}
+
+MODULE_DEVICE_TABLE(of, rave_sp_dt_ids);
+
+static struct serdev_device_driver rave_sp_drv = {
+	.probe			= rave_sp_probe,
+	.driver = {
+		.name		= "rave-sp",
+		.of_match_table	= rave_sp_dt_ids,
+	},
+};
+module_serdev_device_driver(rave_sp_drv);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>");
+MODULE_AUTHOR("Nikita Yushchenko <nikita.yoush@cogentembedded.com>");
+MODULE_AUTHOR("Andrey Smirnov <andrew.smirnov@gmail.com>");
+MODULE_DESCRIPTION("RAVE SP core driver");
diff --git a/drivers/mfd/stm32-lptimer.c b/drivers/mfd/stm32-lptimer.c
index 075330a..a00f99f 100644
--- a/drivers/mfd/stm32-lptimer.c
+++ b/drivers/mfd/stm32-lptimer.c
@@ -1,13 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * STM32 Low-Power Timer parent driver.
- *
  * Copyright (C) STMicroelectronics 2017
- *
  * Author: Fabrice Gasnier <fabrice.gasnier@st.com>
- *
  * Inspired by Benjamin Gaignard's stm32-timers driver
- *
- * License terms:  GNU General Public License (GPL), version 2
  */
 
 #include <linux/mfd/stm32-lptimer.h>
diff --git a/drivers/mfd/stm32-timers.c b/drivers/mfd/stm32-timers.c
index a6675a4..1d347e5 100644
--- a/drivers/mfd/stm32-timers.c
+++ b/drivers/mfd/stm32-timers.c
@@ -1,9 +1,7 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
  * Copyright (C) STMicroelectronics 2016
- *
  * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
- *
- * License terms:  GNU General Public License (GPL), version 2
  */
 
 #include <linux/mfd/stm32-timers.h>
diff --git a/drivers/mfd/syscon.c b/drivers/mfd/syscon.c
index b93fe4c..7eaa40b 100644
--- a/drivers/mfd/syscon.c
+++ b/drivers/mfd/syscon.c
@@ -13,6 +13,7 @@
  */
 
 #include <linux/err.h>
+#include <linux/hwspinlock.h>
 #include <linux/io.h>
 #include <linux/module.h>
 #include <linux/list.h>
@@ -87,6 +88,24 @@ static struct syscon *of_syscon_register(struct device_node *np)
 	if (ret)
 		reg_io_width = 4;
 
+	ret = of_hwspin_lock_get_id(np, 0);
+	if (ret > 0 || (IS_ENABLED(CONFIG_HWSPINLOCK) && ret == 0)) {
+		syscon_config.use_hwlock = true;
+		syscon_config.hwlock_id = ret;
+		syscon_config.hwlock_mode = HWLOCK_IRQSTATE;
+	} else if (ret < 0) {
+		switch (ret) {
+		case -ENOENT:
+			/* Ignore missing hwlock, it's optional. */
+			break;
+		default:
+			pr_err("Failed to retrieve valid hwlock: %d\n", ret);
+			/* fall-through */
+		case -EPROBE_DEFER:
+			goto err_regmap;
+		}
+	}
+
 	syscon_config.reg_stride = reg_io_width;
 	syscon_config.val_bits = reg_io_width * 8;
 	syscon_config.max_register = resource_size(&res) - reg_io_width;
diff --git a/drivers/mfd/ti_am335x_tscadc.c b/drivers/mfd/ti_am335x_tscadc.c
index 0f3fab4..3cd958a 100644
--- a/drivers/mfd/ti_am335x_tscadc.c
+++ b/drivers/mfd/ti_am335x_tscadc.c
@@ -124,7 +124,7 @@ static	int ti_tscadc_probe(struct platform_device *pdev)
 	struct ti_tscadc_dev	*tscadc;
 	struct resource		*res;
 	struct clk		*clk;
-	struct device_node	*node = pdev->dev.of_node;
+	struct device_node	*node;
 	struct mfd_cell		*cell;
 	struct property         *prop;
 	const __be32            *cur;
diff --git a/drivers/mfd/tmio_core.c b/drivers/mfd/tmio_core.c
index 83af78c..ebf54cc2 100644
--- a/drivers/mfd/tmio_core.c
+++ b/drivers/mfd/tmio_core.c
@@ -9,6 +9,26 @@
 #include <linux/export.h>
 #include <linux/mfd/tmio.h>
 
+#define CNF_CMD     0x04
+#define CNF_CTL_BASE   0x10
+#define CNF_INT_PIN  0x3d
+#define CNF_STOP_CLK_CTL 0x40
+#define CNF_GCLK_CTL 0x41
+#define CNF_SD_CLK_MODE 0x42
+#define CNF_PIN_STATUS 0x44
+#define CNF_PWR_CTL_1 0x48
+#define CNF_PWR_CTL_2 0x49
+#define CNF_PWR_CTL_3 0x4a
+#define CNF_CARD_DETECT_MODE 0x4c
+#define CNF_SD_SLOT 0x50
+#define CNF_EXT_GCLK_CTL_1 0xf0
+#define CNF_EXT_GCLK_CTL_2 0xf1
+#define CNF_EXT_GCLK_CTL_3 0xf9
+#define CNF_SD_LED_EN_1 0xfa
+#define CNF_SD_LED_EN_2 0xfe
+
+#define   SDCREN 0x2   /* Enable access to MMC CTL regs. (flag in COMMAND_REG)*/
+
 int tmio_core_mmc_enable(void __iomem *cnf, int shift, unsigned long base)
 {
 	/* Enable the MMC/SD Control registers */
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index f1a5c23..7c0fa24 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -496,6 +496,10 @@
            Enable this configuration option to enable the host side test driver
            for PCI Endpoint.
 
+config MISC_RTSX
+	tristate
+	default MISC_RTSX_PCI || MISC_RTSX_USB
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
@@ -508,4 +512,5 @@
 source "drivers/misc/genwqe/Kconfig"
 source "drivers/misc/echo/Kconfig"
 source "drivers/misc/cxl/Kconfig"
+source "drivers/misc/cardreader/Kconfig"
 endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 5ca5f64..8d8cc09 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -55,6 +55,7 @@
 obj-$(CONFIG_ASPEED_LPC_CTRL)	+= aspeed-lpc-ctrl.o
 obj-$(CONFIG_ASPEED_LPC_SNOOP)	+= aspeed-lpc-snoop.o
 obj-$(CONFIG_PCI_ENDPOINT_TEST)	+= pci_endpoint_test.o
+obj-$(CONFIG_MISC_RTSX)	+= cardreader/
 
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_core.o
 lkdtm-$(CONFIG_LKDTM)		+= lkdtm_bugs.o
diff --git a/drivers/misc/cardreader/Kconfig b/drivers/misc/cardreader/Kconfig
new file mode 100644
index 0000000..69e815e
--- /dev/null
+++ b/drivers/misc/cardreader/Kconfig
@@ -0,0 +1,20 @@
+config MISC_RTSX_PCI
+	tristate "Realtek PCI-E card reader"
+	depends on PCI
+	select MFD_CORE
+	help
+	  This supports for Realtek PCI-Express card reader including rts5209,
+	  rts5227, rts522A, rts5229, rts5249, rts524A, rts525A, rtl8411, rts5260.
+	  Realtek card readers support access to many types of memory cards,
+	  such as Memory Stick, Memory Stick Pro, Secure Digital and
+	  MultiMediaCard.
+
+config MISC_RTSX_USB
+	tristate "Realtek USB card reader"
+	depends on USB
+	select MFD_CORE
+	help
+	  Select this option to get support for Realtek USB 2.0 card readers
+	  including RTS5129, RTS5139, RTS5179 and RTS5170.
+	  Realtek card reader supports access to many types of memory cards,
+	  such as Memory Stick Pro, Secure Digital and MultiMediaCard.
diff --git a/drivers/misc/cardreader/Makefile b/drivers/misc/cardreader/Makefile
new file mode 100644
index 0000000..9fabfcc
--- /dev/null
+++ b/drivers/misc/cardreader/Makefile
@@ -0,0 +1,4 @@
+rtsx_pci-objs := rtsx_pcr.o rts5209.o rts5229.o rtl8411.o rts5227.o rts5249.o rts5260.o
+
+obj-$(CONFIG_MISC_RTSX_PCI)	+= rtsx_pci.o
+obj-$(CONFIG_MISC_RTSX_USB)	+= rtsx_usb.o
diff --git a/drivers/mfd/rtl8411.c b/drivers/misc/cardreader/rtl8411.c
similarity index 99%
rename from drivers/mfd/rtl8411.c
rename to drivers/misc/cardreader/rtl8411.c
index b3ae659..434fd07 100644
--- a/drivers/mfd/rtl8411.c
+++ b/drivers/misc/cardreader/rtl8411.c
@@ -23,7 +23,7 @@
 #include <linux/module.h>
 #include <linux/bitops.h>
 #include <linux/delay.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #include "rtsx_pcr.h"
 
diff --git a/drivers/mfd/rts5209.c b/drivers/misc/cardreader/rts5209.c
similarity index 99%
rename from drivers/mfd/rts5209.c
rename to drivers/misc/cardreader/rts5209.c
index b95beec..ce68c48 100644
--- a/drivers/mfd/rts5209.c
+++ b/drivers/misc/cardreader/rts5209.c
@@ -21,7 +21,7 @@
 
 #include <linux/module.h>
 #include <linux/delay.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #include "rtsx_pcr.h"
 
diff --git a/drivers/mfd/rts5227.c b/drivers/misc/cardreader/rts5227.c
similarity index 99%
rename from drivers/mfd/rts5227.c
rename to drivers/misc/cardreader/rts5227.c
index ff296a4..024dcba 100644
--- a/drivers/mfd/rts5227.c
+++ b/drivers/misc/cardreader/rts5227.c
@@ -22,7 +22,7 @@
 
 #include <linux/module.h>
 #include <linux/delay.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #include "rtsx_pcr.h"
 
diff --git a/drivers/mfd/rts5229.c b/drivers/misc/cardreader/rts5229.c
similarity index 99%
rename from drivers/mfd/rts5229.c
rename to drivers/misc/cardreader/rts5229.c
index 9ed9dc8..9119261 100644
--- a/drivers/mfd/rts5229.c
+++ b/drivers/misc/cardreader/rts5229.c
@@ -21,7 +21,7 @@
 
 #include <linux/module.h>
 #include <linux/delay.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #include "rtsx_pcr.h"
 
diff --git a/drivers/mfd/rts5249.c b/drivers/misc/cardreader/rts5249.c
similarity index 99%
rename from drivers/mfd/rts5249.c
rename to drivers/misc/cardreader/rts5249.c
index 7fcf37b..dbe013a 100644
--- a/drivers/mfd/rts5249.c
+++ b/drivers/misc/cardreader/rts5249.c
@@ -21,7 +21,7 @@
 
 #include <linux/module.h>
 #include <linux/delay.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #include "rtsx_pcr.h"
 
@@ -738,4 +738,3 @@ void rts525a_init_params(struct rtsx_pcr *pcr)
 	pcr->reg_pm_ctrl3 = RTS524A_PM_CTRL3;
 	pcr->ops = &rts525a_pcr_ops;
 }
-
diff --git a/drivers/misc/cardreader/rts5260.c b/drivers/misc/cardreader/rts5260.c
new file mode 100644
index 0000000..07cb93a
--- /dev/null
+++ b/drivers/misc/cardreader/rts5260.c
@@ -0,0 +1,748 @@
+/* Driver for Realtek PCI-Express card reader
+ *
+ * Copyright(c) 2016-2017 Realtek Semiconductor Corp. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation; either version 2, or (at your option) any
+ * later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author:
+ *   Steven FENG <steven_feng@realsil.com.cn>
+ *   Rui FENG <rui_feng@realsil.com.cn>
+ *   Wei WANG <wei_wang@realsil.com.cn>
+ */
+
+#include <linux/module.h>
+#include <linux/delay.h>
+#include <linux/rtsx_pci.h>
+
+#include "rts5260.h"
+#include "rtsx_pcr.h"
+
+static u8 rts5260_get_ic_version(struct rtsx_pcr *pcr)
+{
+	u8 val;
+
+	rtsx_pci_read_register(pcr, DUMMY_REG_RESET_0, &val);
+	return val & IC_VERSION_MASK;
+}
+
+static void rts5260_fill_driving(struct rtsx_pcr *pcr, u8 voltage)
+{
+	u8 driving_3v3[6][3] = {
+		{0x94, 0x94, 0x94},
+		{0x11, 0x11, 0x18},
+		{0x55, 0x55, 0x5C},
+		{0x94, 0x94, 0x94},
+		{0x94, 0x94, 0x94},
+		{0xFF, 0xFF, 0xFF},
+	};
+	u8 driving_1v8[6][3] = {
+		{0x9A, 0x89, 0x89},
+		{0xC4, 0xC4, 0xC4},
+		{0x3C, 0x3C, 0x3C},
+		{0x9B, 0x99, 0x99},
+		{0x9A, 0x89, 0x89},
+		{0xFE, 0xFE, 0xFE},
+	};
+	u8 (*driving)[3], drive_sel;
+
+	if (voltage == OUTPUT_3V3) {
+		driving = driving_3v3;
+		drive_sel = pcr->sd30_drive_sel_3v3;
+	} else {
+		driving = driving_1v8;
+		drive_sel = pcr->sd30_drive_sel_1v8;
+	}
+
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, SD30_CLK_DRIVE_SEL,
+			 0xFF, driving[drive_sel][0]);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, SD30_CMD_DRIVE_SEL,
+			 0xFF, driving[drive_sel][1]);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, SD30_DAT_DRIVE_SEL,
+			 0xFF, driving[drive_sel][2]);
+}
+
+static void rtsx_base_fetch_vendor_settings(struct rtsx_pcr *pcr)
+{
+	u32 reg;
+
+	rtsx_pci_read_config_dword(pcr, PCR_SETTING_REG1, &reg);
+	pcr_dbg(pcr, "Cfg 0x%x: 0x%x\n", PCR_SETTING_REG1, reg);
+
+	if (!rtsx_vendor_setting_valid(reg)) {
+		pcr_dbg(pcr, "skip fetch vendor setting\n");
+		return;
+	}
+
+	pcr->aspm_en = rtsx_reg_to_aspm(reg);
+	pcr->sd30_drive_sel_1v8 = rtsx_reg_to_sd30_drive_sel_1v8(reg);
+	pcr->card_drive_sel &= 0x3F;
+	pcr->card_drive_sel |= rtsx_reg_to_card_drive_sel(reg);
+
+	rtsx_pci_read_config_dword(pcr, PCR_SETTING_REG2, &reg);
+	pcr_dbg(pcr, "Cfg 0x%x: 0x%x\n", PCR_SETTING_REG2, reg);
+	pcr->sd30_drive_sel_3v3 = rtsx_reg_to_sd30_drive_sel_3v3(reg);
+	if (rtsx_reg_check_reverse_socket(reg))
+		pcr->flags |= PCR_REVERSE_SOCKET;
+}
+
+static void rtsx_base_force_power_down(struct rtsx_pcr *pcr, u8 pm_state)
+{
+	/* Set relink_time to 0 */
+	rtsx_pci_write_register(pcr, AUTOLOAD_CFG_BASE + 1, MASK_8_BIT_DEF, 0);
+	rtsx_pci_write_register(pcr, AUTOLOAD_CFG_BASE + 2, MASK_8_BIT_DEF, 0);
+	rtsx_pci_write_register(pcr, AUTOLOAD_CFG_BASE + 3,
+				RELINK_TIME_MASK, 0);
+
+	if (pm_state == HOST_ENTER_S3)
+		rtsx_pci_write_register(pcr, pcr->reg_pm_ctrl3,
+					D3_DELINK_MODE_EN, D3_DELINK_MODE_EN);
+
+	rtsx_pci_write_register(pcr, FPDCTL, ALL_POWER_DOWN, ALL_POWER_DOWN);
+}
+
+static int rtsx_base_enable_auto_blink(struct rtsx_pcr *pcr)
+{
+	return rtsx_pci_write_register(pcr, OLT_LED_CTL,
+		LED_SHINE_MASK, LED_SHINE_EN);
+}
+
+static int rtsx_base_disable_auto_blink(struct rtsx_pcr *pcr)
+{
+	return rtsx_pci_write_register(pcr, OLT_LED_CTL,
+		LED_SHINE_MASK, LED_SHINE_DISABLE);
+}
+
+static int rts5260_turn_on_led(struct rtsx_pcr *pcr)
+{
+	return rtsx_pci_write_register(pcr, RTS5260_REG_GPIO_CTL0,
+		RTS5260_REG_GPIO_MASK, RTS5260_REG_GPIO_ON);
+}
+
+static int rts5260_turn_off_led(struct rtsx_pcr *pcr)
+{
+	return rtsx_pci_write_register(pcr, RTS5260_REG_GPIO_CTL0,
+		RTS5260_REG_GPIO_MASK, RTS5260_REG_GPIO_OFF);
+}
+
+/* SD Pull Control Enable:
+ *     SD_DAT[3:0] ==> pull up
+ *     SD_CD       ==> pull up
+ *     SD_WP       ==> pull up
+ *     SD_CMD      ==> pull up
+ *     SD_CLK      ==> pull down
+ */
+static const u32 rts5260_sd_pull_ctl_enable_tbl[] = {
+	RTSX_REG_PAIR(CARD_PULL_CTL1, 0x66),
+	RTSX_REG_PAIR(CARD_PULL_CTL2, 0xAA),
+	RTSX_REG_PAIR(CARD_PULL_CTL3, 0xE9),
+	RTSX_REG_PAIR(CARD_PULL_CTL4, 0xAA),
+	0,
+};
+
+/* SD Pull Control Disable:
+ *     SD_DAT[3:0] ==> pull down
+ *     SD_CD       ==> pull up
+ *     SD_WP       ==> pull down
+ *     SD_CMD      ==> pull down
+ *     SD_CLK      ==> pull down
+ */
+static const u32 rts5260_sd_pull_ctl_disable_tbl[] = {
+	RTSX_REG_PAIR(CARD_PULL_CTL1, 0x66),
+	RTSX_REG_PAIR(CARD_PULL_CTL2, 0x55),
+	RTSX_REG_PAIR(CARD_PULL_CTL3, 0xD5),
+	RTSX_REG_PAIR(CARD_PULL_CTL4, 0x55),
+	0,
+};
+
+/* MS Pull Control Enable:
+ *     MS CD       ==> pull up
+ *     others      ==> pull down
+ */
+static const u32 rts5260_ms_pull_ctl_enable_tbl[] = {
+	RTSX_REG_PAIR(CARD_PULL_CTL4, 0x55),
+	RTSX_REG_PAIR(CARD_PULL_CTL5, 0x55),
+	RTSX_REG_PAIR(CARD_PULL_CTL6, 0x15),
+	0,
+};
+
+/* MS Pull Control Disable:
+ *     MS CD       ==> pull up
+ *     others      ==> pull down
+ */
+static const u32 rts5260_ms_pull_ctl_disable_tbl[] = {
+	RTSX_REG_PAIR(CARD_PULL_CTL4, 0x55),
+	RTSX_REG_PAIR(CARD_PULL_CTL5, 0x55),
+	RTSX_REG_PAIR(CARD_PULL_CTL6, 0x15),
+	0,
+};
+
+static int sd_set_sample_push_timing_sd30(struct rtsx_pcr *pcr)
+{
+	rtsx_pci_write_register(pcr, SD_CFG1, SD_MODE_SELECT_MASK
+		| SD_ASYNC_FIFO_NOT_RST, SD_30_MODE | SD_ASYNC_FIFO_NOT_RST);
+	rtsx_pci_write_register(pcr, CLK_CTL, CLK_LOW_FREQ, CLK_LOW_FREQ);
+	rtsx_pci_write_register(pcr, CARD_CLK_SOURCE, 0xFF,
+				CRC_VAR_CLK0 | SD30_FIX_CLK | SAMPLE_VAR_CLK1);
+	rtsx_pci_write_register(pcr, CLK_CTL, CLK_LOW_FREQ, 0);
+
+	return 0;
+}
+
+static int rts5260_card_power_on(struct rtsx_pcr *pcr, int card)
+{
+	int err = 0;
+	struct rtsx_cr_option *option = &pcr->option;
+
+	if (option->ocp_en)
+		rtsx_pci_enable_ocp(pcr);
+
+	rtsx_pci_init_cmd(pcr);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_CONFIG2,
+			 DV331812_VDD1, DV331812_VDD1);
+	err = rtsx_pci_send_cmd(pcr, CMD_TIMEOUT_DEF);
+	if (err < 0)
+		return err;
+
+	rtsx_pci_init_cmd(pcr);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_VCC_CFG0,
+			 RTS5260_DVCC_TUNE_MASK, RTS5260_DVCC_33);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_VCC_CFG1,
+			 LDO_POW_SDVDD1_MASK, LDO_POW_SDVDD1_ON);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_CONFIG2,
+			 DV331812_POWERON, DV331812_POWERON);
+	err = rtsx_pci_send_cmd(pcr, CMD_TIMEOUT_DEF);
+
+	msleep(20);
+
+	if (pcr->extra_caps & EXTRA_CAPS_SD_SDR50 ||
+	    pcr->extra_caps & EXTRA_CAPS_SD_SDR104)
+		sd_set_sample_push_timing_sd30(pcr);
+
+	/* Initialize SD_CFG1 register */
+	rtsx_pci_write_register(pcr, SD_CFG1, 0xFF,
+				SD_CLK_DIVIDE_128 | SD_20_MODE);
+
+	rtsx_pci_write_register(pcr, SD_SAMPLE_POINT_CTL,
+				0xFF, SD20_RX_POS_EDGE);
+	rtsx_pci_write_register(pcr, SD_PUSH_POINT_CTL, 0xFF, 0);
+	rtsx_pci_write_register(pcr, CARD_STOP, SD_STOP | SD_CLR_ERR,
+				SD_STOP | SD_CLR_ERR);
+
+	/* Reset SD_CFG3 register */
+	rtsx_pci_write_register(pcr, SD_CFG3, SD30_CLK_END_EN, 0);
+	rtsx_pci_write_register(pcr, REG_SD_STOP_SDCLK_CFG,
+				SD30_CLK_STOP_CFG_EN | SD30_CLK_STOP_CFG1 |
+				SD30_CLK_STOP_CFG0, 0);
+
+	rtsx_pci_write_register(pcr, REG_PRE_RW_MODE, EN_INFINITE_MODE, 0);
+
+	return err;
+}
+
+static int rts5260_switch_output_voltage(struct rtsx_pcr *pcr, u8 voltage)
+{
+	switch (voltage) {
+	case OUTPUT_3V3:
+		rtsx_pci_write_register(pcr, LDO_CONFIG2,
+					DV331812_VDD1, DV331812_VDD1);
+		rtsx_pci_write_register(pcr, LDO_DV18_CFG,
+					DV331812_MASK, DV331812_33);
+		rtsx_pci_write_register(pcr, SD_PAD_CTL, SD_IO_USING_1V8, 0);
+		break;
+	case OUTPUT_1V8:
+		rtsx_pci_write_register(pcr, LDO_CONFIG2,
+					DV331812_VDD1, DV331812_VDD1);
+		rtsx_pci_write_register(pcr, LDO_DV18_CFG,
+					DV331812_MASK, DV331812_17);
+		rtsx_pci_write_register(pcr, SD_PAD_CTL, SD_IO_USING_1V8,
+					SD_IO_USING_1V8);
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* set pad drive */
+	rtsx_pci_init_cmd(pcr);
+	rts5260_fill_driving(pcr, voltage);
+	return rtsx_pci_send_cmd(pcr, CMD_TIMEOUT_DEF);
+}
+
+static void rts5260_stop_cmd(struct rtsx_pcr *pcr)
+{
+	rtsx_pci_writel(pcr, RTSX_HCBCTLR, STOP_CMD);
+	rtsx_pci_writel(pcr, RTSX_HDBCTLR, STOP_DMA);
+	rtsx_pci_write_register(pcr, RTS5260_DMA_RST_CTL_0,
+				RTS5260_DMA_RST | RTS5260_ADMA3_RST,
+				RTS5260_DMA_RST | RTS5260_ADMA3_RST);
+	rtsx_pci_write_register(pcr, RBCTL, RB_FLUSH, RB_FLUSH);
+}
+
+static void rts5260_card_before_power_off(struct rtsx_pcr *pcr)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+
+	rts5260_stop_cmd(pcr);
+	rts5260_switch_output_voltage(pcr, OUTPUT_3V3);
+
+	if (option->ocp_en)
+		rtsx_pci_disable_ocp(pcr);
+}
+
+static int rts5260_card_power_off(struct rtsx_pcr *pcr, int card)
+{
+	int err = 0;
+
+	rts5260_card_before_power_off(pcr);
+
+	rtsx_pci_init_cmd(pcr);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_VCC_CFG1,
+			 LDO_POW_SDVDD1_MASK, LDO_POW_SDVDD1_OFF);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, LDO_CONFIG2,
+			 DV331812_POWERON, DV331812_POWEROFF);
+	err = rtsx_pci_send_cmd(pcr, CMD_TIMEOUT_DEF);
+
+	return err;
+}
+
+static void rts5260_init_ocp(struct rtsx_pcr *pcr)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+
+	if (option->ocp_en) {
+		u8 mask, val;
+
+		rtsx_pci_write_register(pcr, RTS5260_DVCC_CTRL,
+					RTS5260_DVCC_OCP_EN |
+					RTS5260_DVCC_OCP_CL_EN,
+					RTS5260_DVCC_OCP_EN |
+					RTS5260_DVCC_OCP_CL_EN);
+		rtsx_pci_write_register(pcr, RTS5260_DVIO_CTRL,
+					RTS5260_DVIO_OCP_EN |
+					RTS5260_DVIO_OCP_CL_EN,
+					RTS5260_DVIO_OCP_EN |
+					RTS5260_DVIO_OCP_CL_EN);
+
+		rtsx_pci_write_register(pcr, RTS5260_DVCC_CTRL,
+					RTS5260_DVCC_OCP_THD_MASK,
+					option->sd_400mA_ocp_thd);
+
+		rtsx_pci_write_register(pcr, RTS5260_DVIO_CTRL,
+					RTS5260_DVIO_OCP_THD_MASK,
+					RTS5260_DVIO_OCP_THD_350);
+
+		rtsx_pci_write_register(pcr, RTS5260_DV331812_CFG,
+					RTS5260_DV331812_OCP_THD_MASK,
+					RTS5260_DV331812_OCP_THD_210);
+
+		mask = SD_OCP_GLITCH_MASK | SDVIO_OCP_GLITCH_MASK;
+		val = pcr->hw_param.ocp_glitch;
+		rtsx_pci_write_register(pcr, REG_OCPGLITCH, mask, val);
+
+		rtsx_pci_enable_ocp(pcr);
+	} else {
+		rtsx_pci_write_register(pcr, RTS5260_DVCC_CTRL,
+					RTS5260_DVCC_OCP_EN |
+					RTS5260_DVCC_OCP_CL_EN, 0);
+		rtsx_pci_write_register(pcr, RTS5260_DVIO_CTRL,
+					RTS5260_DVIO_OCP_EN |
+					RTS5260_DVIO_OCP_CL_EN, 0);
+	}
+}
+
+static void rts5260_enable_ocp(struct rtsx_pcr *pcr)
+{
+	u8 val = 0;
+
+	rtsx_pci_write_register(pcr, FPDCTL, OC_POWER_DOWN, 0);
+
+	val = SD_OCP_INT_EN | SD_DETECT_EN;
+	val |= SDVIO_OCP_INT_EN | SDVIO_DETECT_EN;
+	rtsx_pci_write_register(pcr, REG_OCPCTL, 0xFF, val);
+	rtsx_pci_write_register(pcr, REG_DV3318_OCPCTL,
+				DV3318_DETECT_EN | DV3318_OCP_INT_EN,
+				DV3318_DETECT_EN | DV3318_OCP_INT_EN);
+}
+
+static void rts5260_disable_ocp(struct rtsx_pcr *pcr)
+{
+	u8 mask = 0;
+
+	mask = SD_OCP_INT_EN | SD_DETECT_EN;
+	mask |= SDVIO_OCP_INT_EN | SDVIO_DETECT_EN;
+	rtsx_pci_write_register(pcr, REG_OCPCTL, mask, 0);
+	rtsx_pci_write_register(pcr, REG_DV3318_OCPCTL,
+				DV3318_DETECT_EN | DV3318_OCP_INT_EN, 0);
+
+	rtsx_pci_write_register(pcr, FPDCTL, OC_POWER_DOWN,
+				OC_POWER_DOWN);
+}
+
+int rts5260_get_ocpstat(struct rtsx_pcr *pcr, u8 *val)
+{
+	return rtsx_pci_read_register(pcr, REG_OCPSTAT, val);
+}
+
+int rts5260_get_ocpstat2(struct rtsx_pcr *pcr, u8 *val)
+{
+	return rtsx_pci_read_register(pcr, REG_DV3318_OCPSTAT, val);
+}
+
+void rts5260_clear_ocpstat(struct rtsx_pcr *pcr)
+{
+	u8 mask = 0;
+	u8 val = 0;
+
+	mask = SD_OCP_INT_CLR | SD_OC_CLR;
+	mask |= SDVIO_OCP_INT_CLR | SDVIO_OC_CLR;
+	val = SD_OCP_INT_CLR | SD_OC_CLR;
+	val |= SDVIO_OCP_INT_CLR | SDVIO_OC_CLR;
+
+	rtsx_pci_write_register(pcr, REG_OCPCTL, mask, val);
+	rtsx_pci_write_register(pcr, REG_DV3318_OCPCTL,
+				DV3318_OCP_INT_CLR | DV3318_OCP_CLR,
+				DV3318_OCP_INT_CLR | DV3318_OCP_CLR);
+	udelay(10);
+	rtsx_pci_write_register(pcr, REG_OCPCTL, mask, 0);
+	rtsx_pci_write_register(pcr, REG_DV3318_OCPCTL,
+				DV3318_OCP_INT_CLR | DV3318_OCP_CLR, 0);
+}
+
+void rts5260_process_ocp(struct rtsx_pcr *pcr)
+{
+	if (!pcr->option.ocp_en)
+		return;
+
+	rtsx_pci_get_ocpstat(pcr, &pcr->ocp_stat);
+	rts5260_get_ocpstat2(pcr, &pcr->ocp_stat2);
+	if (pcr->card_exist & SD_EXIST)
+		rtsx_sd_power_off_card3v3(pcr);
+	else if (pcr->card_exist & MS_EXIST)
+		rtsx_ms_power_off_card3v3(pcr);
+
+	if (!(pcr->card_exist & MS_EXIST) && !(pcr->card_exist & SD_EXIST)) {
+		if ((pcr->ocp_stat & (SD_OC_NOW | SD_OC_EVER |
+			SDVIO_OC_NOW | SDVIO_OC_EVER)) ||
+			(pcr->ocp_stat2 & (DV3318_OCP_NOW | DV3318_OCP_EVER)))
+			rtsx_pci_clear_ocpstat(pcr);
+		pcr->ocp_stat = 0;
+		pcr->ocp_stat2 = 0;
+	}
+
+	if ((pcr->ocp_stat & (SD_OC_NOW | SD_OC_EVER |
+			SDVIO_OC_NOW | SDVIO_OC_EVER)) ||
+			(pcr->ocp_stat2 & (DV3318_OCP_NOW | DV3318_OCP_EVER))) {
+		if (pcr->card_exist & SD_EXIST)
+			rtsx_pci_write_register(pcr, CARD_OE, SD_OUTPUT_EN, 0);
+		else if (pcr->card_exist & MS_EXIST)
+			rtsx_pci_write_register(pcr, CARD_OE, MS_OUTPUT_EN, 0);
+	}
+}
+
+int rts5260_init_hw(struct rtsx_pcr *pcr)
+{
+	int err;
+
+	rtsx_pci_init_ocp(pcr);
+
+	rtsx_pci_init_cmd(pcr);
+
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, L1SUB_CONFIG1,
+			 AUX_CLK_ACTIVE_SEL_MASK, MAC_CKSW_DONE);
+	/* Rest L1SUB Config */
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, L1SUB_CONFIG3, 0xFF, 0x00);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PM_CLK_FORCE_CTL,
+			 CLK_PM_EN, CLK_PM_EN);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PWD_SUSPEND_EN, 0xFF, 0xFF);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PWR_GATE_CTRL,
+			 PWR_GATE_EN, PWR_GATE_EN);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, REG_VREF,
+			 PWD_SUSPND_EN, PWD_SUSPND_EN);
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, RBCTL,
+			 U_AUTO_DMA_EN_MASK, U_AUTO_DMA_DISABLE);
+
+	if (pcr->flags & PCR_REVERSE_SOCKET)
+		rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PETXCFG, 0xB0, 0xB0);
+	else
+		rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PETXCFG, 0xB0, 0x80);
+
+	rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, OBFF_CFG,
+			 OBFF_EN_MASK, OBFF_DISABLE);
+
+	err = rtsx_pci_send_cmd(pcr, CMD_TIMEOUT_DEF);
+	if (err < 0)
+		return err;
+
+	return 0;
+}
+
+static void rts5260_pwr_saving_setting(struct rtsx_pcr *pcr)
+{
+	int lss_l1_1, lss_l1_2;
+
+	lss_l1_1 = rtsx_check_dev_flag(pcr, ASPM_L1_1_EN)
+			| rtsx_check_dev_flag(pcr, PM_L1_1_EN);
+	lss_l1_2 = rtsx_check_dev_flag(pcr, ASPM_L1_2_EN)
+			| rtsx_check_dev_flag(pcr, PM_L1_2_EN);
+
+	if (lss_l1_2) {
+		pcr_dbg(pcr, "Set parameters for L1.2.");
+		rtsx_pci_write_register(pcr, PWR_GLOBAL_CTRL,
+					0xFF, PCIE_L1_2_EN);
+		rtsx_pci_write_register(pcr, PWR_FE_CTL,
+					0xFF, PCIE_L1_2_PD_FE_EN);
+	} else if (lss_l1_1) {
+		pcr_dbg(pcr, "Set parameters for L1.1.");
+		rtsx_pci_write_register(pcr, PWR_GLOBAL_CTRL,
+					0xFF, PCIE_L1_1_EN);
+		rtsx_pci_write_register(pcr, PWR_FE_CTL,
+					0xFF, PCIE_L1_1_PD_FE_EN);
+	} else {
+		pcr_dbg(pcr, "Set parameters for L1.");
+		rtsx_pci_write_register(pcr, PWR_GLOBAL_CTRL,
+					0xFF, PCIE_L1_0_EN);
+		rtsx_pci_write_register(pcr, PWR_FE_CTL,
+					0xFF, PCIE_L1_0_PD_FE_EN);
+	}
+
+	rtsx_pci_write_register(pcr, CFG_L1_0_PCIE_DPHY_RET_VALUE,
+				0xFF, CFG_L1_0_RET_VALUE_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_L1_0_PCIE_MAC_RET_VALUE,
+				0xFF, CFG_L1_0_RET_VALUE_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_L1_0_CRC_SD30_RET_VALUE,
+				0xFF, CFG_L1_0_RET_VALUE_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_L1_0_CRC_SD40_RET_VALUE,
+				0xFF, CFG_L1_0_RET_VALUE_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_L1_0_SYS_RET_VALUE,
+				0xFF, CFG_L1_0_RET_VALUE_DEFAULT);
+	/*Option cut APHY*/
+	rtsx_pci_write_register(pcr, CFG_PCIE_APHY_OFF_0,
+				0xFF, CFG_PCIE_APHY_OFF_0_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_PCIE_APHY_OFF_1,
+				0xFF, CFG_PCIE_APHY_OFF_1_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_PCIE_APHY_OFF_2,
+				0xFF, CFG_PCIE_APHY_OFF_2_DEFAULT);
+	rtsx_pci_write_register(pcr, CFG_PCIE_APHY_OFF_3,
+				0xFF, CFG_PCIE_APHY_OFF_3_DEFAULT);
+	/*CDR DEC*/
+	rtsx_pci_write_register(pcr, PWC_CDR, 0xFF, PWC_CDR_DEFAULT);
+	/*PWMPFM*/
+	rtsx_pci_write_register(pcr, CFG_LP_FPWM_VALUE,
+				0xFF, CFG_LP_FPWM_VALUE_DEFAULT);
+	/*No Power Saving WA*/
+	rtsx_pci_write_register(pcr, CFG_L1_0_CRC_MISC_RET_VALUE,
+				0xFF, CFG_L1_0_CRC_MISC_RET_VALUE_DEFAULT);
+}
+
+static void rts5260_init_from_cfg(struct rtsx_pcr *pcr)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+	u32 lval;
+
+	rtsx_pci_read_config_dword(pcr, PCR_ASPM_SETTING_5260, &lval);
+
+	if (lval & ASPM_L1_1_EN_MASK)
+		rtsx_set_dev_flag(pcr, ASPM_L1_1_EN);
+
+	if (lval & ASPM_L1_2_EN_MASK)
+		rtsx_set_dev_flag(pcr, ASPM_L1_2_EN);
+
+	if (lval & PM_L1_1_EN_MASK)
+		rtsx_set_dev_flag(pcr, PM_L1_1_EN);
+
+	if (lval & PM_L1_2_EN_MASK)
+		rtsx_set_dev_flag(pcr, PM_L1_2_EN);
+
+	rts5260_pwr_saving_setting(pcr);
+
+	if (option->ltr_en) {
+		u16 val;
+
+		pcie_capability_read_word(pcr->pci, PCI_EXP_DEVCTL2, &val);
+		if (val & PCI_EXP_DEVCTL2_LTR_EN) {
+			option->ltr_enabled = true;
+			option->ltr_active = true;
+			rtsx_set_ltr_latency(pcr, option->ltr_active_latency);
+		} else {
+			option->ltr_enabled = false;
+		}
+	}
+
+	if (rtsx_check_dev_flag(pcr, ASPM_L1_1_EN | ASPM_L1_2_EN
+				| PM_L1_1_EN | PM_L1_2_EN))
+		option->force_clkreq_0 = false;
+	else
+		option->force_clkreq_0 = true;
+}
+
+static int rts5260_extra_init_hw(struct rtsx_pcr *pcr)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+
+	/* Set mcu_cnt to 7 to ensure data can be sampled properly */
+	rtsx_pci_write_register(pcr, 0xFC03, 0x7F, 0x07);
+	rtsx_pci_write_register(pcr, SSC_DIV_N_0, 0xFF, 0x5D);
+
+	rts5260_init_from_cfg(pcr);
+
+	/* force no MDIO*/
+	rtsx_pci_write_register(pcr, RTS5260_AUTOLOAD_CFG4,
+				0xFF, RTS5260_MIMO_DISABLE);
+	/*Modify SDVCC Tune Default Parameters!*/
+	rtsx_pci_write_register(pcr, LDO_VCC_CFG0,
+				RTS5260_DVCC_TUNE_MASK, RTS5260_DVCC_33);
+
+	rtsx_pci_write_register(pcr, PCLK_CTL, PCLK_MODE_SEL, PCLK_MODE_SEL);
+
+	rts5260_init_hw(pcr);
+
+	/*
+	 * If u_force_clkreq_0 is enabled, CLKREQ# PIN will be forced
+	 * to drive low, and we forcibly request clock.
+	 */
+	if (option->force_clkreq_0)
+		rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PETXCFG,
+				 FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_LOW);
+	else
+		rtsx_pci_add_cmd(pcr, WRITE_REG_CMD, PETXCFG,
+				 FORCE_CLKREQ_DELINK_MASK, FORCE_CLKREQ_HIGH);
+
+	return 0;
+}
+
+void rts5260_set_aspm(struct rtsx_pcr *pcr, bool enable)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+	u8 val = 0;
+
+	if (pcr->aspm_enabled == enable)
+		return;
+
+	if (option->dev_aspm_mode == DEV_ASPM_DYNAMIC) {
+		if (enable)
+			val = pcr->aspm_en;
+		rtsx_pci_update_cfg_byte(pcr, pcr->pcie_cap + PCI_EXP_LNKCTL,
+					 ASPM_MASK_NEG, val);
+	} else if (option->dev_aspm_mode == DEV_ASPM_BACKDOOR) {
+		u8 mask = FORCE_ASPM_VAL_MASK | FORCE_ASPM_CTL0;
+
+		if (!enable)
+			val = FORCE_ASPM_CTL0;
+		rtsx_pci_write_register(pcr, ASPM_FORCE_CTL, mask, val);
+	}
+
+	pcr->aspm_enabled = enable;
+}
+
+static void rts5260_set_l1off_cfg_sub_d0(struct rtsx_pcr *pcr, int active)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+	u32 interrupt = rtsx_pci_readl(pcr, RTSX_BIPR);
+	int card_exist = (interrupt & SD_EXIST) | (interrupt & MS_EXIST);
+	int aspm_L1_1, aspm_L1_2;
+	u8 val = 0;
+
+	aspm_L1_1 = rtsx_check_dev_flag(pcr, ASPM_L1_1_EN);
+	aspm_L1_2 = rtsx_check_dev_flag(pcr, ASPM_L1_2_EN);
+
+	if (active) {
+		/* run, latency: 60us */
+		if (aspm_L1_1)
+			val = option->ltr_l1off_snooze_sspwrgate;
+	} else {
+		/* l1off, latency: 300us */
+		if (aspm_L1_2)
+			val = option->ltr_l1off_sspwrgate;
+	}
+
+	if (aspm_L1_1 || aspm_L1_2) {
+		if (rtsx_check_dev_flag(pcr,
+					LTR_L1SS_PWR_GATE_CHECK_CARD_EN)) {
+			if (card_exist)
+				val &= ~L1OFF_MBIAS2_EN_5250;
+			else
+				val |= L1OFF_MBIAS2_EN_5250;
+		}
+	}
+	rtsx_set_l1off_sub(pcr, val);
+}
+
+static const struct pcr_ops rts5260_pcr_ops = {
+	.fetch_vendor_settings = rtsx_base_fetch_vendor_settings,
+	.turn_on_led = rts5260_turn_on_led,
+	.turn_off_led = rts5260_turn_off_led,
+	.extra_init_hw = rts5260_extra_init_hw,
+	.enable_auto_blink = rtsx_base_enable_auto_blink,
+	.disable_auto_blink = rtsx_base_disable_auto_blink,
+	.card_power_on = rts5260_card_power_on,
+	.card_power_off = rts5260_card_power_off,
+	.switch_output_voltage = rts5260_switch_output_voltage,
+	.force_power_down = rtsx_base_force_power_down,
+	.stop_cmd = rts5260_stop_cmd,
+	.set_aspm = rts5260_set_aspm,
+	.set_l1off_cfg_sub_d0 = rts5260_set_l1off_cfg_sub_d0,
+	.enable_ocp = rts5260_enable_ocp,
+	.disable_ocp = rts5260_disable_ocp,
+	.init_ocp = rts5260_init_ocp,
+	.process_ocp = rts5260_process_ocp,
+	.get_ocpstat = rts5260_get_ocpstat,
+	.clear_ocpstat = rts5260_clear_ocpstat,
+};
+
+void rts5260_init_params(struct rtsx_pcr *pcr)
+{
+	struct rtsx_cr_option *option = &pcr->option;
+	struct rtsx_hw_param *hw_param = &pcr->hw_param;
+
+	pcr->extra_caps = EXTRA_CAPS_SD_SDR50 | EXTRA_CAPS_SD_SDR104;
+	pcr->num_slots = 2;
+
+	pcr->flags = 0;
+	pcr->card_drive_sel = RTSX_CARD_DRIVE_DEFAULT;
+	pcr->sd30_drive_sel_1v8 = CFG_DRIVER_TYPE_B;
+	pcr->sd30_drive_sel_3v3 = CFG_DRIVER_TYPE_B;
+	pcr->aspm_en = ASPM_L1_EN;
+	pcr->tx_initial_phase = SET_CLOCK_PHASE(1, 29, 16);
+	pcr->rx_initial_phase = SET_CLOCK_PHASE(24, 6, 5);
+
+	pcr->ic_version = rts5260_get_ic_version(pcr);
+	pcr->sd_pull_ctl_enable_tbl = rts5260_sd_pull_ctl_enable_tbl;
+	pcr->sd_pull_ctl_disable_tbl = rts5260_sd_pull_ctl_disable_tbl;
+	pcr->ms_pull_ctl_enable_tbl = rts5260_ms_pull_ctl_enable_tbl;
+	pcr->ms_pull_ctl_disable_tbl = rts5260_ms_pull_ctl_disable_tbl;
+
+	pcr->reg_pm_ctrl3 = RTS524A_PM_CTRL3;
+
+	pcr->ops = &rts5260_pcr_ops;
+
+	option->dev_flags = (LTR_L1SS_PWR_GATE_CHECK_CARD_EN
+				| LTR_L1SS_PWR_GATE_EN);
+	option->ltr_en = true;
+
+	/* init latency of active, idle, L1OFF to 60us, 300us, 3ms */
+	option->ltr_active_latency = LTR_ACTIVE_LATENCY_DEF;
+	option->ltr_idle_latency = LTR_IDLE_LATENCY_DEF;
+	option->ltr_l1off_latency = LTR_L1OFF_LATENCY_DEF;
+	option->dev_aspm_mode = DEV_ASPM_DYNAMIC;
+	option->l1_snooze_delay = L1_SNOOZE_DELAY_DEF;
+	option->ltr_l1off_sspwrgate = LTR_L1OFF_SSPWRGATE_5250_DEF;
+	option->ltr_l1off_snooze_sspwrgate =
+		LTR_L1OFF_SNOOZE_SSPWRGATE_5250_DEF;
+
+	option->ocp_en = 1;
+	if (option->ocp_en)
+		hw_param->interrupt_en |= SD_OC_INT_EN;
+	hw_param->ocp_glitch = SD_OCP_GLITCH_10M | SDVIO_OCP_GLITCH_800U;
+	option->sd_400mA_ocp_thd = RTS5260_DVCC_OCP_THD_550;
+	option->sd_800mA_ocp_thd = RTS5260_DVCC_OCP_THD_970;
+}
diff --git a/drivers/misc/cardreader/rts5260.h b/drivers/misc/cardreader/rts5260.h
new file mode 100644
index 0000000..53a1411
--- /dev/null
+++ b/drivers/misc/cardreader/rts5260.h
@@ -0,0 +1,45 @@
+#ifndef __RTS5260_H__
+#define __RTS5260_H__
+
+#define RTS5260_DVCC_CTRL		0xFF73
+#define RTS5260_DVCC_OCP_EN		(0x01 << 7)
+#define RTS5260_DVCC_OCP_THD_MASK	(0x07 << 4)
+#define RTS5260_DVCC_POWERON		(0x01 << 3)
+#define RTS5260_DVCC_OCP_CL_EN		(0x01 << 2)
+
+#define RTS5260_DVIO_CTRL		0xFF75
+#define RTS5260_DVIO_OCP_EN		(0x01 << 7)
+#define RTS5260_DVIO_OCP_THD_MASK	(0x07 << 4)
+#define RTS5260_DVIO_POWERON		(0x01 << 3)
+#define RTS5260_DVIO_OCP_CL_EN		(0x01 << 2)
+
+#define RTS5260_DV331812_CFG		0xFF71
+#define RTS5260_DV331812_OCP_EN		(0x01 << 7)
+#define RTS5260_DV331812_OCP_THD_MASK	(0x07 << 4)
+#define RTS5260_DV331812_POWERON	(0x01 << 3)
+#define RTS5260_DV331812_SEL		(0x01 << 2)
+#define RTS5260_DV331812_VDD1		(0x01 << 2)
+#define RTS5260_DV331812_VDD2		(0x00 << 2)
+
+#define RTS5260_DV331812_OCP_THD_120	(0x00 << 4)
+#define RTS5260_DV331812_OCP_THD_140	(0x01 << 4)
+#define RTS5260_DV331812_OCP_THD_160	(0x02 << 4)
+#define RTS5260_DV331812_OCP_THD_180	(0x03 << 4)
+#define RTS5260_DV331812_OCP_THD_210	(0x04 << 4)
+#define RTS5260_DV331812_OCP_THD_240	(0x05 << 4)
+#define RTS5260_DV331812_OCP_THD_270	(0x06 << 4)
+#define RTS5260_DV331812_OCP_THD_300	(0x07 << 4)
+
+#define RTS5260_DVIO_OCP_THD_250	(0x00 << 4)
+#define RTS5260_DVIO_OCP_THD_300	(0x01 << 4)
+#define RTS5260_DVIO_OCP_THD_350	(0x02 << 4)
+#define RTS5260_DVIO_OCP_THD_400	(0x03 << 4)
+#define RTS5260_DVIO_OCP_THD_450	(0x04 << 4)
+#define RTS5260_DVIO_OCP_THD_500	(0x05 << 4)
+#define RTS5260_DVIO_OCP_THD_550	(0x06 << 4)
+#define RTS5260_DVIO_OCP_THD_600	(0x07 << 4)
+
+#define RTS5260_DVCC_OCP_THD_550	(0x00 << 4)
+#define RTS5260_DVCC_OCP_THD_970	(0x05 << 4)
+
+#endif
diff --git a/drivers/mfd/rtsx_pcr.c b/drivers/misc/cardreader/rtsx_pcr.c
similarity index 92%
rename from drivers/mfd/rtsx_pcr.c
rename to drivers/misc/cardreader/rtsx_pcr.c
index 590fb9a..fd09b09 100644
--- a/drivers/mfd/rtsx_pcr.c
+++ b/drivers/misc/cardreader/rtsx_pcr.c
@@ -29,7 +29,7 @@
 #include <linux/idr.h>
 #include <linux/platform_device.h>
 #include <linux/mfd/core.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 #include <linux/mmc/card.h>
 #include <asm/unaligned.h>
 
@@ -62,6 +62,7 @@ static const struct pci_device_id rtsx_pci_ids[] = {
 	{ PCI_DEVICE(0x10EC, 0x5286), PCI_CLASS_OTHERS << 16, 0xFF0000 },
 	{ PCI_DEVICE(0x10EC, 0x524A), PCI_CLASS_OTHERS << 16, 0xFF0000 },
 	{ PCI_DEVICE(0x10EC, 0x525A), PCI_CLASS_OTHERS << 16, 0xFF0000 },
+	{ PCI_DEVICE(0x10EC, 0x5260), PCI_CLASS_OTHERS << 16, 0xFF0000 },
 	{ 0, }
 };
 
@@ -334,6 +335,9 @@ EXPORT_SYMBOL_GPL(rtsx_pci_read_phy_register);
 
 void rtsx_pci_stop_cmd(struct rtsx_pcr *pcr)
 {
+	if (pcr->ops->stop_cmd)
+		return pcr->ops->stop_cmd(pcr);
+
 	rtsx_pci_writel(pcr, RTSX_HCBCTLR, STOP_CMD);
 	rtsx_pci_writel(pcr, RTSX_HDBCTLR, STOP_DMA);
 
@@ -826,7 +830,7 @@ int rtsx_pci_switch_clock(struct rtsx_pcr *pcr, unsigned int card_clock,
 		return err;
 
 	/* Wait SSC clock stable */
-	udelay(10);
+	udelay(SSC_CLOCK_STABLE_WAIT);
 	err = rtsx_pci_write_register(pcr, CLK_CTL, CLK_LOW_FREQ, 0);
 	if (err < 0)
 		return err;
@@ -963,6 +967,20 @@ static void rtsx_pci_card_detect(struct work_struct *work)
 				pcr->slots[RTSX_MS_CARD].p_dev);
 }
 
+void rtsx_pci_process_ocp(struct rtsx_pcr *pcr)
+{
+	if (pcr->ops->process_ocp)
+		pcr->ops->process_ocp(pcr);
+}
+
+int rtsx_pci_process_ocp_interrupt(struct rtsx_pcr *pcr)
+{
+	if (pcr->option.ocp_en)
+		rtsx_pci_process_ocp(pcr);
+
+	return 0;
+}
+
 static irqreturn_t rtsx_pci_isr(int irq, void *dev_id)
 {
 	struct rtsx_pcr *pcr = dev_id;
@@ -987,6 +1005,9 @@ static irqreturn_t rtsx_pci_isr(int irq, void *dev_id)
 
 	int_reg &= (pcr->bier | 0x7FFFFF);
 
+	if (int_reg & SD_OC_INT)
+		rtsx_pci_process_ocp_interrupt(pcr);
+
 	if (int_reg & SD_INT) {
 		if (int_reg & SD_EXIST) {
 			pcr->card_inserted |= SD_EXIST;
@@ -1119,6 +1140,102 @@ static void rtsx_pci_power_off(struct rtsx_pcr *pcr, u8 pm_state)
 }
 #endif
 
+void rtsx_pci_enable_ocp(struct rtsx_pcr *pcr)
+{
+	u8 val = SD_OCP_INT_EN | SD_DETECT_EN;
+
+	if (pcr->ops->enable_ocp)
+		pcr->ops->enable_ocp(pcr);
+	else
+		rtsx_pci_write_register(pcr, REG_OCPCTL, 0xFF, val);
+
+}
+
+void rtsx_pci_disable_ocp(struct rtsx_pcr *pcr)
+{
+	u8 mask = SD_OCP_INT_EN | SD_DETECT_EN;
+
+	if (pcr->ops->disable_ocp)
+		pcr->ops->disable_ocp(pcr);
+	else
+		rtsx_pci_write_register(pcr, REG_OCPCTL, mask, 0);
+}
+
+void rtsx_pci_init_ocp(struct rtsx_pcr *pcr)
+{
+	if (pcr->ops->init_ocp) {
+		pcr->ops->init_ocp(pcr);
+	} else {
+		struct rtsx_cr_option *option = &(pcr->option);
+
+		if (option->ocp_en) {
+			u8 val = option->sd_400mA_ocp_thd;
+
+			rtsx_pci_write_register(pcr, FPDCTL, OC_POWER_DOWN, 0);
+			rtsx_pci_write_register(pcr, REG_OCPPARA1,
+				SD_OCP_TIME_MASK, SD_OCP_TIME_800);
+			rtsx_pci_write_register(pcr, REG_OCPPARA2,
+				SD_OCP_THD_MASK, val);
+			rtsx_pci_write_register(pcr, REG_OCPGLITCH,
+				SD_OCP_GLITCH_MASK, pcr->hw_param.ocp_glitch);
+			rtsx_pci_enable_ocp(pcr);
+		} else {
+			/* OC power down */
+			rtsx_pci_write_register(pcr, FPDCTL, OC_POWER_DOWN,
+				OC_POWER_DOWN);
+		}
+	}
+}
+
+int rtsx_pci_get_ocpstat(struct rtsx_pcr *pcr, u8 *val)
+{
+	if (pcr->ops->get_ocpstat)
+		return pcr->ops->get_ocpstat(pcr, val);
+	else
+		return rtsx_pci_read_register(pcr, REG_OCPSTAT, val);
+}
+
+void rtsx_pci_clear_ocpstat(struct rtsx_pcr *pcr)
+{
+	if (pcr->ops->clear_ocpstat) {
+		pcr->ops->clear_ocpstat(pcr);
+	} else {
+		u8 mask = SD_OCP_INT_CLR | SD_OC_CLR;
+		u8 val = SD_OCP_INT_CLR | SD_OC_CLR;
+
+		rtsx_pci_write_register(pcr, REG_OCPCTL, mask, val);
+		rtsx_pci_write_register(pcr, REG_OCPCTL, mask, 0);
+	}
+}
+
+int rtsx_sd_power_off_card3v3(struct rtsx_pcr *pcr)
+{
+	rtsx_pci_write_register(pcr, CARD_CLK_EN, SD_CLK_EN |
+		MS_CLK_EN | SD40_CLK_EN, 0);
+	rtsx_pci_write_register(pcr, CARD_OE, SD_OUTPUT_EN, 0);
+
+	rtsx_pci_card_power_off(pcr, RTSX_SD_CARD);
+
+	msleep(50);
+
+	rtsx_pci_card_pull_ctl_disable(pcr, RTSX_SD_CARD);
+
+	return 0;
+}
+
+int rtsx_ms_power_off_card3v3(struct rtsx_pcr *pcr)
+{
+	rtsx_pci_write_register(pcr, CARD_CLK_EN, SD_CLK_EN |
+		MS_CLK_EN | SD40_CLK_EN, 0);
+
+	rtsx_pci_card_pull_ctl_disable(pcr, RTSX_MS_CARD);
+
+	rtsx_pci_write_register(pcr, CARD_OE, MS_OUTPUT_EN, 0);
+	rtsx_pci_card_power_off(pcr, RTSX_MS_CARD);
+
+	return 0;
+}
+
 static int rtsx_pci_init_hw(struct rtsx_pcr *pcr)
 {
 	int err;
@@ -1189,6 +1306,7 @@ static int rtsx_pci_init_hw(struct rtsx_pcr *pcr)
 	case PID_5250:
 	case PID_524A:
 	case PID_525A:
+	case PID_5260:
 		rtsx_pci_write_register(pcr, PM_CLK_FORCE_CTL, 1, 1);
 		break;
 	default:
@@ -1265,6 +1383,9 @@ static int rtsx_pci_init_chip(struct rtsx_pcr *pcr)
 	case 0x5286:
 		rtl8402_init_params(pcr);
 		break;
+	case 0x5260:
+		rts5260_init_params(pcr);
+		break;
 	}
 
 	pcr_dbg(pcr, "PID: 0x%04x, IC version: 0x%02x\n",
@@ -1543,6 +1664,9 @@ static void rtsx_pci_shutdown(struct pci_dev *pcidev)
 	rtsx_pci_power_off(pcr, HOST_ENTER_S1);
 
 	pci_disable_device(pcidev);
+	free_irq(pcr->irq, (void *)pcr);
+	if (pcr->msi_en)
+		pci_disable_msi(pcr->pci);
 }
 
 #else /* CONFIG_PM */
diff --git a/drivers/mfd/rtsx_pcr.h b/drivers/misc/cardreader/rtsx_pcr.h
similarity index 88%
rename from drivers/mfd/rtsx_pcr.h
rename to drivers/misc/cardreader/rtsx_pcr.h
index ec784e0..6ea1655 100644
--- a/drivers/mfd/rtsx_pcr.h
+++ b/drivers/misc/cardreader/rtsx_pcr.h
@@ -22,7 +22,7 @@
 #ifndef __RTSX_PCR_H
 #define __RTSX_PCR_H
 
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 
 #define MIN_DIV_N_PCR		80
 #define MAX_DIV_N_PCR		208
@@ -44,6 +44,8 @@
 #define ASPM_MASK_NEG		0xFC
 #define MASK_8_BIT_DEF		0xFF
 
+#define SSC_CLOCK_STABLE_WAIT	130
+
 int __rtsx_pci_write_phy_register(struct rtsx_pcr *pcr, u8 addr, u16 val);
 int __rtsx_pci_read_phy_register(struct rtsx_pcr *pcr, u8 addr, u16 *val);
 
@@ -57,6 +59,7 @@ void rts5249_init_params(struct rtsx_pcr *pcr);
 void rts524a_init_params(struct rtsx_pcr *pcr);
 void rts525a_init_params(struct rtsx_pcr *pcr);
 void rtl8411b_init_params(struct rtsx_pcr *pcr);
+void rts5260_init_params(struct rtsx_pcr *pcr);
 
 static inline u8 map_sd_drive(int idx)
 {
@@ -99,5 +102,12 @@ do {									\
 int rtsx_gops_pm_reset(struct rtsx_pcr *pcr);
 int rtsx_set_ltr_latency(struct rtsx_pcr *pcr, u32 latency);
 int rtsx_set_l1off_sub(struct rtsx_pcr *pcr, u8 val);
+void rtsx_pci_init_ocp(struct rtsx_pcr *pcr);
+void rtsx_pci_disable_ocp(struct rtsx_pcr *pcr);
+void rtsx_pci_enable_ocp(struct rtsx_pcr *pcr);
+int rtsx_pci_get_ocpstat(struct rtsx_pcr *pcr, u8 *val);
+void rtsx_pci_clear_ocpstat(struct rtsx_pcr *pcr);
+int rtsx_sd_power_off_card3v3(struct rtsx_pcr *pcr);
+int rtsx_ms_power_off_card3v3(struct rtsx_pcr *pcr);
 
 #endif
diff --git a/drivers/mfd/rtsx_usb.c b/drivers/misc/cardreader/rtsx_usb.c
similarity index 99%
rename from drivers/mfd/rtsx_usb.c
rename to drivers/misc/cardreader/rtsx_usb.c
index 59d61b0..b97903f 100644
--- a/drivers/mfd/rtsx_usb.c
+++ b/drivers/misc/cardreader/rtsx_usb.c
@@ -23,7 +23,7 @@
 #include <linux/usb.h>
 #include <linux/platform_device.h>
 #include <linux/mfd/core.h>
-#include <linux/mfd/rtsx_usb.h>
+#include <linux/rtsx_usb.h>
 
 static int polling_pipe = 1;
 module_param(polling_pipe, int, S_IRUGO | S_IWUSR);
diff --git a/drivers/mmc/core/block.c b/drivers/mmc/core/block.c
index ccfa98a..20135a5 100644
--- a/drivers/mmc/core/block.c
+++ b/drivers/mmc/core/block.c
@@ -63,7 +63,13 @@ MODULE_ALIAS("mmc:block");
 #endif
 #define MODULE_PARAM_PREFIX "mmcblk."
 
-#define MMC_BLK_TIMEOUT_MS  (10 * 60 * 1000)        /* 10 minute timeout */
+/*
+ * Set a 10 second timeout for polling write request busy state. Note, mmc core
+ * is setting a 3 second timeout for SD cards, and SDHCI has long had a 10
+ * second software timer to timeout the whole request, so 10 seconds should be
+ * ample.
+ */
+#define MMC_BLK_TIMEOUT_MS  (10 * 1000)
 #define MMC_SANITIZE_REQ_TIMEOUT 240000
 #define MMC_EXTRACT_INDEX_FROM_ARG(x) ((x & 0x00FF0000) >> 16)
 
@@ -112,6 +118,7 @@ struct mmc_blk_data {
 #define MMC_BLK_WRITE		BIT(1)
 #define MMC_BLK_DISCARD		BIT(2)
 #define MMC_BLK_SECDISCARD	BIT(3)
+#define MMC_BLK_CQE_RECOVERY	BIT(4)
 
 	/*
 	 * Only set in main mmc_blk_data associated
@@ -189,7 +196,7 @@ static void mmc_blk_put(struct mmc_blk_data *md)
 	md->usage--;
 	if (md->usage == 0) {
 		int devidx = mmc_get_devidx(md->disk);
-		blk_cleanup_queue(md->queue.queue);
+		blk_put_queue(md->queue.queue);
 		ida_simple_remove(&mmc_blk_ida, devidx);
 		put_disk(md->disk);
 		kfree(md);
@@ -921,14 +928,54 @@ static int mmc_sd_num_wr_blocks(struct mmc_card *card, u32 *written_blocks)
 	return 0;
 }
 
+static unsigned int mmc_blk_clock_khz(struct mmc_host *host)
+{
+	if (host->actual_clock)
+		return host->actual_clock / 1000;
+
+	/* Clock may be subject to a divisor, fudge it by a factor of 2. */
+	if (host->ios.clock)
+		return host->ios.clock / 2000;
+
+	/* How can there be no clock */
+	WARN_ON_ONCE(1);
+	return 100; /* 100 kHz is minimum possible value */
+}
+
+static unsigned int mmc_blk_data_timeout_ms(struct mmc_host *host,
+					    struct mmc_data *data)
+{
+	unsigned int ms = DIV_ROUND_UP(data->timeout_ns, 1000000);
+	unsigned int khz;
+
+	if (data->timeout_clks) {
+		khz = mmc_blk_clock_khz(host);
+		ms += DIV_ROUND_UP(data->timeout_clks, khz);
+	}
+
+	return ms;
+}
+
+static inline bool mmc_blk_in_tran_state(u32 status)
+{
+	/*
+	 * Some cards mishandle the status bits, so make sure to check both the
+	 * busy indication and the card state.
+	 */
+	return status & R1_READY_FOR_DATA &&
+	       (R1_CURRENT_STATE(status) == R1_STATE_TRAN);
+}
+
 static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
-		bool hw_busy_detect, struct request *req, bool *gen_err)
+			    struct request *req, u32 *resp_errs)
 {
 	unsigned long timeout = jiffies + msecs_to_jiffies(timeout_ms);
 	int err = 0;
 	u32 status;
 
 	do {
+		bool done = time_after(jiffies, timeout);
+
 		err = __mmc_send_status(card, &status, 5);
 		if (err) {
 			pr_err("%s: error %d requesting status\n",
@@ -936,25 +983,18 @@ static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
 			return err;
 		}
 
-		if (status & R1_ERROR) {
-			pr_err("%s: %s: error sending status cmd, status %#x\n",
-				req->rq_disk->disk_name, __func__, status);
-			*gen_err = true;
-		}
-
-		/* We may rely on the host hw to handle busy detection.*/
-		if ((card->host->caps & MMC_CAP_WAIT_WHILE_BUSY) &&
-			hw_busy_detect)
-			break;
+		/* Accumulate any response error bits seen */
+		if (resp_errs)
+			*resp_errs |= status;
 
 		/*
 		 * Timeout if the device never becomes ready for data and never
 		 * leaves the program state.
 		 */
-		if (time_after(jiffies, timeout)) {
-			pr_err("%s: Card stuck in programming state! %s %s\n",
+		if (done) {
+			pr_err("%s: Card stuck in wrong state! %s %s status: %#x\n",
 				mmc_hostname(card->host),
-				req->rq_disk->disk_name, __func__);
+				req->rq_disk->disk_name, __func__, status);
 			return -ETIMEDOUT;
 		}
 
@@ -963,229 +1003,11 @@ static int card_busy_detect(struct mmc_card *card, unsigned int timeout_ms,
 		 * so make sure to check both the busy
 		 * indication and the card state.
 		 */
-	} while (!(status & R1_READY_FOR_DATA) ||
-		 (R1_CURRENT_STATE(status) == R1_STATE_PRG));
+	} while (!mmc_blk_in_tran_state(status));
 
 	return err;
 }
 
-static int send_stop(struct mmc_card *card, unsigned int timeout_ms,
-		struct request *req, bool *gen_err, u32 *stop_status)
-{
-	struct mmc_host *host = card->host;
-	struct mmc_command cmd = {};
-	int err;
-	bool use_r1b_resp = rq_data_dir(req) == WRITE;
-
-	/*
-	 * Normally we use R1B responses for WRITE, but in cases where the host
-	 * has specified a max_busy_timeout we need to validate it. A failure
-	 * means we need to prevent the host from doing hw busy detection, which
-	 * is done by converting to a R1 response instead.
-	 */
-	if (host->max_busy_timeout && (timeout_ms > host->max_busy_timeout))
-		use_r1b_resp = false;
-
-	cmd.opcode = MMC_STOP_TRANSMISSION;
-	if (use_r1b_resp) {
-		cmd.flags = MMC_RSP_SPI_R1B | MMC_RSP_R1B | MMC_CMD_AC;
-		cmd.busy_timeout = timeout_ms;
-	} else {
-		cmd.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_AC;
-	}
-
-	err = mmc_wait_for_cmd(host, &cmd, 5);
-	if (err)
-		return err;
-
-	*stop_status = cmd.resp[0];
-
-	/* No need to check card status in case of READ. */
-	if (rq_data_dir(req) == READ)
-		return 0;
-
-	if (!mmc_host_is_spi(host) &&
-		(*stop_status & R1_ERROR)) {
-		pr_err("%s: %s: general error sending stop command, resp %#x\n",
-			req->rq_disk->disk_name, __func__, *stop_status);
-		*gen_err = true;
-	}
-
-	return card_busy_detect(card, timeout_ms, use_r1b_resp, req, gen_err);
-}
-
-#define ERR_NOMEDIUM	3
-#define ERR_RETRY	2
-#define ERR_ABORT	1
-#define ERR_CONTINUE	0
-
-static int mmc_blk_cmd_error(struct request *req, const char *name, int error,
-	bool status_valid, u32 status)
-{
-	switch (error) {
-	case -EILSEQ:
-		/* response crc error, retry the r/w cmd */
-		pr_err("%s: %s sending %s command, card status %#x\n",
-			req->rq_disk->disk_name, "response CRC error",
-			name, status);
-		return ERR_RETRY;
-
-	case -ETIMEDOUT:
-		pr_err("%s: %s sending %s command, card status %#x\n",
-			req->rq_disk->disk_name, "timed out", name, status);
-
-		/* If the status cmd initially failed, retry the r/w cmd */
-		if (!status_valid) {
-			pr_err("%s: status not valid, retrying timeout\n",
-				req->rq_disk->disk_name);
-			return ERR_RETRY;
-		}
-
-		/*
-		 * If it was a r/w cmd crc error, or illegal command
-		 * (eg, issued in wrong state) then retry - we should
-		 * have corrected the state problem above.
-		 */
-		if (status & (R1_COM_CRC_ERROR | R1_ILLEGAL_COMMAND)) {
-			pr_err("%s: command error, retrying timeout\n",
-				req->rq_disk->disk_name);
-			return ERR_RETRY;
-		}
-
-		/* Otherwise abort the command */
-		return ERR_ABORT;
-
-	default:
-		/* We don't understand the error code the driver gave us */
-		pr_err("%s: unknown error %d sending read/write command, card status %#x\n",
-		       req->rq_disk->disk_name, error, status);
-		return ERR_ABORT;
-	}
-}
-
-/*
- * Initial r/w and stop cmd error recovery.
- * We don't know whether the card received the r/w cmd or not, so try to
- * restore things back to a sane state.  Essentially, we do this as follows:
- * - Obtain card status.  If the first attempt to obtain card status fails,
- *   the status word will reflect the failed status cmd, not the failed
- *   r/w cmd.  If we fail to obtain card status, it suggests we can no
- *   longer communicate with the card.
- * - Check the card state.  If the card received the cmd but there was a
- *   transient problem with the response, it might still be in a data transfer
- *   mode.  Try to send it a stop command.  If this fails, we can't recover.
- * - If the r/w cmd failed due to a response CRC error, it was probably
- *   transient, so retry the cmd.
- * - If the r/w cmd timed out, but we didn't get the r/w cmd status, retry.
- * - If the r/w cmd timed out, and the r/w cmd failed due to CRC error or
- *   illegal cmd, retry.
- * Otherwise we don't understand what happened, so abort.
- */
-static int mmc_blk_cmd_recovery(struct mmc_card *card, struct request *req,
-	struct mmc_blk_request *brq, bool *ecc_err, bool *gen_err)
-{
-	bool prev_cmd_status_valid = true;
-	u32 status, stop_status = 0;
-	int err, retry;
-
-	if (mmc_card_removed(card))
-		return ERR_NOMEDIUM;
-
-	/*
-	 * Try to get card status which indicates both the card state
-	 * and why there was no response.  If the first attempt fails,
-	 * we can't be sure the returned status is for the r/w command.
-	 */
-	for (retry = 2; retry >= 0; retry--) {
-		err = __mmc_send_status(card, &status, 0);
-		if (!err)
-			break;
-
-		/* Re-tune if needed */
-		mmc_retune_recheck(card->host);
-
-		prev_cmd_status_valid = false;
-		pr_err("%s: error %d sending status command, %sing\n",
-		       req->rq_disk->disk_name, err, retry ? "retry" : "abort");
-	}
-
-	/* We couldn't get a response from the card.  Give up. */
-	if (err) {
-		/* Check if the card is removed */
-		if (mmc_detect_card_removed(card->host))
-			return ERR_NOMEDIUM;
-		return ERR_ABORT;
-	}
-
-	/* Flag ECC errors */
-	if ((status & R1_CARD_ECC_FAILED) ||
-	    (brq->stop.resp[0] & R1_CARD_ECC_FAILED) ||
-	    (brq->cmd.resp[0] & R1_CARD_ECC_FAILED))
-		*ecc_err = true;
-
-	/* Flag General errors */
-	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ)
-		if ((status & R1_ERROR) ||
-			(brq->stop.resp[0] & R1_ERROR)) {
-			pr_err("%s: %s: general error sending stop or status command, stop cmd response %#x, card status %#x\n",
-			       req->rq_disk->disk_name, __func__,
-			       brq->stop.resp[0], status);
-			*gen_err = true;
-		}
-
-	/*
-	 * Check the current card state.  If it is in some data transfer
-	 * mode, tell it to stop (and hopefully transition back to TRAN.)
-	 */
-	if (R1_CURRENT_STATE(status) == R1_STATE_DATA ||
-	    R1_CURRENT_STATE(status) == R1_STATE_RCV) {
-		err = send_stop(card,
-			DIV_ROUND_UP(brq->data.timeout_ns, 1000000),
-			req, gen_err, &stop_status);
-		if (err) {
-			pr_err("%s: error %d sending stop command\n",
-			       req->rq_disk->disk_name, err);
-			/*
-			 * If the stop cmd also timed out, the card is probably
-			 * not present, so abort. Other errors are bad news too.
-			 */
-			return ERR_ABORT;
-		}
-
-		if (stop_status & R1_CARD_ECC_FAILED)
-			*ecc_err = true;
-	}
-
-	/* Check for set block count errors */
-	if (brq->sbc.error)
-		return mmc_blk_cmd_error(req, "SET_BLOCK_COUNT", brq->sbc.error,
-				prev_cmd_status_valid, status);
-
-	/* Check for r/w command errors */
-	if (brq->cmd.error)
-		return mmc_blk_cmd_error(req, "r/w cmd", brq->cmd.error,
-				prev_cmd_status_valid, status);
-
-	/* Data errors */
-	if (!brq->stop.error)
-		return ERR_CONTINUE;
-
-	/* Now for stop errors.  These aren't fatal to the transfer. */
-	pr_info("%s: error %d sending stop command, original cmd response %#x, card status %#x\n",
-	       req->rq_disk->disk_name, brq->stop.error,
-	       brq->cmd.resp[0], status);
-
-	/*
-	 * Subsitute in our own stop status as this will give the error
-	 * state which happened during the execution of the r/w command.
-	 */
-	if (stop_status) {
-		brq->stop.resp[0] = stop_status;
-		brq->stop.error = 0;
-	}
-	return ERR_CONTINUE;
-}
-
 static int mmc_blk_reset(struct mmc_blk_data *md, struct mmc_host *host,
 			 int type)
 {
@@ -1281,7 +1103,7 @@ static void mmc_blk_issue_drv_op(struct mmc_queue *mq, struct request *req)
 		break;
 	}
 	mq_rq->drv_op_result = ret;
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_mq_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
@@ -1324,7 +1146,7 @@ static void mmc_blk_issue_discard_rq(struct mmc_queue *mq, struct request *req)
 	else
 		mmc_blk_reset_success(md, type);
 fail:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	blk_mq_end_request(req, status);
 }
 
 static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
@@ -1394,7 +1216,7 @@ static void mmc_blk_issue_secdiscard_rq(struct mmc_queue *mq,
 	if (!err)
 		mmc_blk_reset_success(md, type);
 out:
-	blk_end_request(req, status, blk_rq_bytes(req));
+	blk_mq_end_request(req, status);
 }
 
 static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
@@ -1404,7 +1226,7 @@ static void mmc_blk_issue_flush(struct mmc_queue *mq, struct request *req)
 	int ret = 0;
 
 	ret = mmc_flush_cache(card);
-	blk_end_request_all(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
+	blk_mq_end_request(req, ret ? BLK_STS_IOERR : BLK_STS_OK);
 }
 
 /*
@@ -1430,15 +1252,18 @@ static inline void mmc_apply_rel_rw(struct mmc_blk_request *brq,
 	}
 }
 
-#define CMD_ERRORS							\
-	(R1_OUT_OF_RANGE |	/* Command argument out of range */	\
-	 R1_ADDRESS_ERROR |	/* Misaligned address */		\
+#define CMD_ERRORS_EXCL_OOR						\
+	(R1_ADDRESS_ERROR |	/* Misaligned address */		\
 	 R1_BLOCK_LEN_ERROR |	/* Transferred block length incorrect */\
 	 R1_WP_VIOLATION |	/* Tried to write to protected block */	\
 	 R1_CARD_ECC_FAILED |	/* Card ECC failed */			\
 	 R1_CC_ERROR |		/* Card controller error */		\
 	 R1_ERROR)		/* General/unknown error */
 
+#define CMD_ERRORS							\
+	(CMD_ERRORS_EXCL_OOR |						\
+	 R1_OUT_OF_RANGE)	/* Command argument out of range */	\
+
 static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 {
 	u32 val;
@@ -1481,116 +1306,6 @@ static void mmc_blk_eval_resp_error(struct mmc_blk_request *brq)
 	}
 }
 
-static enum mmc_blk_status mmc_blk_err_check(struct mmc_card *card,
-					     struct mmc_async_req *areq)
-{
-	struct mmc_queue_req *mq_mrq = container_of(areq, struct mmc_queue_req,
-						    areq);
-	struct mmc_blk_request *brq = &mq_mrq->brq;
-	struct request *req = mmc_queue_req_to_req(mq_mrq);
-	int need_retune = card->host->need_retune;
-	bool ecc_err = false;
-	bool gen_err = false;
-
-	/*
-	 * sbc.error indicates a problem with the set block count
-	 * command.  No data will have been transferred.
-	 *
-	 * cmd.error indicates a problem with the r/w command.  No
-	 * data will have been transferred.
-	 *
-	 * stop.error indicates a problem with the stop command.  Data
-	 * may have been transferred, or may still be transferring.
-	 */
-
-	mmc_blk_eval_resp_error(brq);
-
-	if (brq->sbc.error || brq->cmd.error ||
-	    brq->stop.error || brq->data.error) {
-		switch (mmc_blk_cmd_recovery(card, req, brq, &ecc_err, &gen_err)) {
-		case ERR_RETRY:
-			return MMC_BLK_RETRY;
-		case ERR_ABORT:
-			return MMC_BLK_ABORT;
-		case ERR_NOMEDIUM:
-			return MMC_BLK_NOMEDIUM;
-		case ERR_CONTINUE:
-			break;
-		}
-	}
-
-	/*
-	 * Check for errors relating to the execution of the
-	 * initial command - such as address errors.  No data
-	 * has been transferred.
-	 */
-	if (brq->cmd.resp[0] & CMD_ERRORS) {
-		pr_err("%s: r/w command failed, status = %#x\n",
-		       req->rq_disk->disk_name, brq->cmd.resp[0]);
-		return MMC_BLK_ABORT;
-	}
-
-	/*
-	 * Everything else is either success, or a data error of some
-	 * kind.  If it was a write, we may have transitioned to
-	 * program mode, which we have to wait for it to complete.
-	 */
-	if (!mmc_host_is_spi(card->host) && rq_data_dir(req) != READ) {
-		int err;
-
-		/* Check stop command response */
-		if (brq->stop.resp[0] & R1_ERROR) {
-			pr_err("%s: %s: general error sending stop command, stop cmd response %#x\n",
-			       req->rq_disk->disk_name, __func__,
-			       brq->stop.resp[0]);
-			gen_err = true;
-		}
-
-		err = card_busy_detect(card, MMC_BLK_TIMEOUT_MS, false, req,
-					&gen_err);
-		if (err)
-			return MMC_BLK_CMD_ERR;
-	}
-
-	/* if general error occurs, retry the write operation. */
-	if (gen_err) {
-		pr_warn("%s: retrying write for general error\n",
-				req->rq_disk->disk_name);
-		return MMC_BLK_RETRY;
-	}
-
-	/* Some errors (ECC) are flagged on the next commmand, so check stop, too */
-	if (brq->data.error || brq->stop.error) {
-		if (need_retune && !brq->retune_retry_done) {
-			pr_debug("%s: retrying because a re-tune was needed\n",
-				 req->rq_disk->disk_name);
-			brq->retune_retry_done = 1;
-			return MMC_BLK_RETRY;
-		}
-		pr_err("%s: error %d transferring data, sector %u, nr %u, cmd response %#x, card status %#x\n",
-		       req->rq_disk->disk_name, brq->data.error ?: brq->stop.error,
-		       (unsigned)blk_rq_pos(req),
-		       (unsigned)blk_rq_sectors(req),
-		       brq->cmd.resp[0], brq->stop.resp[0]);
-
-		if (rq_data_dir(req) == READ) {
-			if (ecc_err)
-				return MMC_BLK_ECC_ERR;
-			return MMC_BLK_DATA_ERR;
-		} else {
-			return MMC_BLK_CMD_ERR;
-		}
-	}
-
-	if (!brq->data.bytes_xfered)
-		return MMC_BLK_RETRY;
-
-	if (blk_rq_bytes(req) != brq->data.bytes_xfered)
-		return MMC_BLK_PARTIAL;
-
-	return MMC_BLK_SUCCESS;
-}
-
 static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 			      int disable_multi, bool *do_rel_wr_p,
 			      bool *do_data_tag_p)
@@ -1706,8 +1421,6 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		brq->data.sg_len = i;
 	}
 
-	mqrq->areq.mrq = &brq->mrq;
-
 	if (do_rel_wr_p)
 		*do_rel_wr_p = do_rel_wr;
 
@@ -1715,6 +1428,138 @@ static void mmc_blk_data_prep(struct mmc_queue *mq, struct mmc_queue_req *mqrq,
 		*do_data_tag_p = do_data_tag;
 }
 
+#define MMC_CQE_RETRIES 2
+
+static void mmc_blk_cqe_complete_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct request_queue *q = req->q;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+	bool put_card;
+	int err;
+
+	mmc_cqe_post_req(host, mrq);
+
+	if (mrq->cmd && mrq->cmd->error)
+		err = mrq->cmd->error;
+	else if (mrq->data && mrq->data->error)
+		err = mrq->data->error;
+	else
+		err = 0;
+
+	if (err) {
+		if (mqrq->retries++ < MMC_CQE_RETRIES)
+			blk_mq_requeue_request(req, true);
+		else
+			blk_mq_end_request(req, BLK_STS_IOERR);
+	} else if (mrq->data) {
+		if (blk_update_request(req, BLK_STS_OK, mrq->data->bytes_xfered))
+			blk_mq_requeue_request(req, true);
+		else
+			__blk_mq_end_request(req, BLK_STS_OK);
+	} else {
+		blk_mq_end_request(req, BLK_STS_OK);
+	}
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	mq->in_flight[mmc_issue_type(mq, req)] -= 1;
+
+	put_card = (mmc_tot_in_flight(mq) == 0);
+
+	mmc_cqe_check_busy(mq);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (!mq->cqe_busy)
+		blk_mq_run_hw_queues(q, true);
+
+	if (put_card)
+		mmc_put_card(mq->card, &mq->ctx);
+}
+
+void mmc_blk_cqe_recovery(struct mmc_queue *mq)
+{
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	int err;
+
+	pr_debug("%s: CQE recovery start\n", mmc_hostname(host));
+
+	err = mmc_cqe_recovery(host);
+	if (err)
+		mmc_blk_reset(mq->blkdata, host, MMC_BLK_CQE_RECOVERY);
+	else
+		mmc_blk_reset_success(mq->blkdata, MMC_BLK_CQE_RECOVERY);
+
+	pr_debug("%s: CQE recovery done\n", mmc_hostname(host));
+}
+
+static void mmc_blk_cqe_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
+}
+
+static int mmc_blk_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	mrq->done		= mmc_blk_cqe_req_done;
+	mrq->recovery_notifier	= mmc_cqe_recovery_notifier;
+
+	return mmc_cqe_start_req(host, mrq);
+}
+
+static struct mmc_request *mmc_blk_cqe_prep_dcmd(struct mmc_queue_req *mqrq,
+						 struct request *req)
+{
+	struct mmc_blk_request *brq = &mqrq->brq;
+
+	memset(brq, 0, sizeof(*brq));
+
+	brq->mrq.cmd = &brq->cmd;
+	brq->mrq.tag = req->tag;
+
+	return &brq->mrq;
+}
+
+static int mmc_blk_cqe_issue_flush(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = mmc_blk_cqe_prep_dcmd(mqrq, req);
+
+	mrq->cmd->opcode = MMC_SWITCH;
+	mrq->cmd->arg = (MMC_SWITCH_MODE_WRITE_BYTE << 24) |
+			(EXT_CSD_FLUSH_CACHE << 16) |
+			(1 << 8) |
+			EXT_CSD_CMD_SET_NORMAL;
+	mrq->cmd->flags = MMC_CMD_AC | MMC_RSP_R1B;
+
+	return mmc_blk_cqe_start_req(mq->card->host, mrq);
+}
+
+static int mmc_blk_cqe_issue_rw_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mmc_blk_data_prep(mq, mqrq, 0, NULL, NULL);
+
+	return mmc_blk_cqe_start_req(mq->card->host, &mqrq->brq.mrq);
+}
+
 static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 			       struct mmc_card *card,
 			       int disable_multi,
@@ -1779,318 +1624,637 @@ static void mmc_blk_rw_rq_prep(struct mmc_queue_req *mqrq,
 		brq->sbc.flags = MMC_RSP_R1 | MMC_CMD_AC;
 		brq->mrq.sbc = &brq->sbc;
 	}
-
-	mqrq->areq.err_check = mmc_blk_err_check;
 }
 
-static bool mmc_blk_rw_cmd_err(struct mmc_blk_data *md, struct mmc_card *card,
-			       struct mmc_blk_request *brq, struct request *req,
-			       bool old_req_pending)
+#define MMC_MAX_RETRIES		5
+#define MMC_DATA_RETRIES	2
+#define MMC_NO_RETRIES		(MMC_MAX_RETRIES + 1)
+
+static int mmc_blk_send_stop(struct mmc_card *card, unsigned int timeout)
 {
-	bool req_pending;
+	struct mmc_command cmd = {
+		.opcode = MMC_STOP_TRANSMISSION,
+		.flags = MMC_RSP_SPI_R1 | MMC_RSP_R1 | MMC_CMD_AC,
+		/* Some hosts wait for busy anyway, so provide a busy timeout */
+		.busy_timeout = timeout,
+	};
 
-	/*
-	 * If this is an SD card and we're writing, we can first
-	 * mark the known good sectors as ok.
-	 *
-	 * If the card is not SD, we can still ok written sectors
-	 * as reported by the controller (which might be less than
-	 * the real number of written sectors, but never more).
-	 */
-	if (mmc_card_sd(card)) {
-		u32 blocks;
-		int err;
-
-		err = mmc_sd_num_wr_blocks(card, &blocks);
-		if (err)
-			req_pending = old_req_pending;
-		else
-			req_pending = blk_end_request(req, BLK_STS_OK, blocks << 9);
-	} else {
-		req_pending = blk_end_request(req, BLK_STS_OK, brq->data.bytes_xfered);
-	}
-	return req_pending;
+	return mmc_wait_for_cmd(card->host, &cmd, 5);
 }
 
-static void mmc_blk_rw_cmd_abort(struct mmc_queue *mq, struct mmc_card *card,
-				 struct request *req,
-				 struct mmc_queue_req *mqrq)
+static int mmc_blk_fix_state(struct mmc_card *card, struct request *req)
 {
-	if (mmc_card_removed(card))
-		req->rq_flags |= RQF_QUIET;
-	while (blk_end_request(req, BLK_STS_IOERR, blk_rq_cur_bytes(req)));
-	mq->qcnt--;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_blk_request *brq = &mqrq->brq;
+	unsigned int timeout = mmc_blk_data_timeout_ms(card->host, &brq->data);
+	int err;
+
+	mmc_retune_hold_now(card->host);
+
+	mmc_blk_send_stop(card, timeout);
+
+	err = card_busy_detect(card, timeout, req, NULL);
+
+	mmc_retune_release(card->host);
+
+	return err;
 }
 
-/**
- * mmc_blk_rw_try_restart() - tries to restart the current async request
- * @mq: the queue with the card and host to restart
- * @req: a new request that want to be started after the current one
- */
-static void mmc_blk_rw_try_restart(struct mmc_queue *mq, struct request *req,
-				   struct mmc_queue_req *mqrq)
+#define MMC_READ_SINGLE_RETRIES	2
+
+/* Single sector read during recovery */
+static void mmc_blk_read_single(struct mmc_queue *mq, struct request *req)
 {
-	if (!req)
-		return;
-
-	/*
-	 * If the card was removed, just cancel everything and return.
-	 */
-	if (mmc_card_removed(mq->card)) {
-		req->rq_flags |= RQF_QUIET;
-		blk_end_request_all(req, BLK_STS_IOERR);
-		mq->qcnt--; /* FIXME: just set to 0? */
-		return;
-	}
-	/* Else proceed and try to restart the current async request */
-	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
-	mmc_start_areq(mq->card->host, &mqrq->areq, NULL);
-}
-
-static void mmc_blk_issue_rw_rq(struct mmc_queue *mq, struct request *new_req)
-{
-	struct mmc_blk_data *md = mq->blkdata;
-	struct mmc_card *card = md->queue.card;
-	struct mmc_blk_request *brq;
-	int disable_multi = 0, retry = 0, type, retune_retry_done = 0;
-	enum mmc_blk_status status;
-	struct mmc_queue_req *mqrq_cur = NULL;
-	struct mmc_queue_req *mq_rq;
-	struct request *old_req;
-	struct mmc_async_req *new_areq;
-	struct mmc_async_req *old_areq;
-	bool req_pending = true;
-
-	if (new_req) {
-		mqrq_cur = req_to_mmc_queue_req(new_req);
-		mq->qcnt++;
-	}
-
-	if (!mq->qcnt)
-		return;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	blk_status_t error = BLK_STS_OK;
+	int retries = 0;
 
 	do {
-		if (new_req) {
-			/*
-			 * When 4KB native sector is enabled, only 8 blocks
-			 * multiple read or write is allowed
-			 */
-			if (mmc_large_sector(card) &&
-				!IS_ALIGNED(blk_rq_sectors(new_req), 8)) {
-				pr_err("%s: Transfer size is not 4KB sector size aligned\n",
-					new_req->rq_disk->disk_name);
-				mmc_blk_rw_cmd_abort(mq, card, new_req, mqrq_cur);
-				return;
-			}
+		u32 status;
+		int err;
 
-			mmc_blk_rw_rq_prep(mqrq_cur, card, 0, mq);
-			new_areq = &mqrq_cur->areq;
-		} else
-			new_areq = NULL;
+		mmc_blk_rw_rq_prep(mqrq, card, 1, mq);
 
-		old_areq = mmc_start_areq(card->host, new_areq, &status);
-		if (!old_areq) {
-			/*
-			 * We have just put the first request into the pipeline
-			 * and there is nothing more to do until it is
-			 * complete.
-			 */
-			return;
+		mmc_wait_for_req(host, mrq);
+
+		err = mmc_send_status(card, &status);
+		if (err)
+			goto error_exit;
+
+		if (!mmc_host_is_spi(host) &&
+		    !mmc_blk_in_tran_state(status)) {
+			err = mmc_blk_fix_state(card, req);
+			if (err)
+				goto error_exit;
 		}
 
-		/*
-		 * An asynchronous request has been completed and we proceed
-		 * to handle the result of it.
-		 */
-		mq_rq =	container_of(old_areq, struct mmc_queue_req, areq);
-		brq = &mq_rq->brq;
-		old_req = mmc_queue_req_to_req(mq_rq);
-		type = rq_data_dir(old_req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+		if (mrq->cmd->error && retries++ < MMC_READ_SINGLE_RETRIES)
+			continue;
 
-		switch (status) {
-		case MMC_BLK_SUCCESS:
-		case MMC_BLK_PARTIAL:
-			/*
-			 * A block was successfully transferred.
-			 */
-			mmc_blk_reset_success(md, type);
+		retries = 0;
 
-			req_pending = blk_end_request(old_req, BLK_STS_OK,
-						      brq->data.bytes_xfered);
-			/*
-			 * If the blk_end_request function returns non-zero even
-			 * though all data has been transferred and no errors
-			 * were returned by the host controller, it's a bug.
-			 */
-			if (status == MMC_BLK_SUCCESS && req_pending) {
-				pr_err("%s BUG rq_tot %d d_xfer %d\n",
-				       __func__, blk_rq_bytes(old_req),
-				       brq->data.bytes_xfered);
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				return;
-			}
-			break;
-		case MMC_BLK_CMD_ERR:
-			req_pending = mmc_blk_rw_cmd_err(md, card, brq, old_req, req_pending);
-			if (mmc_blk_reset(md, card->host, type)) {
-				if (req_pending)
-					mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				else
-					mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_RETRY:
-			retune_retry_done = brq->retune_retry_done;
-			if (retry++ < 5)
-				break;
-			/* Fall through */
-		case MMC_BLK_ABORT:
-			if (!mmc_blk_reset(md, card->host, type))
-				break;
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		case MMC_BLK_DATA_ERR: {
-			int err;
+		if (mrq->cmd->error ||
+		    mrq->data->error ||
+		    (!mmc_host_is_spi(host) &&
+		     (mrq->cmd->resp[0] & CMD_ERRORS || status & CMD_ERRORS)))
+			error = BLK_STS_IOERR;
+		else
+			error = BLK_STS_OK;
 
-			err = mmc_blk_reset(md, card->host, type);
-			if (!err)
-				break;
-			if (err == -ENODEV) {
-				mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			/* Fall through */
-		}
-		case MMC_BLK_ECC_ERR:
-			if (brq->data.blocks > 1) {
-				/* Redo read one sector at a time */
-				pr_warn("%s: retrying using single block read\n",
-					old_req->rq_disk->disk_name);
-				disable_multi = 1;
-				break;
-			}
-			/*
-			 * After an error, we redo I/O one sector at a
-			 * time, so we only reach here after trying to
-			 * read a single sector.
-			 */
-			req_pending = blk_end_request(old_req, BLK_STS_IOERR,
-						      brq->data.blksz);
-			if (!req_pending) {
-				mq->qcnt--;
-				mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-				return;
-			}
-			break;
-		case MMC_BLK_NOMEDIUM:
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		default:
-			pr_err("%s: Unhandled return value (%d)",
-					old_req->rq_disk->disk_name, status);
-			mmc_blk_rw_cmd_abort(mq, card, old_req, mq_rq);
-			mmc_blk_rw_try_restart(mq, new_req, mqrq_cur);
-			return;
-		}
+	} while (blk_update_request(req, error, 512));
 
-		if (req_pending) {
-			/*
-			 * In case of a incomplete request
-			 * prepare it again and resend.
-			 */
-			mmc_blk_rw_rq_prep(mq_rq, card,
-					disable_multi, mq);
-			mmc_start_areq(card->host,
-					&mq_rq->areq, NULL);
-			mq_rq->brq.retune_retry_done = retune_retry_done;
-		}
-	} while (req_pending);
+	return;
 
-	mq->qcnt--;
+error_exit:
+	mrq->data->bytes_xfered = 0;
+	blk_update_request(req, BLK_STS_IOERR, 512);
+	/* Let it try the remaining request again */
+	if (mqrq->retries > MMC_MAX_RETRIES - 1)
+		mqrq->retries = MMC_MAX_RETRIES - 1;
 }
 
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req)
+static inline bool mmc_blk_oor_valid(struct mmc_blk_request *brq)
 {
-	int ret;
+	return !!brq->mrq.sbc;
+}
+
+static inline u32 mmc_blk_stop_err_bits(struct mmc_blk_request *brq)
+{
+	return mmc_blk_oor_valid(brq) ? CMD_ERRORS : CMD_ERRORS_EXCL_OOR;
+}
+
+/*
+ * Check for errors the host controller driver might not have seen such as
+ * response mode errors or invalid card state.
+ */
+static bool mmc_blk_status_error(struct request *req, u32 status)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_blk_request *brq = &mqrq->brq;
+	struct mmc_queue *mq = req->q->queuedata;
+	u32 stop_err_bits;
+
+	if (mmc_host_is_spi(mq->card->host))
+		return false;
+
+	stop_err_bits = mmc_blk_stop_err_bits(brq);
+
+	return brq->cmd.resp[0]  & CMD_ERRORS    ||
+	       brq->stop.resp[0] & stop_err_bits ||
+	       status            & stop_err_bits ||
+	       (rq_data_dir(req) == WRITE && !mmc_blk_in_tran_state(status));
+}
+
+static inline bool mmc_blk_cmd_started(struct mmc_blk_request *brq)
+{
+	return !brq->sbc.error && !brq->cmd.error &&
+	       !(brq->cmd.resp[0] & CMD_ERRORS);
+}
+
+/*
+ * Requests are completed by mmc_blk_mq_complete_rq() which sets simple
+ * policy:
+ * 1. A request that has transferred at least some data is considered
+ * successful and will be requeued if there is remaining data to
+ * transfer.
+ * 2. Otherwise the number of retries is incremented and the request
+ * will be requeued if there are remaining retries.
+ * 3. Otherwise the request will be errored out.
+ * That means mmc_blk_mq_complete_rq() is controlled by bytes_xfered and
+ * mqrq->retries. So there are only 4 possible actions here:
+ *	1. do not accept the bytes_xfered value i.e. set it to zero
+ *	2. change mqrq->retries to determine the number of retries
+ *	3. try to reset the card
+ *	4. read one sector at a time
+ */
+static void mmc_blk_mq_rw_recovery(struct mmc_queue *mq, struct request *req)
+{
+	int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_blk_request *brq = &mqrq->brq;
 	struct mmc_blk_data *md = mq->blkdata;
-	struct mmc_card *card = md->queue.card;
+	struct mmc_card *card = mq->card;
+	u32 status;
+	u32 blocks;
+	int err;
 
-	if (req && !mq->qcnt)
-		/* claim host only for the first request */
-		mmc_get_card(card, NULL);
+	/*
+	 * Some errors the host driver might not have seen. Set the number of
+	 * bytes transferred to zero in that case.
+	 */
+	err = __mmc_send_status(card, &status, 0);
+	if (err || mmc_blk_status_error(req, status))
+		brq->data.bytes_xfered = 0;
 
-	ret = mmc_blk_part_switch(card, md->part_type);
-	if (ret) {
-		if (req) {
-			blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		goto out;
+	mmc_retune_release(card->host);
+
+	/*
+	 * Try again to get the status. This also provides an opportunity for
+	 * re-tuning.
+	 */
+	if (err)
+		err = __mmc_send_status(card, &status, 0);
+
+	/*
+	 * Nothing more to do after the number of bytes transferred has been
+	 * updated and there is no card.
+	 */
+	if (err && mmc_detect_card_removed(card->host))
+		return;
+
+	/* Try to get back to "tran" state */
+	if (!mmc_host_is_spi(mq->card->host) &&
+	    (err || !mmc_blk_in_tran_state(status)))
+		err = mmc_blk_fix_state(mq->card, req);
+
+	/*
+	 * Special case for SD cards where the card might record the number of
+	 * blocks written.
+	 */
+	if (!err && mmc_blk_cmd_started(brq) && mmc_card_sd(card) &&
+	    rq_data_dir(req) == WRITE) {
+		if (mmc_sd_num_wr_blocks(card, &blocks))
+			brq->data.bytes_xfered = 0;
+		else
+			brq->data.bytes_xfered = blocks << 9;
 	}
 
-	if (req) {
+	/* Reset if the card is in a bad state */
+	if (!mmc_host_is_spi(mq->card->host) &&
+	    err && mmc_blk_reset(md, card->host, type)) {
+		pr_err("%s: recovery failed!\n", req->rq_disk->disk_name);
+		mqrq->retries = MMC_NO_RETRIES;
+		return;
+	}
+
+	/*
+	 * If anything was done, just return and if there is anything remaining
+	 * on the request it will get requeued.
+	 */
+	if (brq->data.bytes_xfered)
+		return;
+
+	/* Reset before last retry */
+	if (mqrq->retries + 1 == MMC_MAX_RETRIES)
+		mmc_blk_reset(md, card->host, type);
+
+	/* Command errors fail fast, so use all MMC_MAX_RETRIES */
+	if (brq->sbc.error || brq->cmd.error)
+		return;
+
+	/* Reduce the remaining retries for data errors */
+	if (mqrq->retries < MMC_MAX_RETRIES - MMC_DATA_RETRIES) {
+		mqrq->retries = MMC_MAX_RETRIES - MMC_DATA_RETRIES;
+		return;
+	}
+
+	/* FIXME: Missing single sector read for large sector size */
+	if (!mmc_large_sector(card) && rq_data_dir(req) == READ &&
+	    brq->data.blocks > 1) {
+		/* Read one sector at a time */
+		mmc_blk_read_single(mq, req);
+		return;
+	}
+}
+
+static inline bool mmc_blk_rq_error(struct mmc_blk_request *brq)
+{
+	mmc_blk_eval_resp_error(brq);
+
+	return brq->sbc.error || brq->cmd.error || brq->stop.error ||
+	       brq->data.error || brq->cmd.resp[0] & CMD_ERRORS;
+}
+
+static int mmc_blk_card_busy(struct mmc_card *card, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	u32 status = 0;
+	int err;
+
+	if (mmc_host_is_spi(card->host) || rq_data_dir(req) == READ)
+		return 0;
+
+	err = card_busy_detect(card, MMC_BLK_TIMEOUT_MS, req, &status);
+
+	/*
+	 * Do not assume data transferred correctly if there are any error bits
+	 * set.
+	 */
+	if (status & mmc_blk_stop_err_bits(&mqrq->brq)) {
+		mqrq->brq.data.bytes_xfered = 0;
+		err = err ? err : -EIO;
+	}
+
+	/* Copy the exception bit so it will be seen later on */
+	if (mmc_card_mmc(card) && status & R1_EXCEPTION_EVENT)
+		mqrq->brq.cmd.resp[0] |= R1_EXCEPTION_EVENT;
+
+	return err;
+}
+
+static inline void mmc_blk_rw_reset_success(struct mmc_queue *mq,
+					    struct request *req)
+{
+	int type = rq_data_dir(req) == READ ? MMC_BLK_READ : MMC_BLK_WRITE;
+
+	mmc_blk_reset_success(mq->blkdata, type);
+}
+
+static void mmc_blk_mq_complete_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	unsigned int nr_bytes = mqrq->brq.data.bytes_xfered;
+
+	if (nr_bytes) {
+		if (blk_update_request(req, BLK_STS_OK, nr_bytes))
+			blk_mq_requeue_request(req, true);
+		else
+			__blk_mq_end_request(req, BLK_STS_OK);
+	} else if (!blk_rq_bytes(req)) {
+		__blk_mq_end_request(req, BLK_STS_IOERR);
+	} else if (mqrq->retries++ < MMC_MAX_RETRIES) {
+		blk_mq_requeue_request(req, true);
+	} else {
+		if (mmc_card_removed(mq->card))
+			req->rq_flags |= RQF_QUIET;
+		blk_mq_end_request(req, BLK_STS_IOERR);
+	}
+}
+
+static bool mmc_blk_urgent_bkops_needed(struct mmc_queue *mq,
+					struct mmc_queue_req *mqrq)
+{
+	return mmc_card_mmc(mq->card) && !mmc_host_is_spi(mq->card->host) &&
+	       (mqrq->brq.cmd.resp[0] & R1_EXCEPTION_EVENT ||
+		mqrq->brq.stop.resp[0] & R1_EXCEPTION_EVENT);
+}
+
+static void mmc_blk_urgent_bkops(struct mmc_queue *mq,
+				 struct mmc_queue_req *mqrq)
+{
+	if (mmc_blk_urgent_bkops_needed(mq, mqrq))
+		mmc_start_bkops(mq->card, true);
+}
+
+void mmc_blk_mq_complete(struct request *req)
+{
+	struct mmc_queue *mq = req->q->queuedata;
+
+	if (mq->use_cqe)
+		mmc_blk_cqe_complete_rq(mq, req);
+	else
+		mmc_blk_mq_complete_rq(mq, req);
+}
+
+static void mmc_blk_mq_poll_completion(struct mmc_queue *mq,
+				       struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_card_busy(mq->card, req)) {
+		mmc_blk_mq_rw_recovery(mq, req);
+	} else {
+		mmc_blk_rw_reset_success(mq, req);
+		mmc_retune_release(host);
+	}
+
+	mmc_blk_urgent_bkops(mq, mqrq);
+}
+
+static void mmc_blk_mq_dec_in_flight(struct mmc_queue *mq, struct request *req)
+{
+	struct request_queue *q = req->q;
+	unsigned long flags;
+	bool put_card;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	mq->in_flight[mmc_issue_type(mq, req)] -= 1;
+
+	put_card = (mmc_tot_in_flight(mq) == 0);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	if (put_card)
+		mmc_put_card(mq->card, &mq->ctx);
+}
+
+static void mmc_blk_mq_post_req(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_host *host = mq->card->host;
+
+	mmc_post_req(host, mrq, 0);
+
+	/*
+	 * Block layer timeouts race with completions which means the normal
+	 * completion path cannot be used during recovery.
+	 */
+	if (mq->in_recovery)
+		mmc_blk_mq_complete_rq(mq, req);
+	else
+		blk_mq_complete_request(req);
+
+	mmc_blk_mq_dec_in_flight(mq, req);
+}
+
+void mmc_blk_mq_recovery(struct mmc_queue *mq)
+{
+	struct request *req = mq->recovery_req;
+	struct mmc_host *host = mq->card->host;
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+
+	mq->recovery_req = NULL;
+	mq->rw_wait = false;
+
+	if (mmc_blk_rq_error(&mqrq->brq)) {
+		mmc_retune_hold_now(host);
+		mmc_blk_mq_rw_recovery(mq, req);
+	}
+
+	mmc_blk_urgent_bkops(mq, mqrq);
+
+	mmc_blk_mq_post_req(mq, req);
+}
+
+static void mmc_blk_mq_complete_prev_req(struct mmc_queue *mq,
+					 struct request **prev_req)
+{
+	if (mmc_host_done_complete(mq->card->host))
+		return;
+
+	mutex_lock(&mq->complete_lock);
+
+	if (!mq->complete_req)
+		goto out_unlock;
+
+	mmc_blk_mq_poll_completion(mq, mq->complete_req);
+
+	if (prev_req)
+		*prev_req = mq->complete_req;
+	else
+		mmc_blk_mq_post_req(mq, mq->complete_req);
+
+	mq->complete_req = NULL;
+
+out_unlock:
+	mutex_unlock(&mq->complete_lock);
+}
+
+void mmc_blk_mq_complete_work(struct work_struct *work)
+{
+	struct mmc_queue *mq = container_of(work, struct mmc_queue,
+					    complete_work);
+
+	mmc_blk_mq_complete_prev_req(mq, NULL);
+}
+
+static void mmc_blk_mq_req_done(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	unsigned long flags;
+
+	if (!mmc_host_done_complete(host)) {
+		bool waiting;
+
+		/*
+		 * We cannot complete the request in this context, so record
+		 * that there is a request to complete, and that a following
+		 * request does not need to wait (although it does need to
+		 * complete complete_req first).
+		 */
+		spin_lock_irqsave(q->queue_lock, flags);
+		mq->complete_req = req;
+		mq->rw_wait = false;
+		waiting = mq->waiting;
+		spin_unlock_irqrestore(q->queue_lock, flags);
+
+		/*
+		 * If 'waiting' then the waiting task will complete this
+		 * request, otherwise queue a work to do it. Note that
+		 * complete_work may still race with the dispatch of a following
+		 * request.
+		 */
+		if (waiting)
+			wake_up(&mq->wait);
+		else
+			kblockd_schedule_work(&mq->complete_work);
+
+		return;
+	}
+
+	/* Take the recovery path for errors or urgent background operations */
+	if (mmc_blk_rq_error(&mqrq->brq) ||
+	    mmc_blk_urgent_bkops_needed(mq, mqrq)) {
+		spin_lock_irqsave(q->queue_lock, flags);
+		mq->recovery_needed = true;
+		mq->recovery_req = req;
+		spin_unlock_irqrestore(q->queue_lock, flags);
+		wake_up(&mq->wait);
+		schedule_work(&mq->recovery_work);
+		return;
+	}
+
+	mmc_blk_rw_reset_success(mq, req);
+
+	mq->rw_wait = false;
+	wake_up(&mq->wait);
+
+	mmc_blk_mq_post_req(mq, req);
+}
+
+static bool mmc_blk_rw_wait_cond(struct mmc_queue *mq, int *err)
+{
+	struct request_queue *q = mq->queue;
+	unsigned long flags;
+	bool done;
+
+	/*
+	 * Wait while there is another request in progress, but not if recovery
+	 * is needed. Also indicate whether there is a request waiting to start.
+	 */
+	spin_lock_irqsave(q->queue_lock, flags);
+	if (mq->recovery_needed) {
+		*err = -EBUSY;
+		done = true;
+	} else {
+		done = !mq->rw_wait;
+	}
+	mq->waiting = !done;
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	return done;
+}
+
+static int mmc_blk_rw_wait(struct mmc_queue *mq, struct request **prev_req)
+{
+	int err = 0;
+
+	wait_event(mq->wait, mmc_blk_rw_wait_cond(mq, &err));
+
+	/* Always complete the previous request if there is one */
+	mmc_blk_mq_complete_prev_req(mq, prev_req);
+
+	return err;
+}
+
+static int mmc_blk_mq_issue_rw_rq(struct mmc_queue *mq,
+				  struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_host *host = mq->card->host;
+	struct request *prev_req = NULL;
+	int err = 0;
+
+	mmc_blk_rw_rq_prep(mqrq, mq->card, 0, mq);
+
+	mqrq->brq.mrq.done = mmc_blk_mq_req_done;
+
+	mmc_pre_req(host, &mqrq->brq.mrq);
+
+	err = mmc_blk_rw_wait(mq, &prev_req);
+	if (err)
+		goto out_post_req;
+
+	mq->rw_wait = true;
+
+	err = mmc_start_request(host, &mqrq->brq.mrq);
+
+	if (prev_req)
+		mmc_blk_mq_post_req(mq, prev_req);
+
+	if (err)
+		mq->rw_wait = false;
+
+	/* Release re-tuning here where there is no synchronization required */
+	if (err || mmc_host_done_complete(host))
+		mmc_retune_release(host);
+
+out_post_req:
+	if (err)
+		mmc_post_req(host, &mqrq->brq.mrq, err);
+
+	return err;
+}
+
+static int mmc_blk_wait_for_idle(struct mmc_queue *mq, struct mmc_host *host)
+{
+	if (mq->use_cqe)
+		return host->cqe_ops->cqe_wait_for_idle(host);
+
+	return mmc_blk_rw_wait(mq, NULL);
+}
+
+enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_blk_data *md = mq->blkdata;
+	struct mmc_card *card = md->queue.card;
+	struct mmc_host *host = card->host;
+	int ret;
+
+	ret = mmc_blk_part_switch(card, md->part_type);
+	if (ret)
+		return MMC_REQ_FAILED_TO_START;
+
+	switch (mmc_issue_type(mq, req)) {
+	case MMC_ISSUE_SYNC:
+		ret = mmc_blk_wait_for_idle(mq, host);
+		if (ret)
+			return MMC_REQ_BUSY;
 		switch (req_op(req)) {
 		case REQ_OP_DRV_IN:
 		case REQ_OP_DRV_OUT:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * ioctl()s
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
 			mmc_blk_issue_drv_op(mq, req);
 			break;
 		case REQ_OP_DISCARD:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * discard.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
 			mmc_blk_issue_discard_rq(mq, req);
 			break;
 		case REQ_OP_SECURE_ERASE:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * secure erase.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
 			mmc_blk_issue_secdiscard_rq(mq, req);
 			break;
 		case REQ_OP_FLUSH:
-			/*
-			 * Complete ongoing async transfer before issuing
-			 * flush.
-			 */
-			if (mq->qcnt)
-				mmc_blk_issue_rw_rq(mq, NULL);
 			mmc_blk_issue_flush(mq, req);
 			break;
 		default:
-			/* Normal request, just issue it */
-			mmc_blk_issue_rw_rq(mq, req);
-			card->host->context_info.is_waiting_last_req = false;
-			break;
+			WARN_ON_ONCE(1);
+			return MMC_REQ_FAILED_TO_START;
 		}
-	} else {
-		/* No request, flushing the pipeline with NULL */
-		mmc_blk_issue_rw_rq(mq, NULL);
-		card->host->context_info.is_waiting_last_req = false;
+		return MMC_REQ_FINISHED;
+	case MMC_ISSUE_DCMD:
+	case MMC_ISSUE_ASYNC:
+		switch (req_op(req)) {
+		case REQ_OP_FLUSH:
+			ret = mmc_blk_cqe_issue_flush(mq, req);
+			break;
+		case REQ_OP_READ:
+		case REQ_OP_WRITE:
+			if (mq->use_cqe)
+				ret = mmc_blk_cqe_issue_rw_rq(mq, req);
+			else
+				ret = mmc_blk_mq_issue_rw_rq(mq, req);
+			break;
+		default:
+			WARN_ON_ONCE(1);
+			ret = -EINVAL;
+		}
+		if (!ret)
+			return MMC_REQ_STARTED;
+		return ret == -EBUSY ? MMC_REQ_BUSY : MMC_REQ_FAILED_TO_START;
+	default:
+		WARN_ON_ONCE(1);
+		return MMC_REQ_FAILED_TO_START;
 	}
-
-out:
-	if (!mq->qcnt)
-		mmc_put_card(card, NULL);
 }
 
 static inline int mmc_blk_readonly(struct mmc_card *card)
@@ -2156,6 +2320,18 @@ static struct mmc_blk_data *mmc_blk_alloc_req(struct mmc_card *card,
 
 	md->queue.blkdata = md;
 
+	/*
+	 * Keep an extra reference to the queue so that we can shutdown the
+	 * queue (i.e. call blk_cleanup_queue()) while there are still
+	 * references to the 'md'. The corresponding blk_put_queue() is in
+	 * mmc_blk_put().
+	 */
+	if (!blk_get_queue(md->queue.queue)) {
+		mmc_cleanup_queue(&md->queue);
+		ret = -ENODEV;
+		goto err_putdisk;
+	}
+
 	md->disk->major	= MMC_BLOCK_MAJOR;
 	md->disk->first_minor = devidx * perdev_minors;
 	md->disk->fops = &mmc_bdops;
@@ -2471,10 +2647,6 @@ static void mmc_blk_remove_req(struct mmc_blk_data *md)
 		 * from being accepted.
 		 */
 		card = md->queue.card;
-		spin_lock_irq(md->queue.queue->queue_lock);
-		queue_flag_set(QUEUE_FLAG_BYPASS, md->queue.queue);
-		spin_unlock_irq(md->queue.queue->queue_lock);
-		blk_set_queue_dying(md->queue.queue);
 		mmc_cleanup_queue(&md->queue);
 		if (md->disk->flags & GENHD_FL_UP) {
 			device_remove_file(disk_to_dev(md->disk), &md->force_ro);
@@ -2623,6 +2795,7 @@ static int mmc_ext_csd_open(struct inode *inode, struct file *filp)
 
 	if (n != EXT_CSD_STR_LEN) {
 		err = -EINVAL;
+		kfree(ext_csd);
 		goto out_free;
 	}
 
diff --git a/drivers/mmc/core/block.h b/drivers/mmc/core/block.h
index 5946636..31153f6 100644
--- a/drivers/mmc/core/block.h
+++ b/drivers/mmc/core/block.h
@@ -5,6 +5,16 @@
 struct mmc_queue;
 struct request;
 
-void mmc_blk_issue_rq(struct mmc_queue *mq, struct request *req);
+void mmc_blk_cqe_recovery(struct mmc_queue *mq);
+
+enum mmc_issued;
+
+enum mmc_issued mmc_blk_mq_issue_rq(struct mmc_queue *mq, struct request *req);
+void mmc_blk_mq_complete(struct request *req);
+void mmc_blk_mq_recovery(struct mmc_queue *mq);
+
+struct work_struct;
+
+void mmc_blk_mq_complete_work(struct work_struct *work);
 
 #endif
diff --git a/drivers/mmc/core/bus.c b/drivers/mmc/core/bus.c
index 7586ff2..fc92c6c 100644
--- a/drivers/mmc/core/bus.c
+++ b/drivers/mmc/core/bus.c
@@ -351,8 +351,6 @@ int mmc_add_card(struct mmc_card *card)
 #ifdef CONFIG_DEBUG_FS
 	mmc_add_card_debugfs(card);
 #endif
-	mmc_init_context_info(card->host);
-
 	card->dev.of_node = mmc_of_find_child_device(card->host, 0);
 
 	device_enable_async_suspend(&card->dev);
diff --git a/drivers/mmc/core/core.c b/drivers/mmc/core/core.c
index 1f0f44f..c0ba6d8 100644
--- a/drivers/mmc/core/core.c
+++ b/drivers/mmc/core/core.c
@@ -341,6 +341,8 @@ int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 {
 	int err;
 
+	init_completion(&mrq->cmd_completion);
+
 	mmc_retune_hold(host);
 
 	if (mmc_card_removed(host->card))
@@ -361,20 +363,6 @@ int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq)
 }
 EXPORT_SYMBOL(mmc_start_request);
 
-/*
- * mmc_wait_data_done() - done callback for data request
- * @mrq: done data request
- *
- * Wakes up mmc context, passed as a callback to host controller driver
- */
-static void mmc_wait_data_done(struct mmc_request *mrq)
-{
-	struct mmc_context_info *context_info = &mrq->host->context_info;
-
-	context_info->is_done_rcv = true;
-	wake_up_interruptible(&context_info->wait);
-}
-
 static void mmc_wait_done(struct mmc_request *mrq)
 {
 	complete(&mrq->completion);
@@ -392,37 +380,6 @@ static inline void mmc_wait_ongoing_tfr_cmd(struct mmc_host *host)
 		wait_for_completion(&ongoing_mrq->cmd_completion);
 }
 
-/*
- *__mmc_start_data_req() - starts data request
- * @host: MMC host to start the request
- * @mrq: data request to start
- *
- * Sets the done callback to be called when request is completed by the card.
- * Starts data mmc request execution
- * If an ongoing transfer is already in progress, wait for the command line
- * to become available before sending another command.
- */
-static int __mmc_start_data_req(struct mmc_host *host, struct mmc_request *mrq)
-{
-	int err;
-
-	mmc_wait_ongoing_tfr_cmd(host);
-
-	mrq->done = mmc_wait_data_done;
-	mrq->host = host;
-
-	init_completion(&mrq->cmd_completion);
-
-	err = mmc_start_request(host, mrq);
-	if (err) {
-		mrq->cmd->error = err;
-		mmc_complete_cmd(mrq);
-		mmc_wait_data_done(mrq);
-	}
-
-	return err;
-}
-
 static int __mmc_start_req(struct mmc_host *host, struct mmc_request *mrq)
 {
 	int err;
@@ -432,8 +389,6 @@ static int __mmc_start_req(struct mmc_host *host, struct mmc_request *mrq)
 	init_completion(&mrq->completion);
 	mrq->done = mmc_wait_done;
 
-	init_completion(&mrq->cmd_completion);
-
 	err = mmc_start_request(host, mrq);
 	if (err) {
 		mrq->cmd->error = err;
@@ -650,164 +605,11 @@ EXPORT_SYMBOL(mmc_cqe_recovery);
  */
 bool mmc_is_req_done(struct mmc_host *host, struct mmc_request *mrq)
 {
-	if (host->areq)
-		return host->context_info.is_done_rcv;
-	else
-		return completion_done(&mrq->completion);
+	return completion_done(&mrq->completion);
 }
 EXPORT_SYMBOL(mmc_is_req_done);
 
 /**
- *	mmc_pre_req - Prepare for a new request
- *	@host: MMC host to prepare command
- *	@mrq: MMC request to prepare for
- *
- *	mmc_pre_req() is called in prior to mmc_start_req() to let
- *	host prepare for the new request. Preparation of a request may be
- *	performed while another request is running on the host.
- */
-static void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq)
-{
-	if (host->ops->pre_req)
-		host->ops->pre_req(host, mrq);
-}
-
-/**
- *	mmc_post_req - Post process a completed request
- *	@host: MMC host to post process command
- *	@mrq: MMC request to post process for
- *	@err: Error, if non zero, clean up any resources made in pre_req
- *
- *	Let the host post process a completed request. Post processing of
- *	a request may be performed while another reuqest is running.
- */
-static void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
-			 int err)
-{
-	if (host->ops->post_req)
-		host->ops->post_req(host, mrq, err);
-}
-
-/**
- * mmc_finalize_areq() - finalize an asynchronous request
- * @host: MMC host to finalize any ongoing request on
- *
- * Returns the status of the ongoing asynchronous request, but
- * MMC_BLK_SUCCESS if no request was going on.
- */
-static enum mmc_blk_status mmc_finalize_areq(struct mmc_host *host)
-{
-	struct mmc_context_info *context_info = &host->context_info;
-	enum mmc_blk_status status;
-
-	if (!host->areq)
-		return MMC_BLK_SUCCESS;
-
-	while (1) {
-		wait_event_interruptible(context_info->wait,
-				(context_info->is_done_rcv ||
-				 context_info->is_new_req));
-
-		if (context_info->is_done_rcv) {
-			struct mmc_command *cmd;
-
-			context_info->is_done_rcv = false;
-			cmd = host->areq->mrq->cmd;
-
-			if (!cmd->error || !cmd->retries ||
-			    mmc_card_removed(host->card)) {
-				status = host->areq->err_check(host->card,
-							       host->areq);
-				break; /* return status */
-			} else {
-				mmc_retune_recheck(host);
-				pr_info("%s: req failed (CMD%u): %d, retrying...\n",
-					mmc_hostname(host),
-					cmd->opcode, cmd->error);
-				cmd->retries--;
-				cmd->error = 0;
-				__mmc_start_request(host, host->areq->mrq);
-				continue; /* wait for done/new event again */
-			}
-		}
-
-		return MMC_BLK_NEW_REQUEST;
-	}
-
-	mmc_retune_release(host);
-
-	/*
-	 * Check BKOPS urgency for each R1 response
-	 */
-	if (host->card && mmc_card_mmc(host->card) &&
-	    ((mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1) ||
-	     (mmc_resp_type(host->areq->mrq->cmd) == MMC_RSP_R1B)) &&
-	    (host->areq->mrq->cmd->resp[0] & R1_EXCEPTION_EVENT)) {
-		mmc_start_bkops(host->card, true);
-	}
-
-	return status;
-}
-
-/**
- *	mmc_start_areq - start an asynchronous request
- *	@host: MMC host to start command
- *	@areq: asynchronous request to start
- *	@ret_stat: out parameter for status
- *
- *	Start a new MMC custom command request for a host.
- *	If there is on ongoing async request wait for completion
- *	of that request and start the new one and return.
- *	Does not wait for the new request to complete.
- *
- *      Returns the completed request, NULL in case of none completed.
- *	Wait for the an ongoing request (previoulsy started) to complete and
- *	return the completed request. If there is no ongoing request, NULL
- *	is returned without waiting. NULL is not an error condition.
- */
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat)
-{
-	enum mmc_blk_status status;
-	int start_err = 0;
-	struct mmc_async_req *previous = host->areq;
-
-	/* Prepare a new request */
-	if (areq)
-		mmc_pre_req(host, areq->mrq);
-
-	/* Finalize previous request */
-	status = mmc_finalize_areq(host);
-	if (ret_stat)
-		*ret_stat = status;
-
-	/* The previous request is still going on... */
-	if (status == MMC_BLK_NEW_REQUEST)
-		return NULL;
-
-	/* Fine so far, start the new request! */
-	if (status == MMC_BLK_SUCCESS && areq)
-		start_err = __mmc_start_data_req(host, areq->mrq);
-
-	/* Postprocess the old request at this point */
-	if (host->areq)
-		mmc_post_req(host, host->areq->mrq, 0);
-
-	/* Cancel a prepared request if it was not started. */
-	if ((status != MMC_BLK_SUCCESS || start_err) && areq)
-		mmc_post_req(host, areq->mrq, -EINVAL);
-
-	if (status != MMC_BLK_SUCCESS)
-		host->areq = NULL;
-	else
-		host->areq = areq;
-
-	return previous;
-}
-EXPORT_SYMBOL(mmc_start_areq);
-
-/**
  *	mmc_wait_for_req - start a request and wait for completion
  *	@host: MMC host to start command
  *	@mrq: MMC request to start
@@ -2959,6 +2761,14 @@ static int mmc_pm_notify(struct notifier_block *notify_block,
 		if (!err)
 			break;
 
+		if (!mmc_card_is_removable(host)) {
+			dev_warn(mmc_dev(host),
+				 "pre_suspend failed for non-removable host: "
+				 "%d\n", err);
+			/* Avoid removing non-removable hosts */
+			break;
+		}
+
 		/* Calling bus_ops->remove() with a claimed host can deadlock */
 		host->bus_ops->remove(host);
 		mmc_claim_host(host);
@@ -2994,22 +2804,6 @@ void mmc_unregister_pm_notifier(struct mmc_host *host)
 }
 #endif
 
-/**
- * mmc_init_context_info() - init synchronization context
- * @host: mmc host
- *
- * Init struct context_info needed to implement asynchronous
- * request mechanism, used by mmc core, host driver and mmc requests
- * supplier.
- */
-void mmc_init_context_info(struct mmc_host *host)
-{
-	host->context_info.is_new_req = false;
-	host->context_info.is_done_rcv = false;
-	host->context_info.is_waiting_last_req = false;
-	init_waitqueue_head(&host->context_info.wait);
-}
-
 static int __init mmc_init(void)
 {
 	int ret;
diff --git a/drivers/mmc/core/core.h b/drivers/mmc/core/core.h
index 71e6c6d..d6303d6 100644
--- a/drivers/mmc/core/core.h
+++ b/drivers/mmc/core/core.h
@@ -62,12 +62,10 @@ void mmc_set_initial_state(struct mmc_host *host);
 
 static inline void mmc_delay(unsigned int ms)
 {
-	if (ms < 1000 / HZ) {
-		cond_resched();
-		mdelay(ms);
-	} else {
+	if (ms <= 20)
+		usleep_range(ms * 1000, ms * 1250);
+	else
 		msleep(ms);
-	}
 }
 
 void mmc_rescan(struct work_struct *work);
@@ -91,8 +89,6 @@ void mmc_remove_host_debugfs(struct mmc_host *host);
 void mmc_add_card_debugfs(struct mmc_card *card);
 void mmc_remove_card_debugfs(struct mmc_card *card);
 
-void mmc_init_context_info(struct mmc_host *host);
-
 int mmc_execute_tuning(struct mmc_card *card);
 int mmc_hs200_to_hs400(struct mmc_card *card);
 int mmc_hs400_to_hs200(struct mmc_card *card);
@@ -110,12 +106,6 @@ bool mmc_is_req_done(struct mmc_host *host, struct mmc_request *mrq);
 
 int mmc_start_request(struct mmc_host *host, struct mmc_request *mrq);
 
-struct mmc_async_req;
-
-struct mmc_async_req *mmc_start_areq(struct mmc_host *host,
-				     struct mmc_async_req *areq,
-				     enum mmc_blk_status *ret_stat);
-
 int mmc_erase(struct mmc_card *card, unsigned int from, unsigned int nr,
 		unsigned int arg);
 int mmc_can_erase(struct mmc_card *card);
@@ -152,4 +142,35 @@ int mmc_cqe_start_req(struct mmc_host *host, struct mmc_request *mrq);
 void mmc_cqe_post_req(struct mmc_host *host, struct mmc_request *mrq);
 int mmc_cqe_recovery(struct mmc_host *host);
 
+/**
+ *	mmc_pre_req - Prepare for a new request
+ *	@host: MMC host to prepare command
+ *	@mrq: MMC request to prepare for
+ *
+ *	mmc_pre_req() is called in prior to mmc_start_req() to let
+ *	host prepare for the new request. Preparation of a request may be
+ *	performed while another request is running on the host.
+ */
+static inline void mmc_pre_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	if (host->ops->pre_req)
+		host->ops->pre_req(host, mrq);
+}
+
+/**
+ *	mmc_post_req - Post process a completed request
+ *	@host: MMC host to post process command
+ *	@mrq: MMC request to post process for
+ *	@err: Error, if non zero, clean up any resources made in pre_req
+ *
+ *	Let the host post process a completed request. Post processing of
+ *	a request may be performed while another request is running.
+ */
+static inline void mmc_post_req(struct mmc_host *host, struct mmc_request *mrq,
+				int err)
+{
+	if (host->ops->post_req)
+		host->ops->post_req(host, mrq, err);
+}
+
 #endif
diff --git a/drivers/mmc/core/host.h b/drivers/mmc/core/host.h
index fb689a1..06ec19b 100644
--- a/drivers/mmc/core/host.h
+++ b/drivers/mmc/core/host.h
@@ -41,6 +41,11 @@ static inline int mmc_host_cmd23(struct mmc_host *host)
 	return host->caps & MMC_CAP_CMD23;
 }
 
+static inline bool mmc_host_done_complete(struct mmc_host *host)
+{
+	return host->caps & MMC_CAP_DONE_COMPLETE;
+}
+
 static inline int mmc_boot_partition_access(struct mmc_host *host)
 {
 	return !(host->caps2 & MMC_CAP2_BOOTPART_NOACC);
@@ -74,6 +79,5 @@ static inline bool mmc_card_hs400es(struct mmc_card *card)
 	return card->host->ios.enhanced_strobe;
 }
 
-
 #endif
 
diff --git a/drivers/mmc/core/mmc_test.c b/drivers/mmc/core/mmc_test.c
index 4788698..ef18dae 100644
--- a/drivers/mmc/core/mmc_test.c
+++ b/drivers/mmc/core/mmc_test.c
@@ -101,7 +101,7 @@ struct mmc_test_transfer_result {
 	struct list_head link;
 	unsigned int count;
 	unsigned int sectors;
-	struct timespec ts;
+	struct timespec64 ts;
 	unsigned int rate;
 	unsigned int iops;
 };
@@ -171,11 +171,6 @@ struct mmc_test_multiple_rw {
 	enum mmc_test_prep_media prepare;
 };
 
-struct mmc_test_async_req {
-	struct mmc_async_req areq;
-	struct mmc_test_card *test;
-};
-
 /*******************************************************************/
 /*  General helper functions                                       */
 /*******************************************************************/
@@ -515,14 +510,11 @@ static int mmc_test_map_sg_max_scatter(struct mmc_test_mem *mem,
 /*
  * Calculate transfer rate in bytes per second.
  */
-static unsigned int mmc_test_rate(uint64_t bytes, struct timespec *ts)
+static unsigned int mmc_test_rate(uint64_t bytes, struct timespec64 *ts)
 {
 	uint64_t ns;
 
-	ns = ts->tv_sec;
-	ns *= 1000000000;
-	ns += ts->tv_nsec;
-
+	ns = timespec64_to_ns(ts);
 	bytes *= 1000000000;
 
 	while (ns > UINT_MAX) {
@@ -542,7 +534,7 @@ static unsigned int mmc_test_rate(uint64_t bytes, struct timespec *ts)
  * Save transfer results for future usage
  */
 static void mmc_test_save_transfer_result(struct mmc_test_card *test,
-	unsigned int count, unsigned int sectors, struct timespec ts,
+	unsigned int count, unsigned int sectors, struct timespec64 ts,
 	unsigned int rate, unsigned int iops)
 {
 	struct mmc_test_transfer_result *tr;
@@ -567,21 +559,21 @@ static void mmc_test_save_transfer_result(struct mmc_test_card *test,
  * Print the transfer rate.
  */
 static void mmc_test_print_rate(struct mmc_test_card *test, uint64_t bytes,
-				struct timespec *ts1, struct timespec *ts2)
+				struct timespec64 *ts1, struct timespec64 *ts2)
 {
 	unsigned int rate, iops, sectors = bytes >> 9;
-	struct timespec ts;
+	struct timespec64 ts;
 
-	ts = timespec_sub(*ts2, *ts1);
+	ts = timespec64_sub(*ts2, *ts1);
 
 	rate = mmc_test_rate(bytes, &ts);
 	iops = mmc_test_rate(100, &ts); /* I/O ops per sec x 100 */
 
-	pr_info("%s: Transfer of %u sectors (%u%s KiB) took %lu.%09lu "
+	pr_info("%s: Transfer of %u sectors (%u%s KiB) took %llu.%09u "
 			 "seconds (%u kB/s, %u KiB/s, %u.%02u IOPS)\n",
 			 mmc_hostname(test->card->host), sectors, sectors >> 1,
-			 (sectors & 1 ? ".5" : ""), (unsigned long)ts.tv_sec,
-			 (unsigned long)ts.tv_nsec, rate / 1000, rate / 1024,
+			 (sectors & 1 ? ".5" : ""), (u64)ts.tv_sec,
+			 (u32)ts.tv_nsec, rate / 1000, rate / 1024,
 			 iops / 100, iops % 100);
 
 	mmc_test_save_transfer_result(test, 1, sectors, ts, rate, iops);
@@ -591,24 +583,24 @@ static void mmc_test_print_rate(struct mmc_test_card *test, uint64_t bytes,
  * Print the average transfer rate.
  */
 static void mmc_test_print_avg_rate(struct mmc_test_card *test, uint64_t bytes,
-				    unsigned int count, struct timespec *ts1,
-				    struct timespec *ts2)
+				    unsigned int count, struct timespec64 *ts1,
+				    struct timespec64 *ts2)
 {
 	unsigned int rate, iops, sectors = bytes >> 9;
 	uint64_t tot = bytes * count;
-	struct timespec ts;
+	struct timespec64 ts;
 
-	ts = timespec_sub(*ts2, *ts1);
+	ts = timespec64_sub(*ts2, *ts1);
 
 	rate = mmc_test_rate(tot, &ts);
 	iops = mmc_test_rate(count * 100, &ts); /* I/O ops per sec x 100 */
 
 	pr_info("%s: Transfer of %u x %u sectors (%u x %u%s KiB) took "
-			 "%lu.%09lu seconds (%u kB/s, %u KiB/s, "
+			 "%llu.%09u seconds (%u kB/s, %u KiB/s, "
 			 "%u.%02u IOPS, sg_len %d)\n",
 			 mmc_hostname(test->card->host), count, sectors, count,
 			 sectors >> 1, (sectors & 1 ? ".5" : ""),
-			 (unsigned long)ts.tv_sec, (unsigned long)ts.tv_nsec,
+			 (u64)ts.tv_sec, (u32)ts.tv_nsec,
 			 rate / 1000, rate / 1024, iops / 100, iops % 100,
 			 test->area.sg_len);
 
@@ -741,30 +733,6 @@ static int mmc_test_check_result(struct mmc_test_card *test,
 	return ret;
 }
 
-static enum mmc_blk_status mmc_test_check_result_async(struct mmc_card *card,
-				       struct mmc_async_req *areq)
-{
-	struct mmc_test_async_req *test_async =
-		container_of(areq, struct mmc_test_async_req, areq);
-	int ret;
-
-	mmc_test_wait_busy(test_async->test);
-
-	/*
-	 * FIXME: this would earlier just casts a regular error code,
-	 * either of the kernel type -ERRORCODE or the local test framework
-	 * RESULT_* errorcode, into an enum mmc_blk_status and return as
-	 * result check. Instead, convert it to some reasonable type by just
-	 * returning either MMC_BLK_SUCCESS or MMC_BLK_CMD_ERR.
-	 * If possible, a reasonable error code should be returned.
-	 */
-	ret = mmc_test_check_result(test_async->test, areq->mrq);
-	if (ret)
-		return MMC_BLK_CMD_ERR;
-
-	return MMC_BLK_SUCCESS;
-}
-
 /*
  * Checks that a "short transfer" behaved as expected
  */
@@ -831,6 +799,45 @@ static struct mmc_test_req *mmc_test_req_alloc(void)
 	return rq;
 }
 
+static void mmc_test_wait_done(struct mmc_request *mrq)
+{
+	complete(&mrq->completion);
+}
+
+static int mmc_test_start_areq(struct mmc_test_card *test,
+			       struct mmc_request *mrq,
+			       struct mmc_request *prev_mrq)
+{
+	struct mmc_host *host = test->card->host;
+	int err = 0;
+
+	if (mrq) {
+		init_completion(&mrq->completion);
+		mrq->done = mmc_test_wait_done;
+		mmc_pre_req(host, mrq);
+	}
+
+	if (prev_mrq) {
+		wait_for_completion(&prev_mrq->completion);
+		err = mmc_test_wait_busy(test);
+		if (!err)
+			err = mmc_test_check_result(test, prev_mrq);
+	}
+
+	if (!err && mrq) {
+		err = mmc_start_request(host, mrq);
+		if (err)
+			mmc_retune_release(host);
+	}
+
+	if (prev_mrq)
+		mmc_post_req(host, prev_mrq, 0);
+
+	if (err && mrq)
+		mmc_post_req(host, mrq, err);
+
+	return err;
+}
 
 static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 				      struct scatterlist *sg, unsigned sg_len,
@@ -838,17 +845,10 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 				      unsigned blksz, int write, int count)
 {
 	struct mmc_test_req *rq1, *rq2;
-	struct mmc_test_async_req test_areq[2];
-	struct mmc_async_req *done_areq;
-	struct mmc_async_req *cur_areq = &test_areq[0].areq;
-	struct mmc_async_req *other_areq = &test_areq[1].areq;
-	enum mmc_blk_status status;
+	struct mmc_request *mrq, *prev_mrq;
 	int i;
 	int ret = RESULT_OK;
 
-	test_areq[0].test = test;
-	test_areq[1].test = test;
-
 	rq1 = mmc_test_req_alloc();
 	rq2 = mmc_test_req_alloc();
 	if (!rq1 || !rq2) {
@@ -856,33 +856,25 @@ static int mmc_test_nonblock_transfer(struct mmc_test_card *test,
 		goto err;
 	}
 
-	cur_areq->mrq = &rq1->mrq;
-	cur_areq->err_check = mmc_test_check_result_async;
-	other_areq->mrq = &rq2->mrq;
-	other_areq->err_check = mmc_test_check_result_async;
+	mrq = &rq1->mrq;
+	prev_mrq = NULL;
 
 	for (i = 0; i < count; i++) {
-		mmc_test_prepare_mrq(test, cur_areq->mrq, sg, sg_len, dev_addr,
-				     blocks, blksz, write);
-		done_areq = mmc_start_areq(test->card->host, cur_areq, &status);
-
-		if (status != MMC_BLK_SUCCESS || (!done_areq && i > 0)) {
-			ret = RESULT_FAIL;
+		mmc_test_req_reset(container_of(mrq, struct mmc_test_req, mrq));
+		mmc_test_prepare_mrq(test, mrq, sg, sg_len, dev_addr, blocks,
+				     blksz, write);
+		ret = mmc_test_start_areq(test, mrq, prev_mrq);
+		if (ret)
 			goto err;
-		}
 
-		if (done_areq)
-			mmc_test_req_reset(container_of(done_areq->mrq,
-						struct mmc_test_req, mrq));
+		if (!prev_mrq)
+			prev_mrq = &rq2->mrq;
 
-		swap(cur_areq, other_areq);
+		swap(mrq, prev_mrq);
 		dev_addr += blocks;
 	}
 
-	done_areq = mmc_start_areq(test->card->host, NULL, &status);
-	if (status != MMC_BLK_SUCCESS)
-		ret = RESULT_FAIL;
-
+	ret = mmc_test_start_areq(test, NULL, prev_mrq);
 err:
 	kfree(rq1);
 	kfree(rq2);
@@ -1449,7 +1441,7 @@ static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
 				int max_scatter, int timed, int count,
 				bool nonblock, int min_sg_len)
 {
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret = 0;
 	int i;
 	struct mmc_test_area *t = &test->area;
@@ -1475,7 +1467,7 @@ static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
 		return ret;
 
 	if (timed)
-		getnstimeofday(&ts1);
+		ktime_get_ts64(&ts1);
 	if (nonblock)
 		ret = mmc_test_nonblock_transfer(test, t->sg, t->sg_len,
 				 dev_addr, t->blocks, 512, write, count);
@@ -1489,7 +1481,7 @@ static int mmc_test_area_io_seq(struct mmc_test_card *test, unsigned long sz,
 		return ret;
 
 	if (timed)
-		getnstimeofday(&ts2);
+		ktime_get_ts64(&ts2);
 
 	if (timed)
 		mmc_test_print_avg_rate(test, sz, count, &ts1, &ts2);
@@ -1747,7 +1739,7 @@ static int mmc_test_profile_trim_perf(struct mmc_test_card *test)
 	struct mmc_test_area *t = &test->area;
 	unsigned long sz;
 	unsigned int dev_addr;
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret;
 
 	if (!mmc_can_trim(test->card))
@@ -1758,19 +1750,19 @@ static int mmc_test_profile_trim_perf(struct mmc_test_card *test)
 
 	for (sz = 512; sz < t->max_sz; sz <<= 1) {
 		dev_addr = t->dev_addr + (sz >> 9);
-		getnstimeofday(&ts1);
+		ktime_get_ts64(&ts1);
 		ret = mmc_erase(test->card, dev_addr, sz >> 9, MMC_TRIM_ARG);
 		if (ret)
 			return ret;
-		getnstimeofday(&ts2);
+		ktime_get_ts64(&ts2);
 		mmc_test_print_rate(test, sz, &ts1, &ts2);
 	}
 	dev_addr = t->dev_addr;
-	getnstimeofday(&ts1);
+	ktime_get_ts64(&ts1);
 	ret = mmc_erase(test->card, dev_addr, sz >> 9, MMC_TRIM_ARG);
 	if (ret)
 		return ret;
-	getnstimeofday(&ts2);
+	ktime_get_ts64(&ts2);
 	mmc_test_print_rate(test, sz, &ts1, &ts2);
 	return 0;
 }
@@ -1779,19 +1771,19 @@ static int mmc_test_seq_read_perf(struct mmc_test_card *test, unsigned long sz)
 {
 	struct mmc_test_area *t = &test->area;
 	unsigned int dev_addr, i, cnt;
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret;
 
 	cnt = t->max_sz / sz;
 	dev_addr = t->dev_addr;
-	getnstimeofday(&ts1);
+	ktime_get_ts64(&ts1);
 	for (i = 0; i < cnt; i++) {
 		ret = mmc_test_area_io(test, sz, dev_addr, 0, 0, 0);
 		if (ret)
 			return ret;
 		dev_addr += (sz >> 9);
 	}
-	getnstimeofday(&ts2);
+	ktime_get_ts64(&ts2);
 	mmc_test_print_avg_rate(test, sz, cnt, &ts1, &ts2);
 	return 0;
 }
@@ -1818,7 +1810,7 @@ static int mmc_test_seq_write_perf(struct mmc_test_card *test, unsigned long sz)
 {
 	struct mmc_test_area *t = &test->area;
 	unsigned int dev_addr, i, cnt;
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret;
 
 	ret = mmc_test_area_erase(test);
@@ -1826,14 +1818,14 @@ static int mmc_test_seq_write_perf(struct mmc_test_card *test, unsigned long sz)
 		return ret;
 	cnt = t->max_sz / sz;
 	dev_addr = t->dev_addr;
-	getnstimeofday(&ts1);
+	ktime_get_ts64(&ts1);
 	for (i = 0; i < cnt; i++) {
 		ret = mmc_test_area_io(test, sz, dev_addr, 1, 0, 0);
 		if (ret)
 			return ret;
 		dev_addr += (sz >> 9);
 	}
-	getnstimeofday(&ts2);
+	ktime_get_ts64(&ts2);
 	mmc_test_print_avg_rate(test, sz, cnt, &ts1, &ts2);
 	return 0;
 }
@@ -1864,7 +1856,7 @@ static int mmc_test_profile_seq_trim_perf(struct mmc_test_card *test)
 	struct mmc_test_area *t = &test->area;
 	unsigned long sz;
 	unsigned int dev_addr, i, cnt;
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret;
 
 	if (!mmc_can_trim(test->card))
@@ -1882,7 +1874,7 @@ static int mmc_test_profile_seq_trim_perf(struct mmc_test_card *test)
 			return ret;
 		cnt = t->max_sz / sz;
 		dev_addr = t->dev_addr;
-		getnstimeofday(&ts1);
+		ktime_get_ts64(&ts1);
 		for (i = 0; i < cnt; i++) {
 			ret = mmc_erase(test->card, dev_addr, sz >> 9,
 					MMC_TRIM_ARG);
@@ -1890,7 +1882,7 @@ static int mmc_test_profile_seq_trim_perf(struct mmc_test_card *test)
 				return ret;
 			dev_addr += (sz >> 9);
 		}
-		getnstimeofday(&ts2);
+		ktime_get_ts64(&ts2);
 		mmc_test_print_avg_rate(test, sz, cnt, &ts1, &ts2);
 	}
 	return 0;
@@ -1912,7 +1904,7 @@ static int mmc_test_rnd_perf(struct mmc_test_card *test, int write, int print,
 {
 	unsigned int dev_addr, cnt, rnd_addr, range1, range2, last_ea = 0, ea;
 	unsigned int ssz;
-	struct timespec ts1, ts2, ts;
+	struct timespec64 ts1, ts2, ts;
 	int ret;
 
 	ssz = sz >> 9;
@@ -1921,10 +1913,10 @@ static int mmc_test_rnd_perf(struct mmc_test_card *test, int write, int print,
 	range1 = rnd_addr / test->card->pref_erase;
 	range2 = range1 / ssz;
 
-	getnstimeofday(&ts1);
+	ktime_get_ts64(&ts1);
 	for (cnt = 0; cnt < UINT_MAX; cnt++) {
-		getnstimeofday(&ts2);
-		ts = timespec_sub(ts2, ts1);
+		ktime_get_ts64(&ts2);
+		ts = timespec64_sub(ts2, ts1);
 		if (ts.tv_sec >= 10)
 			break;
 		ea = mmc_test_rnd_num(range1);
@@ -1998,7 +1990,7 @@ static int mmc_test_seq_perf(struct mmc_test_card *test, int write,
 {
 	struct mmc_test_area *t = &test->area;
 	unsigned int dev_addr, i, cnt, sz, ssz;
-	struct timespec ts1, ts2;
+	struct timespec64 ts1, ts2;
 	int ret;
 
 	sz = t->max_tfr;
@@ -2025,7 +2017,7 @@ static int mmc_test_seq_perf(struct mmc_test_card *test, int write,
 	cnt = tot_sz / sz;
 	dev_addr &= 0xffff0000; /* Round to 64MiB boundary */
 
-	getnstimeofday(&ts1);
+	ktime_get_ts64(&ts1);
 	for (i = 0; i < cnt; i++) {
 		ret = mmc_test_area_io(test, sz, dev_addr, write,
 				       max_scatter, 0);
@@ -2033,7 +2025,7 @@ static int mmc_test_seq_perf(struct mmc_test_card *test, int write,
 			return ret;
 		dev_addr += ssz;
 	}
-	getnstimeofday(&ts2);
+	ktime_get_ts64(&ts2);
 
 	mmc_test_print_avg_rate(test, sz, cnt, &ts1, &ts2);
 
@@ -2328,10 +2320,17 @@ static int mmc_test_reset(struct mmc_test_card *test)
 	int err;
 
 	err = mmc_hw_reset(host);
-	if (!err)
+	if (!err) {
+		/*
+		 * Reset will re-enable the card's command queue, but tests
+		 * expect it to be disabled.
+		 */
+		if (card->ext_csd.cmdq_en)
+			mmc_cmdq_disable(card);
 		return RESULT_OK;
-	else if (err == -EOPNOTSUPP)
+	} else if (err == -EOPNOTSUPP) {
 		return RESULT_UNSUP_HOST;
+	}
 
 	return RESULT_FAIL;
 }
@@ -2356,11 +2355,9 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 	struct mmc_test_req *rq = mmc_test_req_alloc();
 	struct mmc_host *host = test->card->host;
 	struct mmc_test_area *t = &test->area;
-	struct mmc_test_async_req test_areq = { .test = test };
 	struct mmc_request *mrq;
 	unsigned long timeout;
 	bool expired = false;
-	enum mmc_blk_status blkstat = MMC_BLK_SUCCESS;
 	int ret = 0, cmd_ret;
 	u32 status = 0;
 	int count = 0;
@@ -2373,9 +2370,6 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 		mrq->sbc = &rq->sbc;
 	mrq->cap_cmd_during_tfr = true;
 
-	test_areq.areq.mrq = mrq;
-	test_areq.areq.err_check = mmc_test_check_result_async;
-
 	mmc_test_prepare_mrq(test, mrq, t->sg, t->sg_len, dev_addr, t->blocks,
 			     512, write);
 
@@ -2388,11 +2382,9 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 
 	/* Start ongoing data request */
 	if (use_areq) {
-		mmc_start_areq(host, &test_areq.areq, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS) {
-			ret = RESULT_FAIL;
+		ret = mmc_test_start_areq(test, mrq, NULL);
+		if (ret)
 			goto out_free;
-		}
 	} else {
 		mmc_wait_for_req(host, mrq);
 	}
@@ -2426,9 +2418,7 @@ static int mmc_test_ongoing_transfer(struct mmc_test_card *test,
 
 	/* Wait for data request to complete */
 	if (use_areq) {
-		mmc_start_areq(host, NULL, &blkstat);
-		if (blkstat != MMC_BLK_SUCCESS)
-			ret = RESULT_FAIL;
+		ret = mmc_test_start_areq(test, NULL, mrq);
 	} else {
 		mmc_wait_for_req_done(test->card->host, mrq);
 	}
@@ -3066,10 +3056,9 @@ static int mtf_test_show(struct seq_file *sf, void *data)
 		seq_printf(sf, "Test %d: %d\n", gr->testcase + 1, gr->result);
 
 		list_for_each_entry(tr, &gr->tr_lst, link) {
-			seq_printf(sf, "%u %d %lu.%09lu %u %u.%02u\n",
+			seq_printf(sf, "%u %d %llu.%09u %u %u.%02u\n",
 				tr->count, tr->sectors,
-				(unsigned long)tr->ts.tv_sec,
-				(unsigned long)tr->ts.tv_nsec,
+				(u64)tr->ts.tv_sec, (u32)tr->ts.tv_nsec,
 				tr->rate, tr->iops / 100, tr->iops % 100);
 		}
 	}
diff --git a/drivers/mmc/core/queue.c b/drivers/mmc/core/queue.c
index 4f33d27..421fab7 100644
--- a/drivers/mmc/core/queue.c
+++ b/drivers/mmc/core/queue.c
@@ -22,100 +22,147 @@
 #include "block.h"
 #include "core.h"
 #include "card.h"
+#include "host.h"
 
-/*
- * Prepare a MMC request. This just filters out odd stuff.
- */
-static int mmc_prep_request(struct request_queue *q, struct request *req)
+static inline bool mmc_cqe_dcmd_busy(struct mmc_queue *mq)
 {
-	struct mmc_queue *mq = q->queuedata;
-
-	if (mq && mmc_card_removed(mq->card))
-		return BLKPREP_KILL;
-
-	req->rq_flags |= RQF_DONTPREP;
-
-	return BLKPREP_OK;
+	/* Allow only 1 DCMD at a time */
+	return mq->in_flight[MMC_ISSUE_DCMD];
 }
 
-static int mmc_queue_thread(void *d)
+void mmc_cqe_check_busy(struct mmc_queue *mq)
 {
-	struct mmc_queue *mq = d;
+	if ((mq->cqe_busy & MMC_CQE_DCMD_BUSY) && !mmc_cqe_dcmd_busy(mq))
+		mq->cqe_busy &= ~MMC_CQE_DCMD_BUSY;
+
+	mq->cqe_busy &= ~MMC_CQE_QUEUE_FULL;
+}
+
+static inline bool mmc_cqe_can_dcmd(struct mmc_host *host)
+{
+	return host->caps2 & MMC_CAP2_CQE_DCMD;
+}
+
+static enum mmc_issue_type mmc_cqe_issue_type(struct mmc_host *host,
+					      struct request *req)
+{
+	switch (req_op(req)) {
+	case REQ_OP_DRV_IN:
+	case REQ_OP_DRV_OUT:
+	case REQ_OP_DISCARD:
+	case REQ_OP_SECURE_ERASE:
+		return MMC_ISSUE_SYNC;
+	case REQ_OP_FLUSH:
+		return mmc_cqe_can_dcmd(host) ? MMC_ISSUE_DCMD : MMC_ISSUE_SYNC;
+	default:
+		return MMC_ISSUE_ASYNC;
+	}
+}
+
+enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req)
+{
+	struct mmc_host *host = mq->card->host;
+
+	if (mq->use_cqe)
+		return mmc_cqe_issue_type(host, req);
+
+	if (req_op(req) == REQ_OP_READ || req_op(req) == REQ_OP_WRITE)
+		return MMC_ISSUE_ASYNC;
+
+	return MMC_ISSUE_SYNC;
+}
+
+static void __mmc_cqe_recovery_notifier(struct mmc_queue *mq)
+{
+	if (!mq->recovery_needed) {
+		mq->recovery_needed = true;
+		schedule_work(&mq->recovery_work);
+	}
+}
+
+void mmc_cqe_recovery_notifier(struct mmc_request *mrq)
+{
+	struct mmc_queue_req *mqrq = container_of(mrq, struct mmc_queue_req,
+						  brq.mrq);
+	struct request *req = mmc_queue_req_to_req(mqrq);
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+	__mmc_cqe_recovery_notifier(mq);
+	spin_unlock_irqrestore(q->queue_lock, flags);
+}
+
+static enum blk_eh_timer_return mmc_cqe_timed_out(struct request *req)
+{
+	struct mmc_queue_req *mqrq = req_to_mmc_queue_req(req);
+	struct mmc_request *mrq = &mqrq->brq.mrq;
+	struct mmc_queue *mq = req->q->queuedata;
+	struct mmc_host *host = mq->card->host;
+	enum mmc_issue_type issue_type = mmc_issue_type(mq, req);
+	bool recovery_needed = false;
+
+	switch (issue_type) {
+	case MMC_ISSUE_ASYNC:
+	case MMC_ISSUE_DCMD:
+		if (host->cqe_ops->cqe_timeout(host, mrq, &recovery_needed)) {
+			if (recovery_needed)
+				__mmc_cqe_recovery_notifier(mq);
+			return BLK_EH_RESET_TIMER;
+		}
+		/* No timeout */
+		return BLK_EH_HANDLED;
+	default:
+		/* Timeout is handled by mmc core */
+		return BLK_EH_RESET_TIMER;
+	}
+}
+
+static enum blk_eh_timer_return mmc_mq_timed_out(struct request *req,
+						 bool reserved)
+{
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	unsigned long flags;
+	int ret;
+
+	spin_lock_irqsave(q->queue_lock, flags);
+
+	if (mq->recovery_needed || !mq->use_cqe)
+		ret = BLK_EH_RESET_TIMER;
+	else
+		ret = mmc_cqe_timed_out(req);
+
+	spin_unlock_irqrestore(q->queue_lock, flags);
+
+	return ret;
+}
+
+static void mmc_mq_recovery_handler(struct work_struct *work)
+{
+	struct mmc_queue *mq = container_of(work, struct mmc_queue,
+					    recovery_work);
 	struct request_queue *q = mq->queue;
-	struct mmc_context_info *cntx = &mq->card->host->context_info;
 
-	current->flags |= PF_MEMALLOC;
+	mmc_get_card(mq->card, &mq->ctx);
 
-	down(&mq->thread_sem);
-	do {
-		struct request *req;
+	mq->in_recovery = true;
 
-		spin_lock_irq(q->queue_lock);
-		set_current_state(TASK_INTERRUPTIBLE);
-		req = blk_fetch_request(q);
-		mq->asleep = false;
-		cntx->is_waiting_last_req = false;
-		cntx->is_new_req = false;
-		if (!req) {
-			/*
-			 * Dispatch queue is empty so set flags for
-			 * mmc_request_fn() to wake us up.
-			 */
-			if (mq->qcnt)
-				cntx->is_waiting_last_req = true;
-			else
-				mq->asleep = true;
-		}
-		spin_unlock_irq(q->queue_lock);
+	if (mq->use_cqe)
+		mmc_blk_cqe_recovery(mq);
+	else
+		mmc_blk_mq_recovery(mq);
 
-		if (req || mq->qcnt) {
-			set_current_state(TASK_RUNNING);
-			mmc_blk_issue_rq(mq, req);
-			cond_resched();
-		} else {
-			if (kthread_should_stop()) {
-				set_current_state(TASK_RUNNING);
-				break;
-			}
-			up(&mq->thread_sem);
-			schedule();
-			down(&mq->thread_sem);
-		}
-	} while (1);
-	up(&mq->thread_sem);
+	mq->in_recovery = false;
 
-	return 0;
-}
+	spin_lock_irq(q->queue_lock);
+	mq->recovery_needed = false;
+	spin_unlock_irq(q->queue_lock);
 
-/*
- * Generic MMC request handler.  This is called for any queue on a
- * particular host.  When the host is not busy, we look for a request
- * on any queue on this host, and attempt to issue it.  This may
- * not be the queue we were asked to process.
- */
-static void mmc_request_fn(struct request_queue *q)
-{
-	struct mmc_queue *mq = q->queuedata;
-	struct request *req;
-	struct mmc_context_info *cntx;
+	mmc_put_card(mq->card, &mq->ctx);
 
-	if (!mq) {
-		while ((req = blk_fetch_request(q)) != NULL) {
-			req->rq_flags |= RQF_QUIET;
-			__blk_end_request_all(req, BLK_STS_IOERR);
-		}
-		return;
-	}
-
-	cntx = &mq->card->host->context_info;
-
-	if (cntx->is_waiting_last_req) {
-		cntx->is_new_req = true;
-		wake_up_interruptible(&cntx->wait);
-	}
-
-	if (mq->asleep)
-		wake_up_process(mq->thread);
+	blk_mq_run_hw_queues(q, true);
 }
 
 static struct scatterlist *mmc_alloc_sg(int sg_len, gfp_t gfp)
@@ -154,11 +201,10 @@ static void mmc_queue_setup_discard(struct request_queue *q,
  * @req: the request
  * @gfp: memory allocation policy
  */
-static int mmc_init_request(struct request_queue *q, struct request *req,
-			    gfp_t gfp)
+static int __mmc_init_request(struct mmc_queue *mq, struct request *req,
+			      gfp_t gfp)
 {
 	struct mmc_queue_req *mq_rq = req_to_mmc_queue_req(req);
-	struct mmc_queue *mq = q->queuedata;
 	struct mmc_card *card = mq->card;
 	struct mmc_host *host = card->host;
 
@@ -177,6 +223,131 @@ static void mmc_exit_request(struct request_queue *q, struct request *req)
 	mq_rq->sg = NULL;
 }
 
+static int mmc_mq_init_request(struct blk_mq_tag_set *set, struct request *req,
+			       unsigned int hctx_idx, unsigned int numa_node)
+{
+	return __mmc_init_request(set->driver_data, req, GFP_KERNEL);
+}
+
+static void mmc_mq_exit_request(struct blk_mq_tag_set *set, struct request *req,
+				unsigned int hctx_idx)
+{
+	struct mmc_queue *mq = set->driver_data;
+
+	mmc_exit_request(mq->queue, req);
+}
+
+/*
+ * We use BLK_MQ_F_BLOCKING and have only 1 hardware queue, which means requests
+ * will not be dispatched in parallel.
+ */
+static blk_status_t mmc_mq_queue_rq(struct blk_mq_hw_ctx *hctx,
+				    const struct blk_mq_queue_data *bd)
+{
+	struct request *req = bd->rq;
+	struct request_queue *q = req->q;
+	struct mmc_queue *mq = q->queuedata;
+	struct mmc_card *card = mq->card;
+	struct mmc_host *host = card->host;
+	enum mmc_issue_type issue_type;
+	enum mmc_issued issued;
+	bool get_card, cqe_retune_ok;
+	int ret;
+
+	if (mmc_card_removed(mq->card)) {
+		req->rq_flags |= RQF_QUIET;
+		return BLK_STS_IOERR;
+	}
+
+	issue_type = mmc_issue_type(mq, req);
+
+	spin_lock_irq(q->queue_lock);
+
+	if (mq->recovery_needed) {
+		spin_unlock_irq(q->queue_lock);
+		return BLK_STS_RESOURCE;
+	}
+
+	switch (issue_type) {
+	case MMC_ISSUE_DCMD:
+		if (mmc_cqe_dcmd_busy(mq)) {
+			mq->cqe_busy |= MMC_CQE_DCMD_BUSY;
+			spin_unlock_irq(q->queue_lock);
+			return BLK_STS_RESOURCE;
+		}
+		break;
+	case MMC_ISSUE_ASYNC:
+		break;
+	default:
+		/*
+		 * Timeouts are handled by mmc core, and we don't have a host
+		 * API to abort requests, so we can't handle the timeout anyway.
+		 * However, when the timeout happens, blk_mq_complete_request()
+		 * no longer works (to stop the request disappearing under us).
+		 * To avoid racing with that, set a large timeout.
+		 */
+		req->timeout = 600 * HZ;
+		break;
+	}
+
+	mq->in_flight[issue_type] += 1;
+	get_card = (mmc_tot_in_flight(mq) == 1);
+	cqe_retune_ok = (mmc_cqe_qcnt(mq) == 1);
+
+	spin_unlock_irq(q->queue_lock);
+
+	if (!(req->rq_flags & RQF_DONTPREP)) {
+		req_to_mmc_queue_req(req)->retries = 0;
+		req->rq_flags |= RQF_DONTPREP;
+	}
+
+	if (get_card)
+		mmc_get_card(card, &mq->ctx);
+
+	if (mq->use_cqe) {
+		host->retune_now = host->need_retune && cqe_retune_ok &&
+				   !host->hold_retune;
+	}
+
+	blk_mq_start_request(req);
+
+	issued = mmc_blk_mq_issue_rq(mq, req);
+
+	switch (issued) {
+	case MMC_REQ_BUSY:
+		ret = BLK_STS_RESOURCE;
+		break;
+	case MMC_REQ_FAILED_TO_START:
+		ret = BLK_STS_IOERR;
+		break;
+	default:
+		ret = BLK_STS_OK;
+		break;
+	}
+
+	if (issued != MMC_REQ_STARTED) {
+		bool put_card = false;
+
+		spin_lock_irq(q->queue_lock);
+		mq->in_flight[issue_type] -= 1;
+		if (mmc_tot_in_flight(mq) == 0)
+			put_card = true;
+		spin_unlock_irq(q->queue_lock);
+		if (put_card)
+			mmc_put_card(card, &mq->ctx);
+	}
+
+	return ret;
+}
+
+static const struct blk_mq_ops mmc_mq_ops = {
+	.queue_rq	= mmc_mq_queue_rq,
+	.init_request	= mmc_mq_init_request,
+	.exit_request	= mmc_mq_exit_request,
+	.complete	= mmc_blk_mq_complete,
+	.timeout	= mmc_mq_timed_out,
+};
+
 static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 {
 	struct mmc_host *host = card->host;
@@ -196,8 +367,78 @@ static void mmc_setup_queue(struct mmc_queue *mq, struct mmc_card *card)
 	blk_queue_max_segments(mq->queue, host->max_segs);
 	blk_queue_max_segment_size(mq->queue, host->max_seg_size);
 
-	/* Initialize thread_sem even if it is not used */
-	sema_init(&mq->thread_sem, 1);
+	INIT_WORK(&mq->recovery_work, mmc_mq_recovery_handler);
+	INIT_WORK(&mq->complete_work, mmc_blk_mq_complete_work);
+
+	mutex_init(&mq->complete_lock);
+
+	init_waitqueue_head(&mq->wait);
+}
+
+static int mmc_mq_init_queue(struct mmc_queue *mq, int q_depth,
+			     const struct blk_mq_ops *mq_ops, spinlock_t *lock)
+{
+	int ret;
+
+	memset(&mq->tag_set, 0, sizeof(mq->tag_set));
+	mq->tag_set.ops = mq_ops;
+	mq->tag_set.queue_depth = q_depth;
+	mq->tag_set.numa_node = NUMA_NO_NODE;
+	mq->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE |
+			    BLK_MQ_F_BLOCKING;
+	mq->tag_set.nr_hw_queues = 1;
+	mq->tag_set.cmd_size = sizeof(struct mmc_queue_req);
+	mq->tag_set.driver_data = mq;
+
+	ret = blk_mq_alloc_tag_set(&mq->tag_set);
+	if (ret)
+		return ret;
+
+	mq->queue = blk_mq_init_queue(&mq->tag_set);
+	if (IS_ERR(mq->queue)) {
+		ret = PTR_ERR(mq->queue);
+		goto free_tag_set;
+	}
+
+	mq->queue->queue_lock = lock;
+	mq->queue->queuedata = mq;
+
+	return 0;
+
+free_tag_set:
+	blk_mq_free_tag_set(&mq->tag_set);
+
+	return ret;
+}
+
+/* Set queue depth to get a reasonable value for q->nr_requests */
+#define MMC_QUEUE_DEPTH 64
+
+static int mmc_mq_init(struct mmc_queue *mq, struct mmc_card *card,
+			 spinlock_t *lock)
+{
+	struct mmc_host *host = card->host;
+	int q_depth;
+	int ret;
+
+	/*
+	 * The queue depth for CQE must match the hardware because the request
+	 * tag is used to index the hardware queue.
+	 */
+	if (mq->use_cqe)
+		q_depth = min_t(int, card->ext_csd.cmdq_depth, host->cqe_qdepth);
+	else
+		q_depth = MMC_QUEUE_DEPTH;
+
+	ret = mmc_mq_init_queue(mq, q_depth, &mmc_mq_ops, lock);
+	if (ret)
+		return ret;
+
+	blk_queue_rq_timeout(mq->queue, 60 * HZ);
+
+	mmc_setup_queue(mq, card);
+
+	return 0;
 }
 
 /**
@@ -213,108 +454,53 @@ int mmc_init_queue(struct mmc_queue *mq, struct mmc_card *card,
 		   spinlock_t *lock, const char *subname)
 {
 	struct mmc_host *host = card->host;
-	int ret = -ENOMEM;
 
 	mq->card = card;
-	mq->queue = blk_alloc_queue(GFP_KERNEL);
-	if (!mq->queue)
-		return -ENOMEM;
-	mq->queue->queue_lock = lock;
-	mq->queue->request_fn = mmc_request_fn;
-	mq->queue->init_rq_fn = mmc_init_request;
-	mq->queue->exit_rq_fn = mmc_exit_request;
-	mq->queue->cmd_size = sizeof(struct mmc_queue_req);
-	mq->queue->queuedata = mq;
-	mq->qcnt = 0;
-	ret = blk_init_allocated_queue(mq->queue);
-	if (ret) {
-		blk_cleanup_queue(mq->queue);
-		return ret;
-	}
 
-	blk_queue_prep_rq(mq->queue, mmc_prep_request);
+	mq->use_cqe = host->cqe_enabled;
 
-	mmc_setup_queue(mq, card);
+	return mmc_mq_init(mq, card, lock);
+}
 
-	mq->thread = kthread_run(mmc_queue_thread, mq, "mmcqd/%d%s",
-		host->index, subname ? subname : "");
+void mmc_queue_suspend(struct mmc_queue *mq)
+{
+	blk_mq_quiesce_queue(mq->queue);
 
-	if (IS_ERR(mq->thread)) {
-		ret = PTR_ERR(mq->thread);
-		goto cleanup_queue;
-	}
+	/*
+	 * The host remains claimed while there are outstanding requests, so
+	 * simply claiming and releasing here ensures there are none.
+	 */
+	mmc_claim_host(mq->card->host);
+	mmc_release_host(mq->card->host);
+}
 
-	return 0;
-
-cleanup_queue:
-	blk_cleanup_queue(mq->queue);
-	return ret;
+void mmc_queue_resume(struct mmc_queue *mq)
+{
+	blk_mq_unquiesce_queue(mq->queue);
 }
 
 void mmc_cleanup_queue(struct mmc_queue *mq)
 {
 	struct request_queue *q = mq->queue;
-	unsigned long flags;
 
-	/* Make sure the queue isn't suspended, as that will deadlock */
-	mmc_queue_resume(mq);
+	/*
+	 * The legacy code handled the possibility of being suspended,
+	 * so do that here too.
+	 */
+	if (blk_queue_quiesced(q))
+		blk_mq_unquiesce_queue(q);
 
-	/* Then terminate our worker thread */
-	kthread_stop(mq->thread);
+	blk_cleanup_queue(q);
 
-	/* Empty the queue */
-	spin_lock_irqsave(q->queue_lock, flags);
-	q->queuedata = NULL;
-	blk_start_queue(q);
-	spin_unlock_irqrestore(q->queue_lock, flags);
+	/*
+	 * A request can be completed before the next request, potentially
+	 * leaving a complete_work with nothing to do. Such a work item might
+	 * still be queued at this point. Flush it.
+	 */
+	flush_work(&mq->complete_work);
 
 	mq->card = NULL;
 }
-EXPORT_SYMBOL(mmc_cleanup_queue);
-
-/**
- * mmc_queue_suspend - suspend a MMC request queue
- * @mq: MMC queue to suspend
- *
- * Stop the block request queue, and wait for our thread to
- * complete any outstanding requests.  This ensures that we
- * won't suspend while a request is being processed.
- */
-void mmc_queue_suspend(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-	unsigned long flags;
-
-	if (!mq->suspended) {
-		mq->suspended |= true;
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_stop_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-
-		down(&mq->thread_sem);
-	}
-}
-
-/**
- * mmc_queue_resume - resume a previously suspended MMC request queue
- * @mq: MMC queue to resume
- */
-void mmc_queue_resume(struct mmc_queue *mq)
-{
-	struct request_queue *q = mq->queue;
-	unsigned long flags;
-
-	if (mq->suspended) {
-		mq->suspended = false;
-
-		up(&mq->thread_sem);
-
-		spin_lock_irqsave(q->queue_lock, flags);
-		blk_start_queue(q);
-		spin_unlock_irqrestore(q->queue_lock, flags);
-	}
-}
 
 /*
  * Prepare the sg list(s) to be handed of to the host driver
diff --git a/drivers/mmc/core/queue.h b/drivers/mmc/core/queue.h
index 547b457..17e59d5 100644
--- a/drivers/mmc/core/queue.h
+++ b/drivers/mmc/core/queue.h
@@ -8,6 +8,20 @@
 #include <linux/mmc/core.h>
 #include <linux/mmc/host.h>
 
+enum mmc_issued {
+	MMC_REQ_STARTED,
+	MMC_REQ_BUSY,
+	MMC_REQ_FAILED_TO_START,
+	MMC_REQ_FINISHED,
+};
+
+enum mmc_issue_type {
+	MMC_ISSUE_SYNC,
+	MMC_ISSUE_DCMD,
+	MMC_ISSUE_ASYNC,
+	MMC_ISSUE_MAX,
+};
+
 static inline struct mmc_queue_req *req_to_mmc_queue_req(struct request *rq)
 {
 	return blk_mq_rq_to_pdu(rq);
@@ -20,7 +34,6 @@ static inline struct request *mmc_queue_req_to_req(struct mmc_queue_req *mqr)
 	return blk_mq_rq_from_pdu(mqr);
 }
 
-struct task_struct;
 struct mmc_blk_data;
 struct mmc_blk_ioc_data;
 
@@ -30,7 +43,6 @@ struct mmc_blk_request {
 	struct mmc_command	cmd;
 	struct mmc_command	stop;
 	struct mmc_data		data;
-	int			retune_retry_done;
 };
 
 /**
@@ -52,28 +64,34 @@ enum mmc_drv_op {
 struct mmc_queue_req {
 	struct mmc_blk_request	brq;
 	struct scatterlist	*sg;
-	struct mmc_async_req	areq;
 	enum mmc_drv_op		drv_op;
 	int			drv_op_result;
 	void			*drv_op_data;
 	unsigned int		ioc_count;
+	int			retries;
 };
 
 struct mmc_queue {
 	struct mmc_card		*card;
-	struct task_struct	*thread;
-	struct semaphore	thread_sem;
-	bool			suspended;
-	bool			asleep;
+	struct mmc_ctx		ctx;
+	struct blk_mq_tag_set	tag_set;
 	struct mmc_blk_data	*blkdata;
 	struct request_queue	*queue;
-	/*
-	 * FIXME: this counter is not a very reliable way of keeping
-	 * track of how many requests that are ongoing. Switch to just
-	 * letting the block core keep track of requests and per-request
-	 * associated mmc_queue_req data.
-	 */
-	int			qcnt;
+	int			in_flight[MMC_ISSUE_MAX];
+	unsigned int		cqe_busy;
+#define MMC_CQE_DCMD_BUSY	BIT(0)
+#define MMC_CQE_QUEUE_FULL	BIT(1)
+	bool			use_cqe;
+	bool			recovery_needed;
+	bool			in_recovery;
+	bool			rw_wait;
+	bool			waiting;
+	struct work_struct	recovery_work;
+	wait_queue_head_t	wait;
+	struct request		*recovery_req;
+	struct request		*complete_req;
+	struct mutex		complete_lock;
+	struct work_struct	complete_work;
 };
 
 extern int mmc_init_queue(struct mmc_queue *, struct mmc_card *, spinlock_t *,
@@ -84,4 +102,22 @@ extern void mmc_queue_resume(struct mmc_queue *);
 extern unsigned int mmc_queue_map_sg(struct mmc_queue *,
 				     struct mmc_queue_req *);
 
+void mmc_cqe_check_busy(struct mmc_queue *mq);
+void mmc_cqe_recovery_notifier(struct mmc_request *mrq);
+
+enum mmc_issue_type mmc_issue_type(struct mmc_queue *mq, struct request *req);
+
+static inline int mmc_tot_in_flight(struct mmc_queue *mq)
+{
+	return mq->in_flight[MMC_ISSUE_SYNC] +
+	       mq->in_flight[MMC_ISSUE_DCMD] +
+	       mq->in_flight[MMC_ISSUE_ASYNC];
+}
+
+static inline int mmc_cqe_qcnt(struct mmc_queue *mq)
+{
+	return mq->in_flight[MMC_ISSUE_DCMD] +
+	       mq->in_flight[MMC_ISSUE_ASYNC];
+}
+
 #endif
diff --git a/drivers/mmc/core/slot-gpio.c b/drivers/mmc/core/slot-gpio.c
index 863f1db..3698b05 100644
--- a/drivers/mmc/core/slot-gpio.c
+++ b/drivers/mmc/core/slot-gpio.c
@@ -121,20 +121,18 @@ EXPORT_SYMBOL(mmc_gpio_request_ro);
 void mmc_gpiod_request_cd_irq(struct mmc_host *host)
 {
 	struct mmc_gpio *ctx = host->slot.handler_priv;
-	int ret, irq;
+	int irq = -EINVAL;
+	int ret;
 
 	if (host->slot.cd_irq >= 0 || !ctx || !ctx->cd_gpio)
 		return;
 
-	irq = gpiod_to_irq(ctx->cd_gpio);
-
 	/*
-	 * Even if gpiod_to_irq() returns a valid IRQ number, the platform might
-	 * still prefer to poll, e.g., because that IRQ number is already used
-	 * by another unit and cannot be shared.
+	 * Do not use IRQ if the platform prefers to poll, e.g., because that
+	 * IRQ number is already used by another unit and cannot be shared.
 	 */
-	if (irq >= 0 && host->caps & MMC_CAP_NEEDS_POLL)
-		irq = -EINVAL;
+	if (!(host->caps & MMC_CAP_NEEDS_POLL))
+		irq = gpiod_to_irq(ctx->cd_gpio);
 
 	if (irq >= 0) {
 		if (!ctx->cd_gpio_isr)
@@ -307,3 +305,11 @@ int mmc_gpiod_request_ro(struct mmc_host *host, const char *con_id,
 	return 0;
 }
 EXPORT_SYMBOL(mmc_gpiod_request_ro);
+
+bool mmc_can_gpio_ro(struct mmc_host *host)
+{
+	struct mmc_gpio *ctx = host->slot.handler_priv;
+
+	return ctx->ro_gpio ? true : false;
+}
+EXPORT_SYMBOL(mmc_can_gpio_ro);
diff --git a/drivers/mmc/host/Kconfig b/drivers/mmc/host/Kconfig
index 567028c..67bd334 100644
--- a/drivers/mmc/host/Kconfig
+++ b/drivers/mmc/host/Kconfig
@@ -81,6 +81,7 @@
 config MMC_SDHCI_PCI
 	tristate "SDHCI support on PCI bus"
 	depends on MMC_SDHCI && PCI
+	select MMC_CQHCI
 	help
 	  This selects the PCI Secure Digital Host Controller Interface.
 	  Most controllers found today are PCI devices.
@@ -132,6 +133,7 @@
 	depends on MMC_SDHCI_PLTFM
 	depends on OF
 	depends on COMMON_CLK
+	select MMC_CQHCI
 	help
 	  This selects the Arasan Secure Digital Host Controller Interface
 	  (SDHCI). This hardware is found e.g. in Xilinx' Zynq SoC.
@@ -320,7 +322,7 @@
 config MMC_SDHCI_F_SDH30
 	tristate "SDHCI support for Fujitsu Semiconductor F_SDH30"
 	depends on MMC_SDHCI_PLTFM
-	depends on OF
+	depends on OF || ACPI
 	help
 	  This selects the Secure Digital Host Controller Interface (SDHCI)
 	  Needed by some Fujitsu SoC for MMC / SD / SDIO support.
@@ -595,11 +597,8 @@
 
 config MMC_SDHI
 	tristate "Renesas SDHI SD/SDIO controller support"
-	depends on SUPERH || ARM || ARM64
 	depends on SUPERH || ARCH_RENESAS || COMPILE_TEST
 	select MMC_TMIO_CORE
-	select MMC_SDHI_SYS_DMAC if (SUPERH || ARM)
-	select MMC_SDHI_INTERNAL_DMAC if ARM64
 	help
 	  This provides support for the SDHI SD/SDIO controller found in
 	  Renesas SuperH, ARM and ARM64 based SoCs
@@ -607,6 +606,7 @@
 config MMC_SDHI_SYS_DMAC
 	tristate "DMA for SDHI SD/SDIO controllers using SYS-DMAC"
 	depends on MMC_SDHI
+	default MMC_SDHI if (SUPERH || ARM)
 	help
 	  This provides DMA support for SDHI SD/SDIO controllers
 	  using SYS-DMAC via DMA Engine. This supports the controllers
@@ -616,6 +616,7 @@
 	tristate "DMA for SDHI SD/SDIO controllers using on-chip bus mastering"
 	depends on ARM64 || COMPILE_TEST
 	depends on MMC_SDHI
+	default MMC_SDHI if ARM64
 	help
 	  This provides DMA support for SDHI SD/SDIO controllers
 	  using on-chip bus mastering. This supports the controllers
@@ -838,14 +839,14 @@
 
 config MMC_REALTEK_PCI
 	tristate "Realtek PCI-E SD/MMC Card Interface Driver"
-	depends on MFD_RTSX_PCI
+	depends on MISC_RTSX_PCI
 	help
 	  Say Y here to include driver code to support SD/MMC card interface
 	  of Realtek PCI-E card reader
 
 config MMC_REALTEK_USB
 	tristate "Realtek USB SD/MMC Card Interface Driver"
-	depends on MFD_RTSX_USB
+	depends on MISC_RTSX_USB
 	help
 	  Say Y here to include driver code to support SD/MMC card interface
 	  of Realtek RTS5129/39 series card reader
@@ -857,6 +858,19 @@
 	  This selects support for the SD/MMC Host Controller on
 	  Allwinner sunxi SoCs.
 
+config MMC_CQHCI
+	tristate "Command Queue Host Controller Interface support"
+	depends on HAS_DMA
+	help
+	  This selects the Command Queue Host Controller Interface (CQHCI)
+	  support present in host controllers of Qualcomm Technologies, Inc
+	  amongst others.
+	  This controller supports eMMC devices with command queue support.
+
+	  If you have a controller with this interface, say Y or M here.
+
+	  If unsure, say N.
+
 config MMC_TOSHIBA_PCI
 	tristate "Toshiba Type A SD/MMC Card Interface Driver"
 	depends on PCI
diff --git a/drivers/mmc/host/Makefile b/drivers/mmc/host/Makefile
index a43cf0d..84cd138 100644
--- a/drivers/mmc/host/Makefile
+++ b/drivers/mmc/host/Makefile
@@ -11,7 +11,7 @@
 obj-$(CONFIG_MMC_MXS)		+= mxs-mmc.o
 obj-$(CONFIG_MMC_SDHCI)		+= sdhci.o
 obj-$(CONFIG_MMC_SDHCI_PCI)	+= sdhci-pci.o
-sdhci-pci-y			+= sdhci-pci-core.o sdhci-pci-o2micro.o
+sdhci-pci-y			+= sdhci-pci-core.o sdhci-pci-o2micro.o sdhci-pci-arasan.o
 obj-$(subst m,y,$(CONFIG_MMC_SDHCI_PCI))	+= sdhci-pci-data.o
 obj-$(CONFIG_MMC_SDHCI_ACPI)	+= sdhci-acpi.o
 obj-$(CONFIG_MMC_SDHCI_PXAV3)	+= sdhci-pxav3.o
@@ -39,12 +39,8 @@
 obj-$(CONFIG_MMC_TMIO)		+= tmio_mmc.o
 obj-$(CONFIG_MMC_TMIO_CORE)	+= tmio_mmc_core.o
 obj-$(CONFIG_MMC_SDHI)		+= renesas_sdhi_core.o
-ifeq ($(subst m,y,$(CONFIG_MMC_SDHI_SYS_DMAC)),y)
-obj-$(CONFIG_MMC_SDHI)		+= renesas_sdhi_sys_dmac.o
-endif
-ifeq ($(subst m,y,$(CONFIG_MMC_SDHI_INTERNAL_DMAC)),y)
-obj-$(CONFIG_MMC_SDHI)		+= renesas_sdhi_internal_dmac.o
-endif
+obj-$(CONFIG_MMC_SDHI_SYS_DMAC)		+= renesas_sdhi_sys_dmac.o
+obj-$(CONFIG_MMC_SDHI_INTERNAL_DMAC)	+= renesas_sdhi_internal_dmac.o
 obj-$(CONFIG_MMC_CB710)		+= cb710-mmc.o
 obj-$(CONFIG_MMC_VIA_SDMMC)	+= via-sdmmc.o
 obj-$(CONFIG_SDH_BFIN)		+= bfin_sdh.o
@@ -92,6 +88,7 @@
 obj-$(CONFIG_MMC_SDHCI_MICROCHIP_PIC32)	+= sdhci-pic32.o
 obj-$(CONFIG_MMC_SDHCI_BRCMSTB)		+= sdhci-brcmstb.o
 obj-$(CONFIG_MMC_SDHCI_OMAP)		+= sdhci-omap.o
+obj-$(CONFIG_MMC_CQHCI)			+= cqhci.o
 
 ifeq ($(CONFIG_CB710_DEBUG),y)
 	CFLAGS-cb710-mmc	+= -DDEBUG
diff --git a/drivers/mmc/host/android-goldfish.c b/drivers/mmc/host/android-goldfish.c
index 63fe509..63d2758 100644
--- a/drivers/mmc/host/android-goldfish.c
+++ b/drivers/mmc/host/android-goldfish.c
@@ -42,13 +42,11 @@
 #include <linux/spinlock.h>
 #include <linux/timer.h>
 #include <linux/clk.h>
-#include <linux/scatterlist.h>
 
 #include <asm/io.h>
 #include <asm/irq.h>
 
 #include <asm/types.h>
-#include <asm/io.h>
 #include <linux/uaccess.h>
 
 #define DRIVER_NAME "goldfish_mmc"
diff --git a/drivers/mmc/host/cqhci.c b/drivers/mmc/host/cqhci.c
new file mode 100644
index 0000000..159270e
--- /dev/null
+++ b/drivers/mmc/host/cqhci.c
@@ -0,0 +1,1150 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+
+#include <linux/delay.h>
+#include <linux/highmem.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/dma-mapping.h>
+#include <linux/slab.h>
+#include <linux/scatterlist.h>
+#include <linux/platform_device.h>
+#include <linux/ktime.h>
+
+#include <linux/mmc/mmc.h>
+#include <linux/mmc/host.h>
+#include <linux/mmc/card.h>
+
+#include "cqhci.h"
+
+#define DCMD_SLOT 31
+#define NUM_SLOTS 32
+
+struct cqhci_slot {
+	struct mmc_request *mrq;
+	unsigned int flags;
+#define CQHCI_EXTERNAL_TIMEOUT	BIT(0)
+#define CQHCI_COMPLETED		BIT(1)
+#define CQHCI_HOST_CRC		BIT(2)
+#define CQHCI_HOST_TIMEOUT	BIT(3)
+#define CQHCI_HOST_OTHER	BIT(4)
+};
+
+static inline u8 *get_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->desc_base + (tag * cq_host->slot_sz);
+}
+
+static inline u8 *get_link_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *desc = get_desc(cq_host, tag);
+
+	return desc + cq_host->task_desc_len;
+}
+
+static inline dma_addr_t get_trans_desc_dma(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_dma_base +
+		(cq_host->mmc->max_segs * tag *
+		 cq_host->trans_desc_len);
+}
+
+static inline u8 *get_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	return cq_host->trans_desc_base +
+		(cq_host->trans_desc_len * cq_host->mmc->max_segs * tag);
+}
+
+static void setup_trans_desc(struct cqhci_host *cq_host, u8 tag)
+{
+	u8 *link_temp;
+	dma_addr_t trans_temp;
+
+	link_temp = get_link_desc(cq_host, tag);
+	trans_temp = get_trans_desc_dma(cq_host, tag);
+
+	memset(link_temp, 0, cq_host->link_desc_len);
+	if (cq_host->link_desc_len > 8)
+		*(link_temp + 8) = 0;
+
+	if (tag == DCMD_SLOT && (cq_host->mmc->caps2 & MMC_CAP2_CQE_DCMD)) {
+		*link_temp = CQHCI_VALID(0) | CQHCI_ACT(0) | CQHCI_END(1);
+		return;
+	}
+
+	*link_temp = CQHCI_VALID(1) | CQHCI_ACT(0x6) | CQHCI_END(0);
+
+	if (cq_host->dma64) {
+		__le64 *data_addr = (__le64 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le64(trans_temp);
+	} else {
+		__le32 *data_addr = (__le32 __force *)(link_temp + 4);
+
+		data_addr[0] = cpu_to_le32(trans_temp);
+	}
+}
+
+static void cqhci_set_irqs(struct cqhci_host *cq_host, u32 set)
+{
+	cqhci_writel(cq_host, set, CQHCI_ISTE);
+	cqhci_writel(cq_host, set, CQHCI_ISGE);
+}
+
+#define DRV_NAME "cqhci"
+
+#define CQHCI_DUMP(f, x...) \
+	pr_err("%s: " DRV_NAME ": " f, mmc_hostname(mmc), ## x)
+
+static void cqhci_dumpregs(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+
+	CQHCI_DUMP("============ CQHCI REGISTER DUMP ===========\n");
+
+	CQHCI_DUMP("Caps:      0x%08x | Version:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CAP),
+		   cqhci_readl(cq_host, CQHCI_VER));
+	CQHCI_DUMP("Config:    0x%08x | Control:  0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CFG),
+		   cqhci_readl(cq_host, CQHCI_CTL));
+	CQHCI_DUMP("Int stat:  0x%08x | Int enab: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_IS),
+		   cqhci_readl(cq_host, CQHCI_ISTE));
+	CQHCI_DUMP("Int sig:   0x%08x | Int Coal: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_ISGE),
+		   cqhci_readl(cq_host, CQHCI_IC));
+	CQHCI_DUMP("TDL base:  0x%08x | TDL up32: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDLBA),
+		   cqhci_readl(cq_host, CQHCI_TDLBAU));
+	CQHCI_DUMP("Doorbell:  0x%08x | TCN:      0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TDBR),
+		   cqhci_readl(cq_host, CQHCI_TCN));
+	CQHCI_DUMP("Dev queue: 0x%08x | Dev Pend: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_DQS),
+		   cqhci_readl(cq_host, CQHCI_DPT));
+	CQHCI_DUMP("Task clr:  0x%08x | SSC1:     0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_TCLR),
+		   cqhci_readl(cq_host, CQHCI_SSC1));
+	CQHCI_DUMP("SSC2:      0x%08x | DCMD rsp: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_SSC2),
+		   cqhci_readl(cq_host, CQHCI_CRDCT));
+	CQHCI_DUMP("RED mask:  0x%08x | TERRI:    0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_RMEM),
+		   cqhci_readl(cq_host, CQHCI_TERRI));
+	CQHCI_DUMP("Resp idx:  0x%08x | Resp arg: 0x%08x\n",
+		   cqhci_readl(cq_host, CQHCI_CRI),
+		   cqhci_readl(cq_host, CQHCI_CRA));
+
+	if (cq_host->ops->dumpregs)
+		cq_host->ops->dumpregs(mmc);
+	else
+		CQHCI_DUMP(": ===========================================\n");
+}
+
+/**
+ * The allocated descriptor table for task, link & transfer descritors
+ * looks like:
+ * |----------|
+ * |task desc |  |->|----------|
+ * |----------|  |  |trans desc|
+ * |link desc-|->|  |----------|
+ * |----------|          .
+ *      .                .
+ *  no. of slots      max-segs
+ *      .           |----------|
+ * |----------|
+ * The idea here is to create the [task+trans] table and mark & point the
+ * link desc to the transfer desc table on a per slot basis.
+ */
+static int cqhci_host_alloc_tdl(struct cqhci_host *cq_host)
+{
+	int i = 0;
+
+	/* task descriptor can be 64/128 bit irrespective of arch */
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128) {
+		cqhci_writel(cq_host, cqhci_readl(cq_host, CQHCI_CFG) |
+			       CQHCI_TASK_DESC_SZ, CQHCI_CFG);
+		cq_host->task_desc_len = 16;
+	} else {
+		cq_host->task_desc_len = 8;
+	}
+
+	/*
+	 * 96 bits length of transfer desc instead of 128 bits which means
+	 * ADMA would expect next valid descriptor at the 96th bit
+	 * or 128th bit
+	 */
+	if (cq_host->dma64) {
+		if (cq_host->quirks & CQHCI_QUIRK_SHORT_TXFR_DESC_SZ)
+			cq_host->trans_desc_len = 12;
+		else
+			cq_host->trans_desc_len = 16;
+		cq_host->link_desc_len = 16;
+	} else {
+		cq_host->trans_desc_len = 8;
+		cq_host->link_desc_len = 8;
+	}
+
+	/* total size of a slot: 1 task & 1 transfer (link) */
+	cq_host->slot_sz = cq_host->task_desc_len + cq_host->link_desc_len;
+
+	cq_host->desc_size = cq_host->slot_sz * cq_host->num_slots;
+
+	cq_host->data_size = cq_host->trans_desc_len * cq_host->mmc->max_segs *
+		(cq_host->num_slots - 1);
+
+	pr_debug("%s: cqhci: desc_size: %zu data_sz: %zu slot-sz: %d\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_size, cq_host->data_size,
+		 cq_host->slot_sz);
+
+	/*
+	 * allocate a dma-mapped chunk of memory for the descriptors
+	 * allocate a dma-mapped chunk of memory for link descriptors
+	 * setup each link-desc memory offset per slot-number to
+	 * the descriptor table.
+	 */
+	cq_host->desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+						 cq_host->desc_size,
+						 &cq_host->desc_dma_base,
+						 GFP_KERNEL);
+	cq_host->trans_desc_base = dmam_alloc_coherent(mmc_dev(cq_host->mmc),
+					      cq_host->data_size,
+					      &cq_host->trans_desc_dma_base,
+					      GFP_KERNEL);
+	if (!cq_host->desc_base || !cq_host->trans_desc_base)
+		return -ENOMEM;
+
+	pr_debug("%s: cqhci: desc-base: 0x%p trans-base: 0x%p\n desc_dma 0x%llx trans_dma: 0x%llx\n",
+		 mmc_hostname(cq_host->mmc), cq_host->desc_base, cq_host->trans_desc_base,
+		(unsigned long long)cq_host->desc_dma_base,
+		(unsigned long long)cq_host->trans_desc_dma_base);
+
+	for (; i < (cq_host->num_slots); i++)
+		setup_trans_desc(cq_host, i);
+
+	return 0;
+}
+
+static void __cqhci_enable(struct cqhci_host *cq_host)
+{
+	struct mmc_host *mmc = cq_host->mmc;
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+
+	/* Configuration must not be changed while enabled */
+	if (cqcfg & CQHCI_ENABLE) {
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+	}
+
+	cqcfg &= ~(CQHCI_DCMD | CQHCI_TASK_DESC_SZ);
+
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		cqcfg |= CQHCI_DCMD;
+
+	if (cq_host->caps & CQHCI_TASK_DESC_SZ_128)
+		cqcfg |= CQHCI_TASK_DESC_SZ;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cqhci_writel(cq_host, lower_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBA);
+	cqhci_writel(cq_host, upper_32_bits(cq_host->desc_dma_base),
+		     CQHCI_TDLBAU);
+
+	cqhci_writel(cq_host, cq_host->rca, CQHCI_SSC2);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	cqcfg |= CQHCI_ENABLE;
+
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	mmc->cqe_on = true;
+
+	if (cq_host->ops->enable)
+		cq_host->ops->enable(mmc);
+
+	/* Ensure all writes are done before interrupts are enabled */
+	wmb();
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	cq_host->activated = true;
+}
+
+static void __cqhci_disable(struct cqhci_host *cq_host)
+{
+	u32 cqcfg;
+
+	cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+	cqcfg &= ~CQHCI_ENABLE;
+	cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+
+	cq_host->mmc->cqe_on = false;
+
+	cq_host->activated = false;
+}
+
+int cqhci_suspend(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (cq_host->enabled)
+		__cqhci_disable(cq_host);
+
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_suspend);
+
+int cqhci_resume(struct mmc_host *mmc)
+{
+	/* Re-enable is done upon first request */
+	return 0;
+}
+EXPORT_SYMBOL(cqhci_resume);
+
+static int cqhci_enable(struct mmc_host *mmc, struct mmc_card *card)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int err;
+
+	if (cq_host->enabled)
+		return 0;
+
+	cq_host->rca = card->rca;
+
+	err = cqhci_host_alloc_tdl(cq_host);
+	if (err)
+		return err;
+
+	__cqhci_enable(cq_host);
+
+	cq_host->enabled = true;
+
+#ifdef DEBUG
+	cqhci_dumpregs(cq_host);
+#endif
+	return 0;
+}
+
+/* CQHCI is idle and should halt immediately, so set a small timeout */
+#define CQHCI_OFF_TIMEOUT 100
+
+static void cqhci_off(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	ktime_t timeout;
+	bool timed_out;
+	u32 reg;
+
+	if (!cq_host->enabled || !mmc->cqe_on || cq_host->recovery_halt)
+		return;
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, false);
+
+	cqhci_writel(cq_host, CQHCI_HALT, CQHCI_CTL);
+
+	timeout = ktime_add_us(ktime_get(), CQHCI_OFF_TIMEOUT);
+	while (1) {
+		timed_out = ktime_compare(ktime_get(), timeout) > 0;
+		reg = cqhci_readl(cq_host, CQHCI_CTL);
+		if ((reg & CQHCI_HALT) || timed_out)
+			break;
+	}
+
+	if (timed_out)
+		pr_err("%s: cqhci: CQE stuck on\n", mmc_hostname(mmc));
+	else
+		pr_debug("%s: cqhci: CQE off\n", mmc_hostname(mmc));
+
+	mmc->cqe_on = false;
+}
+
+static void cqhci_disable(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->enabled)
+		return;
+
+	cqhci_off(mmc);
+
+	__cqhci_disable(cq_host);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->data_size,
+			   cq_host->trans_desc_base,
+			   cq_host->trans_desc_dma_base);
+
+	dmam_free_coherent(mmc_dev(mmc), cq_host->desc_size,
+			   cq_host->desc_base,
+			   cq_host->desc_dma_base);
+
+	cq_host->trans_desc_base = NULL;
+	cq_host->desc_base = NULL;
+
+	cq_host->enabled = false;
+}
+
+static void cqhci_prep_task_desc(struct mmc_request *mrq,
+					u64 *data, bool intr)
+{
+	u32 req_flags = mrq->data->flags;
+
+	*data = CQHCI_VALID(1) |
+		CQHCI_END(1) |
+		CQHCI_INT(intr) |
+		CQHCI_ACT(0x5) |
+		CQHCI_FORCED_PROG(!!(req_flags & MMC_DATA_FORCED_PRG)) |
+		CQHCI_DATA_TAG(!!(req_flags & MMC_DATA_DAT_TAG)) |
+		CQHCI_DATA_DIR(!!(req_flags & MMC_DATA_READ)) |
+		CQHCI_PRIORITY(!!(req_flags & MMC_DATA_PRIO)) |
+		CQHCI_QBAR(!!(req_flags & MMC_DATA_QBR)) |
+		CQHCI_REL_WRITE(!!(req_flags & MMC_DATA_REL_WR)) |
+		CQHCI_BLK_COUNT(mrq->data->blocks) |
+		CQHCI_BLK_ADDR((u64)mrq->data->blk_addr);
+
+	pr_debug("%s: cqhci: tag %d task descriptor 0x016%llx\n",
+		 mmc_hostname(mrq->host), mrq->tag, (unsigned long long)*data);
+}
+
+static int cqhci_dma_map(struct mmc_host *host, struct mmc_request *mrq)
+{
+	int sg_count;
+	struct mmc_data *data = mrq->data;
+
+	if (!data)
+		return -EINVAL;
+
+	sg_count = dma_map_sg(mmc_dev(host), data->sg,
+			      data->sg_len,
+			      (data->flags & MMC_DATA_WRITE) ?
+			      DMA_TO_DEVICE : DMA_FROM_DEVICE);
+	if (!sg_count) {
+		pr_err("%s: sg-len: %d\n", __func__, data->sg_len);
+		return -ENOMEM;
+	}
+
+	return sg_count;
+}
+
+static void cqhci_set_tran_desc(u8 *desc, dma_addr_t addr, int len, bool end,
+				bool dma64)
+{
+	__le32 *attr = (__le32 __force *)desc;
+
+	*attr = (CQHCI_VALID(1) |
+		 CQHCI_END(end ? 1 : 0) |
+		 CQHCI_INT(0) |
+		 CQHCI_ACT(0x4) |
+		 CQHCI_DAT_LENGTH(len));
+
+	if (dma64) {
+		__le64 *dataddr = (__le64 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le64(addr);
+	} else {
+		__le32 *dataddr = (__le32 __force *)(desc + 4);
+
+		dataddr[0] = cpu_to_le32(addr);
+	}
+}
+
+static int cqhci_prep_tran_desc(struct mmc_request *mrq,
+			       struct cqhci_host *cq_host, int tag)
+{
+	struct mmc_data *data = mrq->data;
+	int i, sg_count, len;
+	bool end = false;
+	bool dma64 = cq_host->dma64;
+	dma_addr_t addr;
+	u8 *desc;
+	struct scatterlist *sg;
+
+	sg_count = cqhci_dma_map(mrq->host, mrq);
+	if (sg_count < 0) {
+		pr_err("%s: %s: unable to map sg lists, %d\n",
+				mmc_hostname(mrq->host), __func__, sg_count);
+		return sg_count;
+	}
+
+	desc = get_trans_desc(cq_host, tag);
+
+	for_each_sg(data->sg, sg, sg_count, i) {
+		addr = sg_dma_address(sg);
+		len = sg_dma_len(sg);
+
+		if ((i+1) == sg_count)
+			end = true;
+		cqhci_set_tran_desc(desc, addr, len, end, dma64);
+		desc += cq_host->trans_desc_len;
+	}
+
+	return 0;
+}
+
+static void cqhci_prep_dcmd_desc(struct mmc_host *mmc,
+				   struct mmc_request *mrq)
+{
+	u64 *task_desc = NULL;
+	u64 data = 0;
+	u8 resp_type;
+	u8 *desc;
+	__le64 *dataddr;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	u8 timing;
+
+	if (!(mrq->cmd->flags & MMC_RSP_PRESENT)) {
+		resp_type = 0x0;
+		timing = 0x1;
+	} else {
+		if (mrq->cmd->flags & MMC_RSP_R1B) {
+			resp_type = 0x3;
+			timing = 0x0;
+		} else {
+			resp_type = 0x2;
+			timing = 0x1;
+		}
+	}
+
+	task_desc = (__le64 __force *)get_desc(cq_host, cq_host->dcmd_slot);
+	memset(task_desc, 0, cq_host->task_desc_len);
+	data |= (CQHCI_VALID(1) |
+		 CQHCI_END(1) |
+		 CQHCI_INT(1) |
+		 CQHCI_QBAR(1) |
+		 CQHCI_ACT(0x5) |
+		 CQHCI_CMD_INDEX(mrq->cmd->opcode) |
+		 CQHCI_CMD_TIMING(timing) | CQHCI_RESP_TYPE(resp_type));
+	*task_desc |= data;
+	desc = (u8 *)task_desc;
+	pr_debug("%s: cqhci: dcmd: cmd: %d timing: %d resp: %d\n",
+		 mmc_hostname(mmc), mrq->cmd->opcode, timing, resp_type);
+	dataddr = (__le64 __force *)(desc + 4);
+	dataddr[0] = cpu_to_le64((u64)mrq->cmd->arg);
+
+}
+
+static void cqhci_post_req(struct mmc_host *host, struct mmc_request *mrq)
+{
+	struct mmc_data *data = mrq->data;
+
+	if (data) {
+		dma_unmap_sg(mmc_dev(host), data->sg, data->sg_len,
+			     (data->flags & MMC_DATA_READ) ?
+			     DMA_FROM_DEVICE : DMA_TO_DEVICE);
+	}
+}
+
+static inline int cqhci_tag(struct mmc_request *mrq)
+{
+	return mrq->cmd ? DCMD_SLOT : mrq->tag;
+}
+
+static int cqhci_request(struct mmc_host *mmc, struct mmc_request *mrq)
+{
+	int err = 0;
+	u64 data = 0;
+	u64 *task_desc = NULL;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+
+	if (!cq_host->enabled) {
+		pr_err("%s: cqhci: not enabled\n", mmc_hostname(mmc));
+		return -EINVAL;
+	}
+
+	/* First request after resume has to re-enable */
+	if (!cq_host->activated)
+		__cqhci_enable(cq_host);
+
+	if (!mmc->cqe_on) {
+		cqhci_writel(cq_host, 0, CQHCI_CTL);
+		mmc->cqe_on = true;
+		pr_debug("%s: cqhci: CQE on\n", mmc_hostname(mmc));
+		if (cqhci_readl(cq_host, CQHCI_CTL) && CQHCI_HALT) {
+			pr_err("%s: cqhci: CQE failed to exit halt state\n",
+			       mmc_hostname(mmc));
+		}
+		if (cq_host->ops->enable)
+			cq_host->ops->enable(mmc);
+	}
+
+	if (mrq->data) {
+		task_desc = (__le64 __force *)get_desc(cq_host, tag);
+		cqhci_prep_task_desc(mrq, &data, 1);
+		*task_desc = cpu_to_le64(data);
+		err = cqhci_prep_tran_desc(mrq, cq_host, tag);
+		if (err) {
+			pr_err("%s: cqhci: failed to setup tx desc: %d\n",
+			       mmc_hostname(mmc), err);
+			return err;
+		}
+	} else {
+		cqhci_prep_dcmd_desc(mmc, mrq);
+	}
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+
+	if (cq_host->recovery_halt) {
+		err = -EBUSY;
+		goto out_unlock;
+	}
+
+	cq_host->slot[tag].mrq = mrq;
+	cq_host->slot[tag].flags = 0;
+
+	cq_host->qcnt += 1;
+
+	cqhci_writel(cq_host, 1 << tag, CQHCI_TDBR);
+	if (!(cqhci_readl(cq_host, CQHCI_TDBR) & (1 << tag)))
+		pr_debug("%s: cqhci: doorbell not set for tag %d\n",
+			 mmc_hostname(mmc), tag);
+out_unlock:
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (err)
+		cqhci_post_req(mmc, mrq);
+
+	return err;
+}
+
+static void cqhci_recovery_needed(struct mmc_host *mmc, struct mmc_request *mrq,
+				  bool notify)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	if (!cq_host->recovery_halt) {
+		cq_host->recovery_halt = true;
+		pr_debug("%s: cqhci: recovery needed\n", mmc_hostname(mmc));
+		wake_up(&cq_host->wait_queue);
+		if (notify && mrq->recovery_notifier)
+			mrq->recovery_notifier(mrq);
+	}
+}
+
+static unsigned int cqhci_error_flags(int error1, int error2)
+{
+	int error = error1 ? error1 : error2;
+
+	switch (error) {
+	case -EILSEQ:
+		return CQHCI_HOST_CRC;
+	case -ETIMEDOUT:
+		return CQHCI_HOST_TIMEOUT;
+	default:
+		return CQHCI_HOST_OTHER;
+	}
+}
+
+static void cqhci_error_irq(struct mmc_host *mmc, u32 status, int cmd_error,
+			    int data_error)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot;
+	u32 terri;
+	int tag;
+
+	spin_lock(&cq_host->lock);
+
+	terri = cqhci_readl(cq_host, CQHCI_TERRI);
+
+	pr_debug("%s: cqhci: error IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+		 mmc_hostname(mmc), status, cmd_error, data_error, terri);
+
+	/* Forget about errors when recovery has already been triggered */
+	if (cq_host->recovery_halt)
+		goto out_unlock;
+
+	if (!cq_host->qcnt) {
+		WARN_ONCE(1, "%s: cqhci: error when idle. IRQ status: 0x%08x cmd error %d data error %d TERRI: 0x%08x\n",
+			  mmc_hostname(mmc), status, cmd_error, data_error,
+			  terri);
+		goto out_unlock;
+	}
+
+	if (CQHCI_TERRI_C_VALID(terri)) {
+		tag = CQHCI_TERRI_C_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(cmd_error, data_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (CQHCI_TERRI_D_VALID(terri)) {
+		tag = CQHCI_TERRI_D_TASK(terri);
+		slot = &cq_host->slot[tag];
+		if (slot->mrq) {
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+		}
+	}
+
+	if (!cq_host->recovery_halt) {
+		/*
+		 * The only way to guarantee forward progress is to mark at
+		 * least one task in error, so if none is indicated, pick one.
+		 */
+		for (tag = 0; tag < NUM_SLOTS; tag++) {
+			slot = &cq_host->slot[tag];
+			if (!slot->mrq)
+				continue;
+			slot->flags = cqhci_error_flags(data_error, cmd_error);
+			cqhci_recovery_needed(mmc, slot->mrq, true);
+			break;
+		}
+	}
+
+out_unlock:
+	spin_unlock(&cq_host->lock);
+}
+
+static void cqhci_finish_mrq(struct mmc_host *mmc, unsigned int tag)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq) {
+		WARN_ONCE(1, "%s: cqhci: spurious TCN for tag %d\n",
+			  mmc_hostname(mmc), tag);
+		return;
+	}
+
+	/* No completions allowed during recovery */
+	if (cq_host->recovery_halt) {
+		slot->flags |= CQHCI_COMPLETED;
+		return;
+	}
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		if (data->error)
+			data->bytes_xfered = 0;
+		else
+			data->bytes_xfered = data->blksz * data->blocks;
+	}
+
+	mmc_cqe_request_done(mmc, mrq);
+}
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error)
+{
+	u32 status;
+	unsigned long tag = 0, comp_status;
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	status = cqhci_readl(cq_host, CQHCI_IS);
+	cqhci_writel(cq_host, status, CQHCI_IS);
+
+	pr_debug("%s: cqhci: IRQ status: 0x%08x\n", mmc_hostname(mmc), status);
+
+	if ((status & CQHCI_IS_RED) || cmd_error || data_error)
+		cqhci_error_irq(mmc, status, cmd_error, data_error);
+
+	if (status & CQHCI_IS_TCC) {
+		/* read TCN and complete the request */
+		comp_status = cqhci_readl(cq_host, CQHCI_TCN);
+		cqhci_writel(cq_host, comp_status, CQHCI_TCN);
+		pr_debug("%s: cqhci: TCN: 0x%08lx\n",
+			 mmc_hostname(mmc), comp_status);
+
+		spin_lock(&cq_host->lock);
+
+		for_each_set_bit(tag, &comp_status, cq_host->num_slots) {
+			/* complete the corresponding mrq */
+			pr_debug("%s: cqhci: completing tag %lu\n",
+				 mmc_hostname(mmc), tag);
+			cqhci_finish_mrq(mmc, tag);
+		}
+
+		if (cq_host->waiting_for_idle && !cq_host->qcnt) {
+			cq_host->waiting_for_idle = false;
+			wake_up(&cq_host->wait_queue);
+		}
+
+		spin_unlock(&cq_host->lock);
+	}
+
+	if (status & CQHCI_IS_TCL)
+		wake_up(&cq_host->wait_queue);
+
+	if (status & CQHCI_IS_HAC)
+		wake_up(&cq_host->wait_queue);
+
+	return IRQ_HANDLED;
+}
+EXPORT_SYMBOL(cqhci_irq);
+
+static bool cqhci_is_idle(struct cqhci_host *cq_host, int *ret)
+{
+	unsigned long flags;
+	bool is_idle;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	is_idle = !cq_host->qcnt || cq_host->recovery_halt;
+	*ret = cq_host->recovery_halt ? -EBUSY : 0;
+	cq_host->waiting_for_idle = !is_idle;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	return is_idle;
+}
+
+static int cqhci_wait_for_idle(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int ret;
+
+	wait_event(cq_host->wait_queue, cqhci_is_idle(cq_host, &ret));
+
+	return ret;
+}
+
+static bool cqhci_timeout(struct mmc_host *mmc, struct mmc_request *mrq,
+			  bool *recovery_needed)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	int tag = cqhci_tag(mrq);
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	unsigned long flags;
+	bool timed_out;
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	timed_out = slot->mrq == mrq;
+	if (timed_out) {
+		slot->flags |= CQHCI_EXTERNAL_TIMEOUT;
+		cqhci_recovery_needed(mmc, mrq, false);
+		*recovery_needed = cq_host->recovery_halt;
+	}
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	if (timed_out) {
+		pr_err("%s: cqhci: timeout for tag %d\n",
+		       mmc_hostname(mmc), tag);
+		cqhci_dumpregs(cq_host);
+	}
+
+	return timed_out;
+}
+
+static bool cqhci_tasks_cleared(struct cqhci_host *cq_host)
+{
+	return !(cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_CLEAR_ALL_TASKS);
+}
+
+static bool cqhci_clear_all_tasks(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_TCL);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_CLEAR_ALL_TASKS;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_tasks_cleared(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_tasks_cleared(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to clear tasks\n",
+			 mmc_hostname(mmc));
+
+	return ret;
+}
+
+static bool cqhci_halted(struct cqhci_host *cq_host)
+{
+	return cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT;
+}
+
+static bool cqhci_halt(struct mmc_host *mmc, unsigned int timeout)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	bool ret;
+	u32 ctl;
+
+	if (cqhci_halted(cq_host))
+		return true;
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_HAC);
+
+	ctl = cqhci_readl(cq_host, CQHCI_CTL);
+	ctl |= CQHCI_HALT;
+	cqhci_writel(cq_host, ctl, CQHCI_CTL);
+
+	wait_event_timeout(cq_host->wait_queue, cqhci_halted(cq_host),
+			   msecs_to_jiffies(timeout) + 1);
+
+	cqhci_set_irqs(cq_host, 0);
+
+	ret = cqhci_halted(cq_host);
+
+	if (!ret)
+		pr_debug("%s: cqhci: Failed to halt\n", mmc_hostname(mmc));
+
+	return ret;
+}
+
+/*
+ * After halting we expect to be able to use the command line. We interpret the
+ * failure to halt to mean the data lines might still be in use (and the upper
+ * layers will need to send a STOP command), so we set the timeout based on a
+ * generous command timeout.
+ */
+#define CQHCI_START_HALT_TIMEOUT	5
+
+static void cqhci_recovery_start(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	cqhci_halt(mmc, CQHCI_START_HALT_TIMEOUT);
+
+	if (cq_host->ops->disable)
+		cq_host->ops->disable(mmc, true);
+
+	mmc->cqe_on = false;
+}
+
+static int cqhci_error_from_flags(unsigned int flags)
+{
+	if (!flags)
+		return 0;
+
+	/* CRC errors might indicate re-tuning so prefer to report that */
+	if (flags & CQHCI_HOST_CRC)
+		return -EILSEQ;
+
+	if (flags & (CQHCI_EXTERNAL_TIMEOUT | CQHCI_HOST_TIMEOUT))
+		return -ETIMEDOUT;
+
+	return -EIO;
+}
+
+static void cqhci_recover_mrq(struct cqhci_host *cq_host, unsigned int tag)
+{
+	struct cqhci_slot *slot = &cq_host->slot[tag];
+	struct mmc_request *mrq = slot->mrq;
+	struct mmc_data *data;
+
+	if (!mrq)
+		return;
+
+	slot->mrq = NULL;
+
+	cq_host->qcnt -= 1;
+
+	data = mrq->data;
+	if (data) {
+		data->bytes_xfered = 0;
+		data->error = cqhci_error_from_flags(slot->flags);
+	} else {
+		mrq->cmd->error = cqhci_error_from_flags(slot->flags);
+	}
+
+	mmc_cqe_request_done(cq_host->mmc, mrq);
+}
+
+static void cqhci_recover_mrqs(struct cqhci_host *cq_host)
+{
+	int i;
+
+	for (i = 0; i < cq_host->num_slots; i++)
+		cqhci_recover_mrq(cq_host, i);
+}
+
+/*
+ * By now the command and data lines should be unused so there is no reason for
+ * CQHCI to take a long time to halt, but if it doesn't halt there could be
+ * problems clearing tasks, so be generous.
+ */
+#define CQHCI_FINISH_HALT_TIMEOUT	20
+
+/* CQHCI could be expected to clear it's internal state pretty quickly */
+#define CQHCI_CLEAR_TIMEOUT		20
+
+static void cqhci_recovery_finish(struct mmc_host *mmc)
+{
+	struct cqhci_host *cq_host = mmc->cqe_private;
+	unsigned long flags;
+	u32 cqcfg;
+	bool ok;
+
+	pr_debug("%s: cqhci: %s\n", mmc_hostname(mmc), __func__);
+
+	WARN_ON(!cq_host->recovery_halt);
+
+	ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+
+	if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+		ok = false;
+
+	/*
+	 * The specification contradicts itself, by saying that tasks cannot be
+	 * cleared if CQHCI does not halt, but if CQHCI does not halt, it should
+	 * be disabled/re-enabled, but not to disable before clearing tasks.
+	 * Have a go anyway.
+	 */
+	if (!ok) {
+		pr_debug("%s: cqhci: disable / re-enable\n", mmc_hostname(mmc));
+		cqcfg = cqhci_readl(cq_host, CQHCI_CFG);
+		cqcfg &= ~CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		cqcfg |= CQHCI_ENABLE;
+		cqhci_writel(cq_host, cqcfg, CQHCI_CFG);
+		/* Be sure that there are no tasks */
+		ok = cqhci_halt(mmc, CQHCI_FINISH_HALT_TIMEOUT);
+		if (!cqhci_clear_all_tasks(mmc, CQHCI_CLEAR_TIMEOUT))
+			ok = false;
+		WARN_ON(!ok);
+	}
+
+	cqhci_recover_mrqs(cq_host);
+
+	WARN_ON(cq_host->qcnt);
+
+	spin_lock_irqsave(&cq_host->lock, flags);
+	cq_host->qcnt = 0;
+	cq_host->recovery_halt = false;
+	mmc->cqe_on = false;
+	spin_unlock_irqrestore(&cq_host->lock, flags);
+
+	/* Ensure all writes are done before interrupts are re-enabled */
+	wmb();
+
+	cqhci_writel(cq_host, CQHCI_IS_HAC | CQHCI_IS_TCL, CQHCI_IS);
+
+	cqhci_set_irqs(cq_host, CQHCI_IS_MASK);
+
+	pr_debug("%s: cqhci: recovery done\n", mmc_hostname(mmc));
+}
+
+static const struct mmc_cqe_ops cqhci_cqe_ops = {
+	.cqe_enable = cqhci_enable,
+	.cqe_disable = cqhci_disable,
+	.cqe_request = cqhci_request,
+	.cqe_post_req = cqhci_post_req,
+	.cqe_off = cqhci_off,
+	.cqe_wait_for_idle = cqhci_wait_for_idle,
+	.cqe_timeout = cqhci_timeout,
+	.cqe_recovery_start = cqhci_recovery_start,
+	.cqe_recovery_finish = cqhci_recovery_finish,
+};
+
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev)
+{
+	struct cqhci_host *cq_host;
+	struct resource *cqhci_memres = NULL;
+
+	/* check and setup CMDQ interface */
+	cqhci_memres = platform_get_resource_byname(pdev, IORESOURCE_MEM,
+						   "cqhci_mem");
+	if (!cqhci_memres) {
+		dev_dbg(&pdev->dev, "CMDQ not supported\n");
+		return ERR_PTR(-EINVAL);
+	}
+
+	cq_host = devm_kzalloc(&pdev->dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host)
+		return ERR_PTR(-ENOMEM);
+	cq_host->mmio = devm_ioremap(&pdev->dev,
+				     cqhci_memres->start,
+				     resource_size(cqhci_memres));
+	if (!cq_host->mmio) {
+		dev_err(&pdev->dev, "failed to remap cqhci regs\n");
+		return ERR_PTR(-EBUSY);
+	}
+	dev_dbg(&pdev->dev, "CMDQ ioremap: done\n");
+
+	return cq_host;
+}
+EXPORT_SYMBOL(cqhci_pltfm_init);
+
+static unsigned int cqhci_ver_major(struct cqhci_host *cq_host)
+{
+	return CQHCI_VER_MAJOR(cqhci_readl(cq_host, CQHCI_VER));
+}
+
+static unsigned int cqhci_ver_minor(struct cqhci_host *cq_host)
+{
+	u32 ver = cqhci_readl(cq_host, CQHCI_VER);
+
+	return CQHCI_VER_MINOR1(ver) * 10 + CQHCI_VER_MINOR2(ver);
+}
+
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc,
+	      bool dma64)
+{
+	int err;
+
+	cq_host->dma64 = dma64;
+	cq_host->mmc = mmc;
+	cq_host->mmc->cqe_private = cq_host;
+
+	cq_host->num_slots = NUM_SLOTS;
+	cq_host->dcmd_slot = DCMD_SLOT;
+
+	mmc->cqe_ops = &cqhci_cqe_ops;
+
+	mmc->cqe_qdepth = NUM_SLOTS;
+	if (mmc->caps2 & MMC_CAP2_CQE_DCMD)
+		mmc->cqe_qdepth -= 1;
+
+	cq_host->slot = devm_kcalloc(mmc_dev(mmc), cq_host->num_slots,
+				     sizeof(*cq_host->slot), GFP_KERNEL);
+	if (!cq_host->slot) {
+		err = -ENOMEM;
+		goto out_err;
+	}
+
+	spin_lock_init(&cq_host->lock);
+
+	init_completion(&cq_host->halt_comp);
+	init_waitqueue_head(&cq_host->wait_queue);
+
+	pr_info("%s: CQHCI version %u.%02u\n",
+		mmc_hostname(mmc), cqhci_ver_major(cq_host),
+		cqhci_ver_minor(cq_host));
+
+	return 0;
+
+out_err:
+	pr_err("%s: CQHCI version %u.%02u failed to initialize, error %d\n",
+	       mmc_hostname(mmc), cqhci_ver_major(cq_host),
+	       cqhci_ver_minor(cq_host), err);
+	return err;
+}
+EXPORT_SYMBOL(cqhci_init);
+
+MODULE_AUTHOR("Venkat Gopalakrishnan <venkatg@codeaurora.org>");
+MODULE_DESCRIPTION("Command Queue Host Controller Interface driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/mmc/host/cqhci.h b/drivers/mmc/host/cqhci.h
new file mode 100644
index 0000000..9e68286
--- /dev/null
+++ b/drivers/mmc/host/cqhci.h
@@ -0,0 +1,240 @@
+/* Copyright (c) 2015, The Linux Foundation. All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 and
+ * only version 2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ */
+#ifndef LINUX_MMC_CQHCI_H
+#define LINUX_MMC_CQHCI_H
+
+#include <linux/compiler.h>
+#include <linux/bitops.h>
+#include <linux/spinlock_types.h>
+#include <linux/types.h>
+#include <linux/completion.h>
+#include <linux/wait.h>
+#include <linux/irqreturn.h>
+#include <asm/io.h>
+
+/* registers */
+/* version */
+#define CQHCI_VER			0x00
+#define CQHCI_VER_MAJOR(x)		(((x) & GENMASK(11, 8)) >> 8)
+#define CQHCI_VER_MINOR1(x)		(((x) & GENMASK(7, 4)) >> 4)
+#define CQHCI_VER_MINOR2(x)		((x) & GENMASK(3, 0))
+
+/* capabilities */
+#define CQHCI_CAP			0x04
+/* configuration */
+#define CQHCI_CFG			0x08
+#define CQHCI_DCMD			0x00001000
+#define CQHCI_TASK_DESC_SZ		0x00000100
+#define CQHCI_ENABLE			0x00000001
+
+/* control */
+#define CQHCI_CTL			0x0C
+#define CQHCI_CLEAR_ALL_TASKS		0x00000100
+#define CQHCI_HALT			0x00000001
+
+/* interrupt status */
+#define CQHCI_IS			0x10
+#define CQHCI_IS_HAC			BIT(0)
+#define CQHCI_IS_TCC			BIT(1)
+#define CQHCI_IS_RED			BIT(2)
+#define CQHCI_IS_TCL			BIT(3)
+
+#define CQHCI_IS_MASK (CQHCI_IS_TCC | CQHCI_IS_RED)
+
+/* interrupt status enable */
+#define CQHCI_ISTE			0x14
+
+/* interrupt signal enable */
+#define CQHCI_ISGE			0x18
+
+/* interrupt coalescing */
+#define CQHCI_IC			0x1C
+#define CQHCI_IC_ENABLE			BIT(31)
+#define CQHCI_IC_RESET			BIT(16)
+#define CQHCI_IC_ICCTHWEN		BIT(15)
+#define CQHCI_IC_ICCTH(x)		(((x) & 0x1F) << 8)
+#define CQHCI_IC_ICTOVALWEN		BIT(7)
+#define CQHCI_IC_ICTOVAL(x)		((x) & 0x7F)
+
+/* task list base address */
+#define CQHCI_TDLBA			0x20
+
+/* task list base address upper */
+#define CQHCI_TDLBAU			0x24
+
+/* door-bell */
+#define CQHCI_TDBR			0x28
+
+/* task completion notification */
+#define CQHCI_TCN			0x2C
+
+/* device queue status */
+#define CQHCI_DQS			0x30
+
+/* device pending tasks */
+#define CQHCI_DPT			0x34
+
+/* task clear */
+#define CQHCI_TCLR			0x38
+
+/* send status config 1 */
+#define CQHCI_SSC1			0x40
+
+/* send status config 2 */
+#define CQHCI_SSC2			0x44
+
+/* response for dcmd */
+#define CQHCI_CRDCT			0x48
+
+/* response mode error mask */
+#define CQHCI_RMEM			0x50
+
+/* task error info */
+#define CQHCI_TERRI			0x54
+
+#define CQHCI_TERRI_C_INDEX(x)		((x) & GENMASK(5, 0))
+#define CQHCI_TERRI_C_TASK(x)		(((x) & GENMASK(12, 8)) >> 8)
+#define CQHCI_TERRI_C_VALID(x)		((x) & BIT(15))
+#define CQHCI_TERRI_D_INDEX(x)		(((x) & GENMASK(21, 16)) >> 16)
+#define CQHCI_TERRI_D_TASK(x)		(((x) & GENMASK(28, 24)) >> 24)
+#define CQHCI_TERRI_D_VALID(x)		((x) & BIT(31))
+
+/* command response index */
+#define CQHCI_CRI			0x58
+
+/* command response argument */
+#define CQHCI_CRA			0x5C
+
+#define CQHCI_INT_ALL			0xF
+#define CQHCI_IC_DEFAULT_ICCTH		31
+#define CQHCI_IC_DEFAULT_ICTOVAL	1
+
+/* attribute fields */
+#define CQHCI_VALID(x)			(((x) & 1) << 0)
+#define CQHCI_END(x)			(((x) & 1) << 1)
+#define CQHCI_INT(x)			(((x) & 1) << 2)
+#define CQHCI_ACT(x)			(((x) & 0x7) << 3)
+
+/* data command task descriptor fields */
+#define CQHCI_FORCED_PROG(x)		(((x) & 1) << 6)
+#define CQHCI_CONTEXT(x)		(((x) & 0xF) << 7)
+#define CQHCI_DATA_TAG(x)		(((x) & 1) << 11)
+#define CQHCI_DATA_DIR(x)		(((x) & 1) << 12)
+#define CQHCI_PRIORITY(x)		(((x) & 1) << 13)
+#define CQHCI_QBAR(x)			(((x) & 1) << 14)
+#define CQHCI_REL_WRITE(x)		(((x) & 1) << 15)
+#define CQHCI_BLK_COUNT(x)		(((x) & 0xFFFF) << 16)
+#define CQHCI_BLK_ADDR(x)		(((x) & 0xFFFFFFFF) << 32)
+
+/* direct command task descriptor fields */
+#define CQHCI_CMD_INDEX(x)		(((x) & 0x3F) << 16)
+#define CQHCI_CMD_TIMING(x)		(((x) & 1) << 22)
+#define CQHCI_RESP_TYPE(x)		(((x) & 0x3) << 23)
+
+/* transfer descriptor fields */
+#define CQHCI_DAT_LENGTH(x)		(((x) & 0xFFFF) << 16)
+#define CQHCI_DAT_ADDR_LO(x)		(((x) & 0xFFFFFFFF) << 32)
+#define CQHCI_DAT_ADDR_HI(x)		(((x) & 0xFFFFFFFF) << 0)
+
+struct cqhci_host_ops;
+struct mmc_host;
+struct cqhci_slot;
+
+struct cqhci_host {
+	const struct cqhci_host_ops *ops;
+	void __iomem *mmio;
+	struct mmc_host *mmc;
+
+	spinlock_t lock;
+
+	/* relative card address of device */
+	unsigned int rca;
+
+	/* 64 bit DMA */
+	bool dma64;
+	int num_slots;
+	int qcnt;
+
+	u32 dcmd_slot;
+	u32 caps;
+#define CQHCI_TASK_DESC_SZ_128		0x1
+
+	u32 quirks;
+#define CQHCI_QUIRK_SHORT_TXFR_DESC_SZ	0x1
+
+	bool enabled;
+	bool halted;
+	bool init_done;
+	bool activated;
+	bool waiting_for_idle;
+	bool recovery_halt;
+
+	size_t desc_size;
+	size_t data_size;
+
+	u8 *desc_base;
+
+	/* total descriptor size */
+	u8 slot_sz;
+
+	/* 64/128 bit depends on CQHCI_CFG */
+	u8 task_desc_len;
+
+	/* 64 bit on 32-bit arch, 128 bit on 64-bit */
+	u8 link_desc_len;
+
+	u8 *trans_desc_base;
+	/* same length as transfer descriptor */
+	u8 trans_desc_len;
+
+	dma_addr_t desc_dma_base;
+	dma_addr_t trans_desc_dma_base;
+
+	struct completion halt_comp;
+	wait_queue_head_t wait_queue;
+	struct cqhci_slot *slot;
+};
+
+struct cqhci_host_ops {
+	void (*dumpregs)(struct mmc_host *mmc);
+	void (*write_l)(struct cqhci_host *host, u32 val, int reg);
+	u32 (*read_l)(struct cqhci_host *host, int reg);
+	void (*enable)(struct mmc_host *mmc);
+	void (*disable)(struct mmc_host *mmc, bool recovery);
+};
+
+static inline void cqhci_writel(struct cqhci_host *host, u32 val, int reg)
+{
+	if (unlikely(host->ops->write_l))
+		host->ops->write_l(host, val, reg);
+	else
+		writel_relaxed(val, host->mmio + reg);
+}
+
+static inline u32 cqhci_readl(struct cqhci_host *host, int reg)
+{
+	if (unlikely(host->ops->read_l))
+		return host->ops->read_l(host, reg);
+	else
+		return readl_relaxed(host->mmio + reg);
+}
+
+struct platform_device;
+
+irqreturn_t cqhci_irq(struct mmc_host *mmc, u32 intmask, int cmd_error,
+		      int data_error);
+int cqhci_init(struct cqhci_host *cq_host, struct mmc_host *mmc, bool dma64);
+struct cqhci_host *cqhci_pltfm_init(struct platform_device *pdev);
+int cqhci_suspend(struct mmc_host *mmc);
+int cqhci_resume(struct mmc_host *mmc);
+
+#endif
diff --git a/drivers/mmc/host/davinci_mmc.c b/drivers/mmc/host/davinci_mmc.c
index 351330d..8e36317 100644
--- a/drivers/mmc/host/davinci_mmc.c
+++ b/drivers/mmc/host/davinci_mmc.c
@@ -174,7 +174,7 @@ module_param(poll_loopcount, uint, S_IRUGO);
 MODULE_PARM_DESC(poll_loopcount,
 		 "Maximum polling loop count. Default = 32");
 
-static unsigned __initdata use_dma = 1;
+static unsigned use_dma = 1;
 module_param(use_dma, uint, 0);
 MODULE_PARM_DESC(use_dma, "Whether to use DMA or not. Default = 1");
 
@@ -496,8 +496,7 @@ static int mmc_davinci_start_dma_transfer(struct mmc_davinci_host *host,
 	return ret;
 }
 
-static void __init_or_module
-davinci_release_dma_channels(struct mmc_davinci_host *host)
+static void davinci_release_dma_channels(struct mmc_davinci_host *host)
 {
 	if (!host->use_dma)
 		return;
@@ -506,7 +505,7 @@ davinci_release_dma_channels(struct mmc_davinci_host *host)
 	dma_release_channel(host->dma_rx);
 }
 
-static int __init davinci_acquire_dma_channels(struct mmc_davinci_host *host)
+static int davinci_acquire_dma_channels(struct mmc_davinci_host *host)
 {
 	host->dma_tx = dma_request_chan(mmc_dev(host->mmc), "tx");
 	if (IS_ERR(host->dma_tx)) {
@@ -1201,7 +1200,7 @@ static int mmc_davinci_parse_pdata(struct mmc_host *mmc)
 	return 0;
 }
 
-static int __init davinci_mmcsd_probe(struct platform_device *pdev)
+static int davinci_mmcsd_probe(struct platform_device *pdev)
 {
 	const struct of_device_id *match;
 	struct mmc_davinci_host *host = NULL;
@@ -1254,8 +1253,9 @@ static int __init davinci_mmcsd_probe(struct platform_device *pdev)
 		pdev->id_entry = match->data;
 		ret = mmc_of_parse(mmc);
 		if (ret) {
-			dev_err(&pdev->dev,
-				"could not parse of data: %d\n", ret);
+			if (ret != -EPROBE_DEFER)
+				dev_err(&pdev->dev,
+					"could not parse of data: %d\n", ret);
 			goto parse_fail;
 		}
 	} else {
@@ -1414,11 +1414,12 @@ static struct platform_driver davinci_mmcsd_driver = {
 		.pm	= davinci_mmcsd_pm_ops,
 		.of_match_table = davinci_mmc_dt_ids,
 	},
+	.probe		= davinci_mmcsd_probe,
 	.remove		= __exit_p(davinci_mmcsd_remove),
 	.id_table	= davinci_mmc_devtype,
 };
 
-module_platform_driver_probe(davinci_mmcsd_driver, davinci_mmcsd_probe);
+module_platform_driver(davinci_mmcsd_driver);
 
 MODULE_AUTHOR("Texas Instruments India");
 MODULE_LICENSE("GPL");
diff --git a/drivers/mmc/host/meson-gx-mmc.c b/drivers/mmc/host/meson-gx-mmc.c
index e0862d3..32a6a22 100644
--- a/drivers/mmc/host/meson-gx-mmc.c
+++ b/drivers/mmc/host/meson-gx-mmc.c
@@ -1208,7 +1208,7 @@ static int meson_mmc_probe(struct platform_device *pdev)
 	}
 
 	irq = platform_get_irq(pdev, 0);
-	if (!irq) {
+	if (irq <= 0) {
 		dev_err(&pdev->dev, "failed to get interrupt resource.\n");
 		ret = -EINVAL;
 		goto free_host;
diff --git a/drivers/mmc/host/mmci.c b/drivers/mmc/host/mmci.c
index e8a1bb1..70b0df8 100644
--- a/drivers/mmc/host/mmci.c
+++ b/drivers/mmc/host/mmci.c
@@ -82,6 +82,10 @@ static unsigned int fmax = 515633;
  * @qcom_fifo: enables qcom specific fifo pio read logic.
  * @qcom_dml: enables qcom specific dma glue for dma transfers.
  * @reversed_irq_handling: handle data irq before cmd irq.
+ * @mmcimask1: true if variant have a MMCIMASK1 register.
+ * @start_err: bitmask identifying the STARTBITERR bit inside MMCISTATUS
+ *	       register.
+ * @opendrain: bitmask identifying the OPENDRAIN bit inside MMCIPOWER register
  */
 struct variant_data {
 	unsigned int		clkreg;
@@ -111,6 +115,9 @@ struct variant_data {
 	bool			qcom_fifo;
 	bool			qcom_dml;
 	bool			reversed_irq_handling;
+	bool			mmcimask1;
+	u32			start_err;
+	u32			opendrain;
 };
 
 static struct variant_data variant_arm = {
@@ -120,6 +127,9 @@ static struct variant_data variant_arm = {
 	.pwrreg_powerup		= MCI_PWR_UP,
 	.f_max			= 100000000,
 	.reversed_irq_handling	= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_ROD,
 };
 
 static struct variant_data variant_arm_extended_fifo = {
@@ -128,6 +138,9 @@ static struct variant_data variant_arm_extended_fifo = {
 	.datalength_bits	= 16,
 	.pwrreg_powerup		= MCI_PWR_UP,
 	.f_max			= 100000000,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_ROD,
 };
 
 static struct variant_data variant_arm_extended_fifo_hwfc = {
@@ -137,6 +150,9 @@ static struct variant_data variant_arm_extended_fifo_hwfc = {
 	.datalength_bits	= 16,
 	.pwrreg_powerup		= MCI_PWR_UP,
 	.f_max			= 100000000,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_ROD,
 };
 
 static struct variant_data variant_u300 = {
@@ -152,6 +168,9 @@ static struct variant_data variant_u300 = {
 	.signal_direction	= true,
 	.pwrreg_clkgate		= true,
 	.pwrreg_nopower		= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_OD,
 };
 
 static struct variant_data variant_nomadik = {
@@ -168,6 +187,9 @@ static struct variant_data variant_nomadik = {
 	.signal_direction	= true,
 	.pwrreg_clkgate		= true,
 	.pwrreg_nopower		= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_OD,
 };
 
 static struct variant_data variant_ux500 = {
@@ -190,6 +212,9 @@ static struct variant_data variant_ux500 = {
 	.busy_detect_flag	= MCI_ST_CARDBUSY,
 	.busy_detect_mask	= MCI_ST_BUSYENDMASK,
 	.pwrreg_nopower		= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_OD,
 };
 
 static struct variant_data variant_ux500v2 = {
@@ -214,6 +239,26 @@ static struct variant_data variant_ux500v2 = {
 	.busy_detect_flag	= MCI_ST_CARDBUSY,
 	.busy_detect_mask	= MCI_ST_BUSYENDMASK,
 	.pwrreg_nopower		= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_OD,
+};
+
+static struct variant_data variant_stm32 = {
+	.fifosize		= 32 * 4,
+	.fifohalfsize		= 8 * 4,
+	.clkreg			= MCI_CLK_ENABLE,
+	.clkreg_enable		= MCI_ST_UX500_HWFCEN,
+	.clkreg_8bit_bus_enable = MCI_ST_8BIT_BUS,
+	.clkreg_neg_edge_enable	= MCI_ST_UX500_NEG_EDGE,
+	.datalength_bits	= 24,
+	.datactrl_mask_sdio	= MCI_DPSM_ST_SDIOEN,
+	.st_sdio		= true,
+	.st_clkdiv		= true,
+	.pwrreg_powerup		= MCI_PWR_ON,
+	.f_max			= 48000000,
+	.pwrreg_clkgate		= true,
+	.pwrreg_nopower		= true,
 };
 
 static struct variant_data variant_qcom = {
@@ -232,6 +277,9 @@ static struct variant_data variant_qcom = {
 	.explicit_mclk_control	= true,
 	.qcom_fifo		= true,
 	.qcom_dml		= true,
+	.mmcimask1		= true,
+	.start_err		= MCI_STARTBITERR,
+	.opendrain		= MCI_ROD,
 };
 
 /* Busy detection for the ST Micro variant */
@@ -396,6 +444,7 @@ mmci_request_end(struct mmci_host *host, struct mmc_request *mrq)
 static void mmci_set_mask1(struct mmci_host *host, unsigned int mask)
 {
 	void __iomem *base = host->base;
+	struct variant_data *variant = host->variant;
 
 	if (host->singleirq) {
 		unsigned int mask0 = readl(base + MMCIMASK0);
@@ -406,7 +455,10 @@ static void mmci_set_mask1(struct mmci_host *host, unsigned int mask)
 		writel(mask0, base + MMCIMASK0);
 	}
 
-	writel(mask, base + MMCIMASK1);
+	if (variant->mmcimask1)
+		writel(mask, base + MMCIMASK1);
+
+	host->mask1_reg = mask;
 }
 
 static void mmci_stop_data(struct mmci_host *host)
@@ -921,8 +973,9 @@ mmci_data_irq(struct mmci_host *host, struct mmc_data *data,
 		return;
 
 	/* First check for errors */
-	if (status & (MCI_DATACRCFAIL|MCI_DATATIMEOUT|MCI_STARTBITERR|
-		      MCI_TXUNDERRUN|MCI_RXOVERRUN)) {
+	if (status & (MCI_DATACRCFAIL | MCI_DATATIMEOUT |
+		      host->variant->start_err |
+		      MCI_TXUNDERRUN | MCI_RXOVERRUN)) {
 		u32 remain, success;
 
 		/* Terminate the DMA transfer */
@@ -1286,7 +1339,7 @@ static irqreturn_t mmci_irq(int irq, void *dev_id)
 		status = readl(host->base + MMCISTATUS);
 
 		if (host->singleirq) {
-			if (status & readl(host->base + MMCIMASK1))
+			if (status & host->mask1_reg)
 				mmci_pio_irq(irq, dev_id);
 
 			status &= ~MCI_IRQ1MASK;
@@ -1429,16 +1482,18 @@ static void mmci_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
 				~MCI_ST_DATA2DIREN);
 	}
 
-	if (ios->bus_mode == MMC_BUSMODE_OPENDRAIN) {
-		if (host->hw_designer != AMBA_VENDOR_ST)
-			pwr |= MCI_ROD;
-		else {
-			/*
-			 * The ST Micro variant use the ROD bit for something
-			 * else and only has OD (Open Drain).
-			 */
-			pwr |= MCI_OD;
-		}
+	if (variant->opendrain) {
+		if (ios->bus_mode == MMC_BUSMODE_OPENDRAIN)
+			pwr |= variant->opendrain;
+	} else {
+		/*
+		 * If the variant cannot configure the pads by its own, then we
+		 * expect the pinctrl to be able to do that for us
+		 */
+		if (ios->bus_mode == MMC_BUSMODE_OPENDRAIN)
+			pinctrl_select_state(host->pinctrl, host->pins_opendrain);
+		else
+			pinctrl_select_state(host->pinctrl, host->pins_default);
 	}
 
 	/*
@@ -1583,6 +1638,35 @@ static int mmci_probe(struct amba_device *dev,
 	host = mmc_priv(mmc);
 	host->mmc = mmc;
 
+	/*
+	 * Some variant (STM32) doesn't have opendrain bit, nevertheless
+	 * pins can be set accordingly using pinctrl
+	 */
+	if (!variant->opendrain) {
+		host->pinctrl = devm_pinctrl_get(&dev->dev);
+		if (IS_ERR(host->pinctrl)) {
+			dev_err(&dev->dev, "failed to get pinctrl");
+			ret = PTR_ERR(host->pinctrl);
+			goto host_free;
+		}
+
+		host->pins_default = pinctrl_lookup_state(host->pinctrl,
+							  PINCTRL_STATE_DEFAULT);
+		if (IS_ERR(host->pins_default)) {
+			dev_err(mmc_dev(mmc), "Can't select default pins\n");
+			ret = PTR_ERR(host->pins_default);
+			goto host_free;
+		}
+
+		host->pins_opendrain = pinctrl_lookup_state(host->pinctrl,
+							    MMCI_PINCTRL_STATE_OPENDRAIN);
+		if (IS_ERR(host->pins_opendrain)) {
+			dev_err(mmc_dev(mmc), "Can't select opendrain pins\n");
+			ret = PTR_ERR(host->pins_opendrain);
+			goto host_free;
+		}
+	}
+
 	host->hw_designer = amba_manf(dev);
 	host->hw_revision = amba_rev(dev);
 	dev_dbg(mmc_dev(mmc), "designer ID = 0x%02x\n", host->hw_designer);
@@ -1729,7 +1813,10 @@ static int mmci_probe(struct amba_device *dev,
 	spin_lock_init(&host->lock);
 
 	writel(0, host->base + MMCIMASK0);
-	writel(0, host->base + MMCIMASK1);
+
+	if (variant->mmcimask1)
+		writel(0, host->base + MMCIMASK1);
+
 	writel(0xfff, host->base + MMCICLEAR);
 
 	/*
@@ -1809,6 +1896,7 @@ static int mmci_remove(struct amba_device *dev)
 
 	if (mmc) {
 		struct mmci_host *host = mmc_priv(mmc);
+		struct variant_data *variant = host->variant;
 
 		/*
 		 * Undo pm_runtime_put() in probe.  We use the _sync
@@ -1819,7 +1907,9 @@ static int mmci_remove(struct amba_device *dev)
 		mmc_remove_host(mmc);
 
 		writel(0, host->base + MMCIMASK0);
-		writel(0, host->base + MMCIMASK1);
+
+		if (variant->mmcimask1)
+			writel(0, host->base + MMCIMASK1);
 
 		writel(0, host->base + MMCICOMMAND);
 		writel(0, host->base + MMCIDATACTRL);
@@ -1951,6 +2041,11 @@ static const struct amba_id mmci_ids[] = {
 		.mask   = 0xf0ffffff,
 		.data	= &variant_ux500v2,
 	},
+	{
+		.id     = 0x00880180,
+		.mask   = 0x00ffffff,
+		.data	= &variant_stm32,
+	},
 	/* Qualcomm variants */
 	{
 		.id     = 0x00051180,
diff --git a/drivers/mmc/host/mmci.h b/drivers/mmc/host/mmci.h
index 4a8bef1..f91cdf7 100644
--- a/drivers/mmc/host/mmci.h
+++ b/drivers/mmc/host/mmci.h
@@ -192,6 +192,8 @@
 
 #define NR_SG		128
 
+#define MMCI_PINCTRL_STATE_OPENDRAIN "opendrain"
+
 struct clk;
 struct variant_data;
 struct dma_chan;
@@ -223,9 +225,13 @@ struct mmci_host {
 	u32			clk_reg;
 	u32			datactrl_reg;
 	u32			busy_status;
+	u32			mask1_reg;
 	bool			vqmmc_enabled;
 	struct mmci_platform_data *plat;
 	struct variant_data	*variant;
+	struct pinctrl		*pinctrl;
+	struct pinctrl_state	*pins_default;
+	struct pinctrl_state	*pins_opendrain;
 
 	u8			hw_designer;
 	u8			hw_revision:4;
diff --git a/drivers/mmc/host/renesas_sdhi.h b/drivers/mmc/host/renesas_sdhi.h
index b9dfea5..f13f798 100644
--- a/drivers/mmc/host/renesas_sdhi.h
+++ b/drivers/mmc/host/renesas_sdhi.h
@@ -35,6 +35,28 @@ struct renesas_sdhi_of_data {
 	unsigned short max_segs;
 };
 
+struct tmio_mmc_dma {
+	enum dma_slave_buswidth dma_buswidth;
+	bool (*filter)(struct dma_chan *chan, void *arg);
+	void (*enable)(struct tmio_mmc_host *host, bool enable);
+	struct completion	dma_dataend;
+	struct tasklet_struct	dma_complete;
+};
+
+struct renesas_sdhi {
+	struct clk *clk;
+	struct clk *clk_cd;
+	struct tmio_mmc_data mmc_data;
+	struct tmio_mmc_dma dma_priv;
+	struct pinctrl *pinctrl;
+	struct pinctrl_state *pins_default, *pins_uhs;
+	void __iomem *scc_ctl;
+	u32 scc_tappos;
+};
+
+#define host_to_priv(host) \
+	container_of((host)->pdata, struct renesas_sdhi, mmc_data)
+
 int renesas_sdhi_probe(struct platform_device *pdev,
 		       const struct tmio_mmc_dma_ops *dma_ops);
 int renesas_sdhi_remove(struct platform_device *pdev);
diff --git a/drivers/mmc/host/renesas_sdhi_core.c b/drivers/mmc/host/renesas_sdhi_core.c
index fcf7235..80943fa 100644
--- a/drivers/mmc/host/renesas_sdhi_core.c
+++ b/drivers/mmc/host/renesas_sdhi_core.c
@@ -24,6 +24,7 @@
 #include <linux/kernel.h>
 #include <linux/clk.h>
 #include <linux/slab.h>
+#include <linux/module.h>
 #include <linux/of_device.h>
 #include <linux/platform_device.h>
 #include <linux/mmc/host.h>
@@ -46,19 +47,6 @@
 #define SDHI_VER_GEN3_SD	0xcc10
 #define SDHI_VER_GEN3_SDMMC	0xcd10
 
-#define host_to_priv(host) \
-	container_of((host)->pdata, struct renesas_sdhi, mmc_data)
-
-struct renesas_sdhi {
-	struct clk *clk;
-	struct clk *clk_cd;
-	struct tmio_mmc_data mmc_data;
-	struct tmio_mmc_dma dma_priv;
-	struct pinctrl *pinctrl;
-	struct pinctrl_state *pins_default, *pins_uhs;
-	void __iomem *scc_ctl;
-};
-
 static void renesas_sdhi_sdbuf_width(struct tmio_mmc_host *host, int width)
 {
 	u32 val;
@@ -280,7 +268,7 @@ static unsigned int renesas_sdhi_init_tuning(struct tmio_mmc_host *host)
 		       ~SH_MOBILE_SDHI_SCC_RVSCNTL_RVSEN &
 		       sd_scc_read32(host, priv, SH_MOBILE_SDHI_SCC_RVSCNTL));
 
-	sd_scc_write32(host, priv, SH_MOBILE_SDHI_SCC_DT2FF, host->scc_tappos);
+	sd_scc_write32(host, priv, SH_MOBILE_SDHI_SCC_DT2FF, priv->scc_tappos);
 
 	/* Read TAPNUM */
 	return (sd_scc_read32(host, priv, SH_MOBILE_SDHI_SCC_DTCNTL) >>
@@ -497,7 +485,7 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 	if (IS_ERR(priv->clk)) {
 		ret = PTR_ERR(priv->clk);
 		dev_err(&pdev->dev, "cannot get clock: %d\n", ret);
-		goto eprobe;
+		return ret;
 	}
 
 	/*
@@ -523,11 +511,9 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 						"state_uhs");
 	}
 
-	host = tmio_mmc_host_alloc(pdev);
-	if (!host) {
-		ret = -ENOMEM;
-		goto eprobe;
-	}
+	host = tmio_mmc_host_alloc(pdev, mmc_data);
+	if (IS_ERR(host))
+		return PTR_ERR(host);
 
 	if (of_data) {
 		mmc_data->flags |= of_data->tmio_flags;
@@ -541,18 +527,18 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 		host->bus_shift = of_data->bus_shift;
 	}
 
-	host->dma		= dma_priv;
 	host->write16_hook	= renesas_sdhi_write16_hook;
 	host->clk_enable	= renesas_sdhi_clk_enable;
 	host->clk_update	= renesas_sdhi_clk_update;
 	host->clk_disable	= renesas_sdhi_clk_disable;
 	host->multi_io_quirk	= renesas_sdhi_multi_io_quirk;
+	host->dma_ops		= dma_ops;
 
 	/* SDR speeds are only available on Gen2+ */
 	if (mmc_data->flags & TMIO_MMC_MIN_RCAR2) {
 		/* card_busy caused issues on r8a73a4 (pre-Gen2) CD-less SDHI */
-		host->card_busy	= renesas_sdhi_card_busy;
-		host->start_signal_voltage_switch =
+		host->ops.card_busy = renesas_sdhi_card_busy;
+		host->ops.start_signal_voltage_switch =
 			renesas_sdhi_start_signal_voltage_switch;
 	}
 
@@ -586,10 +572,14 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 	/* All SDHI have SDIO status bits which must be 1 */
 	mmc_data->flags |= TMIO_MMC_SDIO_STATUS_SETBITS;
 
-	ret = tmio_mmc_host_probe(host, mmc_data, dma_ops);
-	if (ret < 0)
+	ret = renesas_sdhi_clk_enable(host);
+	if (ret)
 		goto efree;
 
+	ret = tmio_mmc_host_probe(host);
+	if (ret < 0)
+		goto edisclk;
+
 	/* One Gen2 SDHI incarnation does NOT have a CBSY bit */
 	if (sd_ctrl_read16(host, CTL_VERSION) == SDHI_VER_GEN2_SDR50)
 		mmc_data->flags &= ~TMIO_MMC_HAVE_CBSY;
@@ -606,7 +596,7 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 		for (i = 0; i < of_data->taps_num; i++) {
 			if (taps[i].clk_rate == 0 ||
 			    taps[i].clk_rate == host->mmc->f_max) {
-				host->scc_tappos = taps->tap;
+				priv->scc_tappos = taps->tap;
 				hit = true;
 				break;
 			}
@@ -650,20 +640,24 @@ int renesas_sdhi_probe(struct platform_device *pdev,
 
 eirq:
 	tmio_mmc_host_remove(host);
+edisclk:
+	renesas_sdhi_clk_disable(host);
 efree:
 	tmio_mmc_host_free(host);
-eprobe:
+
 	return ret;
 }
 EXPORT_SYMBOL_GPL(renesas_sdhi_probe);
 
 int renesas_sdhi_remove(struct platform_device *pdev)
 {
-	struct mmc_host *mmc = platform_get_drvdata(pdev);
-	struct tmio_mmc_host *host = mmc_priv(mmc);
+	struct tmio_mmc_host *host = platform_get_drvdata(pdev);
 
 	tmio_mmc_host_remove(host);
+	renesas_sdhi_clk_disable(host);
 
 	return 0;
 }
 EXPORT_SYMBOL_GPL(renesas_sdhi_remove);
+
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/mmc/host/renesas_sdhi_internal_dmac.c b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
index 41cbe84..7c03cfe 100644
--- a/drivers/mmc/host/renesas_sdhi_internal_dmac.c
+++ b/drivers/mmc/host/renesas_sdhi_internal_dmac.c
@@ -103,6 +103,8 @@ renesas_sdhi_internal_dmac_dm_write(struct tmio_mmc_host *host,
 static void
 renesas_sdhi_internal_dmac_enable_dma(struct tmio_mmc_host *host, bool enable)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
+
 	if (!host->chan_tx || !host->chan_rx)
 		return;
 
@@ -110,8 +112,8 @@ renesas_sdhi_internal_dmac_enable_dma(struct tmio_mmc_host *host, bool enable)
 		renesas_sdhi_internal_dmac_dm_write(host, DM_CM_INFO1,
 						    INFO1_CLEAR);
 
-	if (host->dma->enable)
-		host->dma->enable(host, enable);
+	if (priv->dma_priv.enable)
+		priv->dma_priv.enable(host, enable);
 }
 
 static void
@@ -130,7 +132,9 @@ renesas_sdhi_internal_dmac_abort_dma(struct tmio_mmc_host *host) {
 
 static void
 renesas_sdhi_internal_dmac_dataend_dma(struct tmio_mmc_host *host) {
-	tasklet_schedule(&host->dma_complete);
+	struct renesas_sdhi *priv = host_to_priv(host);
+
+	tasklet_schedule(&priv->dma_priv.dma_complete);
 }
 
 static void
@@ -220,10 +224,12 @@ static void
 renesas_sdhi_internal_dmac_request_dma(struct tmio_mmc_host *host,
 				       struct tmio_mmc_data *pdata)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
+
 	/* Each value is set to non-zero to assume "enabling" each DMA */
 	host->chan_rx = host->chan_tx = (void *)0xdeadbeaf;
 
-	tasklet_init(&host->dma_complete,
+	tasklet_init(&priv->dma_priv.dma_complete,
 		     renesas_sdhi_internal_dmac_complete_tasklet_fn,
 		     (unsigned long)host);
 	tasklet_init(&host->dma_issue,
@@ -255,6 +261,7 @@ static const struct soc_device_attribute gen3_soc_whitelist[] = {
         { .soc_id = "r8a7795", .revision = "ES1.*" },
         { .soc_id = "r8a7795", .revision = "ES2.0" },
         { .soc_id = "r8a7796", .revision = "ES1.0" },
+        { .soc_id = "r8a77995", .revision = "ES1.0" },
         { /* sentinel */ }
 };
 
diff --git a/drivers/mmc/host/renesas_sdhi_sys_dmac.c b/drivers/mmc/host/renesas_sdhi_sys_dmac.c
index 9ab1043..82d757c 100644
--- a/drivers/mmc/host/renesas_sdhi_sys_dmac.c
+++ b/drivers/mmc/host/renesas_sdhi_sys_dmac.c
@@ -117,11 +117,13 @@ MODULE_DEVICE_TABLE(of, renesas_sdhi_sys_dmac_of_match);
 static void renesas_sdhi_sys_dmac_enable_dma(struct tmio_mmc_host *host,
 					     bool enable)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
+
 	if (!host->chan_tx || !host->chan_rx)
 		return;
 
-	if (host->dma->enable)
-		host->dma->enable(host, enable);
+	if (priv->dma_priv.enable)
+		priv->dma_priv.enable(host, enable);
 }
 
 static void renesas_sdhi_sys_dmac_abort_dma(struct tmio_mmc_host *host)
@@ -138,12 +140,15 @@ static void renesas_sdhi_sys_dmac_abort_dma(struct tmio_mmc_host *host)
 
 static void renesas_sdhi_sys_dmac_dataend_dma(struct tmio_mmc_host *host)
 {
-	complete(&host->dma_dataend);
+	struct renesas_sdhi *priv = host_to_priv(host);
+
+	complete(&priv->dma_priv.dma_dataend);
 }
 
 static void renesas_sdhi_sys_dmac_dma_callback(void *arg)
 {
 	struct tmio_mmc_host *host = arg;
+	struct renesas_sdhi *priv = host_to_priv(host);
 
 	spin_lock_irq(&host->lock);
 
@@ -161,7 +166,7 @@ static void renesas_sdhi_sys_dmac_dma_callback(void *arg)
 
 	spin_unlock_irq(&host->lock);
 
-	wait_for_completion(&host->dma_dataend);
+	wait_for_completion(&priv->dma_priv.dma_dataend);
 
 	spin_lock_irq(&host->lock);
 	tmio_mmc_do_data_irq(host);
@@ -171,6 +176,7 @@ static void renesas_sdhi_sys_dmac_dma_callback(void *arg)
 
 static void renesas_sdhi_sys_dmac_start_dma_rx(struct tmio_mmc_host *host)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
 	struct scatterlist *sg = host->sg_ptr, *sg_tmp;
 	struct dma_async_tx_descriptor *desc = NULL;
 	struct dma_chan *chan = host->chan_rx;
@@ -214,7 +220,7 @@ static void renesas_sdhi_sys_dmac_start_dma_rx(struct tmio_mmc_host *host)
 					       DMA_CTRL_ACK);
 
 	if (desc) {
-		reinit_completion(&host->dma_dataend);
+		reinit_completion(&priv->dma_priv.dma_dataend);
 		desc->callback = renesas_sdhi_sys_dmac_dma_callback;
 		desc->callback_param = host;
 
@@ -245,6 +251,7 @@ static void renesas_sdhi_sys_dmac_start_dma_rx(struct tmio_mmc_host *host)
 
 static void renesas_sdhi_sys_dmac_start_dma_tx(struct tmio_mmc_host *host)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
 	struct scatterlist *sg = host->sg_ptr, *sg_tmp;
 	struct dma_async_tx_descriptor *desc = NULL;
 	struct dma_chan *chan = host->chan_tx;
@@ -293,7 +300,7 @@ static void renesas_sdhi_sys_dmac_start_dma_tx(struct tmio_mmc_host *host)
 					       DMA_CTRL_ACK);
 
 	if (desc) {
-		reinit_completion(&host->dma_dataend);
+		reinit_completion(&priv->dma_priv.dma_dataend);
 		desc->callback = renesas_sdhi_sys_dmac_dma_callback;
 		desc->callback_param = host;
 
@@ -341,7 +348,7 @@ static void renesas_sdhi_sys_dmac_issue_tasklet_fn(unsigned long priv)
 
 	spin_lock_irq(&host->lock);
 
-	if (host && host->data) {
+	if (host->data) {
 		if (host->data->flags & MMC_DATA_READ)
 			chan = host->chan_rx;
 		else
@@ -359,9 +366,11 @@ static void renesas_sdhi_sys_dmac_issue_tasklet_fn(unsigned long priv)
 static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 					      struct tmio_mmc_data *pdata)
 {
+	struct renesas_sdhi *priv = host_to_priv(host);
+
 	/* We can only either use DMA for both Tx and Rx or not use it at all */
-	if (!host->dma || (!host->pdev->dev.of_node &&
-			   (!pdata->chan_priv_tx || !pdata->chan_priv_rx)))
+	if (!host->pdev->dev.of_node &&
+	    (!pdata->chan_priv_tx || !pdata->chan_priv_rx))
 		return;
 
 	if (!host->chan_tx && !host->chan_rx) {
@@ -378,7 +387,7 @@ static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 		dma_cap_set(DMA_SLAVE, mask);
 
 		host->chan_tx = dma_request_slave_channel_compat(mask,
-					host->dma->filter, pdata->chan_priv_tx,
+					priv->dma_priv.filter, pdata->chan_priv_tx,
 					&host->pdev->dev, "tx");
 		dev_dbg(&host->pdev->dev, "%s: TX: got channel %p\n", __func__,
 			host->chan_tx);
@@ -389,7 +398,7 @@ static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 		cfg.direction = DMA_MEM_TO_DEV;
 		cfg.dst_addr = res->start +
 			(CTL_SD_DATA_PORT << host->bus_shift);
-		cfg.dst_addr_width = host->dma->dma_buswidth;
+		cfg.dst_addr_width = priv->dma_priv.dma_buswidth;
 		if (!cfg.dst_addr_width)
 			cfg.dst_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
 		cfg.src_addr = 0;
@@ -398,7 +407,7 @@ static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 			goto ecfgtx;
 
 		host->chan_rx = dma_request_slave_channel_compat(mask,
-					host->dma->filter, pdata->chan_priv_rx,
+					priv->dma_priv.filter, pdata->chan_priv_rx,
 					&host->pdev->dev, "rx");
 		dev_dbg(&host->pdev->dev, "%s: RX: got channel %p\n", __func__,
 			host->chan_rx);
@@ -408,7 +417,7 @@ static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 
 		cfg.direction = DMA_DEV_TO_MEM;
 		cfg.src_addr = cfg.dst_addr + host->pdata->dma_rx_offset;
-		cfg.src_addr_width = host->dma->dma_buswidth;
+		cfg.src_addr_width = priv->dma_priv.dma_buswidth;
 		if (!cfg.src_addr_width)
 			cfg.src_addr_width = DMA_SLAVE_BUSWIDTH_2_BYTES;
 		cfg.dst_addr = 0;
@@ -420,7 +429,7 @@ static void renesas_sdhi_sys_dmac_request_dma(struct tmio_mmc_host *host,
 		if (!host->bounce_buf)
 			goto ebouncebuf;
 
-		init_completion(&host->dma_dataend);
+		init_completion(&priv->dma_priv.dma_dataend);
 		tasklet_init(&host->dma_issue,
 			     renesas_sdhi_sys_dmac_issue_tasklet_fn,
 			     (unsigned long)host);
diff --git a/drivers/mmc/host/rtsx_pci_sdmmc.c b/drivers/mmc/host/rtsx_pci_sdmmc.c
index 0848dc0..30bd808 100644
--- a/drivers/mmc/host/rtsx_pci_sdmmc.c
+++ b/drivers/mmc/host/rtsx_pci_sdmmc.c
@@ -30,7 +30,7 @@
 #include <linux/mmc/sd.h>
 #include <linux/mmc/sdio.h>
 #include <linux/mmc/card.h>
-#include <linux/mfd/rtsx_pci.h>
+#include <linux/rtsx_pci.h>
 #include <asm/unaligned.h>
 
 struct realtek_pci_sdmmc {
diff --git a/drivers/mmc/host/rtsx_usb_sdmmc.c b/drivers/mmc/host/rtsx_usb_sdmmc.c
index 76da168..7842207 100644
--- a/drivers/mmc/host/rtsx_usb_sdmmc.c
+++ b/drivers/mmc/host/rtsx_usb_sdmmc.c
@@ -31,7 +31,7 @@
 #include <linux/scatterlist.h>
 #include <linux/pm_runtime.h>
 
-#include <linux/mfd/rtsx_usb.h>
+#include <linux/rtsx_usb.h>
 #include <asm/unaligned.h>
 
 #if defined(CONFIG_LEDS_CLASS) || (defined(CONFIG_LEDS_CLASS_MODULE) && \
diff --git a/drivers/mmc/host/s3cmci.c b/drivers/mmc/host/s3cmci.c
index f7f157a..f774936 100644
--- a/drivers/mmc/host/s3cmci.c
+++ b/drivers/mmc/host/s3cmci.c
@@ -1424,7 +1424,9 @@ static const struct file_operations s3cmci_fops_state = {
 struct s3cmci_reg {
 	unsigned short	addr;
 	unsigned char	*name;
-} debug_regs[] = {
+};
+
+static const struct s3cmci_reg debug_regs[] = {
 	DBG_REG(CON),
 	DBG_REG(PRE),
 	DBG_REG(CMDARG),
@@ -1446,7 +1448,7 @@ struct s3cmci_reg {
 static int s3cmci_regs_show(struct seq_file *seq, void *v)
 {
 	struct s3cmci_host *host = seq->private;
-	struct s3cmci_reg *rptr = debug_regs;
+	const struct s3cmci_reg *rptr = debug_regs;
 
 	for (; rptr->name; rptr++)
 		seq_printf(seq, "SDI%s\t=0x%08x\n", rptr->name,
@@ -1658,7 +1660,7 @@ static int s3cmci_probe(struct platform_device *pdev)
 	}
 
 	host->irq = platform_get_irq(pdev, 0);
-	if (host->irq == 0) {
+	if (host->irq <= 0) {
 		dev_err(&pdev->dev, "failed to get interrupt resource.\n");
 		ret = -EINVAL;
 		goto probe_iounmap;
diff --git a/drivers/mmc/host/sdhci-acpi.c b/drivers/mmc/host/sdhci-acpi.c
index b988997..4065da5 100644
--- a/drivers/mmc/host/sdhci-acpi.c
+++ b/drivers/mmc/host/sdhci-acpi.c
@@ -76,6 +76,7 @@ struct sdhci_acpi_slot {
 	size_t		priv_size;
 	int (*probe_slot)(struct platform_device *, const char *, const char *);
 	int (*remove_slot)(struct platform_device *);
+	int (*setup_host)(struct platform_device *pdev);
 };
 
 struct sdhci_acpi_host {
@@ -96,14 +97,21 @@ static inline bool sdhci_acpi_flag(struct sdhci_acpi_host *c, unsigned int flag)
 	return c->slot && (c->slot->flags & flag);
 }
 
+#define INTEL_DSM_HS_CAPS_SDR25		BIT(0)
+#define INTEL_DSM_HS_CAPS_DDR50		BIT(1)
+#define INTEL_DSM_HS_CAPS_SDR50		BIT(2)
+#define INTEL_DSM_HS_CAPS_SDR104	BIT(3)
+
 enum {
 	INTEL_DSM_FNS		=  0,
 	INTEL_DSM_V18_SWITCH	=  3,
 	INTEL_DSM_V33_SWITCH	=  4,
+	INTEL_DSM_HS_CAPS	=  8,
 };
 
 struct intel_host {
 	u32	dsm_fns;
+	u32	hs_caps;
 };
 
 static const guid_t intel_dsm_guid =
@@ -152,6 +160,8 @@ static void intel_dsm_init(struct intel_host *intel_host, struct device *dev,
 {
 	int err;
 
+	intel_host->hs_caps = ~0;
+
 	err = __intel_dsm(intel_host, dev, INTEL_DSM_FNS, &intel_host->dsm_fns);
 	if (err) {
 		pr_debug("%s: DSM not supported, error %d\n",
@@ -161,6 +171,8 @@ static void intel_dsm_init(struct intel_host *intel_host, struct device *dev,
 
 	pr_debug("%s: DSM function mask %#x\n",
 		 mmc_hostname(mmc), intel_host->dsm_fns);
+
+	intel_dsm(intel_host, dev, INTEL_DSM_HS_CAPS, &intel_host->hs_caps);
 }
 
 static int intel_start_signal_voltage_switch(struct mmc_host *mmc,
@@ -398,6 +410,26 @@ static int intel_probe_slot(struct platform_device *pdev, const char *hid,
 	return 0;
 }
 
+static int intel_setup_host(struct platform_device *pdev)
+{
+	struct sdhci_acpi_host *c = platform_get_drvdata(pdev);
+	struct intel_host *intel_host = sdhci_acpi_priv(c);
+
+	if (!(intel_host->hs_caps & INTEL_DSM_HS_CAPS_SDR25))
+		c->host->mmc->caps &= ~MMC_CAP_UHS_SDR25;
+
+	if (!(intel_host->hs_caps & INTEL_DSM_HS_CAPS_SDR50))
+		c->host->mmc->caps &= ~MMC_CAP_UHS_SDR50;
+
+	if (!(intel_host->hs_caps & INTEL_DSM_HS_CAPS_DDR50))
+		c->host->mmc->caps &= ~MMC_CAP_UHS_DDR50;
+
+	if (!(intel_host->hs_caps & INTEL_DSM_HS_CAPS_SDR104))
+		c->host->mmc->caps &= ~MMC_CAP_UHS_SDR104;
+
+	return 0;
+}
+
 static const struct sdhci_acpi_slot sdhci_acpi_slot_int_emmc = {
 	.chip    = &sdhci_acpi_chip_int,
 	.caps    = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE |
@@ -409,6 +441,7 @@ static const struct sdhci_acpi_slot sdhci_acpi_slot_int_emmc = {
 		   SDHCI_QUIRK2_STOP_WITH_TC |
 		   SDHCI_QUIRK2_CAPS_BIT63_FOR_HS400,
 	.probe_slot	= intel_probe_slot,
+	.setup_host	= intel_setup_host,
 	.priv_size	= sizeof(struct intel_host),
 };
 
@@ -421,6 +454,7 @@ static const struct sdhci_acpi_slot sdhci_acpi_slot_int_sdio = {
 	.flags   = SDHCI_ACPI_RUNTIME_PM,
 	.pm_caps = MMC_PM_KEEP_POWER,
 	.probe_slot	= intel_probe_slot,
+	.setup_host	= intel_setup_host,
 	.priv_size	= sizeof(struct intel_host),
 };
 
@@ -432,6 +466,7 @@ static const struct sdhci_acpi_slot sdhci_acpi_slot_int_sd = {
 		   SDHCI_QUIRK2_STOP_WITH_TC,
 	.caps    = MMC_CAP_WAIT_WHILE_BUSY | MMC_CAP_AGGRESSIVE_PM,
 	.probe_slot	= intel_probe_slot,
+	.setup_host	= intel_setup_host,
 	.priv_size	= sizeof(struct intel_host),
 };
 
@@ -446,6 +481,83 @@ static const struct sdhci_acpi_slot sdhci_acpi_slot_qcom_sd = {
 	.caps    = MMC_CAP_NONREMOVABLE,
 };
 
+/* AMD sdhci reset dll register. */
+#define SDHCI_AMD_RESET_DLL_REGISTER    0x908
+
+static int amd_select_drive_strength(struct mmc_card *card,
+				     unsigned int max_dtr, int host_drv,
+				     int card_drv, int *drv_type)
+{
+	return MMC_SET_DRIVER_TYPE_A;
+}
+
+static void sdhci_acpi_amd_hs400_dll(struct sdhci_host *host)
+{
+	/* AMD Platform requires dll setting */
+	sdhci_writel(host, 0x40003210, SDHCI_AMD_RESET_DLL_REGISTER);
+	usleep_range(10, 20);
+	sdhci_writel(host, 0x40033210, SDHCI_AMD_RESET_DLL_REGISTER);
+}
+
+/*
+ * For AMD Platform it is required to disable the tuning
+ * bit first controller to bring to HS Mode from HS200
+ * mode, later enable to tune to HS400 mode.
+ */
+static void amd_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	unsigned int old_timing = host->timing;
+
+	sdhci_set_ios(mmc, ios);
+	if (old_timing == MMC_TIMING_MMC_HS200 &&
+	    ios->timing == MMC_TIMING_MMC_HS)
+		sdhci_writew(host, 0x9, SDHCI_HOST_CONTROL2);
+	if (old_timing != MMC_TIMING_MMC_HS400 &&
+	    ios->timing == MMC_TIMING_MMC_HS400) {
+		sdhci_writew(host, 0x80, SDHCI_HOST_CONTROL2);
+		sdhci_acpi_amd_hs400_dll(host);
+	}
+}
+
+static const struct sdhci_ops sdhci_acpi_ops_amd = {
+	.set_clock	= sdhci_set_clock,
+	.set_bus_width	= sdhci_set_bus_width,
+	.reset		= sdhci_reset,
+	.set_uhs_signaling = sdhci_set_uhs_signaling,
+};
+
+static const struct sdhci_acpi_chip sdhci_acpi_chip_amd = {
+	.ops = &sdhci_acpi_ops_amd,
+};
+
+static int sdhci_acpi_emmc_amd_probe_slot(struct platform_device *pdev,
+					  const char *hid, const char *uid)
+{
+	struct sdhci_acpi_host *c = platform_get_drvdata(pdev);
+	struct sdhci_host *host   = c->host;
+
+	sdhci_read_caps(host);
+	if (host->caps1 & SDHCI_SUPPORT_DDR50)
+		host->mmc->caps = MMC_CAP_1_8V_DDR;
+
+	if ((host->caps1 & SDHCI_SUPPORT_SDR104) &&
+	    (host->mmc->caps & MMC_CAP_1_8V_DDR))
+		host->mmc->caps2 = MMC_CAP2_HS400_1_8V;
+
+	host->mmc_host_ops.select_drive_strength = amd_select_drive_strength;
+	host->mmc_host_ops.set_ios = amd_set_ios;
+	return 0;
+}
+
+static const struct sdhci_acpi_slot sdhci_acpi_slot_amd_emmc = {
+	.chip   = &sdhci_acpi_chip_amd,
+	.caps   = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE,
+	.quirks = SDHCI_QUIRK_32BIT_DMA_ADDR | SDHCI_QUIRK_32BIT_DMA_SIZE |
+			SDHCI_QUIRK_32BIT_ADMA_SIZE,
+	.probe_slot     = sdhci_acpi_emmc_amd_probe_slot,
+};
+
 struct sdhci_acpi_uid_slot {
 	const char *hid;
 	const char *uid;
@@ -469,6 +581,7 @@ static const struct sdhci_acpi_uid_slot sdhci_acpi_uids[] = {
 	{ "PNP0D40"  },
 	{ "QCOM8051", NULL, &sdhci_acpi_slot_qcom_sd_3v },
 	{ "QCOM8052", NULL, &sdhci_acpi_slot_qcom_sd },
+	{ "AMDI0040", NULL, &sdhci_acpi_slot_amd_emmc },
 	{ },
 };
 
@@ -485,6 +598,7 @@ static const struct acpi_device_id sdhci_acpi_ids[] = {
 	{ "PNP0D40"  },
 	{ "QCOM8051" },
 	{ "QCOM8052" },
+	{ "AMDI0040" },
 	{ },
 };
 MODULE_DEVICE_TABLE(acpi, sdhci_acpi_ids);
@@ -566,6 +680,10 @@ static int sdhci_acpi_probe(struct platform_device *pdev)
 	host->hw_name	= "ACPI";
 	host->ops	= &sdhci_acpi_ops_dflt;
 	host->irq	= platform_get_irq(pdev, 0);
+	if (host->irq <= 0) {
+		err = -EINVAL;
+		goto err_free;
+	}
 
 	host->ioaddr = devm_ioremap_nocache(dev, iomem->start,
 					    resource_size(iomem));
@@ -609,10 +727,20 @@ static int sdhci_acpi_probe(struct platform_device *pdev)
 		}
 	}
 
-	err = sdhci_add_host(host);
+	err = sdhci_setup_host(host);
 	if (err)
 		goto err_free;
 
+	if (c->slot && c->slot->setup_host) {
+		err = c->slot->setup_host(pdev);
+		if (err)
+			goto err_cleanup;
+	}
+
+	err = __sdhci_add_host(host);
+	if (err)
+		goto err_cleanup;
+
 	if (c->use_runtime_pm) {
 		pm_runtime_set_active(dev);
 		pm_suspend_ignore_children(dev, 1);
@@ -625,6 +753,8 @@ static int sdhci_acpi_probe(struct platform_device *pdev)
 
 	return 0;
 
+err_cleanup:
+	sdhci_cleanup_host(c->host);
 err_free:
 	sdhci_free_host(c->host);
 	return err;
diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c
index 85140c9..cd2b5f6 100644
--- a/drivers/mmc/host/sdhci-esdhc-imx.c
+++ b/drivers/mmc/host/sdhci-esdhc-imx.c
@@ -193,6 +193,7 @@ struct pltfm_imx_data {
 	struct clk *clk_ipg;
 	struct clk *clk_ahb;
 	struct clk *clk_per;
+	unsigned int actual_clock;
 	enum {
 		NO_CMD_PENDING,      /* no multiblock command pending */
 		MULTIBLK_IN_PROCESS, /* exact multiblock cmd in process */
@@ -687,6 +688,20 @@ static inline void esdhc_pltfm_set_clock(struct sdhci_host *host,
 		return;
 	}
 
+	/* For i.MX53 eSDHCv3, SYSCTL.SDCLKFS may not be set to 0. */
+	if (is_imx53_esdhc(imx_data)) {
+		/*
+		 * According to the i.MX53 reference manual, if DLLCTRL[10] can
+		 * be set, then the controller is eSDHCv3, else it is eSDHCv2.
+		 */
+		val = readl(host->ioaddr + ESDHC_DLL_CTRL);
+		writel(val | BIT(10), host->ioaddr + ESDHC_DLL_CTRL);
+		temp = readl(host->ioaddr + ESDHC_DLL_CTRL);
+		writel(val, host->ioaddr + ESDHC_DLL_CTRL);
+		if (temp & BIT(10))
+			pre_div = 2;
+	}
+
 	temp = sdhci_readl(host, ESDHC_SYSTEM_CONTROL);
 	temp &= ~(ESDHC_CLOCK_IPGEN | ESDHC_CLOCK_HCKEN | ESDHC_CLOCK_PEREN
 		| ESDHC_CLOCK_MASK);
@@ -1389,11 +1404,15 @@ static int sdhci_esdhc_runtime_suspend(struct device *dev)
 	int ret;
 
 	ret = sdhci_runtime_suspend_host(host);
+	if (ret)
+		return ret;
 
 	if (host->tuning_mode != SDHCI_TUNING_MODE_3)
 		mmc_retune_needed(host->mmc);
 
 	if (!sdhci_sdio_irq_enabled(host)) {
+		imx_data->actual_clock = host->mmc->actual_clock;
+		esdhc_pltfm_set_clock(host, 0);
 		clk_disable_unprepare(imx_data->clk_per);
 		clk_disable_unprepare(imx_data->clk_ipg);
 	}
@@ -1409,31 +1428,34 @@ static int sdhci_esdhc_runtime_resume(struct device *dev)
 	struct pltfm_imx_data *imx_data = sdhci_pltfm_priv(pltfm_host);
 	int err;
 
+	err = clk_prepare_enable(imx_data->clk_ahb);
+	if (err)
+		return err;
+
 	if (!sdhci_sdio_irq_enabled(host)) {
 		err = clk_prepare_enable(imx_data->clk_per);
 		if (err)
-			return err;
+			goto disable_ahb_clk;
 		err = clk_prepare_enable(imx_data->clk_ipg);
 		if (err)
 			goto disable_per_clk;
+		esdhc_pltfm_set_clock(host, imx_data->actual_clock);
 	}
-	err = clk_prepare_enable(imx_data->clk_ahb);
-	if (err)
-		goto disable_ipg_clk;
+
 	err = sdhci_runtime_resume_host(host);
 	if (err)
-		goto disable_ahb_clk;
+		goto disable_ipg_clk;
 
 	return 0;
 
-disable_ahb_clk:
-	clk_disable_unprepare(imx_data->clk_ahb);
 disable_ipg_clk:
 	if (!sdhci_sdio_irq_enabled(host))
 		clk_disable_unprepare(imx_data->clk_ipg);
 disable_per_clk:
 	if (!sdhci_sdio_irq_enabled(host))
 		clk_disable_unprepare(imx_data->clk_per);
+disable_ahb_clk:
+	clk_disable_unprepare(imx_data->clk_ahb);
 	return err;
 }
 #endif
diff --git a/drivers/mmc/host/sdhci-of-arasan.c b/drivers/mmc/host/sdhci-of-arasan.c
index 0720ea7..c33a5f7 100644
--- a/drivers/mmc/host/sdhci-of-arasan.c
+++ b/drivers/mmc/host/sdhci-of-arasan.c
@@ -25,11 +25,13 @@
 #include <linux/of_device.h>
 #include <linux/phy/phy.h>
 #include <linux/regmap.h>
-#include "sdhci-pltfm.h"
 #include <linux/of.h>
 
-#define SDHCI_ARASAN_VENDOR_REGISTER	0x78
+#include "cqhci.h"
+#include "sdhci-pltfm.h"
 
+#define SDHCI_ARASAN_VENDOR_REGISTER	0x78
+#define SDHCI_ARASAN_CQE_BASE_ADDR	0x200
 #define VENDOR_ENHANCED_STROBE		BIT(0)
 
 #define PHY_CLK_TOO_SLOW_HZ		400000
@@ -90,6 +92,7 @@ struct sdhci_arasan_data {
 	struct phy	*phy;
 	bool		is_phy_on;
 
+	bool		has_cqe;
 	struct clk_hw	sdcardclk_hw;
 	struct clk      *sdcardclk;
 
@@ -262,6 +265,17 @@ static int sdhci_arasan_voltage_switch(struct mmc_host *mmc,
 	return -EINVAL;
 }
 
+static void sdhci_arasan_set_power(struct sdhci_host *host, unsigned char mode,
+		     unsigned short vdd)
+{
+	if (!IS_ERR(host->mmc->supply.vmmc)) {
+		struct mmc_host *mmc = host->mmc;
+
+		mmc_regulator_set_ocr(mmc, mmc->supply.vmmc, vdd);
+	}
+	sdhci_set_power_noreg(host, mode, vdd);
+}
+
 static const struct sdhci_ops sdhci_arasan_ops = {
 	.set_clock = sdhci_arasan_set_clock,
 	.get_max_clock = sdhci_pltfm_clk_get_max_clock,
@@ -269,6 +283,7 @@ static const struct sdhci_ops sdhci_arasan_ops = {
 	.set_bus_width = sdhci_set_bus_width,
 	.reset = sdhci_arasan_reset,
 	.set_uhs_signaling = sdhci_set_uhs_signaling,
+	.set_power = sdhci_arasan_set_power,
 };
 
 static const struct sdhci_pltfm_data sdhci_arasan_pdata = {
@@ -278,6 +293,62 @@ static const struct sdhci_pltfm_data sdhci_arasan_pdata = {
 			SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN,
 };
 
+static u32 sdhci_arasan_cqhci_irq(struct sdhci_host *host, u32 intmask)
+{
+	int cmd_error = 0;
+	int data_error = 0;
+
+	if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error))
+		return intmask;
+
+	cqhci_irq(host->mmc, intmask, cmd_error, data_error);
+
+	return 0;
+}
+
+static void sdhci_arasan_dumpregs(struct mmc_host *mmc)
+{
+	sdhci_dumpregs(mmc_priv(mmc));
+}
+
+static void sdhci_arasan_cqe_enable(struct mmc_host *mmc)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	u32 reg;
+
+	reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	while (reg & SDHCI_DATA_AVAILABLE) {
+		sdhci_readl(host, SDHCI_BUFFER);
+		reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	}
+
+	sdhci_cqe_enable(mmc);
+}
+
+static const struct cqhci_host_ops sdhci_arasan_cqhci_ops = {
+	.enable         = sdhci_arasan_cqe_enable,
+	.disable        = sdhci_cqe_disable,
+	.dumpregs       = sdhci_arasan_dumpregs,
+};
+
+static const struct sdhci_ops sdhci_arasan_cqe_ops = {
+	.set_clock = sdhci_arasan_set_clock,
+	.get_max_clock = sdhci_pltfm_clk_get_max_clock,
+	.get_timeout_clock = sdhci_pltfm_clk_get_max_clock,
+	.set_bus_width = sdhci_set_bus_width,
+	.reset = sdhci_arasan_reset,
+	.set_uhs_signaling = sdhci_set_uhs_signaling,
+	.set_power = sdhci_arasan_set_power,
+	.irq = sdhci_arasan_cqhci_irq,
+};
+
+static const struct sdhci_pltfm_data sdhci_arasan_cqe_pdata = {
+	.ops = &sdhci_arasan_cqe_ops,
+	.quirks = SDHCI_QUIRK_CAP_CLOCK_BASE_BROKEN,
+	.quirks2 = SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
+			SDHCI_QUIRK2_CLOCK_DIV_ZERO_BROKEN,
+};
+
 #ifdef CONFIG_PM_SLEEP
 /**
  * sdhci_arasan_suspend - Suspend method for the driver
@@ -297,6 +368,12 @@ static int sdhci_arasan_suspend(struct device *dev)
 	if (host->tuning_mode != SDHCI_TUNING_MODE_3)
 		mmc_retune_needed(host->mmc);
 
+	if (sdhci_arasan->has_cqe) {
+		ret = cqhci_suspend(host->mmc);
+		if (ret)
+			return ret;
+	}
+
 	ret = sdhci_suspend_host(host);
 	if (ret)
 		return ret;
@@ -353,7 +430,16 @@ static int sdhci_arasan_resume(struct device *dev)
 		sdhci_arasan->is_phy_on = true;
 	}
 
-	return sdhci_resume_host(host);
+	ret = sdhci_resume_host(host);
+	if (ret) {
+		dev_err(dev, "Cannot resume host.\n");
+		return ret;
+	}
+
+	if (sdhci_arasan->has_cqe)
+		return cqhci_resume(host->mmc);
+
+	return 0;
 }
 #endif /* ! CONFIG_PM_SLEEP */
 
@@ -556,6 +642,49 @@ static void sdhci_arasan_unregister_sdclk(struct device *dev)
 	of_clk_del_provider(dev->of_node);
 }
 
+static int sdhci_arasan_add_host(struct sdhci_arasan_data *sdhci_arasan)
+{
+	struct sdhci_host *host = sdhci_arasan->host;
+	struct cqhci_host *cq_host;
+	bool dma64;
+	int ret;
+
+	if (!sdhci_arasan->has_cqe)
+		return sdhci_add_host(host);
+
+	ret = sdhci_setup_host(host);
+	if (ret)
+		return ret;
+
+	cq_host = devm_kzalloc(host->mmc->parent,
+			       sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	cq_host->mmio = host->ioaddr + SDHCI_ARASAN_CQE_BASE_ADDR;
+	cq_host->ops = &sdhci_arasan_cqhci_ops;
+
+	dma64 = host->flags & SDHCI_USE_64_BIT_DMA;
+	if (dma64)
+		cq_host->caps |= CQHCI_TASK_DESC_SZ_128;
+
+	ret = cqhci_init(cq_host, host->mmc, dma64);
+	if (ret)
+		goto cleanup;
+
+	ret = __sdhci_add_host(host);
+	if (ret)
+		goto cleanup;
+
+	return 0;
+
+cleanup:
+	sdhci_cleanup_host(host);
+	return ret;
+}
+
 static int sdhci_arasan_probe(struct platform_device *pdev)
 {
 	int ret;
@@ -566,9 +695,15 @@ static int sdhci_arasan_probe(struct platform_device *pdev)
 	struct sdhci_pltfm_host *pltfm_host;
 	struct sdhci_arasan_data *sdhci_arasan;
 	struct device_node *np = pdev->dev.of_node;
+	const struct sdhci_pltfm_data *pdata;
 
-	host = sdhci_pltfm_init(pdev, &sdhci_arasan_pdata,
-				sizeof(*sdhci_arasan));
+	if (of_device_is_compatible(pdev->dev.of_node, "arasan,sdhci-5.1"))
+		pdata = &sdhci_arasan_cqe_pdata;
+	else
+		pdata = &sdhci_arasan_pdata;
+
+	host = sdhci_pltfm_init(pdev, pdata, sizeof(*sdhci_arasan));
+
 	if (IS_ERR(host))
 		return PTR_ERR(host);
 
@@ -663,9 +798,11 @@ static int sdhci_arasan_probe(struct platform_device *pdev)
 					sdhci_arasan_hs400_enhanced_strobe;
 		host->mmc_host_ops.start_signal_voltage_switch =
 					sdhci_arasan_voltage_switch;
+		sdhci_arasan->has_cqe = true;
+		host->mmc->caps2 |= MMC_CAP2_CQE | MMC_CAP2_CQE_DCMD;
 	}
 
-	ret = sdhci_add_host(host);
+	ret = sdhci_arasan_add_host(sdhci_arasan);
 	if (ret)
 		goto err_add_host;
 
diff --git a/drivers/mmc/host/sdhci-of-esdhc.c b/drivers/mmc/host/sdhci-of-esdhc.c
index 1f42437..4ffa6b1 100644
--- a/drivers/mmc/host/sdhci-of-esdhc.c
+++ b/drivers/mmc/host/sdhci-of-esdhc.c
@@ -589,10 +589,18 @@ static void esdhc_pltfm_set_bus_width(struct sdhci_host *host, int width)
 
 static void esdhc_reset(struct sdhci_host *host, u8 mask)
 {
+	u32 val;
+
 	sdhci_reset(host, mask);
 
 	sdhci_writel(host, host->ier, SDHCI_INT_ENABLE);
 	sdhci_writel(host, host->ier, SDHCI_SIGNAL_ENABLE);
+
+	if (mask & SDHCI_RESET_ALL) {
+		val = sdhci_readl(host, ESDHC_TBCTL);
+		val &= ~ESDHC_TB_EN;
+		sdhci_writel(host, val, ESDHC_TBCTL);
+	}
 }
 
 /* The SCFG, Supplemental Configuration Unit, provides SoC specific
diff --git a/drivers/mmc/host/sdhci-pci-arasan.c b/drivers/mmc/host/sdhci-pci-arasan.c
new file mode 100644
index 0000000..499f320
--- /dev/null
+++ b/drivers/mmc/host/sdhci-pci-arasan.c
@@ -0,0 +1,331 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * sdhci-pci-arasan.c - Driver for Arasan PCI Controller with
+ * integrated phy.
+ *
+ * Copyright (C) 2017 Arasan Chip Systems Inc.
+ *
+ * Author: Atul Garg <agarg@arasan.com>
+ */
+
+#include <linux/pci.h>
+#include <linux/delay.h>
+
+#include "sdhci.h"
+#include "sdhci-pci.h"
+
+/* Extra registers for Arasan SD/SDIO/MMC Host Controller with PHY */
+#define PHY_ADDR_REG	0x300
+#define PHY_DAT_REG	0x304
+
+#define PHY_WRITE	BIT(8)
+#define PHY_BUSY	BIT(9)
+#define DATA_MASK	0xFF
+
+/* PHY Specific Registers */
+#define DLL_STATUS	0x00
+#define IPAD_CTRL1	0x01
+#define IPAD_CTRL2	0x02
+#define IPAD_STS	0x03
+#define IOREN_CTRL1	0x06
+#define IOREN_CTRL2	0x07
+#define IOPU_CTRL1	0x08
+#define IOPU_CTRL2	0x09
+#define ITAP_DELAY	0x0C
+#define OTAP_DELAY	0x0D
+#define STRB_SEL	0x0E
+#define CLKBUF_SEL	0x0F
+#define MODE_CTRL	0x11
+#define DLL_TRIM	0x12
+#define CMD_CTRL	0x20
+#define DATA_CTRL	0x21
+#define STRB_CTRL	0x22
+#define CLK_CTRL	0x23
+#define PHY_CTRL	0x24
+
+#define DLL_ENBL	BIT(3)
+#define RTRIM_EN	BIT(1)
+#define PDB_ENBL	BIT(1)
+#define RETB_ENBL	BIT(6)
+#define ODEN_CMD	BIT(1)
+#define ODEN_DAT	0xFF
+#define REN_STRB	BIT(0)
+#define REN_CMND	BIT(1)
+#define REN_DATA	0xFF
+#define PU_CMD		BIT(1)
+#define PU_DAT		0xFF
+#define ITAPDLY_EN	BIT(0)
+#define OTAPDLY_EN	BIT(0)
+#define OD_REL_CMD	BIT(1)
+#define OD_REL_DAT	0xFF
+#define DLLTRM_ICP	0x8
+#define PDB_CMND	BIT(0)
+#define PDB_DATA	0xFF
+#define PDB_STRB	BIT(0)
+#define PDB_CLOCK	BIT(0)
+#define CALDONE_MASK	0x10
+#define DLL_RDY_MASK	0x10
+#define MAX_CLK_BUF	0x7
+
+/* Mode Controls */
+#define ENHSTRB_MODE	BIT(0)
+#define HS400_MODE	BIT(1)
+#define LEGACY_MODE	BIT(2)
+#define DDR50_MODE	BIT(3)
+
+/*
+ * Controller has no specific bits for HS200/HS.
+ * Used BIT(4), BIT(5) for software programming.
+ */
+#define HS200_MODE	BIT(4)
+#define HISPD_MODE	BIT(5)
+
+#define OTAPDLY(x)	(((x) << 1) | OTAPDLY_EN)
+#define ITAPDLY(x)	(((x) << 1) | ITAPDLY_EN)
+#define FREQSEL(x)	(((x) << 5) | DLL_ENBL)
+#define IOPAD(x, y)	((x) | ((y) << 2))
+
+/* Arasan private data */
+struct arasan_host {
+	u32 chg_clk;
+};
+
+static int arasan_phy_addr_poll(struct sdhci_host *host, u32 offset, u32 mask)
+{
+	ktime_t timeout = ktime_add_us(ktime_get(), 100);
+	bool failed;
+	u8 val = 0;
+
+	while (1) {
+		failed = ktime_after(ktime_get(), timeout);
+		val = sdhci_readw(host, PHY_ADDR_REG);
+		if (!(val & mask))
+			return 0;
+		if (failed)
+			return -EBUSY;
+	}
+}
+
+static int arasan_phy_write(struct sdhci_host *host, u8 data, u8 offset)
+{
+	sdhci_writew(host, data, PHY_DAT_REG);
+	sdhci_writew(host, (PHY_WRITE | offset), PHY_ADDR_REG);
+	return arasan_phy_addr_poll(host, PHY_ADDR_REG, PHY_BUSY);
+}
+
+static int arasan_phy_read(struct sdhci_host *host, u8 offset, u8 *data)
+{
+	int ret;
+
+	sdhci_writew(host, 0, PHY_DAT_REG);
+	sdhci_writew(host, offset, PHY_ADDR_REG);
+	ret = arasan_phy_addr_poll(host, PHY_ADDR_REG, PHY_BUSY);
+
+	/* Masking valid data bits */
+	*data = sdhci_readw(host, PHY_DAT_REG) & DATA_MASK;
+	return ret;
+}
+
+static int arasan_phy_sts_poll(struct sdhci_host *host, u32 offset, u32 mask)
+{
+	int ret;
+	ktime_t timeout = ktime_add_us(ktime_get(), 100);
+	bool failed;
+	u8 val = 0;
+
+	while (1) {
+		failed = ktime_after(ktime_get(), timeout);
+		ret = arasan_phy_read(host, offset, &val);
+		if (ret)
+			return -EBUSY;
+		else if (val & mask)
+			return 0;
+		if (failed)
+			return -EBUSY;
+	}
+}
+
+/* Initialize the Arasan PHY */
+static int arasan_phy_init(struct sdhci_host *host)
+{
+	int ret;
+	u8 val;
+
+	/* Program IOPADs and wait for calibration to be done */
+	if (arasan_phy_read(host, IPAD_CTRL1, &val) ||
+	    arasan_phy_write(host, val | RETB_ENBL | PDB_ENBL, IPAD_CTRL1) ||
+	    arasan_phy_read(host, IPAD_CTRL2, &val) ||
+	    arasan_phy_write(host, val | RTRIM_EN, IPAD_CTRL2))
+		return -EBUSY;
+	ret = arasan_phy_sts_poll(host, IPAD_STS, CALDONE_MASK);
+	if (ret)
+		return -EBUSY;
+
+	/* Program CMD/Data lines */
+	if (arasan_phy_read(host, IOREN_CTRL1, &val) ||
+	    arasan_phy_write(host, val | REN_CMND | REN_STRB, IOREN_CTRL1) ||
+	    arasan_phy_read(host, IOPU_CTRL1, &val) ||
+	    arasan_phy_write(host, val | PU_CMD, IOPU_CTRL1) ||
+	    arasan_phy_read(host, CMD_CTRL, &val) ||
+	    arasan_phy_write(host, val | PDB_CMND, CMD_CTRL) ||
+	    arasan_phy_read(host, IOREN_CTRL2, &val) ||
+	    arasan_phy_write(host, val | REN_DATA, IOREN_CTRL2) ||
+	    arasan_phy_read(host, IOPU_CTRL2, &val) ||
+	    arasan_phy_write(host, val | PU_DAT, IOPU_CTRL2) ||
+	    arasan_phy_read(host, DATA_CTRL, &val) ||
+	    arasan_phy_write(host, val | PDB_DATA, DATA_CTRL) ||
+	    arasan_phy_read(host, STRB_CTRL, &val) ||
+	    arasan_phy_write(host, val | PDB_STRB, STRB_CTRL) ||
+	    arasan_phy_read(host, CLK_CTRL, &val) ||
+	    arasan_phy_write(host, val | PDB_CLOCK, CLK_CTRL) ||
+	    arasan_phy_read(host, CLKBUF_SEL, &val) ||
+	    arasan_phy_write(host, val | MAX_CLK_BUF, CLKBUF_SEL) ||
+	    arasan_phy_write(host, LEGACY_MODE, MODE_CTRL))
+		return -EBUSY;
+	return 0;
+}
+
+/* Set Arasan PHY for different modes */
+static int arasan_phy_set(struct sdhci_host *host, u8 mode, u8 otap,
+			  u8 drv_type, u8 itap, u8 trim, u8 clk)
+{
+	u8 val;
+	int ret;
+
+	if (mode == HISPD_MODE || mode == HS200_MODE)
+		ret = arasan_phy_write(host, 0x0, MODE_CTRL);
+	else
+		ret = arasan_phy_write(host, mode, MODE_CTRL);
+	if (ret)
+		return ret;
+	if (mode == HS400_MODE || mode == HS200_MODE) {
+		ret = arasan_phy_read(host, IPAD_CTRL1, &val);
+		if (ret)
+			return ret;
+		ret = arasan_phy_write(host, IOPAD(val, drv_type), IPAD_CTRL1);
+		if (ret)
+			return ret;
+	}
+	if (mode == LEGACY_MODE) {
+		ret = arasan_phy_write(host, 0x0, OTAP_DELAY);
+		if (ret)
+			return ret;
+		ret = arasan_phy_write(host, 0x0, ITAP_DELAY);
+	} else {
+		ret = arasan_phy_write(host, OTAPDLY(otap), OTAP_DELAY);
+		if (ret)
+			return ret;
+		if (mode != HS200_MODE)
+			ret = arasan_phy_write(host, ITAPDLY(itap), ITAP_DELAY);
+		else
+			ret = arasan_phy_write(host, 0x0, ITAP_DELAY);
+	}
+	if (ret)
+		return ret;
+	if (mode != LEGACY_MODE) {
+		ret = arasan_phy_write(host, trim, DLL_TRIM);
+		if (ret)
+			return ret;
+	}
+	ret = arasan_phy_write(host, 0, DLL_STATUS);
+	if (ret)
+		return ret;
+	if (mode != LEGACY_MODE) {
+		ret = arasan_phy_write(host, FREQSEL(clk), DLL_STATUS);
+		if (ret)
+			return ret;
+		ret = arasan_phy_sts_poll(host, DLL_STATUS, DLL_RDY_MASK);
+		if (ret)
+			return -EBUSY;
+	}
+	return 0;
+}
+
+static int arasan_select_phy_clock(struct sdhci_host *host)
+{
+	struct sdhci_pci_slot *slot = sdhci_priv(host);
+	struct arasan_host *arasan_host = sdhci_pci_priv(slot);
+	u8 clk;
+
+	if (arasan_host->chg_clk == host->mmc->ios.clock)
+		return 0;
+
+	arasan_host->chg_clk = host->mmc->ios.clock;
+	if (host->mmc->ios.clock == 200000000)
+		clk = 0x0;
+	else if (host->mmc->ios.clock == 100000000)
+		clk = 0x2;
+	else if (host->mmc->ios.clock == 50000000)
+		clk = 0x1;
+	else
+		clk = 0x0;
+
+	if (host->mmc_host_ops.hs400_enhanced_strobe) {
+		arasan_phy_set(host, ENHSTRB_MODE, 1, 0x0, 0x0,
+			       DLLTRM_ICP, clk);
+	} else {
+		switch (host->mmc->ios.timing) {
+		case MMC_TIMING_LEGACY:
+			arasan_phy_set(host, LEGACY_MODE, 0x0, 0x0, 0x0,
+				       0x0, 0x0);
+			break;
+		case MMC_TIMING_MMC_HS:
+		case MMC_TIMING_SD_HS:
+			arasan_phy_set(host, HISPD_MODE, 0x3, 0x0, 0x2,
+				       DLLTRM_ICP, clk);
+			break;
+		case MMC_TIMING_MMC_HS200:
+		case MMC_TIMING_UHS_SDR104:
+			arasan_phy_set(host, HS200_MODE, 0x2,
+				       host->mmc->ios.drv_type, 0x0,
+				       DLLTRM_ICP, clk);
+			break;
+		case MMC_TIMING_MMC_DDR52:
+		case MMC_TIMING_UHS_DDR50:
+			arasan_phy_set(host, DDR50_MODE, 0x1, 0x0,
+				       0x0, DLLTRM_ICP, clk);
+			break;
+		case MMC_TIMING_MMC_HS400:
+			arasan_phy_set(host, HS400_MODE, 0x1,
+				       host->mmc->ios.drv_type, 0xa,
+				       DLLTRM_ICP, clk);
+			break;
+		default:
+			break;
+		}
+	}
+	return 0;
+}
+
+static int arasan_pci_probe_slot(struct sdhci_pci_slot *slot)
+{
+	int err;
+
+	slot->host->mmc->caps |= MMC_CAP_NONREMOVABLE | MMC_CAP_8_BIT_DATA;
+	err = arasan_phy_init(slot->host);
+	if (err)
+		return -ENODEV;
+	return 0;
+}
+
+static void arasan_sdhci_set_clock(struct sdhci_host *host, unsigned int clock)
+{
+	sdhci_set_clock(host, clock);
+
+	/* Change phy settings for the new clock */
+	arasan_select_phy_clock(host);
+}
+
+static const struct sdhci_ops arasan_sdhci_pci_ops = {
+	.set_clock	= arasan_sdhci_set_clock,
+	.enable_dma	= sdhci_pci_enable_dma,
+	.set_bus_width	= sdhci_set_bus_width,
+	.reset		= sdhci_reset,
+	.set_uhs_signaling	= sdhci_set_uhs_signaling,
+};
+
+const struct sdhci_pci_fixes sdhci_arasan = {
+	.probe_slot = arasan_pci_probe_slot,
+	.ops        = &arasan_sdhci_pci_ops,
+	.priv_size  = sizeof(struct arasan_host),
+};
diff --git a/drivers/mmc/host/sdhci-pci-core.c b/drivers/mmc/host/sdhci-pci-core.c
index 3e4f04f..6d1a983 100644
--- a/drivers/mmc/host/sdhci-pci-core.c
+++ b/drivers/mmc/host/sdhci-pci-core.c
@@ -30,17 +30,37 @@
 #include <linux/mmc/sdhci-pci-data.h>
 #include <linux/acpi.h>
 
+#include "cqhci.h"
+
 #include "sdhci.h"
 #include "sdhci-pci.h"
 
-static int sdhci_pci_enable_dma(struct sdhci_host *host);
 static void sdhci_pci_hw_reset(struct sdhci_host *host);
 
 #ifdef CONFIG_PM_SLEEP
-static int __sdhci_pci_suspend_host(struct sdhci_pci_chip *chip)
+static int sdhci_pci_init_wakeup(struct sdhci_pci_chip *chip)
+{
+	mmc_pm_flag_t pm_flags = 0;
+	int i;
+
+	for (i = 0; i < chip->num_slots; i++) {
+		struct sdhci_pci_slot *slot = chip->slots[i];
+
+		if (slot)
+			pm_flags |= slot->host->mmc->pm_flags;
+	}
+
+	return device_set_wakeup_enable(&chip->pdev->dev,
+					(pm_flags & MMC_PM_KEEP_POWER) &&
+					(pm_flags & MMC_PM_WAKE_SDIO_IRQ));
+}
+
+static int sdhci_pci_suspend_host(struct sdhci_pci_chip *chip)
 {
 	int i, ret;
 
+	sdhci_pci_init_wakeup(chip);
+
 	for (i = 0; i < chip->num_slots; i++) {
 		struct sdhci_pci_slot *slot = chip->slots[i];
 		struct sdhci_host *host;
@@ -56,9 +76,6 @@ static int __sdhci_pci_suspend_host(struct sdhci_pci_chip *chip)
 		ret = sdhci_suspend_host(host);
 		if (ret)
 			goto err_pci_suspend;
-
-		if (host->mmc->pm_flags & MMC_PM_WAKE_SDIO_IRQ)
-			sdhci_enable_irq_wakeups(host);
 	}
 
 	return 0;
@@ -69,36 +86,6 @@ static int __sdhci_pci_suspend_host(struct sdhci_pci_chip *chip)
 	return ret;
 }
 
-static int sdhci_pci_init_wakeup(struct sdhci_pci_chip *chip)
-{
-	mmc_pm_flag_t pm_flags = 0;
-	int i;
-
-	for (i = 0; i < chip->num_slots; i++) {
-		struct sdhci_pci_slot *slot = chip->slots[i];
-
-		if (slot)
-			pm_flags |= slot->host->mmc->pm_flags;
-	}
-
-	return device_init_wakeup(&chip->pdev->dev,
-				  (pm_flags & MMC_PM_KEEP_POWER) &&
-				  (pm_flags & MMC_PM_WAKE_SDIO_IRQ));
-}
-
-static int sdhci_pci_suspend_host(struct sdhci_pci_chip *chip)
-{
-	int ret;
-
-	ret = __sdhci_pci_suspend_host(chip);
-	if (ret)
-		return ret;
-
-	sdhci_pci_init_wakeup(chip);
-
-	return 0;
-}
-
 int sdhci_pci_resume_host(struct sdhci_pci_chip *chip)
 {
 	struct sdhci_pci_slot *slot;
@@ -116,6 +103,28 @@ int sdhci_pci_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_suspend_host(chip);
+}
+
+static int sdhci_cqhci_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
 #ifdef CONFIG_PM
@@ -166,8 +175,48 @@ static int sdhci_pci_runtime_resume_host(struct sdhci_pci_chip *chip)
 
 	return 0;
 }
+
+static int sdhci_cqhci_runtime_suspend(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = cqhci_suspend(chip->slots[0]->host->mmc);
+	if (ret)
+		return ret;
+
+	return sdhci_pci_runtime_suspend_host(chip);
+}
+
+static int sdhci_cqhci_runtime_resume(struct sdhci_pci_chip *chip)
+{
+	int ret;
+
+	ret = sdhci_pci_runtime_resume_host(chip);
+	if (ret)
+		return ret;
+
+	return cqhci_resume(chip->slots[0]->host->mmc);
+}
 #endif
 
+static u32 sdhci_cqhci_irq(struct sdhci_host *host, u32 intmask)
+{
+	int cmd_error = 0;
+	int data_error = 0;
+
+	if (!sdhci_cqe_irq(host, intmask, &cmd_error, &data_error))
+		return intmask;
+
+	cqhci_irq(host->mmc, intmask, cmd_error, data_error);
+
+	return 0;
+}
+
+static void sdhci_pci_dumpregs(struct mmc_host *mmc)
+{
+	sdhci_dumpregs(mmc_priv(mmc));
+}
+
 /*****************************************************************************\
  *                                                                           *
  * Hardware specific quirk handling                                          *
@@ -583,6 +632,18 @@ static const struct sdhci_ops sdhci_intel_byt_ops = {
 	.voltage_switch		= sdhci_intel_voltage_switch,
 };
 
+static const struct sdhci_ops sdhci_intel_glk_ops = {
+	.set_clock		= sdhci_set_clock,
+	.set_power		= sdhci_intel_set_power,
+	.enable_dma		= sdhci_pci_enable_dma,
+	.set_bus_width		= sdhci_set_bus_width,
+	.reset			= sdhci_reset,
+	.set_uhs_signaling	= sdhci_set_uhs_signaling,
+	.hw_reset		= sdhci_pci_hw_reset,
+	.voltage_switch		= sdhci_intel_voltage_switch,
+	.irq			= sdhci_cqhci_irq,
+};
+
 static void byt_read_dsm(struct sdhci_pci_slot *slot)
 {
 	struct intel_host *intel_host = sdhci_pci_priv(slot);
@@ -612,15 +673,83 @@ static int glk_emmc_probe_slot(struct sdhci_pci_slot *slot)
 {
 	int ret = byt_emmc_probe_slot(slot);
 
+	slot->host->mmc->caps2 |= MMC_CAP2_CQE;
+
 	if (slot->chip->pdev->device != PCI_DEVICE_ID_INTEL_GLK_EMMC) {
 		slot->host->mmc->caps2 |= MMC_CAP2_HS400_ES,
 		slot->host->mmc_host_ops.hs400_enhanced_strobe =
 						intel_hs400_enhanced_strobe;
+		slot->host->mmc->caps2 |= MMC_CAP2_CQE_DCMD;
 	}
 
 	return ret;
 }
 
+static void glk_cqe_enable(struct mmc_host *mmc)
+{
+	struct sdhci_host *host = mmc_priv(mmc);
+	u32 reg;
+
+	/*
+	 * CQE gets stuck if it sees Buffer Read Enable bit set, which can be
+	 * the case after tuning, so ensure the buffer is drained.
+	 */
+	reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	while (reg & SDHCI_DATA_AVAILABLE) {
+		sdhci_readl(host, SDHCI_BUFFER);
+		reg = sdhci_readl(host, SDHCI_PRESENT_STATE);
+	}
+
+	sdhci_cqe_enable(mmc);
+}
+
+static const struct cqhci_host_ops glk_cqhci_ops = {
+	.enable		= glk_cqe_enable,
+	.disable	= sdhci_cqe_disable,
+	.dumpregs	= sdhci_pci_dumpregs,
+};
+
+static int glk_emmc_add_host(struct sdhci_pci_slot *slot)
+{
+	struct device *dev = &slot->chip->pdev->dev;
+	struct sdhci_host *host = slot->host;
+	struct cqhci_host *cq_host;
+	bool dma64;
+	int ret;
+
+	ret = sdhci_setup_host(host);
+	if (ret)
+		return ret;
+
+	cq_host = devm_kzalloc(dev, sizeof(*cq_host), GFP_KERNEL);
+	if (!cq_host) {
+		ret = -ENOMEM;
+		goto cleanup;
+	}
+
+	cq_host->mmio = host->ioaddr + 0x200;
+	cq_host->quirks |= CQHCI_QUIRK_SHORT_TXFR_DESC_SZ;
+	cq_host->ops = &glk_cqhci_ops;
+
+	dma64 = host->flags & SDHCI_USE_64_BIT_DMA;
+	if (dma64)
+		cq_host->caps |= CQHCI_TASK_DESC_SZ_128;
+
+	ret = cqhci_init(cq_host, host->mmc, dma64);
+	if (ret)
+		goto cleanup;
+
+	ret = __sdhci_add_host(host);
+	if (ret)
+		goto cleanup;
+
+	return 0;
+
+cleanup:
+	sdhci_cleanup_host(host);
+	return ret;
+}
+
 #ifdef CONFIG_ACPI
 static int ni_set_max_freq(struct sdhci_pci_slot *slot)
 {
@@ -699,11 +828,20 @@ static const struct sdhci_pci_fixes sdhci_intel_byt_emmc = {
 static const struct sdhci_pci_fixes sdhci_intel_glk_emmc = {
 	.allow_runtime_pm	= true,
 	.probe_slot		= glk_emmc_probe_slot,
+	.add_host		= glk_emmc_add_host,
+#ifdef CONFIG_PM_SLEEP
+	.suspend		= sdhci_cqhci_suspend,
+	.resume			= sdhci_cqhci_resume,
+#endif
+#ifdef CONFIG_PM
+	.runtime_suspend	= sdhci_cqhci_runtime_suspend,
+	.runtime_resume		= sdhci_cqhci_runtime_resume,
+#endif
 	.quirks			= SDHCI_QUIRK_NO_ENDATTR_IN_NOPDESC,
 	.quirks2		= SDHCI_QUIRK2_PRESET_VALUE_BROKEN |
 				  SDHCI_QUIRK2_CAPS_BIT63_FOR_HS400 |
 				  SDHCI_QUIRK2_STOP_WITH_TC,
-	.ops			= &sdhci_intel_byt_ops,
+	.ops			= &sdhci_intel_glk_ops,
 	.priv_size		= sizeof(struct intel_host),
 };
 
@@ -778,6 +916,8 @@ static int intel_mrfld_mmc_probe_slot(struct sdhci_pci_slot *slot)
 		slot->host->quirks2 |= SDHCI_QUIRK2_NO_1_8_V;
 		break;
 	case INTEL_MRFLD_SDIO:
+		/* Advertise 2.0v for compatibility with the SDIO card's OCR */
+		slot->host->ocr_mask = MMC_VDD_20_21 | MMC_VDD_165_195;
 		slot->host->mmc->caps |= MMC_CAP_NONREMOVABLE |
 					 MMC_CAP_POWER_OFF_CARD;
 		break;
@@ -955,7 +1095,7 @@ static int jmicron_suspend(struct sdhci_pci_chip *chip)
 {
 	int i, ret;
 
-	ret = __sdhci_pci_suspend_host(chip);
+	ret = sdhci_pci_suspend_host(chip);
 	if (ret)
 		return ret;
 
@@ -965,8 +1105,6 @@ static int jmicron_suspend(struct sdhci_pci_chip *chip)
 			jmicron_enable_mmc(chip->slots[i]->host, 0);
 	}
 
-	sdhci_pci_init_wakeup(chip);
-
 	return 0;
 }
 
@@ -1306,6 +1444,7 @@ static const struct pci_device_id pci_ids[] = {
 	SDHCI_PCI_DEVICE(O2, SDS1,     o2),
 	SDHCI_PCI_DEVICE(O2, SEABIRD0, o2),
 	SDHCI_PCI_DEVICE(O2, SEABIRD1, o2),
+	SDHCI_PCI_DEVICE(ARASAN, PHY_EMMC, arasan),
 	SDHCI_PCI_DEVICE_CLASS(AMD, SYSTEM_SDHCI, PCI_CLASS_MASK, amd),
 	/* Generic SD host controller */
 	{PCI_DEVICE_CLASS(SYSTEM_SDHCI, PCI_CLASS_MASK)},
@@ -1320,7 +1459,7 @@ MODULE_DEVICE_TABLE(pci, pci_ids);
  *                                                                           *
 \*****************************************************************************/
 
-static int sdhci_pci_enable_dma(struct sdhci_host *host)
+int sdhci_pci_enable_dma(struct sdhci_host *host)
 {
 	struct sdhci_pci_slot *slot;
 	struct pci_dev *pdev;
@@ -1543,10 +1682,13 @@ static struct sdhci_pci_slot *sdhci_pci_probe_slot(
 		}
 	}
 
-	host->mmc->pm_caps = MMC_PM_KEEP_POWER | MMC_PM_WAKE_SDIO_IRQ;
+	host->mmc->pm_caps = MMC_PM_KEEP_POWER;
 	host->mmc->slotno = slotno;
 	host->mmc->caps2 |= MMC_CAP2_NO_PRESCAN_POWERUP;
 
+	if (device_can_wakeup(&pdev->dev))
+		host->mmc->pm_caps |= MMC_PM_WAKE_SDIO_IRQ;
+
 	if (slot->cd_idx >= 0) {
 		ret = mmc_gpiod_request_cd(host->mmc, NULL, slot->cd_idx,
 					   slot->cd_override_level, 0, NULL);
diff --git a/drivers/mmc/host/sdhci-pci.h b/drivers/mmc/host/sdhci-pci.h
index 0056f08..5cbcdc4 100644
--- a/drivers/mmc/host/sdhci-pci.h
+++ b/drivers/mmc/host/sdhci-pci.h
@@ -55,6 +55,9 @@
 
 #define PCI_SUBDEVICE_ID_NI_7884	0x7884
 
+#define PCI_VENDOR_ID_ARASAN		0x16e6
+#define PCI_DEVICE_ID_ARASAN_PHY_EMMC	0x0670
+
 /*
  * PCI device class and mask
  */
@@ -170,11 +173,13 @@ static inline void *sdhci_pci_priv(struct sdhci_pci_slot *slot)
 #ifdef CONFIG_PM_SLEEP
 int sdhci_pci_resume_host(struct sdhci_pci_chip *chip);
 #endif
-
+int sdhci_pci_enable_dma(struct sdhci_host *host);
 int sdhci_pci_o2_probe_slot(struct sdhci_pci_slot *slot);
 int sdhci_pci_o2_probe(struct sdhci_pci_chip *chip);
 #ifdef CONFIG_PM_SLEEP
 int sdhci_pci_o2_resume(struct sdhci_pci_chip *chip);
 #endif
 
+extern const struct sdhci_pci_fixes sdhci_arasan;
+
 #endif /* __SDHCI_PCI_H */
diff --git a/drivers/mmc/host/sdhci-spear.c b/drivers/mmc/host/sdhci-spear.c
index 8c0f884..1451152 100644
--- a/drivers/mmc/host/sdhci-spear.c
+++ b/drivers/mmc/host/sdhci-spear.c
@@ -82,6 +82,10 @@ static int sdhci_probe(struct platform_device *pdev)
 	host->hw_name = "sdhci";
 	host->ops = &sdhci_pltfm_ops;
 	host->irq = platform_get_irq(pdev, 0);
+	if (host->irq <= 0) {
+		ret = -EINVAL;
+		goto err_host;
+	}
 	host->quirks = SDHCI_QUIRK_BROKEN_ADMA;
 
 	sdhci = sdhci_priv(host);
diff --git a/drivers/mmc/host/sdhci-xenon.c b/drivers/mmc/host/sdhci-xenon.c
index 0842bbc..4d0791f 100644
--- a/drivers/mmc/host/sdhci-xenon.c
+++ b/drivers/mmc/host/sdhci-xenon.c
@@ -230,7 +230,14 @@ static void xenon_set_power(struct sdhci_host *host, unsigned char mode,
 		mmc_regulator_set_ocr(mmc, mmc->supply.vmmc, vdd);
 }
 
+static void xenon_voltage_switch(struct sdhci_host *host)
+{
+	/* Wait for 5ms after set 1.8V signal enable bit */
+	usleep_range(5000, 5500);
+}
+
 static const struct sdhci_ops sdhci_xenon_ops = {
+	.voltage_switch		= xenon_voltage_switch,
 	.set_clock		= sdhci_set_clock,
 	.set_power		= xenon_set_power,
 	.set_bus_width		= sdhci_set_bus_width,
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index e9290a3..070aff9 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -1434,6 +1434,13 @@ void sdhci_set_power_noreg(struct sdhci_host *host, unsigned char mode,
 	if (mode != MMC_POWER_OFF) {
 		switch (1 << vdd) {
 		case MMC_VDD_165_195:
+		/*
+		 * Without a regulator, SDHCI does not support 2.0v
+		 * so we only get here if the driver deliberately
+		 * added the 2.0v range to ocr_avail. Map it to 1.8v
+		 * for the purpose of turning on the power.
+		 */
+		case MMC_VDD_20_21:
 			pwr = SDHCI_POWER_180;
 			break;
 		case MMC_VDD_29_30:
@@ -2821,25 +2828,33 @@ static irqreturn_t sdhci_thread_irq(int irq, void *dev_id)
  * sdhci_disable_irq_wakeups() since it will be set by
  * sdhci_enable_card_detection() or sdhci_init().
  */
-void sdhci_enable_irq_wakeups(struct sdhci_host *host)
+static bool sdhci_enable_irq_wakeups(struct sdhci_host *host)
 {
+	u8 mask = SDHCI_WAKE_ON_INSERT | SDHCI_WAKE_ON_REMOVE |
+		  SDHCI_WAKE_ON_INT;
+	u32 irq_val = 0;
+	u8 wake_val = 0;
 	u8 val;
-	u8 mask = SDHCI_WAKE_ON_INSERT | SDHCI_WAKE_ON_REMOVE
-			| SDHCI_WAKE_ON_INT;
-	u32 irq_val = SDHCI_INT_CARD_INSERT | SDHCI_INT_CARD_REMOVE |
-		      SDHCI_INT_CARD_INT;
+
+	if (!(host->quirks & SDHCI_QUIRK_BROKEN_CARD_DETECTION)) {
+		wake_val |= SDHCI_WAKE_ON_INSERT | SDHCI_WAKE_ON_REMOVE;
+		irq_val |= SDHCI_INT_CARD_INSERT | SDHCI_INT_CARD_REMOVE;
+	}
+
+	wake_val |= SDHCI_WAKE_ON_INT;
+	irq_val |= SDHCI_INT_CARD_INT;
 
 	val = sdhci_readb(host, SDHCI_WAKE_UP_CONTROL);
-	val |= mask ;
-	/* Avoid fake wake up */
-	if (host->quirks & SDHCI_QUIRK_BROKEN_CARD_DETECTION) {
-		val &= ~(SDHCI_WAKE_ON_INSERT | SDHCI_WAKE_ON_REMOVE);
-		irq_val &= ~(SDHCI_INT_CARD_INSERT | SDHCI_INT_CARD_REMOVE);
-	}
+	val &= ~mask;
+	val |= wake_val;
 	sdhci_writeb(host, val, SDHCI_WAKE_UP_CONTROL);
+
 	sdhci_writel(host, irq_val, SDHCI_INT_ENABLE);
+
+	host->irq_wake_enabled = !enable_irq_wake(host->irq);
+
+	return host->irq_wake_enabled;
 }
-EXPORT_SYMBOL_GPL(sdhci_enable_irq_wakeups);
 
 static void sdhci_disable_irq_wakeups(struct sdhci_host *host)
 {
@@ -2850,6 +2865,10 @@ static void sdhci_disable_irq_wakeups(struct sdhci_host *host)
 	val = sdhci_readb(host, SDHCI_WAKE_UP_CONTROL);
 	val &= ~mask;
 	sdhci_writeb(host, val, SDHCI_WAKE_UP_CONTROL);
+
+	disable_irq_wake(host->irq);
+
+	host->irq_wake_enabled = false;
 }
 
 int sdhci_suspend_host(struct sdhci_host *host)
@@ -2858,15 +2877,14 @@ int sdhci_suspend_host(struct sdhci_host *host)
 
 	mmc_retune_timer_stop(host->mmc);
 
-	if (!device_may_wakeup(mmc_dev(host->mmc))) {
+	if (!device_may_wakeup(mmc_dev(host->mmc)) ||
+	    !sdhci_enable_irq_wakeups(host)) {
 		host->ier = 0;
 		sdhci_writel(host, 0, SDHCI_INT_ENABLE);
 		sdhci_writel(host, 0, SDHCI_SIGNAL_ENABLE);
 		free_irq(host->irq, host);
-	} else {
-		sdhci_enable_irq_wakeups(host);
-		enable_irq_wake(host->irq);
 	}
+
 	return 0;
 }
 
@@ -2894,15 +2912,14 @@ int sdhci_resume_host(struct sdhci_host *host)
 		mmiowb();
 	}
 
-	if (!device_may_wakeup(mmc_dev(host->mmc))) {
+	if (host->irq_wake_enabled) {
+		sdhci_disable_irq_wakeups(host);
+	} else {
 		ret = request_threaded_irq(host->irq, sdhci_irq,
 					   sdhci_thread_irq, IRQF_SHARED,
 					   mmc_hostname(host->mmc), host);
 		if (ret)
 			return ret;
-	} else {
-		sdhci_disable_irq_wakeups(host);
-		disable_irq_wake(host->irq);
 	}
 
 	sdhci_enable_card_detection(host);
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 54bc444..afab26f 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -484,6 +484,7 @@ struct sdhci_host {
 	bool bus_on;		/* Bus power prevents runtime suspend */
 	bool preset_enabled;	/* Preset is enabled */
 	bool pending_reset;	/* Cmd/data reset is pending */
+	bool irq_wake_enabled;	/* IRQ wakeup is enabled */
 
 	struct mmc_request *mrqs_done[SDHCI_MAX_MRQS];	/* Requests done */
 	struct mmc_command *cmd;	/* Current command */
@@ -718,7 +719,6 @@ void sdhci_enable_sdio_irq(struct mmc_host *mmc, int enable);
 #ifdef CONFIG_PM
 int sdhci_suspend_host(struct sdhci_host *host);
 int sdhci_resume_host(struct sdhci_host *host);
-void sdhci_enable_irq_wakeups(struct sdhci_host *host);
 int sdhci_runtime_suspend_host(struct sdhci_host *host);
 int sdhci_runtime_resume_host(struct sdhci_host *host);
 #endif
diff --git a/drivers/mmc/host/sdhci_f_sdh30.c b/drivers/mmc/host/sdhci_f_sdh30.c
index 04ca0d3..485f759 100644
--- a/drivers/mmc/host/sdhci_f_sdh30.c
+++ b/drivers/mmc/host/sdhci_f_sdh30.c
@@ -10,9 +10,11 @@
  * the Free Software Foundation, version 2 of the License.
  */
 
+#include <linux/acpi.h>
 #include <linux/err.h>
 #include <linux/delay.h>
 #include <linux/module.h>
+#include <linux/of.h>
 #include <linux/property.h>
 #include <linux/clk.h>
 
@@ -146,7 +148,6 @@ static int sdhci_f_sdh30_probe(struct platform_device *pdev)
 
 	platform_set_drvdata(pdev, host);
 
-	sdhci_get_of_property(pdev);
 	host->hw_name = "f_sdh30";
 	host->ops = &sdhci_f_sdh30_ops;
 	host->irq = irq;
@@ -158,26 +159,30 @@ static int sdhci_f_sdh30_probe(struct platform_device *pdev)
 		goto err;
 	}
 
-	priv->clk_iface = devm_clk_get(&pdev->dev, "iface");
-	if (IS_ERR(priv->clk_iface)) {
-		ret = PTR_ERR(priv->clk_iface);
-		goto err;
+	if (dev_of_node(dev)) {
+		sdhci_get_of_property(pdev);
+
+		priv->clk_iface = devm_clk_get(&pdev->dev, "iface");
+		if (IS_ERR(priv->clk_iface)) {
+			ret = PTR_ERR(priv->clk_iface);
+			goto err;
+		}
+
+		ret = clk_prepare_enable(priv->clk_iface);
+		if (ret)
+			goto err;
+
+		priv->clk = devm_clk_get(&pdev->dev, "core");
+		if (IS_ERR(priv->clk)) {
+			ret = PTR_ERR(priv->clk);
+			goto err_clk;
+		}
+
+		ret = clk_prepare_enable(priv->clk);
+		if (ret)
+			goto err_clk;
 	}
 
-	ret = clk_prepare_enable(priv->clk_iface);
-	if (ret)
-		goto err;
-
-	priv->clk = devm_clk_get(&pdev->dev, "core");
-	if (IS_ERR(priv->clk)) {
-		ret = PTR_ERR(priv->clk);
-		goto err_clk;
-	}
-
-	ret = clk_prepare_enable(priv->clk);
-	if (ret)
-		goto err_clk;
-
 	/* init vendor specific regs */
 	ctrl = sdhci_readw(host, F_SDH30_AHB_CONFIG);
 	ctrl |= F_SDH30_SIN | F_SDH30_AHB_INCR_16 | F_SDH30_AHB_INCR_8 |
@@ -226,16 +231,27 @@ static int sdhci_f_sdh30_remove(struct platform_device *pdev)
 	return 0;
 }
 
+#ifdef CONFIG_OF
 static const struct of_device_id f_sdh30_dt_ids[] = {
 	{ .compatible = "fujitsu,mb86s70-sdhci-3.0" },
 	{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, f_sdh30_dt_ids);
+#endif
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id f_sdh30_acpi_ids[] = {
+	{ "SCX0002" },
+	{ /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(acpi, f_sdh30_acpi_ids);
+#endif
 
 static struct platform_driver sdhci_f_sdh30_driver = {
 	.driver = {
 		.name = "f_sdh30",
-		.of_match_table = f_sdh30_dt_ids,
+		.of_match_table = of_match_ptr(f_sdh30_dt_ids),
+		.acpi_match_table = ACPI_PTR(f_sdh30_acpi_ids),
 		.pm	= &sdhci_pltfm_pmops,
 	},
 	.probe	= sdhci_f_sdh30_probe,
diff --git a/drivers/mmc/host/sh_mmcif.c b/drivers/mmc/host/sh_mmcif.c
index 53fb18b..7bb00c6 100644
--- a/drivers/mmc/host/sh_mmcif.c
+++ b/drivers/mmc/host/sh_mmcif.c
@@ -916,7 +916,7 @@ static void sh_mmcif_start_cmd(struct sh_mmcif_host *host,
 			       struct mmc_request *mrq)
 {
 	struct mmc_command *cmd = mrq->cmd;
-	u32 opc = cmd->opcode;
+	u32 opc;
 	u32 mask = 0;
 	unsigned long flags;
 
diff --git a/drivers/mmc/host/sunxi-mmc.c b/drivers/mmc/host/sunxi-mmc.c
index cc98355d..bad612d 100644
--- a/drivers/mmc/host/sunxi-mmc.c
+++ b/drivers/mmc/host/sunxi-mmc.c
@@ -3,7 +3,7 @@
  * (C) Copyright 2007-2011 Reuuimlla Technology Co., Ltd.
  * (C) Copyright 2007-2011 Aaron Maoye <leafy.myeh@reuuimllatech.com>
  * (C) Copyright 2013-2014 O2S GmbH <www.o2s.ch>
- * (C) Copyright 2013-2014 David Lanzend�rfer <david.lanzendoerfer@o2s.ch>
+ * (C) Copyright 2013-2014 David Lanzendörfer <david.lanzendoerfer@o2s.ch>
  * (C) Copyright 2013-2014 Hans de Goede <hdegoede@redhat.com>
  * (C) Copyright 2017 Sootech SA
  *
@@ -1255,6 +1255,11 @@ static int sunxi_mmc_resource_request(struct sunxi_mmc_host *host,
 		goto error_assert_reset;
 
 	host->irq = platform_get_irq(pdev, 0);
+	if (host->irq <= 0) {
+		ret = -EINVAL;
+		goto error_assert_reset;
+	}
+
 	return devm_request_threaded_irq(&pdev->dev, host->irq, sunxi_mmc_irq,
 			sunxi_mmc_handle_manual_stop, 0, "sunxi-mmc", host);
 
@@ -1393,5 +1398,5 @@ module_platform_driver(sunxi_mmc_driver);
 
 MODULE_DESCRIPTION("Allwinner's SD/MMC Card Controller Driver");
 MODULE_LICENSE("GPL v2");
-MODULE_AUTHOR("David Lanzend�rfer <david.lanzendoerfer@o2s.ch>");
+MODULE_AUTHOR("David Lanzendörfer <david.lanzendoerfer@o2s.ch>");
 MODULE_ALIAS("platform:sunxi-mmc");
diff --git a/drivers/mmc/host/tmio_mmc.c b/drivers/mmc/host/tmio_mmc.c
index 64b7e9f..43a2ea5 100644
--- a/drivers/mmc/host/tmio_mmc.c
+++ b/drivers/mmc/host/tmio_mmc.c
@@ -92,14 +92,19 @@ static int tmio_mmc_probe(struct platform_device *pdev)
 
 	pdata->flags |= TMIO_MMC_HAVE_HIGH_REG;
 
-	host = tmio_mmc_host_alloc(pdev);
-	if (!host)
+	host = tmio_mmc_host_alloc(pdev, pdata);
+	if (IS_ERR(host)) {
+		ret = PTR_ERR(host);
 		goto cell_disable;
+	}
 
 	/* SD control register space size is 0x200, 0x400 for bus_shift=1 */
 	host->bus_shift = resource_size(res) >> 10;
 
-	ret = tmio_mmc_host_probe(host, pdata, NULL);
+	host->mmc->f_max = pdata->hclk;
+	host->mmc->f_min = pdata->hclk / 512;
+
+	ret = tmio_mmc_host_probe(host);
 	if (ret)
 		goto host_free;
 
@@ -128,15 +133,11 @@ static int tmio_mmc_probe(struct platform_device *pdev)
 static int tmio_mmc_remove(struct platform_device *pdev)
 {
 	const struct mfd_cell *cell = mfd_get_cell(pdev);
-	struct mmc_host *mmc = platform_get_drvdata(pdev);
+	struct tmio_mmc_host *host = platform_get_drvdata(pdev);
 
-	if (mmc) {
-		struct tmio_mmc_host *host = mmc_priv(mmc);
-
-		tmio_mmc_host_remove(host);
-		if (cell->disable)
-			cell->disable(pdev);
-	}
+	tmio_mmc_host_remove(host);
+	if (cell->disable)
+		cell->disable(pdev);
 
 	return 0;
 }
diff --git a/drivers/mmc/host/tmio_mmc.h b/drivers/mmc/host/tmio_mmc.h
index 3e6ff892..e7d6513 100644
--- a/drivers/mmc/host/tmio_mmc.h
+++ b/drivers/mmc/host/tmio_mmc.h
@@ -112,12 +112,6 @@
 struct tmio_mmc_data;
 struct tmio_mmc_host;
 
-struct tmio_mmc_dma {
-	enum dma_slave_buswidth dma_buswidth;
-	bool (*filter)(struct dma_chan *chan, void *arg);
-	void (*enable)(struct tmio_mmc_host *host, bool enable);
-};
-
 struct tmio_mmc_dma_ops {
 	void (*start)(struct tmio_mmc_host *host, struct mmc_data *data);
 	void (*enable)(struct tmio_mmc_host *host, bool enable);
@@ -134,6 +128,7 @@ struct tmio_mmc_host {
 	struct mmc_request      *mrq;
 	struct mmc_data         *data;
 	struct mmc_host         *mmc;
+	struct mmc_host_ops     ops;
 
 	/* Callbacks for clock / power control */
 	void (*set_pwr)(struct platform_device *host, int state);
@@ -144,18 +139,15 @@ struct tmio_mmc_host {
 	struct scatterlist      *sg_orig;
 	unsigned int            sg_len;
 	unsigned int            sg_off;
-	unsigned long		bus_shift;
+	unsigned int		bus_shift;
 
 	struct platform_device *pdev;
 	struct tmio_mmc_data *pdata;
-	struct tmio_mmc_dma	*dma;
 
 	/* DMA support */
 	bool			force_pio;
 	struct dma_chan		*chan_rx;
 	struct dma_chan		*chan_tx;
-	struct completion	dma_dataend;
-	struct tasklet_struct	dma_complete;
 	struct tasklet_struct	dma_issue;
 	struct scatterlist	bounce_sg;
 	u8			*bounce_buf;
@@ -174,7 +166,6 @@ struct tmio_mmc_host {
 	struct mutex		ios_lock;	/* protect set_ios() context */
 	bool			native_hotplug;
 	bool			sdio_irq_enabled;
-	u32			scc_tappos;
 
 	/* Mandatory callback */
 	int (*clk_enable)(struct tmio_mmc_host *host);
@@ -185,9 +176,6 @@ struct tmio_mmc_host {
 	void (*clk_disable)(struct tmio_mmc_host *host);
 	int (*multi_io_quirk)(struct mmc_card *card,
 			      unsigned int direction, int blk_size);
-	int (*card_busy)(struct mmc_host *mmc);
-	int (*start_signal_voltage_switch)(struct mmc_host *mmc,
-					   struct mmc_ios *ios);
 	int (*write16_hook)(struct tmio_mmc_host *host, int addr);
 	void (*hw_reset)(struct tmio_mmc_host *host);
 	void (*prepare_tuning)(struct tmio_mmc_host *host, unsigned long tap);
@@ -207,11 +195,10 @@ struct tmio_mmc_host {
 	const struct tmio_mmc_dma_ops *dma_ops;
 };
 
-struct tmio_mmc_host *tmio_mmc_host_alloc(struct platform_device *pdev);
+struct tmio_mmc_host *tmio_mmc_host_alloc(struct platform_device *pdev,
+					  struct tmio_mmc_data *pdata);
 void tmio_mmc_host_free(struct tmio_mmc_host *host);
-int tmio_mmc_host_probe(struct tmio_mmc_host *host,
-			struct tmio_mmc_data *pdata,
-			const struct tmio_mmc_dma_ops *dma_ops);
+int tmio_mmc_host_probe(struct tmio_mmc_host *host);
 void tmio_mmc_host_remove(struct tmio_mmc_host *host);
 void tmio_mmc_do_data_irq(struct tmio_mmc_host *host);
 
@@ -240,26 +227,26 @@ int tmio_mmc_host_runtime_resume(struct device *dev);
 
 static inline u16 sd_ctrl_read16(struct tmio_mmc_host *host, int addr)
 {
-	return readw(host->ctl + (addr << host->bus_shift));
+	return ioread16(host->ctl + (addr << host->bus_shift));
 }
 
 static inline void sd_ctrl_read16_rep(struct tmio_mmc_host *host, int addr,
 				      u16 *buf, int count)
 {
-	readsw(host->ctl + (addr << host->bus_shift), buf, count);
+	ioread16_rep(host->ctl + (addr << host->bus_shift), buf, count);
 }
 
 static inline u32 sd_ctrl_read16_and_16_as_32(struct tmio_mmc_host *host,
 					      int addr)
 {
-	return readw(host->ctl + (addr << host->bus_shift)) |
-	       readw(host->ctl + ((addr + 2) << host->bus_shift)) << 16;
+	return ioread16(host->ctl + (addr << host->bus_shift)) |
+	       ioread16(host->ctl + ((addr + 2) << host->bus_shift)) << 16;
 }
 
 static inline void sd_ctrl_read32_rep(struct tmio_mmc_host *host, int addr,
 				      u32 *buf, int count)
 {
-	readsl(host->ctl + (addr << host->bus_shift), buf, count);
+	ioread32_rep(host->ctl + (addr << host->bus_shift), buf, count);
 }
 
 static inline void sd_ctrl_write16(struct tmio_mmc_host *host, int addr,
@@ -270,26 +257,26 @@ static inline void sd_ctrl_write16(struct tmio_mmc_host *host, int addr,
 	 */
 	if (host->write16_hook && host->write16_hook(host, addr))
 		return;
-	writew(val, host->ctl + (addr << host->bus_shift));
+	iowrite16(val, host->ctl + (addr << host->bus_shift));
 }
 
 static inline void sd_ctrl_write16_rep(struct tmio_mmc_host *host, int addr,
 				       u16 *buf, int count)
 {
-	writesw(host->ctl + (addr << host->bus_shift), buf, count);
+	iowrite16_rep(host->ctl + (addr << host->bus_shift), buf, count);
 }
 
 static inline void sd_ctrl_write32_as_16_and_16(struct tmio_mmc_host *host,
 						int addr, u32 val)
 {
-	writew(val & 0xffff, host->ctl + (addr << host->bus_shift));
-	writew(val >> 16, host->ctl + ((addr + 2) << host->bus_shift));
+	iowrite16(val & 0xffff, host->ctl + (addr << host->bus_shift));
+	iowrite16(val >> 16, host->ctl + ((addr + 2) << host->bus_shift));
 }
 
 static inline void sd_ctrl_write32_rep(struct tmio_mmc_host *host, int addr,
 				       const u32 *buf, int count)
 {
-	writesl(host->ctl + (addr << host->bus_shift), buf, count);
+	iowrite32_rep(host->ctl + (addr << host->bus_shift), buf, count);
 }
 
 #endif
diff --git a/drivers/mmc/host/tmio_mmc_core.c b/drivers/mmc/host/tmio_mmc_core.c
index 583bf32..3349424 100644
--- a/drivers/mmc/host/tmio_mmc_core.c
+++ b/drivers/mmc/host/tmio_mmc_core.c
@@ -806,7 +806,7 @@ static int tmio_mmc_execute_tuning(struct mmc_host *mmc, u32 opcode)
 		if (ret == 0)
 			set_bit(i, host->taps);
 
-		mdelay(1);
+		usleep_range(1000, 1200);
 	}
 
 	ret = host->select_tuning(host);
@@ -926,20 +926,6 @@ static void tmio_mmc_done_work(struct work_struct *work)
 	tmio_mmc_finish_request(host);
 }
 
-static int tmio_mmc_clk_enable(struct tmio_mmc_host *host)
-{
-	if (!host->clk_enable)
-		return -ENOTSUPP;
-
-	return host->clk_enable(host);
-}
-
-static void tmio_mmc_clk_disable(struct tmio_mmc_host *host)
-{
-	if (host->clk_disable)
-		host->clk_disable(host);
-}
-
 static void tmio_mmc_power_on(struct tmio_mmc_host *host, unsigned short vdd)
 {
 	struct mmc_host *mmc = host->mmc;
@@ -958,7 +944,7 @@ static void tmio_mmc_power_on(struct tmio_mmc_host *host, unsigned short vdd)
 		 * 100us were not enough. Is this the same 140us delay, as in
 		 * tmio_mmc_set_ios()?
 		 */
-		udelay(200);
+		usleep_range(200, 300);
 	}
 	/*
 	 * It seems, VccQ should be switched on after Vcc, this is also what the
@@ -966,7 +952,7 @@ static void tmio_mmc_power_on(struct tmio_mmc_host *host, unsigned short vdd)
 	 */
 	if (!IS_ERR(mmc->supply.vqmmc) && !ret) {
 		ret = regulator_enable(mmc->supply.vqmmc);
-		udelay(200);
+		usleep_range(200, 300);
 	}
 
 	if (ret < 0)
@@ -1059,7 +1045,7 @@ static void tmio_mmc_set_ios(struct mmc_host *mmc, struct mmc_ios *ios)
 	}
 
 	/* Let things settle. delay taken from winCE driver */
-	udelay(140);
+	usleep_range(140, 200);
 	if (PTR_ERR(host->mrq) == -EINTR)
 		dev_dbg(&host->pdev->dev,
 			"%s.%d: IOS interrupted: clk %u, mode %u",
@@ -1076,15 +1062,9 @@ static int tmio_mmc_get_ro(struct mmc_host *mmc)
 {
 	struct tmio_mmc_host *host = mmc_priv(mmc);
 	struct tmio_mmc_data *pdata = host->pdata;
-	int ret = mmc_gpio_get_ro(mmc);
 
-	if (ret >= 0)
-		return ret;
-
-	ret = !((pdata->flags & TMIO_MMC_WRPROTECT_DISABLE) ||
-		(sd_ctrl_read16_and_16_as_32(host, CTL_STATUS) & TMIO_STAT_WRPROTECT));
-
-	return ret;
+	return !((pdata->flags & TMIO_MMC_WRPROTECT_DISABLE) ||
+		 (sd_ctrl_read16_and_16_as_32(host, CTL_STATUS) & TMIO_STAT_WRPROTECT));
 }
 
 static int tmio_multi_io_quirk(struct mmc_card *card,
@@ -1098,7 +1078,7 @@ static int tmio_multi_io_quirk(struct mmc_card *card,
 	return blk_size;
 }
 
-static struct mmc_host_ops tmio_mmc_ops = {
+static const struct mmc_host_ops tmio_mmc_ops = {
 	.request	= tmio_mmc_request,
 	.set_ios	= tmio_mmc_set_ios,
 	.get_ro         = tmio_mmc_get_ro,
@@ -1145,19 +1125,45 @@ static void tmio_mmc_of_parse(struct platform_device *pdev,
 		pdata->flags |= TMIO_MMC_WRPROTECT_DISABLE;
 }
 
-struct tmio_mmc_host*
-tmio_mmc_host_alloc(struct platform_device *pdev)
+struct tmio_mmc_host *tmio_mmc_host_alloc(struct platform_device *pdev,
+					  struct tmio_mmc_data *pdata)
 {
 	struct tmio_mmc_host *host;
 	struct mmc_host *mmc;
+	struct resource *res;
+	void __iomem *ctl;
+	int ret;
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	ctl = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(ctl))
+		return ERR_CAST(ctl);
 
 	mmc = mmc_alloc_host(sizeof(struct tmio_mmc_host), &pdev->dev);
 	if (!mmc)
-		return NULL;
+		return ERR_PTR(-ENOMEM);
 
 	host = mmc_priv(mmc);
+	host->ctl = ctl;
 	host->mmc = mmc;
 	host->pdev = pdev;
+	host->pdata = pdata;
+	host->ops = tmio_mmc_ops;
+	mmc->ops = &host->ops;
+
+	ret = mmc_of_parse(host->mmc);
+	if (ret) {
+		host = ERR_PTR(ret);
+		goto free;
+	}
+
+	tmio_mmc_of_parse(pdev, pdata);
+
+	platform_set_drvdata(pdev, host);
+
+	return host;
+free:
+	mmc_free_host(mmc);
 
 	return host;
 }
@@ -1169,32 +1175,24 @@ void tmio_mmc_host_free(struct tmio_mmc_host *host)
 }
 EXPORT_SYMBOL_GPL(tmio_mmc_host_free);
 
-int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
-			struct tmio_mmc_data *pdata,
-			const struct tmio_mmc_dma_ops *dma_ops)
+int tmio_mmc_host_probe(struct tmio_mmc_host *_host)
 {
 	struct platform_device *pdev = _host->pdev;
+	struct tmio_mmc_data *pdata = _host->pdata;
 	struct mmc_host *mmc = _host->mmc;
-	struct resource *res_ctl;
 	int ret;
 	u32 irq_mask = TMIO_MASK_CMD;
 
-	tmio_mmc_of_parse(pdev, pdata);
+	/*
+	 * Check the sanity of mmc->f_min to prevent tmio_mmc_set_clock() from
+	 * looping forever...
+	 */
+	if (mmc->f_min == 0)
+		return -EINVAL;
 
 	if (!(pdata->flags & TMIO_MMC_HAS_IDLE_WAIT))
 		_host->write16_hook = NULL;
 
-	res_ctl = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!res_ctl)
-		return -EINVAL;
-
-	ret = mmc_of_parse(mmc);
-	if (ret < 0)
-		return ret;
-
-	_host->pdata = pdata;
-	platform_set_drvdata(pdev, mmc);
-
 	_host->set_pwr = pdata->set_pwr;
 	_host->set_clk_div = pdata->set_clk_div;
 
@@ -1202,15 +1200,11 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
 	if (ret < 0)
 		return ret;
 
-	_host->ctl = devm_ioremap(&pdev->dev,
-				  res_ctl->start, resource_size(res_ctl));
-	if (!_host->ctl)
-		return -ENOMEM;
-
-	tmio_mmc_ops.card_busy = _host->card_busy;
-	tmio_mmc_ops.start_signal_voltage_switch =
-		_host->start_signal_voltage_switch;
-	mmc->ops = &tmio_mmc_ops;
+	if (pdata->flags & TMIO_MMC_USE_GPIO_CD) {
+		ret = mmc_gpio_request_cd(mmc, pdata->cd_gpio, 0);
+		if (ret)
+			return ret;
+	}
 
 	mmc->caps |= MMC_CAP_4_BIT_DATA | pdata->capabilities;
 	mmc->caps2 |= pdata->capabilities2;
@@ -1233,7 +1227,10 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
 	}
 	mmc->max_seg_size = mmc->max_req_size;
 
-	_host->native_hotplug = !(pdata->flags & TMIO_MMC_USE_GPIO_CD ||
+	if (mmc_can_gpio_ro(mmc))
+		_host->ops.get_ro = mmc_gpio_get_ro;
+
+	_host->native_hotplug = !(mmc_can_gpio_cd(mmc) ||
 				  mmc->caps & MMC_CAP_NEEDS_POLL ||
 				  !mmc_card_is_removable(mmc));
 
@@ -1246,18 +1243,6 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
 	if (pdata->flags & TMIO_MMC_MIN_RCAR2)
 		_host->native_hotplug = true;
 
-	if (tmio_mmc_clk_enable(_host) < 0) {
-		mmc->f_max = pdata->hclk;
-		mmc->f_min = mmc->f_max / 512;
-	}
-
-	/*
-	 * Check the sanity of mmc->f_min to prevent tmio_mmc_set_clock() from
-	 * looping forever...
-	 */
-	if (mmc->f_min == 0)
-		return -EINVAL;
-
 	/*
 	 * While using internal tmio hardware logic for card detection, we need
 	 * to ensure it stays powered for it to work.
@@ -1293,7 +1278,6 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
 	INIT_WORK(&_host->done, tmio_mmc_done_work);
 
 	/* See if we also get DMA */
-	_host->dma_ops = dma_ops;
 	tmio_mmc_request_dma(_host, pdata);
 
 	pm_runtime_set_active(&pdev->dev);
@@ -1307,14 +1291,6 @@ int tmio_mmc_host_probe(struct tmio_mmc_host *_host,
 
 	dev_pm_qos_expose_latency_limit(&pdev->dev, 100);
 
-	if (pdata->flags & TMIO_MMC_USE_GPIO_CD) {
-		ret = mmc_gpio_request_cd(mmc, pdata->cd_gpio, 0);
-		if (ret)
-			goto remove_host;
-
-		mmc_gpiod_request_cd_irq(mmc);
-	}
-
 	return 0;
 
 remove_host:
@@ -1343,16 +1319,27 @@ void tmio_mmc_host_remove(struct tmio_mmc_host *host)
 
 	pm_runtime_put_sync(&pdev->dev);
 	pm_runtime_disable(&pdev->dev);
-
-	tmio_mmc_clk_disable(host);
 }
 EXPORT_SYMBOL_GPL(tmio_mmc_host_remove);
 
 #ifdef CONFIG_PM
+static int tmio_mmc_clk_enable(struct tmio_mmc_host *host)
+{
+	if (!host->clk_enable)
+		return -ENOTSUPP;
+
+	return host->clk_enable(host);
+}
+
+static void tmio_mmc_clk_disable(struct tmio_mmc_host *host)
+{
+	if (host->clk_disable)
+		host->clk_disable(host);
+}
+
 int tmio_mmc_host_runtime_suspend(struct device *dev)
 {
-	struct mmc_host *mmc = dev_get_drvdata(dev);
-	struct tmio_mmc_host *host = mmc_priv(mmc);
+	struct tmio_mmc_host *host = dev_get_drvdata(dev);
 
 	tmio_mmc_disable_mmc_irqs(host, TMIO_MASK_ALL);
 
@@ -1372,8 +1359,7 @@ static bool tmio_mmc_can_retune(struct tmio_mmc_host *host)
 
 int tmio_mmc_host_runtime_resume(struct device *dev)
 {
-	struct mmc_host *mmc = dev_get_drvdata(dev);
-	struct tmio_mmc_host *host = mmc_priv(mmc);
+	struct tmio_mmc_host *host = dev_get_drvdata(dev);
 
 	tmio_mmc_reset(host);
 	tmio_mmc_clk_enable(host);
diff --git a/drivers/mtd/devices/docg3.c b/drivers/mtd/devices/docg3.c
index 0806f72..a85af23 100644
--- a/drivers/mtd/devices/docg3.c
+++ b/drivers/mtd/devices/docg3.c
@@ -904,9 +904,6 @@ static int doc_read_oob(struct mtd_info *mtd, loff_t from,
 	if (ooblen % DOC_LAYOUT_OOB_SIZE)
 		return -EINVAL;
 
-	if (from + len > mtd->size)
-		return -EINVAL;
-
 	ops->oobretlen = 0;
 	ops->retlen = 0;
 	ret = 0;
@@ -990,36 +987,6 @@ static int doc_read_oob(struct mtd_info *mtd, loff_t from,
 	goto out;
 }
 
-/**
- * doc_read - Read bytes from flash
- * @mtd: the device
- * @from: the offset from first block and first page, in bytes, aligned on page
- *        size
- * @len: the number of bytes to read (must be a multiple of 4)
- * @retlen: the number of bytes actually read
- * @buf: the filled in buffer
- *
- * Reads flash memory pages. This function does not read the OOB chunk, but only
- * the page data.
- *
- * Returns 0 if read successful, of -EIO, -EINVAL if an error occurred
- */
-static int doc_read(struct mtd_info *mtd, loff_t from, size_t len,
-	     size_t *retlen, u_char *buf)
-{
-	struct mtd_oob_ops ops;
-	size_t ret;
-
-	memset(&ops, 0, sizeof(ops));
-	ops.datbuf = buf;
-	ops.len = len;
-	ops.mode = MTD_OPS_AUTO_OOB;
-
-	ret = doc_read_oob(mtd, from, &ops);
-	*retlen = ops.retlen;
-	return ret;
-}
-
 static int doc_reload_bbt(struct docg3 *docg3)
 {
 	int block = DOC_LAYOUT_BLOCK_BBT;
@@ -1471,8 +1438,6 @@ static int doc_write_oob(struct mtd_info *mtd, loff_t ofs,
 	if (len && ooblen &&
 	    (len / DOC_LAYOUT_PAGE_SIZE) != (ooblen / oobdelta))
 		return -EINVAL;
-	if (ofs + len > mtd->size)
-		return -EINVAL;
 
 	ops->oobretlen = 0;
 	ops->retlen = 0;
@@ -1513,39 +1478,6 @@ static int doc_write_oob(struct mtd_info *mtd, loff_t ofs,
 	return ret;
 }
 
-/**
- * doc_write - Write a buffer to the chip
- * @mtd: the device
- * @to: the offset from first block and first page, in bytes, aligned on page
- *      size
- * @len: the number of bytes to write (must be a full page size, ie. 512)
- * @retlen: the number of bytes actually written (0 or 512)
- * @buf: the buffer to get bytes from
- *
- * Writes data to the chip.
- *
- * Returns 0 if write successful, -EIO if write error
- */
-static int doc_write(struct mtd_info *mtd, loff_t to, size_t len,
-		     size_t *retlen, const u_char *buf)
-{
-	struct docg3 *docg3 = mtd->priv;
-	int ret;
-	struct mtd_oob_ops ops;
-
-	doc_dbg("doc_write(to=%lld, len=%zu)\n", to, len);
-	ops.datbuf = (char *)buf;
-	ops.len = len;
-	ops.mode = MTD_OPS_PLACE_OOB;
-	ops.oobbuf = NULL;
-	ops.ooblen = 0;
-	ops.ooboffs = 0;
-
-	ret = doc_write_oob(mtd, to, &ops);
-	*retlen = ops.retlen;
-	return ret;
-}
-
 static struct docg3 *sysfs_dev2docg3(struct device *dev,
 				     struct device_attribute *attr)
 {
@@ -1866,8 +1798,6 @@ static int __init doc_set_driver_info(int chip_id, struct mtd_info *mtd)
 	mtd->writebufsize = mtd->writesize = DOC_LAYOUT_PAGE_SIZE;
 	mtd->oobsize = DOC_LAYOUT_OOB_SIZE;
 	mtd->_erase = doc_erase;
-	mtd->_read = doc_read;
-	mtd->_write = doc_write;
 	mtd->_read_oob = doc_read_oob;
 	mtd->_write_oob = doc_write_oob;
 	mtd->_block_isbad = doc_block_isbad;
diff --git a/drivers/mtd/devices/m25p80.c b/drivers/mtd/devices/m25p80.c
index dbe6a1d..a4e18f6 100644
--- a/drivers/mtd/devices/m25p80.c
+++ b/drivers/mtd/devices/m25p80.c
@@ -307,10 +307,18 @@ static int m25p_remove(struct spi_device *spi)
 {
 	struct m25p	*flash = spi_get_drvdata(spi);
 
+	spi_nor_restore(&flash->spi_nor);
+
 	/* Clean up MTD stuff. */
 	return mtd_device_unregister(&flash->spi_nor.mtd);
 }
 
+static void m25p_shutdown(struct spi_device *spi)
+{
+	struct m25p *flash = spi_get_drvdata(spi);
+
+	spi_nor_restore(&flash->spi_nor);
+}
 /*
  * Do NOT add to this array without reading the following:
  *
@@ -386,6 +394,7 @@ static struct spi_driver m25p80_driver = {
 	.id_table	= m25p_ids,
 	.probe	= m25p_probe,
 	.remove	= m25p_remove,
+	.shutdown	= m25p_shutdown,
 
 	/* REVISIT: many of these chips have deep power-down modes, which
 	 * should clearly be entered on suspend() to minimize power use.
diff --git a/drivers/mtd/devices/mchp23k256.c b/drivers/mtd/devices/mchp23k256.c
index 8956b7d..75f71d1 100644
--- a/drivers/mtd/devices/mchp23k256.c
+++ b/drivers/mtd/devices/mchp23k256.c
@@ -68,6 +68,7 @@ static int mchp23k256_write(struct mtd_info *mtd, loff_t to, size_t len,
 	struct spi_transfer transfer[2] = {};
 	struct spi_message message;
 	unsigned char command[MAX_CMD_SIZE];
+	int ret;
 
 	spi_message_init(&message);
 
@@ -84,12 +85,16 @@ static int mchp23k256_write(struct mtd_info *mtd, loff_t to, size_t len,
 
 	mutex_lock(&flash->lock);
 
-	spi_sync(flash->spi, &message);
+	ret = spi_sync(flash->spi, &message);
+
+	mutex_unlock(&flash->lock);
+
+	if (ret)
+		return ret;
 
 	if (retlen && message.actual_length > sizeof(command))
 		*retlen += message.actual_length - sizeof(command);
 
-	mutex_unlock(&flash->lock);
 	return 0;
 }
 
@@ -100,6 +105,7 @@ static int mchp23k256_read(struct mtd_info *mtd, loff_t from, size_t len,
 	struct spi_transfer transfer[2] = {};
 	struct spi_message message;
 	unsigned char command[MAX_CMD_SIZE];
+	int ret;
 
 	spi_message_init(&message);
 
@@ -117,12 +123,16 @@ static int mchp23k256_read(struct mtd_info *mtd, loff_t from, size_t len,
 
 	mutex_lock(&flash->lock);
 
-	spi_sync(flash->spi, &message);
+	ret = spi_sync(flash->spi, &message);
+
+	mutex_unlock(&flash->lock);
+
+	if (ret)
+		return ret;
 
 	if (retlen && message.actual_length > sizeof(command))
 		*retlen += message.actual_length - sizeof(command);
 
-	mutex_unlock(&flash->lock);
 	return 0;
 }
 
diff --git a/drivers/mtd/mtdcore.c b/drivers/mtd/mtdcore.c
index 73b6055..28553c8 100644
--- a/drivers/mtd/mtdcore.c
+++ b/drivers/mtd/mtdcore.c
@@ -503,6 +503,11 @@ int add_mtd_device(struct mtd_info *mtd)
 		return -EEXIST;
 
 	BUG_ON(mtd->writesize == 0);
+
+	if (WARN_ON((!mtd->erasesize || !mtd->_erase) &&
+		    !(mtd->flags & MTD_NO_ERASE)))
+		return -EINVAL;
+
 	mutex_lock(&mtd_table_mutex);
 
 	i = idr_alloc(&mtd_idr, mtd, 0, 0, GFP_KERNEL);
@@ -1053,7 +1058,20 @@ int mtd_read(struct mtd_info *mtd, loff_t from, size_t len, size_t *retlen,
 	 * representing the maximum number of bitflips that were corrected on
 	 * any one ecc region (if applicable; zero otherwise).
 	 */
-	ret_code = mtd->_read(mtd, from, len, retlen, buf);
+	if (mtd->_read) {
+		ret_code = mtd->_read(mtd, from, len, retlen, buf);
+	} else if (mtd->_read_oob) {
+		struct mtd_oob_ops ops = {
+			.len = len,
+			.datbuf = buf,
+		};
+
+		ret_code = mtd->_read_oob(mtd, from, &ops);
+		*retlen = ops.retlen;
+	} else {
+		return -ENOTSUPP;
+	}
+
 	if (unlikely(ret_code < 0))
 		return ret_code;
 	if (mtd->ecc_strength == 0)
@@ -1068,11 +1086,25 @@ int mtd_write(struct mtd_info *mtd, loff_t to, size_t len, size_t *retlen,
 	*retlen = 0;
 	if (to < 0 || to >= mtd->size || len > mtd->size - to)
 		return -EINVAL;
-	if (!mtd->_write || !(mtd->flags & MTD_WRITEABLE))
+	if ((!mtd->_write && !mtd->_write_oob) ||
+	    !(mtd->flags & MTD_WRITEABLE))
 		return -EROFS;
 	if (!len)
 		return 0;
 	ledtrig_mtd_activity();
+
+	if (!mtd->_write) {
+		struct mtd_oob_ops ops = {
+			.len = len,
+			.datbuf = (u8 *)buf,
+		};
+		int ret;
+
+		ret = mtd->_write_oob(mtd, to, &ops);
+		*retlen = ops.retlen;
+		return ret;
+	}
+
 	return mtd->_write(mtd, to, len, retlen, buf);
 }
 EXPORT_SYMBOL_GPL(mtd_write);
diff --git a/drivers/mtd/mtdpart.c b/drivers/mtd/mtdpart.c
index be088bc..76cd21d 100644
--- a/drivers/mtd/mtdpart.c
+++ b/drivers/mtd/mtdpart.c
@@ -105,34 +105,17 @@ static int part_read_oob(struct mtd_info *mtd, loff_t from,
 		struct mtd_oob_ops *ops)
 {
 	struct mtd_part *part = mtd_to_part(mtd);
+	struct mtd_ecc_stats stats;
 	int res;
 
-	if (from >= mtd->size)
-		return -EINVAL;
-	if (ops->datbuf && from + ops->len > mtd->size)
-		return -EINVAL;
-
-	/*
-	 * If OOB is also requested, make sure that we do not read past the end
-	 * of this partition.
-	 */
-	if (ops->oobbuf) {
-		size_t len, pages;
-
-		len = mtd_oobavail(mtd, ops);
-		pages = mtd_div_by_ws(mtd->size, mtd);
-		pages -= mtd_div_by_ws(from, mtd);
-		if (ops->ooboffs + ops->ooblen > pages * len)
-			return -EINVAL;
-	}
-
+	stats = part->parent->ecc_stats;
 	res = part->parent->_read_oob(part->parent, from + part->offset, ops);
-	if (unlikely(res)) {
-		if (mtd_is_bitflip(res))
-			mtd->ecc_stats.corrected++;
-		if (mtd_is_eccerr(res))
-			mtd->ecc_stats.failed++;
-	}
+	if (unlikely(mtd_is_eccerr(res)))
+		mtd->ecc_stats.failed +=
+			part->parent->ecc_stats.failed - stats.failed;
+	else
+		mtd->ecc_stats.corrected +=
+			part->parent->ecc_stats.corrected - stats.corrected;
 	return res;
 }
 
@@ -189,10 +172,6 @@ static int part_write_oob(struct mtd_info *mtd, loff_t to,
 {
 	struct mtd_part *part = mtd_to_part(mtd);
 
-	if (to >= mtd->size)
-		return -EINVAL;
-	if (ops->datbuf && to + ops->len > mtd->size)
-		return -EINVAL;
 	return part->parent->_write_oob(part->parent, to + part->offset, ops);
 }
 
@@ -435,8 +414,10 @@ static struct mtd_part *allocate_partition(struct mtd_info *parent,
 				parent->dev.parent;
 	slave->mtd.dev.of_node = part->of_node;
 
-	slave->mtd._read = part_read;
-	slave->mtd._write = part_write;
+	if (parent->_read)
+		slave->mtd._read = part_read;
+	if (parent->_write)
+		slave->mtd._write = part_write;
 
 	if (parent->_panic_write)
 		slave->mtd._panic_write = part_panic_write;
diff --git a/drivers/mtd/mtdswap.c b/drivers/mtd/mtdswap.c
index f07492c..7eb0e1f 100644
--- a/drivers/mtd/mtdswap.c
+++ b/drivers/mtd/mtdswap.c
@@ -1223,8 +1223,9 @@ static int mtdswap_show(struct seq_file *s, void *data)
 	unsigned int max[MTDSWAP_TREE_CNT];
 	unsigned int i, cw = 0, cwp = 0, cwecount = 0, bb_cnt, mapped, pages;
 	uint64_t use_size;
-	char *name[] = {"clean", "used", "low", "high", "dirty", "bitflip",
-			"failing"};
+	static const char * const name[] = {
+		"clean", "used", "low", "high", "dirty", "bitflip", "failing"
+	};
 
 	mutex_lock(&d->mbd_dev->lock);
 
diff --git a/drivers/mtd/nand/Kconfig b/drivers/mtd/nand/Kconfig
index bb48aaf..e6b8c59 100644
--- a/drivers/mtd/nand/Kconfig
+++ b/drivers/mtd/nand/Kconfig
@@ -315,6 +315,7 @@
 
 config MTD_NAND_PXA3xx
 	tristate "NAND support on PXA3xx and Armada 370/XP"
+	depends on !MTD_NAND_MARVELL
 	depends on PXA3xx || ARCH_MMP || PLAT_ORION || ARCH_MVEBU
 	help
 
@@ -323,6 +324,18 @@
 	  platforms (XP, 370, 375, 38x, 39x) and 64-bit Armada
 	  platforms (7K, 8K) (NFCv2).
 
+config MTD_NAND_MARVELL
+	tristate "NAND controller support on Marvell boards"
+	depends on PXA3xx || ARCH_MMP || PLAT_ORION || ARCH_MVEBU || \
+		   COMPILE_TEST
+	depends on HAS_IOMEM
+	help
+	  This enables the NAND flash controller driver for Marvell boards,
+	  including:
+	  - PXA3xx processors (NFCv1)
+	  - 32-bit Armada platforms (XP, 37x, 38x, 39x) (NFCv2)
+	  - 64-bit Aramda platforms (7k, 8k) (NFCv2)
+
 config MTD_NAND_SLC_LPC32XX
 	tristate "NXP LPC32xx SLC Controller"
 	depends on ARCH_LPC32XX
@@ -376,9 +389,7 @@
 	 Enables NAND Flash support for IMX23, IMX28 or IMX6.
 	 The GPMI controller is very powerful, with the help of BCH
 	 module, it can do the hardware ECC. The GPMI supports several
-	 NAND flashs at the same time. The GPMI may conflicts with other
-	 block, such as SD card. So pay attention to it when you enable
-	 the GPMI.
+	 NAND flashs at the same time.
 
 config MTD_NAND_BRCMNAND
 	tristate "Broadcom STB NAND controller"
diff --git a/drivers/mtd/nand/Makefile b/drivers/mtd/nand/Makefile
index 118a134..921634b 100644
--- a/drivers/mtd/nand/Makefile
+++ b/drivers/mtd/nand/Makefile
@@ -32,6 +32,7 @@
 obj-$(CONFIG_MTD_NAND_OMAP_BCH_BUILD)	+= omap_elm.o
 obj-$(CONFIG_MTD_NAND_CM_X270)		+= cmx270_nand.o
 obj-$(CONFIG_MTD_NAND_PXA3xx)		+= pxa3xx_nand.o
+obj-$(CONFIG_MTD_NAND_MARVELL)		+= marvell_nand.o
 obj-$(CONFIG_MTD_NAND_TMIO)		+= tmio_nand.o
 obj-$(CONFIG_MTD_NAND_PLATFORM)		+= plat_nand.o
 obj-$(CONFIG_MTD_NAND_PASEMI)		+= pasemi_nand.o
diff --git a/drivers/mtd/nand/atmel/nand-controller.c b/drivers/mtd/nand/atmel/nand-controller.c
index 90a71a5..b2f00b3 100644
--- a/drivers/mtd/nand/atmel/nand-controller.c
+++ b/drivers/mtd/nand/atmel/nand-controller.c
@@ -841,6 +841,8 @@ static int atmel_nand_pmecc_write_pg(struct nand_chip *chip, const u8 *buf,
 	struct atmel_nand *nand = to_atmel_nand(chip);
 	int ret;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	ret = atmel_nand_pmecc_enable(chip, NAND_ECC_WRITE, raw);
 	if (ret)
 		return ret;
@@ -857,7 +859,7 @@ static int atmel_nand_pmecc_write_pg(struct nand_chip *chip, const u8 *buf,
 
 	atmel_nand_write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int atmel_nand_pmecc_write_page(struct mtd_info *mtd,
@@ -881,6 +883,8 @@ static int atmel_nand_pmecc_read_pg(struct nand_chip *chip, u8 *buf,
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	int ret;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	ret = atmel_nand_pmecc_enable(chip, NAND_ECC_READ, raw);
 	if (ret)
 		return ret;
@@ -1000,7 +1004,7 @@ static int atmel_hsmc_nand_pmecc_read_pg(struct nand_chip *chip, u8 *buf,
 	 * to the non-optimized one.
 	 */
 	if (nand->activecs->rb.type != ATMEL_NAND_NATIVE_RB) {
-		chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
+		nand_read_page_op(chip, page, 0, NULL, 0);
 
 		return atmel_nand_pmecc_read_pg(chip, buf, oob_required, page,
 						raw);
@@ -1178,7 +1182,6 @@ static int atmel_hsmc_nand_ecc_init(struct atmel_nand *nand)
 	chip->ecc.write_page = atmel_hsmc_nand_pmecc_write_page;
 	chip->ecc.read_page_raw = atmel_hsmc_nand_pmecc_read_page_raw;
 	chip->ecc.write_page_raw = atmel_hsmc_nand_pmecc_write_page_raw;
-	chip->ecc.options |= NAND_ECC_CUSTOM_PAGE_ACCESS;
 
 	return 0;
 }
diff --git a/drivers/mtd/nand/bf5xx_nand.c b/drivers/mtd/nand/bf5xx_nand.c
index 5655dca..87bbd17 100644
--- a/drivers/mtd/nand/bf5xx_nand.c
+++ b/drivers/mtd/nand/bf5xx_nand.c
@@ -572,6 +572,8 @@ static void bf5xx_nand_dma_write_buf(struct mtd_info *mtd,
 static int bf5xx_nand_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 		uint8_t *buf, int oob_required, int page)
 {
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	bf5xx_nand_read_buf(mtd, buf, mtd->writesize);
 	bf5xx_nand_read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
@@ -582,10 +584,10 @@ static int bf5xx_nand_write_page_raw(struct mtd_info *mtd,
 		struct nand_chip *chip,	const uint8_t *buf, int oob_required,
 		int page)
 {
-	bf5xx_nand_write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	bf5xx_nand_write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 /*
diff --git a/drivers/mtd/nand/brcmnand/brcmnand.c b/drivers/mtd/nand/brcmnand/brcmnand.c
index dd56a67..c28fd2b 100644
--- a/drivers/mtd/nand/brcmnand/brcmnand.c
+++ b/drivers/mtd/nand/brcmnand/brcmnand.c
@@ -1071,7 +1071,7 @@ static void brcmnand_wp(struct mtd_info *mtd, int wp)
 			return;
 
 		brcmnand_set_wp(ctrl, wp);
-		chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+		nand_status_op(chip, NULL);
 		/* NAND_STATUS_WP 0x00 = protected, 0x80 = not protected */
 		ret = bcmnand_ctrl_poll_status(ctrl,
 					       NAND_CTRL_RDY |
@@ -1453,7 +1453,7 @@ static uint8_t brcmnand_read_byte(struct mtd_info *mtd)
 
 		/* At FC_BYTES boundary, switch to next column */
 		if (host->last_byte > 0 && offs == 0)
-			chip->cmdfunc(mtd, NAND_CMD_RNDOUT, addr, -1);
+			nand_change_read_column_op(chip, addr, NULL, 0, false);
 
 		ret = ctrl->flash_cache[offs];
 		break;
@@ -1681,7 +1681,7 @@ static int brcmstb_nand_verify_erased_page(struct mtd_info *mtd,
 	int ret;
 
 	if (!buf) {
-		buf = chip->buffers->databuf;
+		buf = chip->data_buf;
 		/* Invalidate page cache */
 		chip->pagebuf = -1;
 	}
@@ -1689,7 +1689,6 @@ static int brcmstb_nand_verify_erased_page(struct mtd_info *mtd,
 	sas = mtd->oobsize / chip->ecc.steps;
 
 	/* read without ecc for verification */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
 	ret = chip->ecc.read_page_raw(mtd, chip, buf, true, page);
 	if (ret)
 		return ret;
@@ -1793,6 +1792,8 @@ static int brcmnand_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	struct brcmnand_host *host = nand_get_controller_data(chip);
 	u8 *oob = oob_required ? (u8 *)chip->oob_poi : NULL;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	return brcmnand_read(mtd, chip, host->last_addr,
 			mtd->writesize >> FC_SHIFT, (u32 *)buf, oob);
 }
@@ -1804,6 +1805,8 @@ static int brcmnand_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 	u8 *oob = oob_required ? (u8 *)chip->oob_poi : NULL;
 	int ret;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	brcmnand_set_ecc_enabled(host, 0);
 	ret = brcmnand_read(mtd, chip, host->last_addr,
 			mtd->writesize >> FC_SHIFT, (u32 *)buf, oob);
@@ -1909,8 +1912,10 @@ static int brcmnand_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	struct brcmnand_host *host = nand_get_controller_data(chip);
 	void *oob = oob_required ? chip->oob_poi : NULL;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	brcmnand_write(mtd, chip, host->last_addr, (const u32 *)buf, oob);
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 static int brcmnand_write_page_raw(struct mtd_info *mtd,
@@ -1920,10 +1925,12 @@ static int brcmnand_write_page_raw(struct mtd_info *mtd,
 	struct brcmnand_host *host = nand_get_controller_data(chip);
 	void *oob = oob_required ? chip->oob_poi : NULL;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	brcmnand_set_ecc_enabled(host, 0);
 	brcmnand_write(mtd, chip, host->last_addr, (const u32 *)buf, oob);
 	brcmnand_set_ecc_enabled(host, 1);
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 static int brcmnand_write_oob(struct mtd_info *mtd, struct nand_chip *chip,
@@ -2193,16 +2200,9 @@ static int brcmnand_setup_dev(struct brcmnand_host *host)
 	if (ctrl->nand_version >= 0x0702)
 		tmp |= ACC_CONTROL_RD_ERASED;
 	tmp &= ~ACC_CONTROL_FAST_PGM_RDIN;
-	if (ctrl->features & BRCMNAND_HAS_PREFETCH) {
-		/*
-		 * FIXME: Flash DMA + prefetch may see spurious erased-page ECC
-		 * errors
-		 */
-		if (has_flash_dma(ctrl))
-			tmp &= ~ACC_CONTROL_PREFETCH;
-		else
-			tmp |= ACC_CONTROL_PREFETCH;
-	}
+	if (ctrl->features & BRCMNAND_HAS_PREFETCH)
+		tmp &= ~ACC_CONTROL_PREFETCH;
+
 	nand_writereg(ctrl, offs, tmp);
 
 	return 0;
@@ -2230,6 +2230,9 @@ static int brcmnand_init_cs(struct brcmnand_host *host, struct device_node *dn)
 	nand_set_controller_data(chip, host);
 	mtd->name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "brcmnand.%d",
 				   host->cs);
+	if (!mtd->name)
+		return -ENOMEM;
+
 	mtd->owner = THIS_MODULE;
 	mtd->dev.parent = &pdev->dev;
 
@@ -2369,12 +2372,11 @@ static int brcmnand_resume(struct device *dev)
 
 	list_for_each_entry(host, &ctrl->host_list, node) {
 		struct nand_chip *chip = &host->chip;
-		struct mtd_info *mtd = nand_to_mtd(chip);
 
 		brcmnand_save_restore_cs_config(host, 1);
 
 		/* Reset the chip, required by some chips after power-up */
-		chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+		nand_reset_op(chip);
 	}
 
 	return 0;
diff --git a/drivers/mtd/nand/cafe_nand.c b/drivers/mtd/nand/cafe_nand.c
index bc558c4..567ff972d 100644
--- a/drivers/mtd/nand/cafe_nand.c
+++ b/drivers/mtd/nand/cafe_nand.c
@@ -353,23 +353,15 @@ static void cafe_nand_bug(struct mtd_info *mtd)
 static int cafe_nand_write_oob(struct mtd_info *mtd,
 			       struct nand_chip *chip, int page)
 {
-	int status = 0;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, mtd->writesize, page);
-	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_op(chip, page, mtd->writesize, chip->oob_poi,
+				 mtd->oobsize);
 }
 
 /* Don't use -- use nand_read_oob_std for now */
 static int cafe_nand_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 			      int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+	return nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
 }
 /**
  * cafe_nand_read_page_syndrome - [REPLACEABLE] hardware ecc syndrome based page read
@@ -391,7 +383,7 @@ static int cafe_nand_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 		     cafe_readl(cafe, NAND_ECC_RESULT),
 		     cafe_readl(cafe, NAND_ECC_SYN01));
 
-	chip->read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	if (checkecc && cafe_readl(cafe, NAND_ECC_RESULT) & (1<<18)) {
@@ -549,13 +541,13 @@ static int cafe_nand_write_page_lowlevel(struct mtd_info *mtd,
 {
 	struct cafe_priv *cafe = nand_get_controller_data(chip);
 
-	chip->write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	/* Set up ECC autogeneration */
 	cafe->ctl2 |= (1<<30);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int cafe_nand_block_bad(struct mtd_info *mtd, loff_t ofs)
@@ -613,7 +605,6 @@ static int cafe_nand_probe(struct pci_dev *pdev,
 	uint32_t ctrl;
 	int err = 0;
 	int old_dma;
-	struct nand_buffers *nbuf;
 
 	/* Very old versions shared the same PCI ident for all three
 	   functions on the chip. Verify the class too... */
@@ -661,7 +652,6 @@ static int cafe_nand_probe(struct pci_dev *pdev,
 
 	/* Enable the following for a flash based bad block table */
 	cafe->nand.bbt_options = NAND_BBT_USE_FLASH;
-	cafe->nand.options = NAND_OWN_BUFFERS;
 
 	if (skipbbt) {
 		cafe->nand.options |= NAND_SKIP_BBTSCAN;
@@ -731,32 +721,20 @@ static int cafe_nand_probe(struct pci_dev *pdev,
 	if (err)
 		goto out_irq;
 
-	cafe->dmabuf = dma_alloc_coherent(&cafe->pdev->dev,
-				2112 + sizeof(struct nand_buffers) +
-				mtd->writesize + mtd->oobsize,
-				&cafe->dmaaddr, GFP_KERNEL);
+	cafe->dmabuf = dma_alloc_coherent(&cafe->pdev->dev, 2112,
+					  &cafe->dmaaddr, GFP_KERNEL);
 	if (!cafe->dmabuf) {
 		err = -ENOMEM;
 		goto out_irq;
 	}
-	cafe->nand.buffers = nbuf = (void *)cafe->dmabuf + 2112;
 
 	/* Set up DMA address */
-	cafe_writel(cafe, cafe->dmaaddr & 0xffffffff, NAND_DMA_ADDR0);
-	if (sizeof(cafe->dmaaddr) > 4)
-		/* Shift in two parts to shut the compiler up */
-		cafe_writel(cafe, (cafe->dmaaddr >> 16) >> 16, NAND_DMA_ADDR1);
-	else
-		cafe_writel(cafe, 0, NAND_DMA_ADDR1);
+	cafe_writel(cafe, lower_32_bits(cafe->dmaaddr), NAND_DMA_ADDR0);
+	cafe_writel(cafe, upper_32_bits(cafe->dmaaddr), NAND_DMA_ADDR1);
 
 	cafe_dev_dbg(&cafe->pdev->dev, "Set DMA address to %x (virt %p)\n",
 		cafe_readl(cafe, NAND_DMA_ADDR0), cafe->dmabuf);
 
-	/* this driver does not need the @ecccalc and @ecccode */
-	nbuf->ecccalc = NULL;
-	nbuf->ecccode = NULL;
-	nbuf->databuf = (uint8_t *)(nbuf + 1);
-
 	/* Restore the DMA flag */
 	usedma = old_dma;
 
@@ -801,10 +779,7 @@ static int cafe_nand_probe(struct pci_dev *pdev,
 	goto out;
 
  out_free_dma:
-	dma_free_coherent(&cafe->pdev->dev,
-			2112 + sizeof(struct nand_buffers) +
-			mtd->writesize + mtd->oobsize,
-			cafe->dmabuf, cafe->dmaaddr);
+	dma_free_coherent(&cafe->pdev->dev, 2112, cafe->dmabuf, cafe->dmaaddr);
  out_irq:
 	/* Disable NAND IRQ in global IRQ mask register */
 	cafe_writel(cafe, ~1 & cafe_readl(cafe, GLOBAL_IRQ_MASK), GLOBAL_IRQ_MASK);
@@ -829,10 +804,7 @@ static void cafe_nand_remove(struct pci_dev *pdev)
 	nand_release(mtd);
 	free_rs(cafe->rs);
 	pci_iounmap(pdev, cafe->mmio);
-	dma_free_coherent(&cafe->pdev->dev,
-			2112 + sizeof(struct nand_buffers) +
-			mtd->writesize + mtd->oobsize,
-			cafe->dmabuf, cafe->dmaaddr);
+	dma_free_coherent(&cafe->pdev->dev, 2112, cafe->dmabuf, cafe->dmaaddr);
 	kfree(cafe);
 }
 
diff --git a/drivers/mtd/nand/denali.c b/drivers/mtd/nand/denali.c
index 5124f8a..313c7f5 100644
--- a/drivers/mtd/nand/denali.c
+++ b/drivers/mtd/nand/denali.c
@@ -330,16 +330,12 @@ static int denali_check_erased_page(struct mtd_info *mtd,
 				    unsigned long uncor_ecc_flags,
 				    unsigned int max_bitflips)
 {
-	uint8_t *ecc_code = chip->buffers->ecccode;
+	struct denali_nand_info *denali = mtd_to_denali(mtd);
+	uint8_t *ecc_code = chip->oob_poi + denali->oob_skip_bytes;
 	int ecc_steps = chip->ecc.steps;
 	int ecc_size = chip->ecc.size;
 	int ecc_bytes = chip->ecc.bytes;
-	int i, ret, stat;
-
-	ret = mtd_ooblayout_get_eccbytes(mtd, ecc_code, chip->oob_poi, 0,
-					 chip->ecc.total);
-	if (ret)
-		return ret;
+	int i, stat;
 
 	for (i = 0; i < ecc_steps; i++) {
 		if (!(uncor_ecc_flags & BIT(i)))
@@ -645,8 +641,6 @@ static void denali_oob_xfer(struct mtd_info *mtd, struct nand_chip *chip,
 			    int page, int write)
 {
 	struct denali_nand_info *denali = mtd_to_denali(mtd);
-	unsigned int start_cmd = write ? NAND_CMD_SEQIN : NAND_CMD_READ0;
-	unsigned int rnd_cmd = write ? NAND_CMD_RNDIN : NAND_CMD_RNDOUT;
 	int writesize = mtd->writesize;
 	int oobsize = mtd->oobsize;
 	uint8_t *bufpoi = chip->oob_poi;
@@ -658,11 +652,11 @@ static void denali_oob_xfer(struct mtd_info *mtd, struct nand_chip *chip,
 	int i, pos, len;
 
 	/* BBM at the beginning of the OOB area */
-	chip->cmdfunc(mtd, start_cmd, writesize, page);
 	if (write)
-		chip->write_buf(mtd, bufpoi, oob_skip);
+		nand_prog_page_begin_op(chip, page, writesize, bufpoi,
+					oob_skip);
 	else
-		chip->read_buf(mtd, bufpoi, oob_skip);
+		nand_read_page_op(chip, page, writesize, bufpoi, oob_skip);
 	bufpoi += oob_skip;
 
 	/* OOB ECC */
@@ -675,30 +669,35 @@ static void denali_oob_xfer(struct mtd_info *mtd, struct nand_chip *chip,
 		else if (pos + len > writesize)
 			len = writesize - pos;
 
-		chip->cmdfunc(mtd, rnd_cmd, pos, -1);
 		if (write)
-			chip->write_buf(mtd, bufpoi, len);
+			nand_change_write_column_op(chip, pos, bufpoi, len,
+						    false);
 		else
-			chip->read_buf(mtd, bufpoi, len);
+			nand_change_read_column_op(chip, pos, bufpoi, len,
+						   false);
 		bufpoi += len;
 		if (len < ecc_bytes) {
 			len = ecc_bytes - len;
-			chip->cmdfunc(mtd, rnd_cmd, writesize + oob_skip, -1);
 			if (write)
-				chip->write_buf(mtd, bufpoi, len);
+				nand_change_write_column_op(chip, writesize +
+							    oob_skip, bufpoi,
+							    len, false);
 			else
-				chip->read_buf(mtd, bufpoi, len);
+				nand_change_read_column_op(chip, writesize +
+							   oob_skip, bufpoi,
+							   len, false);
 			bufpoi += len;
 		}
 	}
 
 	/* OOB free */
 	len = oobsize - (bufpoi - chip->oob_poi);
-	chip->cmdfunc(mtd, rnd_cmd, size - len, -1);
 	if (write)
-		chip->write_buf(mtd, bufpoi, len);
+		nand_change_write_column_op(chip, size - len, bufpoi, len,
+					    false);
 	else
-		chip->read_buf(mtd, bufpoi, len);
+		nand_change_read_column_op(chip, size - len, bufpoi, len,
+					   false);
 }
 
 static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
@@ -710,12 +709,12 @@ static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 	int ecc_steps = chip->ecc.steps;
 	int ecc_size = chip->ecc.size;
 	int ecc_bytes = chip->ecc.bytes;
-	void *dma_buf = denali->buf;
+	void *tmp_buf = denali->buf;
 	int oob_skip = denali->oob_skip_bytes;
 	size_t size = writesize + oobsize;
 	int ret, i, pos, len;
 
-	ret = denali_data_xfer(denali, dma_buf, size, page, 1, 0);
+	ret = denali_data_xfer(denali, tmp_buf, size, page, 1, 0);
 	if (ret)
 		return ret;
 
@@ -730,11 +729,11 @@ static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			else if (pos + len > writesize)
 				len = writesize - pos;
 
-			memcpy(buf, dma_buf + pos, len);
+			memcpy(buf, tmp_buf + pos, len);
 			buf += len;
 			if (len < ecc_size) {
 				len = ecc_size - len;
-				memcpy(buf, dma_buf + writesize + oob_skip,
+				memcpy(buf, tmp_buf + writesize + oob_skip,
 				       len);
 				buf += len;
 			}
@@ -745,7 +744,7 @@ static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 		uint8_t *oob = chip->oob_poi;
 
 		/* BBM at the beginning of the OOB area */
-		memcpy(oob, dma_buf + writesize, oob_skip);
+		memcpy(oob, tmp_buf + writesize, oob_skip);
 		oob += oob_skip;
 
 		/* OOB ECC */
@@ -758,11 +757,11 @@ static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			else if (pos + len > writesize)
 				len = writesize - pos;
 
-			memcpy(oob, dma_buf + pos, len);
+			memcpy(oob, tmp_buf + pos, len);
 			oob += len;
 			if (len < ecc_bytes) {
 				len = ecc_bytes - len;
-				memcpy(oob, dma_buf + writesize + oob_skip,
+				memcpy(oob, tmp_buf + writesize + oob_skip,
 				       len);
 				oob += len;
 			}
@@ -770,7 +769,7 @@ static int denali_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 
 		/* OOB free */
 		len = oobsize - (oob - chip->oob_poi);
-		memcpy(oob, dma_buf + size - len, len);
+		memcpy(oob, tmp_buf + size - len, len);
 	}
 
 	return 0;
@@ -788,16 +787,12 @@ static int denali_write_oob(struct mtd_info *mtd, struct nand_chip *chip,
 			    int page)
 {
 	struct denali_nand_info *denali = mtd_to_denali(mtd);
-	int status;
 
 	denali_reset_irq(denali);
 
 	denali_oob_xfer(mtd, chip, page, 1);
 
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int denali_read_page(struct mtd_info *mtd, struct nand_chip *chip,
@@ -841,7 +836,7 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 	int ecc_steps = chip->ecc.steps;
 	int ecc_size = chip->ecc.size;
 	int ecc_bytes = chip->ecc.bytes;
-	void *dma_buf = denali->buf;
+	void *tmp_buf = denali->buf;
 	int oob_skip = denali->oob_skip_bytes;
 	size_t size = writesize + oobsize;
 	int i, pos, len;
@@ -851,7 +846,7 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 	 * This simplifies the logic.
 	 */
 	if (!buf || !oob_required)
-		memset(dma_buf, 0xff, size);
+		memset(tmp_buf, 0xff, size);
 
 	/* Arrange the buffer for syndrome payload/ecc layout */
 	if (buf) {
@@ -864,11 +859,11 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			else if (pos + len > writesize)
 				len = writesize - pos;
 
-			memcpy(dma_buf + pos, buf, len);
+			memcpy(tmp_buf + pos, buf, len);
 			buf += len;
 			if (len < ecc_size) {
 				len = ecc_size - len;
-				memcpy(dma_buf + writesize + oob_skip, buf,
+				memcpy(tmp_buf + writesize + oob_skip, buf,
 				       len);
 				buf += len;
 			}
@@ -879,7 +874,7 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 		const uint8_t *oob = chip->oob_poi;
 
 		/* BBM at the beginning of the OOB area */
-		memcpy(dma_buf + writesize, oob, oob_skip);
+		memcpy(tmp_buf + writesize, oob, oob_skip);
 		oob += oob_skip;
 
 		/* OOB ECC */
@@ -892,11 +887,11 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			else if (pos + len > writesize)
 				len = writesize - pos;
 
-			memcpy(dma_buf + pos, oob, len);
+			memcpy(tmp_buf + pos, oob, len);
 			oob += len;
 			if (len < ecc_bytes) {
 				len = ecc_bytes - len;
-				memcpy(dma_buf + writesize + oob_skip, oob,
+				memcpy(tmp_buf + writesize + oob_skip, oob,
 				       len);
 				oob += len;
 			}
@@ -904,10 +899,10 @@ static int denali_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 
 		/* OOB free */
 		len = oobsize - (oob - chip->oob_poi);
-		memcpy(dma_buf + size - len, oob, len);
+		memcpy(tmp_buf + size - len, oob, len);
 	}
 
-	return denali_data_xfer(denali, dma_buf, size, page, 1, 1);
+	return denali_data_xfer(denali, tmp_buf, size, page, 1, 1);
 }
 
 static int denali_write_page(struct mtd_info *mtd, struct nand_chip *chip,
@@ -951,7 +946,7 @@ static int denali_erase(struct mtd_info *mtd, int page)
 	irq_status = denali_wait_for_irq(denali,
 					 INTR__ERASE_COMP | INTR__ERASE_FAIL);
 
-	return irq_status & INTR__ERASE_COMP ? 0 : NAND_STATUS_FAIL;
+	return irq_status & INTR__ERASE_COMP ? 0 : -EIO;
 }
 
 static int denali_setup_data_interface(struct mtd_info *mtd, int chipnr,
@@ -1359,7 +1354,6 @@ int denali_init(struct denali_nand_info *denali)
 		chip->read_buf = denali_read_buf;
 		chip->write_buf = denali_write_buf;
 	}
-	chip->ecc.options |= NAND_ECC_CUSTOM_PAGE_ACCESS;
 	chip->ecc.read_page = denali_read_page;
 	chip->ecc.read_page_raw = denali_read_page_raw;
 	chip->ecc.write_page = denali_write_page;
diff --git a/drivers/mtd/nand/denali.h b/drivers/mtd/nand/denali.h
index 2911066..9ad33d2 100644
--- a/drivers/mtd/nand/denali.h
+++ b/drivers/mtd/nand/denali.h
@@ -329,7 +329,7 @@ struct denali_nand_info {
 #define DENALI_CAP_DMA_64BIT			BIT(1)
 
 int denali_calc_ecc_bytes(int step_size, int strength);
-extern int denali_init(struct denali_nand_info *denali);
-extern void denali_remove(struct denali_nand_info *denali);
+int denali_init(struct denali_nand_info *denali);
+void denali_remove(struct denali_nand_info *denali);
 
 #endif /* __DENALI_H__ */
diff --git a/drivers/mtd/nand/denali_pci.c b/drivers/mtd/nand/denali_pci.c
index 57fb7ae..49cb3e1 100644
--- a/drivers/mtd/nand/denali_pci.c
+++ b/drivers/mtd/nand/denali_pci.c
@@ -125,3 +125,7 @@ static struct pci_driver denali_pci_driver = {
 	.remove = denali_pci_remove,
 };
 module_pci_driver(denali_pci_driver);
+
+MODULE_DESCRIPTION("PCI driver for Denali NAND controller");
+MODULE_AUTHOR("Intel Corporation and its suppliers");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/mtd/nand/diskonchip.c b/drivers/mtd/nand/diskonchip.c
index 72671dc..6bc93ea 100644
--- a/drivers/mtd/nand/diskonchip.c
+++ b/drivers/mtd/nand/diskonchip.c
@@ -448,7 +448,7 @@ static int doc200x_wait(struct mtd_info *mtd, struct nand_chip *this)
 	int status;
 
 	DoC_WaitReady(doc);
-	this->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+	nand_status_op(this, NULL);
 	DoC_WaitReady(doc);
 	status = (int)this->read_byte(mtd);
 
@@ -595,7 +595,7 @@ static void doc2001plus_select_chip(struct mtd_info *mtd, int chip)
 
 	/* Assert ChipEnable and deassert WriteProtect */
 	WriteDOC((DOC_FLASH_CE), docptr, Mplus_FlashSelect);
-	this->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+	nand_reset_op(this);
 
 	doc->curchip = chip;
 	doc->curfloor = floor;
diff --git a/drivers/mtd/nand/docg4.c b/drivers/mtd/nand/docg4.c
index 2436cbc..72f1327 100644
--- a/drivers/mtd/nand/docg4.c
+++ b/drivers/mtd/nand/docg4.c
@@ -785,6 +785,8 @@ static int read_page(struct mtd_info *mtd, struct nand_chip *nand,
 
 	dev_dbg(doc->dev, "%s: page %08x\n", __func__, page);
 
+	nand_read_page_op(nand, page, 0, NULL, 0);
+
 	writew(DOC_ECCCONF0_READ_MODE |
 	       DOC_ECCCONF0_ECC_ENABLE |
 	       DOC_ECCCONF0_UNKNOWN |
@@ -864,7 +866,7 @@ static int docg4_read_oob(struct mtd_info *mtd, struct nand_chip *nand,
 
 	dev_dbg(doc->dev, "%s: page %x\n", __func__, page);
 
-	docg4_command(mtd, NAND_CMD_READ0, nand->ecc.size, page);
+	nand_read_page_op(nand, page, nand->ecc.size, NULL, 0);
 
 	writew(DOC_ECCCONF0_READ_MODE | DOCG4_OOB_SIZE, docptr + DOC_ECCCONF0);
 	write_nop(docptr);
@@ -900,6 +902,7 @@ static int docg4_erase_block(struct mtd_info *mtd, int page)
 	struct docg4_priv *doc = nand_get_controller_data(nand);
 	void __iomem *docptr = doc->virtadr;
 	uint16_t g4_page;
+	int status;
 
 	dev_dbg(doc->dev, "%s: page %04x\n", __func__, page);
 
@@ -939,11 +942,15 @@ static int docg4_erase_block(struct mtd_info *mtd, int page)
 	poll_status(doc);
 	write_nop(docptr);
 
-	return nand->waitfunc(mtd, nand);
+	status = nand->waitfunc(mtd, nand);
+	if (status < 0)
+		return status;
+
+	return status & NAND_STATUS_FAIL ? -EIO : 0;
 }
 
 static int write_page(struct mtd_info *mtd, struct nand_chip *nand,
-		       const uint8_t *buf, bool use_ecc)
+		      const uint8_t *buf, int page, bool use_ecc)
 {
 	struct docg4_priv *doc = nand_get_controller_data(nand);
 	void __iomem *docptr = doc->virtadr;
@@ -951,6 +958,8 @@ static int write_page(struct mtd_info *mtd, struct nand_chip *nand,
 
 	dev_dbg(doc->dev, "%s...\n", __func__);
 
+	nand_prog_page_begin_op(nand, page, 0, NULL, 0);
+
 	writew(DOC_ECCCONF0_ECC_ENABLE |
 	       DOC_ECCCONF0_UNKNOWN |
 	       DOCG4_BCH_SIZE,
@@ -995,19 +1004,19 @@ static int write_page(struct mtd_info *mtd, struct nand_chip *nand,
 	writew(0, docptr + DOC_DATAEND);
 	write_nop(docptr);
 
-	return 0;
+	return nand_prog_page_end_op(nand);
 }
 
 static int docg4_write_page_raw(struct mtd_info *mtd, struct nand_chip *nand,
 				const uint8_t *buf, int oob_required, int page)
 {
-	return write_page(mtd, nand, buf, false);
+	return write_page(mtd, nand, buf, page, false);
 }
 
 static int docg4_write_page(struct mtd_info *mtd, struct nand_chip *nand,
 			     const uint8_t *buf, int oob_required, int page)
 {
-	return write_page(mtd, nand, buf, true);
+	return write_page(mtd, nand, buf, page, true);
 }
 
 static int docg4_write_oob(struct mtd_info *mtd, struct nand_chip *nand,
diff --git a/drivers/mtd/nand/fsl_elbc_nand.c b/drivers/mtd/nand/fsl_elbc_nand.c
index 17db2f9..8b6dcd7 100644
--- a/drivers/mtd/nand/fsl_elbc_nand.c
+++ b/drivers/mtd/nand/fsl_elbc_nand.c
@@ -713,7 +713,7 @@ static int fsl_elbc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	struct fsl_lbc_ctrl *ctrl = priv->ctrl;
 	struct fsl_elbc_fcm_ctrl *elbc_fcm_ctrl = ctrl->nand;
 
-	fsl_elbc_read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	if (oob_required)
 		fsl_elbc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
@@ -729,10 +729,10 @@ static int fsl_elbc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 static int fsl_elbc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 				const uint8_t *buf, int oob_required, int page)
 {
-	fsl_elbc_write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	fsl_elbc_write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 /* ECC will be calculated automatically, and errors will be detected in
@@ -742,10 +742,10 @@ static int fsl_elbc_write_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 				uint32_t offset, uint32_t data_len,
 				const uint8_t *buf, int oob_required, int page)
 {
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	fsl_elbc_write_buf(mtd, buf, mtd->writesize);
 	fsl_elbc_write_buf(mtd, chip->oob_poi, mtd->oobsize);
-
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int fsl_elbc_chip_init(struct fsl_elbc_mtd *priv)
diff --git a/drivers/mtd/nand/fsl_ifc_nand.c b/drivers/mtd/nand/fsl_ifc_nand.c
index 9e03bac..4872a7b 100644
--- a/drivers/mtd/nand/fsl_ifc_nand.c
+++ b/drivers/mtd/nand/fsl_ifc_nand.c
@@ -688,7 +688,7 @@ static int fsl_ifc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	struct fsl_ifc_ctrl *ctrl = priv->ctrl;
 	struct fsl_ifc_nand_ctrl *nctrl = ifc_nand_ctrl;
 
-	fsl_ifc_read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	if (oob_required)
 		fsl_ifc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
@@ -711,10 +711,10 @@ static int fsl_ifc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 static int fsl_ifc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 			       const uint8_t *buf, int oob_required, int page)
 {
-	fsl_ifc_write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	fsl_ifc_write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int fsl_ifc_chip_init_tail(struct mtd_info *mtd)
@@ -916,6 +916,13 @@ static int fsl_ifc_chip_init(struct fsl_ifc_mtd *priv)
 	if (ctrl->version >= FSL_IFC_VERSION_1_1_0)
 		fsl_ifc_sram_init(priv);
 
+	/*
+	 * As IFC version 2.0.0 has 16KB of internal SRAM as compared to older
+	 * versions which had 8KB. Hence bufnum mask needs to be updated.
+	 */
+	if (ctrl->version >= FSL_IFC_VERSION_2_0_0)
+		priv->bufnum_mask = (priv->bufnum_mask * 2) + 1;
+
 	return 0;
 }
 
diff --git a/drivers/mtd/nand/fsmc_nand.c b/drivers/mtd/nand/fsmc_nand.c
index eac15d9..f49ed46 100644
--- a/drivers/mtd/nand/fsmc_nand.c
+++ b/drivers/mtd/nand/fsmc_nand.c
@@ -684,8 +684,8 @@ static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
-	uint8_t *ecc_code = chip->buffers->ecccode;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
+	uint8_t *ecc_code = chip->ecc.code_buf;
 	int off, len, group = 0;
 	/*
 	 * ecc_oob is intentionally taken as uint16_t. In 16bit devices, we
@@ -697,7 +697,7 @@ static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	unsigned int max_bitflips = 0;
 
 	for (i = 0, s = 0; s < eccsteps; s++, i += eccbytes, p += eccsize) {
-		chip->cmdfunc(mtd, NAND_CMD_READ0, s * eccsize, page);
+		nand_read_page_op(chip, page, s * eccsize, NULL, 0);
 		chip->ecc.hwctl(mtd, NAND_ECC_READ);
 		chip->read_buf(mtd, p, eccsize);
 
@@ -720,8 +720,7 @@ static int fsmc_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 			if (chip->options & NAND_BUSWIDTH_16)
 				len = roundup(len, 2);
 
-			chip->cmdfunc(mtd, NAND_CMD_READOOB, off, page);
-			chip->read_buf(mtd, oob + j, len);
+			nand_read_oob_op(chip, page, off, oob + j, len);
 			j += len;
 		}
 
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
index d4d824e..61fdd73 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.c
@@ -1029,11 +1029,13 @@ static void block_mark_swapping(struct gpmi_nand_data *this,
 	p[1] = (p[1] & mask) | (from_oob >> (8 - bit));
 }
 
-static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
-				uint8_t *buf, int oob_required, int page)
+static int gpmi_ecc_read_page_data(struct nand_chip *chip,
+				   uint8_t *buf, int oob_required,
+				   int page)
 {
 	struct gpmi_nand_data *this = nand_get_controller_data(chip);
 	struct bch_geometry *nfc_geo = &this->bch_geometry;
+	struct mtd_info *mtd = nand_to_mtd(chip);
 	void          *payload_virt;
 	dma_addr_t    payload_phys;
 	void          *auxiliary_virt;
@@ -1094,8 +1096,8 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 			eccbytes = DIV_ROUND_UP(offset + eccbits, 8);
 			offset /= 8;
 			eccbytes -= offset;
-			chip->cmdfunc(mtd, NAND_CMD_RNDOUT, offset, -1);
-			chip->read_buf(mtd, eccbuf, eccbytes);
+			nand_change_read_column_op(chip, offset, eccbuf,
+						   eccbytes, false);
 
 			/*
 			 * ECC data are not byte aligned and we may have
@@ -1176,6 +1178,14 @@ static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	return max_bitflips;
 }
 
+static int gpmi_ecc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
+			      uint8_t *buf, int oob_required, int page)
+{
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
+	return gpmi_ecc_read_page_data(chip, buf, oob_required, page);
+}
+
 /* Fake a virtual small page for the subpage read */
 static int gpmi_ecc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 			uint32_t offs, uint32_t len, uint8_t *buf, int page)
@@ -1220,12 +1230,12 @@ static int gpmi_ecc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 	meta = geo->metadata_size;
 	if (first) {
 		col = meta + (size + ecc_parity_size) * first;
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT, col, -1);
-
 		meta = 0;
 		buf = buf + first * size;
 	}
 
+	nand_read_page_op(chip, page, col, NULL, 0);
+
 	/* Save the old environment */
 	r1_old = r1_new = readl(bch_regs + HW_BCH_FLASH0LAYOUT0);
 	r2_old = r2_new = readl(bch_regs + HW_BCH_FLASH0LAYOUT1);
@@ -1254,7 +1264,7 @@ static int gpmi_ecc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 
 	/* Read the subpage now */
 	this->swap_block_mark = false;
-	max_bitflips = gpmi_ecc_read_page(mtd, chip, buf, 0, page);
+	max_bitflips = gpmi_ecc_read_page_data(chip, buf, 0, page);
 
 	/* Restore */
 	writel(r1_old, bch_regs + HW_BCH_FLASH0LAYOUT0);
@@ -1277,6 +1287,9 @@ static int gpmi_ecc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	int        ret;
 
 	dev_dbg(this->dev, "ecc write page.\n");
+
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	if (this->swap_block_mark) {
 		/*
 		 * If control arrives here, we're doing block mark swapping.
@@ -1338,7 +1351,10 @@ static int gpmi_ecc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 				payload_virt, payload_phys);
 	}
 
-	return 0;
+	if (ret)
+		return ret;
+
+	return nand_prog_page_end_op(chip);
 }
 
 /*
@@ -1411,7 +1427,7 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 	memset(chip->oob_poi, ~0, mtd->oobsize);
 
 	/* Read out the conventional OOB. */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, mtd->writesize, page);
+	nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
 	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	/*
@@ -1421,7 +1437,7 @@ static int gpmi_ecc_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 	 */
 	if (GPMI_IS_MX23(this)) {
 		/* Read the block mark into the first byte of the OOB buffer. */
-		chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+		nand_read_page_op(chip, page, 0, NULL, 0);
 		chip->oob_poi[0] = chip->read_byte(mtd);
 	}
 
@@ -1432,7 +1448,6 @@ static int
 gpmi_ecc_write_oob(struct mtd_info *mtd, struct nand_chip *chip, int page)
 {
 	struct mtd_oob_region of = { };
-	int status = 0;
 
 	/* Do we have available oob area? */
 	mtd_ooblayout_free(mtd, 0, &of);
@@ -1442,12 +1457,8 @@ gpmi_ecc_write_oob(struct mtd_info *mtd, struct nand_chip *chip, int page)
 	if (!nand_is_slc(chip))
 		return -EPERM;
 
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, mtd->writesize + of.offset, page);
-	chip->write_buf(mtd, chip->oob_poi + of.offset, of.length);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_op(chip, page, mtd->writesize + of.offset,
+				 chip->oob_poi + of.offset, of.length);
 }
 
 /*
@@ -1477,8 +1488,8 @@ static int gpmi_ecc_read_page_raw(struct mtd_info *mtd,
 	uint8_t *oob = chip->oob_poi;
 	int step;
 
-	chip->read_buf(mtd, tmp_buf,
-		       mtd->writesize + mtd->oobsize);
+	nand_read_page_op(chip, page, 0, tmp_buf,
+			  mtd->writesize + mtd->oobsize);
 
 	/*
 	 * If required, swap the bad block marker and the data stored in the
@@ -1487,12 +1498,8 @@ static int gpmi_ecc_read_page_raw(struct mtd_info *mtd,
 	 * See the layout description for a detailed explanation on why this
 	 * is needed.
 	 */
-	if (this->swap_block_mark) {
-		u8 swap = tmp_buf[0];
-
-		tmp_buf[0] = tmp_buf[mtd->writesize];
-		tmp_buf[mtd->writesize] = swap;
-	}
+	if (this->swap_block_mark)
+		swap(tmp_buf[0], tmp_buf[mtd->writesize]);
 
 	/*
 	 * Copy the metadata section into the oob buffer (this section is
@@ -1615,31 +1622,22 @@ static int gpmi_ecc_write_page_raw(struct mtd_info *mtd,
 	 * See the layout description for a detailed explanation on why this
 	 * is needed.
 	 */
-	if (this->swap_block_mark) {
-		u8 swap = tmp_buf[0];
+	if (this->swap_block_mark)
+		swap(tmp_buf[0], tmp_buf[mtd->writesize]);
 
-		tmp_buf[0] = tmp_buf[mtd->writesize];
-		tmp_buf[mtd->writesize] = swap;
-	}
-
-	chip->write_buf(mtd, tmp_buf, mtd->writesize + mtd->oobsize);
-
-	return 0;
+	return nand_prog_page_op(chip, page, 0, tmp_buf,
+				 mtd->writesize + mtd->oobsize);
 }
 
 static int gpmi_ecc_read_oob_raw(struct mtd_info *mtd, struct nand_chip *chip,
 				 int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
-
 	return gpmi_ecc_read_page_raw(mtd, chip, NULL, 1, page);
 }
 
 static int gpmi_ecc_write_oob_raw(struct mtd_info *mtd, struct nand_chip *chip,
 				 int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0, page);
-
 	return gpmi_ecc_write_page_raw(mtd, chip, NULL, 1, page);
 }
 
@@ -1649,7 +1647,7 @@ static int gpmi_block_markbad(struct mtd_info *mtd, loff_t ofs)
 	struct gpmi_nand_data *this = nand_get_controller_data(chip);
 	int ret = 0;
 	uint8_t *block_mark;
-	int column, page, status, chipnr;
+	int column, page, chipnr;
 
 	chipnr = (int)(ofs >> chip->chip_shift);
 	chip->select_chip(mtd, chipnr);
@@ -1663,13 +1661,7 @@ static int gpmi_block_markbad(struct mtd_info *mtd, loff_t ofs)
 	/* Shift to get page */
 	page = (int)(ofs >> chip->page_shift);
 
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, column, page);
-	chip->write_buf(mtd, block_mark, 1);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-	if (status & NAND_STATUS_FAIL)
-		ret = -EIO;
+	ret = nand_prog_page_op(chip, page, column, block_mark, 1);
 
 	chip->select_chip(mtd, -1);
 
@@ -1712,7 +1704,7 @@ static int mx23_check_transcription_stamp(struct gpmi_nand_data *this)
 	unsigned int search_area_size_in_strides;
 	unsigned int stride;
 	unsigned int page;
-	uint8_t *buffer = chip->buffers->databuf;
+	uint8_t *buffer = chip->data_buf;
 	int saved_chip_number;
 	int found_an_ncb_fingerprint = false;
 
@@ -1737,7 +1729,7 @@ static int mx23_check_transcription_stamp(struct gpmi_nand_data *this)
 		 * Read the NCB fingerprint. The fingerprint is four bytes long
 		 * and starts in the 12th byte of the page.
 		 */
-		chip->cmdfunc(mtd, NAND_CMD_READ0, 12, page);
+		nand_read_page_op(chip, page, 12, NULL, 0);
 		chip->read_buf(mtd, buffer, strlen(fingerprint));
 
 		/* Look for the fingerprint. */
@@ -1771,7 +1763,7 @@ static int mx23_write_transcription_stamp(struct gpmi_nand_data *this)
 	unsigned int block;
 	unsigned int stride;
 	unsigned int page;
-	uint8_t      *buffer = chip->buffers->databuf;
+	uint8_t      *buffer = chip->data_buf;
 	int saved_chip_number;
 	int status;
 
@@ -1797,17 +1789,10 @@ static int mx23_write_transcription_stamp(struct gpmi_nand_data *this)
 	dev_dbg(dev, "Erasing the search area...\n");
 
 	for (block = 0; block < search_area_size_in_blocks; block++) {
-		/* Compute the page address. */
-		page = block * block_size_in_pages;
-
 		/* Erase this block. */
 		dev_dbg(dev, "\tErasing block 0x%x\n", block);
-		chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page);
-		chip->cmdfunc(mtd, NAND_CMD_ERASE2, -1, -1);
-
-		/* Wait for the erase to finish. */
-		status = chip->waitfunc(mtd, chip);
-		if (status & NAND_STATUS_FAIL)
+		status = nand_erase_op(chip, block);
+		if (status)
 			dev_err(dev, "[%s] Erase failed.\n", __func__);
 	}
 
@@ -1823,13 +1808,9 @@ static int mx23_write_transcription_stamp(struct gpmi_nand_data *this)
 
 		/* Write the first page of the current stride. */
 		dev_dbg(dev, "Writing an NCB fingerprint in page 0x%x\n", page);
-		chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
-		chip->ecc.write_page_raw(mtd, chip, buffer, 0, page);
-		chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
 
-		/* Wait for the write to finish. */
-		status = chip->waitfunc(mtd, chip);
-		if (status & NAND_STATUS_FAIL)
+		status = chip->ecc.write_page_raw(mtd, chip, buffer, 0, page);
+		if (status)
 			dev_err(dev, "[%s] Write failed.\n", __func__);
 	}
 
@@ -1884,7 +1865,7 @@ static int mx23_boot_init(struct gpmi_nand_data  *this)
 
 		/* Send the command to read the conventional block mark. */
 		chip->select_chip(mtd, chipnr);
-		chip->cmdfunc(mtd, NAND_CMD_READ0, mtd->writesize, page);
+		nand_read_page_op(chip, page, mtd->writesize, NULL, 0);
 		block_mark = chip->read_byte(mtd);
 		chip->select_chip(mtd, -1);
 
diff --git a/drivers/mtd/nand/gpmi-nand/gpmi-nand.h b/drivers/mtd/nand/gpmi-nand/gpmi-nand.h
index a45e4ce..06c1f99 100644
--- a/drivers/mtd/nand/gpmi-nand/gpmi-nand.h
+++ b/drivers/mtd/nand/gpmi-nand/gpmi-nand.h
@@ -268,31 +268,31 @@ struct timing_threshold {
 };
 
 /* Common Services */
-extern int common_nfc_set_geometry(struct gpmi_nand_data *);
-extern struct dma_chan *get_dma_chan(struct gpmi_nand_data *);
-extern void prepare_data_dma(struct gpmi_nand_data *,
-				enum dma_data_direction dr);
-extern int start_dma_without_bch_irq(struct gpmi_nand_data *,
-				struct dma_async_tx_descriptor *);
-extern int start_dma_with_bch_irq(struct gpmi_nand_data *,
-				struct dma_async_tx_descriptor *);
+int common_nfc_set_geometry(struct gpmi_nand_data *);
+struct dma_chan *get_dma_chan(struct gpmi_nand_data *);
+void prepare_data_dma(struct gpmi_nand_data *,
+		      enum dma_data_direction dr);
+int start_dma_without_bch_irq(struct gpmi_nand_data *,
+			      struct dma_async_tx_descriptor *);
+int start_dma_with_bch_irq(struct gpmi_nand_data *,
+			   struct dma_async_tx_descriptor *);
 
 /* GPMI-NAND helper function library */
-extern int gpmi_init(struct gpmi_nand_data *);
-extern int gpmi_extra_init(struct gpmi_nand_data *);
-extern void gpmi_clear_bch(struct gpmi_nand_data *);
-extern void gpmi_dump_info(struct gpmi_nand_data *);
-extern int bch_set_geometry(struct gpmi_nand_data *);
-extern int gpmi_is_ready(struct gpmi_nand_data *, unsigned chip);
-extern int gpmi_send_command(struct gpmi_nand_data *);
-extern void gpmi_begin(struct gpmi_nand_data *);
-extern void gpmi_end(struct gpmi_nand_data *);
-extern int gpmi_read_data(struct gpmi_nand_data *);
-extern int gpmi_send_data(struct gpmi_nand_data *);
-extern int gpmi_send_page(struct gpmi_nand_data *,
-			dma_addr_t payload, dma_addr_t auxiliary);
-extern int gpmi_read_page(struct gpmi_nand_data *,
-			dma_addr_t payload, dma_addr_t auxiliary);
+int gpmi_init(struct gpmi_nand_data *);
+int gpmi_extra_init(struct gpmi_nand_data *);
+void gpmi_clear_bch(struct gpmi_nand_data *);
+void gpmi_dump_info(struct gpmi_nand_data *);
+int bch_set_geometry(struct gpmi_nand_data *);
+int gpmi_is_ready(struct gpmi_nand_data *, unsigned chip);
+int gpmi_send_command(struct gpmi_nand_data *);
+void gpmi_begin(struct gpmi_nand_data *);
+void gpmi_end(struct gpmi_nand_data *);
+int gpmi_read_data(struct gpmi_nand_data *);
+int gpmi_send_data(struct gpmi_nand_data *);
+int gpmi_send_page(struct gpmi_nand_data *,
+		   dma_addr_t payload, dma_addr_t auxiliary);
+int gpmi_read_page(struct gpmi_nand_data *,
+		   dma_addr_t payload, dma_addr_t auxiliary);
 
 void gpmi_copy_bits(u8 *dst, size_t dst_bit_off,
 		    const u8 *src, size_t src_bit_off,
diff --git a/drivers/mtd/nand/hisi504_nand.c b/drivers/mtd/nand/hisi504_nand.c
index 0897261..cb86279 100644
--- a/drivers/mtd/nand/hisi504_nand.c
+++ b/drivers/mtd/nand/hisi504_nand.c
@@ -544,7 +544,7 @@ static int hisi_nand_read_page_hwecc(struct mtd_info *mtd,
 	int max_bitflips = 0, stat = 0, stat_max = 0, status_ecc;
 	int stat_1, stat_2;
 
-	chip->read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	/* errors which can not be corrected by ECC */
@@ -574,8 +574,7 @@ static int hisi_nand_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 {
 	struct hinfc_host *host = nand_get_controller_data(chip);
 
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
+	nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
 
 	if (host->irq_status & HINFC504_INTS_UE) {
 		host->irq_status = 0;
@@ -590,11 +589,11 @@ static int hisi_nand_write_page_hwecc(struct mtd_info *mtd,
 		struct nand_chip *chip, const uint8_t *buf, int oob_required,
 		int page)
 {
-	chip->write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	if (oob_required)
 		chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static void hisi_nfc_host_init(struct hinfc_host *host)
diff --git a/drivers/mtd/nand/jz4740_nand.c b/drivers/mtd/nand/jz4740_nand.c
index ad827d4..613b00a 100644
--- a/drivers/mtd/nand/jz4740_nand.c
+++ b/drivers/mtd/nand/jz4740_nand.c
@@ -313,6 +313,7 @@ static int jz_nand_detect_bank(struct platform_device *pdev,
 	uint32_t ctrl;
 	struct nand_chip *chip = &nand->chip;
 	struct mtd_info *mtd = nand_to_mtd(chip);
+	u8 id[2];
 
 	/* Request I/O resource. */
 	sprintf(res_name, "bank%d", bank);
@@ -335,17 +336,16 @@ static int jz_nand_detect_bank(struct platform_device *pdev,
 
 		/* Retrieve the IDs from the first chip. */
 		chip->select_chip(mtd, 0);
-		chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
-		chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
-		*nand_maf_id = chip->read_byte(mtd);
-		*nand_dev_id = chip->read_byte(mtd);
+		nand_reset_op(chip);
+		nand_readid_op(chip, 0, id, sizeof(id));
+		*nand_maf_id = id[0];
+		*nand_dev_id = id[1];
 	} else {
 		/* Detect additional chip. */
 		chip->select_chip(mtd, chipnr);
-		chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
-		chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
-		if (*nand_maf_id != chip->read_byte(mtd)
-		 || *nand_dev_id != chip->read_byte(mtd)) {
+		nand_reset_op(chip);
+		nand_readid_op(chip, 0, id, sizeof(id));
+		if (*nand_maf_id != id[0] || *nand_dev_id != id[1]) {
 			ret = -ENODEV;
 			goto notfound_id;
 		}
diff --git a/drivers/mtd/nand/lpc32xx_mlc.c b/drivers/mtd/nand/lpc32xx_mlc.c
index 5796468..e357948 100644
--- a/drivers/mtd/nand/lpc32xx_mlc.c
+++ b/drivers/mtd/nand/lpc32xx_mlc.c
@@ -461,7 +461,7 @@ static int lpc32xx_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	}
 
 	/* Writing Command and Address */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	nand_read_page_op(chip, page, 0, NULL, 0);
 
 	/* For all sub-pages */
 	for (i = 0; i < host->mlcsubpages; i++) {
@@ -522,6 +522,8 @@ static int lpc32xx_write_page_lowlevel(struct mtd_info *mtd,
 		memcpy(dma_buf, buf, mtd->writesize);
 	}
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	for (i = 0; i < host->mlcsubpages; i++) {
 		/* Start Encode */
 		writeb(0x00, MLC_ECC_ENC_REG(host->io_base));
@@ -550,7 +552,8 @@ static int lpc32xx_write_page_lowlevel(struct mtd_info *mtd,
 		/* Wait for Controller Ready */
 		lpc32xx_waitfunc_controller(mtd, chip);
 	}
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 static int lpc32xx_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
diff --git a/drivers/mtd/nand/lpc32xx_slc.c b/drivers/mtd/nand/lpc32xx_slc.c
index b61f28a..5f7cc6d 100644
--- a/drivers/mtd/nand/lpc32xx_slc.c
+++ b/drivers/mtd/nand/lpc32xx_slc.c
@@ -399,10 +399,7 @@ static void lpc32xx_nand_write_buf(struct mtd_info *mtd, const uint8_t *buf, int
 static int lpc32xx_nand_read_oob_syndrome(struct mtd_info *mtd,
 					  struct nand_chip *chip, int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
-
-	return 0;
+	return nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
 }
 
 /*
@@ -411,17 +408,8 @@ static int lpc32xx_nand_read_oob_syndrome(struct mtd_info *mtd,
 static int lpc32xx_nand_write_oob_syndrome(struct mtd_info *mtd,
 	struct nand_chip *chip, int page)
 {
-	int status;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, mtd->writesize, page);
-	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-
-	/* Send command to program the OOB data */
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_op(chip, page, mtd->writesize, chip->oob_poi,
+				 mtd->oobsize);
 }
 
 /*
@@ -632,7 +620,7 @@ static int lpc32xx_nand_read_page_syndrome(struct mtd_info *mtd,
 	uint8_t *oobecc, tmpecc[LPC32XX_ECC_SAVE_SIZE];
 
 	/* Issue read command */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	nand_read_page_op(chip, page, 0, NULL, 0);
 
 	/* Read data and oob, calculate ECC */
 	status = lpc32xx_xfer(mtd, buf, chip->ecc.steps, 1);
@@ -675,7 +663,7 @@ static int lpc32xx_nand_read_page_raw_syndrome(struct mtd_info *mtd,
 					       int page)
 {
 	/* Issue read command */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	nand_read_page_op(chip, page, 0, NULL, 0);
 
 	/* Raw reads can just use the FIFO interface */
 	chip->read_buf(mtd, buf, chip->ecc.size * chip->ecc.steps);
@@ -698,6 +686,8 @@ static int lpc32xx_nand_write_page_syndrome(struct mtd_info *mtd,
 	uint8_t *pb;
 	int error;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	/* Write data, calculate ECC on outbound data */
 	error = lpc32xx_xfer(mtd, (uint8_t *)buf, chip->ecc.steps, 0);
 	if (error)
@@ -716,7 +706,8 @@ static int lpc32xx_nand_write_page_syndrome(struct mtd_info *mtd,
 
 	/* Write ECC data to device */
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 /*
@@ -729,9 +720,11 @@ static int lpc32xx_nand_write_page_raw_syndrome(struct mtd_info *mtd,
 						int oob_required, int page)
 {
 	/* Raw writes can just use the FIFO interface */
-	chip->write_buf(mtd, buf, chip->ecc.size * chip->ecc.steps);
+	nand_prog_page_begin_op(chip, page, 0, buf,
+				chip->ecc.size * chip->ecc.steps);
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 static int lpc32xx_nand_dma_setup(struct lpc32xx_nand_host *host)
diff --git a/drivers/mtd/nand/marvell_nand.c b/drivers/mtd/nand/marvell_nand.c
new file mode 100644
index 0000000..2196f2a
--- /dev/null
+++ b/drivers/mtd/nand/marvell_nand.c
@@ -0,0 +1,2896 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Marvell NAND flash controller driver
+ *
+ * Copyright (C) 2017 Marvell
+ * Author: Miquel RAYNAL <miquel.raynal@free-electrons.com>
+ *
+ */
+
+#include <linux/module.h>
+#include <linux/clk.h>
+#include <linux/mtd/rawnand.h>
+#include <linux/of_platform.h>
+#include <linux/iopoll.h>
+#include <linux/interrupt.h>
+#include <linux/slab.h>
+#include <linux/mfd/syscon.h>
+#include <linux/regmap.h>
+#include <asm/unaligned.h>
+
+#include <linux/dmaengine.h>
+#include <linux/dma-mapping.h>
+#include <linux/dma/pxa-dma.h>
+#include <linux/platform_data/mtd-nand-pxa3xx.h>
+
+/* Data FIFO granularity, FIFO reads/writes must be a multiple of this length */
+#define FIFO_DEPTH		8
+#define FIFO_REP(x)		(x / sizeof(u32))
+#define BCH_SEQ_READS		(32 / FIFO_DEPTH)
+/* NFC does not support transfers of larger chunks at a time */
+#define MAX_CHUNK_SIZE		2112
+/* NFCv1 cannot read more that 7 bytes of ID */
+#define NFCV1_READID_LEN	7
+/* Polling is done at a pace of POLL_PERIOD us until POLL_TIMEOUT is reached */
+#define POLL_PERIOD		0
+#define POLL_TIMEOUT		100000
+/* Interrupt maximum wait period in ms */
+#define IRQ_TIMEOUT		1000
+/* Latency in clock cycles between SoC pins and NFC logic */
+#define MIN_RD_DEL_CNT		3
+/* Maximum number of contiguous address cycles */
+#define MAX_ADDRESS_CYC_NFCV1	5
+#define MAX_ADDRESS_CYC_NFCV2	7
+/* System control registers/bits to enable the NAND controller on some SoCs */
+#define GENCONF_SOC_DEVICE_MUX	0x208
+#define GENCONF_SOC_DEVICE_MUX_NFC_EN BIT(0)
+#define GENCONF_SOC_DEVICE_MUX_ECC_CLK_RST BIT(20)
+#define GENCONF_SOC_DEVICE_MUX_ECC_CORE_RST BIT(21)
+#define GENCONF_SOC_DEVICE_MUX_NFC_INT_EN BIT(25)
+#define GENCONF_CLK_GATING_CTRL	0x220
+#define GENCONF_CLK_GATING_CTRL_ND_GATE BIT(2)
+#define GENCONF_ND_CLK_CTRL	0x700
+#define GENCONF_ND_CLK_CTRL_EN	BIT(0)
+
+/* NAND controller data flash control register */
+#define NDCR			0x00
+#define NDCR_ALL_INT		GENMASK(11, 0)
+#define NDCR_CS1_CMDDM		BIT(7)
+#define NDCR_CS0_CMDDM		BIT(8)
+#define NDCR_RDYM		BIT(11)
+#define NDCR_ND_ARB_EN		BIT(12)
+#define NDCR_RA_START		BIT(15)
+#define NDCR_RD_ID_CNT(x)	(min_t(unsigned int, x, 0x7) << 16)
+#define NDCR_PAGE_SZ(x)		(x >= 2048 ? BIT(24) : 0)
+#define NDCR_DWIDTH_M		BIT(26)
+#define NDCR_DWIDTH_C		BIT(27)
+#define NDCR_ND_RUN		BIT(28)
+#define NDCR_DMA_EN		BIT(29)
+#define NDCR_ECC_EN		BIT(30)
+#define NDCR_SPARE_EN		BIT(31)
+#define NDCR_GENERIC_FIELDS_MASK (~(NDCR_RA_START | NDCR_PAGE_SZ(2048) | \
+				    NDCR_DWIDTH_M | NDCR_DWIDTH_C))
+
+/* NAND interface timing parameter 0 register */
+#define NDTR0			0x04
+#define NDTR0_TRP(x)		((min_t(unsigned int, x, 0xF) & 0x7) << 0)
+#define NDTR0_TRH(x)		(min_t(unsigned int, x, 0x7) << 3)
+#define NDTR0_ETRP(x)		((min_t(unsigned int, x, 0xF) & 0x8) << 3)
+#define NDTR0_SEL_NRE_EDGE	BIT(7)
+#define NDTR0_TWP(x)		(min_t(unsigned int, x, 0x7) << 8)
+#define NDTR0_TWH(x)		(min_t(unsigned int, x, 0x7) << 11)
+#define NDTR0_TCS(x)		(min_t(unsigned int, x, 0x7) << 16)
+#define NDTR0_TCH(x)		(min_t(unsigned int, x, 0x7) << 19)
+#define NDTR0_RD_CNT_DEL(x)	(min_t(unsigned int, x, 0xF) << 22)
+#define NDTR0_SELCNTR		BIT(26)
+#define NDTR0_TADL(x)		(min_t(unsigned int, x, 0x1F) << 27)
+
+/* NAND interface timing parameter 1 register */
+#define NDTR1			0x0C
+#define NDTR1_TAR(x)		(min_t(unsigned int, x, 0xF) << 0)
+#define NDTR1_TWHR(x)		(min_t(unsigned int, x, 0xF) << 4)
+#define NDTR1_TRHW(x)		(min_t(unsigned int, x / 16, 0x3) << 8)
+#define NDTR1_PRESCALE		BIT(14)
+#define NDTR1_WAIT_MODE		BIT(15)
+#define NDTR1_TR(x)		(min_t(unsigned int, x, 0xFFFF) << 16)
+
+/* NAND controller status register */
+#define NDSR			0x14
+#define NDSR_WRCMDREQ		BIT(0)
+#define NDSR_RDDREQ		BIT(1)
+#define NDSR_WRDREQ		BIT(2)
+#define NDSR_CORERR		BIT(3)
+#define NDSR_UNCERR		BIT(4)
+#define NDSR_CMDD(cs)		BIT(8 - cs)
+#define NDSR_RDY(rb)		BIT(11 + rb)
+#define NDSR_ERRCNT(x)		((x >> 16) & 0x1F)
+
+/* NAND ECC control register */
+#define NDECCCTRL		0x28
+#define NDECCCTRL_BCH_EN	BIT(0)
+
+/* NAND controller data buffer register */
+#define NDDB			0x40
+
+/* NAND controller command buffer 0 register */
+#define NDCB0			0x48
+#define NDCB0_CMD1(x)		((x & 0xFF) << 0)
+#define NDCB0_CMD2(x)		((x & 0xFF) << 8)
+#define NDCB0_ADDR_CYC(x)	((x & 0x7) << 16)
+#define NDCB0_ADDR_GET_NUM_CYC(x) (((x) >> 16) & 0x7)
+#define NDCB0_DBC		BIT(19)
+#define NDCB0_CMD_TYPE(x)	((x & 0x7) << 21)
+#define NDCB0_CSEL		BIT(24)
+#define NDCB0_RDY_BYP		BIT(27)
+#define NDCB0_LEN_OVRD		BIT(28)
+#define NDCB0_CMD_XTYPE(x)	((x & 0x7) << 29)
+
+/* NAND controller command buffer 1 register */
+#define NDCB1			0x4C
+#define NDCB1_COLS(x)		((x & 0xFFFF) << 0)
+#define NDCB1_ADDRS_PAGE(x)	(x << 16)
+
+/* NAND controller command buffer 2 register */
+#define NDCB2			0x50
+#define NDCB2_ADDR5_PAGE(x)	(((x >> 16) & 0xFF) << 0)
+#define NDCB2_ADDR5_CYC(x)	((x & 0xFF) << 0)
+
+/* NAND controller command buffer 3 register */
+#define NDCB3			0x54
+#define NDCB3_ADDR6_CYC(x)	((x & 0xFF) << 16)
+#define NDCB3_ADDR7_CYC(x)	((x & 0xFF) << 24)
+
+/* NAND controller command buffer 0 register 'type' and 'xtype' fields */
+#define TYPE_READ		0
+#define TYPE_WRITE		1
+#define TYPE_ERASE		2
+#define TYPE_READ_ID		3
+#define TYPE_STATUS		4
+#define TYPE_RESET		5
+#define TYPE_NAKED_CMD		6
+#define TYPE_NAKED_ADDR		7
+#define TYPE_MASK		7
+#define XTYPE_MONOLITHIC_RW	0
+#define XTYPE_LAST_NAKED_RW	1
+#define XTYPE_FINAL_COMMAND	3
+#define XTYPE_READ		4
+#define XTYPE_WRITE_DISPATCH	4
+#define XTYPE_NAKED_RW		5
+#define XTYPE_COMMAND_DISPATCH	6
+#define XTYPE_MASK		7
+
+/**
+ * Marvell ECC engine works differently than the others, in order to limit the
+ * size of the IP, hardware engineers chose to set a fixed strength at 16 bits
+ * per subpage, and depending on a the desired strength needed by the NAND chip,
+ * a particular layout mixing data/spare/ecc is defined, with a possible last
+ * chunk smaller that the others.
+ *
+ * @writesize:		Full page size on which the layout applies
+ * @chunk:		Desired ECC chunk size on which the layout applies
+ * @strength:		Desired ECC strength (per chunk size bytes) on which the
+ *			layout applies
+ * @nchunks:		Total number of chunks
+ * @full_chunk_cnt:	Number of full-sized chunks, which is the number of
+ *			repetitions of the pattern:
+ *			(data_bytes + spare_bytes + ecc_bytes).
+ * @data_bytes:		Number of data bytes per chunk
+ * @spare_bytes:	Number of spare bytes per chunk
+ * @ecc_bytes:		Number of ecc bytes per chunk
+ * @last_data_bytes:	Number of data bytes in the last chunk
+ * @last_spare_bytes:	Number of spare bytes in the last chunk
+ * @last_ecc_bytes:	Number of ecc bytes in the last chunk
+ */
+struct marvell_hw_ecc_layout {
+	/* Constraints */
+	int writesize;
+	int chunk;
+	int strength;
+	/* Corresponding layout */
+	int nchunks;
+	int full_chunk_cnt;
+	int data_bytes;
+	int spare_bytes;
+	int ecc_bytes;
+	int last_data_bytes;
+	int last_spare_bytes;
+	int last_ecc_bytes;
+};
+
+#define MARVELL_LAYOUT(ws, dc, ds, nc, fcc, db, sb, eb, ldb, lsb, leb)	\
+	{								\
+		.writesize = ws,					\
+		.chunk = dc,						\
+		.strength = ds,						\
+		.nchunks = nc,						\
+		.full_chunk_cnt = fcc,					\
+		.data_bytes = db,					\
+		.spare_bytes = sb,					\
+		.ecc_bytes = eb,					\
+		.last_data_bytes = ldb,					\
+		.last_spare_bytes = lsb,				\
+		.last_ecc_bytes = leb,					\
+	}
+
+/* Layouts explained in AN-379_Marvell_SoC_NFC_ECC */
+static const struct marvell_hw_ecc_layout marvell_nfc_layouts[] = {
+	MARVELL_LAYOUT(  512,   512,  1,  1,  1,  512,  8,  8,  0,  0,  0),
+	MARVELL_LAYOUT( 2048,   512,  1,  1,  1, 2048, 40, 24,  0,  0,  0),
+	MARVELL_LAYOUT( 2048,   512,  4,  1,  1, 2048, 32, 30,  0,  0,  0),
+	MARVELL_LAYOUT( 4096,   512,  4,  2,  2, 2048, 32, 30,  0,  0,  0),
+	MARVELL_LAYOUT( 4096,   512,  8,  5,  4, 1024,  0, 30,  0, 64, 30),
+};
+
+/**
+ * The Nand Flash Controller has up to 4 CE and 2 RB pins. The CE selection
+ * is made by a field in NDCB0 register, and in another field in NDCB2 register.
+ * The datasheet describes the logic with an error: ADDR5 field is once
+ * declared at the beginning of NDCB2, and another time at its end. Because the
+ * ADDR5 field of NDCB2 may be used by other bytes, it would be more logical
+ * to use the last bit of this field instead of the first ones.
+ *
+ * @cs:			Wanted CE lane.
+ * @ndcb0_csel:		Value of the NDCB0 register with or without the flag
+ *			selecting the wanted CE lane. This is set once when
+ *			the Device Tree is probed.
+ * @rb:			Ready/Busy pin for the flash chip
+ */
+struct marvell_nand_chip_sel {
+	unsigned int cs;
+	u32 ndcb0_csel;
+	unsigned int rb;
+};
+
+/**
+ * NAND chip structure: stores NAND chip device related information
+ *
+ * @chip:		Base NAND chip structure
+ * @node:		Used to store NAND chips into a list
+ * @layout		NAND layout when using hardware ECC
+ * @ndcr:		Controller register value for this NAND chip
+ * @ndtr0:		Timing registers 0 value for this NAND chip
+ * @ndtr1:		Timing registers 1 value for this NAND chip
+ * @selected_die:	Current active CS
+ * @nsels:		Number of CS lines required by the NAND chip
+ * @sels:		Array of CS lines descriptions
+ */
+struct marvell_nand_chip {
+	struct nand_chip chip;
+	struct list_head node;
+	const struct marvell_hw_ecc_layout *layout;
+	u32 ndcr;
+	u32 ndtr0;
+	u32 ndtr1;
+	int addr_cyc;
+	int selected_die;
+	unsigned int nsels;
+	struct marvell_nand_chip_sel sels[0];
+};
+
+static inline struct marvell_nand_chip *to_marvell_nand(struct nand_chip *chip)
+{
+	return container_of(chip, struct marvell_nand_chip, chip);
+}
+
+static inline struct marvell_nand_chip_sel *to_nand_sel(struct marvell_nand_chip
+							*nand)
+{
+	return &nand->sels[nand->selected_die];
+}
+
+/**
+ * NAND controller capabilities for distinction between compatible strings
+ *
+ * @max_cs_nb:		Number of Chip Select lines available
+ * @max_rb_nb:		Number of Ready/Busy lines available
+ * @need_system_controller: Indicates if the SoC needs to have access to the
+ *                      system controller (ie. to enable the NAND controller)
+ * @legacy_of_bindings:	Indicates if DT parsing must be done using the old
+ *			fashion way
+ * @is_nfcv2:		NFCv2 has numerous enhancements compared to NFCv1, ie.
+ *			BCH error detection and correction algorithm,
+ *			NDCB3 register has been added
+ * @use_dma:		Use dma for data transfers
+ */
+struct marvell_nfc_caps {
+	unsigned int max_cs_nb;
+	unsigned int max_rb_nb;
+	bool need_system_controller;
+	bool legacy_of_bindings;
+	bool is_nfcv2;
+	bool use_dma;
+};
+
+/**
+ * NAND controller structure: stores Marvell NAND controller information
+ *
+ * @controller:		Base controller structure
+ * @dev:		Parent device (used to print error messages)
+ * @regs:		NAND controller registers
+ * @ecc_clk:		ECC block clock, two times the NAND controller clock
+ * @complete:		Completion object to wait for NAND controller events
+ * @assigned_cs:	Bitmask describing already assigned CS lines
+ * @chips:		List containing all the NAND chips attached to
+ *			this NAND controller
+ * @caps:		NAND controller capabilities for each compatible string
+ * @dma_chan:		DMA channel (NFCv1 only)
+ * @dma_buf:		32-bit aligned buffer for DMA transfers (NFCv1 only)
+ */
+struct marvell_nfc {
+	struct nand_hw_control controller;
+	struct device *dev;
+	void __iomem *regs;
+	struct clk *ecc_clk;
+	struct completion complete;
+	unsigned long assigned_cs;
+	struct list_head chips;
+	struct nand_chip *selected_chip;
+	const struct marvell_nfc_caps *caps;
+
+	/* DMA (NFCv1 only) */
+	bool use_dma;
+	struct dma_chan *dma_chan;
+	u8 *dma_buf;
+};
+
+static inline struct marvell_nfc *to_marvell_nfc(struct nand_hw_control *ctrl)
+{
+	return container_of(ctrl, struct marvell_nfc, controller);
+}
+
+/**
+ * NAND controller timings expressed in NAND Controller clock cycles
+ *
+ * @tRP:		ND_nRE pulse width
+ * @tRH:		ND_nRE high duration
+ * @tWP:		ND_nWE pulse time
+ * @tWH:		ND_nWE high duration
+ * @tCS:		Enable signal setup time
+ * @tCH:		Enable signal hold time
+ * @tADL:		Address to write data delay
+ * @tAR:		ND_ALE low to ND_nRE low delay
+ * @tWHR:		ND_nWE high to ND_nRE low for status read
+ * @tRHW:		ND_nRE high duration, read to write delay
+ * @tR:			ND_nWE high to ND_nRE low for read
+ */
+struct marvell_nfc_timings {
+	/* NDTR0 fields */
+	unsigned int tRP;
+	unsigned int tRH;
+	unsigned int tWP;
+	unsigned int tWH;
+	unsigned int tCS;
+	unsigned int tCH;
+	unsigned int tADL;
+	/* NDTR1 fields */
+	unsigned int tAR;
+	unsigned int tWHR;
+	unsigned int tRHW;
+	unsigned int tR;
+};
+
+/**
+ * Derives a duration in numbers of clock cycles.
+ *
+ * @ps: Duration in pico-seconds
+ * @period_ns:  Clock period in nano-seconds
+ *
+ * Convert the duration in nano-seconds, then divide by the period and
+ * return the number of clock periods.
+ */
+#define TO_CYCLES(ps, period_ns) (DIV_ROUND_UP(ps / 1000, period_ns))
+
+/**
+ * NAND driver structure filled during the parsing of the ->exec_op() subop
+ * subset of instructions.
+ *
+ * @ndcb:		Array of values written to NDCBx registers
+ * @cle_ale_delay_ns:	Optional delay after the last CMD or ADDR cycle
+ * @rdy_timeout_ms:	Timeout for waits on Ready/Busy pin
+ * @rdy_delay_ns:	Optional delay after waiting for the RB pin
+ * @data_delay_ns:	Optional delay after the data xfer
+ * @data_instr_idx:	Index of the data instruction in the subop
+ * @data_instr:		Pointer to the data instruction in the subop
+ */
+struct marvell_nfc_op {
+	u32 ndcb[4];
+	unsigned int cle_ale_delay_ns;
+	unsigned int rdy_timeout_ms;
+	unsigned int rdy_delay_ns;
+	unsigned int data_delay_ns;
+	unsigned int data_instr_idx;
+	const struct nand_op_instr *data_instr;
+};
+
+/*
+ * Internal helper to conditionnally apply a delay (from the above structure,
+ * most of the time).
+ */
+static void cond_delay(unsigned int ns)
+{
+	if (!ns)
+		return;
+
+	if (ns < 10000)
+		ndelay(ns);
+	else
+		udelay(DIV_ROUND_UP(ns, 1000));
+}
+
+/*
+ * The controller has many flags that could generate interrupts, most of them
+ * are disabled and polling is used. For the very slow signals, using interrupts
+ * may relax the CPU charge.
+ */
+static void marvell_nfc_disable_int(struct marvell_nfc *nfc, u32 int_mask)
+{
+	u32 reg;
+
+	/* Writing 1 disables the interrupt */
+	reg = readl_relaxed(nfc->regs + NDCR);
+	writel_relaxed(reg | int_mask, nfc->regs + NDCR);
+}
+
+static void marvell_nfc_enable_int(struct marvell_nfc *nfc, u32 int_mask)
+{
+	u32 reg;
+
+	/* Writing 0 enables the interrupt */
+	reg = readl_relaxed(nfc->regs + NDCR);
+	writel_relaxed(reg & ~int_mask, nfc->regs + NDCR);
+}
+
+static void marvell_nfc_clear_int(struct marvell_nfc *nfc, u32 int_mask)
+{
+	writel_relaxed(int_mask, nfc->regs + NDSR);
+}
+
+static void marvell_nfc_force_byte_access(struct nand_chip *chip,
+					  bool force_8bit)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 ndcr;
+
+	/*
+	 * Callers of this function do not verify if the NAND is using a 16-bit
+	 * an 8-bit bus for normal operations, so we need to take care of that
+	 * here by leaving the configuration unchanged if the NAND does not have
+	 * the NAND_BUSWIDTH_16 flag set.
+	 */
+	if (!(chip->options & NAND_BUSWIDTH_16))
+		return;
+
+	ndcr = readl_relaxed(nfc->regs + NDCR);
+
+	if (force_8bit)
+		ndcr &= ~(NDCR_DWIDTH_M | NDCR_DWIDTH_C);
+	else
+		ndcr |= NDCR_DWIDTH_M | NDCR_DWIDTH_C;
+
+	writel_relaxed(ndcr, nfc->regs + NDCR);
+}
+
+static int marvell_nfc_wait_ndrun(struct nand_chip *chip)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 val;
+	int ret;
+
+	/*
+	 * The command is being processed, wait for the ND_RUN bit to be
+	 * cleared by the NFC. If not, we must clear it by hand.
+	 */
+	ret = readl_relaxed_poll_timeout(nfc->regs + NDCR, val,
+					 (val & NDCR_ND_RUN) == 0,
+					 POLL_PERIOD, POLL_TIMEOUT);
+	if (ret) {
+		dev_err(nfc->dev, "Timeout on NAND controller run mode\n");
+		writel_relaxed(readl(nfc->regs + NDCR) & ~NDCR_ND_RUN,
+			       nfc->regs + NDCR);
+		return ret;
+	}
+
+	return 0;
+}
+
+/*
+ * Any time a command has to be sent to the controller, the following sequence
+ * has to be followed:
+ * - call marvell_nfc_prepare_cmd()
+ *      -> activate the ND_RUN bit that will kind of 'start a job'
+ *      -> wait the signal indicating the NFC is waiting for a command
+ * - send the command (cmd and address cycles)
+ * - enventually send or receive the data
+ * - call marvell_nfc_end_cmd() with the corresponding flag
+ *      -> wait the flag to be triggered or cancel the job with a timeout
+ *
+ * The following helpers are here to factorize the code a bit so that
+ * specialized functions responsible for executing the actual NAND
+ * operations do not have to replicate the same code blocks.
+ */
+static int marvell_nfc_prepare_cmd(struct nand_chip *chip)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 ndcr, val;
+	int ret;
+
+	/* Poll ND_RUN and clear NDSR before issuing any command */
+	ret = marvell_nfc_wait_ndrun(chip);
+	if (ret) {
+		dev_err(nfc->dev, "Last operation did not succeed\n");
+		return ret;
+	}
+
+	ndcr = readl_relaxed(nfc->regs + NDCR);
+	writel_relaxed(readl(nfc->regs + NDSR), nfc->regs + NDSR);
+
+	/* Assert ND_RUN bit and wait the NFC to be ready */
+	writel_relaxed(ndcr | NDCR_ND_RUN, nfc->regs + NDCR);
+	ret = readl_relaxed_poll_timeout(nfc->regs + NDSR, val,
+					 val & NDSR_WRCMDREQ,
+					 POLL_PERIOD, POLL_TIMEOUT);
+	if (ret) {
+		dev_err(nfc->dev, "Timeout on WRCMDRE\n");
+		return -ETIMEDOUT;
+	}
+
+	/* Command may be written, clear WRCMDREQ status bit */
+	writel_relaxed(NDSR_WRCMDREQ, nfc->regs + NDSR);
+
+	return 0;
+}
+
+static void marvell_nfc_send_cmd(struct nand_chip *chip,
+				 struct marvell_nfc_op *nfc_op)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+
+	dev_dbg(nfc->dev, "\nNDCR:  0x%08x\n"
+		"NDCB0: 0x%08x\nNDCB1: 0x%08x\nNDCB2: 0x%08x\nNDCB3: 0x%08x\n",
+		(u32)readl_relaxed(nfc->regs + NDCR), nfc_op->ndcb[0],
+		nfc_op->ndcb[1], nfc_op->ndcb[2], nfc_op->ndcb[3]);
+
+	writel_relaxed(to_nand_sel(marvell_nand)->ndcb0_csel | nfc_op->ndcb[0],
+		       nfc->regs + NDCB0);
+	writel_relaxed(nfc_op->ndcb[1], nfc->regs + NDCB0);
+	writel(nfc_op->ndcb[2], nfc->regs + NDCB0);
+
+	/*
+	 * Write NDCB0 four times only if LEN_OVRD is set or if ADDR6 or ADDR7
+	 * fields are used (only available on NFCv2).
+	 */
+	if (nfc_op->ndcb[0] & NDCB0_LEN_OVRD ||
+	    NDCB0_ADDR_GET_NUM_CYC(nfc_op->ndcb[0]) >= 6) {
+		if (!WARN_ON_ONCE(!nfc->caps->is_nfcv2))
+			writel(nfc_op->ndcb[3], nfc->regs + NDCB0);
+	}
+}
+
+static int marvell_nfc_end_cmd(struct nand_chip *chip, int flag,
+			       const char *label)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 val;
+	int ret;
+
+	ret = readl_relaxed_poll_timeout(nfc->regs + NDSR, val,
+					 val & flag,
+					 POLL_PERIOD, POLL_TIMEOUT);
+
+	if (ret) {
+		dev_err(nfc->dev, "Timeout on %s (NDSR: 0x%08x)\n",
+			label, val);
+		if (nfc->dma_chan)
+			dmaengine_terminate_all(nfc->dma_chan);
+		return ret;
+	}
+
+	/*
+	 * DMA function uses this helper to poll on CMDD bits without wanting
+	 * them to be cleared.
+	 */
+	if (nfc->use_dma && (readl_relaxed(nfc->regs + NDCR) & NDCR_DMA_EN))
+		return 0;
+
+	writel_relaxed(flag, nfc->regs + NDSR);
+
+	return 0;
+}
+
+static int marvell_nfc_wait_cmdd(struct nand_chip *chip)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	int cs_flag = NDSR_CMDD(to_nand_sel(marvell_nand)->ndcb0_csel);
+
+	return marvell_nfc_end_cmd(chip, cs_flag, "CMDD");
+}
+
+static int marvell_nfc_wait_op(struct nand_chip *chip, unsigned int timeout_ms)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	int ret;
+
+	/* Timeout is expressed in ms */
+	if (!timeout_ms)
+		timeout_ms = IRQ_TIMEOUT;
+
+	init_completion(&nfc->complete);
+
+	marvell_nfc_enable_int(nfc, NDCR_RDYM);
+	ret = wait_for_completion_timeout(&nfc->complete,
+					  msecs_to_jiffies(timeout_ms));
+	marvell_nfc_disable_int(nfc, NDCR_RDYM);
+	marvell_nfc_clear_int(nfc, NDSR_RDY(0) | NDSR_RDY(1));
+	if (!ret) {
+		dev_err(nfc->dev, "Timeout waiting for RB signal\n");
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static void marvell_nfc_select_chip(struct mtd_info *mtd, int die_nr)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 ndcr_generic;
+
+	if (chip == nfc->selected_chip && die_nr == marvell_nand->selected_die)
+		return;
+
+	if (die_nr < 0 || die_nr >= marvell_nand->nsels) {
+		nfc->selected_chip = NULL;
+		marvell_nand->selected_die = -1;
+		return;
+	}
+
+	/*
+	 * Do not change the timing registers when using the DT property
+	 * marvell,nand-keep-config; in that case ->ndtr0 and ->ndtr1 from the
+	 * marvell_nand structure are supposedly empty.
+	 */
+	writel_relaxed(marvell_nand->ndtr0, nfc->regs + NDTR0);
+	writel_relaxed(marvell_nand->ndtr1, nfc->regs + NDTR1);
+
+	/*
+	 * Reset the NDCR register to a clean state for this particular chip,
+	 * also clear ND_RUN bit.
+	 */
+	ndcr_generic = readl_relaxed(nfc->regs + NDCR) &
+		       NDCR_GENERIC_FIELDS_MASK & ~NDCR_ND_RUN;
+	writel_relaxed(ndcr_generic | marvell_nand->ndcr, nfc->regs + NDCR);
+
+	/* Also reset the interrupt status register */
+	marvell_nfc_clear_int(nfc, NDCR_ALL_INT);
+
+	nfc->selected_chip = chip;
+	marvell_nand->selected_die = die_nr;
+}
+
+static irqreturn_t marvell_nfc_isr(int irq, void *dev_id)
+{
+	struct marvell_nfc *nfc = dev_id;
+	u32 st = readl_relaxed(nfc->regs + NDSR);
+	u32 ien = (~readl_relaxed(nfc->regs + NDCR)) & NDCR_ALL_INT;
+
+	/*
+	 * RDY interrupt mask is one bit in NDCR while there are two status
+	 * bit in NDSR (RDY[cs0/cs2] and RDY[cs1/cs3]).
+	 */
+	if (st & NDSR_RDY(1))
+		st |= NDSR_RDY(0);
+
+	if (!(st & ien))
+		return IRQ_NONE;
+
+	marvell_nfc_disable_int(nfc, st & NDCR_ALL_INT);
+
+	if (!(st & (NDSR_RDDREQ | NDSR_WRDREQ | NDSR_WRCMDREQ)))
+		complete(&nfc->complete);
+
+	return IRQ_HANDLED;
+}
+
+/* HW ECC related functions */
+static void marvell_nfc_enable_hw_ecc(struct nand_chip *chip)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 ndcr = readl_relaxed(nfc->regs + NDCR);
+
+	if (!(ndcr & NDCR_ECC_EN)) {
+		writel_relaxed(ndcr | NDCR_ECC_EN, nfc->regs + NDCR);
+
+		/*
+		 * When enabling BCH, set threshold to 0 to always know the
+		 * number of corrected bitflips.
+		 */
+		if (chip->ecc.algo == NAND_ECC_BCH)
+			writel_relaxed(NDECCCTRL_BCH_EN, nfc->regs + NDECCCTRL);
+	}
+}
+
+static void marvell_nfc_disable_hw_ecc(struct nand_chip *chip)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	u32 ndcr = readl_relaxed(nfc->regs + NDCR);
+
+	if (ndcr & NDCR_ECC_EN) {
+		writel_relaxed(ndcr & ~NDCR_ECC_EN, nfc->regs + NDCR);
+		if (chip->ecc.algo == NAND_ECC_BCH)
+			writel_relaxed(0, nfc->regs + NDECCCTRL);
+	}
+}
+
+/* DMA related helpers */
+static void marvell_nfc_enable_dma(struct marvell_nfc *nfc)
+{
+	u32 reg;
+
+	reg = readl_relaxed(nfc->regs + NDCR);
+	writel_relaxed(reg | NDCR_DMA_EN, nfc->regs + NDCR);
+}
+
+static void marvell_nfc_disable_dma(struct marvell_nfc *nfc)
+{
+	u32 reg;
+
+	reg = readl_relaxed(nfc->regs + NDCR);
+	writel_relaxed(reg & ~NDCR_DMA_EN, nfc->regs + NDCR);
+}
+
+/* Read/write PIO/DMA accessors */
+static int marvell_nfc_xfer_data_dma(struct marvell_nfc *nfc,
+				     enum dma_data_direction direction,
+				     unsigned int len)
+{
+	unsigned int dma_len = min_t(int, ALIGN(len, 32), MAX_CHUNK_SIZE);
+	struct dma_async_tx_descriptor *tx;
+	struct scatterlist sg;
+	dma_cookie_t cookie;
+	int ret;
+
+	marvell_nfc_enable_dma(nfc);
+	/* Prepare the DMA transfer */
+	sg_init_one(&sg, nfc->dma_buf, dma_len);
+	dma_map_sg(nfc->dma_chan->device->dev, &sg, 1, direction);
+	tx = dmaengine_prep_slave_sg(nfc->dma_chan, &sg, 1,
+				     direction == DMA_FROM_DEVICE ?
+				     DMA_DEV_TO_MEM : DMA_MEM_TO_DEV,
+				     DMA_PREP_INTERRUPT);
+	if (!tx) {
+		dev_err(nfc->dev, "Could not prepare DMA S/G list\n");
+		return -ENXIO;
+	}
+
+	/* Do the task and wait for it to finish */
+	cookie = dmaengine_submit(tx);
+	ret = dma_submit_error(cookie);
+	if (ret)
+		return -EIO;
+
+	dma_async_issue_pending(nfc->dma_chan);
+	ret = marvell_nfc_wait_cmdd(nfc->selected_chip);
+	dma_unmap_sg(nfc->dma_chan->device->dev, &sg, 1, direction);
+	marvell_nfc_disable_dma(nfc);
+	if (ret) {
+		dev_err(nfc->dev, "Timeout waiting for DMA (status: %d)\n",
+			dmaengine_tx_status(nfc->dma_chan, cookie, NULL));
+		dmaengine_terminate_all(nfc->dma_chan);
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int marvell_nfc_xfer_data_in_pio(struct marvell_nfc *nfc, u8 *in,
+					unsigned int len)
+{
+	unsigned int last_len = len % FIFO_DEPTH;
+	unsigned int last_full_offset = round_down(len, FIFO_DEPTH);
+	int i;
+
+	for (i = 0; i < last_full_offset; i += FIFO_DEPTH)
+		ioread32_rep(nfc->regs + NDDB, in + i, FIFO_REP(FIFO_DEPTH));
+
+	if (last_len) {
+		u8 tmp_buf[FIFO_DEPTH];
+
+		ioread32_rep(nfc->regs + NDDB, tmp_buf, FIFO_REP(FIFO_DEPTH));
+		memcpy(in + last_full_offset, tmp_buf, last_len);
+	}
+
+	return 0;
+}
+
+static int marvell_nfc_xfer_data_out_pio(struct marvell_nfc *nfc, const u8 *out,
+					 unsigned int len)
+{
+	unsigned int last_len = len % FIFO_DEPTH;
+	unsigned int last_full_offset = round_down(len, FIFO_DEPTH);
+	int i;
+
+	for (i = 0; i < last_full_offset; i += FIFO_DEPTH)
+		iowrite32_rep(nfc->regs + NDDB, out + i, FIFO_REP(FIFO_DEPTH));
+
+	if (last_len) {
+		u8 tmp_buf[FIFO_DEPTH];
+
+		memcpy(tmp_buf, out + last_full_offset, last_len);
+		iowrite32_rep(nfc->regs + NDDB, tmp_buf, FIFO_REP(FIFO_DEPTH));
+	}
+
+	return 0;
+}
+
+static void marvell_nfc_check_empty_chunk(struct nand_chip *chip,
+					  u8 *data, int data_len,
+					  u8 *spare, int spare_len,
+					  u8 *ecc, int ecc_len,
+					  unsigned int *max_bitflips)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	int bf;
+
+	/*
+	 * Blank pages (all 0xFF) that have not been written may be recognized
+	 * as bad if bitflips occur, so whenever an uncorrectable error occurs,
+	 * check if the entire page (with ECC bytes) is actually blank or not.
+	 */
+	if (!data)
+		data_len = 0;
+	if (!spare)
+		spare_len = 0;
+	if (!ecc)
+		ecc_len = 0;
+
+	bf = nand_check_erased_ecc_chunk(data, data_len, ecc, ecc_len,
+					 spare, spare_len, chip->ecc.strength);
+	if (bf < 0) {
+		mtd->ecc_stats.failed++;
+		return;
+	}
+
+	/* Update the stats and max_bitflips */
+	mtd->ecc_stats.corrected += bf;
+	*max_bitflips = max_t(unsigned int, *max_bitflips, bf);
+}
+
+/*
+ * Check a chunk is correct or not according to hardware ECC engine.
+ * mtd->ecc_stats.corrected is updated, as well as max_bitflips, however
+ * mtd->ecc_stats.failure is not, the function will instead return a non-zero
+ * value indicating that a check on the emptyness of the subpage must be
+ * performed before declaring the subpage corrupted.
+ */
+static int marvell_nfc_hw_ecc_correct(struct nand_chip *chip,
+				      unsigned int *max_bitflips)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	int bf = 0;
+	u32 ndsr;
+
+	ndsr = readl_relaxed(nfc->regs + NDSR);
+
+	/* Check uncorrectable error flag */
+	if (ndsr & NDSR_UNCERR) {
+		writel_relaxed(ndsr, nfc->regs + NDSR);
+
+		/*
+		 * Do not increment ->ecc_stats.failed now, instead, return a
+		 * non-zero value to indicate that this chunk was apparently
+		 * bad, and it should be check to see if it empty or not. If
+		 * the chunk (with ECC bytes) is not declared empty, the calling
+		 * function must increment the failure count.
+		 */
+		return -EBADMSG;
+	}
+
+	/* Check correctable error flag */
+	if (ndsr & NDSR_CORERR) {
+		writel_relaxed(ndsr, nfc->regs + NDSR);
+
+		if (chip->ecc.algo == NAND_ECC_BCH)
+			bf = NDSR_ERRCNT(ndsr);
+		else
+			bf = 1;
+	}
+
+	/* Update the stats and max_bitflips */
+	mtd->ecc_stats.corrected += bf;
+	*max_bitflips = max_t(unsigned int, *max_bitflips, bf);
+
+	return 0;
+}
+
+/* Hamming read helpers */
+static int marvell_nfc_hw_ecc_hmg_do_read_page(struct nand_chip *chip,
+					       u8 *data_buf, u8 *oob_buf,
+					       bool raw, int page)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	struct marvell_nfc_op nfc_op = {
+		.ndcb[0] = NDCB0_CMD_TYPE(TYPE_READ) |
+			   NDCB0_ADDR_CYC(marvell_nand->addr_cyc) |
+			   NDCB0_DBC |
+			   NDCB0_CMD1(NAND_CMD_READ0) |
+			   NDCB0_CMD2(NAND_CMD_READSTART),
+		.ndcb[1] = NDCB1_ADDRS_PAGE(page),
+		.ndcb[2] = NDCB2_ADDR5_PAGE(page),
+	};
+	unsigned int oob_bytes = lt->spare_bytes + (raw ? lt->ecc_bytes : 0);
+	int ret;
+
+	/* NFCv2 needs more information about the operation being executed */
+	if (nfc->caps->is_nfcv2)
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_MONOLITHIC_RW);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_RDDREQ,
+				  "RDDREQ while draining FIFO (data/oob)");
+	if (ret)
+		return ret;
+
+	/*
+	 * Read the page then the OOB area. Unlike what is shown in current
+	 * documentation, spare bytes are protected by the ECC engine, and must
+	 * be at the beginning of the OOB area or running this driver on legacy
+	 * systems will prevent the discovery of the BBM/BBT.
+	 */
+	if (nfc->use_dma) {
+		marvell_nfc_xfer_data_dma(nfc, DMA_FROM_DEVICE,
+					  lt->data_bytes + oob_bytes);
+		memcpy(data_buf, nfc->dma_buf, lt->data_bytes);
+		memcpy(oob_buf, nfc->dma_buf + lt->data_bytes, oob_bytes);
+	} else {
+		marvell_nfc_xfer_data_in_pio(nfc, data_buf, lt->data_bytes);
+		marvell_nfc_xfer_data_in_pio(nfc, oob_buf, oob_bytes);
+	}
+
+	ret = marvell_nfc_wait_cmdd(chip);
+
+	return ret;
+}
+
+static int marvell_nfc_hw_ecc_hmg_read_page_raw(struct mtd_info *mtd,
+						struct nand_chip *chip, u8 *buf,
+						int oob_required, int page)
+{
+	return marvell_nfc_hw_ecc_hmg_do_read_page(chip, buf, chip->oob_poi,
+						   true, page);
+}
+
+static int marvell_nfc_hw_ecc_hmg_read_page(struct mtd_info *mtd,
+					    struct nand_chip *chip,
+					    u8 *buf, int oob_required,
+					    int page)
+{
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	unsigned int full_sz = lt->data_bytes + lt->spare_bytes + lt->ecc_bytes;
+	int max_bitflips = 0, ret;
+	u8 *raw_buf;
+
+	marvell_nfc_enable_hw_ecc(chip);
+	marvell_nfc_hw_ecc_hmg_do_read_page(chip, buf, chip->oob_poi, false,
+					    page);
+	ret = marvell_nfc_hw_ecc_correct(chip, &max_bitflips);
+	marvell_nfc_disable_hw_ecc(chip);
+
+	if (!ret)
+		return max_bitflips;
+
+	/*
+	 * When ECC failures are detected, check if the full page has been
+	 * written or not. Ignore the failure if it is actually empty.
+	 */
+	raw_buf = kmalloc(full_sz, GFP_KERNEL);
+	if (!raw_buf)
+		return -ENOMEM;
+
+	marvell_nfc_hw_ecc_hmg_do_read_page(chip, raw_buf, raw_buf +
+					    lt->data_bytes, true, page);
+	marvell_nfc_check_empty_chunk(chip, raw_buf, full_sz, NULL, 0, NULL, 0,
+				      &max_bitflips);
+	kfree(raw_buf);
+
+	return max_bitflips;
+}
+
+/*
+ * Spare area in Hamming layouts is not protected by the ECC engine (even if
+ * it appears before the ECC bytes when reading), the ->read_oob_raw() function
+ * also stands for ->read_oob().
+ */
+static int marvell_nfc_hw_ecc_hmg_read_oob_raw(struct mtd_info *mtd,
+					       struct nand_chip *chip, int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	return marvell_nfc_hw_ecc_hmg_do_read_page(chip, chip->data_buf,
+						   chip->oob_poi, true, page);
+}
+
+/* Hamming write helpers */
+static int marvell_nfc_hw_ecc_hmg_do_write_page(struct nand_chip *chip,
+						const u8 *data_buf,
+						const u8 *oob_buf, bool raw,
+						int page)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	struct marvell_nfc_op nfc_op = {
+		.ndcb[0] = NDCB0_CMD_TYPE(TYPE_WRITE) |
+			   NDCB0_ADDR_CYC(marvell_nand->addr_cyc) |
+			   NDCB0_CMD1(NAND_CMD_SEQIN) |
+			   NDCB0_CMD2(NAND_CMD_PAGEPROG) |
+			   NDCB0_DBC,
+		.ndcb[1] = NDCB1_ADDRS_PAGE(page),
+		.ndcb[2] = NDCB2_ADDR5_PAGE(page),
+	};
+	unsigned int oob_bytes = lt->spare_bytes + (raw ? lt->ecc_bytes : 0);
+	int ret;
+
+	/* NFCv2 needs more information about the operation being executed */
+	if (nfc->caps->is_nfcv2)
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_MONOLITHIC_RW);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_WRDREQ,
+				  "WRDREQ while loading FIFO (data)");
+	if (ret)
+		return ret;
+
+	/* Write the page then the OOB area */
+	if (nfc->use_dma) {
+		memcpy(nfc->dma_buf, data_buf, lt->data_bytes);
+		memcpy(nfc->dma_buf + lt->data_bytes, oob_buf, oob_bytes);
+		marvell_nfc_xfer_data_dma(nfc, DMA_TO_DEVICE, lt->data_bytes +
+					  lt->ecc_bytes + lt->spare_bytes);
+	} else {
+		marvell_nfc_xfer_data_out_pio(nfc, data_buf, lt->data_bytes);
+		marvell_nfc_xfer_data_out_pio(nfc, oob_buf, oob_bytes);
+	}
+
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	ret = marvell_nfc_wait_op(chip,
+				  chip->data_interface.timings.sdr.tPROG_max);
+	return ret;
+}
+
+static int marvell_nfc_hw_ecc_hmg_write_page_raw(struct mtd_info *mtd,
+						 struct nand_chip *chip,
+						 const u8 *buf,
+						 int oob_required, int page)
+{
+	return marvell_nfc_hw_ecc_hmg_do_write_page(chip, buf, chip->oob_poi,
+						    true, page);
+}
+
+static int marvell_nfc_hw_ecc_hmg_write_page(struct mtd_info *mtd,
+					     struct nand_chip *chip,
+					     const u8 *buf,
+					     int oob_required, int page)
+{
+	int ret;
+
+	marvell_nfc_enable_hw_ecc(chip);
+	ret = marvell_nfc_hw_ecc_hmg_do_write_page(chip, buf, chip->oob_poi,
+						   false, page);
+	marvell_nfc_disable_hw_ecc(chip);
+
+	return ret;
+}
+
+/*
+ * Spare area in Hamming layouts is not protected by the ECC engine (even if
+ * it appears before the ECC bytes when reading), the ->write_oob_raw() function
+ * also stands for ->write_oob().
+ */
+static int marvell_nfc_hw_ecc_hmg_write_oob_raw(struct mtd_info *mtd,
+						struct nand_chip *chip,
+						int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	memset(chip->data_buf, 0xFF, mtd->writesize);
+
+	return marvell_nfc_hw_ecc_hmg_do_write_page(chip, chip->data_buf,
+						    chip->oob_poi, true, page);
+}
+
+/* BCH read helpers */
+static int marvell_nfc_hw_ecc_bch_read_page_raw(struct mtd_info *mtd,
+						struct nand_chip *chip, u8 *buf,
+						int oob_required, int page)
+{
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	u8 *oob = chip->oob_poi;
+	int chunk_size = lt->data_bytes + lt->spare_bytes + lt->ecc_bytes;
+	int ecc_offset = (lt->full_chunk_cnt * lt->spare_bytes) +
+		lt->last_spare_bytes;
+	int data_len = lt->data_bytes;
+	int spare_len = lt->spare_bytes;
+	int ecc_len = lt->ecc_bytes;
+	int chunk;
+
+	if (oob_required)
+		memset(chip->oob_poi, 0xFF, mtd->oobsize);
+
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
+	for (chunk = 0; chunk < lt->nchunks; chunk++) {
+		/* Update last chunk length */
+		if (chunk >= lt->full_chunk_cnt) {
+			data_len = lt->last_data_bytes;
+			spare_len = lt->last_spare_bytes;
+			ecc_len = lt->last_ecc_bytes;
+		}
+
+		/* Read data bytes*/
+		nand_change_read_column_op(chip, chunk * chunk_size,
+					   buf + (lt->data_bytes * chunk),
+					   data_len, false);
+
+		/* Read spare bytes */
+		nand_read_data_op(chip, oob + (lt->spare_bytes * chunk),
+				  spare_len, false);
+
+		/* Read ECC bytes */
+		nand_read_data_op(chip, oob + ecc_offset +
+				  (ALIGN(lt->ecc_bytes, 32) * chunk),
+				  ecc_len, false);
+	}
+
+	return 0;
+}
+
+static void marvell_nfc_hw_ecc_bch_read_chunk(struct nand_chip *chip, int chunk,
+					      u8 *data, unsigned int data_len,
+					      u8 *spare, unsigned int spare_len,
+					      int page)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	int i, ret;
+	struct marvell_nfc_op nfc_op = {
+		.ndcb[0] = NDCB0_CMD_TYPE(TYPE_READ) |
+			   NDCB0_ADDR_CYC(marvell_nand->addr_cyc) |
+			   NDCB0_LEN_OVRD,
+		.ndcb[1] = NDCB1_ADDRS_PAGE(page),
+		.ndcb[2] = NDCB2_ADDR5_PAGE(page),
+		.ndcb[3] = data_len + spare_len,
+	};
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return;
+
+	if (chunk == 0)
+		nfc_op.ndcb[0] |= NDCB0_DBC |
+				  NDCB0_CMD1(NAND_CMD_READ0) |
+				  NDCB0_CMD2(NAND_CMD_READSTART);
+
+	/*
+	 * Trigger the naked read operation only on the last chunk.
+	 * Otherwise, use monolithic read.
+	 */
+	if (lt->nchunks == 1 || (chunk < lt->nchunks - 1))
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_MONOLITHIC_RW);
+	else
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_LAST_NAKED_RW);
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+
+	/*
+	 * According to the datasheet, when reading from NDDB
+	 * with BCH enabled, after each 32 bytes reads, we
+	 * have to make sure that the NDSR.RDDREQ bit is set.
+	 *
+	 * Drain the FIFO, 8 32-bit reads at a time, and skip
+	 * the polling on the last read.
+	 *
+	 * Length is a multiple of 32 bytes, hence it is a multiple of 8 too.
+	 */
+	for (i = 0; i < data_len; i += FIFO_DEPTH * BCH_SEQ_READS) {
+		marvell_nfc_end_cmd(chip, NDSR_RDDREQ,
+				    "RDDREQ while draining FIFO (data)");
+		marvell_nfc_xfer_data_in_pio(nfc, data,
+					     FIFO_DEPTH * BCH_SEQ_READS);
+		data += FIFO_DEPTH * BCH_SEQ_READS;
+	}
+
+	for (i = 0; i < spare_len; i += FIFO_DEPTH * BCH_SEQ_READS) {
+		marvell_nfc_end_cmd(chip, NDSR_RDDREQ,
+				    "RDDREQ while draining FIFO (OOB)");
+		marvell_nfc_xfer_data_in_pio(nfc, spare,
+					     FIFO_DEPTH * BCH_SEQ_READS);
+		spare += FIFO_DEPTH * BCH_SEQ_READS;
+	}
+}
+
+static int marvell_nfc_hw_ecc_bch_read_page(struct mtd_info *mtd,
+					    struct nand_chip *chip,
+					    u8 *buf, int oob_required,
+					    int page)
+{
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	int data_len = lt->data_bytes, spare_len = lt->spare_bytes, ecc_len;
+	u8 *data = buf, *spare = chip->oob_poi, *ecc;
+	int max_bitflips = 0;
+	u32 failure_mask = 0;
+	int chunk, ecc_offset_in_page, ret;
+
+	/*
+	 * With BCH, OOB is not fully used (and thus not read entirely), not
+	 * expected bytes could show up at the end of the OOB buffer if not
+	 * explicitly erased.
+	 */
+	if (oob_required)
+		memset(chip->oob_poi, 0xFF, mtd->oobsize);
+
+	marvell_nfc_enable_hw_ecc(chip);
+
+	for (chunk = 0; chunk < lt->nchunks; chunk++) {
+		/* Update length for the last chunk */
+		if (chunk >= lt->full_chunk_cnt) {
+			data_len = lt->last_data_bytes;
+			spare_len = lt->last_spare_bytes;
+		}
+
+		/* Read the chunk and detect number of bitflips */
+		marvell_nfc_hw_ecc_bch_read_chunk(chip, chunk, data, data_len,
+						  spare, spare_len, page);
+		ret = marvell_nfc_hw_ecc_correct(chip, &max_bitflips);
+		if (ret)
+			failure_mask |= BIT(chunk);
+
+		data += data_len;
+		spare += spare_len;
+	}
+
+	marvell_nfc_disable_hw_ecc(chip);
+
+	if (!failure_mask)
+		return max_bitflips;
+
+	/*
+	 * Please note that dumping the ECC bytes during a normal read with OOB
+	 * area would add a significant overhead as ECC bytes are "consumed" by
+	 * the controller in normal mode and must be re-read in raw mode. To
+	 * avoid dropping the performances, we prefer not to include them. The
+	 * user should re-read the page in raw mode if ECC bytes are required.
+	 *
+	 * However, for any subpage read error reported by ->correct(), the ECC
+	 * bytes must be read in raw mode and the full subpage must be checked
+	 * to see if it is entirely empty of if there was an actual error.
+	 */
+	for (chunk = 0; chunk < lt->nchunks; chunk++) {
+		/* No failure reported for this chunk, move to the next one */
+		if (!(failure_mask & BIT(chunk)))
+			continue;
+
+		/* Derive ECC bytes positions (in page/buffer) and length */
+		ecc = chip->oob_poi +
+			(lt->full_chunk_cnt * lt->spare_bytes) +
+			lt->last_spare_bytes +
+			(chunk * ALIGN(lt->ecc_bytes, 32));
+		ecc_offset_in_page =
+			(chunk * (lt->data_bytes + lt->spare_bytes +
+				  lt->ecc_bytes)) +
+			(chunk < lt->full_chunk_cnt ?
+			 lt->data_bytes + lt->spare_bytes :
+			 lt->last_data_bytes + lt->last_spare_bytes);
+		ecc_len = chunk < lt->full_chunk_cnt ?
+			lt->ecc_bytes : lt->last_ecc_bytes;
+
+		/* Do the actual raw read of the ECC bytes */
+		nand_change_read_column_op(chip, ecc_offset_in_page,
+					   ecc, ecc_len, false);
+
+		/* Derive data/spare bytes positions (in buffer) and length */
+		data = buf + (chunk * lt->data_bytes);
+		data_len = chunk < lt->full_chunk_cnt ?
+			lt->data_bytes : lt->last_data_bytes;
+		spare = chip->oob_poi + (chunk * (lt->spare_bytes +
+						  lt->ecc_bytes));
+		spare_len = chunk < lt->full_chunk_cnt ?
+			lt->spare_bytes : lt->last_spare_bytes;
+
+		/* Check the entire chunk (data + spare + ecc) for emptyness */
+		marvell_nfc_check_empty_chunk(chip, data, data_len, spare,
+					      spare_len, ecc, ecc_len,
+					      &max_bitflips);
+	}
+
+	return max_bitflips;
+}
+
+static int marvell_nfc_hw_ecc_bch_read_oob_raw(struct mtd_info *mtd,
+					       struct nand_chip *chip, int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	return chip->ecc.read_page_raw(mtd, chip, chip->data_buf, true, page);
+}
+
+static int marvell_nfc_hw_ecc_bch_read_oob(struct mtd_info *mtd,
+					   struct nand_chip *chip, int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	return chip->ecc.read_page(mtd, chip, chip->data_buf, true, page);
+}
+
+/* BCH write helpers */
+static int marvell_nfc_hw_ecc_bch_write_page_raw(struct mtd_info *mtd,
+						 struct nand_chip *chip,
+						 const u8 *buf,
+						 int oob_required, int page)
+{
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	int full_chunk_size = lt->data_bytes + lt->spare_bytes + lt->ecc_bytes;
+	int data_len = lt->data_bytes;
+	int spare_len = lt->spare_bytes;
+	int ecc_len = lt->ecc_bytes;
+	int spare_offset = 0;
+	int ecc_offset = (lt->full_chunk_cnt * lt->spare_bytes) +
+		lt->last_spare_bytes;
+	int chunk;
+
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
+	for (chunk = 0; chunk < lt->nchunks; chunk++) {
+		if (chunk >= lt->full_chunk_cnt) {
+			data_len = lt->last_data_bytes;
+			spare_len = lt->last_spare_bytes;
+			ecc_len = lt->last_ecc_bytes;
+		}
+
+		/* Point to the column of the next chunk */
+		nand_change_write_column_op(chip, chunk * full_chunk_size,
+					    NULL, 0, false);
+
+		/* Write the data */
+		nand_write_data_op(chip, buf + (chunk * lt->data_bytes),
+				   data_len, false);
+
+		if (!oob_required)
+			continue;
+
+		/* Write the spare bytes */
+		if (spare_len)
+			nand_write_data_op(chip, chip->oob_poi + spare_offset,
+					   spare_len, false);
+
+		/* Write the ECC bytes */
+		if (ecc_len)
+			nand_write_data_op(chip, chip->oob_poi + ecc_offset,
+					   ecc_len, false);
+
+		spare_offset += spare_len;
+		ecc_offset += ALIGN(ecc_len, 32);
+	}
+
+	return nand_prog_page_end_op(chip);
+}
+
+static int
+marvell_nfc_hw_ecc_bch_write_chunk(struct nand_chip *chip, int chunk,
+				   const u8 *data, unsigned int data_len,
+				   const u8 *spare, unsigned int spare_len,
+				   int page)
+{
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	int ret;
+	struct marvell_nfc_op nfc_op = {
+		.ndcb[0] = NDCB0_CMD_TYPE(TYPE_WRITE) | NDCB0_LEN_OVRD,
+		.ndcb[3] = data_len + spare_len,
+	};
+
+	/*
+	 * First operation dispatches the CMD_SEQIN command, issue the address
+	 * cycles and asks for the first chunk of data.
+	 * All operations in the middle (if any) will issue a naked write and
+	 * also ask for data.
+	 * Last operation (if any) asks for the last chunk of data through a
+	 * last naked write.
+	 */
+	if (chunk == 0) {
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_WRITE_DISPATCH) |
+				  NDCB0_ADDR_CYC(marvell_nand->addr_cyc) |
+				  NDCB0_CMD1(NAND_CMD_SEQIN);
+		nfc_op.ndcb[1] |= NDCB1_ADDRS_PAGE(page);
+		nfc_op.ndcb[2] |= NDCB2_ADDR5_PAGE(page);
+	} else if (chunk < lt->nchunks - 1) {
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_NAKED_RW);
+	} else {
+		nfc_op.ndcb[0] |= NDCB0_CMD_XTYPE(XTYPE_LAST_NAKED_RW);
+	}
+
+	/* Always dispatch the PAGEPROG command on the last chunk */
+	if (chunk == lt->nchunks - 1)
+		nfc_op.ndcb[0] |= NDCB0_CMD2(NAND_CMD_PAGEPROG) | NDCB0_DBC;
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_WRDREQ,
+				  "WRDREQ while loading FIFO (data)");
+	if (ret)
+		return ret;
+
+	/* Transfer the contents */
+	iowrite32_rep(nfc->regs + NDDB, data, FIFO_REP(data_len));
+	iowrite32_rep(nfc->regs + NDDB, spare, FIFO_REP(spare_len));
+
+	return 0;
+}
+
+static int marvell_nfc_hw_ecc_bch_write_page(struct mtd_info *mtd,
+					     struct nand_chip *chip,
+					     const u8 *buf,
+					     int oob_required, int page)
+{
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+	const u8 *data = buf;
+	const u8 *spare = chip->oob_poi;
+	int data_len = lt->data_bytes;
+	int spare_len = lt->spare_bytes;
+	int chunk, ret;
+
+	/* Spare data will be written anyway, so clear it to avoid garbage */
+	if (!oob_required)
+		memset(chip->oob_poi, 0xFF, mtd->oobsize);
+
+	marvell_nfc_enable_hw_ecc(chip);
+
+	for (chunk = 0; chunk < lt->nchunks; chunk++) {
+		if (chunk >= lt->full_chunk_cnt) {
+			data_len = lt->last_data_bytes;
+			spare_len = lt->last_spare_bytes;
+		}
+
+		marvell_nfc_hw_ecc_bch_write_chunk(chip, chunk, data, data_len,
+						   spare, spare_len, page);
+		data += data_len;
+		spare += spare_len;
+
+		/*
+		 * Waiting only for CMDD or PAGED is not enough, ECC are
+		 * partially written. No flag is set once the operation is
+		 * really finished but the ND_RUN bit is cleared, so wait for it
+		 * before stepping into the next command.
+		 */
+		marvell_nfc_wait_ndrun(chip);
+	}
+
+	ret = marvell_nfc_wait_op(chip,
+				  chip->data_interface.timings.sdr.tPROG_max);
+
+	marvell_nfc_disable_hw_ecc(chip);
+
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int marvell_nfc_hw_ecc_bch_write_oob_raw(struct mtd_info *mtd,
+						struct nand_chip *chip,
+						int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	memset(chip->data_buf, 0xFF, mtd->writesize);
+
+	return chip->ecc.write_page_raw(mtd, chip, chip->data_buf, true, page);
+}
+
+static int marvell_nfc_hw_ecc_bch_write_oob(struct mtd_info *mtd,
+					    struct nand_chip *chip, int page)
+{
+	/* Invalidate page cache */
+	chip->pagebuf = -1;
+
+	memset(chip->data_buf, 0xFF, mtd->writesize);
+
+	return chip->ecc.write_page(mtd, chip, chip->data_buf, true, page);
+}
+
+/* NAND framework ->exec_op() hooks and related helpers */
+static void marvell_nfc_parse_instructions(struct nand_chip *chip,
+					   const struct nand_subop *subop,
+					   struct marvell_nfc_op *nfc_op)
+{
+	const struct nand_op_instr *instr = NULL;
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	bool first_cmd = true;
+	unsigned int op_id;
+	int i;
+
+	/* Reset the input structure as most of its fields will be OR'ed */
+	memset(nfc_op, 0, sizeof(struct marvell_nfc_op));
+
+	for (op_id = 0; op_id < subop->ninstrs; op_id++) {
+		unsigned int offset, naddrs;
+		const u8 *addrs;
+		int len = nand_subop_get_data_len(subop, op_id);
+
+		instr = &subop->instrs[op_id];
+
+		switch (instr->type) {
+		case NAND_OP_CMD_INSTR:
+			if (first_cmd)
+				nfc_op->ndcb[0] |=
+					NDCB0_CMD1(instr->ctx.cmd.opcode);
+			else
+				nfc_op->ndcb[0] |=
+					NDCB0_CMD2(instr->ctx.cmd.opcode) |
+					NDCB0_DBC;
+
+			nfc_op->cle_ale_delay_ns = instr->delay_ns;
+			first_cmd = false;
+			break;
+
+		case NAND_OP_ADDR_INSTR:
+			offset = nand_subop_get_addr_start_off(subop, op_id);
+			naddrs = nand_subop_get_num_addr_cyc(subop, op_id);
+			addrs = &instr->ctx.addr.addrs[offset];
+
+			nfc_op->ndcb[0] |= NDCB0_ADDR_CYC(naddrs);
+
+			for (i = 0; i < min_t(unsigned int, 4, naddrs); i++)
+				nfc_op->ndcb[1] |= addrs[i] << (8 * i);
+
+			if (naddrs >= 5)
+				nfc_op->ndcb[2] |= NDCB2_ADDR5_CYC(addrs[4]);
+			if (naddrs >= 6)
+				nfc_op->ndcb[3] |= NDCB3_ADDR6_CYC(addrs[5]);
+			if (naddrs == 7)
+				nfc_op->ndcb[3] |= NDCB3_ADDR7_CYC(addrs[6]);
+
+			nfc_op->cle_ale_delay_ns = instr->delay_ns;
+			break;
+
+		case NAND_OP_DATA_IN_INSTR:
+			nfc_op->data_instr = instr;
+			nfc_op->data_instr_idx = op_id;
+			nfc_op->ndcb[0] |= NDCB0_CMD_TYPE(TYPE_READ);
+			if (nfc->caps->is_nfcv2) {
+				nfc_op->ndcb[0] |=
+					NDCB0_CMD_XTYPE(XTYPE_MONOLITHIC_RW) |
+					NDCB0_LEN_OVRD;
+				nfc_op->ndcb[3] |= round_up(len, FIFO_DEPTH);
+			}
+			nfc_op->data_delay_ns = instr->delay_ns;
+			break;
+
+		case NAND_OP_DATA_OUT_INSTR:
+			nfc_op->data_instr = instr;
+			nfc_op->data_instr_idx = op_id;
+			nfc_op->ndcb[0] |= NDCB0_CMD_TYPE(TYPE_WRITE);
+			if (nfc->caps->is_nfcv2) {
+				nfc_op->ndcb[0] |=
+					NDCB0_CMD_XTYPE(XTYPE_MONOLITHIC_RW) |
+					NDCB0_LEN_OVRD;
+				nfc_op->ndcb[3] |= round_up(len, FIFO_DEPTH);
+			}
+			nfc_op->data_delay_ns = instr->delay_ns;
+			break;
+
+		case NAND_OP_WAITRDY_INSTR:
+			nfc_op->rdy_timeout_ms = instr->ctx.waitrdy.timeout_ms;
+			nfc_op->rdy_delay_ns = instr->delay_ns;
+			break;
+		}
+	}
+}
+
+static int marvell_nfc_xfer_data_pio(struct nand_chip *chip,
+				     const struct nand_subop *subop,
+				     struct marvell_nfc_op *nfc_op)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct nand_op_instr *instr = nfc_op->data_instr;
+	unsigned int op_id = nfc_op->data_instr_idx;
+	unsigned int len = nand_subop_get_data_len(subop, op_id);
+	unsigned int offset = nand_subop_get_data_start_off(subop, op_id);
+	bool reading = (instr->type == NAND_OP_DATA_IN_INSTR);
+	int ret;
+
+	if (instr->ctx.data.force_8bit)
+		marvell_nfc_force_byte_access(chip, true);
+
+	if (reading) {
+		u8 *in = instr->ctx.data.buf.in + offset;
+
+		ret = marvell_nfc_xfer_data_in_pio(nfc, in, len);
+	} else {
+		const u8 *out = instr->ctx.data.buf.out + offset;
+
+		ret = marvell_nfc_xfer_data_out_pio(nfc, out, len);
+	}
+
+	if (instr->ctx.data.force_8bit)
+		marvell_nfc_force_byte_access(chip, false);
+
+	return ret;
+}
+
+static int marvell_nfc_monolithic_access_exec(struct nand_chip *chip,
+					      const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	bool reading;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+	reading = (nfc_op.data_instr->type == NAND_OP_DATA_IN_INSTR);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_RDDREQ | NDSR_WRDREQ,
+				  "RDDREQ/WRDREQ while draining raw data");
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.cle_ale_delay_ns);
+
+	if (reading) {
+		if (nfc_op.rdy_timeout_ms) {
+			ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+			if (ret)
+				return ret;
+		}
+
+		cond_delay(nfc_op.rdy_delay_ns);
+	}
+
+	marvell_nfc_xfer_data_pio(chip, subop, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.data_delay_ns);
+
+	if (!reading) {
+		if (nfc_op.rdy_timeout_ms) {
+			ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+			if (ret)
+				return ret;
+		}
+
+		cond_delay(nfc_op.rdy_delay_ns);
+	}
+
+	/*
+	 * NDCR ND_RUN bit should be cleared automatically at the end of each
+	 * operation but experience shows that the behavior is buggy when it
+	 * comes to writes (with LEN_OVRD). Clear it by hand in this case.
+	 */
+	if (!reading) {
+		struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+
+		writel_relaxed(readl(nfc->regs + NDCR) & ~NDCR_ND_RUN,
+			       nfc->regs + NDCR);
+	}
+
+	return 0;
+}
+
+static int marvell_nfc_naked_access_exec(struct nand_chip *chip,
+					 const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+
+	/*
+	 * Naked access are different in that they need to be flagged as naked
+	 * by the controller. Reset the controller registers fields that inform
+	 * on the type and refill them according to the ongoing operation.
+	 */
+	nfc_op.ndcb[0] &= ~(NDCB0_CMD_TYPE(TYPE_MASK) |
+			    NDCB0_CMD_XTYPE(XTYPE_MASK));
+	switch (subop->instrs[0].type) {
+	case NAND_OP_CMD_INSTR:
+		nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_NAKED_CMD);
+		break;
+	case NAND_OP_ADDR_INSTR:
+		nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_NAKED_ADDR);
+		break;
+	case NAND_OP_DATA_IN_INSTR:
+		nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_READ) |
+				  NDCB0_CMD_XTYPE(XTYPE_LAST_NAKED_RW);
+		break;
+	case NAND_OP_DATA_OUT_INSTR:
+		nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_WRITE) |
+				  NDCB0_CMD_XTYPE(XTYPE_LAST_NAKED_RW);
+		break;
+	default:
+		/* This should never happen */
+		break;
+	}
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+
+	if (!nfc_op.data_instr) {
+		ret = marvell_nfc_wait_cmdd(chip);
+		cond_delay(nfc_op.cle_ale_delay_ns);
+		return ret;
+	}
+
+	ret = marvell_nfc_end_cmd(chip, NDSR_RDDREQ | NDSR_WRDREQ,
+				  "RDDREQ/WRDREQ while draining raw data");
+	if (ret)
+		return ret;
+
+	marvell_nfc_xfer_data_pio(chip, subop, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	/*
+	 * NDCR ND_RUN bit should be cleared automatically at the end of each
+	 * operation but experience shows that the behavior is buggy when it
+	 * comes to writes (with LEN_OVRD). Clear it by hand in this case.
+	 */
+	if (subop->instrs[0].type == NAND_OP_DATA_OUT_INSTR) {
+		struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+
+		writel_relaxed(readl(nfc->regs + NDCR) & ~NDCR_ND_RUN,
+			       nfc->regs + NDCR);
+	}
+
+	return 0;
+}
+
+static int marvell_nfc_naked_waitrdy_exec(struct nand_chip *chip,
+					  const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+
+	ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+	cond_delay(nfc_op.rdy_delay_ns);
+
+	return ret;
+}
+
+static int marvell_nfc_read_id_type_exec(struct nand_chip *chip,
+					 const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+	nfc_op.ndcb[0] &= ~NDCB0_CMD_TYPE(TYPE_READ);
+	nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_READ_ID);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_RDDREQ,
+				  "RDDREQ while reading ID");
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.cle_ale_delay_ns);
+
+	if (nfc_op.rdy_timeout_ms) {
+		ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+		if (ret)
+			return ret;
+	}
+
+	cond_delay(nfc_op.rdy_delay_ns);
+
+	marvell_nfc_xfer_data_pio(chip, subop, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.data_delay_ns);
+
+	return 0;
+}
+
+static int marvell_nfc_read_status_exec(struct nand_chip *chip,
+					const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+	nfc_op.ndcb[0] &= ~NDCB0_CMD_TYPE(TYPE_READ);
+	nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_STATUS);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_end_cmd(chip, NDSR_RDDREQ,
+				  "RDDREQ while reading status");
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.cle_ale_delay_ns);
+
+	if (nfc_op.rdy_timeout_ms) {
+		ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+		if (ret)
+			return ret;
+	}
+
+	cond_delay(nfc_op.rdy_delay_ns);
+
+	marvell_nfc_xfer_data_pio(chip, subop, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.data_delay_ns);
+
+	return 0;
+}
+
+static int marvell_nfc_reset_cmd_type_exec(struct nand_chip *chip,
+					   const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+	nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_RESET);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.cle_ale_delay_ns);
+
+	ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.rdy_delay_ns);
+
+	return 0;
+}
+
+static int marvell_nfc_erase_cmd_type_exec(struct nand_chip *chip,
+					   const struct nand_subop *subop)
+{
+	struct marvell_nfc_op nfc_op;
+	int ret;
+
+	marvell_nfc_parse_instructions(chip, subop, &nfc_op);
+	nfc_op.ndcb[0] |= NDCB0_CMD_TYPE(TYPE_ERASE);
+
+	ret = marvell_nfc_prepare_cmd(chip);
+	if (ret)
+		return ret;
+
+	marvell_nfc_send_cmd(chip, &nfc_op);
+	ret = marvell_nfc_wait_cmdd(chip);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.cle_ale_delay_ns);
+
+	ret = marvell_nfc_wait_op(chip, nfc_op.rdy_timeout_ms);
+	if (ret)
+		return ret;
+
+	cond_delay(nfc_op.rdy_delay_ns);
+
+	return 0;
+}
+
+static const struct nand_op_parser marvell_nfcv2_op_parser = NAND_OP_PARSER(
+	/* Monolithic reads/writes */
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_monolithic_access_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_ADDR_ELEM(true, MAX_ADDRESS_CYC_NFCV2),
+		NAND_OP_PARSER_PAT_CMD_ELEM(true),
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(true),
+		NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, MAX_CHUNK_SIZE)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_monolithic_access_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_ADDR_ELEM(false, MAX_ADDRESS_CYC_NFCV2),
+		NAND_OP_PARSER_PAT_DATA_OUT_ELEM(false, MAX_CHUNK_SIZE),
+		NAND_OP_PARSER_PAT_CMD_ELEM(true),
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(true)),
+	/* Naked commands */
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_access_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_access_exec,
+		NAND_OP_PARSER_PAT_ADDR_ELEM(false, MAX_ADDRESS_CYC_NFCV2)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_access_exec,
+		NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, MAX_CHUNK_SIZE)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_access_exec,
+		NAND_OP_PARSER_PAT_DATA_OUT_ELEM(false, MAX_CHUNK_SIZE)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_waitrdy_exec,
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+	);
+
+static const struct nand_op_parser marvell_nfcv1_op_parser = NAND_OP_PARSER(
+	/* Naked commands not supported, use a function for each pattern */
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_read_id_type_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_ADDR_ELEM(false, MAX_ADDRESS_CYC_NFCV1),
+		NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, 8)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_erase_cmd_type_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_ADDR_ELEM(false, MAX_ADDRESS_CYC_NFCV1),
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_read_status_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_DATA_IN_ELEM(false, 1)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_reset_cmd_type_exec,
+		NAND_OP_PARSER_PAT_CMD_ELEM(false),
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+	NAND_OP_PARSER_PATTERN(
+		marvell_nfc_naked_waitrdy_exec,
+		NAND_OP_PARSER_PAT_WAITRDY_ELEM(false)),
+	);
+
+static int marvell_nfc_exec_op(struct nand_chip *chip,
+			       const struct nand_operation *op,
+			       bool check_only)
+{
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+
+	if (nfc->caps->is_nfcv2)
+		return nand_op_parser_exec_op(chip, &marvell_nfcv2_op_parser,
+					      op, check_only);
+	else
+		return nand_op_parser_exec_op(chip, &marvell_nfcv1_op_parser,
+					      op, check_only);
+}
+
+/*
+ * Layouts were broken in old pxa3xx_nand driver, these are supposed to be
+ * usable.
+ */
+static int marvell_nand_ooblayout_ecc(struct mtd_info *mtd, int section,
+				      struct mtd_oob_region *oobregion)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+
+	if (section)
+		return -ERANGE;
+
+	oobregion->length = (lt->full_chunk_cnt * lt->ecc_bytes) +
+			    lt->last_ecc_bytes;
+	oobregion->offset = mtd->oobsize - oobregion->length;
+
+	return 0;
+}
+
+static int marvell_nand_ooblayout_free(struct mtd_info *mtd, int section,
+				       struct mtd_oob_region *oobregion)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	const struct marvell_hw_ecc_layout *lt = to_marvell_nand(chip)->layout;
+
+	if (section)
+		return -ERANGE;
+
+	/*
+	 * Bootrom looks in bytes 0 & 5 for bad blocks for the
+	 * 4KB page / 4bit BCH combination.
+	 */
+	if (mtd->writesize == SZ_4K && lt->data_bytes == SZ_2K)
+		oobregion->offset = 6;
+	else
+		oobregion->offset = 2;
+
+	oobregion->length = (lt->full_chunk_cnt * lt->spare_bytes) +
+			    lt->last_spare_bytes - oobregion->offset;
+
+	return 0;
+}
+
+static const struct mtd_ooblayout_ops marvell_nand_ooblayout_ops = {
+	.ecc = marvell_nand_ooblayout_ecc,
+	.free = marvell_nand_ooblayout_free,
+};
+
+static int marvell_nand_hw_ecc_ctrl_init(struct mtd_info *mtd,
+					 struct nand_ecc_ctrl *ecc)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	const struct marvell_hw_ecc_layout *l;
+	int i;
+
+	if (!nfc->caps->is_nfcv2 &&
+	    (mtd->writesize + mtd->oobsize > MAX_CHUNK_SIZE)) {
+		dev_err(nfc->dev,
+			"NFCv1: writesize (%d) cannot be bigger than a chunk (%d)\n",
+			mtd->writesize, MAX_CHUNK_SIZE - mtd->oobsize);
+		return -ENOTSUPP;
+	}
+
+	to_marvell_nand(chip)->layout = NULL;
+	for (i = 0; i < ARRAY_SIZE(marvell_nfc_layouts); i++) {
+		l = &marvell_nfc_layouts[i];
+		if (mtd->writesize == l->writesize &&
+		    ecc->size == l->chunk && ecc->strength == l->strength) {
+			to_marvell_nand(chip)->layout = l;
+			break;
+		}
+	}
+
+	if (!to_marvell_nand(chip)->layout ||
+	    (!nfc->caps->is_nfcv2 && ecc->strength > 1)) {
+		dev_err(nfc->dev,
+			"ECC strength %d at page size %d is not supported\n",
+			ecc->strength, mtd->writesize);
+		return -ENOTSUPP;
+	}
+
+	mtd_set_ooblayout(mtd, &marvell_nand_ooblayout_ops);
+	ecc->steps = l->nchunks;
+	ecc->size = l->data_bytes;
+
+	if (ecc->strength == 1) {
+		chip->ecc.algo = NAND_ECC_HAMMING;
+		ecc->read_page_raw = marvell_nfc_hw_ecc_hmg_read_page_raw;
+		ecc->read_page = marvell_nfc_hw_ecc_hmg_read_page;
+		ecc->read_oob_raw = marvell_nfc_hw_ecc_hmg_read_oob_raw;
+		ecc->read_oob = ecc->read_oob_raw;
+		ecc->write_page_raw = marvell_nfc_hw_ecc_hmg_write_page_raw;
+		ecc->write_page = marvell_nfc_hw_ecc_hmg_write_page;
+		ecc->write_oob_raw = marvell_nfc_hw_ecc_hmg_write_oob_raw;
+		ecc->write_oob = ecc->write_oob_raw;
+	} else {
+		chip->ecc.algo = NAND_ECC_BCH;
+		ecc->strength = 16;
+		ecc->read_page_raw = marvell_nfc_hw_ecc_bch_read_page_raw;
+		ecc->read_page = marvell_nfc_hw_ecc_bch_read_page;
+		ecc->read_oob_raw = marvell_nfc_hw_ecc_bch_read_oob_raw;
+		ecc->read_oob = marvell_nfc_hw_ecc_bch_read_oob;
+		ecc->write_page_raw = marvell_nfc_hw_ecc_bch_write_page_raw;
+		ecc->write_page = marvell_nfc_hw_ecc_bch_write_page;
+		ecc->write_oob_raw = marvell_nfc_hw_ecc_bch_write_oob_raw;
+		ecc->write_oob = marvell_nfc_hw_ecc_bch_write_oob;
+	}
+
+	return 0;
+}
+
+static int marvell_nand_ecc_init(struct mtd_info *mtd,
+				 struct nand_ecc_ctrl *ecc)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	int ret;
+
+	if (ecc->mode != NAND_ECC_NONE && (!ecc->size || !ecc->strength)) {
+		if (chip->ecc_step_ds && chip->ecc_strength_ds) {
+			ecc->size = chip->ecc_step_ds;
+			ecc->strength = chip->ecc_strength_ds;
+		} else {
+			dev_info(nfc->dev,
+				 "No minimum ECC strength, using 1b/512B\n");
+			ecc->size = 512;
+			ecc->strength = 1;
+		}
+	}
+
+	switch (ecc->mode) {
+	case NAND_ECC_HW:
+		ret = marvell_nand_hw_ecc_ctrl_init(mtd, ecc);
+		if (ret)
+			return ret;
+		break;
+	case NAND_ECC_NONE:
+	case NAND_ECC_SOFT:
+		if (!nfc->caps->is_nfcv2 && mtd->writesize != SZ_512 &&
+		    mtd->writesize != SZ_2K) {
+			dev_err(nfc->dev, "NFCv1 cannot write %d bytes pages\n",
+				mtd->writesize);
+			return -EINVAL;
+		}
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static u8 bbt_pattern[] = {'M', 'V', 'B', 'b', 't', '0' };
+static u8 bbt_mirror_pattern[] = {'1', 't', 'b', 'B', 'V', 'M' };
+
+static struct nand_bbt_descr bbt_main_descr = {
+	.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE |
+		   NAND_BBT_2BIT | NAND_BBT_VERSION,
+	.offs =	8,
+	.len = 6,
+	.veroffs = 14,
+	.maxblocks = 8,	/* Last 8 blocks in each chip */
+	.pattern = bbt_pattern
+};
+
+static struct nand_bbt_descr bbt_mirror_descr = {
+	.options = NAND_BBT_LASTBLOCK | NAND_BBT_CREATE | NAND_BBT_WRITE |
+		   NAND_BBT_2BIT | NAND_BBT_VERSION,
+	.offs =	8,
+	.len = 6,
+	.veroffs = 14,
+	.maxblocks = 8,	/* Last 8 blocks in each chip */
+	.pattern = bbt_mirror_pattern
+};
+
+static int marvell_nfc_setup_data_interface(struct mtd_info *mtd, int chipnr,
+					    const struct nand_data_interface
+					    *conf)
+{
+	struct nand_chip *chip = mtd_to_nand(mtd);
+	struct marvell_nand_chip *marvell_nand = to_marvell_nand(chip);
+	struct marvell_nfc *nfc = to_marvell_nfc(chip->controller);
+	unsigned int period_ns = 1000000000 / clk_get_rate(nfc->ecc_clk) * 2;
+	const struct nand_sdr_timings *sdr;
+	struct marvell_nfc_timings nfc_tmg;
+	int read_delay;
+
+	sdr = nand_get_sdr_timings(conf);
+	if (IS_ERR(sdr))
+		return PTR_ERR(sdr);
+
+	/*
+	 * SDR timings are given in pico-seconds while NFC timings must be
+	 * expressed in NAND controller clock cycles, which is half of the
+	 * frequency of the accessible ECC clock retrieved by clk_get_rate().
+	 * This is not written anywhere in the datasheet but was observed
+	 * with an oscilloscope.
+	 *
+	 * NFC datasheet gives equations from which thoses calculations
+	 * are derived, they tend to be slightly more restrictives than the
+	 * given core timings and may improve the overall speed.
+	 */
+	nfc_tmg.tRP = TO_CYCLES(DIV_ROUND_UP(sdr->tRC_min, 2), period_ns) - 1;
+	nfc_tmg.tRH = nfc_tmg.tRP;
+	nfc_tmg.tWP = TO_CYCLES(DIV_ROUND_UP(sdr->tWC_min, 2), period_ns) - 1;
+	nfc_tmg.tWH = nfc_tmg.tWP;
+	nfc_tmg.tCS = TO_CYCLES(sdr->tCS_min, period_ns);
+	nfc_tmg.tCH = TO_CYCLES(sdr->tCH_min, period_ns) - 1;
+	nfc_tmg.tADL = TO_CYCLES(sdr->tADL_min, period_ns);
+	/*
+	 * Read delay is the time of propagation from SoC pins to NFC internal
+	 * logic. With non-EDO timings, this is MIN_RD_DEL_CNT clock cycles. In
+	 * EDO mode, an additional delay of tRH must be taken into account so
+	 * the data is sampled on the falling edge instead of the rising edge.
+	 */
+	read_delay = sdr->tRC_min >= 30000 ?
+		MIN_RD_DEL_CNT : MIN_RD_DEL_CNT + nfc_tmg.tRH;
+
+	nfc_tmg.tAR = TO_CYCLES(sdr->tAR_min, period_ns);
+	/*
+	 * tWHR and tRHW are supposed to be read to write delays (and vice
+	 * versa) but in some cases, ie. when doing a change column, they must
+	 * be greater than that to be sure tCCS delay is respected.
+	 */
+	nfc_tmg.tWHR = TO_CYCLES(max_t(int, sdr->tWHR_min, sdr->tCCS_min),
+				 period_ns) - 2,
+	nfc_tmg.tRHW = TO_CYCLES(max_t(int, sdr->tRHW_min, sdr->tCCS_min),
+				 period_ns);
+
+	/* Use WAIT_MODE (wait for RB line) instead of only relying on delays */
+	nfc_tmg.tR = TO_CYCLES(sdr->tWB_max, period_ns);
+
+	if (chipnr < 0)
+		return 0;
+
+	marvell_nand->ndtr0 =
+		NDTR0_TRP(nfc_tmg.tRP) |
+		NDTR0_TRH(nfc_tmg.tRH) |
+		NDTR0_ETRP(nfc_tmg.tRP) |
+		NDTR0_TWP(nfc_tmg.tWP) |
+		NDTR0_TWH(nfc_tmg.tWH) |
+		NDTR0_TCS(nfc_tmg.tCS) |
+		NDTR0_TCH(nfc_tmg.tCH) |
+		NDTR0_RD_CNT_DEL(read_delay) |
+		NDTR0_SELCNTR |
+		NDTR0_TADL(nfc_tmg.tADL);
+
+	marvell_nand->ndtr1 =
+		NDTR1_TAR(nfc_tmg.tAR) |
+		NDTR1_TWHR(nfc_tmg.tWHR) |
+		NDTR1_TRHW(nfc_tmg.tRHW) |
+		NDTR1_WAIT_MODE |
+		NDTR1_TR(nfc_tmg.tR);
+
+	return 0;
+}
+
+static int marvell_nand_chip_init(struct device *dev, struct marvell_nfc *nfc,
+				  struct device_node *np)
+{
+	struct pxa3xx_nand_platform_data *pdata = dev_get_platdata(dev);
+	struct marvell_nand_chip *marvell_nand;
+	struct mtd_info *mtd;
+	struct nand_chip *chip;
+	int nsels, ret, i;
+	u32 cs, rb;
+
+	/*
+	 * The legacy "num-cs" property indicates the number of CS on the only
+	 * chip connected to the controller (legacy bindings does not support
+	 * more than one chip). CS are only incremented one by one while the RB
+	 * pin is always the #0.
+	 *
+	 * When not using legacy bindings, a couple of "reg" and "nand-rb"
+	 * properties must be filled. For each chip, expressed as a subnode,
+	 * "reg" points to the CS lines and "nand-rb" to the RB line.
+	 */
+	if (pdata) {
+		nsels = 1;
+	} else if (nfc->caps->legacy_of_bindings &&
+		   !of_get_property(np, "num-cs", &nsels)) {
+		dev_err(dev, "missing num-cs property\n");
+		return -EINVAL;
+	} else if (!of_get_property(np, "reg", &nsels)) {
+		dev_err(dev, "missing reg property\n");
+		return -EINVAL;
+	}
+
+	if (!pdata)
+		nsels /= sizeof(u32);
+	if (!nsels) {
+		dev_err(dev, "invalid reg property size\n");
+		return -EINVAL;
+	}
+
+	/* Alloc the nand chip structure */
+	marvell_nand = devm_kzalloc(dev, sizeof(*marvell_nand) +
+				    (nsels *
+				     sizeof(struct marvell_nand_chip_sel)),
+				    GFP_KERNEL);
+	if (!marvell_nand) {
+		dev_err(dev, "could not allocate chip structure\n");
+		return -ENOMEM;
+	}
+
+	marvell_nand->nsels = nsels;
+	marvell_nand->selected_die = -1;
+
+	for (i = 0; i < nsels; i++) {
+		if (pdata || nfc->caps->legacy_of_bindings) {
+			/*
+			 * Legacy bindings use the CS lines in natural
+			 * order (0, 1, ...)
+			 */
+			cs = i;
+		} else {
+			/* Retrieve CS id */
+			ret = of_property_read_u32_index(np, "reg", i, &cs);
+			if (ret) {
+				dev_err(dev, "could not retrieve reg property: %d\n",
+					ret);
+				return ret;
+			}
+		}
+
+		if (cs >= nfc->caps->max_cs_nb) {
+			dev_err(dev, "invalid reg value: %u (max CS = %d)\n",
+				cs, nfc->caps->max_cs_nb);
+			return -EINVAL;
+		}
+
+		if (test_and_set_bit(cs, &nfc->assigned_cs)) {
+			dev_err(dev, "CS %d already assigned\n", cs);
+			return -EINVAL;
+		}
+
+		/*
+		 * The cs variable represents the chip select id, which must be
+		 * converted in bit fields for NDCB0 and NDCB2 to select the
+		 * right chip. Unfortunately, due to a lack of information on
+		 * the subject and incoherent documentation, the user should not
+		 * use CS1 and CS3 at all as asserting them is not supported in
+		 * a reliable way (due to multiplexing inside ADDR5 field).
+		 */
+		marvell_nand->sels[i].cs = cs;
+		switch (cs) {
+		case 0:
+		case 2:
+			marvell_nand->sels[i].ndcb0_csel = 0;
+			break;
+		case 1:
+		case 3:
+			marvell_nand->sels[i].ndcb0_csel = NDCB0_CSEL;
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		/* Retrieve RB id */
+		if (pdata || nfc->caps->legacy_of_bindings) {
+			/* Legacy bindings always use RB #0 */
+			rb = 0;
+		} else {
+			ret = of_property_read_u32_index(np, "nand-rb", i,
+							 &rb);
+			if (ret) {
+				dev_err(dev,
+					"could not retrieve RB property: %d\n",
+					ret);
+				return ret;
+			}
+		}
+
+		if (rb >= nfc->caps->max_rb_nb) {
+			dev_err(dev, "invalid reg value: %u (max RB = %d)\n",
+				rb, nfc->caps->max_rb_nb);
+			return -EINVAL;
+		}
+
+		marvell_nand->sels[i].rb = rb;
+	}
+
+	chip = &marvell_nand->chip;
+	chip->controller = &nfc->controller;
+	nand_set_flash_node(chip, np);
+
+	chip->exec_op = marvell_nfc_exec_op;
+	chip->select_chip = marvell_nfc_select_chip;
+	if (nfc->caps->is_nfcv2 &&
+	    !of_property_read_bool(np, "marvell,nand-keep-config"))
+		chip->setup_data_interface = marvell_nfc_setup_data_interface;
+
+	mtd = nand_to_mtd(chip);
+	mtd->dev.parent = dev;
+
+	/*
+	 * Default to HW ECC engine mode. If the nand-ecc-mode property is given
+	 * in the DT node, this entry will be overwritten in nand_scan_ident().
+	 */
+	chip->ecc.mode = NAND_ECC_HW;
+
+	/*
+	 * Save a reference value for timing registers before
+	 * ->setup_data_interface() is called.
+	 */
+	marvell_nand->ndtr0 = readl_relaxed(nfc->regs + NDTR0);
+	marvell_nand->ndtr1 = readl_relaxed(nfc->regs + NDTR1);
+
+	chip->options |= NAND_BUSWIDTH_AUTO;
+	ret = nand_scan_ident(mtd, marvell_nand->nsels, NULL);
+	if (ret) {
+		dev_err(dev, "could not identify the nand chip\n");
+		return ret;
+	}
+
+	if (pdata && pdata->flash_bbt)
+		chip->bbt_options |= NAND_BBT_USE_FLASH;
+
+	if (chip->bbt_options & NAND_BBT_USE_FLASH) {
+		/*
+		 * We'll use a bad block table stored in-flash and don't
+		 * allow writing the bad block marker to the flash.
+		 */
+		chip->bbt_options |= NAND_BBT_NO_OOB_BBM;
+		chip->bbt_td = &bbt_main_descr;
+		chip->bbt_md = &bbt_mirror_descr;
+	}
+
+	/* Save the chip-specific fields of NDCR */
+	marvell_nand->ndcr = NDCR_PAGE_SZ(mtd->writesize);
+	if (chip->options & NAND_BUSWIDTH_16)
+		marvell_nand->ndcr |= NDCR_DWIDTH_M | NDCR_DWIDTH_C;
+
+	/*
+	 * On small page NANDs, only one cycle is needed to pass the
+	 * column address.
+	 */
+	if (mtd->writesize <= 512) {
+		marvell_nand->addr_cyc = 1;
+	} else {
+		marvell_nand->addr_cyc = 2;
+		marvell_nand->ndcr |= NDCR_RA_START;
+	}
+
+	/*
+	 * Now add the number of cycles needed to pass the row
+	 * address.
+	 *
+	 * Addressing a chip using CS 2 or 3 should also need the third row
+	 * cycle but due to inconsistance in the documentation and lack of
+	 * hardware to test this situation, this case is not supported.
+	 */
+	if (chip->options & NAND_ROW_ADDR_3)
+		marvell_nand->addr_cyc += 3;
+	else
+		marvell_nand->addr_cyc += 2;
+
+	if (pdata) {
+		chip->ecc.size = pdata->ecc_step_size;
+		chip->ecc.strength = pdata->ecc_strength;
+	}
+
+	ret = marvell_nand_ecc_init(mtd, &chip->ecc);
+	if (ret) {
+		dev_err(dev, "ECC init failed: %d\n", ret);
+		return ret;
+	}
+
+	if (chip->ecc.mode == NAND_ECC_HW) {
+		/*
+		 * Subpage write not available with hardware ECC, prohibit also
+		 * subpage read as in userspace subpage access would still be
+		 * allowed and subpage write, if used, would lead to numerous
+		 * uncorrectable ECC errors.
+		 */
+		chip->options |= NAND_NO_SUBPAGE_WRITE;
+	}
+
+	if (pdata || nfc->caps->legacy_of_bindings) {
+		/*
+		 * We keep the MTD name unchanged to avoid breaking platforms
+		 * where the MTD cmdline parser is used and the bootloader
+		 * has not been updated to use the new naming scheme.
+		 */
+		mtd->name = "pxa3xx_nand-0";
+	} else if (!mtd->name) {
+		/*
+		 * If the new bindings are used and the bootloader has not been
+		 * updated to pass a new mtdparts parameter on the cmdline, you
+		 * should define the following property in your NAND node, ie:
+		 *
+		 *	label = "main-storage";
+		 *
+		 * This way, mtd->name will be set by the core when
+		 * nand_set_flash_node() is called.
+		 */
+		mtd->name = devm_kasprintf(nfc->dev, GFP_KERNEL,
+					   "%s:nand.%d", dev_name(nfc->dev),
+					   marvell_nand->sels[0].cs);
+		if (!mtd->name) {
+			dev_err(nfc->dev, "Failed to allocate mtd->name\n");
+			return -ENOMEM;
+		}
+	}
+
+	ret = nand_scan_tail(mtd);
+	if (ret) {
+		dev_err(dev, "nand_scan_tail failed: %d\n", ret);
+		return ret;
+	}
+
+	if (pdata)
+		/* Legacy bindings support only one chip */
+		ret = mtd_device_register(mtd, pdata->parts[0],
+					  pdata->nr_parts[0]);
+	else
+		ret = mtd_device_register(mtd, NULL, 0);
+	if (ret) {
+		dev_err(dev, "failed to register mtd device: %d\n", ret);
+		nand_release(mtd);
+		return ret;
+	}
+
+	list_add_tail(&marvell_nand->node, &nfc->chips);
+
+	return 0;
+}
+
+static int marvell_nand_chips_init(struct device *dev, struct marvell_nfc *nfc)
+{
+	struct device_node *np = dev->of_node;
+	struct device_node *nand_np;
+	int max_cs = nfc->caps->max_cs_nb;
+	int nchips;
+	int ret;
+
+	if (!np)
+		nchips = 1;
+	else
+		nchips = of_get_child_count(np);
+
+	if (nchips > max_cs) {
+		dev_err(dev, "too many NAND chips: %d (max = %d CS)\n", nchips,
+			max_cs);
+		return -EINVAL;
+	}
+
+	/*
+	 * Legacy bindings do not use child nodes to exhibit NAND chip
+	 * properties and layout. Instead, NAND properties are mixed with the
+	 * controller ones, and partitions are defined as direct subnodes of the
+	 * NAND controller node.
+	 */
+	if (nfc->caps->legacy_of_bindings) {
+		ret = marvell_nand_chip_init(dev, nfc, np);
+		return ret;
+	}
+
+	for_each_child_of_node(np, nand_np) {
+		ret = marvell_nand_chip_init(dev, nfc, nand_np);
+		if (ret) {
+			of_node_put(nand_np);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static void marvell_nand_chips_cleanup(struct marvell_nfc *nfc)
+{
+	struct marvell_nand_chip *entry, *temp;
+
+	list_for_each_entry_safe(entry, temp, &nfc->chips, node) {
+		nand_release(nand_to_mtd(&entry->chip));
+		list_del(&entry->node);
+	}
+}
+
+static int marvell_nfc_init_dma(struct marvell_nfc *nfc)
+{
+	struct platform_device *pdev = container_of(nfc->dev,
+						    struct platform_device,
+						    dev);
+	struct dma_slave_config config = {};
+	struct resource *r;
+	dma_cap_mask_t mask;
+	struct pxad_param param;
+	int ret;
+
+	if (!IS_ENABLED(CONFIG_PXA_DMA)) {
+		dev_warn(nfc->dev,
+			 "DMA not enabled in configuration\n");
+		return -ENOTSUPP;
+	}
+
+	ret = dma_set_mask_and_coherent(nfc->dev, DMA_BIT_MASK(32));
+	if (ret)
+		return ret;
+
+	r = platform_get_resource(pdev, IORESOURCE_DMA, 0);
+	if (!r) {
+		dev_err(nfc->dev, "No resource defined for data DMA\n");
+		return -ENXIO;
+	}
+
+	param.drcmr = r->start;
+	param.prio = PXAD_PRIO_LOWEST;
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_SLAVE, mask);
+	nfc->dma_chan =
+		dma_request_slave_channel_compat(mask, pxad_filter_fn,
+						 &param, nfc->dev,
+						 "data");
+	if (!nfc->dma_chan) {
+		dev_err(nfc->dev,
+			"Unable to request data DMA channel\n");
+		return -ENODEV;
+	}
+
+	r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!r)
+		return -ENXIO;
+
+	config.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+	config.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+	config.src_addr = r->start + NDDB;
+	config.dst_addr = r->start + NDDB;
+	config.src_maxburst = 32;
+	config.dst_maxburst = 32;
+	ret = dmaengine_slave_config(nfc->dma_chan, &config);
+	if (ret < 0) {
+		dev_err(nfc->dev, "Failed to configure DMA channel\n");
+		return ret;
+	}
+
+	/*
+	 * DMA must act on length multiple of 32 and this length may be
+	 * bigger than the destination buffer. Use this buffer instead
+	 * for DMA transfers and then copy the desired amount of data to
+	 * the provided buffer.
+	 */
+	nfc->dma_buf = kmalloc(MAX_CHUNK_SIZE, GFP_KERNEL | GFP_DMA);
+	if (!nfc->dma_buf)
+		return -ENOMEM;
+
+	nfc->use_dma = true;
+
+	return 0;
+}
+
+static int marvell_nfc_init(struct marvell_nfc *nfc)
+{
+	struct device_node *np = nfc->dev->of_node;
+
+	/*
+	 * Some SoCs like A7k/A8k need to enable manually the NAND
+	 * controller, gated clocks and reset bits to avoid being bootloader
+	 * dependent. This is done through the use of the System Functions
+	 * registers.
+	 */
+	if (nfc->caps->need_system_controller) {
+		struct regmap *sysctrl_base =
+			syscon_regmap_lookup_by_phandle(np,
+							"marvell,system-controller");
+		u32 reg;
+
+		if (IS_ERR(sysctrl_base))
+			return PTR_ERR(sysctrl_base);
+
+		reg = GENCONF_SOC_DEVICE_MUX_NFC_EN |
+		      GENCONF_SOC_DEVICE_MUX_ECC_CLK_RST |
+		      GENCONF_SOC_DEVICE_MUX_ECC_CORE_RST |
+		      GENCONF_SOC_DEVICE_MUX_NFC_INT_EN;
+		regmap_write(sysctrl_base, GENCONF_SOC_DEVICE_MUX, reg);
+
+		regmap_read(sysctrl_base, GENCONF_CLK_GATING_CTRL, &reg);
+		reg |= GENCONF_CLK_GATING_CTRL_ND_GATE;
+		regmap_write(sysctrl_base, GENCONF_CLK_GATING_CTRL, reg);
+
+		regmap_read(sysctrl_base, GENCONF_ND_CLK_CTRL, &reg);
+		reg |= GENCONF_ND_CLK_CTRL_EN;
+		regmap_write(sysctrl_base, GENCONF_ND_CLK_CTRL, reg);
+	}
+
+	/* Configure the DMA if appropriate */
+	if (!nfc->caps->is_nfcv2)
+		marvell_nfc_init_dma(nfc);
+
+	/*
+	 * ECC operations and interruptions are only enabled when specifically
+	 * needed. ECC shall not be activated in the early stages (fails probe).
+	 * Arbiter flag, even if marked as "reserved", must be set (empirical).
+	 * SPARE_EN bit must always be set or ECC bytes will not be at the same
+	 * offset in the read page and this will fail the protection.
+	 */
+	writel_relaxed(NDCR_ALL_INT | NDCR_ND_ARB_EN | NDCR_SPARE_EN |
+		       NDCR_RD_ID_CNT(NFCV1_READID_LEN), nfc->regs + NDCR);
+	writel_relaxed(0xFFFFFFFF, nfc->regs + NDSR);
+	writel_relaxed(0, nfc->regs + NDECCCTRL);
+
+	return 0;
+}
+
+static int marvell_nfc_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct resource *r;
+	struct marvell_nfc *nfc;
+	int ret;
+	int irq;
+
+	nfc = devm_kzalloc(&pdev->dev, sizeof(struct marvell_nfc),
+			   GFP_KERNEL);
+	if (!nfc)
+		return -ENOMEM;
+
+	nfc->dev = dev;
+	nand_hw_control_init(&nfc->controller);
+	INIT_LIST_HEAD(&nfc->chips);
+
+	r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	nfc->regs = devm_ioremap_resource(dev, r);
+	if (IS_ERR(nfc->regs))
+		return PTR_ERR(nfc->regs);
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_err(dev, "failed to retrieve irq\n");
+		return irq;
+	}
+
+	nfc->ecc_clk = devm_clk_get(&pdev->dev, NULL);
+	if (IS_ERR(nfc->ecc_clk))
+		return PTR_ERR(nfc->ecc_clk);
+
+	ret = clk_prepare_enable(nfc->ecc_clk);
+	if (ret)
+		return ret;
+
+	marvell_nfc_disable_int(nfc, NDCR_ALL_INT);
+	marvell_nfc_clear_int(nfc, NDCR_ALL_INT);
+	ret = devm_request_irq(dev, irq, marvell_nfc_isr,
+			       0, "marvell-nfc", nfc);
+	if (ret)
+		goto unprepare_clk;
+
+	/* Get NAND controller capabilities */
+	if (pdev->id_entry)
+		nfc->caps = (void *)pdev->id_entry->driver_data;
+	else
+		nfc->caps = of_device_get_match_data(&pdev->dev);
+
+	if (!nfc->caps) {
+		dev_err(dev, "Could not retrieve NFC caps\n");
+		ret = -EINVAL;
+		goto unprepare_clk;
+	}
+
+	/* Init the controller and then probe the chips */
+	ret = marvell_nfc_init(nfc);
+	if (ret)
+		goto unprepare_clk;
+
+	platform_set_drvdata(pdev, nfc);
+
+	ret = marvell_nand_chips_init(dev, nfc);
+	if (ret)
+		goto unprepare_clk;
+
+	return 0;
+
+unprepare_clk:
+	clk_disable_unprepare(nfc->ecc_clk);
+
+	return ret;
+}
+
+static int marvell_nfc_remove(struct platform_device *pdev)
+{
+	struct marvell_nfc *nfc = platform_get_drvdata(pdev);
+
+	marvell_nand_chips_cleanup(nfc);
+
+	if (nfc->use_dma) {
+		dmaengine_terminate_all(nfc->dma_chan);
+		dma_release_channel(nfc->dma_chan);
+	}
+
+	clk_disable_unprepare(nfc->ecc_clk);
+
+	return 0;
+}
+
+static const struct marvell_nfc_caps marvell_armada_8k_nfc_caps = {
+	.max_cs_nb = 4,
+	.max_rb_nb = 2,
+	.need_system_controller = true,
+	.is_nfcv2 = true,
+};
+
+static const struct marvell_nfc_caps marvell_armada370_nfc_caps = {
+	.max_cs_nb = 4,
+	.max_rb_nb = 2,
+	.is_nfcv2 = true,
+};
+
+static const struct marvell_nfc_caps marvell_pxa3xx_nfc_caps = {
+	.max_cs_nb = 2,
+	.max_rb_nb = 1,
+	.use_dma = true,
+};
+
+static const struct marvell_nfc_caps marvell_armada_8k_nfc_legacy_caps = {
+	.max_cs_nb = 4,
+	.max_rb_nb = 2,
+	.need_system_controller = true,
+	.legacy_of_bindings = true,
+	.is_nfcv2 = true,
+};
+
+static const struct marvell_nfc_caps marvell_armada370_nfc_legacy_caps = {
+	.max_cs_nb = 4,
+	.max_rb_nb = 2,
+	.legacy_of_bindings = true,
+	.is_nfcv2 = true,
+};
+
+static const struct marvell_nfc_caps marvell_pxa3xx_nfc_legacy_caps = {
+	.max_cs_nb = 2,
+	.max_rb_nb = 1,
+	.legacy_of_bindings = true,
+	.use_dma = true,
+};
+
+static const struct platform_device_id marvell_nfc_platform_ids[] = {
+	{
+		.name = "pxa3xx-nand",
+		.driver_data = (kernel_ulong_t)&marvell_pxa3xx_nfc_legacy_caps,
+	},
+	{ /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(platform, marvell_nfc_platform_ids);
+
+static const struct of_device_id marvell_nfc_of_ids[] = {
+	{
+		.compatible = "marvell,armada-8k-nand-controller",
+		.data = &marvell_armada_8k_nfc_caps,
+	},
+	{
+		.compatible = "marvell,armada370-nand-controller",
+		.data = &marvell_armada370_nfc_caps,
+	},
+	{
+		.compatible = "marvell,pxa3xx-nand-controller",
+		.data = &marvell_pxa3xx_nfc_caps,
+	},
+	/* Support for old/deprecated bindings: */
+	{
+		.compatible = "marvell,armada-8k-nand",
+		.data = &marvell_armada_8k_nfc_legacy_caps,
+	},
+	{
+		.compatible = "marvell,armada370-nand",
+		.data = &marvell_armada370_nfc_legacy_caps,
+	},
+	{
+		.compatible = "marvell,pxa3xx-nand",
+		.data = &marvell_pxa3xx_nfc_legacy_caps,
+	},
+	{ /* sentinel */ },
+};
+MODULE_DEVICE_TABLE(of, marvell_nfc_of_ids);
+
+static struct platform_driver marvell_nfc_driver = {
+	.driver	= {
+		.name		= "marvell-nfc",
+		.of_match_table = marvell_nfc_of_ids,
+	},
+	.id_table = marvell_nfc_platform_ids,
+	.probe = marvell_nfc_probe,
+	.remove	= marvell_nfc_remove,
+};
+module_platform_driver(marvell_nfc_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Marvell NAND controller driver");
diff --git a/drivers/mtd/nand/mtk_ecc.c b/drivers/mtd/nand/mtk_ecc.c
index c51d214..40d86a8 100644
--- a/drivers/mtd/nand/mtk_ecc.c
+++ b/drivers/mtd/nand/mtk_ecc.c
@@ -34,34 +34,28 @@
 
 #define ECC_ENCCON		(0x00)
 #define ECC_ENCCNFG		(0x04)
-#define		ECC_MODE_SHIFT		(5)
 #define		ECC_MS_SHIFT		(16)
 #define ECC_ENCDIADDR		(0x08)
 #define ECC_ENCIDLE		(0x0C)
-#define ECC_ENCIRQ_EN		(0x80)
-#define ECC_ENCIRQ_STA		(0x84)
 #define ECC_DECCON		(0x100)
 #define ECC_DECCNFG		(0x104)
 #define		DEC_EMPTY_EN		BIT(31)
 #define		DEC_CNFG_CORRECT	(0x3 << 12)
 #define ECC_DECIDLE		(0x10C)
 #define ECC_DECENUM0		(0x114)
-#define ECC_DECDONE		(0x124)
-#define ECC_DECIRQ_EN		(0x200)
-#define ECC_DECIRQ_STA		(0x204)
 
 #define ECC_TIMEOUT		(500000)
 
 #define ECC_IDLE_REG(op)	((op) == ECC_ENCODE ? ECC_ENCIDLE : ECC_DECIDLE)
 #define ECC_CTL_REG(op)		((op) == ECC_ENCODE ? ECC_ENCCON : ECC_DECCON)
-#define ECC_IRQ_REG(op)		((op) == ECC_ENCODE ? \
-					ECC_ENCIRQ_EN : ECC_DECIRQ_EN)
 
 struct mtk_ecc_caps {
 	u32 err_mask;
 	const u8 *ecc_strength;
+	const u32 *ecc_regs;
 	u8 num_ecc_strength;
-	u32 encode_parity_reg0;
+	u8 ecc_mode_shift;
+	u32 parity_bits;
 	int pg_irq_sel;
 };
 
@@ -89,6 +83,46 @@ static const u8 ecc_strength_mt2712[] = {
 	40, 44, 48, 52, 56, 60, 68, 72, 80
 };
 
+static const u8 ecc_strength_mt7622[] = {
+	4, 6, 8, 10, 12, 14, 16
+};
+
+enum mtk_ecc_regs {
+	ECC_ENCPAR00,
+	ECC_ENCIRQ_EN,
+	ECC_ENCIRQ_STA,
+	ECC_DECDONE,
+	ECC_DECIRQ_EN,
+	ECC_DECIRQ_STA,
+};
+
+static int mt2701_ecc_regs[] = {
+	[ECC_ENCPAR00] =        0x10,
+	[ECC_ENCIRQ_EN] =       0x80,
+	[ECC_ENCIRQ_STA] =      0x84,
+	[ECC_DECDONE] =         0x124,
+	[ECC_DECIRQ_EN] =       0x200,
+	[ECC_DECIRQ_STA] =      0x204,
+};
+
+static int mt2712_ecc_regs[] = {
+	[ECC_ENCPAR00] =        0x300,
+	[ECC_ENCIRQ_EN] =       0x80,
+	[ECC_ENCIRQ_STA] =      0x84,
+	[ECC_DECDONE] =         0x124,
+	[ECC_DECIRQ_EN] =       0x200,
+	[ECC_DECIRQ_STA] =      0x204,
+};
+
+static int mt7622_ecc_regs[] = {
+	[ECC_ENCPAR00] =        0x10,
+	[ECC_ENCIRQ_EN] =       0x30,
+	[ECC_ENCIRQ_STA] =      0x34,
+	[ECC_DECDONE] =         0x11c,
+	[ECC_DECIRQ_EN] =       0x140,
+	[ECC_DECIRQ_STA] =      0x144,
+};
+
 static inline void mtk_ecc_wait_idle(struct mtk_ecc *ecc,
 				     enum mtk_ecc_operation op)
 {
@@ -107,32 +141,30 @@ static inline void mtk_ecc_wait_idle(struct mtk_ecc *ecc,
 static irqreturn_t mtk_ecc_irq(int irq, void *id)
 {
 	struct mtk_ecc *ecc = id;
-	enum mtk_ecc_operation op;
 	u32 dec, enc;
 
-	dec = readw(ecc->regs + ECC_DECIRQ_STA) & ECC_IRQ_EN;
+	dec = readw(ecc->regs + ecc->caps->ecc_regs[ECC_DECIRQ_STA])
+		    & ECC_IRQ_EN;
 	if (dec) {
-		op = ECC_DECODE;
-		dec = readw(ecc->regs + ECC_DECDONE);
+		dec = readw(ecc->regs + ecc->caps->ecc_regs[ECC_DECDONE]);
 		if (dec & ecc->sectors) {
 			/*
 			 * Clear decode IRQ status once again to ensure that
 			 * there will be no extra IRQ.
 			 */
-			readw(ecc->regs + ECC_DECIRQ_STA);
+			readw(ecc->regs + ecc->caps->ecc_regs[ECC_DECIRQ_STA]);
 			ecc->sectors = 0;
 			complete(&ecc->done);
 		} else {
 			return IRQ_HANDLED;
 		}
 	} else {
-		enc = readl(ecc->regs + ECC_ENCIRQ_STA) & ECC_IRQ_EN;
-		if (enc) {
-			op = ECC_ENCODE;
+		enc = readl(ecc->regs + ecc->caps->ecc_regs[ECC_ENCIRQ_STA])
+		      & ECC_IRQ_EN;
+		if (enc)
 			complete(&ecc->done);
-		} else {
+		else
 			return IRQ_NONE;
-		}
 	}
 
 	return IRQ_HANDLED;
@@ -160,7 +192,7 @@ static int mtk_ecc_config(struct mtk_ecc *ecc, struct mtk_ecc_config *config)
 		/* configure ECC encoder (in bits) */
 		enc_sz = config->len << 3;
 
-		reg = ecc_bit | (config->mode << ECC_MODE_SHIFT);
+		reg = ecc_bit | (config->mode << ecc->caps->ecc_mode_shift);
 		reg |= (enc_sz << ECC_MS_SHIFT);
 		writel(reg, ecc->regs + ECC_ENCCNFG);
 
@@ -171,9 +203,9 @@ static int mtk_ecc_config(struct mtk_ecc *ecc, struct mtk_ecc_config *config)
 	} else {
 		/* configure ECC decoder (in bits) */
 		dec_sz = (config->len << 3) +
-					config->strength * ECC_PARITY_BITS;
+			 config->strength * ecc->caps->parity_bits;
 
-		reg = ecc_bit | (config->mode << ECC_MODE_SHIFT);
+		reg = ecc_bit | (config->mode << ecc->caps->ecc_mode_shift);
 		reg |= (dec_sz << ECC_MS_SHIFT) | DEC_CNFG_CORRECT;
 		reg |= DEC_EMPTY_EN;
 		writel(reg, ecc->regs + ECC_DECCNFG);
@@ -291,7 +323,12 @@ int mtk_ecc_enable(struct mtk_ecc *ecc, struct mtk_ecc_config *config)
 		 */
 		if (ecc->caps->pg_irq_sel && config->mode == ECC_NFI_MODE)
 			reg_val |= ECC_PG_IRQ_SEL;
-		writew(reg_val, ecc->regs + ECC_IRQ_REG(op));
+		if (op == ECC_ENCODE)
+			writew(reg_val, ecc->regs +
+			       ecc->caps->ecc_regs[ECC_ENCIRQ_EN]);
+		else
+			writew(reg_val, ecc->regs +
+			       ecc->caps->ecc_regs[ECC_DECIRQ_EN]);
 	}
 
 	writew(ECC_OP_ENABLE, ecc->regs + ECC_CTL_REG(op));
@@ -310,13 +347,17 @@ void mtk_ecc_disable(struct mtk_ecc *ecc)
 
 	/* disable it */
 	mtk_ecc_wait_idle(ecc, op);
-	if (op == ECC_DECODE)
+	if (op == ECC_DECODE) {
 		/*
 		 * Clear decode IRQ status in case there is a timeout to wait
 		 * decode IRQ.
 		 */
-		readw(ecc->regs + ECC_DECIRQ_STA);
-	writew(0, ecc->regs + ECC_IRQ_REG(op));
+		readw(ecc->regs + ecc->caps->ecc_regs[ECC_DECDONE]);
+		writew(0, ecc->regs + ecc->caps->ecc_regs[ECC_DECIRQ_EN]);
+	} else {
+		writew(0, ecc->regs + ecc->caps->ecc_regs[ECC_ENCIRQ_EN]);
+	}
+
 	writew(ECC_OP_DISABLE, ecc->regs + ECC_CTL_REG(op));
 
 	mutex_unlock(&ecc->lock);
@@ -367,11 +408,11 @@ int mtk_ecc_encode(struct mtk_ecc *ecc, struct mtk_ecc_config *config,
 	mtk_ecc_wait_idle(ecc, ECC_ENCODE);
 
 	/* Program ECC bytes to OOB: per sector oob = FDM + ECC + SPARE */
-	len = (config->strength * ECC_PARITY_BITS + 7) >> 3;
+	len = (config->strength * ecc->caps->parity_bits + 7) >> 3;
 
 	/* write the parity bytes generated by the ECC back to temp buffer */
 	__ioread32_copy(ecc->eccdata,
-			ecc->regs + ecc->caps->encode_parity_reg0,
+			ecc->regs + ecc->caps->ecc_regs[ECC_ENCPAR00],
 			round_up(len, 4));
 
 	/* copy into possibly unaligned OOB region with actual length */
@@ -404,22 +445,42 @@ void mtk_ecc_adjust_strength(struct mtk_ecc *ecc, u32 *p)
 }
 EXPORT_SYMBOL(mtk_ecc_adjust_strength);
 
+unsigned int mtk_ecc_get_parity_bits(struct mtk_ecc *ecc)
+{
+	return ecc->caps->parity_bits;
+}
+EXPORT_SYMBOL(mtk_ecc_get_parity_bits);
+
 static const struct mtk_ecc_caps mtk_ecc_caps_mt2701 = {
 	.err_mask = 0x3f,
 	.ecc_strength = ecc_strength_mt2701,
+	.ecc_regs = mt2701_ecc_regs,
 	.num_ecc_strength = 20,
-	.encode_parity_reg0 = 0x10,
+	.ecc_mode_shift = 5,
+	.parity_bits = 14,
 	.pg_irq_sel = 0,
 };
 
 static const struct mtk_ecc_caps mtk_ecc_caps_mt2712 = {
 	.err_mask = 0x7f,
 	.ecc_strength = ecc_strength_mt2712,
+	.ecc_regs = mt2712_ecc_regs,
 	.num_ecc_strength = 23,
-	.encode_parity_reg0 = 0x300,
+	.ecc_mode_shift = 5,
+	.parity_bits = 14,
 	.pg_irq_sel = 1,
 };
 
+static const struct mtk_ecc_caps mtk_ecc_caps_mt7622 = {
+	.err_mask = 0x3f,
+	.ecc_strength = ecc_strength_mt7622,
+	.ecc_regs = mt7622_ecc_regs,
+	.num_ecc_strength = 7,
+	.ecc_mode_shift = 4,
+	.parity_bits = 13,
+	.pg_irq_sel = 0,
+};
+
 static const struct of_device_id mtk_ecc_dt_match[] = {
 	{
 		.compatible = "mediatek,mt2701-ecc",
@@ -427,6 +488,9 @@ static const struct of_device_id mtk_ecc_dt_match[] = {
 	}, {
 		.compatible = "mediatek,mt2712-ecc",
 		.data = &mtk_ecc_caps_mt2712,
+	}, {
+		.compatible = "mediatek,mt7622-ecc",
+		.data = &mtk_ecc_caps_mt7622,
 	},
 	{},
 };
@@ -452,7 +516,7 @@ static int mtk_ecc_probe(struct platform_device *pdev)
 
 	max_eccdata_size = ecc->caps->num_ecc_strength - 1;
 	max_eccdata_size = ecc->caps->ecc_strength[max_eccdata_size];
-	max_eccdata_size = (max_eccdata_size * ECC_PARITY_BITS + 7) >> 3;
+	max_eccdata_size = (max_eccdata_size * ecc->caps->parity_bits + 7) >> 3;
 	max_eccdata_size = round_up(max_eccdata_size, 4);
 	ecc->eccdata = devm_kzalloc(dev, max_eccdata_size, GFP_KERNEL);
 	if (!ecc->eccdata)
diff --git a/drivers/mtd/nand/mtk_ecc.h b/drivers/mtd/nand/mtk_ecc.h
index d245c14..a455df0 100644
--- a/drivers/mtd/nand/mtk_ecc.h
+++ b/drivers/mtd/nand/mtk_ecc.h
@@ -14,8 +14,6 @@
 
 #include <linux/types.h>
 
-#define ECC_PARITY_BITS		(14)
-
 enum mtk_ecc_mode {ECC_DMA_MODE = 0, ECC_NFI_MODE = 1};
 enum mtk_ecc_operation {ECC_ENCODE, ECC_DECODE};
 
@@ -43,6 +41,7 @@ int mtk_ecc_wait_done(struct mtk_ecc *, enum mtk_ecc_operation);
 int mtk_ecc_enable(struct mtk_ecc *, struct mtk_ecc_config *);
 void mtk_ecc_disable(struct mtk_ecc *);
 void mtk_ecc_adjust_strength(struct mtk_ecc *ecc, u32 *p);
+unsigned int mtk_ecc_get_parity_bits(struct mtk_ecc *ecc);
 
 struct mtk_ecc *of_mtk_ecc_get(struct device_node *);
 void mtk_ecc_release(struct mtk_ecc *);
diff --git a/drivers/mtd/nand/mtk_nand.c b/drivers/mtd/nand/mtk_nand.c
index d86a7d1..6977da3 100644
--- a/drivers/mtd/nand/mtk_nand.c
+++ b/drivers/mtd/nand/mtk_nand.c
@@ -97,7 +97,6 @@
 
 #define MTK_TIMEOUT		(500000)
 #define MTK_RESET_TIMEOUT	(1000000)
-#define MTK_MAX_SECTOR		(16)
 #define MTK_NAND_MAX_NSELS	(2)
 #define MTK_NFC_MIN_SPARE	(16)
 #define ACCTIMING(tpoecs, tprecs, tc2r, tw2r, twh, twst, trlt) \
@@ -109,6 +108,8 @@ struct mtk_nfc_caps {
 	u8 num_spare_size;
 	u8 pageformat_spare_shift;
 	u8 nfi_clk_div;
+	u8 max_sector;
+	u32 max_sector_size;
 };
 
 struct mtk_nfc_bad_mark_ctl {
@@ -173,6 +174,10 @@ static const u8 spare_size_mt2712[] = {
 	74
 };
 
+static const u8 spare_size_mt7622[] = {
+	16, 26, 27, 28
+};
+
 static inline struct mtk_nfc_nand_chip *to_mtk_nand(struct nand_chip *nand)
 {
 	return container_of(nand, struct mtk_nfc_nand_chip, nand);
@@ -450,7 +455,7 @@ static inline u8 mtk_nfc_read_byte(struct mtd_info *mtd)
 		 * set to max sector to allow the HW to continue reading over
 		 * unaligned accesses
 		 */
-		reg = (MTK_MAX_SECTOR << CON_SEC_SHIFT) | CON_BRD;
+		reg = (nfc->caps->max_sector << CON_SEC_SHIFT) | CON_BRD;
 		nfi_writel(nfc, reg, NFI_CON);
 
 		/* trigger to fetch data */
@@ -481,7 +486,7 @@ static void mtk_nfc_write_byte(struct mtd_info *mtd, u8 byte)
 		reg = nfi_readw(nfc, NFI_CNFG) | CNFG_BYTE_RW;
 		nfi_writew(nfc, reg, NFI_CNFG);
 
-		reg = MTK_MAX_SECTOR << CON_SEC_SHIFT | CON_BWR;
+		reg = nfc->caps->max_sector << CON_SEC_SHIFT | CON_BWR;
 		nfi_writel(nfc, reg, NFI_CON);
 
 		nfi_writew(nfc, STAR_EN, NFI_STRDATA);
@@ -761,6 +766,8 @@ static int mtk_nfc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	u32 reg;
 	int ret;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	if (!raw) {
 		/* OOB => FDM: from register,  ECC: from HW */
 		reg = nfi_readw(nfc, NFI_CNFG) | CNFG_AUTO_FMT_EN;
@@ -794,7 +801,10 @@ static int mtk_nfc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	if (!raw)
 		mtk_ecc_disable(nfc->ecc);
 
-	return ret;
+	if (ret)
+		return ret;
+
+	return nand_prog_page_end_op(chip);
 }
 
 static int mtk_nfc_write_page_hwecc(struct mtd_info *mtd,
@@ -832,18 +842,7 @@ static int mtk_nfc_write_subpage_hwecc(struct mtd_info *mtd,
 static int mtk_nfc_write_oob_std(struct mtd_info *mtd, struct nand_chip *chip,
 				 int page)
 {
-	int ret;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
-
-	ret = mtk_nfc_write_page_raw(mtd, chip, NULL, 1, page);
-	if (ret < 0)
-		return -EIO;
-
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	ret = chip->waitfunc(mtd, chip);
-
-	return ret & NAND_STATUS_FAIL ? -EIO : 0;
+	return mtk_nfc_write_page_raw(mtd, chip, NULL, 1, page);
 }
 
 static int mtk_nfc_update_ecc_stats(struct mtd_info *mtd, u8 *buf, u32 sectors)
@@ -892,8 +891,7 @@ static int mtk_nfc_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 	len = sectors * chip->ecc.size + (raw ? sectors * spare : 0);
 	buf = bufpoi + start * chip->ecc.size;
 
-	if (column != 0)
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT, column, -1);
+	nand_read_page_op(chip, page, column, NULL, 0);
 
 	addr = dma_map_single(nfc->dev, buf, len, DMA_FROM_DEVICE);
 	rc = dma_mapping_error(nfc->dev, addr);
@@ -1016,8 +1014,6 @@ static int mtk_nfc_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 static int mtk_nfc_read_oob_std(struct mtd_info *mtd, struct nand_chip *chip,
 				int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
-
 	return mtk_nfc_read_page_raw(mtd, chip, NULL, 1, page);
 }
 
@@ -1126,9 +1122,11 @@ static void mtk_nfc_set_fdm(struct mtk_nfc_fdm *fdm, struct mtd_info *mtd)
 {
 	struct nand_chip *nand = mtd_to_nand(mtd);
 	struct mtk_nfc_nand_chip *chip = to_mtk_nand(nand);
+	struct mtk_nfc *nfc = nand_get_controller_data(nand);
 	u32 ecc_bytes;
 
-	ecc_bytes = DIV_ROUND_UP(nand->ecc.strength * ECC_PARITY_BITS, 8);
+	ecc_bytes = DIV_ROUND_UP(nand->ecc.strength *
+				 mtk_ecc_get_parity_bits(nfc->ecc), 8);
 
 	fdm->reg_size = chip->spare_per_sector - ecc_bytes;
 	if (fdm->reg_size > NFI_FDM_MAX_SIZE)
@@ -1208,7 +1206,8 @@ static int mtk_nfc_ecc_init(struct device *dev, struct mtd_info *mtd)
 		 * this controller only supports 512 and 1024 sizes
 		 */
 		if (nand->ecc.size < 1024) {
-			if (mtd->writesize > 512) {
+			if (mtd->writesize > 512 &&
+			    nfc->caps->max_sector_size > 512) {
 				nand->ecc.size = 1024;
 				nand->ecc.strength <<= 1;
 			} else {
@@ -1223,7 +1222,8 @@ static int mtk_nfc_ecc_init(struct device *dev, struct mtd_info *mtd)
 			return ret;
 
 		/* calculate oob bytes except ecc parity data */
-		free = ((nand->ecc.strength * ECC_PARITY_BITS) + 7) >> 3;
+		free = (nand->ecc.strength * mtk_ecc_get_parity_bits(nfc->ecc)
+			+ 7) >> 3;
 		free = spare - free;
 
 		/*
@@ -1233,10 +1233,12 @@ static int mtk_nfc_ecc_init(struct device *dev, struct mtd_info *mtd)
 		 */
 		if (free > NFI_FDM_MAX_SIZE) {
 			spare -= NFI_FDM_MAX_SIZE;
-			nand->ecc.strength = (spare << 3) / ECC_PARITY_BITS;
+			nand->ecc.strength = (spare << 3) /
+					     mtk_ecc_get_parity_bits(nfc->ecc);
 		} else if (free < 0) {
 			spare -= NFI_FDM_MIN_SIZE;
-			nand->ecc.strength = (spare << 3) / ECC_PARITY_BITS;
+			nand->ecc.strength = (spare << 3) /
+					     mtk_ecc_get_parity_bits(nfc->ecc);
 		}
 	}
 
@@ -1389,6 +1391,8 @@ static const struct mtk_nfc_caps mtk_nfc_caps_mt2701 = {
 	.num_spare_size = 16,
 	.pageformat_spare_shift = 4,
 	.nfi_clk_div = 1,
+	.max_sector = 16,
+	.max_sector_size = 1024,
 };
 
 static const struct mtk_nfc_caps mtk_nfc_caps_mt2712 = {
@@ -1396,6 +1400,17 @@ static const struct mtk_nfc_caps mtk_nfc_caps_mt2712 = {
 	.num_spare_size = 19,
 	.pageformat_spare_shift = 16,
 	.nfi_clk_div = 2,
+	.max_sector = 16,
+	.max_sector_size = 1024,
+};
+
+static const struct mtk_nfc_caps mtk_nfc_caps_mt7622 = {
+	.spare_size = spare_size_mt7622,
+	.num_spare_size = 4,
+	.pageformat_spare_shift = 4,
+	.nfi_clk_div = 1,
+	.max_sector = 8,
+	.max_sector_size = 512,
 };
 
 static const struct of_device_id mtk_nfc_id_table[] = {
@@ -1405,6 +1420,9 @@ static const struct of_device_id mtk_nfc_id_table[] = {
 	}, {
 		.compatible = "mediatek,mt2712-nfc",
 		.data = &mtk_nfc_caps_mt2712,
+	}, {
+		.compatible = "mediatek,mt7622-nfc",
+		.data = &mtk_nfc_caps_mt7622,
 	},
 	{}
 };
@@ -1540,7 +1558,6 @@ static int mtk_nfc_resume(struct device *dev)
 	struct mtk_nfc *nfc = dev_get_drvdata(dev);
 	struct mtk_nfc_nand_chip *chip;
 	struct nand_chip *nand;
-	struct mtd_info *mtd;
 	int ret;
 	u32 i;
 
@@ -1553,11 +1570,8 @@ static int mtk_nfc_resume(struct device *dev)
 	/* reset NAND chip if VCC was powered off */
 	list_for_each_entry(chip, &nfc->chips, node) {
 		nand = &chip->nand;
-		mtd = nand_to_mtd(nand);
-		for (i = 0; i < chip->nsels; i++) {
-			nand->select_chip(mtd, i);
-			nand->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
-		}
+		for (i = 0; i < chip->nsels; i++)
+			nand_reset(nand, i);
 	}
 
 	return 0;
diff --git a/drivers/mtd/nand/nand_base.c b/drivers/mtd/nand/nand_base.c
index 6135d00..e70ca16 100644
--- a/drivers/mtd/nand/nand_base.c
+++ b/drivers/mtd/nand/nand_base.c
@@ -561,14 +561,19 @@ static int nand_block_markbad_lowlevel(struct mtd_info *mtd, loff_t ofs)
 static int nand_check_wp(struct mtd_info *mtd)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
+	u8 status;
+	int ret;
 
 	/* Broken xD cards report WP despite being writable */
 	if (chip->options & NAND_BROKEN_XD)
 		return 0;
 
 	/* Check the WP bit */
-	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
-	return (chip->read_byte(mtd) & NAND_STATUS_WP) ? 0 : 1;
+	ret = nand_status_op(chip, &status);
+	if (ret)
+		return ret;
+
+	return status & NAND_STATUS_WP ? 0 : 1;
 }
 
 /**
@@ -667,16 +672,83 @@ EXPORT_SYMBOL_GPL(nand_wait_ready);
 static void nand_wait_status_ready(struct mtd_info *mtd, unsigned long timeo)
 {
 	register struct nand_chip *chip = mtd_to_nand(mtd);
+	int ret;
 
 	timeo = jiffies + msecs_to_jiffies(timeo);
 	do {
-		if ((chip->read_byte(mtd) & NAND_STATUS_READY))
+		u8 status;
+
+		ret = nand_read_data_op(chip, &status, sizeof(status), true);
+		if (ret)
+			return;
+
+		if (status & NAND_STATUS_READY)
 			break;
 		touch_softlockup_watchdog();
 	} while (time_before(jiffies, timeo));
 };
 
 /**
+ * nand_soft_waitrdy - Poll STATUS reg until RDY bit is set to 1
+ * @chip: NAND chip structure
+ * @timeout_ms: Timeout in ms
+ *
+ * Poll the STATUS register using ->exec_op() until the RDY bit becomes 1.
+ * If that does not happen whitin the specified timeout, -ETIMEDOUT is
+ * returned.
+ *
+ * This helper is intended to be used when the controller does not have access
+ * to the NAND R/B pin.
+ *
+ * Be aware that calling this helper from an ->exec_op() implementation means
+ * ->exec_op() must be re-entrant.
+ *
+ * Return 0 if the NAND chip is ready, a negative error otherwise.
+ */
+int nand_soft_waitrdy(struct nand_chip *chip, unsigned long timeout_ms)
+{
+	u8 status = 0;
+	int ret;
+
+	if (!chip->exec_op)
+		return -ENOTSUPP;
+
+	ret = nand_status_op(chip, NULL);
+	if (ret)
+		return ret;
+
+	timeout_ms = jiffies + msecs_to_jiffies(timeout_ms);
+	do {
+		ret = nand_read_data_op(chip, &status, sizeof(status), true);
+		if (ret)
+			break;
+
+		if (status & NAND_STATUS_READY)
+			break;
+
+		/*
+		 * Typical lowest execution time for a tR on most NANDs is 10us,
+		 * use this as polling delay before doing something smarter (ie.
+		 * deriving a delay from the timeout value, timeout_ms/ratio).
+		 */
+		udelay(10);
+	} while	(time_before(jiffies, timeout_ms));
+
+	/*
+	 * We have to exit READ_STATUS mode in order to read real data on the
+	 * bus in case the WAITRDY instruction is preceding a DATA_IN
+	 * instruction.
+	 */
+	nand_exit_status_op(chip);
+
+	if (ret)
+		return ret;
+
+	return status & NAND_STATUS_READY ? 0 : -ETIMEDOUT;
+};
+EXPORT_SYMBOL_GPL(nand_soft_waitrdy);
+
+/**
  * nand_command - [DEFAULT] Send command to NAND device
  * @mtd: MTD device structure
  * @command: the command to be sent
@@ -710,7 +782,8 @@ static void nand_command(struct mtd_info *mtd, unsigned int command,
 		chip->cmd_ctrl(mtd, readcmd, ctrl);
 		ctrl &= ~NAND_CTRL_CHANGE;
 	}
-	chip->cmd_ctrl(mtd, command, ctrl);
+	if (command != NAND_CMD_NONE)
+		chip->cmd_ctrl(mtd, command, ctrl);
 
 	/* Address cycle, when necessary */
 	ctrl = NAND_CTRL_ALE | NAND_CTRL_CHANGE;
@@ -738,6 +811,7 @@ static void nand_command(struct mtd_info *mtd, unsigned int command,
 	 */
 	switch (command) {
 
+	case NAND_CMD_NONE:
 	case NAND_CMD_PAGEPROG:
 	case NAND_CMD_ERASE1:
 	case NAND_CMD_ERASE2:
@@ -802,8 +876,8 @@ static void nand_ccs_delay(struct nand_chip *chip)
 	 * Wait tCCS_min if it is correctly defined, otherwise wait 500ns
 	 * (which should be safe for all NANDs).
 	 */
-	if (chip->data_interface && chip->data_interface->timings.sdr.tCCS_min)
-		ndelay(chip->data_interface->timings.sdr.tCCS_min / 1000);
+	if (chip->setup_data_interface)
+		ndelay(chip->data_interface.timings.sdr.tCCS_min / 1000);
 	else
 		ndelay(500);
 }
@@ -831,7 +905,9 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
 	}
 
 	/* Command latch cycle */
-	chip->cmd_ctrl(mtd, command, NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
+	if (command != NAND_CMD_NONE)
+		chip->cmd_ctrl(mtd, command,
+			       NAND_NCE | NAND_CLE | NAND_CTRL_CHANGE);
 
 	if (column != -1 || page_addr != -1) {
 		int ctrl = NAND_CTRL_CHANGE | NAND_NCE | NAND_ALE;
@@ -866,6 +942,7 @@ static void nand_command_lp(struct mtd_info *mtd, unsigned int command,
 	 */
 	switch (command) {
 
+	case NAND_CMD_NONE:
 	case NAND_CMD_CACHEDPROG:
 	case NAND_CMD_PAGEPROG:
 	case NAND_CMD_ERASE1:
@@ -1014,7 +1091,15 @@ static void panic_nand_wait(struct mtd_info *mtd, struct nand_chip *chip,
 			if (chip->dev_ready(mtd))
 				break;
 		} else {
-			if (chip->read_byte(mtd) & NAND_STATUS_READY)
+			int ret;
+			u8 status;
+
+			ret = nand_read_data_op(chip, &status, sizeof(status),
+						true);
+			if (ret)
+				return;
+
+			if (status & NAND_STATUS_READY)
 				break;
 		}
 		mdelay(1);
@@ -1031,8 +1116,9 @@ static void panic_nand_wait(struct mtd_info *mtd, struct nand_chip *chip,
 static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 {
 
-	int status;
 	unsigned long timeo = 400;
+	u8 status;
+	int ret;
 
 	/*
 	 * Apply this short delay always to ensure that we do wait tWB in any
@@ -1040,7 +1126,9 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 	 */
 	ndelay(100);
 
-	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+	ret = nand_status_op(chip, NULL);
+	if (ret)
+		return ret;
 
 	if (in_interrupt() || oops_in_progress)
 		panic_nand_wait(mtd, chip, timeo);
@@ -1051,14 +1139,22 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 				if (chip->dev_ready(mtd))
 					break;
 			} else {
-				if (chip->read_byte(mtd) & NAND_STATUS_READY)
+				ret = nand_read_data_op(chip, &status,
+							sizeof(status), true);
+				if (ret)
+					return ret;
+
+				if (status & NAND_STATUS_READY)
 					break;
 			}
 			cond_resched();
 		} while (time_before(jiffies, timeo));
 	}
 
-	status = (int)chip->read_byte(mtd);
+	ret = nand_read_data_op(chip, &status, sizeof(status), true);
+	if (ret)
+		return ret;
+
 	/* This can happen if in case of timeout or buggy dev_ready */
 	WARN_ON(!(status & NAND_STATUS_READY));
 	return status;
@@ -1076,7 +1172,6 @@ static int nand_wait(struct mtd_info *mtd, struct nand_chip *chip)
 static int nand_reset_data_interface(struct nand_chip *chip, int chipnr)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
-	const struct nand_data_interface *conf;
 	int ret;
 
 	if (!chip->setup_data_interface)
@@ -1096,8 +1191,8 @@ static int nand_reset_data_interface(struct nand_chip *chip, int chipnr)
 	 * timings to timing mode 0.
 	 */
 
-	conf = nand_get_default_data_interface();
-	ret = chip->setup_data_interface(mtd, chipnr, conf);
+	onfi_fill_data_interface(chip, NAND_SDR_IFACE, 0);
+	ret = chip->setup_data_interface(mtd, chipnr, &chip->data_interface);
 	if (ret)
 		pr_err("Failed to configure data interface to SDR timing mode 0\n");
 
@@ -1122,7 +1217,7 @@ static int nand_setup_data_interface(struct nand_chip *chip, int chipnr)
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	int ret;
 
-	if (!chip->setup_data_interface || !chip->data_interface)
+	if (!chip->setup_data_interface)
 		return 0;
 
 	/*
@@ -1143,7 +1238,7 @@ static int nand_setup_data_interface(struct nand_chip *chip, int chipnr)
 			goto err;
 	}
 
-	ret = chip->setup_data_interface(mtd, chipnr, chip->data_interface);
+	ret = chip->setup_data_interface(mtd, chipnr, &chip->data_interface);
 err:
 	return ret;
 }
@@ -1183,21 +1278,19 @@ static int nand_init_data_interface(struct nand_chip *chip)
 		modes = GENMASK(chip->onfi_timing_mode_default, 0);
 	}
 
-	chip->data_interface = kzalloc(sizeof(*chip->data_interface),
-				       GFP_KERNEL);
-	if (!chip->data_interface)
-		return -ENOMEM;
 
 	for (mode = fls(modes) - 1; mode >= 0; mode--) {
-		ret = onfi_init_data_interface(chip, chip->data_interface,
-					       NAND_SDR_IFACE, mode);
+		ret = onfi_fill_data_interface(chip, NAND_SDR_IFACE, mode);
 		if (ret)
 			continue;
 
-		/* Pass -1 to only */
+		/*
+		 * Pass NAND_DATA_IFACE_CHECK_ONLY to only check if the
+		 * controller supports the requested timings.
+		 */
 		ret = chip->setup_data_interface(mtd,
 						 NAND_DATA_IFACE_CHECK_ONLY,
-						 chip->data_interface);
+						 &chip->data_interface);
 		if (!ret) {
 			chip->onfi_timing_mode_default = mode;
 			break;
@@ -1207,21 +1300,1429 @@ static int nand_init_data_interface(struct nand_chip *chip)
 	return 0;
 }
 
-static void nand_release_data_interface(struct nand_chip *chip)
+/**
+ * nand_fill_column_cycles - fill the column cycles of an address
+ * @chip: The NAND chip
+ * @addrs: Array of address cycles to fill
+ * @offset_in_page: The offset in the page
+ *
+ * Fills the first or the first two bytes of the @addrs field depending
+ * on the NAND bus width and the page size.
+ *
+ * Returns the number of cycles needed to encode the column, or a negative
+ * error code in case one of the arguments is invalid.
+ */
+static int nand_fill_column_cycles(struct nand_chip *chip, u8 *addrs,
+				   unsigned int offset_in_page)
 {
-	kfree(chip->data_interface);
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	/* Make sure the offset is less than the actual page size. */
+	if (offset_in_page > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	/*
+	 * On small page NANDs, there's a dedicated command to access the OOB
+	 * area, and the column address is relative to the start of the OOB
+	 * area, not the start of the page. Asjust the address accordingly.
+	 */
+	if (mtd->writesize <= 512 && offset_in_page >= mtd->writesize)
+		offset_in_page -= mtd->writesize;
+
+	/*
+	 * The offset in page is expressed in bytes, if the NAND bus is 16-bit
+	 * wide, then it must be divided by 2.
+	 */
+	if (chip->options & NAND_BUSWIDTH_16) {
+		if (WARN_ON(offset_in_page % 2))
+			return -EINVAL;
+
+		offset_in_page /= 2;
+	}
+
+	addrs[0] = offset_in_page;
+
+	/*
+	 * Small page NANDs use 1 cycle for the columns, while large page NANDs
+	 * need 2
+	 */
+	if (mtd->writesize <= 512)
+		return 1;
+
+	addrs[1] = offset_in_page >> 8;
+
+	return 2;
 }
 
+static int nand_sp_exec_read_page_op(struct nand_chip *chip, unsigned int page,
+				     unsigned int offset_in_page, void *buf,
+				     unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	const struct nand_sdr_timings *sdr =
+		nand_get_sdr_timings(&chip->data_interface);
+	u8 addrs[4];
+	struct nand_op_instr instrs[] = {
+		NAND_OP_CMD(NAND_CMD_READ0, 0),
+		NAND_OP_ADDR(3, addrs, PSEC_TO_NSEC(sdr->tWB_max)),
+		NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tR_max),
+				 PSEC_TO_NSEC(sdr->tRR_min)),
+		NAND_OP_DATA_IN(len, buf, 0),
+	};
+	struct nand_operation op = NAND_OPERATION(instrs);
+	int ret;
+
+	/* Drop the DATA_IN instruction if len is set to 0. */
+	if (!len)
+		op.ninstrs--;
+
+	if (offset_in_page >= mtd->writesize)
+		instrs[0].ctx.cmd.opcode = NAND_CMD_READOOB;
+	else if (offset_in_page >= 256 &&
+		 !(chip->options & NAND_BUSWIDTH_16))
+		instrs[0].ctx.cmd.opcode = NAND_CMD_READ1;
+
+	ret = nand_fill_column_cycles(chip, addrs, offset_in_page);
+	if (ret < 0)
+		return ret;
+
+	addrs[1] = page;
+	addrs[2] = page >> 8;
+
+	if (chip->options & NAND_ROW_ADDR_3) {
+		addrs[3] = page >> 16;
+		instrs[1].ctx.addr.naddrs++;
+	}
+
+	return nand_exec_op(chip, &op);
+}
+
+static int nand_lp_exec_read_page_op(struct nand_chip *chip, unsigned int page,
+				     unsigned int offset_in_page, void *buf,
+				     unsigned int len)
+{
+	const struct nand_sdr_timings *sdr =
+		nand_get_sdr_timings(&chip->data_interface);
+	u8 addrs[5];
+	struct nand_op_instr instrs[] = {
+		NAND_OP_CMD(NAND_CMD_READ0, 0),
+		NAND_OP_ADDR(4, addrs, 0),
+		NAND_OP_CMD(NAND_CMD_READSTART, PSEC_TO_NSEC(sdr->tWB_max)),
+		NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tR_max),
+				 PSEC_TO_NSEC(sdr->tRR_min)),
+		NAND_OP_DATA_IN(len, buf, 0),
+	};
+	struct nand_operation op = NAND_OPERATION(instrs);
+	int ret;
+
+	/* Drop the DATA_IN instruction if len is set to 0. */
+	if (!len)
+		op.ninstrs--;
+
+	ret = nand_fill_column_cycles(chip, addrs, offset_in_page);
+	if (ret < 0)
+		return ret;
+
+	addrs[2] = page;
+	addrs[3] = page >> 8;
+
+	if (chip->options & NAND_ROW_ADDR_3) {
+		addrs[4] = page >> 16;
+		instrs[1].ctx.addr.naddrs++;
+	}
+
+	return nand_exec_op(chip, &op);
+}
+
+/**
+ * nand_read_page_op - Do a READ PAGE operation
+ * @chip: The NAND chip
+ * @page: page to read
+ * @offset_in_page: offset within the page
+ * @buf: buffer used to store the data
+ * @len: length of the buffer
+ *
+ * This function issues a READ PAGE operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_read_page_op(struct nand_chip *chip, unsigned int page,
+		      unsigned int offset_in_page, void *buf, unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		if (mtd->writesize > 512)
+			return nand_lp_exec_read_page_op(chip, page,
+							 offset_in_page, buf,
+							 len);
+
+		return nand_sp_exec_read_page_op(chip, page, offset_in_page,
+						 buf, len);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, offset_in_page, page);
+	if (len)
+		chip->read_buf(mtd, buf, len);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_read_page_op);
+
+/**
+ * nand_read_param_page_op - Do a READ PARAMETER PAGE operation
+ * @chip: The NAND chip
+ * @page: parameter page to read
+ * @buf: buffer used to store the data
+ * @len: length of the buffer
+ *
+ * This function issues a READ PARAMETER PAGE operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+static int nand_read_param_page_op(struct nand_chip *chip, u8 page, void *buf,
+				   unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	unsigned int i;
+	u8 *p = buf;
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_PARAM, 0),
+			NAND_OP_ADDR(1, &page, PSEC_TO_NSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tR_max),
+					 PSEC_TO_NSEC(sdr->tRR_min)),
+			NAND_OP_8BIT_DATA_IN(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		/* Drop the DATA_IN instruction if len is set to 0. */
+		if (!len)
+			op.ninstrs--;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_PARAM, page, -1);
+	for (i = 0; i < len; i++)
+		p[i] = chip->read_byte(mtd);
+
+	return 0;
+}
+
+/**
+ * nand_change_read_column_op - Do a CHANGE READ COLUMN operation
+ * @chip: The NAND chip
+ * @offset_in_page: offset within the page
+ * @buf: buffer used to store the data
+ * @len: length of the buffer
+ * @force_8bit: force 8-bit bus access
+ *
+ * This function issues a CHANGE READ COLUMN operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_change_read_column_op(struct nand_chip *chip,
+			       unsigned int offset_in_page, void *buf,
+			       unsigned int len, bool force_8bit)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	/* Small page NANDs do not support column change. */
+	if (mtd->writesize <= 512)
+		return -ENOTSUPP;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		u8 addrs[2] = {};
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_RNDOUT, 0),
+			NAND_OP_ADDR(2, addrs, 0),
+			NAND_OP_CMD(NAND_CMD_RNDOUTSTART,
+				    PSEC_TO_NSEC(sdr->tCCS_min)),
+			NAND_OP_DATA_IN(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+		int ret;
+
+		ret = nand_fill_column_cycles(chip, addrs, offset_in_page);
+		if (ret < 0)
+			return ret;
+
+		/* Drop the DATA_IN instruction if len is set to 0. */
+		if (!len)
+			op.ninstrs--;
+
+		instrs[3].ctx.data.force_8bit = force_8bit;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_RNDOUT, offset_in_page, -1);
+	if (len)
+		chip->read_buf(mtd, buf, len);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_change_read_column_op);
+
+/**
+ * nand_read_oob_op - Do a READ OOB operation
+ * @chip: The NAND chip
+ * @page: page to read
+ * @offset_in_oob: offset within the OOB area
+ * @buf: buffer used to store the data
+ * @len: length of the buffer
+ *
+ * This function issues a READ OOB operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_read_oob_op(struct nand_chip *chip, unsigned int page,
+		     unsigned int offset_in_oob, void *buf, unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (offset_in_oob + len > mtd->oobsize)
+		return -EINVAL;
+
+	if (chip->exec_op)
+		return nand_read_page_op(chip, page,
+					 mtd->writesize + offset_in_oob,
+					 buf, len);
+
+	chip->cmdfunc(mtd, NAND_CMD_READOOB, offset_in_oob, page);
+	if (len)
+		chip->read_buf(mtd, buf, len);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_read_oob_op);
+
+static int nand_exec_prog_page_op(struct nand_chip *chip, unsigned int page,
+				  unsigned int offset_in_page, const void *buf,
+				  unsigned int len, bool prog)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	const struct nand_sdr_timings *sdr =
+		nand_get_sdr_timings(&chip->data_interface);
+	u8 addrs[5] = {};
+	struct nand_op_instr instrs[] = {
+		/*
+		 * The first instruction will be dropped if we're dealing
+		 * with a large page NAND and adjusted if we're dealing
+		 * with a small page NAND and the page offset is > 255.
+		 */
+		NAND_OP_CMD(NAND_CMD_READ0, 0),
+		NAND_OP_CMD(NAND_CMD_SEQIN, 0),
+		NAND_OP_ADDR(0, addrs, PSEC_TO_NSEC(sdr->tADL_min)),
+		NAND_OP_DATA_OUT(len, buf, 0),
+		NAND_OP_CMD(NAND_CMD_PAGEPROG, PSEC_TO_NSEC(sdr->tWB_max)),
+		NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tPROG_max), 0),
+	};
+	struct nand_operation op = NAND_OPERATION(instrs);
+	int naddrs = nand_fill_column_cycles(chip, addrs, offset_in_page);
+	int ret;
+	u8 status;
+
+	if (naddrs < 0)
+		return naddrs;
+
+	addrs[naddrs++] = page;
+	addrs[naddrs++] = page >> 8;
+	if (chip->options & NAND_ROW_ADDR_3)
+		addrs[naddrs++] = page >> 16;
+
+	instrs[2].ctx.addr.naddrs = naddrs;
+
+	/* Drop the last two instructions if we're not programming the page. */
+	if (!prog) {
+		op.ninstrs -= 2;
+		/* Also drop the DATA_OUT instruction if empty. */
+		if (!len)
+			op.ninstrs--;
+	}
+
+	if (mtd->writesize <= 512) {
+		/*
+		 * Small pages need some more tweaking: we have to adjust the
+		 * first instruction depending on the page offset we're trying
+		 * to access.
+		 */
+		if (offset_in_page >= mtd->writesize)
+			instrs[0].ctx.cmd.opcode = NAND_CMD_READOOB;
+		else if (offset_in_page >= 256 &&
+			 !(chip->options & NAND_BUSWIDTH_16))
+			instrs[0].ctx.cmd.opcode = NAND_CMD_READ1;
+	} else {
+		/*
+		 * Drop the first command if we're dealing with a large page
+		 * NAND.
+		 */
+		op.instrs++;
+		op.ninstrs--;
+	}
+
+	ret = nand_exec_op(chip, &op);
+	if (!prog || ret)
+		return ret;
+
+	ret = nand_status_op(chip, &status);
+	if (ret)
+		return ret;
+
+	return status;
+}
+
+/**
+ * nand_prog_page_begin_op - starts a PROG PAGE operation
+ * @chip: The NAND chip
+ * @page: page to write
+ * @offset_in_page: offset within the page
+ * @buf: buffer containing the data to write to the page
+ * @len: length of the buffer
+ *
+ * This function issues the first half of a PROG PAGE operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_prog_page_begin_op(struct nand_chip *chip, unsigned int page,
+			    unsigned int offset_in_page, const void *buf,
+			    unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	if (chip->exec_op)
+		return nand_exec_prog_page_op(chip, page, offset_in_page, buf,
+					      len, false);
+
+	chip->cmdfunc(mtd, NAND_CMD_SEQIN, offset_in_page, page);
+
+	if (buf)
+		chip->write_buf(mtd, buf, len);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_prog_page_begin_op);
+
+/**
+ * nand_prog_page_end_op - ends a PROG PAGE operation
+ * @chip: The NAND chip
+ *
+ * This function issues the second half of a PROG PAGE operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_prog_page_end_op(struct nand_chip *chip)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	int ret;
+	u8 status;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_PAGEPROG,
+				    PSEC_TO_NSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tPROG_max), 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		ret = nand_exec_op(chip, &op);
+		if (ret)
+			return ret;
+
+		ret = nand_status_op(chip, &status);
+		if (ret)
+			return ret;
+	} else {
+		chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+		ret = chip->waitfunc(mtd, chip);
+		if (ret < 0)
+			return ret;
+
+		status = ret;
+	}
+
+	if (status & NAND_STATUS_FAIL)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_prog_page_end_op);
+
+/**
+ * nand_prog_page_op - Do a full PROG PAGE operation
+ * @chip: The NAND chip
+ * @page: page to write
+ * @offset_in_page: offset within the page
+ * @buf: buffer containing the data to write to the page
+ * @len: length of the buffer
+ *
+ * This function issues a full PROG PAGE operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_prog_page_op(struct nand_chip *chip, unsigned int page,
+		      unsigned int offset_in_page, const void *buf,
+		      unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	int status;
+
+	if (!len || !buf)
+		return -EINVAL;
+
+	if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		status = nand_exec_prog_page_op(chip, page, offset_in_page, buf,
+						len, true);
+	} else {
+		chip->cmdfunc(mtd, NAND_CMD_SEQIN, offset_in_page, page);
+		chip->write_buf(mtd, buf, len);
+		chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
+		status = chip->waitfunc(mtd, chip);
+	}
+
+	if (status & NAND_STATUS_FAIL)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_prog_page_op);
+
+/**
+ * nand_change_write_column_op - Do a CHANGE WRITE COLUMN operation
+ * @chip: The NAND chip
+ * @offset_in_page: offset within the page
+ * @buf: buffer containing the data to send to the NAND
+ * @len: length of the buffer
+ * @force_8bit: force 8-bit bus access
+ *
+ * This function issues a CHANGE WRITE COLUMN operation.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_change_write_column_op(struct nand_chip *chip,
+				unsigned int offset_in_page,
+				const void *buf, unsigned int len,
+				bool force_8bit)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (offset_in_page + len > mtd->writesize + mtd->oobsize)
+		return -EINVAL;
+
+	/* Small page NANDs do not support column change. */
+	if (mtd->writesize <= 512)
+		return -ENOTSUPP;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		u8 addrs[2];
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_RNDIN, 0),
+			NAND_OP_ADDR(2, addrs, PSEC_TO_NSEC(sdr->tCCS_min)),
+			NAND_OP_DATA_OUT(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+		int ret;
+
+		ret = nand_fill_column_cycles(chip, addrs, offset_in_page);
+		if (ret < 0)
+			return ret;
+
+		instrs[2].ctx.data.force_8bit = force_8bit;
+
+		/* Drop the DATA_OUT instruction if len is set to 0. */
+		if (!len)
+			op.ninstrs--;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_RNDIN, offset_in_page, -1);
+	if (len)
+		chip->write_buf(mtd, buf, len);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_change_write_column_op);
+
+/**
+ * nand_readid_op - Do a READID operation
+ * @chip: The NAND chip
+ * @addr: address cycle to pass after the READID command
+ * @buf: buffer used to store the ID
+ * @len: length of the buffer
+ *
+ * This function sends a READID command and reads back the ID returned by the
+ * NAND.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_readid_op(struct nand_chip *chip, u8 addr, void *buf,
+		   unsigned int len)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	unsigned int i;
+	u8 *id = buf;
+
+	if (len && !buf)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_READID, 0),
+			NAND_OP_ADDR(1, &addr, PSEC_TO_NSEC(sdr->tADL_min)),
+			NAND_OP_8BIT_DATA_IN(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		/* Drop the DATA_IN instruction if len is set to 0. */
+		if (!len)
+			op.ninstrs--;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_READID, addr, -1);
+
+	for (i = 0; i < len; i++)
+		id[i] = chip->read_byte(mtd);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_readid_op);
+
+/**
+ * nand_status_op - Do a STATUS operation
+ * @chip: The NAND chip
+ * @status: out variable to store the NAND status
+ *
+ * This function sends a STATUS command and reads back the status returned by
+ * the NAND.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_status_op(struct nand_chip *chip, u8 *status)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_STATUS,
+				    PSEC_TO_NSEC(sdr->tADL_min)),
+			NAND_OP_8BIT_DATA_IN(1, status, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		if (!status)
+			op.ninstrs--;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
+	if (status)
+		*status = chip->read_byte(mtd);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_status_op);
+
+/**
+ * nand_exit_status_op - Exit a STATUS operation
+ * @chip: The NAND chip
+ *
+ * This function sends a READ0 command to cancel the effect of the STATUS
+ * command to avoid reading only the status until a new read command is sent.
+ *
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_exit_status_op(struct nand_chip *chip)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (chip->exec_op) {
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_READ0, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_exit_status_op);
+
+/**
+ * nand_erase_op - Do an erase operation
+ * @chip: The NAND chip
+ * @eraseblock: block to erase
+ *
+ * This function sends an ERASE command and waits for the NAND to be ready
+ * before returning.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_erase_op(struct nand_chip *chip, unsigned int eraseblock)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	unsigned int page = eraseblock <<
+			    (chip->phys_erase_shift - chip->page_shift);
+	int ret;
+	u8 status;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		u8 addrs[3] = {	page, page >> 8, page >> 16 };
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_ERASE1, 0),
+			NAND_OP_ADDR(2, addrs, 0),
+			NAND_OP_CMD(NAND_CMD_ERASE2,
+				    PSEC_TO_MSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tBERS_max), 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		if (chip->options & NAND_ROW_ADDR_3)
+			instrs[1].ctx.addr.naddrs++;
+
+		ret = nand_exec_op(chip, &op);
+		if (ret)
+			return ret;
+
+		ret = nand_status_op(chip, &status);
+		if (ret)
+			return ret;
+	} else {
+		chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page);
+		chip->cmdfunc(mtd, NAND_CMD_ERASE2, -1, -1);
+
+		ret = chip->waitfunc(mtd, chip);
+		if (ret < 0)
+			return ret;
+
+		status = ret;
+	}
+
+	if (status & NAND_STATUS_FAIL)
+		return -EIO;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_erase_op);
+
+/**
+ * nand_set_features_op - Do a SET FEATURES operation
+ * @chip: The NAND chip
+ * @feature: feature id
+ * @data: 4 bytes of data
+ *
+ * This function sends a SET FEATURES command and waits for the NAND to be
+ * ready before returning.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+static int nand_set_features_op(struct nand_chip *chip, u8 feature,
+				const void *data)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	const u8 *params = data;
+	int i, ret;
+	u8 status;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_SET_FEATURES, 0),
+			NAND_OP_ADDR(1, &feature, PSEC_TO_NSEC(sdr->tADL_min)),
+			NAND_OP_8BIT_DATA_OUT(ONFI_SUBFEATURE_PARAM_LEN, data,
+					      PSEC_TO_NSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tFEAT_max), 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		ret = nand_exec_op(chip, &op);
+		if (ret)
+			return ret;
+
+		ret = nand_status_op(chip, &status);
+		if (ret)
+			return ret;
+	} else {
+		chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, feature, -1);
+		for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
+			chip->write_byte(mtd, params[i]);
+
+		ret = chip->waitfunc(mtd, chip);
+		if (ret < 0)
+			return ret;
+
+		status = ret;
+	}
+
+	if (status & NAND_STATUS_FAIL)
+		return -EIO;
+
+	return 0;
+}
+
+/**
+ * nand_get_features_op - Do a GET FEATURES operation
+ * @chip: The NAND chip
+ * @feature: feature id
+ * @data: 4 bytes of data
+ *
+ * This function sends a GET FEATURES command and waits for the NAND to be
+ * ready before returning.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+static int nand_get_features_op(struct nand_chip *chip, u8 feature,
+				void *data)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	u8 *params = data;
+	int i;
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_GET_FEATURES, 0),
+			NAND_OP_ADDR(1, &feature, PSEC_TO_NSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tFEAT_max),
+					 PSEC_TO_NSEC(sdr->tRR_min)),
+			NAND_OP_8BIT_DATA_IN(ONFI_SUBFEATURE_PARAM_LEN,
+					     data, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_GET_FEATURES, feature, -1);
+	for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
+		params[i] = chip->read_byte(mtd);
+
+	return 0;
+}
+
+/**
+ * nand_reset_op - Do a reset operation
+ * @chip: The NAND chip
+ *
+ * This function sends a RESET command and waits for the NAND to be ready
+ * before returning.
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_reset_op(struct nand_chip *chip)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (chip->exec_op) {
+		const struct nand_sdr_timings *sdr =
+			nand_get_sdr_timings(&chip->data_interface);
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(NAND_CMD_RESET, PSEC_TO_NSEC(sdr->tWB_max)),
+			NAND_OP_WAIT_RDY(PSEC_TO_MSEC(sdr->tRST_max), 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_reset_op);
+
+/**
+ * nand_read_data_op - Read data from the NAND
+ * @chip: The NAND chip
+ * @buf: buffer used to store the data
+ * @len: length of the buffer
+ * @force_8bit: force 8-bit bus access
+ *
+ * This function does a raw data read on the bus. Usually used after launching
+ * another NAND operation like nand_read_page_op().
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
+		      bool force_8bit)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (!len || !buf)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		struct nand_op_instr instrs[] = {
+			NAND_OP_DATA_IN(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		instrs[0].ctx.data.force_8bit = force_8bit;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	if (force_8bit) {
+		u8 *p = buf;
+		unsigned int i;
+
+		for (i = 0; i < len; i++)
+			p[i] = chip->read_byte(mtd);
+	} else {
+		chip->read_buf(mtd, buf, len);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_read_data_op);
+
+/**
+ * nand_write_data_op - Write data from the NAND
+ * @chip: The NAND chip
+ * @buf: buffer containing the data to send on the bus
+ * @len: length of the buffer
+ * @force_8bit: force 8-bit bus access
+ *
+ * This function does a raw data write on the bus. Usually used after launching
+ * another NAND operation like nand_write_page_begin_op().
+ * This function does not select/unselect the CS line.
+ *
+ * Returns 0 on success, a negative error code otherwise.
+ */
+int nand_write_data_op(struct nand_chip *chip, const void *buf,
+		       unsigned int len, bool force_8bit)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+
+	if (!len || !buf)
+		return -EINVAL;
+
+	if (chip->exec_op) {
+		struct nand_op_instr instrs[] = {
+			NAND_OP_DATA_OUT(len, buf, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
+
+		instrs[0].ctx.data.force_8bit = force_8bit;
+
+		return nand_exec_op(chip, &op);
+	}
+
+	if (force_8bit) {
+		const u8 *p = buf;
+		unsigned int i;
+
+		for (i = 0; i < len; i++)
+			chip->write_byte(mtd, p[i]);
+	} else {
+		chip->write_buf(mtd, buf, len);
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_write_data_op);
+
+/**
+ * struct nand_op_parser_ctx - Context used by the parser
+ * @instrs: array of all the instructions that must be addressed
+ * @ninstrs: length of the @instrs array
+ * @subop: Sub-operation to be passed to the NAND controller
+ *
+ * This structure is used by the core to split NAND operations into
+ * sub-operations that can be handled by the NAND controller.
+ */
+struct nand_op_parser_ctx {
+	const struct nand_op_instr *instrs;
+	unsigned int ninstrs;
+	struct nand_subop subop;
+};
+
+/**
+ * nand_op_parser_must_split_instr - Checks if an instruction must be split
+ * @pat: the parser pattern element that matches @instr
+ * @instr: pointer to the instruction to check
+ * @start_offset: this is an in/out parameter. If @instr has already been
+ *		  split, then @start_offset is the offset from which to start
+ *		  (either an address cycle or an offset in the data buffer).
+ *		  Conversely, if the function returns true (ie. instr must be
+ *		  split), this parameter is updated to point to the first
+ *		  data/address cycle that has not been taken care of.
+ *
+ * Some NAND controllers are limited and cannot send X address cycles with a
+ * unique operation, or cannot read/write more than Y bytes at the same time.
+ * In this case, split the instruction that does not fit in a single
+ * controller-operation into two or more chunks.
+ *
+ * Returns true if the instruction must be split, false otherwise.
+ * The @start_offset parameter is also updated to the offset at which the next
+ * bundle of instruction must start (if an address or a data instruction).
+ */
+static bool
+nand_op_parser_must_split_instr(const struct nand_op_parser_pattern_elem *pat,
+				const struct nand_op_instr *instr,
+				unsigned int *start_offset)
+{
+	switch (pat->type) {
+	case NAND_OP_ADDR_INSTR:
+		if (!pat->ctx.addr.maxcycles)
+			break;
+
+		if (instr->ctx.addr.naddrs - *start_offset >
+		    pat->ctx.addr.maxcycles) {
+			*start_offset += pat->ctx.addr.maxcycles;
+			return true;
+		}
+		break;
+
+	case NAND_OP_DATA_IN_INSTR:
+	case NAND_OP_DATA_OUT_INSTR:
+		if (!pat->ctx.data.maxlen)
+			break;
+
+		if (instr->ctx.data.len - *start_offset >
+		    pat->ctx.data.maxlen) {
+			*start_offset += pat->ctx.data.maxlen;
+			return true;
+		}
+		break;
+
+	default:
+		break;
+	}
+
+	return false;
+}
+
+/**
+ * nand_op_parser_match_pat - Checks if a pattern matches the instructions
+ *			      remaining in the parser context
+ * @pat: the pattern to test
+ * @ctx: the parser context structure to match with the pattern @pat
+ *
+ * Check if @pat matches the set or a sub-set of instructions remaining in @ctx.
+ * Returns true if this is the case, false ortherwise. When true is returned,
+ * @ctx->subop is updated with the set of instructions to be passed to the
+ * controller driver.
+ */
+static bool
+nand_op_parser_match_pat(const struct nand_op_parser_pattern *pat,
+			 struct nand_op_parser_ctx *ctx)
+{
+	unsigned int instr_offset = ctx->subop.first_instr_start_off;
+	const struct nand_op_instr *end = ctx->instrs + ctx->ninstrs;
+	const struct nand_op_instr *instr = ctx->subop.instrs;
+	unsigned int i, ninstrs;
+
+	for (i = 0, ninstrs = 0; i < pat->nelems && instr < end; i++) {
+		/*
+		 * The pattern instruction does not match the operation
+		 * instruction. If the instruction is marked optional in the
+		 * pattern definition, we skip the pattern element and continue
+		 * to the next one. If the element is mandatory, there's no
+		 * match and we can return false directly.
+		 */
+		if (instr->type != pat->elems[i].type) {
+			if (!pat->elems[i].optional)
+				return false;
+
+			continue;
+		}
+
+		/*
+		 * Now check the pattern element constraints. If the pattern is
+		 * not able to handle the whole instruction in a single step,
+		 * we have to split it.
+		 * The last_instr_end_off value comes back updated to point to
+		 * the position where we have to split the instruction (the
+		 * start of the next subop chunk).
+		 */
+		if (nand_op_parser_must_split_instr(&pat->elems[i], instr,
+						    &instr_offset)) {
+			ninstrs++;
+			i++;
+			break;
+		}
+
+		instr++;
+		ninstrs++;
+		instr_offset = 0;
+	}
+
+	/*
+	 * This can happen if all instructions of a pattern are optional.
+	 * Still, if there's not at least one instruction handled by this
+	 * pattern, this is not a match, and we should try the next one (if
+	 * any).
+	 */
+	if (!ninstrs)
+		return false;
+
+	/*
+	 * We had a match on the pattern head, but the pattern may be longer
+	 * than the instructions we're asked to execute. We need to make sure
+	 * there's no mandatory elements in the pattern tail.
+	 */
+	for (; i < pat->nelems; i++) {
+		if (!pat->elems[i].optional)
+			return false;
+	}
+
+	/*
+	 * We have a match: update the subop structure accordingly and return
+	 * true.
+	 */
+	ctx->subop.ninstrs = ninstrs;
+	ctx->subop.last_instr_end_off = instr_offset;
+
+	return true;
+}
+
+#if IS_ENABLED(CONFIG_DYNAMIC_DEBUG) || defined(DEBUG)
+static void nand_op_parser_trace(const struct nand_op_parser_ctx *ctx)
+{
+	const struct nand_op_instr *instr;
+	char *prefix = "      ";
+	unsigned int i;
+
+	pr_debug("executing subop:\n");
+
+	for (i = 0; i < ctx->ninstrs; i++) {
+		instr = &ctx->instrs[i];
+
+		if (instr == &ctx->subop.instrs[0])
+			prefix = "    ->";
+
+		switch (instr->type) {
+		case NAND_OP_CMD_INSTR:
+			pr_debug("%sCMD      [0x%02x]\n", prefix,
+				 instr->ctx.cmd.opcode);
+			break;
+		case NAND_OP_ADDR_INSTR:
+			pr_debug("%sADDR     [%d cyc: %*ph]\n", prefix,
+				 instr->ctx.addr.naddrs,
+				 instr->ctx.addr.naddrs < 64 ?
+				 instr->ctx.addr.naddrs : 64,
+				 instr->ctx.addr.addrs);
+			break;
+		case NAND_OP_DATA_IN_INSTR:
+			pr_debug("%sDATA_IN  [%d B%s]\n", prefix,
+				 instr->ctx.data.len,
+				 instr->ctx.data.force_8bit ?
+				 ", force 8-bit" : "");
+			break;
+		case NAND_OP_DATA_OUT_INSTR:
+			pr_debug("%sDATA_OUT [%d B%s]\n", prefix,
+				 instr->ctx.data.len,
+				 instr->ctx.data.force_8bit ?
+				 ", force 8-bit" : "");
+			break;
+		case NAND_OP_WAITRDY_INSTR:
+			pr_debug("%sWAITRDY  [max %d ms]\n", prefix,
+				 instr->ctx.waitrdy.timeout_ms);
+			break;
+		}
+
+		if (instr == &ctx->subop.instrs[ctx->subop.ninstrs - 1])
+			prefix = "      ";
+	}
+}
+#else
+static void nand_op_parser_trace(const struct nand_op_parser_ctx *ctx)
+{
+	/* NOP */
+}
+#endif
+
+/**
+ * nand_op_parser_exec_op - exec_op parser
+ * @chip: the NAND chip
+ * @parser: patterns description provided by the controller driver
+ * @op: the NAND operation to address
+ * @check_only: when true, the function only checks if @op can be handled but
+ *		does not execute the operation
+ *
+ * Helper function designed to ease integration of NAND controller drivers that
+ * only support a limited set of instruction sequences. The supported sequences
+ * are described in @parser, and the framework takes care of splitting @op into
+ * multiple sub-operations (if required) and pass them back to the ->exec()
+ * callback of the matching pattern if @check_only is set to false.
+ *
+ * NAND controller drivers should call this function from their own ->exec_op()
+ * implementation.
+ *
+ * Returns 0 on success, a negative error code otherwise. A failure can be
+ * caused by an unsupported operation (none of the supported patterns is able
+ * to handle the requested operation), or an error returned by one of the
+ * matching pattern->exec() hook.
+ */
+int nand_op_parser_exec_op(struct nand_chip *chip,
+			   const struct nand_op_parser *parser,
+			   const struct nand_operation *op, bool check_only)
+{
+	struct nand_op_parser_ctx ctx = {
+		.subop.instrs = op->instrs,
+		.instrs = op->instrs,
+		.ninstrs = op->ninstrs,
+	};
+	unsigned int i;
+
+	while (ctx.subop.instrs < op->instrs + op->ninstrs) {
+		int ret;
+
+		for (i = 0; i < parser->npatterns; i++) {
+			const struct nand_op_parser_pattern *pattern;
+
+			pattern = &parser->patterns[i];
+			if (!nand_op_parser_match_pat(pattern, &ctx))
+				continue;
+
+			nand_op_parser_trace(&ctx);
+
+			if (check_only)
+				break;
+
+			ret = pattern->exec(chip, &ctx.subop);
+			if (ret)
+				return ret;
+
+			break;
+		}
+
+		if (i == parser->npatterns) {
+			pr_debug("->exec_op() parser: pattern not found!\n");
+			return -ENOTSUPP;
+		}
+
+		/*
+		 * Update the context structure by pointing to the start of the
+		 * next subop.
+		 */
+		ctx.subop.instrs = ctx.subop.instrs + ctx.subop.ninstrs;
+		if (ctx.subop.last_instr_end_off)
+			ctx.subop.instrs -= 1;
+
+		ctx.subop.first_instr_start_off = ctx.subop.last_instr_end_off;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nand_op_parser_exec_op);
+
+static bool nand_instr_is_data(const struct nand_op_instr *instr)
+{
+	return instr && (instr->type == NAND_OP_DATA_IN_INSTR ||
+			 instr->type == NAND_OP_DATA_OUT_INSTR);
+}
+
+static bool nand_subop_instr_is_valid(const struct nand_subop *subop,
+				      unsigned int instr_idx)
+{
+	return subop && instr_idx < subop->ninstrs;
+}
+
+static int nand_subop_get_start_off(const struct nand_subop *subop,
+				    unsigned int instr_idx)
+{
+	if (instr_idx)
+		return 0;
+
+	return subop->first_instr_start_off;
+}
+
+/**
+ * nand_subop_get_addr_start_off - Get the start offset in an address array
+ * @subop: The entire sub-operation
+ * @instr_idx: Index of the instruction inside the sub-operation
+ *
+ * During driver development, one could be tempted to directly use the
+ * ->addr.addrs field of address instructions. This is wrong as address
+ * instructions might be split.
+ *
+ * Given an address instruction, returns the offset of the first cycle to issue.
+ */
+int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+				  unsigned int instr_idx)
+{
+	if (!nand_subop_instr_is_valid(subop, instr_idx) ||
+	    subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
+		return -EINVAL;
+
+	return nand_subop_get_start_off(subop, instr_idx);
+}
+EXPORT_SYMBOL_GPL(nand_subop_get_addr_start_off);
+
+/**
+ * nand_subop_get_num_addr_cyc - Get the remaining address cycles to assert
+ * @subop: The entire sub-operation
+ * @instr_idx: Index of the instruction inside the sub-operation
+ *
+ * During driver development, one could be tempted to directly use the
+ * ->addr->naddrs field of a data instruction. This is wrong as instructions
+ * might be split.
+ *
+ * Given an address instruction, returns the number of address cycle to issue.
+ */
+int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+				unsigned int instr_idx)
+{
+	int start_off, end_off;
+
+	if (!nand_subop_instr_is_valid(subop, instr_idx) ||
+	    subop->instrs[instr_idx].type != NAND_OP_ADDR_INSTR)
+		return -EINVAL;
+
+	start_off = nand_subop_get_addr_start_off(subop, instr_idx);
+
+	if (instr_idx == subop->ninstrs - 1 &&
+	    subop->last_instr_end_off)
+		end_off = subop->last_instr_end_off;
+	else
+		end_off = subop->instrs[instr_idx].ctx.addr.naddrs;
+
+	return end_off - start_off;
+}
+EXPORT_SYMBOL_GPL(nand_subop_get_num_addr_cyc);
+
+/**
+ * nand_subop_get_data_start_off - Get the start offset in a data array
+ * @subop: The entire sub-operation
+ * @instr_idx: Index of the instruction inside the sub-operation
+ *
+ * During driver development, one could be tempted to directly use the
+ * ->data->buf.{in,out} field of data instructions. This is wrong as data
+ * instructions might be split.
+ *
+ * Given a data instruction, returns the offset to start from.
+ */
+int nand_subop_get_data_start_off(const struct nand_subop *subop,
+				  unsigned int instr_idx)
+{
+	if (!nand_subop_instr_is_valid(subop, instr_idx) ||
+	    !nand_instr_is_data(&subop->instrs[instr_idx]))
+		return -EINVAL;
+
+	return nand_subop_get_start_off(subop, instr_idx);
+}
+EXPORT_SYMBOL_GPL(nand_subop_get_data_start_off);
+
+/**
+ * nand_subop_get_data_len - Get the number of bytes to retrieve
+ * @subop: The entire sub-operation
+ * @instr_idx: Index of the instruction inside the sub-operation
+ *
+ * During driver development, one could be tempted to directly use the
+ * ->data->len field of a data instruction. This is wrong as data instructions
+ * might be split.
+ *
+ * Returns the length of the chunk of data to send/receive.
+ */
+int nand_subop_get_data_len(const struct nand_subop *subop,
+			    unsigned int instr_idx)
+{
+	int start_off = 0, end_off;
+
+	if (!nand_subop_instr_is_valid(subop, instr_idx) ||
+	    !nand_instr_is_data(&subop->instrs[instr_idx]))
+		return -EINVAL;
+
+	start_off = nand_subop_get_data_start_off(subop, instr_idx);
+
+	if (instr_idx == subop->ninstrs - 1 &&
+	    subop->last_instr_end_off)
+		end_off = subop->last_instr_end_off;
+	else
+		end_off = subop->instrs[instr_idx].ctx.data.len;
+
+	return end_off - start_off;
+}
+EXPORT_SYMBOL_GPL(nand_subop_get_data_len);
+
 /**
  * nand_reset - Reset and initialize a NAND device
  * @chip: The NAND chip
  * @chipnr: Internal die id
  *
- * Returns 0 for success or negative error code otherwise
+ * Save the timings data structure, then apply SDR timings mode 0 (see
+ * nand_reset_data_interface for details), do the reset operation, and
+ * apply back the previous timings.
+ *
+ * Returns 0 on success, a negative error code otherwise.
  */
 int nand_reset(struct nand_chip *chip, int chipnr)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
+	struct nand_data_interface saved_data_intf = chip->data_interface;
 	int ret;
 
 	ret = nand_reset_data_interface(chip, chipnr);
@@ -1233,10 +2734,13 @@ int nand_reset(struct nand_chip *chip, int chipnr)
 	 * interface settings, hence this weird ->select_chip() dance.
 	 */
 	chip->select_chip(mtd, chipnr);
-	chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+	ret = nand_reset_op(chip);
 	chip->select_chip(mtd, -1);
+	if (ret)
+		return ret;
 
 	chip->select_chip(mtd, chipnr);
+	chip->data_interface = saved_data_intf;
 	ret = nand_setup_data_interface(chip, chipnr);
 	chip->select_chip(mtd, -1);
 	if (ret)
@@ -1390,9 +2894,19 @@ EXPORT_SYMBOL(nand_check_erased_ecc_chunk);
 int nand_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 		       uint8_t *buf, int oob_required, int page)
 {
-	chip->read_buf(mtd, buf, mtd->writesize);
-	if (oob_required)
-		chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
+	int ret;
+
+	ret = nand_read_page_op(chip, page, 0, buf, mtd->writesize);
+	if (ret)
+		return ret;
+
+	if (oob_required) {
+		ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
+					false);
+		if (ret)
+			return ret;
+	}
+
 	return 0;
 }
 EXPORT_SYMBOL(nand_read_page_raw);
@@ -1414,29 +2928,50 @@ static int nand_read_page_raw_syndrome(struct mtd_info *mtd,
 	int eccsize = chip->ecc.size;
 	int eccbytes = chip->ecc.bytes;
 	uint8_t *oob = chip->oob_poi;
-	int steps, size;
+	int steps, size, ret;
+
+	ret = nand_read_page_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
 
 	for (steps = chip->ecc.steps; steps > 0; steps--) {
-		chip->read_buf(mtd, buf, eccsize);
+		ret = nand_read_data_op(chip, buf, eccsize, false);
+		if (ret)
+			return ret;
+
 		buf += eccsize;
 
 		if (chip->ecc.prepad) {
-			chip->read_buf(mtd, oob, chip->ecc.prepad);
+			ret = nand_read_data_op(chip, oob, chip->ecc.prepad,
+						false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.prepad;
 		}
 
-		chip->read_buf(mtd, oob, eccbytes);
+		ret = nand_read_data_op(chip, oob, eccbytes, false);
+		if (ret)
+			return ret;
+
 		oob += eccbytes;
 
 		if (chip->ecc.postpad) {
-			chip->read_buf(mtd, oob, chip->ecc.postpad);
+			ret = nand_read_data_op(chip, oob, chip->ecc.postpad,
+						false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.postpad;
 		}
 	}
 
 	size = mtd->oobsize - (oob - chip->oob_poi);
-	if (size)
-		chip->read_buf(mtd, oob, size);
+	if (size) {
+		ret = nand_read_data_op(chip, oob, size, false);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -1456,8 +2991,8 @@ static int nand_read_page_swecc(struct mtd_info *mtd, struct nand_chip *chip,
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
-	uint8_t *ecc_code = chip->buffers->ecccode;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
+	uint8_t *ecc_code = chip->ecc.code_buf;
 	unsigned int max_bitflips = 0;
 
 	chip->ecc.read_page_raw(mtd, chip, buf, 1, page);
@@ -1521,15 +3056,14 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 
 	data_col_addr = start_step * chip->ecc.size;
 	/* If we read not a page aligned data */
-	if (data_col_addr != 0)
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT, data_col_addr, -1);
-
 	p = bufpoi + data_col_addr;
-	chip->read_buf(mtd, p, datafrag_len);
+	ret = nand_read_page_op(chip, page, data_col_addr, p, datafrag_len);
+	if (ret)
+		return ret;
 
 	/* Calculate ECC */
 	for (i = 0; i < eccfrag_len ; i += chip->ecc.bytes, p += chip->ecc.size)
-		chip->ecc.calculate(mtd, p, &chip->buffers->ecccalc[i]);
+		chip->ecc.calculate(mtd, p, &chip->ecc.calc_buf[i]);
 
 	/*
 	 * The performance is faster if we position offsets according to
@@ -1543,8 +3077,11 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 		gaps = 1;
 
 	if (gaps) {
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT, mtd->writesize, -1);
-		chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
+		ret = nand_change_read_column_op(chip, mtd->writesize,
+						 chip->oob_poi, mtd->oobsize,
+						 false);
+		if (ret)
+			return ret;
 	} else {
 		/*
 		 * Send the command to read the particular ECC bytes take care
@@ -1558,12 +3095,15 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 		    (busw - 1))
 			aligned_len++;
 
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT,
-			      mtd->writesize + aligned_pos, -1);
-		chip->read_buf(mtd, &chip->oob_poi[aligned_pos], aligned_len);
+		ret = nand_change_read_column_op(chip,
+						 mtd->writesize + aligned_pos,
+						 &chip->oob_poi[aligned_pos],
+						 aligned_len, false);
+		if (ret)
+			return ret;
 	}
 
-	ret = mtd_ooblayout_get_eccbytes(mtd, chip->buffers->ecccode,
+	ret = mtd_ooblayout_get_eccbytes(mtd, chip->ecc.code_buf,
 					 chip->oob_poi, index, eccfrag_len);
 	if (ret)
 		return ret;
@@ -1572,13 +3112,13 @@ static int nand_read_subpage(struct mtd_info *mtd, struct nand_chip *chip,
 	for (i = 0; i < eccfrag_len ; i += chip->ecc.bytes, p += chip->ecc.size) {
 		int stat;
 
-		stat = chip->ecc.correct(mtd, p,
-			&chip->buffers->ecccode[i], &chip->buffers->ecccalc[i]);
+		stat = chip->ecc.correct(mtd, p, &chip->ecc.code_buf[i],
+					 &chip->ecc.calc_buf[i]);
 		if (stat == -EBADMSG &&
 		    (chip->ecc.options & NAND_ECC_GENERIC_ERASED_CHECK)) {
 			/* check for empty pages with bitflips */
 			stat = nand_check_erased_ecc_chunk(p, chip->ecc.size,
-						&chip->buffers->ecccode[i],
+						&chip->ecc.code_buf[i],
 						chip->ecc.bytes,
 						NULL, 0,
 						chip->ecc.strength);
@@ -1611,16 +3151,27 @@ static int nand_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
-	uint8_t *ecc_code = chip->buffers->ecccode;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
+	uint8_t *ecc_code = chip->ecc.code_buf;
 	unsigned int max_bitflips = 0;
 
+	ret = nand_read_page_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
 		chip->ecc.hwctl(mtd, NAND_ECC_READ);
-		chip->read_buf(mtd, p, eccsize);
+
+		ret = nand_read_data_op(chip, p, eccsize, false);
+		if (ret)
+			return ret;
+
 		chip->ecc.calculate(mtd, p, &ecc_calc[i]);
 	}
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
+
+	ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize, false);
+	if (ret)
+		return ret;
 
 	ret = mtd_ooblayout_get_eccbytes(mtd, ecc_code, chip->oob_poi, 0,
 					 chip->ecc.total);
@@ -1674,14 +3225,18 @@ static int nand_read_page_hwecc_oob_first(struct mtd_info *mtd,
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	uint8_t *p = buf;
-	uint8_t *ecc_code = chip->buffers->ecccode;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
+	uint8_t *ecc_code = chip->ecc.code_buf;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
 	unsigned int max_bitflips = 0;
 
 	/* Read the OOB area first */
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	ret = nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
+	if (ret)
+		return ret;
+
+	ret = nand_read_page_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
 
 	ret = mtd_ooblayout_get_eccbytes(mtd, ecc_code, chip->oob_poi, 0,
 					 chip->ecc.total);
@@ -1692,7 +3247,11 @@ static int nand_read_page_hwecc_oob_first(struct mtd_info *mtd,
 		int stat;
 
 		chip->ecc.hwctl(mtd, NAND_ECC_READ);
-		chip->read_buf(mtd, p, eccsize);
+
+		ret = nand_read_data_op(chip, p, eccsize, false);
+		if (ret)
+			return ret;
+
 		chip->ecc.calculate(mtd, p, &ecc_calc[i]);
 
 		stat = chip->ecc.correct(mtd, p, &ecc_code[i], NULL);
@@ -1729,7 +3288,7 @@ static int nand_read_page_hwecc_oob_first(struct mtd_info *mtd,
 static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 				   uint8_t *buf, int oob_required, int page)
 {
-	int i, eccsize = chip->ecc.size;
+	int ret, i, eccsize = chip->ecc.size;
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
 	int eccpadbytes = eccbytes + chip->ecc.prepad + chip->ecc.postpad;
@@ -1737,25 +3296,44 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 	uint8_t *oob = chip->oob_poi;
 	unsigned int max_bitflips = 0;
 
+	ret = nand_read_page_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
 		int stat;
 
 		chip->ecc.hwctl(mtd, NAND_ECC_READ);
-		chip->read_buf(mtd, p, eccsize);
+
+		ret = nand_read_data_op(chip, p, eccsize, false);
+		if (ret)
+			return ret;
 
 		if (chip->ecc.prepad) {
-			chip->read_buf(mtd, oob, chip->ecc.prepad);
+			ret = nand_read_data_op(chip, oob, chip->ecc.prepad,
+						false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.prepad;
 		}
 
 		chip->ecc.hwctl(mtd, NAND_ECC_READSYN);
-		chip->read_buf(mtd, oob, eccbytes);
+
+		ret = nand_read_data_op(chip, oob, eccbytes, false);
+		if (ret)
+			return ret;
+
 		stat = chip->ecc.correct(mtd, p, oob, NULL);
 
 		oob += eccbytes;
 
 		if (chip->ecc.postpad) {
-			chip->read_buf(mtd, oob, chip->ecc.postpad);
+			ret = nand_read_data_op(chip, oob, chip->ecc.postpad,
+						false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.postpad;
 		}
 
@@ -1779,8 +3357,11 @@ static int nand_read_page_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 
 	/* Calculate remaining oob bytes */
 	i = mtd->oobsize - (oob - chip->oob_poi);
-	if (i)
-		chip->read_buf(mtd, oob, i);
+	if (i) {
+		ret = nand_read_data_op(chip, oob, i, false);
+		if (ret)
+			return ret;
+	}
 
 	return max_bitflips;
 }
@@ -1894,16 +3475,13 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 
 		/* Is the current page in the buffer? */
 		if (realpage != chip->pagebuf || oob) {
-			bufpoi = use_bufpoi ? chip->buffers->databuf : buf;
+			bufpoi = use_bufpoi ? chip->data_buf : buf;
 
 			if (use_bufpoi && aligned)
 				pr_debug("%s: using read bounce buffer for buf@%p\n",
 						 __func__, buf);
 
 read_retry:
-			if (nand_standard_page_accessors(&chip->ecc))
-				chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
-
 			/*
 			 * Now read the page into the buffer.  Absent an error,
 			 * the read methods return max bitflips per ecc step.
@@ -1938,7 +3516,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 					/* Invalidate page cache */
 					chip->pagebuf = -1;
 				}
-				memcpy(buf, chip->buffers->databuf + col, bytes);
+				memcpy(buf, chip->data_buf + col, bytes);
 			}
 
 			if (unlikely(oob)) {
@@ -1979,7 +3557,7 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 			buf += bytes;
 			max_bitflips = max_t(unsigned int, max_bitflips, ret);
 		} else {
-			memcpy(buf, chip->buffers->databuf + col, bytes);
+			memcpy(buf, chip->data_buf + col, bytes);
 			buf += bytes;
 			max_bitflips = max_t(unsigned int, max_bitflips,
 					     chip->pagebuf_bitflips);
@@ -2027,33 +3605,6 @@ static int nand_do_read_ops(struct mtd_info *mtd, loff_t from,
 }
 
 /**
- * nand_read - [MTD Interface] MTD compatibility function for nand_do_read_ecc
- * @mtd: MTD device structure
- * @from: offset to read from
- * @len: number of bytes to read
- * @retlen: pointer to variable to store the number of read bytes
- * @buf: the databuffer to put data
- *
- * Get hold of the chip and call nand_do_read.
- */
-static int nand_read(struct mtd_info *mtd, loff_t from, size_t len,
-		     size_t *retlen, uint8_t *buf)
-{
-	struct mtd_oob_ops ops;
-	int ret;
-
-	nand_get_device(mtd, FL_READING);
-	memset(&ops, 0, sizeof(ops));
-	ops.len = len;
-	ops.datbuf = buf;
-	ops.mode = MTD_OPS_PLACE_OOB;
-	ret = nand_do_read_ops(mtd, from, &ops);
-	*retlen = ops.retlen;
-	nand_release_device(mtd);
-	return ret;
-}
-
-/**
  * nand_read_oob_std - [REPLACEABLE] the most common OOB data read function
  * @mtd: mtd info structure
  * @chip: nand chip info structure
@@ -2061,9 +3612,7 @@ static int nand_read(struct mtd_info *mtd, loff_t from, size_t len,
  */
 int nand_read_oob_std(struct mtd_info *mtd, struct nand_chip *chip, int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+	return nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
 }
 EXPORT_SYMBOL(nand_read_oob_std);
 
@@ -2081,25 +3630,43 @@ int nand_read_oob_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 	int chunk = chip->ecc.bytes + chip->ecc.prepad + chip->ecc.postpad;
 	int eccsize = chip->ecc.size;
 	uint8_t *bufpoi = chip->oob_poi;
-	int i, toread, sndrnd = 0, pos;
+	int i, toread, sndrnd = 0, pos, ret;
 
-	chip->cmdfunc(mtd, NAND_CMD_READ0, chip->ecc.size, page);
+	ret = nand_read_page_op(chip, page, chip->ecc.size, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (i = 0; i < chip->ecc.steps; i++) {
 		if (sndrnd) {
+			int ret;
+
 			pos = eccsize + i * (eccsize + chunk);
 			if (mtd->writesize > 512)
-				chip->cmdfunc(mtd, NAND_CMD_RNDOUT, pos, -1);
+				ret = nand_change_read_column_op(chip, pos,
+								 NULL, 0,
+								 false);
 			else
-				chip->cmdfunc(mtd, NAND_CMD_READ0, pos, page);
+				ret = nand_read_page_op(chip, page, pos, NULL,
+							0);
+
+			if (ret)
+				return ret;
 		} else
 			sndrnd = 1;
 		toread = min_t(int, length, chunk);
-		chip->read_buf(mtd, bufpoi, toread);
+
+		ret = nand_read_data_op(chip, bufpoi, toread, false);
+		if (ret)
+			return ret;
+
 		bufpoi += toread;
 		length -= toread;
 	}
-	if (length > 0)
-		chip->read_buf(mtd, bufpoi, length);
+	if (length > 0) {
+		ret = nand_read_data_op(chip, bufpoi, length, false);
+		if (ret)
+			return ret;
+	}
 
 	return 0;
 }
@@ -2113,18 +3680,8 @@ EXPORT_SYMBOL(nand_read_oob_syndrome);
  */
 int nand_write_oob_std(struct mtd_info *mtd, struct nand_chip *chip, int page)
 {
-	int status = 0;
-	const uint8_t *buf = chip->oob_poi;
-	int length = mtd->oobsize;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, mtd->writesize, page);
-	chip->write_buf(mtd, buf, length);
-	/* Send command to program the OOB data */
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_op(chip, page, mtd->writesize, chip->oob_poi,
+				 mtd->oobsize);
 }
 EXPORT_SYMBOL(nand_write_oob_std);
 
@@ -2140,7 +3697,7 @@ int nand_write_oob_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 {
 	int chunk = chip->ecc.bytes + chip->ecc.prepad + chip->ecc.postpad;
 	int eccsize = chip->ecc.size, length = mtd->oobsize;
-	int i, len, pos, status = 0, sndcmd = 0, steps = chip->ecc.steps;
+	int ret, i, len, pos, sndcmd = 0, steps = chip->ecc.steps;
 	const uint8_t *bufpoi = chip->oob_poi;
 
 	/*
@@ -2154,7 +3711,10 @@ int nand_write_oob_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 	} else
 		pos = eccsize;
 
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, pos, page);
+	ret = nand_prog_page_begin_op(chip, page, pos, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (i = 0; i < steps; i++) {
 		if (sndcmd) {
 			if (mtd->writesize <= 512) {
@@ -2163,28 +3723,40 @@ int nand_write_oob_syndrome(struct mtd_info *mtd, struct nand_chip *chip,
 				len = eccsize;
 				while (len > 0) {
 					int num = min_t(int, len, 4);
-					chip->write_buf(mtd, (uint8_t *)&fill,
-							num);
+
+					ret = nand_write_data_op(chip, &fill,
+								 num, false);
+					if (ret)
+						return ret;
+
 					len -= num;
 				}
 			} else {
 				pos = eccsize + i * (eccsize + chunk);
-				chip->cmdfunc(mtd, NAND_CMD_RNDIN, pos, -1);
+				ret = nand_change_write_column_op(chip, pos,
+								  NULL, 0,
+								  false);
+				if (ret)
+					return ret;
 			}
 		} else
 			sndcmd = 1;
 		len = min_t(int, length, chunk);
-		chip->write_buf(mtd, bufpoi, len);
+
+		ret = nand_write_data_op(chip, bufpoi, len, false);
+		if (ret)
+			return ret;
+
 		bufpoi += len;
 		length -= len;
 	}
-	if (length > 0)
-		chip->write_buf(mtd, bufpoi, length);
+	if (length > 0) {
+		ret = nand_write_data_op(chip, bufpoi, length, false);
+		if (ret)
+			return ret;
+	}
 
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_end_op(chip);
 }
 EXPORT_SYMBOL(nand_write_oob_syndrome);
 
@@ -2199,6 +3771,7 @@ EXPORT_SYMBOL(nand_write_oob_syndrome);
 static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 			    struct mtd_oob_ops *ops)
 {
+	unsigned int max_bitflips = 0;
 	int page, realpage, chipnr;
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	struct mtd_ecc_stats stats;
@@ -2214,21 +3787,6 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 
 	len = mtd_oobavail(mtd, ops);
 
-	if (unlikely(ops->ooboffs >= len)) {
-		pr_debug("%s: attempt to start read outside oob\n",
-				__func__);
-		return -EINVAL;
-	}
-
-	/* Do not allow reads past end of device */
-	if (unlikely(from >= mtd->size ||
-		     ops->ooboffs + readlen > ((mtd->size >> chip->page_shift) -
-					(from >> chip->page_shift)) * len)) {
-		pr_debug("%s: attempt to read beyond end of device\n",
-				__func__);
-		return -EINVAL;
-	}
-
 	chipnr = (int)(from >> chip->chip_shift);
 	chip->select_chip(mtd, chipnr);
 
@@ -2256,6 +3814,8 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 				nand_wait_ready(mtd);
 		}
 
+		max_bitflips = max_t(unsigned int, max_bitflips, ret);
+
 		readlen -= len;
 		if (!readlen)
 			break;
@@ -2281,7 +3841,7 @@ static int nand_do_read_oob(struct mtd_info *mtd, loff_t from,
 	if (mtd->ecc_stats.failed - stats.failed)
 		return -EBADMSG;
 
-	return  mtd->ecc_stats.corrected - stats.corrected ? -EUCLEAN : 0;
+	return max_bitflips;
 }
 
 /**
@@ -2299,13 +3859,6 @@ static int nand_read_oob(struct mtd_info *mtd, loff_t from,
 
 	ops->retlen = 0;
 
-	/* Do not allow reads past end of device */
-	if (ops->datbuf && (from + ops->len) > mtd->size) {
-		pr_debug("%s: attempt to read beyond end of device\n",
-				__func__);
-		return -EINVAL;
-	}
-
 	if (ops->mode != MTD_OPS_PLACE_OOB &&
 	    ops->mode != MTD_OPS_AUTO_OOB &&
 	    ops->mode != MTD_OPS_RAW)
@@ -2336,11 +3889,20 @@ static int nand_read_oob(struct mtd_info *mtd, loff_t from,
 int nand_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			const uint8_t *buf, int oob_required, int page)
 {
-	chip->write_buf(mtd, buf, mtd->writesize);
-	if (oob_required)
-		chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
+	int ret;
 
-	return 0;
+	ret = nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
+	if (ret)
+		return ret;
+
+	if (oob_required) {
+		ret = nand_write_data_op(chip, chip->oob_poi, mtd->oobsize,
+					 false);
+		if (ret)
+			return ret;
+	}
+
+	return nand_prog_page_end_op(chip);
 }
 EXPORT_SYMBOL(nand_write_page_raw);
 
@@ -2362,31 +3924,52 @@ static int nand_write_page_raw_syndrome(struct mtd_info *mtd,
 	int eccsize = chip->ecc.size;
 	int eccbytes = chip->ecc.bytes;
 	uint8_t *oob = chip->oob_poi;
-	int steps, size;
+	int steps, size, ret;
+
+	ret = nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
 
 	for (steps = chip->ecc.steps; steps > 0; steps--) {
-		chip->write_buf(mtd, buf, eccsize);
+		ret = nand_write_data_op(chip, buf, eccsize, false);
+		if (ret)
+			return ret;
+
 		buf += eccsize;
 
 		if (chip->ecc.prepad) {
-			chip->write_buf(mtd, oob, chip->ecc.prepad);
+			ret = nand_write_data_op(chip, oob, chip->ecc.prepad,
+						 false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.prepad;
 		}
 
-		chip->write_buf(mtd, oob, eccbytes);
+		ret = nand_write_data_op(chip, oob, eccbytes, false);
+		if (ret)
+			return ret;
+
 		oob += eccbytes;
 
 		if (chip->ecc.postpad) {
-			chip->write_buf(mtd, oob, chip->ecc.postpad);
+			ret = nand_write_data_op(chip, oob, chip->ecc.postpad,
+						 false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.postpad;
 		}
 	}
 
 	size = mtd->oobsize - (oob - chip->oob_poi);
-	if (size)
-		chip->write_buf(mtd, oob, size);
+	if (size) {
+		ret = nand_write_data_op(chip, oob, size, false);
+		if (ret)
+			return ret;
+	}
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 /**
  * nand_write_page_swecc - [REPLACEABLE] software ECC based page write function
@@ -2403,7 +3986,7 @@ static int nand_write_page_swecc(struct mtd_info *mtd, struct nand_chip *chip,
 	int i, eccsize = chip->ecc.size, ret;
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
 	const uint8_t *p = buf;
 
 	/* Software ECC calculation */
@@ -2433,12 +4016,20 @@ static int nand_write_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	int i, eccsize = chip->ecc.size, ret;
 	int eccbytes = chip->ecc.bytes;
 	int eccsteps = chip->ecc.steps;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
 	const uint8_t *p = buf;
 
+	ret = nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
 		chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
-		chip->write_buf(mtd, p, eccsize);
+
+		ret = nand_write_data_op(chip, p, eccsize, false);
+		if (ret)
+			return ret;
+
 		chip->ecc.calculate(mtd, p, &ecc_calc[i]);
 	}
 
@@ -2447,9 +4038,11 @@ static int nand_write_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 	if (ret)
 		return ret;
 
-	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
+	ret = nand_write_data_op(chip, chip->oob_poi, mtd->oobsize, false);
+	if (ret)
+		return ret;
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 
@@ -2469,7 +4062,7 @@ static int nand_write_subpage_hwecc(struct mtd_info *mtd,
 				int oob_required, int page)
 {
 	uint8_t *oob_buf  = chip->oob_poi;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
 	int ecc_size      = chip->ecc.size;
 	int ecc_bytes     = chip->ecc.bytes;
 	int ecc_steps     = chip->ecc.steps;
@@ -2478,12 +4071,18 @@ static int nand_write_subpage_hwecc(struct mtd_info *mtd,
 	int oob_bytes       = mtd->oobsize / ecc_steps;
 	int step, ret;
 
+	ret = nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
+
 	for (step = 0; step < ecc_steps; step++) {
 		/* configure controller for WRITE access */
 		chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
 
 		/* write data (untouched subpages already masked by 0xFF) */
-		chip->write_buf(mtd, buf, ecc_size);
+		ret = nand_write_data_op(chip, buf, ecc_size, false);
+		if (ret)
+			return ret;
 
 		/* mask ECC of un-touched subpages by padding 0xFF */
 		if ((step < start_step) || (step > end_step))
@@ -2503,16 +4102,18 @@ static int nand_write_subpage_hwecc(struct mtd_info *mtd,
 
 	/* copy calculated ECC for whole page to chip->buffer->oob */
 	/* this include masked-value(0xFF) for unwritten subpages */
-	ecc_calc = chip->buffers->ecccalc;
+	ecc_calc = chip->ecc.calc_buf;
 	ret = mtd_ooblayout_set_eccbytes(mtd, ecc_calc, chip->oob_poi, 0,
 					 chip->ecc.total);
 	if (ret)
 		return ret;
 
 	/* write OOB buffer to NAND device */
-	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
+	ret = nand_write_data_op(chip, chip->oob_poi, mtd->oobsize, false);
+	if (ret)
+		return ret;
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 
@@ -2537,33 +4138,55 @@ static int nand_write_page_syndrome(struct mtd_info *mtd,
 	int eccsteps = chip->ecc.steps;
 	const uint8_t *p = buf;
 	uint8_t *oob = chip->oob_poi;
+	int ret;
+
+	ret = nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+	if (ret)
+		return ret;
 
 	for (i = 0; eccsteps; eccsteps--, i += eccbytes, p += eccsize) {
-
 		chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
-		chip->write_buf(mtd, p, eccsize);
+
+		ret = nand_write_data_op(chip, p, eccsize, false);
+		if (ret)
+			return ret;
 
 		if (chip->ecc.prepad) {
-			chip->write_buf(mtd, oob, chip->ecc.prepad);
+			ret = nand_write_data_op(chip, oob, chip->ecc.prepad,
+						 false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.prepad;
 		}
 
 		chip->ecc.calculate(mtd, p, oob);
-		chip->write_buf(mtd, oob, eccbytes);
+
+		ret = nand_write_data_op(chip, oob, eccbytes, false);
+		if (ret)
+			return ret;
+
 		oob += eccbytes;
 
 		if (chip->ecc.postpad) {
-			chip->write_buf(mtd, oob, chip->ecc.postpad);
+			ret = nand_write_data_op(chip, oob, chip->ecc.postpad,
+						 false);
+			if (ret)
+				return ret;
+
 			oob += chip->ecc.postpad;
 		}
 	}
 
 	/* Calculate remaining oob bytes */
 	i = mtd->oobsize - (oob - chip->oob_poi);
-	if (i)
-		chip->write_buf(mtd, oob, i);
+	if (i) {
+		ret = nand_write_data_op(chip, oob, i, false);
+		if (ret)
+			return ret;
+	}
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 /**
@@ -2589,9 +4212,6 @@ static int nand_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	else
 		subpage = 0;
 
-	if (nand_standard_page_accessors(&chip->ecc))
-		chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
-
 	if (unlikely(raw))
 		status = chip->ecc.write_page_raw(mtd, chip, buf,
 						  oob_required, page);
@@ -2605,14 +4225,6 @@ static int nand_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	if (status < 0)
 		return status;
 
-	if (nand_standard_page_accessors(&chip->ecc)) {
-		chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-		status = chip->waitfunc(mtd, chip);
-		if (status & NAND_STATUS_FAIL)
-			return -EIO;
-	}
-
 	return 0;
 }
 
@@ -2737,9 +4349,9 @@ static int nand_do_write_ops(struct mtd_info *mtd, loff_t to,
 			if (part_pagewr)
 				bytes = min_t(int, bytes - column, writelen);
 			chip->pagebuf = -1;
-			memset(chip->buffers->databuf, 0xff, mtd->writesize);
-			memcpy(&chip->buffers->databuf[column], buf, bytes);
-			wbuf = chip->buffers->databuf;
+			memset(chip->data_buf, 0xff, mtd->writesize);
+			memcpy(&chip->data_buf[column], buf, bytes);
+			wbuf = chip->data_buf;
 		}
 
 		if (unlikely(oob)) {
@@ -2822,33 +4434,6 @@ static int panic_nand_write(struct mtd_info *mtd, loff_t to, size_t len,
 }
 
 /**
- * nand_write - [MTD Interface] NAND write with ECC
- * @mtd: MTD device structure
- * @to: offset to write to
- * @len: number of bytes to write
- * @retlen: pointer to variable to store the number of written bytes
- * @buf: the data to write
- *
- * NAND write with ECC.
- */
-static int nand_write(struct mtd_info *mtd, loff_t to, size_t len,
-			  size_t *retlen, const uint8_t *buf)
-{
-	struct mtd_oob_ops ops;
-	int ret;
-
-	nand_get_device(mtd, FL_WRITING);
-	memset(&ops, 0, sizeof(ops));
-	ops.len = len;
-	ops.datbuf = (uint8_t *)buf;
-	ops.mode = MTD_OPS_PLACE_OOB;
-	ret = nand_do_write_ops(mtd, to, &ops);
-	*retlen = ops.retlen;
-	nand_release_device(mtd);
-	return ret;
-}
-
-/**
  * nand_do_write_oob - [MTD Interface] NAND write out-of-band
  * @mtd: MTD device structure
  * @to: offset to write to
@@ -2874,22 +4459,6 @@ static int nand_do_write_oob(struct mtd_info *mtd, loff_t to,
 		return -EINVAL;
 	}
 
-	if (unlikely(ops->ooboffs >= len)) {
-		pr_debug("%s: attempt to start write outside oob\n",
-				__func__);
-		return -EINVAL;
-	}
-
-	/* Do not allow write past end of device */
-	if (unlikely(to >= mtd->size ||
-		     ops->ooboffs + ops->ooblen >
-			((mtd->size >> chip->page_shift) -
-			 (to >> chip->page_shift)) * len)) {
-		pr_debug("%s: attempt to write beyond end of device\n",
-				__func__);
-		return -EINVAL;
-	}
-
 	chipnr = (int)(to >> chip->chip_shift);
 
 	/*
@@ -2945,13 +4514,6 @@ static int nand_write_oob(struct mtd_info *mtd, loff_t to,
 
 	ops->retlen = 0;
 
-	/* Do not allow writes past end of device */
-	if (ops->datbuf && (to + ops->len) > mtd->size) {
-		pr_debug("%s: attempt to write beyond end of device\n",
-				__func__);
-		return -EINVAL;
-	}
-
 	nand_get_device(mtd, FL_WRITING);
 
 	switch (ops->mode) {
@@ -2984,11 +4546,12 @@ static int nand_write_oob(struct mtd_info *mtd, loff_t to,
 static int single_erase(struct mtd_info *mtd, int page)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
-	/* Send commands to erase a block */
-	chip->cmdfunc(mtd, NAND_CMD_ERASE1, -1, page);
-	chip->cmdfunc(mtd, NAND_CMD_ERASE2, -1, -1);
+	unsigned int eraseblock;
 
-	return chip->waitfunc(mtd, chip);
+	/* Send commands to erase a block */
+	eraseblock = page >> (chip->phys_erase_shift - chip->page_shift);
+
+	return nand_erase_op(chip, eraseblock);
 }
 
 /**
@@ -3072,7 +4635,7 @@ int nand_erase_nand(struct mtd_info *mtd, struct erase_info *instr,
 		status = chip->erase(mtd, page & chip->pagemask);
 
 		/* See if block erase succeeded */
-		if (status & NAND_STATUS_FAIL) {
+		if (status) {
 			pr_debug("%s: failed erase, page 0x%08x\n",
 					__func__, page);
 			instr->state = MTD_ERASE_FAILED;
@@ -3215,22 +4778,12 @@ static int nand_max_bad_blocks(struct mtd_info *mtd, loff_t ofs, size_t len)
 static int nand_onfi_set_features(struct mtd_info *mtd, struct nand_chip *chip,
 			int addr, uint8_t *subfeature_param)
 {
-	int status;
-	int i;
-
 	if (!chip->onfi_version ||
 	    !(le16_to_cpu(chip->onfi_params.opt_cmd)
 	      & ONFI_OPT_CMD_SET_GET_FEATURES))
 		return -EINVAL;
 
-	chip->cmdfunc(mtd, NAND_CMD_SET_FEATURES, addr, -1);
-	for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
-		chip->write_byte(mtd, subfeature_param[i]);
-
-	status = chip->waitfunc(mtd, chip);
-	if (status & NAND_STATUS_FAIL)
-		return -EIO;
-	return 0;
+	return nand_set_features_op(chip, addr, subfeature_param);
 }
 
 /**
@@ -3243,17 +4796,12 @@ static int nand_onfi_set_features(struct mtd_info *mtd, struct nand_chip *chip,
 static int nand_onfi_get_features(struct mtd_info *mtd, struct nand_chip *chip,
 			int addr, uint8_t *subfeature_param)
 {
-	int i;
-
 	if (!chip->onfi_version ||
 	    !(le16_to_cpu(chip->onfi_params.opt_cmd)
 	      & ONFI_OPT_CMD_SET_GET_FEATURES))
 		return -EINVAL;
 
-	chip->cmdfunc(mtd, NAND_CMD_GET_FEATURES, addr, -1);
-	for (i = 0; i < ONFI_SUBFEATURE_PARAM_LEN; ++i)
-		*subfeature_param++ = chip->read_byte(mtd);
-	return 0;
+	return nand_get_features_op(chip, addr, subfeature_param);
 }
 
 /**
@@ -3319,7 +4867,7 @@ static void nand_set_defaults(struct nand_chip *chip)
 		chip->chip_delay = 20;
 
 	/* check, if a user supplied command function given */
-	if (chip->cmdfunc == NULL)
+	if (!chip->cmdfunc && !chip->exec_op)
 		chip->cmdfunc = nand_command;
 
 	/* check, if a user supplied wait function given */
@@ -3396,12 +4944,11 @@ static u16 onfi_crc16(u16 crc, u8 const *p, size_t len)
 static int nand_flash_detect_ext_param_page(struct nand_chip *chip,
 					    struct nand_onfi_params *p)
 {
-	struct mtd_info *mtd = nand_to_mtd(chip);
 	struct onfi_ext_param_page *ep;
 	struct onfi_ext_section *s;
 	struct onfi_ext_ecc_info *ecc;
 	uint8_t *cursor;
-	int ret = -EINVAL;
+	int ret;
 	int len;
 	int i;
 
@@ -3411,14 +4958,18 @@ static int nand_flash_detect_ext_param_page(struct nand_chip *chip,
 		return -ENOMEM;
 
 	/* Send our own NAND_CMD_PARAM. */
-	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
+	ret = nand_read_param_page_op(chip, 0, NULL, 0);
+	if (ret)
+		goto ext_out;
 
 	/* Use the Change Read Column command to skip the ONFI param pages. */
-	chip->cmdfunc(mtd, NAND_CMD_RNDOUT,
-			sizeof(*p) * p->num_of_param_pages , -1);
+	ret = nand_change_read_column_op(chip,
+					 sizeof(*p) * p->num_of_param_pages,
+					 ep, len, true);
+	if (ret)
+		goto ext_out;
 
-	/* Read out the Extended Parameter Page. */
-	chip->read_buf(mtd, (uint8_t *)ep, len);
+	ret = -EINVAL;
 	if ((onfi_crc16(ONFI_CRC_BASE, ((uint8_t *)ep) + 2, len - 2)
 		!= le16_to_cpu(ep->crc))) {
 		pr_debug("fail in the CRC.\n");
@@ -3471,19 +5022,23 @@ static int nand_flash_detect_onfi(struct nand_chip *chip)
 {
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	struct nand_onfi_params *p = &chip->onfi_params;
-	int i, j;
-	int val;
+	char id[4];
+	int i, ret, val;
 
 	/* Try ONFI for unknown chip or LP */
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x20, -1);
-	if (chip->read_byte(mtd) != 'O' || chip->read_byte(mtd) != 'N' ||
-		chip->read_byte(mtd) != 'F' || chip->read_byte(mtd) != 'I')
+	ret = nand_readid_op(chip, 0x20, id, sizeof(id));
+	if (ret || strncmp(id, "ONFI", 4))
 		return 0;
 
-	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0, -1);
+	ret = nand_read_param_page_op(chip, 0, NULL, 0);
+	if (ret)
+		return 0;
+
 	for (i = 0; i < 3; i++) {
-		for (j = 0; j < sizeof(*p); j++)
-			((uint8_t *)p)[j] = chip->read_byte(mtd);
+		ret = nand_read_data_op(chip, p, sizeof(*p), true);
+		if (ret)
+			return 0;
+
 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 254) ==
 				le16_to_cpu(p->crc)) {
 			break;
@@ -3574,20 +5129,22 @@ static int nand_flash_detect_jedec(struct nand_chip *chip)
 	struct mtd_info *mtd = nand_to_mtd(chip);
 	struct nand_jedec_params *p = &chip->jedec_params;
 	struct jedec_ecc_info *ecc;
-	int val;
-	int i, j;
+	char id[5];
+	int i, val, ret;
 
 	/* Try JEDEC for unknown chip or LP */
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x40, -1);
-	if (chip->read_byte(mtd) != 'J' || chip->read_byte(mtd) != 'E' ||
-		chip->read_byte(mtd) != 'D' || chip->read_byte(mtd) != 'E' ||
-		chip->read_byte(mtd) != 'C')
+	ret = nand_readid_op(chip, 0x40, id, sizeof(id));
+	if (ret || strncmp(id, "JEDEC", sizeof(id)))
 		return 0;
 
-	chip->cmdfunc(mtd, NAND_CMD_PARAM, 0x40, -1);
+	ret = nand_read_param_page_op(chip, 0x40, NULL, 0);
+	if (ret)
+		return 0;
+
 	for (i = 0; i < 3; i++) {
-		for (j = 0; j < sizeof(*p); j++)
-			((uint8_t *)p)[j] = chip->read_byte(mtd);
+		ret = nand_read_data_op(chip, p, sizeof(*p), true);
+		if (ret)
+			return 0;
 
 		if (onfi_crc16(ONFI_CRC_BASE, (uint8_t *)p, 510) ==
 				le16_to_cpu(p->crc))
@@ -3866,8 +5423,7 @@ static int nand_detect(struct nand_chip *chip, struct nand_flash_dev *type)
 {
 	const struct nand_manufacturer *manufacturer;
 	struct mtd_info *mtd = nand_to_mtd(chip);
-	int busw;
-	int i;
+	int busw, ret;
 	u8 *id_data = chip->id.data;
 	u8 maf_id, dev_id;
 
@@ -3875,17 +5431,21 @@ static int nand_detect(struct nand_chip *chip, struct nand_flash_dev *type)
 	 * Reset the chip, required by some chips (e.g. Micron MT29FxGxxxxx)
 	 * after power-up.
 	 */
-	nand_reset(chip, 0);
+	ret = nand_reset(chip, 0);
+	if (ret)
+		return ret;
 
 	/* Select the device */
 	chip->select_chip(mtd, 0);
 
 	/* Send the command for reading device ID */
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
+	ret = nand_readid_op(chip, 0, id_data, 2);
+	if (ret)
+		return ret;
 
 	/* Read manufacturer and device IDs */
-	maf_id = chip->read_byte(mtd);
-	dev_id = chip->read_byte(mtd);
+	maf_id = id_data[0];
+	dev_id = id_data[1];
 
 	/*
 	 * Try again to make sure, as some systems the bus-hold or other
@@ -3894,11 +5454,10 @@ static int nand_detect(struct nand_chip *chip, struct nand_flash_dev *type)
 	 * not match, ignore the device completely.
 	 */
 
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
-
 	/* Read entire ID string */
-	for (i = 0; i < ARRAY_SIZE(chip->id.data); i++)
-		id_data[i] = chip->read_byte(mtd);
+	ret = nand_readid_op(chip, 0, id_data, sizeof(chip->id.data));
+	if (ret)
+		return ret;
 
 	if (id_data[0] != maf_id || id_data[1] != dev_id) {
 		pr_info("second ID read did not match %02x,%02x against %02x,%02x\n",
@@ -4190,6 +5749,9 @@ int nand_scan_ident(struct mtd_info *mtd, int maxchips,
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	int ret;
 
+	/* Enforce the right timings for reset/detection */
+	onfi_fill_data_interface(chip, NAND_SDR_IFACE, 0);
+
 	ret = nand_dt_init(chip);
 	if (ret)
 		return ret;
@@ -4197,15 +5759,21 @@ int nand_scan_ident(struct mtd_info *mtd, int maxchips,
 	if (!mtd->name && mtd->dev.parent)
 		mtd->name = dev_name(mtd->dev.parent);
 
-	if ((!chip->cmdfunc || !chip->select_chip) && !chip->cmd_ctrl) {
+	/*
+	 * ->cmdfunc() is legacy and will only be used if ->exec_op() is not
+	 * populated.
+	 */
+	if (!chip->exec_op) {
 		/*
-		 * Default functions assigned for chip_select() and
-		 * cmdfunc() both expect cmd_ctrl() to be populated,
-		 * so we need to check that that's the case
+		 * Default functions assigned for ->cmdfunc() and
+		 * ->select_chip() both expect ->cmd_ctrl() to be populated.
 		 */
-		pr_err("chip.cmd_ctrl() callback is not provided");
-		return -EINVAL;
+		if ((!chip->cmdfunc || !chip->select_chip) && !chip->cmd_ctrl) {
+			pr_err("->cmd_ctrl() should be provided\n");
+			return -EINVAL;
+		}
 	}
+
 	/* Set the default functions */
 	nand_set_defaults(chip);
 
@@ -4225,15 +5793,16 @@ int nand_scan_ident(struct mtd_info *mtd, int maxchips,
 
 	/* Check for a chip array */
 	for (i = 1; i < maxchips; i++) {
+		u8 id[2];
+
 		/* See comment in nand_get_flash_type for reset */
 		nand_reset(chip, i);
 
 		chip->select_chip(mtd, i);
 		/* Send the command for reading device ID */
-		chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
+		nand_readid_op(chip, 0, id, sizeof(id));
 		/* Read manufacturer and device IDs */
-		if (nand_maf_id != chip->read_byte(mtd) ||
-		    nand_dev_id != chip->read_byte(mtd)) {
+		if (nand_maf_id != id[0] || nand_dev_id != id[1]) {
 			chip->select_chip(mtd, -1);
 			break;
 		}
@@ -4600,26 +6169,6 @@ static bool nand_ecc_strength_good(struct mtd_info *mtd)
 	return corr >= ds_corr && ecc->strength >= chip->ecc_strength_ds;
 }
 
-static bool invalid_ecc_page_accessors(struct nand_chip *chip)
-{
-	struct nand_ecc_ctrl *ecc = &chip->ecc;
-
-	if (nand_standard_page_accessors(ecc))
-		return false;
-
-	/*
-	 * NAND_ECC_CUSTOM_PAGE_ACCESS flag is set, make sure the NAND
-	 * controller driver implements all the page accessors because
-	 * default helpers are not suitable when the core does not
-	 * send the READ0/PAGEPROG commands.
-	 */
-	return (!ecc->read_page || !ecc->write_page ||
-		!ecc->read_page_raw || !ecc->write_page_raw ||
-		(NAND_HAS_SUBPAGE_READ(chip) && !ecc->read_subpage) ||
-		(NAND_HAS_SUBPAGE_WRITE(chip) && !ecc->write_subpage &&
-		 ecc->hwctl && ecc->calculate));
-}
-
 /**
  * nand_scan_tail - [NAND Interface] Scan for the NAND device
  * @mtd: MTD device structure
@@ -4632,7 +6181,6 @@ int nand_scan_tail(struct mtd_info *mtd)
 {
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
-	struct nand_buffers *nbuf = NULL;
 	int ret, i;
 
 	/* New bad blocks should be marked in OOB, flash-based BBT, or both */
@@ -4641,39 +6189,9 @@ int nand_scan_tail(struct mtd_info *mtd)
 		return -EINVAL;
 	}
 
-	if (invalid_ecc_page_accessors(chip)) {
-		pr_err("Invalid ECC page accessors setup\n");
-		return -EINVAL;
-	}
-
-	if (!(chip->options & NAND_OWN_BUFFERS)) {
-		nbuf = kzalloc(sizeof(*nbuf), GFP_KERNEL);
-		if (!nbuf)
-			return -ENOMEM;
-
-		nbuf->ecccalc = kmalloc(mtd->oobsize, GFP_KERNEL);
-		if (!nbuf->ecccalc) {
-			ret = -ENOMEM;
-			goto err_free_nbuf;
-		}
-
-		nbuf->ecccode = kmalloc(mtd->oobsize, GFP_KERNEL);
-		if (!nbuf->ecccode) {
-			ret = -ENOMEM;
-			goto err_free_nbuf;
-		}
-
-		nbuf->databuf = kmalloc(mtd->writesize + mtd->oobsize,
-					GFP_KERNEL);
-		if (!nbuf->databuf) {
-			ret = -ENOMEM;
-			goto err_free_nbuf;
-		}
-
-		chip->buffers = nbuf;
-	} else if (!chip->buffers) {
+	chip->data_buf = kmalloc(mtd->writesize + mtd->oobsize, GFP_KERNEL);
+	if (!chip->data_buf)
 		return -ENOMEM;
-	}
 
 	/*
 	 * FIXME: some NAND manufacturer drivers expect the first die to be
@@ -4685,10 +6203,10 @@ int nand_scan_tail(struct mtd_info *mtd)
 	ret = nand_manufacturer_init(chip);
 	chip->select_chip(mtd, -1);
 	if (ret)
-		goto err_free_nbuf;
+		goto err_free_buf;
 
 	/* Set the internal oob buffer location, just after the page data */
-	chip->oob_poi = chip->buffers->databuf + mtd->writesize;
+	chip->oob_poi = chip->data_buf + mtd->writesize;
 
 	/*
 	 * If no default placement scheme is given, select an appropriate one.
@@ -4836,6 +6354,15 @@ int nand_scan_tail(struct mtd_info *mtd)
 		goto err_nand_manuf_cleanup;
 	}
 
+	if (ecc->correct || ecc->calculate) {
+		ecc->calc_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
+		ecc->code_buf = kmalloc(mtd->oobsize, GFP_KERNEL);
+		if (!ecc->calc_buf || !ecc->code_buf) {
+			ret = -ENOMEM;
+			goto err_nand_manuf_cleanup;
+		}
+	}
+
 	/* For many systems, the standard OOB write also works for raw */
 	if (!ecc->read_oob_raw)
 		ecc->read_oob_raw = ecc->read_oob;
@@ -4917,8 +6444,6 @@ int nand_scan_tail(struct mtd_info *mtd)
 	mtd->_erase = nand_erase;
 	mtd->_point = NULL;
 	mtd->_unpoint = NULL;
-	mtd->_read = nand_read;
-	mtd->_write = nand_write;
 	mtd->_panic_write = panic_nand_write;
 	mtd->_read_oob = nand_read_oob;
 	mtd->_write_oob = nand_write_oob;
@@ -4954,7 +6479,7 @@ int nand_scan_tail(struct mtd_info *mtd)
 		chip->select_chip(mtd, -1);
 
 		if (ret)
-			goto err_nand_data_iface_cleanup;
+			goto err_nand_manuf_cleanup;
 	}
 
 	/* Check, if we should skip the bad block table scan */
@@ -4964,23 +6489,18 @@ int nand_scan_tail(struct mtd_info *mtd)
 	/* Build bad block table */
 	ret = chip->scan_bbt(mtd);
 	if (ret)
-		goto err_nand_data_iface_cleanup;
+		goto err_nand_manuf_cleanup;
 
 	return 0;
 
-err_nand_data_iface_cleanup:
-	nand_release_data_interface(chip);
 
 err_nand_manuf_cleanup:
 	nand_manufacturer_cleanup(chip);
 
-err_free_nbuf:
-	if (nbuf) {
-		kfree(nbuf->databuf);
-		kfree(nbuf->ecccode);
-		kfree(nbuf->ecccalc);
-		kfree(nbuf);
-	}
+err_free_buf:
+	kfree(chip->data_buf);
+	kfree(ecc->code_buf);
+	kfree(ecc->calc_buf);
 
 	return ret;
 }
@@ -5028,16 +6548,11 @@ void nand_cleanup(struct nand_chip *chip)
 	    chip->ecc.algo == NAND_ECC_BCH)
 		nand_bch_free((struct nand_bch_control *)chip->ecc.priv);
 
-	nand_release_data_interface(chip);
-
 	/* Free bad block table memory */
 	kfree(chip->bbt);
-	if (!(chip->options & NAND_OWN_BUFFERS) && chip->buffers) {
-		kfree(chip->buffers->databuf);
-		kfree(chip->buffers->ecccode);
-		kfree(chip->buffers->ecccalc);
-		kfree(chip->buffers);
-	}
+	kfree(chip->data_buf);
+	kfree(chip->ecc.code_buf);
+	kfree(chip->ecc.calc_buf);
 
 	/* Free bad block descriptor memory */
 	if (chip->badblock_pattern && chip->badblock_pattern->options
diff --git a/drivers/mtd/nand/nand_bbt.c b/drivers/mtd/nand/nand_bbt.c
index 2915b67..3609285 100644
--- a/drivers/mtd/nand/nand_bbt.c
+++ b/drivers/mtd/nand/nand_bbt.c
@@ -898,7 +898,7 @@ static inline int nand_memory_bbt(struct mtd_info *mtd, struct nand_bbt_descr *b
 {
 	struct nand_chip *this = mtd_to_nand(mtd);
 
-	return create_bbt(mtd, this->buffers->databuf, bd, -1);
+	return create_bbt(mtd, this->data_buf, bd, -1);
 }
 
 /**
diff --git a/drivers/mtd/nand/nand_hynix.c b/drivers/mtd/nand/nand_hynix.c
index 985751e..d542908a 100644
--- a/drivers/mtd/nand/nand_hynix.c
+++ b/drivers/mtd/nand/nand_hynix.c
@@ -67,15 +67,43 @@ struct hynix_read_retry_otp {
 
 static bool hynix_nand_has_valid_jedecid(struct nand_chip *chip)
 {
+	u8 jedecid[5] = { };
+	int ret;
+
+	ret = nand_readid_op(chip, 0x40, jedecid, sizeof(jedecid));
+	if (ret)
+		return false;
+
+	return !strncmp("JEDEC", jedecid, sizeof(jedecid));
+}
+
+static int hynix_nand_cmd_op(struct nand_chip *chip, u8 cmd)
+{
 	struct mtd_info *mtd = nand_to_mtd(chip);
-	u8 jedecid[6] = { };
-	int i = 0;
 
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x40, -1);
-	for (i = 0; i < 5; i++)
-		jedecid[i] = chip->read_byte(mtd);
+	if (chip->exec_op) {
+		struct nand_op_instr instrs[] = {
+			NAND_OP_CMD(cmd, 0),
+		};
+		struct nand_operation op = NAND_OPERATION(instrs);
 
-	return !strcmp("JEDEC", jedecid);
+		return nand_exec_op(chip, &op);
+	}
+
+	chip->cmdfunc(mtd, cmd, -1, -1);
+
+	return 0;
+}
+
+static int hynix_nand_reg_write_op(struct nand_chip *chip, u8 addr, u8 val)
+{
+	struct mtd_info *mtd = nand_to_mtd(chip);
+	u16 column = ((u16)addr << 8) | addr;
+
+	chip->cmdfunc(mtd, NAND_CMD_NONE, column, -1);
+	chip->write_byte(mtd, val);
+
+	return 0;
 }
 
 static int hynix_nand_setup_read_retry(struct mtd_info *mtd, int retry_mode)
@@ -83,14 +111,15 @@ static int hynix_nand_setup_read_retry(struct mtd_info *mtd, int retry_mode)
 	struct nand_chip *chip = mtd_to_nand(mtd);
 	struct hynix_nand *hynix = nand_get_manufacturer_data(chip);
 	const u8 *values;
-	int status;
-	int i;
+	int i, ret;
 
 	values = hynix->read_retry->values +
 		 (retry_mode * hynix->read_retry->nregs);
 
 	/* Enter 'Set Hynix Parameters' mode */
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_SET_PARAMS, -1, -1);
+	ret = hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_SET_PARAMS);
+	if (ret)
+		return ret;
 
 	/*
 	 * Configure the NAND in the requested read-retry mode.
@@ -102,21 +131,14 @@ static int hynix_nand_setup_read_retry(struct mtd_info *mtd, int retry_mode)
 	 * probably tweaked at production in this case).
 	 */
 	for (i = 0; i < hynix->read_retry->nregs; i++) {
-		int column = hynix->read_retry->regs[i];
-
-		column |= column << 8;
-		chip->cmdfunc(mtd, NAND_CMD_NONE, column, -1);
-		chip->write_byte(mtd, values[i]);
+		ret = hynix_nand_reg_write_op(chip, hynix->read_retry->regs[i],
+					      values[i]);
+		if (ret)
+			return ret;
 	}
 
 	/* Apply the new settings. */
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_APPLY_PARAMS, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-	if (status & NAND_STATUS_FAIL)
-		return -EIO;
-
-	return 0;
+	return hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_APPLY_PARAMS);
 }
 
 /**
@@ -172,40 +194,63 @@ static int hynix_read_rr_otp(struct nand_chip *chip,
 			     const struct hynix_read_retry_otp *info,
 			     void *buf)
 {
-	struct mtd_info *mtd = nand_to_mtd(chip);
-	int i;
+	int i, ret;
 
-	chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+	ret = nand_reset_op(chip);
+	if (ret)
+		return ret;
 
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_SET_PARAMS, -1, -1);
+	ret = hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_SET_PARAMS);
+	if (ret)
+		return ret;
 
 	for (i = 0; i < info->nregs; i++) {
-		int column = info->regs[i];
-
-		column |= column << 8;
-		chip->cmdfunc(mtd, NAND_CMD_NONE, column, -1);
-		chip->write_byte(mtd, info->values[i]);
+		ret = hynix_nand_reg_write_op(chip, info->regs[i],
+					      info->values[i]);
+		if (ret)
+			return ret;
 	}
 
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_APPLY_PARAMS, -1, -1);
+	ret = hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_APPLY_PARAMS);
+	if (ret)
+		return ret;
 
 	/* Sequence to enter OTP mode? */
-	chip->cmdfunc(mtd, 0x17, -1, -1);
-	chip->cmdfunc(mtd, 0x04, -1, -1);
-	chip->cmdfunc(mtd, 0x19, -1, -1);
+	ret = hynix_nand_cmd_op(chip, 0x17);
+	if (ret)
+		return ret;
+
+	ret = hynix_nand_cmd_op(chip, 0x4);
+	if (ret)
+		return ret;
+
+	ret = hynix_nand_cmd_op(chip, 0x19);
+	if (ret)
+		return ret;
 
 	/* Now read the page */
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x0, info->page);
-	chip->read_buf(mtd, buf, info->size);
+	ret = nand_read_page_op(chip, info->page, 0, buf, info->size);
+	if (ret)
+		return ret;
 
 	/* Put everything back to normal */
-	chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_SET_PARAMS, 0x38, -1);
-	chip->write_byte(mtd, 0x0);
-	chip->cmdfunc(mtd, NAND_HYNIX_CMD_APPLY_PARAMS, -1, -1);
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x0, -1);
+	ret = nand_reset_op(chip);
+	if (ret)
+		return ret;
 
-	return 0;
+	ret = hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_SET_PARAMS);
+	if (ret)
+		return ret;
+
+	ret = hynix_nand_reg_write_op(chip, 0x38, 0);
+	if (ret)
+		return ret;
+
+	ret = hynix_nand_cmd_op(chip, NAND_HYNIX_CMD_APPLY_PARAMS);
+	if (ret)
+		return ret;
+
+	return nand_read_page_op(chip, 0, 0, NULL, 0);
 }
 
 #define NAND_HYNIX_1XNM_RR_COUNT_OFFS				0
diff --git a/drivers/mtd/nand/nand_micron.c b/drivers/mtd/nand/nand_micron.c
index abf6a3c..02e109a 100644
--- a/drivers/mtd/nand/nand_micron.c
+++ b/drivers/mtd/nand/nand_micron.c
@@ -117,16 +117,28 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 				 uint8_t *buf, int oob_required,
 				 int page)
 {
-	int status;
-	int max_bitflips = 0;
+	u8 status;
+	int ret, max_bitflips = 0;
 
-	micron_nand_on_die_ecc_setup(chip, true);
+	ret = micron_nand_on_die_ecc_setup(chip, true);
+	if (ret)
+		return ret;
 
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
-	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
-	status = chip->read_byte(mtd);
+	ret = nand_read_page_op(chip, page, 0, NULL, 0);
+	if (ret)
+		goto out;
+
+	ret = nand_status_op(chip, &status);
+	if (ret)
+		goto out;
+
+	ret = nand_exit_status_op(chip);
+	if (ret)
+		goto out;
+
 	if (status & NAND_STATUS_FAIL)
 		mtd->ecc_stats.failed++;
+
 	/*
 	 * The internal ECC doesn't tell us the number of bitflips
 	 * that have been corrected, but tells us if it recommends to
@@ -137,13 +149,15 @@ micron_nand_read_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 	else if (status & NAND_STATUS_WRITE_RECOMMENDED)
 		max_bitflips = chip->ecc.strength;
 
-	chip->cmdfunc(mtd, NAND_CMD_READ0, -1, -1);
+	ret = nand_read_data_op(chip, buf, mtd->writesize, false);
+	if (!ret && oob_required)
+		ret = nand_read_data_op(chip, chip->oob_poi, mtd->oobsize,
+					false);
 
-	nand_read_page_raw(mtd, chip, buf, oob_required, page);
-
+out:
 	micron_nand_on_die_ecc_setup(chip, false);
 
-	return max_bitflips;
+	return ret ? ret : max_bitflips;
 }
 
 static int
@@ -151,46 +165,16 @@ micron_nand_write_page_on_die_ecc(struct mtd_info *mtd, struct nand_chip *chip,
 				  const uint8_t *buf, int oob_required,
 				  int page)
 {
-	int status;
+	int ret;
 
-	micron_nand_on_die_ecc_setup(chip, true);
+	ret = micron_nand_on_die_ecc_setup(chip, true);
+	if (ret)
+		return ret;
 
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
-	nand_write_page_raw(mtd, chip, buf, oob_required, page);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	status = chip->waitfunc(mtd, chip);
-
+	ret = nand_write_page_raw(mtd, chip, buf, oob_required, page);
 	micron_nand_on_die_ecc_setup(chip, false);
 
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
-}
-
-static int
-micron_nand_read_page_raw_on_die_ecc(struct mtd_info *mtd,
-				     struct nand_chip *chip,
-				     uint8_t *buf, int oob_required,
-				     int page)
-{
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0x00, page);
-	nand_read_page_raw(mtd, chip, buf, oob_required, page);
-
-	return 0;
-}
-
-static int
-micron_nand_write_page_raw_on_die_ecc(struct mtd_info *mtd,
-				      struct nand_chip *chip,
-				      const uint8_t *buf, int oob_required,
-				      int page)
-{
-	int status;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0x00, page);
-	nand_write_page_raw(mtd, chip, buf, oob_required, page);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return ret;
 }
 
 enum {
@@ -285,17 +269,14 @@ static int micron_nand_init(struct nand_chip *chip)
 			return -EINVAL;
 		}
 
-		chip->ecc.options = NAND_ECC_CUSTOM_PAGE_ACCESS;
 		chip->ecc.bytes = 8;
 		chip->ecc.size = 512;
 		chip->ecc.strength = 4;
 		chip->ecc.algo = NAND_ECC_BCH;
 		chip->ecc.read_page = micron_nand_read_page_on_die_ecc;
 		chip->ecc.write_page = micron_nand_write_page_on_die_ecc;
-		chip->ecc.read_page_raw =
-			micron_nand_read_page_raw_on_die_ecc;
-		chip->ecc.write_page_raw =
-			micron_nand_write_page_raw_on_die_ecc;
+		chip->ecc.read_page_raw = nand_read_page_raw;
+		chip->ecc.write_page_raw = nand_write_page_raw;
 
 		mtd_set_ooblayout(mtd, &micron_nand_on_die_ooblayout_ops);
 	}
diff --git a/drivers/mtd/nand/nand_samsung.c b/drivers/mtd/nand/nand_samsung.c
index d348f01..ef022f6 100644
--- a/drivers/mtd/nand/nand_samsung.c
+++ b/drivers/mtd/nand/nand_samsung.c
@@ -91,6 +91,25 @@ static void samsung_nand_decode_id(struct nand_chip *chip)
 		}
 	} else {
 		nand_decode_ext_id(chip);
+
+		if (nand_is_slc(chip)) {
+			switch (chip->id.data[1]) {
+			/* K9F4G08U0D-S[I|C]B0(T00) */
+			case 0xDC:
+				chip->ecc_step_ds = 512;
+				chip->ecc_strength_ds = 1;
+				break;
+
+			/* K9F1G08U0E 21nm chips do not support subpage write */
+			case 0xF1:
+				if (chip->id.len > 4 &&
+				    (chip->id.data[4] & GENMASK(1, 0)) == 0x1)
+					chip->options |= NAND_NO_SUBPAGE_WRITE;
+				break;
+			default:
+				break;
+			}
+		}
 	}
 }
 
diff --git a/drivers/mtd/nand/nand_timings.c b/drivers/mtd/nand/nand_timings.c
index 5d1533b..9400d03 100644
--- a/drivers/mtd/nand/nand_timings.c
+++ b/drivers/mtd/nand/nand_timings.c
@@ -283,16 +283,16 @@ const struct nand_sdr_timings *onfi_async_timing_mode_to_sdr_timings(int mode)
 EXPORT_SYMBOL(onfi_async_timing_mode_to_sdr_timings);
 
 /**
- * onfi_init_data_interface - [NAND Interface] Initialize a data interface from
+ * onfi_fill_data_interface - [NAND Interface] Initialize a data interface from
  * given ONFI mode
- * @iface: The data interface to be initialized
  * @mode: The ONFI timing mode
  */
-int onfi_init_data_interface(struct nand_chip *chip,
-			     struct nand_data_interface *iface,
+int onfi_fill_data_interface(struct nand_chip *chip,
 			     enum nand_data_interface_type type,
 			     int timing_mode)
 {
+	struct nand_data_interface *iface = &chip->data_interface;
+
 	if (type != NAND_SDR_IFACE)
 		return -EINVAL;
 
@@ -321,15 +321,4 @@ int onfi_init_data_interface(struct nand_chip *chip,
 
 	return 0;
 }
-EXPORT_SYMBOL(onfi_init_data_interface);
-
-/**
- * nand_get_default_data_interface - [NAND Interface] Retrieve NAND
- * data interface for mode 0. This is used as default timing after
- * reset.
- */
-const struct nand_data_interface *nand_get_default_data_interface(void)
-{
-	return &onfi_sdr_timings[0];
-}
-EXPORT_SYMBOL(nand_get_default_data_interface);
+EXPORT_SYMBOL(onfi_fill_data_interface);
diff --git a/drivers/mtd/nand/omap2.c b/drivers/mtd/nand/omap2.c
index dad438c..8cdf7d3 100644
--- a/drivers/mtd/nand/omap2.c
+++ b/drivers/mtd/nand/omap2.c
@@ -1530,7 +1530,9 @@ static int omap_write_page_bch(struct mtd_info *mtd, struct nand_chip *chip,
 			       const uint8_t *buf, int oob_required, int page)
 {
 	int ret;
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
+
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 
 	/* Enable GPMC ecc engine */
 	chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
@@ -1548,7 +1550,8 @@ static int omap_write_page_bch(struct mtd_info *mtd, struct nand_chip *chip,
 
 	/* Write ecc vector to OOB area */
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+
+	return nand_prog_page_end_op(chip);
 }
 
 /**
@@ -1568,7 +1571,7 @@ static int omap_write_subpage_bch(struct mtd_info *mtd,
 				  u32 data_len, const u8 *buf,
 				  int oob_required, int page)
 {
-	u8 *ecc_calc = chip->buffers->ecccalc;
+	u8 *ecc_calc = chip->ecc.calc_buf;
 	int ecc_size      = chip->ecc.size;
 	int ecc_bytes     = chip->ecc.bytes;
 	int ecc_steps     = chip->ecc.steps;
@@ -1582,6 +1585,7 @@ static int omap_write_subpage_bch(struct mtd_info *mtd,
 	 * ECC is calculated for all subpages but we choose
 	 * only what we want.
 	 */
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 
 	/* Enable GPMC ECC engine */
 	chip->ecc.hwctl(mtd, NAND_ECC_WRITE);
@@ -1605,7 +1609,7 @@ static int omap_write_subpage_bch(struct mtd_info *mtd,
 
 	/* copy calculated ECC for whole page to chip->buffer->oob */
 	/* this include masked-value(0xFF) for unwritten subpages */
-	ecc_calc = chip->buffers->ecccalc;
+	ecc_calc = chip->ecc.calc_buf;
 	ret = mtd_ooblayout_set_eccbytes(mtd, ecc_calc, chip->oob_poi, 0,
 					 chip->ecc.total);
 	if (ret)
@@ -1614,7 +1618,7 @@ static int omap_write_subpage_bch(struct mtd_info *mtd,
 	/* write OOB buffer to NAND device */
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 /**
@@ -1635,11 +1639,13 @@ static int omap_write_subpage_bch(struct mtd_info *mtd,
 static int omap_read_page_bch(struct mtd_info *mtd, struct nand_chip *chip,
 				uint8_t *buf, int oob_required, int page)
 {
-	uint8_t *ecc_calc = chip->buffers->ecccalc;
-	uint8_t *ecc_code = chip->buffers->ecccode;
+	uint8_t *ecc_calc = chip->ecc.calc_buf;
+	uint8_t *ecc_code = chip->ecc.code_buf;
 	int stat, ret;
 	unsigned int max_bitflips = 0;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	/* Enable GPMC ecc engine */
 	chip->ecc.hwctl(mtd, NAND_ECC_READ);
 
@@ -1647,10 +1653,10 @@ static int omap_read_page_bch(struct mtd_info *mtd, struct nand_chip *chip,
 	chip->read_buf(mtd, buf, mtd->writesize);
 
 	/* Read oob bytes */
-	chip->cmdfunc(mtd, NAND_CMD_RNDOUT,
-		      mtd->writesize + BADBLOCK_MARKER_LENGTH, -1);
-	chip->read_buf(mtd, chip->oob_poi + BADBLOCK_MARKER_LENGTH,
-		       chip->ecc.total);
+	nand_change_read_column_op(chip,
+				   mtd->writesize + BADBLOCK_MARKER_LENGTH,
+				   chip->oob_poi + BADBLOCK_MARKER_LENGTH,
+				   chip->ecc.total, false);
 
 	/* Calculate ecc bytes */
 	omap_calculate_ecc_bch_multi(mtd, buf, ecc_calc);
diff --git a/drivers/mtd/nand/pxa3xx_nand.c b/drivers/mtd/nand/pxa3xx_nand.c
index 90b9a9c..d1979c7 100644
--- a/drivers/mtd/nand/pxa3xx_nand.c
+++ b/drivers/mtd/nand/pxa3xx_nand.c
@@ -520,15 +520,13 @@ static int pxa3xx_nand_init_timings_compat(struct pxa3xx_nand_host *host,
 	struct nand_chip *chip = &host->chip;
 	struct pxa3xx_nand_info *info = host->info_data;
 	const struct pxa3xx_nand_flash *f = NULL;
-	struct mtd_info *mtd = nand_to_mtd(&host->chip);
 	int i, id, ntypes;
+	u8 idbuf[2];
 
 	ntypes = ARRAY_SIZE(builtin_flash_types);
 
-	chip->cmdfunc(mtd, NAND_CMD_READID, 0x00, -1);
-
-	id = chip->read_byte(mtd);
-	id |= chip->read_byte(mtd) << 0x8;
+	nand_readid_op(chip, 0, idbuf, sizeof(idbuf));
+	id = idbuf[0] | (idbuf[1] << 8);
 
 	for (i = 0; i < ntypes; i++) {
 		f = &builtin_flash_types[i];
@@ -963,6 +961,7 @@ static void prepare_start_command(struct pxa3xx_nand_info *info, int command)
 
 	switch (command) {
 	case NAND_CMD_READ0:
+	case NAND_CMD_READOOB:
 	case NAND_CMD_PAGEPROG:
 		info->use_ecc = 1;
 		break;
@@ -1350,10 +1349,10 @@ static int pxa3xx_nand_write_page_hwecc(struct mtd_info *mtd,
 		struct nand_chip *chip, const uint8_t *buf, int oob_required,
 		int page)
 {
-	chip->write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int pxa3xx_nand_read_page_hwecc(struct mtd_info *mtd,
@@ -1363,7 +1362,7 @@ static int pxa3xx_nand_read_page_hwecc(struct mtd_info *mtd,
 	struct pxa3xx_nand_host *host = nand_get_controller_data(chip);
 	struct pxa3xx_nand_info *info = host->info_data;
 
-	chip->read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
 	if (info->retcode == ERR_CORERR && info->use_ecc) {
diff --git a/drivers/mtd/nand/qcom_nandc.c b/drivers/mtd/nand/qcom_nandc.c
index 2656c1a..6be5558 100644
--- a/drivers/mtd/nand/qcom_nandc.c
+++ b/drivers/mtd/nand/qcom_nandc.c
@@ -1725,6 +1725,7 @@ static int qcom_nandc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	u8 *data_buf, *oob_buf = NULL;
 	int ret;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
 	data_buf = buf;
 	oob_buf = oob_required ? chip->oob_poi : NULL;
 
@@ -1750,6 +1751,7 @@ static int qcom_nandc_read_page_raw(struct mtd_info *mtd,
 	int i, ret;
 	int read_loc;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
 	data_buf = buf;
 	oob_buf = chip->oob_poi;
 
@@ -1850,6 +1852,8 @@ static int qcom_nandc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	u8 *data_buf, *oob_buf;
 	int i, ret;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	clear_read_regs(nandc);
 	clear_bam_transaction(nandc);
 
@@ -1902,6 +1906,9 @@ static int qcom_nandc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 
 	free_descs(nandc);
 
+	if (!ret)
+		ret = nand_prog_page_end_op(chip);
+
 	return ret;
 }
 
@@ -1916,6 +1923,7 @@ static int qcom_nandc_write_page_raw(struct mtd_info *mtd,
 	u8 *data_buf, *oob_buf;
 	int i, ret;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	clear_read_regs(nandc);
 	clear_bam_transaction(nandc);
 
@@ -1970,6 +1978,9 @@ static int qcom_nandc_write_page_raw(struct mtd_info *mtd,
 
 	free_descs(nandc);
 
+	if (!ret)
+		ret = nand_prog_page_end_op(chip);
+
 	return ret;
 }
 
@@ -1990,7 +2001,7 @@ static int qcom_nandc_write_oob(struct mtd_info *mtd, struct nand_chip *chip,
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
 	u8 *oob = chip->oob_poi;
 	int data_size, oob_size;
-	int ret, status = 0;
+	int ret;
 
 	host->use_ecc = true;
 
@@ -2027,11 +2038,7 @@ static int qcom_nandc_write_oob(struct mtd_info *mtd, struct nand_chip *chip,
 		return -EIO;
 	}
 
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int qcom_nandc_block_bad(struct mtd_info *mtd, loff_t ofs)
@@ -2081,7 +2088,7 @@ static int qcom_nandc_block_markbad(struct mtd_info *mtd, loff_t ofs)
 	struct qcom_nand_host *host = to_qcom_nand_host(chip);
 	struct qcom_nand_controller *nandc = get_qcom_nand_controller(chip);
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
-	int page, ret, status = 0;
+	int page, ret;
 
 	clear_read_regs(nandc);
 	clear_bam_transaction(nandc);
@@ -2114,11 +2121,7 @@ static int qcom_nandc_block_markbad(struct mtd_info *mtd, loff_t ofs)
 		return -EIO;
 	}
 
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_end_op(chip);
 }
 
 /*
@@ -2636,6 +2639,9 @@ static int qcom_nand_host_init(struct qcom_nand_controller *nandc,
 
 	nand_set_flash_node(chip, dn);
 	mtd->name = devm_kasprintf(dev, GFP_KERNEL, "qcom_nand.%d", host->cs);
+	if (!mtd->name)
+		return -ENOMEM;
+
 	mtd->owner = THIS_MODULE;
 	mtd->dev.parent = dev;
 
diff --git a/drivers/mtd/nand/r852.c b/drivers/mtd/nand/r852.c
index fc9287a..595635b 100644
--- a/drivers/mtd/nand/r852.c
+++ b/drivers/mtd/nand/r852.c
@@ -364,7 +364,7 @@ static int r852_wait(struct mtd_info *mtd, struct nand_chip *chip)
 	struct r852_device *dev = nand_get_controller_data(chip);
 
 	unsigned long timeout;
-	int status;
+	u8 status;
 
 	timeout = jiffies + (chip->state == FL_ERASING ?
 		msecs_to_jiffies(400) : msecs_to_jiffies(20));
@@ -373,8 +373,7 @@ static int r852_wait(struct mtd_info *mtd, struct nand_chip *chip)
 		if (chip->dev_ready(mtd))
 			break;
 
-	chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
-	status = (int)chip->read_byte(mtd);
+	nand_status_op(chip, &status);
 
 	/* Unfortunelly, no way to send detailed error status... */
 	if (dev->dma_error) {
@@ -522,9 +521,7 @@ static int r852_ecc_correct(struct mtd_info *mtd, uint8_t *dat,
 static int r852_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 			     int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READOOB, 0, page);
-	chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+	return nand_read_oob_op(chip, page, 0, chip->oob_poi, mtd->oobsize);
 }
 
 /*
@@ -1046,7 +1043,7 @@ static int r852_resume(struct device *device)
 	if (dev->card_registred) {
 		r852_engine_enable(dev);
 		dev->chip->select_chip(mtd, 0);
-		dev->chip->cmdfunc(mtd, NAND_CMD_RESET, -1, -1);
+		nand_reset_op(dev->chip);
 		dev->chip->select_chip(mtd, -1);
 	}
 
diff --git a/drivers/mtd/nand/sh_flctl.c b/drivers/mtd/nand/sh_flctl.c
index 3c5008a..c4e7755 100644
--- a/drivers/mtd/nand/sh_flctl.c
+++ b/drivers/mtd/nand/sh_flctl.c
@@ -614,7 +614,7 @@ static void set_cmd_regs(struct mtd_info *mtd, uint32_t cmd, uint32_t flcmcdr_va
 static int flctl_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 				uint8_t *buf, int oob_required, int page)
 {
-	chip->read_buf(mtd, buf, mtd->writesize);
+	nand_read_page_op(chip, page, 0, buf, mtd->writesize);
 	if (oob_required)
 		chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 	return 0;
@@ -624,9 +624,9 @@ static int flctl_write_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 				  const uint8_t *buf, int oob_required,
 				  int page)
 {
-	chip->write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	chip->write_buf(mtd, chip->oob_poi, mtd->oobsize);
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static void execmd_read_page_sector(struct mtd_info *mtd, int page_addr)
diff --git a/drivers/mtd/nand/sm_common.h b/drivers/mtd/nand/sm_common.h
index d3e028e..1581671 100644
--- a/drivers/mtd/nand/sm_common.h
+++ b/drivers/mtd/nand/sm_common.h
@@ -36,7 +36,7 @@ struct sm_oob {
 #define SM_SMALL_OOB_SIZE	8
 
 
-extern int sm_register_device(struct mtd_info *mtd, int smartmedia);
+int sm_register_device(struct mtd_info *mtd, int smartmedia);
 
 
 static inline int sm_sector_valid(struct sm_oob *oob)
diff --git a/drivers/mtd/nand/sunxi_nand.c b/drivers/mtd/nand/sunxi_nand.c
index 82244be..f5a55c6 100644
--- a/drivers/mtd/nand/sunxi_nand.c
+++ b/drivers/mtd/nand/sunxi_nand.c
@@ -958,12 +958,12 @@ static int sunxi_nfc_hw_ecc_read_chunk(struct mtd_info *mtd,
 	int ret;
 
 	if (*cur_off != data_off)
-		nand->cmdfunc(mtd, NAND_CMD_RNDOUT, data_off, -1);
+		nand_change_read_column_op(nand, data_off, NULL, 0, false);
 
 	sunxi_nfc_randomizer_read_buf(mtd, NULL, ecc->size, false, page);
 
 	if (data_off + ecc->size != oob_off)
-		nand->cmdfunc(mtd, NAND_CMD_RNDOUT, oob_off, -1);
+		nand_change_read_column_op(nand, oob_off, NULL, 0, false);
 
 	ret = sunxi_nfc_wait_cmd_fifo_empty(nfc);
 	if (ret)
@@ -991,16 +991,15 @@ static int sunxi_nfc_hw_ecc_read_chunk(struct mtd_info *mtd,
 		 * Re-read the data with the randomizer disabled to identify
 		 * bitflips in erased pages.
 		 */
-		if (nand->options & NAND_NEED_SCRAMBLING) {
-			nand->cmdfunc(mtd, NAND_CMD_RNDOUT, data_off, -1);
-			nand->read_buf(mtd, data, ecc->size);
-		} else {
+		if (nand->options & NAND_NEED_SCRAMBLING)
+			nand_change_read_column_op(nand, data_off, data,
+						   ecc->size, false);
+		else
 			memcpy_fromio(data, nfc->regs + NFC_RAM0_BASE,
 				      ecc->size);
-		}
 
-		nand->cmdfunc(mtd, NAND_CMD_RNDOUT, oob_off, -1);
-		nand->read_buf(mtd, oob, ecc->bytes + 4);
+		nand_change_read_column_op(nand, oob_off, oob, ecc->bytes + 4,
+					   false);
 
 		ret = nand_check_erased_ecc_chunk(data,	ecc->size,
 						  oob, ecc->bytes + 4,
@@ -1011,7 +1010,8 @@ static int sunxi_nfc_hw_ecc_read_chunk(struct mtd_info *mtd,
 		memcpy_fromio(data, nfc->regs + NFC_RAM0_BASE, ecc->size);
 
 		if (oob_required) {
-			nand->cmdfunc(mtd, NAND_CMD_RNDOUT, oob_off, -1);
+			nand_change_read_column_op(nand, oob_off, NULL, 0,
+						   false);
 			sunxi_nfc_randomizer_read_buf(mtd, oob, ecc->bytes + 4,
 						      true, page);
 
@@ -1038,8 +1038,8 @@ static void sunxi_nfc_hw_ecc_read_extra_oob(struct mtd_info *mtd,
 		return;
 
 	if (!cur_off || *cur_off != offset)
-		nand->cmdfunc(mtd, NAND_CMD_RNDOUT,
-			      offset + mtd->writesize, -1);
+		nand_change_read_column_op(nand, mtd->writesize, NULL, 0,
+					   false);
 
 	if (!randomize)
 		sunxi_nfc_read_buf(mtd, oob + offset, len);
@@ -1116,9 +1116,9 @@ static int sunxi_nfc_hw_ecc_read_chunks_dma(struct mtd_info *mtd, uint8_t *buf,
 
 		if (oob_required && !erased) {
 			/* TODO: use DMA to retrieve OOB */
-			nand->cmdfunc(mtd, NAND_CMD_RNDOUT,
-				      mtd->writesize + oob_off, -1);
-			nand->read_buf(mtd, oob, ecc->bytes + 4);
+			nand_change_read_column_op(nand,
+						   mtd->writesize + oob_off,
+						   oob, ecc->bytes + 4, false);
 
 			sunxi_nfc_hw_ecc_get_prot_oob_bytes(mtd, oob, i,
 							    !i, page);
@@ -1143,18 +1143,17 @@ static int sunxi_nfc_hw_ecc_read_chunks_dma(struct mtd_info *mtd, uint8_t *buf,
 			/*
 			 * Re-read the data with the randomizer disabled to
 			 * identify bitflips in erased pages.
+			 * TODO: use DMA to read page in raw mode
 			 */
-			if (randomized) {
-				/* TODO: use DMA to read page in raw mode */
-				nand->cmdfunc(mtd, NAND_CMD_RNDOUT,
-					      data_off, -1);
-				nand->read_buf(mtd, data, ecc->size);
-			}
+			if (randomized)
+				nand_change_read_column_op(nand, data_off,
+							   data, ecc->size,
+							   false);
 
 			/* TODO: use DMA to retrieve OOB */
-			nand->cmdfunc(mtd, NAND_CMD_RNDOUT,
-				      mtd->writesize + oob_off, -1);
-			nand->read_buf(mtd, oob, ecc->bytes + 4);
+			nand_change_read_column_op(nand,
+						   mtd->writesize + oob_off,
+						   oob, ecc->bytes + 4, false);
 
 			ret = nand_check_erased_ecc_chunk(data,	ecc->size,
 							  oob, ecc->bytes + 4,
@@ -1187,12 +1186,12 @@ static int sunxi_nfc_hw_ecc_write_chunk(struct mtd_info *mtd,
 	int ret;
 
 	if (data_off != *cur_off)
-		nand->cmdfunc(mtd, NAND_CMD_RNDIN, data_off, -1);
+		nand_change_write_column_op(nand, data_off, NULL, 0, false);
 
 	sunxi_nfc_randomizer_write_buf(mtd, data, ecc->size, false, page);
 
 	if (data_off + ecc->size != oob_off)
-		nand->cmdfunc(mtd, NAND_CMD_RNDIN, oob_off, -1);
+		nand_change_write_column_op(nand, oob_off, NULL, 0, false);
 
 	ret = sunxi_nfc_wait_cmd_fifo_empty(nfc);
 	if (ret)
@@ -1228,8 +1227,8 @@ static void sunxi_nfc_hw_ecc_write_extra_oob(struct mtd_info *mtd,
 		return;
 
 	if (!cur_off || *cur_off != offset)
-		nand->cmdfunc(mtd, NAND_CMD_RNDIN,
-			      offset + mtd->writesize, -1);
+		nand_change_write_column_op(nand, offset + mtd->writesize,
+					    NULL, 0, false);
 
 	sunxi_nfc_randomizer_write_buf(mtd, oob + offset, len, false, page);
 
@@ -1246,6 +1245,8 @@ static int sunxi_nfc_hw_ecc_read_page(struct mtd_info *mtd,
 	int ret, i, cur_off = 0;
 	bool raw_mode = false;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = 0; i < ecc->steps; i++) {
@@ -1279,14 +1280,14 @@ static int sunxi_nfc_hw_ecc_read_page_dma(struct mtd_info *mtd,
 {
 	int ret;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	ret = sunxi_nfc_hw_ecc_read_chunks_dma(mtd, buf, oob_required, page,
 					       chip->ecc.steps);
 	if (ret >= 0)
 		return ret;
 
 	/* Fallback to PIO mode */
-	chip->cmdfunc(mtd, NAND_CMD_RNDOUT, 0, -1);
-
 	return sunxi_nfc_hw_ecc_read_page(mtd, chip, buf, oob_required, page);
 }
 
@@ -1299,6 +1300,8 @@ static int sunxi_nfc_hw_ecc_read_subpage(struct mtd_info *mtd,
 	int ret, i, cur_off = 0;
 	unsigned int max_bitflips = 0;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = data_offs / ecc->size;
@@ -1330,13 +1333,13 @@ static int sunxi_nfc_hw_ecc_read_subpage_dma(struct mtd_info *mtd,
 	int nchunks = DIV_ROUND_UP(data_offs + readlen, chip->ecc.size);
 	int ret;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	ret = sunxi_nfc_hw_ecc_read_chunks_dma(mtd, buf, false, page, nchunks);
 	if (ret >= 0)
 		return ret;
 
 	/* Fallback to PIO mode */
-	chip->cmdfunc(mtd, NAND_CMD_RNDOUT, 0, -1);
-
 	return sunxi_nfc_hw_ecc_read_subpage(mtd, chip, data_offs, readlen,
 					     buf, page);
 }
@@ -1349,6 +1352,8 @@ static int sunxi_nfc_hw_ecc_write_page(struct mtd_info *mtd,
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
 	int ret, i, cur_off = 0;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = 0; i < ecc->steps; i++) {
@@ -1370,7 +1375,7 @@ static int sunxi_nfc_hw_ecc_write_page(struct mtd_info *mtd,
 
 	sunxi_nfc_hw_ecc_disable(mtd);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int sunxi_nfc_hw_ecc_write_subpage(struct mtd_info *mtd,
@@ -1382,6 +1387,8 @@ static int sunxi_nfc_hw_ecc_write_subpage(struct mtd_info *mtd,
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
 	int ret, i, cur_off = 0;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = data_offs / ecc->size;
@@ -1400,7 +1407,7 @@ static int sunxi_nfc_hw_ecc_write_subpage(struct mtd_info *mtd,
 
 	sunxi_nfc_hw_ecc_disable(mtd);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int sunxi_nfc_hw_ecc_write_page_dma(struct mtd_info *mtd,
@@ -1430,6 +1437,8 @@ static int sunxi_nfc_hw_ecc_write_page_dma(struct mtd_info *mtd,
 		sunxi_nfc_hw_ecc_set_prot_oob_bytes(mtd, oob, i, !i, page);
 	}
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 	sunxi_nfc_randomizer_config(mtd, page, false);
 	sunxi_nfc_randomizer_enable(mtd);
@@ -1460,7 +1469,7 @@ static int sunxi_nfc_hw_ecc_write_page_dma(struct mtd_info *mtd,
 		sunxi_nfc_hw_ecc_write_extra_oob(mtd, chip->oob_poi,
 						 NULL, page);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 
 pio_fallback:
 	return sunxi_nfc_hw_ecc_write_page(mtd, chip, buf, oob_required, page);
@@ -1476,6 +1485,8 @@ static int sunxi_nfc_hw_syndrome_ecc_read_page(struct mtd_info *mtd,
 	int ret, i, cur_off = 0;
 	bool raw_mode = false;
 
+	nand_read_page_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = 0; i < ecc->steps; i++) {
@@ -1512,6 +1523,8 @@ static int sunxi_nfc_hw_syndrome_ecc_write_page(struct mtd_info *mtd,
 	struct nand_ecc_ctrl *ecc = &chip->ecc;
 	int ret, i, cur_off = 0;
 
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
+
 	sunxi_nfc_hw_ecc_enable(mtd);
 
 	for (i = 0; i < ecc->steps; i++) {
@@ -1533,41 +1546,33 @@ static int sunxi_nfc_hw_syndrome_ecc_write_page(struct mtd_info *mtd,
 
 	sunxi_nfc_hw_ecc_disable(mtd);
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int sunxi_nfc_hw_common_ecc_read_oob(struct mtd_info *mtd,
 					    struct nand_chip *chip,
 					    int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
-
 	chip->pagebuf = -1;
 
-	return chip->ecc.read_page(mtd, chip, chip->buffers->databuf, 1, page);
+	return chip->ecc.read_page(mtd, chip, chip->data_buf, 1, page);
 }
 
 static int sunxi_nfc_hw_common_ecc_write_oob(struct mtd_info *mtd,
 					     struct nand_chip *chip,
 					     int page)
 {
-	int ret, status;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0, page);
+	int ret;
 
 	chip->pagebuf = -1;
 
-	memset(chip->buffers->databuf, 0xff, mtd->writesize);
-	ret = chip->ecc.write_page(mtd, chip, chip->buffers->databuf, 1, page);
+	memset(chip->data_buf, 0xff, mtd->writesize);
+	ret = chip->ecc.write_page(mtd, chip, chip->data_buf, 1, page);
 	if (ret)
 		return ret;
 
 	/* Send command to program the OOB data */
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-
-	return status & NAND_STATUS_FAIL ? -EIO : 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static const s32 tWB_lut[] = {6, 12, 16, 20};
@@ -1853,8 +1858,14 @@ static int sunxi_nand_hw_common_ecc_ctrl_init(struct mtd_info *mtd,
 
 	/* Add ECC info retrieval from DT */
 	for (i = 0; i < ARRAY_SIZE(strengths); i++) {
-		if (ecc->strength <= strengths[i])
+		if (ecc->strength <= strengths[i]) {
+			/*
+			 * Update ecc->strength value with the actual strength
+			 * that will be used by the ECC engine.
+			 */
+			ecc->strength = strengths[i];
 			break;
+		}
 	}
 
 	if (i >= ARRAY_SIZE(strengths)) {
diff --git a/drivers/mtd/nand/tango_nand.c b/drivers/mtd/nand/tango_nand.c
index 766906f..c5bee00b 100644
--- a/drivers/mtd/nand/tango_nand.c
+++ b/drivers/mtd/nand/tango_nand.c
@@ -329,7 +329,7 @@ static void aux_read(struct nand_chip *chip, u8 **buf, int len, int *pos)
 
 	if (!*buf) {
 		/* skip over "len" bytes */
-		chip->cmdfunc(mtd, NAND_CMD_RNDOUT, *pos, -1);
+		nand_change_read_column_op(chip, *pos, NULL, 0, false);
 	} else {
 		tango_read_buf(mtd, *buf, len);
 		*buf += len;
@@ -344,7 +344,7 @@ static void aux_write(struct nand_chip *chip, const u8 **buf, int len, int *pos)
 
 	if (!*buf) {
 		/* skip over "len" bytes */
-		chip->cmdfunc(mtd, NAND_CMD_RNDIN, *pos, -1);
+		nand_change_write_column_op(chip, *pos, NULL, 0, false);
 	} else {
 		tango_write_buf(mtd, *buf, len);
 		*buf += len;
@@ -427,7 +427,7 @@ static void raw_write(struct nand_chip *chip, const u8 *buf, const u8 *oob)
 static int tango_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 			       u8 *buf, int oob_required, int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	nand_read_page_op(chip, page, 0, NULL, 0);
 	raw_read(chip, buf, chip->oob_poi);
 	return 0;
 }
@@ -435,23 +435,15 @@ static int tango_read_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 static int tango_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 				const u8 *buf, int oob_required, int page)
 {
-	int status;
-
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0, page);
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	raw_write(chip, buf, chip->oob_poi);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-
-	status = chip->waitfunc(mtd, chip);
-	if (status & NAND_STATUS_FAIL)
-		return -EIO;
-
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int tango_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 			  int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_READ0, 0, page);
+	nand_read_page_op(chip, page, 0, NULL, 0);
 	raw_read(chip, NULL, chip->oob_poi);
 	return 0;
 }
@@ -459,11 +451,9 @@ static int tango_read_oob(struct mtd_info *mtd, struct nand_chip *chip,
 static int tango_write_oob(struct mtd_info *mtd, struct nand_chip *chip,
 			   int page)
 {
-	chip->cmdfunc(mtd, NAND_CMD_SEQIN, 0, page);
+	nand_prog_page_begin_op(chip, page, 0, NULL, 0);
 	raw_write(chip, NULL, chip->oob_poi);
-	chip->cmdfunc(mtd, NAND_CMD_PAGEPROG, -1, -1);
-	chip->waitfunc(mtd, chip);
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static int oob_ecc(struct mtd_info *mtd, int idx, struct mtd_oob_region *res)
@@ -590,7 +580,6 @@ static int chip_init(struct device *dev, struct device_node *np)
 	ecc->write_page = tango_write_page;
 	ecc->read_oob = tango_read_oob;
 	ecc->write_oob = tango_write_oob;
-	ecc->options = NAND_ECC_CUSTOM_PAGE_ACCESS;
 
 	err = nand_scan_tail(mtd);
 	if (err)
diff --git a/drivers/mtd/nand/tmio_nand.c b/drivers/mtd/nand/tmio_nand.c
index 84dbf32..dcaa924 100644
--- a/drivers/mtd/nand/tmio_nand.c
+++ b/drivers/mtd/nand/tmio_nand.c
@@ -192,6 +192,7 @@ tmio_nand_wait(struct mtd_info *mtd, struct nand_chip *nand_chip)
 {
 	struct tmio_nand *tmio = mtd_to_tmio(mtd);
 	long timeout;
+	u8 status;
 
 	/* enable RDYREQ interrupt */
 	tmio_iowrite8(0x0f, tmio->fcr + FCR_ISR);
@@ -212,8 +213,8 @@ tmio_nand_wait(struct mtd_info *mtd, struct nand_chip *nand_chip)
 		dev_warn(&tmio->dev->dev, "timeout waiting for interrupt\n");
 	}
 
-	nand_chip->cmdfunc(mtd, NAND_CMD_STATUS, -1, -1);
-	return nand_chip->read_byte(mtd);
+	nand_status_op(nand_chip, &status);
+	return status;
 }
 
 /*
diff --git a/drivers/mtd/nand/vf610_nfc.c b/drivers/mtd/nand/vf610_nfc.c
index 8037d4b..80d31a5 100644
--- a/drivers/mtd/nand/vf610_nfc.c
+++ b/drivers/mtd/nand/vf610_nfc.c
@@ -560,7 +560,7 @@ static int vf610_nfc_read_page(struct mtd_info *mtd, struct nand_chip *chip,
 	int eccsize = chip->ecc.size;
 	int stat;
 
-	vf610_nfc_read_buf(mtd, buf, eccsize);
+	nand_read_page_op(chip, page, 0, buf, eccsize);
 	if (oob_required)
 		vf610_nfc_read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
@@ -580,7 +580,7 @@ static int vf610_nfc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 {
 	struct vf610_nfc *nfc = mtd_to_nfc(mtd);
 
-	vf610_nfc_write_buf(mtd, buf, mtd->writesize);
+	nand_prog_page_begin_op(chip, page, 0, buf, mtd->writesize);
 	if (oob_required)
 		vf610_nfc_write_buf(mtd, chip->oob_poi, mtd->oobsize);
 
@@ -588,7 +588,7 @@ static int vf610_nfc_write_page(struct mtd_info *mtd, struct nand_chip *chip,
 	nfc->use_hw_ecc = true;
 	nfc->write_sz = mtd->writesize + mtd->oobsize;
 
-	return 0;
+	return nand_prog_page_end_op(chip);
 }
 
 static const struct of_device_id vf610_nfc_dt_ids[] = {
diff --git a/drivers/mtd/onenand/Kconfig b/drivers/mtd/onenand/Kconfig
index dcae2f6..9dc1574 100644
--- a/drivers/mtd/onenand/Kconfig
+++ b/drivers/mtd/onenand/Kconfig
@@ -4,8 +4,7 @@
 	depends on HAS_IOMEM
 	help
 	  This enables support for accessing all type of OneNAND flash
-	  devices. For further information see
-	  <http://www.samsung.com/Products/Semiconductor/OneNAND/index.htm>
+	  devices.
 
 if MTD_ONENAND
 
@@ -26,9 +25,11 @@
 config MTD_ONENAND_OMAP2
 	tristate "OneNAND on OMAP2/OMAP3 support"
 	depends on ARCH_OMAP2 || ARCH_OMAP3
+	depends on OF || COMPILE_TEST
 	help
-	  Support for a OneNAND flash device connected to an OMAP2/OMAP3 CPU
+	  Support for a OneNAND flash device connected to an OMAP2/OMAP3 SoC
 	  via the GPMC memory controller.
+	  Enable dmaengine and gpiolib for better performance.
 
 config MTD_ONENAND_SAMSUNG
         tristate "OneNAND on Samsung SOC controller support"
diff --git a/drivers/mtd/onenand/omap2.c b/drivers/mtd/onenand/omap2.c
index 24a1388..87c34f6 100644
--- a/drivers/mtd/onenand/omap2.c
+++ b/drivers/mtd/onenand/omap2.c
@@ -28,19 +28,18 @@
 #include <linux/mtd/mtd.h>
 #include <linux/mtd/onenand.h>
 #include <linux/mtd/partitions.h>
+#include <linux/of_device.h>
+#include <linux/omap-gpmc.h>
 #include <linux/platform_device.h>
 #include <linux/interrupt.h>
 #include <linux/delay.h>
 #include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
 #include <linux/io.h>
 #include <linux/slab.h>
-#include <linux/regulator/consumer.h>
-#include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
 
 #include <asm/mach/flash.h>
-#include <linux/platform_data/mtd-onenand-omap2.h>
-
-#include <linux/omap-dma.h>
 
 #define DRIVER_NAME "omap2-onenand"
 
@@ -50,24 +49,17 @@ struct omap2_onenand {
 	struct platform_device *pdev;
 	int gpmc_cs;
 	unsigned long phys_base;
-	unsigned int mem_size;
-	int gpio_irq;
+	struct gpio_desc *int_gpiod;
 	struct mtd_info mtd;
 	struct onenand_chip onenand;
 	struct completion irq_done;
 	struct completion dma_done;
-	int dma_channel;
-	int freq;
-	int (*setup)(void __iomem *base, int *freq_ptr);
-	struct regulator *regulator;
-	u8 flags;
+	struct dma_chan *dma_chan;
 };
 
-static void omap2_onenand_dma_cb(int lch, u16 ch_status, void *data)
+static void omap2_onenand_dma_complete_func(void *completion)
 {
-	struct omap2_onenand *c = data;
-
-	complete(&c->dma_done);
+	complete(completion);
 }
 
 static irqreturn_t omap2_onenand_interrupt(int irq, void *dev_id)
@@ -90,6 +82,65 @@ static inline void write_reg(struct omap2_onenand *c, unsigned short value,
 	writew(value, c->onenand.base + reg);
 }
 
+static int omap2_onenand_set_cfg(struct omap2_onenand *c,
+				 bool sr, bool sw,
+				 int latency, int burst_len)
+{
+	unsigned short reg = ONENAND_SYS_CFG1_RDY | ONENAND_SYS_CFG1_INT;
+
+	reg |= latency << ONENAND_SYS_CFG1_BRL_SHIFT;
+
+	switch (burst_len) {
+	case 0:		/* continuous */
+		break;
+	case 4:
+		reg |= ONENAND_SYS_CFG1_BL_4;
+		break;
+	case 8:
+		reg |= ONENAND_SYS_CFG1_BL_8;
+		break;
+	case 16:
+		reg |= ONENAND_SYS_CFG1_BL_16;
+		break;
+	case 32:
+		reg |= ONENAND_SYS_CFG1_BL_32;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	if (latency > 5)
+		reg |= ONENAND_SYS_CFG1_HF;
+	if (latency > 7)
+		reg |= ONENAND_SYS_CFG1_VHF;
+	if (sr)
+		reg |= ONENAND_SYS_CFG1_SYNC_READ;
+	if (sw)
+		reg |= ONENAND_SYS_CFG1_SYNC_WRITE;
+
+	write_reg(c, reg, ONENAND_REG_SYS_CFG1);
+
+	return 0;
+}
+
+static int omap2_onenand_get_freq(int ver)
+{
+	switch ((ver >> 4) & 0xf) {
+	case 0:
+		return 40;
+	case 1:
+		return 54;
+	case 2:
+		return 66;
+	case 3:
+		return 83;
+	case 4:
+		return 104;
+	}
+
+	return -EINVAL;
+}
+
 static void wait_err(char *msg, int state, unsigned int ctrl, unsigned int intr)
 {
 	printk(KERN_ERR "onenand_wait: %s! state %d ctrl 0x%04x intr 0x%04x\n",
@@ -153,28 +204,22 @@ static int omap2_onenand_wait(struct mtd_info *mtd, int state)
 		if (!(syscfg & ONENAND_SYS_CFG1_IOBE)) {
 			syscfg |= ONENAND_SYS_CFG1_IOBE;
 			write_reg(c, syscfg, ONENAND_REG_SYS_CFG1);
-			if (c->flags & ONENAND_IN_OMAP34XX)
-				/* Add a delay to let GPIO settle */
-				syscfg = read_reg(c, ONENAND_REG_SYS_CFG1);
+			/* Add a delay to let GPIO settle */
+			syscfg = read_reg(c, ONENAND_REG_SYS_CFG1);
 		}
 
 		reinit_completion(&c->irq_done);
-		if (c->gpio_irq) {
-			result = gpio_get_value(c->gpio_irq);
-			if (result == -1) {
-				ctrl = read_reg(c, ONENAND_REG_CTRL_STATUS);
-				intr = read_reg(c, ONENAND_REG_INTERRUPT);
-				wait_err("gpio error", state, ctrl, intr);
-				return -EIO;
-			}
-		} else
-			result = 0;
-		if (result == 0) {
+		result = gpiod_get_value(c->int_gpiod);
+		if (result < 0) {
+			ctrl = read_reg(c, ONENAND_REG_CTRL_STATUS);
+			intr = read_reg(c, ONENAND_REG_INTERRUPT);
+			wait_err("gpio error", state, ctrl, intr);
+			return result;
+		} else if (result == 0) {
 			int retry_cnt = 0;
 retry:
-			result = wait_for_completion_timeout(&c->irq_done,
-						    msecs_to_jiffies(20));
-			if (result == 0) {
+			if (!wait_for_completion_io_timeout(&c->irq_done,
+						msecs_to_jiffies(20))) {
 				/* Timeout after 20ms */
 				ctrl = read_reg(c, ONENAND_REG_CTRL_STATUS);
 				if (ctrl & ONENAND_CTRL_ONGO &&
@@ -291,9 +336,42 @@ static inline int omap2_onenand_bufferram_offset(struct mtd_info *mtd, int area)
 	return 0;
 }
 
-#if defined(CONFIG_ARCH_OMAP3) || defined(MULTI_OMAP2)
+static inline int omap2_onenand_dma_transfer(struct omap2_onenand *c,
+					     dma_addr_t src, dma_addr_t dst,
+					     size_t count)
+{
+	struct dma_async_tx_descriptor *tx;
+	dma_cookie_t cookie;
 
-static int omap3_onenand_read_bufferram(struct mtd_info *mtd, int area,
+	tx = dmaengine_prep_dma_memcpy(c->dma_chan, dst, src, count, 0);
+	if (!tx) {
+		dev_err(&c->pdev->dev, "Failed to prepare DMA memcpy\n");
+		return -EIO;
+	}
+
+	reinit_completion(&c->dma_done);
+
+	tx->callback = omap2_onenand_dma_complete_func;
+	tx->callback_param = &c->dma_done;
+
+	cookie = tx->tx_submit(tx);
+	if (dma_submit_error(cookie)) {
+		dev_err(&c->pdev->dev, "Failed to do DMA tx_submit\n");
+		return -EIO;
+	}
+
+	dma_async_issue_pending(c->dma_chan);
+
+	if (!wait_for_completion_io_timeout(&c->dma_done,
+					    msecs_to_jiffies(20))) {
+		dmaengine_terminate_sync(c->dma_chan);
+		return -ETIMEDOUT;
+	}
+
+	return 0;
+}
+
+static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area,
 					unsigned char *buffer, int offset,
 					size_t count)
 {
@@ -301,10 +379,9 @@ static int omap3_onenand_read_bufferram(struct mtd_info *mtd, int area,
 	struct onenand_chip *this = mtd->priv;
 	dma_addr_t dma_src, dma_dst;
 	int bram_offset;
-	unsigned long timeout;
 	void *buf = (void *)buffer;
 	size_t xtra;
-	volatile unsigned *done;
+	int ret;
 
 	bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
 	if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
@@ -341,25 +418,10 @@ static int omap3_onenand_read_bufferram(struct mtd_info *mtd, int area,
 		goto out_copy;
 	}
 
-	omap_set_dma_transfer_params(c->dma_channel, OMAP_DMA_DATA_TYPE_S32,
-				     count >> 2, 1, 0, 0, 0);
-	omap_set_dma_src_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				dma_src, 0, 0);
-	omap_set_dma_dest_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				 dma_dst, 0, 0);
-
-	reinit_completion(&c->dma_done);
-	omap_start_dma(c->dma_channel);
-
-	timeout = jiffies + msecs_to_jiffies(20);
-	done = &c->dma_done.done;
-	while (time_before(jiffies, timeout))
-		if (*done)
-			break;
-
+	ret = omap2_onenand_dma_transfer(c, dma_src, dma_dst, count);
 	dma_unmap_single(&c->pdev->dev, dma_dst, count, DMA_FROM_DEVICE);
 
-	if (!*done) {
+	if (ret) {
 		dev_err(&c->pdev->dev, "timeout waiting for DMA\n");
 		goto out_copy;
 	}
@@ -371,7 +433,7 @@ static int omap3_onenand_read_bufferram(struct mtd_info *mtd, int area,
 	return 0;
 }
 
-static int omap3_onenand_write_bufferram(struct mtd_info *mtd, int area,
+static int omap2_onenand_write_bufferram(struct mtd_info *mtd, int area,
 					 const unsigned char *buffer,
 					 int offset, size_t count)
 {
@@ -379,9 +441,8 @@ static int omap3_onenand_write_bufferram(struct mtd_info *mtd, int area,
 	struct onenand_chip *this = mtd->priv;
 	dma_addr_t dma_src, dma_dst;
 	int bram_offset;
-	unsigned long timeout;
 	void *buf = (void *)buffer;
-	volatile unsigned *done;
+	int ret;
 
 	bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
 	if (bram_offset & 3 || (size_t)buf & 3 || count < 384)
@@ -412,25 +473,10 @@ static int omap3_onenand_write_bufferram(struct mtd_info *mtd, int area,
 		return -1;
 	}
 
-	omap_set_dma_transfer_params(c->dma_channel, OMAP_DMA_DATA_TYPE_S32,
-				     count >> 2, 1, 0, 0, 0);
-	omap_set_dma_src_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				dma_src, 0, 0);
-	omap_set_dma_dest_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				 dma_dst, 0, 0);
-
-	reinit_completion(&c->dma_done);
-	omap_start_dma(c->dma_channel);
-
-	timeout = jiffies + msecs_to_jiffies(20);
-	done = &c->dma_done.done;
-	while (time_before(jiffies, timeout))
-		if (*done)
-			break;
-
+	ret = omap2_onenand_dma_transfer(c, dma_src, dma_dst, count);
 	dma_unmap_single(&c->pdev->dev, dma_src, count, DMA_TO_DEVICE);
 
-	if (!*done) {
+	if (ret) {
 		dev_err(&c->pdev->dev, "timeout waiting for DMA\n");
 		goto out_copy;
 	}
@@ -442,136 +488,6 @@ static int omap3_onenand_write_bufferram(struct mtd_info *mtd, int area,
 	return 0;
 }
 
-#else
-
-static int omap3_onenand_read_bufferram(struct mtd_info *mtd, int area,
-					unsigned char *buffer, int offset,
-					size_t count)
-{
-	return -ENOSYS;
-}
-
-static int omap3_onenand_write_bufferram(struct mtd_info *mtd, int area,
-					 const unsigned char *buffer,
-					 int offset, size_t count)
-{
-	return -ENOSYS;
-}
-
-#endif
-
-#if defined(CONFIG_ARCH_OMAP2) || defined(MULTI_OMAP2)
-
-static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area,
-					unsigned char *buffer, int offset,
-					size_t count)
-{
-	struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
-	struct onenand_chip *this = mtd->priv;
-	dma_addr_t dma_src, dma_dst;
-	int bram_offset;
-
-	bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
-	/* DMA is not used.  Revisit PM requirements before enabling it. */
-	if (1 || (c->dma_channel < 0) ||
-	    ((void *) buffer >= (void *) high_memory) || (bram_offset & 3) ||
-	    (((unsigned int) buffer) & 3) || (count < 1024) || (count & 3)) {
-		memcpy(buffer, (__force void *)(this->base + bram_offset),
-		       count);
-		return 0;
-	}
-
-	dma_src = c->phys_base + bram_offset;
-	dma_dst = dma_map_single(&c->pdev->dev, buffer, count,
-				 DMA_FROM_DEVICE);
-	if (dma_mapping_error(&c->pdev->dev, dma_dst)) {
-		dev_err(&c->pdev->dev,
-			"Couldn't DMA map a %d byte buffer\n",
-			count);
-		return -1;
-	}
-
-	omap_set_dma_transfer_params(c->dma_channel, OMAP_DMA_DATA_TYPE_S32,
-				     count / 4, 1, 0, 0, 0);
-	omap_set_dma_src_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				dma_src, 0, 0);
-	omap_set_dma_dest_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				 dma_dst, 0, 0);
-
-	reinit_completion(&c->dma_done);
-	omap_start_dma(c->dma_channel);
-	wait_for_completion(&c->dma_done);
-
-	dma_unmap_single(&c->pdev->dev, dma_dst, count, DMA_FROM_DEVICE);
-
-	return 0;
-}
-
-static int omap2_onenand_write_bufferram(struct mtd_info *mtd, int area,
-					 const unsigned char *buffer,
-					 int offset, size_t count)
-{
-	struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
-	struct onenand_chip *this = mtd->priv;
-	dma_addr_t dma_src, dma_dst;
-	int bram_offset;
-
-	bram_offset = omap2_onenand_bufferram_offset(mtd, area) + area + offset;
-	/* DMA is not used.  Revisit PM requirements before enabling it. */
-	if (1 || (c->dma_channel < 0) ||
-	    ((void *) buffer >= (void *) high_memory) || (bram_offset & 3) ||
-	    (((unsigned int) buffer) & 3) || (count < 1024) || (count & 3)) {
-		memcpy((__force void *)(this->base + bram_offset), buffer,
-		       count);
-		return 0;
-	}
-
-	dma_src = dma_map_single(&c->pdev->dev, (void *) buffer, count,
-				 DMA_TO_DEVICE);
-	dma_dst = c->phys_base + bram_offset;
-	if (dma_mapping_error(&c->pdev->dev, dma_src)) {
-		dev_err(&c->pdev->dev,
-			"Couldn't DMA map a %d byte buffer\n",
-			count);
-		return -1;
-	}
-
-	omap_set_dma_transfer_params(c->dma_channel, OMAP_DMA_DATA_TYPE_S16,
-				     count / 2, 1, 0, 0, 0);
-	omap_set_dma_src_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				dma_src, 0, 0);
-	omap_set_dma_dest_params(c->dma_channel, 0, OMAP_DMA_AMODE_POST_INC,
-				 dma_dst, 0, 0);
-
-	reinit_completion(&c->dma_done);
-	omap_start_dma(c->dma_channel);
-	wait_for_completion(&c->dma_done);
-
-	dma_unmap_single(&c->pdev->dev, dma_src, count, DMA_TO_DEVICE);
-
-	return 0;
-}
-
-#else
-
-static int omap2_onenand_read_bufferram(struct mtd_info *mtd, int area,
-					unsigned char *buffer, int offset,
-					size_t count)
-{
-	return -ENOSYS;
-}
-
-static int omap2_onenand_write_bufferram(struct mtd_info *mtd, int area,
-					 const unsigned char *buffer,
-					 int offset, size_t count)
-{
-	return -ENOSYS;
-}
-
-#endif
-
-static struct platform_driver omap2_onenand_driver;
-
 static void omap2_onenand_shutdown(struct platform_device *pdev)
 {
 	struct omap2_onenand *c = dev_get_drvdata(&pdev->dev);
@@ -583,168 +499,117 @@ static void omap2_onenand_shutdown(struct platform_device *pdev)
 	memset((__force void *)c->onenand.base, 0, ONENAND_BUFRAM_SIZE);
 }
 
-static int omap2_onenand_enable(struct mtd_info *mtd)
-{
-	int ret;
-	struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
-
-	ret = regulator_enable(c->regulator);
-	if (ret != 0)
-		dev_err(&c->pdev->dev, "can't enable regulator\n");
-
-	return ret;
-}
-
-static int omap2_onenand_disable(struct mtd_info *mtd)
-{
-	int ret;
-	struct omap2_onenand *c = container_of(mtd, struct omap2_onenand, mtd);
-
-	ret = regulator_disable(c->regulator);
-	if (ret != 0)
-		dev_err(&c->pdev->dev, "can't disable regulator\n");
-
-	return ret;
-}
-
 static int omap2_onenand_probe(struct platform_device *pdev)
 {
-	struct omap_onenand_platform_data *pdata;
-	struct omap2_onenand *c;
-	struct onenand_chip *this;
-	int r;
+	u32 val;
+	dma_cap_mask_t mask;
+	int freq, latency, r;
 	struct resource *res;
+	struct omap2_onenand *c;
+	struct gpmc_onenand_info info;
+	struct device *dev = &pdev->dev;
+	struct device_node *np = dev->of_node;
 
-	pdata = dev_get_platdata(&pdev->dev);
-	if (pdata == NULL) {
-		dev_err(&pdev->dev, "platform data missing\n");
-		return -ENODEV;
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(dev, "error getting memory resource\n");
+		return -EINVAL;
 	}
 
-	c = kzalloc(sizeof(struct omap2_onenand), GFP_KERNEL);
+	r = of_property_read_u32(np, "reg", &val);
+	if (r) {
+		dev_err(dev, "reg not found in DT\n");
+		return r;
+	}
+
+	c = devm_kzalloc(dev, sizeof(struct omap2_onenand), GFP_KERNEL);
 	if (!c)
 		return -ENOMEM;
 
 	init_completion(&c->irq_done);
 	init_completion(&c->dma_done);
-	c->flags = pdata->flags;
-	c->gpmc_cs = pdata->cs;
-	c->gpio_irq = pdata->gpio_irq;
-	c->dma_channel = pdata->dma_channel;
-	if (c->dma_channel < 0) {
-		/* if -1, don't use DMA */
-		c->gpio_irq = 0;
-	}
-
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (res == NULL) {
-		r = -EINVAL;
-		dev_err(&pdev->dev, "error getting memory resource\n");
-		goto err_kfree;
-	}
-
+	c->gpmc_cs = val;
 	c->phys_base = res->start;
-	c->mem_size = resource_size(res);
 
-	if (request_mem_region(c->phys_base, c->mem_size,
-			       pdev->dev.driver->name) == NULL) {
-		dev_err(&pdev->dev, "Cannot reserve memory region at 0x%08lx, size: 0x%x\n",
-						c->phys_base, c->mem_size);
-		r = -EBUSY;
-		goto err_kfree;
-	}
-	c->onenand.base = ioremap(c->phys_base, c->mem_size);
-	if (c->onenand.base == NULL) {
-		r = -ENOMEM;
-		goto err_release_mem_region;
+	c->onenand.base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(c->onenand.base))
+		return PTR_ERR(c->onenand.base);
+
+	c->int_gpiod = devm_gpiod_get_optional(dev, "int", GPIOD_IN);
+	if (IS_ERR(c->int_gpiod)) {
+		r = PTR_ERR(c->int_gpiod);
+		/* Just try again if this happens */
+		if (r != -EPROBE_DEFER)
+			dev_err(dev, "error getting gpio: %d\n", r);
+		return r;
 	}
 
-	if (pdata->onenand_setup != NULL) {
-		r = pdata->onenand_setup(c->onenand.base, &c->freq);
-		if (r < 0) {
-			dev_err(&pdev->dev, "Onenand platform setup failed: "
-				"%d\n", r);
-			goto err_iounmap;
-		}
-		c->setup = pdata->onenand_setup;
+	if (c->int_gpiod) {
+		r = devm_request_irq(dev, gpiod_to_irq(c->int_gpiod),
+				     omap2_onenand_interrupt,
+				     IRQF_TRIGGER_RISING, "onenand", c);
+		if (r)
+			return r;
+
+		c->onenand.wait = omap2_onenand_wait;
 	}
 
-	if (c->gpio_irq) {
-		if ((r = gpio_request(c->gpio_irq, "OneNAND irq")) < 0) {
-			dev_err(&pdev->dev,  "Failed to request GPIO%d for "
-				"OneNAND\n", c->gpio_irq);
-			goto err_iounmap;
-	}
-	gpio_direction_input(c->gpio_irq);
+	dma_cap_zero(mask);
+	dma_cap_set(DMA_MEMCPY, mask);
 
-	if ((r = request_irq(gpio_to_irq(c->gpio_irq),
-			     omap2_onenand_interrupt, IRQF_TRIGGER_RISING,
-			     pdev->dev.driver->name, c)) < 0)
-		goto err_release_gpio;
+	c->dma_chan = dma_request_channel(mask, NULL, NULL);
+	if (c->dma_chan) {
+		c->onenand.read_bufferram = omap2_onenand_read_bufferram;
+		c->onenand.write_bufferram = omap2_onenand_write_bufferram;
 	}
 
-	if (c->dma_channel >= 0) {
-		r = omap_request_dma(0, pdev->dev.driver->name,
-				     omap2_onenand_dma_cb, (void *) c,
-				     &c->dma_channel);
-		if (r == 0) {
-			omap_set_dma_write_mode(c->dma_channel,
-						OMAP_DMA_WRITE_NON_POSTED);
-			omap_set_dma_src_data_pack(c->dma_channel, 1);
-			omap_set_dma_src_burst_mode(c->dma_channel,
-						    OMAP_DMA_DATA_BURST_8);
-			omap_set_dma_dest_data_pack(c->dma_channel, 1);
-			omap_set_dma_dest_burst_mode(c->dma_channel,
-						     OMAP_DMA_DATA_BURST_8);
-		} else {
-			dev_info(&pdev->dev,
-				 "failed to allocate DMA for OneNAND, "
-				 "using PIO instead\n");
-			c->dma_channel = -1;
-		}
-	}
-
-	dev_info(&pdev->dev, "initializing on CS%d, phys base 0x%08lx, virtual "
-		 "base %p, freq %d MHz\n", c->gpmc_cs, c->phys_base,
-		 c->onenand.base, c->freq);
-
 	c->pdev = pdev;
 	c->mtd.priv = &c->onenand;
+	c->mtd.dev.parent = dev;
+	mtd_set_of_node(&c->mtd, dev->of_node);
 
-	c->mtd.dev.parent = &pdev->dev;
-	mtd_set_of_node(&c->mtd, pdata->of_node);
-
-	this = &c->onenand;
-	if (c->dma_channel >= 0) {
-		this->wait = omap2_onenand_wait;
-		if (c->flags & ONENAND_IN_OMAP34XX) {
-			this->read_bufferram = omap3_onenand_read_bufferram;
-			this->write_bufferram = omap3_onenand_write_bufferram;
-		} else {
-			this->read_bufferram = omap2_onenand_read_bufferram;
-			this->write_bufferram = omap2_onenand_write_bufferram;
-		}
-	}
-
-	if (pdata->regulator_can_sleep) {
-		c->regulator = regulator_get(&pdev->dev, "vonenand");
-		if (IS_ERR(c->regulator)) {
-			dev_err(&pdev->dev,  "Failed to get regulator\n");
-			r = PTR_ERR(c->regulator);
-			goto err_release_dma;
-		}
-		c->onenand.enable = omap2_onenand_enable;
-		c->onenand.disable = omap2_onenand_disable;
-	}
-
-	if (pdata->skip_initial_unlocking)
-		this->options |= ONENAND_SKIP_INITIAL_UNLOCKING;
+	dev_info(dev, "initializing on CS%d (0x%08lx), va %p, %s mode\n",
+		 c->gpmc_cs, c->phys_base, c->onenand.base,
+		 c->dma_chan ? "DMA" : "PIO");
 
 	if ((r = onenand_scan(&c->mtd, 1)) < 0)
-		goto err_release_regulator;
+		goto err_release_dma;
 
-	r = mtd_device_register(&c->mtd, pdata ? pdata->parts : NULL,
-				pdata ? pdata->nr_parts : 0);
+	freq = omap2_onenand_get_freq(c->onenand.version_id);
+	if (freq > 0) {
+		switch (freq) {
+		case 104:
+			latency = 7;
+			break;
+		case 83:
+			latency = 6;
+			break;
+		case 66:
+			latency = 5;
+			break;
+		case 56:
+			latency = 4;
+			break;
+		default:	/* 40 MHz or lower */
+			latency = 3;
+			break;
+		}
+
+		r = gpmc_omap_onenand_set_timings(dev, c->gpmc_cs,
+						  freq, latency, &info);
+		if (r)
+			goto err_release_onenand;
+
+		r = omap2_onenand_set_cfg(c, info.sync_read, info.sync_write,
+					  latency, info.burst_len);
+		if (r)
+			goto err_release_onenand;
+
+		if (info.sync_read || info.sync_write)
+			dev_info(dev, "optimized timings for %d MHz\n", freq);
+	}
+
+	r = mtd_device_register(&c->mtd, NULL, 0);
 	if (r)
 		goto err_release_onenand;
 
@@ -754,22 +619,9 @@ static int omap2_onenand_probe(struct platform_device *pdev)
 
 err_release_onenand:
 	onenand_release(&c->mtd);
-err_release_regulator:
-	regulator_put(c->regulator);
 err_release_dma:
-	if (c->dma_channel != -1)
-		omap_free_dma(c->dma_channel);
-	if (c->gpio_irq)
-		free_irq(gpio_to_irq(c->gpio_irq), c);
-err_release_gpio:
-	if (c->gpio_irq)
-		gpio_free(c->gpio_irq);
-err_iounmap:
-	iounmap(c->onenand.base);
-err_release_mem_region:
-	release_mem_region(c->phys_base, c->mem_size);
-err_kfree:
-	kfree(c);
+	if (c->dma_chan)
+		dma_release_channel(c->dma_chan);
 
 	return r;
 }
@@ -779,27 +631,26 @@ static int omap2_onenand_remove(struct platform_device *pdev)
 	struct omap2_onenand *c = dev_get_drvdata(&pdev->dev);
 
 	onenand_release(&c->mtd);
-	regulator_put(c->regulator);
-	if (c->dma_channel != -1)
-		omap_free_dma(c->dma_channel);
+	if (c->dma_chan)
+		dma_release_channel(c->dma_chan);
 	omap2_onenand_shutdown(pdev);
-	if (c->gpio_irq) {
-		free_irq(gpio_to_irq(c->gpio_irq), c);
-		gpio_free(c->gpio_irq);
-	}
-	iounmap(c->onenand.base);
-	release_mem_region(c->phys_base, c->mem_size);
-	kfree(c);
 
 	return 0;
 }
 
+static const struct of_device_id omap2_onenand_id_table[] = {
+	{ .compatible = "ti,omap2-onenand", },
+	{},
+};
+MODULE_DEVICE_TABLE(of, omap2_onenand_id_table);
+
 static struct platform_driver omap2_onenand_driver = {
 	.probe		= omap2_onenand_probe,
 	.remove		= omap2_onenand_remove,
 	.shutdown	= omap2_onenand_shutdown,
 	.driver		= {
 		.name	= DRIVER_NAME,
+		.of_match_table = omap2_onenand_id_table,
 	},
 };
 
diff --git a/drivers/mtd/onenand/onenand_base.c b/drivers/mtd/onenand/onenand_base.c
index 1a6d0e3..979f403 100644
--- a/drivers/mtd/onenand/onenand_base.c
+++ b/drivers/mtd/onenand/onenand_base.c
@@ -1383,15 +1383,6 @@ static int onenand_read_oob_nolock(struct mtd_info *mtd, loff_t from,
 		return -EINVAL;
 	}
 
-	/* Do not allow reads past end of device */
-	if (unlikely(from >= mtd->size ||
-		     column + len > ((mtd->size >> this->page_shift) -
-				     (from >> this->page_shift)) * oobsize)) {
-		printk(KERN_ERR "%s: Attempted to read beyond end of device\n",
-			__func__);
-		return -EINVAL;
-	}
-
 	stats = mtd->ecc_stats;
 
 	readcmd = ONENAND_IS_4KB_PAGE(this) ? ONENAND_CMD_READ : ONENAND_CMD_READOOB;
@@ -1448,38 +1439,6 @@ static int onenand_read_oob_nolock(struct mtd_info *mtd, loff_t from,
 }
 
 /**
- * onenand_read - [MTD Interface] Read data from flash
- * @param mtd		MTD device structure
- * @param from		offset to read from
- * @param len		number of bytes to read
- * @param retlen	pointer to variable to store the number of read bytes
- * @param buf		the databuffer to put data
- *
- * Read with ecc
-*/
-static int onenand_read(struct mtd_info *mtd, loff_t from, size_t len,
-	size_t *retlen, u_char *buf)
-{
-	struct onenand_chip *this = mtd->priv;
-	struct mtd_oob_ops ops = {
-		.len	= len,
-		.ooblen	= 0,
-		.datbuf	= buf,
-		.oobbuf	= NULL,
-	};
-	int ret;
-
-	onenand_get_device(mtd, FL_READING);
-	ret = ONENAND_IS_4KB_PAGE(this) ?
-		onenand_mlc_read_ops_nolock(mtd, from, &ops) :
-		onenand_read_ops_nolock(mtd, from, &ops);
-	onenand_release_device(mtd);
-
-	*retlen = ops.retlen;
-	return ret;
-}
-
-/**
  * onenand_read_oob - [MTD Interface] Read main and/or out-of-band
  * @param mtd:		MTD device structure
  * @param from:		offset to read from
@@ -2056,15 +2015,6 @@ static int onenand_write_oob_nolock(struct mtd_info *mtd, loff_t to,
 		return -EINVAL;
 	}
 
-	/* Do not allow reads past end of device */
-	if (unlikely(to >= mtd->size ||
-		     column + len > ((mtd->size >> this->page_shift) -
-				     (to >> this->page_shift)) * oobsize)) {
-		printk(KERN_ERR "%s: Attempted to write past end of device\n",
-		       __func__);
-		return -EINVAL;
-	}
-
 	oobbuf = this->oob_buf;
 
 	oobcmd = ONENAND_IS_4KB_PAGE(this) ? ONENAND_CMD_PROG : ONENAND_CMD_PROGOOB;
@@ -2129,35 +2079,6 @@ static int onenand_write_oob_nolock(struct mtd_info *mtd, loff_t to,
 }
 
 /**
- * onenand_write - [MTD Interface] write buffer to FLASH
- * @param mtd		MTD device structure
- * @param to		offset to write to
- * @param len		number of bytes to write
- * @param retlen	pointer to variable to store the number of written bytes
- * @param buf		the data to write
- *
- * Write with ECC
- */
-static int onenand_write(struct mtd_info *mtd, loff_t to, size_t len,
-	size_t *retlen, const u_char *buf)
-{
-	struct mtd_oob_ops ops = {
-		.len	= len,
-		.ooblen	= 0,
-		.datbuf	= (u_char *) buf,
-		.oobbuf	= NULL,
-	};
-	int ret;
-
-	onenand_get_device(mtd, FL_WRITING);
-	ret = onenand_write_ops_nolock(mtd, to, &ops);
-	onenand_release_device(mtd);
-
-	*retlen = ops.retlen;
-	return ret;
-}
-
-/**
  * onenand_write_oob - [MTD Interface] NAND write data and/or out-of-band
  * @param mtd:		MTD device structure
  * @param to:		offset to write
@@ -4038,8 +3959,6 @@ int onenand_scan(struct mtd_info *mtd, int maxchips)
 	mtd->_erase = onenand_erase;
 	mtd->_point = NULL;
 	mtd->_unpoint = NULL;
-	mtd->_read = onenand_read;
-	mtd->_write = onenand_write;
 	mtd->_read_oob = onenand_read_oob;
 	mtd->_write_oob = onenand_write_oob;
 	mtd->_panic_write = onenand_panic_write;
diff --git a/drivers/mtd/onenand/samsung.c b/drivers/mtd/onenand/samsung.c
index af0ac1a..2e9d076 100644
--- a/drivers/mtd/onenand/samsung.c
+++ b/drivers/mtd/onenand/samsung.c
@@ -25,8 +25,6 @@
 #include <linux/interrupt.h>
 #include <linux/io.h>
 
-#include <asm/mach/flash.h>
-
 #include "samsung.h"
 
 enum soc_type {
@@ -129,16 +127,13 @@ struct s3c_onenand {
 	struct platform_device	*pdev;
 	enum soc_type	type;
 	void __iomem	*base;
-	struct resource *base_res;
 	void __iomem	*ahb_addr;
-	struct resource *ahb_res;
 	int		bootram_command;
-	void __iomem	*page_buf;
-	void __iomem	*oob_buf;
+	void		*page_buf;
+	void		*oob_buf;
 	unsigned int	(*mem_addr)(int fba, int fpa, int fsa);
 	unsigned int	(*cmd_map)(unsigned int type, unsigned int val);
 	void __iomem	*dma_addr;
-	struct resource *dma_res;
 	unsigned long	phys_base;
 	struct completion	complete;
 };
@@ -413,8 +408,8 @@ static int s3c_onenand_command(struct mtd_info *mtd, int cmd, loff_t addr,
 	/*
 	 * Emulate Two BufferRAMs and access with 4 bytes pointer
 	 */
-	m = (unsigned int *) onenand->page_buf;
-	s = (unsigned int *) onenand->oob_buf;
+	m = onenand->page_buf;
+	s = onenand->oob_buf;
 
 	if (index) {
 		m += (this->writesize >> 2);
@@ -486,11 +481,11 @@ static unsigned char *s3c_get_bufferram(struct mtd_info *mtd, int area)
 	unsigned char *p;
 
 	if (area == ONENAND_DATARAM) {
-		p = (unsigned char *) onenand->page_buf;
+		p = onenand->page_buf;
 		if (index == 1)
 			p += this->writesize;
 	} else {
-		p = (unsigned char *) onenand->oob_buf;
+		p = onenand->oob_buf;
 		if (index == 1)
 			p += mtd->oobsize;
 	}
@@ -851,15 +846,14 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 	/* No need to check pdata. the platform data is optional */
 
 	size = sizeof(struct mtd_info) + sizeof(struct onenand_chip);
-	mtd = kzalloc(size, GFP_KERNEL);
+	mtd = devm_kzalloc(&pdev->dev, size, GFP_KERNEL);
 	if (!mtd)
 		return -ENOMEM;
 
-	onenand = kzalloc(sizeof(struct s3c_onenand), GFP_KERNEL);
-	if (!onenand) {
-		err = -ENOMEM;
-		goto onenand_fail;
-	}
+	onenand = devm_kzalloc(&pdev->dev, sizeof(struct s3c_onenand),
+			       GFP_KERNEL);
+	if (!onenand)
+		return -ENOMEM;
 
 	this = (struct onenand_chip *) &mtd[1];
 	mtd->priv = this;
@@ -870,26 +864,12 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 	s3c_onenand_setup(mtd);
 
 	r = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	if (!r) {
-		dev_err(&pdev->dev, "no memory resource defined\n");
-		return -ENOENT;
-		goto ahb_resource_failed;
-	}
+	onenand->base = devm_ioremap_resource(&pdev->dev, r);
+	if (IS_ERR(onenand->base))
+		return PTR_ERR(onenand->base);
 
-	onenand->base_res = request_mem_region(r->start, resource_size(r),
-					       pdev->name);
-	if (!onenand->base_res) {
-		dev_err(&pdev->dev, "failed to request memory resource\n");
-		err = -EBUSY;
-		goto resource_failed;
-	}
+	onenand->phys_base = r->start;
 
-	onenand->base = ioremap(r->start, resource_size(r));
-	if (!onenand->base) {
-		dev_err(&pdev->dev, "failed to map memory resource\n");
-		err = -EFAULT;
-		goto ioremap_failed;
-	}
 	/* Set onenand_chip also */
 	this->base = onenand->base;
 
@@ -898,40 +878,20 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 
 	if (onenand->type != TYPE_S5PC110) {
 		r = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-		if (!r) {
-			dev_err(&pdev->dev, "no buffer memory resource defined\n");
-			err = -ENOENT;
-			goto ahb_resource_failed;
-		}
-
-		onenand->ahb_res = request_mem_region(r->start, resource_size(r),
-						      pdev->name);
-		if (!onenand->ahb_res) {
-			dev_err(&pdev->dev, "failed to request buffer memory resource\n");
-			err = -EBUSY;
-			goto ahb_resource_failed;
-		}
-
-		onenand->ahb_addr = ioremap(r->start, resource_size(r));
-		if (!onenand->ahb_addr) {
-			dev_err(&pdev->dev, "failed to map buffer memory resource\n");
-			err = -EINVAL;
-			goto ahb_ioremap_failed;
-		}
+		onenand->ahb_addr = devm_ioremap_resource(&pdev->dev, r);
+		if (IS_ERR(onenand->ahb_addr))
+			return PTR_ERR(onenand->ahb_addr);
 
 		/* Allocate 4KiB BufferRAM */
-		onenand->page_buf = kzalloc(SZ_4K, GFP_KERNEL);
-		if (!onenand->page_buf) {
-			err = -ENOMEM;
-			goto page_buf_fail;
-		}
+		onenand->page_buf = devm_kzalloc(&pdev->dev, SZ_4K,
+						 GFP_KERNEL);
+		if (!onenand->page_buf)
+			return -ENOMEM;
 
 		/* Allocate 128 SpareRAM */
-		onenand->oob_buf = kzalloc(128, GFP_KERNEL);
-		if (!onenand->oob_buf) {
-			err = -ENOMEM;
-			goto oob_buf_fail;
-		}
+		onenand->oob_buf = devm_kzalloc(&pdev->dev, 128, GFP_KERNEL);
+		if (!onenand->oob_buf)
+			return -ENOMEM;
 
 		/* S3C doesn't handle subpage write */
 		mtd->subpage_sft = 0;
@@ -939,28 +899,9 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 
 	} else { /* S5PC110 */
 		r = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-		if (!r) {
-			dev_err(&pdev->dev, "no dma memory resource defined\n");
-			err = -ENOENT;
-			goto dma_resource_failed;
-		}
-
-		onenand->dma_res = request_mem_region(r->start, resource_size(r),
-						      pdev->name);
-		if (!onenand->dma_res) {
-			dev_err(&pdev->dev, "failed to request dma memory resource\n");
-			err = -EBUSY;
-			goto dma_resource_failed;
-		}
-
-		onenand->dma_addr = ioremap(r->start, resource_size(r));
-		if (!onenand->dma_addr) {
-			dev_err(&pdev->dev, "failed to map dma memory resource\n");
-			err = -EINVAL;
-			goto dma_ioremap_failed;
-		}
-
-		onenand->phys_base = onenand->base_res->start;
+		onenand->dma_addr = devm_ioremap_resource(&pdev->dev, r);
+		if (IS_ERR(onenand->dma_addr))
+			return PTR_ERR(onenand->dma_addr);
 
 		s5pc110_dma_ops = s5pc110_dma_poll;
 		/* Interrupt support */
@@ -968,19 +909,20 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 		if (r) {
 			init_completion(&onenand->complete);
 			s5pc110_dma_ops = s5pc110_dma_irq;
-			err = request_irq(r->start, s5pc110_onenand_irq,
-					IRQF_SHARED, "onenand", &onenand);
+			err = devm_request_irq(&pdev->dev, r->start,
+					       s5pc110_onenand_irq,
+					       IRQF_SHARED, "onenand",
+					       &onenand);
 			if (err) {
 				dev_err(&pdev->dev, "failed to get irq\n");
-				goto scan_failed;
+				return err;
 			}
 		}
 	}
 
-	if (onenand_scan(mtd, 1)) {
-		err = -EFAULT;
-		goto scan_failed;
-	}
+	err = onenand_scan(mtd, 1);
+	if (err)
+		return err;
 
 	if (onenand->type != TYPE_S5PC110) {
 		/* S3C doesn't handle subpage write */
@@ -994,40 +936,15 @@ static int s3c_onenand_probe(struct platform_device *pdev)
 	err = mtd_device_parse_register(mtd, NULL, NULL,
 					pdata ? pdata->parts : NULL,
 					pdata ? pdata->nr_parts : 0);
+	if (err) {
+		dev_err(&pdev->dev, "failed to parse partitions and register the MTD device\n");
+		onenand_release(mtd);
+		return err;
+	}
 
 	platform_set_drvdata(pdev, mtd);
 
 	return 0;
-
-scan_failed:
-	if (onenand->dma_addr)
-		iounmap(onenand->dma_addr);
-dma_ioremap_failed:
-	if (onenand->dma_res)
-		release_mem_region(onenand->dma_res->start,
-				   resource_size(onenand->dma_res));
-	kfree(onenand->oob_buf);
-oob_buf_fail:
-	kfree(onenand->page_buf);
-page_buf_fail:
-	if (onenand->ahb_addr)
-		iounmap(onenand->ahb_addr);
-ahb_ioremap_failed:
-	if (onenand->ahb_res)
-		release_mem_region(onenand->ahb_res->start,
-				   resource_size(onenand->ahb_res));
-dma_resource_failed:
-ahb_resource_failed:
-	iounmap(onenand->base);
-ioremap_failed:
-	if (onenand->base_res)
-		release_mem_region(onenand->base_res->start,
-				   resource_size(onenand->base_res));
-resource_failed:
-	kfree(onenand);
-onenand_fail:
-	kfree(mtd);
-	return err;
 }
 
 static int s3c_onenand_remove(struct platform_device *pdev)
@@ -1035,25 +952,7 @@ static int s3c_onenand_remove(struct platform_device *pdev)
 	struct mtd_info *mtd = platform_get_drvdata(pdev);
 
 	onenand_release(mtd);
-	if (onenand->ahb_addr)
-		iounmap(onenand->ahb_addr);
-	if (onenand->ahb_res)
-		release_mem_region(onenand->ahb_res->start,
-				   resource_size(onenand->ahb_res));
-	if (onenand->dma_addr)
-		iounmap(onenand->dma_addr);
-	if (onenand->dma_res)
-		release_mem_region(onenand->dma_res->start,
-				   resource_size(onenand->dma_res));
 
-	iounmap(onenand->base);
-	release_mem_region(onenand->base_res->start,
-			   resource_size(onenand->base_res));
-
-	kfree(onenand->oob_buf);
-	kfree(onenand->page_buf);
-	kfree(onenand);
-	kfree(mtd);
 	return 0;
 }
 
diff --git a/drivers/mtd/parsers/sharpslpart.c b/drivers/mtd/parsers/sharpslpart.c
index 5fe0079..8893dc8 100644
--- a/drivers/mtd/parsers/sharpslpart.c
+++ b/drivers/mtd/parsers/sharpslpart.c
@@ -192,7 +192,7 @@ static int sharpsl_nand_init_ftl(struct mtd_info *mtd, struct sharpsl_ftl *ftl)
 
 	/* create physical-logical table */
 	for (block_num = 0; block_num < phymax; block_num++) {
-		block_adr = block_num * mtd->erasesize;
+		block_adr = (loff_t)block_num * mtd->erasesize;
 
 		if (mtd_block_isbad(mtd, block_adr))
 			continue;
@@ -219,7 +219,7 @@ static int sharpsl_nand_init_ftl(struct mtd_info *mtd, struct sharpsl_ftl *ftl)
 	return ret;
 }
 
-void sharpsl_nand_cleanup_ftl(struct sharpsl_ftl *ftl)
+static void sharpsl_nand_cleanup_ftl(struct sharpsl_ftl *ftl)
 {
 	kfree(ftl->log2phy);
 }
@@ -244,7 +244,7 @@ static int sharpsl_nand_read_laddr(struct mtd_info *mtd,
 		return -EINVAL;
 
 	block_num = ftl->log2phy[log_num];
-	block_adr = block_num * mtd->erasesize;
+	block_adr = (loff_t)block_num * mtd->erasesize;
 	block_ofs = mtd_mod_by_eb((u32)from, mtd);
 
 	err = mtd_read(mtd, block_adr + block_ofs, len, &retlen, buf);
diff --git a/drivers/mtd/spi-nor/cadence-quadspi.c b/drivers/mtd/spi-nor/cadence-quadspi.c
index 75a2bc4..4b8e918 100644
--- a/drivers/mtd/spi-nor/cadence-quadspi.c
+++ b/drivers/mtd/spi-nor/cadence-quadspi.c
@@ -58,6 +58,7 @@ struct cqspi_flash_pdata {
 	u8		data_width;
 	u8		cs;
 	bool		registered;
+	bool		use_direct_mode;
 };
 
 struct cqspi_st {
@@ -68,6 +69,7 @@ struct cqspi_st {
 
 	void __iomem		*iobase;
 	void __iomem		*ahb_base;
+	resource_size_t		ahb_size;
 	struct completion	transfer_complete;
 	struct mutex		bus_mutex;
 
@@ -103,6 +105,7 @@ struct cqspi_st {
 /* Register map */
 #define CQSPI_REG_CONFIG			0x00
 #define CQSPI_REG_CONFIG_ENABLE_MASK		BIT(0)
+#define CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL	BIT(7)
 #define CQSPI_REG_CONFIG_DECODE_MASK		BIT(9)
 #define CQSPI_REG_CONFIG_CHIPSELECT_LSB		10
 #define CQSPI_REG_CONFIG_DMA_MASK		BIT(15)
@@ -450,8 +453,7 @@ static int cqspi_command_write_addr(struct spi_nor *nor,
 	return cqspi_exec_flash_cmd(cqspi, reg);
 }
 
-static int cqspi_indirect_read_setup(struct spi_nor *nor,
-				     const unsigned int from_addr)
+static int cqspi_read_setup(struct spi_nor *nor)
 {
 	struct cqspi_flash_pdata *f_pdata = nor->priv;
 	struct cqspi_st *cqspi = f_pdata->cqspi;
@@ -459,8 +461,6 @@ static int cqspi_indirect_read_setup(struct spi_nor *nor,
 	unsigned int dummy_clk = 0;
 	unsigned int reg;
 
-	writel(from_addr, reg_base + CQSPI_REG_INDIRECTRDSTARTADDR);
-
 	reg = nor->read_opcode << CQSPI_REG_RD_INSTR_OPCODE_LSB;
 	reg |= cqspi_calc_rdreg(nor, nor->read_opcode);
 
@@ -493,8 +493,8 @@ static int cqspi_indirect_read_setup(struct spi_nor *nor,
 	return 0;
 }
 
-static int cqspi_indirect_read_execute(struct spi_nor *nor,
-				       u8 *rxbuf, const unsigned n_rx)
+static int cqspi_indirect_read_execute(struct spi_nor *nor, u8 *rxbuf,
+				       loff_t from_addr, const size_t n_rx)
 {
 	struct cqspi_flash_pdata *f_pdata = nor->priv;
 	struct cqspi_st *cqspi = f_pdata->cqspi;
@@ -504,6 +504,7 @@ static int cqspi_indirect_read_execute(struct spi_nor *nor,
 	unsigned int bytes_to_read = 0;
 	int ret = 0;
 
+	writel(from_addr, reg_base + CQSPI_REG_INDIRECTRDSTARTADDR);
 	writel(remaining, reg_base + CQSPI_REG_INDIRECTRDBYTES);
 
 	/* Clear all interrupts. */
@@ -570,8 +571,7 @@ static int cqspi_indirect_read_execute(struct spi_nor *nor,
 	return ret;
 }
 
-static int cqspi_indirect_write_setup(struct spi_nor *nor,
-				      const unsigned int to_addr)
+static int cqspi_write_setup(struct spi_nor *nor)
 {
 	unsigned int reg;
 	struct cqspi_flash_pdata *f_pdata = nor->priv;
@@ -584,8 +584,6 @@ static int cqspi_indirect_write_setup(struct spi_nor *nor,
 	reg = cqspi_calc_rdreg(nor, nor->program_opcode);
 	writel(reg, reg_base + CQSPI_REG_RD_INSTR);
 
-	writel(to_addr, reg_base + CQSPI_REG_INDIRECTWRSTARTADDR);
-
 	reg = readl(reg_base + CQSPI_REG_SIZE);
 	reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK;
 	reg |= (nor->addr_width - 1);
@@ -593,8 +591,8 @@ static int cqspi_indirect_write_setup(struct spi_nor *nor,
 	return 0;
 }
 
-static int cqspi_indirect_write_execute(struct spi_nor *nor,
-					const u8 *txbuf, const unsigned n_tx)
+static int cqspi_indirect_write_execute(struct spi_nor *nor, loff_t to_addr,
+					const u8 *txbuf, const size_t n_tx)
 {
 	const unsigned int page_size = nor->page_size;
 	struct cqspi_flash_pdata *f_pdata = nor->priv;
@@ -604,6 +602,7 @@ static int cqspi_indirect_write_execute(struct spi_nor *nor,
 	unsigned int write_bytes;
 	int ret;
 
+	writel(to_addr, reg_base + CQSPI_REG_INDIRECTWRSTARTADDR);
 	writel(remaining, reg_base + CQSPI_REG_INDIRECTWRBYTES);
 
 	/* Clear all interrupts. */
@@ -894,17 +893,22 @@ static int cqspi_set_protocol(struct spi_nor *nor, const int read)
 static ssize_t cqspi_write(struct spi_nor *nor, loff_t to,
 			   size_t len, const u_char *buf)
 {
+	struct cqspi_flash_pdata *f_pdata = nor->priv;
+	struct cqspi_st *cqspi = f_pdata->cqspi;
 	int ret;
 
 	ret = cqspi_set_protocol(nor, 0);
 	if (ret)
 		return ret;
 
-	ret = cqspi_indirect_write_setup(nor, to);
+	ret = cqspi_write_setup(nor);
 	if (ret)
 		return ret;
 
-	ret = cqspi_indirect_write_execute(nor, buf, len);
+	if (f_pdata->use_direct_mode)
+		memcpy_toio(cqspi->ahb_base + to, buf, len);
+	else
+		ret = cqspi_indirect_write_execute(nor, to, buf, len);
 	if (ret)
 		return ret;
 
@@ -914,17 +918,22 @@ static ssize_t cqspi_write(struct spi_nor *nor, loff_t to,
 static ssize_t cqspi_read(struct spi_nor *nor, loff_t from,
 			  size_t len, u_char *buf)
 {
+	struct cqspi_flash_pdata *f_pdata = nor->priv;
+	struct cqspi_st *cqspi = f_pdata->cqspi;
 	int ret;
 
 	ret = cqspi_set_protocol(nor, 1);
 	if (ret)
 		return ret;
 
-	ret = cqspi_indirect_read_setup(nor, from);
+	ret = cqspi_read_setup(nor);
 	if (ret)
 		return ret;
 
-	ret = cqspi_indirect_read_execute(nor, buf, len);
+	if (f_pdata->use_direct_mode)
+		memcpy_fromio(buf, cqspi->ahb_base + from, len);
+	else
+		ret = cqspi_indirect_read_execute(nor, buf, from, len);
 	if (ret)
 		return ret;
 
@@ -1059,6 +1068,8 @@ static int cqspi_of_get_pdata(struct platform_device *pdev)
 
 static void cqspi_controller_init(struct cqspi_st *cqspi)
 {
+	u32 reg;
+
 	cqspi_controller_enable(cqspi, 0);
 
 	/* Configure the remap address register, no remap */
@@ -1081,6 +1092,11 @@ static void cqspi_controller_init(struct cqspi_st *cqspi)
 	writel(cqspi->fifo_depth * cqspi->fifo_width / 8,
 	       cqspi->iobase + CQSPI_REG_INDIRECTWRWATERMARK);
 
+	/* Enable Direct Access Controller */
+	reg = readl(cqspi->iobase + CQSPI_REG_CONFIG);
+	reg |= CQSPI_REG_CONFIG_ENB_DIR_ACC_CTRL;
+	writel(reg, cqspi->iobase + CQSPI_REG_CONFIG);
+
 	cqspi_controller_enable(cqspi, 1);
 }
 
@@ -1156,6 +1172,12 @@ static int cqspi_setup_flash(struct cqspi_st *cqspi, struct device_node *np)
 			goto err;
 
 		f_pdata->registered = true;
+
+		if (mtd->size <= cqspi->ahb_size) {
+			f_pdata->use_direct_mode = true;
+			dev_dbg(nor->dev, "using direct mode for %s\n",
+				mtd->name);
+		}
 	}
 
 	return 0;
@@ -1215,6 +1237,7 @@ static int cqspi_probe(struct platform_device *pdev)
 		dev_err(dev, "Cannot remap AHB address.\n");
 		return PTR_ERR(cqspi->ahb_base);
 	}
+	cqspi->ahb_size = resource_size(res_ahb);
 
 	init_completion(&cqspi->transfer_complete);
 
diff --git a/drivers/mtd/spi-nor/fsl-quadspi.c b/drivers/mtd/spi-nor/fsl-quadspi.c
index f17d224..2901c7b 100644
--- a/drivers/mtd/spi-nor/fsl-quadspi.c
+++ b/drivers/mtd/spi-nor/fsl-quadspi.c
@@ -801,10 +801,10 @@ static int fsl_qspi_nor_setup_last(struct fsl_qspi *q)
 }
 
 static const struct of_device_id fsl_qspi_dt_ids[] = {
-	{ .compatible = "fsl,vf610-qspi", .data = (void *)&vybrid_data, },
-	{ .compatible = "fsl,imx6sx-qspi", .data = (void *)&imx6sx_data, },
-	{ .compatible = "fsl,imx7d-qspi", .data = (void *)&imx7d_data, },
-	{ .compatible = "fsl,imx6ul-qspi", .data = (void *)&imx6ul_data, },
+	{ .compatible = "fsl,vf610-qspi", .data = &vybrid_data, },
+	{ .compatible = "fsl,imx6sx-qspi", .data = &imx6sx_data, },
+	{ .compatible = "fsl,imx7d-qspi", .data = &imx7d_data, },
+	{ .compatible = "fsl,imx6ul-qspi", .data = &imx6ul_data, },
 	{ .compatible = "fsl,ls1021a-qspi", .data = (void *)&ls1021a_data, },
 	{ /* sentinel */ }
 };
diff --git a/drivers/mtd/spi-nor/intel-spi.c b/drivers/mtd/spi-nor/intel-spi.c
index ef034d8..6999515 100644
--- a/drivers/mtd/spi-nor/intel-spi.c
+++ b/drivers/mtd/spi-nor/intel-spi.c
@@ -138,7 +138,6 @@
  * @erase_64k: 64k erase supported
  * @opcodes: Opcodes which are supported. This are programmed by BIOS
  *           before it locks down the controller.
- * @preopcodes: Preopcodes which are supported.
  */
 struct intel_spi {
 	struct device *dev;
@@ -155,7 +154,6 @@ struct intel_spi {
 	bool swseq_erase;
 	bool erase_64k;
 	u8 opcodes[8];
-	u8 preopcodes[2];
 };
 
 static bool writeable;
@@ -400,10 +398,6 @@ static int intel_spi_init(struct intel_spi *ispi)
 				ispi->opcodes[i] = opmenu0 >> i * 8;
 				ispi->opcodes[i + 4] = opmenu1 >> i * 8;
 			}
-
-			val = readl(ispi->sregs + PREOP_OPTYPE);
-			ispi->preopcodes[0] = val;
-			ispi->preopcodes[1] = val >> 8;
 		}
 	}
 
diff --git a/drivers/mtd/spi-nor/mtk-quadspi.c b/drivers/mtd/spi-nor/mtk-quadspi.c
index abe455c..5442993 100644
--- a/drivers/mtd/spi-nor/mtk-quadspi.c
+++ b/drivers/mtd/spi-nor/mtk-quadspi.c
@@ -110,7 +110,7 @@
 #define MTK_NOR_PRG_REG(n)		(MTK_NOR_PRGDATA0_REG + 4 * (n))
 #define MTK_NOR_SHREG(n)		(MTK_NOR_SHREG0_REG + 4 * (n))
 
-struct mt8173_nor {
+struct mtk_nor {
 	struct spi_nor nor;
 	struct device *dev;
 	void __iomem *base;	/* nor flash base address */
@@ -118,48 +118,48 @@ struct mt8173_nor {
 	struct clk *nor_clk;
 };
 
-static void mt8173_nor_set_read_mode(struct mt8173_nor *mt8173_nor)
+static void mtk_nor_set_read_mode(struct mtk_nor *mtk_nor)
 {
-	struct spi_nor *nor = &mt8173_nor->nor;
+	struct spi_nor *nor = &mtk_nor->nor;
 
 	switch (nor->read_proto) {
 	case SNOR_PROTO_1_1_1:
-		writeb(nor->read_opcode, mt8173_nor->base +
+		writeb(nor->read_opcode, mtk_nor->base +
 		       MTK_NOR_PRGDATA3_REG);
-		writeb(MTK_NOR_FAST_READ, mt8173_nor->base +
+		writeb(MTK_NOR_FAST_READ, mtk_nor->base +
 		       MTK_NOR_CFG1_REG);
 		break;
 	case SNOR_PROTO_1_1_2:
-		writeb(nor->read_opcode, mt8173_nor->base +
+		writeb(nor->read_opcode, mtk_nor->base +
 		       MTK_NOR_PRGDATA3_REG);
-		writeb(MTK_NOR_DUAL_READ_EN, mt8173_nor->base +
+		writeb(MTK_NOR_DUAL_READ_EN, mtk_nor->base +
 		       MTK_NOR_DUAL_REG);
 		break;
 	case SNOR_PROTO_1_1_4:
-		writeb(nor->read_opcode, mt8173_nor->base +
+		writeb(nor->read_opcode, mtk_nor->base +
 		       MTK_NOR_PRGDATA4_REG);
-		writeb(MTK_NOR_QUAD_READ_EN, mt8173_nor->base +
+		writeb(MTK_NOR_QUAD_READ_EN, mtk_nor->base +
 		       MTK_NOR_DUAL_REG);
 		break;
 	default:
-		writeb(MTK_NOR_DUAL_DISABLE, mt8173_nor->base +
+		writeb(MTK_NOR_DUAL_DISABLE, mtk_nor->base +
 		       MTK_NOR_DUAL_REG);
 		break;
 	}
 }
 
-static int mt8173_nor_execute_cmd(struct mt8173_nor *mt8173_nor, u8 cmdval)
+static int mtk_nor_execute_cmd(struct mtk_nor *mtk_nor, u8 cmdval)
 {
 	int reg;
 	u8 val = cmdval & 0x1f;
 
-	writeb(cmdval, mt8173_nor->base + MTK_NOR_CMD_REG);
-	return readl_poll_timeout(mt8173_nor->base + MTK_NOR_CMD_REG, reg,
+	writeb(cmdval, mtk_nor->base + MTK_NOR_CMD_REG);
+	return readl_poll_timeout(mtk_nor->base + MTK_NOR_CMD_REG, reg,
 				  !(reg & val), 100, 10000);
 }
 
-static int mt8173_nor_do_tx_rx(struct mt8173_nor *mt8173_nor, u8 op,
-			       u8 *tx, int txlen, u8 *rx, int rxlen)
+static int mtk_nor_do_tx_rx(struct mtk_nor *mtk_nor, u8 op,
+			    u8 *tx, int txlen, u8 *rx, int rxlen)
 {
 	int len = 1 + txlen + rxlen;
 	int i, ret, idx;
@@ -167,26 +167,26 @@ static int mt8173_nor_do_tx_rx(struct mt8173_nor *mt8173_nor, u8 op,
 	if (len > MTK_NOR_MAX_SHIFT)
 		return -EINVAL;
 
-	writeb(len * 8, mt8173_nor->base + MTK_NOR_CNT_REG);
+	writeb(len * 8, mtk_nor->base + MTK_NOR_CNT_REG);
 
 	/* start at PRGDATA5, go down to PRGDATA0 */
 	idx = MTK_NOR_MAX_RX_TX_SHIFT - 1;
 
 	/* opcode */
-	writeb(op, mt8173_nor->base + MTK_NOR_PRG_REG(idx));
+	writeb(op, mtk_nor->base + MTK_NOR_PRG_REG(idx));
 	idx--;
 
 	/* program TX data */
 	for (i = 0; i < txlen; i++, idx--)
-		writeb(tx[i], mt8173_nor->base + MTK_NOR_PRG_REG(idx));
+		writeb(tx[i], mtk_nor->base + MTK_NOR_PRG_REG(idx));
 
 	/* clear out rest of TX registers */
 	while (idx >= 0) {
-		writeb(0, mt8173_nor->base + MTK_NOR_PRG_REG(idx));
+		writeb(0, mtk_nor->base + MTK_NOR_PRG_REG(idx));
 		idx--;
 	}
 
-	ret = mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_PRG_CMD);
+	ret = mtk_nor_execute_cmd(mtk_nor, MTK_NOR_PRG_CMD);
 	if (ret)
 		return ret;
 
@@ -195,20 +195,20 @@ static int mt8173_nor_do_tx_rx(struct mt8173_nor *mt8173_nor, u8 op,
 
 	/* read out RX data */
 	for (i = 0; i < rxlen; i++, idx--)
-		rx[i] = readb(mt8173_nor->base + MTK_NOR_SHREG(idx));
+		rx[i] = readb(mtk_nor->base + MTK_NOR_SHREG(idx));
 
 	return 0;
 }
 
 /* Do a WRSR (Write Status Register) command */
-static int mt8173_nor_wr_sr(struct mt8173_nor *mt8173_nor, u8 sr)
+static int mtk_nor_wr_sr(struct mtk_nor *mtk_nor, u8 sr)
 {
-	writeb(sr, mt8173_nor->base + MTK_NOR_PRGDATA5_REG);
-	writeb(8, mt8173_nor->base + MTK_NOR_CNT_REG);
-	return mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_WRSR_CMD);
+	writeb(sr, mtk_nor->base + MTK_NOR_PRGDATA5_REG);
+	writeb(8, mtk_nor->base + MTK_NOR_CNT_REG);
+	return mtk_nor_execute_cmd(mtk_nor, MTK_NOR_WRSR_CMD);
 }
 
-static int mt8173_nor_write_buffer_enable(struct mt8173_nor *mt8173_nor)
+static int mtk_nor_write_buffer_enable(struct mtk_nor *mtk_nor)
 {
 	u8 reg;
 
@@ -216,27 +216,27 @@ static int mt8173_nor_write_buffer_enable(struct mt8173_nor *mt8173_nor)
 	 * 0: pre-fetch buffer use for read
 	 * 1: pre-fetch buffer use for page program
 	 */
-	writel(MTK_NOR_WR_BUF_ENABLE, mt8173_nor->base + MTK_NOR_CFG2_REG);
-	return readb_poll_timeout(mt8173_nor->base + MTK_NOR_CFG2_REG, reg,
+	writel(MTK_NOR_WR_BUF_ENABLE, mtk_nor->base + MTK_NOR_CFG2_REG);
+	return readb_poll_timeout(mtk_nor->base + MTK_NOR_CFG2_REG, reg,
 				  0x01 == (reg & 0x01), 100, 10000);
 }
 
-static int mt8173_nor_write_buffer_disable(struct mt8173_nor *mt8173_nor)
+static int mtk_nor_write_buffer_disable(struct mtk_nor *mtk_nor)
 {
 	u8 reg;
 
-	writel(MTK_NOR_WR_BUF_DISABLE, mt8173_nor->base + MTK_NOR_CFG2_REG);
-	return readb_poll_timeout(mt8173_nor->base + MTK_NOR_CFG2_REG, reg,
+	writel(MTK_NOR_WR_BUF_DISABLE, mtk_nor->base + MTK_NOR_CFG2_REG);
+	return readb_poll_timeout(mtk_nor->base + MTK_NOR_CFG2_REG, reg,
 				  MTK_NOR_WR_BUF_DISABLE == (reg & 0x1), 100,
 				  10000);
 }
 
-static void mt8173_nor_set_addr_width(struct mt8173_nor *mt8173_nor)
+static void mtk_nor_set_addr_width(struct mtk_nor *mtk_nor)
 {
 	u8 val;
-	struct spi_nor *nor = &mt8173_nor->nor;
+	struct spi_nor *nor = &mtk_nor->nor;
 
-	val = readb(mt8173_nor->base + MTK_NOR_DUAL_REG);
+	val = readb(mtk_nor->base + MTK_NOR_DUAL_REG);
 
 	switch (nor->addr_width) {
 	case 3:
@@ -246,115 +246,115 @@ static void mt8173_nor_set_addr_width(struct mt8173_nor *mt8173_nor)
 		val |= MTK_NOR_4B_ADDR_EN;
 		break;
 	default:
-		dev_warn(mt8173_nor->dev, "Unexpected address width %u.\n",
+		dev_warn(mtk_nor->dev, "Unexpected address width %u.\n",
 			 nor->addr_width);
 		break;
 	}
 
-	writeb(val, mt8173_nor->base + MTK_NOR_DUAL_REG);
+	writeb(val, mtk_nor->base + MTK_NOR_DUAL_REG);
 }
 
-static void mt8173_nor_set_addr(struct mt8173_nor *mt8173_nor, u32 addr)
+static void mtk_nor_set_addr(struct mtk_nor *mtk_nor, u32 addr)
 {
 	int i;
 
-	mt8173_nor_set_addr_width(mt8173_nor);
+	mtk_nor_set_addr_width(mtk_nor);
 
 	for (i = 0; i < 3; i++) {
-		writeb(addr & 0xff, mt8173_nor->base + MTK_NOR_RADR0_REG + i * 4);
+		writeb(addr & 0xff, mtk_nor->base + MTK_NOR_RADR0_REG + i * 4);
 		addr >>= 8;
 	}
 	/* Last register is non-contiguous */
-	writeb(addr & 0xff, mt8173_nor->base + MTK_NOR_RADR3_REG);
+	writeb(addr & 0xff, mtk_nor->base + MTK_NOR_RADR3_REG);
 }
 
-static ssize_t mt8173_nor_read(struct spi_nor *nor, loff_t from, size_t length,
-			       u_char *buffer)
+static ssize_t mtk_nor_read(struct spi_nor *nor, loff_t from, size_t length,
+			    u_char *buffer)
 {
 	int i, ret;
 	int addr = (int)from;
 	u8 *buf = (u8 *)buffer;
-	struct mt8173_nor *mt8173_nor = nor->priv;
+	struct mtk_nor *mtk_nor = nor->priv;
 
 	/* set mode for fast read mode ,dual mode or quad mode */
-	mt8173_nor_set_read_mode(mt8173_nor);
-	mt8173_nor_set_addr(mt8173_nor, addr);
+	mtk_nor_set_read_mode(mtk_nor);
+	mtk_nor_set_addr(mtk_nor, addr);
 
 	for (i = 0; i < length; i++) {
-		ret = mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_PIO_READ_CMD);
+		ret = mtk_nor_execute_cmd(mtk_nor, MTK_NOR_PIO_READ_CMD);
 		if (ret < 0)
 			return ret;
-		buf[i] = readb(mt8173_nor->base + MTK_NOR_RDATA_REG);
+		buf[i] = readb(mtk_nor->base + MTK_NOR_RDATA_REG);
 	}
 	return length;
 }
 
-static int mt8173_nor_write_single_byte(struct mt8173_nor *mt8173_nor,
-					int addr, int length, u8 *data)
+static int mtk_nor_write_single_byte(struct mtk_nor *mtk_nor,
+				     int addr, int length, u8 *data)
 {
 	int i, ret;
 
-	mt8173_nor_set_addr(mt8173_nor, addr);
+	mtk_nor_set_addr(mtk_nor, addr);
 
 	for (i = 0; i < length; i++) {
-		writeb(*data++, mt8173_nor->base + MTK_NOR_WDATA_REG);
-		ret = mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_PIO_WR_CMD);
+		writeb(*data++, mtk_nor->base + MTK_NOR_WDATA_REG);
+		ret = mtk_nor_execute_cmd(mtk_nor, MTK_NOR_PIO_WR_CMD);
 		if (ret < 0)
 			return ret;
 	}
 	return 0;
 }
 
-static int mt8173_nor_write_buffer(struct mt8173_nor *mt8173_nor, int addr,
-				   const u8 *buf)
+static int mtk_nor_write_buffer(struct mtk_nor *mtk_nor, int addr,
+				const u8 *buf)
 {
 	int i, bufidx, data;
 
-	mt8173_nor_set_addr(mt8173_nor, addr);
+	mtk_nor_set_addr(mtk_nor, addr);
 
 	bufidx = 0;
 	for (i = 0; i < SFLASH_WRBUF_SIZE; i += 4) {
 		data = buf[bufidx + 3]<<24 | buf[bufidx + 2]<<16 |
 		       buf[bufidx + 1]<<8 | buf[bufidx];
 		bufidx += 4;
-		writel(data, mt8173_nor->base + MTK_NOR_PP_DATA_REG);
+		writel(data, mtk_nor->base + MTK_NOR_PP_DATA_REG);
 	}
-	return mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_WR_CMD);
+	return mtk_nor_execute_cmd(mtk_nor, MTK_NOR_WR_CMD);
 }
 
-static ssize_t mt8173_nor_write(struct spi_nor *nor, loff_t to, size_t len,
-				const u_char *buf)
+static ssize_t mtk_nor_write(struct spi_nor *nor, loff_t to, size_t len,
+			     const u_char *buf)
 {
 	int ret;
-	struct mt8173_nor *mt8173_nor = nor->priv;
+	struct mtk_nor *mtk_nor = nor->priv;
 	size_t i;
 
-	ret = mt8173_nor_write_buffer_enable(mt8173_nor);
+	ret = mtk_nor_write_buffer_enable(mtk_nor);
 	if (ret < 0) {
-		dev_warn(mt8173_nor->dev, "write buffer enable failed!\n");
+		dev_warn(mtk_nor->dev, "write buffer enable failed!\n");
 		return ret;
 	}
 
 	for (i = 0; i + SFLASH_WRBUF_SIZE <= len; i += SFLASH_WRBUF_SIZE) {
-		ret = mt8173_nor_write_buffer(mt8173_nor, to, buf);
+		ret = mtk_nor_write_buffer(mtk_nor, to, buf);
 		if (ret < 0) {
-			dev_err(mt8173_nor->dev, "write buffer failed!\n");
+			dev_err(mtk_nor->dev, "write buffer failed!\n");
 			return ret;
 		}
 		to += SFLASH_WRBUF_SIZE;
 		buf += SFLASH_WRBUF_SIZE;
 	}
-	ret = mt8173_nor_write_buffer_disable(mt8173_nor);
+	ret = mtk_nor_write_buffer_disable(mtk_nor);
 	if (ret < 0) {
-		dev_warn(mt8173_nor->dev, "write buffer disable failed!\n");
+		dev_warn(mtk_nor->dev, "write buffer disable failed!\n");
 		return ret;
 	}
 
 	if (i < len) {
-		ret = mt8173_nor_write_single_byte(mt8173_nor, to,
-						   (int)(len - i), (u8 *)buf);
+		ret = mtk_nor_write_single_byte(mtk_nor, to,
+						(int)(len - i), (u8 *)buf);
 		if (ret < 0) {
-			dev_err(mt8173_nor->dev, "write single byte failed!\n");
+			dev_err(mtk_nor->dev, "write single byte failed!\n");
 			return ret;
 		}
 	}
@@ -362,72 +362,72 @@ static ssize_t mt8173_nor_write(struct spi_nor *nor, loff_t to, size_t len,
 	return len;
 }
 
-static int mt8173_nor_read_reg(struct spi_nor *nor, u8 opcode, u8 *buf, int len)
+static int mtk_nor_read_reg(struct spi_nor *nor, u8 opcode, u8 *buf, int len)
 {
 	int ret;
-	struct mt8173_nor *mt8173_nor = nor->priv;
+	struct mtk_nor *mtk_nor = nor->priv;
 
 	switch (opcode) {
 	case SPINOR_OP_RDSR:
-		ret = mt8173_nor_execute_cmd(mt8173_nor, MTK_NOR_RDSR_CMD);
+		ret = mtk_nor_execute_cmd(mtk_nor, MTK_NOR_RDSR_CMD);
 		if (ret < 0)
 			return ret;
 		if (len == 1)
-			*buf = readb(mt8173_nor->base + MTK_NOR_RDSR_REG);
+			*buf = readb(mtk_nor->base + MTK_NOR_RDSR_REG);
 		else
-			dev_err(mt8173_nor->dev, "len should be 1 for read status!\n");
+			dev_err(mtk_nor->dev, "len should be 1 for read status!\n");
 		break;
 	default:
-		ret = mt8173_nor_do_tx_rx(mt8173_nor, opcode, NULL, 0, buf, len);
+		ret = mtk_nor_do_tx_rx(mtk_nor, opcode, NULL, 0, buf, len);
 		break;
 	}
 	return ret;
 }
 
-static int mt8173_nor_write_reg(struct spi_nor *nor, u8 opcode, u8 *buf,
-				int len)
+static int mtk_nor_write_reg(struct spi_nor *nor, u8 opcode, u8 *buf,
+			     int len)
 {
 	int ret;
-	struct mt8173_nor *mt8173_nor = nor->priv;
+	struct mtk_nor *mtk_nor = nor->priv;
 
 	switch (opcode) {
 	case SPINOR_OP_WRSR:
 		/* We only handle 1 byte */
-		ret = mt8173_nor_wr_sr(mt8173_nor, *buf);
+		ret = mtk_nor_wr_sr(mtk_nor, *buf);
 		break;
 	default:
-		ret = mt8173_nor_do_tx_rx(mt8173_nor, opcode, buf, len, NULL, 0);
+		ret = mtk_nor_do_tx_rx(mtk_nor, opcode, buf, len, NULL, 0);
 		if (ret)
-			dev_warn(mt8173_nor->dev, "write reg failure!\n");
+			dev_warn(mtk_nor->dev, "write reg failure!\n");
 		break;
 	}
 	return ret;
 }
 
-static void mt8173_nor_disable_clk(struct mt8173_nor *mt8173_nor)
+static void mtk_nor_disable_clk(struct mtk_nor *mtk_nor)
 {
-	clk_disable_unprepare(mt8173_nor->spi_clk);
-	clk_disable_unprepare(mt8173_nor->nor_clk);
+	clk_disable_unprepare(mtk_nor->spi_clk);
+	clk_disable_unprepare(mtk_nor->nor_clk);
 }
 
-static int mt8173_nor_enable_clk(struct mt8173_nor *mt8173_nor)
+static int mtk_nor_enable_clk(struct mtk_nor *mtk_nor)
 {
 	int ret;
 
-	ret = clk_prepare_enable(mt8173_nor->spi_clk);
+	ret = clk_prepare_enable(mtk_nor->spi_clk);
 	if (ret)
 		return ret;
 
-	ret = clk_prepare_enable(mt8173_nor->nor_clk);
+	ret = clk_prepare_enable(mtk_nor->nor_clk);
 	if (ret) {
-		clk_disable_unprepare(mt8173_nor->spi_clk);
+		clk_disable_unprepare(mtk_nor->spi_clk);
 		return ret;
 	}
 
 	return 0;
 }
 
-static int mtk_nor_init(struct mt8173_nor *mt8173_nor,
+static int mtk_nor_init(struct mtk_nor *mtk_nor,
 			struct device_node *flash_node)
 {
 	const struct spi_nor_hwcaps hwcaps = {
@@ -439,18 +439,18 @@ static int mtk_nor_init(struct mt8173_nor *mt8173_nor,
 	struct spi_nor *nor;
 
 	/* initialize controller to accept commands */
-	writel(MTK_NOR_ENABLE_SF_CMD, mt8173_nor->base + MTK_NOR_WRPROT_REG);
+	writel(MTK_NOR_ENABLE_SF_CMD, mtk_nor->base + MTK_NOR_WRPROT_REG);
 
-	nor = &mt8173_nor->nor;
-	nor->dev = mt8173_nor->dev;
-	nor->priv = mt8173_nor;
+	nor = &mtk_nor->nor;
+	nor->dev = mtk_nor->dev;
+	nor->priv = mtk_nor;
 	spi_nor_set_flash_node(nor, flash_node);
 
 	/* fill the hooks to spi nor */
-	nor->read = mt8173_nor_read;
-	nor->read_reg = mt8173_nor_read_reg;
-	nor->write = mt8173_nor_write;
-	nor->write_reg = mt8173_nor_write_reg;
+	nor->read = mtk_nor_read;
+	nor->read_reg = mtk_nor_read_reg;
+	nor->write = mtk_nor_write;
+	nor->write_reg = mtk_nor_write_reg;
 	nor->mtd.name = "mtk_nor";
 	/* initialized with NULL */
 	ret = spi_nor_scan(nor, NULL, &hwcaps);
@@ -465,34 +465,34 @@ static int mtk_nor_drv_probe(struct platform_device *pdev)
 	struct device_node *flash_np;
 	struct resource *res;
 	int ret;
-	struct mt8173_nor *mt8173_nor;
+	struct mtk_nor *mtk_nor;
 
 	if (!pdev->dev.of_node) {
 		dev_err(&pdev->dev, "No DT found\n");
 		return -EINVAL;
 	}
 
-	mt8173_nor = devm_kzalloc(&pdev->dev, sizeof(*mt8173_nor), GFP_KERNEL);
-	if (!mt8173_nor)
+	mtk_nor = devm_kzalloc(&pdev->dev, sizeof(*mtk_nor), GFP_KERNEL);
+	if (!mtk_nor)
 		return -ENOMEM;
-	platform_set_drvdata(pdev, mt8173_nor);
+	platform_set_drvdata(pdev, mtk_nor);
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	mt8173_nor->base = devm_ioremap_resource(&pdev->dev, res);
-	if (IS_ERR(mt8173_nor->base))
-		return PTR_ERR(mt8173_nor->base);
+	mtk_nor->base = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(mtk_nor->base))
+		return PTR_ERR(mtk_nor->base);
 
-	mt8173_nor->spi_clk = devm_clk_get(&pdev->dev, "spi");
-	if (IS_ERR(mt8173_nor->spi_clk))
-		return PTR_ERR(mt8173_nor->spi_clk);
+	mtk_nor->spi_clk = devm_clk_get(&pdev->dev, "spi");
+	if (IS_ERR(mtk_nor->spi_clk))
+		return PTR_ERR(mtk_nor->spi_clk);
 
-	mt8173_nor->nor_clk = devm_clk_get(&pdev->dev, "sf");
-	if (IS_ERR(mt8173_nor->nor_clk))
-		return PTR_ERR(mt8173_nor->nor_clk);
+	mtk_nor->nor_clk = devm_clk_get(&pdev->dev, "sf");
+	if (IS_ERR(mtk_nor->nor_clk))
+		return PTR_ERR(mtk_nor->nor_clk);
 
-	mt8173_nor->dev = &pdev->dev;
+	mtk_nor->dev = &pdev->dev;
 
-	ret = mt8173_nor_enable_clk(mt8173_nor);
+	ret = mtk_nor_enable_clk(mtk_nor);
 	if (ret)
 		return ret;
 
@@ -503,20 +503,20 @@ static int mtk_nor_drv_probe(struct platform_device *pdev)
 		ret = -ENODEV;
 		goto nor_free;
 	}
-	ret = mtk_nor_init(mt8173_nor, flash_np);
+	ret = mtk_nor_init(mtk_nor, flash_np);
 
 nor_free:
 	if (ret)
-		mt8173_nor_disable_clk(mt8173_nor);
+		mtk_nor_disable_clk(mtk_nor);
 
 	return ret;
 }
 
 static int mtk_nor_drv_remove(struct platform_device *pdev)
 {
-	struct mt8173_nor *mt8173_nor = platform_get_drvdata(pdev);
+	struct mtk_nor *mtk_nor = platform_get_drvdata(pdev);
 
-	mt8173_nor_disable_clk(mt8173_nor);
+	mtk_nor_disable_clk(mtk_nor);
 
 	return 0;
 }
@@ -524,18 +524,18 @@ static int mtk_nor_drv_remove(struct platform_device *pdev)
 #ifdef CONFIG_PM_SLEEP
 static int mtk_nor_suspend(struct device *dev)
 {
-	struct mt8173_nor *mt8173_nor = dev_get_drvdata(dev);
+	struct mtk_nor *mtk_nor = dev_get_drvdata(dev);
 
-	mt8173_nor_disable_clk(mt8173_nor);
+	mtk_nor_disable_clk(mtk_nor);
 
 	return 0;
 }
 
 static int mtk_nor_resume(struct device *dev)
 {
-	struct mt8173_nor *mt8173_nor = dev_get_drvdata(dev);
+	struct mtk_nor *mtk_nor = dev_get_drvdata(dev);
 
-	return mt8173_nor_enable_clk(mt8173_nor);
+	return mtk_nor_enable_clk(mtk_nor);
 }
 
 static const struct dev_pm_ops mtk_nor_dev_pm_ops = {
diff --git a/drivers/mtd/spi-nor/spi-nor.c b/drivers/mtd/spi-nor/spi-nor.c
index bc266f7..d445a4d 100644
--- a/drivers/mtd/spi-nor/spi-nor.c
+++ b/drivers/mtd/spi-nor/spi-nor.c
@@ -330,8 +330,22 @@ static inline int spi_nor_fsr_ready(struct spi_nor *nor)
 	int fsr = read_fsr(nor);
 	if (fsr < 0)
 		return fsr;
-	else
-		return fsr & FSR_READY;
+
+	if (fsr & (FSR_E_ERR | FSR_P_ERR)) {
+		if (fsr & FSR_E_ERR)
+			dev_err(nor->dev, "Erase operation failed.\n");
+		else
+			dev_err(nor->dev, "Program operation failed.\n");
+
+		if (fsr & FSR_PT_ERR)
+			dev_err(nor->dev,
+			"Attempted to modify a protected sector.\n");
+
+		nor->write_reg(nor, SPINOR_OP_CLFSR, NULL, 0);
+		return -EIO;
+	}
+
+	return fsr & FSR_READY;
 }
 
 static int spi_nor_ready(struct spi_nor *nor)
@@ -552,6 +566,27 @@ static int spi_nor_erase(struct mtd_info *mtd, struct erase_info *instr)
 	return ret;
 }
 
+/* Write status register and ensure bits in mask match written values */
+static int write_sr_and_check(struct spi_nor *nor, u8 status_new, u8 mask)
+{
+	int ret;
+
+	write_enable(nor);
+	ret = write_sr(nor, status_new);
+	if (ret)
+		return ret;
+
+	ret = spi_nor_wait_till_ready(nor);
+	if (ret)
+		return ret;
+
+	ret = read_sr(nor);
+	if (ret < 0)
+		return ret;
+
+	return ((ret & mask) != (status_new & mask)) ? -EIO : 0;
+}
+
 static void stm_get_locked_range(struct spi_nor *nor, u8 sr, loff_t *ofs,
 				 uint64_t *len)
 {
@@ -650,7 +685,6 @@ static int stm_lock(struct spi_nor *nor, loff_t ofs, uint64_t len)
 	loff_t lock_len;
 	bool can_be_top = true, can_be_bottom = nor->flags & SNOR_F_HAS_SR_TB;
 	bool use_top;
-	int ret;
 
 	status_old = read_sr(nor);
 	if (status_old < 0)
@@ -714,11 +748,7 @@ static int stm_lock(struct spi_nor *nor, loff_t ofs, uint64_t len)
 	if ((status_new & mask) < (status_old & mask))
 		return -EINVAL;
 
-	write_enable(nor);
-	ret = write_sr(nor, status_new);
-	if (ret)
-		return ret;
-	return spi_nor_wait_till_ready(nor);
+	return write_sr_and_check(nor, status_new, mask);
 }
 
 /*
@@ -735,7 +765,6 @@ static int stm_unlock(struct spi_nor *nor, loff_t ofs, uint64_t len)
 	loff_t lock_len;
 	bool can_be_top = true, can_be_bottom = nor->flags & SNOR_F_HAS_SR_TB;
 	bool use_top;
-	int ret;
 
 	status_old = read_sr(nor);
 	if (status_old < 0)
@@ -802,11 +831,7 @@ static int stm_unlock(struct spi_nor *nor, loff_t ofs, uint64_t len)
 	if ((status_new & mask) > (status_old & mask))
 		return -EINVAL;
 
-	write_enable(nor);
-	ret = write_sr(nor, status_new);
-	if (ret)
-		return ret;
-	return spi_nor_wait_till_ready(nor);
+	return write_sr_and_check(nor, status_new, mask);
 }
 
 /*
@@ -1020,7 +1045,13 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "640s33b",  INFO(0x898913, 0, 64 * 1024, 128, 0) },
 
 	/* ISSI */
-	{ "is25cd512", INFO(0x7f9d20, 0, 32 * 1024,   2, SECT_4K) },
+	{ "is25cd512",  INFO(0x7f9d20, 0, 32 * 1024,   2, SECT_4K) },
+	{ "is25lq040b", INFO(0x9d4013, 0, 64 * 1024,   8,
+			SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
+	{ "is25lp080d", INFO(0x9d6014, 0, 64 * 1024,  16,
+			SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
+	{ "is25lp128",  INFO(0x9d6018, 0, 64 * 1024, 256,
+			SECT_4K | SPI_NOR_DUAL_READ) },
 
 	/* Macronix */
 	{ "mx25l512e",   INFO(0xc22010, 0, 64 * 1024,   1, SECT_4K) },
@@ -1065,7 +1096,7 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "pm25lv010",   INFO(0,        0, 32 * 1024,    4, SECT_4K_PMC) },
 	{ "pm25lq032",   INFO(0x7f9d46, 0, 64 * 1024,   64, SECT_4K) },
 
-	/* Spansion -- single (large) sector size only, at least
+	/* Spansion/Cypress -- single (large) sector size only, at least
 	 * for the chips listed here (without boot sectors).
 	 */
 	{ "s25sl032p",  INFO(0x010215, 0x4d00,  64 * 1024,  64, SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ) },
@@ -1094,6 +1125,8 @@ static const struct flash_info spi_nor_ids[] = {
 	{ "s25fl204k",  INFO(0x014013,      0,  64 * 1024,   8, SECT_4K | SPI_NOR_DUAL_READ) },
 	{ "s25fl208k",  INFO(0x014014,      0,  64 * 1024,  16, SECT_4K | SPI_NOR_DUAL_READ) },
 	{ "s25fl064l",  INFO(0x016017,      0,  64 * 1024, 128, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ | SPI_NOR_4B_OPCODES) },
+	{ "s25fl128l",  INFO(0x016018,      0,  64 * 1024, 256, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ | SPI_NOR_4B_OPCODES) },
+	{ "s25fl256l",  INFO(0x016019,      0,  64 * 1024, 512, SECT_4K | SPI_NOR_DUAL_READ | SPI_NOR_QUAD_READ | SPI_NOR_4B_OPCODES) },
 
 	/* SST -- large erase sizes are "overlays", "sectors" are 4K */
 	{ "sst25vf040b", INFO(0xbf258d, 0, 64 * 1024,  8, SECT_4K | SST_WRITE) },
@@ -2713,6 +2746,16 @@ static void spi_nor_resume(struct mtd_info *mtd)
 		dev_err(dev, "resume() failed\n");
 }
 
+void spi_nor_restore(struct spi_nor *nor)
+{
+	/* restore the addressing mode */
+	if ((nor->addr_width == 4) &&
+	    (JEDEC_MFR(nor->info) != SNOR_MFR_SPANSION) &&
+	    !(nor->info->flags & SPI_NOR_4B_OPCODES))
+		set_4byte(nor, nor->info, 0);
+}
+EXPORT_SYMBOL_GPL(spi_nor_restore);
+
 int spi_nor_scan(struct spi_nor *nor, const char *name,
 		 const struct spi_nor_hwcaps *hwcaps)
 {
diff --git a/drivers/mtd/tests/nandbiterrs.c b/drivers/mtd/tests/nandbiterrs.c
index 5f03b8c..cde19c9 100644
--- a/drivers/mtd/tests/nandbiterrs.c
+++ b/drivers/mtd/tests/nandbiterrs.c
@@ -151,7 +151,7 @@ static int read_page(int log)
 	memcpy(&oldstats, &mtd->ecc_stats, sizeof(oldstats));
 
 	err = mtd_read(mtd, offset, mtd->writesize, &read, rbuffer);
-	if (err == -EUCLEAN)
+	if (!err || err == -EUCLEAN)
 		err = mtd->ecc_stats.corrected - oldstats.corrected;
 
 	if (err < 0 || read != mtd->writesize) {
diff --git a/drivers/mtd/tests/oobtest.c b/drivers/mtd/tests/oobtest.c
index 1cb3f77..766b2c3 100644
--- a/drivers/mtd/tests/oobtest.c
+++ b/drivers/mtd/tests/oobtest.c
@@ -193,6 +193,9 @@ static int verify_eraseblock(int ebnum)
 		ops.datbuf    = NULL;
 		ops.oobbuf    = readbuf;
 		err = mtd_read_oob(mtd, addr, &ops);
+		if (mtd_is_bitflip(err))
+			err = 0;
+
 		if (err || ops.oobretlen != use_len) {
 			pr_err("error: readoob failed at %#llx\n",
 			       (long long)addr);
@@ -227,6 +230,9 @@ static int verify_eraseblock(int ebnum)
 			ops.datbuf    = NULL;
 			ops.oobbuf    = readbuf;
 			err = mtd_read_oob(mtd, addr, &ops);
+			if (mtd_is_bitflip(err))
+				err = 0;
+
 			if (err || ops.oobretlen != mtd->oobavail) {
 				pr_err("error: readoob failed at %#llx\n",
 						(long long)addr);
@@ -286,6 +292,9 @@ static int verify_eraseblock_in_one_go(int ebnum)
 
 	/* read entire block's OOB at one go */
 	err = mtd_read_oob(mtd, addr, &ops);
+	if (mtd_is_bitflip(err))
+		err = 0;
+
 	if (err || ops.oobretlen != len) {
 		pr_err("error: readoob failed at %#llx\n",
 		       (long long)addr);
@@ -527,6 +536,9 @@ static int __init mtd_oobtest_init(void)
 	pr_info("attempting to start read past end of OOB\n");
 	pr_info("an error is expected...\n");
 	err = mtd_read_oob(mtd, addr0, &ops);
+	if (mtd_is_bitflip(err))
+		err = 0;
+
 	if (err) {
 		pr_info("error occurred as expected\n");
 		err = 0;
@@ -571,6 +583,9 @@ static int __init mtd_oobtest_init(void)
 		pr_info("attempting to read past end of device\n");
 		pr_info("an error is expected...\n");
 		err = mtd_read_oob(mtd, mtd->size - mtd->writesize, &ops);
+		if (mtd_is_bitflip(err))
+			err = 0;
+
 		if (err) {
 			pr_info("error occurred as expected\n");
 			err = 0;
@@ -615,6 +630,9 @@ static int __init mtd_oobtest_init(void)
 		pr_info("attempting to read past end of device\n");
 		pr_info("an error is expected...\n");
 		err = mtd_read_oob(mtd, mtd->size - mtd->writesize, &ops);
+		if (mtd_is_bitflip(err))
+			err = 0;
+
 		if (err) {
 			pr_info("error occurred as expected\n");
 			err = 0;
@@ -684,6 +702,9 @@ static int __init mtd_oobtest_init(void)
 		ops.datbuf    = NULL;
 		ops.oobbuf    = readbuf;
 		err = mtd_read_oob(mtd, addr, &ops);
+		if (mtd_is_bitflip(err))
+			err = 0;
+
 		if (err)
 			goto out;
 		if (memcmpshow(addr, readbuf, writebuf,
diff --git a/drivers/mtd/ubi/block.c b/drivers/mtd/ubi/block.c
index b210fdb..b1fc28f 100644
--- a/drivers/mtd/ubi/block.c
+++ b/drivers/mtd/ubi/block.c
@@ -99,6 +99,8 @@ struct ubiblock {
 
 /* Linked list of all ubiblock instances */
 static LIST_HEAD(ubiblock_devices);
+static DEFINE_IDR(ubiblock_minor_idr);
+/* Protects ubiblock_devices and ubiblock_minor_idr */
 static DEFINE_MUTEX(devices_mutex);
 static int ubiblock_major;
 
@@ -351,8 +353,6 @@ static const struct blk_mq_ops ubiblock_mq_ops = {
 	.init_request	= ubiblock_init_request,
 };
 
-static DEFINE_IDR(ubiblock_minor_idr);
-
 int ubiblock_create(struct ubi_volume_info *vi)
 {
 	struct ubiblock *dev;
@@ -365,14 +365,15 @@ int ubiblock_create(struct ubi_volume_info *vi)
 	/* Check that the volume isn't already handled */
 	mutex_lock(&devices_mutex);
 	if (find_dev_nolock(vi->ubi_num, vi->vol_id)) {
-		mutex_unlock(&devices_mutex);
-		return -EEXIST;
+		ret = -EEXIST;
+		goto out_unlock;
 	}
-	mutex_unlock(&devices_mutex);
 
 	dev = kzalloc(sizeof(struct ubiblock), GFP_KERNEL);
-	if (!dev)
-		return -ENOMEM;
+	if (!dev) {
+		ret = -ENOMEM;
+		goto out_unlock;
+	}
 
 	mutex_init(&dev->dev_mutex);
 
@@ -437,14 +438,13 @@ int ubiblock_create(struct ubi_volume_info *vi)
 		goto out_free_queue;
 	}
 
-	mutex_lock(&devices_mutex);
 	list_add_tail(&dev->list, &ubiblock_devices);
-	mutex_unlock(&devices_mutex);
 
 	/* Must be the last step: anyone can call file ops from now on */
 	add_disk(dev->gd);
 	dev_info(disk_to_dev(dev->gd), "created from ubi%d:%d(%s)",
 		 dev->ubi_num, dev->vol_id, vi->name);
+	mutex_unlock(&devices_mutex);
 	return 0;
 
 out_free_queue:
@@ -457,6 +457,8 @@ int ubiblock_create(struct ubi_volume_info *vi)
 	put_disk(dev->gd);
 out_free_dev:
 	kfree(dev);
+out_unlock:
+	mutex_unlock(&devices_mutex);
 
 	return ret;
 }
@@ -478,30 +480,36 @@ static void ubiblock_cleanup(struct ubiblock *dev)
 int ubiblock_remove(struct ubi_volume_info *vi)
 {
 	struct ubiblock *dev;
+	int ret;
 
 	mutex_lock(&devices_mutex);
 	dev = find_dev_nolock(vi->ubi_num, vi->vol_id);
 	if (!dev) {
-		mutex_unlock(&devices_mutex);
-		return -ENODEV;
+		ret = -ENODEV;
+		goto out_unlock;
 	}
 
 	/* Found a device, let's lock it so we can check if it's busy */
 	mutex_lock(&dev->dev_mutex);
 	if (dev->refcnt > 0) {
-		mutex_unlock(&dev->dev_mutex);
-		mutex_unlock(&devices_mutex);
-		return -EBUSY;
+		ret = -EBUSY;
+		goto out_unlock_dev;
 	}
 
 	/* Remove from device list */
 	list_del(&dev->list);
-	mutex_unlock(&devices_mutex);
-
 	ubiblock_cleanup(dev);
 	mutex_unlock(&dev->dev_mutex);
+	mutex_unlock(&devices_mutex);
+
 	kfree(dev);
 	return 0;
+
+out_unlock_dev:
+	mutex_unlock(&dev->dev_mutex);
+out_unlock:
+	mutex_unlock(&devices_mutex);
+	return ret;
 }
 
 static int ubiblock_resize(struct ubi_volume_info *vi)
@@ -630,6 +638,7 @@ static void ubiblock_remove_all(void)
 	struct ubiblock *next;
 	struct ubiblock *dev;
 
+	mutex_lock(&devices_mutex);
 	list_for_each_entry_safe(dev, next, &ubiblock_devices, list) {
 		/* The module is being forcefully removed */
 		WARN_ON(dev->desc);
@@ -638,6 +647,7 @@ static void ubiblock_remove_all(void)
 		ubiblock_cleanup(dev);
 		kfree(dev);
 	}
+	mutex_unlock(&devices_mutex);
 }
 
 int __init ubiblock_init(void)
diff --git a/drivers/mtd/ubi/build.c b/drivers/mtd/ubi/build.c
index 136ce05..e941395 100644
--- a/drivers/mtd/ubi/build.c
+++ b/drivers/mtd/ubi/build.c
@@ -535,8 +535,17 @@ static int get_bad_peb_limit(const struct ubi_device *ubi, int max_beb_per1024)
 	int limit, device_pebs;
 	uint64_t device_size;
 
-	if (!max_beb_per1024)
-		return 0;
+	if (!max_beb_per1024) {
+		/*
+		 * Since max_beb_per1024 has not been set by the user in either
+		 * the cmdline or Kconfig, use mtd_max_bad_blocks to set the
+		 * limit if it is supported by the device.
+		 */
+		limit = mtd_max_bad_blocks(ubi->mtd, 0, ubi->mtd->size);
+		if (limit < 0)
+			return 0;
+		return limit;
+	}
 
 	/*
 	 * Here we are using size of the entire flash chip and
diff --git a/drivers/mtd/ubi/eba.c b/drivers/mtd/ubi/eba.c
index 388e46b..250e30f 100644
--- a/drivers/mtd/ubi/eba.c
+++ b/drivers/mtd/ubi/eba.c
@@ -384,7 +384,7 @@ static int leb_write_lock(struct ubi_device *ubi, int vol_id, int lnum)
 }
 
 /**
- * leb_write_lock - lock logical eraseblock for writing.
+ * leb_write_trylock - try to lock logical eraseblock for writing.
  * @ubi: UBI device description object
  * @vol_id: volume ID
  * @lnum: logical eraseblock number
diff --git a/drivers/mtd/ubi/fastmap-wl.c b/drivers/mtd/ubi/fastmap-wl.c
index 4f0bd6b..590d967 100644
--- a/drivers/mtd/ubi/fastmap-wl.c
+++ b/drivers/mtd/ubi/fastmap-wl.c
@@ -66,7 +66,7 @@ static void return_unused_pool_pebs(struct ubi_device *ubi,
 	}
 }
 
-static int anchor_pebs_avalible(struct rb_root *root)
+static int anchor_pebs_available(struct rb_root *root)
 {
 	struct rb_node *p;
 	struct ubi_wl_entry *e;
diff --git a/drivers/mtd/ubi/fastmap.c b/drivers/mtd/ubi/fastmap.c
index 5a832bc..9170596 100644
--- a/drivers/mtd/ubi/fastmap.c
+++ b/drivers/mtd/ubi/fastmap.c
@@ -214,9 +214,8 @@ static void assign_aeb_to_av(struct ubi_attach_info *ai,
 			     struct ubi_ainf_volume *av)
 {
 	struct ubi_ainf_peb *tmp_aeb;
-	struct rb_node **p = &ai->volumes.rb_node, *parent = NULL;
+	struct rb_node **p = &av->root.rb_node, *parent = NULL;
 
-	p = &av->root.rb_node;
 	while (*p) {
 		parent = *p;
 
@@ -1063,7 +1062,7 @@ int ubi_scan_fastmap(struct ubi_device *ubi, struct ubi_attach_info *ai,
 		e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
 		if (!e) {
 			while (i--)
-				kfree(fm->e[i]);
+				kmem_cache_free(ubi_wl_entry_slab, fm->e[i]);
 
 			ret = -ENOMEM;
 			goto free_hdr;
diff --git a/drivers/mtd/ubi/vmt.c b/drivers/mtd/ubi/vmt.c
index 85237cf..3fd8d7f 100644
--- a/drivers/mtd/ubi/vmt.c
+++ b/drivers/mtd/ubi/vmt.c
@@ -270,6 +270,12 @@ int ubi_create_volume(struct ubi_device *ubi, struct ubi_mkvol_req *req)
 			vol->last_eb_bytes = vol->usable_leb_size;
 	}
 
+	/* Make volume "available" before it becomes accessible via sysfs */
+	spin_lock(&ubi->volumes_lock);
+	ubi->volumes[vol_id] = vol;
+	ubi->vol_count += 1;
+	spin_unlock(&ubi->volumes_lock);
+
 	/* Register character device for the volume */
 	cdev_init(&vol->cdev, &ubi_vol_cdev_operations);
 	vol->cdev.owner = THIS_MODULE;
@@ -298,11 +304,6 @@ int ubi_create_volume(struct ubi_device *ubi, struct ubi_mkvol_req *req)
 	if (err)
 		goto out_sysfs;
 
-	spin_lock(&ubi->volumes_lock);
-	ubi->volumes[vol_id] = vol;
-	ubi->vol_count += 1;
-	spin_unlock(&ubi->volumes_lock);
-
 	ubi_volume_notify(ubi, vol, UBI_VOLUME_ADDED);
 	self_check_volumes(ubi);
 	return err;
@@ -315,6 +316,10 @@ int ubi_create_volume(struct ubi_device *ubi, struct ubi_mkvol_req *req)
 	 */
 	cdev_device_del(&vol->cdev, &vol->dev);
 out_mapping:
+	spin_lock(&ubi->volumes_lock);
+	ubi->volumes[vol_id] = NULL;
+	ubi->vol_count -= 1;
+	spin_unlock(&ubi->volumes_lock);
 	ubi_eba_destroy_table(eba_tbl);
 out_acc:
 	spin_lock(&ubi->volumes_lock);
diff --git a/drivers/mtd/ubi/wl.c b/drivers/mtd/ubi/wl.c
index b5b8cd6..2052a64 100644
--- a/drivers/mtd/ubi/wl.c
+++ b/drivers/mtd/ubi/wl.c
@@ -692,7 +692,7 @@ static int wear_leveling_worker(struct ubi_device *ubi, struct ubi_work *wrk,
 #ifdef CONFIG_MTD_UBI_FASTMAP
 	/* Check whether we need to produce an anchor PEB */
 	if (!anchor)
-		anchor = !anchor_pebs_avalible(&ubi->free);
+		anchor = !anchor_pebs_available(&ubi->free);
 
 	if (anchor) {
 		e1 = find_anchor_wl_entry(&ubi->used);
@@ -1529,6 +1529,46 @@ static void shutdown_work(struct ubi_device *ubi)
 }
 
 /**
+ * erase_aeb - erase a PEB given in UBI attach info PEB
+ * @ubi: UBI device description object
+ * @aeb: UBI attach info PEB
+ * @sync: If true, erase synchronously. Otherwise schedule for erasure
+ */
+static int erase_aeb(struct ubi_device *ubi, struct ubi_ainf_peb *aeb, bool sync)
+{
+	struct ubi_wl_entry *e;
+	int err;
+
+	e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
+	if (!e)
+		return -ENOMEM;
+
+	e->pnum = aeb->pnum;
+	e->ec = aeb->ec;
+	ubi->lookuptbl[e->pnum] = e;
+
+	if (sync) {
+		err = sync_erase(ubi, e, false);
+		if (err)
+			goto out_free;
+
+		wl_tree_add(e, &ubi->free);
+		ubi->free_count++;
+	} else {
+		err = schedule_erase(ubi, e, aeb->vol_id, aeb->lnum, 0, false);
+		if (err)
+			goto out_free;
+	}
+
+	return 0;
+
+out_free:
+	wl_entry_destroy(ubi, e);
+
+	return err;
+}
+
+/**
  * ubi_wl_init - initialize the WL sub-system using attaching information.
  * @ubi: UBI device description object
  * @ai: attaching information
@@ -1566,18 +1606,10 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 	list_for_each_entry_safe(aeb, tmp, &ai->erase, u.list) {
 		cond_resched();
 
-		e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-		if (!e)
+		err = erase_aeb(ubi, aeb, false);
+		if (err)
 			goto out_free;
 
-		e->pnum = aeb->pnum;
-		e->ec = aeb->ec;
-		ubi->lookuptbl[e->pnum] = e;
-		if (schedule_erase(ubi, e, aeb->vol_id, aeb->lnum, 0, false)) {
-			wl_entry_destroy(ubi, e);
-			goto out_free;
-		}
-
 		found_pebs++;
 	}
 
@@ -1585,8 +1617,10 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 		cond_resched();
 
 		e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-		if (!e)
+		if (!e) {
+			err = -ENOMEM;
 			goto out_free;
+		}
 
 		e->pnum = aeb->pnum;
 		e->ec = aeb->ec;
@@ -1605,8 +1639,10 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 			cond_resched();
 
 			e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-			if (!e)
+			if (!e) {
+				err = -ENOMEM;
 				goto out_free;
+			}
 
 			e->pnum = aeb->pnum;
 			e->ec = aeb->ec;
@@ -1635,6 +1671,8 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 			ubi_assert(!ubi->lookuptbl[e->pnum]);
 			ubi->lookuptbl[e->pnum] = e;
 		} else {
+			bool sync = false;
+
 			/*
 			 * Usually old Fastmap PEBs are scheduled for erasure
 			 * and we don't have to care about them but if we face
@@ -1644,18 +1682,21 @@ int ubi_wl_init(struct ubi_device *ubi, struct ubi_attach_info *ai)
 			if (ubi->lookuptbl[aeb->pnum])
 				continue;
 
-			e = kmem_cache_alloc(ubi_wl_entry_slab, GFP_KERNEL);
-			if (!e)
-				goto out_free;
+			/*
+			 * The fastmap update code might not find a free PEB for
+			 * writing the fastmap anchor to and then reuses the
+			 * current fastmap anchor PEB. When this PEB gets erased
+			 * and a power cut happens before it is written again we
+			 * must make sure that the fastmap attach code doesn't
+			 * find any outdated fastmap anchors, hence we erase the
+			 * outdated fastmap anchor PEBs synchronously here.
+			 */
+			if (aeb->vol_id == UBI_FM_SB_VOLUME_ID)
+				sync = true;
 
-			e->pnum = aeb->pnum;
-			e->ec = aeb->ec;
-			ubi_assert(!ubi->lookuptbl[e->pnum]);
-			ubi->lookuptbl[e->pnum] = e;
-			if (schedule_erase(ubi, e, aeb->vol_id, aeb->lnum, 0, false)) {
-				wl_entry_destroy(ubi, e);
+			err = erase_aeb(ubi, aeb, sync);
+			if (err)
 				goto out_free;
-			}
 		}
 
 		found_pebs++;
diff --git a/drivers/mtd/ubi/wl.h b/drivers/mtd/ubi/wl.h
index 2aaa3f7..a9e2d66 100644
--- a/drivers/mtd/ubi/wl.h
+++ b/drivers/mtd/ubi/wl.h
@@ -2,7 +2,7 @@
 #ifndef UBI_WL_H
 #define UBI_WL_H
 #ifdef CONFIG_MTD_UBI_FASTMAP
-static int anchor_pebs_avalible(struct rb_root *root);
+static int anchor_pebs_available(struct rb_root *root);
 static void update_fastmap_work_fn(struct work_struct *wrk);
 static struct ubi_wl_entry *find_anchor_wl_entry(struct rb_root *root);
 static struct ubi_wl_entry *get_peb_for_wl(struct ubi_device *ubi);
diff --git a/drivers/mux/core.c b/drivers/mux/core.c
index 2260063..6e5cf9d 100644
--- a/drivers/mux/core.c
+++ b/drivers/mux/core.c
@@ -413,6 +413,7 @@ static int of_dev_node_match(struct device *dev, const void *data)
 	return dev->of_node == data;
 }
 
+/* Note this function returns a reference to the mux_chip dev. */
 static struct mux_chip *of_find_mux_chip_by_node(struct device_node *np)
 {
 	struct device *dev;
@@ -466,6 +467,7 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name)
 	    (!args.args_count && (mux_chip->controllers > 1))) {
 		dev_err(dev, "%pOF: wrong #mux-control-cells for %pOF\n",
 			np, args.np);
+		put_device(&mux_chip->dev);
 		return ERR_PTR(-EINVAL);
 	}
 
@@ -476,10 +478,10 @@ struct mux_control *mux_control_get(struct device *dev, const char *mux_name)
 	if (controller >= mux_chip->controllers) {
 		dev_err(dev, "%pOF: bad mux controller %u specified in %pOF\n",
 			np, controller, args.np);
+		put_device(&mux_chip->dev);
 		return ERR_PTR(-EINVAL);
 	}
 
-	get_device(&mux_chip->dev);
 	return &mux_chip->mux[controller];
 }
 EXPORT_SYMBOL_GPL(mux_control_get);
diff --git a/drivers/net/can/flexcan.c b/drivers/net/can/flexcan.c
index 0626dcf..760d2c0 100644
--- a/drivers/net/can/flexcan.c
+++ b/drivers/net/can/flexcan.c
@@ -526,7 +526,7 @@ static int flexcan_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		data = be32_to_cpup((__be32 *)&cf->data[0]);
 		flexcan_write(data, &priv->tx_mb->data[0]);
 	}
-	if (cf->can_dlc > 3) {
+	if (cf->can_dlc > 4) {
 		data = be32_to_cpup((__be32 *)&cf->data[4]);
 		flexcan_write(data, &priv->tx_mb->data[1]);
 	}
diff --git a/drivers/net/can/usb/ems_usb.c b/drivers/net/can/usb/ems_usb.c
index b003582..12ff002 100644
--- a/drivers/net/can/usb/ems_usb.c
+++ b/drivers/net/can/usb/ems_usb.c
@@ -395,6 +395,7 @@ static void ems_usb_rx_err(struct ems_usb *dev, struct ems_cpc_msg *msg)
 
 		if (dev->can.state == CAN_STATE_ERROR_WARNING ||
 		    dev->can.state == CAN_STATE_ERROR_PASSIVE) {
+			cf->can_id |= CAN_ERR_CRTL;
 			cf->data[1] = (txerr > rxerr) ?
 			    CAN_ERR_CRTL_TX_PASSIVE : CAN_ERR_CRTL_RX_PASSIVE;
 		}
diff --git a/drivers/net/can/usb/gs_usb.c b/drivers/net/can/usb/gs_usb.c
index 68ac3e8..8bf80ad 100644
--- a/drivers/net/can/usb/gs_usb.c
+++ b/drivers/net/can/usb/gs_usb.c
@@ -449,7 +449,7 @@ static int gs_usb_set_bittiming(struct net_device *netdev)
 		dev_err(netdev->dev.parent, "Couldn't set bittimings (err=%d)",
 			rc);
 
-	return rc;
+	return (rc > 0) ? 0 : rc;
 }
 
 static void gs_usb_xmit_callback(struct urb *urb)
diff --git a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c
index 7ccdc3e..53d6bb0 100644
--- a/drivers/net/can/usb/peak_usb/pcan_usb_fd.c
+++ b/drivers/net/can/usb/peak_usb/pcan_usb_fd.c
@@ -184,7 +184,7 @@ static int pcan_usb_fd_send_cmd(struct peak_usb_device *dev, void *cmd_tail)
 	void *cmd_head = pcan_usb_fd_cmd_buffer(dev);
 	int err = 0;
 	u8 *packet_ptr;
-	int i, n = 1, packet_len;
+	int packet_len;
 	ptrdiff_t cmd_len;
 
 	/* usb device unregistered? */
@@ -201,17 +201,13 @@ static int pcan_usb_fd_send_cmd(struct peak_usb_device *dev, void *cmd_tail)
 	}
 
 	packet_ptr = cmd_head;
+	packet_len = cmd_len;
 
 	/* firmware is not able to re-assemble 512 bytes buffer in full-speed */
-	if ((dev->udev->speed != USB_SPEED_HIGH) &&
-	    (cmd_len > PCAN_UFD_LOSPD_PKT_SIZE)) {
-		packet_len = PCAN_UFD_LOSPD_PKT_SIZE;
-		n += cmd_len / packet_len;
-	} else {
-		packet_len = cmd_len;
-	}
+	if (unlikely(dev->udev->speed != USB_SPEED_HIGH))
+		packet_len = min(packet_len, PCAN_UFD_LOSPD_PKT_SIZE);
 
-	for (i = 0; i < n; i++) {
+	do {
 		err = usb_bulk_msg(dev->udev,
 				   usb_sndbulkpipe(dev->udev,
 						   PCAN_USBPRO_EP_CMDOUT),
@@ -224,7 +220,12 @@ static int pcan_usb_fd_send_cmd(struct peak_usb_device *dev, void *cmd_tail)
 		}
 
 		packet_ptr += packet_len;
-	}
+		cmd_len -= packet_len;
+
+		if (cmd_len < PCAN_UFD_LOSPD_PKT_SIZE)
+			packet_len = cmd_len;
+
+	} while (packet_len > 0);
 
 	return err;
 }
diff --git a/drivers/net/can/vxcan.c b/drivers/net/can/vxcan.c
index 8404e88..b4c4a2c 100644
--- a/drivers/net/can/vxcan.c
+++ b/drivers/net/can/vxcan.c
@@ -194,7 +194,7 @@ static int vxcan_newlink(struct net *net, struct net_device *dev,
 		tbp = peer_tb;
 	}
 
-	if (tbp[IFLA_IFNAME]) {
+	if (ifmp && tbp[IFLA_IFNAME]) {
 		nla_strlcpy(ifname, tbp[IFLA_IFNAME], IFNAMSIZ);
 		name_assign_type = NET_NAME_USER;
 	} else {
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index f5a8dd9..4498ab8 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1500,10 +1500,13 @@ static enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds,
 {
 	struct b53_device *dev = ds->priv;
 
-	/* Older models support a different tag format that we do not
-	 * support in net/dsa/tag_brcm.c yet.
+	/* Older models (5325, 5365) support a different tag format that we do
+	 * not support in net/dsa/tag_brcm.c yet. 539x and 531x5 require managed
+	 * mode to be turned on which means we need to specifically manage ARL
+	 * misses on multicast addresses (TBD).
 	 */
-	if (is5325(dev) || is5365(dev) || !b53_can_enable_brcm_tags(ds, port))
+	if (is5325(dev) || is5365(dev) || is539x(dev) || is531x5(dev) ||
+	    !b53_can_enable_brcm_tags(ds, port))
 		return DSA_TAG_PROTO_NONE;
 
 	/* Broadcom BCM58xx chips have a flow accelerator on Port 8
diff --git a/drivers/net/ethernet/3com/3c59x.c b/drivers/net/ethernet/3com/3c59x.c
index f4e13a7..36c8950 100644
--- a/drivers/net/ethernet/3com/3c59x.c
+++ b/drivers/net/ethernet/3com/3c59x.c
@@ -602,7 +602,7 @@ struct vortex_private {
 	struct sk_buff* rx_skbuff[RX_RING_SIZE];
 	struct sk_buff* tx_skbuff[TX_RING_SIZE];
 	unsigned int cur_rx, cur_tx;		/* The next free ring entry */
-	unsigned int dirty_rx, dirty_tx;	/* The ring entries to be free()ed. */
+	unsigned int dirty_tx;	/* The ring entries to be free()ed. */
 	struct vortex_extra_stats xstats;	/* NIC-specific extra stats */
 	struct sk_buff *tx_skb;				/* Packet being eaten by bus master ctrl.  */
 	dma_addr_t tx_skb_dma;				/* Allocated DMA address for bus master ctrl DMA.   */
@@ -618,7 +618,6 @@ struct vortex_private {
 
 	/* The remainder are related to chip state, mostly media selection. */
 	struct timer_list timer;			/* Media selection timer. */
-	struct timer_list rx_oom_timer;		/* Rx skb allocation retry timer */
 	int options;						/* User-settable misc. driver options. */
 	unsigned int media_override:4, 		/* Passed-in media type. */
 		default_media:4,				/* Read from the EEPROM/Wn3_Config. */
@@ -760,7 +759,6 @@ static void mdio_sync(struct vortex_private *vp, int bits);
 static int mdio_read(struct net_device *dev, int phy_id, int location);
 static void mdio_write(struct net_device *vp, int phy_id, int location, int value);
 static void vortex_timer(struct timer_list *t);
-static void rx_oom_timer(struct timer_list *t);
 static netdev_tx_t vortex_start_xmit(struct sk_buff *skb,
 				     struct net_device *dev);
 static netdev_tx_t boomerang_start_xmit(struct sk_buff *skb,
@@ -1601,7 +1599,6 @@ vortex_up(struct net_device *dev)
 
 	timer_setup(&vp->timer, vortex_timer, 0);
 	mod_timer(&vp->timer, RUN_AT(media_tbl[dev->if_port].wait));
-	timer_setup(&vp->rx_oom_timer, rx_oom_timer, 0);
 
 	if (vortex_debug > 1)
 		pr_debug("%s: Initial media type %s.\n",
@@ -1676,7 +1673,7 @@ vortex_up(struct net_device *dev)
 	window_write16(vp, 0x0040, 4, Wn4_NetDiag);
 
 	if (vp->full_bus_master_rx) { /* Boomerang bus master. */
-		vp->cur_rx = vp->dirty_rx = 0;
+		vp->cur_rx = 0;
 		/* Initialize the RxEarly register as recommended. */
 		iowrite16(SetRxThreshold + (1536>>2), ioaddr + EL3_CMD);
 		iowrite32(0x0020, ioaddr + PktStatus);
@@ -1729,6 +1726,7 @@ vortex_open(struct net_device *dev)
 	struct vortex_private *vp = netdev_priv(dev);
 	int i;
 	int retval;
+	dma_addr_t dma;
 
 	/* Use the now-standard shared IRQ implementation. */
 	if ((retval = request_irq(dev->irq, vp->full_bus_master_rx ?
@@ -1753,7 +1751,11 @@ vortex_open(struct net_device *dev)
 				break;			/* Bad news!  */
 
 			skb_reserve(skb, NET_IP_ALIGN);	/* Align IP on 16 byte boundaries */
-			vp->rx_ring[i].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
+			dma = pci_map_single(VORTEX_PCI(vp), skb->data,
+					     PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+			if (dma_mapping_error(&VORTEX_PCI(vp)->dev, dma))
+				break;
+			vp->rx_ring[i].addr = cpu_to_le32(dma);
 		}
 		if (i != RX_RING_SIZE) {
 			pr_emerg("%s: no memory for rx ring\n", dev->name);
@@ -2067,6 +2069,12 @@ vortex_start_xmit(struct sk_buff *skb, struct net_device *dev)
 		int len = (skb->len + 3) & ~3;
 		vp->tx_skb_dma = pci_map_single(VORTEX_PCI(vp), skb->data, len,
 						PCI_DMA_TODEVICE);
+		if (dma_mapping_error(&VORTEX_PCI(vp)->dev, vp->tx_skb_dma)) {
+			dev_kfree_skb_any(skb);
+			dev->stats.tx_dropped++;
+			return NETDEV_TX_OK;
+		}
+
 		spin_lock_irq(&vp->window_lock);
 		window_set(vp, 7);
 		iowrite32(vp->tx_skb_dma, ioaddr + Wn7_MasterAddr);
@@ -2593,7 +2601,7 @@ boomerang_rx(struct net_device *dev)
 	int entry = vp->cur_rx % RX_RING_SIZE;
 	void __iomem *ioaddr = vp->ioaddr;
 	int rx_status;
-	int rx_work_limit = vp->dirty_rx + RX_RING_SIZE - vp->cur_rx;
+	int rx_work_limit = RX_RING_SIZE;
 
 	if (vortex_debug > 5)
 		pr_debug("boomerang_rx(): status %4.4x\n", ioread16(ioaddr+EL3_STATUS));
@@ -2614,7 +2622,8 @@ boomerang_rx(struct net_device *dev)
 		} else {
 			/* The packet length: up to 4.5K!. */
 			int pkt_len = rx_status & 0x1fff;
-			struct sk_buff *skb;
+			struct sk_buff *skb, *newskb;
+			dma_addr_t newdma;
 			dma_addr_t dma = le32_to_cpu(vp->rx_ring[entry].addr);
 
 			if (vortex_debug > 4)
@@ -2633,9 +2642,27 @@ boomerang_rx(struct net_device *dev)
 				pci_dma_sync_single_for_device(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_copy++;
 			} else {
+				/* Pre-allocate the replacement skb.  If it or its
+				 * mapping fails then recycle the buffer thats already
+				 * in place
+				 */
+				newskb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
+				if (!newskb) {
+					dev->stats.rx_dropped++;
+					goto clear_complete;
+				}
+				newdma = pci_map_single(VORTEX_PCI(vp), newskb->data,
+							PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
+				if (dma_mapping_error(&VORTEX_PCI(vp)->dev, newdma)) {
+					dev->stats.rx_dropped++;
+					consume_skb(newskb);
+					goto clear_complete;
+				}
+
 				/* Pass up the skbuff already on the Rx ring. */
 				skb = vp->rx_skbuff[entry];
-				vp->rx_skbuff[entry] = NULL;
+				vp->rx_skbuff[entry] = newskb;
+				vp->rx_ring[entry].addr = cpu_to_le32(newdma);
 				skb_put(skb, pkt_len);
 				pci_unmap_single(VORTEX_PCI(vp), dma, PKT_BUF_SZ, PCI_DMA_FROMDEVICE);
 				vp->rx_nocopy++;
@@ -2653,55 +2680,15 @@ boomerang_rx(struct net_device *dev)
 			netif_rx(skb);
 			dev->stats.rx_packets++;
 		}
-		entry = (++vp->cur_rx) % RX_RING_SIZE;
-	}
-	/* Refill the Rx ring buffers. */
-	for (; vp->cur_rx - vp->dirty_rx > 0; vp->dirty_rx++) {
-		struct sk_buff *skb;
-		entry = vp->dirty_rx % RX_RING_SIZE;
-		if (vp->rx_skbuff[entry] == NULL) {
-			skb = netdev_alloc_skb_ip_align(dev, PKT_BUF_SZ);
-			if (skb == NULL) {
-				static unsigned long last_jif;
-				if (time_after(jiffies, last_jif + 10 * HZ)) {
-					pr_warn("%s: memory shortage\n",
-						dev->name);
-					last_jif = jiffies;
-				}
-				if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)
-					mod_timer(&vp->rx_oom_timer, RUN_AT(HZ * 1));
-				break;			/* Bad news!  */
-			}
 
-			vp->rx_ring[entry].addr = cpu_to_le32(pci_map_single(VORTEX_PCI(vp), skb->data, PKT_BUF_SZ, PCI_DMA_FROMDEVICE));
-			vp->rx_skbuff[entry] = skb;
-		}
+clear_complete:
 		vp->rx_ring[entry].status = 0;	/* Clear complete bit. */
 		iowrite16(UpUnstall, ioaddr + EL3_CMD);
+		entry = (++vp->cur_rx) % RX_RING_SIZE;
 	}
 	return 0;
 }
 
-/*
- * If we've hit a total OOM refilling the Rx ring we poll once a second
- * for some memory.  Otherwise there is no way to restart the rx process.
- */
-static void
-rx_oom_timer(struct timer_list *t)
-{
-	struct vortex_private *vp = from_timer(vp, t, rx_oom_timer);
-	struct net_device *dev = vp->mii.dev;
-
-	spin_lock_irq(&vp->lock);
-	if ((vp->cur_rx - vp->dirty_rx) == RX_RING_SIZE)	/* This test is redundant, but makes me feel good */
-		boomerang_rx(dev);
-	if (vortex_debug > 1) {
-		pr_debug("%s: rx_oom_timer %s\n", dev->name,
-			((vp->cur_rx - vp->dirty_rx) != RX_RING_SIZE) ? "succeeded" : "retrying");
-	}
-	spin_unlock_irq(&vp->lock);
-}
-
 static void
 vortex_down(struct net_device *dev, int final_down)
 {
@@ -2711,7 +2698,6 @@ vortex_down(struct net_device *dev, int final_down)
 	netdev_reset_queue(dev);
 	netif_stop_queue(dev);
 
-	del_timer_sync(&vp->rx_oom_timer);
 	del_timer_sync(&vp->timer);
 
 	/* Turn off statistics ASAP.  We update dev->stats below. */
diff --git a/drivers/net/ethernet/8390/mac8390.c b/drivers/net/ethernet/8390/mac8390.c
index 9497f18e..2f91ce8 100644
--- a/drivers/net/ethernet/8390/mac8390.c
+++ b/drivers/net/ethernet/8390/mac8390.c
@@ -123,7 +123,8 @@ enum mac8390_access {
 };
 
 extern int mac8390_memtest(struct net_device *dev);
-static int mac8390_initdev(struct net_device *dev, struct nubus_dev *ndev,
+static int mac8390_initdev(struct net_device *dev,
+			   struct nubus_rsrc *ndev,
 			   enum mac8390_type type);
 
 static int mac8390_open(struct net_device *dev);
@@ -169,11 +170,11 @@ static void word_memcpy_tocard(unsigned long tp, const void *fp, int count);
 static void word_memcpy_fromcard(void *tp, unsigned long fp, int count);
 static u32 mac8390_msg_enable;
 
-static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
+static enum mac8390_type __init mac8390_ident(struct nubus_rsrc *fres)
 {
-	switch (dev->dr_sw) {
+	switch (fres->dr_sw) {
 	case NUBUS_DRSW_3COM:
-		switch (dev->dr_hw) {
+		switch (fres->dr_hw) {
 		case NUBUS_DRHW_APPLE_SONIC_NB:
 		case NUBUS_DRHW_APPLE_SONIC_LC:
 		case NUBUS_DRHW_SONNET:
@@ -184,7 +185,7 @@ static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
 		break;
 
 	case NUBUS_DRSW_APPLE:
-		switch (dev->dr_hw) {
+		switch (fres->dr_hw) {
 		case NUBUS_DRHW_ASANTE_LC:
 			return MAC8390_NONE;
 		case NUBUS_DRHW_CABLETRON:
@@ -201,7 +202,7 @@ static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
 	case NUBUS_DRSW_TECHWORKS:
 	case NUBUS_DRSW_DAYNA2:
 	case NUBUS_DRSW_DAYNA_LC:
-		if (dev->dr_hw == NUBUS_DRHW_CABLETRON)
+		if (fres->dr_hw == NUBUS_DRHW_CABLETRON)
 			return MAC8390_CABLETRON;
 		else
 			return MAC8390_APPLE;
@@ -212,7 +213,7 @@ static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
 		break;
 
 	case NUBUS_DRSW_KINETICS:
-		switch (dev->dr_hw) {
+		switch (fres->dr_hw) {
 		case NUBUS_DRHW_INTERLAN:
 			return MAC8390_INTERLAN;
 		default:
@@ -225,8 +226,8 @@ static enum mac8390_type __init mac8390_ident(struct nubus_dev *dev)
 		 * These correspond to Dayna Sonic cards
 		 * which use the macsonic driver
 		 */
-		if (dev->dr_hw == NUBUS_DRHW_SMC9194 ||
-		    dev->dr_hw == NUBUS_DRHW_INTERLAN)
+		if (fres->dr_hw == NUBUS_DRHW_SMC9194 ||
+		    fres->dr_hw == NUBUS_DRHW_INTERLAN)
 			return MAC8390_NONE;
 		else
 			return MAC8390_DAYNA;
@@ -289,7 +290,8 @@ static int __init mac8390_memsize(unsigned long membase)
 	return i * 0x1000;
 }
 
-static bool __init mac8390_init(struct net_device *dev, struct nubus_dev *ndev,
+static bool __init mac8390_init(struct net_device *dev,
+				struct nubus_rsrc *ndev,
 				enum mac8390_type cardtype)
 {
 	struct nubus_dir dir;
@@ -394,7 +396,7 @@ static bool __init mac8390_init(struct net_device *dev, struct nubus_dev *ndev,
 struct net_device * __init mac8390_probe(int unit)
 {
 	struct net_device *dev;
-	struct nubus_dev *ndev = NULL;
+	struct nubus_rsrc *ndev = NULL;
 	int err = -ENODEV;
 	struct ei_device *ei_local;
 
@@ -414,8 +416,11 @@ struct net_device * __init mac8390_probe(int unit)
 	if (unit >= 0)
 		sprintf(dev->name, "eth%d", unit);
 
-	while ((ndev = nubus_find_type(NUBUS_CAT_NETWORK, NUBUS_TYPE_ETHERNET,
-				       ndev))) {
+	for_each_func_rsrc(ndev) {
+		if (ndev->category != NUBUS_CAT_NETWORK ||
+		    ndev->type != NUBUS_TYPE_ETHERNET)
+			continue;
+
 		/* Have we seen it already? */
 		if (slots & (1 << ndev->board->slot))
 			continue;
@@ -489,7 +494,7 @@ static const struct net_device_ops mac8390_netdev_ops = {
 };
 
 static int __init mac8390_initdev(struct net_device *dev,
-				  struct nubus_dev *ndev,
+				  struct nubus_rsrc *ndev,
 				  enum mac8390_type type)
 {
 	static u32 fwrd4_offsets[16] = {
diff --git a/drivers/net/ethernet/amazon/ena/ena_netdev.c b/drivers/net/ethernet/amazon/ena/ena_netdev.c
index 97c5a89..fbe21a81 100644
--- a/drivers/net/ethernet/amazon/ena/ena_netdev.c
+++ b/drivers/net/ethernet/amazon/ena/ena_netdev.c
@@ -75,6 +75,9 @@ static struct workqueue_struct *ena_wq;
 MODULE_DEVICE_TABLE(pci, ena_pci_tbl);
 
 static int ena_rss_init_default(struct ena_adapter *adapter);
+static void check_for_admin_com_state(struct ena_adapter *adapter);
+static void ena_destroy_device(struct ena_adapter *adapter);
+static int ena_restore_device(struct ena_adapter *adapter);
 
 static void ena_tx_timeout(struct net_device *dev)
 {
@@ -1565,7 +1568,7 @@ static int ena_rss_configure(struct ena_adapter *adapter)
 
 static int ena_up_complete(struct ena_adapter *adapter)
 {
-	int rc, i;
+	int rc;
 
 	rc = ena_rss_configure(adapter);
 	if (rc)
@@ -1584,17 +1587,6 @@ static int ena_up_complete(struct ena_adapter *adapter)
 
 	ena_napi_enable_all(adapter);
 
-	/* Enable completion queues interrupt */
-	for (i = 0; i < adapter->num_queues; i++)
-		ena_unmask_interrupt(&adapter->tx_ring[i],
-				     &adapter->rx_ring[i]);
-
-	/* schedule napi in case we had pending packets
-	 * from the last time we disable napi
-	 */
-	for (i = 0; i < adapter->num_queues; i++)
-		napi_schedule(&adapter->ena_napi[i].napi);
-
 	return 0;
 }
 
@@ -1731,7 +1723,7 @@ static int ena_create_all_io_rx_queues(struct ena_adapter *adapter)
 
 static int ena_up(struct ena_adapter *adapter)
 {
-	int rc;
+	int rc, i;
 
 	netdev_dbg(adapter->netdev, "%s\n", __func__);
 
@@ -1774,6 +1766,17 @@ static int ena_up(struct ena_adapter *adapter)
 
 	set_bit(ENA_FLAG_DEV_UP, &adapter->flags);
 
+	/* Enable completion queues interrupt */
+	for (i = 0; i < adapter->num_queues; i++)
+		ena_unmask_interrupt(&adapter->tx_ring[i],
+				     &adapter->rx_ring[i]);
+
+	/* schedule napi in case we had pending packets
+	 * from the last time we disable napi
+	 */
+	for (i = 0; i < adapter->num_queues; i++)
+		napi_schedule(&adapter->ena_napi[i].napi);
+
 	return rc;
 
 err_up:
@@ -1884,6 +1887,17 @@ static int ena_close(struct net_device *netdev)
 	if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
 		ena_down(adapter);
 
+	/* Check for device status and issue reset if needed*/
+	check_for_admin_com_state(adapter);
+	if (unlikely(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags))) {
+		netif_err(adapter, ifdown, adapter->netdev,
+			  "Destroy failure, restarting device\n");
+		ena_dump_stats_to_dmesg(adapter);
+		/* rtnl lock already obtained in dev_ioctl() layer */
+		ena_destroy_device(adapter);
+		ena_restore_device(adapter);
+	}
+
 	return 0;
 }
 
@@ -2544,11 +2558,12 @@ static void ena_destroy_device(struct ena_adapter *adapter)
 
 	ena_com_set_admin_running_state(ena_dev, false);
 
-	ena_close(netdev);
+	if (test_bit(ENA_FLAG_DEV_UP, &adapter->flags))
+		ena_down(adapter);
 
 	/* Before releasing the ENA resources, a device reset is required.
 	 * (to prevent the device from accessing them).
-	 * In case the reset flag is set and the device is up, ena_close
+	 * In case the reset flag is set and the device is up, ena_down()
 	 * already perform the reset, so it can be skipped.
 	 */
 	if (!(test_bit(ENA_FLAG_TRIGGER_RESET, &adapter->flags) && dev_up))
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
index 5ee1866..c961767 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c
@@ -70,7 +70,7 @@ static int bnxt_vf_ndo_prep(struct bnxt *bp, int vf_id)
 		netdev_err(bp->dev, "vf ndo called though sriov is disabled\n");
 		return -EINVAL;
 	}
-	if (vf_id >= bp->pf.max_vfs) {
+	if (vf_id >= bp->pf.active_vfs) {
 		netdev_err(bp->dev, "Invalid VF id %d\n", vf_id);
 		return -EINVAL;
 	}
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
index 3d201d7..d8fee26 100644
--- a/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
+++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_tc.c
@@ -421,7 +421,7 @@ static int bnxt_hwrm_cfa_flow_alloc(struct bnxt *bp, struct bnxt_tc_flow *flow,
 	}
 
 	/* If all IP and L4 fields are wildcarded then this is an L2 flow */
-	if (is_wildcard(&l3_mask, sizeof(l3_mask)) &&
+	if (is_wildcard(l3_mask, sizeof(*l3_mask)) &&
 	    is_wildcard(&flow->l4_mask, sizeof(flow->l4_mask))) {
 		flow_flags |= CFA_FLOW_ALLOC_REQ_FLAGS_FLOWTYPE_L2;
 	} else {
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
index 6f9fa6e..d8424ed 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4.h
@@ -344,7 +344,6 @@ struct adapter_params {
 
 	unsigned int sf_size;             /* serial flash size in bytes */
 	unsigned int sf_nsec;             /* # of flash sectors */
-	unsigned int sf_fw_start;         /* start of FW image in flash */
 
 	unsigned int fw_vers;		  /* firmware version */
 	unsigned int bs_vers;		  /* bootstrap version */
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index d4a548a..a452d5a 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -111,6 +111,9 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 			ethtype_mask = 0;
 		}
 
+		if (ethtype_key == ETH_P_IPV6)
+			fs->type = 1;
+
 		fs->val.ethtype = ethtype_key;
 		fs->mask.ethtype = ethtype_mask;
 		fs->val.proto = key->ip_proto;
@@ -205,8 +208,8 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 					   VLAN_PRIO_SHIFT);
 		vlan_tci_mask = mask->vlan_id | (mask->vlan_priority <<
 						 VLAN_PRIO_SHIFT);
-		fs->val.ivlan = cpu_to_be16(vlan_tci);
-		fs->mask.ivlan = cpu_to_be16(vlan_tci_mask);
+		fs->val.ivlan = vlan_tci;
+		fs->mask.ivlan = vlan_tci_mask;
 
 		/* Chelsio adapters use ivlan_vld bit to match vlan packets
 		 * as 802.1Q. Also, when vlan tag is present in packets,
diff --git a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
index f63210f..375ef86 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/t4_hw.c
@@ -2844,8 +2844,6 @@ enum {
 	SF_RD_DATA_FAST = 0xb,        /* read flash */
 	SF_RD_ID        = 0x9f,       /* read ID */
 	SF_ERASE_SECTOR = 0xd8,       /* erase sector */
-
-	FW_MAX_SIZE = 16 * SF_SEC_SIZE,
 };
 
 /**
@@ -3558,8 +3556,9 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size)
 	const __be32 *p = (const __be32 *)fw_data;
 	const struct fw_hdr *hdr = (const struct fw_hdr *)fw_data;
 	unsigned int sf_sec_size = adap->params.sf_size / adap->params.sf_nsec;
-	unsigned int fw_img_start = adap->params.sf_fw_start;
-	unsigned int fw_start_sec = fw_img_start / sf_sec_size;
+	unsigned int fw_start_sec = FLASH_FW_START_SEC;
+	unsigned int fw_size = FLASH_FW_MAX_SIZE;
+	unsigned int fw_start = FLASH_FW_START;
 
 	if (!size) {
 		dev_err(adap->pdev_dev, "FW image has no data\n");
@@ -3575,9 +3574,9 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size)
 			"FW image size differs from size in FW header\n");
 		return -EINVAL;
 	}
-	if (size > FW_MAX_SIZE) {
+	if (size > fw_size) {
 		dev_err(adap->pdev_dev, "FW image too large, max is %u bytes\n",
-			FW_MAX_SIZE);
+			fw_size);
 		return -EFBIG;
 	}
 	if (!t4_fw_matches_chip(adap, hdr))
@@ -3604,11 +3603,11 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size)
 	 */
 	memcpy(first_page, fw_data, SF_PAGE_SIZE);
 	((struct fw_hdr *)first_page)->fw_ver = cpu_to_be32(0xffffffff);
-	ret = t4_write_flash(adap, fw_img_start, SF_PAGE_SIZE, first_page);
+	ret = t4_write_flash(adap, fw_start, SF_PAGE_SIZE, first_page);
 	if (ret)
 		goto out;
 
-	addr = fw_img_start;
+	addr = fw_start;
 	for (size -= SF_PAGE_SIZE; size; size -= SF_PAGE_SIZE) {
 		addr += SF_PAGE_SIZE;
 		fw_data += SF_PAGE_SIZE;
@@ -3618,7 +3617,7 @@ int t4_load_fw(struct adapter *adap, const u8 *fw_data, unsigned int size)
 	}
 
 	ret = t4_write_flash(adap,
-			     fw_img_start + offsetof(struct fw_hdr, fw_ver),
+			     fw_start + offsetof(struct fw_hdr, fw_ver),
 			     sizeof(hdr->fw_ver), (const u8 *)&hdr->fw_ver);
 out:
 	if (ret)
diff --git a/drivers/net/ethernet/cirrus/cs89x0.c b/drivers/net/ethernet/cirrus/cs89x0.c
index 410a0a9..b3e7faf 100644
--- a/drivers/net/ethernet/cirrus/cs89x0.c
+++ b/drivers/net/ethernet/cirrus/cs89x0.c
@@ -1913,3 +1913,7 @@ static struct platform_driver cs89x0_driver = {
 module_platform_driver_probe(cs89x0_driver, cs89x0_platform_probe);
 
 #endif /* CONFIG_CS89x0_PLATFORM */
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("Crystal Semiconductor (Now Cirrus Logic) CS89[02]0 network driver");
+MODULE_AUTHOR("Russell Nelson <nelson@crynwr.com>");
diff --git a/drivers/net/ethernet/cirrus/mac89x0.c b/drivers/net/ethernet/cirrus/mac89x0.c
index f910f0f..977d4c2 100644
--- a/drivers/net/ethernet/cirrus/mac89x0.c
+++ b/drivers/net/ethernet/cirrus/mac89x0.c
@@ -187,6 +187,7 @@ struct net_device * __init mac89x0_probe(int unit)
 	unsigned long ioaddr;
 	unsigned short sig;
 	int err = -ENODEV;
+	struct nubus_rsrc *fres;
 
 	if (!MACH_IS_MAC)
 		return ERR_PTR(-ENODEV);
@@ -207,8 +208,9 @@ struct net_device * __init mac89x0_probe(int unit)
 	/* We might have to parameterize this later */
 	slot = 0xE;
 	/* Get out now if there's a real NuBus card in slot E */
-	if (nubus_find_slot(slot, NULL) != NULL)
-		goto out;
+	for_each_func_rsrc(fres)
+		if (fres->board->slot == slot)
+			goto out;
 
 	/* The pseudo-ISA bits always live at offset 0x300 (gee,
            wonder why...) */
diff --git a/drivers/net/ethernet/emulex/benet/be_main.c b/drivers/net/ethernet/emulex/benet/be_main.c
index c6e859a2..e180657 100644
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4634,6 +4634,15 @@ int be_update_queues(struct be_adapter *adapter)
 
 	be_schedule_worker(adapter);
 
+	/*
+	 * The IF was destroyed and re-created. We need to clear
+	 * all promiscuous flags valid for the destroyed IF.
+	 * Without this promisc mode is not restored during
+	 * be_open() because the driver thinks that it is
+	 * already enabled in HW.
+	 */
+	adapter->if_flags &= ~BE_IF_FLAGS_ALL_PROMISCUOUS;
+
 	if (netif_running(netdev))
 		status = be_open(netdev);
 
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c
index 8184d2f..a74300a 100644
--- a/drivers/net/ethernet/freescale/fec_main.c
+++ b/drivers/net/ethernet/freescale/fec_main.c
@@ -3469,6 +3469,10 @@ fec_probe(struct platform_device *pdev)
 			goto failed_regulator;
 		}
 	} else {
+		if (PTR_ERR(fep->reg_phy) == -EPROBE_DEFER) {
+			ret = -EPROBE_DEFER;
+			goto failed_regulator;
+		}
 		fep->reg_phy = NULL;
 	}
 
@@ -3552,8 +3556,9 @@ fec_probe(struct platform_device *pdev)
 failed_clk:
 	if (of_phy_is_fixed_link(np))
 		of_phy_deregister_fixed_link(np);
-failed_phy:
 	of_node_put(phy_node);
+failed_phy:
+	dev_id--;
 failed_ioremap:
 	free_netdev(ndev);
 
diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
index 7892f2f0..2c2976a 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c
@@ -613,9 +613,11 @@ static int fs_enet_start_xmit(struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 }
 
-static void fs_timeout(struct net_device *dev)
+static void fs_timeout_work(struct work_struct *work)
 {
-	struct fs_enet_private *fep = netdev_priv(dev);
+	struct fs_enet_private *fep = container_of(work, struct fs_enet_private,
+						   timeout_work);
+	struct net_device *dev = fep->ndev;
 	unsigned long flags;
 	int wake = 0;
 
@@ -627,7 +629,6 @@ static void fs_timeout(struct net_device *dev)
 		phy_stop(dev->phydev);
 		(*fep->ops->stop)(dev);
 		(*fep->ops->restart)(dev);
-		phy_start(dev->phydev);
 	}
 
 	phy_start(dev->phydev);
@@ -639,6 +640,13 @@ static void fs_timeout(struct net_device *dev)
 		netif_wake_queue(dev);
 }
 
+static void fs_timeout(struct net_device *dev)
+{
+	struct fs_enet_private *fep = netdev_priv(dev);
+
+	schedule_work(&fep->timeout_work);
+}
+
 /*-----------------------------------------------------------------------------
  *  generic link-change handler - should be sufficient for most cases
  *-----------------------------------------------------------------------------*/
@@ -759,6 +767,7 @@ static int fs_enet_close(struct net_device *dev)
 	netif_stop_queue(dev);
 	netif_carrier_off(dev);
 	napi_disable(&fep->napi);
+	cancel_work_sync(&fep->timeout_work);
 	phy_stop(dev->phydev);
 
 	spin_lock_irqsave(&fep->lock, flags);
@@ -1019,6 +1028,7 @@ static int fs_enet_probe(struct platform_device *ofdev)
 
 	ndev->netdev_ops = &fs_enet_netdev_ops;
 	ndev->watchdog_timeo = 2 * HZ;
+	INIT_WORK(&fep->timeout_work, fs_timeout_work);
 	netif_napi_add(ndev, &fep->napi, fs_enet_napi, fpi->napi_weight);
 
 	ndev->ethtool_ops = &fs_ethtool_ops;
diff --git a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
index 92e06b3..195fae6 100644
--- a/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
+++ b/drivers/net/ethernet/freescale/fs_enet/fs_enet.h
@@ -125,6 +125,7 @@ struct fs_enet_private {
 	spinlock_t lock;	/* during all ops except TX pckt processing */
 	spinlock_t tx_lock;	/* during fs_start_xmit and fs_tx         */
 	struct fs_platform_info *fpi;
+	struct work_struct timeout_work;
 	const struct fs_ops *ops;
 	int rx_ring, tx_ring;
 	dma_addr_t ring_mem_addr;
diff --git a/drivers/net/ethernet/freescale/gianfar_ptp.c b/drivers/net/ethernet/freescale/gianfar_ptp.c
index 5441142..9f8d4f8 100644
--- a/drivers/net/ethernet/freescale/gianfar_ptp.c
+++ b/drivers/net/ethernet/freescale/gianfar_ptp.c
@@ -319,11 +319,10 @@ static int ptp_gianfar_adjtime(struct ptp_clock_info *ptp, s64 delta)
 	now = tmr_cnt_read(etsects);
 	now += delta;
 	tmr_cnt_write(etsects, now);
+	set_fipers(etsects);
 
 	spin_unlock_irqrestore(&etsects->lock, flags);
 
-	set_fipers(etsects);
-
 	return 0;
 }
 
diff --git a/drivers/net/ethernet/ibm/emac/core.c b/drivers/net/ethernet/ibm/emac/core.c
index 7feff24..241db31 100644
--- a/drivers/net/ethernet/ibm/emac/core.c
+++ b/drivers/net/ethernet/ibm/emac/core.c
@@ -494,6 +494,9 @@ static u32 __emac_calc_base_mr1(struct emac_instance *dev, int tx_size, int rx_s
 	case 16384:
 		ret |= EMAC_MR1_RFS_16K;
 		break;
+	case 8192:
+		ret |= EMAC4_MR1_RFS_8K;
+		break;
 	case 4096:
 		ret |= EMAC_MR1_RFS_4K;
 		break;
@@ -516,6 +519,9 @@ static u32 __emac4_calc_base_mr1(struct emac_instance *dev, int tx_size, int rx_
 	case 16384:
 		ret |= EMAC4_MR1_TFS_16K;
 		break;
+	case 8192:
+		ret |= EMAC4_MR1_TFS_8K;
+		break;
 	case 4096:
 		ret |= EMAC4_MR1_TFS_4K;
 		break;
diff --git a/drivers/net/ethernet/ibm/emac/emac.h b/drivers/net/ethernet/ibm/emac/emac.h
index 5afcc27..c26d263 100644
--- a/drivers/net/ethernet/ibm/emac/emac.h
+++ b/drivers/net/ethernet/ibm/emac/emac.h
@@ -151,9 +151,11 @@ struct emac_regs {
 
 #define EMAC4_MR1_RFS_2K		0x00100000
 #define EMAC4_MR1_RFS_4K		0x00180000
+#define EMAC4_MR1_RFS_8K		0x00200000
 #define EMAC4_MR1_RFS_16K		0x00280000
 #define EMAC4_MR1_TFS_2K       		0x00020000
 #define EMAC4_MR1_TFS_4K		0x00030000
+#define EMAC4_MR1_TFS_8K		0x00040000
 #define EMAC4_MR1_TFS_16K		0x00050000
 #define EMAC4_MR1_TR			0x00008000
 #define EMAC4_MR1_MWSW_001		0x00001000
@@ -242,7 +244,7 @@ struct emac_regs {
 #define EMAC_STACR_PHYE			0x00004000
 #define EMAC_STACR_STAC_MASK		0x00003000
 #define EMAC_STACR_STAC_READ		0x00001000
-#define EMAC_STACR_STAC_WRITE		0x00002000
+#define EMAC_STACR_STAC_WRITE		0x00000800
 #define EMAC_STACR_OPBC_MASK		0x00000C00
 #define EMAC_STACR_OPBC_50		0x00000000
 #define EMAC_STACR_OPBC_66		0x00000400
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c
index 1dc4aef..b65f5f3 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -410,6 +410,10 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 	struct ibmvnic_rx_pool *rx_pool;
 	int rx_scrqs;
 	int i, j, rc;
+	u64 *size_array;
+
+	size_array = (u64 *)((u8 *)(adapter->login_rsp_buf) +
+		be32_to_cpu(adapter->login_rsp_buf->off_rxadd_buff_size));
 
 	rx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs);
 	for (i = 0; i < rx_scrqs; i++) {
@@ -417,7 +421,17 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 
 		netdev_dbg(adapter->netdev, "Re-setting rx_pool[%d]\n", i);
 
-		rc = reset_long_term_buff(adapter, &rx_pool->long_term_buff);
+		if (rx_pool->buff_size != be64_to_cpu(size_array[i])) {
+			free_long_term_buff(adapter, &rx_pool->long_term_buff);
+			rx_pool->buff_size = be64_to_cpu(size_array[i]);
+			alloc_long_term_buff(adapter, &rx_pool->long_term_buff,
+					     rx_pool->size *
+					     rx_pool->buff_size);
+		} else {
+			rc = reset_long_term_buff(adapter,
+						  &rx_pool->long_term_buff);
+		}
+
 		if (rc)
 			return rc;
 
@@ -439,14 +453,12 @@ static int reset_rx_pools(struct ibmvnic_adapter *adapter)
 static void release_rx_pools(struct ibmvnic_adapter *adapter)
 {
 	struct ibmvnic_rx_pool *rx_pool;
-	int rx_scrqs;
 	int i, j;
 
 	if (!adapter->rx_pool)
 		return;
 
-	rx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_rxadd_subcrqs);
-	for (i = 0; i < rx_scrqs; i++) {
+	for (i = 0; i < adapter->num_active_rx_pools; i++) {
 		rx_pool = &adapter->rx_pool[i];
 
 		netdev_dbg(adapter->netdev, "Releasing rx_pool[%d]\n", i);
@@ -469,6 +481,7 @@ static void release_rx_pools(struct ibmvnic_adapter *adapter)
 
 	kfree(adapter->rx_pool);
 	adapter->rx_pool = NULL;
+	adapter->num_active_rx_pools = 0;
 }
 
 static int init_rx_pools(struct net_device *netdev)
@@ -493,6 +506,8 @@ static int init_rx_pools(struct net_device *netdev)
 		return -1;
 	}
 
+	adapter->num_active_rx_pools = 0;
+
 	for (i = 0; i < rxadd_subcrqs; i++) {
 		rx_pool = &adapter->rx_pool[i];
 
@@ -536,6 +551,8 @@ static int init_rx_pools(struct net_device *netdev)
 		rx_pool->next_free = 0;
 	}
 
+	adapter->num_active_rx_pools = rxadd_subcrqs;
+
 	return 0;
 }
 
@@ -586,13 +603,12 @@ static void release_vpd_data(struct ibmvnic_adapter *adapter)
 static void release_tx_pools(struct ibmvnic_adapter *adapter)
 {
 	struct ibmvnic_tx_pool *tx_pool;
-	int i, tx_scrqs;
+	int i;
 
 	if (!adapter->tx_pool)
 		return;
 
-	tx_scrqs = be32_to_cpu(adapter->login_rsp_buf->num_txsubm_subcrqs);
-	for (i = 0; i < tx_scrqs; i++) {
+	for (i = 0; i < adapter->num_active_tx_pools; i++) {
 		netdev_dbg(adapter->netdev, "Releasing tx_pool[%d]\n", i);
 		tx_pool = &adapter->tx_pool[i];
 		kfree(tx_pool->tx_buff);
@@ -603,6 +619,7 @@ static void release_tx_pools(struct ibmvnic_adapter *adapter)
 
 	kfree(adapter->tx_pool);
 	adapter->tx_pool = NULL;
+	adapter->num_active_tx_pools = 0;
 }
 
 static int init_tx_pools(struct net_device *netdev)
@@ -619,6 +636,8 @@ static int init_tx_pools(struct net_device *netdev)
 	if (!adapter->tx_pool)
 		return -1;
 
+	adapter->num_active_tx_pools = 0;
+
 	for (i = 0; i < tx_subcrqs; i++) {
 		tx_pool = &adapter->tx_pool[i];
 
@@ -666,6 +685,8 @@ static int init_tx_pools(struct net_device *netdev)
 		tx_pool->producer_index = 0;
 	}
 
+	adapter->num_active_tx_pools = tx_subcrqs;
+
 	return 0;
 }
 
@@ -756,6 +777,12 @@ static int ibmvnic_login(struct net_device *netdev)
 		}
 	} while (adapter->renegotiate);
 
+	/* handle pending MAC address changes after successful login */
+	if (adapter->mac_change_pending) {
+		__ibmvnic_set_mac(netdev, &adapter->desired.mac);
+		adapter->mac_change_pending = false;
+	}
+
 	return 0;
 }
 
@@ -854,7 +881,7 @@ static int ibmvnic_get_vpd(struct ibmvnic_adapter *adapter)
 	if (adapter->vpd->buff)
 		len = adapter->vpd->len;
 
-	reinit_completion(&adapter->fw_done);
+	init_completion(&adapter->fw_done);
 	crq.get_vpd_size.first = IBMVNIC_CRQ_CMD;
 	crq.get_vpd_size.cmd = GET_VPD_SIZE;
 	ibmvnic_send_crq(adapter, &crq);
@@ -916,6 +943,13 @@ static int init_resources(struct ibmvnic_adapter *adapter)
 	if (!adapter->vpd)
 		return -ENOMEM;
 
+	/* Vital Product Data (VPD) */
+	rc = ibmvnic_get_vpd(adapter);
+	if (rc) {
+		netdev_err(netdev, "failed to initialize Vital Product Data (VPD)\n");
+		return rc;
+	}
+
 	adapter->map_id = 1;
 	adapter->napi = kcalloc(adapter->req_rx_queues,
 				sizeof(struct napi_struct), GFP_KERNEL);
@@ -989,15 +1023,10 @@ static int __ibmvnic_open(struct net_device *netdev)
 static int ibmvnic_open(struct net_device *netdev)
 {
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
-	int rc, vpd;
+	int rc;
 
 	mutex_lock(&adapter->reset_lock);
 
-	if (adapter->mac_change_pending) {
-		__ibmvnic_set_mac(netdev, &adapter->desired.mac);
-		adapter->mac_change_pending = false;
-	}
-
 	if (adapter->state != VNIC_CLOSED) {
 		rc = ibmvnic_login(netdev);
 		if (rc) {
@@ -1017,11 +1046,6 @@ static int ibmvnic_open(struct net_device *netdev)
 	rc = __ibmvnic_open(netdev);
 	netif_carrier_on(netdev);
 
-	/* Vital Product Data (VPD) */
-	vpd = ibmvnic_get_vpd(adapter);
-	if (vpd)
-		netdev_err(netdev, "failed to initialize Vital Product Data (VPD)\n");
-
 	mutex_unlock(&adapter->reset_lock);
 
 	return rc;
@@ -1275,6 +1299,7 @@ static int ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	unsigned char *dst;
 	u64 *handle_array;
 	int index = 0;
+	u8 proto = 0;
 	int ret = 0;
 
 	if (adapter->resetting) {
@@ -1363,17 +1388,18 @@ static int ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev)
 	}
 
 	if (skb->protocol == htons(ETH_P_IP)) {
-		if (ip_hdr(skb)->version == 4)
-			tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV4;
-		else if (ip_hdr(skb)->version == 6)
-			tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV6;
-
-		if (ip_hdr(skb)->protocol == IPPROTO_TCP)
-			tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_TCP;
-		else if (ip_hdr(skb)->protocol != IPPROTO_TCP)
-			tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_UDP;
+		tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV4;
+		proto = ip_hdr(skb)->protocol;
+	} else if (skb->protocol == htons(ETH_P_IPV6)) {
+		tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_IPV6;
+		proto = ipv6_hdr(skb)->nexthdr;
 	}
 
+	if (proto == IPPROTO_TCP)
+		tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_TCP;
+	else if (proto == IPPROTO_UDP)
+		tx_crq.v1.flags1 |= IBMVNIC_TX_PROT_UDP;
+
 	if (skb->ip_summed == CHECKSUM_PARTIAL) {
 		tx_crq.v1.flags1 |= IBMVNIC_TX_CHKSUM_OFFLOAD;
 		hdrs += 2;
@@ -1527,7 +1553,7 @@ static int ibmvnic_set_mac(struct net_device *netdev, void *p)
 	struct ibmvnic_adapter *adapter = netdev_priv(netdev);
 	struct sockaddr *addr = p;
 
-	if (adapter->state != VNIC_OPEN) {
+	if (adapter->state == VNIC_PROBED) {
 		memcpy(&adapter->desired.mac, addr, sizeof(struct sockaddr));
 		adapter->mac_change_pending = true;
 		return 0;
@@ -1545,6 +1571,7 @@ static int ibmvnic_set_mac(struct net_device *netdev, void *p)
 static int do_reset(struct ibmvnic_adapter *adapter,
 		    struct ibmvnic_rwi *rwi, u32 reset_state)
 {
+	u64 old_num_rx_queues, old_num_tx_queues;
 	struct net_device *netdev = adapter->netdev;
 	int i, rc;
 
@@ -1554,6 +1581,9 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 	netif_carrier_off(netdev);
 	adapter->reset_reason = rwi->reset_reason;
 
+	old_num_rx_queues = adapter->req_rx_queues;
+	old_num_tx_queues = adapter->req_tx_queues;
+
 	if (rwi->reset_reason == VNIC_RESET_MOBILITY) {
 		rc = ibmvnic_reenable_crq_queue(adapter);
 		if (rc)
@@ -1598,6 +1628,12 @@ static int do_reset(struct ibmvnic_adapter *adapter,
 			rc = init_resources(adapter);
 			if (rc)
 				return rc;
+		} else if (adapter->req_rx_queues != old_num_rx_queues ||
+			   adapter->req_tx_queues != old_num_tx_queues) {
+			release_rx_pools(adapter);
+			release_tx_pools(adapter);
+			init_rx_pools(netdev);
+			init_tx_pools(netdev);
 		} else {
 			rc = reset_tx_pools(adapter);
 			if (rc)
@@ -3345,7 +3381,11 @@ static void handle_query_ip_offload_rsp(struct ibmvnic_adapter *adapter)
 		return;
 	}
 
+	adapter->ip_offload_ctrl.len =
+	    cpu_to_be32(sizeof(adapter->ip_offload_ctrl));
 	adapter->ip_offload_ctrl.version = cpu_to_be32(INITIAL_VERSION_IOB);
+	adapter->ip_offload_ctrl.ipv4_chksum = buf->ipv4_chksum;
+	adapter->ip_offload_ctrl.ipv6_chksum = buf->ipv6_chksum;
 	adapter->ip_offload_ctrl.tcp_ipv4_chksum = buf->tcp_ipv4_chksum;
 	adapter->ip_offload_ctrl.udp_ipv4_chksum = buf->udp_ipv4_chksum;
 	adapter->ip_offload_ctrl.tcp_ipv6_chksum = buf->tcp_ipv6_chksum;
@@ -3585,7 +3625,17 @@ static void handle_request_cap_rsp(union ibmvnic_crq *crq,
 			 *req_value,
 			 (long int)be64_to_cpu(crq->request_capability_rsp.
 					       number), name);
-		*req_value = be64_to_cpu(crq->request_capability_rsp.number);
+
+		if (be16_to_cpu(crq->request_capability_rsp.capability) ==
+		    REQ_MTU) {
+			pr_err("mtu of %llu is not supported. Reverting.\n",
+			       *req_value);
+			*req_value = adapter->fallback.mtu;
+		} else {
+			*req_value =
+				be64_to_cpu(crq->request_capability_rsp.number);
+		}
+
 		ibmvnic_send_req_caps(adapter, 1);
 		return;
 	default:
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h
index 4487f1e..3aec421 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.h
+++ b/drivers/net/ethernet/ibm/ibmvnic.h
@@ -1091,6 +1091,8 @@ struct ibmvnic_adapter {
 	u64 opt_rxba_entries_per_subcrq;
 	__be64 tx_rx_desc_req;
 	u8 map_id;
+	u64 num_active_rx_pools;
+	u64 num_active_tx_pools;
 
 	struct tasklet_struct tasklet;
 	enum vnic_state state;
diff --git a/drivers/net/ethernet/intel/e1000/e1000.h b/drivers/net/ethernet/intel/e1000/e1000.h
index d7bdea7..8fd2458 100644
--- a/drivers/net/ethernet/intel/e1000/e1000.h
+++ b/drivers/net/ethernet/intel/e1000/e1000.h
@@ -331,7 +331,8 @@ struct e1000_adapter {
 enum e1000_state_t {
 	__E1000_TESTING,
 	__E1000_RESETTING,
-	__E1000_DOWN
+	__E1000_DOWN,
+	__E1000_DISABLED
 };
 
 #undef pr_fmt
diff --git a/drivers/net/ethernet/intel/e1000/e1000_main.c b/drivers/net/ethernet/intel/e1000/e1000_main.c
index 1982f79..3dd4aeb 100644
--- a/drivers/net/ethernet/intel/e1000/e1000_main.c
+++ b/drivers/net/ethernet/intel/e1000/e1000_main.c
@@ -945,7 +945,7 @@ static int e1000_init_hw_struct(struct e1000_adapter *adapter,
 static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 {
 	struct net_device *netdev;
-	struct e1000_adapter *adapter;
+	struct e1000_adapter *adapter = NULL;
 	struct e1000_hw *hw;
 
 	static int cards_found;
@@ -955,6 +955,7 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	u16 tmp = 0;
 	u16 eeprom_apme_mask = E1000_EEPROM_APME;
 	int bars, need_ioport;
+	bool disable_dev = false;
 
 	/* do not allocate ioport bars when not needed */
 	need_ioport = e1000_is_need_ioport(pdev);
@@ -1259,11 +1260,13 @@ static int e1000_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
 	iounmap(hw->ce4100_gbe_mdio_base_virt);
 	iounmap(hw->hw_addr);
 err_ioremap:
+	disable_dev = !test_and_set_bit(__E1000_DISABLED, &adapter->flags);
 	free_netdev(netdev);
 err_alloc_etherdev:
 	pci_release_selected_regions(pdev, bars);
 err_pci_reg:
-	pci_disable_device(pdev);
+	if (!adapter || disable_dev)
+		pci_disable_device(pdev);
 	return err;
 }
 
@@ -1281,6 +1284,7 @@ static void e1000_remove(struct pci_dev *pdev)
 	struct net_device *netdev = pci_get_drvdata(pdev);
 	struct e1000_adapter *adapter = netdev_priv(netdev);
 	struct e1000_hw *hw = &adapter->hw;
+	bool disable_dev;
 
 	e1000_down_and_stop(adapter);
 	e1000_release_manageability(adapter);
@@ -1299,9 +1303,11 @@ static void e1000_remove(struct pci_dev *pdev)
 		iounmap(hw->flash_address);
 	pci_release_selected_regions(pdev, adapter->bars);
 
+	disable_dev = !test_and_set_bit(__E1000_DISABLED, &adapter->flags);
 	free_netdev(netdev);
 
-	pci_disable_device(pdev);
+	if (disable_dev)
+		pci_disable_device(pdev);
 }
 
 /**
@@ -5156,7 +5162,8 @@ static int __e1000_shutdown(struct pci_dev *pdev, bool *enable_wake)
 	if (netif_running(netdev))
 		e1000_free_irq(adapter);
 
-	pci_disable_device(pdev);
+	if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags))
+		pci_disable_device(pdev);
 
 	return 0;
 }
@@ -5200,6 +5207,10 @@ static int e1000_resume(struct pci_dev *pdev)
 		pr_err("Cannot enable PCI device from suspend\n");
 		return err;
 	}
+
+	/* flush memory to make sure state is correct */
+	smp_mb__before_atomic();
+	clear_bit(__E1000_DISABLED, &adapter->flags);
 	pci_set_master(pdev);
 
 	pci_enable_wake(pdev, PCI_D3hot, 0);
@@ -5274,7 +5285,9 @@ static pci_ers_result_t e1000_io_error_detected(struct pci_dev *pdev,
 
 	if (netif_running(netdev))
 		e1000_down(adapter);
-	pci_disable_device(pdev);
+
+	if (!test_and_set_bit(__E1000_DISABLED, &adapter->flags))
+		pci_disable_device(pdev);
 
 	/* Request a slot slot reset. */
 	return PCI_ERS_RESULT_NEED_RESET;
@@ -5302,6 +5315,10 @@ static pci_ers_result_t e1000_io_slot_reset(struct pci_dev *pdev)
 		pr_err("Cannot re-enable PCI device after reset.\n");
 		return PCI_ERS_RESULT_DISCONNECT;
 	}
+
+	/* flush memory to make sure state is correct */
+	smp_mb__before_atomic();
+	clear_bit(__E1000_DISABLED, &adapter->flags);
 	pci_set_master(pdev);
 
 	pci_enable_wake(pdev, PCI_D3hot, 0);
diff --git a/drivers/net/ethernet/intel/e1000e/ich8lan.c b/drivers/net/ethernet/intel/e1000e/ich8lan.c
index d6d4ed7..31277d3 100644
--- a/drivers/net/ethernet/intel/e1000e/ich8lan.c
+++ b/drivers/net/ethernet/intel/e1000e/ich8lan.c
@@ -1367,6 +1367,9 @@ static s32 e1000_disable_ulp_lpt_lp(struct e1000_hw *hw, bool force)
  *  Checks to see of the link status of the hardware has changed.  If a
  *  change in link status has been detected, then we read the PHY registers
  *  to get the current speed/duplex if link exists.
+ *
+ *  Returns a negative error code (-E1000_ERR_*) or 0 (link down) or 1 (link
+ *  up).
  **/
 static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 {
@@ -1382,7 +1385,7 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 	 * Change or Rx Sequence Error interrupt.
 	 */
 	if (!mac->get_link_status)
-		return 0;
+		return 1;
 
 	/* First we want to see if the MII Status Register reports
 	 * link.  If so, then we want to get the current speed/duplex
@@ -1613,10 +1616,12 @@ static s32 e1000_check_for_copper_link_ich8lan(struct e1000_hw *hw)
 	 * different link partner.
 	 */
 	ret_val = e1000e_config_fc_after_link_up(hw);
-	if (ret_val)
+	if (ret_val) {
 		e_dbg("Error configuring flow control\n");
+		return ret_val;
+	}
 
-	return ret_val;
+	return 1;
 }
 
 static s32 e1000_get_variants_ich8lan(struct e1000_adapter *adapter)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
index 7f60522..a434fec 100644
--- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
+++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c
@@ -2463,7 +2463,6 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
 	return err;
 }
 
-#ifdef CONFIG_PM
 /**
  * fm10k_resume - Generic PM resume hook
  * @dev: generic device structure
@@ -2472,7 +2471,7 @@ static int fm10k_handle_resume(struct fm10k_intfc *interface)
  * suspend or hibernation. This function does not need to handle lower PCIe
  * device state as the stack takes care of that for us.
  **/
-static int fm10k_resume(struct device *dev)
+static int __maybe_unused fm10k_resume(struct device *dev)
 {
 	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
 	struct net_device *netdev = interface->netdev;
@@ -2499,7 +2498,7 @@ static int fm10k_resume(struct device *dev)
  * system suspend or hibernation. This function does not need to handle lower
  * PCIe device state as the stack takes care of that for us.
  **/
-static int fm10k_suspend(struct device *dev)
+static int __maybe_unused fm10k_suspend(struct device *dev)
 {
 	struct fm10k_intfc *interface = pci_get_drvdata(to_pci_dev(dev));
 	struct net_device *netdev = interface->netdev;
@@ -2511,8 +2510,6 @@ static int fm10k_suspend(struct device *dev)
 	return 0;
 }
 
-#endif /* CONFIG_PM */
-
 /**
  * fm10k_io_error_detected - called when PCI error is detected
  * @pdev: Pointer to PCI device
@@ -2643,11 +2640,9 @@ static struct pci_driver fm10k_driver = {
 	.id_table		= fm10k_pci_tbl,
 	.probe			= fm10k_probe,
 	.remove			= fm10k_remove,
-#ifdef CONFIG_PM
 	.driver = {
 		.pm		= &fm10k_pm_ops,
 	},
-#endif /* CONFIG_PM */
 	.sriov_configure	= fm10k_iov_configure,
 	.err_handler		= &fm10k_err_handler
 };
diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c b/drivers/net/ethernet/intel/i40e/i40e_main.c
index 321d8be..af792112 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_main.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
@@ -1573,11 +1573,18 @@ static int i40e_set_mac(struct net_device *netdev, void *p)
 	else
 		netdev_info(netdev, "set new mac address %pM\n", addr->sa_data);
 
+	/* Copy the address first, so that we avoid a possible race with
+	 * .set_rx_mode(). If we copy after changing the address in the filter
+	 * list, we might open ourselves to a narrow race window where
+	 * .set_rx_mode could delete our dev_addr filter and prevent traffic
+	 * from passing.
+	 */
+	ether_addr_copy(netdev->dev_addr, addr->sa_data);
+
 	spin_lock_bh(&vsi->mac_filter_hash_lock);
 	i40e_del_mac_filter(vsi, netdev->dev_addr);
 	i40e_add_mac_filter(vsi, addr->sa_data);
 	spin_unlock_bh(&vsi->mac_filter_hash_lock);
-	ether_addr_copy(netdev->dev_addr, addr->sa_data);
 	if (vsi->type == I40E_VSI_MAIN) {
 		i40e_status ret;
 
@@ -1923,6 +1930,14 @@ static int i40e_addr_unsync(struct net_device *netdev, const u8 *addr)
 	struct i40e_netdev_priv *np = netdev_priv(netdev);
 	struct i40e_vsi *vsi = np->vsi;
 
+	/* Under some circumstances, we might receive a request to delete
+	 * our own device address from our uc list. Because we store the
+	 * device address in the VSI's MAC/VLAN filter list, we need to ignore
+	 * such requests and not delete our device address from this list.
+	 */
+	if (ether_addr_equal(addr, netdev->dev_addr))
+		return 0;
+
 	i40e_del_mac_filter(vsi, addr);
 
 	return 0;
@@ -6038,8 +6053,8 @@ static int i40e_validate_and_set_switch_mode(struct i40e_vsi *vsi)
 	/* Set Bit 7 to be valid */
 	mode = I40E_AQ_SET_SWITCH_BIT7_VALID;
 
-	/* Set L4type to both TCP and UDP support */
-	mode |= I40E_AQ_SET_SWITCH_L4_TYPE_BOTH;
+	/* Set L4type for TCP support */
+	mode |= I40E_AQ_SET_SWITCH_L4_TYPE_TCP;
 
 	/* Set cloud filter mode */
 	mode |= I40E_AQ_SET_SWITCH_MODE_NON_TUNNEL;
@@ -6969,18 +6984,18 @@ static int i40e_add_del_cloud_filter_big_buf(struct i40e_vsi *vsi,
 	     is_valid_ether_addr(filter->src_mac)) ||
 	    (is_multicast_ether_addr(filter->dst_mac) &&
 	     is_multicast_ether_addr(filter->src_mac)))
-		return -EINVAL;
+		return -EOPNOTSUPP;
 
-	/* Make sure port is specified, otherwise bail out, for channel
-	 * specific cloud filter needs 'L4 port' to be non-zero
+	/* Big buffer cloud filter needs 'L4 port' to be non-zero. Also, UDP
+	 * ports are not supported via big buffer now.
 	 */
-	if (!filter->dst_port)
-		return -EINVAL;
+	if (!filter->dst_port || filter->ip_proto == IPPROTO_UDP)
+		return -EOPNOTSUPP;
 
 	/* adding filter using src_port/src_ip is not supported at this stage */
 	if (filter->src_port || filter->src_ipv4 ||
 	    !ipv6_addr_any(&filter->ip.v6.src_ip6))
-		return -EINVAL;
+		return -EOPNOTSUPP;
 
 	/* copy element needed to add cloud filter from filter */
 	i40e_set_cld_element(filter, &cld_filter.element);
@@ -6991,7 +7006,7 @@ static int i40e_add_del_cloud_filter_big_buf(struct i40e_vsi *vsi,
 	    is_multicast_ether_addr(filter->src_mac)) {
 		/* MAC + IP : unsupported mode */
 		if (filter->dst_ipv4)
-			return -EINVAL;
+			return -EOPNOTSUPP;
 
 		/* since we validated that L4 port must be valid before
 		 * we get here, start with respective "flags" value
@@ -7356,7 +7371,7 @@ static int i40e_configure_clsflower(struct i40e_vsi *vsi,
 
 	if (tc < 0) {
 		dev_err(&vsi->back->pdev->dev, "Invalid traffic class\n");
-		return -EINVAL;
+		return -EOPNOTSUPP;
 	}
 
 	if (test_bit(__I40E_RESET_RECOVERY_PENDING, pf->state) ||
@@ -7490,6 +7505,8 @@ static int i40e_setup_tc_cls_flower(struct i40e_netdev_priv *np,
 {
 	struct i40e_vsi *vsi = np->vsi;
 
+	if (!tc_can_offload(vsi->netdev))
+		return -EOPNOTSUPP;
 	if (cls_flower->common.chain_index)
 		return -EOPNOTSUPP;
 
diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
index 4566d66..5bc2748 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
@@ -3047,10 +3047,30 @@ bool __i40e_chk_linearize(struct sk_buff *skb)
 	/* Walk through fragments adding latest fragment, testing it, and
 	 * then removing stale fragments from the sum.
 	 */
-	stale = &skb_shinfo(skb)->frags[0];
-	for (;;) {
+	for (stale = &skb_shinfo(skb)->frags[0];; stale++) {
+		int stale_size = skb_frag_size(stale);
+
 		sum += skb_frag_size(frag++);
 
+		/* The stale fragment may present us with a smaller
+		 * descriptor than the actual fragment size. To account
+		 * for that we need to remove all the data on the front and
+		 * figure out what the remainder would be in the last
+		 * descriptor associated with the fragment.
+		 */
+		if (stale_size > I40E_MAX_DATA_PER_TXD) {
+			int align_pad = -(stale->page_offset) &
+					(I40E_MAX_READ_REQ_SIZE - 1);
+
+			sum -= align_pad;
+			stale_size -= align_pad;
+
+			do {
+				sum -= I40E_MAX_DATA_PER_TXD_ALIGNED;
+				stale_size -= I40E_MAX_DATA_PER_TXD_ALIGNED;
+			} while (stale_size > I40E_MAX_DATA_PER_TXD);
+		}
+
 		/* if sum is negative we failed to make sufficient progress */
 		if (sum < 0)
 			return true;
@@ -3058,7 +3078,7 @@ bool __i40e_chk_linearize(struct sk_buff *skb)
 		if (!nr_frags--)
 			break;
 
-		sum -= skb_frag_size(stale++);
+		sum -= stale_size;
 	}
 
 	return false;
diff --git a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
index 50864f9..1ba29bb 100644
--- a/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
+++ b/drivers/net/ethernet/intel/i40evf/i40e_txrx.c
@@ -2012,10 +2012,30 @@ bool __i40evf_chk_linearize(struct sk_buff *skb)
 	/* Walk through fragments adding latest fragment, testing it, and
 	 * then removing stale fragments from the sum.
 	 */
-	stale = &skb_shinfo(skb)->frags[0];
-	for (;;) {
+	for (stale = &skb_shinfo(skb)->frags[0];; stale++) {
+		int stale_size = skb_frag_size(stale);
+
 		sum += skb_frag_size(frag++);
 
+		/* The stale fragment may present us with a smaller
+		 * descriptor than the actual fragment size. To account
+		 * for that we need to remove all the data on the front and
+		 * figure out what the remainder would be in the last
+		 * descriptor associated with the fragment.
+		 */
+		if (stale_size > I40E_MAX_DATA_PER_TXD) {
+			int align_pad = -(stale->page_offset) &
+					(I40E_MAX_READ_REQ_SIZE - 1);
+
+			sum -= align_pad;
+			stale_size -= align_pad;
+
+			do {
+				sum -= I40E_MAX_DATA_PER_TXD_ALIGNED;
+				stale_size -= I40E_MAX_DATA_PER_TXD_ALIGNED;
+			} while (stale_size > I40E_MAX_DATA_PER_TXD);
+		}
+
 		/* if sum is negative we failed to make sufficient progress */
 		if (sum < 0)
 			return true;
@@ -2023,7 +2043,7 @@ bool __i40evf_chk_linearize(struct sk_buff *skb)
 		if (!nr_frags--)
 			break;
 
-		sum -= skb_frag_size(stale++);
+		sum -= stale_size;
 	}
 
 	return false;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en.h b/drivers/net/ethernet/mellanox/mlx5/core/en.h
index 543060c..c2d89bf 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en.h
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en.h
@@ -895,7 +895,7 @@ int mlx5e_vlan_rx_kill_vid(struct net_device *dev, __always_unused __be16 proto,
 			   u16 vid);
 void mlx5e_enable_cvlan_filter(struct mlx5e_priv *priv);
 void mlx5e_disable_cvlan_filter(struct mlx5e_priv *priv);
-void mlx5e_timestamp_set(struct mlx5e_priv *priv);
+void mlx5e_timestamp_init(struct mlx5e_priv *priv);
 
 struct mlx5e_redirect_rqt_param {
 	bool is_rss;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
index 9bcf38f..3d46ef4 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_dcbnl.c
@@ -922,8 +922,9 @@ static void mlx5e_dcbnl_query_dcbx_mode(struct mlx5e_priv *priv,
 
 static void mlx5e_ets_init(struct mlx5e_priv *priv)
 {
-	int i;
 	struct ieee_ets ets;
+	int err;
+	int i;
 
 	if (!MLX5_CAP_GEN(priv->mdev, ets))
 		return;
@@ -936,11 +937,16 @@ static void mlx5e_ets_init(struct mlx5e_priv *priv)
 		ets.prio_tc[i] = i;
 	}
 
-	/* tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1) */
-	ets.prio_tc[0] = 1;
-	ets.prio_tc[1] = 0;
+	if (ets.ets_cap > 1) {
+		/* tclass[prio=0]=1, tclass[prio=1]=0, tclass[prio=i]=i (for i>1) */
+		ets.prio_tc[0] = 1;
+		ets.prio_tc[1] = 0;
+	}
 
-	mlx5e_dcbnl_ieee_setets_core(priv, &ets);
+	err = mlx5e_dcbnl_ieee_setets_core(priv, &ets);
+	if (err)
+		netdev_err(priv->netdev,
+			   "%s, Failed to init ETS: %d\n", __func__, err);
 }
 
 enum {
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
index 8f05efa..ea5fff2 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_ethtool.c
@@ -207,8 +207,7 @@ void mlx5e_ethtool_get_ethtool_stats(struct mlx5e_priv *priv,
 		return;
 
 	mutex_lock(&priv->state_lock);
-	if (test_bit(MLX5E_STATE_OPENED, &priv->state))
-		mlx5e_update_stats(priv, true);
+	mlx5e_update_stats(priv, true);
 	mutex_unlock(&priv->state_lock);
 
 	for (i = 0; i < mlx5e_num_stats_grps; i++)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
index d9d8227..d8aefee 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c
@@ -2669,7 +2669,7 @@ void mlx5e_switch_priv_channels(struct mlx5e_priv *priv,
 		netif_carrier_on(netdev);
 }
 
-void mlx5e_timestamp_set(struct mlx5e_priv *priv)
+void mlx5e_timestamp_init(struct mlx5e_priv *priv)
 {
 	priv->tstamp.tx_type   = HWTSTAMP_TX_OFF;
 	priv->tstamp.rx_filter = HWTSTAMP_FILTER_NONE;
@@ -2690,7 +2690,6 @@ int mlx5e_open_locked(struct net_device *netdev)
 	mlx5e_activate_priv_channels(priv);
 	if (priv->profile->update_carrier)
 		priv->profile->update_carrier(priv);
-	mlx5e_timestamp_set(priv);
 
 	if (priv->profile->update_stats)
 		queue_delayed_work(priv->wq, &priv->update_stats_work, 0);
@@ -3219,12 +3218,12 @@ static int mlx5e_set_mac(struct net_device *netdev, void *addr)
 	return 0;
 }
 
-#define MLX5E_SET_FEATURE(netdev, feature, enable)	\
+#define MLX5E_SET_FEATURE(features, feature, enable)	\
 	do {						\
 		if (enable)				\
-			netdev->features |= feature;	\
+			*features |= feature;		\
 		else					\
-			netdev->features &= ~feature;	\
+			*features &= ~feature;		\
 	} while (0)
 
 typedef int (*mlx5e_feature_handler)(struct net_device *netdev, bool enable);
@@ -3347,6 +3346,7 @@ static int set_feature_arfs(struct net_device *netdev, bool enable)
 #endif
 
 static int mlx5e_handle_feature(struct net_device *netdev,
+				netdev_features_t *features,
 				netdev_features_t wanted_features,
 				netdev_features_t feature,
 				mlx5e_feature_handler feature_handler)
@@ -3365,34 +3365,40 @@ static int mlx5e_handle_feature(struct net_device *netdev,
 		return err;
 	}
 
-	MLX5E_SET_FEATURE(netdev, feature, enable);
+	MLX5E_SET_FEATURE(features, feature, enable);
 	return 0;
 }
 
 static int mlx5e_set_features(struct net_device *netdev,
 			      netdev_features_t features)
 {
+	netdev_features_t oper_features = netdev->features;
 	int err;
 
-	err  = mlx5e_handle_feature(netdev, features, NETIF_F_LRO,
-				    set_feature_lro);
-	err |= mlx5e_handle_feature(netdev, features,
+	err  = mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_LRO, set_feature_lro);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
 				    NETIF_F_HW_VLAN_CTAG_FILTER,
 				    set_feature_cvlan_filter);
-	err |= mlx5e_handle_feature(netdev, features, NETIF_F_HW_TC,
-				    set_feature_tc_num_filters);
-	err |= mlx5e_handle_feature(netdev, features, NETIF_F_RXALL,
-				    set_feature_rx_all);
-	err |= mlx5e_handle_feature(netdev, features, NETIF_F_RXFCS,
-				    set_feature_rx_fcs);
-	err |= mlx5e_handle_feature(netdev, features, NETIF_F_HW_VLAN_CTAG_RX,
-				    set_feature_rx_vlan);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_HW_TC, set_feature_tc_num_filters);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_RXALL, set_feature_rx_all);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_RXFCS, set_feature_rx_fcs);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_HW_VLAN_CTAG_RX, set_feature_rx_vlan);
 #ifdef CONFIG_RFS_ACCEL
-	err |= mlx5e_handle_feature(netdev, features, NETIF_F_NTUPLE,
-				    set_feature_arfs);
+	err |= mlx5e_handle_feature(netdev, &oper_features, features,
+				    NETIF_F_NTUPLE, set_feature_arfs);
 #endif
 
-	return err ? -EINVAL : 0;
+	if (err) {
+		netdev->features = oper_features;
+		return -EINVAL;
+	}
+
+	return 0;
 }
 
 static netdev_features_t mlx5e_fix_features(struct net_device *netdev,
@@ -4139,6 +4145,8 @@ static void mlx5e_build_nic_netdev_priv(struct mlx5_core_dev *mdev,
 	INIT_WORK(&priv->set_rx_mode_work, mlx5e_set_rx_mode_work);
 	INIT_WORK(&priv->tx_timeout_work, mlx5e_tx_timeout_work);
 	INIT_DELAYED_WORK(&priv->update_stats_work, mlx5e_update_stats_work);
+
+	mlx5e_timestamp_init(priv);
 }
 
 static void mlx5e_set_netdev_dev_addr(struct net_device *netdev)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
index 2c43606..3409d86 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c
@@ -877,6 +877,8 @@ static void mlx5e_init_rep(struct mlx5_core_dev *mdev,
 
 	mlx5e_build_rep_params(mdev, &priv->channels.params);
 	mlx5e_build_rep_netdev(netdev);
+
+	mlx5e_timestamp_init(priv);
 }
 
 static int mlx5e_init_rep_rx(struct mlx5e_priv *priv)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
index e401d9d..b69a705 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
@@ -201,9 +201,15 @@ static int mlx5e_am_stats_compare(struct mlx5e_rx_am_stats *curr,
 		return (curr->bpms > prev->bpms) ? MLX5E_AM_STATS_BETTER :
 						   MLX5E_AM_STATS_WORSE;
 
+	if (!prev->ppms)
+		return curr->ppms ? MLX5E_AM_STATS_BETTER :
+				    MLX5E_AM_STATS_SAME;
+
 	if (IS_SIGNIFICANT_DIFF(curr->ppms, prev->ppms))
 		return (curr->ppms > prev->ppms) ? MLX5E_AM_STATS_BETTER :
 						   MLX5E_AM_STATS_WORSE;
+	if (!prev->epms)
+		return MLX5E_AM_STATS_SAME;
 
 	if (IS_SIGNIFICANT_DIFF(curr->epms, prev->epms))
 		return (curr->epms < prev->epms) ? MLX5E_AM_STATS_BETTER :
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
index 1f1f8af8..5a46082 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -238,15 +238,19 @@ static int mlx5e_test_loopback_setup(struct mlx5e_priv *priv,
 	int err = 0;
 
 	/* Temporarily enable local_lb */
-	if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
-		mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
-		if (!lbtp->local_lb)
-			mlx5_nic_vport_update_local_lb(priv->mdev, true);
+	err = mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
+	if (err)
+		return err;
+
+	if (!lbtp->local_lb) {
+		err = mlx5_nic_vport_update_local_lb(priv->mdev, true);
+		if (err)
+			return err;
 	}
 
 	err = mlx5e_refresh_tirs(priv, true);
 	if (err)
-		return err;
+		goto out;
 
 	lbtp->loopback_ok = false;
 	init_completion(&lbtp->comp);
@@ -256,16 +260,21 @@ static int mlx5e_test_loopback_setup(struct mlx5e_priv *priv,
 	lbtp->pt.dev = priv->netdev;
 	lbtp->pt.af_packet_priv = lbtp;
 	dev_add_pack(&lbtp->pt);
+
+	return 0;
+
+out:
+	if (!lbtp->local_lb)
+		mlx5_nic_vport_update_local_lb(priv->mdev, false);
+
 	return err;
 }
 
 static void mlx5e_test_loopback_cleanup(struct mlx5e_priv *priv,
 					struct mlx5e_lbt_priv *lbtp)
 {
-	if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
-		if (!lbtp->local_lb)
-			mlx5_nic_vport_update_local_lb(priv->mdev, false);
-	}
+	if (!lbtp->local_lb)
+		mlx5_nic_vport_update_local_lb(priv->mdev, false);
 
 	dev_remove_pack(&lbtp->pt);
 	mlx5e_refresh_tirs(priv, false);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
index 8812d72..ee2f378 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/ipoib/ipoib.c
@@ -86,6 +86,8 @@ void mlx5i_init(struct mlx5_core_dev *mdev,
 	mlx5e_build_nic_params(mdev, &priv->channels.params, profile->max_nch(mdev));
 	mlx5i_build_nic_params(mdev, &priv->channels.params);
 
+	mlx5e_timestamp_init(priv);
+
 	/* netdev init */
 	netdev->hw_features    |= NETIF_F_SG;
 	netdev->hw_features    |= NETIF_F_IP_CSUM;
@@ -450,7 +452,6 @@ static int mlx5i_open(struct net_device *netdev)
 
 	mlx5e_refresh_tirs(epriv, false);
 	mlx5e_activate_priv_channels(epriv);
-	mlx5e_timestamp_set(epriv);
 
 	mutex_unlock(&epriv->state_lock);
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
index fa8aed6..5701f12 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lib/clock.c
@@ -423,9 +423,13 @@ void mlx5_pps_event(struct mlx5_core_dev *mdev,
 
 	switch (clock->ptp_info.pin_config[pin].func) {
 	case PTP_PF_EXTTS:
+		ptp_event.index = pin;
+		ptp_event.timestamp = timecounter_cyc2time(&clock->tc,
+					be64_to_cpu(eqe->data.pps.time_stamp));
 		if (clock->pps_info.enabled) {
 			ptp_event.type = PTP_CLOCK_PPSUSR;
-			ptp_event.pps_times.ts_real = ns_to_timespec64(eqe->data.pps.time_stamp);
+			ptp_event.pps_times.ts_real =
+					ns_to_timespec64(ptp_event.timestamp);
 		} else {
 			ptp_event.type = PTP_CLOCK_EXTTS;
 		}
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/main.c b/drivers/net/ethernet/mellanox/mlx5/core/main.c
index 8a89c7e..0f88fd3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -319,6 +319,7 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
 	struct mlx5_eq_table *table = &priv->eq_table;
 	int num_eqs = 1 << MLX5_CAP_GEN(dev, log_max_eq);
 	int nvec;
+	int err;
 
 	nvec = MLX5_CAP_GEN(dev, num_ports) * num_online_cpus() +
 	       MLX5_EQ_VEC_COMP_BASE;
@@ -328,21 +329,23 @@ static int mlx5_alloc_irq_vectors(struct mlx5_core_dev *dev)
 
 	priv->irq_info = kcalloc(nvec, sizeof(*priv->irq_info), GFP_KERNEL);
 	if (!priv->irq_info)
-		goto err_free_msix;
+		return -ENOMEM;
 
 	nvec = pci_alloc_irq_vectors(dev->pdev,
 			MLX5_EQ_VEC_COMP_BASE + 1, nvec,
 			PCI_IRQ_MSIX);
-	if (nvec < 0)
-		return nvec;
+	if (nvec < 0) {
+		err = nvec;
+		goto err_free_irq_info;
+	}
 
 	table->num_comp_vectors = nvec - MLX5_EQ_VEC_COMP_BASE;
 
 	return 0;
 
-err_free_msix:
+err_free_irq_info:
 	kfree(priv->irq_info);
-	return -ENOMEM;
+	return err;
 }
 
 static void mlx5_free_irq_vectors(struct mlx5_core_dev *dev)
@@ -578,8 +581,7 @@ static int mlx5_core_set_hca_defaults(struct mlx5_core_dev *dev)
 	int ret = 0;
 
 	/* Disable local_lb by default */
-	if ((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
-	    MLX5_CAP_GEN(dev, disable_local_lb))
+	if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH)
 		ret = mlx5_nic_vport_update_local_lb(dev, false);
 
 	return ret;
@@ -1121,9 +1123,12 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 		goto err_stop_poll;
 	}
 
-	if (boot && mlx5_init_once(dev, priv)) {
-		dev_err(&pdev->dev, "sw objs init failed\n");
-		goto err_stop_poll;
+	if (boot) {
+		err = mlx5_init_once(dev, priv);
+		if (err) {
+			dev_err(&pdev->dev, "sw objs init failed\n");
+			goto err_stop_poll;
+		}
 	}
 
 	err = mlx5_alloc_irq_vectors(dev);
@@ -1133,8 +1138,9 @@ static int mlx5_load_one(struct mlx5_core_dev *dev, struct mlx5_priv *priv,
 	}
 
 	dev->priv.uar = mlx5_get_uars_page(dev);
-	if (!dev->priv.uar) {
+	if (IS_ERR(dev->priv.uar)) {
 		dev_err(&pdev->dev, "Failed allocating uar, aborting\n");
+		err = PTR_ERR(dev->priv.uar);
 		goto err_disable_msix;
 	}
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/uar.c b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
index 222b259..8b97066 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/uar.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/uar.c
@@ -168,18 +168,16 @@ struct mlx5_uars_page *mlx5_get_uars_page(struct mlx5_core_dev *mdev)
 	struct mlx5_uars_page *ret;
 
 	mutex_lock(&mdev->priv.bfregs.reg_head.lock);
-	if (list_empty(&mdev->priv.bfregs.reg_head.list)) {
-		ret = alloc_uars_page(mdev, false);
-		if (IS_ERR(ret)) {
-			ret = NULL;
-			goto out;
-		}
-		list_add(&ret->list, &mdev->priv.bfregs.reg_head.list);
-	} else {
+	if (!list_empty(&mdev->priv.bfregs.reg_head.list)) {
 		ret = list_first_entry(&mdev->priv.bfregs.reg_head.list,
 				       struct mlx5_uars_page, list);
 		kref_get(&ret->ref_count);
+		goto out;
 	}
+	ret = alloc_uars_page(mdev, false);
+	if (IS_ERR(ret))
+		goto out;
+	list_add(&ret->list, &mdev->priv.bfregs.reg_head.list);
 out:
 	mutex_unlock(&mdev->priv.bfregs.reg_head.lock);
 
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/vport.c b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
index d653b00..a1296a6 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -908,23 +908,33 @@ int mlx5_nic_vport_update_local_lb(struct mlx5_core_dev *mdev, bool enable)
 	void *in;
 	int err;
 
-	mlx5_core_dbg(mdev, "%s local_lb\n", enable ? "enable" : "disable");
+	if (!MLX5_CAP_GEN(mdev, disable_local_lb_mc) &&
+	    !MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+		return 0;
+
 	in = kvzalloc(inlen, GFP_KERNEL);
 	if (!in)
 		return -ENOMEM;
 
 	MLX5_SET(modify_nic_vport_context_in, in,
-		 field_select.disable_mc_local_lb, 1);
-	MLX5_SET(modify_nic_vport_context_in, in,
 		 nic_vport_context.disable_mc_local_lb, !enable);
-
-	MLX5_SET(modify_nic_vport_context_in, in,
-		 field_select.disable_uc_local_lb, 1);
 	MLX5_SET(modify_nic_vport_context_in, in,
 		 nic_vport_context.disable_uc_local_lb, !enable);
 
+	if (MLX5_CAP_GEN(mdev, disable_local_lb_mc))
+		MLX5_SET(modify_nic_vport_context_in, in,
+			 field_select.disable_mc_local_lb, 1);
+
+	if (MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+		MLX5_SET(modify_nic_vport_context_in, in,
+			 field_select.disable_uc_local_lb, 1);
+
 	err = mlx5_modify_nic_vport_context(mdev, in, inlen);
 
+	if (!err)
+		mlx5_core_dbg(mdev, "%s local_lb\n",
+			      enable ? "enable" : "disable");
+
 	kvfree(in);
 	return err;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci.c b/drivers/net/ethernet/mellanox/mlxsw/pci.c
index 23f7d82..6ef20e5 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci.c
@@ -1643,7 +1643,12 @@ static int mlxsw_pci_sw_reset(struct mlxsw_pci *mlxsw_pci,
 		return 0;
 	}
 
-	wmb(); /* reset needs to be written before we read control register */
+	/* Reset needs to be written before we read control register, and
+	 * we must wait for the HW to become responsive once again
+	 */
+	wmb();
+	msleep(MLXSW_PCI_SW_RESET_WAIT_MSECS);
+
 	end = jiffies + msecs_to_jiffies(MLXSW_PCI_SW_RESET_TIMEOUT_MSECS);
 	do {
 		u32 val = mlxsw_pci_read32(mlxsw_pci, FW_READY);
diff --git a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
index a644120..fb082ad2 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/pci_hw.h
@@ -59,6 +59,7 @@
 #define MLXSW_PCI_SW_RESET			0xF0010
 #define MLXSW_PCI_SW_RESET_RST_BIT		BIT(0)
 #define MLXSW_PCI_SW_RESET_TIMEOUT_MSECS	5000
+#define MLXSW_PCI_SW_RESET_WAIT_MSECS		100
 #define MLXSW_PCI_FW_READY			0xA1844
 #define MLXSW_PCI_FW_READY_MASK			0xFFFF
 #define MLXSW_PCI_FW_READY_MAGIC		0x5E
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
index 9bd8d28..c3837ca 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c
@@ -4376,7 +4376,10 @@ static int mlxsw_sp_netdevice_port_upper_event(struct net_device *lower_dev,
 		}
 		if (!info->linking)
 			break;
-		if (netdev_has_any_upper_dev(upper_dev)) {
+		if (netdev_has_any_upper_dev(upper_dev) &&
+		    (!netif_is_bridge_master(upper_dev) ||
+		     !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp,
+							  upper_dev))) {
 			NL_SET_ERR_MSG(extack,
 				       "spectrum: Enslaving a port to a device that already has an upper device is not supported");
 			return -EINVAL;
@@ -4504,6 +4507,7 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 					      u16 vid)
 {
 	struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
+	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	struct netdev_notifier_changeupper_info *info = ptr;
 	struct netlink_ext_ack *extack;
 	struct net_device *upper_dev;
@@ -4520,7 +4524,10 @@ static int mlxsw_sp_netdevice_port_vlan_event(struct net_device *vlan_dev,
 		}
 		if (!info->linking)
 			break;
-		if (netdev_has_any_upper_dev(upper_dev)) {
+		if (netdev_has_any_upper_dev(upper_dev) &&
+		    (!netif_is_bridge_master(upper_dev) ||
+		     !mlxsw_sp_bridge_device_is_offloaded(mlxsw_sp,
+							  upper_dev))) {
 			NL_SET_ERR_MSG(extack, "spectrum: Enslaving a port to a device that already has an upper device is not supported");
 			return -EINVAL;
 		}
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
index 432ab9b..05ce1be 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h
@@ -365,6 +365,8 @@ int mlxsw_sp_port_bridge_join(struct mlxsw_sp_port *mlxsw_sp_port,
 void mlxsw_sp_port_bridge_leave(struct mlxsw_sp_port *mlxsw_sp_port,
 				struct net_device *brport_dev,
 				struct net_device *br_dev);
+bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp,
+					 const struct net_device *br_dev);
 
 /* spectrum.c */
 int mlxsw_sp_port_ets_set(struct mlxsw_sp_port *mlxsw_sp_port,
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
index c33beac..b5397da 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_qdisc.c
@@ -46,7 +46,8 @@ mlxsw_sp_tclass_congestion_enable(struct mlxsw_sp_port *mlxsw_sp_port,
 				  int tclass_num, u32 min, u32 max,
 				  u32 probability, bool is_ecn)
 {
-	char cwtp_cmd[max_t(u8, MLXSW_REG_CWTP_LEN, MLXSW_REG_CWTPM_LEN)];
+	char cwtpm_cmd[MLXSW_REG_CWTPM_LEN];
+	char cwtp_cmd[MLXSW_REG_CWTP_LEN];
 	struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
 	int err;
 
@@ -60,10 +61,10 @@ mlxsw_sp_tclass_congestion_enable(struct mlxsw_sp_port *mlxsw_sp_port,
 	if (err)
 		return err;
 
-	mlxsw_reg_cwtpm_pack(cwtp_cmd, mlxsw_sp_port->local_port, tclass_num,
+	mlxsw_reg_cwtpm_pack(cwtpm_cmd, mlxsw_sp_port->local_port, tclass_num,
 			     MLXSW_REG_CWTP_DEFAULT_PROFILE, true, is_ecn);
 
-	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(cwtpm), cwtp_cmd);
+	return mlxsw_reg_write(mlxsw_sp->core, MLXSW_REG(cwtpm), cwtpm_cmd);
 }
 
 static int
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
index be657b8..7042c85 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -821,13 +821,18 @@ static int mlxsw_sp_vr_lpm_tree_replace(struct mlxsw_sp *mlxsw_sp,
 	struct mlxsw_sp_lpm_tree *old_tree = fib->lpm_tree;
 	int err;
 
-	err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, fib, new_tree->id);
-	if (err)
-		return err;
 	fib->lpm_tree = new_tree;
 	mlxsw_sp_lpm_tree_hold(new_tree);
+	err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, fib, new_tree->id);
+	if (err)
+		goto err_tree_bind;
 	mlxsw_sp_lpm_tree_put(mlxsw_sp, old_tree);
 	return 0;
+
+err_tree_bind:
+	mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+	fib->lpm_tree = old_tree;
+	return err;
 }
 
 static int mlxsw_sp_vrs_lpm_tree_replace(struct mlxsw_sp *mlxsw_sp,
@@ -868,11 +873,14 @@ static int mlxsw_sp_vrs_lpm_tree_replace(struct mlxsw_sp *mlxsw_sp,
 	return err;
 
 no_replace:
-	err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, fib, new_tree->id);
-	if (err)
-		return err;
 	fib->lpm_tree = new_tree;
 	mlxsw_sp_lpm_tree_hold(new_tree);
+	err = mlxsw_sp_vr_lpm_tree_bind(mlxsw_sp, fib, new_tree->id);
+	if (err) {
+		mlxsw_sp_lpm_tree_put(mlxsw_sp, new_tree);
+		fib->lpm_tree = NULL;
+		return err;
+	}
 	return 0;
 }
 
@@ -1934,11 +1942,8 @@ static void mlxsw_sp_router_neigh_ent_ipv4_process(struct mlxsw_sp *mlxsw_sp,
 	dipn = htonl(dip);
 	dev = mlxsw_sp->router->rifs[rif]->dev;
 	n = neigh_lookup(&arp_tbl, &dipn, dev);
-	if (!n) {
-		netdev_err(dev, "Failed to find matching neighbour for IP=%pI4h\n",
-			   &dip);
+	if (!n)
 		return;
-	}
 
 	netdev_dbg(dev, "Updating neighbour with IP=%pI4h\n", &dip);
 	neigh_event_send(n, NULL);
@@ -1965,11 +1970,8 @@ static void mlxsw_sp_router_neigh_ent_ipv6_process(struct mlxsw_sp *mlxsw_sp,
 
 	dev = mlxsw_sp->router->rifs[rif]->dev;
 	n = neigh_lookup(&nd_tbl, &dip, dev);
-	if (!n) {
-		netdev_err(dev, "Failed to find matching neighbour for IP=%pI6c\n",
-			   &dip);
+	if (!n)
 		return;
-	}
 
 	netdev_dbg(dev, "Updating neighbour with IP=%pI6c\n", &dip);
 	neigh_event_send(n, NULL);
@@ -3228,7 +3230,7 @@ static void __mlxsw_sp_nexthop_neigh_update(struct mlxsw_sp_nexthop *nh,
 {
 	if (!removing)
 		nh->should_offload = 1;
-	else if (nh->offloaded)
+	else
 		nh->should_offload = 0;
 	nh->update = 1;
 }
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 7b8548e..593ad31 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -152,6 +152,12 @@ mlxsw_sp_bridge_device_find(const struct mlxsw_sp_bridge *bridge,
 	return NULL;
 }
 
+bool mlxsw_sp_bridge_device_is_offloaded(const struct mlxsw_sp *mlxsw_sp,
+					 const struct net_device *br_dev)
+{
+	return !!mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev);
+}
+
 static struct mlxsw_sp_bridge_device *
 mlxsw_sp_bridge_device_create(struct mlxsw_sp_bridge *bridge,
 			      struct net_device *br_dev)
diff --git a/drivers/net/ethernet/natsemi/macsonic.c b/drivers/net/ethernet/natsemi/macsonic.c
index a42433f..b922ab5 100644
--- a/drivers/net/ethernet/natsemi/macsonic.c
+++ b/drivers/net/ethernet/natsemi/macsonic.c
@@ -311,7 +311,7 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 {
 	struct sonic_local* lp = netdev_priv(dev);
 	int sr;
-	int commslot = 0;
+	bool commslot = macintosh_config->expansion_type == MAC_EXP_PDS_COMM;
 
 	if (!MACH_IS_MAC)
 		return -ENODEV;
@@ -322,10 +322,7 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 	   Ethernet (BTW, the Ethernet *is* always at the same
 	   address, and nothing else lives there, at least if Apple's
 	   documentation is to be believed) */
-	if (macintosh_config->ident == MAC_MODEL_Q630 ||
-	    macintosh_config->ident == MAC_MODEL_P588 ||
-	    macintosh_config->ident == MAC_MODEL_P575 ||
-	    macintosh_config->ident == MAC_MODEL_C610) {
+	if (commslot || macintosh_config->ident == MAC_MODEL_C610) {
 		int card_present;
 
 		card_present = hwreg_present((void*)ONBOARD_SONIC_REGISTERS);
@@ -333,7 +330,6 @@ static int mac_onboard_sonic_probe(struct net_device *dev)
 			printk("none.\n");
 			return -ENODEV;
 		}
-		commslot = 1;
 	}
 
 	printk("yes\n");
@@ -428,26 +424,26 @@ static int mac_nubus_sonic_ethernet_addr(struct net_device *dev,
 	return 0;
 }
 
-static int macsonic_ident(struct nubus_dev *ndev)
+static int macsonic_ident(struct nubus_rsrc *fres)
 {
-	if (ndev->dr_hw == NUBUS_DRHW_ASANTE_LC &&
-	    ndev->dr_sw == NUBUS_DRSW_SONIC_LC)
+	if (fres->dr_hw == NUBUS_DRHW_ASANTE_LC &&
+	    fres->dr_sw == NUBUS_DRSW_SONIC_LC)
 		return MACSONIC_DAYNALINK;
-	if (ndev->dr_hw == NUBUS_DRHW_SONIC &&
-	    ndev->dr_sw == NUBUS_DRSW_APPLE) {
+	if (fres->dr_hw == NUBUS_DRHW_SONIC &&
+	    fres->dr_sw == NUBUS_DRSW_APPLE) {
 		/* There has to be a better way to do this... */
-		if (strstr(ndev->board->name, "DuoDock"))
+		if (strstr(fres->board->name, "DuoDock"))
 			return MACSONIC_DUODOCK;
 		else
 			return MACSONIC_APPLE;
 	}
 
-	if (ndev->dr_hw == NUBUS_DRHW_SMC9194 &&
-	    ndev->dr_sw == NUBUS_DRSW_DAYNA)
+	if (fres->dr_hw == NUBUS_DRHW_SMC9194 &&
+	    fres->dr_sw == NUBUS_DRSW_DAYNA)
 		return MACSONIC_DAYNA;
 
-	if (ndev->dr_hw == NUBUS_DRHW_APPLE_SONIC_LC &&
-	    ndev->dr_sw == 0) { /* huh? */
+	if (fres->dr_hw == NUBUS_DRHW_APPLE_SONIC_LC &&
+	    fres->dr_sw == 0) { /* huh? */
 		return MACSONIC_APPLE16;
 	}
 	return -1;
@@ -456,7 +452,7 @@ static int macsonic_ident(struct nubus_dev *ndev)
 static int mac_nubus_sonic_probe(struct net_device *dev)
 {
 	static int slots;
-	struct nubus_dev* ndev = NULL;
+	struct nubus_rsrc *ndev = NULL;
 	struct sonic_local* lp = netdev_priv(dev);
 	unsigned long base_addr, prom_addr;
 	u16 sonic_dcr;
@@ -464,9 +460,11 @@ static int mac_nubus_sonic_probe(struct net_device *dev)
 	int reg_offset, dma_bitmode;
 
 	/* Find the first SONIC that hasn't been initialized already */
-	while ((ndev = nubus_find_type(NUBUS_CAT_NETWORK,
-				       NUBUS_TYPE_ETHERNET, ndev)) != NULL)
-	{
+	for_each_func_rsrc(ndev) {
+		if (ndev->category != NUBUS_CAT_NETWORK ||
+		    ndev->type != NUBUS_TYPE_ETHERNET)
+			continue;
+
 		/* Have we seen it already? */
 		if (slots & (1<<ndev->board->slot))
 			continue;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index 1a603fd..99b0487 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -568,6 +568,7 @@ nfp_net_aux_irq_request(struct nfp_net *nn, u32 ctrl_offset,
 		return err;
 	}
 	nn_writeb(nn, ctrl_offset, entry->entry);
+	nfp_net_irq_unmask(nn, entry->entry);
 
 	return 0;
 }
@@ -582,6 +583,7 @@ static void nfp_net_aux_irq_free(struct nfp_net *nn, u32 ctrl_offset,
 				 unsigned int vector_idx)
 {
 	nn_writeb(nn, ctrl_offset, 0xff);
+	nn_pci_flush(nn);
 	free_irq(nn->irq_entries[vector_idx].vector, nn);
 }
 
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index 2801ecd..6c02b2d 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -333,7 +333,7 @@ nfp_net_get_link_ksettings(struct net_device *netdev,
 	    ls >= ARRAY_SIZE(ls_to_ethtool))
 		return 0;
 
-	cmd->base.speed = ls_to_ethtool[sts];
+	cmd->base.speed = ls_to_ethtool[ls];
 	cmd->base.duplex = DUPLEX_FULL;
 
 	return 0;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_rdma.c b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
index c8c4b39..b7abb82 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_rdma.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_rdma.c
@@ -358,10 +358,27 @@ static void qed_rdma_resc_free(struct qed_hwfn *p_hwfn)
 	kfree(p_rdma_info);
 }
 
+static void qed_rdma_free_tid(void *rdma_cxt, u32 itid)
+{
+        struct qed_hwfn *p_hwfn = (struct qed_hwfn *)rdma_cxt;
+
+        DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "itid = %08x\n", itid);
+
+        spin_lock_bh(&p_hwfn->p_rdma_info->lock);
+        qed_bmap_release_id(p_hwfn, &p_hwfn->p_rdma_info->tid_map, itid);
+        spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
+}
+
+static void qed_rdma_free_reserved_lkey(struct qed_hwfn *p_hwfn)
+{
+	qed_rdma_free_tid(p_hwfn, p_hwfn->p_rdma_info->dev->reserved_lkey);
+}
+
 static void qed_rdma_free(struct qed_hwfn *p_hwfn)
 {
 	DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "Freeing RDMA\n");
 
+	qed_rdma_free_reserved_lkey(p_hwfn);
 	qed_rdma_resc_free(p_hwfn);
 }
 
@@ -615,9 +632,6 @@ static int qed_rdma_reserve_lkey(struct qed_hwfn *p_hwfn)
 {
 	struct qed_rdma_device *dev = p_hwfn->p_rdma_info->dev;
 
-	/* The first DPI is reserved for the Kernel */
-	__set_bit(0, p_hwfn->p_rdma_info->dpi_map.bitmap);
-
 	/* Tid 0 will be used as the key for "reserved MR".
 	 * The driver should allocate memory for it so it can be loaded but no
 	 * ramrod should be passed on it.
@@ -797,17 +811,6 @@ static struct qed_rdma_device *qed_rdma_query_device(void *rdma_cxt)
 	return p_hwfn->p_rdma_info->dev;
 }
 
-static void qed_rdma_free_tid(void *rdma_cxt, u32 itid)
-{
-	struct qed_hwfn *p_hwfn = (struct qed_hwfn *)rdma_cxt;
-
-	DP_VERBOSE(p_hwfn, QED_MSG_RDMA, "itid = %08x\n", itid);
-
-	spin_lock_bh(&p_hwfn->p_rdma_info->lock);
-	qed_bmap_release_id(p_hwfn, &p_hwfn->p_rdma_info->tid_map, itid);
-	spin_unlock_bh(&p_hwfn->p_rdma_info->lock);
-}
-
 static void qed_rdma_cnq_prod_update(void *rdma_cxt, u8 qz_offset, u16 prod)
 {
 	struct qed_hwfn *p_hwfn;
diff --git a/drivers/net/ethernet/qlogic/qed/qed_spq.c b/drivers/net/ethernet/qlogic/qed/qed_spq.c
index be48d9a..cd9a029 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_spq.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_spq.c
@@ -97,9 +97,7 @@ static int __qed_spq_block(struct qed_hwfn *p_hwfn,
 
 	while (iter_cnt--) {
 		/* Validate we receive completion update */
-		if (READ_ONCE(comp_done->done) == 1) {
-			/* Read updated FW return value */
-			smp_read_barrier_depends();
+		if (smp_load_acquire(&comp_done->done) == 1) { /* ^^^ */
 			if (p_fw_ret)
 				*p_fw_ret = comp_done->fw_return_code;
 			return 0;
@@ -776,6 +774,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
 	int rc = 0;
 	struct qed_spq *p_spq = p_hwfn ? p_hwfn->p_spq : NULL;
 	bool b_ret_ent = true;
+	bool eblock;
 
 	if (!p_hwfn)
 		return -EINVAL;
@@ -794,6 +793,11 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
 	if (rc)
 		goto spq_post_fail;
 
+	/* Check if entry is in block mode before qed_spq_add_entry,
+	 * which might kfree p_ent.
+	 */
+	eblock = (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK);
+
 	/* Add the request to the pending queue */
 	rc = qed_spq_add_entry(p_hwfn, p_ent, p_ent->priority);
 	if (rc)
@@ -811,7 +815,7 @@ int qed_spq_post(struct qed_hwfn *p_hwfn,
 
 	spin_unlock_bh(&p_spq->lock);
 
-	if (p_ent->comp_mode == QED_SPQ_MODE_EBLOCK) {
+	if (eblock) {
 		/* For entries in QED BLOCK mode, the completion code cannot
 		 * perform the necessary cleanup - if it did, we couldn't
 		 * access p_ent here to see whether it's successful or not.
diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index fc0d5fa..734286e 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2244,19 +2244,14 @@ static bool rtl8169_do_counters(struct net_device *dev, u32 counter_cmd)
 	void __iomem *ioaddr = tp->mmio_addr;
 	dma_addr_t paddr = tp->counters_phys_addr;
 	u32 cmd;
-	bool ret;
 
 	RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+	RTL_R32(CounterAddrHigh);
 	cmd = (u64)paddr & DMA_BIT_MASK(32);
 	RTL_W32(CounterAddrLow, cmd);
 	RTL_W32(CounterAddrLow, cmd | counter_cmd);
 
-	ret = rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
-
-	RTL_W32(CounterAddrLow, 0);
-	RTL_W32(CounterAddrHigh, 0);
-
-	return ret;
+	return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
 }
 
 static bool rtl8169_reset_counters(struct net_device *dev)
diff --git a/drivers/net/ethernet/renesas/sh_eth.c b/drivers/net/ethernet/renesas/sh_eth.c
index 7532300..53924a4 100644
--- a/drivers/net/ethernet/renesas/sh_eth.c
+++ b/drivers/net/ethernet/renesas/sh_eth.c
@@ -147,7 +147,7 @@ static const u16 sh_eth_offset_gigabit[SH_ETH_MAX_REGISTER_OFFSET] = {
 	[FWNLCR0]	= 0x0090,
 	[FWALCR0]	= 0x0094,
 	[TXNLCR1]	= 0x00a0,
-	[TXALCR1]	= 0x00a0,
+	[TXALCR1]	= 0x00a4,
 	[RXNLCR1]	= 0x00a8,
 	[RXALCR1]	= 0x00ac,
 	[FWNLCR1]	= 0x00b0,
@@ -399,7 +399,7 @@ static const u16 sh_eth_offset_fast_sh3_sh2[SH_ETH_MAX_REGISTER_OFFSET] = {
 	[FWNLCR0]	= 0x0090,
 	[FWALCR0]	= 0x0094,
 	[TXNLCR1]	= 0x00a0,
-	[TXALCR1]	= 0x00a0,
+	[TXALCR1]	= 0x00a4,
 	[RXNLCR1]	= 0x00a8,
 	[RXALCR1]	= 0x00ac,
 	[FWNLCR1]	= 0x00b0,
@@ -2089,8 +2089,8 @@ static size_t __sh_eth_get_regs(struct net_device *ndev, u32 *buf)
 		add_reg(CSMR);
 	if (cd->select_mii)
 		add_reg(RMII_MII);
-	add_reg(ARSTR);
 	if (cd->tsu) {
+		add_tsu_reg(ARSTR);
 		add_tsu_reg(TSU_CTRST);
 		add_tsu_reg(TSU_FWEN0);
 		add_tsu_reg(TSU_FWEN1);
@@ -3225,18 +3225,37 @@ static int sh_eth_drv_probe(struct platform_device *pdev)
 	/* ioremap the TSU registers */
 	if (mdp->cd->tsu) {
 		struct resource *rtsu;
+
 		rtsu = platform_get_resource(pdev, IORESOURCE_MEM, 1);
-		mdp->tsu_addr = devm_ioremap_resource(&pdev->dev, rtsu);
-		if (IS_ERR(mdp->tsu_addr)) {
-			ret = PTR_ERR(mdp->tsu_addr);
+		if (!rtsu) {
+			dev_err(&pdev->dev, "no TSU resource\n");
+			ret = -ENODEV;
+			goto out_release;
+		}
+		/* We can only request the  TSU region  for the first port
+		 * of the two  sharing this TSU for the probe to succeed...
+		 */
+		if (devno % 2 == 0 &&
+		    !devm_request_mem_region(&pdev->dev, rtsu->start,
+					     resource_size(rtsu),
+					     dev_name(&pdev->dev))) {
+			dev_err(&pdev->dev, "can't request TSU resource.\n");
+			ret = -EBUSY;
+			goto out_release;
+		}
+		mdp->tsu_addr = devm_ioremap(&pdev->dev, rtsu->start,
+					     resource_size(rtsu));
+		if (!mdp->tsu_addr) {
+			dev_err(&pdev->dev, "TSU region ioremap() failed.\n");
+			ret = -ENOMEM;
 			goto out_release;
 		}
 		mdp->port = devno % 2;
 		ndev->features = NETIF_F_HW_VLAN_CTAG_FILTER;
 	}
 
-	/* initialize first or needed device */
-	if (!devno || pd->needs_init) {
+	/* Need to init only the first port of the two sharing a TSU */
+	if (devno % 2 == 0) {
 		if (mdp->cd->chip_reset)
 			mdp->cd->chip_reset(ndev);
 
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 337d53d..c0af0bc 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -364,9 +364,15 @@ static void stmmac_eee_ctrl_timer(struct timer_list *t)
 bool stmmac_eee_init(struct stmmac_priv *priv)
 {
 	struct net_device *ndev = priv->dev;
+	int interface = priv->plat->interface;
 	unsigned long flags;
 	bool ret = false;
 
+	if ((interface != PHY_INTERFACE_MODE_MII) &&
+	    (interface != PHY_INTERFACE_MODE_GMII) &&
+	    !phy_interface_mode_is_rgmii(interface))
+		goto out;
+
 	/* Using PCS we cannot dial with the phy registers at this stage
 	 * so we do not support extra feature like EEE.
 	 */
diff --git a/drivers/net/ethernet/ti/netcp_core.c b/drivers/net/ethernet/ti/netcp_core.c
index ed58c74..f5a7eb2 100644
--- a/drivers/net/ethernet/ti/netcp_core.c
+++ b/drivers/net/ethernet/ti/netcp_core.c
@@ -715,7 +715,7 @@ static int netcp_process_one_rx_packet(struct netcp_intf *netcp)
 		/* warning!!!! We are retrieving the virtual ptr in the sw_data
 		 * field as a 32bit value. Will not work on 64bit machines
 		 */
-		page = (struct page *)GET_SW_DATA0(desc);
+		page = (struct page *)GET_SW_DATA0(ndesc);
 
 		if (likely(dma_buff && buf_len && page)) {
 			dma_unmap_page(netcp->dev, dma_buff, PAGE_SIZE,
diff --git a/drivers/net/geneve.c b/drivers/net/geneve.c
index b718a02..64fda2e 100644
--- a/drivers/net/geneve.c
+++ b/drivers/net/geneve.c
@@ -825,6 +825,13 @@ static int geneve_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	if (IS_ERR(rt))
 		return PTR_ERR(rt);
 
+	if (skb_dst(skb)) {
+		int mtu = dst_mtu(&rt->dst) - sizeof(struct iphdr) -
+			  GENEVE_BASE_HLEN - info->options_len - 14;
+
+		skb_dst_update_pmtu(skb, mtu);
+	}
+
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
 	if (geneve->collect_md) {
 		tos = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
@@ -864,6 +871,13 @@ static int geneve6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
 	if (IS_ERR(dst))
 		return PTR_ERR(dst);
 
+	if (skb_dst(skb)) {
+		int mtu = dst_mtu(dst) - sizeof(struct ipv6hdr) -
+			  GENEVE_BASE_HLEN - info->options_len - 14;
+
+		skb_dst_update_pmtu(skb, mtu);
+	}
+
 	sport = udp_flow_src_port(geneve->net, skb, 1, USHRT_MAX, true);
 	if (geneve->collect_md) {
 		prio = ip_tunnel_ecn_encap(key->tos, ip_hdr(skb), skb);
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c
index a178c5e..a0f2be8 100644
--- a/drivers/net/macvlan.c
+++ b/drivers/net/macvlan.c
@@ -1444,9 +1444,14 @@ int macvlan_common_newlink(struct net *src_net, struct net_device *dev,
 	return 0;
 
 unregister_netdev:
+	/* macvlan_uninit would free the macvlan port */
 	unregister_netdevice(dev);
+	return err;
 destroy_macvlan_port:
-	if (create)
+	/* the macvlan port may be freed by macvlan_uninit when fail to register.
+	 * so we destroy the macvlan port only when it's valid.
+	 */
+	if (create && macvlan_port_get_rtnl(dev))
 		macvlan_port_destroy(port->dev);
 	return err;
 }
diff --git a/drivers/net/phy/mdio-sun4i.c b/drivers/net/phy/mdio-sun4i.c
index 1352965..6425ce0 100644
--- a/drivers/net/phy/mdio-sun4i.c
+++ b/drivers/net/phy/mdio-sun4i.c
@@ -118,8 +118,10 @@ static int sun4i_mdio_probe(struct platform_device *pdev)
 
 	data->regulator = devm_regulator_get(&pdev->dev, "phy");
 	if (IS_ERR(data->regulator)) {
-		if (PTR_ERR(data->regulator) == -EPROBE_DEFER)
-			return -EPROBE_DEFER;
+		if (PTR_ERR(data->regulator) == -EPROBE_DEFER) {
+			ret = -EPROBE_DEFER;
+			goto err_out_free_mdiobus;
+		}
 
 		dev_info(&pdev->dev, "no regulator found\n");
 		data->regulator = NULL;
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c
index 827f3f9..249ce5c 100644
--- a/drivers/net/phy/phylink.c
+++ b/drivers/net/phy/phylink.c
@@ -1296,6 +1296,7 @@ int phylink_mii_ioctl(struct phylink *pl, struct ifreq *ifr, int cmd)
 		switch (cmd) {
 		case SIOCGMIIPHY:
 			mii->phy_id = pl->phydev->mdio.addr;
+			/* fall through */
 
 		case SIOCGMIIREG:
 			ret = phylink_phy_read(pl, mii->phy_id, mii->reg_num);
@@ -1318,6 +1319,7 @@ int phylink_mii_ioctl(struct phylink *pl, struct ifreq *ifr, int cmd)
 		switch (cmd) {
 		case SIOCGMIIPHY:
 			mii->phy_id = 0;
+			/* fall through */
 
 		case SIOCGMIIREG:
 			ret = phylink_mii_read(pl, mii->phy_id, mii->reg_num);
@@ -1429,9 +1431,8 @@ static void phylink_sfp_link_down(void *upstream)
 	WARN_ON(!lockdep_rtnl_is_held());
 
 	set_bit(PHYLINK_DISABLE_LINK, &pl->phylink_disable_state);
+	queue_work(system_power_efficient_wq, &pl->resolve);
 	flush_work(&pl->resolve);
-
-	netif_carrier_off(pl->netdev);
 }
 
 static void phylink_sfp_link_up(void *upstream)
diff --git a/drivers/net/phy/sfp-bus.c b/drivers/net/phy/sfp-bus.c
index 8a1b1f4..ab64a14 100644
--- a/drivers/net/phy/sfp-bus.c
+++ b/drivers/net/phy/sfp-bus.c
@@ -356,7 +356,8 @@ EXPORT_SYMBOL_GPL(sfp_register_upstream);
 void sfp_unregister_upstream(struct sfp_bus *bus)
 {
 	rtnl_lock();
-	sfp_unregister_bus(bus);
+	if (bus->sfp)
+		sfp_unregister_bus(bus);
 	bus->upstream = NULL;
 	bus->netdev = NULL;
 	rtnl_unlock();
@@ -459,7 +460,8 @@ EXPORT_SYMBOL_GPL(sfp_register_socket);
 void sfp_unregister_socket(struct sfp_bus *bus)
 {
 	rtnl_lock();
-	sfp_unregister_bus(bus);
+	if (bus->netdev)
+		sfp_unregister_bus(bus);
 	bus->sfp_dev = NULL;
 	bus->sfp = NULL;
 	bus->socket_ops = NULL;
diff --git a/drivers/net/ppp/ppp_generic.c b/drivers/net/ppp/ppp_generic.c
index d8e5747..264d4af 100644
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -1006,17 +1006,18 @@ static int ppp_unit_register(struct ppp *ppp, int unit, bool ifname_is_set)
 	if (!ifname_is_set)
 		snprintf(ppp->dev->name, IFNAMSIZ, "ppp%i", ppp->file.index);
 
+	mutex_unlock(&pn->all_ppp_mutex);
+
 	ret = register_netdevice(ppp->dev);
 	if (ret < 0)
 		goto err_unit;
 
 	atomic_inc(&ppp_unit_count);
 
-	mutex_unlock(&pn->all_ppp_mutex);
-
 	return 0;
 
 err_unit:
+	mutex_lock(&pn->all_ppp_mutex);
 	unit_put(&pn->units_idr, ppp->file.index);
 err:
 	mutex_unlock(&pn->all_ppp_mutex);
diff --git a/drivers/net/ppp/pppoe.c b/drivers/net/ppp/pppoe.c
index 4e1da16..5aa59f4 100644
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -842,6 +842,7 @@ static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
 	struct pppoe_hdr *ph;
 	struct net_device *dev;
 	char *start;
+	int hlen;
 
 	lock_sock(sk);
 	if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) {
@@ -860,16 +861,16 @@ static int pppoe_sendmsg(struct socket *sock, struct msghdr *m,
 	if (total_len > (dev->mtu + dev->hard_header_len))
 		goto end;
 
-
-	skb = sock_wmalloc(sk, total_len + dev->hard_header_len + 32,
-			   0, GFP_KERNEL);
+	hlen = LL_RESERVED_SPACE(dev);
+	skb = sock_wmalloc(sk, hlen + sizeof(*ph) + total_len +
+			   dev->needed_tailroom, 0, GFP_KERNEL);
 	if (!skb) {
 		error = -ENOMEM;
 		goto end;
 	}
 
 	/* Reserve space for headers. */
-	skb_reserve(skb, dev->hard_header_len);
+	skb_reserve(skb, hlen);
 	skb_reset_network_header(skb);
 
 	skb->dev = dev;
@@ -930,7 +931,7 @@ static int __pppoe_xmit(struct sock *sk, struct sk_buff *skb)
 	/* Copy the data if there is no space for the header or if it's
 	 * read-only.
 	 */
-	if (skb_cow_head(skb, sizeof(*ph) + dev->hard_header_len))
+	if (skb_cow_head(skb, LL_RESERVED_SPACE(dev) + sizeof(*ph)))
 		goto abort;
 
 	__skb_push(skb, sizeof(*ph));
diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 4f4a842..a8ec589 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -611,6 +611,14 @@ static void tun_queue_purge(struct tun_file *tfile)
 	skb_queue_purge(&tfile->sk.sk_error_queue);
 }
 
+static void tun_cleanup_tx_array(struct tun_file *tfile)
+{
+	if (tfile->tx_array.ring.queue) {
+		skb_array_cleanup(&tfile->tx_array);
+		memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+	}
+}
+
 static void __tun_detach(struct tun_file *tfile, bool clean)
 {
 	struct tun_file *ntfile;
@@ -657,8 +665,7 @@ static void __tun_detach(struct tun_file *tfile, bool clean)
 			    tun->dev->reg_state == NETREG_REGISTERED)
 				unregister_netdevice(tun->dev);
 		}
-		if (tun)
-			skb_array_cleanup(&tfile->tx_array);
+		tun_cleanup_tx_array(tfile);
 		sock_put(&tfile->sk);
 	}
 }
@@ -700,11 +707,13 @@ static void tun_detach_all(struct net_device *dev)
 		/* Drop read queue */
 		tun_queue_purge(tfile);
 		sock_put(&tfile->sk);
+		tun_cleanup_tx_array(tfile);
 	}
 	list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
 		tun_enable_queue(tfile);
 		tun_queue_purge(tfile);
 		sock_put(&tfile->sk);
+		tun_cleanup_tx_array(tfile);
 	}
 	BUG_ON(tun->numdisabled != 0);
 
@@ -2851,6 +2860,8 @@ static int tun_chr_open(struct inode *inode, struct file * file)
 
 	sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
 
+	memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+
 	return 0;
 }
 
diff --git a/drivers/net/usb/lan78xx.c b/drivers/net/usb/lan78xx.c
index 94c7804..ec56ff2 100644
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2396,6 +2396,7 @@ static int lan78xx_reset(struct lan78xx_net *dev)
 		buf = DEFAULT_BURST_CAP_SIZE / FS_USB_PKT_SIZE;
 		dev->rx_urb_size = DEFAULT_BURST_CAP_SIZE;
 		dev->rx_qlen = 4;
+		dev->tx_qlen = 4;
 	}
 
 	ret = lan78xx_write_reg(dev, BURST_CAP, buf);
diff --git a/drivers/net/usb/qmi_wwan.c b/drivers/net/usb/qmi_wwan.c
index 3000ddd..728819f 100644
--- a/drivers/net/usb/qmi_wwan.c
+++ b/drivers/net/usb/qmi_wwan.c
@@ -1100,6 +1100,7 @@ static const struct usb_device_id products[] = {
 	{QMI_FIXED_INTF(0x05c6, 0x9084, 4)},
 	{QMI_FIXED_INTF(0x05c6, 0x920d, 0)},
 	{QMI_FIXED_INTF(0x05c6, 0x920d, 5)},
+	{QMI_QUIRK_SET_DTR(0x05c6, 0x9625, 4)},	/* YUGA CLM920-NC5 */
 	{QMI_FIXED_INTF(0x0846, 0x68a2, 8)},
 	{QMI_FIXED_INTF(0x12d1, 0x140c, 1)},	/* Huawei E173 */
 	{QMI_FIXED_INTF(0x12d1, 0x14ac, 1)},	/* Huawei E1820 */
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d51d9abf..0657203 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -606,6 +606,7 @@ enum rtl8152_flags {
 	PHY_RESET,
 	SCHEDULE_NAPI,
 	GREEN_ETHERNET,
+	DELL_TB_RX_AGG_BUG,
 };
 
 /* Define these values to match your device */
@@ -1798,6 +1799,9 @@ static int r8152_tx_agg_fill(struct r8152 *tp, struct tx_agg *agg)
 		dev_kfree_skb_any(skb);
 
 		remain = agg_buf_sz - (int)(tx_agg_align(tx_data) - agg->head);
+
+		if (test_bit(DELL_TB_RX_AGG_BUG, &tp->flags))
+			break;
 	}
 
 	if (!skb_queue_empty(&skb_head)) {
@@ -4133,6 +4137,9 @@ static void r8153_init(struct r8152 *tp)
 	/* rx aggregation */
 	ocp_data = ocp_read_word(tp, MCU_TYPE_USB, USB_USB_CTRL);
 	ocp_data &= ~(RX_AGG_DISABLE | RX_ZERO_EN);
+	if (test_bit(DELL_TB_RX_AGG_BUG, &tp->flags))
+		ocp_data |= RX_AGG_DISABLE;
+
 	ocp_write_word(tp, MCU_TYPE_USB, USB_USB_CTRL, ocp_data);
 
 	rtl_tally_reset(tp);
@@ -5207,6 +5214,12 @@ static int rtl8152_probe(struct usb_interface *intf,
 		netdev->hw_features &= ~NETIF_F_RXCSUM;
 	}
 
+	if (le16_to_cpu(udev->descriptor.bcdDevice) == 0x3011 &&
+	    udev->serial && !strcmp(udev->serial, "000001000000")) {
+		dev_info(&udev->dev, "Dell TB16 Dock, disable RX aggregation");
+		set_bit(DELL_TB_RX_AGG_BUG, &tp->flags);
+	}
+
 	netdev->ethtool_ops = &ops;
 	netif_set_gso_max_size(netdev, RTL_LIMITED_TSO_SIZE);
 
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c
index d56fe32..8a22ff6 100644
--- a/drivers/net/usb/usbnet.c
+++ b/drivers/net/usb/usbnet.c
@@ -457,12 +457,10 @@ static enum skb_state defer_bh(struct usbnet *dev, struct sk_buff *skb,
 void usbnet_defer_kevent (struct usbnet *dev, int work)
 {
 	set_bit (work, &dev->flags);
-	if (!schedule_work (&dev->kevent)) {
-		if (net_ratelimit())
-			netdev_err(dev->net, "kevent %d may have been dropped\n", work);
-	} else {
+	if (!schedule_work (&dev->kevent))
+		netdev_dbg(dev->net, "kevent %d may have been dropped\n", work);
+	else
 		netdev_dbg(dev->net, "kevent %d scheduled\n", work);
-	}
 }
 EXPORT_SYMBOL_GPL(usbnet_defer_kevent);
 
diff --git a/drivers/net/vmxnet3/vmxnet3_drv.c b/drivers/net/vmxnet3/vmxnet3_drv.c
index d1c7029..cf95290 100644
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1616,7 +1616,6 @@ static void vmxnet3_rq_destroy(struct vmxnet3_rx_queue *rq,
 					  rq->rx_ring[i].basePA);
 			rq->rx_ring[i].base = NULL;
 		}
-		rq->buf_info[i] = NULL;
 	}
 
 	if (rq->data_ring.base) {
@@ -1638,6 +1637,7 @@ static void vmxnet3_rq_destroy(struct vmxnet3_rx_queue *rq,
 			(rq->rx_ring[0].size + rq->rx_ring[1].size);
 		dma_free_coherent(&adapter->pdev->dev, sz, rq->buf_info[0],
 				  rq->buf_info_pa);
+		rq->buf_info[0] = rq->buf_info[1] = NULL;
 	}
 }
 
diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index feb1b2e..139c61c 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -673,8 +673,9 @@ static struct sk_buff *vrf_ip_out(struct net_device *vrf_dev,
 				  struct sock *sk,
 				  struct sk_buff *skb)
 {
-	/* don't divert multicast */
-	if (ipv4_is_multicast(ip_hdr(skb)->daddr))
+	/* don't divert multicast or local broadcast */
+	if (ipv4_is_multicast(ip_hdr(skb)->daddr) ||
+	    ipv4_is_lbcast(ip_hdr(skb)->daddr))
 		return skb;
 
 	if (qdisc_tx_is_default(vrf_dev))
diff --git a/drivers/net/vxlan.c b/drivers/net/vxlan.c
index 31f4b79..c3e34e3 100644
--- a/drivers/net/vxlan.c
+++ b/drivers/net/vxlan.c
@@ -2158,8 +2158,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (skb_dst(skb)) {
 			int mtu = dst_mtu(ndst) - VXLAN_HEADROOM;
 
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
+			skb_dst_update_pmtu(skb, mtu);
 		}
 
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
@@ -2200,8 +2199,7 @@ static void vxlan_xmit_one(struct sk_buff *skb, struct net_device *dev,
 		if (skb_dst(skb)) {
 			int mtu = dst_mtu(ndst) - VXLAN6_HEADROOM;
 
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL,
-						       skb, mtu);
+			skb_dst_update_pmtu(skb, mtu);
 		}
 
 		tos = ip_tunnel_ecn_encap(tos, old_iph, skb);
diff --git a/drivers/net/wireless/ath/wcn36xx/main.c b/drivers/net/wireless/ath/wcn36xx/main.c
index f7d228b..987f125 100644
--- a/drivers/net/wireless/ath/wcn36xx/main.c
+++ b/drivers/net/wireless/ath/wcn36xx/main.c
@@ -384,6 +384,18 @@ static int wcn36xx_config(struct ieee80211_hw *hw, u32 changed)
 		}
 	}
 
+	if (changed & IEEE80211_CONF_CHANGE_PS) {
+		list_for_each_entry(tmp, &wcn->vif_list, list) {
+			vif = wcn36xx_priv_to_vif(tmp);
+			if (hw->conf.flags & IEEE80211_CONF_PS) {
+				if (vif->bss_conf.ps) /* ps allowed ? */
+					wcn36xx_pmc_enter_bmps_state(wcn, vif);
+			} else {
+				wcn36xx_pmc_exit_bmps_state(wcn, vif);
+			}
+		}
+	}
+
 	mutex_unlock(&wcn->conf_mutex);
 
 	return 0;
@@ -747,17 +759,6 @@ static void wcn36xx_bss_info_changed(struct ieee80211_hw *hw,
 		vif_priv->dtim_period = bss_conf->dtim_period;
 	}
 
-	if (changed & BSS_CHANGED_PS) {
-		wcn36xx_dbg(WCN36XX_DBG_MAC,
-			    "mac bss PS set %d\n",
-			    bss_conf->ps);
-		if (bss_conf->ps) {
-			wcn36xx_pmc_enter_bmps_state(wcn, vif);
-		} else {
-			wcn36xx_pmc_exit_bmps_state(wcn, vif);
-		}
-	}
-
 	if (changed & BSS_CHANGED_BSSID) {
 		wcn36xx_dbg(WCN36XX_DBG_MAC, "mac bss changed_bssid %pM\n",
 			    bss_conf->bssid);
diff --git a/drivers/net/wireless/ath/wcn36xx/pmc.c b/drivers/net/wireless/ath/wcn36xx/pmc.c
index 589fe5f..1976b80 100644
--- a/drivers/net/wireless/ath/wcn36xx/pmc.c
+++ b/drivers/net/wireless/ath/wcn36xx/pmc.c
@@ -45,8 +45,10 @@ int wcn36xx_pmc_exit_bmps_state(struct wcn36xx *wcn,
 	struct wcn36xx_vif *vif_priv = wcn36xx_vif_to_priv(vif);
 
 	if (WCN36XX_BMPS != vif_priv->pw_state) {
-		wcn36xx_err("Not in BMPS mode, no need to exit from BMPS mode!\n");
-		return -EINVAL;
+		/* Unbalanced call or last BMPS enter failed */
+		wcn36xx_dbg(WCN36XX_DBG_PMC,
+			    "Not in BMPS mode, no need to exit\n");
+		return -EALREADY;
 	}
 	wcn36xx_smd_exit_bmps(wcn, vif);
 	vif_priv->pw_state = WCN36XX_FULL_POWER;
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
index 6a59d06..9be0b05 100644
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
+++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c
@@ -182,12 +182,9 @@ static int brcmf_c_process_clm_blob(struct brcmf_if *ifp)
 
 	err = request_firmware(&clm, clm_name, dev);
 	if (err) {
-		if (err == -ENOENT) {
-			brcmf_dbg(INFO, "continue with CLM data currently present in firmware\n");
-			return 0;
-		}
-		brcmf_err("request CLM blob file failed (%d)\n", err);
-		return err;
+		brcmf_info("no clm_blob available(err=%d), device may have limited channels available\n",
+			   err);
+		return 0;
 	}
 
 	chunk_buf = kzalloc(sizeof(*chunk_buf) + MAX_CHUNK_LEN - 1, GFP_KERNEL);
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
index d749abe..403e65c 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/internal.h
@@ -670,11 +670,15 @@ static inline u8 iwl_pcie_get_cmd_index(struct iwl_txq *q, u32 index)
 	return index & (q->n_window - 1);
 }
 
-static inline void *iwl_pcie_get_tfd(struct iwl_trans_pcie *trans_pcie,
+static inline void *iwl_pcie_get_tfd(struct iwl_trans *trans,
 				     struct iwl_txq *txq, int idx)
 {
-	return txq->tfds + trans_pcie->tfd_size * iwl_pcie_get_cmd_index(txq,
-									 idx);
+	struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
+
+	if (trans->cfg->use_tfh)
+		idx = iwl_pcie_get_cmd_index(txq, idx);
+
+	return txq->tfds + trans_pcie->tfd_size * idx;
 }
 
 static inline void iwl_enable_rfkill_int(struct iwl_trans *trans)
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
index 16b345f..6d0a907 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx-gen2.c
@@ -171,8 +171,6 @@ static void iwl_pcie_gen2_tfd_unmap(struct iwl_trans *trans,
 
 static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq)
 {
-	struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
-
 	/* rd_ptr is bounded by TFD_QUEUE_SIZE_MAX and
 	 * idx is bounded by n_window
 	 */
@@ -181,7 +179,7 @@ static void iwl_pcie_gen2_free_tfd(struct iwl_trans *trans, struct iwl_txq *txq)
 	lockdep_assert_held(&txq->lock);
 
 	iwl_pcie_gen2_tfd_unmap(trans, &txq->entries[idx].meta,
-				iwl_pcie_get_tfd(trans_pcie, txq, idx));
+				iwl_pcie_get_tfd(trans, txq, idx));
 
 	/* free SKB */
 	if (txq->entries) {
@@ -364,11 +362,9 @@ struct iwl_tfh_tfd *iwl_pcie_gen2_build_tfd(struct iwl_trans *trans,
 					    struct sk_buff *skb,
 					    struct iwl_cmd_meta *out_meta)
 {
-	struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
 	struct ieee80211_hdr *hdr = (struct ieee80211_hdr *)skb->data;
 	int idx = iwl_pcie_get_cmd_index(txq, txq->write_ptr);
-	struct iwl_tfh_tfd *tfd =
-		iwl_pcie_get_tfd(trans_pcie, txq, idx);
+	struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, idx);
 	dma_addr_t tb_phys;
 	bool amsdu;
 	int i, len, tb1_len, tb2_len, hdr_len;
@@ -565,8 +561,7 @@ static int iwl_pcie_gen2_enqueue_hcmd(struct iwl_trans *trans,
 	u8 group_id = iwl_cmd_groupid(cmd->id);
 	const u8 *cmddata[IWL_MAX_CMD_TBS_PER_TFD];
 	u16 cmdlen[IWL_MAX_CMD_TBS_PER_TFD];
-	struct iwl_tfh_tfd *tfd =
-		iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr);
+	struct iwl_tfh_tfd *tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr);
 
 	memset(tfd, 0, sizeof(*tfd));
 
diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
index fed6d84..3f85713 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/tx.c
@@ -373,7 +373,7 @@ static void iwl_pcie_tfd_unmap(struct iwl_trans *trans,
 {
 	struct iwl_trans_pcie *trans_pcie = IWL_TRANS_GET_PCIE_TRANS(trans);
 	int i, num_tbs;
-	void *tfd = iwl_pcie_get_tfd(trans_pcie, txq, index);
+	void *tfd = iwl_pcie_get_tfd(trans, txq, index);
 
 	/* Sanity check on number of chunks */
 	num_tbs = iwl_pcie_tfd_get_num_tbs(trans, tfd);
@@ -2018,7 +2018,7 @@ static int iwl_fill_data_tbs(struct iwl_trans *trans, struct sk_buff *skb,
 	}
 
 	trace_iwlwifi_dev_tx(trans->dev, skb,
-			     iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr),
+			     iwl_pcie_get_tfd(trans, txq, txq->write_ptr),
 			     trans_pcie->tfd_size,
 			     &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len,
 			     hdr_len);
@@ -2092,7 +2092,7 @@ static int iwl_fill_data_tbs_amsdu(struct iwl_trans *trans, struct sk_buff *skb,
 		IEEE80211_CCMP_HDR_LEN : 0;
 
 	trace_iwlwifi_dev_tx(trans->dev, skb,
-			     iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr),
+			     iwl_pcie_get_tfd(trans, txq, txq->write_ptr),
 			     trans_pcie->tfd_size,
 			     &dev_cmd->hdr, IWL_FIRST_TB_SIZE + tb1_len, 0);
 
@@ -2425,7 +2425,7 @@ int iwl_trans_pcie_tx(struct iwl_trans *trans, struct sk_buff *skb,
 	memcpy(&txq->first_tb_bufs[txq->write_ptr], &dev_cmd->hdr,
 	       IWL_FIRST_TB_SIZE);
 
-	tfd = iwl_pcie_get_tfd(trans_pcie, txq, txq->write_ptr);
+	tfd = iwl_pcie_get_tfd(trans, txq, txq->write_ptr);
 	/* Set up entry for this TFD in Tx byte-count array */
 	iwl_pcie_txq_update_byte_cnt_tbl(trans, txq, le16_to_cpu(tx_cmd->len),
 					 iwl_pcie_tfd_get_num_tbs(trans, tfd));
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c
index e8189c0..f6d4a50f 100644
--- a/drivers/net/wireless/mac80211_hwsim.c
+++ b/drivers/net/wireless/mac80211_hwsim.c
@@ -489,6 +489,7 @@ static const struct ieee80211_iface_combination hwsim_if_comb_p2p_dev[] = {
 
 static spinlock_t hwsim_radio_lock;
 static LIST_HEAD(hwsim_radios);
+static struct workqueue_struct *hwsim_wq;
 static int hwsim_radio_idx;
 
 static struct platform_driver mac80211_hwsim_driver = {
@@ -3120,6 +3121,11 @@ static int hwsim_new_radio_nl(struct sk_buff *msg, struct genl_info *info)
 	if (info->attrs[HWSIM_ATTR_CHANNELS])
 		param.channels = nla_get_u32(info->attrs[HWSIM_ATTR_CHANNELS]);
 
+	if (param.channels > CFG80211_MAX_NUM_DIFFERENT_CHANNELS) {
+		GENL_SET_ERR_MSG(info, "too many channels specified");
+		return -EINVAL;
+	}
+
 	if (info->attrs[HWSIM_ATTR_NO_VIF])
 		param.no_vif = true;
 
@@ -3342,7 +3348,7 @@ static void remove_user_radios(u32 portid)
 		if (entry->destroy_on_close && entry->portid == portid) {
 			list_del(&entry->list);
 			INIT_WORK(&entry->destroy_work, destroy_radio);
-			schedule_work(&entry->destroy_work);
+			queue_work(hwsim_wq, &entry->destroy_work);
 		}
 	}
 	spin_unlock_bh(&hwsim_radio_lock);
@@ -3417,7 +3423,7 @@ static void __net_exit hwsim_exit_net(struct net *net)
 
 		list_del(&data->list);
 		INIT_WORK(&data->destroy_work, destroy_radio);
-		schedule_work(&data->destroy_work);
+		queue_work(hwsim_wq, &data->destroy_work);
 	}
 	spin_unlock_bh(&hwsim_radio_lock);
 }
@@ -3449,6 +3455,10 @@ static int __init init_mac80211_hwsim(void)
 
 	spin_lock_init(&hwsim_radio_lock);
 
+	hwsim_wq = alloc_workqueue("hwsim_wq",WQ_MEM_RECLAIM,0);
+	if (!hwsim_wq)
+		return -ENOMEM;
+
 	err = register_pernet_device(&hwsim_net_ops);
 	if (err)
 		return err;
@@ -3587,8 +3597,11 @@ static void __exit exit_mac80211_hwsim(void)
 	hwsim_exit_netlink();
 
 	mac80211_hwsim_free();
+	flush_workqueue(hwsim_wq);
+
 	unregister_netdev(hwsim_mon);
 	platform_driver_unregister(&mac80211_hwsim_driver);
 	unregister_pernet_device(&hwsim_net_ops);
+	destroy_workqueue(hwsim_wq);
 }
 module_exit(exit_mac80211_hwsim);
diff --git a/drivers/net/xen-netfront.c b/drivers/net/xen-netfront.c
index c5a34671..9bd7dde 100644
--- a/drivers/net/xen-netfront.c
+++ b/drivers/net/xen-netfront.c
@@ -1326,6 +1326,7 @@ static struct net_device *xennet_create_dev(struct xenbus_device *dev)
 
 	netif_carrier_off(netdev);
 
+	xenbus_switch_state(dev, XenbusStateInitialising);
 	return netdev;
 
  exit:
diff --git a/drivers/nubus/Makefile b/drivers/nubus/Makefile
index 21bda20..6d063cd 100644
--- a/drivers/nubus/Makefile
+++ b/drivers/nubus/Makefile
@@ -2,6 +2,6 @@
 # Makefile for the nubus specific drivers.
 #
 
-obj-y   := nubus.o
+obj-y := nubus.o bus.o
 
 obj-$(CONFIG_PROC_FS) += proc.o
diff --git a/drivers/nubus/bus.c b/drivers/nubus/bus.c
new file mode 100644
index 0000000..d306c34
--- /dev/null
+++ b/drivers/nubus/bus.c
@@ -0,0 +1,117 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// Bus implementation for the NuBus subsystem.
+//
+// Copyright (C) 2017 Finn Thain
+
+#include <linux/device.h>
+#include <linux/list.h>
+#include <linux/nubus.h>
+#include <linux/seq_file.h>
+#include <linux/slab.h>
+
+#define to_nubus_board(d)       container_of(d, struct nubus_board, dev)
+#define to_nubus_driver(d)      container_of(d, struct nubus_driver, driver)
+
+static int nubus_bus_match(struct device *dev, struct device_driver *driver)
+{
+	return 1;
+}
+
+static int nubus_device_probe(struct device *dev)
+{
+	struct nubus_driver *ndrv = to_nubus_driver(dev->driver);
+	int err = -ENODEV;
+
+	if (ndrv->probe)
+		err = ndrv->probe(to_nubus_board(dev));
+	return err;
+}
+
+static int nubus_device_remove(struct device *dev)
+{
+	struct nubus_driver *ndrv = to_nubus_driver(dev->driver);
+	int err = -ENODEV;
+
+	if (dev->driver && ndrv->remove)
+		err = ndrv->remove(to_nubus_board(dev));
+	return err;
+}
+
+struct bus_type nubus_bus_type = {
+	.name		= "nubus",
+	.match		= nubus_bus_match,
+	.probe		= nubus_device_probe,
+	.remove		= nubus_device_remove,
+};
+EXPORT_SYMBOL(nubus_bus_type);
+
+int nubus_driver_register(struct nubus_driver *ndrv)
+{
+	ndrv->driver.bus = &nubus_bus_type;
+	return driver_register(&ndrv->driver);
+}
+EXPORT_SYMBOL(nubus_driver_register);
+
+void nubus_driver_unregister(struct nubus_driver *ndrv)
+{
+	driver_unregister(&ndrv->driver);
+}
+EXPORT_SYMBOL(nubus_driver_unregister);
+
+static struct device nubus_parent = {
+	.init_name	= "nubus",
+};
+
+int __init nubus_bus_register(void)
+{
+	int err;
+
+	err = device_register(&nubus_parent);
+	if (err)
+		return err;
+
+	err = bus_register(&nubus_bus_type);
+	if (!err)
+		return 0;
+
+	device_unregister(&nubus_parent);
+	return err;
+}
+
+static void nubus_device_release(struct device *dev)
+{
+	struct nubus_board *board = to_nubus_board(dev);
+	struct nubus_rsrc *fres, *tmp;
+
+	list_for_each_entry_safe(fres, tmp, &nubus_func_rsrcs, list)
+		if (fres->board == board) {
+			list_del(&fres->list);
+			kfree(fres);
+		}
+	kfree(board);
+}
+
+int nubus_device_register(struct nubus_board *board)
+{
+	board->dev.parent = &nubus_parent;
+	board->dev.release = nubus_device_release;
+	board->dev.bus = &nubus_bus_type;
+	dev_set_name(&board->dev, "slot.%X", board->slot);
+	return device_register(&board->dev);
+}
+
+static int nubus_print_device_name_fn(struct device *dev, void *data)
+{
+	struct nubus_board *board = to_nubus_board(dev);
+	struct seq_file *m = data;
+
+	seq_printf(m, "Slot %X: %s\n", board->slot, board->name);
+	return 0;
+}
+
+int nubus_proc_show(struct seq_file *m, void *data)
+{
+	return bus_for_each_dev(&nubus_bus_type, NULL, m,
+				nubus_print_device_name_fn);
+}
diff --git a/drivers/nubus/nubus.c b/drivers/nubus/nubus.c
index b793727..4621ff9 100644
--- a/drivers/nubus/nubus.c
+++ b/drivers/nubus/nubus.c
@@ -15,6 +15,7 @@
 #include <linux/errno.h>
 #include <linux/init.h>
 #include <linux/module.h>
+#include <linux/seq_file.h>
 #include <linux/slab.h>
 #include <asm/setup.h>
 #include <asm/page.h>
@@ -31,8 +32,7 @@
 
 /* Globals */
 
-struct nubus_dev *nubus_devices;
-struct nubus_board *nubus_boards;
+LIST_HEAD(nubus_func_rsrcs);
 
 /* Meaning of "bytelanes":
 
@@ -146,7 +146,7 @@ static inline void *nubus_rom_addr(int slot)
 	return (void *)(0xF1000000 + (slot << 24));
 }
 
-static unsigned char *nubus_dirptr(const struct nubus_dirent *nd)
+unsigned char *nubus_dirptr(const struct nubus_dirent *nd)
 {
 	unsigned char *p = nd->base;
 
@@ -161,7 +161,7 @@ static unsigned char *nubus_dirptr(const struct nubus_dirent *nd)
    pointed to with offsets) out of the card ROM. */
 
 void nubus_get_rsrc_mem(void *dest, const struct nubus_dirent *dirent,
-			int len)
+			unsigned int len)
 {
 	unsigned char *t = (unsigned char *)dest;
 	unsigned char *p = nubus_dirptr(dirent);
@@ -173,21 +173,49 @@ void nubus_get_rsrc_mem(void *dest, const struct nubus_dirent *dirent,
 }
 EXPORT_SYMBOL(nubus_get_rsrc_mem);
 
-void nubus_get_rsrc_str(void *dest, const struct nubus_dirent *dirent,
-			int len)
+unsigned int nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
+				unsigned int len)
 {
-	unsigned char *t = (unsigned char *)dest;
+	char *t = dest;
 	unsigned char *p = nubus_dirptr(dirent);
 
-	while (len) {
-		*t = nubus_get_rom(&p, 1, dirent->mask);
-		if (!*t++)
+	while (len > 1) {
+		unsigned char c = nubus_get_rom(&p, 1, dirent->mask);
+
+		if (!c)
 			break;
+		*t++ = c;
 		len--;
 	}
+	if (len > 0)
+		*t = '\0';
+	return t - dest;
 }
 EXPORT_SYMBOL(nubus_get_rsrc_str);
 
+void nubus_seq_write_rsrc_mem(struct seq_file *m,
+			      const struct nubus_dirent *dirent,
+			      unsigned int len)
+{
+	unsigned long buf[32];
+	unsigned int buf_size = sizeof(buf);
+	unsigned char *p = nubus_dirptr(dirent);
+
+	/* If possible, write out full buffers */
+	while (len >= buf_size) {
+		unsigned int i;
+
+		for (i = 0; i < ARRAY_SIZE(buf); i++)
+			buf[i] = nubus_get_rom(&p, sizeof(buf[0]),
+					       dirent->mask);
+		seq_write(m, buf, buf_size);
+		len -= buf_size;
+	}
+	/* If not, write out individual bytes */
+	while (len--)
+		seq_putc(m, nubus_get_rom(&p, 1, dirent->mask));
+}
+
 int nubus_get_root_dir(const struct nubus_board *board,
 		       struct nubus_dir *dir)
 {
@@ -199,12 +227,11 @@ int nubus_get_root_dir(const struct nubus_board *board,
 EXPORT_SYMBOL(nubus_get_root_dir);
 
 /* This is a slyly renamed version of the above */
-int nubus_get_func_dir(const struct nubus_dev *dev,
-		       struct nubus_dir *dir)
+int nubus_get_func_dir(const struct nubus_rsrc *fres, struct nubus_dir *dir)
 {
-	dir->ptr = dir->base = dev->directory;
+	dir->ptr = dir->base = fres->directory;
 	dir->done = 0;
-	dir->mask = dev->board->lanes;
+	dir->mask = fres->board->lanes;
 	return 0;
 }
 EXPORT_SYMBOL(nubus_get_func_dir);
@@ -277,51 +304,20 @@ EXPORT_SYMBOL(nubus_rewinddir);
 
 /* Driver interface functions, more or less like in pci.c */
 
-struct nubus_dev*
-nubus_find_device(unsigned short category, unsigned short type,
-		  unsigned short dr_hw, unsigned short dr_sw,
-		  const struct nubus_dev *from)
+struct nubus_rsrc *nubus_first_rsrc_or_null(void)
 {
-	struct nubus_dev *itor = from ? from->next : nubus_devices;
-
-	while (itor) {
-		if (itor->category == category && itor->type == type &&
-		    itor->dr_hw == dr_hw && itor->dr_sw == dr_sw)
-			return itor;
-		itor = itor->next;
-	}
-	return NULL;
+	return list_first_entry_or_null(&nubus_func_rsrcs, struct nubus_rsrc,
+					list);
 }
-EXPORT_SYMBOL(nubus_find_device);
+EXPORT_SYMBOL(nubus_first_rsrc_or_null);
 
-struct nubus_dev*
-nubus_find_type(unsigned short category, unsigned short type,
-		const struct nubus_dev *from)
+struct nubus_rsrc *nubus_next_rsrc_or_null(struct nubus_rsrc *from)
 {
-	struct nubus_dev *itor = from ? from->next : nubus_devices;
-
-	while (itor) {
-		if (itor->category == category && itor->type == type)
-			return itor;
-		itor = itor->next;
-	}
-	return NULL;
+	if (list_is_last(&from->list, &nubus_func_rsrcs))
+		return NULL;
+	return list_next_entry(from, list);
 }
-EXPORT_SYMBOL(nubus_find_type);
-
-struct nubus_dev*
-nubus_find_slot(unsigned int slot, const struct nubus_dev *from)
-{
-	struct nubus_dev *itor = from ? from->next : nubus_devices;
-
-	while (itor) {
-		if (itor->board->slot == slot)
-			return itor;
-		itor = itor->next;
-	}
-	return NULL;
-}
-EXPORT_SYMBOL(nubus_find_slot);
+EXPORT_SYMBOL(nubus_next_rsrc_or_null);
 
 int
 nubus_find_rsrc(struct nubus_dir *dir, unsigned char rsrc_type,
@@ -339,31 +335,83 @@ EXPORT_SYMBOL(nubus_find_rsrc);
    looking at, and print out lots and lots of information from the
    resource blocks. */
 
-/* FIXME: A lot of this stuff will eventually be useful after
-   initialization, for intelligently probing Ethernet and video chips,
-   among other things.  The rest of it should go in the /proc code.
-   For now, we just use it to give verbose boot logs. */
-
-static int __init nubus_show_display_resource(struct nubus_dev *dev,
-					      const struct nubus_dirent *ent)
+static int __init nubus_get_block_rsrc_dir(struct nubus_board *board,
+					   struct proc_dir_entry *procdir,
+					   const struct nubus_dirent *parent)
 {
-	switch (ent->type) {
-	case NUBUS_RESID_GAMMADIR:
-		pr_info("    gamma directory offset: 0x%06x\n", ent->data);
-		break;
-	case 0x0080 ... 0x0085:
-		pr_info("    mode %02X info offset: 0x%06x\n",
-		       ent->type, ent->data);
-		break;
-	default:
-		pr_info("    unknown resource %02X, data 0x%06x\n",
-		       ent->type, ent->data);
+	struct nubus_dir dir;
+	struct nubus_dirent ent;
+
+	nubus_get_subdir(parent, &dir);
+	dir.procdir = nubus_proc_add_rsrc_dir(procdir, parent, board);
+
+	while (nubus_readdir(&dir, &ent) != -1) {
+		u32 size;
+
+		nubus_get_rsrc_mem(&size, &ent, 4);
+		pr_debug("        block (0x%x), size %d\n", ent.type, size);
+		nubus_proc_add_rsrc_mem(dir.procdir, &ent, size);
 	}
 	return 0;
 }
 
-static int __init nubus_show_network_resource(struct nubus_dev *dev,
-					      const struct nubus_dirent *ent)
+static int __init nubus_get_display_vidmode(struct nubus_board *board,
+					    struct proc_dir_entry *procdir,
+					    const struct nubus_dirent *parent)
+{
+	struct nubus_dir dir;
+	struct nubus_dirent ent;
+
+	nubus_get_subdir(parent, &dir);
+	dir.procdir = nubus_proc_add_rsrc_dir(procdir, parent, board);
+
+	while (nubus_readdir(&dir, &ent) != -1) {
+		switch (ent.type) {
+		case 1: /* mVidParams */
+		case 2: /* mTable */
+		{
+			u32 size;
+
+			nubus_get_rsrc_mem(&size, &ent, 4);
+			pr_debug("        block (0x%x), size %d\n", ent.type,
+				size);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, size);
+			break;
+		}
+		default:
+			pr_debug("        unknown resource 0x%02x, data 0x%06x\n",
+				ent.type, ent.data);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 0);
+		}
+	}
+	return 0;
+}
+
+static int __init nubus_get_display_resource(struct nubus_rsrc *fres,
+					     struct proc_dir_entry *procdir,
+					     const struct nubus_dirent *ent)
+{
+	switch (ent->type) {
+	case NUBUS_RESID_GAMMADIR:
+		pr_debug("    gamma directory offset: 0x%06x\n", ent->data);
+		nubus_get_block_rsrc_dir(fres->board, procdir, ent);
+		break;
+	case 0x0080 ... 0x0085:
+		pr_debug("    mode 0x%02x info offset: 0x%06x\n",
+			ent->type, ent->data);
+		nubus_get_display_vidmode(fres->board, procdir, ent);
+		break;
+	default:
+		pr_debug("    unknown resource 0x%02x, data 0x%06x\n",
+			ent->type, ent->data);
+		nubus_proc_add_rsrc_mem(procdir, ent, 0);
+	}
+	return 0;
+}
+
+static int __init nubus_get_network_resource(struct nubus_rsrc *fres,
+					     struct proc_dir_entry *procdir,
+					     const struct nubus_dirent *ent)
 {
 	switch (ent->type) {
 	case NUBUS_RESID_MAC_ADDRESS:
@@ -371,18 +419,21 @@ static int __init nubus_show_network_resource(struct nubus_dev *dev,
 		char addr[6];
 
 		nubus_get_rsrc_mem(addr, ent, 6);
-		pr_info("    MAC address: %pM\n", addr);
+		pr_debug("    MAC address: %pM\n", addr);
+		nubus_proc_add_rsrc_mem(procdir, ent, 6);
 		break;
 	}
 	default:
-		pr_info("    unknown resource %02X, data 0x%06x\n",
-		       ent->type, ent->data);
+		pr_debug("    unknown resource 0x%02x, data 0x%06x\n",
+			ent->type, ent->data);
+		nubus_proc_add_rsrc_mem(procdir, ent, 0);
 	}
 	return 0;
 }
 
-static int __init nubus_show_cpu_resource(struct nubus_dev *dev,
-					  const struct nubus_dirent *ent)
+static int __init nubus_get_cpu_resource(struct nubus_rsrc *fres,
+					 struct proc_dir_entry *procdir,
+					 const struct nubus_dirent *ent)
 {
 	switch (ent->type) {
 	case NUBUS_RESID_MEMINFO:
@@ -390,8 +441,9 @@ static int __init nubus_show_cpu_resource(struct nubus_dev *dev,
 		unsigned long meminfo[2];
 
 		nubus_get_rsrc_mem(&meminfo, ent, 8);
-		pr_info("    memory: [ 0x%08lx 0x%08lx ]\n",
-		       meminfo[0], meminfo[1]);
+		pr_debug("    memory: [ 0x%08lx 0x%08lx ]\n",
+			meminfo[0], meminfo[1]);
+		nubus_proc_add_rsrc_mem(procdir, ent, 8);
 		break;
 	}
 	case NUBUS_RESID_ROMINFO:
@@ -399,57 +451,60 @@ static int __init nubus_show_cpu_resource(struct nubus_dev *dev,
 		unsigned long rominfo[2];
 
 		nubus_get_rsrc_mem(&rominfo, ent, 8);
-		pr_info("    ROM:    [ 0x%08lx 0x%08lx ]\n",
-		       rominfo[0], rominfo[1]);
+		pr_debug("    ROM:    [ 0x%08lx 0x%08lx ]\n",
+			rominfo[0], rominfo[1]);
+		nubus_proc_add_rsrc_mem(procdir, ent, 8);
 		break;
 	}
 	default:
-		pr_info("    unknown resource %02X, data 0x%06x\n",
-		       ent->type, ent->data);
+		pr_debug("    unknown resource 0x%02x, data 0x%06x\n",
+			ent->type, ent->data);
+		nubus_proc_add_rsrc_mem(procdir, ent, 0);
 	}
 	return 0;
 }
 
-static int __init nubus_show_private_resource(struct nubus_dev *dev,
-					      const struct nubus_dirent *ent)
+static int __init nubus_get_private_resource(struct nubus_rsrc *fres,
+					     struct proc_dir_entry *procdir,
+					     const struct nubus_dirent *ent)
 {
-	switch (dev->category) {
+	switch (fres->category) {
 	case NUBUS_CAT_DISPLAY:
-		nubus_show_display_resource(dev, ent);
+		nubus_get_display_resource(fres, procdir, ent);
 		break;
 	case NUBUS_CAT_NETWORK:
-		nubus_show_network_resource(dev, ent);
+		nubus_get_network_resource(fres, procdir, ent);
 		break;
 	case NUBUS_CAT_CPU:
-		nubus_show_cpu_resource(dev, ent);
+		nubus_get_cpu_resource(fres, procdir, ent);
 		break;
 	default:
-		pr_info("    unknown resource %02X, data 0x%06x\n",
-		       ent->type, ent->data);
+		pr_debug("    unknown resource 0x%02x, data 0x%06x\n",
+			ent->type, ent->data);
+		nubus_proc_add_rsrc_mem(procdir, ent, 0);
 	}
 	return 0;
 }
 
-static struct nubus_dev * __init
+static struct nubus_rsrc * __init
 nubus_get_functional_resource(struct nubus_board *board, int slot,
 			      const struct nubus_dirent *parent)
 {
 	struct nubus_dir dir;
 	struct nubus_dirent ent;
-	struct nubus_dev *dev;
+	struct nubus_rsrc *fres;
 
-	pr_info("  Function 0x%02x:\n", parent->type);
+	pr_debug("  Functional resource 0x%02x:\n", parent->type);
 	nubus_get_subdir(parent, &dir);
-
-	pr_debug("%s: parent is 0x%p, dir is 0x%p\n",
-	         __func__, parent->base, dir.base);
+	dir.procdir = nubus_proc_add_rsrc_dir(board->procdir, parent, board);
 
 	/* Actually we should probably panic if this fails */
-	if ((dev = kzalloc(sizeof(*dev), GFP_ATOMIC)) == NULL)
+	fres = kzalloc(sizeof(*fres), GFP_ATOMIC);
+	if (!fres)
 		return NULL;
-	dev->resid = parent->type;
-	dev->directory = dir.base;
-	dev->board = board;
+	fres->resid = parent->type;
+	fres->directory = dir.base;
+	fres->board = board;
 
 	while (nubus_readdir(&dir, &ent) != -1) {
 		switch (ent.type) {
@@ -458,130 +513,96 @@ nubus_get_functional_resource(struct nubus_board *board, int slot,
 			unsigned short nbtdata[4];
 
 			nubus_get_rsrc_mem(nbtdata, &ent, 8);
-			dev->category = nbtdata[0];
-			dev->type     = nbtdata[1];
-			dev->dr_sw    = nbtdata[2];
-			dev->dr_hw    = nbtdata[3];
-			pr_info("    type: [cat 0x%x type 0x%x sw 0x%x hw 0x%x]\n",
-			        nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
+			fres->category = nbtdata[0];
+			fres->type     = nbtdata[1];
+			fres->dr_sw    = nbtdata[2];
+			fres->dr_hw    = nbtdata[3];
+			pr_debug("    type: [cat 0x%x type 0x%x sw 0x%x hw 0x%x]\n",
+				nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 8);
 			break;
 		}
 		case NUBUS_RESID_NAME:
 		{
-			nubus_get_rsrc_str(dev->name, &ent, 64);
-			pr_info("    name: %s\n", dev->name);
+			char name[64];
+			unsigned int len;
+
+			len = nubus_get_rsrc_str(name, &ent, sizeof(name));
+			pr_debug("    name: %s\n", name);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, len + 1);
 			break;
 		}
 		case NUBUS_RESID_DRVRDIR:
 		{
 			/* MacOS driver.  If we were NetBSD we might
 			   use this :-) */
-			struct nubus_dir drvr_dir;
-			struct nubus_dirent drvr_ent;
-
-			nubus_get_subdir(&ent, &drvr_dir);
-			nubus_readdir(&drvr_dir, &drvr_ent);
-			dev->driver = nubus_dirptr(&drvr_ent);
-			pr_info("    driver at: 0x%p\n", dev->driver);
+			pr_debug("    driver directory offset: 0x%06x\n",
+				ent.data);
+			nubus_get_block_rsrc_dir(board, dir.procdir, &ent);
 			break;
 		}
 		case NUBUS_RESID_MINOR_BASEOS:
+		{
 			/* We will need this in order to support
 			   multiple framebuffers.  It might be handy
 			   for Ethernet as well */
-			nubus_get_rsrc_mem(&dev->iobase, &ent, 4);
-			pr_info("    memory offset: 0x%08lx\n", dev->iobase);
+			u32 base_offset;
+
+			nubus_get_rsrc_mem(&base_offset, &ent, 4);
+			pr_debug("    memory offset: 0x%08x\n", base_offset);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 4);
 			break;
+		}
 		case NUBUS_RESID_MINOR_LENGTH:
+		{
 			/* Ditto */
-			nubus_get_rsrc_mem(&dev->iosize, &ent, 4);
-			pr_info("    memory length: 0x%08lx\n", dev->iosize);
+			u32 length;
+
+			nubus_get_rsrc_mem(&length, &ent, 4);
+			pr_debug("    memory length: 0x%08x\n", length);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 4);
 			break;
+		}
 		case NUBUS_RESID_FLAGS:
-			dev->flags = ent.data;
-			pr_info("    flags: 0x%06x\n", dev->flags);
+			pr_debug("    flags: 0x%06x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_HWDEVID:
-			dev->hwdevid = ent.data;
-			pr_info("    hwdevid: 0x%06x\n", dev->hwdevid);
+			pr_debug("    hwdevid: 0x%06x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		default:
 			/* Local/Private resources have their own
 			   function */
-			nubus_show_private_resource(dev, &ent);
+			nubus_get_private_resource(fres, dir.procdir, &ent);
 		}
 	}
 
-	return dev;
-}
-
-/* This is cool. */
-static int __init nubus_get_vidnames(struct nubus_board *board,
-				     const struct nubus_dirent *parent)
-{
-	struct nubus_dir dir;
-	struct nubus_dirent ent;
-
-	/* FIXME: obviously we want to put this in a header file soon */
-	struct vidmode {
-		u32 size;
-		/* Don't know what this is yet */
-		u16 id;
-		/* Longest one I've seen so far is 26 characters */
-		char name[32];
-	};
-
-	pr_info("    video modes supported:\n");
-	nubus_get_subdir(parent, &dir);
-	pr_debug("%s: parent is 0x%p, dir is 0x%p\n",
-	         __func__, parent->base, dir.base);
-
-	while (nubus_readdir(&dir, &ent) != -1) {
-		struct vidmode mode;
-		u32 size;
-
-		/* First get the length */
-		nubus_get_rsrc_mem(&size, &ent, 4);
-
-		/* Now clobber the whole thing */
-		if (size > sizeof(mode) - 1)
-			size = sizeof(mode) - 1;
-		memset(&mode, 0, sizeof(mode));
-		nubus_get_rsrc_mem(&mode, &ent, size);
-		pr_info("      %02X: (%02X) %s\n", ent.type,
-			mode.id, mode.name);
-	}
-	return 0;
+	return fres;
 }
 
 /* This is *really* cool. */
 static int __init nubus_get_icon(struct nubus_board *board,
+				 struct proc_dir_entry *procdir,
 				 const struct nubus_dirent *ent)
 {
 	/* Should be 32x32 if my memory serves me correctly */
-	unsigned char icon[128];
-	int x, y;
+	u32 icon[32];
+	int i;
 
 	nubus_get_rsrc_mem(&icon, ent, 128);
-	pr_info("    icon:\n");
+	pr_debug("    icon:\n");
+	for (i = 0; i < 8; i++)
+		pr_debug("        %08x %08x %08x %08x\n",
+			icon[i * 4 + 0], icon[i * 4 + 1],
+			icon[i * 4 + 2], icon[i * 4 + 3]);
+	nubus_proc_add_rsrc_mem(procdir, ent, 128);
 
-	/* We should actually plot these somewhere in the framebuffer
-	   init.  This is just to demonstrate that they do, in fact,
-	   exist */
-	for (y = 0; y < 32; y++) {
-		pr_info("      ");
-		for (x = 0; x < 32; x++) {
-			if (icon[y * 4 + x / 8] & (0x80 >> (x % 8)))
-				pr_cont("*");
-			else
-				pr_cont(" ");
-		}
-		pr_cont("\n");
-	}
 	return 0;
 }
 
 static int __init nubus_get_vendorinfo(struct nubus_board *board,
+				       struct proc_dir_entry *procdir,
 				       const struct nubus_dirent *parent)
 {
 	struct nubus_dir dir;
@@ -589,19 +610,20 @@ static int __init nubus_get_vendorinfo(struct nubus_board *board,
 	static char *vendor_fields[6] = { "ID", "serial", "revision",
 	                                  "part", "date", "unknown field" };
 
-	pr_info("    vendor info:\n");
+	pr_debug("    vendor info:\n");
 	nubus_get_subdir(parent, &dir);
-	pr_debug("%s: parent is 0x%p, dir is 0x%p\n",
-	         __func__, parent->base, dir.base);
+	dir.procdir = nubus_proc_add_rsrc_dir(procdir, parent, board);
 
 	while (nubus_readdir(&dir, &ent) != -1) {
 		char name[64];
+		unsigned int len;
 
 		/* These are all strings, we think */
-		nubus_get_rsrc_str(name, &ent, 64);
-		if (ent.type > 5)
+		len = nubus_get_rsrc_str(name, &ent, sizeof(name));
+		if (ent.type < 1 || ent.type > 5)
 			ent.type = 5;
-		pr_info("    %s: %s\n", vendor_fields[ent.type - 1], name);
+		pr_debug("    %s: %s\n", vendor_fields[ent.type - 1], name);
+		nubus_proc_add_rsrc_mem(dir.procdir, &ent, len + 1);
 	}
 	return 0;
 }
@@ -612,9 +634,9 @@ static int __init nubus_get_board_resource(struct nubus_board *board, int slot,
 	struct nubus_dir dir;
 	struct nubus_dirent ent;
 
+	pr_debug("  Board resource 0x%02x:\n", parent->type);
 	nubus_get_subdir(parent, &dir);
-	pr_debug("%s: parent is 0x%p, dir is 0x%p\n",
-	         __func__, parent->base, dir.base);
+	dir.procdir = nubus_proc_add_rsrc_dir(board->procdir, parent, board);
 
 	while (nubus_readdir(&dir, &ent) != -1) {
 		switch (ent.type) {
@@ -625,64 +647,81 @@ static int __init nubus_get_board_resource(struct nubus_board *board, int slot,
 			   useful except insofar as it tells us that
 			   we really are looking at a board resource. */
 			nubus_get_rsrc_mem(nbtdata, &ent, 8);
-			pr_info("    type: [cat 0x%x type 0x%x sw 0x%x hw 0x%x]\n",
-			        nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
+			pr_debug("    type: [cat 0x%x type 0x%x sw 0x%x hw 0x%x]\n",
+				nbtdata[0], nbtdata[1], nbtdata[2], nbtdata[3]);
 			if (nbtdata[0] != 1 || nbtdata[1] != 0 ||
 			    nbtdata[2] != 0 || nbtdata[3] != 0)
-				pr_err("this sResource is not a board resource!\n");
+				pr_err("Slot %X: sResource is not a board resource!\n",
+				       slot);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 8);
 			break;
 		}
 		case NUBUS_RESID_NAME:
-			nubus_get_rsrc_str(board->name, &ent, 64);
-			pr_info("    name: %s\n", board->name);
+		{
+			unsigned int len;
+
+			len = nubus_get_rsrc_str(board->name, &ent,
+						 sizeof(board->name));
+			pr_debug("    name: %s\n", board->name);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, len + 1);
 			break;
+		}
 		case NUBUS_RESID_ICON:
-			nubus_get_icon(board, &ent);
+			nubus_get_icon(board, dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_BOARDID:
-			pr_info("    board id: 0x%x\n", ent.data);
+			pr_debug("    board id: 0x%x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_PRIMARYINIT:
-			pr_info("    primary init offset: 0x%06x\n", ent.data);
+			pr_debug("    primary init offset: 0x%06x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_VENDORINFO:
-			nubus_get_vendorinfo(board, &ent);
+			nubus_get_vendorinfo(board, dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_FLAGS:
-			pr_info("    flags: 0x%06x\n", ent.data);
+			pr_debug("    flags: 0x%06x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_HWDEVID:
-			pr_info("    hwdevid: 0x%06x\n", ent.data);
+			pr_debug("    hwdevid: 0x%06x\n", ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		case NUBUS_RESID_SECONDINIT:
-			pr_info("    secondary init offset: 0x%06x\n", ent.data);
+			pr_debug("    secondary init offset: 0x%06x\n",
+				 ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 			/* WTF isn't this in the functional resources? */
 		case NUBUS_RESID_VIDNAMES:
-			nubus_get_vidnames(board, &ent);
+			pr_debug("    vidnames directory offset: 0x%06x\n",
+				ent.data);
+			nubus_get_block_rsrc_dir(board, dir.procdir, &ent);
 			break;
 			/* Same goes for this */
 		case NUBUS_RESID_VIDMODES:
-			pr_info("    video mode parameter directory offset: 0x%06x\n",
-			       ent.data);
+			pr_debug("    video mode parameter directory offset: 0x%06x\n",
+				ent.data);
+			nubus_proc_add_rsrc(dir.procdir, &ent);
 			break;
 		default:
-			pr_info("    unknown resource %02X, data 0x%06x\n",
-			       ent.type, ent.data);
+			pr_debug("    unknown resource 0x%02x, data 0x%06x\n",
+				ent.type, ent.data);
+			nubus_proc_add_rsrc_mem(dir.procdir, &ent, 0);
 		}
 	}
 	return 0;
 }
 
-/* Add a board (might be many devices) to the list */
-static struct nubus_board * __init nubus_add_board(int slot, int bytelanes)
+static void __init nubus_add_board(int slot, int bytelanes)
 {
 	struct nubus_board *board;
-	struct nubus_board **boardp;
 	unsigned char *rp;
 	unsigned long dpat;
 	struct nubus_dir dir;
 	struct nubus_dirent ent;
+	int prev_resid = -1;
 
 	/* Move to the start of the format block */
 	rp = nubus_rom_addr(slot);
@@ -690,19 +729,19 @@ static struct nubus_board * __init nubus_add_board(int slot, int bytelanes)
 
 	/* Actually we should probably panic if this fails */
 	if ((board = kzalloc(sizeof(*board), GFP_ATOMIC)) == NULL)
-		return NULL;
+		return;
 	board->fblock = rp;
 
 	/* Dump the format block for debugging purposes */
 	pr_debug("Slot %X, format block at 0x%p:\n", slot, rp);
+	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
+	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
+	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
 	pr_debug("%02lx\n", nubus_get_rom(&rp, 1, bytelanes));
 	pr_debug("%02lx\n", nubus_get_rom(&rp, 1, bytelanes));
 	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
 	pr_debug("%02lx\n", nubus_get_rom(&rp, 1, bytelanes));
 	pr_debug("%02lx\n", nubus_get_rom(&rp, 1, bytelanes));
-	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
-	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
-	pr_debug("%08lx\n", nubus_get_rom(&rp, 4, bytelanes));
 	rp = board->fblock;
 
 	board->slot = slot;
@@ -722,10 +761,10 @@ static struct nubus_board * __init nubus_add_board(int slot, int bytelanes)
 
 	/* Directory offset should be small and negative... */
 	if (!(board->doffset & 0x00FF0000))
-		pr_warn("Dodgy doffset!\n");
+		pr_warn("Slot %X: Dodgy doffset!\n", slot);
 	dpat = nubus_get_rom(&rp, 4, bytelanes);
 	if (dpat != NUBUS_TEST_PATTERN)
-		pr_warn("Wrong test pattern %08lx!\n", dpat);
+		pr_warn("Slot %X: Wrong test pattern %08lx!\n", slot, dpat);
 
 	/*
 	 *	I wonder how the CRC is meant to work -
@@ -742,53 +781,52 @@ static struct nubus_board * __init nubus_add_board(int slot, int bytelanes)
 	nubus_get_root_dir(board, &dir);
 
 	/* We're ready to rock */
-	pr_info("Slot %X:\n", slot);
+	pr_debug("Slot %X resources:\n", slot);
 
 	/* Each slot should have one board resource and any number of
-	   functional resources.  So we'll fill in some fields in the
-	   struct nubus_board from the board resource, then walk down
-	   the list of functional resources, spinning out a nubus_dev
-	   for each of them. */
+	 * functional resources.  So we'll fill in some fields in the
+	 * struct nubus_board from the board resource, then walk down
+	 * the list of functional resources, spinning out a nubus_rsrc
+	 * for each of them.
+	 */
 	if (nubus_readdir(&dir, &ent) == -1) {
 		/* We can't have this! */
-		pr_err("Board resource not found!\n");
-		return NULL;
-	} else {
-		pr_info("  Board resource:\n");
-		nubus_get_board_resource(board, slot, &ent);
+		pr_err("Slot %X: Board resource not found!\n", slot);
+		kfree(board);
+		return;
 	}
 
+	if (ent.type < 1 || ent.type > 127)
+		pr_warn("Slot %X: Board resource ID is invalid!\n", slot);
+
+	board->procdir = nubus_proc_add_board(board);
+
+	nubus_get_board_resource(board, slot, &ent);
+
 	while (nubus_readdir(&dir, &ent) != -1) {
-		struct nubus_dev *dev;
-		struct nubus_dev **devp;
+		struct nubus_rsrc *fres;
 
-		dev = nubus_get_functional_resource(board, slot, &ent);
-		if (dev == NULL)
+		fres = nubus_get_functional_resource(board, slot, &ent);
+		if (fres == NULL)
 			continue;
 
-		/* We zeroed this out above */
-		if (board->first_dev == NULL)
-			board->first_dev = dev;
+		/* Resources should appear in ascending ID order. This sanity
+		 * check prevents duplicate resource IDs.
+		 */
+		if (fres->resid <= prev_resid) {
+			kfree(fres);
+			continue;
+		}
+		prev_resid = fres->resid;
 
-		/* Put it on the global NuBus device chain. Keep entries in order. */
-		for (devp = &nubus_devices; *devp != NULL;
-		     devp = &((*devp)->next))
-			/* spin */;
-		*devp = dev;
-		dev->next = NULL;
+		list_add_tail(&fres->list, &nubus_func_rsrcs);
 	}
 
-	/* Put it on the global NuBus board chain. Keep entries in order. */
-	for (boardp = &nubus_boards; *boardp != NULL;
-	     boardp = &((*boardp)->next))
-		/* spin */;
-	*boardp = board;
-	board->next = NULL;
-
-	return board;
+	if (nubus_device_register(board))
+		put_device(&board->dev);
 }
 
-void __init nubus_probe_slot(int slot)
+static void __init nubus_probe_slot(int slot)
 {
 	unsigned char dp;
 	unsigned char *rp;
@@ -796,11 +834,8 @@ void __init nubus_probe_slot(int slot)
 
 	rp = nubus_rom_addr(slot);
 	for (i = 4; i; i--) {
-		int card_present;
-
 		rp--;
-		card_present = hwreg_present(rp);
-		if (!card_present)
+		if (!hwreg_present(rp))
 			continue;
 
 		dp = *rp;
@@ -822,10 +857,11 @@ void __init nubus_probe_slot(int slot)
 	}
 }
 
-void __init nubus_scan_bus(void)
+static void __init nubus_scan_bus(void)
 {
 	int slot;
 
+	pr_info("NuBus: Scanning NuBus slots.\n");
 	for (slot = 9; slot < 15; slot++) {
 		nubus_probe_slot(slot);
 	}
@@ -833,14 +869,16 @@ void __init nubus_scan_bus(void)
 
 static int __init nubus_init(void)
 {
+	int err;
+
 	if (!MACH_IS_MAC)
 		return 0;
 
-	pr_info("NuBus: Scanning NuBus slots.\n");
-	nubus_devices = NULL;
-	nubus_boards = NULL;
-	nubus_scan_bus();
 	nubus_proc_init();
+	err = nubus_bus_register();
+	if (err)
+		return err;
+	nubus_scan_bus();
 	return 0;
 }
 
diff --git a/drivers/nubus/proc.c b/drivers/nubus/proc.c
index 004a122..c2e5a7e 100644
--- a/drivers/nubus/proc.c
+++ b/drivers/nubus/proc.c
@@ -11,39 +11,37 @@
    structure in /proc analogous to the structure of the NuBus ROM
    resources.
 
-   Therefore each NuBus device is in fact a directory, which may in
-   turn contain subdirectories.  The "files" correspond to NuBus
-   resource records.  For those types of records which we know how to
-   convert to formats that are meaningful to userspace (mostly just
-   icons) these files will provide "cooked" data.  Otherwise they will
-   simply provide raw access (read-only of course) to the ROM.  */
+   Therefore each board function gets a directory, which may in turn
+   contain subdirectories.  Each slot resource is a file.  Unrecognized
+   resources are empty files, since every resource ID requires a special
+   case (e.g. if the resource ID implies a directory or block, then its
+   value has to be interpreted as a slot ROM pointer etc.).
+ */
 
 #include <linux/types.h>
 #include <linux/kernel.h>
 #include <linux/nubus.h>
 #include <linux/proc_fs.h>
 #include <linux/seq_file.h>
+#include <linux/slab.h>
 #include <linux/init.h>
 #include <linux/module.h>
-
 #include <linux/uaccess.h>
 #include <asm/byteorder.h>
 
+/*
+ * /proc/bus/nubus/devices stuff
+ */
+
 static int
 nubus_devices_proc_show(struct seq_file *m, void *v)
 {
-	struct nubus_dev *dev = nubus_devices;
+	struct nubus_rsrc *fres;
 
-	while (dev) {
-		seq_printf(m, "%x\t%04x %04x %04x %04x",
-			      dev->board->slot,
-			      dev->category,
-			      dev->type,
-			      dev->dr_sw,
-			      dev->dr_hw);
-		seq_printf(m, "\t%08lx\n", dev->board->slot_addr);
-		dev = dev->next;
-	}
+	for_each_func_rsrc(fres)
+		seq_printf(m, "%x\t%04x %04x %04x %04x\t%08lx\n",
+			   fres->board->slot, fres->category, fres->type,
+			   fres->dr_sw, fres->dr_hw, fres->board->slot_addr);
 	return 0;
 }
 
@@ -61,174 +59,163 @@ static const struct file_operations nubus_devices_proc_fops = {
 
 static struct proc_dir_entry *proc_bus_nubus_dir;
 
-static const struct file_operations nubus_proc_subdir_fops = {
-#warning Need to set some I/O handlers here
+/*
+ * /proc/bus/nubus/x/ stuff
+ */
+
+struct proc_dir_entry *nubus_proc_add_board(struct nubus_board *board)
+{
+	char name[2];
+
+	if (!proc_bus_nubus_dir)
+		return NULL;
+	snprintf(name, sizeof(name), "%x", board->slot);
+	return proc_mkdir(name, proc_bus_nubus_dir);
+}
+
+/* The PDE private data for any directory under /proc/bus/nubus/x/
+ * is the bytelanes value for the board in slot x.
+ */
+
+struct proc_dir_entry *nubus_proc_add_rsrc_dir(struct proc_dir_entry *procdir,
+					       const struct nubus_dirent *ent,
+					       struct nubus_board *board)
+{
+	char name[9];
+	int lanes = board->lanes;
+
+	if (!procdir)
+		return NULL;
+	snprintf(name, sizeof(name), "%x", ent->type);
+	return proc_mkdir_data(name, 0555, procdir, (void *)lanes);
+}
+
+/* The PDE private data for a file under /proc/bus/nubus/x/ is a pointer to
+ * an instance of the following structure, which gives the location and size
+ * of the resource data in the slot ROM. For slot resources which hold only a
+ * small integer, this integer value is stored directly and size is set to 0.
+ * A NULL private data pointer indicates an unrecognized resource.
+ */
+
+struct nubus_proc_pde_data {
+	unsigned char *res_ptr;
+	unsigned int res_size;
 };
 
-static void nubus_proc_subdir(struct nubus_dev* dev,
-			      struct proc_dir_entry* parent,
-			      struct nubus_dir* dir)
+static struct nubus_proc_pde_data *
+nubus_proc_alloc_pde_data(unsigned char *ptr, unsigned int size)
 {
-	struct nubus_dirent ent;
+	struct nubus_proc_pde_data *pde_data;
 
-	/* Some of these are directories, others aren't */
-	while (nubus_readdir(dir, &ent) != -1) {
-		char name[8];
-		struct proc_dir_entry* e;
-		
-		sprintf(name, "%x", ent.type);
-		e = proc_create(name, S_IFREG | S_IRUGO | S_IWUSR, parent,
-				&nubus_proc_subdir_fops);
-		if (!e)
-			return;
-	}
+	pde_data = kmalloc(sizeof(*pde_data), GFP_KERNEL);
+	if (!pde_data)
+		return NULL;
+
+	pde_data->res_ptr = ptr;
+	pde_data->res_size = size;
+	return pde_data;
 }
 
-/* Can't do this recursively since the root directory is structured
-   somewhat differently from the subdirectories */
-static void nubus_proc_populate(struct nubus_dev* dev,
-				struct proc_dir_entry* parent,
-				struct nubus_dir* root)
+static int nubus_proc_rsrc_show(struct seq_file *m, void *v)
 {
-	struct nubus_dirent ent;
+	struct inode *inode = m->private;
+	struct nubus_proc_pde_data *pde_data;
 
-	/* We know these are all directories (board resource + one or
-	   more functional resources) */
-	while (nubus_readdir(root, &ent) != -1) {
-		char name[8];
-		struct proc_dir_entry* e;
-		struct nubus_dir dir;
-		
-		sprintf(name, "%x", ent.type);
-		e = proc_mkdir(name, parent);
-		if (!e) return;
+	pde_data = PDE_DATA(inode);
+	if (!pde_data)
+		return 0;
 
-		/* And descend */
-		if (nubus_get_subdir(&ent, &dir) == -1) {
-			/* This shouldn't happen */
-			printk(KERN_ERR "NuBus root directory node %x:%x has no subdir!\n",
-			       dev->board->slot, ent.type);
-			continue;
-		} else {
-			nubus_proc_subdir(dev, e, &dir);
-		}
+	if (pde_data->res_size > m->size)
+		return -EFBIG;
+
+	if (pde_data->res_size) {
+		int lanes = (int)proc_get_parent_data(inode);
+		struct nubus_dirent ent;
+
+		if (!lanes)
+			return 0;
+
+		ent.mask = lanes;
+		ent.base = pde_data->res_ptr;
+		ent.data = 0;
+		nubus_seq_write_rsrc_mem(m, &ent, pde_data->res_size);
+	} else {
+		unsigned int data = (unsigned int)pde_data->res_ptr;
+
+		seq_putc(m, data >> 16);
+		seq_putc(m, data >> 8);
+		seq_putc(m, data >> 0);
 	}
-}
-
-int nubus_proc_attach_device(struct nubus_dev *dev)
-{
-	struct proc_dir_entry *e;
-	struct nubus_dir root;
-	char name[8];
-
-	if (dev == NULL) {
-		printk(KERN_ERR
-		       "NULL pointer in nubus_proc_attach_device, shoot the programmer!\n");
-		return -1;
-	}
-		
-	if (dev->board == NULL) {
-		printk(KERN_ERR
-		       "NULL pointer in nubus_proc_attach_device, shoot the programmer!\n");
-		printk("dev = %p, dev->board = %p\n", dev, dev->board);
-		return -1;
-	}
-		
-	/* Create a directory */
-	sprintf(name, "%x", dev->board->slot);
-	e = dev->procdir = proc_mkdir(name, proc_bus_nubus_dir);
-	if (!e)
-		return -ENOMEM;
-
-	/* Now recursively populate it with files */
-	nubus_get_root_dir(dev->board, &root);
-	nubus_proc_populate(dev, e, &root);
-
 	return 0;
 }
-EXPORT_SYMBOL(nubus_proc_attach_device);
+
+static int nubus_proc_rsrc_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, nubus_proc_rsrc_show, inode);
+}
+
+static const struct file_operations nubus_proc_rsrc_fops = {
+	.open		= nubus_proc_rsrc_open,
+	.read		= seq_read,
+	.llseek		= seq_lseek,
+	.release	= single_release,
+};
+
+void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir,
+			     const struct nubus_dirent *ent,
+			     unsigned int size)
+{
+	char name[9];
+	struct nubus_proc_pde_data *pde_data;
+
+	if (!procdir)
+		return;
+
+	snprintf(name, sizeof(name), "%x", ent->type);
+	if (size)
+		pde_data = nubus_proc_alloc_pde_data(nubus_dirptr(ent), size);
+	else
+		pde_data = NULL;
+	proc_create_data(name, S_IFREG | 0444, procdir,
+			 &nubus_proc_rsrc_fops, pde_data);
+}
+
+void nubus_proc_add_rsrc(struct proc_dir_entry *procdir,
+			 const struct nubus_dirent *ent)
+{
+	char name[9];
+	unsigned char *data = (unsigned char *)ent->data;
+
+	if (!procdir)
+		return;
+
+	snprintf(name, sizeof(name), "%x", ent->type);
+	proc_create_data(name, S_IFREG | 0444, procdir,
+			 &nubus_proc_rsrc_fops,
+			 nubus_proc_alloc_pde_data(data, 0));
+}
 
 /*
  * /proc/nubus stuff
  */
-static int nubus_proc_show(struct seq_file *m, void *v)
-{
-	const struct nubus_board *board = v;
-
-	/* Display header on line 1 */
-	if (v == SEQ_START_TOKEN)
-		seq_puts(m, "Nubus devices found:\n");
-	else
-		seq_printf(m, "Slot %X: %s\n", board->slot, board->name);
-	return 0;
-}
-
-static void *nubus_proc_start(struct seq_file *m, loff_t *_pos)
-{
-	struct nubus_board *board;
-	unsigned pos;
-
-	if (*_pos > LONG_MAX)
-		return NULL;
-	pos = *_pos;
-	if (pos == 0)
-		return SEQ_START_TOKEN;
-	for (board = nubus_boards; board; board = board->next)
-		if (--pos == 0)
-			break;
-	return board;
-}
-
-static void *nubus_proc_next(struct seq_file *p, void *v, loff_t *_pos)
-{
-	/* Walk the list of NuBus boards */
-	struct nubus_board *board = v;
-
-	++*_pos;
-	if (v == SEQ_START_TOKEN)
-		board = nubus_boards;
-	else if (board)
-		board = board->next;
-	return board;
-}
-
-static void nubus_proc_stop(struct seq_file *p, void *v)
-{
-}
-
-static const struct seq_operations nubus_proc_seqops = {
-	.start	= nubus_proc_start,
-	.next	= nubus_proc_next,
-	.stop	= nubus_proc_stop,
-	.show	= nubus_proc_show,
-};
 
 static int nubus_proc_open(struct inode *inode, struct file *file)
 {
-	return seq_open(file, &nubus_proc_seqops);
+	return single_open(file, nubus_proc_show, NULL);
 }
 
 static const struct file_operations nubus_proc_fops = {
 	.open		= nubus_proc_open,
 	.read		= seq_read,
 	.llseek		= seq_lseek,
-	.release	= seq_release,
+	.release	= single_release,
 };
 
-void __init proc_bus_nubus_add_devices(void)
-{
-	struct nubus_dev *dev;
-	
-	for(dev = nubus_devices; dev; dev = dev->next)
-		nubus_proc_attach_device(dev);
-}
-
 void __init nubus_proc_init(void)
 {
 	proc_create("nubus", 0, NULL, &nubus_proc_fops);
-	if (!MACH_IS_MAC)
-		return;
 	proc_bus_nubus_dir = proc_mkdir("bus/nubus", NULL);
+	if (!proc_bus_nubus_dir)
+		return;
 	proc_create("devices", 0, proc_bus_nubus_dir, &nubus_devices_proc_fops);
-	proc_bus_nubus_add_devices();
 }
diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
index a25fd43..441e67e 100644
--- a/drivers/nvme/host/Makefile
+++ b/drivers/nvme/host/Makefile
@@ -1,4 +1,7 @@
 # SPDX-License-Identifier: GPL-2.0
+
+ccflags-y				+= -I$(src)
+
 obj-$(CONFIG_NVME_CORE)			+= nvme-core.o
 obj-$(CONFIG_BLK_DEV_NVME)		+= nvme.o
 obj-$(CONFIG_NVME_FABRICS)		+= nvme-fabrics.o
@@ -6,6 +9,7 @@
 obj-$(CONFIG_NVME_FC)			+= nvme-fc.o
 
 nvme-core-y				:= core.o
+nvme-core-$(CONFIG_TRACING)		+= trace.o
 nvme-core-$(CONFIG_NVME_MULTIPATH)	+= multipath.o
 nvme-core-$(CONFIG_NVM)			+= lightnvm.o
 
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 1e46e60..e8104871 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -29,6 +29,9 @@
 #include <linux/pm_qos.h>
 #include <asm/unaligned.h>
 
+#define CREATE_TRACE_POINTS
+#include "trace.h"
+
 #include "nvme.h"
 #include "fabrics.h"
 
@@ -65,9 +68,26 @@ static bool streams;
 module_param(streams, bool, 0644);
 MODULE_PARM_DESC(streams, "turn on support for Streams write directives");
 
+/*
+ * nvme_wq - hosts nvme related works that are not reset or delete
+ * nvme_reset_wq - hosts nvme reset works
+ * nvme_delete_wq - hosts nvme delete works
+ *
+ * nvme_wq will host works such are scan, aen handling, fw activation,
+ * keep-alive error recovery, periodic reconnects etc. nvme_reset_wq
+ * runs reset works which also flush works hosted on nvme_wq for
+ * serialization purposes. nvme_delete_wq host controller deletion
+ * works which flush reset works for serialization.
+ */
 struct workqueue_struct *nvme_wq;
 EXPORT_SYMBOL_GPL(nvme_wq);
 
+struct workqueue_struct *nvme_reset_wq;
+EXPORT_SYMBOL_GPL(nvme_reset_wq);
+
+struct workqueue_struct *nvme_delete_wq;
+EXPORT_SYMBOL_GPL(nvme_delete_wq);
+
 static DEFINE_IDA(nvme_subsystems_ida);
 static LIST_HEAD(nvme_subsystems);
 static DEFINE_MUTEX(nvme_subsystems_lock);
@@ -89,13 +109,13 @@ int nvme_reset_ctrl(struct nvme_ctrl *ctrl)
 {
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING))
 		return -EBUSY;
-	if (!queue_work(nvme_wq, &ctrl->reset_work))
+	if (!queue_work(nvme_reset_wq, &ctrl->reset_work))
 		return -EBUSY;
 	return 0;
 }
 EXPORT_SYMBOL_GPL(nvme_reset_ctrl);
 
-static int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl)
+int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl)
 {
 	int ret;
 
@@ -104,6 +124,7 @@ static int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl)
 		flush_work(&ctrl->reset_work);
 	return ret;
 }
+EXPORT_SYMBOL_GPL(nvme_reset_ctrl_sync);
 
 static void nvme_delete_ctrl_work(struct work_struct *work)
 {
@@ -122,7 +143,7 @@ int nvme_delete_ctrl(struct nvme_ctrl *ctrl)
 {
 	if (!nvme_change_ctrl_state(ctrl, NVME_CTRL_DELETING))
 		return -EBUSY;
-	if (!queue_work(nvme_wq, &ctrl->delete_work))
+	if (!queue_work(nvme_delete_wq, &ctrl->delete_work))
 		return -EBUSY;
 	return 0;
 }
@@ -157,13 +178,20 @@ static blk_status_t nvme_error_status(struct request *req)
 		return BLK_STS_OK;
 	case NVME_SC_CAP_EXCEEDED:
 		return BLK_STS_NOSPC;
+	case NVME_SC_LBA_RANGE:
+		return BLK_STS_TARGET;
+	case NVME_SC_BAD_ATTRIBUTES:
 	case NVME_SC_ONCS_NOT_SUPPORTED:
+	case NVME_SC_INVALID_OPCODE:
+	case NVME_SC_INVALID_FIELD:
+	case NVME_SC_INVALID_NS:
 		return BLK_STS_NOTSUPP;
 	case NVME_SC_WRITE_FAULT:
 	case NVME_SC_READ_ERROR:
 	case NVME_SC_UNWRITTEN_BLOCK:
 	case NVME_SC_ACCESS_DENIED:
 	case NVME_SC_READ_ONLY:
+	case NVME_SC_COMPARE_FAILED:
 		return BLK_STS_MEDIUM;
 	case NVME_SC_GUARD_CHECK:
 	case NVME_SC_APPTAG_CHECK:
@@ -190,8 +218,12 @@ static inline bool nvme_req_needs_retry(struct request *req)
 
 void nvme_complete_rq(struct request *req)
 {
-	if (unlikely(nvme_req(req)->status && nvme_req_needs_retry(req))) {
-		if (nvme_req_needs_failover(req)) {
+	blk_status_t status = nvme_error_status(req);
+
+	trace_nvme_complete_rq(req);
+
+	if (unlikely(status != BLK_STS_OK && nvme_req_needs_retry(req))) {
+		if (nvme_req_needs_failover(req, status)) {
 			nvme_failover_req(req);
 			return;
 		}
@@ -202,8 +234,7 @@ void nvme_complete_rq(struct request *req)
 			return;
 		}
 	}
-
-	blk_mq_end_request(req, nvme_error_status(req));
+	blk_mq_end_request(req, status);
 }
 EXPORT_SYMBOL_GPL(nvme_complete_rq);
 
@@ -232,6 +263,15 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 
 	old_state = ctrl->state;
 	switch (new_state) {
+	case NVME_CTRL_ADMIN_ONLY:
+		switch (old_state) {
+		case NVME_CTRL_RECONNECTING:
+			changed = true;
+			/* FALLTHRU */
+		default:
+			break;
+		}
+		break;
 	case NVME_CTRL_LIVE:
 		switch (old_state) {
 		case NVME_CTRL_NEW:
@@ -247,6 +287,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 		switch (old_state) {
 		case NVME_CTRL_NEW:
 		case NVME_CTRL_LIVE:
+		case NVME_CTRL_ADMIN_ONLY:
 			changed = true;
 			/* FALLTHRU */
 		default:
@@ -266,6 +307,7 @@ bool nvme_change_ctrl_state(struct nvme_ctrl *ctrl,
 	case NVME_CTRL_DELETING:
 		switch (old_state) {
 		case NVME_CTRL_LIVE:
+		case NVME_CTRL_ADMIN_ONLY:
 		case NVME_CTRL_RESETTING:
 		case NVME_CTRL_RECONNECTING:
 			changed = true;
@@ -591,6 +633,10 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req,
 	}
 
 	cmd->common.command_id = req->tag;
+	if (ns)
+		trace_nvme_setup_nvm_cmd(req->q->id, cmd);
+	else
+		trace_nvme_setup_admin_cmd(cmd);
 	return ret;
 }
 EXPORT_SYMBOL_GPL(nvme_setup_cmd);
@@ -1217,16 +1263,27 @@ static int nvme_open(struct block_device *bdev, fmode_t mode)
 #ifdef CONFIG_NVME_MULTIPATH
 	/* should never be called due to GENHD_FL_HIDDEN */
 	if (WARN_ON_ONCE(ns->head->disk))
-		return -ENXIO;
+		goto fail;
 #endif
 	if (!kref_get_unless_zero(&ns->kref))
-		return -ENXIO;
+		goto fail;
+	if (!try_module_get(ns->ctrl->ops->module))
+		goto fail_put_ns;
+
 	return 0;
+
+fail_put_ns:
+	nvme_put_ns(ns);
+fail:
+	return -ENXIO;
 }
 
 static void nvme_release(struct gendisk *disk, fmode_t mode)
 {
-	nvme_put_ns(disk->private_data);
+	struct nvme_ns *ns = disk->private_data;
+
+	module_put(ns->ctrl->ops->module);
+	nvme_put_ns(ns);
 }
 
 static int nvme_getgeo(struct block_device *bdev, struct hd_geometry *geo)
@@ -1335,6 +1392,7 @@ static void nvme_update_disk_info(struct gendisk *disk,
 		struct nvme_ns *ns, struct nvme_id_ns *id)
 {
 	sector_t capacity = le64_to_cpup(&id->nsze) << (ns->lba_shift - 9);
+	unsigned short bs = 1 << ns->lba_shift;
 	unsigned stream_alignment = 0;
 
 	if (ns->ctrl->nr_streams && ns->sws && ns->sgs)
@@ -1343,7 +1401,10 @@ static void nvme_update_disk_info(struct gendisk *disk,
 	blk_mq_freeze_queue(disk->queue);
 	blk_integrity_unregister(disk);
 
-	blk_queue_logical_block_size(disk->queue, 1 << ns->lba_shift);
+	blk_queue_logical_block_size(disk->queue, bs);
+	blk_queue_physical_block_size(disk->queue, bs);
+	blk_queue_io_min(disk->queue, bs);
+
 	if (ns->ms && !ns->ext &&
 	    (ns->ctrl->ops->flags & NVME_F_METADATA_SUPPORTED))
 		nvme_init_integrity(disk, ns->ms, ns->pi_type);
@@ -2048,6 +2109,22 @@ static const struct attribute_group *nvme_subsys_attrs_groups[] = {
 	NULL,
 };
 
+static int nvme_active_ctrls(struct nvme_subsystem *subsys)
+{
+	int count = 0;
+	struct nvme_ctrl *ctrl;
+
+	mutex_lock(&subsys->lock);
+	list_for_each_entry(ctrl, &subsys->ctrls, subsys_entry) {
+		if (ctrl->state != NVME_CTRL_DELETING &&
+		    ctrl->state != NVME_CTRL_DEAD)
+			count++;
+	}
+	mutex_unlock(&subsys->lock);
+
+	return count;
+}
+
 static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 {
 	struct nvme_subsystem *subsys, *found;
@@ -2086,7 +2163,7 @@ static int nvme_init_subsystem(struct nvme_ctrl *ctrl, struct nvme_id_ctrl *id)
 		 * Verify that the subsystem actually supports multiple
 		 * controllers, else bail out.
 		 */
-		if (!(id->cmic & (1 << 1))) {
+		if (nvme_active_ctrls(found) && !(id->cmic & (1 << 1))) {
 			dev_err(ctrl->device,
 				"ignoring ctrl due to duplicate subnqn (%s).\n",
 				found->subnqn);
@@ -2253,7 +2330,7 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
 						 shutdown_timeout, 60);
 
 		if (ctrl->shutdown_timeout != shutdown_timeout)
-			dev_warn(ctrl->device,
+			dev_info(ctrl->device,
 				 "Shutdown timeout set to %u seconds\n",
 				 ctrl->shutdown_timeout);
 	} else
@@ -2337,8 +2414,14 @@ static int nvme_dev_open(struct inode *inode, struct file *file)
 	struct nvme_ctrl *ctrl =
 		container_of(inode->i_cdev, struct nvme_ctrl, cdev);
 
-	if (ctrl->state != NVME_CTRL_LIVE)
+	switch (ctrl->state) {
+	case NVME_CTRL_LIVE:
+	case NVME_CTRL_ADMIN_ONLY:
+		break;
+	default:
 		return -EWOULDBLOCK;
+	}
+
 	file->private_data = ctrl;
 	return 0;
 }
@@ -2602,6 +2685,7 @@ static ssize_t nvme_sysfs_show_state(struct device *dev,
 	static const char *const state_name[] = {
 		[NVME_CTRL_NEW]		= "new",
 		[NVME_CTRL_LIVE]	= "live",
+		[NVME_CTRL_ADMIN_ONLY]	= "only-admin",
 		[NVME_CTRL_RESETTING]	= "resetting",
 		[NVME_CTRL_RECONNECTING]= "reconnecting",
 		[NVME_CTRL_DELETING]	= "deleting",
@@ -2987,6 +3071,7 @@ static void nvme_ns_remove(struct nvme_ns *ns)
 	mutex_unlock(&ns->ctrl->namespaces_mutex);
 
 	synchronize_srcu(&ns->head->srcu);
+	nvme_mpath_check_last_path(ns);
 	nvme_put_ns(ns);
 }
 
@@ -3074,6 +3159,8 @@ static void nvme_scan_work(struct work_struct *work)
 	if (ctrl->state != NVME_CTRL_LIVE)
 		return;
 
+	WARN_ON_ONCE(!ctrl->tagset);
+
 	if (nvme_identify_ctrl(ctrl, &id))
 		return;
 
@@ -3094,8 +3181,7 @@ static void nvme_scan_work(struct work_struct *work)
 void nvme_queue_scan(struct nvme_ctrl *ctrl)
 {
 	/*
-	 * Do not queue new scan work when a controller is reset during
-	 * removal.
+	 * Only new queue scan work when admin and IO queues are both alive
 	 */
 	if (ctrl->state == NVME_CTRL_LIVE)
 		queue_work(nvme_wq, &ctrl->scan_work);
@@ -3472,16 +3558,26 @@ EXPORT_SYMBOL_GPL(nvme_reinit_tagset);
 
 int __init nvme_core_init(void)
 {
-	int result;
+	int result = -ENOMEM;
 
 	nvme_wq = alloc_workqueue("nvme-wq",
 			WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0);
 	if (!nvme_wq)
-		return -ENOMEM;
+		goto out;
+
+	nvme_reset_wq = alloc_workqueue("nvme-reset-wq",
+			WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0);
+	if (!nvme_reset_wq)
+		goto destroy_wq;
+
+	nvme_delete_wq = alloc_workqueue("nvme-delete-wq",
+			WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 0);
+	if (!nvme_delete_wq)
+		goto destroy_reset_wq;
 
 	result = alloc_chrdev_region(&nvme_chr_devt, 0, NVME_MINORS, "nvme");
 	if (result < 0)
-		goto destroy_wq;
+		goto destroy_delete_wq;
 
 	nvme_class = class_create(THIS_MODULE, "nvme");
 	if (IS_ERR(nvme_class)) {
@@ -3500,8 +3596,13 @@ int __init nvme_core_init(void)
 	class_destroy(nvme_class);
 unregister_chrdev:
 	unregister_chrdev_region(nvme_chr_devt, NVME_MINORS);
+destroy_delete_wq:
+	destroy_workqueue(nvme_delete_wq);
+destroy_reset_wq:
+	destroy_workqueue(nvme_reset_wq);
 destroy_wq:
 	destroy_workqueue(nvme_wq);
+out:
 	return result;
 }
 
@@ -3511,6 +3612,8 @@ void nvme_core_exit(void)
 	class_destroy(nvme_subsys_class);
 	class_destroy(nvme_class);
 	unregister_chrdev_region(nvme_chr_devt, NVME_MINORS);
+	destroy_workqueue(nvme_delete_wq);
+	destroy_workqueue(nvme_reset_wq);
 	destroy_workqueue(nvme_wq);
 }
 
diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index 76b4fe6..5dd4cee 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -74,6 +74,7 @@ static struct nvmf_host *nvmf_host_default(void)
 		return NULL;
 
 	kref_init(&host->ref);
+	uuid_gen(&host->id);
 	snprintf(host->nqn, NVMF_NQN_SIZE,
 		"nqn.2014-08.org.nvmexpress:uuid:%pUb", &host->id);
 
@@ -492,7 +493,7 @@ EXPORT_SYMBOL_GPL(nvmf_should_reconnect);
  */
 int nvmf_register_transport(struct nvmf_transport_ops *ops)
 {
-	if (!ops->create_ctrl)
+	if (!ops->create_ctrl || !ops->module)
 		return -EINVAL;
 
 	down_write(&nvmf_transports_rwsem);
@@ -738,11 +739,14 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
 				ret = -ENOMEM;
 				goto out;
 			}
-			if (uuid_parse(p, &hostid)) {
+			ret = uuid_parse(p, &hostid);
+			if (ret) {
 				pr_err("Invalid hostid %s\n", p);
 				ret = -EINVAL;
+				kfree(p);
 				goto out;
 			}
+			kfree(p);
 			break;
 		case NVMF_OPT_DUP_CONNECT:
 			opts->duplicate_connect = true;
@@ -868,32 +872,41 @@ nvmf_create_ctrl(struct device *dev, const char *buf, size_t count)
 		goto out_unlock;
 	}
 
+	if (!try_module_get(ops->module)) {
+		ret = -EBUSY;
+		goto out_unlock;
+	}
+
 	ret = nvmf_check_required_opts(opts, ops->required_opts);
 	if (ret)
-		goto out_unlock;
+		goto out_module_put;
 	ret = nvmf_check_allowed_opts(opts, NVMF_ALLOWED_OPTS |
 				ops->allowed_opts | ops->required_opts);
 	if (ret)
-		goto out_unlock;
+		goto out_module_put;
 
 	ctrl = ops->create_ctrl(dev, opts);
 	if (IS_ERR(ctrl)) {
 		ret = PTR_ERR(ctrl);
-		goto out_unlock;
+		goto out_module_put;
 	}
 
 	if (strcmp(ctrl->subsys->subnqn, opts->subsysnqn)) {
 		dev_warn(ctrl->device,
 			"controller returned incorrect NQN: \"%s\".\n",
 			ctrl->subsys->subnqn);
+		module_put(ops->module);
 		up_read(&nvmf_transports_rwsem);
 		nvme_delete_ctrl_sync(ctrl);
 		return ERR_PTR(-EINVAL);
 	}
 
+	module_put(ops->module);
 	up_read(&nvmf_transports_rwsem);
 	return ctrl;
 
+out_module_put:
+	module_put(ops->module);
 out_unlock:
 	up_read(&nvmf_transports_rwsem);
 out_free_opts:
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index 9ba6149..25b19f7 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -108,6 +108,7 @@ struct nvmf_ctrl_options {
  *			       fabric implementation of NVMe fabrics.
  * @entry:		Used by the fabrics library to add the new
  *			registration entry to its linked-list internal tree.
+ * @module:             Transport module reference
  * @name:		Name of the NVMe fabric driver implementation.
  * @required_opts:	sysfs command-line options that must be specified
  *			when adding a new NVMe controller.
@@ -126,6 +127,7 @@ struct nvmf_ctrl_options {
  */
 struct nvmf_transport_ops {
 	struct list_head	entry;
+	struct module		*module;
 	const char		*name;
 	int			required_opts;
 	int			allowed_opts;
diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 794e66e..99bf51c 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -2921,6 +2921,9 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
 	__nvme_fc_delete_hw_queue(ctrl, &ctrl->queues[0], 0);
 	nvme_fc_free_queue(&ctrl->queues[0]);
 
+	/* re-enable the admin_q so anything new can fast fail */
+	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
+
 	nvme_fc_ctlr_inactive_on_rport(ctrl);
 }
 
@@ -2935,6 +2938,9 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl)
 	 * waiting for io to terminate
 	 */
 	nvme_fc_delete_association(ctrl);
+
+	/* resume the io queues so that things will fast fail */
+	nvme_start_queues(nctrl);
 }
 
 static void
@@ -3380,6 +3386,7 @@ nvme_fc_create_ctrl(struct device *dev, struct nvmf_ctrl_options *opts)
 
 static struct nvmf_transport_ops nvme_fc_transport = {
 	.name		= "fc",
+	.module		= THIS_MODULE,
 	.required_opts	= NVMF_OPT_TRADDR | NVMF_OPT_HOST_TRADDR,
 	.allowed_opts	= NVMF_OPT_RECONNECT_DELAY | NVMF_OPT_CTRL_LOSS_TMO,
 	.create_ctrl	= nvme_fc_create_ctrl,
diff --git a/drivers/nvme/host/lightnvm.c b/drivers/nvme/host/lightnvm.c
index ba3d7f3..50ef71ee 100644
--- a/drivers/nvme/host/lightnvm.c
+++ b/drivers/nvme/host/lightnvm.c
@@ -31,27 +31,10 @@
 
 enum nvme_nvm_admin_opcode {
 	nvme_nvm_admin_identity		= 0xe2,
-	nvme_nvm_admin_get_l2p_tbl	= 0xea,
 	nvme_nvm_admin_get_bb_tbl	= 0xf2,
 	nvme_nvm_admin_set_bb_tbl	= 0xf1,
 };
 
-struct nvme_nvm_hb_rw {
-	__u8			opcode;
-	__u8			flags;
-	__u16			command_id;
-	__le32			nsid;
-	__u64			rsvd2;
-	__le64			metadata;
-	__le64			prp1;
-	__le64			prp2;
-	__le64			spba;
-	__le16			length;
-	__le16			control;
-	__le32			dsmgmt;
-	__le64			slba;
-};
-
 struct nvme_nvm_ph_rw {
 	__u8			opcode;
 	__u8			flags;
@@ -80,19 +63,6 @@ struct nvme_nvm_identity {
 	__u32			rsvd11[5];
 };
 
-struct nvme_nvm_l2ptbl {
-	__u8			opcode;
-	__u8			flags;
-	__u16			command_id;
-	__le32			nsid;
-	__le32			cdw2[4];
-	__le64			prp1;
-	__le64			prp2;
-	__le64			slba;
-	__le32			nlb;
-	__le16			cdw14[6];
-};
-
 struct nvme_nvm_getbbtbl {
 	__u8			opcode;
 	__u8			flags;
@@ -139,9 +109,7 @@ struct nvme_nvm_command {
 	union {
 		struct nvme_common_command common;
 		struct nvme_nvm_identity identity;
-		struct nvme_nvm_hb_rw hb_rw;
 		struct nvme_nvm_ph_rw ph_rw;
-		struct nvme_nvm_l2ptbl l2p;
 		struct nvme_nvm_getbbtbl get_bb;
 		struct nvme_nvm_setbbtbl set_bb;
 		struct nvme_nvm_erase_blk erase;
@@ -167,7 +135,7 @@ struct nvme_nvm_id_group {
 	__u8			num_lun;
 	__u8			num_pln;
 	__u8			rsvd1;
-	__le16			num_blk;
+	__le16			num_chk;
 	__le16			num_pg;
 	__le16			fpg_sz;
 	__le16			csecs;
@@ -234,11 +202,9 @@ struct nvme_nvm_bb_tbl {
 static inline void _nvme_nvm_check_size(void)
 {
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_identity) != 64);
-	BUILD_BUG_ON(sizeof(struct nvme_nvm_hb_rw) != 64);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_ph_rw) != 64);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_getbbtbl) != 64);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_setbbtbl) != 64);
-	BUILD_BUG_ON(sizeof(struct nvme_nvm_l2ptbl) != 64);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_erase_blk) != 64);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_id_group) != 960);
 	BUILD_BUG_ON(sizeof(struct nvme_nvm_addr_format) != 16);
@@ -249,51 +215,58 @@ static inline void _nvme_nvm_check_size(void)
 static int init_grps(struct nvm_id *nvm_id, struct nvme_nvm_id *nvme_nvm_id)
 {
 	struct nvme_nvm_id_group *src;
-	struct nvm_id_group *dst;
+	struct nvm_id_group *grp;
+	int sec_per_pg, sec_per_pl, pg_per_blk;
 
 	if (nvme_nvm_id->cgrps != 1)
 		return -EINVAL;
 
 	src = &nvme_nvm_id->groups[0];
-	dst = &nvm_id->grp;
+	grp = &nvm_id->grp;
 
-	dst->mtype = src->mtype;
-	dst->fmtype = src->fmtype;
-	dst->num_ch = src->num_ch;
-	dst->num_lun = src->num_lun;
-	dst->num_pln = src->num_pln;
+	grp->mtype = src->mtype;
+	grp->fmtype = src->fmtype;
 
-	dst->num_pg = le16_to_cpu(src->num_pg);
-	dst->num_blk = le16_to_cpu(src->num_blk);
-	dst->fpg_sz = le16_to_cpu(src->fpg_sz);
-	dst->csecs = le16_to_cpu(src->csecs);
-	dst->sos = le16_to_cpu(src->sos);
+	grp->num_ch = src->num_ch;
+	grp->num_lun = src->num_lun;
 
-	dst->trdt = le32_to_cpu(src->trdt);
-	dst->trdm = le32_to_cpu(src->trdm);
-	dst->tprt = le32_to_cpu(src->tprt);
-	dst->tprm = le32_to_cpu(src->tprm);
-	dst->tbet = le32_to_cpu(src->tbet);
-	dst->tbem = le32_to_cpu(src->tbem);
-	dst->mpos = le32_to_cpu(src->mpos);
-	dst->mccap = le32_to_cpu(src->mccap);
+	grp->num_chk = le16_to_cpu(src->num_chk);
+	grp->csecs = le16_to_cpu(src->csecs);
+	grp->sos = le16_to_cpu(src->sos);
 
-	dst->cpar = le16_to_cpu(src->cpar);
+	pg_per_blk = le16_to_cpu(src->num_pg);
+	sec_per_pg = le16_to_cpu(src->fpg_sz) / grp->csecs;
+	sec_per_pl = sec_per_pg * src->num_pln;
+	grp->clba = sec_per_pl * pg_per_blk;
+	grp->ws_per_chk = pg_per_blk;
 
-	if (dst->fmtype == NVM_ID_FMTYPE_MLC) {
-		memcpy(dst->lptbl.id, src->lptbl.id, 8);
-		dst->lptbl.mlc.num_pairs =
-				le16_to_cpu(src->lptbl.mlc.num_pairs);
+	grp->mpos = le32_to_cpu(src->mpos);
+	grp->cpar = le16_to_cpu(src->cpar);
+	grp->mccap = le32_to_cpu(src->mccap);
 
-		if (dst->lptbl.mlc.num_pairs > NVME_NVM_LP_MLC_PAIRS) {
-			pr_err("nvm: number of MLC pairs not supported\n");
-			return -EINVAL;
-		}
+	grp->ws_opt = grp->ws_min = sec_per_pg;
+	grp->ws_seq = NVM_IO_SNGL_ACCESS;
 
-		memcpy(dst->lptbl.mlc.pairs, src->lptbl.mlc.pairs,
-					dst->lptbl.mlc.num_pairs);
+	if (grp->mpos & 0x020202) {
+		grp->ws_seq = NVM_IO_DUAL_ACCESS;
+		grp->ws_opt <<= 1;
+	} else if (grp->mpos & 0x040404) {
+		grp->ws_seq = NVM_IO_QUAD_ACCESS;
+		grp->ws_opt <<= 2;
 	}
 
+	grp->trdt = le32_to_cpu(src->trdt);
+	grp->trdm = le32_to_cpu(src->trdm);
+	grp->tprt = le32_to_cpu(src->tprt);
+	grp->tprm = le32_to_cpu(src->tprm);
+	grp->tbet = le32_to_cpu(src->tbet);
+	grp->tbem = le32_to_cpu(src->tbem);
+
+	/* 1.2 compatibility */
+	grp->num_pln = src->num_pln;
+	grp->num_pg = le16_to_cpu(src->num_pg);
+	grp->fpg_sz = le16_to_cpu(src->fpg_sz);
+
 	return 0;
 }
 
@@ -332,62 +305,6 @@ static int nvme_nvm_identity(struct nvm_dev *nvmdev, struct nvm_id *nvm_id)
 	return ret;
 }
 
-static int nvme_nvm_get_l2p_tbl(struct nvm_dev *nvmdev, u64 slba, u32 nlb,
-				nvm_l2p_update_fn *update_l2p, void *priv)
-{
-	struct nvme_ns *ns = nvmdev->q->queuedata;
-	struct nvme_nvm_command c = {};
-	u32 len = queue_max_hw_sectors(ns->ctrl->admin_q) << 9;
-	u32 nlb_pr_rq = len / sizeof(u64);
-	u64 cmd_slba = slba;
-	void *entries;
-	int ret = 0;
-
-	c.l2p.opcode = nvme_nvm_admin_get_l2p_tbl;
-	c.l2p.nsid = cpu_to_le32(ns->head->ns_id);
-	entries = kmalloc(len, GFP_KERNEL);
-	if (!entries)
-		return -ENOMEM;
-
-	while (nlb) {
-		u32 cmd_nlb = min(nlb_pr_rq, nlb);
-		u64 elba = slba + cmd_nlb;
-
-		c.l2p.slba = cpu_to_le64(cmd_slba);
-		c.l2p.nlb = cpu_to_le32(cmd_nlb);
-
-		ret = nvme_submit_sync_cmd(ns->ctrl->admin_q,
-				(struct nvme_command *)&c, entries, len);
-		if (ret) {
-			dev_err(ns->ctrl->device,
-				"L2P table transfer failed (%d)\n", ret);
-			ret = -EIO;
-			goto out;
-		}
-
-		if (unlikely(elba > nvmdev->total_secs)) {
-			pr_err("nvm: L2P data from device is out of bounds!\n");
-			ret = -EINVAL;
-			goto out;
-		}
-
-		/* Transform physical address to target address space */
-		nvm_part_to_tgt(nvmdev, entries, cmd_nlb);
-
-		if (update_l2p(cmd_slba, cmd_nlb, entries, priv)) {
-			ret = -EINTR;
-			goto out;
-		}
-
-		cmd_slba += cmd_nlb;
-		nlb -= cmd_nlb;
-	}
-
-out:
-	kfree(entries);
-	return ret;
-}
-
 static int nvme_nvm_get_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr ppa,
 								u8 *blks)
 {
@@ -397,7 +314,7 @@ static int nvme_nvm_get_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr ppa,
 	struct nvme_ctrl *ctrl = ns->ctrl;
 	struct nvme_nvm_command c = {};
 	struct nvme_nvm_bb_tbl *bb_tbl;
-	int nr_blks = geo->blks_per_lun * geo->plane_mode;
+	int nr_blks = geo->nr_chks * geo->plane_mode;
 	int tblsz = sizeof(struct nvme_nvm_bb_tbl) + nr_blks;
 	int ret = 0;
 
@@ -438,7 +355,7 @@ static int nvme_nvm_get_bb_tbl(struct nvm_dev *nvmdev, struct ppa_addr ppa,
 		goto out;
 	}
 
-	memcpy(blks, bb_tbl->blk, geo->blks_per_lun * geo->plane_mode);
+	memcpy(blks, bb_tbl->blk, geo->nr_chks * geo->plane_mode);
 out:
 	kfree(bb_tbl);
 	return ret;
@@ -474,10 +391,6 @@ static inline void nvme_nvm_rqtocmd(struct nvm_rq *rqd, struct nvme_ns *ns,
 	c->ph_rw.metadata = cpu_to_le64(rqd->dma_meta_list);
 	c->ph_rw.control = cpu_to_le16(rqd->flags);
 	c->ph_rw.length = cpu_to_le16(rqd->nr_ppas - 1);
-
-	if (rqd->opcode == NVM_OP_HBWRITE || rqd->opcode == NVM_OP_HBREAD)
-		c->hb_rw.slba = cpu_to_le64(nvme_block_nr(ns,
-					rqd->bio->bi_iter.bi_sector));
 }
 
 static void nvme_nvm_end_io(struct request *rq, blk_status_t status)
@@ -597,8 +510,6 @@ static void nvme_nvm_dev_dma_free(void *pool, void *addr,
 static struct nvm_dev_ops nvme_nvm_dev_ops = {
 	.identity		= nvme_nvm_identity,
 
-	.get_l2p_tbl		= nvme_nvm_get_l2p_tbl,
-
 	.get_bb_tbl		= nvme_nvm_get_bb_tbl,
 	.set_bb_tbl		= nvme_nvm_set_bb_tbl,
 
@@ -883,7 +794,7 @@ static ssize_t nvm_dev_attr_show(struct device *dev,
 	} else if (strcmp(attr->name, "num_planes") == 0) {
 		return scnprintf(page, PAGE_SIZE, "%u\n", grp->num_pln);
 	} else if (strcmp(attr->name, "num_blocks") == 0) {	/* u16 */
-		return scnprintf(page, PAGE_SIZE, "%u\n", grp->num_blk);
+		return scnprintf(page, PAGE_SIZE, "%u\n", grp->num_chk);
 	} else if (strcmp(attr->name, "num_pages") == 0) {
 		return scnprintf(page, PAGE_SIZE, "%u\n", grp->num_pg);
 	} else if (strcmp(attr->name, "page_size") == 0) {
diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c
index 1218a9f..3b211d9 100644
--- a/drivers/nvme/host/multipath.c
+++ b/drivers/nvme/host/multipath.c
@@ -33,51 +33,11 @@ void nvme_failover_req(struct request *req)
 	kblockd_schedule_work(&ns->head->requeue_work);
 }
 
-bool nvme_req_needs_failover(struct request *req)
+bool nvme_req_needs_failover(struct request *req, blk_status_t error)
 {
 	if (!(req->cmd_flags & REQ_NVME_MPATH))
 		return false;
-
-	switch (nvme_req(req)->status & 0x7ff) {
-	/*
-	 * Generic command status:
-	 */
-	case NVME_SC_INVALID_OPCODE:
-	case NVME_SC_INVALID_FIELD:
-	case NVME_SC_INVALID_NS:
-	case NVME_SC_LBA_RANGE:
-	case NVME_SC_CAP_EXCEEDED:
-	case NVME_SC_RESERVATION_CONFLICT:
-		return false;
-
-	/*
-	 * I/O command set specific error.  Unfortunately these values are
-	 * reused for fabrics commands, but those should never get here.
-	 */
-	case NVME_SC_BAD_ATTRIBUTES:
-	case NVME_SC_INVALID_PI:
-	case NVME_SC_READ_ONLY:
-	case NVME_SC_ONCS_NOT_SUPPORTED:
-		WARN_ON_ONCE(nvme_req(req)->cmd->common.opcode ==
-			nvme_fabrics_command);
-		return false;
-
-	/*
-	 * Media and Data Integrity Errors:
-	 */
-	case NVME_SC_WRITE_FAULT:
-	case NVME_SC_READ_ERROR:
-	case NVME_SC_GUARD_CHECK:
-	case NVME_SC_APPTAG_CHECK:
-	case NVME_SC_REFTAG_CHECK:
-	case NVME_SC_COMPARE_FAILED:
-	case NVME_SC_ACCESS_DENIED:
-	case NVME_SC_UNWRITTEN_BLOCK:
-		return false;
-	}
-
-	/* Everything else could be a path failure, so should be retried */
-	return true;
+	return blk_path_error(error);
 }
 
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl)
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index ea1aa52..8e4550f 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -32,6 +32,8 @@ extern unsigned int admin_timeout;
 #define NVME_KATO_GRACE		10
 
 extern struct workqueue_struct *nvme_wq;
+extern struct workqueue_struct *nvme_reset_wq;
+extern struct workqueue_struct *nvme_delete_wq;
 
 enum {
 	NVME_NS_LBA		= 0,
@@ -119,6 +121,7 @@ static inline struct nvme_request *nvme_req(struct request *req)
 enum nvme_ctrl_state {
 	NVME_CTRL_NEW,
 	NVME_CTRL_LIVE,
+	NVME_CTRL_ADMIN_ONLY,    /* Only admin queue live */
 	NVME_CTRL_RESETTING,
 	NVME_CTRL_RECONNECTING,
 	NVME_CTRL_DELETING,
@@ -393,6 +396,7 @@ int nvme_set_queue_count(struct nvme_ctrl *ctrl, int *count);
 void nvme_start_keep_alive(struct nvme_ctrl *ctrl);
 void nvme_stop_keep_alive(struct nvme_ctrl *ctrl);
 int nvme_reset_ctrl(struct nvme_ctrl *ctrl);
+int nvme_reset_ctrl_sync(struct nvme_ctrl *ctrl);
 int nvme_delete_ctrl(struct nvme_ctrl *ctrl);
 int nvme_delete_ctrl_sync(struct nvme_ctrl *ctrl);
 
@@ -401,7 +405,7 @@ extern const struct block_device_operations nvme_ns_head_ops;
 
 #ifdef CONFIG_NVME_MULTIPATH
 void nvme_failover_req(struct request *req);
-bool nvme_req_needs_failover(struct request *req);
+bool nvme_req_needs_failover(struct request *req, blk_status_t error);
 void nvme_kick_requeue_lists(struct nvme_ctrl *ctrl);
 int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl,struct nvme_ns_head *head);
 void nvme_mpath_add_disk(struct nvme_ns_head *head);
@@ -417,11 +421,21 @@ static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
 		rcu_assign_pointer(head->current_path, NULL);
 }
 struct nvme_ns *nvme_find_path(struct nvme_ns_head *head);
+
+static inline void nvme_mpath_check_last_path(struct nvme_ns *ns)
+{
+	struct nvme_ns_head *head = ns->head;
+
+	if (head->disk && list_empty(&head->list))
+		kblockd_schedule_work(&head->requeue_work);
+}
+
 #else
 static inline void nvme_failover_req(struct request *req)
 {
 }
-static inline bool nvme_req_needs_failover(struct request *req)
+static inline bool nvme_req_needs_failover(struct request *req,
+					   blk_status_t error)
 {
 	return false;
 }
@@ -448,6 +462,9 @@ static inline void nvme_mpath_remove_disk_links(struct nvme_ns *ns)
 static inline void nvme_mpath_clear_current_path(struct nvme_ns *ns)
 {
 }
+static inline void nvme_mpath_check_last_path(struct nvme_ns *ns)
+{
+}
 #endif /* CONFIG_NVME_MULTIPATH */
 
 #ifdef CONFIG_NVM
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index f5800c3c..6fe7af0 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -75,7 +75,7 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown);
  * Represents an NVM Express device.  Each nvme_dev is a PCI function.
  */
 struct nvme_dev {
-	struct nvme_queue **queues;
+	struct nvme_queue *queues;
 	struct blk_mq_tag_set tagset;
 	struct blk_mq_tag_set admin_tagset;
 	u32 __iomem *dbs;
@@ -365,7 +365,7 @@ static int nvme_admin_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
 				unsigned int hctx_idx)
 {
 	struct nvme_dev *dev = data;
-	struct nvme_queue *nvmeq = dev->queues[0];
+	struct nvme_queue *nvmeq = &dev->queues[0];
 
 	WARN_ON(hctx_idx != 0);
 	WARN_ON(dev->admin_tagset.tags[0] != hctx->tags);
@@ -387,7 +387,7 @@ static int nvme_init_hctx(struct blk_mq_hw_ctx *hctx, void *data,
 			  unsigned int hctx_idx)
 {
 	struct nvme_dev *dev = data;
-	struct nvme_queue *nvmeq = dev->queues[hctx_idx + 1];
+	struct nvme_queue *nvmeq = &dev->queues[hctx_idx + 1];
 
 	if (!nvmeq->tags)
 		nvmeq->tags = &dev->tagset.tags[hctx_idx];
@@ -403,7 +403,7 @@ static int nvme_init_request(struct blk_mq_tag_set *set, struct request *req,
 	struct nvme_dev *dev = set->driver_data;
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
 	int queue_idx = (set == &dev->tagset) ? hctx_idx + 1 : 0;
-	struct nvme_queue *nvmeq = dev->queues[queue_idx];
+	struct nvme_queue *nvmeq = &dev->queues[queue_idx];
 
 	BUG_ON(!nvmeq);
 	iod->nvmeq = nvmeq;
@@ -448,12 +448,34 @@ static void **nvme_pci_iod_list(struct request *req)
 	return (void **)(iod->sg + blk_rq_nr_phys_segments(req));
 }
 
+static inline bool nvme_pci_use_sgls(struct nvme_dev *dev, struct request *req)
+{
+	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
+	int nseg = blk_rq_nr_phys_segments(req);
+	unsigned int avg_seg_size;
+
+	if (nseg == 0)
+		return false;
+
+	avg_seg_size = DIV_ROUND_UP(blk_rq_payload_bytes(req), nseg);
+
+	if (!(dev->ctrl.sgls & ((1 << 0) | (1 << 1))))
+		return false;
+	if (!iod->nvmeq->qid)
+		return false;
+	if (!sgl_threshold || avg_seg_size < sgl_threshold)
+		return false;
+	return true;
+}
+
 static blk_status_t nvme_init_iod(struct request *rq, struct nvme_dev *dev)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(rq);
 	int nseg = blk_rq_nr_phys_segments(rq);
 	unsigned int size = blk_rq_payload_bytes(rq);
 
+	iod->use_sgl = nvme_pci_use_sgls(dev, rq);
+
 	if (nseg > NVME_INT_PAGES || size > NVME_INT_BYTES(dev)) {
 		size_t alloc_size = nvme_pci_iod_alloc_size(dev, size, nseg,
 				iod->use_sgl);
@@ -604,8 +626,6 @@ static blk_status_t nvme_pci_setup_prps(struct nvme_dev *dev,
 	dma_addr_t prp_dma;
 	int nprps, i;
 
-	iod->use_sgl = false;
-
 	length -= (page_size - offset);
 	if (length <= 0) {
 		iod->first_dma = 0;
@@ -705,22 +725,19 @@ static void nvme_pci_sgl_set_seg(struct nvme_sgl_desc *sge,
 }
 
 static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev,
-		struct request *req, struct nvme_rw_command *cmd)
+		struct request *req, struct nvme_rw_command *cmd, int entries)
 {
 	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-	int length = blk_rq_payload_bytes(req);
 	struct dma_pool *pool;
 	struct nvme_sgl_desc *sg_list;
 	struct scatterlist *sg = iod->sg;
-	int entries = iod->nents, i = 0;
 	dma_addr_t sgl_dma;
-
-	iod->use_sgl = true;
+	int i = 0;
 
 	/* setting the transfer type as SGL */
 	cmd->flags = NVME_CMD_SGL_METABUF;
 
-	if (length == sg_dma_len(sg)) {
+	if (entries == 1) {
 		nvme_pci_sgl_set_data(&cmd->dptr.sgl, sg);
 		return BLK_STS_OK;
 	}
@@ -760,33 +777,12 @@ static blk_status_t nvme_pci_setup_sgls(struct nvme_dev *dev,
 		}
 
 		nvme_pci_sgl_set_data(&sg_list[i++], sg);
-
-		length -= sg_dma_len(sg);
 		sg = sg_next(sg);
-		entries--;
-	} while (length > 0);
+	} while (--entries > 0);
 
-	WARN_ON(entries > 0);
 	return BLK_STS_OK;
 }
 
-static inline bool nvme_pci_use_sgls(struct nvme_dev *dev, struct request *req)
-{
-	struct nvme_iod *iod = blk_mq_rq_to_pdu(req);
-	unsigned int avg_seg_size;
-
-	avg_seg_size = DIV_ROUND_UP(blk_rq_payload_bytes(req),
-			blk_rq_nr_phys_segments(req));
-
-	if (!(dev->ctrl.sgls & ((1 << 0) | (1 << 1))))
-		return false;
-	if (!iod->nvmeq->qid)
-		return false;
-	if (!sgl_threshold || avg_seg_size < sgl_threshold)
-		return false;
-	return true;
-}
-
 static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 		struct nvme_command *cmnd)
 {
@@ -795,6 +791,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 	enum dma_data_direction dma_dir = rq_data_dir(req) ?
 			DMA_TO_DEVICE : DMA_FROM_DEVICE;
 	blk_status_t ret = BLK_STS_IOERR;
+	int nr_mapped;
 
 	sg_init_table(iod->sg, blk_rq_nr_phys_segments(req));
 	iod->nents = blk_rq_map_sg(q, req, iod->sg);
@@ -802,12 +799,13 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req,
 		goto out;
 
 	ret = BLK_STS_RESOURCE;
-	if (!dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, dma_dir,
-				DMA_ATTR_NO_WARN))
+	nr_mapped = dma_map_sg_attrs(dev->dev, iod->sg, iod->nents, dma_dir,
+			DMA_ATTR_NO_WARN);
+	if (!nr_mapped)
 		goto out;
 
-	if (nvme_pci_use_sgls(dev, req))
-		ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw);
+	if (iod->use_sgl)
+		ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw, nr_mapped);
 	else
 		ret = nvme_pci_setup_prps(dev, req, &cmnd->rw);
 
@@ -1046,7 +1044,7 @@ static int nvme_poll(struct blk_mq_hw_ctx *hctx, unsigned int tag)
 static void nvme_pci_submit_async_event(struct nvme_ctrl *ctrl)
 {
 	struct nvme_dev *dev = to_nvme_dev(ctrl);
-	struct nvme_queue *nvmeq = dev->queues[0];
+	struct nvme_queue *nvmeq = &dev->queues[0];
 	struct nvme_command c;
 
 	memset(&c, 0, sizeof(c));
@@ -1140,9 +1138,14 @@ static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
 	 */
 	bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);
 
-	/* If there is a reset ongoing, we shouldn't reset again. */
-	if (dev->ctrl.state == NVME_CTRL_RESETTING)
+	/* If there is a reset/reinit ongoing, we shouldn't reset again. */
+	switch (dev->ctrl.state) {
+	case NVME_CTRL_RESETTING:
+	case NVME_CTRL_RECONNECTING:
 		return false;
+	default:
+		break;
+	}
 
 	/* We shouldn't reset unless the controller is on fatal error state
 	 * _or_ if we lost the communication with it.
@@ -1282,7 +1285,6 @@ static void nvme_free_queue(struct nvme_queue *nvmeq)
 	if (nvmeq->sq_cmds)
 		dma_free_coherent(nvmeq->q_dmadev, SQ_SIZE(nvmeq->q_depth),
 					nvmeq->sq_cmds, nvmeq->sq_dma_addr);
-	kfree(nvmeq);
 }
 
 static void nvme_free_queues(struct nvme_dev *dev, int lowest)
@@ -1290,10 +1292,8 @@ static void nvme_free_queues(struct nvme_dev *dev, int lowest)
 	int i;
 
 	for (i = dev->ctrl.queue_count - 1; i >= lowest; i--) {
-		struct nvme_queue *nvmeq = dev->queues[i];
 		dev->ctrl.queue_count--;
-		dev->queues[i] = NULL;
-		nvme_free_queue(nvmeq);
+		nvme_free_queue(&dev->queues[i]);
 	}
 }
 
@@ -1325,12 +1325,7 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq)
 
 static void nvme_disable_admin_queue(struct nvme_dev *dev, bool shutdown)
 {
-	struct nvme_queue *nvmeq = dev->queues[0];
-
-	if (!nvmeq)
-		return;
-	if (nvme_suspend_queue(nvmeq))
-		return;
+	struct nvme_queue *nvmeq = &dev->queues[0];
 
 	if (shutdown)
 		nvme_shutdown_ctrl(&dev->ctrl);
@@ -1369,7 +1364,7 @@ static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
 static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
 				int qid, int depth)
 {
-	if (qid && dev->cmb && use_cmb_sqes && NVME_CMB_SQS(dev->cmbsz)) {
+	if (qid && dev->cmb && use_cmb_sqes && (dev->cmbsz & NVME_CMBSZ_SQS)) {
 		unsigned offset = (qid - 1) * roundup(SQ_SIZE(depth),
 						      dev->ctrl.page_size);
 		nvmeq->sq_dma_addr = dev->cmb_bus_addr + offset;
@@ -1384,13 +1379,13 @@ static int nvme_alloc_sq_cmds(struct nvme_dev *dev, struct nvme_queue *nvmeq,
 	return 0;
 }
 
-static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, int qid,
-							int depth, int node)
+static int nvme_alloc_queue(struct nvme_dev *dev, int qid,
+		int depth, int node)
 {
-	struct nvme_queue *nvmeq = kzalloc_node(sizeof(*nvmeq), GFP_KERNEL,
-							node);
-	if (!nvmeq)
-		return NULL;
+	struct nvme_queue *nvmeq = &dev->queues[qid];
+
+	if (dev->ctrl.queue_count > qid)
+		return 0;
 
 	nvmeq->cqes = dma_zalloc_coherent(dev->dev, CQ_SIZE(depth),
 					  &nvmeq->cq_dma_addr, GFP_KERNEL);
@@ -1409,17 +1404,15 @@ static struct nvme_queue *nvme_alloc_queue(struct nvme_dev *dev, int qid,
 	nvmeq->q_depth = depth;
 	nvmeq->qid = qid;
 	nvmeq->cq_vector = -1;
-	dev->queues[qid] = nvmeq;
 	dev->ctrl.queue_count++;
 
-	return nvmeq;
+	return 0;
 
  free_cqdma:
 	dma_free_coherent(dev->dev, CQ_SIZE(depth), (void *)nvmeq->cqes,
 							nvmeq->cq_dma_addr);
  free_nvmeq:
-	kfree(nvmeq);
-	return NULL;
+	return -ENOMEM;
 }
 
 static int queue_request_irq(struct nvme_queue *nvmeq)
@@ -1592,14 +1585,12 @@ static int nvme_pci_configure_admin_queue(struct nvme_dev *dev)
 	if (result < 0)
 		return result;
 
-	nvmeq = dev->queues[0];
-	if (!nvmeq) {
-		nvmeq = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
-					dev_to_node(dev->dev));
-		if (!nvmeq)
-			return -ENOMEM;
-	}
+	result = nvme_alloc_queue(dev, 0, NVME_AQ_DEPTH,
+			dev_to_node(dev->dev));
+	if (result)
+		return result;
 
+	nvmeq = &dev->queues[0];
 	aqa = nvmeq->q_depth - 1;
 	aqa |= aqa << 16;
 
@@ -1629,7 +1620,7 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
 
 	for (i = dev->ctrl.queue_count; i <= dev->max_qid; i++) {
 		/* vector == qid - 1, match nvme_create_queue */
-		if (!nvme_alloc_queue(dev, i, dev->q_depth,
+		if (nvme_alloc_queue(dev, i, dev->q_depth,
 		     pci_irq_get_node(to_pci_dev(dev->dev), i - 1))) {
 			ret = -ENOMEM;
 			break;
@@ -1638,15 +1629,15 @@ static int nvme_create_io_queues(struct nvme_dev *dev)
 
 	max = min(dev->max_qid, dev->ctrl.queue_count - 1);
 	for (i = dev->online_queues; i <= max; i++) {
-		ret = nvme_create_queue(dev->queues[i], i);
+		ret = nvme_create_queue(&dev->queues[i], i);
 		if (ret)
 			break;
 	}
 
 	/*
 	 * Ignore failing Create SQ/CQ commands, we can continue with less
-	 * than the desired aount of queues, and even a controller without
-	 * I/O queues an still be used to issue admin commands.  This might
+	 * than the desired amount of queues, and even a controller without
+	 * I/O queues can still be used to issue admin commands.  This might
 	 * be useful to upgrade a buggy firmware for example.
 	 */
 	return ret >= 0 ? 0 : ret;
@@ -1663,30 +1654,40 @@ static ssize_t nvme_cmb_show(struct device *dev,
 }
 static DEVICE_ATTR(cmb, S_IRUGO, nvme_cmb_show, NULL);
 
-static void __iomem *nvme_map_cmb(struct nvme_dev *dev)
+static u64 nvme_cmb_size_unit(struct nvme_dev *dev)
 {
-	u64 szu, size, offset;
+	u8 szu = (dev->cmbsz >> NVME_CMBSZ_SZU_SHIFT) & NVME_CMBSZ_SZU_MASK;
+
+	return 1ULL << (12 + 4 * szu);
+}
+
+static u32 nvme_cmb_size(struct nvme_dev *dev)
+{
+	return (dev->cmbsz >> NVME_CMBSZ_SZ_SHIFT) & NVME_CMBSZ_SZ_MASK;
+}
+
+static void nvme_map_cmb(struct nvme_dev *dev)
+{
+	u64 size, offset;
 	resource_size_t bar_size;
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
-	void __iomem *cmb;
 	int bar;
 
 	dev->cmbsz = readl(dev->bar + NVME_REG_CMBSZ);
-	if (!(NVME_CMB_SZ(dev->cmbsz)))
-		return NULL;
+	if (!dev->cmbsz)
+		return;
 	dev->cmbloc = readl(dev->bar + NVME_REG_CMBLOC);
 
 	if (!use_cmb_sqes)
-		return NULL;
+		return;
 
-	szu = (u64)1 << (12 + 4 * NVME_CMB_SZU(dev->cmbsz));
-	size = szu * NVME_CMB_SZ(dev->cmbsz);
-	offset = szu * NVME_CMB_OFST(dev->cmbloc);
+	size = nvme_cmb_size_unit(dev) * nvme_cmb_size(dev);
+	offset = nvme_cmb_size_unit(dev) * NVME_CMB_OFST(dev->cmbloc);
 	bar = NVME_CMB_BIR(dev->cmbloc);
 	bar_size = pci_resource_len(pdev, bar);
 
 	if (offset > bar_size)
-		return NULL;
+		return;
 
 	/*
 	 * Controllers may support a CMB size larger than their BAR,
@@ -1696,13 +1697,16 @@ static void __iomem *nvme_map_cmb(struct nvme_dev *dev)
 	if (size > bar_size - offset)
 		size = bar_size - offset;
 
-	cmb = ioremap_wc(pci_resource_start(pdev, bar) + offset, size);
-	if (!cmb)
-		return NULL;
-
+	dev->cmb = ioremap_wc(pci_resource_start(pdev, bar) + offset, size);
+	if (!dev->cmb)
+		return;
 	dev->cmb_bus_addr = pci_bus_address(pdev, bar) + offset;
 	dev->cmb_size = size;
-	return cmb;
+
+	if (sysfs_add_file_to_group(&dev->ctrl.device->kobj,
+				    &dev_attr_cmb.attr, NULL))
+		dev_warn(dev->ctrl.device,
+			 "failed to add sysfs attribute for CMB\n");
 }
 
 static inline void nvme_release_cmb(struct nvme_dev *dev)
@@ -1770,7 +1774,7 @@ static int __nvme_alloc_host_mem(struct nvme_dev *dev, u64 preferred,
 	dma_addr_t descs_dma;
 	int i = 0;
 	void **bufs;
-	u64 size = 0, tmp;
+	u64 size, tmp;
 
 	tmp = (preferred + chunk_size - 1);
 	do_div(tmp, chunk_size);
@@ -1853,7 +1857,7 @@ static int nvme_setup_host_mem(struct nvme_dev *dev)
 	u64 preferred = (u64)dev->ctrl.hmpre * 4096;
 	u64 min = (u64)dev->ctrl.hmmin * 4096;
 	u32 enable_bits = NVME_HOST_MEM_ENABLE;
-	int ret = 0;
+	int ret;
 
 	preferred = min(preferred, max);
 	if (min > max) {
@@ -1894,7 +1898,7 @@ static int nvme_setup_host_mem(struct nvme_dev *dev)
 
 static int nvme_setup_io_queues(struct nvme_dev *dev)
 {
-	struct nvme_queue *adminq = dev->queues[0];
+	struct nvme_queue *adminq = &dev->queues[0];
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 	int result, nr_io_queues;
 	unsigned long size;
@@ -1907,7 +1911,7 @@ static int nvme_setup_io_queues(struct nvme_dev *dev)
 	if (nr_io_queues == 0)
 		return 0;
 
-	if (dev->cmb && NVME_CMB_SQS(dev->cmbsz)) {
+	if (dev->cmb && (dev->cmbsz & NVME_CMBSZ_SQS)) {
 		result = nvme_cmb_qdepth(dev, nr_io_queues,
 				sizeof(struct nvme_command));
 		if (result > 0)
@@ -2007,9 +2011,9 @@ static int nvme_delete_queue(struct nvme_queue *nvmeq, u8 opcode)
 	return 0;
 }
 
-static void nvme_disable_io_queues(struct nvme_dev *dev, int queues)
+static void nvme_disable_io_queues(struct nvme_dev *dev)
 {
-	int pass;
+	int pass, queues = dev->online_queues - 1;
 	unsigned long timeout;
 	u8 opcode = nvme_admin_delete_sq;
 
@@ -2020,7 +2024,7 @@ static void nvme_disable_io_queues(struct nvme_dev *dev, int queues)
  retry:
 		timeout = ADMIN_TIMEOUT;
 		for (; i > 0; i--, sent++)
-			if (nvme_delete_queue(dev->queues[i], opcode))
+			if (nvme_delete_queue(&dev->queues[i], opcode))
 				break;
 
 		while (sent--) {
@@ -2035,13 +2039,12 @@ static void nvme_disable_io_queues(struct nvme_dev *dev, int queues)
 }
 
 /*
- * Return: error value if an error occurred setting up the queues or calling
- * Identify Device.  0 if these succeeded, even if adding some of the
- * namespaces failed.  At the moment, these failures are silent.  TBD which
- * failures should be reported.
+ * return error value only when tagset allocation failed
  */
 static int nvme_dev_add(struct nvme_dev *dev)
 {
+	int ret;
+
 	if (!dev->ctrl.tagset) {
 		dev->tagset.ops = &nvme_mq_ops;
 		dev->tagset.nr_hw_queues = dev->online_queues - 1;
@@ -2057,8 +2060,12 @@ static int nvme_dev_add(struct nvme_dev *dev)
 		dev->tagset.flags = BLK_MQ_F_SHOULD_MERGE;
 		dev->tagset.driver_data = dev;
 
-		if (blk_mq_alloc_tag_set(&dev->tagset))
-			return 0;
+		ret = blk_mq_alloc_tag_set(&dev->tagset);
+		if (ret) {
+			dev_warn(dev->ctrl.device,
+				"IO queues tagset allocation failed %d\n", ret);
+			return ret;
+		}
 		dev->ctrl.tagset = &dev->tagset;
 
 		nvme_dbbuf_set(dev);
@@ -2124,22 +2131,7 @@ static int nvme_pci_enable(struct nvme_dev *dev)
                         "set queue depth=%u\n", dev->q_depth);
 	}
 
-	/*
-	 * CMBs can currently only exist on >=1.2 PCIe devices. We only
-	 * populate sysfs if a CMB is implemented. Since nvme_dev_attrs_group
-	 * has no name we can pass NULL as final argument to
-	 * sysfs_add_file_to_group.
-	 */
-
-	if (readl(dev->bar + NVME_REG_VS) >= NVME_VS(1, 2, 0)) {
-		dev->cmb = nvme_map_cmb(dev);
-		if (dev->cmb) {
-			if (sysfs_add_file_to_group(&dev->ctrl.device->kobj,
-						    &dev_attr_cmb.attr, NULL))
-				dev_warn(dev->ctrl.device,
-					 "failed to add sysfs attribute for CMB\n");
-		}
-	}
+	nvme_map_cmb(dev);
 
 	pci_enable_pcie_error_reporting(pdev);
 	pci_save_state(pdev);
@@ -2172,7 +2164,7 @@ static void nvme_pci_disable(struct nvme_dev *dev)
 
 static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 {
-	int i, queues;
+	int i;
 	bool dead = true;
 	struct pci_dev *pdev = to_pci_dev(dev->dev);
 
@@ -2207,21 +2199,13 @@ static void nvme_dev_disable(struct nvme_dev *dev, bool shutdown)
 	}
 	nvme_stop_queues(&dev->ctrl);
 
-	queues = dev->online_queues - 1;
-	for (i = dev->ctrl.queue_count - 1; i > 0; i--)
-		nvme_suspend_queue(dev->queues[i]);
-
-	if (dead) {
-		/* A device might become IO incapable very soon during
-		 * probe, before the admin queue is configured. Thus,
-		 * queue_count can be 0 here.
-		 */
-		if (dev->ctrl.queue_count)
-			nvme_suspend_queue(dev->queues[0]);
-	} else {
-		nvme_disable_io_queues(dev, queues);
+	if (!dead) {
+		nvme_disable_io_queues(dev);
 		nvme_disable_admin_queue(dev, shutdown);
 	}
+	for (i = dev->ctrl.queue_count - 1; i >= 0; i--)
+		nvme_suspend_queue(&dev->queues[i]);
+
 	nvme_pci_disable(dev);
 
 	blk_mq_tagset_busy_iter(&dev->tagset, nvme_cancel_request, &dev->ctrl);
@@ -2291,6 +2275,7 @@ static void nvme_reset_work(struct work_struct *work)
 		container_of(work, struct nvme_dev, ctrl.reset_work);
 	bool was_suspend = !!(dev->ctrl.ctrl_config & NVME_CC_SHN_NORMAL);
 	int result = -ENODEV;
+	enum nvme_ctrl_state new_state = NVME_CTRL_LIVE;
 
 	if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING))
 		goto out;
@@ -2302,6 +2287,16 @@ static void nvme_reset_work(struct work_struct *work)
 	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
 		nvme_dev_disable(dev, false);
 
+	/*
+	 * Introduce RECONNECTING state from nvme-fc/rdma transports to mark the
+	 * initializing procedure here.
+	 */
+	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RECONNECTING)) {
+		dev_warn(dev->ctrl.device,
+			"failed to mark controller RECONNECTING\n");
+		goto out;
+	}
+
 	result = nvme_pci_enable(dev);
 	if (result)
 		goto out;
@@ -2354,15 +2349,23 @@ static void nvme_reset_work(struct work_struct *work)
 		dev_warn(dev->ctrl.device, "IO queues not created\n");
 		nvme_kill_queues(&dev->ctrl);
 		nvme_remove_namespaces(&dev->ctrl);
+		new_state = NVME_CTRL_ADMIN_ONLY;
 	} else {
 		nvme_start_queues(&dev->ctrl);
 		nvme_wait_freeze(&dev->ctrl);
-		nvme_dev_add(dev);
+		/* hit this only when allocate tagset fails */
+		if (nvme_dev_add(dev))
+			new_state = NVME_CTRL_ADMIN_ONLY;
 		nvme_unfreeze(&dev->ctrl);
 	}
 
-	if (!nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_LIVE)) {
-		dev_warn(dev->ctrl.device, "failed to mark controller live\n");
+	/*
+	 * If only admin queue live, keep it to do further investigation or
+	 * recovery.
+	 */
+	if (!nvme_change_ctrl_state(&dev->ctrl, new_state)) {
+		dev_warn(dev->ctrl.device,
+			"failed to mark controller state %d\n", new_state);
 		goto out;
 	}
 
@@ -2470,8 +2473,9 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	dev = kzalloc_node(sizeof(*dev), GFP_KERNEL, node);
 	if (!dev)
 		return -ENOMEM;
-	dev->queues = kzalloc_node((num_possible_cpus() + 1) * sizeof(void *),
-							GFP_KERNEL, node);
+
+	dev->queues = kcalloc_node(num_possible_cpus() + 1,
+			sizeof(struct nvme_queue), GFP_KERNEL, node);
 	if (!dev->queues)
 		goto free;
 
@@ -2498,10 +2502,10 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id)
 	if (result)
 		goto release_pools;
 
-	nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_RESETTING);
 	dev_info(dev->ctrl.device, "pci function %s\n", dev_name(&pdev->dev));
 
-	queue_work(nvme_wq, &dev->ctrl.reset_work);
+	nvme_reset_ctrl(&dev->ctrl);
+
 	return 0;
 
  release_pools:
@@ -2525,7 +2529,7 @@ static void nvme_reset_prepare(struct pci_dev *pdev)
 static void nvme_reset_done(struct pci_dev *pdev)
 {
 	struct nvme_dev *dev = pci_get_drvdata(pdev);
-	nvme_reset_ctrl(&dev->ctrl);
+	nvme_reset_ctrl_sync(&dev->ctrl);
 }
 
 static void nvme_shutdown(struct pci_dev *pdev)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 37af565..2bc059f 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -66,7 +66,6 @@ struct nvme_rdma_request {
 	struct ib_sge		sge[1 + NVME_RDMA_MAX_INLINE_SEGMENTS];
 	u32			num_sge;
 	int			nents;
-	bool			inline_data;
 	struct ib_reg_wr	reg_wr;
 	struct ib_cqe		reg_cqe;
 	struct nvme_rdma_queue  *queue;
@@ -974,12 +973,18 @@ static void nvme_rdma_error_recovery_work(struct work_struct *work)
 	blk_mq_unquiesce_queue(ctrl->ctrl.admin_q);
 	nvme_start_queues(&ctrl->ctrl);
 
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING)) {
+		/* state change failure should never happen */
+		WARN_ON_ONCE(1);
+		return;
+	}
+
 	nvme_rdma_reconnect_or_remove(ctrl);
 }
 
 static void nvme_rdma_error_recovery(struct nvme_rdma_ctrl *ctrl)
 {
-	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING))
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RESETTING))
 		return;
 
 	queue_work(nvme_wq, &ctrl->err_work);
@@ -1086,7 +1091,6 @@ static int nvme_rdma_map_sg_inline(struct nvme_rdma_queue *queue,
 	sg->length = cpu_to_le32(sg_dma_len(req->sg_table.sgl));
 	sg->type = (NVME_SGL_FMT_DATA_DESC << 4) | NVME_SGL_FMT_OFFSET;
 
-	req->inline_data = true;
 	req->num_sge++;
 	return 0;
 }
@@ -1158,7 +1162,6 @@ static int nvme_rdma_map_data(struct nvme_rdma_queue *queue,
 	int count, ret;
 
 	req->num_sge = 1;
-	req->inline_data = false;
 	refcount_set(&req->ref, 2); /* send and recv completions */
 
 	c->common.flags |= NVME_CMD_SGL_METABUF;
@@ -1753,6 +1756,12 @@ static void nvme_rdma_reset_ctrl_work(struct work_struct *work)
 	nvme_stop_ctrl(&ctrl->ctrl);
 	nvme_rdma_shutdown_ctrl(ctrl, false);
 
+	if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_RECONNECTING)) {
+		/* state change failure should never happen */
+		WARN_ON_ONCE(1);
+		return;
+	}
+
 	ret = nvme_rdma_configure_admin_queue(ctrl, false);
 	if (ret)
 		goto out_fail;
@@ -2006,6 +2015,7 @@ static struct nvme_ctrl *nvme_rdma_create_ctrl(struct device *dev,
 
 static struct nvmf_transport_ops nvme_rdma_transport = {
 	.name		= "rdma",
+	.module		= THIS_MODULE,
 	.required_opts	= NVMF_OPT_TRADDR,
 	.allowed_opts	= NVMF_OPT_TRSVCID | NVMF_OPT_RECONNECT_DELAY |
 			  NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO,
@@ -2028,7 +2038,7 @@ static void nvme_rdma_remove_one(struct ib_device *ib_device, void *client_data)
 	}
 	mutex_unlock(&nvme_rdma_ctrl_mutex);
 
-	flush_workqueue(nvme_wq);
+	flush_workqueue(nvme_delete_wq);
 }
 
 static struct ib_client nvme_rdma_ib_client = {
diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c
new file mode 100644
index 0000000..41944bb
--- /dev/null
+++ b/drivers/nvme/host/trace.c
@@ -0,0 +1,130 @@
+/*
+ * NVM Express device driver tracepoints
+ * Copyright (c) 2018 Johannes Thumshirn, SUSE Linux GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#include <asm/unaligned.h>
+#include "trace.h"
+
+static const char *nvme_trace_create_sq(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+	u16 sqid = get_unaligned_le16(cdw10);
+	u16 qsize = get_unaligned_le16(cdw10 + 2);
+	u16 sq_flags = get_unaligned_le16(cdw10 + 4);
+	u16 cqid = get_unaligned_le16(cdw10 + 6);
+
+
+	trace_seq_printf(p, "sqid=%u, qsize=%u, sq_flags=0x%x, cqid=%u",
+			 sqid, qsize, sq_flags, cqid);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+static const char *nvme_trace_create_cq(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+	u16 cqid = get_unaligned_le16(cdw10);
+	u16 qsize = get_unaligned_le16(cdw10 + 2);
+	u16 cq_flags = get_unaligned_le16(cdw10 + 4);
+	u16 irq_vector = get_unaligned_le16(cdw10 + 6);
+
+	trace_seq_printf(p, "cqid=%u, qsize=%u, cq_flags=0x%x, irq_vector=%u",
+			 cqid, qsize, cq_flags, irq_vector);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+static const char *nvme_trace_admin_identify(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+	u8 cns = cdw10[0];
+	u16 ctrlid = get_unaligned_le16(cdw10 + 2);
+
+	trace_seq_printf(p, "cns=%u, ctrlid=%u", cns, ctrlid);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+
+
+static const char *nvme_trace_read_write(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+	u64 slba = get_unaligned_le64(cdw10);
+	u16 length = get_unaligned_le16(cdw10 + 8);
+	u16 control = get_unaligned_le16(cdw10 + 10);
+	u32 dsmgmt = get_unaligned_le32(cdw10 + 12);
+	u32 reftag = get_unaligned_le32(cdw10 +  16);
+
+	trace_seq_printf(p,
+			 "slba=%llu, len=%u, ctrl=0x%x, dsmgmt=%u, reftag=%u",
+			 slba, length, control, dsmgmt, reftag);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+static const char *nvme_trace_dsm(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+
+	trace_seq_printf(p, "nr=%u, attributes=%u",
+			 get_unaligned_le32(cdw10),
+			 get_unaligned_le32(cdw10 + 4));
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+static const char *nvme_trace_common(struct trace_seq *p, u8 *cdw10)
+{
+	const char *ret = trace_seq_buffer_ptr(p);
+
+	trace_seq_printf(p, "cdw10=%*ph", 24, cdw10);
+	trace_seq_putc(p, 0);
+
+	return ret;
+}
+
+const char *nvme_trace_parse_admin_cmd(struct trace_seq *p,
+				       u8 opcode, u8 *cdw10)
+{
+	switch (opcode) {
+	case nvme_admin_create_sq:
+		return nvme_trace_create_sq(p, cdw10);
+	case nvme_admin_create_cq:
+		return nvme_trace_create_cq(p, cdw10);
+	case nvme_admin_identify:
+		return nvme_trace_admin_identify(p, cdw10);
+	default:
+		return nvme_trace_common(p, cdw10);
+	}
+}
+
+const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p,
+				     u8 opcode, u8 *cdw10)
+{
+	switch (opcode) {
+	case nvme_cmd_read:
+	case nvme_cmd_write:
+	case nvme_cmd_write_zeroes:
+		return nvme_trace_read_write(p, cdw10);
+	case nvme_cmd_dsm:
+		return nvme_trace_dsm(p, cdw10);
+	default:
+		return nvme_trace_common(p, cdw10);
+	}
+}
diff --git a/drivers/nvme/host/trace.h b/drivers/nvme/host/trace.h
new file mode 100644
index 0000000..ea91fcc
--- /dev/null
+++ b/drivers/nvme/host/trace.h
@@ -0,0 +1,165 @@
+/*
+ * NVM Express device driver tracepoints
+ * Copyright (c) 2018 Johannes Thumshirn, SUSE Linux GmbH
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ */
+
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM nvme
+
+#if !defined(_TRACE_NVME_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_NVME_H
+
+#include <linux/nvme.h>
+#include <linux/tracepoint.h>
+#include <linux/trace_seq.h>
+
+#include "nvme.h"
+
+#define nvme_admin_opcode_name(opcode)	{ opcode, #opcode }
+#define show_admin_opcode_name(val)					\
+	__print_symbolic(val,						\
+		nvme_admin_opcode_name(nvme_admin_delete_sq),		\
+		nvme_admin_opcode_name(nvme_admin_create_sq),		\
+		nvme_admin_opcode_name(nvme_admin_get_log_page),	\
+		nvme_admin_opcode_name(nvme_admin_delete_cq),		\
+		nvme_admin_opcode_name(nvme_admin_create_cq),		\
+		nvme_admin_opcode_name(nvme_admin_identify),		\
+		nvme_admin_opcode_name(nvme_admin_abort_cmd),		\
+		nvme_admin_opcode_name(nvme_admin_set_features),	\
+		nvme_admin_opcode_name(nvme_admin_get_features),	\
+		nvme_admin_opcode_name(nvme_admin_async_event),		\
+		nvme_admin_opcode_name(nvme_admin_ns_mgmt),		\
+		nvme_admin_opcode_name(nvme_admin_activate_fw),		\
+		nvme_admin_opcode_name(nvme_admin_download_fw),		\
+		nvme_admin_opcode_name(nvme_admin_ns_attach),		\
+		nvme_admin_opcode_name(nvme_admin_keep_alive),		\
+		nvme_admin_opcode_name(nvme_admin_directive_send),	\
+		nvme_admin_opcode_name(nvme_admin_directive_recv),	\
+		nvme_admin_opcode_name(nvme_admin_dbbuf),		\
+		nvme_admin_opcode_name(nvme_admin_format_nvm),		\
+		nvme_admin_opcode_name(nvme_admin_security_send),	\
+		nvme_admin_opcode_name(nvme_admin_security_recv),	\
+		nvme_admin_opcode_name(nvme_admin_sanitize_nvm))
+
+const char *nvme_trace_parse_admin_cmd(struct trace_seq *p, u8 opcode,
+				       u8 *cdw10);
+#define __parse_nvme_admin_cmd(opcode, cdw10) \
+	nvme_trace_parse_admin_cmd(p, opcode, cdw10)
+
+#define nvme_opcode_name(opcode)	{ opcode, #opcode }
+#define show_opcode_name(val)					\
+	__print_symbolic(val,					\
+		nvme_opcode_name(nvme_cmd_flush),		\
+		nvme_opcode_name(nvme_cmd_write),		\
+		nvme_opcode_name(nvme_cmd_read),		\
+		nvme_opcode_name(nvme_cmd_write_uncor),		\
+		nvme_opcode_name(nvme_cmd_compare),		\
+		nvme_opcode_name(nvme_cmd_write_zeroes),	\
+		nvme_opcode_name(nvme_cmd_dsm),			\
+		nvme_opcode_name(nvme_cmd_resv_register),	\
+		nvme_opcode_name(nvme_cmd_resv_report),		\
+		nvme_opcode_name(nvme_cmd_resv_acquire),	\
+		nvme_opcode_name(nvme_cmd_resv_release))
+
+const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p, u8 opcode,
+				     u8 *cdw10);
+#define __parse_nvme_cmd(opcode, cdw10) \
+	nvme_trace_parse_nvm_cmd(p, opcode, cdw10)
+
+TRACE_EVENT(nvme_setup_admin_cmd,
+	    TP_PROTO(struct nvme_command *cmd),
+	    TP_ARGS(cmd),
+	    TP_STRUCT__entry(
+		    __field(u8, opcode)
+		    __field(u8, flags)
+		    __field(u16, cid)
+		    __field(u64, metadata)
+		    __array(u8, cdw10, 24)
+	    ),
+	    TP_fast_assign(
+		    __entry->opcode = cmd->common.opcode;
+		    __entry->flags = cmd->common.flags;
+		    __entry->cid = cmd->common.command_id;
+		    __entry->metadata = le64_to_cpu(cmd->common.metadata);
+		    memcpy(__entry->cdw10, cmd->common.cdw10,
+			   sizeof(__entry->cdw10));
+	    ),
+	    TP_printk(" cmdid=%u, flags=0x%x, meta=0x%llx, cmd=(%s %s)",
+		      __entry->cid, __entry->flags, __entry->metadata,
+		      show_admin_opcode_name(__entry->opcode),
+		      __parse_nvme_admin_cmd(__entry->opcode, __entry->cdw10))
+);
+
+
+TRACE_EVENT(nvme_setup_nvm_cmd,
+	    TP_PROTO(int qid, struct nvme_command *cmd),
+	    TP_ARGS(qid, cmd),
+	    TP_STRUCT__entry(
+		    __field(int, qid)
+		    __field(u8, opcode)
+		    __field(u8, flags)
+		    __field(u16, cid)
+		    __field(u32, nsid)
+		    __field(u64, metadata)
+		    __array(u8, cdw10, 24)
+	    ),
+	    TP_fast_assign(
+		    __entry->qid = qid;
+		    __entry->opcode = cmd->common.opcode;
+		    __entry->flags = cmd->common.flags;
+		    __entry->cid = cmd->common.command_id;
+		    __entry->nsid = le32_to_cpu(cmd->common.nsid);
+		    __entry->metadata = le64_to_cpu(cmd->common.metadata);
+		    memcpy(__entry->cdw10, cmd->common.cdw10,
+			   sizeof(__entry->cdw10));
+	    ),
+	    TP_printk("qid=%d, nsid=%u, cmdid=%u, flags=0x%x, meta=0x%llx, cmd=(%s %s)",
+		      __entry->qid, __entry->nsid, __entry->cid,
+		      __entry->flags, __entry->metadata,
+		      show_opcode_name(__entry->opcode),
+		      __parse_nvme_cmd(__entry->opcode, __entry->cdw10))
+);
+
+TRACE_EVENT(nvme_complete_rq,
+	    TP_PROTO(struct request *req),
+	    TP_ARGS(req),
+	    TP_STRUCT__entry(
+		    __field(int, qid)
+		    __field(int, cid)
+		    __field(u64, result)
+		    __field(u8, retries)
+		    __field(u8, flags)
+		    __field(u16, status)
+	    ),
+	    TP_fast_assign(
+		    __entry->qid = req->q->id;
+		    __entry->cid = req->tag;
+		    __entry->result = le64_to_cpu(nvme_req(req)->result.u64);
+		    __entry->retries = nvme_req(req)->retries;
+		    __entry->flags = nvme_req(req)->flags;
+		    __entry->status = nvme_req(req)->status;
+	    ),
+	    TP_printk("cmdid=%u, qid=%d, res=%llu, retries=%u, flags=0x%x, status=%u",
+		      __entry->cid, __entry->qid, __entry->result,
+		      __entry->retries, __entry->flags, __entry->status)
+
+);
+
+#endif /* _TRACE_NVME_H */
+
+#undef TRACE_INCLUDE_PATH
+#define TRACE_INCLUDE_PATH .
+#undef TRACE_INCLUDE_FILE
+#define TRACE_INCLUDE_FILE trace
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
diff --git a/drivers/nvme/target/Kconfig b/drivers/nvme/target/Kconfig
index 03e4ab6..5f4f8b1 100644
--- a/drivers/nvme/target/Kconfig
+++ b/drivers/nvme/target/Kconfig
@@ -29,6 +29,7 @@
 	tristate "NVMe over Fabrics RDMA target support"
 	depends on INFINIBAND
 	depends on NVME_TARGET
+	select SGL_ALLOC
 	help
 	  This enables the NVMe RDMA target support, which allows exporting NVMe
 	  devices over RDMA.
@@ -39,6 +40,7 @@
 	tristate "NVMe over Fabrics FC target driver"
 	depends on NVME_TARGET
 	depends on HAS_DMA
+	select SGL_ALLOC
 	help
 	  This enables the NVMe FC target support, which allows exporting NVMe
 	  devices over FC.
diff --git a/drivers/nvme/target/core.c b/drivers/nvme/target/core.c
index b54748a..0bd7371 100644
--- a/drivers/nvme/target/core.c
+++ b/drivers/nvme/target/core.c
@@ -512,6 +512,7 @@ bool nvmet_req_init(struct nvmet_req *req, struct nvmet_cq *cq,
 	req->sg_cnt = 0;
 	req->transfer_len = 0;
 	req->rsp->status = 0;
+	req->ns = NULL;
 
 	/* no support for fused commands yet */
 	if (unlikely(flags & (NVME_CMD_FUSE_FIRST | NVME_CMD_FUSE_SECOND))) {
@@ -557,6 +558,8 @@ EXPORT_SYMBOL_GPL(nvmet_req_init);
 void nvmet_req_uninit(struct nvmet_req *req)
 {
 	percpu_ref_put(&req->sq->ref);
+	if (req->ns)
+		nvmet_put_namespace(req->ns);
 }
 EXPORT_SYMBOL_GPL(nvmet_req_uninit);
 
@@ -830,7 +833,7 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
 		/* Don't accept keep-alive timeout for discovery controllers */
 		if (kato) {
 			status = NVME_SC_INVALID_FIELD | NVME_SC_DNR;
-			goto out_free_sqs;
+			goto out_remove_ida;
 		}
 
 		/*
@@ -860,6 +863,8 @@ u16 nvmet_alloc_ctrl(const char *subsysnqn, const char *hostnqn,
 	*ctrlp = ctrl;
 	return 0;
 
+out_remove_ida:
+	ida_simple_remove(&cntlid_ida, ctrl->cntlid);
 out_free_sqs:
 	kfree(ctrl->sqs);
 out_free_cqs:
@@ -877,21 +882,22 @@ static void nvmet_ctrl_free(struct kref *ref)
 	struct nvmet_ctrl *ctrl = container_of(ref, struct nvmet_ctrl, ref);
 	struct nvmet_subsys *subsys = ctrl->subsys;
 
-	nvmet_stop_keep_alive_timer(ctrl);
-
 	mutex_lock(&subsys->lock);
 	list_del(&ctrl->subsys_entry);
 	mutex_unlock(&subsys->lock);
 
+	nvmet_stop_keep_alive_timer(ctrl);
+
 	flush_work(&ctrl->async_event_work);
 	cancel_work_sync(&ctrl->fatal_err_work);
 
 	ida_simple_remove(&cntlid_ida, ctrl->cntlid);
-	nvmet_subsys_put(subsys);
 
 	kfree(ctrl->sqs);
 	kfree(ctrl->cqs);
 	kfree(ctrl);
+
+	nvmet_subsys_put(subsys);
 }
 
 void nvmet_ctrl_put(struct nvmet_ctrl *ctrl)
diff --git a/drivers/nvme/target/fabrics-cmd.c b/drivers/nvme/target/fabrics-cmd.c
index db3bf6b8..19e9e42 100644
--- a/drivers/nvme/target/fabrics-cmd.c
+++ b/drivers/nvme/target/fabrics-cmd.c
@@ -225,7 +225,7 @@ static void nvmet_execute_io_connect(struct nvmet_req *req)
 		goto out_ctrl_put;
 	}
 
-	pr_info("adding queue %d to ctrl %d.\n", qid, ctrl->cntlid);
+	pr_debug("adding queue %d to ctrl %d.\n", qid, ctrl->cntlid);
 
 out:
 	kfree(d);
diff --git a/drivers/nvme/target/fc.c b/drivers/nvme/target/fc.c
index 5fd8603..9b39a6c 100644
--- a/drivers/nvme/target/fc.c
+++ b/drivers/nvme/target/fc.c
@@ -1697,31 +1697,12 @@ static int
 nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
 {
 	struct scatterlist *sg;
-	struct page *page;
 	unsigned int nent;
-	u32 page_len, length;
-	int i = 0;
 
-	length = fod->req.transfer_len;
-	nent = DIV_ROUND_UP(length, PAGE_SIZE);
-	sg = kmalloc_array(nent, sizeof(struct scatterlist), GFP_KERNEL);
+	sg = sgl_alloc(fod->req.transfer_len, GFP_KERNEL, &nent);
 	if (!sg)
 		goto out;
 
-	sg_init_table(sg, nent);
-
-	while (length) {
-		page_len = min_t(u32, length, PAGE_SIZE);
-
-		page = alloc_page(GFP_KERNEL);
-		if (!page)
-			goto out_free_pages;
-
-		sg_set_page(&sg[i], page, page_len, 0);
-		length -= page_len;
-		i++;
-	}
-
 	fod->data_sg = sg;
 	fod->data_sg_cnt = nent;
 	fod->data_sg_cnt = fc_dma_map_sg(fod->tgtport->dev, sg, nent,
@@ -1731,14 +1712,6 @@ nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
 
 	return 0;
 
-out_free_pages:
-	while (i > 0) {
-		i--;
-		__free_page(sg_page(&sg[i]));
-	}
-	kfree(sg);
-	fod->data_sg = NULL;
-	fod->data_sg_cnt = 0;
 out:
 	return NVME_SC_INTERNAL;
 }
@@ -1746,18 +1719,13 @@ nvmet_fc_alloc_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
 static void
 nvmet_fc_free_tgt_pgs(struct nvmet_fc_fcp_iod *fod)
 {
-	struct scatterlist *sg;
-	int count;
-
 	if (!fod->data_sg || !fod->data_sg_cnt)
 		return;
 
 	fc_dma_unmap_sg(fod->tgtport->dev, fod->data_sg, fod->data_sg_cnt,
 				((fod->io_dir == NVMET_FCP_WRITE) ?
 					DMA_FROM_DEVICE : DMA_TO_DEVICE));
-	for_each_sg(fod->data_sg, sg, fod->data_sg_cnt, count)
-		__free_page(sg_page(sg));
-	kfree(fod->data_sg);
+	sgl_free(fod->data_sg);
 	fod->data_sg = NULL;
 	fod->data_sg_cnt = 0;
 }
@@ -2522,14 +2490,8 @@ nvmet_fc_add_port(struct nvmet_port *port)
 	list_for_each_entry(tgtport, &nvmet_fc_target_list, tgt_list) {
 		if ((tgtport->fc_target_port.node_name == traddr.nn) &&
 		    (tgtport->fc_target_port.port_name == traddr.pn)) {
-			/* a FC port can only be 1 nvmet port id */
-			if (!tgtport->port) {
-				tgtport->port = port;
-				port->priv = tgtport;
-				nvmet_fc_tgtport_get(tgtport);
-				ret = 0;
-			} else
-				ret = -EALREADY;
+			tgtport->port = port;
+			ret = 0;
 			break;
 		}
 	}
@@ -2540,19 +2502,7 @@ nvmet_fc_add_port(struct nvmet_port *port)
 static void
 nvmet_fc_remove_port(struct nvmet_port *port)
 {
-	struct nvmet_fc_tgtport *tgtport = port->priv;
-	unsigned long flags;
-	bool matched = false;
-
-	spin_lock_irqsave(&nvmet_fc_tgtlock, flags);
-	if (tgtport->port == port) {
-		matched = true;
-		tgtport->port = NULL;
-	}
-	spin_unlock_irqrestore(&nvmet_fc_tgtlock, flags);
-
-	if (matched)
-		nvmet_fc_tgtport_put(tgtport);
+	/* nothing to do */
 }
 
 static struct nvmet_fabrics_ops nvmet_fc_tgt_fcp_ops = {
diff --git a/drivers/nvme/target/fcloop.c b/drivers/nvme/target/fcloop.c
index 7b75d9d..34712de 100644
--- a/drivers/nvme/target/fcloop.c
+++ b/drivers/nvme/target/fcloop.c
@@ -204,6 +204,10 @@ struct fcloop_lport {
 	struct completion unreg_done;
 };
 
+struct fcloop_lport_priv {
+	struct fcloop_lport *lport;
+};
+
 struct fcloop_rport {
 	struct nvme_fc_remote_port *remoteport;
 	struct nvmet_fc_target_port *targetport;
@@ -238,21 +242,32 @@ struct fcloop_lsreq {
 	int				status;
 };
 
+enum {
+	INI_IO_START		= 0,
+	INI_IO_ACTIVE		= 1,
+	INI_IO_ABORTED		= 2,
+	INI_IO_COMPLETED	= 3,
+};
+
 struct fcloop_fcpreq {
 	struct fcloop_tport		*tport;
 	struct nvmefc_fcp_req		*fcpreq;
 	spinlock_t			reqlock;
 	u16				status;
+	u32				inistate;
 	bool				active;
 	bool				aborted;
-	struct work_struct		work;
+	struct kref			ref;
+	struct work_struct		fcp_rcv_work;
+	struct work_struct		abort_rcv_work;
+	struct work_struct		tio_done_work;
 	struct nvmefc_tgt_fcp_req	tgt_fcp_req;
 };
 
 struct fcloop_ini_fcpreq {
 	struct nvmefc_fcp_req		*fcpreq;
 	struct fcloop_fcpreq		*tfcp_req;
-	struct work_struct		iniwork;
+	spinlock_t			inilock;
 };
 
 static inline struct fcloop_lsreq *
@@ -343,17 +358,122 @@ fcloop_xmt_ls_rsp(struct nvmet_fc_target_port *tport,
 	return 0;
 }
 
-/*
- * FCP IO operation done by initiator abort.
- * call back up initiator "done" flows.
- */
 static void
-fcloop_tgt_fcprqst_ini_done_work(struct work_struct *work)
+fcloop_tfcp_req_free(struct kref *ref)
 {
-	struct fcloop_ini_fcpreq *inireq =
-		container_of(work, struct fcloop_ini_fcpreq, iniwork);
+	struct fcloop_fcpreq *tfcp_req =
+		container_of(ref, struct fcloop_fcpreq, ref);
 
-	inireq->fcpreq->done(inireq->fcpreq);
+	kfree(tfcp_req);
+}
+
+static void
+fcloop_tfcp_req_put(struct fcloop_fcpreq *tfcp_req)
+{
+	kref_put(&tfcp_req->ref, fcloop_tfcp_req_free);
+}
+
+static int
+fcloop_tfcp_req_get(struct fcloop_fcpreq *tfcp_req)
+{
+	return kref_get_unless_zero(&tfcp_req->ref);
+}
+
+static void
+fcloop_call_host_done(struct nvmefc_fcp_req *fcpreq,
+			struct fcloop_fcpreq *tfcp_req, int status)
+{
+	struct fcloop_ini_fcpreq *inireq = NULL;
+
+	if (fcpreq) {
+		inireq = fcpreq->private;
+		spin_lock(&inireq->inilock);
+		inireq->tfcp_req = NULL;
+		spin_unlock(&inireq->inilock);
+
+		fcpreq->status = status;
+		fcpreq->done(fcpreq);
+	}
+
+	/* release original io reference on tgt struct */
+	fcloop_tfcp_req_put(tfcp_req);
+}
+
+static void
+fcloop_fcp_recv_work(struct work_struct *work)
+{
+	struct fcloop_fcpreq *tfcp_req =
+		container_of(work, struct fcloop_fcpreq, fcp_rcv_work);
+	struct nvmefc_fcp_req *fcpreq = tfcp_req->fcpreq;
+	int ret = 0;
+	bool aborted = false;
+
+	spin_lock(&tfcp_req->reqlock);
+	switch (tfcp_req->inistate) {
+	case INI_IO_START:
+		tfcp_req->inistate = INI_IO_ACTIVE;
+		break;
+	case INI_IO_ABORTED:
+		aborted = true;
+		break;
+	default:
+		spin_unlock(&tfcp_req->reqlock);
+		WARN_ON(1);
+		return;
+	}
+	spin_unlock(&tfcp_req->reqlock);
+
+	if (unlikely(aborted))
+		ret = -ECANCELED;
+	else
+		ret = nvmet_fc_rcv_fcp_req(tfcp_req->tport->targetport,
+				&tfcp_req->tgt_fcp_req,
+				fcpreq->cmdaddr, fcpreq->cmdlen);
+	if (ret)
+		fcloop_call_host_done(fcpreq, tfcp_req, ret);
+
+	return;
+}
+
+static void
+fcloop_fcp_abort_recv_work(struct work_struct *work)
+{
+	struct fcloop_fcpreq *tfcp_req =
+		container_of(work, struct fcloop_fcpreq, abort_rcv_work);
+	struct nvmefc_fcp_req *fcpreq;
+	bool completed = false;
+
+	spin_lock(&tfcp_req->reqlock);
+	fcpreq = tfcp_req->fcpreq;
+	switch (tfcp_req->inistate) {
+	case INI_IO_ABORTED:
+		break;
+	case INI_IO_COMPLETED:
+		completed = true;
+		break;
+	default:
+		spin_unlock(&tfcp_req->reqlock);
+		WARN_ON(1);
+		return;
+	}
+	spin_unlock(&tfcp_req->reqlock);
+
+	if (unlikely(completed)) {
+		/* remove reference taken in original abort downcall */
+		fcloop_tfcp_req_put(tfcp_req);
+		return;
+	}
+
+	if (tfcp_req->tport->targetport)
+		nvmet_fc_rcv_fcp_abort(tfcp_req->tport->targetport,
+					&tfcp_req->tgt_fcp_req);
+
+	spin_lock(&tfcp_req->reqlock);
+	tfcp_req->fcpreq = NULL;
+	spin_unlock(&tfcp_req->reqlock);
+
+	fcloop_call_host_done(fcpreq, tfcp_req, -ECANCELED);
+	/* call_host_done releases reference for abort downcall */
 }
 
 /*
@@ -364,20 +484,15 @@ static void
 fcloop_tgt_fcprqst_done_work(struct work_struct *work)
 {
 	struct fcloop_fcpreq *tfcp_req =
-		container_of(work, struct fcloop_fcpreq, work);
-	struct fcloop_tport *tport = tfcp_req->tport;
+		container_of(work, struct fcloop_fcpreq, tio_done_work);
 	struct nvmefc_fcp_req *fcpreq;
 
 	spin_lock(&tfcp_req->reqlock);
 	fcpreq = tfcp_req->fcpreq;
+	tfcp_req->inistate = INI_IO_COMPLETED;
 	spin_unlock(&tfcp_req->reqlock);
 
-	if (tport->remoteport && fcpreq) {
-		fcpreq->status = tfcp_req->status;
-		fcpreq->done(fcpreq);
-	}
-
-	kfree(tfcp_req);
+	fcloop_call_host_done(fcpreq, tfcp_req, tfcp_req->status);
 }
 
 
@@ -390,7 +505,6 @@ fcloop_fcp_req(struct nvme_fc_local_port *localport,
 	struct fcloop_rport *rport = remoteport->private;
 	struct fcloop_ini_fcpreq *inireq = fcpreq->private;
 	struct fcloop_fcpreq *tfcp_req;
-	int ret = 0;
 
 	if (!rport->targetport)
 		return -ECONNREFUSED;
@@ -401,16 +515,20 @@ fcloop_fcp_req(struct nvme_fc_local_port *localport,
 
 	inireq->fcpreq = fcpreq;
 	inireq->tfcp_req = tfcp_req;
-	INIT_WORK(&inireq->iniwork, fcloop_tgt_fcprqst_ini_done_work);
+	spin_lock_init(&inireq->inilock);
+
 	tfcp_req->fcpreq = fcpreq;
 	tfcp_req->tport = rport->targetport->private;
+	tfcp_req->inistate = INI_IO_START;
 	spin_lock_init(&tfcp_req->reqlock);
-	INIT_WORK(&tfcp_req->work, fcloop_tgt_fcprqst_done_work);
+	INIT_WORK(&tfcp_req->fcp_rcv_work, fcloop_fcp_recv_work);
+	INIT_WORK(&tfcp_req->abort_rcv_work, fcloop_fcp_abort_recv_work);
+	INIT_WORK(&tfcp_req->tio_done_work, fcloop_tgt_fcprqst_done_work);
+	kref_init(&tfcp_req->ref);
 
-	ret = nvmet_fc_rcv_fcp_req(rport->targetport, &tfcp_req->tgt_fcp_req,
-				 fcpreq->cmdaddr, fcpreq->cmdlen);
+	schedule_work(&tfcp_req->fcp_rcv_work);
 
-	return ret;
+	return 0;
 }
 
 static void
@@ -589,7 +707,7 @@ fcloop_fcp_req_release(struct nvmet_fc_target_port *tgtport,
 {
 	struct fcloop_fcpreq *tfcp_req = tgt_fcp_req_to_fcpreq(tgt_fcpreq);
 
-	schedule_work(&tfcp_req->work);
+	schedule_work(&tfcp_req->tio_done_work);
 }
 
 static void
@@ -605,27 +723,47 @@ fcloop_fcp_abort(struct nvme_fc_local_port *localport,
 			void *hw_queue_handle,
 			struct nvmefc_fcp_req *fcpreq)
 {
-	struct fcloop_rport *rport = remoteport->private;
 	struct fcloop_ini_fcpreq *inireq = fcpreq->private;
-	struct fcloop_fcpreq *tfcp_req = inireq->tfcp_req;
+	struct fcloop_fcpreq *tfcp_req;
+	bool abortio = true;
+
+	spin_lock(&inireq->inilock);
+	tfcp_req = inireq->tfcp_req;
+	if (tfcp_req)
+		fcloop_tfcp_req_get(tfcp_req);
+	spin_unlock(&inireq->inilock);
 
 	if (!tfcp_req)
 		/* abort has already been called */
 		return;
 
-	if (rport->targetport)
-		nvmet_fc_rcv_fcp_abort(rport->targetport,
-					&tfcp_req->tgt_fcp_req);
-
 	/* break initiator/target relationship for io */
 	spin_lock(&tfcp_req->reqlock);
-	inireq->tfcp_req = NULL;
-	tfcp_req->fcpreq = NULL;
+	switch (tfcp_req->inistate) {
+	case INI_IO_START:
+	case INI_IO_ACTIVE:
+		tfcp_req->inistate = INI_IO_ABORTED;
+		break;
+	case INI_IO_COMPLETED:
+		abortio = false;
+		break;
+	default:
+		spin_unlock(&tfcp_req->reqlock);
+		WARN_ON(1);
+		return;
+	}
 	spin_unlock(&tfcp_req->reqlock);
 
-	/* post the aborted io completion */
-	fcpreq->status = -ECANCELED;
-	schedule_work(&inireq->iniwork);
+	if (abortio)
+		/* leave the reference while the work item is scheduled */
+		WARN_ON(!schedule_work(&tfcp_req->abort_rcv_work));
+	else  {
+		/*
+		 * as the io has already had the done callback made,
+		 * nothing more to do. So release the reference taken above
+		 */
+		fcloop_tfcp_req_put(tfcp_req);
+	}
 }
 
 static void
@@ -657,7 +795,8 @@ fcloop_nport_get(struct fcloop_nport *nport)
 static void
 fcloop_localport_delete(struct nvme_fc_local_port *localport)
 {
-	struct fcloop_lport *lport = localport->private;
+	struct fcloop_lport_priv *lport_priv = localport->private;
+	struct fcloop_lport *lport = lport_priv->lport;
 
 	/* release any threads waiting for the unreg to complete */
 	complete(&lport->unreg_done);
@@ -697,7 +836,7 @@ static struct nvme_fc_port_template fctemplate = {
 	.max_dif_sgl_segments	= FCLOOP_SGL_SEGS,
 	.dma_boundary		= FCLOOP_DMABOUND_4G,
 	/* sizes of additional private data for data structures */
-	.local_priv_sz		= sizeof(struct fcloop_lport),
+	.local_priv_sz		= sizeof(struct fcloop_lport_priv),
 	.remote_priv_sz		= sizeof(struct fcloop_rport),
 	.lsrqst_priv_sz		= sizeof(struct fcloop_lsreq),
 	.fcprqst_priv_sz	= sizeof(struct fcloop_ini_fcpreq),
@@ -714,8 +853,7 @@ static struct nvmet_fc_target_template tgttemplate = {
 	.max_dif_sgl_segments	= FCLOOP_SGL_SEGS,
 	.dma_boundary		= FCLOOP_DMABOUND_4G,
 	/* optional features */
-	.target_features	= NVMET_FCTGTFEAT_CMD_IN_ISR |
-				  NVMET_FCTGTFEAT_OPDONE_IN_ISR,
+	.target_features	= 0,
 	/* sizes of additional private data for data structures */
 	.target_priv_sz		= sizeof(struct fcloop_tport),
 };
@@ -728,11 +866,17 @@ fcloop_create_local_port(struct device *dev, struct device_attribute *attr,
 	struct fcloop_ctrl_options *opts;
 	struct nvme_fc_local_port *localport;
 	struct fcloop_lport *lport;
-	int ret;
+	struct fcloop_lport_priv *lport_priv;
+	unsigned long flags;
+	int ret = -ENOMEM;
+
+	lport = kzalloc(sizeof(*lport), GFP_KERNEL);
+	if (!lport)
+		return -ENOMEM;
 
 	opts = kzalloc(sizeof(*opts), GFP_KERNEL);
 	if (!opts)
-		return -ENOMEM;
+		goto out_free_lport;
 
 	ret = fcloop_parse_options(opts, buf);
 	if (ret)
@@ -752,23 +896,25 @@ fcloop_create_local_port(struct device *dev, struct device_attribute *attr,
 
 	ret = nvme_fc_register_localport(&pinfo, &fctemplate, NULL, &localport);
 	if (!ret) {
-		unsigned long flags;
-
 		/* success */
-		lport = localport->private;
+		lport_priv = localport->private;
+		lport_priv->lport = lport;
+
 		lport->localport = localport;
 		INIT_LIST_HEAD(&lport->lport_list);
 
 		spin_lock_irqsave(&fcloop_lock, flags);
 		list_add_tail(&lport->lport_list, &fcloop_lports);
 		spin_unlock_irqrestore(&fcloop_lock, flags);
-
-		/* mark all of the input buffer consumed */
-		ret = count;
 	}
 
 out_free_opts:
 	kfree(opts);
+out_free_lport:
+	/* free only if we're going to fail */
+	if (ret)
+		kfree(lport);
+
 	return ret ? ret : count;
 }
 
@@ -790,6 +936,8 @@ __wait_localport_unreg(struct fcloop_lport *lport)
 
 	wait_for_completion(&lport->unreg_done);
 
+	kfree(lport);
+
 	return ret;
 }
 
@@ -1085,7 +1233,7 @@ fcloop_delete_target_port(struct device *dev, struct device_attribute *attr,
 		const char *buf, size_t count)
 {
 	struct fcloop_nport *nport = NULL, *tmpport;
-	struct fcloop_tport *tport;
+	struct fcloop_tport *tport = NULL;
 	u64 nodename, portname;
 	unsigned long flags;
 	int ret;
diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c
index 1e21b28..7991ec3 100644
--- a/drivers/nvme/target/loop.c
+++ b/drivers/nvme/target/loop.c
@@ -686,6 +686,7 @@ static struct nvmet_fabrics_ops nvme_loop_ops = {
 
 static struct nvmf_transport_ops nvme_loop_transport = {
 	.name		= "loop",
+	.module		= THIS_MODULE,
 	.create_ctrl	= nvme_loop_create_ctrl,
 };
 
@@ -716,7 +717,7 @@ static void __exit nvme_loop_cleanup_module(void)
 		nvme_delete_ctrl(&ctrl->ctrl);
 	mutex_unlock(&nvme_loop_ctrl_mutex);
 
-	flush_workqueue(nvme_wq);
+	flush_workqueue(nvme_delete_wq);
 }
 
 module_init(nvme_loop_init_module);
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 4991290..978e169 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -185,59 +185,6 @@ nvmet_rdma_put_rsp(struct nvmet_rdma_rsp *rsp)
 	spin_unlock_irqrestore(&rsp->queue->rsps_lock, flags);
 }
 
-static void nvmet_rdma_free_sgl(struct scatterlist *sgl, unsigned int nents)
-{
-	struct scatterlist *sg;
-	int count;
-
-	if (!sgl || !nents)
-		return;
-
-	for_each_sg(sgl, sg, nents, count)
-		__free_page(sg_page(sg));
-	kfree(sgl);
-}
-
-static int nvmet_rdma_alloc_sgl(struct scatterlist **sgl, unsigned int *nents,
-		u32 length)
-{
-	struct scatterlist *sg;
-	struct page *page;
-	unsigned int nent;
-	int i = 0;
-
-	nent = DIV_ROUND_UP(length, PAGE_SIZE);
-	sg = kmalloc_array(nent, sizeof(struct scatterlist), GFP_KERNEL);
-	if (!sg)
-		goto out;
-
-	sg_init_table(sg, nent);
-
-	while (length) {
-		u32 page_len = min_t(u32, length, PAGE_SIZE);
-
-		page = alloc_page(GFP_KERNEL);
-		if (!page)
-			goto out_free_pages;
-
-		sg_set_page(&sg[i], page, page_len, 0);
-		length -= page_len;
-		i++;
-	}
-	*sgl = sg;
-	*nents = nent;
-	return 0;
-
-out_free_pages:
-	while (i > 0) {
-		i--;
-		__free_page(sg_page(&sg[i]));
-	}
-	kfree(sg);
-out:
-	return NVME_SC_INTERNAL;
-}
-
 static int nvmet_rdma_alloc_cmd(struct nvmet_rdma_device *ndev,
 			struct nvmet_rdma_cmd *c, bool admin)
 {
@@ -484,7 +431,7 @@ static void nvmet_rdma_release_rsp(struct nvmet_rdma_rsp *rsp)
 	}
 
 	if (rsp->req.sg != &rsp->cmd->inline_sg)
-		nvmet_rdma_free_sgl(rsp->req.sg, rsp->req.sg_cnt);
+		sgl_free(rsp->req.sg);
 
 	if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list)))
 		nvmet_rdma_process_wr_wait_list(queue);
@@ -621,16 +568,14 @@ static u16 nvmet_rdma_map_sgl_keyed(struct nvmet_rdma_rsp *rsp,
 	u32 len = get_unaligned_le24(sgl->length);
 	u32 key = get_unaligned_le32(sgl->key);
 	int ret;
-	u16 status;
 
 	/* no data command? */
 	if (!len)
 		return 0;
 
-	status = nvmet_rdma_alloc_sgl(&rsp->req.sg, &rsp->req.sg_cnt,
-			len);
-	if (status)
-		return status;
+	rsp->req.sg = sgl_alloc(len, GFP_KERNEL, &rsp->req.sg_cnt);
+	if (!rsp->req.sg)
+		return NVME_SC_INTERNAL;
 
 	ret = rdma_rw_ctx_init(&rsp->rw, cm_id->qp, cm_id->port_num,
 			rsp->req.sg, rsp->req.sg_cnt, 0, addr, key,
@@ -976,7 +921,7 @@ static void nvmet_rdma_destroy_queue_ib(struct nvmet_rdma_queue *queue)
 
 static void nvmet_rdma_free_queue(struct nvmet_rdma_queue *queue)
 {
-	pr_info("freeing queue %d\n", queue->idx);
+	pr_debug("freeing queue %d\n", queue->idx);
 
 	nvmet_sq_destroy(&queue->nvme_sq);
 
@@ -1558,25 +1503,9 @@ static int __init nvmet_rdma_init(void)
 
 static void __exit nvmet_rdma_exit(void)
 {
-	struct nvmet_rdma_queue *queue;
-
 	nvmet_unregister_transport(&nvmet_rdma_ops);
-
-	flush_scheduled_work();
-
-	mutex_lock(&nvmet_rdma_queue_mutex);
-	while ((queue = list_first_entry_or_null(&nvmet_rdma_queue_list,
-			struct nvmet_rdma_queue, queue_list))) {
-		list_del_init(&queue->queue_list);
-
-		mutex_unlock(&nvmet_rdma_queue_mutex);
-		__nvmet_rdma_queue_disconnect(queue);
-		mutex_lock(&nvmet_rdma_queue_mutex);
-	}
-	mutex_unlock(&nvmet_rdma_queue_mutex);
-
-	flush_scheduled_work();
 	ib_unregister_client(&nvmet_rdma_ib_client);
+	WARN_ON_ONCE(!list_empty(&nvmet_rdma_queue_list));
 	ida_destroy(&nvmet_rdma_queue_ida);
 }
 
diff --git a/drivers/of/base.c b/drivers/of/base.c
index 26618ba..a9d6fe8 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -316,6 +316,32 @@ struct device_node *of_get_cpu_node(int cpu, unsigned int *thread)
 EXPORT_SYMBOL(of_get_cpu_node);
 
 /**
+ * of_cpu_node_to_id: Get the logical CPU number for a given device_node
+ *
+ * @cpu_node: Pointer to the device_node for CPU.
+ *
+ * Returns the logical CPU number of the given CPU device_node.
+ * Returns -ENODEV if the CPU is not found.
+ */
+int of_cpu_node_to_id(struct device_node *cpu_node)
+{
+	int cpu;
+	bool found = false;
+	struct device_node *np;
+
+	for_each_possible_cpu(cpu) {
+		np = of_cpu_device_node_get(cpu);
+		found = (cpu_node == np);
+		of_node_put(np);
+		if (found)
+			return cpu;
+	}
+
+	return -ENODEV;
+}
+EXPORT_SYMBOL(of_cpu_node_to_id);
+
+/**
  * __of_device_is_compatible() - Check if the node matches given constraints
  * @device: pointer to node
  * @compat: required compatible string, NULL or "" for any match
diff --git a/drivers/of/of_mdio.c b/drivers/of/of_mdio.c
index 3481e69..a327be1 100644
--- a/drivers/of/of_mdio.c
+++ b/drivers/of/of_mdio.c
@@ -231,7 +231,12 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 			rc = of_mdiobus_register_phy(mdio, child, addr);
 		else
 			rc = of_mdiobus_register_device(mdio, child, addr);
-		if (rc)
+
+		if (rc == -ENODEV)
+			dev_err(&mdio->dev,
+				"MDIO device at address %d is missing.\n",
+				addr);
+		else if (rc)
 			goto unregister;
 	}
 
@@ -255,7 +260,7 @@ int of_mdiobus_register(struct mii_bus *mdio, struct device_node *np)
 
 			if (of_mdiobus_child_is_phy(child)) {
 				rc = of_mdiobus_register_phy(mdio, child, addr);
-				if (rc)
+				if (rc && rc != -ENODEV)
 					goto unregister;
 			}
 		}
diff --git a/drivers/opp/Makefile b/drivers/opp/Makefile
index e70ceb4..6ce6aef 100644
--- a/drivers/opp/Makefile
+++ b/drivers/opp/Makefile
@@ -2,3 +2,4 @@
 obj-y				+= core.o cpu.o
 obj-$(CONFIG_OF)		+= of.o
 obj-$(CONFIG_DEBUG_FS)		+= debugfs.o
+obj-$(CONFIG_ARM_TI_CPUFREQ)	+= ti-opp-supply.o
diff --git a/drivers/opp/ti-opp-supply.c b/drivers/opp/ti-opp-supply.c
new file mode 100644
index 0000000..370eff3
--- /dev/null
+++ b/drivers/opp/ti-opp-supply.c
@@ -0,0 +1,425 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2016-2017 Texas Instruments Incorporated - http://www.ti.com/
+ *	Nishanth Menon <nm@ti.com>
+ *	Dave Gerlach <d-gerlach@ti.com>
+ *
+ * TI OPP supply driver that provides override into the regulator control
+ * for generic opp core to handle devices with ABB regulator and/or
+ * SmartReflex Class0.
+ */
+#include <linux/clk.h>
+#include <linux/cpufreq.h>
+#include <linux/device.h>
+#include <linux/io.h>
+#include <linux/module.h>
+#include <linux/notifier.h>
+#include <linux/of_device.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/pm_opp.h>
+#include <linux/regulator/consumer.h>
+#include <linux/slab.h>
+
+/**
+ * struct ti_opp_supply_optimum_voltage_table - optimized voltage table
+ * @reference_uv:	reference voltage (usually Nominal voltage)
+ * @optimized_uv:	Optimized voltage from efuse
+ */
+struct ti_opp_supply_optimum_voltage_table {
+	unsigned int reference_uv;
+	unsigned int optimized_uv;
+};
+
+/**
+ * struct ti_opp_supply_data - OMAP specific opp supply data
+ * @vdd_table:	Optimized voltage mapping table
+ * @num_vdd_table: number of entries in vdd_table
+ * @vdd_absolute_max_voltage_uv: absolute maximum voltage in UV for the supply
+ */
+struct ti_opp_supply_data {
+	struct ti_opp_supply_optimum_voltage_table *vdd_table;
+	u32 num_vdd_table;
+	u32 vdd_absolute_max_voltage_uv;
+};
+
+static struct ti_opp_supply_data opp_data;
+
+/**
+ * struct ti_opp_supply_of_data - device tree match data
+ * @flags:	specific type of opp supply
+ * @efuse_voltage_mask: mask required for efuse register representing voltage
+ * @efuse_voltage_uv: Are the efuse entries in micro-volts? if not, assume
+ *		milli-volts.
+ */
+struct ti_opp_supply_of_data {
+#define OPPDM_EFUSE_CLASS0_OPTIMIZED_VOLTAGE	BIT(1)
+#define OPPDM_HAS_NO_ABB			BIT(2)
+	const u8 flags;
+	const u32 efuse_voltage_mask;
+	const bool efuse_voltage_uv;
+};
+
+/**
+ * _store_optimized_voltages() - store optimized voltages
+ * @dev:	ti opp supply device for which we need to store info
+ * @data:	data specific to the device
+ *
+ * Picks up efuse based optimized voltages for VDD unique per device and
+ * stores it in internal data structure for use during transition requests.
+ *
+ * Return: If successful, 0, else appropriate error value.
+ */
+static int _store_optimized_voltages(struct device *dev,
+				     struct ti_opp_supply_data *data)
+{
+	void __iomem *base;
+	struct property *prop;
+	struct resource *res;
+	const __be32 *val;
+	int proplen, i;
+	int ret = 0;
+	struct ti_opp_supply_optimum_voltage_table *table;
+	const struct ti_opp_supply_of_data *of_data = dev_get_drvdata(dev);
+
+	/* pick up Efuse based voltages */
+	res = platform_get_resource(to_platform_device(dev), IORESOURCE_MEM, 0);
+	if (!res) {
+		dev_err(dev, "Unable to get IO resource\n");
+		ret = -ENODEV;
+		goto out_map;
+	}
+
+	base = ioremap_nocache(res->start, resource_size(res));
+	if (!base) {
+		dev_err(dev, "Unable to map Efuse registers\n");
+		ret = -ENOMEM;
+		goto out_map;
+	}
+
+	/* Fetch efuse-settings. */
+	prop = of_find_property(dev->of_node, "ti,efuse-settings", NULL);
+	if (!prop) {
+		dev_err(dev, "No 'ti,efuse-settings' property found\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	proplen = prop->length / sizeof(int);
+	data->num_vdd_table = proplen / 2;
+	/* Verify for corrupted OPP entries in dt */
+	if (data->num_vdd_table * 2 * sizeof(int) != prop->length) {
+		dev_err(dev, "Invalid 'ti,efuse-settings'\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	ret = of_property_read_u32(dev->of_node, "ti,absolute-max-voltage-uv",
+				   &data->vdd_absolute_max_voltage_uv);
+	if (ret) {
+		dev_err(dev, "ti,absolute-max-voltage-uv is missing\n");
+		ret = -EINVAL;
+		goto out;
+	}
+
+	table = kzalloc(sizeof(*data->vdd_table) *
+				  data->num_vdd_table, GFP_KERNEL);
+	if (!table) {
+		ret = -ENOMEM;
+		goto out;
+	}
+	data->vdd_table = table;
+
+	val = prop->value;
+	for (i = 0; i < data->num_vdd_table; i++, table++) {
+		u32 efuse_offset;
+		u32 tmp;
+
+		table->reference_uv = be32_to_cpup(val++);
+		efuse_offset = be32_to_cpup(val++);
+
+		tmp = readl(base + efuse_offset);
+		tmp &= of_data->efuse_voltage_mask;
+		tmp >>= __ffs(of_data->efuse_voltage_mask);
+
+		table->optimized_uv = of_data->efuse_voltage_uv ? tmp :
+					tmp * 1000;
+
+		dev_dbg(dev, "[%d] efuse=0x%08x volt_table=%d vset=%d\n",
+			i, efuse_offset, table->reference_uv,
+			table->optimized_uv);
+
+		/*
+		 * Some older samples might not have optimized efuse
+		 * Use reference voltage for those - just add debug message
+		 * for them.
+		 */
+		if (!table->optimized_uv) {
+			dev_dbg(dev, "[%d] efuse=0x%08x volt_table=%d:vset0\n",
+				i, efuse_offset, table->reference_uv);
+			table->optimized_uv = table->reference_uv;
+		}
+	}
+out:
+	iounmap(base);
+out_map:
+	return ret;
+}
+
+/**
+ * _free_optimized_voltages() - free resources for optvoltages
+ * @dev:	device for which we need to free info
+ * @data:	data specific to the device
+ */
+static void _free_optimized_voltages(struct device *dev,
+				     struct ti_opp_supply_data *data)
+{
+	kfree(data->vdd_table);
+	data->vdd_table = NULL;
+	data->num_vdd_table = 0;
+}
+
+/**
+ * _get_optimal_vdd_voltage() - Finds optimal voltage for the supply
+ * @dev:	device for which we need to find info
+ * @data:	data specific to the device
+ * @reference_uv:	reference voltage (OPP voltage) for which we need value
+ *
+ * Return: if a match is found, return optimized voltage, else return
+ * reference_uv, also return reference_uv if no optimization is needed.
+ */
+static int _get_optimal_vdd_voltage(struct device *dev,
+				    struct ti_opp_supply_data *data,
+				    int reference_uv)
+{
+	int i;
+	struct ti_opp_supply_optimum_voltage_table *table;
+
+	if (!data->num_vdd_table)
+		return reference_uv;
+
+	table = data->vdd_table;
+	if (!table)
+		return -EINVAL;
+
+	/* Find a exact match - this list is usually very small */
+	for (i = 0; i < data->num_vdd_table; i++, table++)
+		if (table->reference_uv == reference_uv)
+			return table->optimized_uv;
+
+	/* IF things are screwed up, we'd make a mess on console.. ratelimit */
+	dev_err_ratelimited(dev, "%s: Failed optimized voltage match for %d\n",
+			    __func__, reference_uv);
+	return reference_uv;
+}
+
+static int _opp_set_voltage(struct device *dev,
+			    struct dev_pm_opp_supply *supply,
+			    int new_target_uv, struct regulator *reg,
+			    char *reg_name)
+{
+	int ret;
+	unsigned long vdd_uv, uv_max;
+
+	if (new_target_uv)
+		vdd_uv = new_target_uv;
+	else
+		vdd_uv = supply->u_volt;
+
+	/*
+	 * If we do have an absolute max voltage specified, then we should
+	 * use that voltage instead to allow for cases where the voltage rails
+	 * are ganged (example if we set the max for an opp as 1.12v, and
+	 * the absolute max is 1.5v, for another rail to get 1.25v, it cannot
+	 * be achieved if the regulator is constrainted to max of 1.12v, even
+	 * if it can function at 1.25v
+	 */
+	if (opp_data.vdd_absolute_max_voltage_uv)
+		uv_max = opp_data.vdd_absolute_max_voltage_uv;
+	else
+		uv_max = supply->u_volt_max;
+
+	if (vdd_uv > uv_max ||
+	    vdd_uv < supply->u_volt_min ||
+	    supply->u_volt_min > uv_max) {
+		dev_warn(dev,
+			 "Invalid range voltages [Min:%lu target:%lu Max:%lu]\n",
+			 supply->u_volt_min, vdd_uv, uv_max);
+		return -EINVAL;
+	}
+
+	dev_dbg(dev, "%s scaling to %luuV[min %luuV max %luuV]\n", reg_name,
+		vdd_uv, supply->u_volt_min,
+		uv_max);
+
+	ret = regulator_set_voltage_triplet(reg,
+					    supply->u_volt_min,
+					    vdd_uv,
+					    uv_max);
+	if (ret) {
+		dev_err(dev, "%s failed for %luuV[min %luuV max %luuV]\n",
+			reg_name, vdd_uv, supply->u_volt_min,
+			uv_max);
+		return ret;
+	}
+
+	return 0;
+}
+
+/**
+ * ti_opp_supply_set_opp() - do the opp supply transition
+ * @data:	information on regulators and new and old opps provided by
+ *		opp core to use in transition
+ *
+ * Return: If successful, 0, else appropriate error value.
+ */
+static int ti_opp_supply_set_opp(struct dev_pm_set_opp_data *data)
+{
+	struct dev_pm_opp_supply *old_supply_vdd = &data->old_opp.supplies[0];
+	struct dev_pm_opp_supply *old_supply_vbb = &data->old_opp.supplies[1];
+	struct dev_pm_opp_supply *new_supply_vdd = &data->new_opp.supplies[0];
+	struct dev_pm_opp_supply *new_supply_vbb = &data->new_opp.supplies[1];
+	struct device *dev = data->dev;
+	unsigned long old_freq = data->old_opp.rate, freq = data->new_opp.rate;
+	struct clk *clk = data->clk;
+	struct regulator *vdd_reg = data->regulators[0];
+	struct regulator *vbb_reg = data->regulators[1];
+	int vdd_uv;
+	int ret;
+
+	vdd_uv = _get_optimal_vdd_voltage(dev, &opp_data,
+					  new_supply_vbb->u_volt);
+
+	/* Scaling up? Scale voltage before frequency */
+	if (freq > old_freq) {
+		ret = _opp_set_voltage(dev, new_supply_vdd, vdd_uv, vdd_reg,
+				       "vdd");
+		if (ret)
+			goto restore_voltage;
+
+		ret = _opp_set_voltage(dev, new_supply_vbb, 0, vbb_reg, "vbb");
+		if (ret)
+			goto restore_voltage;
+	}
+
+	/* Change frequency */
+	dev_dbg(dev, "%s: switching OPP: %lu Hz --> %lu Hz\n",
+		__func__, old_freq, freq);
+
+	ret = clk_set_rate(clk, freq);
+	if (ret) {
+		dev_err(dev, "%s: failed to set clock rate: %d\n", __func__,
+			ret);
+		goto restore_voltage;
+	}
+
+	/* Scaling down? Scale voltage after frequency */
+	if (freq < old_freq) {
+		ret = _opp_set_voltage(dev, new_supply_vbb, 0, vbb_reg, "vbb");
+		if (ret)
+			goto restore_freq;
+
+		ret = _opp_set_voltage(dev, new_supply_vdd, vdd_uv, vdd_reg,
+				       "vdd");
+		if (ret)
+			goto restore_freq;
+	}
+
+	return 0;
+
+restore_freq:
+	ret = clk_set_rate(clk, old_freq);
+	if (ret)
+		dev_err(dev, "%s: failed to restore old-freq (%lu Hz)\n",
+			__func__, old_freq);
+restore_voltage:
+	/* This shouldn't harm even if the voltages weren't updated earlier */
+	if (old_supply_vdd->u_volt) {
+		ret = _opp_set_voltage(dev, old_supply_vbb, 0, vbb_reg, "vbb");
+		if (ret)
+			return ret;
+
+		ret = _opp_set_voltage(dev, old_supply_vdd, 0, vdd_reg,
+				       "vdd");
+		if (ret)
+			return ret;
+	}
+
+	return ret;
+}
+
+static const struct ti_opp_supply_of_data omap_generic_of_data = {
+};
+
+static const struct ti_opp_supply_of_data omap_omap5_of_data = {
+	.flags = OPPDM_EFUSE_CLASS0_OPTIMIZED_VOLTAGE,
+	.efuse_voltage_mask = 0xFFF,
+	.efuse_voltage_uv = false,
+};
+
+static const struct ti_opp_supply_of_data omap_omap5core_of_data = {
+	.flags = OPPDM_EFUSE_CLASS0_OPTIMIZED_VOLTAGE | OPPDM_HAS_NO_ABB,
+	.efuse_voltage_mask = 0xFFF,
+	.efuse_voltage_uv = false,
+};
+
+static const struct of_device_id ti_opp_supply_of_match[] = {
+	{.compatible = "ti,omap-opp-supply", .data = &omap_generic_of_data},
+	{.compatible = "ti,omap5-opp-supply", .data = &omap_omap5_of_data},
+	{.compatible = "ti,omap5-core-opp-supply",
+	 .data = &omap_omap5core_of_data},
+	{},
+};
+MODULE_DEVICE_TABLE(of, ti_opp_supply_of_match);
+
+static int ti_opp_supply_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct device *cpu_dev = get_cpu_device(0);
+	const struct of_device_id *match;
+	const struct ti_opp_supply_of_data *of_data;
+	int ret = 0;
+
+	match = of_match_device(ti_opp_supply_of_match, dev);
+	if (!match) {
+		/* We do not expect this to happen */
+		dev_err(dev, "%s: Unable to match device\n", __func__);
+		return -ENODEV;
+	}
+	if (!match->data) {
+		/* Again, unlikely.. but mistakes do happen */
+		dev_err(dev, "%s: Bad data in match\n", __func__);
+		return -EINVAL;
+	}
+	of_data = match->data;
+
+	dev_set_drvdata(dev, (void *)of_data);
+
+	/* If we need optimized voltage */
+	if (of_data->flags & OPPDM_EFUSE_CLASS0_OPTIMIZED_VOLTAGE) {
+		ret = _store_optimized_voltages(dev, &opp_data);
+		if (ret)
+			return ret;
+	}
+
+	ret = PTR_ERR_OR_ZERO(dev_pm_opp_register_set_opp_helper(cpu_dev,
+								 ti_opp_supply_set_opp));
+	if (ret)
+		_free_optimized_voltages(dev, &opp_data);
+
+	return ret;
+}
+
+static struct platform_driver ti_opp_supply_driver = {
+	.probe = ti_opp_supply_probe,
+	.driver = {
+		   .name = "ti_opp_supply",
+		   .owner = THIS_MODULE,
+		   .of_match_table = of_match_ptr(ti_opp_supply_of_match),
+		   },
+};
+module_platform_driver(ti_opp_supply_driver);
+
+MODULE_DESCRIPTION("Texas Instruments OMAP OPP Supply driver");
+MODULE_AUTHOR("Texas Instruments Inc.");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/parisc/dino.c b/drivers/parisc/dino.c
index 0b3fb99..7390fb8 100644
--- a/drivers/parisc/dino.c
+++ b/drivers/parisc/dino.c
@@ -303,7 +303,7 @@ static void dino_mask_irq(struct irq_data *d)
 	struct dino_device *dino_dev = irq_data_get_irq_chip_data(d);
 	int local_irq = gsc_find_local_irq(d->irq, dino_dev->global_irq, DINO_LOCAL_IRQS);
 
-	DBG(KERN_WARNING "%s(0x%p, %d)\n", __func__, dino_dev, d->irq);
+	DBG(KERN_WARNING "%s(0x%px, %d)\n", __func__, dino_dev, d->irq);
 
 	/* Clear the matching bit in the IMR register */
 	dino_dev->imr &= ~(DINO_MASK_IRQ(local_irq));
@@ -316,7 +316,7 @@ static void dino_unmask_irq(struct irq_data *d)
 	int local_irq = gsc_find_local_irq(d->irq, dino_dev->global_irq, DINO_LOCAL_IRQS);
 	u32 tmp;
 
-	DBG(KERN_WARNING "%s(0x%p, %d)\n", __func__, dino_dev, d->irq);
+	DBG(KERN_WARNING "%s(0x%px, %d)\n", __func__, dino_dev, d->irq);
 
 	/*
 	** clear pending IRQ bits
@@ -396,7 +396,7 @@ static irqreturn_t dino_isr(int irq, void *intr_dev)
 	if (mask) {
 		if (--ilr_loop > 0)
 			goto ilr_again;
-		printk(KERN_ERR "Dino 0x%p: stuck interrupt %d\n", 
+		printk(KERN_ERR "Dino 0x%px: stuck interrupt %d\n",
 		       dino_dev->hba.base_addr, mask);
 		return IRQ_NONE;
 	}
@@ -553,7 +553,7 @@ dino_fixup_bus(struct pci_bus *bus)
         struct pci_dev *dev;
         struct dino_device *dino_dev = DINO_DEV(parisc_walk_tree(bus->bridge));
 
-	DBG(KERN_WARNING "%s(0x%p) bus %d platform_data 0x%p\n",
+	DBG(KERN_WARNING "%s(0x%px) bus %d platform_data 0x%px\n",
 	    __func__, bus, bus->busn_res.start,
 	    bus->bridge->platform_data);
 
@@ -854,7 +854,7 @@ static int __init dino_common_init(struct parisc_device *dev,
 	res->flags = IORESOURCE_IO; /* do not mark it busy ! */
 	if (request_resource(&ioport_resource, res) < 0) {
 		printk(KERN_ERR "%s: request I/O Port region failed "
-		       "0x%lx/%lx (hpa 0x%p)\n",
+		       "0x%lx/%lx (hpa 0x%px)\n",
 		       name, (unsigned long)res->start, (unsigned long)res->end,
 		       dino_dev->hba.base_addr);
 		return 1;
diff --git a/drivers/parisc/eisa_eeprom.c b/drivers/parisc/eisa_eeprom.c
index 4dd9b13..99a80da 100644
--- a/drivers/parisc/eisa_eeprom.c
+++ b/drivers/parisc/eisa_eeprom.c
@@ -106,7 +106,7 @@ static int __init eisa_eeprom_init(void)
 		return retval;
 	}
 
-	printk(KERN_INFO "EISA EEPROM at 0x%p\n", eisa_eeprom_addr);
+	printk(KERN_INFO "EISA EEPROM at 0x%px\n", eisa_eeprom_addr);
 	return 0;
 }
 
diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c
index 14fd865..5958c8d 100644
--- a/drivers/pci/pci-driver.c
+++ b/drivers/pci/pci-driver.c
@@ -699,7 +699,7 @@ static void pci_pm_complete(struct device *dev)
 	pm_generic_complete(dev);
 
 	/* Resume device if platform firmware has put it in reset-power-on */
-	if (dev->power.direct_complete && pm_resume_via_firmware()) {
+	if (pm_runtime_suspended(dev) && pm_resume_via_firmware()) {
 		pci_power_t pre_sleep_state = pci_dev->current_state;
 
 		pci_update_current_state(pci_dev, pci_dev->current_state);
@@ -783,8 +783,10 @@ static int pci_pm_suspend_noirq(struct device *dev)
 	struct pci_dev *pci_dev = to_pci_dev(dev);
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 
-	if (dev_pm_smart_suspend_and_suspended(dev))
+	if (dev_pm_smart_suspend_and_suspended(dev)) {
+		dev->power.may_skip_resume = true;
 		return 0;
+	}
 
 	if (pci_has_legacy_pm_support(pci_dev))
 		return pci_legacy_suspend_late(dev, PMSG_SUSPEND);
@@ -838,6 +840,16 @@ static int pci_pm_suspend_noirq(struct device *dev)
 Fixup:
 	pci_fixup_device(pci_fixup_suspend_late, pci_dev);
 
+	/*
+	 * If the target system sleep state is suspend-to-idle, it is sufficient
+	 * to check whether or not the device's wakeup settings are good for
+	 * runtime PM.  Otherwise, the pm_resume_via_firmware() check will cause
+	 * pci_pm_complete() to take care of fixing up the device's state
+	 * anyway, if need be.
+	 */
+	dev->power.may_skip_resume = device_may_wakeup(dev) ||
+					!device_can_wakeup(dev);
+
 	return 0;
 }
 
@@ -847,6 +859,9 @@ static int pci_pm_resume_noirq(struct device *dev)
 	struct device_driver *drv = dev->driver;
 	int error = 0;
 
+	if (dev_pm_may_skip_resume(dev))
+		return 0;
+
 	/*
 	 * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend
 	 * during system suspend, so update their runtime PM status to "active"
@@ -953,7 +968,7 @@ static int pci_pm_freeze_late(struct device *dev)
 	if (dev_pm_smart_suspend_and_suspended(dev))
 		return 0;
 
-	return pm_generic_freeze_late(dev);;
+	return pm_generic_freeze_late(dev);
 }
 
 static int pci_pm_freeze_noirq(struct device *dev)
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index ffbf4e7..fb1c1bb 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -150,6 +150,9 @@ static int pcie_portdrv_probe(struct pci_dev *dev,
 
 	pci_save_state(dev);
 
+	dev_pm_set_driver_flags(&dev->dev, DPM_FLAG_SMART_SUSPEND |
+					   DPM_FLAG_LEAVE_SUSPENDED);
+
 	if (pci_bridge_d3_possible(dev)) {
 		/*
 		 * Keep the port resumed 100ms to make sure things like
diff --git a/drivers/perf/Kconfig b/drivers/perf/Kconfig
index b8f44b0..da5724c 100644
--- a/drivers/perf/Kconfig
+++ b/drivers/perf/Kconfig
@@ -17,6 +17,15 @@
 	depends on ARM_PMU && ACPI
 	def_bool y
 
+config ARM_DSU_PMU
+	tristate "ARM DynamIQ Shared Unit (DSU) PMU"
+	depends on ARM64
+	  help
+	  Provides support for performance monitor unit in ARM DynamIQ Shared
+	  Unit (DSU). The DSU integrates one or more cores with an L3 memory
+	  system, control logic. The PMU allows counting various events related
+	  to DSU.
+
 config HISI_PMU
        bool "HiSilicon SoC PMU"
        depends on ARM64 && ACPI
diff --git a/drivers/perf/Makefile b/drivers/perf/Makefile
index 710a013..c2f2741 100644
--- a/drivers/perf/Makefile
+++ b/drivers/perf/Makefile
@@ -1,4 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ARM_DSU_PMU) += arm_dsu_pmu.o
 obj-$(CONFIG_ARM_PMU) += arm_pmu.o arm_pmu_platform.o
 obj-$(CONFIG_ARM_PMU_ACPI) += arm_pmu_acpi.o
 obj-$(CONFIG_HISI_PMU) += hisilicon/
diff --git a/drivers/perf/arm_dsu_pmu.c b/drivers/perf/arm_dsu_pmu.c
new file mode 100644
index 0000000..93c50e3
--- /dev/null
+++ b/drivers/perf/arm_dsu_pmu.c
@@ -0,0 +1,843 @@
+/*
+ * ARM DynamIQ Shared Unit (DSU) PMU driver
+ *
+ * Copyright (C) ARM Limited, 2017.
+ *
+ * Based on ARM CCI-PMU, ARMv8 PMU-v3 drivers.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * version 2 as published by the Free Software Foundation.
+ */
+
+#define PMUNAME		"arm_dsu"
+#define DRVNAME		PMUNAME "_pmu"
+#define pr_fmt(fmt)	DRVNAME ": " fmt
+
+#include <linux/bitmap.h>
+#include <linux/bitops.h>
+#include <linux/bug.h>
+#include <linux/cpumask.h>
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_device.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/smp.h>
+#include <linux/sysfs.h>
+#include <linux/types.h>
+
+#include <asm/arm_dsu_pmu.h>
+#include <asm/local64.h>
+
+/* PMU event codes */
+#define DSU_PMU_EVT_CYCLES		0x11
+#define DSU_PMU_EVT_CHAIN		0x1e
+
+#define DSU_PMU_MAX_COMMON_EVENTS	0x40
+
+#define DSU_PMU_MAX_HW_CNTRS		32
+#define DSU_PMU_HW_COUNTER_MASK		(DSU_PMU_MAX_HW_CNTRS - 1)
+
+#define CLUSTERPMCR_E			BIT(0)
+#define CLUSTERPMCR_P			BIT(1)
+#define CLUSTERPMCR_C			BIT(2)
+#define CLUSTERPMCR_N_SHIFT		11
+#define CLUSTERPMCR_N_MASK		0x1f
+#define CLUSTERPMCR_IDCODE_SHIFT	16
+#define CLUSTERPMCR_IDCODE_MASK		0xff
+#define CLUSTERPMCR_IMP_SHIFT		24
+#define CLUSTERPMCR_IMP_MASK		0xff
+#define CLUSTERPMCR_RES_MASK		0x7e8
+#define CLUSTERPMCR_RES_VAL		0x40
+
+#define DSU_ACTIVE_CPU_MASK		0x0
+#define DSU_ASSOCIATED_CPU_MASK		0x1
+
+/*
+ * We use the index of the counters as they appear in the counter
+ * bit maps in the PMU registers (e.g CLUSTERPMSELR).
+ * i.e,
+ *	counter 0	- Bit 0
+ *	counter 1	- Bit 1
+ *	...
+ *	Cycle counter	- Bit 31
+ */
+#define DSU_PMU_IDX_CYCLE_COUNTER	31
+
+/* All event counters are 32bit, with a 64bit Cycle counter */
+#define DSU_PMU_COUNTER_WIDTH(idx)	\
+	(((idx) == DSU_PMU_IDX_CYCLE_COUNTER) ? 64 : 32)
+
+#define DSU_PMU_COUNTER_MASK(idx)	\
+	GENMASK_ULL((DSU_PMU_COUNTER_WIDTH((idx)) - 1), 0)
+
+#define DSU_EXT_ATTR(_name, _func, _config)		\
+	(&((struct dev_ext_attribute[]) {				\
+		{							\
+			.attr = __ATTR(_name, 0444, _func, NULL),	\
+			.var = (void *)_config				\
+		}							\
+	})[0].attr.attr)
+
+#define DSU_EVENT_ATTR(_name, _config)		\
+	DSU_EXT_ATTR(_name, dsu_pmu_sysfs_event_show, (unsigned long)_config)
+
+#define DSU_FORMAT_ATTR(_name, _config)		\
+	DSU_EXT_ATTR(_name, dsu_pmu_sysfs_format_show, (char *)_config)
+
+#define DSU_CPUMASK_ATTR(_name, _config)	\
+	DSU_EXT_ATTR(_name, dsu_pmu_cpumask_show, (unsigned long)_config)
+
+struct dsu_hw_events {
+	DECLARE_BITMAP(used_mask, DSU_PMU_MAX_HW_CNTRS);
+	struct perf_event	*events[DSU_PMU_MAX_HW_CNTRS];
+};
+
+/*
+ * struct dsu_pmu	- DSU PMU descriptor
+ *
+ * @pmu_lock		: Protects accesses to DSU PMU register from normal vs
+ *			  interrupt handler contexts.
+ * @hw_events		: Holds the event counter state.
+ * @associated_cpus	: CPUs attached to the DSU.
+ * @active_cpu		: CPU to which the PMU is bound for accesses.
+ * @cpuhp_node		: Node for CPU hotplug notifier link.
+ * @num_counters	: Number of event counters implemented by the PMU,
+ *			  excluding the cycle counter.
+ * @irq			: Interrupt line for counter overflow.
+ * @cpmceid_bitmap	: Bitmap for the availability of architected common
+ *			  events (event_code < 0x40).
+ */
+struct dsu_pmu {
+	struct pmu			pmu;
+	struct device			*dev;
+	raw_spinlock_t			pmu_lock;
+	struct dsu_hw_events		hw_events;
+	cpumask_t			associated_cpus;
+	cpumask_t			active_cpu;
+	struct hlist_node		cpuhp_node;
+	s8				num_counters;
+	int				irq;
+	DECLARE_BITMAP(cpmceid_bitmap, DSU_PMU_MAX_COMMON_EVENTS);
+};
+
+static unsigned long dsu_pmu_cpuhp_state;
+
+static inline struct dsu_pmu *to_dsu_pmu(struct pmu *pmu)
+{
+	return container_of(pmu, struct dsu_pmu, pmu);
+}
+
+static ssize_t dsu_pmu_sysfs_event_show(struct device *dev,
+					struct device_attribute *attr,
+					char *buf)
+{
+	struct dev_ext_attribute *eattr = container_of(attr,
+					struct dev_ext_attribute, attr);
+	return snprintf(buf, PAGE_SIZE, "event=0x%lx\n",
+					 (unsigned long)eattr->var);
+}
+
+static ssize_t dsu_pmu_sysfs_format_show(struct device *dev,
+					 struct device_attribute *attr,
+					 char *buf)
+{
+	struct dev_ext_attribute *eattr = container_of(attr,
+					struct dev_ext_attribute, attr);
+	return snprintf(buf, PAGE_SIZE, "%s\n", (char *)eattr->var);
+}
+
+static ssize_t dsu_pmu_cpumask_show(struct device *dev,
+				    struct device_attribute *attr,
+				    char *buf)
+{
+	struct pmu *pmu = dev_get_drvdata(dev);
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(pmu);
+	struct dev_ext_attribute *eattr = container_of(attr,
+					struct dev_ext_attribute, attr);
+	unsigned long mask_id = (unsigned long)eattr->var;
+	const cpumask_t *cpumask;
+
+	switch (mask_id) {
+	case DSU_ACTIVE_CPU_MASK:
+		cpumask = &dsu_pmu->active_cpu;
+		break;
+	case DSU_ASSOCIATED_CPU_MASK:
+		cpumask = &dsu_pmu->associated_cpus;
+		break;
+	default:
+		return 0;
+	}
+	return cpumap_print_to_pagebuf(true, buf, cpumask);
+}
+
+static struct attribute *dsu_pmu_format_attrs[] = {
+	DSU_FORMAT_ATTR(event, "config:0-31"),
+	NULL,
+};
+
+static const struct attribute_group dsu_pmu_format_attr_group = {
+	.name = "format",
+	.attrs = dsu_pmu_format_attrs,
+};
+
+static struct attribute *dsu_pmu_event_attrs[] = {
+	DSU_EVENT_ATTR(cycles, 0x11),
+	DSU_EVENT_ATTR(bus_access, 0x19),
+	DSU_EVENT_ATTR(memory_error, 0x1a),
+	DSU_EVENT_ATTR(bus_cycles, 0x1d),
+	DSU_EVENT_ATTR(l3d_cache_allocate, 0x29),
+	DSU_EVENT_ATTR(l3d_cache_refill, 0x2a),
+	DSU_EVENT_ATTR(l3d_cache, 0x2b),
+	DSU_EVENT_ATTR(l3d_cache_wb, 0x2c),
+	NULL,
+};
+
+static umode_t
+dsu_pmu_event_attr_is_visible(struct kobject *kobj, struct attribute *attr,
+				int unused)
+{
+	struct pmu *pmu = dev_get_drvdata(kobj_to_dev(kobj));
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(pmu);
+	struct dev_ext_attribute *eattr = container_of(attr,
+					struct dev_ext_attribute, attr.attr);
+	unsigned long evt = (unsigned long)eattr->var;
+
+	return test_bit(evt, dsu_pmu->cpmceid_bitmap) ? attr->mode : 0;
+}
+
+static const struct attribute_group dsu_pmu_events_attr_group = {
+	.name = "events",
+	.attrs = dsu_pmu_event_attrs,
+	.is_visible = dsu_pmu_event_attr_is_visible,
+};
+
+static struct attribute *dsu_pmu_cpumask_attrs[] = {
+	DSU_CPUMASK_ATTR(cpumask, DSU_ACTIVE_CPU_MASK),
+	DSU_CPUMASK_ATTR(associated_cpus, DSU_ASSOCIATED_CPU_MASK),
+	NULL,
+};
+
+static const struct attribute_group dsu_pmu_cpumask_attr_group = {
+	.attrs = dsu_pmu_cpumask_attrs,
+};
+
+static const struct attribute_group *dsu_pmu_attr_groups[] = {
+	&dsu_pmu_cpumask_attr_group,
+	&dsu_pmu_events_attr_group,
+	&dsu_pmu_format_attr_group,
+	NULL,
+};
+
+static int dsu_pmu_get_online_cpu_any_but(struct dsu_pmu *dsu_pmu, int cpu)
+{
+	struct cpumask online_supported;
+
+	cpumask_and(&online_supported,
+			 &dsu_pmu->associated_cpus, cpu_online_mask);
+	return cpumask_any_but(&online_supported, cpu);
+}
+
+static inline bool dsu_pmu_counter_valid(struct dsu_pmu *dsu_pmu, u32 idx)
+{
+	return (idx < dsu_pmu->num_counters) ||
+	       (idx == DSU_PMU_IDX_CYCLE_COUNTER);
+}
+
+static inline u64 dsu_pmu_read_counter(struct perf_event *event)
+{
+	u64 val;
+	unsigned long flags;
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	if (WARN_ON(!cpumask_test_cpu(smp_processor_id(),
+				 &dsu_pmu->associated_cpus)))
+		return 0;
+
+	if (!dsu_pmu_counter_valid(dsu_pmu, idx)) {
+		dev_err(event->pmu->dev,
+			"Trying reading invalid counter %d\n", idx);
+		return 0;
+	}
+
+	raw_spin_lock_irqsave(&dsu_pmu->pmu_lock, flags);
+	if (idx == DSU_PMU_IDX_CYCLE_COUNTER)
+		val = __dsu_pmu_read_pmccntr();
+	else
+		val = __dsu_pmu_read_counter(idx);
+	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
+
+	return val;
+}
+
+static void dsu_pmu_write_counter(struct perf_event *event, u64 val)
+{
+	unsigned long flags;
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+	int idx = event->hw.idx;
+
+	if (WARN_ON(!cpumask_test_cpu(smp_processor_id(),
+			 &dsu_pmu->associated_cpus)))
+		return;
+
+	if (!dsu_pmu_counter_valid(dsu_pmu, idx)) {
+		dev_err(event->pmu->dev,
+			"writing to invalid counter %d\n", idx);
+		return;
+	}
+
+	raw_spin_lock_irqsave(&dsu_pmu->pmu_lock, flags);
+	if (idx == DSU_PMU_IDX_CYCLE_COUNTER)
+		__dsu_pmu_write_pmccntr(val);
+	else
+		__dsu_pmu_write_counter(idx, val);
+	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
+}
+
+static int dsu_pmu_get_event_idx(struct dsu_hw_events *hw_events,
+				 struct perf_event *event)
+{
+	int idx;
+	unsigned long evtype = event->attr.config;
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+	unsigned long *used_mask = hw_events->used_mask;
+
+	if (evtype == DSU_PMU_EVT_CYCLES) {
+		if (test_and_set_bit(DSU_PMU_IDX_CYCLE_COUNTER, used_mask))
+			return -EAGAIN;
+		return DSU_PMU_IDX_CYCLE_COUNTER;
+	}
+
+	idx = find_first_zero_bit(used_mask, dsu_pmu->num_counters);
+	if (idx >= dsu_pmu->num_counters)
+		return -EAGAIN;
+	set_bit(idx, hw_events->used_mask);
+	return idx;
+}
+
+static void dsu_pmu_enable_counter(struct dsu_pmu *dsu_pmu, int idx)
+{
+	__dsu_pmu_counter_interrupt_enable(idx);
+	__dsu_pmu_enable_counter(idx);
+}
+
+static void dsu_pmu_disable_counter(struct dsu_pmu *dsu_pmu, int idx)
+{
+	__dsu_pmu_disable_counter(idx);
+	__dsu_pmu_counter_interrupt_disable(idx);
+}
+
+static inline void dsu_pmu_set_event(struct dsu_pmu *dsu_pmu,
+					struct perf_event *event)
+{
+	int idx = event->hw.idx;
+	unsigned long flags;
+
+	if (!dsu_pmu_counter_valid(dsu_pmu, idx)) {
+		dev_err(event->pmu->dev,
+			"Trying to set invalid counter %d\n", idx);
+		return;
+	}
+
+	raw_spin_lock_irqsave(&dsu_pmu->pmu_lock, flags);
+	__dsu_pmu_set_event(idx, event->hw.config_base);
+	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
+}
+
+static void dsu_pmu_event_update(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+	u64 delta, prev_count, new_count;
+
+	do {
+		/* We may also be called from the irq handler */
+		prev_count = local64_read(&hwc->prev_count);
+		new_count = dsu_pmu_read_counter(event);
+	} while (local64_cmpxchg(&hwc->prev_count, prev_count, new_count) !=
+			prev_count);
+	delta = (new_count - prev_count) & DSU_PMU_COUNTER_MASK(hwc->idx);
+	local64_add(delta, &event->count);
+}
+
+static void dsu_pmu_read(struct perf_event *event)
+{
+	dsu_pmu_event_update(event);
+}
+
+static inline u32 dsu_pmu_get_reset_overflow(void)
+{
+	return __dsu_pmu_get_reset_overflow();
+}
+
+/**
+ * dsu_pmu_set_event_period: Set the period for the counter.
+ *
+ * All DSU PMU event counters, except the cycle counter are 32bit
+ * counters. To handle cases of extreme interrupt latency, we program
+ * the counter with half of the max count for the counters.
+ */
+static void dsu_pmu_set_event_period(struct perf_event *event)
+{
+	int idx = event->hw.idx;
+	u64 val = DSU_PMU_COUNTER_MASK(idx) >> 1;
+
+	local64_set(&event->hw.prev_count, val);
+	dsu_pmu_write_counter(event, val);
+}
+
+static irqreturn_t dsu_pmu_handle_irq(int irq_num, void *dev)
+{
+	int i;
+	bool handled = false;
+	struct dsu_pmu *dsu_pmu = dev;
+	struct dsu_hw_events *hw_events = &dsu_pmu->hw_events;
+	unsigned long overflow;
+
+	overflow = dsu_pmu_get_reset_overflow();
+	if (!overflow)
+		return IRQ_NONE;
+
+	for_each_set_bit(i, &overflow, DSU_PMU_MAX_HW_CNTRS) {
+		struct perf_event *event = hw_events->events[i];
+
+		if (!event)
+			continue;
+		dsu_pmu_event_update(event);
+		dsu_pmu_set_event_period(event);
+		handled = true;
+	}
+
+	return IRQ_RETVAL(handled);
+}
+
+static void dsu_pmu_start(struct perf_event *event, int pmu_flags)
+{
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+
+	/* We always reprogram the counter */
+	if (pmu_flags & PERF_EF_RELOAD)
+		WARN_ON(!(event->hw.state & PERF_HES_UPTODATE));
+	dsu_pmu_set_event_period(event);
+	if (event->hw.idx != DSU_PMU_IDX_CYCLE_COUNTER)
+		dsu_pmu_set_event(dsu_pmu, event);
+	event->hw.state = 0;
+	dsu_pmu_enable_counter(dsu_pmu, event->hw.idx);
+}
+
+static void dsu_pmu_stop(struct perf_event *event, int pmu_flags)
+{
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+
+	if (event->hw.state & PERF_HES_STOPPED)
+		return;
+	dsu_pmu_disable_counter(dsu_pmu, event->hw.idx);
+	dsu_pmu_event_update(event);
+	event->hw.state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+}
+
+static int dsu_pmu_add(struct perf_event *event, int flags)
+{
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+	struct dsu_hw_events *hw_events = &dsu_pmu->hw_events;
+	struct hw_perf_event *hwc = &event->hw;
+	int idx;
+
+	if (WARN_ON_ONCE(!cpumask_test_cpu(smp_processor_id(),
+					   &dsu_pmu->associated_cpus)))
+		return -ENOENT;
+
+	idx = dsu_pmu_get_event_idx(hw_events, event);
+	if (idx < 0)
+		return idx;
+
+	hwc->idx = idx;
+	hw_events->events[idx] = event;
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+
+	if (flags & PERF_EF_START)
+		dsu_pmu_start(event, PERF_EF_RELOAD);
+
+	perf_event_update_userpage(event);
+	return 0;
+}
+
+static void dsu_pmu_del(struct perf_event *event, int flags)
+{
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+	struct dsu_hw_events *hw_events = &dsu_pmu->hw_events;
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	dsu_pmu_stop(event, PERF_EF_UPDATE);
+	hw_events->events[idx] = NULL;
+	clear_bit(idx, hw_events->used_mask);
+	perf_event_update_userpage(event);
+}
+
+static void dsu_pmu_enable(struct pmu *pmu)
+{
+	u32 pmcr;
+	unsigned long flags;
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(pmu);
+
+	/* If no counters are added, skip enabling the PMU */
+	if (bitmap_empty(dsu_pmu->hw_events.used_mask, DSU_PMU_MAX_HW_CNTRS))
+		return;
+
+	raw_spin_lock_irqsave(&dsu_pmu->pmu_lock, flags);
+	pmcr = __dsu_pmu_read_pmcr();
+	pmcr |= CLUSTERPMCR_E;
+	__dsu_pmu_write_pmcr(pmcr);
+	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
+}
+
+static void dsu_pmu_disable(struct pmu *pmu)
+{
+	u32 pmcr;
+	unsigned long flags;
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(pmu);
+
+	raw_spin_lock_irqsave(&dsu_pmu->pmu_lock, flags);
+	pmcr = __dsu_pmu_read_pmcr();
+	pmcr &= ~CLUSTERPMCR_E;
+	__dsu_pmu_write_pmcr(pmcr);
+	raw_spin_unlock_irqrestore(&dsu_pmu->pmu_lock, flags);
+}
+
+static bool dsu_pmu_validate_event(struct pmu *pmu,
+				  struct dsu_hw_events *hw_events,
+				  struct perf_event *event)
+{
+	if (is_software_event(event))
+		return true;
+	/* Reject groups spanning multiple HW PMUs. */
+	if (event->pmu != pmu)
+		return false;
+	return dsu_pmu_get_event_idx(hw_events, event) >= 0;
+}
+
+/*
+ * Make sure the group of events can be scheduled at once
+ * on the PMU.
+ */
+static bool dsu_pmu_validate_group(struct perf_event *event)
+{
+	struct perf_event *sibling, *leader = event->group_leader;
+	struct dsu_hw_events fake_hw;
+
+	if (event->group_leader == event)
+		return true;
+
+	memset(fake_hw.used_mask, 0, sizeof(fake_hw.used_mask));
+	if (!dsu_pmu_validate_event(event->pmu, &fake_hw, leader))
+		return false;
+	list_for_each_entry(sibling, &leader->sibling_list, group_entry) {
+		if (!dsu_pmu_validate_event(event->pmu, &fake_hw, sibling))
+			return false;
+	}
+	return dsu_pmu_validate_event(event->pmu, &fake_hw, event);
+}
+
+static int dsu_pmu_event_init(struct perf_event *event)
+{
+	struct dsu_pmu *dsu_pmu = to_dsu_pmu(event->pmu);
+
+	if (event->attr.type != event->pmu->type)
+		return -ENOENT;
+
+	/* We don't support sampling */
+	if (is_sampling_event(event)) {
+		dev_dbg(dsu_pmu->pmu.dev, "Can't support sampling events\n");
+		return -EOPNOTSUPP;
+	}
+
+	/* We cannot support task bound events */
+	if (event->cpu < 0 || event->attach_state & PERF_ATTACH_TASK) {
+		dev_dbg(dsu_pmu->pmu.dev, "Can't support per-task counters\n");
+		return -EINVAL;
+	}
+
+	if (has_branch_stack(event) ||
+	    event->attr.exclude_user ||
+	    event->attr.exclude_kernel ||
+	    event->attr.exclude_hv ||
+	    event->attr.exclude_idle ||
+	    event->attr.exclude_host ||
+	    event->attr.exclude_guest) {
+		dev_dbg(dsu_pmu->pmu.dev, "Can't support filtering\n");
+		return -EINVAL;
+	}
+
+	if (!cpumask_test_cpu(event->cpu, &dsu_pmu->associated_cpus)) {
+		dev_dbg(dsu_pmu->pmu.dev,
+			 "Requested cpu is not associated with the DSU\n");
+		return -EINVAL;
+	}
+	/*
+	 * Choose the current active CPU to read the events. We don't want
+	 * to migrate the event contexts, irq handling etc to the requested
+	 * CPU. As long as the requested CPU is within the same DSU, we
+	 * are fine.
+	 */
+	event->cpu = cpumask_first(&dsu_pmu->active_cpu);
+	if (event->cpu >= nr_cpu_ids)
+		return -EINVAL;
+	if (!dsu_pmu_validate_group(event))
+		return -EINVAL;
+
+	event->hw.config_base = event->attr.config;
+	return 0;
+}
+
+static struct dsu_pmu *dsu_pmu_alloc(struct platform_device *pdev)
+{
+	struct dsu_pmu *dsu_pmu;
+
+	dsu_pmu = devm_kzalloc(&pdev->dev, sizeof(*dsu_pmu), GFP_KERNEL);
+	if (!dsu_pmu)
+		return ERR_PTR(-ENOMEM);
+
+	raw_spin_lock_init(&dsu_pmu->pmu_lock);
+	/*
+	 * Initialise the number of counters to -1, until we probe
+	 * the real number on a connected CPU.
+	 */
+	dsu_pmu->num_counters = -1;
+	return dsu_pmu;
+}
+
+/**
+ * dsu_pmu_dt_get_cpus: Get the list of CPUs in the cluster.
+ */
+static int dsu_pmu_dt_get_cpus(struct device_node *dev, cpumask_t *mask)
+{
+	int i = 0, n, cpu;
+	struct device_node *cpu_node;
+
+	n = of_count_phandle_with_args(dev, "cpus", NULL);
+	if (n <= 0)
+		return -ENODEV;
+	for (; i < n; i++) {
+		cpu_node = of_parse_phandle(dev, "cpus", i);
+		if (!cpu_node)
+			break;
+		cpu = of_cpu_node_to_id(cpu_node);
+		of_node_put(cpu_node);
+		/*
+		 * We have to ignore the failures here and continue scanning
+		 * the list to handle cases where the nr_cpus could be capped
+		 * in the running kernel.
+		 */
+		if (cpu < 0)
+			continue;
+		cpumask_set_cpu(cpu, mask);
+	}
+	return 0;
+}
+
+/*
+ * dsu_pmu_probe_pmu: Probe the PMU details on a CPU in the cluster.
+ */
+static void dsu_pmu_probe_pmu(struct dsu_pmu *dsu_pmu)
+{
+	u64 num_counters;
+	u32 cpmceid[2];
+
+	num_counters = (__dsu_pmu_read_pmcr() >> CLUSTERPMCR_N_SHIFT) &
+						CLUSTERPMCR_N_MASK;
+	/* We can only support up to 31 independent counters */
+	if (WARN_ON(num_counters > 31))
+		num_counters = 31;
+	dsu_pmu->num_counters = num_counters;
+	if (!dsu_pmu->num_counters)
+		return;
+	cpmceid[0] = __dsu_pmu_read_pmceid(0);
+	cpmceid[1] = __dsu_pmu_read_pmceid(1);
+	bitmap_from_u32array(dsu_pmu->cpmceid_bitmap,
+				DSU_PMU_MAX_COMMON_EVENTS,
+				cpmceid,
+				ARRAY_SIZE(cpmceid));
+}
+
+static void dsu_pmu_set_active_cpu(int cpu, struct dsu_pmu *dsu_pmu)
+{
+	cpumask_set_cpu(cpu, &dsu_pmu->active_cpu);
+	if (irq_set_affinity_hint(dsu_pmu->irq, &dsu_pmu->active_cpu))
+		pr_warn("Failed to set irq affinity to %d\n", cpu);
+}
+
+/*
+ * dsu_pmu_init_pmu: Initialise the DSU PMU configurations if
+ * we haven't done it already.
+ */
+static void dsu_pmu_init_pmu(struct dsu_pmu *dsu_pmu)
+{
+	if (dsu_pmu->num_counters == -1)
+		dsu_pmu_probe_pmu(dsu_pmu);
+	/* Reset the interrupt overflow mask */
+	dsu_pmu_get_reset_overflow();
+}
+
+static int dsu_pmu_device_probe(struct platform_device *pdev)
+{
+	int irq, rc;
+	struct dsu_pmu *dsu_pmu;
+	char *name;
+	static atomic_t pmu_idx = ATOMIC_INIT(-1);
+
+	dsu_pmu = dsu_pmu_alloc(pdev);
+	if (IS_ERR(dsu_pmu))
+		return PTR_ERR(dsu_pmu);
+
+	rc = dsu_pmu_dt_get_cpus(pdev->dev.of_node, &dsu_pmu->associated_cpus);
+	if (rc) {
+		dev_warn(&pdev->dev, "Failed to parse the CPUs\n");
+		return rc;
+	}
+
+	irq = platform_get_irq(pdev, 0);
+	if (irq < 0) {
+		dev_warn(&pdev->dev, "Failed to find IRQ\n");
+		return -EINVAL;
+	}
+
+	name = devm_kasprintf(&pdev->dev, GFP_KERNEL, "%s_%d",
+				PMUNAME, atomic_inc_return(&pmu_idx));
+	if (!name)
+		return -ENOMEM;
+	rc = devm_request_irq(&pdev->dev, irq, dsu_pmu_handle_irq,
+			      IRQF_NOBALANCING, name, dsu_pmu);
+	if (rc) {
+		dev_warn(&pdev->dev, "Failed to request IRQ %d\n", irq);
+		return rc;
+	}
+
+	dsu_pmu->irq = irq;
+	platform_set_drvdata(pdev, dsu_pmu);
+	rc = cpuhp_state_add_instance(dsu_pmu_cpuhp_state,
+						&dsu_pmu->cpuhp_node);
+	if (rc)
+		return rc;
+
+	dsu_pmu->pmu = (struct pmu) {
+		.task_ctx_nr	= perf_invalid_context,
+		.module		= THIS_MODULE,
+		.pmu_enable	= dsu_pmu_enable,
+		.pmu_disable	= dsu_pmu_disable,
+		.event_init	= dsu_pmu_event_init,
+		.add		= dsu_pmu_add,
+		.del		= dsu_pmu_del,
+		.start		= dsu_pmu_start,
+		.stop		= dsu_pmu_stop,
+		.read		= dsu_pmu_read,
+
+		.attr_groups	= dsu_pmu_attr_groups,
+	};
+
+	rc = perf_pmu_register(&dsu_pmu->pmu, name, -1);
+	if (rc) {
+		cpuhp_state_remove_instance(dsu_pmu_cpuhp_state,
+						 &dsu_pmu->cpuhp_node);
+		irq_set_affinity_hint(dsu_pmu->irq, NULL);
+	}
+
+	return rc;
+}
+
+static int dsu_pmu_device_remove(struct platform_device *pdev)
+{
+	struct dsu_pmu *dsu_pmu = platform_get_drvdata(pdev);
+
+	perf_pmu_unregister(&dsu_pmu->pmu);
+	cpuhp_state_remove_instance(dsu_pmu_cpuhp_state, &dsu_pmu->cpuhp_node);
+	irq_set_affinity_hint(dsu_pmu->irq, NULL);
+
+	return 0;
+}
+
+static const struct of_device_id dsu_pmu_of_match[] = {
+	{ .compatible = "arm,dsu-pmu", },
+	{},
+};
+
+static struct platform_driver dsu_pmu_driver = {
+	.driver = {
+		.name	= DRVNAME,
+		.of_match_table = of_match_ptr(dsu_pmu_of_match),
+	},
+	.probe = dsu_pmu_device_probe,
+	.remove = dsu_pmu_device_remove,
+};
+
+static int dsu_pmu_cpu_online(unsigned int cpu, struct hlist_node *node)
+{
+	struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
+						   cpuhp_node);
+
+	if (!cpumask_test_cpu(cpu, &dsu_pmu->associated_cpus))
+		return 0;
+
+	/* If the PMU is already managed, there is nothing to do */
+	if (!cpumask_empty(&dsu_pmu->active_cpu))
+		return 0;
+
+	dsu_pmu_init_pmu(dsu_pmu);
+	dsu_pmu_set_active_cpu(cpu, dsu_pmu);
+
+	return 0;
+}
+
+static int dsu_pmu_cpu_teardown(unsigned int cpu, struct hlist_node *node)
+{
+	int dst;
+	struct dsu_pmu *dsu_pmu = hlist_entry_safe(node, struct dsu_pmu,
+						   cpuhp_node);
+
+	if (!cpumask_test_and_clear_cpu(cpu, &dsu_pmu->active_cpu))
+		return 0;
+
+	dst = dsu_pmu_get_online_cpu_any_but(dsu_pmu, cpu);
+	/* If there are no active CPUs in the DSU, leave IRQ disabled */
+	if (dst >= nr_cpu_ids) {
+		irq_set_affinity_hint(dsu_pmu->irq, NULL);
+		return 0;
+	}
+
+	perf_pmu_migrate_context(&dsu_pmu->pmu, cpu, dst);
+	dsu_pmu_set_active_cpu(dst, dsu_pmu);
+
+	return 0;
+}
+
+static int __init dsu_pmu_init(void)
+{
+	int ret;
+
+	ret = cpuhp_setup_state_multi(CPUHP_AP_ONLINE_DYN,
+					DRVNAME,
+					dsu_pmu_cpu_online,
+					dsu_pmu_cpu_teardown);
+	if (ret < 0)
+		return ret;
+	dsu_pmu_cpuhp_state = ret;
+	return platform_driver_register(&dsu_pmu_driver);
+}
+
+static void __exit dsu_pmu_exit(void)
+{
+	platform_driver_unregister(&dsu_pmu_driver);
+	cpuhp_remove_multi_state(dsu_pmu_cpuhp_state);
+}
+
+module_init(dsu_pmu_init);
+module_exit(dsu_pmu_exit);
+
+MODULE_DEVICE_TABLE(of, dsu_pmu_of_match);
+MODULE_DESCRIPTION("Perf driver for ARM DynamIQ Shared Unit");
+MODULE_AUTHOR("Suzuki K Poulose <suzuki.poulose@arm.com>");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/perf/arm_pmu_platform.c b/drivers/perf/arm_pmu_platform.c
index 91b224ec..46501cc 100644
--- a/drivers/perf/arm_pmu_platform.c
+++ b/drivers/perf/arm_pmu_platform.c
@@ -82,19 +82,10 @@ static int pmu_parse_irq_affinity(struct device_node *node, int i)
 		return -EINVAL;
 	}
 
-	/* Now look up the logical CPU number */
-	for_each_possible_cpu(cpu) {
-		struct device_node *cpu_dn;
-
-		cpu_dn = of_cpu_device_node_get(cpu);
-		of_node_put(cpu_dn);
-
-		if (dn == cpu_dn)
-			break;
-	}
-
-	if (cpu >= nr_cpu_ids) {
+	cpu = of_cpu_node_to_id(dn);
+	if (cpu < 0) {
 		pr_warn("failed to find logical CPU for %s\n", dn->name);
+		cpu = nr_cpu_ids;
 	}
 
 	of_node_put(dn);
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 8ce262f..51b40ae 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -1164,6 +1164,15 @@ static int arm_spe_pmu_device_dt_probe(struct platform_device *pdev)
 	struct arm_spe_pmu *spe_pmu;
 	struct device *dev = &pdev->dev;
 
+	/*
+	 * If kernelspace is unmapped when running at EL0, then the SPE
+	 * buffer will fault and prematurely terminate the AUX session.
+	 */
+	if (arm64_kernel_unmapped_at_el0()) {
+		dev_warn_once(dev, "profiling buffer inaccessible. Try passing \"kpti=off\" on the kernel command line\n");
+		return -EPERM;
+	}
+
 	spe_pmu = devm_kzalloc(dev, sizeof(*spe_pmu), GFP_KERNEL);
 	if (!spe_pmu) {
 		dev_err(dev, "failed to allocate spe_pmu\n");
diff --git a/drivers/phy/phy-core.c b/drivers/phy/phy-core.c
index b4964b0..8f6e8e2 100644
--- a/drivers/phy/phy-core.c
+++ b/drivers/phy/phy-core.c
@@ -410,6 +410,10 @@ static struct phy *_of_phy_get(struct device_node *np, int index)
 	if (ret)
 		return ERR_PTR(-ENODEV);
 
+	/* This phy type handled by the usb-phy subsystem for now */
+	if (of_device_is_compatible(args.np, "usb-nop-xceiv"))
+		return ERR_PTR(-ENODEV);
+
 	mutex_lock(&phy_provider_mutex);
 	phy_provider = of_phy_provider_lookup(args.np);
 	if (IS_ERR(phy_provider) || !try_module_get(phy_provider->owner)) {
diff --git a/drivers/platform/chrome/Kconfig b/drivers/platform/chrome/Kconfig
index 0ad6e29..e728a96 100644
--- a/drivers/platform/chrome/Kconfig
+++ b/drivers/platform/chrome/Kconfig
@@ -38,14 +38,8 @@
 	  If you have a supported Chromebook, choose Y or M here.
 	  The module will be called chromeos_pstore.
 
-config CROS_EC_CHARDEV
-        tristate "Chrome OS Embedded Controller userspace device interface"
-        depends on MFD_CROS_EC
-        ---help---
-          This driver adds support to talk with the ChromeOS EC from userspace.
-
-          If you have a supported Chromebook, choose Y or M here.
-          The module will be called cros_ec_dev.
+config CROS_EC_CTL
+        tristate
 
 config CROS_EC_LPC
         tristate "ChromeOS Embedded Controller (LPC)"
diff --git a/drivers/platform/chrome/Makefile b/drivers/platform/chrome/Makefile
index a077b1f..ff3b369 100644
--- a/drivers/platform/chrome/Makefile
+++ b/drivers/platform/chrome/Makefile
@@ -2,10 +2,9 @@
 
 obj-$(CONFIG_CHROMEOS_LAPTOP)		+= chromeos_laptop.o
 obj-$(CONFIG_CHROMEOS_PSTORE)		+= chromeos_pstore.o
-cros_ec_devs-objs			:= cros_ec_dev.o cros_ec_sysfs.o \
-					   cros_ec_lightbar.o cros_ec_vbc.o \
-					   cros_ec_debugfs.o
-obj-$(CONFIG_CROS_EC_CHARDEV)		+= cros_ec_devs.o
+cros_ec_ctl-objs			:= cros_ec_sysfs.o cros_ec_lightbar.o \
+					   cros_ec_vbc.o cros_ec_debugfs.o
+obj-$(CONFIG_CROS_EC_CTL)		+= cros_ec_ctl.o
 cros_ec_lpcs-objs			:= cros_ec_lpc.o cros_ec_lpc_reg.o
 cros_ec_lpcs-$(CONFIG_CROS_EC_LPC_MEC)	+= cros_ec_lpc_mec.o
 obj-$(CONFIG_CROS_EC_LPC)		+= cros_ec_lpcs.o
diff --git a/drivers/platform/chrome/cros_ec_debugfs.c b/drivers/platform/chrome/cros_ec_debugfs.c
index 4cc66f4..98a35d3 100644
--- a/drivers/platform/chrome/cros_ec_debugfs.c
+++ b/drivers/platform/chrome/cros_ec_debugfs.c
@@ -29,9 +29,6 @@
 #include <linux/slab.h>
 #include <linux/wait.h>
 
-#include "cros_ec_dev.h"
-#include "cros_ec_debugfs.h"
-
 #define LOG_SHIFT		14
 #define LOG_SIZE		(1 << LOG_SHIFT)
 #define LOG_POLL_SEC		10
@@ -390,6 +387,7 @@ int cros_ec_debugfs_init(struct cros_ec_dev *ec)
 	debugfs_remove_recursive(debug_info->dir);
 	return ret;
 }
+EXPORT_SYMBOL(cros_ec_debugfs_init);
 
 void cros_ec_debugfs_remove(struct cros_ec_dev *ec)
 {
@@ -399,3 +397,4 @@ void cros_ec_debugfs_remove(struct cros_ec_dev *ec)
 	debugfs_remove_recursive(ec->debug_info->dir);
 	cros_ec_cleanup_console_log(ec->debug_info);
 }
+EXPORT_SYMBOL(cros_ec_debugfs_remove);
diff --git a/drivers/platform/chrome/cros_ec_debugfs.h b/drivers/platform/chrome/cros_ec_debugfs.h
deleted file mode 100644
index 1ff3a50..0000000
--- a/drivers/platform/chrome/cros_ec_debugfs.h
+++ /dev/null
@@ -1,27 +0,0 @@
-/*
- * Copyright 2015 Google, Inc.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program. If not, see <http://www.gnu.org/licenses/>.
- */
-
-#ifndef _DRV_CROS_EC_DEBUGFS_H_
-#define _DRV_CROS_EC_DEBUGFS_H_
-
-#include "cros_ec_dev.h"
-
-/* debugfs stuff */
-int cros_ec_debugfs_init(struct cros_ec_dev *ec);
-void cros_ec_debugfs_remove(struct cros_ec_dev *ec);
-
-#endif  /* _DRV_CROS_EC_DEBUGFS_H_ */
diff --git a/drivers/platform/chrome/cros_ec_lightbar.c b/drivers/platform/chrome/cros_ec_lightbar.c
index fd2b047..6ea79d4 100644
--- a/drivers/platform/chrome/cros_ec_lightbar.c
+++ b/drivers/platform/chrome/cros_ec_lightbar.c
@@ -33,8 +33,6 @@
 #include <linux/uaccess.h>
 #include <linux/slab.h>
 
-#include "cros_ec_dev.h"
-
 /* Rate-limit the lightbar interface to prevent DoS. */
 static unsigned long lb_interval_jiffies = 50 * HZ / 1000;
 
@@ -414,6 +412,7 @@ int lb_manual_suspend_ctrl(struct cros_ec_dev *ec, uint8_t enable)
 
 	return ret;
 }
+EXPORT_SYMBOL(lb_manual_suspend_ctrl);
 
 int lb_suspend(struct cros_ec_dev *ec)
 {
@@ -422,6 +421,7 @@ int lb_suspend(struct cros_ec_dev *ec)
 
 	return lb_send_empty_cmd(ec, LIGHTBAR_CMD_SUSPEND);
 }
+EXPORT_SYMBOL(lb_suspend);
 
 int lb_resume(struct cros_ec_dev *ec)
 {
@@ -430,6 +430,7 @@ int lb_resume(struct cros_ec_dev *ec)
 
 	return lb_send_empty_cmd(ec, LIGHTBAR_CMD_RESUME);
 }
+EXPORT_SYMBOL(lb_resume);
 
 static ssize_t sequence_store(struct device *dev, struct device_attribute *attr,
 			      const char *buf, size_t count)
@@ -622,3 +623,4 @@ struct attribute_group cros_ec_lightbar_attr_group = {
 	.attrs = __lb_cmds_attrs,
 	.is_visible = cros_ec_lightbar_attrs_are_visible,
 };
+EXPORT_SYMBOL(cros_ec_lightbar_attr_group);
diff --git a/drivers/platform/chrome/cros_ec_sysfs.c b/drivers/platform/chrome/cros_ec_sysfs.c
index f3baf99..d6eebe8 100644
--- a/drivers/platform/chrome/cros_ec_sysfs.c
+++ b/drivers/platform/chrome/cros_ec_sysfs.c
@@ -34,8 +34,6 @@
 #include <linux/types.h>
 #include <linux/uaccess.h>
 
-#include "cros_ec_dev.h"
-
 /* Accessor functions */
 
 static ssize_t show_ec_reboot(struct device *dev,
@@ -294,4 +292,7 @@ static struct attribute *__ec_attrs[] = {
 struct attribute_group cros_ec_attr_group = {
 	.attrs = __ec_attrs,
 };
+EXPORT_SYMBOL(cros_ec_attr_group);
 
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("ChromeOS EC control driver");
diff --git a/drivers/platform/chrome/cros_ec_vbc.c b/drivers/platform/chrome/cros_ec_vbc.c
index 564a0d0..6d38e6b 100644
--- a/drivers/platform/chrome/cros_ec_vbc.c
+++ b/drivers/platform/chrome/cros_ec_vbc.c
@@ -135,3 +135,4 @@ struct attribute_group cros_ec_vbc_attr_group = {
 	.bin_attrs = cros_ec_vbc_bin_attrs,
 	.is_bin_visible = cros_ec_vbc_is_visible,
 };
+EXPORT_SYMBOL(cros_ec_vbc_attr_group);
diff --git a/drivers/platform/x86/surfacepro3_button.c b/drivers/platform/x86/surfacepro3_button.c
index 6505c97..1b49169 100644
--- a/drivers/platform/x86/surfacepro3_button.c
+++ b/drivers/platform/x86/surfacepro3_button.c
@@ -119,7 +119,7 @@ static void surface_button_notify(struct acpi_device *device, u32 event)
 	if (key_code == KEY_RESERVED)
 		return;
 	if (pressed)
-		pm_wakeup_event(&device->dev, 0);
+		pm_wakeup_dev_event(&device->dev, 0, button->suspended);
 	if (button->suspended)
 		return;
 	input_report_key(input, key_code, pressed?1:0);
@@ -185,6 +185,8 @@ static int surface_button_add(struct acpi_device *device)
 	error = input_register_device(input);
 	if (error)
 		goto err_free_input;
+
+	device_init_wakeup(&device->dev, true);
 	dev_info(&device->dev,
 			"%s [%s]\n", name, acpi_device_bid(device));
 	return 0;
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c
index 791449a..daa68ac 100644
--- a/drivers/platform/x86/wmi.c
+++ b/drivers/platform/x86/wmi.c
@@ -1458,5 +1458,5 @@ static void __exit acpi_wmi_exit(void)
 	class_unregister(&wmi_bus_class);
 }
 
-subsys_initcall(acpi_wmi_init);
+subsys_initcall_sync(acpi_wmi_init);
 module_exit(acpi_wmi_exit);
diff --git a/drivers/pnp/pnpbios/core.c b/drivers/pnp/pnpbios/core.c
index e681140..077f334 100644
--- a/drivers/pnp/pnpbios/core.c
+++ b/drivers/pnp/pnpbios/core.c
@@ -581,10 +581,7 @@ static int __init pnpbios_thread_init(void)
 
 	init_completion(&unload_sem);
 	task = kthread_run(pnp_dock_thread, NULL, "kpnpbiosd");
-	if (IS_ERR(task))
-		return PTR_ERR(task);
-
-	return 0;
+	return PTR_ERR_OR_ZERO(task);
 }
 
 /* Start the kernel thread later: */
diff --git a/drivers/pnp/quirks.c b/drivers/pnp/quirks.c
index f054cdd..803666a 100644
--- a/drivers/pnp/quirks.c
+++ b/drivers/pnp/quirks.c
@@ -21,7 +21,6 @@
 #include <linux/slab.h>
 #include <linux/pnp.h>
 #include <linux/io.h>
-#include <linux/kallsyms.h>
 #include "base.h"
 
 static void quirk_awe32_add_ports(struct pnp_dev *dev,
diff --git a/drivers/power/avs/rockchip-io-domain.c b/drivers/power/avs/rockchip-io-domain.c
index 75f63e3..ed2b109 100644
--- a/drivers/power/avs/rockchip-io-domain.c
+++ b/drivers/power/avs/rockchip-io-domain.c
@@ -76,7 +76,7 @@ struct rockchip_iodomain_supply {
 struct rockchip_iodomain {
 	struct device *dev;
 	struct regmap *grf;
-	struct rockchip_iodomain_soc_data *soc_data;
+	const struct rockchip_iodomain_soc_data *soc_data;
 	struct rockchip_iodomain_supply supplies[MAX_SUPPLIES];
 };
 
@@ -382,43 +382,43 @@ static const struct rockchip_iodomain_soc_data soc_data_rv1108_pmu = {
 static const struct of_device_id rockchip_iodomain_match[] = {
 	{
 		.compatible = "rockchip,rk3188-io-voltage-domain",
-		.data = (void *)&soc_data_rk3188
+		.data = &soc_data_rk3188
 	},
 	{
 		.compatible = "rockchip,rk3228-io-voltage-domain",
-		.data = (void *)&soc_data_rk3228
+		.data = &soc_data_rk3228
 	},
 	{
 		.compatible = "rockchip,rk3288-io-voltage-domain",
-		.data = (void *)&soc_data_rk3288
+		.data = &soc_data_rk3288
 	},
 	{
 		.compatible = "rockchip,rk3328-io-voltage-domain",
-		.data = (void *)&soc_data_rk3328
+		.data = &soc_data_rk3328
 	},
 	{
 		.compatible = "rockchip,rk3368-io-voltage-domain",
-		.data = (void *)&soc_data_rk3368
+		.data = &soc_data_rk3368
 	},
 	{
 		.compatible = "rockchip,rk3368-pmu-io-voltage-domain",
-		.data = (void *)&soc_data_rk3368_pmu
+		.data = &soc_data_rk3368_pmu
 	},
 	{
 		.compatible = "rockchip,rk3399-io-voltage-domain",
-		.data = (void *)&soc_data_rk3399
+		.data = &soc_data_rk3399
 	},
 	{
 		.compatible = "rockchip,rk3399-pmu-io-voltage-domain",
-		.data = (void *)&soc_data_rk3399_pmu
+		.data = &soc_data_rk3399_pmu
 	},
 	{
 		.compatible = "rockchip,rv1108-io-voltage-domain",
-		.data = (void *)&soc_data_rv1108
+		.data = &soc_data_rv1108
 	},
 	{
 		.compatible = "rockchip,rv1108-pmu-io-voltage-domain",
-		.data = (void *)&soc_data_rv1108_pmu
+		.data = &soc_data_rv1108_pmu
 	},
 	{ /* sentinel */ },
 };
@@ -443,7 +443,7 @@ static int rockchip_iodomain_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, iod);
 
 	match = of_match_node(rockchip_iodomain_match, np);
-	iod->soc_data = (struct rockchip_iodomain_soc_data *)match->data;
+	iod->soc_data = match->data;
 
 	parent = pdev->dev.parent;
 	if (parent && parent->of_node) {
diff --git a/drivers/powercap/intel_rapl.c b/drivers/powercap/intel_rapl.c
index d1694f1..35636e1 100644
--- a/drivers/powercap/intel_rapl.c
+++ b/drivers/powercap/intel_rapl.c
@@ -29,6 +29,7 @@
 #include <linux/sysfs.h>
 #include <linux/cpu.h>
 #include <linux/powercap.h>
+#include <linux/suspend.h>
 #include <asm/iosf_mbi.h>
 
 #include <asm/processor.h>
@@ -155,6 +156,7 @@ struct rapl_power_limit {
 	int prim_id; /* primitive ID used to enable */
 	struct rapl_domain *domain;
 	const char *name;
+	u64 last_power_limit;
 };
 
 static const char pl1_name[] = "long_term";
@@ -1209,7 +1211,7 @@ static int rapl_package_register_powercap(struct rapl_package *rp)
 	struct rapl_domain *rd;
 	char dev_name[17]; /* max domain name = 7 + 1 + 8 for int + 1 for null*/
 	struct powercap_zone *power_zone = NULL;
-	int nr_pl, ret;;
+	int nr_pl, ret;
 
 	/* Update the domain data of the new package */
 	rapl_update_domain_data(rp);
@@ -1533,6 +1535,92 @@ static int rapl_cpu_down_prep(unsigned int cpu)
 
 static enum cpuhp_state pcap_rapl_online;
 
+static void power_limit_state_save(void)
+{
+	struct rapl_package *rp;
+	struct rapl_domain *rd;
+	int nr_pl, ret, i;
+
+	get_online_cpus();
+	list_for_each_entry(rp, &rapl_packages, plist) {
+		if (!rp->power_zone)
+			continue;
+		rd = power_zone_to_rapl_domain(rp->power_zone);
+		nr_pl = find_nr_power_limit(rd);
+		for (i = 0; i < nr_pl; i++) {
+			switch (rd->rpl[i].prim_id) {
+			case PL1_ENABLE:
+				ret = rapl_read_data_raw(rd,
+						POWER_LIMIT1,
+						true,
+						&rd->rpl[i].last_power_limit);
+				if (ret)
+					rd->rpl[i].last_power_limit = 0;
+				break;
+			case PL2_ENABLE:
+				ret = rapl_read_data_raw(rd,
+						POWER_LIMIT2,
+						true,
+						&rd->rpl[i].last_power_limit);
+				if (ret)
+					rd->rpl[i].last_power_limit = 0;
+				break;
+			}
+		}
+	}
+	put_online_cpus();
+}
+
+static void power_limit_state_restore(void)
+{
+	struct rapl_package *rp;
+	struct rapl_domain *rd;
+	int nr_pl, i;
+
+	get_online_cpus();
+	list_for_each_entry(rp, &rapl_packages, plist) {
+		if (!rp->power_zone)
+			continue;
+		rd = power_zone_to_rapl_domain(rp->power_zone);
+		nr_pl = find_nr_power_limit(rd);
+		for (i = 0; i < nr_pl; i++) {
+			switch (rd->rpl[i].prim_id) {
+			case PL1_ENABLE:
+				if (rd->rpl[i].last_power_limit)
+					rapl_write_data_raw(rd,
+						POWER_LIMIT1,
+						rd->rpl[i].last_power_limit);
+				break;
+			case PL2_ENABLE:
+				if (rd->rpl[i].last_power_limit)
+					rapl_write_data_raw(rd,
+						POWER_LIMIT2,
+						rd->rpl[i].last_power_limit);
+				break;
+			}
+		}
+	}
+	put_online_cpus();
+}
+
+static int rapl_pm_callback(struct notifier_block *nb,
+	unsigned long mode, void *_unused)
+{
+	switch (mode) {
+	case PM_SUSPEND_PREPARE:
+		power_limit_state_save();
+		break;
+	case PM_POST_SUSPEND:
+		power_limit_state_restore();
+		break;
+	}
+	return NOTIFY_OK;
+}
+
+static struct notifier_block rapl_pm_notifier = {
+	.notifier_call = rapl_pm_callback,
+};
+
 static int __init rapl_init(void)
 {
 	const struct x86_cpu_id *id;
@@ -1560,8 +1648,16 @@ static int __init rapl_init(void)
 
 	/* Don't bail out if PSys is not supported */
 	rapl_register_psys();
+
+	ret = register_pm_notifier(&rapl_pm_notifier);
+	if (ret)
+		goto err_unreg_all;
+
 	return 0;
 
+err_unreg_all:
+	cpuhp_remove_state(pcap_rapl_online);
+
 err_unreg:
 	rapl_unregister_powercap();
 	return ret;
@@ -1569,6 +1665,7 @@ static int __init rapl_init(void)
 
 static void __exit rapl_exit(void)
 {
+	unregister_pm_notifier(&rapl_pm_notifier);
 	cpuhp_remove_state(pcap_rapl_online);
 	rapl_unregister_powercap();
 }
diff --git a/drivers/powercap/powercap_sys.c b/drivers/powercap/powercap_sys.c
index 5b10b50..64b2b25 100644
--- a/drivers/powercap/powercap_sys.c
+++ b/drivers/powercap/powercap_sys.c
@@ -673,15 +673,13 @@ EXPORT_SYMBOL_GPL(powercap_unregister_control_type);
 
 static int __init powercap_init(void)
 {
-	int result = 0;
+	int result;
 
 	result = seed_constraint_attributes();
 	if (result)
 		return result;
 
-	result = class_register(&powercap_class);
-
-	return result;
+	return class_register(&powercap_class);
 }
 
 device_initcall(powercap_init);
diff --git a/drivers/regulator/Kconfig b/drivers/regulator/Kconfig
index 96cd55f..b27417c 100644
--- a/drivers/regulator/Kconfig
+++ b/drivers/regulator/Kconfig
@@ -744,6 +744,13 @@
 	 via I2C bus. S5M8767A have 9 Bucks and 28 LDOs output and
 	 supports DVS mode with 8bits of output voltage control.
 
+config REGULATOR_SC2731
+	tristate "Spreadtrum SC2731 power regulator driver"
+	depends on MFD_SC27XX_PMIC || COMPILE_TEST
+	help
+	  This driver provides support for the voltage regulators on the
+	  SC2731 PMIC.
+
 config REGULATOR_SKY81452
 	tristate "Skyworks Solutions SKY81452 voltage regulator"
 	depends on MFD_SKY81452
diff --git a/drivers/regulator/Makefile b/drivers/regulator/Makefile
index 80ffc57a..19fea09 100644
--- a/drivers/regulator/Makefile
+++ b/drivers/regulator/Makefile
@@ -95,6 +95,7 @@
 obj-$(CONFIG_REGULATOR_S2MPA01) += s2mpa01.o
 obj-$(CONFIG_REGULATOR_S2MPS11) += s2mps11.o
 obj-$(CONFIG_REGULATOR_S5M8767) += s5m8767.o
+obj-$(CONFIG_REGULATOR_SC2731) += sc2731-regulator.o
 obj-$(CONFIG_REGULATOR_SKY81452) += sky81452-regulator.o
 obj-$(CONFIG_REGULATOR_STM32_VREFBUF) += stm32-vrefbuf.o
 obj-$(CONFIG_REGULATOR_STW481X_VMMC) += stw481x-vmmc.o
diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c
index b64b791..42681c1 100644
--- a/drivers/regulator/core.c
+++ b/drivers/regulator/core.c
@@ -58,8 +58,6 @@ static bool has_full_constraints;
 
 static struct dentry *debugfs_root;
 
-static struct class regulator_class;
-
 /*
  * struct regulator_map
  *
@@ -112,11 +110,6 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
 					  const char *supply_name);
 static void _regulator_put(struct regulator *regulator);
 
-static struct regulator_dev *dev_to_rdev(struct device *dev)
-{
-	return container_of(dev, struct regulator_dev, dev);
-}
-
 static const char *rdev_get_name(struct regulator_dev *rdev)
 {
 	if (rdev->constraints && rdev->constraints->name)
@@ -236,26 +229,35 @@ static int regulator_check_voltage(struct regulator_dev *rdev,
 	return 0;
 }
 
+/* return 0 if the state is valid */
+static int regulator_check_states(suspend_state_t state)
+{
+	return (state > PM_SUSPEND_MAX || state == PM_SUSPEND_TO_IDLE);
+}
+
 /* Make sure we select a voltage that suits the needs of all
  * regulator consumers
  */
 static int regulator_check_consumers(struct regulator_dev *rdev,
-				     int *min_uV, int *max_uV)
+				     int *min_uV, int *max_uV,
+				     suspend_state_t state)
 {
 	struct regulator *regulator;
+	struct regulator_voltage *voltage;
 
 	list_for_each_entry(regulator, &rdev->consumer_list, list) {
+		voltage = &regulator->voltage[state];
 		/*
 		 * Assume consumers that didn't say anything are OK
 		 * with anything in the constraint range.
 		 */
-		if (!regulator->min_uV && !regulator->max_uV)
+		if (!voltage->min_uV && !voltage->max_uV)
 			continue;
 
-		if (*max_uV > regulator->max_uV)
-			*max_uV = regulator->max_uV;
-		if (*min_uV < regulator->min_uV)
-			*min_uV = regulator->min_uV;
+		if (*max_uV > voltage->max_uV)
+			*max_uV = voltage->max_uV;
+		if (*min_uV < voltage->min_uV)
+			*min_uV = voltage->min_uV;
 	}
 
 	if (*min_uV > *max_uV) {
@@ -324,6 +326,24 @@ static int regulator_mode_constrain(struct regulator_dev *rdev,
 	return -EINVAL;
 }
 
+static inline struct regulator_state *
+regulator_get_suspend_state(struct regulator_dev *rdev, suspend_state_t state)
+{
+	if (rdev->constraints == NULL)
+		return NULL;
+
+	switch (state) {
+	case PM_SUSPEND_STANDBY:
+		return &rdev->constraints->state_standby;
+	case PM_SUSPEND_MEM:
+		return &rdev->constraints->state_mem;
+	case PM_SUSPEND_MAX:
+		return &rdev->constraints->state_disk;
+	default:
+		return NULL;
+	}
+}
+
 static ssize_t regulator_uV_show(struct device *dev,
 				struct device_attribute *attr, char *buf)
 {
@@ -731,29 +751,32 @@ static int drms_uA_update(struct regulator_dev *rdev)
 }
 
 static int suspend_set_state(struct regulator_dev *rdev,
-	struct regulator_state *rstate)
+				    suspend_state_t state)
 {
 	int ret = 0;
+	struct regulator_state *rstate;
+
+	rstate = regulator_get_suspend_state(rdev, state);
+	if (rstate == NULL)
+		return -EINVAL;
 
 	/* If we have no suspend mode configration don't set anything;
 	 * only warn if the driver implements set_suspend_voltage or
 	 * set_suspend_mode callback.
 	 */
-	if (!rstate->enabled && !rstate->disabled) {
+	if (rstate->enabled != ENABLE_IN_SUSPEND &&
+	    rstate->enabled != DISABLE_IN_SUSPEND) {
 		if (rdev->desc->ops->set_suspend_voltage ||
 		    rdev->desc->ops->set_suspend_mode)
 			rdev_warn(rdev, "No configuration\n");
 		return 0;
 	}
 
-	if (rstate->enabled && rstate->disabled) {
-		rdev_err(rdev, "invalid configuration\n");
-		return -EINVAL;
-	}
-
-	if (rstate->enabled && rdev->desc->ops->set_suspend_enable)
+	if (rstate->enabled == ENABLE_IN_SUSPEND &&
+		rdev->desc->ops->set_suspend_enable)
 		ret = rdev->desc->ops->set_suspend_enable(rdev);
-	else if (rstate->disabled && rdev->desc->ops->set_suspend_disable)
+	else if (rstate->enabled == DISABLE_IN_SUSPEND &&
+		rdev->desc->ops->set_suspend_disable)
 		ret = rdev->desc->ops->set_suspend_disable(rdev);
 	else /* OK if set_suspend_enable or set_suspend_disable is NULL */
 		ret = 0;
@@ -778,30 +801,10 @@ static int suspend_set_state(struct regulator_dev *rdev,
 			return ret;
 		}
 	}
+
 	return ret;
 }
 
-/* locks held by caller */
-static int suspend_prepare(struct regulator_dev *rdev, suspend_state_t state)
-{
-	if (!rdev->constraints)
-		return -EINVAL;
-
-	switch (state) {
-	case PM_SUSPEND_STANDBY:
-		return suspend_set_state(rdev,
-			&rdev->constraints->state_standby);
-	case PM_SUSPEND_MEM:
-		return suspend_set_state(rdev,
-			&rdev->constraints->state_mem);
-	case PM_SUSPEND_MAX:
-		return suspend_set_state(rdev,
-			&rdev->constraints->state_disk);
-	default:
-		return -EINVAL;
-	}
-}
-
 static void print_constraints(struct regulator_dev *rdev)
 {
 	struct regulation_constraints *constraints = rdev->constraints;
@@ -1068,7 +1071,7 @@ static int set_machine_constraints(struct regulator_dev *rdev,
 
 	/* do we need to setup our suspend state */
 	if (rdev->constraints->initial_state) {
-		ret = suspend_prepare(rdev, rdev->constraints->initial_state);
+		ret = suspend_set_state(rdev, rdev->constraints->initial_state);
 		if (ret < 0) {
 			rdev_err(rdev, "failed to set suspend state\n");
 			return ret;
@@ -1356,9 +1359,9 @@ static struct regulator *create_regulator(struct regulator_dev *rdev,
 		debugfs_create_u32("uA_load", 0444, regulator->debugfs,
 				   &regulator->uA_load);
 		debugfs_create_u32("min_uV", 0444, regulator->debugfs,
-				   &regulator->min_uV);
+				   &regulator->voltage[PM_SUSPEND_ON].min_uV);
 		debugfs_create_u32("max_uV", 0444, regulator->debugfs,
-				   &regulator->max_uV);
+				   &regulator->voltage[PM_SUSPEND_ON].max_uV);
 		debugfs_create_file("constraint_flags", 0444,
 				    regulator->debugfs, regulator,
 				    &constraint_flags_fops);
@@ -1417,20 +1420,6 @@ static void regulator_supply_alias(struct device **dev, const char **supply)
 	}
 }
 
-static int of_node_match(struct device *dev, const void *data)
-{
-	return dev->of_node == data;
-}
-
-static struct regulator_dev *of_find_regulator_by_node(struct device_node *np)
-{
-	struct device *dev;
-
-	dev = class_find_device(&regulator_class, NULL, np, of_node_match);
-
-	return dev ? dev_to_rdev(dev) : NULL;
-}
-
 static int regulator_match(struct device *dev, const void *data)
 {
 	struct regulator_dev *r = dev_to_rdev(dev);
@@ -2468,10 +2457,9 @@ static int _regulator_is_enabled(struct regulator_dev *rdev)
 	return rdev->desc->ops->is_enabled(rdev);
 }
 
-static int _regulator_list_voltage(struct regulator *regulator,
-				    unsigned selector, int lock)
+static int _regulator_list_voltage(struct regulator_dev *rdev,
+				   unsigned selector, int lock)
 {
-	struct regulator_dev *rdev = regulator->rdev;
 	const struct regulator_ops *ops = rdev->desc->ops;
 	int ret;
 
@@ -2487,7 +2475,8 @@ static int _regulator_list_voltage(struct regulator *regulator,
 		if (lock)
 			mutex_unlock(&rdev->mutex);
 	} else if (rdev->is_switch && rdev->supply) {
-		ret = _regulator_list_voltage(rdev->supply, selector, lock);
+		ret = _regulator_list_voltage(rdev->supply->rdev,
+					      selector, lock);
 	} else {
 		return -EINVAL;
 	}
@@ -2563,7 +2552,7 @@ EXPORT_SYMBOL_GPL(regulator_count_voltages);
  */
 int regulator_list_voltage(struct regulator *regulator, unsigned selector)
 {
-	return _regulator_list_voltage(regulator, selector, 1);
+	return _regulator_list_voltage(regulator->rdev, selector, 1);
 }
 EXPORT_SYMBOL_GPL(regulator_list_voltage);
 
@@ -2605,8 +2594,8 @@ int regulator_get_hardware_vsel_register(struct regulator *regulator,
 	if (ops->set_voltage_sel != regulator_set_voltage_sel_regmap)
 		return -EOPNOTSUPP;
 
-	 *vsel_reg = rdev->desc->vsel_reg;
-	 *vsel_mask = rdev->desc->vsel_mask;
+	*vsel_reg = rdev->desc->vsel_reg;
+	*vsel_mask = rdev->desc->vsel_mask;
 
 	 return 0;
 }
@@ -2897,10 +2886,38 @@ static int _regulator_do_set_voltage(struct regulator_dev *rdev,
 	return ret;
 }
 
+static int _regulator_do_set_suspend_voltage(struct regulator_dev *rdev,
+				  int min_uV, int max_uV, suspend_state_t state)
+{
+	struct regulator_state *rstate;
+	int uV, sel;
+
+	rstate = regulator_get_suspend_state(rdev, state);
+	if (rstate == NULL)
+		return -EINVAL;
+
+	if (min_uV < rstate->min_uV)
+		min_uV = rstate->min_uV;
+	if (max_uV > rstate->max_uV)
+		max_uV = rstate->max_uV;
+
+	sel = regulator_map_voltage(rdev, min_uV, max_uV);
+	if (sel < 0)
+		return sel;
+
+	uV = rdev->desc->ops->list_voltage(rdev, sel);
+	if (uV >= min_uV && uV <= max_uV)
+		rstate->uV = uV;
+
+	return 0;
+}
+
 static int regulator_set_voltage_unlocked(struct regulator *regulator,
-					  int min_uV, int max_uV)
+					  int min_uV, int max_uV,
+					  suspend_state_t state)
 {
 	struct regulator_dev *rdev = regulator->rdev;
+	struct regulator_voltage *voltage = &regulator->voltage[state];
 	int ret = 0;
 	int old_min_uV, old_max_uV;
 	int current_uV;
@@ -2911,7 +2928,7 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 	 * should be a noop (some cpufreq implementations use the same
 	 * voltage for multiple frequencies, for example).
 	 */
-	if (regulator->min_uV == min_uV && regulator->max_uV == max_uV)
+	if (voltage->min_uV == min_uV && voltage->max_uV == max_uV)
 		goto out;
 
 	/* If we're trying to set a range that overlaps the current voltage,
@@ -2921,8 +2938,8 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 	if (!regulator_ops_is_valid(rdev, REGULATOR_CHANGE_VOLTAGE)) {
 		current_uV = _regulator_get_voltage(rdev);
 		if (min_uV <= current_uV && current_uV <= max_uV) {
-			regulator->min_uV = min_uV;
-			regulator->max_uV = max_uV;
+			voltage->min_uV = min_uV;
+			voltage->max_uV = max_uV;
 			goto out;
 		}
 	}
@@ -2940,12 +2957,12 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 		goto out;
 
 	/* restore original values in case of error */
-	old_min_uV = regulator->min_uV;
-	old_max_uV = regulator->max_uV;
-	regulator->min_uV = min_uV;
-	regulator->max_uV = max_uV;
+	old_min_uV = voltage->min_uV;
+	old_max_uV = voltage->max_uV;
+	voltage->min_uV = min_uV;
+	voltage->max_uV = max_uV;
 
-	ret = regulator_check_consumers(rdev, &min_uV, &max_uV);
+	ret = regulator_check_consumers(rdev, &min_uV, &max_uV, state);
 	if (ret < 0)
 		goto out2;
 
@@ -2963,7 +2980,7 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 			goto out2;
 		}
 
-		best_supply_uV = _regulator_list_voltage(regulator, selector, 0);
+		best_supply_uV = _regulator_list_voltage(rdev, selector, 0);
 		if (best_supply_uV < 0) {
 			ret = best_supply_uV;
 			goto out2;
@@ -2982,7 +2999,7 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 
 	if (supply_change_uV > 0) {
 		ret = regulator_set_voltage_unlocked(rdev->supply,
-				best_supply_uV, INT_MAX);
+				best_supply_uV, INT_MAX, state);
 		if (ret) {
 			dev_err(&rdev->dev, "Failed to increase supply voltage: %d\n",
 					ret);
@@ -2990,13 +3007,17 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 		}
 	}
 
-	ret = _regulator_do_set_voltage(rdev, min_uV, max_uV);
+	if (state == PM_SUSPEND_ON)
+		ret = _regulator_do_set_voltage(rdev, min_uV, max_uV);
+	else
+		ret = _regulator_do_set_suspend_voltage(rdev, min_uV,
+							max_uV, state);
 	if (ret < 0)
 		goto out2;
 
 	if (supply_change_uV < 0) {
 		ret = regulator_set_voltage_unlocked(rdev->supply,
-				best_supply_uV, INT_MAX);
+				best_supply_uV, INT_MAX, state);
 		if (ret)
 			dev_warn(&rdev->dev, "Failed to decrease supply voltage: %d\n",
 					ret);
@@ -3007,8 +3028,8 @@ static int regulator_set_voltage_unlocked(struct regulator *regulator,
 out:
 	return ret;
 out2:
-	regulator->min_uV = old_min_uV;
-	regulator->max_uV = old_max_uV;
+	voltage->min_uV = old_min_uV;
+	voltage->max_uV = old_max_uV;
 
 	return ret;
 }
@@ -3037,7 +3058,8 @@ int regulator_set_voltage(struct regulator *regulator, int min_uV, int max_uV)
 
 	regulator_lock_supply(regulator->rdev);
 
-	ret = regulator_set_voltage_unlocked(regulator, min_uV, max_uV);
+	ret = regulator_set_voltage_unlocked(regulator, min_uV, max_uV,
+					     PM_SUSPEND_ON);
 
 	regulator_unlock_supply(regulator->rdev);
 
@@ -3045,6 +3067,89 @@ int regulator_set_voltage(struct regulator *regulator, int min_uV, int max_uV)
 }
 EXPORT_SYMBOL_GPL(regulator_set_voltage);
 
+static inline int regulator_suspend_toggle(struct regulator_dev *rdev,
+					   suspend_state_t state, bool en)
+{
+	struct regulator_state *rstate;
+
+	rstate = regulator_get_suspend_state(rdev, state);
+	if (rstate == NULL)
+		return -EINVAL;
+
+	if (!rstate->changeable)
+		return -EPERM;
+
+	rstate->enabled = en;
+
+	return 0;
+}
+
+int regulator_suspend_enable(struct regulator_dev *rdev,
+				    suspend_state_t state)
+{
+	return regulator_suspend_toggle(rdev, state, true);
+}
+EXPORT_SYMBOL_GPL(regulator_suspend_enable);
+
+int regulator_suspend_disable(struct regulator_dev *rdev,
+				     suspend_state_t state)
+{
+	struct regulator *regulator;
+	struct regulator_voltage *voltage;
+
+	/*
+	 * if any consumer wants this regulator device keeping on in
+	 * suspend states, don't set it as disabled.
+	 */
+	list_for_each_entry(regulator, &rdev->consumer_list, list) {
+		voltage = &regulator->voltage[state];
+		if (voltage->min_uV || voltage->max_uV)
+			return 0;
+	}
+
+	return regulator_suspend_toggle(rdev, state, false);
+}
+EXPORT_SYMBOL_GPL(regulator_suspend_disable);
+
+static int _regulator_set_suspend_voltage(struct regulator *regulator,
+					  int min_uV, int max_uV,
+					  suspend_state_t state)
+{
+	struct regulator_dev *rdev = regulator->rdev;
+	struct regulator_state *rstate;
+
+	rstate = regulator_get_suspend_state(rdev, state);
+	if (rstate == NULL)
+		return -EINVAL;
+
+	if (rstate->min_uV == rstate->max_uV) {
+		rdev_err(rdev, "The suspend voltage can't be changed!\n");
+		return -EPERM;
+	}
+
+	return regulator_set_voltage_unlocked(regulator, min_uV, max_uV, state);
+}
+
+int regulator_set_suspend_voltage(struct regulator *regulator, int min_uV,
+				  int max_uV, suspend_state_t state)
+{
+	int ret = 0;
+
+	/* PM_SUSPEND_ON is handled by regulator_set_voltage() */
+	if (regulator_check_states(state) || state == PM_SUSPEND_ON)
+		return -EINVAL;
+
+	regulator_lock_supply(regulator->rdev);
+
+	ret = _regulator_set_suspend_voltage(regulator, min_uV,
+					     max_uV, state);
+
+	regulator_unlock_supply(regulator->rdev);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(regulator_set_suspend_voltage);
+
 /**
  * regulator_set_voltage_time - get raise/fall time
  * @regulator: regulator source
@@ -3138,6 +3243,7 @@ EXPORT_SYMBOL_GPL(regulator_set_voltage_time_sel);
 int regulator_sync_voltage(struct regulator *regulator)
 {
 	struct regulator_dev *rdev = regulator->rdev;
+	struct regulator_voltage *voltage = &regulator->voltage[PM_SUSPEND_ON];
 	int ret, min_uV, max_uV;
 
 	mutex_lock(&rdev->mutex);
@@ -3149,20 +3255,20 @@ int regulator_sync_voltage(struct regulator *regulator)
 	}
 
 	/* This is only going to work if we've had a voltage configured. */
-	if (!regulator->min_uV && !regulator->max_uV) {
+	if (!voltage->min_uV && !voltage->max_uV) {
 		ret = -EINVAL;
 		goto out;
 	}
 
-	min_uV = regulator->min_uV;
-	max_uV = regulator->max_uV;
+	min_uV = voltage->min_uV;
+	max_uV = voltage->max_uV;
 
 	/* This should be a paranoia check... */
 	ret = regulator_check_voltage(rdev, &min_uV, &max_uV);
 	if (ret < 0)
 		goto out;
 
-	ret = regulator_check_consumers(rdev, &min_uV, &max_uV);
+	ret = regulator_check_consumers(rdev, &min_uV, &max_uV, 0);
 	if (ret < 0)
 		goto out;
 
@@ -3918,12 +4024,6 @@ static void regulator_dev_release(struct device *dev)
 	kfree(rdev);
 }
 
-static struct class regulator_class = {
-	.name = "regulator",
-	.dev_release = regulator_dev_release,
-	.dev_groups = regulator_dev_groups,
-};
-
 static void rdev_init_debugfs(struct regulator_dev *rdev)
 {
 	struct device *parent = rdev->dev.parent;
@@ -4174,81 +4274,86 @@ void regulator_unregister(struct regulator_dev *rdev)
 }
 EXPORT_SYMBOL_GPL(regulator_unregister);
 
-static int _regulator_suspend_prepare(struct device *dev, void *data)
+#ifdef CONFIG_SUSPEND
+static int _regulator_suspend_late(struct device *dev, void *data)
 {
 	struct regulator_dev *rdev = dev_to_rdev(dev);
-	const suspend_state_t *state = data;
+	suspend_state_t *state = data;
 	int ret;
 
 	mutex_lock(&rdev->mutex);
-	ret = suspend_prepare(rdev, *state);
+	ret = suspend_set_state(rdev, *state);
 	mutex_unlock(&rdev->mutex);
 
 	return ret;
 }
 
 /**
- * regulator_suspend_prepare - prepare regulators for system wide suspend
+ * regulator_suspend_late - prepare regulators for system wide suspend
  * @state: system suspend state
  *
  * Configure each regulator with it's suspend operating parameters for state.
- * This will usually be called by machine suspend code prior to supending.
  */
-int regulator_suspend_prepare(suspend_state_t state)
+static int regulator_suspend_late(struct device *dev)
 {
-	/* ON is handled by regulator active state */
-	if (state == PM_SUSPEND_ON)
-		return -EINVAL;
+	suspend_state_t state = pm_suspend_target_state;
 
 	return class_for_each_device(&regulator_class, NULL, &state,
-				     _regulator_suspend_prepare);
+				     _regulator_suspend_late);
 }
-EXPORT_SYMBOL_GPL(regulator_suspend_prepare);
-
-static int _regulator_suspend_finish(struct device *dev, void *data)
+static int _regulator_resume_early(struct device *dev, void *data)
 {
+	int ret = 0;
 	struct regulator_dev *rdev = dev_to_rdev(dev);
-	int ret;
+	suspend_state_t *state = data;
+	struct regulator_state *rstate;
+
+	rstate = regulator_get_suspend_state(rdev, *state);
+	if (rstate == NULL)
+		return -EINVAL;
 
 	mutex_lock(&rdev->mutex);
-	if (rdev->use_count > 0  || rdev->constraints->always_on) {
-		if (!_regulator_is_enabled(rdev)) {
-			ret = _regulator_do_enable(rdev);
-			if (ret)
-				dev_err(dev,
-					"Failed to resume regulator %d\n",
-					ret);
-		}
-	} else {
-		if (!have_full_constraints())
-			goto unlock;
-		if (!_regulator_is_enabled(rdev))
-			goto unlock;
 
-		ret = _regulator_do_disable(rdev);
-		if (ret)
-			dev_err(dev, "Failed to suspend regulator %d\n", ret);
-	}
-unlock:
+	if (rdev->desc->ops->resume_early &&
+	    (rstate->enabled == ENABLE_IN_SUSPEND ||
+	     rstate->enabled == DISABLE_IN_SUSPEND))
+		ret = rdev->desc->ops->resume_early(rdev);
+
 	mutex_unlock(&rdev->mutex);
 
-	/* Keep processing regulators in spite of any errors */
-	return 0;
+	return ret;
 }
 
-/**
- * regulator_suspend_finish - resume regulators from system wide suspend
- *
- * Turn on regulators that might be turned off by regulator_suspend_prepare
- * and that should be turned on according to the regulators properties.
- */
-int regulator_suspend_finish(void)
+static int regulator_resume_early(struct device *dev)
 {
-	return class_for_each_device(&regulator_class, NULL, NULL,
-				     _regulator_suspend_finish);
-}
-EXPORT_SYMBOL_GPL(regulator_suspend_finish);
+	suspend_state_t state = pm_suspend_target_state;
 
+	return class_for_each_device(&regulator_class, NULL, &state,
+				     _regulator_resume_early);
+}
+
+#else /* !CONFIG_SUSPEND */
+
+#define regulator_suspend_late	NULL
+#define regulator_resume_early	NULL
+
+#endif /* !CONFIG_SUSPEND */
+
+#ifdef CONFIG_PM
+static const struct dev_pm_ops __maybe_unused regulator_pm_ops = {
+	.suspend_late	= regulator_suspend_late,
+	.resume_early	= regulator_resume_early,
+};
+#endif
+
+struct class regulator_class = {
+	.name = "regulator",
+	.dev_release = regulator_dev_release,
+	.dev_groups = regulator_dev_groups,
+#ifdef CONFIG_PM
+	.pm = &regulator_pm_ops,
+#endif
+};
 /**
  * regulator_has_full_constraints - the system has fully specified constraints
  *
@@ -4424,8 +4529,8 @@ static void regulator_summary_show_subtree(struct seq_file *s,
 		switch (rdev->desc->type) {
 		case REGULATOR_VOLTAGE:
 			seq_printf(s, "%37dmV %5dmV",
-				   consumer->min_uV / 1000,
-				   consumer->max_uV / 1000);
+				   consumer->voltage[PM_SUSPEND_ON].min_uV / 1000,
+				   consumer->voltage[PM_SUSPEND_ON].max_uV / 1000);
 			break;
 		case REGULATOR_CURRENT:
 			break;
diff --git a/drivers/regulator/internal.h b/drivers/regulator/internal.h
index 66a8ea0..abfd56e 100644
--- a/drivers/regulator/internal.h
+++ b/drivers/regulator/internal.h
@@ -16,10 +16,25 @@
 #ifndef __REGULATOR_INTERNAL_H
 #define __REGULATOR_INTERNAL_H
 
+#include <linux/suspend.h>
+
+#define REGULATOR_STATES_NUM	(PM_SUSPEND_MAX + 1)
+
+struct regulator_voltage {
+	int min_uV;
+	int max_uV;
+};
+
 /*
  * struct regulator
  *
  * One for each consumer device.
+ * @voltage - a voltage array for each state of runtime, i.e.:
+ *            PM_SUSPEND_ON
+ *            PM_SUSPEND_TO_IDLE
+ *            PM_SUSPEND_STANDBY
+ *            PM_SUSPEND_MEM
+ *            PM_SUSPEND_MAX
  */
 struct regulator {
 	struct device *dev;
@@ -27,14 +42,22 @@ struct regulator {
 	unsigned int always_on:1;
 	unsigned int bypass:1;
 	int uA_load;
-	int min_uV;
-	int max_uV;
+	struct regulator_voltage voltage[REGULATOR_STATES_NUM];
 	const char *supply_name;
 	struct device_attribute dev_attr;
 	struct regulator_dev *rdev;
 	struct dentry *debugfs;
 };
 
+extern struct class regulator_class;
+
+static inline struct regulator_dev *dev_to_rdev(struct device *dev)
+{
+	return container_of(dev, struct regulator_dev, dev);
+}
+
+struct regulator_dev *of_find_regulator_by_node(struct device_node *np);
+
 #ifdef CONFIG_OF
 struct regulator_init_data *regulator_of_get_init_data(struct device *dev,
 			         const struct regulator_desc *desc,
diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c
index 14637a0..092ed6e 100644
--- a/drivers/regulator/of_regulator.c
+++ b/drivers/regulator/of_regulator.c
@@ -177,14 +177,30 @@ static void of_get_regulation_constraints(struct device_node *np,
 
 		if (of_property_read_bool(suspend_np,
 					"regulator-on-in-suspend"))
-			suspend_state->enabled = true;
+			suspend_state->enabled = ENABLE_IN_SUSPEND;
 		else if (of_property_read_bool(suspend_np,
 					"regulator-off-in-suspend"))
-			suspend_state->disabled = true;
+			suspend_state->enabled = DISABLE_IN_SUSPEND;
+		else
+			suspend_state->enabled = DO_NOTHING_IN_SUSPEND;
+
+		if (!of_property_read_u32(np, "regulator-suspend-min-microvolt",
+					  &pval))
+			suspend_state->min_uV = pval;
+
+		if (!of_property_read_u32(np, "regulator-suspend-max-microvolt",
+					  &pval))
+			suspend_state->max_uV = pval;
 
 		if (!of_property_read_u32(suspend_np,
 					"regulator-suspend-microvolt", &pval))
 			suspend_state->uV = pval;
+		else /* otherwise use min_uV as default suspend voltage */
+			suspend_state->uV = suspend_state->min_uV;
+
+		if (of_property_read_bool(suspend_np,
+					"regulator-changeable-in-suspend"))
+			suspend_state->changeable = true;
 
 		if (i == PM_SUSPEND_MEM)
 			constraints->initial_state = PM_SUSPEND_MEM;
@@ -376,3 +392,17 @@ struct regulator_init_data *regulator_of_get_init_data(struct device *dev,
 
 	return init_data;
 }
+
+static int of_node_match(struct device *dev, const void *data)
+{
+	return dev->of_node == data;
+}
+
+struct regulator_dev *of_find_regulator_by_node(struct device_node *np)
+{
+	struct device *dev;
+
+	dev = class_find_device(&regulator_class, NULL, np, of_node_match);
+
+	return dev ? dev_to_rdev(dev) : NULL;
+}
diff --git a/drivers/regulator/qcom_spmi-regulator.c b/drivers/regulator/qcom_spmi-regulator.c
index 0241ada..63c7a0c1 100644
--- a/drivers/regulator/qcom_spmi-regulator.c
+++ b/drivers/regulator/qcom_spmi-regulator.c
@@ -486,24 +486,6 @@ static int spmi_vreg_update_bits(struct spmi_regulator *vreg, u16 addr, u8 val,
 	return regmap_update_bits(vreg->regmap, vreg->base + addr, mask, val);
 }
 
-static int spmi_regulator_common_is_enabled(struct regulator_dev *rdev)
-{
-	struct spmi_regulator *vreg = rdev_get_drvdata(rdev);
-	u8 reg;
-
-	spmi_vreg_read(vreg, SPMI_COMMON_REG_ENABLE, &reg, 1);
-
-	return (reg & SPMI_COMMON_ENABLE_MASK) == SPMI_COMMON_ENABLE;
-}
-
-static int spmi_regulator_common_enable(struct regulator_dev *rdev)
-{
-	struct spmi_regulator *vreg = rdev_get_drvdata(rdev);
-
-	return spmi_vreg_update_bits(vreg, SPMI_COMMON_REG_ENABLE,
-		SPMI_COMMON_ENABLE, SPMI_COMMON_ENABLE_MASK);
-}
-
 static int spmi_regulator_vs_enable(struct regulator_dev *rdev)
 {
 	struct spmi_regulator *vreg = rdev_get_drvdata(rdev);
@@ -513,7 +495,7 @@ static int spmi_regulator_vs_enable(struct regulator_dev *rdev)
 		vreg->vs_enable_time = ktime_get();
 	}
 
-	return spmi_regulator_common_enable(rdev);
+	return regulator_enable_regmap(rdev);
 }
 
 static int spmi_regulator_vs_ocp(struct regulator_dev *rdev)
@@ -524,14 +506,6 @@ static int spmi_regulator_vs_ocp(struct regulator_dev *rdev)
 	return spmi_vreg_write(vreg, SPMI_VS_REG_OCP, &reg, 1);
 }
 
-static int spmi_regulator_common_disable(struct regulator_dev *rdev)
-{
-	struct spmi_regulator *vreg = rdev_get_drvdata(rdev);
-
-	return spmi_vreg_update_bits(vreg, SPMI_COMMON_REG_ENABLE,
-		SPMI_COMMON_DISABLE, SPMI_COMMON_ENABLE_MASK);
-}
-
 static int spmi_regulator_select_voltage(struct spmi_regulator *vreg,
 					 int min_uV, int max_uV)
 {
@@ -1062,9 +1036,9 @@ static irqreturn_t spmi_regulator_vs_ocp_isr(int irq, void *data)
 }
 
 static struct regulator_ops spmi_smps_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_common_set_voltage,
 	.set_voltage_time_sel	= spmi_regulator_set_voltage_time_sel,
 	.get_voltage_sel	= spmi_regulator_common_get_voltage,
@@ -1077,9 +1051,9 @@ static struct regulator_ops spmi_smps_ops = {
 };
 
 static struct regulator_ops spmi_ldo_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_common_set_voltage,
 	.get_voltage_sel	= spmi_regulator_common_get_voltage,
 	.map_voltage		= spmi_regulator_common_map_voltage,
@@ -1094,9 +1068,9 @@ static struct regulator_ops spmi_ldo_ops = {
 };
 
 static struct regulator_ops spmi_ln_ldo_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_common_set_voltage,
 	.get_voltage_sel	= spmi_regulator_common_get_voltage,
 	.map_voltage		= spmi_regulator_common_map_voltage,
@@ -1107,8 +1081,8 @@ static struct regulator_ops spmi_ln_ldo_ops = {
 
 static struct regulator_ops spmi_vs_ops = {
 	.enable			= spmi_regulator_vs_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_pull_down		= spmi_regulator_common_set_pull_down,
 	.set_soft_start		= spmi_regulator_common_set_soft_start,
 	.set_over_current_protection = spmi_regulator_vs_ocp,
@@ -1117,9 +1091,9 @@ static struct regulator_ops spmi_vs_ops = {
 };
 
 static struct regulator_ops spmi_boost_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_single_range_set_voltage,
 	.get_voltage_sel	= spmi_regulator_single_range_get_voltage,
 	.map_voltage		= spmi_regulator_single_map_voltage,
@@ -1128,9 +1102,9 @@ static struct regulator_ops spmi_boost_ops = {
 };
 
 static struct regulator_ops spmi_ftsmps_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_common_set_voltage,
 	.set_voltage_time_sel	= spmi_regulator_set_voltage_time_sel,
 	.get_voltage_sel	= spmi_regulator_common_get_voltage,
@@ -1143,9 +1117,9 @@ static struct regulator_ops spmi_ftsmps_ops = {
 };
 
 static struct regulator_ops spmi_ult_lo_smps_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_ult_lo_smps_set_voltage,
 	.set_voltage_time_sel	= spmi_regulator_set_voltage_time_sel,
 	.get_voltage_sel	= spmi_regulator_ult_lo_smps_get_voltage,
@@ -1157,9 +1131,9 @@ static struct regulator_ops spmi_ult_lo_smps_ops = {
 };
 
 static struct regulator_ops spmi_ult_ho_smps_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_single_range_set_voltage,
 	.set_voltage_time_sel	= spmi_regulator_set_voltage_time_sel,
 	.get_voltage_sel	= spmi_regulator_single_range_get_voltage,
@@ -1172,9 +1146,9 @@ static struct regulator_ops spmi_ult_ho_smps_ops = {
 };
 
 static struct regulator_ops spmi_ult_ldo_ops = {
-	.enable			= spmi_regulator_common_enable,
-	.disable		= spmi_regulator_common_disable,
-	.is_enabled		= spmi_regulator_common_is_enabled,
+	.enable			= regulator_enable_regmap,
+	.disable		= regulator_disable_regmap,
+	.is_enabled		= regulator_is_enabled_regmap,
 	.set_voltage_sel	= spmi_regulator_single_range_set_voltage,
 	.get_voltage_sel	= spmi_regulator_single_range_get_voltage,
 	.map_voltage		= spmi_regulator_single_map_voltage,
@@ -1711,6 +1685,9 @@ static int qcom_spmi_regulator_probe(struct platform_device *pdev)
 		vreg->desc.id = -1;
 		vreg->desc.owner = THIS_MODULE;
 		vreg->desc.type = REGULATOR_VOLTAGE;
+		vreg->desc.enable_reg = reg->base + SPMI_COMMON_REG_ENABLE;
+		vreg->desc.enable_mask = SPMI_COMMON_ENABLE_MASK;
+		vreg->desc.enable_val = SPMI_COMMON_ENABLE;
 		vreg->desc.name = name = reg->name;
 		vreg->desc.supply_name = reg->supply;
 		vreg->desc.of_match = reg->name;
@@ -1723,6 +1700,7 @@ static int qcom_spmi_regulator_probe(struct platform_device *pdev)
 
 		config.dev = dev;
 		config.driver_data = vreg;
+		config.regmap = regmap;
 		rdev = devm_regulator_register(dev, &vreg->desc, &config);
 		if (IS_ERR(rdev)) {
 			dev_err(dev, "failed to register %s\n", name);
diff --git a/drivers/regulator/sc2731-regulator.c b/drivers/regulator/sc2731-regulator.c
new file mode 100644
index 0000000..eb2bdf0
--- /dev/null
+++ b/drivers/regulator/sc2731-regulator.c
@@ -0,0 +1,256 @@
+ //SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2017 Spreadtrum Communications Inc.
+ */
+
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+#include <linux/regulator/driver.h>
+#include <linux/regulator/of_regulator.h>
+
+/*
+ * SC2731 regulator lock register
+ */
+#define SC2731_PWR_WR_PROT		0xf0c
+#define SC2731_WR_UNLOCK_VALUE		0x6e7f
+
+/*
+ * SC2731 enable register
+ */
+#define SC2731_POWER_PD_SW		0xc28
+#define SC2731_LDO_CAMA0_PD		0xcfc
+#define SC2731_LDO_CAMA1_PD		0xd04
+#define SC2731_LDO_CAMMOT_PD		0xd0c
+#define SC2731_LDO_VLDO_PD		0xd6c
+#define SC2731_LDO_EMMCCORE_PD		0xd2c
+#define SC2731_LDO_SDCORE_PD		0xd74
+#define SC2731_LDO_SDIO_PD		0xd70
+#define SC2731_LDO_WIFIPA_PD		0xd4c
+#define SC2731_LDO_USB33_PD		0xd5c
+#define SC2731_LDO_CAMD0_PD		0xd7c
+#define SC2731_LDO_CAMD1_PD		0xd84
+#define SC2731_LDO_CON_PD		0xd8c
+#define SC2731_LDO_CAMIO_PD		0xd94
+#define SC2731_LDO_SRAM_PD		0xd78
+
+/*
+ * SC2731 enable mask
+ */
+#define SC2731_DCDC_CPU0_PD_MASK	BIT(4)
+#define SC2731_DCDC_CPU1_PD_MASK	BIT(3)
+#define SC2731_DCDC_RF_PD_MASK		BIT(11)
+#define SC2731_LDO_CAMA0_PD_MASK	BIT(0)
+#define SC2731_LDO_CAMA1_PD_MASK	BIT(0)
+#define SC2731_LDO_CAMMOT_PD_MASK	BIT(0)
+#define SC2731_LDO_VLDO_PD_MASK		BIT(0)
+#define SC2731_LDO_EMMCCORE_PD_MASK	BIT(0)
+#define SC2731_LDO_SDCORE_PD_MASK	BIT(0)
+#define SC2731_LDO_SDIO_PD_MASK		BIT(0)
+#define SC2731_LDO_WIFIPA_PD_MASK	BIT(0)
+#define SC2731_LDO_USB33_PD_MASK	BIT(0)
+#define SC2731_LDO_CAMD0_PD_MASK	BIT(0)
+#define SC2731_LDO_CAMD1_PD_MASK	BIT(0)
+#define SC2731_LDO_CON_PD_MASK		BIT(0)
+#define SC2731_LDO_CAMIO_PD_MASK	BIT(0)
+#define SC2731_LDO_SRAM_PD_MASK		BIT(0)
+
+/*
+ * SC2731 vsel register
+ */
+#define SC2731_DCDC_CPU0_VOL		0xc54
+#define SC2731_DCDC_CPU1_VOL		0xc64
+#define SC2731_DCDC_RF_VOL		0xcb8
+#define SC2731_LDO_CAMA0_VOL		0xd00
+#define SC2731_LDO_CAMA1_VOL		0xd08
+#define SC2731_LDO_CAMMOT_VOL		0xd10
+#define SC2731_LDO_VLDO_VOL		0xd28
+#define SC2731_LDO_EMMCCORE_VOL		0xd30
+#define SC2731_LDO_SDCORE_VOL		0xd38
+#define SC2731_LDO_SDIO_VOL		0xd40
+#define SC2731_LDO_WIFIPA_VOL		0xd50
+#define SC2731_LDO_USB33_VOL		0xd60
+#define SC2731_LDO_CAMD0_VOL		0xd80
+#define SC2731_LDO_CAMD1_VOL		0xd88
+#define SC2731_LDO_CON_VOL		0xd90
+#define SC2731_LDO_CAMIO_VOL		0xd98
+#define SC2731_LDO_SRAM_VOL		0xdB0
+
+/*
+ * SC2731 vsel register mask
+ */
+#define SC2731_DCDC_CPU0_VOL_MASK	GENMASK(8, 0)
+#define SC2731_DCDC_CPU1_VOL_MASK	GENMASK(8, 0)
+#define SC2731_DCDC_RF_VOL_MASK		GENMASK(8, 0)
+#define SC2731_LDO_CAMA0_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_CAMA1_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_CAMMOT_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_VLDO_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_EMMCCORE_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_SDCORE_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_SDIO_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_WIFIPA_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_USB33_VOL_MASK	GENMASK(7, 0)
+#define SC2731_LDO_CAMD0_VOL_MASK	GENMASK(6, 0)
+#define SC2731_LDO_CAMD1_VOL_MASK	GENMASK(6, 0)
+#define SC2731_LDO_CON_VOL_MASK		GENMASK(6, 0)
+#define SC2731_LDO_CAMIO_VOL_MASK	GENMASK(6, 0)
+#define SC2731_LDO_SRAM_VOL_MASK	GENMASK(6, 0)
+
+enum sc2731_regulator_id {
+	SC2731_BUCK_CPU0,
+	SC2731_BUCK_CPU1,
+	SC2731_BUCK_RF,
+	SC2731_LDO_CAMA0,
+	SC2731_LDO_CAMA1,
+	SC2731_LDO_CAMMOT,
+	SC2731_LDO_VLDO,
+	SC2731_LDO_EMMCCORE,
+	SC2731_LDO_SDCORE,
+	SC2731_LDO_SDIO,
+	SC2731_LDO_WIFIPA,
+	SC2731_LDO_USB33,
+	SC2731_LDO_CAMD0,
+	SC2731_LDO_CAMD1,
+	SC2731_LDO_CON,
+	SC2731_LDO_CAMIO,
+	SC2731_LDO_SRAM,
+};
+
+static const struct regulator_ops sc2731_regu_linear_ops = {
+	.enable = regulator_enable_regmap,
+	.disable = regulator_disable_regmap,
+	.is_enabled = regulator_is_enabled_regmap,
+	.list_voltage = regulator_list_voltage_linear,
+	.get_voltage_sel = regulator_get_voltage_sel_regmap,
+	.set_voltage_sel = regulator_set_voltage_sel_regmap,
+};
+
+#define SC2731_REGU_LINEAR(_id, en_reg, en_mask, vreg, vmask,	\
+			  vstep, vmin, vmax) {			\
+	.name			= #_id,				\
+	.of_match		= of_match_ptr(#_id),		\
+	.ops			= &sc2731_regu_linear_ops,	\
+	.type			= REGULATOR_VOLTAGE,		\
+	.id			= SC2731_##_id,			\
+	.owner			= THIS_MODULE,			\
+	.min_uV			= vmin,				\
+	.n_voltages		= ((vmax) - (vmin)) / (vstep) + 1,	\
+	.uV_step		= vstep,			\
+	.enable_is_inverted	= true,				\
+	.enable_val		= 0,				\
+	.enable_reg		= en_reg,			\
+	.enable_mask		= en_mask,			\
+	.vsel_reg		= vreg,				\
+	.vsel_mask		= vmask,			\
+}
+
+static struct regulator_desc regulators[] = {
+	SC2731_REGU_LINEAR(BUCK_CPU0, SC2731_POWER_PD_SW,
+			   SC2731_DCDC_CPU0_PD_MASK, SC2731_DCDC_CPU0_VOL,
+			   SC2731_DCDC_CPU0_VOL_MASK, 3125, 400000, 1996875),
+	SC2731_REGU_LINEAR(BUCK_CPU1, SC2731_POWER_PD_SW,
+			   SC2731_DCDC_CPU1_PD_MASK, SC2731_DCDC_CPU1_VOL,
+			   SC2731_DCDC_CPU1_VOL_MASK, 3125, 400000, 1996875),
+	SC2731_REGU_LINEAR(BUCK_RF, SC2731_POWER_PD_SW, SC2731_DCDC_RF_PD_MASK,
+			   SC2731_DCDC_RF_VOL, SC2731_DCDC_RF_VOL_MASK,
+			   3125, 600000, 2196875),
+	SC2731_REGU_LINEAR(LDO_CAMA0, SC2731_LDO_CAMA0_PD,
+			   SC2731_LDO_CAMA0_PD_MASK, SC2731_LDO_CAMA0_VOL,
+			   SC2731_LDO_CAMA0_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_CAMA1, SC2731_LDO_CAMA1_PD,
+			   SC2731_LDO_CAMA1_PD_MASK, SC2731_LDO_CAMA1_VOL,
+			   SC2731_LDO_CAMA1_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_CAMMOT, SC2731_LDO_CAMMOT_PD,
+			   SC2731_LDO_CAMMOT_PD_MASK, SC2731_LDO_CAMMOT_VOL,
+			   SC2731_LDO_CAMMOT_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_VLDO, SC2731_LDO_VLDO_PD,
+			   SC2731_LDO_VLDO_PD_MASK, SC2731_LDO_VLDO_VOL,
+			   SC2731_LDO_VLDO_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_EMMCCORE, SC2731_LDO_EMMCCORE_PD,
+			   SC2731_LDO_EMMCCORE_PD_MASK, SC2731_LDO_EMMCCORE_VOL,
+			   SC2731_LDO_EMMCCORE_VOL_MASK, 10000, 1200000,
+			   3750000),
+	SC2731_REGU_LINEAR(LDO_SDCORE, SC2731_LDO_SDCORE_PD,
+			   SC2731_LDO_SDCORE_PD_MASK, SC2731_LDO_SDCORE_VOL,
+			   SC2731_LDO_SDCORE_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_SDIO, SC2731_LDO_SDIO_PD,
+			   SC2731_LDO_SDIO_PD_MASK, SC2731_LDO_SDIO_VOL,
+			   SC2731_LDO_SDIO_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_WIFIPA, SC2731_LDO_WIFIPA_PD,
+			   SC2731_LDO_WIFIPA_PD_MASK, SC2731_LDO_WIFIPA_VOL,
+			   SC2731_LDO_WIFIPA_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_USB33, SC2731_LDO_USB33_PD,
+			   SC2731_LDO_USB33_PD_MASK, SC2731_LDO_USB33_VOL,
+			   SC2731_LDO_USB33_VOL_MASK, 10000, 1200000, 3750000),
+	SC2731_REGU_LINEAR(LDO_CAMD0, SC2731_LDO_CAMD0_PD,
+			   SC2731_LDO_CAMD0_PD_MASK, SC2731_LDO_CAMD0_VOL,
+			   SC2731_LDO_CAMD0_VOL_MASK, 6250, 1000000, 1793750),
+	SC2731_REGU_LINEAR(LDO_CAMD1, SC2731_LDO_CAMD1_PD,
+			   SC2731_LDO_CAMD1_PD_MASK, SC2731_LDO_CAMD1_VOL,
+			   SC2731_LDO_CAMD1_VOL_MASK, 6250, 1000000, 1793750),
+	SC2731_REGU_LINEAR(LDO_CON, SC2731_LDO_CON_PD,
+			   SC2731_LDO_CON_PD_MASK, SC2731_LDO_CON_VOL,
+			   SC2731_LDO_CON_VOL_MASK, 6250, 1000000, 1793750),
+	SC2731_REGU_LINEAR(LDO_CAMIO, SC2731_LDO_CAMIO_PD,
+			   SC2731_LDO_CAMIO_PD_MASK, SC2731_LDO_CAMIO_VOL,
+			   SC2731_LDO_CAMIO_VOL_MASK, 6250, 1000000, 1793750),
+	SC2731_REGU_LINEAR(LDO_SRAM, SC2731_LDO_SRAM_PD,
+			   SC2731_LDO_SRAM_PD_MASK, SC2731_LDO_SRAM_VOL,
+			   SC2731_LDO_SRAM_VOL_MASK, 6250, 1000000, 1793750),
+};
+
+static int sc2731_regulator_unlock(struct regmap *regmap)
+{
+	return regmap_write(regmap, SC2731_PWR_WR_PROT,
+			    SC2731_WR_UNLOCK_VALUE);
+}
+
+static int sc2731_regulator_probe(struct platform_device *pdev)
+{
+	int i, ret;
+	struct regmap *regmap;
+	struct regulator_config config = { };
+	struct regulator_dev *rdev;
+
+	regmap = dev_get_regmap(pdev->dev.parent, NULL);
+	if (!regmap) {
+		dev_err(&pdev->dev, "failed to get regmap.\n");
+		return -ENODEV;
+	}
+
+	ret = sc2731_regulator_unlock(regmap);
+	if (ret) {
+		dev_err(&pdev->dev, "failed to release regulator lock\n");
+		return ret;
+	}
+
+	config.dev = &pdev->dev;
+	config.regmap = regmap;
+
+	for (i = 0; i < ARRAY_SIZE(regulators); i++) {
+		rdev = devm_regulator_register(&pdev->dev, &regulators[i],
+					       &config);
+		if (IS_ERR(rdev)) {
+			dev_err(&pdev->dev, "failed to register regulator %s\n",
+				regulators[i].name);
+			return PTR_ERR(rdev);
+		}
+	}
+
+	return 0;
+}
+
+static struct platform_driver sc2731_regulator_driver = {
+	.driver = {
+		.name = "sc27xx-regulator",
+	},
+	.probe = sc2731_regulator_probe,
+};
+
+module_platform_driver(sc2731_regulator_driver);
+
+MODULE_AUTHOR("Chen Junhui <erick.chen@spreadtrum.com>");
+MODULE_DESCRIPTION("Spreadtrum SC2731 regulator driver");
+MODULE_LICENSE("GPL v2");
diff --git a/drivers/regulator/tps65218-regulator.c b/drivers/regulator/tps65218-regulator.c
index bc48995..1827185 100644
--- a/drivers/regulator/tps65218-regulator.c
+++ b/drivers/regulator/tps65218-regulator.c
@@ -28,9 +28,6 @@
 #include <linux/regulator/machine.h>
 #include <linux/mfd/tps65218.h>
 
-enum tps65218_regulators { DCDC1, DCDC2, DCDC3, DCDC4,
-			   DCDC5, DCDC6, LDO1, LS3 };
-
 #define TPS65218_REGULATOR(_name, _of, _id, _type, _ops, _n, _vr, _vm, _er, \
 			   _em, _cr, _cm, _lr, _nlr, _delay, _fuv, _sr, _sm) \
 	{							\
@@ -329,6 +326,8 @@ static int tps65218_regulator_probe(struct platform_device *pdev)
 	/* Allocate memory for strobes */
 	tps->strobes = devm_kzalloc(&pdev->dev, sizeof(u8) *
 				    TPS65218_NUM_REGULATOR, GFP_KERNEL);
+	if (!tps->strobes)
+		return -ENOMEM;
 
 	for (i = 0; i < ARRAY_SIZE(regulators); i++) {
 		rdev = devm_regulator_register(&pdev->dev, &regulators[i],
diff --git a/drivers/s390/block/dasd_3990_erp.c b/drivers/s390/block/dasd_3990_erp.c
index c94b606..ee14d8e 100644
--- a/drivers/s390/block/dasd_3990_erp.c
+++ b/drivers/s390/block/dasd_3990_erp.c
@@ -2803,6 +2803,16 @@ dasd_3990_erp_action(struct dasd_ccw_req * cqr)
 		erp = dasd_3990_erp_handle_match_erp(cqr, erp);
 	}
 
+
+	/*
+	 * For path verification work we need to stick with the path that was
+	 * originally chosen so that the per path configuration data is
+	 * assigned correctly.
+	 */
+	if (test_bit(DASD_CQR_VERIFY_PATH, &erp->flags) && cqr->lpm) {
+		erp->lpm = cqr->lpm;
+	}
+
 	if (device->features & DASD_FEATURE_ERPLOG) {
 		/* print current erp_chain */
 		dev_err(&device->cdev->dev,
diff --git a/drivers/s390/char/Makefile b/drivers/s390/char/Makefile
index 05ac6ba..614b44e 100644
--- a/drivers/s390/char/Makefile
+++ b/drivers/s390/char/Makefile
@@ -17,6 +17,8 @@
 CFLAGS_sclp_early_core.o		+= -march=z900
 endif
 
+CFLAGS_sclp_early_core.o		+= -D__NO_FORTIFY
+
 obj-y += ctrlchar.o keyboard.o defkeymap.o sclp.o sclp_rw.o sclp_quiesce.o \
 	 sclp_cmd.o sclp_config.o sclp_cpi_sys.o sclp_ocf.o sclp_ctl.o \
 	 sclp_early.o sclp_early_core.o
diff --git a/drivers/scsi/libsas/sas_scsi_host.c b/drivers/scsi/libsas/sas_scsi_host.c
index 58476b7..c940685 100644
--- a/drivers/scsi/libsas/sas_scsi_host.c
+++ b/drivers/scsi/libsas/sas_scsi_host.c
@@ -486,15 +486,28 @@ static int sas_queue_reset(struct domain_device *dev, int reset_type,
 
 int sas_eh_abort_handler(struct scsi_cmnd *cmd)
 {
-	int res;
+	int res = TMF_RESP_FUNC_FAILED;
 	struct sas_task *task = TO_SAS_TASK(cmd);
 	struct Scsi_Host *host = cmd->device->host;
+	struct domain_device *dev = cmd_to_domain_dev(cmd);
 	struct sas_internal *i = to_sas_internal(host->transportt);
+	unsigned long flags;
 
 	if (!i->dft->lldd_abort_task)
 		return FAILED;
 
-	res = i->dft->lldd_abort_task(task);
+	spin_lock_irqsave(host->host_lock, flags);
+	/* We cannot do async aborts for SATA devices */
+	if (dev_is_sata(dev) && !host->host_eh_scheduled) {
+		spin_unlock_irqrestore(host->host_lock, flags);
+		return FAILED;
+	}
+	spin_unlock_irqrestore(host->host_lock, flags);
+
+	if (task)
+		res = i->dft->lldd_abort_task(task);
+	else
+		SAS_DPRINTK("no task to abort\n");
 	if (res == TMF_RESP_FUNC_SUCC || res == TMF_RESP_FUNC_COMPLETE)
 		return SUCCESS;
 
diff --git a/drivers/scsi/scsi_transport_spi.c b/drivers/scsi/scsi_transport_spi.c
index 10ebb21..871ea58 100644
--- a/drivers/scsi/scsi_transport_spi.c
+++ b/drivers/scsi/scsi_transport_spi.c
@@ -26,6 +26,7 @@
 #include <linux/mutex.h>
 #include <linux/sysfs.h>
 #include <linux/slab.h>
+#include <linux/suspend.h>
 #include <scsi/scsi.h>
 #include "scsi_priv.h"
 #include <scsi/scsi_device.h>
@@ -1009,11 +1010,20 @@ spi_dv_device(struct scsi_device *sdev)
 	u8 *buffer;
 	const int len = SPI_MAX_ECHO_BUFFER_SIZE*2;
 
+	/*
+	 * Because this function and the power management code both call
+	 * scsi_device_quiesce(), it is not safe to perform domain validation
+	 * while suspend or resume is in progress. Hence the
+	 * lock/unlock_system_sleep() calls.
+	 */
+	lock_system_sleep();
+
 	if (unlikely(spi_dv_in_progress(starget)))
-		return;
+		goto unlock;
 
 	if (unlikely(scsi_device_get(sdev)))
-		return;
+		goto unlock;
+
 	spi_dv_in_progress(starget) = 1;
 
 	buffer = kzalloc(len, GFP_KERNEL);
@@ -1049,6 +1059,8 @@ spi_dv_device(struct scsi_device *sdev)
  out_put:
 	spi_dv_in_progress(starget) = 0;
 	scsi_device_put(sdev);
+unlock:
+	unlock_system_sleep();
 }
 EXPORT_SYMBOL(spi_dv_device);
 
diff --git a/drivers/spi/spi-armada-3700.c b/drivers/spi/spi-armada-3700.c
index d653453..7dcb14d 100644
--- a/drivers/spi/spi-armada-3700.c
+++ b/drivers/spi/spi-armada-3700.c
@@ -27,6 +27,8 @@
 
 #define DRIVER_NAME			"armada_3700_spi"
 
+#define A3700_SPI_MAX_SPEED_HZ		100000000
+#define A3700_SPI_MAX_PRESCALE		30
 #define A3700_SPI_TIMEOUT		10
 
 /* SPI Register Offest */
@@ -184,12 +186,15 @@ static int a3700_spi_pin_mode_set(struct a3700_spi *a3700_spi,
 	return 0;
 }
 
-static void a3700_spi_fifo_mode_set(struct a3700_spi *a3700_spi)
+static void a3700_spi_fifo_mode_set(struct a3700_spi *a3700_spi, bool enable)
 {
 	u32 val;
 
 	val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
-	val |= A3700_SPI_FIFO_MODE;
+	if (enable)
+		val |= A3700_SPI_FIFO_MODE;
+	else
+		val &= ~A3700_SPI_FIFO_MODE;
 	spireg_write(a3700_spi, A3700_SPI_IF_CFG_REG, val);
 }
 
@@ -297,7 +302,7 @@ static int a3700_spi_init(struct a3700_spi *a3700_spi)
 		a3700_spi_deactivate_cs(a3700_spi, i);
 
 	/* Enable FIFO mode */
-	a3700_spi_fifo_mode_set(a3700_spi);
+	a3700_spi_fifo_mode_set(a3700_spi, true);
 
 	/* Set SPI mode */
 	a3700_spi_mode_set(a3700_spi, master->mode_bits);
@@ -416,15 +421,20 @@ static void a3700_spi_transfer_setup(struct spi_device *spi,
 				     struct spi_transfer *xfer)
 {
 	struct a3700_spi *a3700_spi;
-	unsigned int byte_len;
 
 	a3700_spi = spi_master_get_devdata(spi->master);
 
 	a3700_spi_clock_set(a3700_spi, xfer->speed_hz);
 
-	byte_len = xfer->bits_per_word >> 3;
+	/* Use 4 bytes long transfers. Each transfer method has its way to deal
+	 * with the remaining bytes for non 4-bytes aligned transfers.
+	 */
+	a3700_spi_bytelen_set(a3700_spi, 4);
 
-	a3700_spi_fifo_thres_set(a3700_spi, byte_len);
+	/* Initialize the working buffers */
+	a3700_spi->tx_buf  = xfer->tx_buf;
+	a3700_spi->rx_buf  = xfer->rx_buf;
+	a3700_spi->buf_len = xfer->len;
 }
 
 static void a3700_spi_set_cs(struct spi_device *spi, bool enable)
@@ -491,7 +501,7 @@ static int a3700_spi_fifo_write(struct a3700_spi *a3700_spi)
 	u32 val;
 
 	while (!a3700_is_wfifo_full(a3700_spi) && a3700_spi->buf_len) {
-		val = cpu_to_le32(*(u32 *)a3700_spi->tx_buf);
+		val = *(u32 *)a3700_spi->tx_buf;
 		spireg_write(a3700_spi, A3700_SPI_DATA_OUT_REG, val);
 		a3700_spi->buf_len -= 4;
 		a3700_spi->tx_buf += 4;
@@ -514,9 +524,8 @@ static int a3700_spi_fifo_read(struct a3700_spi *a3700_spi)
 	while (!a3700_is_rfifo_empty(a3700_spi) && a3700_spi->buf_len) {
 		val = spireg_read(a3700_spi, A3700_SPI_DATA_IN_REG);
 		if (a3700_spi->buf_len >= 4) {
-			u32 data = le32_to_cpu(val);
 
-			memcpy(a3700_spi->rx_buf, &data, 4);
+			memcpy(a3700_spi->rx_buf, &val, 4);
 
 			a3700_spi->buf_len -= 4;
 			a3700_spi->rx_buf += 4;
@@ -579,27 +588,26 @@ static int a3700_spi_prepare_message(struct spi_master *master,
 	if (ret)
 		return ret;
 
-	a3700_spi_bytelen_set(a3700_spi, 4);
-
 	a3700_spi_mode_set(a3700_spi, spi->mode);
 
 	return 0;
 }
 
-static int a3700_spi_transfer_one(struct spi_master *master,
+static int a3700_spi_transfer_one_fifo(struct spi_master *master,
 				  struct spi_device *spi,
 				  struct spi_transfer *xfer)
 {
 	struct a3700_spi *a3700_spi = spi_master_get_devdata(master);
 	int ret = 0, timeout = A3700_SPI_TIMEOUT;
-	unsigned int nbits = 0;
+	unsigned int nbits = 0, byte_len;
 	u32 val;
 
-	a3700_spi_transfer_setup(spi, xfer);
+	/* Make sure we use FIFO mode */
+	a3700_spi_fifo_mode_set(a3700_spi, true);
 
-	a3700_spi->tx_buf  = xfer->tx_buf;
-	a3700_spi->rx_buf  = xfer->rx_buf;
-	a3700_spi->buf_len = xfer->len;
+	/* Configure FIFO thresholds */
+	byte_len = xfer->bits_per_word >> 3;
+	a3700_spi_fifo_thres_set(a3700_spi, byte_len);
 
 	if (xfer->tx_buf)
 		nbits = xfer->tx_nbits;
@@ -615,6 +623,11 @@ static int a3700_spi_transfer_one(struct spi_master *master,
 	a3700_spi_header_set(a3700_spi);
 
 	if (xfer->rx_buf) {
+		/* Clear WFIFO, since it's last 2 bytes are shifted out during
+		 * a read operation
+		 */
+		spireg_write(a3700_spi, A3700_SPI_DATA_OUT_REG, 0);
+
 		/* Set read data length */
 		spireg_write(a3700_spi, A3700_SPI_IF_DIN_CNT_REG,
 			     a3700_spi->buf_len);
@@ -729,6 +742,63 @@ static int a3700_spi_transfer_one(struct spi_master *master,
 	return ret;
 }
 
+static int a3700_spi_transfer_one_full_duplex(struct spi_master *master,
+				  struct spi_device *spi,
+				  struct spi_transfer *xfer)
+{
+	struct a3700_spi *a3700_spi = spi_master_get_devdata(master);
+	u32 val;
+
+	/* Disable FIFO mode */
+	a3700_spi_fifo_mode_set(a3700_spi, false);
+
+	while (a3700_spi->buf_len) {
+
+		/* When we have less than 4 bytes to transfer, switch to 1 byte
+		 * mode. This is reset after each transfer
+		 */
+		if (a3700_spi->buf_len < 4)
+			a3700_spi_bytelen_set(a3700_spi, 1);
+
+		if (a3700_spi->byte_len == 1)
+			val = *a3700_spi->tx_buf;
+		else
+			val = *(u32 *)a3700_spi->tx_buf;
+
+		spireg_write(a3700_spi, A3700_SPI_DATA_OUT_REG, val);
+
+		/* Wait for all the data to be shifted in / out */
+		while (!(spireg_read(a3700_spi, A3700_SPI_IF_CTRL_REG) &
+				A3700_SPI_XFER_DONE))
+			cpu_relax();
+
+		val = spireg_read(a3700_spi, A3700_SPI_DATA_IN_REG);
+
+		memcpy(a3700_spi->rx_buf, &val, a3700_spi->byte_len);
+
+		a3700_spi->buf_len -= a3700_spi->byte_len;
+		a3700_spi->tx_buf += a3700_spi->byte_len;
+		a3700_spi->rx_buf += a3700_spi->byte_len;
+
+	}
+
+	spi_finalize_current_transfer(master);
+
+	return 0;
+}
+
+static int a3700_spi_transfer_one(struct spi_master *master,
+				  struct spi_device *spi,
+				  struct spi_transfer *xfer)
+{
+	a3700_spi_transfer_setup(spi, xfer);
+
+	if (xfer->tx_buf && xfer->rx_buf)
+		return a3700_spi_transfer_one_full_duplex(master, spi, xfer);
+
+	return a3700_spi_transfer_one_fifo(master, spi, xfer);
+}
+
 static int a3700_spi_unprepare_message(struct spi_master *master,
 				       struct spi_message *message)
 {
@@ -778,7 +848,6 @@ static int a3700_spi_probe(struct platform_device *pdev)
 	master->transfer_one = a3700_spi_transfer_one;
 	master->unprepare_message = a3700_spi_unprepare_message;
 	master->set_cs = a3700_spi_set_cs;
-	master->flags = SPI_MASTER_HALF_DUPLEX;
 	master->mode_bits |= (SPI_RX_DUAL | SPI_TX_DUAL |
 			      SPI_RX_QUAD | SPI_TX_QUAD);
 
@@ -818,6 +887,11 @@ static int a3700_spi_probe(struct platform_device *pdev)
 		goto error;
 	}
 
+	master->max_speed_hz = min_t(unsigned long, A3700_SPI_MAX_SPEED_HZ,
+					clk_get_rate(spi->clk));
+	master->min_speed_hz = DIV_ROUND_UP(clk_get_rate(spi->clk),
+						A3700_SPI_MAX_PRESCALE);
+
 	ret = a3700_spi_init(spi);
 	if (ret)
 		goto error_clk;
diff --git a/drivers/spi/spi-atmel.c b/drivers/spi/spi-atmel.c
index 6694709..4a11fc0 100644
--- a/drivers/spi/spi-atmel.c
+++ b/drivers/spi/spi-atmel.c
@@ -291,6 +291,10 @@ struct atmel_spi {
 	struct spi_transfer	*current_transfer;
 	int			current_remaining_bytes;
 	int			done_status;
+	dma_addr_t		dma_addr_rx_bbuf;
+	dma_addr_t		dma_addr_tx_bbuf;
+	void			*addr_rx_bbuf;
+	void			*addr_tx_bbuf;
 
 	struct completion	xfer_completion;
 
@@ -436,6 +440,11 @@ static void atmel_spi_unlock(struct atmel_spi *as) __releases(&as->lock)
 	spin_unlock_irqrestore(&as->lock, as->flags);
 }
 
+static inline bool atmel_spi_is_vmalloc_xfer(struct spi_transfer *xfer)
+{
+	return is_vmalloc_addr(xfer->tx_buf) || is_vmalloc_addr(xfer->rx_buf);
+}
+
 static inline bool atmel_spi_use_dma(struct atmel_spi *as,
 				struct spi_transfer *xfer)
 {
@@ -448,7 +457,12 @@ static bool atmel_spi_can_dma(struct spi_master *master,
 {
 	struct atmel_spi *as = spi_master_get_devdata(master);
 
-	return atmel_spi_use_dma(as, xfer);
+	if (IS_ENABLED(CONFIG_SOC_SAM_V4_V5))
+		return atmel_spi_use_dma(as, xfer) &&
+			!atmel_spi_is_vmalloc_xfer(xfer);
+	else
+		return atmel_spi_use_dma(as, xfer);
+
 }
 
 static int atmel_spi_dma_slave_config(struct atmel_spi *as,
@@ -594,6 +608,11 @@ static void dma_callback(void *data)
 	struct spi_master	*master = data;
 	struct atmel_spi	*as = spi_master_get_devdata(master);
 
+	if (is_vmalloc_addr(as->current_transfer->rx_buf) &&
+	    IS_ENABLED(CONFIG_SOC_SAM_V4_V5)) {
+		memcpy(as->current_transfer->rx_buf, as->addr_rx_bbuf,
+		       as->current_transfer->len);
+	}
 	complete(&as->xfer_completion);
 }
 
@@ -744,17 +763,41 @@ static int atmel_spi_next_xfer_dma_submit(struct spi_master *master,
 		goto err_exit;
 
 	/* Send both scatterlists */
-	rxdesc = dmaengine_prep_slave_sg(rxchan,
-					 xfer->rx_sg.sgl, xfer->rx_sg.nents,
-					 DMA_FROM_DEVICE,
-					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (atmel_spi_is_vmalloc_xfer(xfer) &&
+	    IS_ENABLED(CONFIG_SOC_SAM_V4_V5)) {
+		rxdesc = dmaengine_prep_slave_single(rxchan,
+						     as->dma_addr_rx_bbuf,
+						     xfer->len,
+						     DMA_FROM_DEVICE,
+						     DMA_PREP_INTERRUPT |
+						     DMA_CTRL_ACK);
+	} else {
+		rxdesc = dmaengine_prep_slave_sg(rxchan,
+						 xfer->rx_sg.sgl,
+						 xfer->rx_sg.nents,
+						 DMA_FROM_DEVICE,
+						 DMA_PREP_INTERRUPT |
+						 DMA_CTRL_ACK);
+	}
 	if (!rxdesc)
 		goto err_dma;
 
-	txdesc = dmaengine_prep_slave_sg(txchan,
-					 xfer->tx_sg.sgl, xfer->tx_sg.nents,
-					 DMA_TO_DEVICE,
-					 DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+	if (atmel_spi_is_vmalloc_xfer(xfer) &&
+	    IS_ENABLED(CONFIG_SOC_SAM_V4_V5)) {
+		memcpy(as->addr_tx_bbuf, xfer->tx_buf, xfer->len);
+		txdesc = dmaengine_prep_slave_single(txchan,
+						     as->dma_addr_tx_bbuf,
+						     xfer->len, DMA_TO_DEVICE,
+						     DMA_PREP_INTERRUPT |
+						     DMA_CTRL_ACK);
+	} else {
+		txdesc = dmaengine_prep_slave_sg(txchan,
+						 xfer->tx_sg.sgl,
+						 xfer->tx_sg.nents,
+						 DMA_TO_DEVICE,
+						 DMA_PREP_INTERRUPT |
+						 DMA_CTRL_ACK);
+	}
 	if (!txdesc)
 		goto err_dma;
 
@@ -1426,27 +1469,7 @@ static void atmel_get_caps(struct atmel_spi *as)
 
 	as->caps.is_spi2 = version > 0x121;
 	as->caps.has_wdrbt = version >= 0x210;
-#ifdef CONFIG_SOC_SAM_V4_V5
-	/*
-	 * Atmel SoCs based on ARM9 (SAM9x) cores should not use spi_map_buf()
-	 * since this later function tries to map buffers with dma_map_sg()
-	 * even if they have not been allocated inside DMA-safe areas.
-	 * On SoCs based on Cortex A5 (SAMA5Dx), it works anyway because for
-	 * those ARM cores, the data cache follows the PIPT model.
-	 * Also the L2 cache controller of SAMA5D2 uses the PIPT model too.
-	 * In case of PIPT caches, there cannot be cache aliases.
-	 * However on ARM9 cores, the data cache follows the VIVT model, hence
-	 * the cache aliases issue can occur when buffers are allocated from
-	 * DMA-unsafe areas, by vmalloc() for instance, where cache coherency is
-	 * not taken into account or at least not handled completely (cache
-	 * lines of aliases are not invalidated).
-	 * This is not a theorical issue: it was reproduced when trying to mount
-	 * a UBI file-system on a at91sam9g35ek board.
-	 */
-	as->caps.has_dma_support = false;
-#else
 	as->caps.has_dma_support = version >= 0x212;
-#endif
 	as->caps.has_pdc_support = version < 0x212;
 }
 
@@ -1592,6 +1615,30 @@ static int atmel_spi_probe(struct platform_device *pdev)
 		as->use_pdc = true;
 	}
 
+	if (IS_ENABLED(CONFIG_SOC_SAM_V4_V5)) {
+		as->addr_rx_bbuf = dma_alloc_coherent(&pdev->dev,
+						      SPI_MAX_DMA_XFER,
+						      &as->dma_addr_rx_bbuf,
+						      GFP_KERNEL | GFP_DMA);
+		if (!as->addr_rx_bbuf) {
+			as->use_dma = false;
+		} else {
+			as->addr_tx_bbuf = dma_alloc_coherent(&pdev->dev,
+					SPI_MAX_DMA_XFER,
+					&as->dma_addr_tx_bbuf,
+					GFP_KERNEL | GFP_DMA);
+			if (!as->addr_tx_bbuf) {
+				as->use_dma = false;
+				dma_free_coherent(&pdev->dev, SPI_MAX_DMA_XFER,
+						  as->addr_rx_bbuf,
+						  as->dma_addr_rx_bbuf);
+			}
+		}
+		if (!as->use_dma)
+			dev_info(master->dev.parent,
+				 "  can not allocate dma coherent memory\n");
+	}
+
 	if (as->caps.has_dma_support && !as->use_dma)
 		dev_info(&pdev->dev, "Atmel SPI Controller using PIO only\n");
 
@@ -1664,6 +1711,14 @@ static int atmel_spi_remove(struct platform_device *pdev)
 	if (as->use_dma) {
 		atmel_spi_stop_dma(master);
 		atmel_spi_release_dma(master);
+		if (IS_ENABLED(CONFIG_SOC_SAM_V4_V5)) {
+			dma_free_coherent(&pdev->dev, SPI_MAX_DMA_XFER,
+					  as->addr_tx_bbuf,
+					  as->dma_addr_tx_bbuf);
+			dma_free_coherent(&pdev->dev, SPI_MAX_DMA_XFER,
+					  as->addr_rx_bbuf,
+					  as->dma_addr_rx_bbuf);
+		}
 	}
 
 	spin_lock_irq(&as->lock);
diff --git a/drivers/spi/spi-bcm53xx.c b/drivers/spi/spi-bcm53xx.c
index 6e409ea..d02ceb7 100644
--- a/drivers/spi/spi-bcm53xx.c
+++ b/drivers/spi/spi-bcm53xx.c
@@ -27,8 +27,6 @@ struct bcm53xxspi {
 	struct bcma_device *core;
 	struct spi_master *master;
 	void __iomem *mmio_base;
-
-	size_t read_offset;
 	bool bspi;				/* Boot SPI mode with memory mapping */
 };
 
@@ -172,8 +170,6 @@ static void bcm53xxspi_buf_write(struct bcm53xxspi *b53spi, u8 *w_buf,
 
 	if (!cont)
 		bcm53xxspi_write(b53spi, B53SPI_MSPI_WRITE_LOCK, 0);
-
-	b53spi->read_offset = len;
 }
 
 static void bcm53xxspi_buf_read(struct bcm53xxspi *b53spi, u8 *r_buf,
@@ -182,10 +178,10 @@ static void bcm53xxspi_buf_read(struct bcm53xxspi *b53spi, u8 *r_buf,
 	u32 tmp;
 	int i;
 
-	for (i = 0; i < b53spi->read_offset + len; i++) {
+	for (i = 0; i < len; i++) {
 		tmp = B53SPI_CDRAM_CONT | B53SPI_CDRAM_PCS_DISABLE_ALL |
 		      B53SPI_CDRAM_PCS_DSCK;
-		if (!cont && i == b53spi->read_offset + len - 1)
+		if (!cont && i == len - 1)
 			tmp &= ~B53SPI_CDRAM_CONT;
 		tmp &= ~0x1;
 		/* Command Register File */
@@ -194,8 +190,7 @@ static void bcm53xxspi_buf_read(struct bcm53xxspi *b53spi, u8 *r_buf,
 
 	/* Set queue pointers */
 	bcm53xxspi_write(b53spi, B53SPI_MSPI_NEWQP, 0);
-	bcm53xxspi_write(b53spi, B53SPI_MSPI_ENDQP,
-			 b53spi->read_offset + len - 1);
+	bcm53xxspi_write(b53spi, B53SPI_MSPI_ENDQP, len - 1);
 
 	if (cont)
 		bcm53xxspi_write(b53spi, B53SPI_MSPI_WRITE_LOCK, 1);
@@ -214,13 +209,11 @@ static void bcm53xxspi_buf_read(struct bcm53xxspi *b53spi, u8 *r_buf,
 		bcm53xxspi_write(b53spi, B53SPI_MSPI_WRITE_LOCK, 0);
 
 	for (i = 0; i < len; ++i) {
-		int offset = b53spi->read_offset + i;
+		u16 reg = B53SPI_MSPI_RXRAM + 4 * (1 + i * 2);
 
 		/* Data stored in the transmit register file LSB */
-		r_buf[i] = (u8)bcm53xxspi_read(b53spi, B53SPI_MSPI_RXRAM + 4 * (1 + offset * 2));
+		r_buf[i] = (u8)bcm53xxspi_read(b53spi, reg);
 	}
-
-	b53spi->read_offset = 0;
 }
 
 static int bcm53xxspi_transfer_one(struct spi_master *master,
@@ -238,7 +231,8 @@ static int bcm53xxspi_transfer_one(struct spi_master *master,
 		left = t->len;
 		while (left) {
 			size_t to_write = min_t(size_t, 16, left);
-			bool cont = left - to_write > 0;
+			bool cont = !spi_transfer_is_last(master, t) ||
+				    left - to_write > 0;
 
 			bcm53xxspi_buf_write(b53spi, buf, to_write, cont);
 			left -= to_write;
@@ -250,9 +244,9 @@ static int bcm53xxspi_transfer_one(struct spi_master *master,
 		buf = (u8 *)t->rx_buf;
 		left = t->len;
 		while (left) {
-			size_t to_read = min_t(size_t, 16 - b53spi->read_offset,
-					       left);
-			bool cont = left - to_read > 0;
+			size_t to_read = min_t(size_t, 16, left);
+			bool cont = !spi_transfer_is_last(master, t) ||
+				    left - to_read > 0;
 
 			bcm53xxspi_buf_read(b53spi, buf, to_read, cont);
 			left -= to_read;
diff --git a/drivers/spi/spi-davinci.c b/drivers/spi/spi-davinci.c
index 6ddb6ef..60d59b0 100644
--- a/drivers/spi/spi-davinci.c
+++ b/drivers/spi/spi-davinci.c
@@ -945,6 +945,8 @@ static int davinci_spi_probe(struct platform_device *pdev)
 		goto free_master;
 	}
 
+	init_completion(&dspi->done);
+
 	ret = platform_get_irq(pdev, 0);
 	if (ret == 0)
 		ret = -EINVAL;
@@ -1021,8 +1023,6 @@ static int davinci_spi_probe(struct platform_device *pdev)
 	dspi->get_rx = davinci_spi_rx_buf_u8;
 	dspi->get_tx = davinci_spi_tx_buf_u8;
 
-	init_completion(&dspi->done);
-
 	/* Reset In/OUT SPI module */
 	iowrite32(0, dspi->base + SPIGCR0);
 	udelay(100);
diff --git a/drivers/spi/spi-dw.c b/drivers/spi/spi-dw.c
index b217c22..211cc7d 100644
--- a/drivers/spi/spi-dw.c
+++ b/drivers/spi/spi-dw.c
@@ -30,13 +30,11 @@
 
 /* Slave spi_dev related */
 struct chip_data {
-	u8 cs;			/* chip select pin */
 	u8 tmode;		/* TR/TO/RO/EEPROM */
 	u8 type;		/* SPI/SSP/MicroWire */
 
 	u8 poll_mode;		/* 1 means use poll mode */
 
-	u8 enable_dma;
 	u16 clk_div;		/* baud rate divider */
 	u32 speed_hz;		/* baud rate */
 	void (*cs_control)(u32 command);
diff --git a/drivers/spi/spi-fsl-dspi.c b/drivers/spi/spi-fsl-dspi.c
index f652f70..0630962 100644
--- a/drivers/spi/spi-fsl-dspi.c
+++ b/drivers/spi/spi-fsl-dspi.c
@@ -903,10 +903,9 @@ static irqreturn_t dspi_interrupt(int irq, void *dev_id)
 }
 
 static const struct of_device_id fsl_dspi_dt_ids[] = {
-	{ .compatible = "fsl,vf610-dspi", .data = (void *)&vf610_data, },
-	{ .compatible = "fsl,ls1021a-v1.0-dspi",
-		.data = (void *)&ls1021a_v1_data, },
-	{ .compatible = "fsl,ls2085a-dspi", .data = (void *)&ls2085a_data, },
+	{ .compatible = "fsl,vf610-dspi", .data = &vf610_data, },
+	{ .compatible = "fsl,ls1021a-v1.0-dspi", .data = &ls1021a_v1_data, },
+	{ .compatible = "fsl,ls2085a-dspi", .data = &ls2085a_data, },
 	{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, fsl_dspi_dt_ids);
@@ -980,7 +979,7 @@ static int dspi_probe(struct platform_device *pdev)
 	master->dev.of_node = pdev->dev.of_node;
 
 	master->cleanup = dspi_cleanup;
-	master->mode_bits = SPI_CPOL | SPI_CPHA;
+	master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_LSB_FIRST;
 	master->bits_per_word_mask = SPI_BPW_MASK(4) | SPI_BPW_MASK(8) |
 					SPI_BPW_MASK(16);
 
diff --git a/drivers/spi/spi-imx.c b/drivers/spi/spi-imx.c
index 79ddefe..6f57592 100644
--- a/drivers/spi/spi-imx.c
+++ b/drivers/spi/spi-imx.c
@@ -1622,6 +1622,11 @@ static int spi_imx_probe(struct platform_device *pdev)
 	spi_imx->devtype_data->intctrl(spi_imx, 0);
 
 	master->dev.of_node = pdev->dev.of_node;
+	ret = spi_bitbang_start(&spi_imx->bitbang);
+	if (ret) {
+		dev_err(&pdev->dev, "bitbang start failed with %d\n", ret);
+		goto out_clk_put;
+	}
 
 	/* Request GPIO CS lines, if any */
 	if (!spi_imx->slave_mode && master->cs_gpios) {
@@ -1640,12 +1645,6 @@ static int spi_imx_probe(struct platform_device *pdev)
 		}
 	}
 
-	ret = spi_bitbang_start(&spi_imx->bitbang);
-	if (ret) {
-		dev_err(&pdev->dev, "bitbang start failed with %d\n", ret);
-		goto out_clk_put;
-	}
-
 	dev_info(&pdev->dev, "probed\n");
 
 	clk_disable(spi_imx->clk_ipg);
@@ -1668,12 +1667,23 @@ static int spi_imx_remove(struct platform_device *pdev)
 {
 	struct spi_master *master = platform_get_drvdata(pdev);
 	struct spi_imx_data *spi_imx = spi_master_get_devdata(master);
+	int ret;
 
 	spi_bitbang_stop(&spi_imx->bitbang);
 
+	ret = clk_enable(spi_imx->clk_per);
+	if (ret)
+		return ret;
+
+	ret = clk_enable(spi_imx->clk_ipg);
+	if (ret) {
+		clk_disable(spi_imx->clk_per);
+		return ret;
+	}
+
 	writel(0, spi_imx->base + MXC_CSPICTRL);
-	clk_unprepare(spi_imx->clk_ipg);
-	clk_unprepare(spi_imx->clk_per);
+	clk_disable_unprepare(spi_imx->clk_ipg);
+	clk_disable_unprepare(spi_imx->clk_per);
 	spi_imx_sdma_exit(spi_imx);
 	spi_master_put(master);
 
diff --git a/drivers/spi/spi-jcore.c b/drivers/spi/spi-jcore.c
index cebfea5..dafed62 100644
--- a/drivers/spi/spi-jcore.c
+++ b/drivers/spi/spi-jcore.c
@@ -198,8 +198,10 @@ static int jcore_spi_probe(struct platform_device *pdev)
 
 	/* Register our spi controller */
 	err = devm_spi_register_master(&pdev->dev, master);
-	if (err)
+	if (err) {
+		clk_disable(clk);
 		goto exit;
+	}
 
 	return 0;
 
diff --git a/drivers/spi/spi-meson-spicc.c b/drivers/spi/spi-meson-spicc.c
index 7f84296..5c82910 100644
--- a/drivers/spi/spi-meson-spicc.c
+++ b/drivers/spi/spi-meson-spicc.c
@@ -599,6 +599,7 @@ static int meson_spicc_remove(struct platform_device *pdev)
 
 static const struct of_device_id meson_spicc_of_match[] = {
 	{ .compatible = "amlogic,meson-gx-spicc", },
+	{ .compatible = "amlogic,meson-axg-spicc", },
 	{ /* sentinel */ }
 };
 MODULE_DEVICE_TABLE(of, meson_spicc_of_match);
diff --git a/drivers/spi/spi-orion.c b/drivers/spi/spi-orion.c
index 8974bb3..deca63e 100644
--- a/drivers/spi/spi-orion.c
+++ b/drivers/spi/spi-orion.c
@@ -94,6 +94,7 @@ struct orion_spi {
 	struct spi_master	*master;
 	void __iomem		*base;
 	struct clk              *clk;
+	struct clk              *axi_clk;
 	const struct orion_spi_dev *devdata;
 
 	struct orion_direct_acc	direct_access[ORION_NUM_CHIPSELECTS];
@@ -634,6 +635,16 @@ static int orion_spi_probe(struct platform_device *pdev)
 	if (status)
 		goto out;
 
+	/* The following clock is only used by some SoCs */
+	spi->axi_clk = devm_clk_get(&pdev->dev, "axi");
+	if (IS_ERR(spi->axi_clk) &&
+	    PTR_ERR(spi->axi_clk) == -EPROBE_DEFER) {
+		status = -EPROBE_DEFER;
+		goto out_rel_clk;
+	}
+	if (!IS_ERR(spi->axi_clk))
+		clk_prepare_enable(spi->axi_clk);
+
 	tclk_hz = clk_get_rate(spi->clk);
 
 	/*
@@ -658,7 +669,7 @@ static int orion_spi_probe(struct platform_device *pdev)
 	spi->base = devm_ioremap_resource(&pdev->dev, r);
 	if (IS_ERR(spi->base)) {
 		status = PTR_ERR(spi->base);
-		goto out_rel_clk;
+		goto out_rel_axi_clk;
 	}
 
 	/* Scan all SPI devices of this controller for direct mapped devices */
@@ -696,7 +707,7 @@ static int orion_spi_probe(struct platform_device *pdev)
 							    PAGE_SIZE);
 		if (!spi->direct_access[cs].vaddr) {
 			status = -ENOMEM;
-			goto out_rel_clk;
+			goto out_rel_axi_clk;
 		}
 		spi->direct_access[cs].size = PAGE_SIZE;
 
@@ -724,6 +735,8 @@ static int orion_spi_probe(struct platform_device *pdev)
 
 out_rel_pm:
 	pm_runtime_disable(&pdev->dev);
+out_rel_axi_clk:
+	clk_disable_unprepare(spi->axi_clk);
 out_rel_clk:
 	clk_disable_unprepare(spi->clk);
 out:
@@ -738,6 +751,7 @@ static int orion_spi_remove(struct platform_device *pdev)
 	struct orion_spi *spi = spi_master_get_devdata(master);
 
 	pm_runtime_get_sync(&pdev->dev);
+	clk_disable_unprepare(spi->axi_clk);
 	clk_disable_unprepare(spi->clk);
 
 	spi_unregister_master(master);
@@ -754,6 +768,7 @@ static int orion_spi_runtime_suspend(struct device *dev)
 	struct spi_master *master = dev_get_drvdata(dev);
 	struct orion_spi *spi = spi_master_get_devdata(master);
 
+	clk_disable_unprepare(spi->axi_clk);
 	clk_disable_unprepare(spi->clk);
 	return 0;
 }
@@ -763,6 +778,8 @@ static int orion_spi_runtime_resume(struct device *dev)
 	struct spi_master *master = dev_get_drvdata(dev);
 	struct orion_spi *spi = spi_master_get_devdata(master);
 
+	if (!IS_ERR(spi->axi_clk))
+		clk_prepare_enable(spi->axi_clk);
 	return clk_prepare_enable(spi->clk);
 }
 #endif
diff --git a/drivers/spi/spi-pxa2xx.c b/drivers/spi/spi-pxa2xx.c
index 4cb515a..b0822d1 100644
--- a/drivers/spi/spi-pxa2xx.c
+++ b/drivers/spi/spi-pxa2xx.c
@@ -1237,7 +1237,7 @@ static int setup_cs(struct spi_device *spi, struct chip_data *chip,
 	 * different chip_info, release previously requested GPIO
 	 */
 	if (chip->gpiod_cs) {
-		gpio_free(desc_to_gpio(chip->gpiod_cs));
+		gpiod_put(chip->gpiod_cs);
 		chip->gpiod_cs = NULL;
 	}
 
@@ -1417,7 +1417,7 @@ static void cleanup(struct spi_device *spi)
 
 	if (drv_data->ssp_type != CE4100_SSP && !drv_data->cs_gpiods &&
 	    chip->gpiod_cs)
-		gpio_free(desc_to_gpio(chip->gpiod_cs));
+		gpiod_put(chip->gpiod_cs);
 
 	kfree(chip);
 }
diff --git a/drivers/spi/spi-s3c64xx.c b/drivers/spi/spi-s3c64xx.c
index de7df20..baa3a9f 100644
--- a/drivers/spi/spi-s3c64xx.c
+++ b/drivers/spi/spi-s3c64xx.c
@@ -1,17 +1,7 @@
-/*
- * Copyright (C) 2009 Samsung Electronics Ltd.
- *	Jaswinder Singh <jassi.brar@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License as published by
- * the Free Software Foundation; either version 2 of the License, or
- * (at your option) any later version.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- */
+// SPDX-License-Identifier: GPL-2.0+
+//
+// Copyright (c) 2009 Samsung Electronics Co., Ltd.
+//      Jaswinder Singh <jassi.brar@samsung.com>
 
 #include <linux/init.h>
 #include <linux/module.h>
diff --git a/drivers/spi/spi-sh-msiof.c b/drivers/spi/spi-sh-msiof.c
index fcd261f..c5dcfb4 100644
--- a/drivers/spi/spi-sh-msiof.c
+++ b/drivers/spi/spi-sh-msiof.c
@@ -19,6 +19,7 @@
 #include <linux/dmaengine.h>
 #include <linux/err.h>
 #include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
 #include <linux/interrupt.h>
 #include <linux/io.h>
 #include <linux/kernel.h>
@@ -55,9 +56,14 @@ struct sh_msiof_spi_priv {
 	void *rx_dma_page;
 	dma_addr_t tx_dma_addr;
 	dma_addr_t rx_dma_addr;
+	unsigned short unused_ss;
+	bool native_cs_inited;
+	bool native_cs_high;
 	bool slave_aborted;
 };
 
+#define MAX_SS	3	/* Maximum number of native chip selects */
+
 #define TMDR1	0x00	/* Transmit Mode Register 1 */
 #define TMDR2	0x04	/* Transmit Mode Register 2 */
 #define TMDR3	0x08	/* Transmit Mode Register 3 */
@@ -91,6 +97,8 @@ struct sh_msiof_spi_priv {
 #define MDR1_XXSTP	 0x00000001 /* Transmission/Reception Stop on FIFO */
 /* TMDR1 */
 #define TMDR1_PCON	 0x40000000 /* Transfer Signal Connection */
+#define TMDR1_SYNCCH_MASK 0xc000000 /* Synchronization Signal Channel Select */
+#define TMDR1_SYNCCH_SHIFT	 26 /* 0=MSIOF_SYNC, 1=MSIOF_SS1, 2=MSIOF_SS2 */
 
 /* TMDR2 and RMDR2 */
 #define MDR2_BITLEN1(i)	(((i) - 1) << 24) /* Data Size (8-32 bits) */
@@ -324,7 +332,7 @@ static u32 sh_msiof_spi_get_dtdl_and_syncdl(struct sh_msiof_spi_priv *p)
 	return val;
 }
 
-static void sh_msiof_spi_set_pin_regs(struct sh_msiof_spi_priv *p,
+static void sh_msiof_spi_set_pin_regs(struct sh_msiof_spi_priv *p, u32 ss,
 				      u32 cpol, u32 cpha,
 				      u32 tx_hi_z, u32 lsb_first, u32 cs_high)
 {
@@ -342,10 +350,13 @@ static void sh_msiof_spi_set_pin_regs(struct sh_msiof_spi_priv *p,
 	tmp |= !cs_high << MDR1_SYNCAC_SHIFT;
 	tmp |= lsb_first << MDR1_BITLSB_SHIFT;
 	tmp |= sh_msiof_spi_get_dtdl_and_syncdl(p);
-	if (spi_controller_is_slave(p->master))
+	if (spi_controller_is_slave(p->master)) {
 		sh_msiof_write(p, TMDR1, tmp | TMDR1_PCON);
-	else
-		sh_msiof_write(p, TMDR1, tmp | MDR1_TRMD | TMDR1_PCON);
+	} else {
+		sh_msiof_write(p, TMDR1,
+			       tmp | MDR1_TRMD | TMDR1_PCON |
+			       (ss < MAX_SS ? ss : 0) << TMDR1_SYNCCH_SHIFT);
+	}
 	if (p->master->flags & SPI_MASTER_MUST_TX) {
 		/* These bits are reserved if RX needs TX */
 		tmp &= ~0x0000ffff;
@@ -528,8 +539,7 @@ static int sh_msiof_spi_setup(struct spi_device *spi)
 {
 	struct device_node	*np = spi->master->dev.of_node;
 	struct sh_msiof_spi_priv *p = spi_master_get_devdata(spi->master);
-
-	pm_runtime_get_sync(&p->pdev->dev);
+	u32 clr, set, tmp;
 
 	if (!np) {
 		/*
@@ -539,19 +549,31 @@ static int sh_msiof_spi_setup(struct spi_device *spi)
 		spi->cs_gpio = (uintptr_t)spi->controller_data;
 	}
 
-	/* Configure pins before deasserting CS */
-	sh_msiof_spi_set_pin_regs(p, !!(spi->mode & SPI_CPOL),
-				  !!(spi->mode & SPI_CPHA),
-				  !!(spi->mode & SPI_3WIRE),
-				  !!(spi->mode & SPI_LSB_FIRST),
-				  !!(spi->mode & SPI_CS_HIGH));
+	if (gpio_is_valid(spi->cs_gpio)) {
+		gpio_direction_output(spi->cs_gpio, !(spi->mode & SPI_CS_HIGH));
+		return 0;
+	}
 
-	if (spi->cs_gpio >= 0)
-		gpio_set_value(spi->cs_gpio, !(spi->mode & SPI_CS_HIGH));
+	if (spi_controller_is_slave(p->master))
+		return 0;
 
+	if (p->native_cs_inited &&
+	    (p->native_cs_high == !!(spi->mode & SPI_CS_HIGH)))
+		return 0;
 
+	/* Configure native chip select mode/polarity early */
+	clr = MDR1_SYNCMD_MASK;
+	set = MDR1_TRMD | TMDR1_PCON | MDR1_SYNCMD_SPI;
+	if (spi->mode & SPI_CS_HIGH)
+		clr |= BIT(MDR1_SYNCAC_SHIFT);
+	else
+		set |= BIT(MDR1_SYNCAC_SHIFT);
+	pm_runtime_get_sync(&p->pdev->dev);
+	tmp = sh_msiof_read(p, TMDR1) & ~clr;
+	sh_msiof_write(p, TMDR1, tmp | set);
 	pm_runtime_put(&p->pdev->dev);
-
+	p->native_cs_high = spi->mode & SPI_CS_HIGH;
+	p->native_cs_inited = true;
 	return 0;
 }
 
@@ -560,13 +582,20 @@ static int sh_msiof_prepare_message(struct spi_master *master,
 {
 	struct sh_msiof_spi_priv *p = spi_master_get_devdata(master);
 	const struct spi_device *spi = msg->spi;
+	u32 ss, cs_high;
 
 	/* Configure pins before asserting CS */
-	sh_msiof_spi_set_pin_regs(p, !!(spi->mode & SPI_CPOL),
+	if (gpio_is_valid(spi->cs_gpio)) {
+		ss = p->unused_ss;
+		cs_high = p->native_cs_high;
+	} else {
+		ss = spi->chip_select;
+		cs_high = !!(spi->mode & SPI_CS_HIGH);
+	}
+	sh_msiof_spi_set_pin_regs(p, ss, !!(spi->mode & SPI_CPOL),
 				  !!(spi->mode & SPI_CPHA),
 				  !!(spi->mode & SPI_3WIRE),
-				  !!(spi->mode & SPI_LSB_FIRST),
-				  !!(spi->mode & SPI_CS_HIGH));
+				  !!(spi->mode & SPI_LSB_FIRST), cs_high);
 	return 0;
 }
 
@@ -784,11 +813,21 @@ static int sh_msiof_dma_once(struct sh_msiof_spi_priv *p, const void *tx,
 		goto stop_dma;
 	}
 
-	/* wait for tx fifo to be emptied / rx fifo to be filled */
+	/* wait for tx/rx DMA completion */
 	ret = sh_msiof_wait_for_completion(p);
 	if (ret)
 		goto stop_reset;
 
+	if (!rx) {
+		reinit_completion(&p->done);
+		sh_msiof_write(p, IER, IER_TEOFE);
+
+		/* wait for tx fifo to be emptied */
+		ret = sh_msiof_wait_for_completion(p);
+		if (ret)
+			goto stop_reset;
+	}
+
 	/* clear status bits */
 	sh_msiof_reset_str(p);
 
@@ -912,9 +951,8 @@ static int sh_msiof_transfer_one(struct spi_master *master,
 
 		ret = sh_msiof_dma_once(p, tx_buf, rx_buf, l);
 		if (ret == -EAGAIN) {
-			pr_warn_once("%s %s: DMA not available, falling back to PIO\n",
-				     dev_driver_string(&p->pdev->dev),
-				     dev_name(&p->pdev->dev));
+			dev_warn_once(&p->pdev->dev,
+				"DMA not available, falling back to PIO\n");
 			break;
 		}
 		if (ret)
@@ -1071,6 +1109,45 @@ static struct sh_msiof_spi_info *sh_msiof_spi_parse_dt(struct device *dev)
 }
 #endif
 
+static int sh_msiof_get_cs_gpios(struct sh_msiof_spi_priv *p)
+{
+	struct device *dev = &p->pdev->dev;
+	unsigned int used_ss_mask = 0;
+	unsigned int cs_gpios = 0;
+	unsigned int num_cs, i;
+	int ret;
+
+	ret = gpiod_count(dev, "cs");
+	if (ret <= 0)
+		return 0;
+
+	num_cs = max_t(unsigned int, ret, p->master->num_chipselect);
+	for (i = 0; i < num_cs; i++) {
+		struct gpio_desc *gpiod;
+
+		gpiod = devm_gpiod_get_index(dev, "cs", i, GPIOD_ASIS);
+		if (!IS_ERR(gpiod)) {
+			cs_gpios++;
+			continue;
+		}
+
+		if (PTR_ERR(gpiod) != -ENOENT)
+			return PTR_ERR(gpiod);
+
+		if (i >= MAX_SS) {
+			dev_err(dev, "Invalid native chip select %d\n", i);
+			return -EINVAL;
+		}
+		used_ss_mask |= BIT(i);
+	}
+	p->unused_ss = ffz(used_ss_mask);
+	if (cs_gpios && p->unused_ss >= MAX_SS) {
+		dev_err(dev, "No unused native chip select available\n");
+		return -EINVAL;
+	}
+	return 0;
+}
+
 static struct dma_chan *sh_msiof_request_dma_chan(struct device *dev,
 	enum dma_transfer_direction dir, unsigned int id, dma_addr_t port_addr)
 {
@@ -1284,13 +1361,18 @@ static int sh_msiof_spi_probe(struct platform_device *pdev)
 	if (p->info->rx_fifo_override)
 		p->rx_fifo_size = p->info->rx_fifo_override;
 
+	/* Setup GPIO chip selects */
+	master->num_chipselect = p->info->num_chipselect;
+	ret = sh_msiof_get_cs_gpios(p);
+	if (ret)
+		goto err1;
+
 	/* init master code */
 	master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_CS_HIGH;
 	master->mode_bits |= SPI_LSB_FIRST | SPI_3WIRE;
 	master->flags = chipdata->master_flags;
 	master->bus_num = pdev->id;
 	master->dev.of_node = pdev->dev.of_node;
-	master->num_chipselect = p->info->num_chipselect;
 	master->setup = sh_msiof_spi_setup;
 	master->prepare_message = sh_msiof_prepare_message;
 	master->slave_abort = sh_msiof_slave_abort;
diff --git a/drivers/spi/spi-sirf.c b/drivers/spi/spi-sirf.c
index bbb1a27..f009d76 100644
--- a/drivers/spi/spi-sirf.c
+++ b/drivers/spi/spi-sirf.c
@@ -1072,7 +1072,7 @@ static int spi_sirfsoc_probe(struct platform_device *pdev)
 	struct sirfsoc_spi *sspi;
 	struct spi_master *master;
 	struct resource *mem_res;
-	struct sirf_spi_comp_data *spi_comp_data;
+	const struct sirf_spi_comp_data *spi_comp_data;
 	int irq;
 	int ret;
 	const struct of_device_id *match;
@@ -1092,7 +1092,7 @@ static int spi_sirfsoc_probe(struct platform_device *pdev)
 	platform_set_drvdata(pdev, master);
 	sspi = spi_master_get_devdata(master);
 	sspi->fifo_full_offset = ilog2(sspi->fifo_size);
-	spi_comp_data = (struct sirf_spi_comp_data *)match->data;
+	spi_comp_data = match->data;
 	sspi->regs = spi_comp_data->regs;
 	sspi->type = spi_comp_data->type;
 	sspi->fifo_level_chk_mask = (sspi->fifo_size / 4) - 1;
diff --git a/drivers/spi/spi-sun6i.c b/drivers/spi/spi-sun6i.c
index fb38234..8533f4e 100644
--- a/drivers/spi/spi-sun6i.c
+++ b/drivers/spi/spi-sun6i.c
@@ -541,7 +541,7 @@ static int sun6i_spi_probe(struct platform_device *pdev)
 
 static int sun6i_spi_remove(struct platform_device *pdev)
 {
-	pm_runtime_disable(&pdev->dev);
+	pm_runtime_force_suspend(&pdev->dev);
 
 	return 0;
 }
diff --git a/drivers/spi/spi-xilinx.c b/drivers/spi/spi-xilinx.c
index e0b9fe1..63fedc4 100644
--- a/drivers/spi/spi-xilinx.c
+++ b/drivers/spi/spi-xilinx.c
@@ -381,6 +381,7 @@ static int xilinx_spi_find_buffer_size(struct xilinx_spi *xspi)
 }
 
 static const struct of_device_id xilinx_spi_of_match[] = {
+	{ .compatible = "xlnx,axi-quad-spi-1.00.a", },
 	{ .compatible = "xlnx,xps-spi-2.00.a", },
 	{ .compatible = "xlnx,xps-spi-2.00.b", },
 	{}
diff --git a/drivers/ssb/Kconfig b/drivers/ssb/Kconfig
index d8e4219..71c7376 100644
--- a/drivers/ssb/Kconfig
+++ b/drivers/ssb/Kconfig
@@ -32,7 +32,7 @@
 
 config SSB_PCIHOST_POSSIBLE
 	bool
-	depends on SSB && (PCI = y || PCI = SSB)
+	depends on SSB && (PCI = y || PCI = SSB) && PCI_DRIVERS_LEGACY
 	default y
 
 config SSB_PCIHOST
diff --git a/drivers/staging/android/ashmem.c b/drivers/staging/android/ashmem.c
index 0f695df..372ce99 100644
--- a/drivers/staging/android/ashmem.c
+++ b/drivers/staging/android/ashmem.c
@@ -765,10 +765,12 @@ static long ashmem_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
 		break;
 	case ASHMEM_SET_SIZE:
 		ret = -EINVAL;
+		mutex_lock(&ashmem_mutex);
 		if (!asma->file) {
 			ret = 0;
 			asma->size = (size_t)arg;
 		}
+		mutex_unlock(&ashmem_mutex);
 		break;
 	case ASHMEM_GET_SIZE:
 		ret = asma->size;
diff --git a/drivers/staging/lustre/lustre/llite/dir.c b/drivers/staging/lustre/lustre/llite/dir.c
index 5b2e47c..6f59045 100644
--- a/drivers/staging/lustre/lustre/llite/dir.c
+++ b/drivers/staging/lustre/lustre/llite/dir.c
@@ -369,8 +369,6 @@ static int ll_readdir(struct file *filp, struct dir_context *ctx)
 	}
 	ctx->pos = pos;
 	ll_finish_md_op_data(op_data);
-	filp->f_version = inode->i_version;
-
 out:
 	if (!rc)
 		ll_stats_ops_tally(sbi, LPROC_LL_READDIR, 1);
@@ -1678,7 +1676,6 @@ static loff_t ll_dir_seek(struct file *file, loff_t offset, int origin)
 			else
 				fd->lfd_pos = offset;
 			file->f_pos = offset;
-			file->f_version = 0;
 		}
 		ret = offset;
 	}
diff --git a/drivers/staging/mt29f_spinand/mt29f_spinand.c b/drivers/staging/mt29f_spinand/mt29f_spinand.c
index 87595c5..264ad36 100644
--- a/drivers/staging/mt29f_spinand/mt29f_spinand.c
+++ b/drivers/staging/mt29f_spinand/mt29f_spinand.c
@@ -637,8 +637,7 @@ static int spinand_write_page_hwecc(struct mtd_info *mtd,
 	int eccsteps = chip->ecc.steps;
 
 	enable_hw_ecc = 1;
-	chip->write_buf(mtd, p, eccsize * eccsteps);
-	return 0;
+	return nand_prog_page_op(chip, page, 0, p, eccsize * eccsteps);
 }
 
 static int spinand_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
@@ -653,7 +652,7 @@ static int spinand_read_page_hwecc(struct mtd_info *mtd, struct nand_chip *chip,
 
 	enable_read_hw_ecc = 1;
 
-	chip->read_buf(mtd, p, eccsize * eccsteps);
+	nand_read_page_op(chip, page, 0, p, eccsize * eccsteps);
 	if (oob_required)
 		chip->read_buf(mtd, chip->oob_poi, mtd->oobsize);
 
diff --git a/drivers/target/Kconfig b/drivers/target/Kconfig
index e2bc999..4c44d7b 100644
--- a/drivers/target/Kconfig
+++ b/drivers/target/Kconfig
@@ -5,6 +5,7 @@
 	select CONFIGFS_FS
 	select CRC_T10DIF
 	select BLK_SCSI_REQUEST # only for scsi_command_size_tbl..
+	select SGL_ALLOC
 	default n
 	help
 	Say Y or M here to enable the TCM Storage Engine and ConfigFS enabled
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c
index 58caacd..c03a78e 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -2300,13 +2300,7 @@ static void target_complete_ok_work(struct work_struct *work)
 
 void target_free_sgl(struct scatterlist *sgl, int nents)
 {
-	struct scatterlist *sg;
-	int count;
-
-	for_each_sg(sgl, sg, nents, count)
-		__free_page(sg_page(sg));
-
-	kfree(sgl);
+	sgl_free_n_order(sgl, nents, 0);
 }
 EXPORT_SYMBOL(target_free_sgl);
 
@@ -2414,42 +2408,10 @@ int
 target_alloc_sgl(struct scatterlist **sgl, unsigned int *nents, u32 length,
 		 bool zero_page, bool chainable)
 {
-	struct scatterlist *sg;
-	struct page *page;
-	gfp_t zero_flag = (zero_page) ? __GFP_ZERO : 0;
-	unsigned int nalloc, nent;
-	int i = 0;
+	gfp_t gfp = GFP_KERNEL | (zero_page ? __GFP_ZERO : 0);
 
-	nalloc = nent = DIV_ROUND_UP(length, PAGE_SIZE);
-	if (chainable)
-		nalloc++;
-	sg = kmalloc_array(nalloc, sizeof(struct scatterlist), GFP_KERNEL);
-	if (!sg)
-		return -ENOMEM;
-
-	sg_init_table(sg, nalloc);
-
-	while (length) {
-		u32 page_len = min_t(u32, length, PAGE_SIZE);
-		page = alloc_page(GFP_KERNEL | zero_flag);
-		if (!page)
-			goto out;
-
-		sg_set_page(&sg[i], page, page_len, 0);
-		length -= page_len;
-		i++;
-	}
-	*sgl = sg;
-	*nents = nent;
-	return 0;
-
-out:
-	while (i > 0) {
-		i--;
-		__free_page(sg_page(&sg[i]));
-	}
-	kfree(sg);
-	return -ENOMEM;
+	*sgl = sgl_alloc_order(length, 0, chainable, gfp, nents);
+	return *sgl ? 0 : -ENOMEM;
 }
 EXPORT_SYMBOL(target_alloc_sgl);
 
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index dc63aba..dfd2324 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -88,7 +88,6 @@ struct time_in_idle {
  * @policy: cpufreq policy.
  * @node: list_head to link all cpufreq_cooling_device together.
  * @idle_time: idle time stats
- * @plat_get_static_power: callback to calculate the static power
  *
  * This structure is required for keeping information of each registered
  * cpufreq_cooling_device.
@@ -104,7 +103,6 @@ struct cpufreq_cooling_device {
 	struct cpufreq_policy *policy;
 	struct list_head node;
 	struct time_in_idle *idle_time;
-	get_static_t plat_get_static_power;
 };
 
 static DEFINE_IDA(cpufreq_ida);
@@ -319,60 +317,6 @@ static u32 get_load(struct cpufreq_cooling_device *cpufreq_cdev, int cpu,
 }
 
 /**
- * get_static_power() - calculate the static power consumed by the cpus
- * @cpufreq_cdev:	struct &cpufreq_cooling_device for this cpu cdev
- * @tz:		thermal zone device in which we're operating
- * @freq:	frequency in KHz
- * @power:	pointer in which to store the calculated static power
- *
- * Calculate the static power consumed by the cpus described by
- * @cpu_actor running at frequency @freq.  This function relies on a
- * platform specific function that should have been provided when the
- * actor was registered.  If it wasn't, the static power is assumed to
- * be negligible.  The calculated static power is stored in @power.
- *
- * Return: 0 on success, -E* on failure.
- */
-static int get_static_power(struct cpufreq_cooling_device *cpufreq_cdev,
-			    struct thermal_zone_device *tz, unsigned long freq,
-			    u32 *power)
-{
-	struct dev_pm_opp *opp;
-	unsigned long voltage;
-	struct cpufreq_policy *policy = cpufreq_cdev->policy;
-	struct cpumask *cpumask = policy->related_cpus;
-	unsigned long freq_hz = freq * 1000;
-	struct device *dev;
-
-	if (!cpufreq_cdev->plat_get_static_power) {
-		*power = 0;
-		return 0;
-	}
-
-	dev = get_cpu_device(policy->cpu);
-	WARN_ON(!dev);
-
-	opp = dev_pm_opp_find_freq_exact(dev, freq_hz, true);
-	if (IS_ERR(opp)) {
-		dev_warn_ratelimited(dev, "Failed to find OPP for frequency %lu: %ld\n",
-				     freq_hz, PTR_ERR(opp));
-		return -EINVAL;
-	}
-
-	voltage = dev_pm_opp_get_voltage(opp);
-	dev_pm_opp_put(opp);
-
-	if (voltage == 0) {
-		dev_err_ratelimited(dev, "Failed to get voltage for frequency %lu\n",
-				    freq_hz);
-		return -EINVAL;
-	}
-
-	return cpufreq_cdev->plat_get_static_power(cpumask, tz->passive_delay,
-						  voltage, power);
-}
-
-/**
  * get_dynamic_power() - calculate the dynamic power
  * @cpufreq_cdev:	&cpufreq_cooling_device for this cdev
  * @freq:	current frequency
@@ -491,8 +435,8 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
 				       u32 *power)
 {
 	unsigned long freq;
-	int i = 0, cpu, ret;
-	u32 static_power, dynamic_power, total_load = 0;
+	int i = 0, cpu;
+	u32 total_load = 0;
 	struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 	struct cpufreq_policy *policy = cpufreq_cdev->policy;
 	u32 *load_cpu = NULL;
@@ -522,22 +466,15 @@ static int cpufreq_get_requested_power(struct thermal_cooling_device *cdev,
 
 	cpufreq_cdev->last_load = total_load;
 
-	dynamic_power = get_dynamic_power(cpufreq_cdev, freq);
-	ret = get_static_power(cpufreq_cdev, tz, freq, &static_power);
-	if (ret) {
-		kfree(load_cpu);
-		return ret;
-	}
+	*power = get_dynamic_power(cpufreq_cdev, freq);
 
 	if (load_cpu) {
 		trace_thermal_power_cpu_get_power(policy->related_cpus, freq,
-						  load_cpu, i, dynamic_power,
-						  static_power);
+						  load_cpu, i, *power);
 
 		kfree(load_cpu);
 	}
 
-	*power = static_power + dynamic_power;
 	return 0;
 }
 
@@ -561,8 +498,6 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 			       unsigned long state, u32 *power)
 {
 	unsigned int freq, num_cpus;
-	u32 static_power, dynamic_power;
-	int ret;
 	struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 
 	/* Request state should be less than max_level */
@@ -572,13 +507,9 @@ static int cpufreq_state2power(struct thermal_cooling_device *cdev,
 	num_cpus = cpumask_weight(cpufreq_cdev->policy->cpus);
 
 	freq = cpufreq_cdev->freq_table[state].frequency;
-	dynamic_power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
-	ret = get_static_power(cpufreq_cdev, tz, freq, &static_power);
-	if (ret)
-		return ret;
+	*power = cpu_freq_to_power(cpufreq_cdev, freq) * num_cpus;
 
-	*power = static_power + dynamic_power;
-	return ret;
+	return 0;
 }
 
 /**
@@ -606,21 +537,14 @@ static int cpufreq_power2state(struct thermal_cooling_device *cdev,
 			       unsigned long *state)
 {
 	unsigned int cur_freq, target_freq;
-	int ret;
-	s32 dyn_power;
-	u32 last_load, normalised_power, static_power;
+	u32 last_load, normalised_power;
 	struct cpufreq_cooling_device *cpufreq_cdev = cdev->devdata;
 	struct cpufreq_policy *policy = cpufreq_cdev->policy;
 
 	cur_freq = cpufreq_quick_get(policy->cpu);
-	ret = get_static_power(cpufreq_cdev, tz, cur_freq, &static_power);
-	if (ret)
-		return ret;
-
-	dyn_power = power - static_power;
-	dyn_power = dyn_power > 0 ? dyn_power : 0;
+	power = power > 0 ? power : 0;
 	last_load = cpufreq_cdev->last_load ?: 1;
-	normalised_power = (dyn_power * 100) / last_load;
+	normalised_power = (power * 100) / last_load;
 	target_freq = cpu_power_to_freq(cpufreq_cdev, normalised_power);
 
 	*state = get_level(cpufreq_cdev, target_freq);
@@ -671,8 +595,6 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  * @policy: cpufreq policy
  * Normally this should be same as cpufreq policy->related_cpus.
  * @capacitance: dynamic power coefficient for these cpus
- * @plat_static_func: function to calculate the static power consumed by these
- *                    cpus (optional)
  *
  * This interface function registers the cpufreq cooling device with the name
  * "thermal-cpufreq-%x". This api can support multiple instances of cpufreq
@@ -684,8 +606,7 @@ static unsigned int find_next_max(struct cpufreq_frequency_table *table,
  */
 static struct thermal_cooling_device *
 __cpufreq_cooling_register(struct device_node *np,
-			struct cpufreq_policy *policy, u32 capacitance,
-			get_static_t plat_static_func)
+			struct cpufreq_policy *policy, u32 capacitance)
 {
 	struct thermal_cooling_device *cdev;
 	struct cpufreq_cooling_device *cpufreq_cdev;
@@ -755,8 +676,6 @@ __cpufreq_cooling_register(struct device_node *np,
 	}
 
 	if (capacitance) {
-		cpufreq_cdev->plat_get_static_power = plat_static_func;
-
 		ret = update_freq_table(cpufreq_cdev, capacitance);
 		if (ret) {
 			cdev = ERR_PTR(ret);
@@ -813,13 +732,12 @@ __cpufreq_cooling_register(struct device_node *np,
 struct thermal_cooling_device *
 cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
-	return __cpufreq_cooling_register(NULL, policy, 0, NULL);
+	return __cpufreq_cooling_register(NULL, policy, 0);
 }
 EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
 
 /**
  * of_cpufreq_cooling_register - function to create cpufreq cooling device.
- * @np: a valid struct device_node to the cooling device device tree node
  * @policy: cpufreq policy
  *
  * This interface function registers the cpufreq cooling device with the name
@@ -827,86 +745,45 @@ EXPORT_SYMBOL_GPL(cpufreq_cooling_register);
  * cooling devices. Using this API, the cpufreq cooling device will be
  * linked to the device tree node provided.
  *
- * Return: a valid struct thermal_cooling_device pointer on success,
- * on failure, it returns a corresponding ERR_PTR().
- */
-struct thermal_cooling_device *
-of_cpufreq_cooling_register(struct device_node *np,
-			    struct cpufreq_policy *policy)
-{
-	if (!np)
-		return ERR_PTR(-EINVAL);
-
-	return __cpufreq_cooling_register(np, policy, 0, NULL);
-}
-EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);
-
-/**
- * cpufreq_power_cooling_register() - create cpufreq cooling device with power extensions
- * @policy:		cpufreq policy
- * @capacitance:	dynamic power coefficient for these cpus
- * @plat_static_func:	function to calculate the static power consumed by these
- *			cpus (optional)
- *
- * This interface function registers the cpufreq cooling device with
- * the name "thermal-cpufreq-%x".  This api can support multiple
- * instances of cpufreq cooling devices.  Using this function, the
- * cooling device will implement the power extensions by using a
- * simple cpu power model.  The cpus must have registered their OPPs
- * using the OPP library.
- *
- * An optional @plat_static_func may be provided to calculate the
- * static power consumed by these cpus.  If the platform's static
- * power consumption is unknown or negligible, make it NULL.
- *
- * Return: a valid struct thermal_cooling_device pointer on success,
- * on failure, it returns a corresponding ERR_PTR().
- */
-struct thermal_cooling_device *
-cpufreq_power_cooling_register(struct cpufreq_policy *policy, u32 capacitance,
-			       get_static_t plat_static_func)
-{
-	return __cpufreq_cooling_register(NULL, policy, capacitance,
-				plat_static_func);
-}
-EXPORT_SYMBOL(cpufreq_power_cooling_register);
-
-/**
- * of_cpufreq_power_cooling_register() - create cpufreq cooling device with power extensions
- * @np:	a valid struct device_node to the cooling device device tree node
- * @policy: cpufreq policy
- * @capacitance:	dynamic power coefficient for these cpus
- * @plat_static_func:	function to calculate the static power consumed by these
- *			cpus (optional)
- *
- * This interface function registers the cpufreq cooling device with
- * the name "thermal-cpufreq-%x".  This api can support multiple
- * instances of cpufreq cooling devices.  Using this API, the cpufreq
- * cooling device will be linked to the device tree node provided.
  * Using this function, the cooling device will implement the power
  * extensions by using a simple cpu power model.  The cpus must have
  * registered their OPPs using the OPP library.
  *
- * An optional @plat_static_func may be provided to calculate the
- * static power consumed by these cpus.  If the platform's static
- * power consumption is unknown or negligible, make it NULL.
+ * It also takes into account, if property present in policy CPU node, the
+ * static power consumed by the cpu.
  *
  * Return: a valid struct thermal_cooling_device pointer on success,
- * on failure, it returns a corresponding ERR_PTR().
+ * and NULL on failure.
  */
 struct thermal_cooling_device *
-of_cpufreq_power_cooling_register(struct device_node *np,
-				  struct cpufreq_policy *policy,
-				  u32 capacitance,
-				  get_static_t plat_static_func)
+of_cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
-	if (!np)
-		return ERR_PTR(-EINVAL);
+	struct device_node *np = of_get_cpu_node(policy->cpu, NULL);
+	struct thermal_cooling_device *cdev = NULL;
+	u32 capacitance = 0;
 
-	return __cpufreq_cooling_register(np, policy, capacitance,
-				plat_static_func);
+	if (!np) {
+		pr_err("cpu_cooling: OF node not available for cpu%d\n",
+		       policy->cpu);
+		return NULL;
+	}
+
+	if (of_find_property(np, "#cooling-cells", NULL)) {
+		of_property_read_u32(np, "dynamic-power-coefficient",
+				     &capacitance);
+
+		cdev = __cpufreq_cooling_register(np, policy, capacitance);
+		if (IS_ERR(cdev)) {
+			pr_err("cpu_cooling: cpu%d is not running as cooling device: %ld\n",
+			       policy->cpu, PTR_ERR(cdev));
+			cdev = NULL;
+		}
+	}
+
+	of_node_put(np);
+	return cdev;
 }
-EXPORT_SYMBOL(of_cpufreq_power_cooling_register);
+EXPORT_SYMBOL_GPL(of_cpufreq_cooling_register);
 
 /**
  * cpufreq_cooling_unregister - function to remove cpufreq cooling device.
diff --git a/drivers/tty/serdev/core.c b/drivers/tty/serdev/core.c
index 1bef398..28133db 100644
--- a/drivers/tty/serdev/core.c
+++ b/drivers/tty/serdev/core.c
@@ -132,6 +132,33 @@ void serdev_device_close(struct serdev_device *serdev)
 }
 EXPORT_SYMBOL_GPL(serdev_device_close);
 
+static void devm_serdev_device_release(struct device *dev, void *dr)
+{
+	serdev_device_close(*(struct serdev_device **)dr);
+}
+
+int devm_serdev_device_open(struct device *dev, struct serdev_device *serdev)
+{
+	struct serdev_device **dr;
+	int ret;
+
+	dr = devres_alloc(devm_serdev_device_release, sizeof(*dr), GFP_KERNEL);
+	if (!dr)
+		return -ENOMEM;
+
+	ret = serdev_device_open(serdev);
+	if (ret) {
+		devres_free(dr);
+		return ret;
+	}
+
+	*dr = serdev;
+	devres_add(dev, dr);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(devm_serdev_device_open);
+
 void serdev_device_write_wakeup(struct serdev_device *serdev)
 {
 	complete(&serdev->write_comp);
@@ -268,8 +295,8 @@ static int serdev_drv_probe(struct device *dev)
 static int serdev_drv_remove(struct device *dev)
 {
 	const struct serdev_device_driver *sdrv = to_serdev_device_driver(dev->driver);
-
-	sdrv->remove(to_serdev_device(dev));
+	if (sdrv->remove)
+		sdrv->remove(to_serdev_device(dev));
 	return 0;
 }
 
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index c5bce8e..5780fba 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -73,9 +73,7 @@ struct f_ncm {
 	struct sk_buff			*skb_tx_ndp;
 	u16				ndp_dgram_count;
 	bool				timer_force_tx;
-	struct tasklet_struct		tx_tasklet;
 	struct hrtimer			task_timer;
-
 	bool				timer_stopping;
 };
 
@@ -1104,7 +1102,7 @@ static struct sk_buff *ncm_wrap_ntb(struct gether *port,
 
 		/* Delay the timer. */
 		hrtimer_start(&ncm->task_timer, TX_TIMEOUT_NSECS,
-			      HRTIMER_MODE_REL);
+			      HRTIMER_MODE_REL_SOFT);
 
 		/* Add the datagram position entries */
 		ntb_ndp = skb_put_zero(ncm->skb_tx_ndp, dgram_idx_len);
@@ -1148,17 +1146,15 @@ static struct sk_buff *ncm_wrap_ntb(struct gether *port,
 }
 
 /*
- * This transmits the NTB if there are frames waiting.
+ * The transmit should only be run if no skb data has been sent
+ * for a certain duration.
  */
-static void ncm_tx_tasklet(unsigned long data)
+static enum hrtimer_restart ncm_tx_timeout(struct hrtimer *data)
 {
-	struct f_ncm	*ncm = (void *)data;
-
-	if (ncm->timer_stopping)
-		return;
+	struct f_ncm *ncm = container_of(data, struct f_ncm, task_timer);
 
 	/* Only send if data is available. */
-	if (ncm->skb_tx_data) {
+	if (!ncm->timer_stopping && ncm->skb_tx_data) {
 		ncm->timer_force_tx = true;
 
 		/* XXX This allowance of a NULL skb argument to ndo_start_xmit
@@ -1171,16 +1167,6 @@ static void ncm_tx_tasklet(unsigned long data)
 
 		ncm->timer_force_tx = false;
 	}
-}
-
-/*
- * The transmit should only be run if no skb data has been sent
- * for a certain duration.
- */
-static enum hrtimer_restart ncm_tx_timeout(struct hrtimer *data)
-{
-	struct f_ncm *ncm = container_of(data, struct f_ncm, task_timer);
-	tasklet_schedule(&ncm->tx_tasklet);
 	return HRTIMER_NORESTART;
 }
 
@@ -1513,8 +1499,7 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f)
 	ncm->port.open = ncm_open;
 	ncm->port.close = ncm_close;
 
-	tasklet_init(&ncm->tx_tasklet, ncm_tx_tasklet, (unsigned long) ncm);
-	hrtimer_init(&ncm->task_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_init(&ncm->task_timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT);
 	ncm->task_timer.function = ncm_tx_timeout;
 
 	DBG(cdev, "CDC Network: %s speed IN/%s OUT/%s NOTIFY/%s\n",
@@ -1623,7 +1608,6 @@ static void ncm_unbind(struct usb_configuration *c, struct usb_function *f)
 	DBG(c->cdev, "ncm unbind\n");
 
 	hrtimer_cancel(&ncm->task_timer);
-	tasklet_kill(&ncm->tx_tasklet);
 
 	ncm_string_defs[0].id = 0;
 	usb_free_all_descriptors(f);
diff --git a/drivers/usb/gadget/udc/core.c b/drivers/usb/gadget/udc/core.c
index 93eff7de..1b3efb1 100644
--- a/drivers/usb/gadget/udc/core.c
+++ b/drivers/usb/gadget/udc/core.c
@@ -1147,11 +1147,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget,
 
 	udc = kzalloc(sizeof(*udc), GFP_KERNEL);
 	if (!udc)
-		goto err1;
-
-	ret = device_add(&gadget->dev);
-	if (ret)
-		goto err2;
+		goto err_put_gadget;
 
 	device_initialize(&udc->dev);
 	udc->dev.release = usb_udc_release;
@@ -1160,7 +1156,11 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget,
 	udc->dev.parent = parent;
 	ret = dev_set_name(&udc->dev, "%s", kobject_name(&parent->kobj));
 	if (ret)
-		goto err3;
+		goto err_put_udc;
+
+	ret = device_add(&gadget->dev);
+	if (ret)
+		goto err_put_udc;
 
 	udc->gadget = gadget;
 	gadget->udc = udc;
@@ -1170,7 +1170,7 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget,
 
 	ret = device_add(&udc->dev);
 	if (ret)
-		goto err4;
+		goto err_unlist_udc;
 
 	usb_gadget_set_state(gadget, USB_STATE_NOTATTACHED);
 	udc->vbus = true;
@@ -1178,27 +1178,25 @@ int usb_add_gadget_udc_release(struct device *parent, struct usb_gadget *gadget,
 	/* pick up one of pending gadget drivers */
 	ret = check_pending_gadget_drivers(udc);
 	if (ret)
-		goto err5;
+		goto err_del_udc;
 
 	mutex_unlock(&udc_lock);
 
 	return 0;
 
-err5:
+ err_del_udc:
 	device_del(&udc->dev);
 
-err4:
+ err_unlist_udc:
 	list_del(&udc->list);
 	mutex_unlock(&udc_lock);
 
-err3:
-	put_device(&udc->dev);
 	device_del(&gadget->dev);
 
-err2:
-	kfree(udc);
+ err_put_udc:
+	put_device(&udc->dev);
 
-err1:
+ err_put_gadget:
 	put_device(&gadget->dev);
 	return ret;
 }
diff --git a/drivers/usb/misc/usb3503.c b/drivers/usb/misc/usb3503.c
index 465dbf6..f723f7b 100644
--- a/drivers/usb/misc/usb3503.c
+++ b/drivers/usb/misc/usb3503.c
@@ -279,6 +279,8 @@ static int usb3503_probe(struct usb3503 *hub)
 	if (gpio_is_valid(hub->gpio_reset)) {
 		err = devm_gpio_request_one(dev, hub->gpio_reset,
 				GPIOF_OUT_INIT_LOW, "usb3503 reset");
+		/* Datasheet defines a hardware reset to be at least 100us */
+		usleep_range(100, 10000);
 		if (err) {
 			dev_err(dev,
 				"unable to request GPIO %d as reset pin (%d)\n",
diff --git a/drivers/usb/mon/mon_bin.c b/drivers/usb/mon/mon_bin.c
index f6ae753..f932f40 100644
--- a/drivers/usb/mon/mon_bin.c
+++ b/drivers/usb/mon/mon_bin.c
@@ -1004,7 +1004,9 @@ static long mon_bin_ioctl(struct file *file, unsigned int cmd, unsigned long arg
 		break;
 
 	case MON_IOCQ_RING_SIZE:
+		mutex_lock(&rp->fetch_lock);
 		ret = rp->b_size;
+		mutex_unlock(&rp->fetch_lock);
 		break;
 
 	case MON_IOCT_RING_SIZE:
@@ -1231,12 +1233,16 @@ static int mon_bin_vma_fault(struct vm_fault *vmf)
 	unsigned long offset, chunk_idx;
 	struct page *pageptr;
 
+	mutex_lock(&rp->fetch_lock);
 	offset = vmf->pgoff << PAGE_SHIFT;
-	if (offset >= rp->b_size)
+	if (offset >= rp->b_size) {
+		mutex_unlock(&rp->fetch_lock);
 		return VM_FAULT_SIGBUS;
+	}
 	chunk_idx = offset / CHUNK_SIZE;
 	pageptr = rp->b_vec[chunk_idx].pg;
 	get_page(pageptr);
+	mutex_unlock(&rp->fetch_lock);
 	vmf->page = pageptr;
 	return 0;
 }
diff --git a/drivers/usb/serial/cp210x.c b/drivers/usb/serial/cp210x.c
index 7c6273b..06d502b 100644
--- a/drivers/usb/serial/cp210x.c
+++ b/drivers/usb/serial/cp210x.c
@@ -124,6 +124,7 @@ static const struct usb_device_id id_table[] = {
 	{ USB_DEVICE(0x10C4, 0x8470) }, /* Juniper Networks BX Series System Console */
 	{ USB_DEVICE(0x10C4, 0x8477) }, /* Balluff RFID */
 	{ USB_DEVICE(0x10C4, 0x84B6) }, /* Starizona Hyperion */
+	{ USB_DEVICE(0x10C4, 0x85A7) }, /* LifeScan OneTouch Verio IQ */
 	{ USB_DEVICE(0x10C4, 0x85EA) }, /* AC-Services IBUS-IF */
 	{ USB_DEVICE(0x10C4, 0x85EB) }, /* AC-Services CIS-IBUS */
 	{ USB_DEVICE(0x10C4, 0x85F8) }, /* Virtenio Preon32 */
@@ -174,6 +175,7 @@ static const struct usb_device_id id_table[] = {
 	{ USB_DEVICE(0x1843, 0x0200) }, /* Vaisala USB Instrument Cable */
 	{ USB_DEVICE(0x18EF, 0xE00F) }, /* ELV USB-I2C-Interface */
 	{ USB_DEVICE(0x18EF, 0xE025) }, /* ELV Marble Sound Board 1 */
+	{ USB_DEVICE(0x18EF, 0xE030) }, /* ELV ALC 8xxx Battery Charger */
 	{ USB_DEVICE(0x18EF, 0xE032) }, /* ELV TFD500 Data Logger */
 	{ USB_DEVICE(0x1901, 0x0190) }, /* GE B850 CP2105 Recorder interface */
 	{ USB_DEVICE(0x1901, 0x0193) }, /* GE B650 CP2104 PMC interface */
diff --git a/drivers/usb/storage/unusual_uas.h b/drivers/usb/storage/unusual_uas.h
index e6127fb..a7d08ae 100644
--- a/drivers/usb/storage/unusual_uas.h
+++ b/drivers/usb/storage/unusual_uas.h
@@ -143,6 +143,13 @@ UNUSUAL_DEV(0x2109, 0x0711, 0x0000, 0x9999,
 		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
 		US_FL_NO_ATA_1X),
 
+/* Reported-by: Icenowy Zheng <icenowy@aosc.io> */
+UNUSUAL_DEV(0x2537, 0x1068, 0x0000, 0x9999,
+		"Norelsys",
+		"NS1068X",
+		USB_SC_DEVICE, USB_PR_DEVICE, NULL,
+		US_FL_IGNORE_UAS),
+
 /* Reported-by: Takeo Nakayama <javhera@gmx.com> */
 UNUSUAL_DEV(0x357d, 0x7788, 0x0000, 0x9999,
 		"JMicron",
diff --git a/drivers/usb/usbip/usbip_common.c b/drivers/usb/usbip/usbip_common.c
index 7b219d9..ee2bbce 100644
--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -91,7 +91,7 @@ static void usbip_dump_usb_device(struct usb_device *udev)
 	dev_dbg(dev, "       devnum(%d) devpath(%s) usb speed(%s)",
 		udev->devnum, udev->devpath, usb_speed_string(udev->speed));
 
-	pr_debug("tt %p, ttport %d\n", udev->tt, udev->ttport);
+	pr_debug("tt hub ttport %d\n", udev->ttport);
 
 	dev_dbg(dev, "                    ");
 	for (i = 0; i < 16; i++)
@@ -124,12 +124,8 @@ static void usbip_dump_usb_device(struct usb_device *udev)
 	}
 	pr_debug("\n");
 
-	dev_dbg(dev, "parent %p, bus %p\n", udev->parent, udev->bus);
-
-	dev_dbg(dev,
-		"descriptor %p, config %p, actconfig %p, rawdescriptors %p\n",
-		&udev->descriptor, udev->config,
-		udev->actconfig, udev->rawdescriptors);
+	dev_dbg(dev, "parent %s, bus %s\n", dev_name(&udev->parent->dev),
+		udev->bus->bus_name);
 
 	dev_dbg(dev, "have_langid %d, string_langid %d\n",
 		udev->have_langid, udev->string_langid);
@@ -237,9 +233,6 @@ void usbip_dump_urb(struct urb *urb)
 
 	dev = &urb->dev->dev;
 
-	dev_dbg(dev, "   urb                   :%p\n", urb);
-	dev_dbg(dev, "   dev                   :%p\n", urb->dev);
-
 	usbip_dump_usb_device(urb->dev);
 
 	dev_dbg(dev, "   pipe                  :%08x ", urb->pipe);
@@ -248,11 +241,9 @@ void usbip_dump_urb(struct urb *urb)
 
 	dev_dbg(dev, "   status                :%d\n", urb->status);
 	dev_dbg(dev, "   transfer_flags        :%08X\n", urb->transfer_flags);
-	dev_dbg(dev, "   transfer_buffer       :%p\n", urb->transfer_buffer);
 	dev_dbg(dev, "   transfer_buffer_length:%d\n",
 						urb->transfer_buffer_length);
 	dev_dbg(dev, "   actual_length         :%d\n", urb->actual_length);
-	dev_dbg(dev, "   setup_packet          :%p\n", urb->setup_packet);
 
 	if (urb->setup_packet && usb_pipetype(urb->pipe) == PIPE_CONTROL)
 		usbip_dump_usb_ctrlrequest(
@@ -262,8 +253,6 @@ void usbip_dump_urb(struct urb *urb)
 	dev_dbg(dev, "   number_of_packets     :%d\n", urb->number_of_packets);
 	dev_dbg(dev, "   interval              :%d\n", urb->interval);
 	dev_dbg(dev, "   error_count           :%d\n", urb->error_count);
-	dev_dbg(dev, "   context               :%p\n", urb->context);
-	dev_dbg(dev, "   complete              :%p\n", urb->complete);
 }
 EXPORT_SYMBOL_GPL(usbip_dump_urb);
 
diff --git a/drivers/usb/usbip/vudc_rx.c b/drivers/usb/usbip/vudc_rx.c
index df1e309..1e8a23d 100644
--- a/drivers/usb/usbip/vudc_rx.c
+++ b/drivers/usb/usbip/vudc_rx.c
@@ -120,6 +120,25 @@ static int v_recv_cmd_submit(struct vudc *udc,
 	urb_p->new = 1;
 	urb_p->seqnum = pdu->base.seqnum;
 
+	if (urb_p->ep->type == USB_ENDPOINT_XFER_ISOC) {
+		/* validate packet size and number of packets */
+		unsigned int maxp, packets, bytes;
+
+		maxp = usb_endpoint_maxp(urb_p->ep->desc);
+		maxp *= usb_endpoint_maxp_mult(urb_p->ep->desc);
+		bytes = pdu->u.cmd_submit.transfer_buffer_length;
+		packets = DIV_ROUND_UP(bytes, maxp);
+
+		if (pdu->u.cmd_submit.number_of_packets < 0 ||
+		    pdu->u.cmd_submit.number_of_packets > packets) {
+			dev_err(&udc->gadget.dev,
+				"CMD_SUBMIT: isoc invalid num packets %d\n",
+				pdu->u.cmd_submit.number_of_packets);
+			ret = -EMSGSIZE;
+			goto free_urbp;
+		}
+	}
+
 	ret = alloc_urb_from_cmd(&urb_p->urb, pdu, urb_p->ep->type);
 	if (ret) {
 		usbip_event_add(&udc->ud, VUDC_EVENT_ERROR_MALLOC);
diff --git a/drivers/usb/usbip/vudc_tx.c b/drivers/usb/usbip/vudc_tx.c
index 1440ae0..3ccb17c 100644
--- a/drivers/usb/usbip/vudc_tx.c
+++ b/drivers/usb/usbip/vudc_tx.c
@@ -85,6 +85,13 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p)
 	memset(&pdu_header, 0, sizeof(pdu_header));
 	memset(&msg, 0, sizeof(msg));
 
+	if (urb->actual_length > 0 && !urb->transfer_buffer) {
+		dev_err(&udc->gadget.dev,
+			"urb: actual_length %d transfer_buffer null\n",
+			urb->actual_length);
+		return -1;
+	}
+
 	if (urb_p->type == USB_ENDPOINT_XFER_ISOC)
 		iovnum = 2 + urb->number_of_packets;
 	else
@@ -100,8 +107,8 @@ static int v_send_ret_submit(struct vudc *udc, struct urbp *urb_p)
 
 	/* 1. setup usbip_header */
 	setup_ret_submit_pdu(&pdu_header, urb_p);
-	usbip_dbg_stub_tx("setup txdata seqnum: %d urb: %p\n",
-			  pdu_header.base.seqnum, urb);
+	usbip_dbg_stub_tx("setup txdata seqnum: %d\n",
+			  pdu_header.base.seqnum);
 	usbip_header_correct_endian(&pdu_header, 1);
 
 	iov[iovnum].iov_base = &pdu_header;
diff --git a/drivers/vhost/vhost.c b/drivers/vhost/vhost.c
index 33ac2b1..e6bb094 100644
--- a/drivers/vhost/vhost.c
+++ b/drivers/vhost/vhost.c
@@ -904,7 +904,7 @@ static void vhost_dev_lock_vqs(struct vhost_dev *d)
 {
 	int i = 0;
 	for (i = 0; i < d->nvqs; ++i)
-		mutex_lock(&d->vqs[i]->mutex);
+		mutex_lock_nested(&d->vqs[i]->mutex, i);
 }
 
 static void vhost_dev_unlock_vqs(struct vhost_dev *d)
@@ -1015,6 +1015,10 @@ static int vhost_process_iotlb_msg(struct vhost_dev *dev,
 		vhost_iotlb_notify_vq(dev, msg);
 		break;
 	case VHOST_IOTLB_INVALIDATE:
+		if (!dev->iotlb) {
+			ret = -EFAULT;
+			break;
+		}
 		vhost_vq_meta_reset(dev);
 		vhost_del_umem_range(dev->iotlb, msg->iova,
 				     msg->iova + msg->size - 1);
@@ -1877,12 +1881,7 @@ static unsigned next_desc(struct vhost_virtqueue *vq, struct vring_desc *desc)
 		return -1U;
 
 	/* Check they're not leading us off end of descriptors. */
-	next = vhost16_to_cpu(vq, desc->next);
-	/* Make sure compiler knows to grab that: we don't want it changing! */
-	/* We will use the result as an index in an array, so most
-	 * architectures only need a compiler barrier here. */
-	read_barrier_depends();
-
+	next = vhost16_to_cpu(vq, READ_ONCE(desc->next));
 	return next;
 }
 
diff --git a/drivers/video/backlight/apple_bl.c b/drivers/video/backlight/apple_bl.c
index d843296..6a34ab9 100644
--- a/drivers/video/backlight/apple_bl.c
+++ b/drivers/video/backlight/apple_bl.c
@@ -143,7 +143,7 @@ static int apple_bl_add(struct acpi_device *dev)
 	struct pci_dev *host;
 	int intensity;
 
-	host = pci_get_bus_and_slot(0, 0);
+	host = pci_get_domain_bus_and_slot(0, 0, 0);
 
 	if (!host) {
 		pr_err("unable to find PCI host\n");
diff --git a/drivers/video/backlight/corgi_lcd.c b/drivers/video/backlight/corgi_lcd.c
index d7c239e..f557406 100644
--- a/drivers/video/backlight/corgi_lcd.c
+++ b/drivers/video/backlight/corgi_lcd.c
@@ -177,7 +177,7 @@ static int corgi_ssp_lcdtg_send(struct corgi_lcd *lcd, int adrs, uint8_t data)
 	struct spi_message msg;
 	struct spi_transfer xfer = {
 		.len		= 1,
-		.cs_change	= 1,
+		.cs_change	= 0,
 		.tx_buf		= lcd->buf,
 	};
 
diff --git a/drivers/video/backlight/tdo24m.c b/drivers/video/backlight/tdo24m.c
index eab1f84..e4bd63e 100644
--- a/drivers/video/backlight/tdo24m.c
+++ b/drivers/video/backlight/tdo24m.c
@@ -369,7 +369,7 @@ static int tdo24m_probe(struct spi_device *spi)
 
 	spi_message_init(m);
 
-	x->cs_change = 1;
+	x->cs_change = 0;
 	x->tx_buf = &lcd->buf[0];
 	spi_message_add_tail(x, m);
 
diff --git a/drivers/video/backlight/tosa_lcd.c b/drivers/video/backlight/tosa_lcd.c
index 6a41ea9..4dc5ee8 100644
--- a/drivers/video/backlight/tosa_lcd.c
+++ b/drivers/video/backlight/tosa_lcd.c
@@ -49,7 +49,7 @@ static int tosa_tg_send(struct spi_device *spi, int adrs, uint8_t data)
 	struct spi_message msg;
 	struct spi_transfer xfer = {
 		.len		= 1,
-		.cs_change	= 1,
+		.cs_change	= 0,
 		.tx_buf		= buf,
 	};
 
diff --git a/drivers/video/fbdev/macfb.c b/drivers/video/fbdev/macfb.c
index cda7587c..e707e61 100644
--- a/drivers/video/fbdev/macfb.c
+++ b/drivers/video/fbdev/macfb.c
@@ -556,7 +556,7 @@ static void __init iounmap_macfb(void)
 static int __init macfb_init(void)
 {
 	int video_cmap_len, video_is_nubus = 0;
-	struct nubus_dev* ndev = NULL;
+	struct nubus_rsrc *ndev = NULL;
 	char *option = NULL;
 	int err;
 
@@ -670,15 +670,17 @@ static int __init macfb_init(void)
 	 * code is really broken :-)
 	 */
 
-	while ((ndev = nubus_find_type(NUBUS_CAT_DISPLAY,
-				       NUBUS_TYPE_VIDEO, ndev)))
-	{
+	for_each_func_rsrc(ndev) {
 		unsigned long base = ndev->board->slot_addr;
 
 		if (mac_bi_data.videoaddr < base ||
 		    mac_bi_data.videoaddr - base > 0xFFFFFF)
 			continue;
 
+		if (ndev->category != NUBUS_CAT_DISPLAY ||
+		    ndev->type != NUBUS_TYPE_VIDEO)
+			continue;
+
 		video_is_nubus = 1;
 		slot_addr = (unsigned char *)base;
 
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index ca200d1..5bf613d 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -223,6 +223,13 @@
 	  To compile this driver as a module, choose M here: the
 	  module will be called ziirave_wdt.
 
+config RAVE_SP_WATCHDOG
+	tristate "RAVE SP Watchdog timer"
+	depends on RAVE_SP_CORE
+	select WATCHDOG_CORE
+	help
+	  Support for the watchdog on RAVE SP device.
+
 # ALPHA Architecture
 
 # ARM Architecture
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index 715a210..135c5e8 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -224,3 +224,4 @@
 obj-$(CONFIG_ZIIRAVE_WATCHDOG) += ziirave_wdt.o
 obj-$(CONFIG_SOFT_WATCHDOG) += softdog.o
 obj-$(CONFIG_MENF21BMC_WATCHDOG) += menf21bmc_wdt.o
+obj-$(CONFIG_RAVE_SP_WATCHDOG) += rave-sp-wdt.o
diff --git a/drivers/watchdog/rave-sp-wdt.c b/drivers/watchdog/rave-sp-wdt.c
new file mode 100644
index 0000000..35db173
--- /dev/null
+++ b/drivers/watchdog/rave-sp-wdt.c
@@ -0,0 +1,337 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Driver for watchdog aspect of for Zodiac Inflight Innovations RAVE
+ * Supervisory Processor(SP) MCU
+ *
+ * Copyright (C) 2017 Zodiac Inflight Innovation
+ *
+ */
+
+#include <linux/delay.h>
+#include <linux/kernel.h>
+#include <linux/mfd/rave-sp.h>
+#include <linux/module.h>
+#include <linux/nvmem-consumer.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/reboot.h>
+#include <linux/slab.h>
+#include <linux/watchdog.h>
+
+enum {
+	RAVE_SP_RESET_BYTE = 1,
+	RAVE_SP_RESET_REASON_NORMAL = 0,
+	RAVE_SP_RESET_DELAY_MS = 500,
+};
+
+/**
+ * struct rave_sp_wdt_variant - RAVE SP watchdog variant
+ *
+ * @max_timeout:	Largest possible watchdog timeout setting
+ * @min_timeout:	Smallest possible watchdog timeout setting
+ *
+ * @configure:		Function to send configuration command
+ * @restart:		Function to send "restart" command
+ */
+struct rave_sp_wdt_variant {
+	unsigned int max_timeout;
+	unsigned int min_timeout;
+
+	int (*configure)(struct watchdog_device *, bool);
+	int (*restart)(struct watchdog_device *);
+};
+
+/**
+ * struct rave_sp_wdt - RAVE SP watchdog
+ *
+ * @wdd:		Underlying watchdog device
+ * @sp:			Pointer to parent RAVE SP device
+ * @variant:		Device specific variant information
+ * @reboot_notifier:	Reboot notifier implementing machine reset
+ */
+struct rave_sp_wdt {
+	struct watchdog_device wdd;
+	struct rave_sp *sp;
+	const struct rave_sp_wdt_variant *variant;
+	struct notifier_block reboot_notifier;
+};
+
+static struct rave_sp_wdt *to_rave_sp_wdt(struct watchdog_device *wdd)
+{
+	return container_of(wdd, struct rave_sp_wdt, wdd);
+}
+
+static int rave_sp_wdt_exec(struct watchdog_device *wdd, void *data,
+			    size_t data_size)
+{
+	return rave_sp_exec(to_rave_sp_wdt(wdd)->sp,
+			    data, data_size, NULL, 0);
+}
+
+static int rave_sp_wdt_legacy_configure(struct watchdog_device *wdd, bool on)
+{
+	u8 cmd[] = {
+		[0] = RAVE_SP_CMD_SW_WDT,
+		[1] = 0,
+		[2] = 0,
+		[3] = on,
+		[4] = on ? wdd->timeout : 0,
+	};
+
+	return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+static int rave_sp_wdt_rdu_configure(struct watchdog_device *wdd, bool on)
+{
+	u8 cmd[] = {
+		[0] = RAVE_SP_CMD_SW_WDT,
+		[1] = 0,
+		[2] = on,
+		[3] = (u8)wdd->timeout,
+		[4] = (u8)(wdd->timeout >> 8),
+	};
+
+	return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+/**
+ * rave_sp_wdt_configure - Configure watchdog device
+ *
+ * @wdd:	Device to configure
+ * @on:		Desired state of the watchdog timer (ON/OFF)
+ *
+ * This function configures two aspects of the watchdog timer:
+ *
+ *  - Wheither it is ON or OFF
+ *  - Its timeout duration
+ *
+ * with first aspect specified via function argument and second via
+ * the value of 'wdd->timeout'.
+ */
+static int rave_sp_wdt_configure(struct watchdog_device *wdd, bool on)
+{
+	return to_rave_sp_wdt(wdd)->variant->configure(wdd, on);
+}
+
+static int rave_sp_wdt_legacy_restart(struct watchdog_device *wdd)
+{
+	u8 cmd[] = {
+		[0] = RAVE_SP_CMD_RESET,
+		[1] = 0,
+		[2] = RAVE_SP_RESET_BYTE
+	};
+
+	return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+static int rave_sp_wdt_rdu_restart(struct watchdog_device *wdd)
+{
+	u8 cmd[] = {
+		[0] = RAVE_SP_CMD_RESET,
+		[1] = 0,
+		[2] = RAVE_SP_RESET_BYTE,
+		[3] = RAVE_SP_RESET_REASON_NORMAL
+	};
+
+	return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+static int rave_sp_wdt_reboot_notifier(struct notifier_block *nb,
+				       unsigned long action, void *data)
+{
+	/*
+	 * Restart handler is called in atomic context which means we
+	 * can't communicate to SP via UART. Luckily for use SP will
+	 * wait 500ms before actually resetting us, so we ask it to do
+	 * so here and let the rest of the system go on wrapping
+	 * things up.
+	 */
+	if (action == SYS_DOWN || action == SYS_HALT) {
+		struct rave_sp_wdt *sp_wd =
+			container_of(nb, struct rave_sp_wdt, reboot_notifier);
+
+		const int ret = sp_wd->variant->restart(&sp_wd->wdd);
+
+		if (ret < 0)
+			dev_err(sp_wd->wdd.parent,
+				"Failed to issue restart command (%d)", ret);
+		return NOTIFY_OK;
+	}
+
+	return NOTIFY_DONE;
+}
+
+static int rave_sp_wdt_restart(struct watchdog_device *wdd,
+			       unsigned long action, void *data)
+{
+	/*
+	 * The actual work was done by reboot notifier above. SP
+	 * firmware waits 500 ms before issuing reset, so let's hang
+	 * here for twice that delay and hopefuly we'd never reach
+	 * the return statement.
+	 */
+	mdelay(2 * RAVE_SP_RESET_DELAY_MS);
+
+	return -EIO;
+}
+
+static int rave_sp_wdt_start(struct watchdog_device *wdd)
+{
+	int ret;
+
+	ret = rave_sp_wdt_configure(wdd, true);
+	if (!ret)
+		set_bit(WDOG_HW_RUNNING, &wdd->status);
+
+	return ret;
+}
+
+static int rave_sp_wdt_stop(struct watchdog_device *wdd)
+{
+	return rave_sp_wdt_configure(wdd, false);
+}
+
+static int rave_sp_wdt_set_timeout(struct watchdog_device *wdd,
+				   unsigned int timeout)
+{
+	wdd->timeout = timeout;
+
+	return rave_sp_wdt_configure(wdd, watchdog_active(wdd));
+}
+
+static int rave_sp_wdt_ping(struct watchdog_device *wdd)
+{
+	u8 cmd[] = {
+		[0] = RAVE_SP_CMD_PET_WDT,
+		[1] = 0,
+	};
+
+	return rave_sp_wdt_exec(wdd, cmd, sizeof(cmd));
+}
+
+static const struct watchdog_info rave_sp_wdt_info = {
+	.options = WDIOF_SETTIMEOUT | WDIOF_KEEPALIVEPING | WDIOF_MAGICCLOSE,
+	.identity = "RAVE SP Watchdog",
+};
+
+static const struct watchdog_ops rave_sp_wdt_ops = {
+	.owner = THIS_MODULE,
+	.start = rave_sp_wdt_start,
+	.stop = rave_sp_wdt_stop,
+	.ping = rave_sp_wdt_ping,
+	.set_timeout = rave_sp_wdt_set_timeout,
+	.restart = rave_sp_wdt_restart,
+};
+
+static const struct rave_sp_wdt_variant rave_sp_wdt_legacy = {
+	.max_timeout = 255,
+	.min_timeout = 1,
+	.configure = rave_sp_wdt_legacy_configure,
+	.restart   = rave_sp_wdt_legacy_restart,
+};
+
+static const struct rave_sp_wdt_variant rave_sp_wdt_rdu = {
+	.max_timeout = 180,
+	.min_timeout = 60,
+	.configure = rave_sp_wdt_rdu_configure,
+	.restart   = rave_sp_wdt_rdu_restart,
+};
+
+static const struct of_device_id rave_sp_wdt_of_match[] = {
+	{
+		.compatible = "zii,rave-sp-watchdog-legacy",
+		.data = &rave_sp_wdt_legacy,
+	},
+	{
+		.compatible = "zii,rave-sp-watchdog",
+		.data = &rave_sp_wdt_rdu,
+	},
+	{ /* sentinel */ }
+};
+
+static int rave_sp_wdt_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct watchdog_device *wdd;
+	struct rave_sp_wdt *sp_wd;
+	struct nvmem_cell *cell;
+	__le16 timeout = 0;
+	int ret;
+
+	sp_wd = devm_kzalloc(dev, sizeof(*sp_wd), GFP_KERNEL);
+	if (!sp_wd)
+		return -ENOMEM;
+
+	sp_wd->variant = of_device_get_match_data(dev);
+	sp_wd->sp      = dev_get_drvdata(dev->parent);
+
+	wdd              = &sp_wd->wdd;
+	wdd->parent      = dev;
+	wdd->info        = &rave_sp_wdt_info;
+	wdd->ops         = &rave_sp_wdt_ops;
+	wdd->min_timeout = sp_wd->variant->min_timeout;
+	wdd->max_timeout = sp_wd->variant->max_timeout;
+	wdd->status      = WATCHDOG_NOWAYOUT_INIT_STATUS;
+	wdd->timeout     = 60;
+
+	cell = nvmem_cell_get(dev, "wdt-timeout");
+	if (!IS_ERR(cell)) {
+		size_t len;
+		void *value = nvmem_cell_read(cell, &len);
+
+		if (!IS_ERR(value)) {
+			memcpy(&timeout, value, min(len, sizeof(timeout)));
+			kfree(value);
+		}
+		nvmem_cell_put(cell);
+	}
+	watchdog_init_timeout(wdd, le16_to_cpu(timeout), dev);
+	watchdog_set_restart_priority(wdd, 255);
+	watchdog_stop_on_unregister(wdd);
+
+	sp_wd->reboot_notifier.notifier_call = rave_sp_wdt_reboot_notifier;
+	ret = devm_register_reboot_notifier(dev, &sp_wd->reboot_notifier);
+	if (ret) {
+		dev_err(dev, "Failed to register reboot notifier\n");
+		return ret;
+	}
+
+	/*
+	 * We don't know if watchdog is running now. To be sure, let's
+	 * start it and depend on watchdog core to ping it
+	 */
+	wdd->max_hw_heartbeat_ms = wdd->max_timeout * 1000;
+	ret = rave_sp_wdt_start(wdd);
+	if (ret) {
+		dev_err(dev, "Watchdog didn't start\n");
+		return ret;
+	}
+
+	ret = devm_watchdog_register_device(dev, wdd);
+	if (ret) {
+		dev_err(dev, "Failed to register watchdog device\n");
+		rave_sp_wdt_stop(wdd);
+		return ret;
+	}
+
+	return 0;
+}
+
+static struct platform_driver rave_sp_wdt_driver = {
+	.probe = rave_sp_wdt_probe,
+	.driver = {
+		.name = KBUILD_MODNAME,
+		.of_match_table = rave_sp_wdt_of_match,
+	},
+};
+
+module_platform_driver(rave_sp_wdt_driver);
+
+MODULE_DEVICE_TABLE(of, rave_sp_wdt_of_match);
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Andrey Vostrikov <andrey.vostrikov@cogentembedded.com>");
+MODULE_AUTHOR("Nikita Yushchenko <nikita.yoush@cogentembedded.com>");
+MODULE_AUTHOR("Andrey Smirnov <andrew.smirnov@gmail.com>");
+MODULE_DESCRIPTION("RAVE SP Watchdog driver");
+MODULE_ALIAS("platform:rave-sp-watchdog");
diff --git a/drivers/xen/gntdev.c b/drivers/xen/gntdev.c
index 57efbd3..bd56653 100644
--- a/drivers/xen/gntdev.c
+++ b/drivers/xen/gntdev.c
@@ -380,10 +380,8 @@ static int unmap_grant_pages(struct grant_map *map, int offset, int pages)
 		}
 		range = 0;
 		while (range < pages) {
-			if (map->unmap_ops[offset+range].handle == -1) {
-				range--;
+			if (map->unmap_ops[offset+range].handle == -1)
 				break;
-			}
 			range++;
 		}
 		err = __unmap_grant_pages(map, offset, range);
@@ -1073,8 +1071,10 @@ static int gntdev_mmap(struct file *flip, struct vm_area_struct *vma)
 out_unlock_put:
 	mutex_unlock(&priv->lock);
 out_put_map:
-	if (use_ptemod)
+	if (use_ptemod) {
 		map->vma = NULL;
+		unmap_grant_pages(map, 0, map->count);
+	}
 	gntdev_put_map(priv, map);
 	return err;
 }
diff --git a/drivers/xen/pvcalls-front.c b/drivers/xen/pvcalls-front.c
index d1e1d8d..4c789e6 100644
--- a/drivers/xen/pvcalls-front.c
+++ b/drivers/xen/pvcalls-front.c
@@ -805,7 +805,7 @@ int pvcalls_front_accept(struct socket *sock, struct socket *newsock, int flags)
 		pvcalls_exit();
 		return ret;
 	}
-	map2 = kzalloc(sizeof(*map2), GFP_KERNEL);
+	map2 = kzalloc(sizeof(*map2), GFP_ATOMIC);
 	if (map2 == NULL) {
 		clear_bit(PVCALLS_FLAG_ACCEPT_INFLIGHT,
 			  (void *)&map->passive.flags);
diff --git a/fs/affs/amigaffs.c b/fs/affs/amigaffs.c
index 0f0e692..14a6c1b 100644
--- a/fs/affs/amigaffs.c
+++ b/fs/affs/amigaffs.c
@@ -10,6 +10,7 @@
  */
 
 #include <linux/math64.h>
+#include <linux/iversion.h>
 #include "affs.h"
 
 /*
@@ -60,7 +61,7 @@ affs_insert_hash(struct inode *dir, struct buffer_head *bh)
 	affs_brelse(dir_bh);
 
 	dir->i_mtime = dir->i_ctime = current_time(dir);
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	mark_inode_dirty(dir);
 
 	return 0;
@@ -114,7 +115,7 @@ affs_remove_hash(struct inode *dir, struct buffer_head *rem_bh)
 	affs_brelse(bh);
 
 	dir->i_mtime = dir->i_ctime = current_time(dir);
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	mark_inode_dirty(dir);
 
 	return retval;
diff --git a/fs/affs/dir.c b/fs/affs/dir.c
index a105e77..d180b46 100644
--- a/fs/affs/dir.c
+++ b/fs/affs/dir.c
@@ -14,6 +14,7 @@
  *
  */
 
+#include <linux/iversion.h>
 #include "affs.h"
 
 static int affs_readdir(struct file *, struct dir_context *);
@@ -80,7 +81,7 @@ affs_readdir(struct file *file, struct dir_context *ctx)
 	 * we can jump directly to where we left off.
 	 */
 	ino = (u32)(long)file->private_data;
-	if (ino && file->f_version == inode->i_version) {
+	if (ino && inode_cmp_iversion(inode, file->f_version) == 0) {
 		pr_debug("readdir() left off=%d\n", ino);
 		goto inside;
 	}
@@ -130,7 +131,7 @@ affs_readdir(struct file *file, struct dir_context *ctx)
 		} while (ino);
 	}
 done:
-	file->f_version = inode->i_version;
+	file->f_version = inode_query_iversion(inode);
 	file->private_data = (void *)(long)ino;
 	affs_brelse(fh_bh);
 
diff --git a/fs/affs/super.c b/fs/affs/super.c
index 1117e36..e602619 100644
--- a/fs/affs/super.c
+++ b/fs/affs/super.c
@@ -21,6 +21,7 @@
 #include <linux/writeback.h>
 #include <linux/blkdev.h>
 #include <linux/seq_file.h>
+#include <linux/iversion.h>
 #include "affs.h"
 
 static int affs_statfs(struct dentry *dentry, struct kstatfs *buf);
@@ -102,7 +103,7 @@ static struct inode *affs_alloc_inode(struct super_block *sb)
 	if (!i)
 		return NULL;
 
-	i->vfs_inode.i_version = 1;
+	inode_set_iversion(&i->vfs_inode, 1);
 	i->i_lc = NULL;
 	i->i_ext_bh = NULL;
 	i->i_pa_cnt = 0;
diff --git a/fs/afs/dir.c b/fs/afs/dir.c
index ff8d5bf..23c7f39 100644
--- a/fs/afs/dir.c
+++ b/fs/afs/dir.c
@@ -895,20 +895,38 @@ static int afs_rmdir(struct inode *dir, struct dentry *dentry)
  * However, if we didn't have a callback promise outstanding, or it was
  * outstanding on a different server, then it won't break it either...
  */
-static int afs_dir_remove_link(struct dentry *dentry, struct key *key)
+static int afs_dir_remove_link(struct dentry *dentry, struct key *key,
+			       unsigned long d_version_before,
+			       unsigned long d_version_after)
 {
+	bool dir_valid;
 	int ret = 0;
 
+	/* There were no intervening changes on the server if the version
+	 * number we got back was incremented by exactly 1.
+	 */
+	dir_valid = (d_version_after == d_version_before + 1);
+
 	if (d_really_is_positive(dentry)) {
 		struct afs_vnode *vnode = AFS_FS_I(d_inode(dentry));
 
-		if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
-			kdebug("AFS_VNODE_DELETED");
-		clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
-
-		ret = afs_validate(vnode, key);
-		if (ret == -ESTALE)
+		if (dir_valid) {
+			drop_nlink(&vnode->vfs_inode);
+			if (vnode->vfs_inode.i_nlink == 0) {
+				set_bit(AFS_VNODE_DELETED, &vnode->flags);
+				clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+			}
 			ret = 0;
+		} else {
+			clear_bit(AFS_VNODE_CB_PROMISED, &vnode->flags);
+
+			if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+				kdebug("AFS_VNODE_DELETED");
+
+			ret = afs_validate(vnode, key);
+			if (ret == -ESTALE)
+				ret = 0;
+		}
 		_debug("nlink %d [val %d]", vnode->vfs_inode.i_nlink, ret);
 	}
 
@@ -923,6 +941,7 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 	struct afs_fs_cursor fc;
 	struct afs_vnode *dvnode = AFS_FS_I(dir), *vnode;
 	struct key *key;
+	unsigned long d_version = (unsigned long)dentry->d_fsdata;
 	int ret;
 
 	_enter("{%x:%u},{%pd}",
@@ -955,7 +974,9 @@ static int afs_unlink(struct inode *dir, struct dentry *dentry)
 		afs_vnode_commit_status(&fc, dvnode, fc.cb_break);
 		ret = afs_end_vnode_operation(&fc);
 		if (ret == 0)
-			ret = afs_dir_remove_link(dentry, key);
+			ret = afs_dir_remove_link(
+				dentry, key, d_version,
+				(unsigned long)dvnode->status.data_version);
 	}
 
 error_key:
diff --git a/fs/afs/fsclient.c b/fs/afs/fsclient.c
index b90ef39..88ec38c 100644
--- a/fs/afs/fsclient.c
+++ b/fs/afs/fsclient.c
@@ -13,6 +13,7 @@
 #include <linux/slab.h>
 #include <linux/sched.h>
 #include <linux/circ_buf.h>
+#include <linux/iversion.h>
 #include "internal.h"
 #include "afs_fs.h"
 
@@ -124,7 +125,7 @@ static void xdr_decode_AFSFetchStatus(const __be32 **_bp,
 		vnode->vfs_inode.i_ctime.tv_sec	= status->mtime_client;
 		vnode->vfs_inode.i_mtime	= vnode->vfs_inode.i_ctime;
 		vnode->vfs_inode.i_atime	= vnode->vfs_inode.i_ctime;
-		vnode->vfs_inode.i_version	= data_version;
+		inode_set_iversion_raw(&vnode->vfs_inode, data_version);
 	}
 
 	expected_version = status->data_version;
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 3415eb7..c7f17c4 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -21,6 +21,7 @@
 #include <linux/sched.h>
 #include <linux/mount.h>
 #include <linux/namei.h>
+#include <linux/iversion.h>
 #include "internal.h"
 
 static const struct inode_operations afs_symlink_inode_operations = {
@@ -89,7 +90,7 @@ static int afs_inode_map_status(struct afs_vnode *vnode, struct key *key)
 	inode->i_atime		= inode->i_mtime = inode->i_ctime;
 	inode->i_blocks		= 0;
 	inode->i_generation	= vnode->fid.unique;
-	inode->i_version	= vnode->status.data_version;
+	inode_set_iversion_raw(inode, vnode->status.data_version);
 	inode->i_mapping->a_ops	= &afs_fs_aops;
 
 	read_sequnlock_excl(&vnode->cb_lock);
@@ -218,7 +219,7 @@ struct inode *afs_iget_autocell(struct inode *dir, const char *dev_name,
 	inode->i_ctime.tv_nsec	= 0;
 	inode->i_atime		= inode->i_mtime = inode->i_ctime;
 	inode->i_blocks		= 0;
-	inode->i_version	= 0;
+	inode_set_iversion_raw(inode, 0);
 	inode->i_generation	= 0;
 
 	set_bit(AFS_VNODE_PSEUDODIR, &vnode->flags);
@@ -377,6 +378,10 @@ int afs_validate(struct afs_vnode *vnode, struct key *key)
 	}
 
 	read_sequnlock_excl(&vnode->cb_lock);
+
+	if (test_bit(AFS_VNODE_DELETED, &vnode->flags))
+		clear_nlink(&vnode->vfs_inode);
+
 	if (valid)
 		goto valid;
 
diff --git a/fs/afs/rxrpc.c b/fs/afs/rxrpc.c
index ea1460b..e112665 100644
--- a/fs/afs/rxrpc.c
+++ b/fs/afs/rxrpc.c
@@ -885,7 +885,7 @@ int afs_extract_data(struct afs_call *call, void *buf, size_t count,
 {
 	struct afs_net *net = call->net;
 	enum afs_call_state state;
-	u32 remote_abort;
+	u32 remote_abort = 0;
 	int ret;
 
 	_enter("{%s,%zu},,%zu,%d",
diff --git a/fs/afs/write.c b/fs/afs/write.c
index cb5f8a3..9370e2f 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -198,7 +198,7 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 			ret = afs_fill_page(vnode, key, pos + copied,
 					    len - copied, page);
 			if (ret < 0)
-				return ret;
+				goto out;
 		}
 		SetPageUptodate(page);
 	}
@@ -206,10 +206,12 @@ int afs_write_end(struct file *file, struct address_space *mapping,
 	set_page_dirty(page);
 	if (PageDirty(page))
 		_debug("dirtied");
+	ret = copied;
+
+out:
 	unlock_page(page);
 	put_page(page);
-
-	return copied;
+	return ret;
 }
 
 /*
diff --git a/fs/btrfs/Makefile b/fs/btrfs/Makefile
index 6fe881d..0c43736 100644
--- a/fs/btrfs/Makefile
+++ b/fs/btrfs/Makefile
@@ -19,4 +19,4 @@
 btrfs-$(CONFIG_BTRFS_FS_RUN_SANITY_TESTS) += tests/free-space-tests.o \
 	tests/extent-buffer-tests.o tests/btrfs-tests.o \
 	tests/extent-io-tests.o tests/inode-tests.o tests/qgroup-tests.o \
-	tests/free-space-tree-tests.o
+	tests/free-space-tree-tests.o tests/extent-map-tests.o
diff --git a/fs/btrfs/backref.c b/fs/btrfs/backref.c
index 7d0dc10..e4054e5 100644
--- a/fs/btrfs/backref.c
+++ b/fs/btrfs/backref.c
@@ -216,7 +216,8 @@ static int prelim_ref_compare(struct prelim_ref *ref1,
 	return 0;
 }
 
-void update_share_count(struct share_check *sc, int oldcount, int newcount)
+static void update_share_count(struct share_check *sc, int oldcount,
+			       int newcount)
 {
 	if ((!sc) || (oldcount == 0 && newcount < 1))
 		return;
diff --git a/fs/btrfs/compression.c b/fs/btrfs/compression.c
index 5982c8a..07d049c 100644
--- a/fs/btrfs/compression.c
+++ b/fs/btrfs/compression.c
@@ -33,7 +33,6 @@
 #include <linux/bit_spinlock.h>
 #include <linux/slab.h>
 #include <linux/sched/mm.h>
-#include <linux/sort.h>
 #include <linux/log2.h>
 #include "ctree.h"
 #include "disk-io.h"
@@ -45,6 +44,21 @@
 #include "extent_io.h"
 #include "extent_map.h"
 
+static const char* const btrfs_compress_types[] = { "", "zlib", "lzo", "zstd" };
+
+const char* btrfs_compress_type2str(enum btrfs_compression_type type)
+{
+	switch (type) {
+	case BTRFS_COMPRESS_ZLIB:
+	case BTRFS_COMPRESS_LZO:
+	case BTRFS_COMPRESS_ZSTD:
+	case BTRFS_COMPRESS_NONE:
+		return btrfs_compress_types[type];
+	}
+
+	return NULL;
+}
+
 static int btrfs_decompress_bio(struct compressed_bio *cb);
 
 static inline int compressed_bio_size(struct btrfs_fs_info *fs_info,
@@ -348,8 +362,6 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 		page->mapping = NULL;
 		if (submit || bio_add_page(bio, page, PAGE_SIZE, 0) <
 		    PAGE_SIZE) {
-			bio_get(bio);
-
 			/*
 			 * inc the count before we submit the bio so
 			 * we know the end IO handler won't happen before
@@ -372,8 +384,6 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 				bio_endio(bio);
 			}
 
-			bio_put(bio);
-
 			bio = btrfs_bio_alloc(bdev, first_byte);
 			bio->bi_opf = REQ_OP_WRITE | write_flags;
 			bio->bi_private = cb;
@@ -389,7 +399,6 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 		first_byte += PAGE_SIZE;
 		cond_resched();
 	}
-	bio_get(bio);
 
 	ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA);
 	BUG_ON(ret); /* -ENOMEM */
@@ -405,13 +414,12 @@ blk_status_t btrfs_submit_compressed_write(struct inode *inode, u64 start,
 		bio_endio(bio);
 	}
 
-	bio_put(bio);
 	return 0;
 }
 
 static u64 bio_end_offset(struct bio *bio)
 {
-	struct bio_vec *last = &bio->bi_io_vec[bio->bi_vcnt - 1];
+	struct bio_vec *last = bio_last_bvec_all(bio);
 
 	return page_offset(last->bv_page) + last->bv_len + last->bv_offset;
 }
@@ -563,7 +571,7 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 	/* we need the actual starting offset of this extent in the file */
 	read_lock(&em_tree->lock);
 	em = lookup_extent_mapping(em_tree,
-				   page_offset(bio->bi_io_vec->bv_page),
+				   page_offset(bio_first_page_all(bio)),
 				   PAGE_SIZE);
 	read_unlock(&em_tree->lock);
 	if (!em)
@@ -638,8 +646,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 		page->mapping = NULL;
 		if (submit || bio_add_page(comp_bio, page, PAGE_SIZE, 0) <
 		    PAGE_SIZE) {
-			bio_get(comp_bio);
-
 			ret = btrfs_bio_wq_end_io(fs_info, comp_bio,
 						  BTRFS_WQ_ENDIO_DATA);
 			BUG_ON(ret); /* -ENOMEM */
@@ -666,8 +672,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 				bio_endio(comp_bio);
 			}
 
-			bio_put(comp_bio);
-
 			comp_bio = btrfs_bio_alloc(bdev, cur_disk_byte);
 			bio_set_op_attrs(comp_bio, REQ_OP_READ, 0);
 			comp_bio->bi_private = cb;
@@ -677,7 +681,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 		}
 		cur_disk_byte += PAGE_SIZE;
 	}
-	bio_get(comp_bio);
 
 	ret = btrfs_bio_wq_end_io(fs_info, comp_bio, BTRFS_WQ_ENDIO_DATA);
 	BUG_ON(ret); /* -ENOMEM */
@@ -693,7 +696,6 @@ blk_status_t btrfs_submit_compressed_read(struct inode *inode, struct bio *bio,
 		bio_endio(comp_bio);
 	}
 
-	bio_put(comp_bio);
 	return 0;
 
 fail2:
@@ -752,6 +754,8 @@ struct heuristic_ws {
 	u32 sample_size;
 	/* Buckets store counters for each byte value */
 	struct bucket_item *bucket;
+	/* Sorting buffer */
+	struct bucket_item *bucket_b;
 	struct list_head list;
 };
 
@@ -763,6 +767,7 @@ static void free_heuristic_ws(struct list_head *ws)
 
 	kvfree(workspace->sample);
 	kfree(workspace->bucket);
+	kfree(workspace->bucket_b);
 	kfree(workspace);
 }
 
@@ -782,6 +787,10 @@ static struct list_head *alloc_heuristic_ws(void)
 	if (!ws->bucket)
 		goto fail;
 
+	ws->bucket_b = kcalloc(BUCKET_SIZE, sizeof(*ws->bucket_b), GFP_KERNEL);
+	if (!ws->bucket_b)
+		goto fail;
+
 	INIT_LIST_HEAD(&ws->list);
 	return &ws->list;
 fail:
@@ -1278,13 +1287,103 @@ static u32 shannon_entropy(struct heuristic_ws *ws)
 	return entropy_sum * 100 / entropy_max;
 }
 
-/* Compare buckets by size, ascending */
-static int bucket_comp_rev(const void *lv, const void *rv)
-{
-	const struct bucket_item *l = (const struct bucket_item *)lv;
-	const struct bucket_item *r = (const struct bucket_item *)rv;
+#define RADIX_BASE		4U
+#define COUNTERS_SIZE		(1U << RADIX_BASE)
 
-	return r->count - l->count;
+static u8 get4bits(u64 num, int shift) {
+	u8 low4bits;
+
+	num >>= shift;
+	/* Reverse order */
+	low4bits = (COUNTERS_SIZE - 1) - (num % COUNTERS_SIZE);
+	return low4bits;
+}
+
+/*
+ * Use 4 bits as radix base
+ * Use 16 u32 counters for calculating new possition in buf array
+ *
+ * @array     - array that will be sorted
+ * @array_buf - buffer array to store sorting results
+ *              must be equal in size to @array
+ * @num       - array size
+ */
+static void radix_sort(struct bucket_item *array, struct bucket_item *array_buf,
+		       int num)
+{
+	u64 max_num;
+	u64 buf_num;
+	u32 counters[COUNTERS_SIZE];
+	u32 new_addr;
+	u32 addr;
+	int bitlen;
+	int shift;
+	int i;
+
+	/*
+	 * Try avoid useless loop iterations for small numbers stored in big
+	 * counters.  Example: 48 33 4 ... in 64bit array
+	 */
+	max_num = array[0].count;
+	for (i = 1; i < num; i++) {
+		buf_num = array[i].count;
+		if (buf_num > max_num)
+			max_num = buf_num;
+	}
+
+	buf_num = ilog2(max_num);
+	bitlen = ALIGN(buf_num, RADIX_BASE * 2);
+
+	shift = 0;
+	while (shift < bitlen) {
+		memset(counters, 0, sizeof(counters));
+
+		for (i = 0; i < num; i++) {
+			buf_num = array[i].count;
+			addr = get4bits(buf_num, shift);
+			counters[addr]++;
+		}
+
+		for (i = 1; i < COUNTERS_SIZE; i++)
+			counters[i] += counters[i - 1];
+
+		for (i = num - 1; i >= 0; i--) {
+			buf_num = array[i].count;
+			addr = get4bits(buf_num, shift);
+			counters[addr]--;
+			new_addr = counters[addr];
+			array_buf[new_addr] = array[i];
+		}
+
+		shift += RADIX_BASE;
+
+		/*
+		 * Normal radix expects to move data from a temporary array, to
+		 * the main one.  But that requires some CPU time. Avoid that
+		 * by doing another sort iteration to original array instead of
+		 * memcpy()
+		 */
+		memset(counters, 0, sizeof(counters));
+
+		for (i = 0; i < num; i ++) {
+			buf_num = array_buf[i].count;
+			addr = get4bits(buf_num, shift);
+			counters[addr]++;
+		}
+
+		for (i = 1; i < COUNTERS_SIZE; i++)
+			counters[i] += counters[i - 1];
+
+		for (i = num - 1; i >= 0; i--) {
+			buf_num = array_buf[i].count;
+			addr = get4bits(buf_num, shift);
+			counters[addr]--;
+			new_addr = counters[addr];
+			array[new_addr] = array_buf[i];
+		}
+
+		shift += RADIX_BASE;
+	}
 }
 
 /*
@@ -1314,7 +1413,7 @@ static int byte_core_set_size(struct heuristic_ws *ws)
 	struct bucket_item *bucket = ws->bucket;
 
 	/* Sort in reverse order */
-	sort(bucket, BUCKET_SIZE, sizeof(*bucket), &bucket_comp_rev, NULL);
+	radix_sort(ws->bucket, ws->bucket_b, BUCKET_SIZE);
 
 	for (i = 0; i < BYTE_CORE_SET_LOW; i++)
 		coreset_sum += bucket[i].count;
diff --git a/fs/btrfs/compression.h b/fs/btrfs/compression.h
index 0868cc5..677fa4a 100644
--- a/fs/btrfs/compression.h
+++ b/fs/btrfs/compression.h
@@ -75,7 +75,7 @@ struct compressed_bio {
 	u32 sums;
 };
 
-void btrfs_init_compress(void);
+void __init btrfs_init_compress(void);
 void btrfs_exit_compress(void);
 
 int btrfs_compress_pages(unsigned int type_level, struct address_space *mapping,
@@ -137,6 +137,8 @@ extern const struct btrfs_compress_op btrfs_zlib_compress;
 extern const struct btrfs_compress_op btrfs_lzo_compress;
 extern const struct btrfs_compress_op btrfs_zstd_compress;
 
+const char* btrfs_compress_type2str(enum btrfs_compression_type type);
+
 int btrfs_compress_heuristic(struct inode *inode, u64 start, u64 end);
 
 #endif
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
index 1e74cf8..b88a79e 100644
--- a/fs/btrfs/ctree.c
+++ b/fs/btrfs/ctree.c
@@ -1807,8 +1807,8 @@ static noinline int generic_bin_search(struct extent_buffer *eb,
  * simple bin_search frontend that does the right thing for
  * leaves vs nodes
  */
-static int bin_search(struct extent_buffer *eb, const struct btrfs_key *key,
-		      int level, int *slot)
+int btrfs_bin_search(struct extent_buffer *eb, const struct btrfs_key *key,
+		     int level, int *slot)
 {
 	if (level == 0)
 		return generic_bin_search(eb,
@@ -1824,12 +1824,6 @@ static int bin_search(struct extent_buffer *eb, const struct btrfs_key *key,
 					  slot);
 }
 
-int btrfs_bin_search(struct extent_buffer *eb, const struct btrfs_key *key,
-		     int level, int *slot)
-{
-	return bin_search(eb, key, level, slot);
-}
-
 static void root_add_used(struct btrfs_root *root, u32 size)
 {
 	spin_lock(&root->accounting_lock);
@@ -2614,7 +2608,7 @@ static int key_search(struct extent_buffer *b, const struct btrfs_key *key,
 		      int level, int *prev_cmp, int *slot)
 {
 	if (*prev_cmp != 0) {
-		*prev_cmp = bin_search(b, key, level, slot);
+		*prev_cmp = btrfs_bin_search(b, key, level, slot);
 		return *prev_cmp;
 	}
 
@@ -2660,17 +2654,29 @@ int btrfs_find_item(struct btrfs_root *fs_root, struct btrfs_path *path,
 }
 
 /*
- * look for key in the tree.  path is filled in with nodes along the way
- * if key is found, we return zero and you can find the item in the leaf
- * level of the path (level 0)
+ * btrfs_search_slot - look for a key in a tree and perform necessary
+ * modifications to preserve tree invariants.
  *
- * If the key isn't found, the path points to the slot where it should
- * be inserted, and 1 is returned.  If there are other errors during the
- * search a negative error number is returned.
+ * @trans:	Handle of transaction, used when modifying the tree
+ * @p:		Holds all btree nodes along the search path
+ * @root:	The root node of the tree
+ * @key:	The key we are looking for
+ * @ins_len:	Indicates purpose of search, for inserts it is 1, for
+ *		deletions it's -1. 0 for plain searches
+ * @cow:	boolean should CoW operations be performed. Must always be 1
+ *		when modifying the tree.
  *
- * if ins_len > 0, nodes and leaves will be split as we walk down the
- * tree.  if ins_len < 0, nodes will be merged as we walk down the tree (if
- * possible)
+ * If @ins_len > 0, nodes and leaves will be split as we walk down the tree.
+ * If @ins_len < 0, nodes will be merged as we walk down the tree (if possible)
+ *
+ * If @key is found, 0 is returned and you can find the item in the leaf level
+ * of the path (level 0)
+ *
+ * If @key isn't found, 1 is returned and the leaf level of the path (level 0)
+ * points to the slot where it should be inserted
+ *
+ * If an error is encountered while searching the tree a negative error number
+ * is returned
  */
 int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		      const struct btrfs_key *key, struct btrfs_path *p,
@@ -2774,6 +2780,8 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 		 * contention with the cow code
 		 */
 		if (cow) {
+			bool last_level = (level == (BTRFS_MAX_LEVEL - 1));
+
 			/*
 			 * if we don't really need to cow this block
 			 * then we don't want to set the path blocking,
@@ -2798,9 +2806,13 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
 			}
 
 			btrfs_set_path_blocking(p);
-			err = btrfs_cow_block(trans, root, b,
-					      p->nodes[level + 1],
-					      p->slots[level + 1], &b);
+			if (last_level)
+				err = btrfs_cow_block(trans, root, b, NULL, 0,
+						      &b);
+			else
+				err = btrfs_cow_block(trans, root, b,
+						      p->nodes[level + 1],
+						      p->slots[level + 1], &b);
 			if (err) {
 				ret = err;
 				goto done;
@@ -5175,7 +5187,7 @@ int btrfs_search_forward(struct btrfs_root *root, struct btrfs_key *min_key,
 	while (1) {
 		nritems = btrfs_header_nritems(cur);
 		level = btrfs_header_level(cur);
-		sret = bin_search(cur, min_key, level, &slot);
+		sret = btrfs_bin_search(cur, min_key, level, &slot);
 
 		/* at the lowest level, we're done, setup the path and exit */
 		if (level == path->lowest_level) {
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
index 13c260b..1a462ab 100644
--- a/fs/btrfs/ctree.h
+++ b/fs/btrfs/ctree.h
@@ -679,7 +679,6 @@ enum btrfs_orphan_cleanup_state {
 /* used by the raid56 code to lock stripes for read/modify/write */
 struct btrfs_stripe_hash {
 	struct list_head hash_list;
-	wait_queue_head_t wait;
 	spinlock_t lock;
 };
 
@@ -3060,15 +3059,10 @@ struct btrfs_dir_item *btrfs_lookup_xattr(struct btrfs_trans_handle *trans,
 					  struct btrfs_path *path, u64 dir,
 					  const char *name, u16 name_len,
 					  int mod);
-int verify_dir_item(struct btrfs_fs_info *fs_info,
-		    struct extent_buffer *leaf, int slot,
-		    struct btrfs_dir_item *dir_item);
 struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_fs_info *fs_info,
 						 struct btrfs_path *path,
 						 const char *name,
 						 int name_len);
-bool btrfs_is_name_len_valid(struct extent_buffer *leaf, int slot,
-			     unsigned long start, u16 name_len);
 
 /* orphan.c */
 int btrfs_insert_orphan_item(struct btrfs_trans_handle *trans,
@@ -3197,7 +3191,7 @@ int btrfs_write_inode(struct inode *inode, struct writeback_control *wbc);
 struct inode *btrfs_alloc_inode(struct super_block *sb);
 void btrfs_destroy_inode(struct inode *inode);
 int btrfs_drop_inode(struct inode *inode);
-int btrfs_init_cachep(void);
+int __init btrfs_init_cachep(void);
 void btrfs_destroy_cachep(void);
 long btrfs_ioctl_trans_end(struct file *file);
 struct inode *btrfs_iget(struct super_block *s, struct btrfs_key *location,
@@ -3248,7 +3242,7 @@ ssize_t btrfs_dedupe_file_range(struct file *src_file, u64 loff, u64 olen,
 			   struct file *dst_file, u64 dst_loff);
 
 /* file.c */
-int btrfs_auto_defrag_init(void);
+int __init btrfs_auto_defrag_init(void);
 void btrfs_auto_defrag_exit(void);
 int btrfs_add_inode_defrag(struct btrfs_trans_handle *trans,
 			   struct btrfs_inode *inode);
@@ -3283,7 +3277,7 @@ int btrfs_defrag_leaves(struct btrfs_trans_handle *trans,
 			struct btrfs_root *root);
 
 /* sysfs.c */
-int btrfs_init_sysfs(void);
+int __init btrfs_init_sysfs(void);
 void btrfs_exit_sysfs(void);
 int btrfs_sysfs_add_mounted(struct btrfs_fs_info *fs_info);
 void btrfs_sysfs_remove_mounted(struct btrfs_fs_info *fs_info);
diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index 5d73f79..0530f6f 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -18,6 +18,7 @@
  */
 
 #include <linux/slab.h>
+#include <linux/iversion.h>
 #include "delayed-inode.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -87,6 +88,7 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node(
 
 	spin_lock(&root->inode_lock);
 	node = radix_tree_lookup(&root->delayed_nodes_tree, ino);
+
 	if (node) {
 		if (btrfs_inode->delayed_node) {
 			refcount_inc(&node->refs);	/* can be accessed */
@@ -94,9 +96,30 @@ static struct btrfs_delayed_node *btrfs_get_delayed_node(
 			spin_unlock(&root->inode_lock);
 			return node;
 		}
-		btrfs_inode->delayed_node = node;
-		/* can be accessed and cached in the inode */
-		refcount_add(2, &node->refs);
+
+		/*
+		 * It's possible that we're racing into the middle of removing
+		 * this node from the radix tree.  In this case, the refcount
+		 * was zero and it should never go back to one.  Just return
+		 * NULL like it was never in the radix at all; our release
+		 * function is in the process of removing it.
+		 *
+		 * Some implementations of refcount_inc refuse to bump the
+		 * refcount once it has hit zero.  If we don't do this dance
+		 * here, refcount_inc() may decide to just WARN_ONCE() instead
+		 * of actually bumping the refcount.
+		 *
+		 * If this node is properly in the radix, we want to bump the
+		 * refcount twice, once for the inode and once for this get
+		 * operation.
+		 */
+		if (refcount_inc_not_zero(&node->refs)) {
+			refcount_inc(&node->refs);
+			btrfs_inode->delayed_node = node;
+		} else {
+			node = NULL;
+		}
+
 		spin_unlock(&root->inode_lock);
 		return node;
 	}
@@ -254,17 +277,18 @@ static void __btrfs_release_delayed_node(
 	mutex_unlock(&delayed_node->mutex);
 
 	if (refcount_dec_and_test(&delayed_node->refs)) {
-		bool free = false;
 		struct btrfs_root *root = delayed_node->root;
+
 		spin_lock(&root->inode_lock);
-		if (refcount_read(&delayed_node->refs) == 0) {
-			radix_tree_delete(&root->delayed_nodes_tree,
-					  delayed_node->inode_id);
-			free = true;
-		}
+		/*
+		 * Once our refcount goes to zero, nobody is allowed to bump it
+		 * back up.  We can delete it now.
+		 */
+		ASSERT(refcount_read(&delayed_node->refs) == 0);
+		radix_tree_delete(&root->delayed_nodes_tree,
+				  delayed_node->inode_id);
 		spin_unlock(&root->inode_lock);
-		if (free)
-			kmem_cache_free(delayed_node_cache, delayed_node);
+		kmem_cache_free(delayed_node_cache, delayed_node);
 	}
 }
 
@@ -1279,40 +1303,42 @@ static void btrfs_async_run_delayed_root(struct btrfs_work *work)
 	if (!path)
 		goto out;
 
-again:
-	if (atomic_read(&delayed_root->items) < BTRFS_DELAYED_BACKGROUND / 2)
-		goto free_path;
+	do {
+		if (atomic_read(&delayed_root->items) <
+		    BTRFS_DELAYED_BACKGROUND / 2)
+			break;
 
-	delayed_node = btrfs_first_prepared_delayed_node(delayed_root);
-	if (!delayed_node)
-		goto free_path;
+		delayed_node = btrfs_first_prepared_delayed_node(delayed_root);
+		if (!delayed_node)
+			break;
 
-	path->leave_spinning = 1;
-	root = delayed_node->root;
+		path->leave_spinning = 1;
+		root = delayed_node->root;
 
-	trans = btrfs_join_transaction(root);
-	if (IS_ERR(trans))
-		goto release_path;
+		trans = btrfs_join_transaction(root);
+		if (IS_ERR(trans)) {
+			btrfs_release_path(path);
+			btrfs_release_prepared_delayed_node(delayed_node);
+			total_done++;
+			continue;
+		}
 
-	block_rsv = trans->block_rsv;
-	trans->block_rsv = &root->fs_info->delayed_block_rsv;
+		block_rsv = trans->block_rsv;
+		trans->block_rsv = &root->fs_info->delayed_block_rsv;
 
-	__btrfs_commit_inode_delayed_items(trans, path, delayed_node);
+		__btrfs_commit_inode_delayed_items(trans, path, delayed_node);
 
-	trans->block_rsv = block_rsv;
-	btrfs_end_transaction(trans);
-	btrfs_btree_balance_dirty_nodelay(root->fs_info);
+		trans->block_rsv = block_rsv;
+		btrfs_end_transaction(trans);
+		btrfs_btree_balance_dirty_nodelay(root->fs_info);
 
-release_path:
-	btrfs_release_path(path);
-	total_done++;
+		btrfs_release_path(path);
+		btrfs_release_prepared_delayed_node(delayed_node);
+		total_done++;
 
-	btrfs_release_prepared_delayed_node(delayed_node);
-	if ((async_work->nr == 0 && total_done < BTRFS_DELAYED_WRITEBACK) ||
-	    total_done < async_work->nr)
-		goto again;
+	} while ((async_work->nr == 0 && total_done < BTRFS_DELAYED_WRITEBACK)
+		 || total_done < async_work->nr);
 
-free_path:
 	btrfs_free_path(path);
 out:
 	wake_up(&delayed_root->wait);
@@ -1325,10 +1351,6 @@ static int btrfs_wq_run_delayed_node(struct btrfs_delayed_root *delayed_root,
 {
 	struct btrfs_async_delayed_work *async_work;
 
-	if (atomic_read(&delayed_root->items) < BTRFS_DELAYED_BACKGROUND ||
-	    btrfs_workqueue_normal_congested(fs_info->delayed_workers))
-		return 0;
-
 	async_work = kmalloc(sizeof(*async_work), GFP_NOFS);
 	if (!async_work)
 		return -ENOMEM;
@@ -1364,7 +1386,8 @@ void btrfs_balance_delayed_items(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_delayed_root *delayed_root = fs_info->delayed_root;
 
-	if (atomic_read(&delayed_root->items) < BTRFS_DELAYED_BACKGROUND)
+	if ((atomic_read(&delayed_root->items) < BTRFS_DELAYED_BACKGROUND) ||
+		btrfs_workqueue_normal_congested(fs_info->delayed_workers))
 		return;
 
 	if (atomic_read(&delayed_root->items) >= BTRFS_DELAYED_WRITEBACK) {
@@ -1610,28 +1633,18 @@ void btrfs_readdir_put_delayed_items(struct inode *inode,
 int btrfs_should_delete_dir_index(struct list_head *del_list,
 				  u64 index)
 {
-	struct btrfs_delayed_item *curr, *next;
-	int ret;
+	struct btrfs_delayed_item *curr;
+	int ret = 0;
 
-	if (list_empty(del_list))
-		return 0;
-
-	list_for_each_entry_safe(curr, next, del_list, readdir_list) {
+	list_for_each_entry(curr, del_list, readdir_list) {
 		if (curr->key.offset > index)
 			break;
-
-		list_del(&curr->readdir_list);
-		ret = (curr->key.offset == index);
-
-		if (refcount_dec_and_test(&curr->refs))
-			kfree(curr);
-
-		if (ret)
-			return 1;
-		else
-			continue;
+		if (curr->key.offset == index) {
+			ret = 1;
+			break;
+		}
 	}
-	return 0;
+	return ret;
 }
 
 /*
@@ -1700,7 +1713,8 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans,
 	btrfs_set_stack_inode_nbytes(inode_item, inode_get_bytes(inode));
 	btrfs_set_stack_inode_generation(inode_item,
 					 BTRFS_I(inode)->generation);
-	btrfs_set_stack_inode_sequence(inode_item, inode->i_version);
+	btrfs_set_stack_inode_sequence(inode_item,
+				       inode_peek_iversion(inode));
 	btrfs_set_stack_inode_transid(inode_item, trans->transid);
 	btrfs_set_stack_inode_rdev(inode_item, inode->i_rdev);
 	btrfs_set_stack_inode_flags(inode_item, BTRFS_I(inode)->flags);
@@ -1754,7 +1768,8 @@ int btrfs_fill_inode(struct inode *inode, u32 *rdev)
 	BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item);
         BTRFS_I(inode)->last_trans = btrfs_stack_inode_transid(inode_item);
 
-	inode->i_version = btrfs_stack_inode_sequence(inode_item);
+	inode_set_iversion_queried(inode,
+				   btrfs_stack_inode_sequence(inode_item));
 	inode->i_rdev = 0;
 	*rdev = btrfs_stack_inode_rdev(inode_item);
 	BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item);
diff --git a/fs/btrfs/delayed-ref.c b/fs/btrfs/delayed-ref.c
index 83be8f9..a1a40cf 100644
--- a/fs/btrfs/delayed-ref.c
+++ b/fs/btrfs/delayed-ref.c
@@ -937,7 +937,7 @@ void btrfs_delayed_ref_exit(void)
 	kmem_cache_destroy(btrfs_delayed_extent_op_cachep);
 }
 
-int btrfs_delayed_ref_init(void)
+int __init btrfs_delayed_ref_init(void)
 {
 	btrfs_delayed_ref_head_cachep = kmem_cache_create(
 				"btrfs_delayed_ref_head",
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h
index a43af43..c4f625e 100644
--- a/fs/btrfs/delayed-ref.h
+++ b/fs/btrfs/delayed-ref.h
@@ -203,7 +203,7 @@ extern struct kmem_cache *btrfs_delayed_tree_ref_cachep;
 extern struct kmem_cache *btrfs_delayed_data_ref_cachep;
 extern struct kmem_cache *btrfs_delayed_extent_op_cachep;
 
-int btrfs_delayed_ref_init(void);
+int __init btrfs_delayed_ref_init(void);
 void btrfs_delayed_ref_exit(void);
 
 static inline struct btrfs_delayed_extent_op *
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c
index 7c655f9..7efbc4d 100644
--- a/fs/btrfs/dev-replace.c
+++ b/fs/btrfs/dev-replace.c
@@ -172,7 +172,8 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info)
 				dev_replace->tgtdev->commit_bytes_used =
 					dev_replace->srcdev->commit_bytes_used;
 			}
-			dev_replace->tgtdev->is_tgtdev_for_dev_replace = 1;
+			set_bit(BTRFS_DEV_STATE_REPLACE_TGT,
+				&dev_replace->tgtdev->dev_state);
 			btrfs_init_dev_replace_tgtdev_for_resume(fs_info,
 				dev_replace->tgtdev);
 		}
@@ -304,6 +305,14 @@ void btrfs_after_dev_replace_commit(struct btrfs_fs_info *fs_info)
 		dev_replace->cursor_left_last_write_of_item;
 }
 
+static char* btrfs_dev_name(struct btrfs_device *device)
+{
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
+		return "<missing disk>";
+	else
+		return rcu_str_deref(device->name);
+}
+
 int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info,
 		const char *tgtdev_name, u64 srcdevid, const char *srcdev_name,
 		int read_src)
@@ -363,8 +372,7 @@ int btrfs_dev_replace_start(struct btrfs_fs_info *fs_info,
 
 	btrfs_info_in_rcu(fs_info,
 		      "dev_replace from %s (devid %llu) to %s started",
-		      src_device->missing ? "<missing disk>" :
-		        rcu_str_deref(src_device->name),
+		      btrfs_dev_name(src_device),
 		      src_device->devid,
 		      rcu_str_deref(tgt_device->name));
 
@@ -538,8 +546,7 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 	} else {
 		btrfs_err_in_rcu(fs_info,
 				 "btrfs_scrub_dev(%s, %llu, %s) failed %d",
-				 src_device->missing ? "<missing disk>" :
-				 rcu_str_deref(src_device->name),
+				 btrfs_dev_name(src_device),
 				 src_device->devid,
 				 rcu_str_deref(tgt_device->name), scrub_ret);
 		btrfs_dev_replace_unlock(dev_replace, 1);
@@ -557,11 +564,10 @@ static int btrfs_dev_replace_finishing(struct btrfs_fs_info *fs_info,
 
 	btrfs_info_in_rcu(fs_info,
 			  "dev_replace from %s (devid %llu) to %s finished",
-			  src_device->missing ? "<missing disk>" :
-			  rcu_str_deref(src_device->name),
+			  btrfs_dev_name(src_device),
 			  src_device->devid,
 			  rcu_str_deref(tgt_device->name));
-	tgt_device->is_tgtdev_for_dev_replace = 0;
+	clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &tgt_device->dev_state);
 	tgt_device->devid = src_device->devid;
 	src_device->devid = BTRFS_DEV_REPLACE_DEVID;
 	memcpy(uuid_tmp, tgt_device->uuid, sizeof(uuid_tmp));
@@ -814,12 +820,10 @@ static int btrfs_dev_replace_kthread(void *data)
 	progress = btrfs_dev_replace_progress(fs_info);
 	progress = div_u64(progress, 10);
 	btrfs_info_in_rcu(fs_info,
-		"continuing dev_replace from %s (devid %llu) to %s @%u%%",
-		dev_replace->srcdev->missing ? "<missing disk>"
-			: rcu_str_deref(dev_replace->srcdev->name),
+		"continuing dev_replace from %s (devid %llu) to target %s @%u%%",
+		btrfs_dev_name(dev_replace->srcdev),
 		dev_replace->srcdev->devid,
-		dev_replace->tgtdev ? rcu_str_deref(dev_replace->tgtdev->name)
-			: "<missing target disk>",
+		btrfs_dev_name(dev_replace->tgtdev),
 		(unsigned int)progress);
 
 	btrfs_dev_replace_continue_on_mount(fs_info);
diff --git a/fs/btrfs/dir-item.c b/fs/btrfs/dir-item.c
index 41cb919..cbe4216 100644
--- a/fs/btrfs/dir-item.c
+++ b/fs/btrfs/dir-item.c
@@ -403,8 +403,6 @@ struct btrfs_dir_item *btrfs_match_dir_item_name(struct btrfs_fs_info *fs_info,
 			btrfs_dir_data_len(leaf, dir_item);
 		name_ptr = (unsigned long)(dir_item + 1);
 
-		if (verify_dir_item(fs_info, leaf, path->slots[0], dir_item))
-			return NULL;
 		if (btrfs_dir_name_len(leaf, dir_item) == name_len &&
 		    memcmp_extent_buffer(leaf, name, name_ptr, name_len) == 0)
 			return dir_item;
@@ -450,109 +448,3 @@ int btrfs_delete_one_dir_name(struct btrfs_trans_handle *trans,
 	}
 	return ret;
 }
-
-int verify_dir_item(struct btrfs_fs_info *fs_info,
-		    struct extent_buffer *leaf,
-		    int slot,
-		    struct btrfs_dir_item *dir_item)
-{
-	u16 namelen = BTRFS_NAME_LEN;
-	int ret;
-	u8 type = btrfs_dir_type(leaf, dir_item);
-
-	if (type >= BTRFS_FT_MAX) {
-		btrfs_crit(fs_info, "invalid dir item type: %d", (int)type);
-		return 1;
-	}
-
-	if (type == BTRFS_FT_XATTR)
-		namelen = XATTR_NAME_MAX;
-
-	if (btrfs_dir_name_len(leaf, dir_item) > namelen) {
-		btrfs_crit(fs_info, "invalid dir item name len: %u",
-		       (unsigned)btrfs_dir_name_len(leaf, dir_item));
-		return 1;
-	}
-
-	namelen = btrfs_dir_name_len(leaf, dir_item);
-	ret = btrfs_is_name_len_valid(leaf, slot,
-				      (unsigned long)(dir_item + 1), namelen);
-	if (!ret)
-		return 1;
-
-	/* BTRFS_MAX_XATTR_SIZE is the same for all dir items */
-	if ((btrfs_dir_data_len(leaf, dir_item) +
-	     btrfs_dir_name_len(leaf, dir_item)) >
-					BTRFS_MAX_XATTR_SIZE(fs_info)) {
-		btrfs_crit(fs_info, "invalid dir item name + data len: %u + %u",
-			   (unsigned)btrfs_dir_name_len(leaf, dir_item),
-			   (unsigned)btrfs_dir_data_len(leaf, dir_item));
-		return 1;
-	}
-
-	return 0;
-}
-
-bool btrfs_is_name_len_valid(struct extent_buffer *leaf, int slot,
-			     unsigned long start, u16 name_len)
-{
-	struct btrfs_fs_info *fs_info = leaf->fs_info;
-	struct btrfs_key key;
-	u32 read_start;
-	u32 read_end;
-	u32 item_start;
-	u32 item_end;
-	u32 size;
-	bool ret = true;
-
-	ASSERT(start > BTRFS_LEAF_DATA_OFFSET);
-
-	read_start = start - BTRFS_LEAF_DATA_OFFSET;
-	read_end = read_start + name_len;
-	item_start = btrfs_item_offset_nr(leaf, slot);
-	item_end = btrfs_item_end_nr(leaf, slot);
-
-	btrfs_item_key_to_cpu(leaf, &key, slot);
-
-	switch (key.type) {
-	case BTRFS_DIR_ITEM_KEY:
-	case BTRFS_XATTR_ITEM_KEY:
-	case BTRFS_DIR_INDEX_KEY:
-		size = sizeof(struct btrfs_dir_item);
-		break;
-	case BTRFS_INODE_REF_KEY:
-		size = sizeof(struct btrfs_inode_ref);
-		break;
-	case BTRFS_INODE_EXTREF_KEY:
-		size = sizeof(struct btrfs_inode_extref);
-		break;
-	case BTRFS_ROOT_REF_KEY:
-	case BTRFS_ROOT_BACKREF_KEY:
-		size = sizeof(struct btrfs_root_ref);
-		break;
-	default:
-		ret = false;
-		goto out;
-	}
-
-	if (read_start < item_start) {
-		ret = false;
-		goto out;
-	}
-	if (read_end > item_end) {
-		ret = false;
-		goto out;
-	}
-
-	/* there shall be item(s) before name */
-	if (read_start - item_start < size) {
-		ret = false;
-		goto out;
-	}
-
-out:
-	if (!ret)
-		btrfs_crit(fs_info, "invalid dir item name len: %u",
-			   (unsigned int)name_len);
-	return ret;
-}
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index a8ecccf..ed09520 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -61,7 +61,8 @@
 				 BTRFS_HEADER_FLAG_RELOC |\
 				 BTRFS_SUPER_FLAG_ERROR |\
 				 BTRFS_SUPER_FLAG_SEEDING |\
-				 BTRFS_SUPER_FLAG_METADUMP)
+				 BTRFS_SUPER_FLAG_METADUMP |\
+				 BTRFS_SUPER_FLAG_METADUMP_V2)
 
 static const struct extent_io_ops btree_extent_io_ops;
 static void end_workqueue_fn(struct btrfs_work *work);
@@ -220,7 +221,7 @@ void btrfs_set_buffer_lockdep_class(u64 objectid, struct extent_buffer *eb,
  * extents on the btree inode are pretty simple, there's one extent
  * that covers the entire device
  */
-static struct extent_map *btree_get_extent(struct btrfs_inode *inode,
+struct extent_map *btree_get_extent(struct btrfs_inode *inode,
 		struct page *page, size_t pg_offset, u64 start, u64 len,
 		int create)
 {
@@ -285,7 +286,7 @@ static int csum_tree_block(struct btrfs_fs_info *fs_info,
 			   int verify)
 {
 	u16 csum_size = btrfs_super_csum_size(fs_info->super_copy);
-	char *result = NULL;
+	char result[BTRFS_CSUM_SIZE];
 	unsigned long len;
 	unsigned long cur_len;
 	unsigned long offset = BTRFS_CSUM_SIZE;
@@ -294,7 +295,6 @@ static int csum_tree_block(struct btrfs_fs_info *fs_info,
 	unsigned long map_len;
 	int err;
 	u32 crc = ~(u32)0;
-	unsigned long inline_result;
 
 	len = buf->len - offset;
 	while (len > 0) {
@@ -308,13 +308,7 @@ static int csum_tree_block(struct btrfs_fs_info *fs_info,
 		len -= cur_len;
 		offset += cur_len;
 	}
-	if (csum_size > sizeof(inline_result)) {
-		result = kzalloc(csum_size, GFP_NOFS);
-		if (!result)
-			return -ENOMEM;
-	} else {
-		result = (char *)&inline_result;
-	}
+	memset(result, 0, BTRFS_CSUM_SIZE);
 
 	btrfs_csum_final(crc, result);
 
@@ -329,15 +323,12 @@ static int csum_tree_block(struct btrfs_fs_info *fs_info,
 				"%s checksum verify failed on %llu wanted %X found %X level %d",
 				fs_info->sb->s_id, buf->start,
 				val, found, btrfs_header_level(buf));
-			if (result != (char *)&inline_result)
-				kfree(result);
 			return -EUCLEAN;
 		}
 	} else {
 		write_extent_buffer(buf, result, 0, csum_size);
 	}
-	if (result != (char *)&inline_result)
-		kfree(result);
+
 	return 0;
 }
 
@@ -391,7 +382,7 @@ static int verify_parent_transid(struct extent_io_tree *io_tree,
 		clear_extent_buffer_uptodate(eb);
 out:
 	unlock_extent_cached(io_tree, eb->start, eb->start + eb->len - 1,
-			     &cached_state, GFP_NOFS);
+			     &cached_state);
 	if (need_lock)
 		btrfs_tree_read_unlock_blocking(eb);
 	return ret;
@@ -455,7 +446,7 @@ static int btree_read_extent_buffer_pages(struct btrfs_fs_info *fs_info,
 	io_tree = &BTRFS_I(fs_info->btree_inode)->io_tree;
 	while (1) {
 		ret = read_extent_buffer_pages(io_tree, eb, WAIT_COMPLETE,
-					       btree_get_extent, mirror_num);
+					       mirror_num);
 		if (!ret) {
 			if (!verify_parent_transid(io_tree, eb,
 						   parent_transid, 0))
@@ -1012,7 +1003,7 @@ void readahead_tree_block(struct btrfs_fs_info *fs_info, u64 bytenr)
 	if (IS_ERR(buf))
 		return;
 	read_extent_buffer_pages(&BTRFS_I(btree_inode)->io_tree,
-				 buf, WAIT_NONE, btree_get_extent, 0);
+				 buf, WAIT_NONE, 0);
 	free_extent_buffer(buf);
 }
 
@@ -1031,7 +1022,7 @@ int reada_tree_block_flagged(struct btrfs_fs_info *fs_info, u64 bytenr,
 	set_bit(EXTENT_BUFFER_READAHEAD, &buf->bflags);
 
 	ret = read_extent_buffer_pages(io_tree, buf, WAIT_PAGE_LOCK,
-				       btree_get_extent, mirror_num);
+				       mirror_num);
 	if (ret) {
 		free_extent_buffer(buf);
 		return ret;
@@ -1243,7 +1234,7 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 	struct btrfs_root *root;
 	struct btrfs_key key;
 	int ret = 0;
-	uuid_le uuid;
+	uuid_le uuid = NULL_UUID_LE;
 
 	root = btrfs_alloc_root(fs_info, GFP_KERNEL);
 	if (!root)
@@ -1284,7 +1275,8 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 	btrfs_set_root_used(&root->root_item, leaf->len);
 	btrfs_set_root_last_snapshot(&root->root_item, 0);
 	btrfs_set_root_dirid(&root->root_item, 0);
-	uuid_le_gen(&uuid);
+	if (is_fstree(objectid))
+		uuid_le_gen(&uuid);
 	memcpy(root->root_item.uuid, uuid.b, BTRFS_UUID_SIZE);
 	root->root_item.drop_level = 0;
 
@@ -2875,7 +2867,7 @@ int open_ctree(struct super_block *sb,
 		goto fail_sysfs;
 	}
 
-	if (!sb_rdonly(sb) && !btrfs_check_rw_degradable(fs_info)) {
+	if (!sb_rdonly(sb) && !btrfs_check_rw_degradable(fs_info, NULL)) {
 		btrfs_warn(fs_info,
 		"writeable mount is not allowed due to too many missing devices");
 		goto fail_sysfs;
@@ -3357,7 +3349,7 @@ static void write_dev_flush(struct btrfs_device *device)
 	bio->bi_private = &device->flush_wait;
 
 	btrfsic_submit_bio(bio);
-	device->flush_bio_sent = 1;
+	set_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state);
 }
 
 /*
@@ -3367,10 +3359,10 @@ static blk_status_t wait_dev_flush(struct btrfs_device *device)
 {
 	struct bio *bio = device->flush_bio;
 
-	if (!device->flush_bio_sent)
+	if (!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state))
 		return BLK_STS_OK;
 
-	device->flush_bio_sent = 0;
+	clear_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state);
 	wait_for_completion_io(&device->flush_wait);
 
 	return bio->bi_status;
@@ -3378,7 +3370,7 @@ static blk_status_t wait_dev_flush(struct btrfs_device *device)
 
 static int check_barrier_error(struct btrfs_fs_info *fs_info)
 {
-	if (!btrfs_check_rw_degradable(fs_info))
+	if (!btrfs_check_rw_degradable(fs_info, NULL))
 		return -EIO;
 	return 0;
 }
@@ -3394,14 +3386,16 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 	int errors_wait = 0;
 	blk_status_t ret;
 
+	lockdep_assert_held(&info->fs_devices->device_list_mutex);
 	/* send down all the barriers */
 	head = &info->fs_devices->devices;
-	list_for_each_entry_rcu(dev, head, dev_list) {
-		if (dev->missing)
+	list_for_each_entry(dev, head, dev_list) {
+		if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
 			continue;
 		if (!dev->bdev)
 			continue;
-		if (!dev->in_fs_metadata || !dev->writeable)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &dev->dev_state) ||
+		    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))
 			continue;
 
 		write_dev_flush(dev);
@@ -3409,14 +3403,15 @@ static int barrier_all_devices(struct btrfs_fs_info *info)
 	}
 
 	/* wait for all the barriers */
-	list_for_each_entry_rcu(dev, head, dev_list) {
-		if (dev->missing)
+	list_for_each_entry(dev, head, dev_list) {
+		if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
 			continue;
 		if (!dev->bdev) {
 			errors_wait++;
 			continue;
 		}
-		if (!dev->in_fs_metadata || !dev->writeable)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &dev->dev_state) ||
+		    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))
 			continue;
 
 		ret = wait_dev_flush(dev);
@@ -3508,12 +3503,13 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors)
 		}
 	}
 
-	list_for_each_entry_rcu(dev, head, dev_list) {
+	list_for_each_entry(dev, head, dev_list) {
 		if (!dev->bdev) {
 			total_errors++;
 			continue;
 		}
-		if (!dev->in_fs_metadata || !dev->writeable)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &dev->dev_state) ||
+		    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))
 			continue;
 
 		btrfs_set_stack_device_generation(dev_item, 0);
@@ -3549,10 +3545,11 @@ int write_all_supers(struct btrfs_fs_info *fs_info, int max_mirrors)
 	}
 
 	total_errors = 0;
-	list_for_each_entry_rcu(dev, head, dev_list) {
+	list_for_each_entry(dev, head, dev_list) {
 		if (!dev->bdev)
 			continue;
-		if (!dev->in_fs_metadata || !dev->writeable)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &dev->dev_state) ||
+		    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))
 			continue;
 
 		ret = wait_dev_supers(dev, max_mirrors);
@@ -3910,9 +3907,11 @@ static int btrfs_check_super_valid(struct btrfs_fs_info *fs_info)
 		btrfs_err(fs_info, "no valid FS found");
 		ret = -EINVAL;
 	}
-	if (btrfs_super_flags(sb) & ~BTRFS_SUPER_FLAG_SUPP)
-		btrfs_warn(fs_info, "unrecognized super flag: %llu",
+	if (btrfs_super_flags(sb) & ~BTRFS_SUPER_FLAG_SUPP) {
+		btrfs_err(fs_info, "unrecognized or unsupported super flag: %llu",
 				btrfs_super_flags(sb) & ~BTRFS_SUPER_FLAG_SUPP);
+		ret = -EINVAL;
+	}
 	if (btrfs_super_root_level(sb) >= BTRFS_MAX_LEVEL) {
 		btrfs_err(fs_info, "tree_root level too big: %d >= %d",
 				btrfs_super_root_level(sb), BTRFS_MAX_LEVEL);
diff --git a/fs/btrfs/disk-io.h b/fs/btrfs/disk-io.h
index 7f7c35d..301151a 100644
--- a/fs/btrfs/disk-io.h
+++ b/fs/btrfs/disk-io.h
@@ -149,6 +149,9 @@ struct btrfs_root *btrfs_create_tree(struct btrfs_trans_handle *trans,
 				     u64 objectid);
 int btree_lock_page_hook(struct page *page, void *data,
 				void (*flush_fn)(void *));
+struct extent_map *btree_get_extent(struct btrfs_inode *inode,
+		struct page *page, size_t pg_offset, u64 start, u64 len,
+		int create);
 int btrfs_get_num_tolerated_disk_barrier_failures(u64 flags);
 int __init btrfs_end_io_wq_init(void);
 void btrfs_end_io_wq_exit(void);
diff --git a/fs/btrfs/export.c b/fs/btrfs/export.c
index 3aeb577..ddaccad 100644
--- a/fs/btrfs/export.c
+++ b/fs/btrfs/export.c
@@ -283,11 +283,6 @@ static int btrfs_get_name(struct dentry *parent, char *name,
 		name_len = btrfs_inode_ref_name_len(leaf, iref);
 	}
 
-	ret = btrfs_is_name_len_valid(leaf, path->slots[0], name_ptr, name_len);
-	if (!ret) {
-		btrfs_free_path(path);
-		return -EIO;
-	}
 	read_extent_buffer(leaf, name, name_ptr, name_len);
 	btrfs_free_path(path);
 
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
index 2f43285..05751a6 100644
--- a/fs/btrfs/extent-tree.c
+++ b/fs/btrfs/extent-tree.c
@@ -2145,7 +2145,10 @@ int btrfs_discard_extent(struct btrfs_fs_info *fs_info, u64 bytenr,
 
 		for (i = 0; i < bbio->num_stripes; i++, stripe++) {
 			u64 bytes;
-			if (!stripe->dev->can_discard)
+			struct request_queue *req_q;
+
+			req_q = bdev_get_queue(stripe->dev->bdev);
+			if (!blk_queue_discard(req_q))
 				continue;
 
 			ret = btrfs_issue_discard(stripe->dev->bdev,
@@ -2894,7 +2897,7 @@ int btrfs_check_space_for_delayed_refs(struct btrfs_trans_handle *trans,
 	struct btrfs_block_rsv *global_rsv;
 	u64 num_heads = trans->transaction->delayed_refs.num_heads_ready;
 	u64 csum_bytes = trans->transaction->delayed_refs.pending_csums;
-	u64 num_dirty_bgs = trans->transaction->num_dirty_bgs;
+	unsigned int num_dirty_bgs = trans->transaction->num_dirty_bgs;
 	u64 num_bytes, num_dirty_bgs_bytes;
 	int ret = 0;
 
@@ -4945,12 +4948,12 @@ static int may_commit_transaction(struct btrfs_fs_info *fs_info,
 		bytes = 0;
 	else
 		bytes -= delayed_rsv->size;
+	spin_unlock(&delayed_rsv->lock);
+
 	if (percpu_counter_compare(&space_info->total_bytes_pinned,
 				   bytes) < 0) {
-		spin_unlock(&delayed_rsv->lock);
 		return -ENOSPC;
 	}
-	spin_unlock(&delayed_rsv->lock);
 
 commit:
 	trans = btrfs_join_transaction(fs_info->extent_root);
@@ -5738,8 +5741,8 @@ int btrfs_block_rsv_refill(struct btrfs_root *root,
  * or return if we already have enough space.  This will also handle the resreve
  * tracepoint for the reserved amount.
  */
-int btrfs_inode_rsv_refill(struct btrfs_inode *inode,
-			   enum btrfs_reserve_flush_enum flush)
+static int btrfs_inode_rsv_refill(struct btrfs_inode *inode,
+				  enum btrfs_reserve_flush_enum flush)
 {
 	struct btrfs_root *root = inode->root;
 	struct btrfs_block_rsv *block_rsv = &inode->block_rsv;
@@ -5770,7 +5773,7 @@ int btrfs_inode_rsv_refill(struct btrfs_inode *inode,
  * This is the same as btrfs_block_rsv_release, except that it handles the
  * tracepoint for the reservation.
  */
-void btrfs_inode_rsv_release(struct btrfs_inode *inode)
+static void btrfs_inode_rsv_release(struct btrfs_inode *inode)
 {
 	struct btrfs_fs_info *fs_info = inode->root->fs_info;
 	struct btrfs_block_rsv *global_rsv = &fs_info->global_block_rsv;
@@ -9690,7 +9693,7 @@ int btrfs_can_relocate(struct btrfs_fs_info *fs_info, u64 bytenr)
 		 * space to fit our block group in.
 		 */
 		if (device->total_bytes > device->bytes_used + min_free &&
-		    !device->is_tgtdev_for_dev_replace) {
+		    !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 			ret = find_free_dev_extent(trans, device, min_free,
 						   &dev_offset, NULL);
 			if (!ret)
@@ -10875,7 +10878,7 @@ static int btrfs_trim_free_extents(struct btrfs_device *device,
 	*trimmed = 0;
 
 	/* Not writeable = nothing to do. */
-	if (!device->writeable)
+	if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 		return 0;
 
 	/* No free space = nothing to do. */
diff --git a/fs/btrfs/extent_io.c b/fs/btrfs/extent_io.c
index 012d638..dfeb74a 100644
--- a/fs/btrfs/extent_io.c
+++ b/fs/btrfs/extent_io.c
@@ -21,6 +21,7 @@
 #include "locking.h"
 #include "rcu-string.h"
 #include "backref.h"
+#include "disk-io.h"
 
 static struct kmem_cache *extent_state_cache;
 static struct kmem_cache *extent_buffer_cache;
@@ -109,8 +110,6 @@ struct tree_entry {
 struct extent_page_data {
 	struct bio *bio;
 	struct extent_io_tree *tree;
-	get_extent_t *get_extent;
-
 	/* tells writepage not to lock the state bits for this range
 	 * it still does the unlocking
 	 */
@@ -139,7 +138,8 @@ static void add_extent_changeset(struct extent_state *state, unsigned bits,
 	BUG_ON(ret < 0);
 }
 
-static noinline void flush_write_bio(void *data);
+static void flush_write_bio(struct extent_page_data *epd);
+
 static inline struct btrfs_fs_info *
 tree_fs_info(struct extent_io_tree *tree)
 {
@@ -581,7 +581,7 @@ static void extent_io_tree_panic(struct extent_io_tree *tree, int err)
  *
  * This takes the tree lock, and returns 0 on success and < 0 on error.
  */
-static int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
+int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 			      unsigned bits, int wake, int delete,
 			      struct extent_state **cached_state,
 			      gfp_t mask, struct extent_changeset *changeset)
@@ -1295,10 +1295,10 @@ int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
 
 int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		     unsigned bits, int wake, int delete,
-		     struct extent_state **cached, gfp_t mask)
+		     struct extent_state **cached)
 {
 	return __clear_extent_bit(tree, start, end, bits, wake, delete,
-				  cached, mask, NULL);
+				  cached, GFP_NOFS, NULL);
 }
 
 int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
@@ -1348,7 +1348,7 @@ int try_lock_extent(struct extent_io_tree *tree, u64 start, u64 end)
 	if (err == -EEXIST) {
 		if (failed_start > start)
 			clear_extent_bit(tree, start, failed_start - 1,
-					 EXTENT_LOCKED, 1, 0, NULL, GFP_NOFS);
+					 EXTENT_LOCKED, 1, 0, NULL);
 		return 0;
 	}
 	return 1;
@@ -1648,7 +1648,7 @@ STATIC u64 find_lock_delalloc_range(struct inode *inode,
 			     EXTENT_DELALLOC, 1, cached_state);
 	if (!ret) {
 		unlock_extent_cached(tree, delalloc_start, delalloc_end,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		__unlock_for_delalloc(inode, locked_page,
 			      delalloc_start, delalloc_end);
 		cond_resched();
@@ -1744,7 +1744,7 @@ void extent_clear_unlock_delalloc(struct inode *inode, u64 start, u64 end,
 				 unsigned long page_ops)
 {
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, start, end, clear_bits, 1, 0,
-			 NULL, GFP_NOFS);
+			 NULL);
 
 	__process_pages_contig(inode->i_mapping, locked_page,
 			       start >> PAGE_SHIFT, end >> PAGE_SHIFT,
@@ -2027,7 +2027,8 @@ int repair_io_failure(struct btrfs_fs_info *fs_info, u64 ino, u64 start,
 	bio->bi_iter.bi_sector = sector;
 	dev = bbio->stripes[bbio->mirror_num - 1].dev;
 	btrfs_put_bbio(bbio);
-	if (!dev || !dev->bdev || !dev->writeable) {
+	if (!dev || !dev->bdev ||
+	    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state)) {
 		btrfs_bio_counter_dec(fs_info);
 		bio_put(bio);
 		return -EIO;
@@ -2257,7 +2258,7 @@ int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
 	return 0;
 }
 
-bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
+bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages,
 			   struct io_failure_record *failrec, int failed_mirror)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2281,7 +2282,7 @@ bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
 	 *	a) deliver good data to the caller
 	 *	b) correct the bad sectors on disk
 	 */
-	if (failed_bio->bi_vcnt > 1) {
+	if (failed_bio_pages > 1) {
 		/*
 		 * to fulfill b), we need to know the exact failing sectors, as
 		 * we don't want to rewrite any more than the failed ones. thus,
@@ -2374,6 +2375,7 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset,
 	int read_mode = 0;
 	blk_status_t status;
 	int ret;
+	unsigned failed_bio_pages = bio_pages_all(failed_bio);
 
 	BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
 
@@ -2381,13 +2383,13 @@ static int bio_readpage_error(struct bio *failed_bio, u64 phy_offset,
 	if (ret)
 		return ret;
 
-	if (!btrfs_check_repairable(inode, failed_bio, failrec,
+	if (!btrfs_check_repairable(inode, failed_bio_pages, failrec,
 				    failed_mirror)) {
 		free_io_failure(failure_tree, tree, failrec);
 		return -EIO;
 	}
 
-	if (failed_bio->bi_vcnt > 1)
+	if (failed_bio_pages > 1)
 		read_mode |= REQ_FAILFAST_DEV;
 
 	phy_offset >>= inode->i_sb->s_blocksize_bits;
@@ -2492,7 +2494,7 @@ endio_readpage_release_extent(struct extent_io_tree *tree, u64 start, u64 len,
 
 	if (uptodate && tree->track_uptodate)
 		set_extent_uptodate(tree, start, end, &cached, GFP_ATOMIC);
-	unlock_extent_cached(tree, start, end, &cached, GFP_ATOMIC);
+	unlock_extent_cached_atomic(tree, start, end, &cached);
 }
 
 /*
@@ -2724,7 +2726,7 @@ static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
 				       unsigned long bio_flags)
 {
 	blk_status_t ret = 0;
-	struct bio_vec *bvec = bio->bi_io_vec + bio->bi_vcnt - 1;
+	struct bio_vec *bvec = bio_last_bvec_all(bio);
 	struct page *page = bvec->bv_page;
 	struct extent_io_tree *tree = bio->bi_private;
 	u64 start;
@@ -2732,7 +2734,6 @@ static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
 	start = page_offset(page) + bvec->bv_offset;
 
 	bio->bi_private = NULL;
-	bio_get(bio);
 
 	if (tree->ops)
 		ret = tree->ops->submit_bio_hook(tree->private_data, bio,
@@ -2740,7 +2741,6 @@ static int __must_check submit_one_bio(struct bio *bio, int mirror_num,
 	else
 		btrfsic_submit_bio(bio);
 
-	bio_put(bio);
 	return blk_status_to_errno(ret);
 }
 
@@ -2942,8 +2942,7 @@ static int __do_readpage(struct extent_io_tree *tree,
 			set_extent_uptodate(tree, cur, cur + iosize - 1,
 					    &cached, GFP_NOFS);
 			unlock_extent_cached(tree, cur,
-					     cur + iosize - 1,
-					     &cached, GFP_NOFS);
+					     cur + iosize - 1, &cached);
 			break;
 		}
 		em = __get_extent_map(inode, page, pg_offset, cur,
@@ -3036,8 +3035,7 @@ static int __do_readpage(struct extent_io_tree *tree,
 			set_extent_uptodate(tree, cur, cur + iosize - 1,
 					    &cached, GFP_NOFS);
 			unlock_extent_cached(tree, cur,
-					     cur + iosize - 1,
-					     &cached, GFP_NOFS);
+					     cur + iosize - 1, &cached);
 			cur = cur + iosize;
 			pg_offset += iosize;
 			continue;
@@ -3092,9 +3090,8 @@ static int __do_readpage(struct extent_io_tree *tree,
 static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
 					     struct page *pages[], int nr_pages,
 					     u64 start, u64 end,
-					     get_extent_t *get_extent,
 					     struct extent_map **em_cached,
-					     struct bio **bio, int mirror_num,
+					     struct bio **bio,
 					     unsigned long *bio_flags,
 					     u64 *prev_em_start)
 {
@@ -3115,18 +3112,17 @@ static inline void __do_contiguous_readpages(struct extent_io_tree *tree,
 	}
 
 	for (index = 0; index < nr_pages; index++) {
-		__do_readpage(tree, pages[index], get_extent, em_cached, bio,
-			      mirror_num, bio_flags, 0, prev_em_start);
+		__do_readpage(tree, pages[index], btrfs_get_extent, em_cached,
+				bio, 0, bio_flags, 0, prev_em_start);
 		put_page(pages[index]);
 	}
 }
 
 static void __extent_readpages(struct extent_io_tree *tree,
 			       struct page *pages[],
-			       int nr_pages, get_extent_t *get_extent,
+			       int nr_pages,
 			       struct extent_map **em_cached,
-			       struct bio **bio, int mirror_num,
-			       unsigned long *bio_flags,
+			       struct bio **bio, unsigned long *bio_flags,
 			       u64 *prev_em_start)
 {
 	u64 start = 0;
@@ -3146,8 +3142,8 @@ static void __extent_readpages(struct extent_io_tree *tree,
 		} else {
 			__do_contiguous_readpages(tree, &pages[first_index],
 						  index - first_index, start,
-						  end, get_extent, em_cached,
-						  bio, mirror_num, bio_flags,
+						  end, em_cached,
+						  bio, bio_flags,
 						  prev_em_start);
 			start = page_start;
 			end = start + PAGE_SIZE - 1;
@@ -3158,9 +3154,8 @@ static void __extent_readpages(struct extent_io_tree *tree,
 	if (end)
 		__do_contiguous_readpages(tree, &pages[first_index],
 					  index - first_index, start,
-					  end, get_extent, em_cached, bio,
-					  mirror_num, bio_flags,
-					  prev_em_start);
+					  end, em_cached, bio,
+					  bio_flags, prev_em_start);
 }
 
 static int __extent_read_full_page(struct extent_io_tree *tree,
@@ -3375,7 +3370,7 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode,
 							 page_end, NULL, 1);
 			break;
 		}
-		em = epd->get_extent(BTRFS_I(inode), page, pg_offset, cur,
+		em = btrfs_get_extent(BTRFS_I(inode), page, pg_offset, cur,
 				     end - cur + 1, 1);
 		if (IS_ERR_OR_NULL(em)) {
 			SetPageError(page);
@@ -3458,10 +3453,9 @@ static noinline_for_stack int __extent_writepage_io(struct inode *inode,
  * and the end_io handler clears the writeback ranges
  */
 static int __extent_writepage(struct page *page, struct writeback_control *wbc,
-			      void *data)
+			      struct extent_page_data *epd)
 {
 	struct inode *inode = page->mapping->host;
-	struct extent_page_data *epd = data;
 	u64 start = page_offset(page);
 	u64 page_end = start + PAGE_SIZE - 1;
 	int ret;
@@ -3895,8 +3889,7 @@ int btree_write_cache_pages(struct address_space *mapping,
  * write_cache_pages - walk the list of dirty pages of the given address space and write all of them.
  * @mapping: address space structure to write
  * @wbc: subtract the number of written pages from *@wbc->nr_to_write
- * @writepage: function called for each page
- * @data: data passed to writepage function
+ * @data: data passed to __extent_writepage function
  *
  * If a page is already under I/O, write_cache_pages() skips it, even
  * if it's dirty.  This is desirable behaviour for memory-cleaning writeback,
@@ -3908,8 +3901,7 @@ int btree_write_cache_pages(struct address_space *mapping,
  */
 static int extent_write_cache_pages(struct address_space *mapping,
 			     struct writeback_control *wbc,
-			     writepage_t writepage, void *data,
-			     void (*flush_fn)(void *))
+			     struct extent_page_data *epd)
 {
 	struct inode *inode = mapping->host;
 	int ret = 0;
@@ -3973,7 +3965,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 			 * mapping
 			 */
 			if (!trylock_page(page)) {
-				flush_fn(data);
+				flush_write_bio(epd);
 				lock_page(page);
 			}
 
@@ -3984,7 +3976,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 
 			if (wbc->sync_mode != WB_SYNC_NONE) {
 				if (PageWriteback(page))
-					flush_fn(data);
+					flush_write_bio(epd);
 				wait_on_page_writeback(page);
 			}
 
@@ -3994,7 +3986,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 				continue;
 			}
 
-			ret = (*writepage)(page, wbc, data);
+			ret = __extent_writepage(page, wbc, epd);
 
 			if (unlikely(ret == AOP_WRITEPAGE_ACTIVATE)) {
 				unlock_page(page);
@@ -4042,7 +4034,7 @@ static int extent_write_cache_pages(struct address_space *mapping,
 	return ret;
 }
 
-static void flush_epd_write_bio(struct extent_page_data *epd)
+static void flush_write_bio(struct extent_page_data *epd)
 {
 	if (epd->bio) {
 		int ret;
@@ -4053,37 +4045,28 @@ static void flush_epd_write_bio(struct extent_page_data *epd)
 	}
 }
 
-static noinline void flush_write_bio(void *data)
-{
-	struct extent_page_data *epd = data;
-	flush_epd_write_bio(epd);
-}
-
-int extent_write_full_page(struct extent_io_tree *tree, struct page *page,
-			  get_extent_t *get_extent,
-			  struct writeback_control *wbc)
+int extent_write_full_page(struct page *page, struct writeback_control *wbc)
 {
 	int ret;
 	struct extent_page_data epd = {
 		.bio = NULL,
-		.tree = tree,
-		.get_extent = get_extent,
+		.tree = &BTRFS_I(page->mapping->host)->io_tree,
 		.extent_locked = 0,
 		.sync_io = wbc->sync_mode == WB_SYNC_ALL,
 	};
 
 	ret = __extent_writepage(page, wbc, &epd);
 
-	flush_epd_write_bio(&epd);
+	flush_write_bio(&epd);
 	return ret;
 }
 
-int extent_write_locked_range(struct extent_io_tree *tree, struct inode *inode,
-			      u64 start, u64 end, get_extent_t *get_extent,
+int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 			      int mode)
 {
 	int ret = 0;
 	struct address_space *mapping = inode->i_mapping;
+	struct extent_io_tree *tree = &BTRFS_I(inode)->io_tree;
 	struct page *page;
 	unsigned long nr_pages = (end - start + PAGE_SIZE) >>
 		PAGE_SHIFT;
@@ -4091,7 +4074,6 @@ int extent_write_locked_range(struct extent_io_tree *tree, struct inode *inode,
 	struct extent_page_data epd = {
 		.bio = NULL,
 		.tree = tree,
-		.get_extent = get_extent,
 		.extent_locked = 1,
 		.sync_io = mode == WB_SYNC_ALL,
 	};
@@ -4117,34 +4099,30 @@ int extent_write_locked_range(struct extent_io_tree *tree, struct inode *inode,
 		start += PAGE_SIZE;
 	}
 
-	flush_epd_write_bio(&epd);
+	flush_write_bio(&epd);
 	return ret;
 }
 
 int extent_writepages(struct extent_io_tree *tree,
 		      struct address_space *mapping,
-		      get_extent_t *get_extent,
 		      struct writeback_control *wbc)
 {
 	int ret = 0;
 	struct extent_page_data epd = {
 		.bio = NULL,
 		.tree = tree,
-		.get_extent = get_extent,
 		.extent_locked = 0,
 		.sync_io = wbc->sync_mode == WB_SYNC_ALL,
 	};
 
-	ret = extent_write_cache_pages(mapping, wbc, __extent_writepage, &epd,
-				       flush_write_bio);
-	flush_epd_write_bio(&epd);
+	ret = extent_write_cache_pages(mapping, wbc, &epd);
+	flush_write_bio(&epd);
 	return ret;
 }
 
 int extent_readpages(struct extent_io_tree *tree,
 		     struct address_space *mapping,
-		     struct list_head *pages, unsigned nr_pages,
-		     get_extent_t get_extent)
+		     struct list_head *pages, unsigned nr_pages)
 {
 	struct bio *bio = NULL;
 	unsigned page_idx;
@@ -4170,13 +4148,13 @@ int extent_readpages(struct extent_io_tree *tree,
 		pagepool[nr++] = page;
 		if (nr < ARRAY_SIZE(pagepool))
 			continue;
-		__extent_readpages(tree, pagepool, nr, get_extent, &em_cached,
-				   &bio, 0, &bio_flags, &prev_em_start);
+		__extent_readpages(tree, pagepool, nr, &em_cached, &bio,
+				&bio_flags, &prev_em_start);
 		nr = 0;
 	}
 	if (nr)
-		__extent_readpages(tree, pagepool, nr, get_extent, &em_cached,
-				   &bio, 0, &bio_flags, &prev_em_start);
+		__extent_readpages(tree, pagepool, nr, &em_cached, &bio,
+				&bio_flags, &prev_em_start);
 
 	if (em_cached)
 		free_extent_map(em_cached);
@@ -4209,7 +4187,7 @@ int extent_invalidatepage(struct extent_io_tree *tree,
 	clear_extent_bit(tree, start, end,
 			 EXTENT_LOCKED | EXTENT_DIRTY | EXTENT_DELALLOC |
 			 EXTENT_DO_ACCOUNTING,
-			 1, 1, &cached_state, GFP_NOFS);
+			 1, 1, &cached_state);
 	return 0;
 }
 
@@ -4234,9 +4212,9 @@ static int try_release_extent_state(struct extent_map_tree *map,
 		 * at this point we can safely clear everything except the
 		 * locked bit and the nodatasum bit
 		 */
-		ret = clear_extent_bit(tree, start, end,
+		ret = __clear_extent_bit(tree, start, end,
 				 ~(EXTENT_LOCKED | EXTENT_NODATASUM),
-				 0, 0, NULL, mask);
+				 0, 0, NULL, mask, NULL);
 
 		/* if clear_extent_bit failed for enomem reasons,
 		 * we can't allow the release to continue.
@@ -4302,9 +4280,7 @@ int try_release_extent_mapping(struct extent_map_tree *map,
  * This maps until we find something past 'last'
  */
 static struct extent_map *get_extent_skip_holes(struct inode *inode,
-						u64 offset,
-						u64 last,
-						get_extent_t *get_extent)
+						u64 offset, u64 last)
 {
 	u64 sectorsize = btrfs_inode_sectorsize(inode);
 	struct extent_map *em;
@@ -4318,15 +4294,14 @@ static struct extent_map *get_extent_skip_holes(struct inode *inode,
 		if (len == 0)
 			break;
 		len = ALIGN(len, sectorsize);
-		em = get_extent(BTRFS_I(inode), NULL, 0, offset, len, 0);
+		em = btrfs_get_extent_fiemap(BTRFS_I(inode), NULL, 0, offset,
+				len, 0);
 		if (IS_ERR_OR_NULL(em))
 			return em;
 
 		/* if this isn't a hole return it */
-		if (!test_bit(EXTENT_FLAG_VACANCY, &em->flags) &&
-		    em->block_start != EXTENT_MAP_HOLE) {
+		if (em->block_start != EXTENT_MAP_HOLE)
 			return em;
-		}
 
 		/* this is a hole, advance to the next extent */
 		offset = extent_map_end(em);
@@ -4451,7 +4426,7 @@ static int emit_last_fiemap_cache(struct btrfs_fs_info *fs_info,
 }
 
 int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
-		__u64 start, __u64 len, get_extent_t *get_extent)
+		__u64 start, __u64 len)
 {
 	int ret = 0;
 	u64 off = start;
@@ -4533,8 +4508,7 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	lock_extent_bits(&BTRFS_I(inode)->io_tree, start, start + len - 1,
 			 &cached_state);
 
-	em = get_extent_skip_holes(inode, start, last_for_get_extent,
-				   get_extent);
+	em = get_extent_skip_holes(inode, start, last_for_get_extent);
 	if (!em)
 		goto out;
 	if (IS_ERR(em)) {
@@ -4622,8 +4596,7 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 		}
 
 		/* now scan forward to see if this is really the last extent. */
-		em = get_extent_skip_holes(inode, off, last_for_get_extent,
-					   get_extent);
+		em = get_extent_skip_holes(inode, off, last_for_get_extent);
 		if (IS_ERR(em)) {
 			ret = PTR_ERR(em);
 			goto out;
@@ -4647,7 +4620,7 @@ int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 out:
 	btrfs_free_path(path);
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, start, start + len - 1,
-			     &cached_state, GFP_NOFS);
+			     &cached_state);
 	return ret;
 }
 
@@ -5263,8 +5236,7 @@ int extent_buffer_uptodate(struct extent_buffer *eb)
 }
 
 int read_extent_buffer_pages(struct extent_io_tree *tree,
-			     struct extent_buffer *eb, int wait,
-			     get_extent_t *get_extent, int mirror_num)
+			     struct extent_buffer *eb, int wait, int mirror_num)
 {
 	unsigned long i;
 	struct page *page;
@@ -5324,7 +5296,7 @@ int read_extent_buffer_pages(struct extent_io_tree *tree,
 
 			ClearPageError(page);
 			err = __extent_read_full_page(tree, page,
-						      get_extent, &bio,
+						      btree_get_extent, &bio,
 						      mirror_num, &bio_flags,
 						      REQ_META);
 			if (err) {
diff --git a/fs/btrfs/extent_io.h b/fs/btrfs/extent_io.h
index 93dcae0..a7a850a 100644
--- a/fs/btrfs/extent_io.h
+++ b/fs/btrfs/extent_io.h
@@ -300,19 +300,29 @@ int clear_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
 		unsigned bits, struct extent_changeset *changeset);
 int clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
 		     unsigned bits, int wake, int delete,
-		     struct extent_state **cached, gfp_t mask);
+		     struct extent_state **cached);
+int __clear_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
+		     unsigned bits, int wake, int delete,
+		     struct extent_state **cached, gfp_t mask,
+		     struct extent_changeset *changeset);
 
 static inline int unlock_extent(struct extent_io_tree *tree, u64 start, u64 end)
 {
-	return clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, NULL,
-				GFP_NOFS);
+	return clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, NULL);
 }
 
 static inline int unlock_extent_cached(struct extent_io_tree *tree, u64 start,
-		u64 end, struct extent_state **cached, gfp_t mask)
+		u64 end, struct extent_state **cached)
 {
-	return clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, cached,
-				mask);
+	return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, cached,
+				GFP_NOFS, NULL);
+}
+
+static inline int unlock_extent_cached_atomic(struct extent_io_tree *tree,
+		u64 start, u64 end, struct extent_state **cached)
+{
+	return __clear_extent_bit(tree, start, end, EXTENT_LOCKED, 1, 0, cached,
+				GFP_ATOMIC, NULL);
 }
 
 static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start,
@@ -323,8 +333,7 @@ static inline int clear_extent_bits(struct extent_io_tree *tree, u64 start,
 	if (bits & EXTENT_LOCKED)
 		wake = 1;
 
-	return clear_extent_bit(tree, start, end, bits, wake, 0, NULL,
-			GFP_NOFS);
+	return clear_extent_bit(tree, start, end, bits, wake, 0, NULL);
 }
 
 int set_record_extent_bits(struct extent_io_tree *tree, u64 start, u64 end,
@@ -340,10 +349,10 @@ static inline int set_extent_bits(struct extent_io_tree *tree, u64 start,
 }
 
 static inline int clear_extent_uptodate(struct extent_io_tree *tree, u64 start,
-		u64 end, struct extent_state **cached_state, gfp_t mask)
+		u64 end, struct extent_state **cached_state)
 {
-	return clear_extent_bit(tree, start, end, EXTENT_UPTODATE, 0, 0,
-				cached_state, mask);
+	return __clear_extent_bit(tree, start, end, EXTENT_UPTODATE, 0, 0,
+				cached_state, GFP_NOFS, NULL);
 }
 
 static inline int set_extent_dirty(struct extent_io_tree *tree, u64 start,
@@ -358,7 +367,7 @@ static inline int clear_extent_dirty(struct extent_io_tree *tree, u64 start,
 {
 	return clear_extent_bit(tree, start, end,
 				EXTENT_DIRTY | EXTENT_DELALLOC |
-				EXTENT_DO_ACCOUNTING, 0, 0, NULL, GFP_NOFS);
+				EXTENT_DO_ACCOUNTING, 0, 0, NULL);
 }
 
 int convert_extent_bit(struct extent_io_tree *tree, u64 start, u64 end,
@@ -401,24 +410,19 @@ int find_first_extent_bit(struct extent_io_tree *tree, u64 start,
 			  struct extent_state **cached_state);
 int extent_invalidatepage(struct extent_io_tree *tree,
 			  struct page *page, unsigned long offset);
-int extent_write_full_page(struct extent_io_tree *tree, struct page *page,
-			  get_extent_t *get_extent,
-			  struct writeback_control *wbc);
-int extent_write_locked_range(struct extent_io_tree *tree, struct inode *inode,
-			      u64 start, u64 end, get_extent_t *get_extent,
+int extent_write_full_page(struct page *page, struct writeback_control *wbc);
+int extent_write_locked_range(struct inode *inode, u64 start, u64 end,
 			      int mode);
 int extent_writepages(struct extent_io_tree *tree,
 		      struct address_space *mapping,
-		      get_extent_t *get_extent,
 		      struct writeback_control *wbc);
 int btree_write_cache_pages(struct address_space *mapping,
 			    struct writeback_control *wbc);
 int extent_readpages(struct extent_io_tree *tree,
 		     struct address_space *mapping,
-		     struct list_head *pages, unsigned nr_pages,
-		     get_extent_t get_extent);
+		     struct list_head *pages, unsigned nr_pages);
 int extent_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
-		__u64 start, __u64 len, get_extent_t *get_extent);
+		__u64 start, __u64 len);
 void set_page_extent_mapped(struct page *page);
 
 struct extent_buffer *alloc_extent_buffer(struct btrfs_fs_info *fs_info,
@@ -437,7 +441,7 @@ void free_extent_buffer_stale(struct extent_buffer *eb);
 #define WAIT_PAGE_LOCK	2
 int read_extent_buffer_pages(struct extent_io_tree *tree,
 			     struct extent_buffer *eb, int wait,
-			     get_extent_t *get_extent, int mirror_num);
+			     int mirror_num);
 void wait_on_extent_buffer_writeback(struct extent_buffer *eb);
 
 static inline unsigned long num_extent_pages(u64 start, u64 len)
@@ -540,7 +544,7 @@ void btrfs_free_io_failure_record(struct btrfs_inode *inode, u64 start,
 		u64 end);
 int btrfs_get_io_failure_record(struct inode *inode, u64 start, u64 end,
 				struct io_failure_record **failrec_ret);
-bool btrfs_check_repairable(struct inode *inode, struct bio *failed_bio,
+bool btrfs_check_repairable(struct inode *inode, unsigned failed_bio_pages,
 			    struct io_failure_record *failrec, int fail_mirror);
 struct bio *btrfs_create_repair_bio(struct inode *inode, struct bio *failed_bio,
 				    struct io_failure_record *failrec,
diff --git a/fs/btrfs/extent_map.c b/fs/btrfs/extent_map.c
index 2e348fb..d3bd021 100644
--- a/fs/btrfs/extent_map.c
+++ b/fs/btrfs/extent_map.c
@@ -454,3 +454,135 @@ void replace_extent_mapping(struct extent_map_tree *tree,
 
 	setup_extent_mapping(tree, new, modified);
 }
+
+static struct extent_map *next_extent_map(struct extent_map *em)
+{
+	struct rb_node *next;
+
+	next = rb_next(&em->rb_node);
+	if (!next)
+		return NULL;
+	return container_of(next, struct extent_map, rb_node);
+}
+
+static struct extent_map *prev_extent_map(struct extent_map *em)
+{
+	struct rb_node *prev;
+
+	prev = rb_prev(&em->rb_node);
+	if (!prev)
+		return NULL;
+	return container_of(prev, struct extent_map, rb_node);
+}
+
+/* helper for btfs_get_extent.  Given an existing extent in the tree,
+ * the existing extent is the nearest extent to map_start,
+ * and an extent that you want to insert, deal with overlap and insert
+ * the best fitted new extent into the tree.
+ */
+static noinline int merge_extent_mapping(struct extent_map_tree *em_tree,
+					 struct extent_map *existing,
+					 struct extent_map *em,
+					 u64 map_start)
+{
+	struct extent_map *prev;
+	struct extent_map *next;
+	u64 start;
+	u64 end;
+	u64 start_diff;
+
+	BUG_ON(map_start < em->start || map_start >= extent_map_end(em));
+
+	if (existing->start > map_start) {
+		next = existing;
+		prev = prev_extent_map(next);
+	} else {
+		prev = existing;
+		next = next_extent_map(prev);
+	}
+
+	start = prev ? extent_map_end(prev) : em->start;
+	start = max_t(u64, start, em->start);
+	end = next ? next->start : extent_map_end(em);
+	end = min_t(u64, end, extent_map_end(em));
+	start_diff = start - em->start;
+	em->start = start;
+	em->len = end - start;
+	if (em->block_start < EXTENT_MAP_LAST_BYTE &&
+	    !test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
+		em->block_start += start_diff;
+		em->block_len = em->len;
+	}
+	return add_extent_mapping(em_tree, em, 0);
+}
+
+/**
+ * btrfs_add_extent_mapping - add extent mapping into em_tree
+ * @em_tree - the extent tree into which we want to insert the extent mapping
+ * @em_in   - extent we are inserting
+ * @start   - start of the logical range btrfs_get_extent() is requesting
+ * @len     - length of the logical range btrfs_get_extent() is requesting
+ *
+ * Note that @em_in's range may be different from [start, start+len),
+ * but they must be overlapped.
+ *
+ * Insert @em_in into @em_tree. In case there is an overlapping range, handle
+ * the -EEXIST by either:
+ * a) Returning the existing extent in @em_in if @start is within the
+ *    existing em.
+ * b) Merge the existing extent with @em_in passed in.
+ *
+ * Return 0 on success, otherwise -EEXIST.
+ *
+ */
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len)
+{
+	int ret;
+	struct extent_map *em = *em_in;
+
+	ret = add_extent_mapping(em_tree, em, 0);
+	/* it is possible that someone inserted the extent into the tree
+	 * while we had the lock dropped.  It is also possible that
+	 * an overlapping map exists in the tree
+	 */
+	if (ret == -EEXIST) {
+		struct extent_map *existing;
+
+		ret = 0;
+
+		existing = search_extent_mapping(em_tree, start, len);
+		/*
+		 * existing will always be non-NULL, since there must be
+		 * extent causing the -EEXIST.
+		 */
+		if (start >= existing->start &&
+		    start < extent_map_end(existing)) {
+			free_extent_map(em);
+			*em_in = existing;
+			ret = 0;
+		} else {
+			u64 orig_start = em->start;
+			u64 orig_len = em->len;
+
+			/*
+			 * The existing extent map is the one nearest to
+			 * the [start, start + len) range which overlaps
+			 */
+			ret = merge_extent_mapping(em_tree, existing,
+						   em, start);
+			if (ret) {
+				free_extent_map(em);
+				*em_in = NULL;
+				WARN_ONCE(ret,
+"unexpected error %d: merge existing(start %llu len %llu) with em(start %llu len %llu)\n",
+					  ret, existing->start, existing->len,
+					  orig_start, orig_len);
+			}
+			free_extent_map(existing);
+		}
+	}
+
+	ASSERT(ret == 0 || ret == -EEXIST);
+	return ret;
+}
diff --git a/fs/btrfs/extent_map.h b/fs/btrfs/extent_map.h
index 64365bb..b29f77b 100644
--- a/fs/btrfs/extent_map.h
+++ b/fs/btrfs/extent_map.h
@@ -13,7 +13,6 @@
 /* bits for the flags field */
 #define EXTENT_FLAG_PINNED 0 /* this entry not yet on disk, don't free it */
 #define EXTENT_FLAG_COMPRESSED 1
-#define EXTENT_FLAG_VACANCY 2 /* no file extent item found */
 #define EXTENT_FLAG_PREALLOC 3 /* pre-allocated extent */
 #define EXTENT_FLAG_LOGGING 4 /* Logging this extent */
 #define EXTENT_FLAG_FILLING 5 /* Filling in a preallocated extent */
@@ -92,4 +91,6 @@ int unpin_extent_cache(struct extent_map_tree *tree, u64 start, u64 len, u64 gen
 void clear_em_logging(struct extent_map_tree *tree, struct extent_map *em);
 struct extent_map *search_extent_mapping(struct extent_map_tree *tree,
 					 u64 start, u64 len);
+int btrfs_add_extent_mapping(struct extent_map_tree *em_tree,
+			     struct extent_map **em_in, u64 start, u64 len);
 #endif
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index eb1bac7..41ab907 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -31,6 +31,7 @@
 #include <linux/slab.h>
 #include <linux/btrfs.h>
 #include <linux/uio.h>
+#include <linux/iversion.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -1504,7 +1505,7 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages,
 		    ordered->file_offset + ordered->len > start_pos &&
 		    ordered->file_offset <= last_pos) {
 			unlock_extent_cached(&inode->io_tree, start_pos,
-					last_pos, cached_state, GFP_NOFS);
+					last_pos, cached_state);
 			for (i = 0; i < num_pages; i++) {
 				unlock_page(pages[i]);
 				put_page(pages[i]);
@@ -1519,7 +1520,7 @@ lock_and_cleanup_extent_if_need(struct btrfs_inode *inode, struct page **pages,
 		clear_extent_bit(&inode->io_tree, start_pos, last_pos,
 				 EXTENT_DIRTY | EXTENT_DELALLOC |
 				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
-				 0, 0, cached_state, GFP_NOFS);
+				 0, 0, cached_state);
 		*lockstart = start_pos;
 		*lockend = last_pos;
 		ret = 1;
@@ -1755,11 +1756,10 @@ static noinline ssize_t __btrfs_buffered_write(struct file *file,
 
 		if (copied > 0)
 			ret = btrfs_dirty_pages(inode, pages, dirty_pages,
-						pos, copied, NULL);
+						pos, copied, &cached_state);
 		if (extents_locked)
 			unlock_extent_cached(&BTRFS_I(inode)->io_tree,
-					     lockstart, lockend, &cached_state,
-					     GFP_NOFS);
+					     lockstart, lockend, &cached_state);
 		btrfs_delalloc_release_extents(BTRFS_I(inode), reserve_bytes);
 		if (ret) {
 			btrfs_drop_pages(pages, num_pages);
@@ -2019,10 +2019,19 @@ int btrfs_release_file(struct inode *inode, struct file *filp)
 static int start_ordered_ops(struct inode *inode, loff_t start, loff_t end)
 {
 	int ret;
+	struct blk_plug plug;
 
+	/*
+	 * This is only called in fsync, which would do synchronous writes, so
+	 * a plug can merge adjacent IOs as much as possible.  Esp. in case of
+	 * multiple disks using raid profile, a large IO can be split to
+	 * several segments of stripe length (currently 64K).
+	 */
+	blk_start_plug(&plug);
 	atomic_inc(&BTRFS_I(inode)->sync_writers);
 	ret = btrfs_fdatawrite_range(inode, start, end);
 	atomic_dec(&BTRFS_I(inode)->sync_writers);
+	blk_finish_plug(&plug);
 
 	return ret;
 }
@@ -2450,6 +2459,46 @@ static int find_first_non_hole(struct inode *inode, u64 *start, u64 *len)
 	return ret;
 }
 
+static int btrfs_punch_hole_lock_range(struct inode *inode,
+				       const u64 lockstart,
+				       const u64 lockend,
+				       struct extent_state **cached_state)
+{
+	while (1) {
+		struct btrfs_ordered_extent *ordered;
+		int ret;
+
+		truncate_pagecache_range(inode, lockstart, lockend);
+
+		lock_extent_bits(&BTRFS_I(inode)->io_tree, lockstart, lockend,
+				 cached_state);
+		ordered = btrfs_lookup_first_ordered_extent(inode, lockend);
+
+		/*
+		 * We need to make sure we have no ordered extents in this range
+		 * and nobody raced in and read a page in this range, if we did
+		 * we need to try again.
+		 */
+		if ((!ordered ||
+		    (ordered->file_offset + ordered->len <= lockstart ||
+		     ordered->file_offset > lockend)) &&
+		     !btrfs_page_exists_in_range(inode, lockstart, lockend)) {
+			if (ordered)
+				btrfs_put_ordered_extent(ordered);
+			break;
+		}
+		if (ordered)
+			btrfs_put_ordered_extent(ordered);
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart,
+				     lockend, cached_state);
+		ret = btrfs_wait_ordered_range(inode, lockstart,
+					       lockend - lockstart + 1);
+		if (ret)
+			return ret;
+	}
+	return 0;
+}
+
 static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
 {
 	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
@@ -2566,38 +2615,11 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
 		goto out_only_mutex;
 	}
 
-	while (1) {
-		struct btrfs_ordered_extent *ordered;
-
-		truncate_pagecache_range(inode, lockstart, lockend);
-
-		lock_extent_bits(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-				 &cached_state);
-		ordered = btrfs_lookup_first_ordered_extent(inode, lockend);
-
-		/*
-		 * We need to make sure we have no ordered extents in this range
-		 * and nobody raced in and read a page in this range, if we did
-		 * we need to try again.
-		 */
-		if ((!ordered ||
-		    (ordered->file_offset + ordered->len <= lockstart ||
-		     ordered->file_offset > lockend)) &&
-		     !btrfs_page_exists_in_range(inode, lockstart, lockend)) {
-			if (ordered)
-				btrfs_put_ordered_extent(ordered);
-			break;
-		}
-		if (ordered)
-			btrfs_put_ordered_extent(ordered);
-		unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart,
-				     lockend, &cached_state, GFP_NOFS);
-		ret = btrfs_wait_ordered_range(inode, lockstart,
-					       lockend - lockstart + 1);
-		if (ret) {
-			inode_unlock(inode);
-			return ret;
-		}
+	ret = btrfs_punch_hole_lock_range(inode, lockstart, lockend,
+					  &cached_state);
+	if (ret) {
+		inode_unlock(inode);
+		goto out_only_mutex;
 	}
 
 	path = btrfs_alloc_path();
@@ -2742,7 +2764,7 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len)
 	btrfs_free_block_rsv(fs_info, rsv);
 out:
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-			     &cached_state, GFP_NOFS);
+			     &cached_state);
 out_only_mutex:
 	if (!updated_inode && truncated_block && !ret && !err) {
 		/*
@@ -2806,6 +2828,234 @@ static int add_falloc_range(struct list_head *head, u64 start, u64 len)
 	return 0;
 }
 
+static int btrfs_fallocate_update_isize(struct inode *inode,
+					const u64 end,
+					const int mode)
+{
+	struct btrfs_trans_handle *trans;
+	struct btrfs_root *root = BTRFS_I(inode)->root;
+	int ret;
+	int ret2;
+
+	if (mode & FALLOC_FL_KEEP_SIZE || end <= i_size_read(inode))
+		return 0;
+
+	trans = btrfs_start_transaction(root, 1);
+	if (IS_ERR(trans))
+		return PTR_ERR(trans);
+
+	inode->i_ctime = current_time(inode);
+	i_size_write(inode, end);
+	btrfs_ordered_update_i_size(inode, end, NULL);
+	ret = btrfs_update_inode(trans, root, inode);
+	ret2 = btrfs_end_transaction(trans);
+
+	return ret ? ret : ret2;
+}
+
+enum {
+	RANGE_BOUNDARY_WRITTEN_EXTENT = 0,
+	RANGE_BOUNDARY_PREALLOC_EXTENT = 1,
+	RANGE_BOUNDARY_HOLE = 2,
+};
+
+static int btrfs_zero_range_check_range_boundary(struct inode *inode,
+						 u64 offset)
+{
+	const u64 sectorsize = btrfs_inode_sectorsize(inode);
+	struct extent_map *em;
+	int ret;
+
+	offset = round_down(offset, sectorsize);
+	em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, offset, sectorsize, 0);
+	if (IS_ERR(em))
+		return PTR_ERR(em);
+
+	if (em->block_start == EXTENT_MAP_HOLE)
+		ret = RANGE_BOUNDARY_HOLE;
+	else if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
+		ret = RANGE_BOUNDARY_PREALLOC_EXTENT;
+	else
+		ret = RANGE_BOUNDARY_WRITTEN_EXTENT;
+
+	free_extent_map(em);
+	return ret;
+}
+
+static int btrfs_zero_range(struct inode *inode,
+			    loff_t offset,
+			    loff_t len,
+			    const int mode)
+{
+	struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
+	struct extent_map *em;
+	struct extent_changeset *data_reserved = NULL;
+	int ret;
+	u64 alloc_hint = 0;
+	const u64 sectorsize = btrfs_inode_sectorsize(inode);
+	u64 alloc_start = round_down(offset, sectorsize);
+	u64 alloc_end = round_up(offset + len, sectorsize);
+	u64 bytes_to_reserve = 0;
+	bool space_reserved = false;
+
+	inode_dio_wait(inode);
+
+	em = btrfs_get_extent(BTRFS_I(inode), NULL, 0,
+			      alloc_start, alloc_end - alloc_start, 0);
+	if (IS_ERR(em)) {
+		ret = PTR_ERR(em);
+		goto out;
+	}
+
+	/*
+	 * Avoid hole punching and extent allocation for some cases. More cases
+	 * could be considered, but these are unlikely common and we keep things
+	 * as simple as possible for now. Also, intentionally, if the target
+	 * range contains one or more prealloc extents together with regular
+	 * extents and holes, we drop all the existing extents and allocate a
+	 * new prealloc extent, so that we get a larger contiguous disk extent.
+	 */
+	if (em->start <= alloc_start &&
+	    test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) {
+		const u64 em_end = em->start + em->len;
+
+		if (em_end >= offset + len) {
+			/*
+			 * The whole range is already a prealloc extent,
+			 * do nothing except updating the inode's i_size if
+			 * needed.
+			 */
+			free_extent_map(em);
+			ret = btrfs_fallocate_update_isize(inode, offset + len,
+							   mode);
+			goto out;
+		}
+		/*
+		 * Part of the range is already a prealloc extent, so operate
+		 * only on the remaining part of the range.
+		 */
+		alloc_start = em_end;
+		ASSERT(IS_ALIGNED(alloc_start, sectorsize));
+		len = offset + len - alloc_start;
+		offset = alloc_start;
+		alloc_hint = em->block_start + em->len;
+	}
+	free_extent_map(em);
+
+	if (BTRFS_BYTES_TO_BLKS(fs_info, offset) ==
+	    BTRFS_BYTES_TO_BLKS(fs_info, offset + len - 1)) {
+		em = btrfs_get_extent(BTRFS_I(inode), NULL, 0,
+				      alloc_start, sectorsize, 0);
+		if (IS_ERR(em)) {
+			ret = PTR_ERR(em);
+			goto out;
+		}
+
+		if (test_bit(EXTENT_FLAG_PREALLOC, &em->flags)) {
+			free_extent_map(em);
+			ret = btrfs_fallocate_update_isize(inode, offset + len,
+							   mode);
+			goto out;
+		}
+		if (len < sectorsize && em->block_start != EXTENT_MAP_HOLE) {
+			free_extent_map(em);
+			ret = btrfs_truncate_block(inode, offset, len, 0);
+			if (!ret)
+				ret = btrfs_fallocate_update_isize(inode,
+								   offset + len,
+								   mode);
+			return ret;
+		}
+		free_extent_map(em);
+		alloc_start = round_down(offset, sectorsize);
+		alloc_end = alloc_start + sectorsize;
+		goto reserve_space;
+	}
+
+	alloc_start = round_up(offset, sectorsize);
+	alloc_end = round_down(offset + len, sectorsize);
+
+	/*
+	 * For unaligned ranges, check the pages at the boundaries, they might
+	 * map to an extent, in which case we need to partially zero them, or
+	 * they might map to a hole, in which case we need our allocation range
+	 * to cover them.
+	 */
+	if (!IS_ALIGNED(offset, sectorsize)) {
+		ret = btrfs_zero_range_check_range_boundary(inode, offset);
+		if (ret < 0)
+			goto out;
+		if (ret == RANGE_BOUNDARY_HOLE) {
+			alloc_start = round_down(offset, sectorsize);
+			ret = 0;
+		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
+			ret = btrfs_truncate_block(inode, offset, 0, 0);
+			if (ret)
+				goto out;
+		} else {
+			ret = 0;
+		}
+	}
+
+	if (!IS_ALIGNED(offset + len, sectorsize)) {
+		ret = btrfs_zero_range_check_range_boundary(inode,
+							    offset + len);
+		if (ret < 0)
+			goto out;
+		if (ret == RANGE_BOUNDARY_HOLE) {
+			alloc_end = round_up(offset + len, sectorsize);
+			ret = 0;
+		} else if (ret == RANGE_BOUNDARY_WRITTEN_EXTENT) {
+			ret = btrfs_truncate_block(inode, offset + len, 0, 1);
+			if (ret)
+				goto out;
+		} else {
+			ret = 0;
+		}
+	}
+
+reserve_space:
+	if (alloc_start < alloc_end) {
+		struct extent_state *cached_state = NULL;
+		const u64 lockstart = alloc_start;
+		const u64 lockend = alloc_end - 1;
+
+		bytes_to_reserve = alloc_end - alloc_start;
+		ret = btrfs_alloc_data_chunk_ondemand(BTRFS_I(inode),
+						      bytes_to_reserve);
+		if (ret < 0)
+			goto out;
+		space_reserved = true;
+		ret = btrfs_qgroup_reserve_data(inode, &data_reserved,
+						alloc_start, bytes_to_reserve);
+		if (ret)
+			goto out;
+		ret = btrfs_punch_hole_lock_range(inode, lockstart, lockend,
+						  &cached_state);
+		if (ret)
+			goto out;
+		ret = btrfs_prealloc_file_range(inode, mode, alloc_start,
+						alloc_end - alloc_start,
+						i_blocksize(inode),
+						offset + len, &alloc_hint);
+		unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart,
+				     lockend, &cached_state);
+		/* btrfs_prealloc_file_range releases reserved space on error */
+		if (ret) {
+			space_reserved = false;
+			goto out;
+		}
+	}
+	ret = btrfs_fallocate_update_isize(inode, offset + len, mode);
+ out:
+	if (ret && space_reserved)
+		btrfs_free_reserved_data_space(inode, data_reserved,
+					       alloc_start, bytes_to_reserve);
+	extent_changeset_free(data_reserved);
+
+	return ret;
+}
+
 static long btrfs_fallocate(struct file *file, int mode,
 			    loff_t offset, loff_t len)
 {
@@ -2831,7 +3081,8 @@ static long btrfs_fallocate(struct file *file, int mode,
 	cur_offset = alloc_start;
 
 	/* Make sure we aren't being give some crap mode */
-	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE))
+	if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE |
+		     FALLOC_FL_ZERO_RANGE))
 		return -EOPNOTSUPP;
 
 	if (mode & FALLOC_FL_PUNCH_HOLE)
@@ -2842,10 +3093,12 @@ static long btrfs_fallocate(struct file *file, int mode,
 	 *
 	 * For qgroup space, it will be checked later.
 	 */
-	ret = btrfs_alloc_data_chunk_ondemand(BTRFS_I(inode),
-			alloc_end - alloc_start);
-	if (ret < 0)
-		return ret;
+	if (!(mode & FALLOC_FL_ZERO_RANGE)) {
+		ret = btrfs_alloc_data_chunk_ondemand(BTRFS_I(inode),
+						      alloc_end - alloc_start);
+		if (ret < 0)
+			return ret;
+	}
 
 	inode_lock(inode);
 
@@ -2887,6 +3140,12 @@ static long btrfs_fallocate(struct file *file, int mode,
 	if (ret)
 		goto out;
 
+	if (mode & FALLOC_FL_ZERO_RANGE) {
+		ret = btrfs_zero_range(inode, offset, len, mode);
+		inode_unlock(inode);
+		return ret;
+	}
+
 	locked_end = alloc_end - 1;
 	while (1) {
 		struct btrfs_ordered_extent *ordered;
@@ -2896,15 +3155,15 @@ static long btrfs_fallocate(struct file *file, int mode,
 		 */
 		lock_extent_bits(&BTRFS_I(inode)->io_tree, alloc_start,
 				 locked_end, &cached_state);
-		ordered = btrfs_lookup_first_ordered_extent(inode,
-							    alloc_end - 1);
+		ordered = btrfs_lookup_first_ordered_extent(inode, locked_end);
+
 		if (ordered &&
 		    ordered->file_offset + ordered->len > alloc_start &&
 		    ordered->file_offset < alloc_end) {
 			btrfs_put_ordered_extent(ordered);
 			unlock_extent_cached(&BTRFS_I(inode)->io_tree,
 					     alloc_start, locked_end,
-					     &cached_state, GFP_KERNEL);
+					     &cached_state);
 			/*
 			 * we can't wait on the range with the transaction
 			 * running or with the extent lock held
@@ -2922,7 +3181,7 @@ static long btrfs_fallocate(struct file *file, int mode,
 
 	/* First, check if we exceed the qgroup limit */
 	INIT_LIST_HEAD(&reserve_list);
-	while (1) {
+	while (cur_offset < alloc_end) {
 		em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, cur_offset,
 				      alloc_end - cur_offset, 0);
 		if (IS_ERR(em)) {
@@ -2958,8 +3217,6 @@ static long btrfs_fallocate(struct file *file, int mode,
 		}
 		free_extent_map(em);
 		cur_offset = last_byte;
-		if (cur_offset >= alloc_end)
-			break;
 	}
 
 	/*
@@ -2982,37 +3239,18 @@ static long btrfs_fallocate(struct file *file, int mode,
 	if (ret < 0)
 		goto out_unlock;
 
-	if (actual_end > inode->i_size &&
-	    !(mode & FALLOC_FL_KEEP_SIZE)) {
-		struct btrfs_trans_handle *trans;
-		struct btrfs_root *root = BTRFS_I(inode)->root;
-
-		/*
-		 * We didn't need to allocate any more space, but we
-		 * still extended the size of the file so we need to
-		 * update i_size and the inode item.
-		 */
-		trans = btrfs_start_transaction(root, 1);
-		if (IS_ERR(trans)) {
-			ret = PTR_ERR(trans);
-		} else {
-			inode->i_ctime = current_time(inode);
-			i_size_write(inode, actual_end);
-			btrfs_ordered_update_i_size(inode, actual_end, NULL);
-			ret = btrfs_update_inode(trans, root, inode);
-			if (ret)
-				btrfs_end_transaction(trans);
-			else
-				ret = btrfs_end_transaction(trans);
-		}
-	}
+	/*
+	 * We didn't need to allocate any more space, but we still extended the
+	 * size of the file so we need to update i_size and the inode item.
+	 */
+	ret = btrfs_fallocate_update_isize(inode, actual_end, mode);
 out_unlock:
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, alloc_start, locked_end,
-			     &cached_state, GFP_KERNEL);
+			     &cached_state);
 out:
 	inode_unlock(inode);
 	/* Let go of our reservation. */
-	if (ret != 0)
+	if (ret != 0 && !(mode & FALLOC_FL_ZERO_RANGE))
 		btrfs_free_reserved_data_space(inode, data_reserved,
 				alloc_start, alloc_end - cur_offset);
 	extent_changeset_free(data_reserved);
@@ -3081,7 +3319,7 @@ static int find_desired_extent(struct inode *inode, loff_t *offset, int whence)
 			*offset = min_t(loff_t, start, inode->i_size);
 	}
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-			     &cached_state, GFP_NOFS);
+			     &cached_state);
 	return ret;
 }
 
@@ -3145,7 +3383,7 @@ void btrfs_auto_defrag_exit(void)
 	kmem_cache_destroy(btrfs_inode_defrag_cachep);
 }
 
-int btrfs_auto_defrag_init(void)
+int __init btrfs_auto_defrag_init(void)
 {
 	btrfs_inode_defrag_cachep = kmem_cache_create("btrfs_inode_defrag",
 					sizeof(struct inode_defrag), 0,
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c
index 4426d1c..014f3c0 100644
--- a/fs/btrfs/free-space-cache.c
+++ b/fs/btrfs/free-space-cache.c
@@ -993,8 +993,7 @@ update_cache_item(struct btrfs_trans_handle *trans,
 	ret = btrfs_search_slot(trans, root, &key, path, 0, 1);
 	if (ret < 0) {
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, inode->i_size - 1,
-				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL,
-				 GFP_NOFS);
+				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL);
 		goto fail;
 	}
 	leaf = path->nodes[0];
@@ -1008,7 +1007,7 @@ update_cache_item(struct btrfs_trans_handle *trans,
 			clear_extent_bit(&BTRFS_I(inode)->io_tree, 0,
 					 inode->i_size - 1,
 					 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0,
-					 NULL, GFP_NOFS);
+					 NULL);
 			btrfs_release_path(path);
 			goto fail;
 		}
@@ -1105,8 +1104,7 @@ static int flush_dirty_cache(struct inode *inode)
 	ret = btrfs_wait_ordered_range(inode, 0, (u64)-1);
 	if (ret)
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, inode->i_size - 1,
-				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL,
-				 GFP_NOFS);
+				 EXTENT_DIRTY | EXTENT_DELALLOC, 0, 0, NULL);
 
 	return ret;
 }
@@ -1127,8 +1125,7 @@ cleanup_write_cache_enospc(struct inode *inode,
 {
 	io_ctl_drop_pages(io_ctl);
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, 0,
-			     i_size_read(inode) - 1, cached_state,
-			     GFP_NOFS);
+			     i_size_read(inode) - 1, cached_state);
 }
 
 static int __btrfs_wait_cache_io(struct btrfs_root *root,
@@ -1322,7 +1319,7 @@ static int __btrfs_write_out_cache(struct btrfs_root *root, struct inode *inode,
 	io_ctl_drop_pages(io_ctl);
 
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, 0,
-			     i_size_read(inode) - 1, &cached_state, GFP_NOFS);
+			     i_size_read(inode) - 1, &cached_state);
 
 	/*
 	 * at this point the pages are under IO and we're happy,
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index e1a7f3c..53ca025 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -43,6 +43,7 @@
 #include <linux/posix_acl_xattr.h>
 #include <linux/uio.h>
 #include <linux/magic.h>
+#include <linux/iversion.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -536,9 +537,14 @@ static noinline void compress_file_range(struct inode *inode,
 		 *
 		 * If the compression fails for any reason, we set the pages
 		 * dirty again later on.
+		 *
+		 * Note that the remaining part is redirtied, the start pointer
+		 * has moved, the end is the original one.
 		 */
-		extent_range_clear_dirty_for_io(inode, start, end);
-		redirty = 1;
+		if (!redirty) {
+			extent_range_clear_dirty_for_io(inode, start, end);
+			redirty = 1;
+		}
 
 		/* Compression level is applied here and only here */
 		ret = btrfs_compress_pages(
@@ -765,11 +771,10 @@ static noinline void submit_compressed_extents(struct inode *inode,
 			 * all those pages down to the drive.
 			 */
 			if (!page_started && !ret)
-				extent_write_locked_range(io_tree,
-						  inode, async_extent->start,
+				extent_write_locked_range(inode,
+						  async_extent->start,
 						  async_extent->start +
 						  async_extent->ram_size - 1,
-						  btrfs_get_extent,
 						  WB_SYNC_ALL);
 			else if (ret)
 				unlock_page(async_cow->locked_page);
@@ -1203,7 +1208,7 @@ static int cow_file_range_async(struct inode *inode, struct page *locked_page,
 	u64 cur_end;
 
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, start, end, EXTENT_LOCKED,
-			 1, 0, NULL, GFP_NOFS);
+			 1, 0, NULL);
 	while (start < end) {
 		async_cow = kmalloc(sizeof(*async_cow), GFP_NOFS);
 		BUG_ON(!async_cow); /* -ENOMEM */
@@ -1951,7 +1956,21 @@ static blk_status_t __btrfs_submit_bio_done(void *private_data, struct bio *bio,
 
 /*
  * extent_io.c submission hook. This does the right thing for csum calculation
- * on write, or reading the csums from the tree before a read
+ * on write, or reading the csums from the tree before a read.
+ *
+ * Rules about async/sync submit,
+ * a) read:				sync submit
+ *
+ * b) write without checksum:		sync submit
+ *
+ * c) write with checksum:
+ *    c-1) if bio is issued by fsync:	sync submit
+ *         (sync_writers != 0)
+ *
+ *    c-2) if root is reloc root:	sync submit
+ *         (only in case of buffered IO)
+ *
+ *    c-3) otherwise:			async submit
  */
 static blk_status_t btrfs_submit_bio_hook(void *private_data, struct bio *bio,
 				 int mirror_num, unsigned long bio_flags,
@@ -2023,10 +2042,10 @@ static noinline int add_pending_csums(struct btrfs_trans_handle *trans,
 	struct btrfs_ordered_sum *sum;
 
 	list_for_each_entry(sum, list, list) {
-		trans->adding_csums = 1;
+		trans->adding_csums = true;
 		btrfs_csum_file_blocks(trans,
 		       BTRFS_I(inode)->root->fs_info->csum_root, sum);
-		trans->adding_csums = 0;
+		trans->adding_csums = false;
 	}
 	return 0;
 }
@@ -2082,7 +2101,7 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 					PAGE_SIZE);
 	if (ordered) {
 		unlock_extent_cached(&BTRFS_I(inode)->io_tree, page_start,
-				     page_end, &cached_state, GFP_NOFS);
+				     page_end, &cached_state);
 		unlock_page(page);
 		btrfs_start_ordered_extent(inode, ordered, 1);
 		btrfs_put_ordered_extent(ordered);
@@ -2098,14 +2117,21 @@ static void btrfs_writepage_fixup_worker(struct btrfs_work *work)
 		goto out;
 	 }
 
-	btrfs_set_extent_delalloc(inode, page_start, page_end, 0, &cached_state,
-				  0);
+	ret = btrfs_set_extent_delalloc(inode, page_start, page_end, 0,
+					&cached_state, 0);
+	if (ret) {
+		mapping_set_error(page->mapping, ret);
+		end_extent_writepage(page, ret, page_start, page_end);
+		ClearPageChecked(page);
+		goto out;
+	}
+
 	ClearPageChecked(page);
 	set_page_dirty(page);
 	btrfs_delalloc_release_extents(BTRFS_I(inode), PAGE_SIZE);
 out:
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, page_start, page_end,
-			     &cached_state, GFP_NOFS);
+			     &cached_state);
 out_page:
 	unlock_page(page);
 	put_page(page);
@@ -2697,7 +2723,7 @@ static noinline int relink_extent_backref(struct btrfs_path *path,
 	btrfs_end_transaction(trans);
 out_unlock:
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree, lock_start, lock_end,
-			     &cached, GFP_NOFS);
+			     &cached);
 	iput(inode);
 	return ret;
 }
@@ -2986,7 +3012,7 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent)
 
 		clear_extent_bit(io_tree, ordered_extent->file_offset,
 			ordered_extent->file_offset + ordered_extent->len - 1,
-			EXTENT_DEFRAG, 0, 0, &cached_state, GFP_NOFS);
+			EXTENT_DEFRAG, 0, 0, &cached_state);
 	}
 
 	if (nolock)
@@ -3056,7 +3082,7 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent)
 				 ordered_extent->len - 1,
 				 clear_bits,
 				 (clear_bits & EXTENT_LOCKED) ? 1 : 0,
-				 0, &cached_state, GFP_NOFS);
+				 0, &cached_state);
 	}
 
 	if (trans)
@@ -3070,7 +3096,7 @@ static int btrfs_finish_ordered_io(struct btrfs_ordered_extent *ordered_extent)
 		else
 			start = ordered_extent->file_offset;
 		end = ordered_extent->file_offset + ordered_extent->len - 1;
-		clear_extent_uptodate(io_tree, start, end, NULL, GFP_NOFS);
+		clear_extent_uptodate(io_tree, start, end, NULL);
 
 		/* Drop the cache for the part of the extent we didn't write. */
 		btrfs_drop_extent_cache(BTRFS_I(inode), start, end, 0);
@@ -3777,7 +3803,8 @@ static int btrfs_read_locked_inode(struct inode *inode)
 	BTRFS_I(inode)->generation = btrfs_inode_generation(leaf, inode_item);
 	BTRFS_I(inode)->last_trans = btrfs_inode_transid(leaf, inode_item);
 
-	inode->i_version = btrfs_inode_sequence(leaf, inode_item);
+	inode_set_iversion_queried(inode,
+				   btrfs_inode_sequence(leaf, inode_item));
 	inode->i_generation = BTRFS_I(inode)->generation;
 	inode->i_rdev = 0;
 	rdev = btrfs_inode_rdev(leaf, inode_item);
@@ -3945,7 +3972,8 @@ static void fill_inode_item(struct btrfs_trans_handle *trans,
 				     &token);
 	btrfs_set_token_inode_generation(leaf, item, BTRFS_I(inode)->generation,
 					 &token);
-	btrfs_set_token_inode_sequence(leaf, item, inode->i_version, &token);
+	btrfs_set_token_inode_sequence(leaf, item, inode_peek_iversion(inode),
+				       &token);
 	btrfs_set_token_inode_transid(leaf, item, trans->transid, &token);
 	btrfs_set_token_inode_rdev(leaf, item, inode->i_rdev, &token);
 	btrfs_set_token_inode_flags(leaf, item, BTRFS_I(inode)->flags, &token);
@@ -4744,8 +4772,8 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	u64 block_start;
 	u64 block_end;
 
-	if ((offset & (blocksize - 1)) == 0 &&
-	    (!len || ((len & (blocksize - 1)) == 0)))
+	if (IS_ALIGNED(offset, blocksize) &&
+	    (!len || IS_ALIGNED(len, blocksize)))
 		goto out;
 
 	block_start = round_down(from, blocksize);
@@ -4787,7 +4815,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	ordered = btrfs_lookup_ordered_extent(inode, block_start);
 	if (ordered) {
 		unlock_extent_cached(io_tree, block_start, block_end,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		unlock_page(page);
 		put_page(page);
 		btrfs_start_ordered_extent(inode, ordered, 1);
@@ -4798,13 +4826,13 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, block_start, block_end,
 			  EXTENT_DIRTY | EXTENT_DELALLOC |
 			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
-			  0, 0, &cached_state, GFP_NOFS);
+			  0, 0, &cached_state);
 
 	ret = btrfs_set_extent_delalloc(inode, block_start, block_end, 0,
 					&cached_state, 0);
 	if (ret) {
 		unlock_extent_cached(io_tree, block_start, block_end,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		goto out_unlock;
 	}
 
@@ -4823,8 +4851,7 @@ int btrfs_truncate_block(struct inode *inode, loff_t from, loff_t len,
 	}
 	ClearPageChecked(page);
 	set_page_dirty(page);
-	unlock_extent_cached(io_tree, block_start, block_end, &cached_state,
-			     GFP_NOFS);
+	unlock_extent_cached(io_tree, block_start, block_end, &cached_state);
 
 out_unlock:
 	if (ret)
@@ -4925,7 +4952,7 @@ int btrfs_cont_expand(struct inode *inode, loff_t oldsize, loff_t size)
 		if (!ordered)
 			break;
 		unlock_extent_cached(io_tree, hole_start, block_end - 1,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		btrfs_start_ordered_extent(inode, ordered, 1);
 		btrfs_put_ordered_extent(ordered);
 	}
@@ -4990,8 +5017,7 @@ int btrfs_cont_expand(struct inode *inode, loff_t oldsize, loff_t size)
 			break;
 	}
 	free_extent_map(em);
-	unlock_extent_cached(io_tree, hole_start, block_end - 1, &cached_state,
-			     GFP_NOFS);
+	unlock_extent_cached(io_tree, hole_start, block_end - 1, &cached_state);
 	return err;
 }
 
@@ -5234,8 +5260,7 @@ static void evict_inode_truncate_pages(struct inode *inode)
 		clear_extent_bit(io_tree, start, end,
 				 EXTENT_LOCKED | EXTENT_DIRTY |
 				 EXTENT_DELALLOC | EXTENT_DO_ACCOUNTING |
-				 EXTENT_DEFRAG, 1, 1,
-				 &cached_state, GFP_NOFS);
+				 EXTENT_DEFRAG, 1, 1, &cached_state);
 
 		cond_resched();
 		spin_lock(&io_tree->lock);
@@ -5894,7 +5919,6 @@ static int btrfs_filldir(void *addr, int entries, struct dir_context *ctx)
 static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
 {
 	struct inode *inode = file_inode(file);
-	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_file_private *private = file->private_data;
 	struct btrfs_dir_item *di;
@@ -5962,9 +5986,6 @@ static int btrfs_real_readdir(struct file *file, struct dir_context *ctx)
 		if (btrfs_should_delete_dir_index(&del_list, found_key.offset))
 			goto next;
 		di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
-		if (verify_dir_item(fs_info, leaf, slot, di))
-			goto next;
-
 		name_len = btrfs_dir_name_len(leaf, di);
 		if ((total_len + sizeof(struct dir_entry) + name_len) >=
 		    PAGE_SIZE) {
@@ -6104,19 +6125,20 @@ static int btrfs_update_time(struct inode *inode, struct timespec *now,
 			     int flags)
 {
 	struct btrfs_root *root = BTRFS_I(inode)->root;
+	bool dirty = flags & ~S_VERSION;
 
 	if (btrfs_root_readonly(root))
 		return -EROFS;
 
 	if (flags & S_VERSION)
-		inode_inc_iversion(inode);
+		dirty |= inode_maybe_inc_iversion(inode, dirty);
 	if (flags & S_CTIME)
 		inode->i_ctime = *now;
 	if (flags & S_MTIME)
 		inode->i_mtime = *now;
 	if (flags & S_ATIME)
 		inode->i_atime = *now;
-	return btrfs_dirty_inode(inode);
+	return dirty ? btrfs_dirty_inode(inode) : 0;
 }
 
 /*
@@ -6297,7 +6319,7 @@ static struct inode *btrfs_new_inode(struct btrfs_trans_handle *trans,
 	}
 	/*
 	 * index_cnt is ignored for everything but a dir,
-	 * btrfs_get_inode_index_count has an explanation for the magic
+	 * btrfs_set_inode_index_count has an explanation for the magic
 	 * number
 	 */
 	BTRFS_I(inode)->index_cnt = 2;
@@ -6560,7 +6582,6 @@ static int btrfs_mknod(struct inode *dir, struct dentry *dentry,
 
 out_unlock:
 	btrfs_end_transaction(trans);
-	btrfs_balance_delayed_items(fs_info);
 	btrfs_btree_balance_dirty(fs_info);
 	if (drop_inode) {
 		inode_dec_link_count(inode);
@@ -6641,7 +6662,6 @@ static int btrfs_create(struct inode *dir, struct dentry *dentry,
 		inode_dec_link_count(inode);
 		iput(inode);
 	}
-	btrfs_balance_delayed_items(fs_info);
 	btrfs_btree_balance_dirty(fs_info);
 	return err;
 
@@ -6716,7 +6736,6 @@ static int btrfs_link(struct dentry *old_dentry, struct inode *dir,
 		btrfs_log_new_name(trans, BTRFS_I(inode), NULL, parent);
 	}
 
-	btrfs_balance_delayed_items(fs_info);
 fail:
 	if (trans)
 		btrfs_end_transaction(trans);
@@ -6794,7 +6813,6 @@ static int btrfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 		inode_dec_link_count(inode);
 		iput(inode);
 	}
-	btrfs_balance_delayed_items(fs_info);
 	btrfs_btree_balance_dirty(fs_info);
 	return err;
 
@@ -6803,68 +6821,6 @@ static int btrfs_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	goto out_fail;
 }
 
-/* Find next extent map of a given extent map, caller needs to ensure locks */
-static struct extent_map *next_extent_map(struct extent_map *em)
-{
-	struct rb_node *next;
-
-	next = rb_next(&em->rb_node);
-	if (!next)
-		return NULL;
-	return container_of(next, struct extent_map, rb_node);
-}
-
-static struct extent_map *prev_extent_map(struct extent_map *em)
-{
-	struct rb_node *prev;
-
-	prev = rb_prev(&em->rb_node);
-	if (!prev)
-		return NULL;
-	return container_of(prev, struct extent_map, rb_node);
-}
-
-/* helper for btfs_get_extent.  Given an existing extent in the tree,
- * the existing extent is the nearest extent to map_start,
- * and an extent that you want to insert, deal with overlap and insert
- * the best fitted new extent into the tree.
- */
-static int merge_extent_mapping(struct extent_map_tree *em_tree,
-				struct extent_map *existing,
-				struct extent_map *em,
-				u64 map_start)
-{
-	struct extent_map *prev;
-	struct extent_map *next;
-	u64 start;
-	u64 end;
-	u64 start_diff;
-
-	BUG_ON(map_start < em->start || map_start >= extent_map_end(em));
-
-	if (existing->start > map_start) {
-		next = existing;
-		prev = prev_extent_map(next);
-	} else {
-		prev = existing;
-		next = next_extent_map(prev);
-	}
-
-	start = prev ? extent_map_end(prev) : em->start;
-	start = max_t(u64, start, em->start);
-	end = next ? next->start : extent_map_end(em);
-	end = min_t(u64, end, extent_map_end(em));
-	start_diff = start - em->start;
-	em->start = start;
-	em->len = end - start;
-	if (em->block_start < EXTENT_MAP_LAST_BYTE &&
-	    !test_bit(EXTENT_FLAG_COMPRESSED, &em->flags)) {
-		em->block_start += start_diff;
-		em->block_len -= start_diff;
-	}
-	return add_extent_mapping(em_tree, em, 0);
-}
-
 static noinline int uncompress_inline(struct btrfs_path *path,
 				      struct page *page,
 				      size_t pg_offset, u64 extent_offset,
@@ -6939,10 +6895,8 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 	struct extent_map *em = NULL;
 	struct extent_map_tree *em_tree = &inode->extent_tree;
 	struct extent_io_tree *io_tree = &inode->io_tree;
-	struct btrfs_trans_handle *trans = NULL;
 	const bool new_inline = !page || create;
 
-again:
 	read_lock(&em_tree->lock);
 	em = lookup_extent_mapping(em_tree, start, len);
 	if (em)
@@ -6981,8 +6935,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		path->reada = READA_FORWARD;
 	}
 
-	ret = btrfs_lookup_file_extent(trans, root, path,
-				       objectid, start, trans != NULL);
+	ret = btrfs_lookup_file_extent(NULL, root, path, objectid, start, 0);
 	if (ret < 0) {
 		err = ret;
 		goto out;
@@ -7083,7 +7036,7 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 		em->orig_block_len = em->len;
 		em->orig_start = em->start;
 		ptr = btrfs_file_extent_inline_start(item) + extent_offset;
-		if (create == 0 && !PageUptodate(page)) {
+		if (!PageUptodate(page)) {
 			if (btrfs_file_extent_compression(leaf, item) !=
 			    BTRFS_COMPRESS_NONE) {
 				ret = uncompress_inline(path, page, pg_offset,
@@ -7104,25 +7057,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 				kunmap(page);
 			}
 			flush_dcache_page(page);
-		} else if (create && PageUptodate(page)) {
-			BUG();
-			if (!trans) {
-				kunmap(page);
-				free_extent_map(em);
-				em = NULL;
-
-				btrfs_release_path(path);
-				trans = btrfs_join_transaction(root);
-
-				if (IS_ERR(trans))
-					return ERR_CAST(trans);
-				goto again;
-			}
-			map = kmap(page);
-			write_extent_buffer(leaf, map + pg_offset, ptr,
-					    copy_size);
-			kunmap(page);
-			btrfs_mark_buffer_dirty(leaf);
 		}
 		set_extent_uptodate(io_tree, em->start,
 				    extent_map_end(em) - 1, NULL, GFP_NOFS);
@@ -7134,7 +7068,6 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 	em->len = len;
 not_found_em:
 	em->block_start = EXTENT_MAP_HOLE;
-	set_bit(EXTENT_FLAG_VACANCY, &em->flags);
 insert:
 	btrfs_release_path(path);
 	if (em->start > start || extent_map_end(em) <= start) {
@@ -7147,62 +7080,13 @@ struct extent_map *btrfs_get_extent(struct btrfs_inode *inode,
 
 	err = 0;
 	write_lock(&em_tree->lock);
-	ret = add_extent_mapping(em_tree, em, 0);
-	/* it is possible that someone inserted the extent into the tree
-	 * while we had the lock dropped.  It is also possible that
-	 * an overlapping map exists in the tree
-	 */
-	if (ret == -EEXIST) {
-		struct extent_map *existing;
-
-		ret = 0;
-
-		existing = search_extent_mapping(em_tree, start, len);
-		/*
-		 * existing will always be non-NULL, since there must be
-		 * extent causing the -EEXIST.
-		 */
-		if (existing->start == em->start &&
-		    extent_map_end(existing) >= extent_map_end(em) &&
-		    em->block_start == existing->block_start) {
-			/*
-			 * The existing extent map already encompasses the
-			 * entire extent map we tried to add.
-			 */
-			free_extent_map(em);
-			em = existing;
-			err = 0;
-
-		} else if (start >= extent_map_end(existing) ||
-		    start <= existing->start) {
-			/*
-			 * The existing extent map is the one nearest to
-			 * the [start, start + len) range which overlaps
-			 */
-			err = merge_extent_mapping(em_tree, existing,
-						   em, start);
-			free_extent_map(existing);
-			if (err) {
-				free_extent_map(em);
-				em = NULL;
-			}
-		} else {
-			free_extent_map(em);
-			em = existing;
-			err = 0;
-		}
-	}
+	err = btrfs_add_extent_mapping(em_tree, &em, start, len);
 	write_unlock(&em_tree->lock);
 out:
 
 	trace_btrfs_get_extent(root, inode, em);
 
 	btrfs_free_path(path);
-	if (trans) {
-		ret = btrfs_end_transaction(trans);
-		if (!err)
-			err = ret;
-	}
 	if (err) {
 		free_extent_map(em);
 		return ERR_PTR(err);
@@ -7324,7 +7208,7 @@ struct extent_map *btrfs_get_extent_fiemap(struct btrfs_inode *inode,
 			em->block_start = EXTENT_MAP_DELALLOC;
 			em->block_len = found;
 		}
-	} else if (hole_em) {
+	} else {
 		return hole_em;
 	}
 out:
@@ -7641,7 +7525,7 @@ static int lock_extent_direct(struct inode *inode, u64 lockstart, u64 lockend,
 			break;
 
 		unlock_extent_cached(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-				     cached_state, GFP_NOFS);
+				     cached_state);
 
 		if (ordered) {
 			/*
@@ -7926,7 +7810,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 	if (lockstart < lockend) {
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart,
 				 lockend, unlock_bits, 1, 0,
-				 &cached_state, GFP_NOFS);
+				 &cached_state);
 	} else {
 		free_extent_state(cached_state);
 	}
@@ -7937,7 +7821,7 @@ static int btrfs_get_blocks_direct(struct inode *inode, sector_t iblock,
 
 unlock_err:
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, lockstart, lockend,
-			 unlock_bits, 1, 0, &cached_state, GFP_NOFS);
+			 unlock_bits, 1, 0, &cached_state);
 err:
 	if (dio_data)
 		current->journal_info = dio_data;
@@ -7953,15 +7837,12 @@ static inline blk_status_t submit_dio_repair_bio(struct inode *inode,
 
 	BUG_ON(bio_op(bio) == REQ_OP_WRITE);
 
-	bio_get(bio);
-
 	ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DIO_REPAIR);
 	if (ret)
-		goto err;
+		return ret;
 
 	ret = btrfs_map_bio(fs_info, bio, mirror_num, 0);
-err:
-	bio_put(bio);
+
 	return ret;
 }
 
@@ -8015,6 +7896,7 @@ static blk_status_t dio_read_error(struct inode *inode, struct bio *failed_bio,
 	int segs;
 	int ret;
 	blk_status_t status;
+	struct bio_vec bvec;
 
 	BUG_ON(bio_op(failed_bio) == REQ_OP_WRITE);
 
@@ -8030,8 +7912,9 @@ static blk_status_t dio_read_error(struct inode *inode, struct bio *failed_bio,
 	}
 
 	segs = bio_segments(failed_bio);
+	bio_get_first_bvec(failed_bio, &bvec);
 	if (segs > 1 ||
-	    (failed_bio->bi_io_vec->bv_len > btrfs_inode_sectorsize(inode)))
+	    (bvec.bv_len > btrfs_inode_sectorsize(inode)))
 		read_mode |= REQ_FAILFAST_DEV;
 
 	isector = start - btrfs_io_bio(failed_bio)->logical;
@@ -8074,7 +7957,7 @@ static void btrfs_retry_endio_nocsum(struct bio *bio)
 	ASSERT(bio->bi_vcnt == 1);
 	io_tree = &BTRFS_I(inode)->io_tree;
 	failure_tree = &BTRFS_I(inode)->io_failure_tree;
-	ASSERT(bio->bi_io_vec->bv_len == btrfs_inode_sectorsize(inode));
+	ASSERT(bio_first_bvec_all(bio)->bv_len == btrfs_inode_sectorsize(inode));
 
 	done->uptodate = 1;
 	ASSERT(!bio_flagged(bio, BIO_CLONED));
@@ -8164,7 +8047,7 @@ static void btrfs_retry_endio(struct bio *bio)
 	uptodate = 1;
 
 	ASSERT(bio->bi_vcnt == 1);
-	ASSERT(bio->bi_io_vec->bv_len == btrfs_inode_sectorsize(done->inode));
+	ASSERT(bio_first_bvec_all(bio)->bv_len == btrfs_inode_sectorsize(done->inode));
 
 	io_tree = &BTRFS_I(inode)->io_tree;
 	failure_tree = &BTRFS_I(inode)->io_failure_tree;
@@ -8460,11 +8343,10 @@ __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, u64 file_offset,
 	bool write = bio_op(bio) == REQ_OP_WRITE;
 	blk_status_t ret;
 
+	/* Check btrfs_submit_bio_hook() for rules about async submit. */
 	if (async_submit)
 		async_submit = !atomic_read(&BTRFS_I(inode)->sync_writers);
 
-	bio_get(bio);
-
 	if (!write) {
 		ret = btrfs_bio_wq_end_io(fs_info, bio, BTRFS_WQ_ENDIO_DATA);
 		if (ret)
@@ -8497,7 +8379,6 @@ __btrfs_submit_dio_bio(struct bio *bio, struct inode *inode, u64 file_offset,
 map:
 	ret = btrfs_map_bio(fs_info, bio, 0, 0);
 err:
-	bio_put(bio);
 	return ret;
 }
 
@@ -8854,7 +8735,7 @@ static int btrfs_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
 	if (ret)
 		return ret;
 
-	return extent_fiemap(inode, fieinfo, start, len, btrfs_get_extent_fiemap);
+	return extent_fiemap(inode, fieinfo, start, len);
 }
 
 int btrfs_readpage(struct file *file, struct page *page)
@@ -8866,7 +8747,6 @@ int btrfs_readpage(struct file *file, struct page *page)
 
 static int btrfs_writepage(struct page *page, struct writeback_control *wbc)
 {
-	struct extent_io_tree *tree;
 	struct inode *inode = page->mapping->host;
 	int ret;
 
@@ -8885,8 +8765,7 @@ static int btrfs_writepage(struct page *page, struct writeback_control *wbc)
 		redirty_page_for_writepage(wbc, page);
 		return AOP_WRITEPAGE_ACTIVATE;
 	}
-	tree = &BTRFS_I(page->mapping->host)->io_tree;
-	ret = extent_write_full_page(tree, page, btrfs_get_extent, wbc);
+	ret = extent_write_full_page(page, wbc);
 	btrfs_add_delayed_iput(inode);
 	return ret;
 }
@@ -8897,7 +8776,7 @@ static int btrfs_writepages(struct address_space *mapping,
 	struct extent_io_tree *tree;
 
 	tree = &BTRFS_I(mapping->host)->io_tree;
-	return extent_writepages(tree, mapping, btrfs_get_extent, wbc);
+	return extent_writepages(tree, mapping, wbc);
 }
 
 static int
@@ -8906,8 +8785,7 @@ btrfs_readpages(struct file *file, struct address_space *mapping,
 {
 	struct extent_io_tree *tree;
 	tree = &BTRFS_I(mapping->host)->io_tree;
-	return extent_readpages(tree, mapping, pages, nr_pages,
-				btrfs_get_extent);
+	return extent_readpages(tree, mapping, pages, nr_pages);
 }
 static int __btrfs_releasepage(struct page *page, gfp_t gfp_flags)
 {
@@ -8978,8 +8856,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 					 EXTENT_DIRTY | EXTENT_DELALLOC |
 					 EXTENT_DELALLOC_NEW |
 					 EXTENT_LOCKED | EXTENT_DO_ACCOUNTING |
-					 EXTENT_DEFRAG, 1, 0, &cached_state,
-					 GFP_NOFS);
+					 EXTENT_DEFRAG, 1, 0, &cached_state);
 		/*
 		 * whoever cleared the private bit is responsible
 		 * for the finish_ordered_io
@@ -9036,7 +8913,7 @@ static void btrfs_invalidatepage(struct page *page, unsigned int offset,
 				 EXTENT_LOCKED | EXTENT_DIRTY |
 				 EXTENT_DELALLOC | EXTENT_DELALLOC_NEW |
 				 EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 1, 1,
-				 &cached_state, GFP_NOFS);
+				 &cached_state);
 
 		__btrfs_releasepage(page, GFP_NOFS);
 	}
@@ -9137,7 +9014,7 @@ int btrfs_page_mkwrite(struct vm_fault *vmf)
 			PAGE_SIZE);
 	if (ordered) {
 		unlock_extent_cached(io_tree, page_start, page_end,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		unlock_page(page);
 		btrfs_start_ordered_extent(inode, ordered, 1);
 		btrfs_put_ordered_extent(ordered);
@@ -9164,13 +9041,13 @@ int btrfs_page_mkwrite(struct vm_fault *vmf)
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start, end,
 			  EXTENT_DIRTY | EXTENT_DELALLOC |
 			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG,
-			  0, 0, &cached_state, GFP_NOFS);
+			  0, 0, &cached_state);
 
 	ret = btrfs_set_extent_delalloc(inode, page_start, end, 0,
 					&cached_state, 0);
 	if (ret) {
 		unlock_extent_cached(io_tree, page_start, page_end,
-				     &cached_state, GFP_NOFS);
+				     &cached_state);
 		ret = VM_FAULT_SIGBUS;
 		goto out_unlock;
 	}
@@ -9196,7 +9073,7 @@ int btrfs_page_mkwrite(struct vm_fault *vmf)
 	BTRFS_I(inode)->last_sub_trans = BTRFS_I(inode)->root->log_transid;
 	BTRFS_I(inode)->last_log_commit = BTRFS_I(inode)->root->last_log_commit;
 
-	unlock_extent_cached(io_tree, page_start, page_end, &cached_state, GFP_NOFS);
+	unlock_extent_cached(io_tree, page_start, page_end, &cached_state);
 
 out_unlock:
 	if (!ret) {
@@ -9421,7 +9298,7 @@ struct inode *btrfs_alloc_inode(struct super_block *sb)
 	struct btrfs_inode *ei;
 	struct inode *inode;
 
-	ei = kmem_cache_alloc(btrfs_inode_cachep, GFP_NOFS);
+	ei = kmem_cache_alloc(btrfs_inode_cachep, GFP_KERNEL);
 	if (!ei)
 		return NULL;
 
@@ -9573,7 +9450,7 @@ void btrfs_destroy_cachep(void)
 	kmem_cache_destroy(btrfs_free_space_cachep);
 }
 
-int btrfs_init_cachep(void)
+int __init btrfs_init_cachep(void)
 {
 	btrfs_inode_cachep = kmem_cache_create("btrfs_inode",
 			sizeof(struct btrfs_inode), 0,
@@ -10688,7 +10565,6 @@ static int btrfs_tmpfile(struct inode *dir, struct dentry *dentry, umode_t mode)
 	btrfs_end_transaction(trans);
 	if (ret)
 		iput(inode);
-	btrfs_balance_delayed_items(fs_info);
 	btrfs_btree_balance_dirty(fs_info);
 	return ret;
 
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 2ef8aca..111ee28 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -43,6 +43,7 @@
 #include <linux/uuid.h>
 #include <linux/btrfs.h>
 #include <linux/uaccess.h>
+#include <linux/iversion.h>
 #include "ctree.h"
 #include "disk-io.h"
 #include "transaction.h"
@@ -307,12 +308,10 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
 		ip->flags |= BTRFS_INODE_COMPRESS;
 		ip->flags &= ~BTRFS_INODE_NOCOMPRESS;
 
-		if (fs_info->compress_type == BTRFS_COMPRESS_LZO)
-			comp = "lzo";
-		else if (fs_info->compress_type == BTRFS_COMPRESS_ZLIB)
-			comp = "zlib";
-		else
-			comp = "zstd";
+		comp = btrfs_compress_type2str(fs_info->compress_type);
+		if (!comp || comp[0] == 0)
+			comp = btrfs_compress_type2str(BTRFS_COMPRESS_ZLIB);
+
 		ret = btrfs_set_prop(inode, "btrfs.compression",
 				     comp, strlen(comp), 0);
 		if (ret)
@@ -979,7 +978,7 @@ static struct extent_map *defrag_lookup_extent(struct inode *inode, u64 start)
 		/* get the big lock and read metadata off disk */
 		lock_extent_bits(io_tree, start, end, &cached);
 		em = btrfs_get_extent(BTRFS_I(inode), NULL, 0, start, len, 0);
-		unlock_extent_cached(io_tree, start, end, &cached, GFP_NOFS);
+		unlock_extent_cached(io_tree, start, end, &cached);
 
 		if (IS_ERR(em))
 			return NULL;
@@ -1130,7 +1129,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 			ordered = btrfs_lookup_ordered_extent(inode,
 							      page_start);
 			unlock_extent_cached(tree, page_start, page_end,
-					     &cached_state, GFP_NOFS);
+					     &cached_state);
 			if (!ordered)
 				break;
 
@@ -1190,7 +1189,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 	clear_extent_bit(&BTRFS_I(inode)->io_tree, page_start,
 			  page_end - 1, EXTENT_DIRTY | EXTENT_DELALLOC |
 			  EXTENT_DO_ACCOUNTING | EXTENT_DEFRAG, 0, 0,
-			  &cached_state, GFP_NOFS);
+			  &cached_state);
 
 	if (i_done != page_cnt) {
 		spin_lock(&BTRFS_I(inode)->lock);
@@ -1206,8 +1205,7 @@ static int cluster_pages_for_defrag(struct inode *inode,
 			  &cached_state);
 
 	unlock_extent_cached(&BTRFS_I(inode)->io_tree,
-			     page_start, page_end - 1, &cached_state,
-			     GFP_NOFS);
+			     page_start, page_end - 1, &cached_state);
 
 	for (i = 0; i < i_done; i++) {
 		clear_page_dirty_for_io(pages[i]);
@@ -1503,7 +1501,7 @@ static noinline int btrfs_ioctl_resize(struct file *file,
 		goto out_free;
 	}
 
-	if (!device->writeable) {
+	if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		btrfs_info(fs_info,
 			   "resizer unable to apply on readonly device %llu",
 		       devid);
@@ -1528,7 +1526,7 @@ static noinline int btrfs_ioctl_resize(struct file *file,
 		}
 	}
 
-	if (device->is_tgtdev_for_dev_replace) {
+	if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 		ret = -EPERM;
 		goto out_free;
 	}
@@ -2675,14 +2673,12 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg)
 		goto out;
 	}
 
-	mutex_lock(&fs_info->volume_mutex);
 	if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) {
 		ret = btrfs_rm_device(fs_info, NULL, vol_args->devid);
 	} else {
 		vol_args->name[BTRFS_SUBVOL_NAME_MAX] = '\0';
 		ret = btrfs_rm_device(fs_info, vol_args->name, 0);
 	}
-	mutex_unlock(&fs_info->volume_mutex);
 	clear_bit(BTRFS_FS_EXCL_OP, &fs_info->flags);
 
 	if (!ret) {
@@ -2726,9 +2722,7 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg)
 	}
 
 	vol_args->name[BTRFS_PATH_NAME_MAX] = '\0';
-	mutex_lock(&fs_info->volume_mutex);
 	ret = btrfs_rm_device(fs_info, vol_args->name, 0);
-	mutex_unlock(&fs_info->volume_mutex);
 
 	if (!ret)
 		btrfs_info(fs_info, "disk deleted %s", vol_args->name);
@@ -2753,16 +2747,16 @@ static long btrfs_ioctl_fs_info(struct btrfs_fs_info *fs_info,
 	if (!fi_args)
 		return -ENOMEM;
 
-	mutex_lock(&fs_devices->device_list_mutex);
+	rcu_read_lock();
 	fi_args->num_devices = fs_devices->num_devices;
-	memcpy(&fi_args->fsid, fs_info->fsid, sizeof(fi_args->fsid));
 
-	list_for_each_entry(device, &fs_devices->devices, dev_list) {
+	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
 		if (device->devid > fi_args->max_id)
 			fi_args->max_id = device->devid;
 	}
-	mutex_unlock(&fs_devices->device_list_mutex);
+	rcu_read_unlock();
 
+	memcpy(&fi_args->fsid, fs_info->fsid, sizeof(fi_args->fsid));
 	fi_args->nodesize = fs_info->nodesize;
 	fi_args->sectorsize = fs_info->sectorsize;
 	fi_args->clone_alignment = fs_info->sectorsize;
@@ -2779,7 +2773,6 @@ static long btrfs_ioctl_dev_info(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_ioctl_dev_info_args *di_args;
 	struct btrfs_device *dev;
-	struct btrfs_fs_devices *fs_devices = fs_info->fs_devices;
 	int ret = 0;
 	char *s_uuid = NULL;
 
@@ -2790,7 +2783,7 @@ static long btrfs_ioctl_dev_info(struct btrfs_fs_info *fs_info,
 	if (!btrfs_is_empty_uuid(di_args->uuid))
 		s_uuid = di_args->uuid;
 
-	mutex_lock(&fs_devices->device_list_mutex);
+	rcu_read_lock();
 	dev = btrfs_find_device(fs_info, di_args->devid, s_uuid, NULL);
 
 	if (!dev) {
@@ -2805,17 +2798,15 @@ static long btrfs_ioctl_dev_info(struct btrfs_fs_info *fs_info,
 	if (dev->name) {
 		struct rcu_string *name;
 
-		rcu_read_lock();
 		name = rcu_dereference(dev->name);
-		strncpy(di_args->path, name->str, sizeof(di_args->path));
-		rcu_read_unlock();
+		strncpy(di_args->path, name->str, sizeof(di_args->path) - 1);
 		di_args->path[sizeof(di_args->path) - 1] = 0;
 	} else {
 		di_args->path[0] = '\0';
 	}
 
 out:
-	mutex_unlock(&fs_devices->device_list_mutex);
+	rcu_read_unlock();
 	if (ret == 0 && copy_to_user(arg, di_args, sizeof(*di_args)))
 		ret = -EFAULT;
 
diff --git a/fs/btrfs/props.c b/fs/btrfs/props.c
index f6a05f8..b30a056 100644
--- a/fs/btrfs/props.c
+++ b/fs/btrfs/props.c
@@ -164,7 +164,6 @@ static int iterate_object_props(struct btrfs_root *root,
 						 size_t),
 				void *ctx)
 {
-	struct btrfs_fs_info *fs_info = root->fs_info;
 	int ret;
 	char *name_buf = NULL;
 	char *value_buf = NULL;
@@ -215,12 +214,6 @@ static int iterate_object_props(struct btrfs_root *root,
 			name_ptr = (unsigned long)(di + 1);
 			data_ptr = name_ptr + name_len;
 
-			if (verify_dir_item(fs_info, leaf,
-					    path->slots[0], di)) {
-				ret = -EIO;
-				goto out;
-			}
-
 			if (name_len <= XATTR_BTRFS_PREFIX_LEN ||
 			    memcmp_extent_buffer(leaf, XATTR_BTRFS_PREFIX,
 						 name_ptr,
@@ -430,11 +423,11 @@ static const char *prop_compression_extract(struct inode *inode)
 {
 	switch (BTRFS_I(inode)->prop_compress) {
 	case BTRFS_COMPRESS_ZLIB:
-		return "zlib";
 	case BTRFS_COMPRESS_LZO:
-		return "lzo";
 	case BTRFS_COMPRESS_ZSTD:
-		return "zstd";
+		return btrfs_compress_type2str(BTRFS_I(inode)->prop_compress);
+	default:
+		break;
 	}
 
 	return NULL;
diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c
index 168fd03..9e61dd6 100644
--- a/fs/btrfs/qgroup.c
+++ b/fs/btrfs/qgroup.c
@@ -2883,8 +2883,7 @@ int btrfs_qgroup_reserve_data(struct inode *inode,
 	ULIST_ITER_INIT(&uiter);
 	while ((unode = ulist_next(&reserved->range_changed, &uiter)))
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, unode->val,
-				 unode->aux, EXTENT_QGROUP_RESERVED, 0, 0, NULL,
-				 GFP_NOFS);
+				 unode->aux, EXTENT_QGROUP_RESERVED, 0, 0, NULL);
 	extent_changeset_release(reserved);
 	return ret;
 }
diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index a7f7925..dec0907 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -231,7 +231,6 @@ int btrfs_alloc_stripe_hash_table(struct btrfs_fs_info *info)
 		cur = h + i;
 		INIT_LIST_HEAD(&cur->hash_list);
 		spin_lock_init(&cur->lock);
-		init_waitqueue_head(&cur->wait);
 	}
 
 	x = cmpxchg(&info->stripe_hash_table, NULL, table);
@@ -595,14 +594,31 @@ static int rbio_can_merge(struct btrfs_raid_bio *last,
 	 * bio list here, anyone else that wants to
 	 * change this stripe needs to do their own rmw.
 	 */
-	if (last->operation == BTRFS_RBIO_PARITY_SCRUB ||
-	    cur->operation == BTRFS_RBIO_PARITY_SCRUB)
+	if (last->operation == BTRFS_RBIO_PARITY_SCRUB)
 		return 0;
 
-	if (last->operation == BTRFS_RBIO_REBUILD_MISSING ||
-	    cur->operation == BTRFS_RBIO_REBUILD_MISSING)
+	if (last->operation == BTRFS_RBIO_REBUILD_MISSING)
 		return 0;
 
+	if (last->operation == BTRFS_RBIO_READ_REBUILD) {
+		int fa = last->faila;
+		int fb = last->failb;
+		int cur_fa = cur->faila;
+		int cur_fb = cur->failb;
+
+		if (last->faila >= last->failb) {
+			fa = last->failb;
+			fb = last->faila;
+		}
+
+		if (cur->faila >= cur->failb) {
+			cur_fa = cur->failb;
+			cur_fb = cur->faila;
+		}
+
+		if (fa != cur_fa || fb != cur_fb)
+			return 0;
+	}
 	return 1;
 }
 
@@ -670,7 +686,6 @@ static noinline int lock_stripe_add(struct btrfs_raid_bio *rbio)
 	struct btrfs_raid_bio *cur;
 	struct btrfs_raid_bio *pending;
 	unsigned long flags;
-	DEFINE_WAIT(wait);
 	struct btrfs_raid_bio *freeit = NULL;
 	struct btrfs_raid_bio *cache_drop = NULL;
 	int ret = 0;
@@ -816,15 +831,6 @@ static noinline void unlock_stripe(struct btrfs_raid_bio *rbio)
 			}
 
 			goto done_nolock;
-			/*
-			 * The barrier for this waitqueue_active is not needed,
-			 * we're protected by h->lock and can't miss a wakeup.
-			 */
-		} else if (waitqueue_active(&h->wait)) {
-			spin_unlock(&rbio->bio_list_lock);
-			spin_unlock_irqrestore(&h->lock, flags);
-			wake_up(&h->wait);
-			goto done_nolock;
 		}
 	}
 done:
@@ -858,10 +864,17 @@ static void __free_raid_bio(struct btrfs_raid_bio *rbio)
 	kfree(rbio);
 }
 
-static void free_raid_bio(struct btrfs_raid_bio *rbio)
+static void rbio_endio_bio_list(struct bio *cur, blk_status_t err)
 {
-	unlock_stripe(rbio);
-	__free_raid_bio(rbio);
+	struct bio *next;
+
+	while (cur) {
+		next = cur->bi_next;
+		cur->bi_next = NULL;
+		cur->bi_status = err;
+		bio_endio(cur);
+		cur = next;
+	}
 }
 
 /*
@@ -871,20 +884,26 @@ static void free_raid_bio(struct btrfs_raid_bio *rbio)
 static void rbio_orig_end_io(struct btrfs_raid_bio *rbio, blk_status_t err)
 {
 	struct bio *cur = bio_list_get(&rbio->bio_list);
-	struct bio *next;
+	struct bio *extra;
 
 	if (rbio->generic_bio_cnt)
 		btrfs_bio_counter_sub(rbio->fs_info, rbio->generic_bio_cnt);
 
-	free_raid_bio(rbio);
+	/*
+	 * At this moment, rbio->bio_list is empty, however since rbio does not
+	 * always have RBIO_RMW_LOCKED_BIT set and rbio is still linked on the
+	 * hash list, rbio may be merged with others so that rbio->bio_list
+	 * becomes non-empty.
+	 * Once unlock_stripe() is done, rbio->bio_list will not be updated any
+	 * more and we can call bio_endio() on all queued bios.
+	 */
+	unlock_stripe(rbio);
+	extra = bio_list_get(&rbio->bio_list);
+	__free_raid_bio(rbio);
 
-	while (cur) {
-		next = cur->bi_next;
-		cur->bi_next = NULL;
-		cur->bi_status = err;
-		bio_endio(cur);
-		cur = next;
-	}
+	rbio_endio_bio_list(cur, err);
+	if (extra)
+		rbio_endio_bio_list(extra, err);
 }
 
 /*
@@ -1435,14 +1454,13 @@ static int fail_bio_stripe(struct btrfs_raid_bio *rbio,
  */
 static void set_bio_pages_uptodate(struct bio *bio)
 {
-	struct bio_vec bvec;
-	struct bvec_iter iter;
+	struct bio_vec *bvec;
+	int i;
 
-	if (bio_flagged(bio, BIO_CLONED))
-		bio->bi_iter = btrfs_io_bio(bio)->iter;
+	ASSERT(!bio_flagged(bio, BIO_CLONED));
 
-	bio_for_each_segment(bvec, bio, iter)
-		SetPageUptodate(bvec.bv_page);
+	bio_for_each_segment_all(bvec, bio, i)
+		SetPageUptodate(bvec->bv_page);
 }
 
 /*
@@ -1969,7 +1987,22 @@ static void __raid_recover_end_io(struct btrfs_raid_bio *rbio)
 
 cleanup_io:
 	if (rbio->operation == BTRFS_RBIO_READ_REBUILD) {
-		if (err == BLK_STS_OK)
+		/*
+		 * - In case of two failures, where rbio->failb != -1:
+		 *
+		 *   Do not cache this rbio since the above read reconstruction
+		 *   (raid6_datap_recov() or raid6_2data_recov()) may have
+		 *   changed some content of stripes which are not identical to
+		 *   on-disk content any more, otherwise, a later write/recover
+		 *   may steal stripe_pages from this rbio and end up with
+		 *   corruptions or rebuild failures.
+		 *
+		 * - In case of single failure, where rbio->failb == -1:
+		 *
+		 *   Cache this rbio iff the above read reconstruction is
+		 *   excuted without problems.
+		 */
+		if (err == BLK_STS_OK && rbio->failb < 0)
 			cache_rbio_pages(rbio);
 		else
 			clear_bit(RBIO_CACHE_READY_BIT, &rbio->flags);
@@ -2170,11 +2203,21 @@ int raid56_parity_recover(struct btrfs_fs_info *fs_info, struct bio *bio,
 	}
 
 	/*
-	 * reconstruct from the q stripe if they are
-	 * asking for mirror 3
+	 * Loop retry:
+	 * for 'mirror == 2', reconstruct from all other stripes.
+	 * for 'mirror_num > 2', select a stripe to fail on every retry.
 	 */
-	if (mirror_num == 3)
-		rbio->failb = rbio->real_stripes - 2;
+	if (mirror_num > 2) {
+		/*
+		 * 'mirror == 3' is to fail the p stripe and
+		 * reconstruct from the q stripe.  'mirror > 3' is to
+		 * fail a data stripe and reconstruct from p+q stripe.
+		 */
+		rbio->failb = rbio->real_stripes - (mirror_num - 1);
+		ASSERT(rbio->failb > 0);
+		if (rbio->failb <= rbio->faila)
+			rbio->failb--;
+	}
 
 	ret = lock_stripe_add(rbio);
 
diff --git a/fs/btrfs/ref-verify.c b/fs/btrfs/ref-verify.c
index 3487869..171f3cc 100644
--- a/fs/btrfs/ref-verify.c
+++ b/fs/btrfs/ref-verify.c
@@ -606,8 +606,7 @@ static int walk_down_tree(struct btrfs_root *root, struct btrfs_path *path,
 }
 
 /* Walk up to the next node that needs to be processed */
-static int walk_up_tree(struct btrfs_root *root, struct btrfs_path *path,
-			int *level)
+static int walk_up_tree(struct btrfs_path *path, int *level)
 {
 	int l;
 
@@ -984,7 +983,6 @@ void btrfs_free_ref_tree_range(struct btrfs_fs_info *fs_info, u64 start,
 int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info)
 {
 	struct btrfs_path *path;
-	struct btrfs_root *root;
 	struct extent_buffer *eb;
 	u64 bytenr = 0, num_bytes = 0;
 	int ret, level;
@@ -1014,7 +1012,7 @@ int btrfs_build_ref_tree(struct btrfs_fs_info *fs_info)
 				     &bytenr, &num_bytes);
 		if (ret)
 			break;
-		ret = walk_up_tree(root, path, &level);
+		ret = walk_up_tree(path, &level);
 		if (ret < 0)
 			break;
 		if (ret > 0) {
diff --git a/fs/btrfs/root-tree.c b/fs/btrfs/root-tree.c
index 3338407..aab0194 100644
--- a/fs/btrfs/root-tree.c
+++ b/fs/btrfs/root-tree.c
@@ -387,13 +387,6 @@ int btrfs_del_root_ref(struct btrfs_trans_handle *trans,
 		WARN_ON(btrfs_root_ref_dirid(leaf, ref) != dirid);
 		WARN_ON(btrfs_root_ref_name_len(leaf, ref) != name_len);
 		ptr = (unsigned long)(ref + 1);
-		ret = btrfs_is_name_len_valid(leaf, path->slots[0], ptr,
-					      name_len);
-		if (!ret) {
-			err = -EIO;
-			goto out;
-		}
-
 		WARN_ON(memcmp_extent_buffer(leaf, name, ptr, name_len));
 		*sequence = btrfs_root_ref_sequence(leaf, ref);
 
diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c
index b2f871d..ec56f33 100644
--- a/fs/btrfs/scrub.c
+++ b/fs/btrfs/scrub.c
@@ -301,6 +301,11 @@ static void __scrub_blocked_if_needed(struct btrfs_fs_info *fs_info);
 static void scrub_blocked_if_needed(struct btrfs_fs_info *fs_info);
 static void scrub_put_ctx(struct scrub_ctx *sctx);
 
+static inline int scrub_is_page_on_raid56(struct scrub_page *page)
+{
+	return page->recover &&
+	       (page->recover->bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK);
+}
 
 static void scrub_pending_bio_inc(struct scrub_ctx *sctx)
 {
@@ -1323,15 +1328,34 @@ static int scrub_handle_errored_block(struct scrub_block *sblock_to_check)
 	 * could happen otherwise that a correct page would be
 	 * overwritten by a bad one).
 	 */
-	for (mirror_index = 0;
-	     mirror_index < BTRFS_MAX_MIRRORS &&
-	     sblocks_for_recheck[mirror_index].page_count > 0;
-	     mirror_index++) {
+	for (mirror_index = 0; ;mirror_index++) {
 		struct scrub_block *sblock_other;
 
 		if (mirror_index == failed_mirror_index)
 			continue;
-		sblock_other = sblocks_for_recheck + mirror_index;
+
+		/* raid56's mirror can be more than BTRFS_MAX_MIRRORS */
+		if (!scrub_is_page_on_raid56(sblock_bad->pagev[0])) {
+			if (mirror_index >= BTRFS_MAX_MIRRORS)
+				break;
+			if (!sblocks_for_recheck[mirror_index].page_count)
+				break;
+
+			sblock_other = sblocks_for_recheck + mirror_index;
+		} else {
+			struct scrub_recover *r = sblock_bad->pagev[0]->recover;
+			int max_allowed = r->bbio->num_stripes -
+						r->bbio->num_tgtdevs;
+
+			if (mirror_index >= max_allowed)
+				break;
+			if (!sblocks_for_recheck[1].page_count)
+				break;
+
+			ASSERT(failed_mirror_index == 0);
+			sblock_other = sblocks_for_recheck + 1;
+			sblock_other->pagev[0]->mirror_num = 1 + mirror_index;
+		}
 
 		/* build and submit the bios, check checksums */
 		scrub_recheck_block(fs_info, sblock_other, 0);
@@ -1666,49 +1690,32 @@ static int scrub_setup_recheck_block(struct scrub_block *original_sblock,
 	return 0;
 }
 
-struct scrub_bio_ret {
-	struct completion event;
-	blk_status_t status;
-};
-
 static void scrub_bio_wait_endio(struct bio *bio)
 {
-	struct scrub_bio_ret *ret = bio->bi_private;
-
-	ret->status = bio->bi_status;
-	complete(&ret->event);
-}
-
-static inline int scrub_is_page_on_raid56(struct scrub_page *page)
-{
-	return page->recover &&
-	       (page->recover->bbio->map_type & BTRFS_BLOCK_GROUP_RAID56_MASK);
+	complete(bio->bi_private);
 }
 
 static int scrub_submit_raid56_bio_wait(struct btrfs_fs_info *fs_info,
 					struct bio *bio,
 					struct scrub_page *page)
 {
-	struct scrub_bio_ret done;
+	DECLARE_COMPLETION_ONSTACK(done);
 	int ret;
+	int mirror_num;
 
-	init_completion(&done.event);
-	done.status = 0;
 	bio->bi_iter.bi_sector = page->logical >> 9;
 	bio->bi_private = &done;
 	bio->bi_end_io = scrub_bio_wait_endio;
 
+	mirror_num = page->sblock->pagev[0]->mirror_num;
 	ret = raid56_parity_recover(fs_info, bio, page->recover->bbio,
 				    page->recover->map_length,
-				    page->mirror_num, 0);
+				    mirror_num, 0);
 	if (ret)
 		return ret;
 
-	wait_for_completion_io(&done.event);
-	if (done.status)
-		return -EIO;
-
-	return 0;
+	wait_for_completion_io(&done);
+	return blk_status_to_errno(bio->bi_status);
 }
 
 /*
@@ -2535,7 +2542,7 @@ static int scrub_pages(struct scrub_ctx *sctx, u64 logical, u64 len,
 	}
 
 	WARN_ON(sblock->page_count == 0);
-	if (dev->missing) {
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state)) {
 		/*
 		 * This case should only be hit for RAID 5/6 device replace. See
 		 * the comment in scrub_missing_raid56_pages() for details.
@@ -2870,7 +2877,7 @@ static int scrub_extent_for_parity(struct scrub_parity *sparity,
 	u8 csum[BTRFS_CSUM_SIZE];
 	u32 blocksize;
 
-	if (dev->missing) {
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state)) {
 		scrub_parity_mark_sectors_error(sparity, logical, len);
 		return 0;
 	}
@@ -4112,12 +4119,14 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
 	dev = btrfs_find_device(fs_info, devid, NULL, NULL);
-	if (!dev || (dev->missing && !is_dev_replace)) {
+	if (!dev || (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) &&
+		     !is_dev_replace)) {
 		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 		return -ENODEV;
 	}
 
-	if (!is_dev_replace && !readonly && !dev->writeable) {
+	if (!is_dev_replace && !readonly &&
+	    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state)) {
 		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 		rcu_read_lock();
 		name = rcu_dereference(dev->name);
@@ -4128,14 +4137,15 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 	}
 
 	mutex_lock(&fs_info->scrub_lock);
-	if (!dev->in_fs_metadata || dev->is_tgtdev_for_dev_replace) {
+	if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &dev->dev_state) ||
+	    test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &dev->dev_state)) {
 		mutex_unlock(&fs_info->scrub_lock);
 		mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 		return -EIO;
 	}
 
 	btrfs_dev_replace_lock(&fs_info->dev_replace, 0);
-	if (dev->scrub_device ||
+	if (dev->scrub_ctx ||
 	    (!is_dev_replace &&
 	     btrfs_dev_replace_is_ongoing(&fs_info->dev_replace))) {
 		btrfs_dev_replace_unlock(&fs_info->dev_replace, 0);
@@ -4160,7 +4170,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 		return PTR_ERR(sctx);
 	}
 	sctx->readonly = readonly;
-	dev->scrub_device = sctx;
+	dev->scrub_ctx = sctx;
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
 
 	/*
@@ -4195,7 +4205,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start,
 		memcpy(progress, &sctx->stat, sizeof(*progress));
 
 	mutex_lock(&fs_info->scrub_lock);
-	dev->scrub_device = NULL;
+	dev->scrub_ctx = NULL;
 	scrub_workers_put(fs_info);
 	mutex_unlock(&fs_info->scrub_lock);
 
@@ -4252,16 +4262,16 @@ int btrfs_scrub_cancel_dev(struct btrfs_fs_info *fs_info,
 	struct scrub_ctx *sctx;
 
 	mutex_lock(&fs_info->scrub_lock);
-	sctx = dev->scrub_device;
+	sctx = dev->scrub_ctx;
 	if (!sctx) {
 		mutex_unlock(&fs_info->scrub_lock);
 		return -ENOTCONN;
 	}
 	atomic_inc(&sctx->cancel_req);
-	while (dev->scrub_device) {
+	while (dev->scrub_ctx) {
 		mutex_unlock(&fs_info->scrub_lock);
 		wait_event(fs_info->scrub_pause_wait,
-			   dev->scrub_device == NULL);
+			   dev->scrub_ctx == NULL);
 		mutex_lock(&fs_info->scrub_lock);
 	}
 	mutex_unlock(&fs_info->scrub_lock);
@@ -4278,7 +4288,7 @@ int btrfs_scrub_progress(struct btrfs_fs_info *fs_info, u64 devid,
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
 	dev = btrfs_find_device(fs_info, devid, NULL, NULL);
 	if (dev)
-		sctx = dev->scrub_device;
+		sctx = dev->scrub_ctx;
 	if (sctx)
 		memcpy(progress, &sctx->stat, sizeof(*progress));
 	mutex_unlock(&fs_info->fs_devices->device_list_mutex);
@@ -4478,8 +4488,7 @@ static int check_extent_to_block(struct btrfs_inode *inode, u64 start, u64 len,
 	free_extent_map(em);
 
 out_unlock:
-	unlock_extent_cached(io_tree, lockstart, lockend, &cached_state,
-			     GFP_NOFS);
+	unlock_extent_cached(io_tree, lockstart, lockend, &cached_state);
 	return ret;
 }
 
diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c
index 20d3300..f306c60 100644
--- a/fs/btrfs/send.c
+++ b/fs/btrfs/send.c
@@ -1059,12 +1059,6 @@ static int iterate_dir_item(struct btrfs_root *root, struct btrfs_path *path,
 			}
 		}
 
-		ret = btrfs_is_name_len_valid(eb, path->slots[0],
-			  (unsigned long)(di + 1), name_len + data_len);
-		if (!ret) {
-			ret = -EIO;
-			goto out;
-		}
 		if (name_len + data_len > buf_len) {
 			buf_len = name_len + data_len;
 			if (is_vmalloc_addr(buf)) {
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 3a4dce1..6e71a2a 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -61,12 +61,21 @@
 #include "tests/btrfs-tests.h"
 
 #include "qgroup.h"
-#include "backref.h"
 #define CREATE_TRACE_POINTS
 #include <trace/events/btrfs.h>
 
 static const struct super_operations btrfs_super_ops;
+
+/*
+ * Types for mounting the default subvolume and a subvolume explicitly
+ * requested by subvol=/path. That way the callchain is straightforward and we
+ * don't have to play tricks with the mount options and recursive calls to
+ * btrfs_mount.
+ *
+ * The new btrfs_root_fs_type also servers as a tag for the bdev_holder.
+ */
 static struct file_system_type btrfs_fs_type;
+static struct file_system_type btrfs_root_fs_type;
 
 static int btrfs_remount(struct super_block *sb, int *flags, char *data);
 
@@ -98,30 +107,6 @@ const char *btrfs_decode_error(int errno)
 	return errstr;
 }
 
-/* btrfs handle error by forcing the filesystem readonly */
-static void btrfs_handle_error(struct btrfs_fs_info *fs_info)
-{
-	struct super_block *sb = fs_info->sb;
-
-	if (sb_rdonly(sb))
-		return;
-
-	if (test_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state)) {
-		sb->s_flags |= SB_RDONLY;
-		btrfs_info(fs_info, "forced readonly");
-		/*
-		 * Note that a running device replace operation is not
-		 * canceled here although there is no way to update
-		 * the progress. It would add the risk of a deadlock,
-		 * therefore the canceling is omitted. The only penalty
-		 * is that some I/O remains active until the procedure
-		 * completes. The next time when the filesystem is
-		 * mounted writeable again, the device replace
-		 * operation continues.
-		 */
-	}
-}
-
 /*
  * __btrfs_handle_fs_error decodes expected errors from the caller and
  * invokes the approciate error response.
@@ -168,8 +153,23 @@ void __btrfs_handle_fs_error(struct btrfs_fs_info *fs_info, const char *function
 	set_bit(BTRFS_FS_STATE_ERROR, &fs_info->fs_state);
 
 	/* Don't go through full error handling during mount */
-	if (sb->s_flags & SB_BORN)
-		btrfs_handle_error(fs_info);
+	if (!(sb->s_flags & SB_BORN))
+		return;
+
+	if (sb_rdonly(sb))
+		return;
+
+	/* btrfs handle error by forcing the filesystem readonly */
+	sb->s_flags |= SB_RDONLY;
+	btrfs_info(fs_info, "forced readonly");
+	/*
+	 * Note that a running device replace operation is not canceled here
+	 * although there is no way to update the progress. It would add the
+	 * risk of a deadlock, therefore the canceling is omitted. The only
+	 * penalty is that some I/O remains active until the procedure
+	 * completes. The next time when the filesystem is mounted writeable
+	 * again, the device replace operation continues.
+	 */
 }
 
 #ifdef CONFIG_PRINTK
@@ -405,7 +405,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 			unsigned long new_flags)
 {
 	substring_t args[MAX_OPT_ARGS];
-	char *p, *num, *orig = NULL;
+	char *p, *num;
 	u64 cache_gen;
 	int intarg;
 	int ret = 0;
@@ -428,16 +428,6 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 	if (!options)
 		goto check;
 
-	/*
-	 * strsep changes the string, duplicate it because parse_options
-	 * gets called twice
-	 */
-	options = kstrdup(options, GFP_KERNEL);
-	if (!options)
-		return -ENOMEM;
-
-	orig = options;
-
 	while ((p = strsep(&options, ",")) != NULL) {
 		int token;
 		if (!*p)
@@ -454,7 +444,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 		case Opt_subvolrootid:
 		case Opt_device:
 			/*
-			 * These are parsed by btrfs_parse_early_options
+			 * These are parsed by btrfs_parse_subvol_options
+			 * and btrfs_parse_early_options
 			 * and can be happily ignored here.
 			 */
 			break;
@@ -877,7 +868,6 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
 		btrfs_info(info, "disk space caching is enabled");
 	if (!ret && btrfs_test_opt(info, FREE_SPACE_TREE))
 		btrfs_info(info, "using free space tree");
-	kfree(orig);
 	return ret;
 }
 
@@ -888,11 +878,60 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
  * only when we need to allocate a new super block.
  */
 static int btrfs_parse_early_options(const char *options, fmode_t flags,
-		void *holder, char **subvol_name, u64 *subvol_objectid,
-		struct btrfs_fs_devices **fs_devices)
+		void *holder, struct btrfs_fs_devices **fs_devices)
 {
 	substring_t args[MAX_OPT_ARGS];
 	char *device_name, *opts, *orig, *p;
+	int error = 0;
+
+	if (!options)
+		return 0;
+
+	/*
+	 * strsep changes the string, duplicate it because btrfs_parse_options
+	 * gets called later
+	 */
+	opts = kstrdup(options, GFP_KERNEL);
+	if (!opts)
+		return -ENOMEM;
+	orig = opts;
+
+	while ((p = strsep(&opts, ",")) != NULL) {
+		int token;
+
+		if (!*p)
+			continue;
+
+		token = match_token(p, tokens, args);
+		if (token == Opt_device) {
+			device_name = match_strdup(&args[0]);
+			if (!device_name) {
+				error = -ENOMEM;
+				goto out;
+			}
+			error = btrfs_scan_one_device(device_name,
+					flags, holder, fs_devices);
+			kfree(device_name);
+			if (error)
+				goto out;
+		}
+	}
+
+out:
+	kfree(orig);
+	return error;
+}
+
+/*
+ * Parse mount options that are related to subvolume id
+ *
+ * The value is later passed to mount_subvol()
+ */
+static int btrfs_parse_subvol_options(const char *options, fmode_t flags,
+		char **subvol_name, u64 *subvol_objectid)
+{
+	substring_t args[MAX_OPT_ARGS];
+	char *opts, *orig, *p;
 	char *num = NULL;
 	int error = 0;
 
@@ -900,8 +939,8 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags,
 		return 0;
 
 	/*
-	 * strsep changes the string, duplicate it because parse_options
-	 * gets called twice
+	 * strsep changes the string, duplicate it because
+	 * btrfs_parse_early_options gets called later
 	 */
 	opts = kstrdup(options, GFP_KERNEL);
 	if (!opts)
@@ -940,18 +979,6 @@ static int btrfs_parse_early_options(const char *options, fmode_t flags,
 		case Opt_subvolrootid:
 			pr_warn("BTRFS: 'subvolrootid' mount option is deprecated and has no effect\n");
 			break;
-		case Opt_device:
-			device_name = match_strdup(&args[0]);
-			if (!device_name) {
-				error = -ENOMEM;
-				goto out;
-			}
-			error = btrfs_scan_one_device(device_name,
-					flags, holder, fs_devices);
-			kfree(device_name);
-			if (error)
-				goto out;
-			break;
 		default:
 			break;
 		}
@@ -1243,7 +1270,7 @@ int btrfs_sync_fs(struct super_block *sb, int wait)
 static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 {
 	struct btrfs_fs_info *info = btrfs_sb(dentry->d_sb);
-	char *compress_type;
+	const char *compress_type;
 
 	if (btrfs_test_opt(info, DEGRADED))
 		seq_puts(seq, ",degraded");
@@ -1259,12 +1286,7 @@ static int btrfs_show_options(struct seq_file *seq, struct dentry *dentry)
 					     num_online_cpus() + 2, 8))
 		seq_printf(seq, ",thread_pool=%d", info->thread_pool_size);
 	if (btrfs_test_opt(info, COMPRESS)) {
-		if (info->compress_type == BTRFS_COMPRESS_ZLIB)
-			compress_type = "zlib";
-		else if (info->compress_type == BTRFS_COMPRESS_LZO)
-			compress_type = "lzo";
-		else
-			compress_type = "zstd";
+		compress_type = btrfs_compress_type2str(info->compress_type);
 		if (btrfs_test_opt(info, FORCE_COMPRESS))
 			seq_printf(seq, ",compress-force=%s", compress_type);
 		else
@@ -1365,86 +1387,12 @@ static inline int is_subvolume_inode(struct inode *inode)
 	return 0;
 }
 
-/*
- * This will add subvolid=0 to the argument string while removing any subvol=
- * and subvolid= arguments to make sure we get the top-level root for path
- * walking to the subvol we want.
- */
-static char *setup_root_args(char *args)
-{
-	char *buf, *dst, *sep;
-
-	if (!args)
-		return kstrdup("subvolid=0", GFP_KERNEL);
-
-	/* The worst case is that we add ",subvolid=0" to the end. */
-	buf = dst = kmalloc(strlen(args) + strlen(",subvolid=0") + 1,
-			GFP_KERNEL);
-	if (!buf)
-		return NULL;
-
-	while (1) {
-		sep = strchrnul(args, ',');
-		if (!strstarts(args, "subvol=") &&
-		    !strstarts(args, "subvolid=")) {
-			memcpy(dst, args, sep - args);
-			dst += sep - args;
-			*dst++ = ',';
-		}
-		if (*sep)
-			args = sep + 1;
-		else
-			break;
-	}
-	strcpy(dst, "subvolid=0");
-
-	return buf;
-}
-
 static struct dentry *mount_subvol(const char *subvol_name, u64 subvol_objectid,
-				   int flags, const char *device_name,
-				   char *data)
+				   const char *device_name, struct vfsmount *mnt)
 {
 	struct dentry *root;
-	struct vfsmount *mnt = NULL;
-	char *newargs;
 	int ret;
 
-	newargs = setup_root_args(data);
-	if (!newargs) {
-		root = ERR_PTR(-ENOMEM);
-		goto out;
-	}
-
-	mnt = vfs_kern_mount(&btrfs_fs_type, flags, device_name, newargs);
-	if (PTR_ERR_OR_ZERO(mnt) == -EBUSY) {
-		if (flags & SB_RDONLY) {
-			mnt = vfs_kern_mount(&btrfs_fs_type, flags & ~SB_RDONLY,
-					     device_name, newargs);
-		} else {
-			mnt = vfs_kern_mount(&btrfs_fs_type, flags | SB_RDONLY,
-					     device_name, newargs);
-			if (IS_ERR(mnt)) {
-				root = ERR_CAST(mnt);
-				mnt = NULL;
-				goto out;
-			}
-
-			down_write(&mnt->mnt_sb->s_umount);
-			ret = btrfs_remount(mnt->mnt_sb, &flags, NULL);
-			up_write(&mnt->mnt_sb->s_umount);
-			if (ret < 0) {
-				root = ERR_PTR(ret);
-				goto out;
-			}
-		}
-	}
-	if (IS_ERR(mnt)) {
-		root = ERR_CAST(mnt);
-		mnt = NULL;
-		goto out;
-	}
-
 	if (!subvol_name) {
 		if (!subvol_objectid) {
 			ret = get_default_subvol_objectid(btrfs_sb(mnt->mnt_sb),
@@ -1500,7 +1448,6 @@ static struct dentry *mount_subvol(const char *subvol_name, u64 subvol_objectid,
 
 out:
 	mntput(mnt);
-	kfree(newargs);
 	kfree(subvol_name);
 	return root;
 }
@@ -1558,11 +1505,11 @@ static int setup_security_options(struct btrfs_fs_info *fs_info,
 /*
  * Find a superblock for the given device / mount point.
  *
- * Note:  This is based on get_sb_bdev from fs/super.c with a few additions
- *	  for multiple device setup.  Make sure to keep it in sync.
+ * Note: This is based on mount_bdev from fs/super.c with a few additions
+ *       for multiple device setup.  Make sure to keep it in sync.
  */
-static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags,
-		const char *device_name, void *data)
+static struct dentry *btrfs_mount_root(struct file_system_type *fs_type,
+		int flags, const char *device_name, void *data)
 {
 	struct block_device *bdev = NULL;
 	struct super_block *s;
@@ -1570,27 +1517,17 @@ static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags,
 	struct btrfs_fs_info *fs_info = NULL;
 	struct security_mnt_opts new_sec_opts;
 	fmode_t mode = FMODE_READ;
-	char *subvol_name = NULL;
-	u64 subvol_objectid = 0;
 	int error = 0;
 
 	if (!(flags & SB_RDONLY))
 		mode |= FMODE_WRITE;
 
 	error = btrfs_parse_early_options(data, mode, fs_type,
-					  &subvol_name, &subvol_objectid,
 					  &fs_devices);
 	if (error) {
-		kfree(subvol_name);
 		return ERR_PTR(error);
 	}
 
-	if (subvol_name || subvol_objectid != BTRFS_FS_TREE_OBJECTID) {
-		/* mount_subvol() will free subvol_name. */
-		return mount_subvol(subvol_name, subvol_objectid, flags,
-				    device_name, data);
-	}
-
 	security_init_mnt_opts(&new_sec_opts);
 	if (data) {
 		error = parse_security_options(data, &new_sec_opts);
@@ -1674,6 +1611,84 @@ static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags,
 	return ERR_PTR(error);
 }
 
+/*
+ * Mount function which is called by VFS layer.
+ *
+ * In order to allow mounting a subvolume directly, btrfs uses mount_subtree()
+ * which needs vfsmount* of device's root (/).  This means device's root has to
+ * be mounted internally in any case.
+ *
+ * Operation flow:
+ *   1. Parse subvol id related options for later use in mount_subvol().
+ *
+ *   2. Mount device's root (/) by calling vfs_kern_mount().
+ *
+ *      NOTE: vfs_kern_mount() is used by VFS to call btrfs_mount() in the
+ *      first place. In order to avoid calling btrfs_mount() again, we use
+ *      different file_system_type which is not registered to VFS by
+ *      register_filesystem() (btrfs_root_fs_type). As a result,
+ *      btrfs_mount_root() is called. The return value will be used by
+ *      mount_subtree() in mount_subvol().
+ *
+ *   3. Call mount_subvol() to get the dentry of subvolume. Since there is
+ *      "btrfs subvolume set-default", mount_subvol() is called always.
+ */
+static struct dentry *btrfs_mount(struct file_system_type *fs_type, int flags,
+		const char *device_name, void *data)
+{
+	struct vfsmount *mnt_root;
+	struct dentry *root;
+	fmode_t mode = FMODE_READ;
+	char *subvol_name = NULL;
+	u64 subvol_objectid = 0;
+	int error = 0;
+
+	if (!(flags & SB_RDONLY))
+		mode |= FMODE_WRITE;
+
+	error = btrfs_parse_subvol_options(data, mode,
+					  &subvol_name, &subvol_objectid);
+	if (error) {
+		kfree(subvol_name);
+		return ERR_PTR(error);
+	}
+
+	/* mount device's root (/) */
+	mnt_root = vfs_kern_mount(&btrfs_root_fs_type, flags, device_name, data);
+	if (PTR_ERR_OR_ZERO(mnt_root) == -EBUSY) {
+		if (flags & SB_RDONLY) {
+			mnt_root = vfs_kern_mount(&btrfs_root_fs_type,
+				flags & ~SB_RDONLY, device_name, data);
+		} else {
+			mnt_root = vfs_kern_mount(&btrfs_root_fs_type,
+				flags | SB_RDONLY, device_name, data);
+			if (IS_ERR(mnt_root)) {
+				root = ERR_CAST(mnt_root);
+				goto out;
+			}
+
+			down_write(&mnt_root->mnt_sb->s_umount);
+			error = btrfs_remount(mnt_root->mnt_sb, &flags, NULL);
+			up_write(&mnt_root->mnt_sb->s_umount);
+			if (error < 0) {
+				root = ERR_PTR(error);
+				mntput(mnt_root);
+				goto out;
+			}
+		}
+	}
+	if (IS_ERR(mnt_root)) {
+		root = ERR_CAST(mnt_root);
+		goto out;
+	}
+
+	/* mount_subvol() will free subvol_name and mnt_root */
+	root = mount_subvol(subvol_name, subvol_objectid, device_name, mnt_root);
+
+out:
+	return root;
+}
+
 static void btrfs_resize_thread_pool(struct btrfs_fs_info *fs_info,
 				     int new_pool_size, int old_pool_size)
 {
@@ -1820,7 +1835,7 @@ static int btrfs_remount(struct super_block *sb, int *flags, char *data)
 			goto restore;
 		}
 
-		if (!btrfs_check_rw_degradable(fs_info)) {
+		if (!btrfs_check_rw_degradable(fs_info, NULL)) {
 			btrfs_warn(fs_info,
 				"too many missing devices, writeable remount is not allowed");
 			ret = -EACCES;
@@ -1972,8 +1987,10 @@ static int btrfs_calc_avail_data_space(struct btrfs_fs_info *fs_info,
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(device, &fs_devices->devices, dev_list) {
-		if (!device->in_fs_metadata || !device->bdev ||
-		    device->is_tgtdev_for_dev_replace)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+						&device->dev_state) ||
+		    !device->bdev ||
+		    test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
 			continue;
 
 		if (i >= nr_devices)
@@ -2174,6 +2191,15 @@ static struct file_system_type btrfs_fs_type = {
 	.kill_sb	= btrfs_kill_super,
 	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
 };
+
+static struct file_system_type btrfs_root_fs_type = {
+	.owner		= THIS_MODULE,
+	.name		= "btrfs",
+	.mount		= btrfs_mount_root,
+	.kill_sb	= btrfs_kill_super,
+	.fs_flags	= FS_REQUIRES_DEV | FS_BINARY_MOUNTDATA,
+};
+
 MODULE_ALIAS_FS("btrfs");
 
 static int btrfs_control_open(struct inode *inode, struct file *file)
@@ -2207,11 +2233,11 @@ static long btrfs_control_ioctl(struct file *file, unsigned int cmd,
 	switch (cmd) {
 	case BTRFS_IOC_SCAN_DEV:
 		ret = btrfs_scan_one_device(vol->name, FMODE_READ,
-					    &btrfs_fs_type, &fs_devices);
+					    &btrfs_root_fs_type, &fs_devices);
 		break;
 	case BTRFS_IOC_DEVICES_READY:
 		ret = btrfs_scan_one_device(vol->name, FMODE_READ,
-					    &btrfs_fs_type, &fs_devices);
+					    &btrfs_root_fs_type, &fs_devices);
 		if (ret)
 			break;
 		ret = !(fs_devices->num_devices == fs_devices->total_devices);
@@ -2269,7 +2295,7 @@ static int btrfs_show_devname(struct seq_file *m, struct dentry *root)
 	while (cur_devices) {
 		head = &cur_devices->devices;
 		list_for_each_entry(dev, head, dev_list) {
-			if (dev->missing)
+			if (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state))
 				continue;
 			if (!dev->name)
 				continue;
@@ -2324,7 +2350,7 @@ static struct miscdevice btrfs_misc = {
 MODULE_ALIAS_MISCDEV(BTRFS_MINOR);
 MODULE_ALIAS("devname:btrfs-control");
 
-static int btrfs_interface_init(void)
+static int __init btrfs_interface_init(void)
 {
 	return misc_register(&btrfs_misc);
 }
@@ -2334,7 +2360,7 @@ static void btrfs_interface_exit(void)
 	misc_deregister(&btrfs_misc);
 }
 
-static void btrfs_print_mod_info(void)
+static void __init btrfs_print_mod_info(void)
 {
 	pr_info("Btrfs loaded, crc32c=%s"
 #ifdef CONFIG_BTRFS_DEBUG
diff --git a/fs/btrfs/sysfs.c b/fs/btrfs/sysfs.c
index a28bba8..a8bafed 100644
--- a/fs/btrfs/sysfs.c
+++ b/fs/btrfs/sysfs.c
@@ -897,7 +897,7 @@ static int btrfs_init_debugfs(void)
 	return 0;
 }
 
-int btrfs_init_sysfs(void)
+int __init btrfs_init_sysfs(void)
 {
 	int ret;
 
diff --git a/fs/btrfs/tests/btrfs-tests.c b/fs/btrfs/tests/btrfs-tests.c
index d3f2537..9786d8c 100644
--- a/fs/btrfs/tests/btrfs-tests.c
+++ b/fs/btrfs/tests/btrfs-tests.c
@@ -277,6 +277,9 @@ int btrfs_run_sanity_tests(void)
 				goto out;
 		}
 	}
+	ret = btrfs_test_extent_map();
+	if (ret)
+		goto out;
 out:
 	btrfs_destroy_test_fs();
 	return ret;
diff --git a/fs/btrfs/tests/btrfs-tests.h b/fs/btrfs/tests/btrfs-tests.h
index 266f1e3..bc0615b 100644
--- a/fs/btrfs/tests/btrfs-tests.h
+++ b/fs/btrfs/tests/btrfs-tests.h
@@ -33,6 +33,7 @@ int btrfs_test_extent_io(u32 sectorsize, u32 nodesize);
 int btrfs_test_inodes(u32 sectorsize, u32 nodesize);
 int btrfs_test_qgroups(u32 sectorsize, u32 nodesize);
 int btrfs_test_free_space_tree(u32 sectorsize, u32 nodesize);
+int btrfs_test_extent_map(void);
 struct inode *btrfs_new_test_inode(void);
 struct btrfs_fs_info *btrfs_alloc_dummy_fs_info(u32 nodesize, u32 sectorsize);
 void btrfs_free_dummy_fs_info(struct btrfs_fs_info *fs_info);
diff --git a/fs/btrfs/tests/extent-map-tests.c b/fs/btrfs/tests/extent-map-tests.c
new file mode 100644
index 0000000..70c993f0
--- /dev/null
+++ b/fs/btrfs/tests/extent-map-tests.c
@@ -0,0 +1,366 @@
+/*
+ * Copyright (C) 2017 Oracle.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License v2 as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+#include <linux/types.h>
+#include "btrfs-tests.h"
+#include "../ctree.h"
+
+static void free_extent_map_tree(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	struct rb_node *node;
+
+	while (!RB_EMPTY_ROOT(&em_tree->map)) {
+		node = rb_first(&em_tree->map);
+		em = rb_entry(node, struct extent_map, rb_node);
+		remove_extent_mapping(em_tree, em);
+
+#ifdef CONFIG_BTRFS_DEBUG
+		if (refcount_read(&em->refs) != 1) {
+			test_msg(
+"em leak: em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx) refs %d\n",
+				 em->start, em->len, em->block_start,
+				 em->block_len, refcount_read(&em->refs));
+
+			refcount_set(&em->refs, 1);
+		}
+#endif
+		free_extent_map(em);
+	}
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet, there is a file
+ * extent [0, 16K), followed by another file extent [16K, 20K), two dio reads
+ * are entering btrfs_get_extent() concurrently, t1 is reading [8K, 16K), t2 is
+ * reading [0, 8K)
+ *
+ *     t1                            t2
+ *  btrfs_get_extent()              btrfs_get_extent()
+ *    -> lookup_extent_mapping()      ->lookup_extent_mapping()
+ *    -> add_extent_mapping(0, 16K)
+ *    -> return em
+ *                                    ->add_extent_mapping(0, 16K)
+ *                                    -> #handle -EEXIST
+ */
+static void test_case_1(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	u64 start = 0;
+	u64 len = SZ_8K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip the test on error. */
+		return;
+
+	/* Add [0, 16K) */
+	em->start = 0;
+	em->len = SZ_16K;
+	em->block_start = 0;
+	em->block_len = SZ_16K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	/* Add [16K, 20K) following [0, 16K)  */
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	em->start = SZ_16K;
+	em->len = SZ_4K;
+	em->block_start = SZ_32K; /* avoid merging */
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 8K), should return [0, 16K) instead. */
+	em->start = start;
+	em->len = len;
+	em->block_start = start;
+	em->block_len = len;
+	ret = btrfs_add_extent_mapping(em_tree, &em, em->start, em->len);
+	if (ret)
+		test_msg("case1 [%llu %llu]: ret %d\n", start, start + len, ret);
+	if (em &&
+	    (em->start != 0 || extent_map_end(em) != SZ_16K ||
+	     em->block_start != 0 || em->block_len != SZ_16K))
+		test_msg(
+"case1 [%llu %llu]: ret %d return a wrong em (start %llu len %llu block_start %llu block_len %llu\n",
+			 start, start + len, ret, em->start, em->len,
+			 em->block_start, em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Reading the inline ending up with EEXIST, ie. read an inline
+ * extent and discard page cache and read it again.
+ */
+static void test_case_2(struct extent_map_tree *em_tree)
+{
+	struct extent_map *em;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip the test on error. */
+		return;
+
+	/* Add [0, 1K) */
+	em->start = 0;
+	em->len = SZ_1K;
+	em->block_start = EXTENT_MAP_INLINE;
+	em->block_len = (u64)-1;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	/* Add [4K, 4K) following [0, 1K)  */
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	em->start = SZ_4K;
+	em->len = SZ_4K;
+	em->block_start = SZ_4K;
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 1K) */
+	em->start = 0;
+	em->len = SZ_1K;
+	em->block_start = EXTENT_MAP_INLINE;
+	em->block_len = (u64)-1;
+	ret = btrfs_add_extent_mapping(em_tree, &em, em->start, em->len);
+	if (ret)
+		test_msg("case2 [0 1K]: ret %d\n", ret);
+	if (em &&
+	    (em->start != 0 || extent_map_end(em) != SZ_1K ||
+	     em->block_start != EXTENT_MAP_INLINE || em->block_len != (u64)-1))
+		test_msg(
+"case2 [0 1K]: ret %d return a wrong em (start %llu len %llu block_start %llu block_len %llu\n",
+			 ret, em->start, em->len, em->block_start,
+			 em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+static void __test_case_3(struct extent_map_tree *em_tree, u64 start)
+{
+	struct extent_map *em;
+	u64 len = SZ_4K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip this test on error. */
+		return;
+
+	/* Add [4K, 8K) */
+	em->start = SZ_4K;
+	em->len = SZ_4K;
+	em->block_start = SZ_4K;
+	em->block_len = SZ_4K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [0, 16K) */
+	em->start = 0;
+	em->len = SZ_16K;
+	em->block_start = 0;
+	em->block_len = SZ_16K;
+	ret = btrfs_add_extent_mapping(em_tree, &em, start, len);
+	if (ret)
+		test_msg("case3 [0x%llx 0x%llx): ret %d\n",
+			 start, start + len, ret);
+	/*
+	 * Since bytes within em are contiguous, em->block_start is identical to
+	 * em->start.
+	 */
+	if (em &&
+	    (start < em->start || start + len > extent_map_end(em) ||
+	     em->start != em->block_start || em->len != em->block_len))
+		test_msg(
+"case3 [0x%llx 0x%llx): ret %d em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx)\n",
+			 start, start + len, ret, em->start, em->len,
+			 em->block_start, em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet.
+ * There is a file extent [0, 16K), two jobs are running concurrently
+ * against it, t1 is buffered writing to [4K, 8K) and t2 is doing dio
+ * read from [0, 4K) or [8K, 12K) or [12K, 16K).
+ *
+ * t1 goes ahead of t2 and adds em [4K, 8K) into tree.
+ *
+ *         t1                       t2
+ *  cow_file_range()	     btrfs_get_extent()
+ *                            -> lookup_extent_mapping()
+ *   -> add_extent_mapping()
+ *                            -> add_extent_mapping()
+ */
+static void test_case_3(struct extent_map_tree *em_tree)
+{
+	__test_case_3(em_tree, 0);
+	__test_case_3(em_tree, SZ_8K);
+	__test_case_3(em_tree, (12 * 1024ULL));
+}
+
+static void __test_case_4(struct extent_map_tree *em_tree, u64 start)
+{
+	struct extent_map *em;
+	u64 len = SZ_4K;
+	int ret;
+
+	em = alloc_extent_map();
+	if (!em)
+		/* Skip this test on error. */
+		return;
+
+	/* Add [0K, 8K) */
+	em->start = 0;
+	em->len = SZ_8K;
+	em->block_start = 0;
+	em->block_len = SZ_8K;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+
+	/* Add [8K, 24K) */
+	em->start = SZ_8K;
+	em->len = 24 * 1024ULL;
+	em->block_start = SZ_16K; /* avoid merging */
+	em->block_len = 24 * 1024ULL;
+	ret = add_extent_mapping(em_tree, em, 0);
+	ASSERT(ret == 0);
+	free_extent_map(em);
+
+	em = alloc_extent_map();
+	if (!em)
+		goto out;
+	/* Add [0K, 32K) */
+	em->start = 0;
+	em->len = SZ_32K;
+	em->block_start = 0;
+	em->block_len = SZ_32K;
+	ret = btrfs_add_extent_mapping(em_tree, &em, start, len);
+	if (ret)
+		test_msg("case4 [0x%llx 0x%llx): ret %d\n",
+			 start, len, ret);
+	if (em &&
+	    (start < em->start || start + len > extent_map_end(em)))
+		test_msg(
+"case4 [0x%llx 0x%llx): ret %d, added wrong em (start 0x%llx len 0x%llx block_start 0x%llx block_len 0x%llx)\n",
+			 start, len, ret, em->start, em->len, em->block_start,
+			 em->block_len);
+	free_extent_map(em);
+out:
+	/* free memory */
+	free_extent_map_tree(em_tree);
+}
+
+/*
+ * Test scenario:
+ *
+ * Suppose that no extent map has been loaded into memory yet.
+ * There is a file extent [0, 32K), two jobs are running concurrently
+ * against it, t1 is doing dio write to [8K, 32K) and t2 is doing dio
+ * read from [0, 4K) or [4K, 8K).
+ *
+ * t1 goes ahead of t2 and splits em [0, 32K) to em [0K, 8K) and [8K 32K).
+ *
+ *         t1                                t2
+ *  btrfs_get_blocks_direct()	       btrfs_get_blocks_direct()
+ *   -> btrfs_get_extent()              -> btrfs_get_extent()
+ *       -> lookup_extent_mapping()
+ *       -> add_extent_mapping()            -> lookup_extent_mapping()
+ *          # load [0, 32K)
+ *   -> btrfs_new_extent_direct()
+ *       -> btrfs_drop_extent_cache()
+ *          # split [0, 32K)
+ *       -> add_extent_mapping()
+ *          # add [8K, 32K)
+ *                                          -> add_extent_mapping()
+ *                                             # handle -EEXIST when adding
+ *                                             # [0, 32K)
+ */
+static void test_case_4(struct extent_map_tree *em_tree)
+{
+	__test_case_4(em_tree, 0);
+	__test_case_4(em_tree, SZ_4K);
+}
+
+int btrfs_test_extent_map()
+{
+	struct extent_map_tree *em_tree;
+
+	test_msg("Running extent_map tests\n");
+
+	em_tree = kzalloc(sizeof(*em_tree), GFP_KERNEL);
+	if (!em_tree)
+		/* Skip the test on error. */
+		return 0;
+
+	extent_map_tree_init(em_tree);
+
+	test_case_1(em_tree);
+	test_case_2(em_tree);
+	test_case_3(em_tree);
+	test_case_4(em_tree);
+
+	kfree(em_tree);
+	return 0;
+}
diff --git a/fs/btrfs/tests/inode-tests.c b/fs/btrfs/tests/inode-tests.c
index 30affb6..13420cd 100644
--- a/fs/btrfs/tests/inode-tests.c
+++ b/fs/btrfs/tests/inode-tests.c
@@ -288,10 +288,6 @@ static noinline int test_btrfs_get_extent(u32 sectorsize, u32 nodesize)
 		test_msg("Expected a hole, got %llu\n", em->block_start);
 		goto out;
 	}
-	if (!test_bit(EXTENT_FLAG_VACANCY, &em->flags)) {
-		test_msg("Vacancy flag wasn't set properly\n");
-		goto out;
-	}
 	free_extent_map(em);
 	btrfs_drop_extent_cache(BTRFS_I(inode), 0, (u64)-1, 0);
 
@@ -1001,8 +997,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 			       BTRFS_MAX_EXTENT_SIZE >> 1,
 			       (BTRFS_MAX_EXTENT_SIZE >> 1) + sectorsize - 1,
 			       EXTENT_DELALLOC | EXTENT_DIRTY |
-			       EXTENT_UPTODATE, 0, 0,
-			       NULL, GFP_KERNEL);
+			       EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_msg("clear_extent_bit returned %d\n", ret);
 		goto out;
@@ -1070,8 +1065,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 			       BTRFS_MAX_EXTENT_SIZE + sectorsize,
 			       BTRFS_MAX_EXTENT_SIZE + 2 * sectorsize - 1,
 			       EXTENT_DIRTY | EXTENT_DELALLOC |
-			       EXTENT_UPTODATE, 0, 0,
-			       NULL, GFP_KERNEL);
+			       EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_msg("clear_extent_bit returned %d\n", ret);
 		goto out;
@@ -1104,8 +1098,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 	/* Empty */
 	ret = clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, (u64)-1,
 			       EXTENT_DIRTY | EXTENT_DELALLOC |
-			       EXTENT_UPTODATE, 0, 0,
-			       NULL, GFP_KERNEL);
+			       EXTENT_UPTODATE, 0, 0, NULL);
 	if (ret) {
 		test_msg("clear_extent_bit returned %d\n", ret);
 		goto out;
@@ -1121,8 +1114,7 @@ static int test_extent_accounting(u32 sectorsize, u32 nodesize)
 	if (ret)
 		clear_extent_bit(&BTRFS_I(inode)->io_tree, 0, (u64)-1,
 				 EXTENT_DIRTY | EXTENT_DELALLOC |
-				 EXTENT_UPTODATE, 0, 0,
-				 NULL, GFP_KERNEL);
+				 EXTENT_UPTODATE, 0, 0, NULL);
 	iput(inode);
 	btrfs_free_dummy_root(root);
 	btrfs_free_dummy_fs_info(fs_info);
@@ -1134,7 +1126,6 @@ int btrfs_test_inodes(u32 sectorsize, u32 nodesize)
 	int ret;
 
 	set_bit(EXTENT_FLAG_COMPRESSED, &compressed_only);
-	set_bit(EXTENT_FLAG_VACANCY, &vacancy_only);
 	set_bit(EXTENT_FLAG_PREALLOC, &prealloc_only);
 
 	test_msg("Running btrfs_get_extent tests\n");
diff --git a/fs/btrfs/transaction.c b/fs/btrfs/transaction.c
index 5a8c264..04f0714 100644
--- a/fs/btrfs/transaction.c
+++ b/fs/btrfs/transaction.c
@@ -495,8 +495,8 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
 	if (current->journal_info) {
 		WARN_ON(type & TRANS_EXTWRITERS);
 		h = current->journal_info;
-		h->use_count++;
-		WARN_ON(h->use_count > 2);
+		refcount_inc(&h->use_count);
+		WARN_ON(refcount_read(&h->use_count) > 2);
 		h->orig_rsv = h->block_rsv;
 		h->block_rsv = NULL;
 		goto got_it;
@@ -567,7 +567,7 @@ start_transaction(struct btrfs_root *root, unsigned int num_items,
 	h->transid = cur_trans->transid;
 	h->transaction = cur_trans;
 	h->root = root;
-	h->use_count = 1;
+	refcount_set(&h->use_count, 1);
 	h->fs_info = root->fs_info;
 
 	h->type = type;
@@ -837,8 +837,8 @@ static int __btrfs_end_transaction(struct btrfs_trans_handle *trans,
 	int err = 0;
 	int must_run_delayed_refs = 0;
 
-	if (trans->use_count > 1) {
-		trans->use_count--;
+	if (refcount_read(&trans->use_count) > 1) {
+		refcount_dec(&trans->use_count);
 		trans->block_rsv = trans->orig_rsv;
 		return 0;
 	}
@@ -1016,8 +1016,7 @@ static int __btrfs_wait_marked_extents(struct btrfs_fs_info *fs_info,
 		 * it's safe to do it (through clear_btree_io_tree()).
 		 */
 		err = clear_extent_bit(dirty_pages, start, end,
-				       EXTENT_NEED_WAIT,
-				       0, 0, &cached_state, GFP_NOFS);
+				       EXTENT_NEED_WAIT, 0, 0, &cached_state);
 		if (err == -ENOMEM)
 			err = 0;
 		if (!err)
@@ -1869,7 +1868,7 @@ static void cleanup_transaction(struct btrfs_trans_handle *trans,
 	struct btrfs_transaction *cur_trans = trans->transaction;
 	DEFINE_WAIT(wait);
 
-	WARN_ON(trans->use_count > 1);
+	WARN_ON(refcount_read(&trans->use_count) > 1);
 
 	btrfs_abort_transaction(trans, err);
 
@@ -2266,16 +2265,13 @@ int btrfs_commit_transaction(struct btrfs_trans_handle *trans)
 	}
 
 	ret = write_all_supers(fs_info, 0);
-	if (ret) {
-		mutex_unlock(&fs_info->tree_log_mutex);
-		goto scrub_continue;
-	}
-
 	/*
 	 * the super is written, we can safely allow the tree-loggers
 	 * to go about their business
 	 */
 	mutex_unlock(&fs_info->tree_log_mutex);
+	if (ret)
+		goto scrub_continue;
 
 	btrfs_finish_extent_commit(trans, fs_info);
 
diff --git a/fs/btrfs/transaction.h b/fs/btrfs/transaction.h
index c55e445..6beee07 100644
--- a/fs/btrfs/transaction.h
+++ b/fs/btrfs/transaction.h
@@ -58,6 +58,7 @@ struct btrfs_transaction {
 
 	/* Be protected by fs_info->trans_lock when we want to change it. */
 	enum btrfs_trans_state state;
+	int aborted;
 	struct list_head list;
 	struct extent_io_tree dirty_pages;
 	unsigned long start_time;
@@ -70,7 +71,6 @@ struct btrfs_transaction {
 	struct list_head dirty_bgs;
 	struct list_head io_bgs;
 	struct list_head dropped_roots;
-	u64 num_dirty_bgs;
 
 	/*
 	 * we need to make sure block group deletion doesn't race with
@@ -79,11 +79,11 @@ struct btrfs_transaction {
 	 */
 	struct mutex cache_write_mutex;
 	spinlock_t dirty_bgs_lock;
+	unsigned int num_dirty_bgs;
 	/* Protected by spin lock fs_info->unused_bgs_lock. */
 	struct list_head deleted_bgs;
 	spinlock_t dropped_roots_lock;
 	struct btrfs_delayed_ref_root delayed_refs;
-	int aborted;
 	struct btrfs_fs_info *fs_info;
 };
 
@@ -111,20 +111,19 @@ struct btrfs_trans_handle {
 	u64 transid;
 	u64 bytes_reserved;
 	u64 chunk_bytes_reserved;
-	unsigned long use_count;
-	unsigned long blocks_reserved;
 	unsigned long delayed_ref_updates;
 	struct btrfs_transaction *transaction;
 	struct btrfs_block_rsv *block_rsv;
 	struct btrfs_block_rsv *orig_rsv;
+	refcount_t use_count;
+	unsigned int type;
 	short aborted;
-	short adding_csums;
+	bool adding_csums;
 	bool allocating_chunk;
 	bool can_flush_pending_bgs;
 	bool reloc_reserved;
 	bool sync;
 	bool dirty;
-	unsigned int type;
 	struct btrfs_root *root;
 	struct btrfs_fs_info *fs_info;
 	struct list_head new_bgs;
diff --git a/fs/btrfs/tree-checker.c b/fs/btrfs/tree-checker.c
index ce4ed6e..c3c8d48 100644
--- a/fs/btrfs/tree-checker.c
+++ b/fs/btrfs/tree-checker.c
@@ -30,6 +30,7 @@
 #include "tree-checker.h"
 #include "disk-io.h"
 #include "compression.h"
+#include "hash.h"
 
 /*
  * Error message should follow the following format:
@@ -223,6 +224,142 @@ static int check_csum_item(struct btrfs_root *root, struct extent_buffer *leaf,
 }
 
 /*
+ * Customized reported for dir_item, only important new info is key->objectid,
+ * which represents inode number
+ */
+__printf(4, 5)
+static void dir_item_err(const struct btrfs_root *root,
+			 const struct extent_buffer *eb, int slot,
+			 const char *fmt, ...)
+{
+	struct btrfs_key key;
+	struct va_format vaf;
+	va_list args;
+
+	btrfs_item_key_to_cpu(eb, &key, slot);
+	va_start(args, fmt);
+
+	vaf.fmt = fmt;
+	vaf.va = &args;
+
+	btrfs_crit(root->fs_info,
+	"corrupt %s: root=%llu block=%llu slot=%d ino=%llu, %pV",
+		btrfs_header_level(eb) == 0 ? "leaf" : "node", root->objectid,
+		btrfs_header_bytenr(eb), slot, key.objectid, &vaf);
+	va_end(args);
+}
+
+static int check_dir_item(struct btrfs_root *root,
+			  struct extent_buffer *leaf,
+			  struct btrfs_key *key, int slot)
+{
+	struct btrfs_dir_item *di;
+	u32 item_size = btrfs_item_size_nr(leaf, slot);
+	u32 cur = 0;
+
+	di = btrfs_item_ptr(leaf, slot, struct btrfs_dir_item);
+	while (cur < item_size) {
+		u32 name_len;
+		u32 data_len;
+		u32 max_name_len;
+		u32 total_size;
+		u32 name_hash;
+		u8 dir_type;
+
+		/* header itself should not cross item boundary */
+		if (cur + sizeof(*di) > item_size) {
+			dir_item_err(root, leaf, slot,
+		"dir item header crosses item boundary, have %zu boundary %u",
+				cur + sizeof(*di), item_size);
+			return -EUCLEAN;
+		}
+
+		/* dir type check */
+		dir_type = btrfs_dir_type(leaf, di);
+		if (dir_type >= BTRFS_FT_MAX) {
+			dir_item_err(root, leaf, slot,
+			"invalid dir item type, have %u expect [0, %u)",
+				dir_type, BTRFS_FT_MAX);
+			return -EUCLEAN;
+		}
+
+		if (key->type == BTRFS_XATTR_ITEM_KEY &&
+		    dir_type != BTRFS_FT_XATTR) {
+			dir_item_err(root, leaf, slot,
+		"invalid dir item type for XATTR key, have %u expect %u",
+				dir_type, BTRFS_FT_XATTR);
+			return -EUCLEAN;
+		}
+		if (dir_type == BTRFS_FT_XATTR &&
+		    key->type != BTRFS_XATTR_ITEM_KEY) {
+			dir_item_err(root, leaf, slot,
+			"xattr dir type found for non-XATTR key");
+			return -EUCLEAN;
+		}
+		if (dir_type == BTRFS_FT_XATTR)
+			max_name_len = XATTR_NAME_MAX;
+		else
+			max_name_len = BTRFS_NAME_LEN;
+
+		/* Name/data length check */
+		name_len = btrfs_dir_name_len(leaf, di);
+		data_len = btrfs_dir_data_len(leaf, di);
+		if (name_len > max_name_len) {
+			dir_item_err(root, leaf, slot,
+			"dir item name len too long, have %u max %u",
+				name_len, max_name_len);
+			return -EUCLEAN;
+		}
+		if (name_len + data_len > BTRFS_MAX_XATTR_SIZE(root->fs_info)) {
+			dir_item_err(root, leaf, slot,
+			"dir item name and data len too long, have %u max %u",
+				name_len + data_len,
+				BTRFS_MAX_XATTR_SIZE(root->fs_info));
+			return -EUCLEAN;
+		}
+
+		if (data_len && dir_type != BTRFS_FT_XATTR) {
+			dir_item_err(root, leaf, slot,
+			"dir item with invalid data len, have %u expect 0",
+				data_len);
+			return -EUCLEAN;
+		}
+
+		total_size = sizeof(*di) + name_len + data_len;
+
+		/* header and name/data should not cross item boundary */
+		if (cur + total_size > item_size) {
+			dir_item_err(root, leaf, slot,
+		"dir item data crosses item boundary, have %u boundary %u",
+				cur + total_size, item_size);
+			return -EUCLEAN;
+		}
+
+		/*
+		 * Special check for XATTR/DIR_ITEM, as key->offset is name
+		 * hash, should match its name
+		 */
+		if (key->type == BTRFS_DIR_ITEM_KEY ||
+		    key->type == BTRFS_XATTR_ITEM_KEY) {
+			char namebuf[max(BTRFS_NAME_LEN, XATTR_NAME_MAX)];
+
+			read_extent_buffer(leaf, namebuf,
+					(unsigned long)(di + 1), name_len);
+			name_hash = btrfs_name_hash(namebuf, name_len);
+			if (key->offset != name_hash) {
+				dir_item_err(root, leaf, slot,
+		"name hash mismatch with key, have 0x%016x expect 0x%016llx",
+					name_hash, key->offset);
+				return -EUCLEAN;
+			}
+		}
+		cur += total_size;
+		di = (struct btrfs_dir_item *)((void *)di + total_size);
+	}
+	return 0;
+}
+
+/*
  * Common point to switch the item-specific validation.
  */
 static int check_leaf_item(struct btrfs_root *root,
@@ -238,6 +375,11 @@ static int check_leaf_item(struct btrfs_root *root,
 	case BTRFS_EXTENT_CSUM_KEY:
 		ret = check_csum_item(root, leaf, key, slot);
 		break;
+	case BTRFS_DIR_ITEM_KEY:
+	case BTRFS_DIR_INDEX_KEY:
+	case BTRFS_XATTR_ITEM_KEY:
+		ret = check_dir_item(root, leaf, key, slot);
+		break;
 	}
 	return ret;
 }
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 7bf9b31..afadaad 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -20,6 +20,7 @@
 #include <linux/slab.h>
 #include <linux/blkdev.h>
 #include <linux/list_sort.h>
+#include <linux/iversion.h>
 #include "tree-log.h"
 #include "disk-io.h"
 #include "locking.h"
@@ -1173,19 +1174,15 @@ static inline int __add_inode_ref(struct btrfs_trans_handle *trans,
 	return 0;
 }
 
-static int extref_get_fields(struct extent_buffer *eb, int slot,
-			     unsigned long ref_ptr, u32 *namelen, char **name,
-			     u64 *index, u64 *parent_objectid)
+static int extref_get_fields(struct extent_buffer *eb, unsigned long ref_ptr,
+			     u32 *namelen, char **name, u64 *index,
+			     u64 *parent_objectid)
 {
 	struct btrfs_inode_extref *extref;
 
 	extref = (struct btrfs_inode_extref *)ref_ptr;
 
 	*namelen = btrfs_inode_extref_name_len(eb, extref);
-	if (!btrfs_is_name_len_valid(eb, slot, (unsigned long)&extref->name,
-				     *namelen))
-		return -EIO;
-
 	*name = kmalloc(*namelen, GFP_NOFS);
 	if (*name == NULL)
 		return -ENOMEM;
@@ -1200,19 +1197,14 @@ static int extref_get_fields(struct extent_buffer *eb, int slot,
 	return 0;
 }
 
-static int ref_get_fields(struct extent_buffer *eb, int slot,
-			  unsigned long ref_ptr, u32 *namelen, char **name,
-			  u64 *index)
+static int ref_get_fields(struct extent_buffer *eb, unsigned long ref_ptr,
+			  u32 *namelen, char **name, u64 *index)
 {
 	struct btrfs_inode_ref *ref;
 
 	ref = (struct btrfs_inode_ref *)ref_ptr;
 
 	*namelen = btrfs_inode_ref_name_len(eb, ref);
-	if (!btrfs_is_name_len_valid(eb, slot, (unsigned long)(ref + 1),
-				     *namelen))
-		return -EIO;
-
 	*name = kmalloc(*namelen, GFP_NOFS);
 	if (*name == NULL)
 		return -ENOMEM;
@@ -1287,8 +1279,8 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans,
 
 	while (ref_ptr < ref_end) {
 		if (log_ref_ver) {
-			ret = extref_get_fields(eb, slot, ref_ptr, &namelen,
-					  &name, &ref_index, &parent_objectid);
+			ret = extref_get_fields(eb, ref_ptr, &namelen, &name,
+						&ref_index, &parent_objectid);
 			/*
 			 * parent object can change from one array
 			 * item to another.
@@ -1300,8 +1292,8 @@ static noinline int add_inode_ref(struct btrfs_trans_handle *trans,
 				goto out;
 			}
 		} else {
-			ret = ref_get_fields(eb, slot, ref_ptr, &namelen,
-					     &name, &ref_index);
+			ret = ref_get_fields(eb, ref_ptr, &namelen, &name,
+					     &ref_index);
 		}
 		if (ret)
 			goto out;
@@ -1835,7 +1827,6 @@ static noinline int replay_one_dir_item(struct btrfs_trans_handle *trans,
 					struct extent_buffer *eb, int slot,
 					struct btrfs_key *key)
 {
-	struct btrfs_fs_info *fs_info = root->fs_info;
 	int ret = 0;
 	u32 item_size = btrfs_item_size_nr(eb, slot);
 	struct btrfs_dir_item *di;
@@ -1848,8 +1839,6 @@ static noinline int replay_one_dir_item(struct btrfs_trans_handle *trans,
 	ptr_end = ptr + item_size;
 	while (ptr < ptr_end) {
 		di = (struct btrfs_dir_item *)ptr;
-		if (verify_dir_item(fs_info, eb, slot, di))
-			return -EIO;
 		name_len = btrfs_dir_name_len(eb, di);
 		ret = replay_one_name(trans, root, path, eb, di, key);
 		if (ret < 0)
@@ -2024,11 +2013,6 @@ static noinline int check_item_in_log(struct btrfs_trans_handle *trans,
 	ptr_end = ptr + item_size;
 	while (ptr < ptr_end) {
 		di = (struct btrfs_dir_item *)ptr;
-		if (verify_dir_item(fs_info, eb, slot, di)) {
-			ret = -EIO;
-			goto out;
-		}
-
 		name_len = btrfs_dir_name_len(eb, di);
 		name = kmalloc(name_len, GFP_NOFS);
 		if (!name) {
@@ -2109,7 +2093,6 @@ static int replay_xattr_deletes(struct btrfs_trans_handle *trans,
 			      struct btrfs_path *path,
 			      const u64 ino)
 {
-	struct btrfs_fs_info *fs_info = root->fs_info;
 	struct btrfs_key search_key;
 	struct btrfs_path *log_path;
 	int i;
@@ -2151,11 +2134,6 @@ static int replay_xattr_deletes(struct btrfs_trans_handle *trans,
 			u32 this_len = sizeof(*di) + name_len + data_len;
 			char *name;
 
-			ret = verify_dir_item(fs_info, path->nodes[0], i, di);
-			if (ret) {
-				ret = -EIO;
-				goto out;
-			}
 			name = kmalloc(name_len, GFP_NOFS);
 			if (!name) {
 				ret = -ENOMEM;
@@ -3609,7 +3587,8 @@ static void fill_inode_item(struct btrfs_trans_handle *trans,
 	btrfs_set_token_inode_nbytes(leaf, item, inode_get_bytes(inode),
 				     &token);
 
-	btrfs_set_token_inode_sequence(leaf, item, inode->i_version, &token);
+	btrfs_set_token_inode_sequence(leaf, item,
+				       inode_peek_iversion(inode), &token);
 	btrfs_set_token_inode_transid(leaf, item, trans->transid, &token);
 	btrfs_set_token_inode_rdev(leaf, item, inode->i_rdev, &token);
 	btrfs_set_token_inode_flags(leaf, item, BTRFS_I(inode)->flags, &token);
@@ -4572,12 +4551,6 @@ static int btrfs_check_ref_name_override(struct extent_buffer *eb,
 			this_len = sizeof(*extref) + this_name_len;
 		}
 
-		ret = btrfs_is_name_len_valid(eb, slot, name_ptr,
-					      this_name_len);
-		if (!ret) {
-			ret = -EIO;
-			goto out;
-		}
 		if (this_name_len > name_len) {
 			char *new_name;
 
@@ -5432,11 +5405,10 @@ static int btrfs_log_inode_parent(struct btrfs_trans_handle *trans,
 				  struct dentry *parent,
 				  const loff_t start,
 				  const loff_t end,
-				  int exists_only,
+				  int inode_only,
 				  struct btrfs_log_ctx *ctx)
 {
 	struct btrfs_fs_info *fs_info = root->fs_info;
-	int inode_only = exists_only ? LOG_INODE_EXISTS : LOG_INODE_ALL;
 	struct super_block *sb;
 	struct dentry *old_parent = NULL;
 	int ret = 0;
@@ -5602,7 +5574,7 @@ int btrfs_log_dentry_safe(struct btrfs_trans_handle *trans,
 	int ret;
 
 	ret = btrfs_log_inode_parent(trans, root, BTRFS_I(d_inode(dentry)),
-			parent, start, end, 0, ctx);
+			parent, start, end, LOG_INODE_ALL, ctx);
 	dput(parent);
 
 	return ret;
@@ -5865,6 +5837,6 @@ int btrfs_log_new_name(struct btrfs_trans_handle *trans,
 		return 0;
 
 	return btrfs_log_inode_parent(trans, root, inode, parent, 0,
-				      LLONG_MAX, 1, NULL);
+				      LLONG_MAX, LOG_INODE_EXISTS, NULL);
 }
 
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 49810b7..b5036bd 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -145,6 +145,71 @@ static int __btrfs_map_block(struct btrfs_fs_info *fs_info,
 			     struct btrfs_bio **bbio_ret,
 			     int mirror_num, int need_raid_map);
 
+/*
+ * Device locking
+ * ==============
+ *
+ * There are several mutexes that protect manipulation of devices and low-level
+ * structures like chunks but not block groups, extents or files
+ *
+ * uuid_mutex (global lock)
+ * ------------------------
+ * protects the fs_uuids list that tracks all per-fs fs_devices, resulting from
+ * the SCAN_DEV ioctl registration or from mount either implicitly (the first
+ * device) or requested by the device= mount option
+ *
+ * the mutex can be very coarse and can cover long-running operations
+ *
+ * protects: updates to fs_devices counters like missing devices, rw devices,
+ * seeding, structure cloning, openning/closing devices at mount/umount time
+ *
+ * global::fs_devs - add, remove, updates to the global list
+ *
+ * does not protect: manipulation of the fs_devices::devices list!
+ *
+ * btrfs_device::name - renames (write side), read is RCU
+ *
+ * fs_devices::device_list_mutex (per-fs, with RCU)
+ * ------------------------------------------------
+ * protects updates to fs_devices::devices, ie. adding and deleting
+ *
+ * simple list traversal with read-only actions can be done with RCU protection
+ *
+ * may be used to exclude some operations from running concurrently without any
+ * modifications to the list (see write_all_supers)
+ *
+ * volume_mutex
+ * ------------
+ * coarse lock owned by a mounted filesystem; used to exclude some operations
+ * that cannot run in parallel and affect the higher-level properties of the
+ * filesystem like: device add/deleting/resize/replace, or balance
+ *
+ * balance_mutex
+ * -------------
+ * protects balance structures (status, state) and context accessed from
+ * several places (internally, ioctl)
+ *
+ * chunk_mutex
+ * -----------
+ * protects chunks, adding or removing during allocation, trim or when a new
+ * device is added/removed
+ *
+ * cleaner_mutex
+ * -------------
+ * a big lock that is held by the cleaner thread and prevents running subvolume
+ * cleaning together with relocation or delayed iputs
+ *
+ *
+ * Lock nesting
+ * ============
+ *
+ * uuid_mutex
+ *   volume_mutex
+ *     device_list_mutex
+ *       chunk_mutex
+ *     balance_mutex
+ */
+
 DEFINE_MUTEX(uuid_mutex);
 static LIST_HEAD(fs_uuids);
 struct list_head *btrfs_get_fs_uuids(void)
@@ -180,6 +245,13 @@ static struct btrfs_fs_devices *alloc_fs_devices(const u8 *fsid)
 	return fs_devs;
 }
 
+static void free_device(struct btrfs_device *device)
+{
+	rcu_string_free(device->name);
+	bio_put(device->flush_bio);
+	kfree(device);
+}
+
 static void free_fs_devices(struct btrfs_fs_devices *fs_devices)
 {
 	struct btrfs_device *device;
@@ -188,9 +260,7 @@ static void free_fs_devices(struct btrfs_fs_devices *fs_devices)
 		device = list_entry(fs_devices->devices.next,
 				    struct btrfs_device, dev_list);
 		list_del(&device->dev_list);
-		rcu_string_free(device->name);
-		bio_put(device->flush_bio);
-		kfree(device);
+		free_device(device);
 	}
 	kfree(fs_devices);
 }
@@ -220,6 +290,11 @@ void btrfs_cleanup_fs_uuids(void)
 	}
 }
 
+/*
+ * Returns a pointer to a new btrfs_device on success; ERR_PTR() on error.
+ * Returned struct is not linked onto any lists and must be destroyed using
+ * free_device.
+ */
 static struct btrfs_device *__alloc_device(void)
 {
 	struct btrfs_device *dev;
@@ -237,7 +312,6 @@ static struct btrfs_device *__alloc_device(void)
 		kfree(dev);
 		return ERR_PTR(-ENOMEM);
 	}
-	bio_get(dev->flush_bio);
 
 	INIT_LIST_HEAD(&dev->dev_list);
 	INIT_LIST_HEAD(&dev->dev_alloc_list);
@@ -245,7 +319,6 @@ static struct btrfs_device *__alloc_device(void)
 
 	spin_lock_init(&dev->io_lock);
 
-	spin_lock_init(&dev->reada_lock);
 	atomic_set(&dev->reada_in_flight, 0);
 	atomic_set(&dev->dev_stats_ccnt, 0);
 	btrfs_device_data_ordered_init(dev);
@@ -531,45 +604,42 @@ static void pending_bios_fn(struct btrfs_work *work)
 	run_scheduled_bios(device);
 }
 
-
-static void btrfs_free_stale_device(struct btrfs_device *cur_dev)
+/*
+ *  Search and remove all stale (devices which are not mounted) devices.
+ *  When both inputs are NULL, it will search and release all stale devices.
+ *  path:	Optional. When provided will it release all unmounted devices
+ *		matching this path only.
+ *  skip_dev:	Optional. Will skip this device when searching for the stale
+ *		devices.
+ */
+static void btrfs_free_stale_devices(const char *path,
+				     struct btrfs_device *skip_dev)
 {
-	struct btrfs_fs_devices *fs_devs;
-	struct btrfs_device *dev;
+	struct btrfs_fs_devices *fs_devs, *tmp_fs_devs;
+	struct btrfs_device *dev, *tmp_dev;
 
-	if (!cur_dev->name)
-		return;
-
-	list_for_each_entry(fs_devs, &fs_uuids, list) {
-		int del = 1;
+	list_for_each_entry_safe(fs_devs, tmp_fs_devs, &fs_uuids, list) {
 
 		if (fs_devs->opened)
 			continue;
-		if (fs_devs->seeding)
-			continue;
 
-		list_for_each_entry(dev, &fs_devs->devices, dev_list) {
+		list_for_each_entry_safe(dev, tmp_dev,
+					 &fs_devs->devices, dev_list) {
+			int not_found = 0;
 
-			if (dev == cur_dev)
+			if (skip_dev && skip_dev == dev)
 				continue;
-			if (!dev->name)
+			if (path && !dev->name)
 				continue;
 
-			/*
-			 * Todo: This won't be enough. What if the same device
-			 * comes back (with new uuid and) with its mapper path?
-			 * But for now, this does help as mostly an admin will
-			 * either use mapper or non mapper path throughout.
-			 */
 			rcu_read_lock();
-			del = strcmp(rcu_str_deref(dev->name),
-						rcu_str_deref(cur_dev->name));
+			if (path)
+				not_found = strcmp(rcu_str_deref(dev->name),
+						   path);
 			rcu_read_unlock();
-			if (!del)
-				break;
-		}
+			if (not_found)
+				continue;
 
-		if (!del) {
 			/* delete the stale device */
 			if (fs_devs->num_devices == 1) {
 				btrfs_sysfs_remove_fsid(fs_devs);
@@ -578,38 +648,99 @@ static void btrfs_free_stale_device(struct btrfs_device *cur_dev)
 			} else {
 				fs_devs->num_devices--;
 				list_del(&dev->dev_list);
-				rcu_string_free(dev->name);
-				bio_put(dev->flush_bio);
-				kfree(dev);
+				free_device(dev);
 			}
-			break;
 		}
 	}
 }
 
+static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices,
+			struct btrfs_device *device, fmode_t flags,
+			void *holder)
+{
+	struct request_queue *q;
+	struct block_device *bdev;
+	struct buffer_head *bh;
+	struct btrfs_super_block *disk_super;
+	u64 devid;
+	int ret;
+
+	if (device->bdev)
+		return -EINVAL;
+	if (!device->name)
+		return -EINVAL;
+
+	ret = btrfs_get_bdev_and_sb(device->name->str, flags, holder, 1,
+				    &bdev, &bh);
+	if (ret)
+		return ret;
+
+	disk_super = (struct btrfs_super_block *)bh->b_data;
+	devid = btrfs_stack_device_id(&disk_super->dev_item);
+	if (devid != device->devid)
+		goto error_brelse;
+
+	if (memcmp(device->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE))
+		goto error_brelse;
+
+	device->generation = btrfs_super_generation(disk_super);
+
+	if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING) {
+		clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
+		fs_devices->seeding = 1;
+	} else {
+		if (bdev_read_only(bdev))
+			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
+		else
+			set_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
+	}
+
+	q = bdev_get_queue(bdev);
+	if (!blk_queue_nonrot(q))
+		fs_devices->rotating = 1;
+
+	device->bdev = bdev;
+	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
+	device->mode = flags;
+
+	fs_devices->open_devices++;
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
+	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
+		fs_devices->rw_devices++;
+		list_add(&device->dev_alloc_list, &fs_devices->alloc_list);
+	}
+	brelse(bh);
+
+	return 0;
+
+error_brelse:
+	brelse(bh);
+	blkdev_put(bdev, flags);
+
+	return -EINVAL;
+}
+
 /*
  * Add new device to list of registered devices
  *
  * Returns:
- * 1   - first time device is seen
- * 0   - device already known
- * < 0 - error
+ * device pointer which was just added or updated when successful
+ * error pointer when failed
  */
-static noinline int device_list_add(const char *path,
-			   struct btrfs_super_block *disk_super,
-			   u64 devid, struct btrfs_fs_devices **fs_devices_ret)
+static noinline struct btrfs_device *device_list_add(const char *path,
+			   struct btrfs_super_block *disk_super)
 {
 	struct btrfs_device *device;
 	struct btrfs_fs_devices *fs_devices;
 	struct rcu_string *name;
-	int ret = 0;
 	u64 found_transid = btrfs_super_generation(disk_super);
+	u64 devid = btrfs_stack_device_id(&disk_super->dev_item);
 
 	fs_devices = find_fsid(disk_super->fsid);
 	if (!fs_devices) {
 		fs_devices = alloc_fs_devices(disk_super->fsid);
 		if (IS_ERR(fs_devices))
-			return PTR_ERR(fs_devices);
+			return ERR_CAST(fs_devices);
 
 		list_add(&fs_devices->list, &fs_uuids);
 
@@ -621,20 +752,19 @@ static noinline int device_list_add(const char *path,
 
 	if (!device) {
 		if (fs_devices->opened)
-			return -EBUSY;
+			return ERR_PTR(-EBUSY);
 
 		device = btrfs_alloc_device(NULL, &devid,
 					    disk_super->dev_item.uuid);
 		if (IS_ERR(device)) {
 			/* we can safely leave the fs_devices entry around */
-			return PTR_ERR(device);
+			return device;
 		}
 
 		name = rcu_string_strdup(path, GFP_NOFS);
 		if (!name) {
-			bio_put(device->flush_bio);
-			kfree(device);
-			return -ENOMEM;
+			free_device(device);
+			return ERR_PTR(-ENOMEM);
 		}
 		rcu_assign_pointer(device->name, name);
 
@@ -643,8 +773,16 @@ static noinline int device_list_add(const char *path,
 		fs_devices->num_devices++;
 		mutex_unlock(&fs_devices->device_list_mutex);
 
-		ret = 1;
 		device->fs_devices = fs_devices;
+		btrfs_free_stale_devices(path, device);
+
+		if (disk_super->label[0])
+			pr_info("BTRFS: device label %s devid %llu transid %llu %s\n",
+				disk_super->label, devid, found_transid, path);
+		else
+			pr_info("BTRFS: device fsid %pU devid %llu transid %llu %s\n",
+				disk_super->fsid, devid, found_transid, path);
+
 	} else if (!device->name || strcmp(device->name->str, path)) {
 		/*
 		 * When FS is already mounted.
@@ -680,17 +818,17 @@ static noinline int device_list_add(const char *path,
 			 * with larger generation number or the last-in if
 			 * generation are equal.
 			 */
-			return -EEXIST;
+			return ERR_PTR(-EEXIST);
 		}
 
 		name = rcu_string_strdup(path, GFP_NOFS);
 		if (!name)
-			return -ENOMEM;
+			return ERR_PTR(-ENOMEM);
 		rcu_string_free(device->name);
 		rcu_assign_pointer(device->name, name);
-		if (device->missing) {
+		if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) {
 			fs_devices->missing_devices--;
-			device->missing = 0;
+			clear_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state);
 		}
 	}
 
@@ -703,16 +841,9 @@ static noinline int device_list_add(const char *path,
 	if (!fs_devices->opened)
 		device->generation = found_transid;
 
-	/*
-	 * if there is new btrfs on an already registered device,
-	 * then remove the stale device entry.
-	 */
-	if (ret > 0)
-		btrfs_free_stale_device(device);
+	fs_devices->total_devices = btrfs_super_num_devices(disk_super);
 
-	*fs_devices_ret = fs_devices;
-
-	return ret;
+	return device;
 }
 
 static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
@@ -745,8 +876,7 @@ static struct btrfs_fs_devices *clone_fs_devices(struct btrfs_fs_devices *orig)
 			name = rcu_string_strdup(orig_dev->name->str,
 					GFP_KERNEL);
 			if (!name) {
-				bio_put(device->flush_bio);
-				kfree(device);
+				free_device(device);
 				goto error;
 			}
 			rcu_assign_pointer(device->name, name);
@@ -773,10 +903,12 @@ void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step)
 again:
 	/* This is the initialized path, it is safe to release the devices. */
 	list_for_each_entry_safe(device, next, &fs_devices->devices, dev_list) {
-		if (device->in_fs_metadata) {
-			if (!device->is_tgtdev_for_dev_replace &&
-			    (!latest_dev ||
-			     device->generation > latest_dev->generation)) {
+		if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+							&device->dev_state)) {
+			if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
+			     &device->dev_state) &&
+			     (!latest_dev ||
+			      device->generation > latest_dev->generation)) {
 				latest_dev = device;
 			}
 			continue;
@@ -793,7 +925,8 @@ void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step)
 			 * not, which means whether this device is
 			 * used or whether it should be removed.
 			 */
-			if (step == 0 || device->is_tgtdev_for_dev_replace) {
+			if (step == 0 || test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
+						  &device->dev_state)) {
 				continue;
 			}
 		}
@@ -802,17 +935,16 @@ void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step)
 			device->bdev = NULL;
 			fs_devices->open_devices--;
 		}
-		if (device->writeable) {
+		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 			list_del_init(&device->dev_alloc_list);
-			device->writeable = 0;
-			if (!device->is_tgtdev_for_dev_replace)
+			clear_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
+			if (!test_bit(BTRFS_DEV_STATE_REPLACE_TGT,
+				      &device->dev_state))
 				fs_devices->rw_devices--;
 		}
 		list_del_init(&device->dev_list);
 		fs_devices->num_devices--;
-		rcu_string_free(device->name);
-		bio_put(device->flush_bio);
-		kfree(device);
+		free_device(device);
 	}
 
 	if (fs_devices->seed) {
@@ -825,35 +957,25 @@ void btrfs_close_extra_devices(struct btrfs_fs_devices *fs_devices, int step)
 	mutex_unlock(&uuid_mutex);
 }
 
-static void __free_device(struct work_struct *work)
-{
-	struct btrfs_device *device;
-
-	device = container_of(work, struct btrfs_device, rcu_work);
-	rcu_string_free(device->name);
-	bio_put(device->flush_bio);
-	kfree(device);
-}
-
-static void free_device(struct rcu_head *head)
+static void free_device_rcu(struct rcu_head *head)
 {
 	struct btrfs_device *device;
 
 	device = container_of(head, struct btrfs_device, rcu);
-
-	INIT_WORK(&device->rcu_work, __free_device);
-	schedule_work(&device->rcu_work);
+	free_device(device);
 }
 
 static void btrfs_close_bdev(struct btrfs_device *device)
 {
-	if (device->bdev && device->writeable) {
+	if (!device->bdev)
+		return;
+
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		sync_blockdev(device->bdev);
 		invalidate_bdev(device->bdev);
 	}
 
-	if (device->bdev)
-		blkdev_put(device->bdev, device->mode);
+	blkdev_put(device->bdev, device->mode);
 }
 
 static void btrfs_prepare_close_one_device(struct btrfs_device *device)
@@ -865,13 +987,13 @@ static void btrfs_prepare_close_one_device(struct btrfs_device *device)
 	if (device->bdev)
 		fs_devices->open_devices--;
 
-	if (device->writeable &&
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
 	    device->devid != BTRFS_DEV_REPLACE_DEVID) {
 		list_del_init(&device->dev_alloc_list);
 		fs_devices->rw_devices--;
 	}
 
-	if (device->missing)
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
 		fs_devices->missing_devices--;
 
 	new_device = btrfs_alloc_device(NULL, &device->devid,
@@ -917,7 +1039,7 @@ static int __btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 				struct btrfs_device, dev_list);
 		list_del(&device->dev_list);
 		btrfs_close_bdev(device);
-		call_rcu(&device->rcu, free_device);
+		call_rcu(&device->rcu, free_device_rcu);
 	}
 
 	WARN_ON(fs_devices->open_devices);
@@ -947,93 +1069,32 @@ int btrfs_close_devices(struct btrfs_fs_devices *fs_devices)
 		__btrfs_close_devices(fs_devices);
 		free_fs_devices(fs_devices);
 	}
-	/*
-	 * Wait for rcu kworkers under __btrfs_close_devices
-	 * to finish all blkdev_puts so device is really
-	 * free when umount is done.
-	 */
-	rcu_barrier();
 	return ret;
 }
 
 static int __btrfs_open_devices(struct btrfs_fs_devices *fs_devices,
 				fmode_t flags, void *holder)
 {
-	struct request_queue *q;
-	struct block_device *bdev;
 	struct list_head *head = &fs_devices->devices;
 	struct btrfs_device *device;
 	struct btrfs_device *latest_dev = NULL;
-	struct buffer_head *bh;
-	struct btrfs_super_block *disk_super;
-	u64 devid;
-	int seeding = 1;
 	int ret = 0;
 
 	flags |= FMODE_EXCL;
 
 	list_for_each_entry(device, head, dev_list) {
-		if (device->bdev)
-			continue;
-		if (!device->name)
-			continue;
-
 		/* Just open everything we can; ignore failures here */
-		if (btrfs_get_bdev_and_sb(device->name->str, flags, holder, 1,
-					    &bdev, &bh))
+		if (btrfs_open_one_device(fs_devices, device, flags, holder))
 			continue;
 
-		disk_super = (struct btrfs_super_block *)bh->b_data;
-		devid = btrfs_stack_device_id(&disk_super->dev_item);
-		if (devid != device->devid)
-			goto error_brelse;
-
-		if (memcmp(device->uuid, disk_super->dev_item.uuid,
-			   BTRFS_UUID_SIZE))
-			goto error_brelse;
-
-		device->generation = btrfs_super_generation(disk_super);
 		if (!latest_dev ||
 		    device->generation > latest_dev->generation)
 			latest_dev = device;
-
-		if (btrfs_super_flags(disk_super) & BTRFS_SUPER_FLAG_SEEDING) {
-			device->writeable = 0;
-		} else {
-			device->writeable = !bdev_read_only(bdev);
-			seeding = 0;
-		}
-
-		q = bdev_get_queue(bdev);
-		if (blk_queue_discard(q))
-			device->can_discard = 1;
-		if (!blk_queue_nonrot(q))
-			fs_devices->rotating = 1;
-
-		device->bdev = bdev;
-		device->in_fs_metadata = 0;
-		device->mode = flags;
-
-		fs_devices->open_devices++;
-		if (device->writeable &&
-		    device->devid != BTRFS_DEV_REPLACE_DEVID) {
-			fs_devices->rw_devices++;
-			list_add(&device->dev_alloc_list,
-				 &fs_devices->alloc_list);
-		}
-		brelse(bh);
-		continue;
-
-error_brelse:
-		brelse(bh);
-		blkdev_put(bdev, flags);
-		continue;
 	}
 	if (fs_devices->open_devices == 0) {
 		ret = -EINVAL;
 		goto out;
 	}
-	fs_devices->seeding = seeding;
 	fs_devices->opened = 1;
 	fs_devices->latest_bdev = latest_dev->bdev;
 	fs_devices->total_rw_bytes = 0;
@@ -1117,12 +1178,10 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder,
 			  struct btrfs_fs_devices **fs_devices_ret)
 {
 	struct btrfs_super_block *disk_super;
+	struct btrfs_device *device;
 	struct block_device *bdev;
 	struct page *page;
-	int ret = -EINVAL;
-	u64 devid;
-	u64 transid;
-	u64 total_devices;
+	int ret = 0;
 	u64 bytenr;
 
 	/*
@@ -1141,26 +1200,16 @@ int btrfs_scan_one_device(const char *path, fmode_t flags, void *holder,
 		goto error;
 	}
 
-	if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super))
+	if (btrfs_read_disk_super(bdev, bytenr, &page, &disk_super)) {
+		ret = -EINVAL;
 		goto error_bdev_put;
-
-	devid = btrfs_stack_device_id(&disk_super->dev_item);
-	transid = btrfs_super_generation(disk_super);
-	total_devices = btrfs_super_num_devices(disk_super);
-
-	ret = device_list_add(path, disk_super, devid, fs_devices_ret);
-	if (ret > 0) {
-		if (disk_super->label[0]) {
-			pr_info("BTRFS: device label %s ", disk_super->label);
-		} else {
-			pr_info("BTRFS: device fsid %pU ", disk_super->fsid);
-		}
-
-		pr_cont("devid %llu transid %llu %s\n", devid, transid, path);
-		ret = 0;
 	}
-	if (!ret && fs_devices_ret)
-		(*fs_devices_ret)->total_devices = total_devices;
+
+	device = device_list_add(path, disk_super);
+	if (IS_ERR(device))
+		ret = PTR_ERR(device);
+	else
+		*fs_devices_ret = device->fs_devices;
 
 	btrfs_release_disk_super(page);
 
@@ -1186,7 +1235,8 @@ int btrfs_account_dev_extents_size(struct btrfs_device *device, u64 start,
 
 	*length = 0;
 
-	if (start >= device->total_bytes || device->is_tgtdev_for_dev_replace)
+	if (start >= device->total_bytes ||
+		test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
 		return 0;
 
 	path = btrfs_alloc_path();
@@ -1364,7 +1414,8 @@ int find_free_dev_extent_start(struct btrfs_transaction *transaction,
 	max_hole_size = 0;
 
 again:
-	if (search_start >= search_end || device->is_tgtdev_for_dev_replace) {
+	if (search_start >= search_end ||
+		test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 		ret = -ENOSPC;
 		goto out;
 	}
@@ -1571,8 +1622,8 @@ static int btrfs_alloc_dev_extent(struct btrfs_trans_handle *trans,
 	struct extent_buffer *leaf;
 	struct btrfs_key key;
 
-	WARN_ON(!device->in_fs_metadata);
-	WARN_ON(device->is_tgtdev_for_dev_replace);
+	WARN_ON(!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state));
+	WARN_ON(test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
 	path = btrfs_alloc_path();
 	if (!path)
 		return -ENOMEM;
@@ -1662,7 +1713,7 @@ static noinline int find_next_devid(struct btrfs_fs_info *fs_info,
  * the device information is stored in the chunk root
  * the btrfs_device struct should be fully filled in
  */
-static int btrfs_add_device(struct btrfs_trans_handle *trans,
+static int btrfs_add_dev_item(struct btrfs_trans_handle *trans,
 			    struct btrfs_fs_info *fs_info,
 			    struct btrfs_device *device)
 {
@@ -1818,7 +1869,8 @@ static struct btrfs_device * btrfs_find_next_active_device(
 
 	list_for_each_entry(next_device, &fs_devs->devices, dev_list) {
 		if (next_device != device &&
-			!next_device->missing && next_device->bdev)
+		    !test_bit(BTRFS_DEV_STATE_MISSING, &next_device->dev_state)
+		    && next_device->bdev)
 			return next_device;
 	}
 
@@ -1859,6 +1911,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	u64 num_devices;
 	int ret = 0;
 
+	mutex_lock(&fs_info->volume_mutex);
 	mutex_lock(&uuid_mutex);
 
 	num_devices = fs_info->fs_devices->num_devices;
@@ -1878,17 +1931,18 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	if (ret)
 		goto out;
 
-	if (device->is_tgtdev_for_dev_replace) {
+	if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 		ret = BTRFS_ERROR_DEV_TGT_REPLACE;
 		goto out;
 	}
 
-	if (device->writeable && fs_info->fs_devices->rw_devices == 1) {
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
+	    fs_info->fs_devices->rw_devices == 1) {
 		ret = BTRFS_ERROR_DEV_ONLY_WRITABLE;
 		goto out;
 	}
 
-	if (device->writeable) {
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		mutex_lock(&fs_info->chunk_mutex);
 		list_del_init(&device->dev_alloc_list);
 		device->fs_devices->rw_devices--;
@@ -1910,7 +1964,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	if (ret)
 		goto error_undo;
 
-	device->in_fs_metadata = 0;
+	clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
 	btrfs_scrub_cancel_dev(fs_info, device);
 
 	/*
@@ -1930,7 +1984,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	device->fs_devices->num_devices--;
 	device->fs_devices->total_devices--;
 
-	if (device->missing)
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state))
 		device->fs_devices->missing_devices--;
 
 	btrfs_assign_next_active_device(fs_info, device, NULL);
@@ -1950,11 +2004,11 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 	 * the devices list.  All that's left is to zero out the old
 	 * supers and free the device.
 	 */
-	if (device->writeable)
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 		btrfs_scratch_superblocks(device->bdev, device->name->str);
 
 	btrfs_close_bdev(device);
-	call_rcu(&device->rcu, free_device);
+	call_rcu(&device->rcu, free_device_rcu);
 
 	if (cur_devices->open_devices == 0) {
 		struct btrfs_fs_devices *fs_devices;
@@ -1973,10 +2027,11 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path,
 
 out:
 	mutex_unlock(&uuid_mutex);
+	mutex_unlock(&fs_info->volume_mutex);
 	return ret;
 
 error_undo:
-	if (device->writeable) {
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		mutex_lock(&fs_info->chunk_mutex);
 		list_add(&device->dev_alloc_list,
 			 &fs_info->fs_devices->alloc_list);
@@ -2004,10 +2059,10 @@ void btrfs_rm_dev_replace_remove_srcdev(struct btrfs_fs_info *fs_info,
 	list_del_rcu(&srcdev->dev_list);
 	list_del(&srcdev->dev_alloc_list);
 	fs_devices->num_devices--;
-	if (srcdev->missing)
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &srcdev->dev_state))
 		fs_devices->missing_devices--;
 
-	if (srcdev->writeable)
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &srcdev->dev_state))
 		fs_devices->rw_devices--;
 
 	if (srcdev->bdev)
@@ -2019,13 +2074,13 @@ void btrfs_rm_dev_replace_free_srcdev(struct btrfs_fs_info *fs_info,
 {
 	struct btrfs_fs_devices *fs_devices = srcdev->fs_devices;
 
-	if (srcdev->writeable) {
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &srcdev->dev_state)) {
 		/* zero out the old super if it is writable */
 		btrfs_scratch_superblocks(srcdev->bdev, srcdev->name->str);
 	}
 
 	btrfs_close_bdev(srcdev);
-	call_rcu(&srcdev->rcu, free_device);
+	call_rcu(&srcdev->rcu, free_device_rcu);
 
 	/* if this is no devs we rather delete the fs_devices */
 	if (!fs_devices->num_devices) {
@@ -2084,7 +2139,7 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	btrfs_scratch_superblocks(tgtdev->bdev, tgtdev->name->str);
 
 	btrfs_close_bdev(tgtdev);
-	call_rcu(&tgtdev->rcu, free_device);
+	call_rcu(&tgtdev->rcu, free_device_rcu);
 }
 
 static int btrfs_find_device_by_path(struct btrfs_fs_info *fs_info,
@@ -2129,7 +2184,8 @@ int btrfs_find_device_missing_or_by_path(struct btrfs_fs_info *fs_info,
 		 * is held by the caller.
 		 */
 		list_for_each_entry(tmp, devices, dev_list) {
-			if (tmp->in_fs_metadata && !tmp->bdev) {
+			if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+					&tmp->dev_state) && !tmp->bdev) {
 				*device = tmp;
 				break;
 			}
@@ -2358,26 +2414,19 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 
 	name = rcu_string_strdup(device_path, GFP_KERNEL);
 	if (!name) {
-		bio_put(device->flush_bio);
-		kfree(device);
 		ret = -ENOMEM;
-		goto error;
+		goto error_free_device;
 	}
 	rcu_assign_pointer(device->name, name);
 
 	trans = btrfs_start_transaction(root, 0);
 	if (IS_ERR(trans)) {
-		rcu_string_free(device->name);
-		bio_put(device->flush_bio);
-		kfree(device);
 		ret = PTR_ERR(trans);
-		goto error;
+		goto error_free_device;
 	}
 
 	q = bdev_get_queue(bdev);
-	if (blk_queue_discard(q))
-		device->can_discard = 1;
-	device->writeable = 1;
+	set_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
 	device->generation = trans->transid;
 	device->io_width = fs_info->sectorsize;
 	device->io_align = fs_info->sectorsize;
@@ -2388,8 +2437,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 	device->commit_total_bytes = device->total_bytes;
 	device->fs_info = fs_info;
 	device->bdev = bdev;
-	device->in_fs_metadata = 1;
-	device->is_tgtdev_for_dev_replace = 0;
+	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
+	clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
 	device->mode = FMODE_EXCL;
 	device->dev_stats_valid = 1;
 	set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
@@ -2450,7 +2499,7 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		}
 	}
 
-	ret = btrfs_add_device(trans, fs_info, device);
+	ret = btrfs_add_dev_item(trans, fs_info, device);
 	if (ret) {
 		btrfs_abort_transaction(trans, ret);
 		goto error_sysfs;
@@ -2511,9 +2560,8 @@ int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *device_path
 		sb->s_flags |= SB_RDONLY;
 	if (trans)
 		btrfs_end_transaction(trans);
-	rcu_string_free(device->name);
-	bio_put(device->flush_bio);
-	kfree(device);
+error_free_device:
+	free_device(device);
 error:
 	blkdev_put(bdev, FMODE_EXCL);
 	if (seeding_dev && !unlocked) {
@@ -2528,7 +2576,6 @@ int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 				  struct btrfs_device *srcdev,
 				  struct btrfs_device **device_out)
 {
-	struct request_queue *q;
 	struct btrfs_device *device;
 	struct block_device *bdev;
 	struct list_head *devices;
@@ -2579,18 +2626,14 @@ int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 
 	name = rcu_string_strdup(device_path, GFP_KERNEL);
 	if (!name) {
-		bio_put(device->flush_bio);
-		kfree(device);
+		free_device(device);
 		ret = -ENOMEM;
 		goto error;
 	}
 	rcu_assign_pointer(device->name, name);
 
-	q = bdev_get_queue(bdev);
-	if (blk_queue_discard(q))
-		device->can_discard = 1;
 	mutex_lock(&fs_info->fs_devices->device_list_mutex);
-	device->writeable = 1;
+	set_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state);
 	device->generation = 0;
 	device->io_width = fs_info->sectorsize;
 	device->io_align = fs_info->sectorsize;
@@ -2603,8 +2646,8 @@ int btrfs_init_dev_replace_tgtdev(struct btrfs_fs_info *fs_info,
 	device->commit_bytes_used = device->bytes_used;
 	device->fs_info = fs_info;
 	device->bdev = bdev;
-	device->in_fs_metadata = 1;
-	device->is_tgtdev_for_dev_replace = 1;
+	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
+	set_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
 	device->mode = FMODE_EXCL;
 	device->dev_stats_valid = 1;
 	set_blocksize(device->bdev, BTRFS_BDEV_BLOCKSIZE);
@@ -2632,7 +2675,7 @@ void btrfs_init_dev_replace_tgtdev_for_resume(struct btrfs_fs_info *fs_info,
 	tgtdev->io_align = sectorsize;
 	tgtdev->sector_size = sectorsize;
 	tgtdev->fs_info = fs_info;
-	tgtdev->in_fs_metadata = 1;
+	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &tgtdev->dev_state);
 }
 
 static noinline int btrfs_update_device(struct btrfs_trans_handle *trans,
@@ -2690,7 +2733,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
 	u64 old_total;
 	u64 diff;
 
-	if (!device->writeable)
+	if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 		return -EACCES;
 
 	new_size = round_down(new_size, fs_info->sectorsize);
@@ -2700,7 +2743,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans,
 	diff = round_down(new_size - device->total_bytes, fs_info->sectorsize);
 
 	if (new_size <= device->total_bytes ||
-	    device->is_tgtdev_for_dev_replace) {
+	    test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 		mutex_unlock(&fs_info->chunk_mutex);
 		return -EINVAL;
 	}
@@ -3044,6 +3087,48 @@ static int btrfs_relocate_sys_chunks(struct btrfs_fs_info *fs_info)
 	return ret;
 }
 
+/*
+ * return 1 : allocate a data chunk successfully,
+ * return <0: errors during allocating a data chunk,
+ * return 0 : no need to allocate a data chunk.
+ */
+static int btrfs_may_alloc_data_chunk(struct btrfs_fs_info *fs_info,
+				      u64 chunk_offset)
+{
+	struct btrfs_block_group_cache *cache;
+	u64 bytes_used;
+	u64 chunk_type;
+
+	cache = btrfs_lookup_block_group(fs_info, chunk_offset);
+	ASSERT(cache);
+	chunk_type = cache->flags;
+	btrfs_put_block_group(cache);
+
+	if (chunk_type & BTRFS_BLOCK_GROUP_DATA) {
+		spin_lock(&fs_info->data_sinfo->lock);
+		bytes_used = fs_info->data_sinfo->bytes_used;
+		spin_unlock(&fs_info->data_sinfo->lock);
+
+		if (!bytes_used) {
+			struct btrfs_trans_handle *trans;
+			int ret;
+
+			trans =	btrfs_join_transaction(fs_info->tree_root);
+			if (IS_ERR(trans))
+				return PTR_ERR(trans);
+
+			ret = btrfs_force_chunk_alloc(trans, fs_info,
+						      BTRFS_BLOCK_GROUP_DATA);
+			btrfs_end_transaction(trans);
+			if (ret < 0)
+				return ret;
+
+			return 1;
+		}
+	}
+	return 0;
+}
+
 static int insert_balance_item(struct btrfs_fs_info *fs_info,
 			       struct btrfs_balance_control *bctl)
 {
@@ -3502,7 +3587,6 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 	u32 count_meta = 0;
 	u32 count_sys = 0;
 	int chunk_reserved = 0;
-	u64 bytes_used = 0;
 
 	/* step one make some room on all the devices */
 	devices = &fs_info->fs_devices->devices;
@@ -3510,10 +3594,10 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 		old_size = btrfs_device_get_total_bytes(device);
 		size_to_free = div_factor(old_size, 1);
 		size_to_free = min_t(u64, size_to_free, SZ_1M);
-		if (!device->writeable ||
+		if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) ||
 		    btrfs_device_get_total_bytes(device) -
 		    btrfs_device_get_bytes_used(device) > size_to_free ||
-		    device->is_tgtdev_for_dev_replace)
+		    test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
 			continue;
 
 		ret = btrfs_shrink_device(device, old_size - size_to_free);
@@ -3661,28 +3745,21 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info)
 			goto loop;
 		}
 
-		ASSERT(fs_info->data_sinfo);
-		spin_lock(&fs_info->data_sinfo->lock);
-		bytes_used = fs_info->data_sinfo->bytes_used;
-		spin_unlock(&fs_info->data_sinfo->lock);
-
-		if ((chunk_type & BTRFS_BLOCK_GROUP_DATA) &&
-		    !chunk_reserved && !bytes_used) {
-			trans = btrfs_start_transaction(chunk_root, 0);
-			if (IS_ERR(trans)) {
-				mutex_unlock(&fs_info->delete_unused_bgs_mutex);
-				ret = PTR_ERR(trans);
-				goto error;
-			}
-
-			ret = btrfs_force_chunk_alloc(trans, fs_info,
-						      BTRFS_BLOCK_GROUP_DATA);
-			btrfs_end_transaction(trans);
+		if (!chunk_reserved) {
+			/*
+			 * We may be relocating the only data chunk we have,
+			 * which could potentially end up with losing data's
+			 * raid profile, so lets allocate an empty one in
+			 * advance.
+			 */
+			ret = btrfs_may_alloc_data_chunk(fs_info,
+							 found_key.offset);
 			if (ret < 0) {
 				mutex_unlock(&fs_info->delete_unused_bgs_mutex);
 				goto error;
+			} else if (ret == 1) {
+				chunk_reserved = 1;
 			}
-			chunk_reserved = 1;
 		}
 
 		ret = btrfs_relocate_chunk(fs_info, found_key.offset);
@@ -4381,7 +4458,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	new_size = round_down(new_size, fs_info->sectorsize);
 	diff = round_down(old_size - new_size, fs_info->sectorsize);
 
-	if (device->is_tgtdev_for_dev_replace)
+	if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
 		return -EINVAL;
 
 	path = btrfs_alloc_path();
@@ -4393,7 +4470,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	mutex_lock(&fs_info->chunk_mutex);
 
 	btrfs_device_set_total_bytes(device, new_size);
-	if (device->writeable) {
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 		device->fs_devices->total_rw_bytes -= diff;
 		atomic64_sub(diff, &fs_info->free_chunk_space);
 	}
@@ -4445,6 +4522,18 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 		chunk_offset = btrfs_dev_extent_chunk_offset(l, dev_extent);
 		btrfs_release_path(path);
 
+		/*
+		 * We may be relocating the only data chunk we have,
+		 * which could potentially end up with losing data's
+		 * raid profile, so lets allocate an empty one in
+		 * advance.
+		 */
+		ret = btrfs_may_alloc_data_chunk(fs_info, chunk_offset);
+		if (ret < 0) {
+			mutex_unlock(&fs_info->delete_unused_bgs_mutex);
+			goto done;
+		}
+
 		ret = btrfs_relocate_chunk(fs_info, chunk_offset);
 		mutex_unlock(&fs_info->delete_unused_bgs_mutex);
 		if (ret && ret != -ENOSPC)
@@ -4518,7 +4607,7 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size)
 	if (ret) {
 		mutex_lock(&fs_info->chunk_mutex);
 		btrfs_device_set_total_bytes(device, old_size);
-		if (device->writeable)
+		if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 			device->fs_devices->total_rw_bytes += diff;
 		atomic64_add(diff, &fs_info->free_chunk_space);
 		mutex_unlock(&fs_info->chunk_mutex);
@@ -4678,14 +4767,15 @@ static int __btrfs_alloc_chunk(struct btrfs_trans_handle *trans,
 		u64 max_avail;
 		u64 dev_offset;
 
-		if (!device->writeable) {
+		if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) {
 			WARN(1, KERN_ERR
 			       "BTRFS: read-only device in alloc_list\n");
 			continue;
 		}
 
-		if (!device->in_fs_metadata ||
-		    device->is_tgtdev_for_dev_replace)
+		if (!test_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+					&device->dev_state) ||
+		    test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state))
 			continue;
 
 		if (device->total_bytes > device->bytes_used)
@@ -5033,12 +5123,13 @@ int btrfs_chunk_readonly(struct btrfs_fs_info *fs_info, u64 chunk_offset)
 
 	map = em->map_lookup;
 	for (i = 0; i < map->num_stripes; i++) {
-		if (map->stripes[i].dev->missing) {
+		if (test_bit(BTRFS_DEV_STATE_MISSING,
+					&map->stripes[i].dev->dev_state)) {
 			miss_ndevs++;
 			continue;
 		}
-
-		if (!map->stripes[i].dev->writeable) {
+		if (!test_bit(BTRFS_DEV_STATE_WRITEABLE,
+					&map->stripes[i].dev->dev_state)) {
 			readonly = 1;
 			goto end;
 		}
@@ -5104,7 +5195,14 @@ int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len)
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID5)
 		ret = 2;
 	else if (map->type & BTRFS_BLOCK_GROUP_RAID6)
-		ret = 3;
+		/*
+		 * There could be two corrupted data stripes, we need
+		 * to loop retry in order to rebuild the correct data.
+		 * 
+		 * Fail a stripe at a time on every retry except the
+		 * stripe under reconstruction.
+		 */
+		ret = map->num_stripes;
 	else
 		ret = 1;
 	free_extent_map(em);
@@ -6004,15 +6102,14 @@ static void btrfs_end_bio(struct bio *bio)
 			dev = bbio->stripes[stripe_index].dev;
 			if (dev->bdev) {
 				if (bio_op(bio) == REQ_OP_WRITE)
-					btrfs_dev_stat_inc(dev,
+					btrfs_dev_stat_inc_and_print(dev,
 						BTRFS_DEV_STAT_WRITE_ERRS);
 				else
-					btrfs_dev_stat_inc(dev,
+					btrfs_dev_stat_inc_and_print(dev,
 						BTRFS_DEV_STAT_READ_ERRS);
 				if (bio->bi_opf & REQ_PREFLUSH)
-					btrfs_dev_stat_inc(dev,
+					btrfs_dev_stat_inc_and_print(dev,
 						BTRFS_DEV_STAT_FLUSH_ERRS);
-				btrfs_dev_stat_print_on_error(dev);
 			}
 		}
 	}
@@ -6062,16 +6159,15 @@ static noinline void btrfs_schedule_bio(struct btrfs_device *device,
 	int should_queue = 1;
 	struct btrfs_pending_bios *pending_bios;
 
-	if (device->missing || !device->bdev) {
+	if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state) ||
+	    !device->bdev) {
 		bio_io_error(bio);
 		return;
 	}
 
 	/* don't bother with additional async steps for reads, right now */
 	if (bio_op(bio) == REQ_OP_READ) {
-		bio_get(bio);
 		btrfsic_submit_bio(bio);
-		bio_put(bio);
 		return;
 	}
 
@@ -6208,7 +6304,8 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio,
 	for (dev_nr = 0; dev_nr < total_devs; dev_nr++) {
 		dev = bbio->stripes[dev_nr].dev;
 		if (!dev || !dev->bdev ||
-		    (bio_op(first_bio) == REQ_OP_WRITE && !dev->writeable)) {
+		    (bio_op(first_bio) == REQ_OP_WRITE &&
+		    !test_bit(BTRFS_DEV_STATE_WRITEABLE, &dev->dev_state))) {
 			bbio_error(bbio, first_bio, logical);
 			continue;
 		}
@@ -6257,7 +6354,7 @@ static struct btrfs_device *add_missing_dev(struct btrfs_fs_devices *fs_devices,
 	device->fs_devices = fs_devices;
 	fs_devices->num_devices++;
 
-	device->missing = 1;
+	set_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state);
 	fs_devices->missing_devices++;
 
 	return device;
@@ -6273,8 +6370,8 @@ static struct btrfs_device *add_missing_dev(struct btrfs_fs_devices *fs_devices,
  *		is generated.
  *
  * Return: a pointer to a new &struct btrfs_device on success; ERR_PTR()
- * on error.  Returned struct is not linked onto any lists and can be
- * destroyed with kfree() right away.
+ * on error.  Returned struct is not linked onto any lists and must be
+ * destroyed with free_device.
  */
 struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
 					const u64 *devid,
@@ -6297,8 +6394,7 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info,
 
 		ret = find_next_devid(fs_info, &tmp);
 		if (ret) {
-			bio_put(dev->flush_bio);
-			kfree(dev);
+			free_device(dev);
 			return ERR_PTR(ret);
 		}
 	}
@@ -6477,7 +6573,9 @@ static int read_one_chunk(struct btrfs_fs_info *fs_info, struct btrfs_key *key,
 			}
 			btrfs_report_missing_device(fs_info, devid, uuid, false);
 		}
-		map->stripes[i].dev->in_fs_metadata = 1;
+		set_bit(BTRFS_DEV_STATE_IN_FS_METADATA,
+				&(map->stripes[i].dev->dev_state));
+
 	}
 
 	write_lock(&map_tree->map_tree.lock);
@@ -6506,7 +6604,7 @@ static void fill_device_from_item(struct extent_buffer *leaf,
 	device->io_width = btrfs_device_io_width(leaf, dev_item);
 	device->sector_size = btrfs_device_sector_size(leaf, dev_item);
 	WARN_ON(device->devid == BTRFS_DEV_REPLACE_DEVID);
-	device->is_tgtdev_for_dev_replace = 0;
+	clear_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state);
 
 	ptr = btrfs_device_uuid(dev_item);
 	read_extent_buffer(leaf, device->uuid, ptr, BTRFS_UUID_SIZE);
@@ -6618,7 +6716,8 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 							dev_uuid, false);
 		}
 
-		if(!device->bdev && !device->missing) {
+		if (!device->bdev &&
+		    !test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) {
 			/*
 			 * this happens when a device that was properly setup
 			 * in the device info lists suddenly goes bad.
@@ -6626,12 +6725,13 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 			 * device->missing to one here
 			 */
 			device->fs_devices->missing_devices++;
-			device->missing = 1;
+			set_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state);
 		}
 
 		/* Move the device to its own fs_devices */
 		if (device->fs_devices != fs_devices) {
-			ASSERT(device->missing);
+			ASSERT(test_bit(BTRFS_DEV_STATE_MISSING,
+							&device->dev_state));
 
 			list_move(&device->dev_list, &fs_devices->devices);
 			device->fs_devices->num_devices--;
@@ -6645,15 +6745,16 @@ static int read_one_dev(struct btrfs_fs_info *fs_info,
 	}
 
 	if (device->fs_devices != fs_info->fs_devices) {
-		BUG_ON(device->writeable);
+		BUG_ON(test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state));
 		if (device->generation !=
 		    btrfs_device_generation(leaf, dev_item))
 			return -EINVAL;
 	}
 
 	fill_device_from_item(leaf, dev_item, device);
-	device->in_fs_metadata = 1;
-	if (device->writeable && !device->is_tgtdev_for_dev_replace) {
+	set_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state);
+	if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) &&
+	   !test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) {
 		device->fs_devices->total_rw_bytes += device->total_bytes;
 		atomic64_add(device->total_bytes - device->bytes_used,
 				&fs_info->free_chunk_space);
@@ -6785,10 +6886,13 @@ int btrfs_read_sys_array(struct btrfs_fs_info *fs_info)
 /*
  * Check if all chunks in the fs are OK for read-write degraded mount
  *
+ * If the @failing_dev is specified, it's accounted as missing.
+ *
  * Return true if all chunks meet the minimal RW mount requirements.
  * Return false if any chunk doesn't meet the minimal RW mount requirements.
  */
-bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
+bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info,
+					struct btrfs_device *failing_dev)
 {
 	struct btrfs_mapping_tree *map_tree = &fs_info->mapping_tree;
 	struct extent_map *em;
@@ -6816,12 +6920,16 @@ bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info)
 		for (i = 0; i < map->num_stripes; i++) {
 			struct btrfs_device *dev = map->stripes[i].dev;
 
-			if (!dev || !dev->bdev || dev->missing ||
+			if (!dev || !dev->bdev ||
+			    test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) ||
 			    dev->last_flush_error)
 				missing++;
+			else if (failing_dev && failing_dev == dev)
+				missing++;
 		}
 		if (missing > max_tolerated) {
-			btrfs_warn(fs_info,
+			if (!failing_dev)
+				btrfs_warn(fs_info,
 	"chunk %llu missing %d devices, max tolerance is %d for writeable mount",
 				   em->start, missing, max_tolerated);
 			free_extent_map(em);
@@ -7092,10 +7200,24 @@ int btrfs_run_dev_stats(struct btrfs_trans_handle *trans,
 
 	mutex_lock(&fs_devices->device_list_mutex);
 	list_for_each_entry(device, &fs_devices->devices, dev_list) {
-		if (!device->dev_stats_valid || !btrfs_dev_stats_dirty(device))
+		stats_cnt = atomic_read(&device->dev_stats_ccnt);
+		if (!device->dev_stats_valid || stats_cnt == 0)
 			continue;
 
-		stats_cnt = atomic_read(&device->dev_stats_ccnt);
+
+		/*
+		 * There is a LOAD-LOAD control dependency between the value of
+		 * dev_stats_ccnt and updating the on-disk values which requires
+		 * reading the in-memory counters. Such control dependencies
+		 * require explicit read memory barriers.
+		 *
+		 * This memory barriers pairs with smp_mb__before_atomic in
+		 * btrfs_dev_stat_inc/btrfs_dev_stat_set and with the full
+		 * barrier implied by atomic_xchg in
+		 * btrfs_dev_stats_read_and_reset
+		 */
+		smp_rmb();
+
 		ret = update_dev_stat_item(trans, fs_info, device);
 		if (!ret)
 			atomic_sub(stats_cnt, &device->dev_stats_ccnt);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h
index ff15208..28c28ee 100644
--- a/fs/btrfs/volumes.h
+++ b/fs/btrfs/volumes.h
@@ -47,6 +47,12 @@ struct btrfs_pending_bios {
 #define btrfs_device_data_ordered_init(device) do { } while (0)
 #endif
 
+#define BTRFS_DEV_STATE_WRITEABLE	(0)
+#define BTRFS_DEV_STATE_IN_FS_METADATA	(1)
+#define BTRFS_DEV_STATE_MISSING		(2)
+#define BTRFS_DEV_STATE_REPLACE_TGT	(3)
+#define BTRFS_DEV_STATE_FLUSH_SENT	(4)
+
 struct btrfs_device {
 	struct list_head dev_list;
 	struct list_head dev_alloc_list;
@@ -69,11 +75,7 @@ struct btrfs_device {
 	/* the mode sent to blkdev_get */
 	fmode_t mode;
 
-	int writeable;
-	int in_fs_metadata;
-	int missing;
-	int can_discard;
-	int is_tgtdev_for_dev_replace;
+	unsigned long dev_state;
 	blk_status_t last_flush_error;
 	int flush_bio_sent;
 
@@ -129,14 +131,12 @@ struct btrfs_device {
 	struct completion flush_wait;
 
 	/* per-device scrub information */
-	struct scrub_ctx *scrub_device;
+	struct scrub_ctx *scrub_ctx;
 
 	struct btrfs_work work;
 	struct rcu_head rcu;
-	struct work_struct rcu_work;
 
 	/* readahead state */
-	spinlock_t reada_lock;
 	atomic_t reada_in_flight;
 	u64 reada_next;
 	struct reada_zone *reada_curr_zone;
@@ -489,15 +489,16 @@ int btrfs_finish_chunk_alloc(struct btrfs_trans_handle *trans,
 int btrfs_remove_chunk(struct btrfs_trans_handle *trans,
 		       struct btrfs_fs_info *fs_info, u64 chunk_offset);
 
-static inline int btrfs_dev_stats_dirty(struct btrfs_device *dev)
-{
-	return atomic_read(&dev->dev_stats_ccnt);
-}
-
 static inline void btrfs_dev_stat_inc(struct btrfs_device *dev,
 				      int index)
 {
 	atomic_inc(dev->dev_stat_values + index);
+	/*
+	 * This memory barrier orders stores updating statistics before stores
+	 * updating dev_stats_ccnt.
+	 *
+	 * It pairs with smp_rmb() in btrfs_run_dev_stats().
+	 */
 	smp_mb__before_atomic();
 	atomic_inc(&dev->dev_stats_ccnt);
 }
@@ -514,7 +515,13 @@ static inline int btrfs_dev_stat_read_and_reset(struct btrfs_device *dev,
 	int ret;
 
 	ret = atomic_xchg(dev->dev_stat_values + index, 0);
-	smp_mb__before_atomic();
+	/*
+	 * atomic_xchg implies a full memory barriers as per atomic_t.txt:
+	 * - RMW operations that have a return value are fully ordered;
+	 *
+	 * This implicit memory barriers is paired with the smp_rmb in
+	 * btrfs_run_dev_stats
+	 */
 	atomic_inc(&dev->dev_stats_ccnt);
 	return ret;
 }
@@ -523,6 +530,12 @@ static inline void btrfs_dev_stat_set(struct btrfs_device *dev,
 				      int index, unsigned long val)
 {
 	atomic_set(dev->dev_stat_values + index, val);
+	/*
+	 * This memory barrier orders stores updating statistics before stores
+	 * updating dev_stats_ccnt.
+	 *
+	 * It pairs with smp_rmb() in btrfs_run_dev_stats().
+	 */
 	smp_mb__before_atomic();
 	atomic_inc(&dev->dev_stats_ccnt);
 }
@@ -540,7 +553,7 @@ void btrfs_update_commit_device_bytes_used(struct btrfs_fs_info *fs_info,
 struct list_head *btrfs_get_fs_uuids(void);
 void btrfs_set_fs_info_ptr(struct btrfs_fs_info *fs_info);
 void btrfs_reset_fs_info_ptr(struct btrfs_fs_info *fs_info);
-
-bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info);
+bool btrfs_check_rw_degradable(struct btrfs_fs_info *fs_info,
+					struct btrfs_device *failing_dev);
 
 #endif
diff --git a/fs/btrfs/xattr.c b/fs/btrfs/xattr.c
index 2c7e53f..de7d072 100644
--- a/fs/btrfs/xattr.c
+++ b/fs/btrfs/xattr.c
@@ -23,6 +23,7 @@
 #include <linux/xattr.h>
 #include <linux/security.h>
 #include <linux/posix_acl_xattr.h>
+#include <linux/iversion.h>
 #include "ctree.h"
 #include "btrfs_inode.h"
 #include "transaction.h"
@@ -267,7 +268,6 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size)
 {
 	struct btrfs_key key;
 	struct inode *inode = d_inode(dentry);
-	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
 	struct btrfs_root *root = BTRFS_I(inode)->root;
 	struct btrfs_path *path;
 	int ret = 0;
@@ -336,11 +336,6 @@ ssize_t btrfs_listxattr(struct dentry *dentry, char *buffer, size_t size)
 			u32 this_len = sizeof(*di) + name_len + data_len;
 			unsigned long name_ptr = (unsigned long)(di + 1);
 
-			if (verify_dir_item(fs_info, leaf, slot, di)) {
-				ret = -EIO;
-				goto err;
-			}
-
 			total_size += name_len + 1;
 			/*
 			 * We are just looking for how big our buffer needs to
diff --git a/fs/btrfs/zstd.c b/fs/btrfs/zstd.c
index 17f2dd8..01a4eab6 100644
--- a/fs/btrfs/zstd.c
+++ b/fs/btrfs/zstd.c
@@ -43,6 +43,8 @@ struct workspace {
 	size_t size;
 	char *buf;
 	struct list_head list;
+	ZSTD_inBuffer in_buf;
+	ZSTD_outBuffer out_buf;
 };
 
 static void zstd_free_workspace(struct list_head *ws)
@@ -94,8 +96,6 @@ static int zstd_compress_pages(struct list_head *ws,
 	int nr_pages = 0;
 	struct page *in_page = NULL;  /* The current page to read */
 	struct page *out_page = NULL; /* The current page to write to */
-	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
-	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
 	unsigned long tot_in = 0;
 	unsigned long tot_out = 0;
 	unsigned long len = *total_out;
@@ -118,9 +118,9 @@ static int zstd_compress_pages(struct list_head *ws,
 
 	/* map in the first page of input data */
 	in_page = find_get_page(mapping, start >> PAGE_SHIFT);
-	in_buf.src = kmap(in_page);
-	in_buf.pos = 0;
-	in_buf.size = min_t(size_t, len, PAGE_SIZE);
+	workspace->in_buf.src = kmap(in_page);
+	workspace->in_buf.pos = 0;
+	workspace->in_buf.size = min_t(size_t, len, PAGE_SIZE);
 
 
 	/* Allocate and map in the output buffer */
@@ -130,14 +130,15 @@ static int zstd_compress_pages(struct list_head *ws,
 		goto out;
 	}
 	pages[nr_pages++] = out_page;
-	out_buf.dst = kmap(out_page);
-	out_buf.pos = 0;
-	out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+	workspace->out_buf.dst = kmap(out_page);
+	workspace->out_buf.pos = 0;
+	workspace->out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
 
 	while (1) {
 		size_t ret2;
 
-		ret2 = ZSTD_compressStream(stream, &out_buf, &in_buf);
+		ret2 = ZSTD_compressStream(stream, &workspace->out_buf,
+				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
 			pr_debug("BTRFS: ZSTD_compressStream returned %d\n",
 					ZSTD_getErrorCode(ret2));
@@ -146,22 +147,22 @@ static int zstd_compress_pages(struct list_head *ws,
 		}
 
 		/* Check to see if we are making it bigger */
-		if (tot_in + in_buf.pos > 8192 &&
-				tot_in + in_buf.pos <
-				tot_out + out_buf.pos) {
+		if (tot_in + workspace->in_buf.pos > 8192 &&
+				tot_in + workspace->in_buf.pos <
+				tot_out + workspace->out_buf.pos) {
 			ret = -E2BIG;
 			goto out;
 		}
 
 		/* We've reached the end of our output range */
-		if (out_buf.pos >= max_out) {
-			tot_out += out_buf.pos;
+		if (workspace->out_buf.pos >= max_out) {
+			tot_out += workspace->out_buf.pos;
 			ret = -E2BIG;
 			goto out;
 		}
 
 		/* Check if we need more output space */
-		if (out_buf.pos == out_buf.size) {
+		if (workspace->out_buf.pos == workspace->out_buf.size) {
 			tot_out += PAGE_SIZE;
 			max_out -= PAGE_SIZE;
 			kunmap(out_page);
@@ -176,19 +177,20 @@ static int zstd_compress_pages(struct list_head *ws,
 				goto out;
 			}
 			pages[nr_pages++] = out_page;
-			out_buf.dst = kmap(out_page);
-			out_buf.pos = 0;
-			out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+			workspace->out_buf.dst = kmap(out_page);
+			workspace->out_buf.pos = 0;
+			workspace->out_buf.size = min_t(size_t, max_out,
+							PAGE_SIZE);
 		}
 
 		/* We've reached the end of the input */
-		if (in_buf.pos >= len) {
-			tot_in += in_buf.pos;
+		if (workspace->in_buf.pos >= len) {
+			tot_in += workspace->in_buf.pos;
 			break;
 		}
 
 		/* Check if we need more input */
-		if (in_buf.pos == in_buf.size) {
+		if (workspace->in_buf.pos == workspace->in_buf.size) {
 			tot_in += PAGE_SIZE;
 			kunmap(in_page);
 			put_page(in_page);
@@ -196,15 +198,15 @@ static int zstd_compress_pages(struct list_head *ws,
 			start += PAGE_SIZE;
 			len -= PAGE_SIZE;
 			in_page = find_get_page(mapping, start >> PAGE_SHIFT);
-			in_buf.src = kmap(in_page);
-			in_buf.pos = 0;
-			in_buf.size = min_t(size_t, len, PAGE_SIZE);
+			workspace->in_buf.src = kmap(in_page);
+			workspace->in_buf.pos = 0;
+			workspace->in_buf.size = min_t(size_t, len, PAGE_SIZE);
 		}
 	}
 	while (1) {
 		size_t ret2;
 
-		ret2 = ZSTD_endStream(stream, &out_buf);
+		ret2 = ZSTD_endStream(stream, &workspace->out_buf);
 		if (ZSTD_isError(ret2)) {
 			pr_debug("BTRFS: ZSTD_endStream returned %d\n",
 					ZSTD_getErrorCode(ret2));
@@ -212,11 +214,11 @@ static int zstd_compress_pages(struct list_head *ws,
 			goto out;
 		}
 		if (ret2 == 0) {
-			tot_out += out_buf.pos;
+			tot_out += workspace->out_buf.pos;
 			break;
 		}
-		if (out_buf.pos >= max_out) {
-			tot_out += out_buf.pos;
+		if (workspace->out_buf.pos >= max_out) {
+			tot_out += workspace->out_buf.pos;
 			ret = -E2BIG;
 			goto out;
 		}
@@ -235,9 +237,9 @@ static int zstd_compress_pages(struct list_head *ws,
 			goto out;
 		}
 		pages[nr_pages++] = out_page;
-		out_buf.dst = kmap(out_page);
-		out_buf.pos = 0;
-		out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
+		workspace->out_buf.dst = kmap(out_page);
+		workspace->out_buf.pos = 0;
+		workspace->out_buf.size = min_t(size_t, max_out, PAGE_SIZE);
 	}
 
 	if (tot_out >= tot_in) {
@@ -273,8 +275,6 @@ static int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 	unsigned long total_pages_in = DIV_ROUND_UP(srclen, PAGE_SIZE);
 	unsigned long buf_start;
 	unsigned long total_out = 0;
-	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
-	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
 
 	stream = ZSTD_initDStream(
 			ZSTD_BTRFS_MAX_INPUT, workspace->mem, workspace->size);
@@ -284,18 +284,19 @@ static int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 		goto done;
 	}
 
-	in_buf.src = kmap(pages_in[page_in_index]);
-	in_buf.pos = 0;
-	in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
+	workspace->in_buf.src = kmap(pages_in[page_in_index]);
+	workspace->in_buf.pos = 0;
+	workspace->in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
 
-	out_buf.dst = workspace->buf;
-	out_buf.pos = 0;
-	out_buf.size = PAGE_SIZE;
+	workspace->out_buf.dst = workspace->buf;
+	workspace->out_buf.pos = 0;
+	workspace->out_buf.size = PAGE_SIZE;
 
 	while (1) {
 		size_t ret2;
 
-		ret2 = ZSTD_decompressStream(stream, &out_buf, &in_buf);
+		ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
+				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
 			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
 					ZSTD_getErrorCode(ret2));
@@ -303,38 +304,38 @@ static int zstd_decompress_bio(struct list_head *ws, struct compressed_bio *cb)
 			goto done;
 		}
 		buf_start = total_out;
-		total_out += out_buf.pos;
-		out_buf.pos = 0;
+		total_out += workspace->out_buf.pos;
+		workspace->out_buf.pos = 0;
 
-		ret = btrfs_decompress_buf2page(out_buf.dst, buf_start,
-				total_out, disk_start, orig_bio);
+		ret = btrfs_decompress_buf2page(workspace->out_buf.dst,
+				buf_start, total_out, disk_start, orig_bio);
 		if (ret == 0)
 			break;
 
-		if (in_buf.pos >= srclen)
+		if (workspace->in_buf.pos >= srclen)
 			break;
 
 		/* Check if we've hit the end of a frame */
 		if (ret2 == 0)
 			break;
 
-		if (in_buf.pos == in_buf.size) {
+		if (workspace->in_buf.pos == workspace->in_buf.size) {
 			kunmap(pages_in[page_in_index++]);
 			if (page_in_index >= total_pages_in) {
-				in_buf.src = NULL;
+				workspace->in_buf.src = NULL;
 				ret = -EIO;
 				goto done;
 			}
 			srclen -= PAGE_SIZE;
-			in_buf.src = kmap(pages_in[page_in_index]);
-			in_buf.pos = 0;
-			in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
+			workspace->in_buf.src = kmap(pages_in[page_in_index]);
+			workspace->in_buf.pos = 0;
+			workspace->in_buf.size = min_t(size_t, srclen, PAGE_SIZE);
 		}
 	}
 	ret = 0;
 	zero_fill_bio(orig_bio);
 done:
-	if (in_buf.src)
+	if (workspace->in_buf.src)
 		kunmap(pages_in[page_in_index]);
 	return ret;
 }
@@ -348,8 +349,6 @@ static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 	ZSTD_DStream *stream;
 	int ret = 0;
 	size_t ret2;
-	ZSTD_inBuffer in_buf = { NULL, 0, 0 };
-	ZSTD_outBuffer out_buf = { NULL, 0, 0 };
 	unsigned long total_out = 0;
 	unsigned long pg_offset = 0;
 	char *kaddr;
@@ -364,16 +363,17 @@ static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 
 	destlen = min_t(size_t, destlen, PAGE_SIZE);
 
-	in_buf.src = data_in;
-	in_buf.pos = 0;
-	in_buf.size = srclen;
+	workspace->in_buf.src = data_in;
+	workspace->in_buf.pos = 0;
+	workspace->in_buf.size = srclen;
 
-	out_buf.dst = workspace->buf;
-	out_buf.pos = 0;
-	out_buf.size = PAGE_SIZE;
+	workspace->out_buf.dst = workspace->buf;
+	workspace->out_buf.pos = 0;
+	workspace->out_buf.size = PAGE_SIZE;
 
 	ret2 = 1;
-	while (pg_offset < destlen && in_buf.pos < in_buf.size) {
+	while (pg_offset < destlen
+	       && workspace->in_buf.pos < workspace->in_buf.size) {
 		unsigned long buf_start;
 		unsigned long buf_offset;
 		unsigned long bytes;
@@ -384,7 +384,8 @@ static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 			ret = -EIO;
 			goto finish;
 		}
-		ret2 = ZSTD_decompressStream(stream, &out_buf, &in_buf);
+		ret2 = ZSTD_decompressStream(stream, &workspace->out_buf,
+				&workspace->in_buf);
 		if (ZSTD_isError(ret2)) {
 			pr_debug("BTRFS: ZSTD_decompressStream returned %d\n",
 					ZSTD_getErrorCode(ret2));
@@ -393,8 +394,8 @@ static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 		}
 
 		buf_start = total_out;
-		total_out += out_buf.pos;
-		out_buf.pos = 0;
+		total_out += workspace->out_buf.pos;
+		workspace->out_buf.pos = 0;
 
 		if (total_out <= start_byte)
 			continue;
@@ -405,10 +406,11 @@ static int zstd_decompress(struct list_head *ws, unsigned char *data_in,
 			buf_offset = 0;
 
 		bytes = min_t(unsigned long, destlen - pg_offset,
-				out_buf.size - buf_offset);
+				workspace->out_buf.size - buf_offset);
 
 		kaddr = kmap_atomic(dest_page);
-		memcpy(kaddr + pg_offset, out_buf.dst + buf_offset, bytes);
+		memcpy(kaddr + pg_offset, workspace->out_buf.dst + buf_offset,
+				bytes);
 		kunmap_atomic(kaddr);
 
 		pg_offset += bytes;
diff --git a/fs/buffer.c b/fs/buffer.c
index 0736a6a..8b26295 100644
--- a/fs/buffer.c
+++ b/fs/buffer.c
@@ -3014,7 +3014,7 @@ static void end_bio_bh_io_sync(struct bio *bio)
 void guard_bio_eod(int op, struct bio *bio)
 {
 	sector_t maxsector;
-	struct bio_vec *bvec = &bio->bi_io_vec[bio->bi_vcnt - 1];
+	struct bio_vec *bvec = bio_last_bvec_all(bio);
 	unsigned truncated_bytes;
 	struct hd_struct *part;
 
diff --git a/fs/cifs/Kconfig b/fs/cifs/Kconfig
index d5b2e12..c71971c 100644
--- a/fs/cifs/Kconfig
+++ b/fs/cifs/Kconfig
@@ -196,6 +196,14 @@
 	  This dialect includes improved security negotiation features.
 	  If unsure, say N
 
+config CIFS_SMB_DIRECT
+	bool "SMB Direct support (Experimental)"
+	depends on CIFS=m && INFINIBAND || CIFS=y && INFINIBAND=y
+	help
+	  Enables SMB Direct experimental support for SMB 3.0, 3.02 and 3.1.1.
+	  SMB Direct allows transferring SMB packets over RDMA. If unsure,
+	  say N.
+
 config CIFS_FSCACHE
 	  bool "Provide CIFS client caching support"
 	  depends on CIFS=m && FSCACHE || CIFS=y && FSCACHE=y
diff --git a/fs/cifs/Makefile b/fs/cifs/Makefile
index 7134f18..7e4a1e2 100644
--- a/fs/cifs/Makefile
+++ b/fs/cifs/Makefile
@@ -19,3 +19,5 @@
 cifs-$(CONFIG_CIFS_DFS_UPCALL) += dns_resolve.o cifs_dfs_ref.o
 
 cifs-$(CONFIG_CIFS_FSCACHE) += fscache.o cache.o
+
+cifs-$(CONFIG_CIFS_SMB_DIRECT) += smbdirect.o
diff --git a/fs/cifs/cifs_debug.c b/fs/cifs/cifs_debug.c
index cbb9534..c7a8632 100644
--- a/fs/cifs/cifs_debug.c
+++ b/fs/cifs/cifs_debug.c
@@ -30,6 +30,9 @@
 #include "cifsproto.h"
 #include "cifs_debug.h"
 #include "cifsfs.h"
+#ifdef CONFIG_CIFS_SMB_DIRECT
+#include "smbdirect.h"
+#endif
 
 void
 cifs_dump_mem(char *label, void *data, int length)
@@ -107,6 +110,32 @@ void cifs_dump_mids(struct TCP_Server_Info *server)
 }
 
 #ifdef CONFIG_PROC_FS
+static void cifs_debug_tcon(struct seq_file *m, struct cifs_tcon *tcon)
+{
+	__u32 dev_type = le32_to_cpu(tcon->fsDevInfo.DeviceType);
+
+	seq_printf(m, "%s Mounts: %d ", tcon->treeName, tcon->tc_count);
+	if (tcon->nativeFileSystem)
+		seq_printf(m, "Type: %s ", tcon->nativeFileSystem);
+	seq_printf(m, "DevInfo: 0x%x Attributes: 0x%x\n\tPathComponentMax: %d Status: %d",
+		   le32_to_cpu(tcon->fsDevInfo.DeviceCharacteristics),
+		   le32_to_cpu(tcon->fsAttrInfo.Attributes),
+		   le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength),
+		   tcon->tidStatus);
+	if (dev_type == FILE_DEVICE_DISK)
+		seq_puts(m, " type: DISK ");
+	else if (dev_type == FILE_DEVICE_CD_ROM)
+		seq_puts(m, " type: CDROM ");
+	else
+		seq_printf(m, " type: %d ", dev_type);
+	if (tcon->ses->server->ops->dump_share_caps)
+		tcon->ses->server->ops->dump_share_caps(m, tcon);
+
+	if (tcon->need_reconnect)
+		seq_puts(m, "\tDISCONNECTED ");
+	seq_putc(m, '\n');
+}
+
 static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 {
 	struct list_head *tmp1, *tmp2, *tmp3;
@@ -115,7 +144,6 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 	struct cifs_ses *ses;
 	struct cifs_tcon *tcon;
 	int i, j;
-	__u32 dev_type;
 
 	seq_puts(m,
 		    "Display Internal CIFS Data Structures for Debugging\n"
@@ -152,6 +180,72 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 	list_for_each(tmp1, &cifs_tcp_ses_list) {
 		server = list_entry(tmp1, struct TCP_Server_Info,
 				    tcp_ses_list);
+
+#ifdef CONFIG_CIFS_SMB_DIRECT
+		if (!server->rdma)
+			goto skip_rdma;
+
+		seq_printf(m, "\nSMBDirect (in hex) protocol version: %x "
+			"transport status: %x",
+			server->smbd_conn->protocol,
+			server->smbd_conn->transport_status);
+		seq_printf(m, "\nConn receive_credit_max: %x "
+			"send_credit_target: %x max_send_size: %x",
+			server->smbd_conn->receive_credit_max,
+			server->smbd_conn->send_credit_target,
+			server->smbd_conn->max_send_size);
+		seq_printf(m, "\nConn max_fragmented_recv_size: %x "
+			"max_fragmented_send_size: %x max_receive_size:%x",
+			server->smbd_conn->max_fragmented_recv_size,
+			server->smbd_conn->max_fragmented_send_size,
+			server->smbd_conn->max_receive_size);
+		seq_printf(m, "\nConn keep_alive_interval: %x "
+			"max_readwrite_size: %x rdma_readwrite_threshold: %x",
+			server->smbd_conn->keep_alive_interval,
+			server->smbd_conn->max_readwrite_size,
+			server->smbd_conn->rdma_readwrite_threshold);
+		seq_printf(m, "\nDebug count_get_receive_buffer: %x "
+			"count_put_receive_buffer: %x count_send_empty: %x",
+			server->smbd_conn->count_get_receive_buffer,
+			server->smbd_conn->count_put_receive_buffer,
+			server->smbd_conn->count_send_empty);
+		seq_printf(m, "\nRead Queue count_reassembly_queue: %x "
+			"count_enqueue_reassembly_queue: %x "
+			"count_dequeue_reassembly_queue: %x "
+			"fragment_reassembly_remaining: %x "
+			"reassembly_data_length: %x "
+			"reassembly_queue_length: %x",
+			server->smbd_conn->count_reassembly_queue,
+			server->smbd_conn->count_enqueue_reassembly_queue,
+			server->smbd_conn->count_dequeue_reassembly_queue,
+			server->smbd_conn->fragment_reassembly_remaining,
+			server->smbd_conn->reassembly_data_length,
+			server->smbd_conn->reassembly_queue_length);
+		seq_printf(m, "\nCurrent Credits send_credits: %x "
+			"receive_credits: %x receive_credit_target: %x",
+			atomic_read(&server->smbd_conn->send_credits),
+			atomic_read(&server->smbd_conn->receive_credits),
+			server->smbd_conn->receive_credit_target);
+		seq_printf(m, "\nPending send_pending: %x send_payload_pending:"
+			" %x smbd_send_pending: %x smbd_recv_pending: %x",
+			atomic_read(&server->smbd_conn->send_pending),
+			atomic_read(&server->smbd_conn->send_payload_pending),
+			server->smbd_conn->smbd_send_pending,
+			server->smbd_conn->smbd_recv_pending);
+		seq_printf(m, "\nReceive buffers count_receive_queue: %x "
+			"count_empty_packet_queue: %x",
+			server->smbd_conn->count_receive_queue,
+			server->smbd_conn->count_empty_packet_queue);
+		seq_printf(m, "\nMR responder_resources: %x "
+			"max_frmr_depth: %x mr_type: %x",
+			server->smbd_conn->responder_resources,
+			server->smbd_conn->max_frmr_depth,
+			server->smbd_conn->mr_type);
+		seq_printf(m, "\nMR mr_ready_count: %x mr_used_count: %x",
+			atomic_read(&server->smbd_conn->mr_ready_count),
+			atomic_read(&server->smbd_conn->mr_used_count));
+skip_rdma:
+#endif
 		seq_printf(m, "\nNumber of credits: %d", server->credits);
 		i++;
 		list_for_each(tmp2, &server->smb_ses_list) {
@@ -176,6 +270,8 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 				ses->ses_count, ses->serverOS, ses->serverNOS,
 				ses->capabilities, ses->status);
 			}
+			if (server->rdma)
+				seq_printf(m, "RDMA\n\t");
 			seq_printf(m, "TCP status: %d\n\tLocal Users To "
 				   "Server: %d SecMode: 0x%x Req On Wire: %d",
 				   server->tcpStatus, server->srv_count,
@@ -189,35 +285,19 @@ static int cifs_debug_data_proc_show(struct seq_file *m, void *v)
 
 			seq_puts(m, "\n\tShares:");
 			j = 0;
+
+			seq_printf(m, "\n\t%d) IPC: ", j);
+			if (ses->tcon_ipc)
+				cifs_debug_tcon(m, ses->tcon_ipc);
+			else
+				seq_puts(m, "none\n");
+
 			list_for_each(tmp3, &ses->tcon_list) {
 				tcon = list_entry(tmp3, struct cifs_tcon,
 						  tcon_list);
 				++j;
-				dev_type = le32_to_cpu(tcon->fsDevInfo.DeviceType);
-				seq_printf(m, "\n\t%d) %s Mounts: %d ", j,
-					   tcon->treeName, tcon->tc_count);
-				if (tcon->nativeFileSystem) {
-					seq_printf(m, "Type: %s ",
-						   tcon->nativeFileSystem);
-				}
-				seq_printf(m, "DevInfo: 0x%x Attributes: 0x%x"
-					"\n\tPathComponentMax: %d Status: %d",
-					le32_to_cpu(tcon->fsDevInfo.DeviceCharacteristics),
-					le32_to_cpu(tcon->fsAttrInfo.Attributes),
-					le32_to_cpu(tcon->fsAttrInfo.MaxPathNameComponentLength),
-					tcon->tidStatus);
-				if (dev_type == FILE_DEVICE_DISK)
-					seq_puts(m, " type: DISK ");
-				else if (dev_type == FILE_DEVICE_CD_ROM)
-					seq_puts(m, " type: CDROM ");
-				else
-					seq_printf(m, " type: %d ", dev_type);
-				if (server->ops->dump_share_caps)
-					server->ops->dump_share_caps(m, tcon);
-
-				if (tcon->need_reconnect)
-					seq_puts(m, "\tDISCONNECTED ");
-				seq_putc(m, '\n');
+				seq_printf(m, "\n\t%d) ", j);
+				cifs_debug_tcon(m, tcon);
 			}
 
 			seq_puts(m, "\n\tMIDs:\n");
@@ -374,6 +454,45 @@ static const struct file_operations cifs_stats_proc_fops = {
 };
 #endif /* STATS */
 
+#ifdef CONFIG_CIFS_SMB_DIRECT
+#define PROC_FILE_DEFINE(name) \
+static ssize_t name##_write(struct file *file, const char __user *buffer, \
+	size_t count, loff_t *ppos) \
+{ \
+	int rc; \
+	rc = kstrtoint_from_user(buffer, count, 10, & name); \
+	if (rc) \
+		return rc; \
+	return count; \
+} \
+static int name##_proc_show(struct seq_file *m, void *v) \
+{ \
+	seq_printf(m, "%d\n", name ); \
+	return 0; \
+} \
+static int name##_open(struct inode *inode, struct file *file) \
+{ \
+	return single_open(file, name##_proc_show, NULL); \
+} \
+\
+static const struct file_operations cifs_##name##_proc_fops = { \
+	.open		= name##_open, \
+	.read		= seq_read, \
+	.llseek		= seq_lseek, \
+	.release	= single_release, \
+	.write		= name##_write, \
+}
+
+PROC_FILE_DEFINE(rdma_readwrite_threshold);
+PROC_FILE_DEFINE(smbd_max_frmr_depth);
+PROC_FILE_DEFINE(smbd_keep_alive_interval);
+PROC_FILE_DEFINE(smbd_max_receive_size);
+PROC_FILE_DEFINE(smbd_max_fragmented_recv_size);
+PROC_FILE_DEFINE(smbd_max_send_size);
+PROC_FILE_DEFINE(smbd_send_credit_target);
+PROC_FILE_DEFINE(smbd_receive_credit_max);
+#endif
+
 static struct proc_dir_entry *proc_fs_cifs;
 static const struct file_operations cifsFYI_proc_fops;
 static const struct file_operations cifs_lookup_cache_proc_fops;
@@ -401,6 +520,24 @@ cifs_proc_init(void)
 		    &cifs_security_flags_proc_fops);
 	proc_create("LookupCacheEnabled", 0, proc_fs_cifs,
 		    &cifs_lookup_cache_proc_fops);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	proc_create("rdma_readwrite_threshold", 0, proc_fs_cifs,
+		&cifs_rdma_readwrite_threshold_proc_fops);
+	proc_create("smbd_max_frmr_depth", 0, proc_fs_cifs,
+		&cifs_smbd_max_frmr_depth_proc_fops);
+	proc_create("smbd_keep_alive_interval", 0, proc_fs_cifs,
+		&cifs_smbd_keep_alive_interval_proc_fops);
+	proc_create("smbd_max_receive_size", 0, proc_fs_cifs,
+		&cifs_smbd_max_receive_size_proc_fops);
+	proc_create("smbd_max_fragmented_recv_size", 0, proc_fs_cifs,
+		&cifs_smbd_max_fragmented_recv_size_proc_fops);
+	proc_create("smbd_max_send_size", 0, proc_fs_cifs,
+		&cifs_smbd_max_send_size_proc_fops);
+	proc_create("smbd_send_credit_target", 0, proc_fs_cifs,
+		&cifs_smbd_send_credit_target_proc_fops);
+	proc_create("smbd_receive_credit_max", 0, proc_fs_cifs,
+		&cifs_smbd_receive_credit_max_proc_fops);
+#endif
 }
 
 void
@@ -418,6 +555,16 @@ cifs_proc_clean(void)
 	remove_proc_entry("SecurityFlags", proc_fs_cifs);
 	remove_proc_entry("LinuxExtensionsEnabled", proc_fs_cifs);
 	remove_proc_entry("LookupCacheEnabled", proc_fs_cifs);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	remove_proc_entry("rdma_readwrite_threshold", proc_fs_cifs);
+	remove_proc_entry("smbd_max_frmr_depth", proc_fs_cifs);
+	remove_proc_entry("smbd_keep_alive_interval", proc_fs_cifs);
+	remove_proc_entry("smbd_max_receive_size", proc_fs_cifs);
+	remove_proc_entry("smbd_max_fragmented_recv_size", proc_fs_cifs);
+	remove_proc_entry("smbd_max_send_size", proc_fs_cifs);
+	remove_proc_entry("smbd_send_credit_target", proc_fs_cifs);
+	remove_proc_entry("smbd_receive_credit_max", proc_fs_cifs);
+#endif
 	remove_proc_entry("fs/cifs", NULL);
 }
 
diff --git a/fs/cifs/cifsacl.c b/fs/cifs/cifsacl.c
index b98436f..13a8a77 100644
--- a/fs/cifs/cifsacl.c
+++ b/fs/cifs/cifsacl.c
@@ -1125,7 +1125,7 @@ int set_cifs_acl(struct cifs_ntsd *pnntsd, __u32 acllen,
 	return rc;
 }
 
-/* Translate the CIFS ACL (simlar to NTFS ACL) for a file into mode bits */
+/* Translate the CIFS ACL (similar to NTFS ACL) for a file into mode bits */
 int
 cifs_acl_to_fattr(struct cifs_sb_info *cifs_sb, struct cifs_fattr *fattr,
 		  struct inode *inode, const char *path,
diff --git a/fs/cifs/cifsencrypt.c b/fs/cifs/cifsencrypt.c
index 68abbb0..f2b0a7f 100644
--- a/fs/cifs/cifsencrypt.c
+++ b/fs/cifs/cifsencrypt.c
@@ -325,9 +325,8 @@ int calc_lanman_hash(const char *password, const char *cryptkey, bool encrypt,
 {
 	int i;
 	int rc;
-	char password_with_pad[CIFS_ENCPWD_SIZE];
+	char password_with_pad[CIFS_ENCPWD_SIZE] = {0};
 
-	memset(password_with_pad, 0, CIFS_ENCPWD_SIZE);
 	if (password)
 		strncpy(password_with_pad, password, CIFS_ENCPWD_SIZE);
 
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c
index 31b7565..a7be591 100644
--- a/fs/cifs/cifsfs.c
+++ b/fs/cifs/cifsfs.c
@@ -327,6 +327,8 @@ cifs_show_address(struct seq_file *s, struct TCP_Server_Info *server)
 	default:
 		seq_puts(s, "(unknown)");
 	}
+	if (server->rdma)
+		seq_puts(s, ",rdma");
 }
 
 static void
@@ -1068,6 +1070,7 @@ const struct file_operations cifs_file_ops = {
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1086,6 +1089,7 @@ const struct file_operations cifs_file_strict_ops = {
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1105,6 +1109,7 @@ const struct file_operations cifs_file_direct_ops = {
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
 	.clone_file_range = cifs_clone_file_range,
@@ -1122,6 +1127,7 @@ const struct file_operations cifs_file_nobrl_ops = {
 	.flush = cifs_flush,
 	.mmap  = cifs_file_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1139,6 +1145,7 @@ const struct file_operations cifs_file_strict_nobrl_ops = {
 	.flush = cifs_flush,
 	.mmap = cifs_file_strict_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.llseek = cifs_llseek,
 	.unlocked_ioctl	= cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
@@ -1157,6 +1164,7 @@ const struct file_operations cifs_file_direct_nobrl_ops = {
 	.flush = cifs_flush,
 	.mmap = cifs_file_mmap,
 	.splice_read = generic_file_splice_read,
+	.splice_write = iter_file_splice_write,
 	.unlocked_ioctl  = cifs_ioctl,
 	.copy_file_range = cifs_copy_file_range,
 	.clone_file_range = cifs_clone_file_range,
diff --git a/fs/cifs/cifsfs.h b/fs/cifs/cifsfs.h
index 5a10e56..013ba2a 100644
--- a/fs/cifs/cifsfs.h
+++ b/fs/cifs/cifsfs.h
@@ -149,5 +149,5 @@ extern long cifs_ioctl(struct file *filep, unsigned int cmd, unsigned long arg);
 extern const struct export_operations cifs_export_ops;
 #endif /* CONFIG_CIFS_NFSD_EXPORT */
 
-#define CIFS_VERSION   "2.10"
+#define CIFS_VERSION   "2.11"
 #endif				/* _CIFSFS_H */
diff --git a/fs/cifs/cifsglob.h b/fs/cifs/cifsglob.h
index b165835..48f7c19 100644
--- a/fs/cifs/cifsglob.h
+++ b/fs/cifs/cifsglob.h
@@ -64,8 +64,8 @@
 #define RFC1001_NAME_LEN 15
 #define RFC1001_NAME_LEN_WITH_NULL (RFC1001_NAME_LEN + 1)
 
-/* currently length of NIP6_FMT */
-#define SERVER_NAME_LENGTH 40
+/* maximum length of ip addr as a string (including ipv6 and sctp) */
+#define SERVER_NAME_LENGTH 80
 #define SERVER_NAME_LEN_WITH_NULL     (SERVER_NAME_LENGTH + 1)
 
 /* echo interval in seconds */
@@ -230,8 +230,14 @@ struct smb_version_operations {
 	__u64 (*get_next_mid)(struct TCP_Server_Info *);
 	/* data offset from read response message */
 	unsigned int (*read_data_offset)(char *);
-	/* data length from read response message */
-	unsigned int (*read_data_length)(char *);
+	/*
+	 * Data length from read response message
+	 * When in_remaining is true, the returned data length is in
+	 * message field DataRemaining for out-of-band data read (e.g through
+	 * Memory Registration RDMA write in SMBD).
+	 * Otherwise, the returned data length is in message field DataLength.
+	 */
+	unsigned int (*read_data_length)(char *, bool in_remaining);
 	/* map smb to linux error */
 	int (*map_error)(char *, bool);
 	/* find mid corresponding to the response message */
@@ -532,6 +538,7 @@ struct smb_vol {
 	bool nopersistent:1;
 	bool resilient:1; /* noresilient not required since not fored for CA */
 	bool domainauto:1;
+	bool rdma:1;
 	unsigned int rsize;
 	unsigned int wsize;
 	bool sockopt_tcp_nodelay:1;
@@ -648,6 +655,10 @@ struct TCP_Server_Info {
 	bool	sec_kerberos;		/* supports plain Kerberos */
 	bool	sec_mskerberos;		/* supports legacy MS Kerberos */
 	bool	large_buf;		/* is current buffer large? */
+	/* use SMBD connection instead of socket */
+	bool	rdma;
+	/* point to the SMBD connection if RDMA is used instead of socket */
+	struct smbd_connection *smbd_conn;
 	struct delayed_work	echo; /* echo ping workqueue job */
 	char	*smallbuf;	/* pointer to current "small" buffer */
 	char	*bigbuf;	/* pointer to current "big" buffer */
@@ -822,12 +833,12 @@ static inline void cifs_set_net_ns(struct TCP_Server_Info *srv, struct net *net)
 struct cifs_ses {
 	struct list_head smb_ses_list;
 	struct list_head tcon_list;
+	struct cifs_tcon *tcon_ipc;
 	struct mutex session_mutex;
 	struct TCP_Server_Info *server;	/* pointer to server info */
 	int ses_count;		/* reference counter */
 	enum statusEnum status;
 	unsigned overrideSecFlg;  /* if non-zero override global sec flags */
-	__u32 ipc_tid;		/* special tid for connection to IPC share */
 	char *serverOS;		/* name of operating system underlying server */
 	char *serverNOS;	/* name of network operating system of server */
 	char *serverDomain;	/* security realm of server */
@@ -835,8 +846,7 @@ struct cifs_ses {
 	kuid_t linux_uid;	/* overriding owner of files on the mount */
 	kuid_t cred_uid;	/* owner of credentials */
 	unsigned int capabilities;
-	char serverName[SERVER_NAME_LEN_WITH_NULL * 2];	/* BB make bigger for
-				TCP names - will ipv6 and sctp addresses fit? */
+	char serverName[SERVER_NAME_LEN_WITH_NULL];
 	char *user_name;	/* must not be null except during init of sess
 				   and after mount option parsing we fill it */
 	char *domainName;
@@ -931,7 +941,9 @@ struct cifs_tcon {
 	FILE_SYSTEM_DEVICE_INFO fsDevInfo;
 	FILE_SYSTEM_ATTRIBUTE_INFO fsAttrInfo; /* ok if fs name truncated */
 	FILE_SYSTEM_UNIX_INFO fsUnixInfo;
-	bool ipc:1;		/* set if connection to IPC$ eg for RPC/PIPES */
+	bool ipc:1;   /* set if connection to IPC$ share (always also pipe) */
+	bool pipe:1;  /* set if connection to pipe share */
+	bool print:1; /* set if connection to printer share */
 	bool retry:1;
 	bool nocase:1;
 	bool seal:1;      /* transport encryption for this mounted share */
@@ -944,7 +956,6 @@ struct cifs_tcon {
 	bool need_reopen_files:1; /* need to reopen tcon file handles */
 	bool use_resilient:1; /* use resilient instead of durable handles */
 	bool use_persistent:1; /* use persistent instead of durable handles */
-	bool print:1;		/* set if connection to printer share */
 	__le32 capabilities;
 	__u32 share_flags;
 	__u32 maximal_access;
@@ -1147,6 +1158,9 @@ struct cifs_readdata {
 				struct cifs_readdata *rdata,
 				struct iov_iter *iter);
 	struct kvec			iov[2];
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	struct smbd_mr			*mr;
+#endif
 	unsigned int			pagesz;
 	unsigned int			tailsz;
 	unsigned int			credits;
@@ -1169,6 +1183,9 @@ struct cifs_writedata {
 	pid_t				pid;
 	unsigned int			bytes;
 	int				result;
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	struct smbd_mr			*mr;
+#endif
 	unsigned int			pagesz;
 	unsigned int			tailsz;
 	unsigned int			credits;
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h
index 4143c9d..93d5651 100644
--- a/fs/cifs/cifsproto.h
+++ b/fs/cifs/cifsproto.h
@@ -106,6 +106,10 @@ extern int SendReceive2(const unsigned int /* xid */ , struct cifs_ses *,
 			struct kvec *, int /* nvec to send */,
 			int * /* type of buf returned */, const int flags,
 			struct kvec * /* resp vec */);
+extern int smb2_send_recv(const unsigned int xid, struct cifs_ses *pses,
+			  struct kvec *pkvec, int nvec_to_send,
+			  int *pbuftype, const int flags,
+			  struct kvec *presp);
 extern int SendReceiveBlockingLock(const unsigned int xid,
 			struct cifs_tcon *ptcon,
 			struct smb_hdr *in_buf ,
diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c
index 35dc5bf..4e0922d 100644
--- a/fs/cifs/cifssmb.c
+++ b/fs/cifs/cifssmb.c
@@ -43,6 +43,7 @@
 #include "cifs_unicode.h"
 #include "cifs_debug.h"
 #include "fscache.h"
+#include "smbdirect.h"
 
 #ifdef CONFIG_CIFS_POSIX
 static struct {
@@ -1454,6 +1455,7 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid)
 	struct cifs_readdata *rdata = mid->callback_data;
 	char *buf = server->smallbuf;
 	unsigned int buflen = get_rfc1002_length(buf) + 4;
+	bool use_rdma_mr = false;
 
 	cifs_dbg(FYI, "%s: mid=%llu offset=%llu bytes=%u\n",
 		 __func__, mid->mid, rdata->offset, rdata->bytes);
@@ -1542,8 +1544,11 @@ cifs_readv_receive(struct TCP_Server_Info *server, struct mid_q_entry *mid)
 		 rdata->iov[0].iov_base, server->total_read);
 
 	/* how much data is in the response? */
-	data_len = server->ops->read_data_length(buf);
-	if (data_offset + data_len > buflen) {
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	use_rdma_mr = rdata->mr;
+#endif
+	data_len = server->ops->read_data_length(buf, use_rdma_mr);
+	if (!use_rdma_mr && (data_offset + data_len > buflen)) {
 		/* data_len is corrupt -- discard frame */
 		rdata->result = -EIO;
 		return cifs_readv_discard(server, mid);
@@ -1923,6 +1928,12 @@ cifs_writedata_release(struct kref *refcount)
 {
 	struct cifs_writedata *wdata = container_of(refcount,
 					struct cifs_writedata, refcount);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (wdata->mr) {
+		smbd_deregister_mr(wdata->mr);
+		wdata->mr = NULL;
+	}
+#endif
 
 	if (wdata->cfile)
 		cifsFileInfo_put(wdata->cfile);
@@ -4822,10 +4833,11 @@ CIFSGetDFSRefer(const unsigned int xid, struct cifs_ses *ses,
 	*target_nodes = NULL;
 
 	cifs_dbg(FYI, "In GetDFSRefer the path %s\n", search_name);
-	if (ses == NULL)
+	if (ses == NULL || ses->tcon_ipc == NULL)
 		return -ENODEV;
+
 getDFSRetry:
-	rc = smb_init(SMB_COM_TRANSACTION2, 15, NULL, (void **) &pSMB,
+	rc = smb_init(SMB_COM_TRANSACTION2, 15, ses->tcon_ipc, (void **) &pSMB,
 		      (void **) &pSMBr);
 	if (rc)
 		return rc;
@@ -4833,7 +4845,7 @@ CIFSGetDFSRefer(const unsigned int xid, struct cifs_ses *ses,
 	/* server pointer checked in called function,
 	but should never be null here anyway */
 	pSMB->hdr.Mid = get_next_mid(ses->server);
-	pSMB->hdr.Tid = ses->ipc_tid;
+	pSMB->hdr.Tid = ses->tcon_ipc->tid;
 	pSMB->hdr.Uid = ses->Suid;
 	if (ses->capabilities & CAP_STATUS32)
 		pSMB->hdr.Flags2 |= SMBFLG2_ERR_STATUS;
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c
index 0bfc228..a726f52 100644
--- a/fs/cifs/connect.c
+++ b/fs/cifs/connect.c
@@ -44,7 +44,6 @@
 #include <net/ipv6.h>
 #include <linux/parser.h>
 #include <linux/bvec.h>
-
 #include "cifspdu.h"
 #include "cifsglob.h"
 #include "cifsproto.h"
@@ -56,6 +55,7 @@
 #include "rfc1002pdu.h"
 #include "fscache.h"
 #include "smb2proto.h"
+#include "smbdirect.h"
 
 #define CIFS_PORT 445
 #define RFC1001_PORT 139
@@ -92,7 +92,7 @@ enum {
 	Opt_multiuser, Opt_sloppy, Opt_nosharesock,
 	Opt_persistent, Opt_nopersistent,
 	Opt_resilient, Opt_noresilient,
-	Opt_domainauto,
+	Opt_domainauto, Opt_rdma,
 
 	/* Mount options which take numeric value */
 	Opt_backupuid, Opt_backupgid, Opt_uid,
@@ -183,6 +183,7 @@ static const match_table_t cifs_mount_option_tokens = {
 	{ Opt_resilient, "resilienthandles"},
 	{ Opt_noresilient, "noresilienthandles"},
 	{ Opt_domainauto, "domainauto"},
+	{ Opt_rdma, "rdma"},
 
 	{ Opt_backupuid, "backupuid=%s" },
 	{ Opt_backupgid, "backupgid=%s" },
@@ -353,11 +354,12 @@ cifs_reconnect(struct TCP_Server_Info *server)
 	list_for_each(tmp, &server->smb_ses_list) {
 		ses = list_entry(tmp, struct cifs_ses, smb_ses_list);
 		ses->need_reconnect = true;
-		ses->ipc_tid = 0;
 		list_for_each(tmp2, &ses->tcon_list) {
 			tcon = list_entry(tmp2, struct cifs_tcon, tcon_list);
 			tcon->need_reconnect = true;
 		}
+		if (ses->tcon_ipc)
+			ses->tcon_ipc->need_reconnect = true;
 	}
 	spin_unlock(&cifs_tcp_ses_lock);
 
@@ -405,7 +407,10 @@ cifs_reconnect(struct TCP_Server_Info *server)
 
 		/* we should try only the port we connected to before */
 		mutex_lock(&server->srv_mutex);
-		rc = generic_ip_connect(server);
+		if (cifs_rdma_enabled(server))
+			rc = smbd_reconnect(server);
+		else
+			rc = generic_ip_connect(server);
 		if (rc) {
 			cifs_dbg(FYI, "reconnect error %d\n", rc);
 			mutex_unlock(&server->srv_mutex);
@@ -538,8 +543,10 @@ cifs_readv_from_socket(struct TCP_Server_Info *server, struct msghdr *smb_msg)
 
 		if (server_unresponsive(server))
 			return -ECONNABORTED;
-
-		length = sock_recvmsg(server->ssocket, smb_msg, 0);
+		if (cifs_rdma_enabled(server) && server->smbd_conn)
+			length = smbd_recv(server->smbd_conn, smb_msg);
+		else
+			length = sock_recvmsg(server->ssocket, smb_msg, 0);
 
 		if (server->tcpStatus == CifsExiting)
 			return -ESHUTDOWN;
@@ -700,7 +707,10 @@ static void clean_demultiplex_info(struct TCP_Server_Info *server)
 	wake_up_all(&server->request_q);
 	/* give those requests time to exit */
 	msleep(125);
-
+	if (cifs_rdma_enabled(server) && server->smbd_conn) {
+		smbd_destroy(server->smbd_conn);
+		server->smbd_conn = NULL;
+	}
 	if (server->ssocket) {
 		sock_release(server->ssocket);
 		server->ssocket = NULL;
@@ -1550,6 +1560,9 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 		case Opt_domainauto:
 			vol->domainauto = true;
 			break;
+		case Opt_rdma:
+			vol->rdma = true;
+			break;
 
 		/* Numeric Values */
 		case Opt_backupuid:
@@ -1707,7 +1720,7 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 			tmp_end++;
 			if (!(tmp_end < end && tmp_end[1] == delim)) {
 				/* No it is not. Set the password to NULL */
-				kfree(vol->password);
+				kzfree(vol->password);
 				vol->password = NULL;
 				break;
 			}
@@ -1745,7 +1758,7 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 					options = end;
 			}
 
-			kfree(vol->password);
+			kzfree(vol->password);
 			/* Now build new password string */
 			temp_len = strlen(value);
 			vol->password = kzalloc(temp_len+1, GFP_KERNEL);
@@ -1951,6 +1964,19 @@ cifs_parse_mount_options(const char *mountdata, const char *devname,
 		goto cifs_parse_mount_err;
 	}
 
+	if (vol->rdma && vol->vals->protocol_id < SMB30_PROT_ID) {
+		cifs_dbg(VFS, "SMB Direct requires Version >=3.0\n");
+		goto cifs_parse_mount_err;
+	}
+
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (vol->rdma && vol->sign) {
+		cifs_dbg(VFS, "Currently SMB direct doesn't support signing."
+			" This is being fixed\n");
+		goto cifs_parse_mount_err;
+	}
+#endif
+
 #ifndef CONFIG_KEYS
 	/* Muliuser mounts require CONFIG_KEYS support */
 	if (vol->multiuser) {
@@ -2162,6 +2188,9 @@ static int match_server(struct TCP_Server_Info *server, struct smb_vol *vol)
 	if (server->echo_interval != vol->echo_interval * HZ)
 		return 0;
 
+	if (server->rdma != vol->rdma)
+		return 0;
+
 	return 1;
 }
 
@@ -2260,6 +2289,7 @@ cifs_get_tcp_session(struct smb_vol *volume_info)
 	tcp_ses->noblocksnd = volume_info->noblocksnd;
 	tcp_ses->noautotune = volume_info->noautotune;
 	tcp_ses->tcp_nodelay = volume_info->sockopt_tcp_nodelay;
+	tcp_ses->rdma = volume_info->rdma;
 	tcp_ses->in_flight = 0;
 	tcp_ses->credits = 1;
 	init_waitqueue_head(&tcp_ses->response_q);
@@ -2297,13 +2327,29 @@ cifs_get_tcp_session(struct smb_vol *volume_info)
 		tcp_ses->echo_interval = volume_info->echo_interval * HZ;
 	else
 		tcp_ses->echo_interval = SMB_ECHO_INTERVAL_DEFAULT * HZ;
-
+	if (tcp_ses->rdma) {
+#ifndef CONFIG_CIFS_SMB_DIRECT
+		cifs_dbg(VFS, "CONFIG_CIFS_SMB_DIRECT is not enabled\n");
+		rc = -ENOENT;
+		goto out_err_crypto_release;
+#endif
+		tcp_ses->smbd_conn = smbd_get_connection(
+			tcp_ses, (struct sockaddr *)&volume_info->dstaddr);
+		if (tcp_ses->smbd_conn) {
+			cifs_dbg(VFS, "RDMA transport established\n");
+			rc = 0;
+			goto smbd_connected;
+		} else {
+			rc = -ENOENT;
+			goto out_err_crypto_release;
+		}
+	}
 	rc = ip_connect(tcp_ses);
 	if (rc < 0) {
 		cifs_dbg(VFS, "Error connecting to socket. Aborting operation.\n");
 		goto out_err_crypto_release;
 	}
-
+smbd_connected:
 	/*
 	 * since we're in a cifs function already, we know that
 	 * this will succeed. No need for try_module_get().
@@ -2381,6 +2427,93 @@ static int match_session(struct cifs_ses *ses, struct smb_vol *vol)
 	return 1;
 }
 
+/**
+ * cifs_setup_ipc - helper to setup the IPC tcon for the session
+ *
+ * A new IPC connection is made and stored in the session
+ * tcon_ipc. The IPC tcon has the same lifetime as the session.
+ */
+static int
+cifs_setup_ipc(struct cifs_ses *ses, struct smb_vol *volume_info)
+{
+	int rc = 0, xid;
+	struct cifs_tcon *tcon;
+	struct nls_table *nls_codepage;
+	char unc[SERVER_NAME_LENGTH + sizeof("//x/IPC$")] = {0};
+	bool seal = false;
+
+	/*
+	 * If the mount request that resulted in the creation of the
+	 * session requires encryption, force IPC to be encrypted too.
+	 */
+	if (volume_info->seal) {
+		if (ses->server->capabilities & SMB2_GLOBAL_CAP_ENCRYPTION)
+			seal = true;
+		else {
+			cifs_dbg(VFS,
+				 "IPC: server doesn't support encryption\n");
+			return -EOPNOTSUPP;
+		}
+	}
+
+	tcon = tconInfoAlloc();
+	if (tcon == NULL)
+		return -ENOMEM;
+
+	snprintf(unc, sizeof(unc), "\\\\%s\\IPC$", ses->serverName);
+
+	/* cannot fail */
+	nls_codepage = load_nls_default();
+
+	xid = get_xid();
+	tcon->ses = ses;
+	tcon->ipc = true;
+	tcon->seal = seal;
+	rc = ses->server->ops->tree_connect(xid, ses, unc, tcon, nls_codepage);
+	free_xid(xid);
+
+	if (rc) {
+		cifs_dbg(VFS, "failed to connect to IPC (rc=%d)\n", rc);
+		tconInfoFree(tcon);
+		goto out;
+	}
+
+	cifs_dbg(FYI, "IPC tcon rc = %d ipc tid = %d\n", rc, tcon->tid);
+
+	ses->tcon_ipc = tcon;
+out:
+	unload_nls(nls_codepage);
+	return rc;
+}
+
+/**
+ * cifs_free_ipc - helper to release the session IPC tcon
+ *
+ * Needs to be called everytime a session is destroyed
+ */
+static int
+cifs_free_ipc(struct cifs_ses *ses)
+{
+	int rc = 0, xid;
+	struct cifs_tcon *tcon = ses->tcon_ipc;
+
+	if (tcon == NULL)
+		return 0;
+
+	if (ses->server->ops->tree_disconnect) {
+		xid = get_xid();
+		rc = ses->server->ops->tree_disconnect(xid, tcon);
+		free_xid(xid);
+	}
+
+	if (rc)
+		cifs_dbg(FYI, "failed to disconnect IPC tcon (rc=%d)\n", rc);
+
+	tconInfoFree(tcon);
+	ses->tcon_ipc = NULL;
+	return rc;
+}
+
 static struct cifs_ses *
 cifs_find_smb_ses(struct TCP_Server_Info *server, struct smb_vol *vol)
 {
@@ -2421,6 +2554,8 @@ cifs_put_smb_ses(struct cifs_ses *ses)
 		ses->status = CifsExiting;
 	spin_unlock(&cifs_tcp_ses_lock);
 
+	cifs_free_ipc(ses);
+
 	if (ses->status == CifsExiting && server->ops->logoff) {
 		xid = get_xid();
 		rc = server->ops->logoff(xid, ses);
@@ -2569,6 +2704,13 @@ cifs_set_cifscreds(struct smb_vol *vol __attribute__((unused)),
 }
 #endif /* CONFIG_KEYS */
 
+/**
+ * cifs_get_smb_ses - get a session matching @volume_info data from @server
+ *
+ * This function assumes it is being called from cifs_mount() where we
+ * already got a server reference (server refcount +1). See
+ * cifs_get_tcon() for refcount explanations.
+ */
 static struct cifs_ses *
 cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb_vol *volume_info)
 {
@@ -2665,6 +2807,9 @@ cifs_get_smb_ses(struct TCP_Server_Info *server, struct smb_vol *volume_info)
 	spin_unlock(&cifs_tcp_ses_lock);
 
 	free_xid(xid);
+
+	cifs_setup_ipc(ses, volume_info);
+
 	return ses;
 
 get_ses_fail:
@@ -2709,8 +2854,16 @@ void
 cifs_put_tcon(struct cifs_tcon *tcon)
 {
 	unsigned int xid;
-	struct cifs_ses *ses = tcon->ses;
+	struct cifs_ses *ses;
 
+	/*
+	 * IPC tcon share the lifetime of their session and are
+	 * destroyed in the session put function
+	 */
+	if (tcon == NULL || tcon->ipc)
+		return;
+
+	ses = tcon->ses;
 	cifs_dbg(FYI, "%s: tc_count=%d\n", __func__, tcon->tc_count);
 	spin_lock(&cifs_tcp_ses_lock);
 	if (--tcon->tc_count > 0) {
@@ -2731,6 +2884,26 @@ cifs_put_tcon(struct cifs_tcon *tcon)
 	cifs_put_smb_ses(ses);
 }
 
+/**
+ * cifs_get_tcon - get a tcon matching @volume_info data from @ses
+ *
+ * - tcon refcount is the number of mount points using the tcon.
+ * - ses refcount is the number of tcon using the session.
+ *
+ * 1. This function assumes it is being called from cifs_mount() where
+ *    we already got a session reference (ses refcount +1).
+ *
+ * 2. Since we're in the context of adding a mount point, the end
+ *    result should be either:
+ *
+ * a) a new tcon already allocated with refcount=1 (1 mount point) and
+ *    its session refcount incremented (1 new tcon). This +1 was
+ *    already done in (1).
+ *
+ * b) an existing tcon with refcount+1 (add a mount point to it) and
+ *    identical ses refcount (no new tcon). Because of (1) we need to
+ *    decrement the ses refcount.
+ */
 static struct cifs_tcon *
 cifs_get_tcon(struct cifs_ses *ses, struct smb_vol *volume_info)
 {
@@ -2739,8 +2912,11 @@ cifs_get_tcon(struct cifs_ses *ses, struct smb_vol *volume_info)
 
 	tcon = cifs_find_tcon(ses, volume_info);
 	if (tcon) {
+		/*
+		 * tcon has refcount already incremented but we need to
+		 * decrement extra ses reference gotten by caller (case b)
+		 */
 		cifs_dbg(FYI, "Found match on UNC path\n");
-		/* existing tcon already has a reference */
 		cifs_put_smb_ses(ses);
 		return tcon;
 	}
@@ -2986,39 +3162,17 @@ get_dfs_path(const unsigned int xid, struct cifs_ses *ses, const char *old_path,
 	     const struct nls_table *nls_codepage, unsigned int *num_referrals,
 	     struct dfs_info3_param **referrals, int remap)
 {
-	char *temp_unc;
 	int rc = 0;
 
-	if (!ses->server->ops->tree_connect || !ses->server->ops->get_dfs_refer)
+	if (!ses->server->ops->get_dfs_refer)
 		return -ENOSYS;
 
 	*num_referrals = 0;
 	*referrals = NULL;
 
-	if (ses->ipc_tid == 0) {
-		temp_unc = kmalloc(2 /* for slashes */ +
-			strnlen(ses->serverName, SERVER_NAME_LEN_WITH_NULL * 2)
-				+ 1 + 4 /* slash IPC$ */ + 2, GFP_KERNEL);
-		if (temp_unc == NULL)
-			return -ENOMEM;
-		temp_unc[0] = '\\';
-		temp_unc[1] = '\\';
-		strcpy(temp_unc + 2, ses->serverName);
-		strcpy(temp_unc + 2 + strlen(ses->serverName), "\\IPC$");
-		rc = ses->server->ops->tree_connect(xid, ses, temp_unc, NULL,
-						    nls_codepage);
-		cifs_dbg(FYI, "Tcon rc = %d ipc_tid = %d\n", rc, ses->ipc_tid);
-		kfree(temp_unc);
-	}
-	if (rc == 0)
-		rc = ses->server->ops->get_dfs_refer(xid, ses, old_path,
-						     referrals, num_referrals,
-						     nls_codepage, remap);
-	/*
-	 * BB - map targetUNCs to dfs_info3 structures, here or in
-	 * ses->server->ops->get_dfs_refer.
-	 */
-
+	rc = ses->server->ops->get_dfs_refer(xid, ses, old_path,
+					     referrals, num_referrals,
+					     nls_codepage, remap);
 	return rc;
 }
 
@@ -3783,7 +3937,7 @@ cifs_mount(struct cifs_sb_info *cifs_sb, struct smb_vol *volume_info)
 		tcon->unix_ext = 0; /* server does not support them */
 
 	/* do not care if a following call succeed - informational */
-	if (!tcon->ipc && server->ops->qfs_tcon)
+	if (!tcon->pipe && server->ops->qfs_tcon)
 		server->ops->qfs_tcon(xid, tcon);
 
 	cifs_sb->wsize = server->ops->negotiate_wsize(tcon, volume_info);
@@ -3913,8 +4067,7 @@ cifs_mount(struct cifs_sb_info *cifs_sb, struct smb_vol *volume_info)
 }
 
 /*
- * Issue a TREE_CONNECT request. Note that for IPC$ shares, that the tcon
- * pointer may be NULL.
+ * Issue a TREE_CONNECT request.
  */
 int
 CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
@@ -3950,7 +4103,7 @@ CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
 	pSMB->AndXCommand = 0xFF;
 	pSMB->Flags = cpu_to_le16(TCON_EXTENDED_SECINFO);
 	bcc_ptr = &pSMB->Password[0];
-	if (!tcon || (ses->server->sec_mode & SECMODE_USER)) {
+	if (tcon->pipe || (ses->server->sec_mode & SECMODE_USER)) {
 		pSMB->PasswordLength = cpu_to_le16(1);	/* minimum */
 		*bcc_ptr = 0; /* password is null byte */
 		bcc_ptr++;              /* skip password */
@@ -4022,7 +4175,7 @@ CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
 			 0);
 
 	/* above now done in SendReceive */
-	if ((rc == 0) && (tcon != NULL)) {
+	if (rc == 0) {
 		bool is_unicode;
 
 		tcon->tidStatus = CifsGood;
@@ -4042,7 +4195,8 @@ CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
 			if ((bcc_ptr[0] == 'I') && (bcc_ptr[1] == 'P') &&
 			    (bcc_ptr[2] == 'C')) {
 				cifs_dbg(FYI, "IPC connection\n");
-				tcon->ipc = 1;
+				tcon->ipc = true;
+				tcon->pipe = true;
 			}
 		} else if (length == 2) {
 			if ((bcc_ptr[0] == 'A') && (bcc_ptr[1] == ':')) {
@@ -4069,9 +4223,6 @@ CIFSTCon(const unsigned int xid, struct cifs_ses *ses,
 		else
 			tcon->Flags = 0;
 		cifs_dbg(FYI, "Tcon flags: 0x%x\n", tcon->Flags);
-	} else if ((rc == 0) && tcon == NULL) {
-		/* all we need to save for IPC$ connection */
-		ses->ipc_tid = smb_buffer_response->Tid;
 	}
 
 	cifs_buf_release(smb_buffer);
@@ -4235,7 +4386,7 @@ cifs_construct_tcon(struct cifs_sb_info *cifs_sb, kuid_t fsuid)
 		reset_cifs_unix_caps(0, tcon, NULL, vol_info);
 out:
 	kfree(vol_info->username);
-	kfree(vol_info->password);
+	kzfree(vol_info->password);
 	kfree(vol_info);
 
 	return tcon;
@@ -4387,7 +4538,7 @@ cifs_prune_tlinks(struct work_struct *work)
 	struct cifs_sb_info *cifs_sb = container_of(work, struct cifs_sb_info,
 						    prune_tlinks.work);
 	struct rb_root *root = &cifs_sb->tlink_tree;
-	struct rb_node *node = rb_first(root);
+	struct rb_node *node;
 	struct rb_node *tmp;
 	struct tcon_link *tlink;
 
diff --git a/fs/cifs/file.c b/fs/cifs/file.c
index df9f682..7cee97b 100644
--- a/fs/cifs/file.c
+++ b/fs/cifs/file.c
@@ -42,7 +42,7 @@
 #include "cifs_debug.h"
 #include "cifs_fs_sb.h"
 #include "fscache.h"
-
+#include "smbdirect.h"
 
 static inline int cifs_convert_flags(unsigned int flags)
 {
@@ -2902,7 +2902,12 @@ cifs_readdata_release(struct kref *refcount)
 {
 	struct cifs_readdata *rdata = container_of(refcount,
 					struct cifs_readdata, refcount);
-
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (rdata->mr) {
+		smbd_deregister_mr(rdata->mr);
+		rdata->mr = NULL;
+	}
+#endif
 	if (rdata->cfile)
 		cifsFileInfo_put(rdata->cfile);
 
@@ -3031,6 +3036,10 @@ uncached_fill_pages(struct TCP_Server_Info *server,
 		}
 		if (iter)
 			result = copy_page_from_iter(page, 0, n, iter);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+		else if (rdata->mr)
+			result = n;
+#endif
 		else
 			result = cifs_read_page_from_socket(server, page, n);
 		if (result < 0)
@@ -3471,20 +3480,18 @@ static const struct vm_operations_struct cifs_file_vm_ops = {
 
 int cifs_file_strict_mmap(struct file *file, struct vm_area_struct *vma)
 {
-	int rc, xid;
+	int xid, rc = 0;
 	struct inode *inode = file_inode(file);
 
 	xid = get_xid();
 
-	if (!CIFS_CACHE_READ(CIFS_I(inode))) {
+	if (!CIFS_CACHE_READ(CIFS_I(inode)))
 		rc = cifs_zap_mapping(inode);
-		if (rc)
-			return rc;
-	}
-
-	rc = generic_file_mmap(file, vma);
-	if (rc == 0)
+	if (!rc)
+		rc = generic_file_mmap(file, vma);
+	if (!rc)
 		vma->vm_ops = &cifs_file_vm_ops;
+
 	free_xid(xid);
 	return rc;
 }
@@ -3494,16 +3501,16 @@ int cifs_file_mmap(struct file *file, struct vm_area_struct *vma)
 	int rc, xid;
 
 	xid = get_xid();
+
 	rc = cifs_revalidate_file(file);
-	if (rc) {
+	if (rc)
 		cifs_dbg(FYI, "Validation prior to mmap failed, error=%d\n",
 			 rc);
-		free_xid(xid);
-		return rc;
-	}
-	rc = generic_file_mmap(file, vma);
-	if (rc == 0)
+	if (!rc)
+		rc = generic_file_mmap(file, vma);
+	if (!rc)
 		vma->vm_ops = &cifs_file_vm_ops;
+
 	free_xid(xid);
 	return rc;
 }
@@ -3600,6 +3607,10 @@ readpages_fill_pages(struct TCP_Server_Info *server,
 
 		if (iter)
 			result = copy_page_from_iter(page, 0, n, iter);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+		else if (rdata->mr)
+			result = n;
+#endif
 		else
 			result = cifs_read_page_from_socket(server, page, n);
 		if (result < 0)
diff --git a/fs/cifs/inode.c b/fs/cifs/inode.c
index ecb9907..8f9a8cc 100644
--- a/fs/cifs/inode.c
+++ b/fs/cifs/inode.c
@@ -1049,7 +1049,7 @@ struct inode *cifs_root_iget(struct super_block *sb)
 	tcon->resource_id = CIFS_I(inode)->uniqueid;
 #endif
 
-	if (rc && tcon->ipc) {
+	if (rc && tcon->pipe) {
 		cifs_dbg(FYI, "ipc connection - fake read inode\n");
 		spin_lock(&inode->i_lock);
 		inode->i_mode |= S_IFDIR;
diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c
index eea93ac..a0dbced 100644
--- a/fs/cifs/misc.c
+++ b/fs/cifs/misc.c
@@ -98,14 +98,11 @@ sesInfoFree(struct cifs_ses *buf_to_free)
 	kfree(buf_to_free->serverOS);
 	kfree(buf_to_free->serverDomain);
 	kfree(buf_to_free->serverNOS);
-	if (buf_to_free->password) {
-		memset(buf_to_free->password, 0, strlen(buf_to_free->password));
-		kfree(buf_to_free->password);
-	}
+	kzfree(buf_to_free->password);
 	kfree(buf_to_free->user_name);
 	kfree(buf_to_free->domainName);
-	kfree(buf_to_free->auth_key.response);
-	kfree(buf_to_free);
+	kzfree(buf_to_free->auth_key.response);
+	kzfree(buf_to_free);
 }
 
 struct cifs_tcon *
@@ -136,10 +133,7 @@ tconInfoFree(struct cifs_tcon *buf_to_free)
 	}
 	atomic_dec(&tconInfoAllocCount);
 	kfree(buf_to_free->nativeFileSystem);
-	if (buf_to_free->password) {
-		memset(buf_to_free->password, 0, strlen(buf_to_free->password));
-		kfree(buf_to_free->password);
-	}
+	kzfree(buf_to_free->password);
 	kfree(buf_to_free);
 }
 
diff --git a/fs/cifs/smb1ops.c b/fs/cifs/smb1ops.c
index a723df3e..3d495e4 100644
--- a/fs/cifs/smb1ops.c
+++ b/fs/cifs/smb1ops.c
@@ -87,9 +87,11 @@ cifs_read_data_offset(char *buf)
 }
 
 static unsigned int
-cifs_read_data_length(char *buf)
+cifs_read_data_length(char *buf, bool in_remaining)
 {
 	READ_RSP *rsp = (READ_RSP *)buf;
+	/* It's a bug reading remaining data for SMB1 packets */
+	WARN_ON(in_remaining);
 	return (le16_to_cpu(rsp->DataLengthHigh) << 16) +
 	       le16_to_cpu(rsp->DataLength);
 }
diff --git a/fs/cifs/smb2file.c b/fs/cifs/smb2file.c
index b4b1f03..12af5db 100644
--- a/fs/cifs/smb2file.c
+++ b/fs/cifs/smb2file.c
@@ -74,7 +74,7 @@ smb2_open_file(const unsigned int xid, struct cifs_open_parms *oparms,
 		nr_ioctl_req.Reserved = 0;
 		rc = SMB2_ioctl(xid, oparms->tcon, fid->persistent_fid,
 			fid->volatile_fid, FSCTL_LMR_REQUEST_RESILIENCY,
-			true /* is_fsctl */, false /* use_ipc */,
+			true /* is_fsctl */,
 			(char *)&nr_ioctl_req, sizeof(nr_ioctl_req),
 			NULL, NULL /* no return info */);
 		if (rc == -EOPNOTSUPP) {
diff --git a/fs/cifs/smb2misc.c b/fs/cifs/smb2misc.c
index 7b08a14..76d03ab 100644
--- a/fs/cifs/smb2misc.c
+++ b/fs/cifs/smb2misc.c
@@ -578,7 +578,7 @@ smb2_is_valid_lease_break(char *buffer)
 bool
 smb2_is_valid_oplock_break(char *buffer, struct TCP_Server_Info *server)
 {
-	struct smb2_oplock_break *rsp = (struct smb2_oplock_break *)buffer;
+	struct smb2_oplock_break_rsp *rsp = (struct smb2_oplock_break_rsp *)buffer;
 	struct list_head *tmp, *tmp1, *tmp2;
 	struct cifs_ses *ses;
 	struct cifs_tcon *tcon;
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c
index ed88ab8a..eb68e2f 100644
--- a/fs/cifs/smb2ops.c
+++ b/fs/cifs/smb2ops.c
@@ -32,6 +32,7 @@
 #include "smb2status.h"
 #include "smb2glob.h"
 #include "cifs_ioctl.h"
+#include "smbdirect.h"
 
 static int
 change_conf(struct TCP_Server_Info *server)
@@ -250,7 +251,11 @@ smb2_negotiate_wsize(struct cifs_tcon *tcon, struct smb_vol *volume_info)
 	/* start with specified wsize, or default */
 	wsize = volume_info->wsize ? volume_info->wsize : CIFS_DEFAULT_IOSIZE;
 	wsize = min_t(unsigned int, wsize, server->max_write);
-
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (server->rdma)
+		wsize = min_t(unsigned int,
+				wsize, server->smbd_conn->max_readwrite_size);
+#endif
 	if (!(server->capabilities & SMB2_GLOBAL_CAP_LARGE_MTU))
 		wsize = min_t(unsigned int, wsize, SMB2_MAX_BUFFER_SIZE);
 
@@ -266,6 +271,11 @@ smb2_negotiate_rsize(struct cifs_tcon *tcon, struct smb_vol *volume_info)
 	/* start with specified rsize, or default */
 	rsize = volume_info->rsize ? volume_info->rsize : CIFS_DEFAULT_IOSIZE;
 	rsize = min_t(unsigned int, rsize, server->max_read);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (server->rdma)
+		rsize = min_t(unsigned int,
+				rsize, server->smbd_conn->max_readwrite_size);
+#endif
 
 	if (!(server->capabilities & SMB2_GLOBAL_CAP_LARGE_MTU))
 		rsize = min_t(unsigned int, rsize, SMB2_MAX_BUFFER_SIZE);
@@ -283,7 +293,6 @@ SMB3_request_interfaces(const unsigned int xid, struct cifs_tcon *tcon)
 
 	rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
 			FSCTL_QUERY_NETWORK_INTERFACE_INFO, true /* is_fsctl */,
-			false /* use_ipc */,
 			NULL /* no data input */, 0 /* no data input */,
 			(char **)&out_buf, &ret_data_len);
 	if (rc != 0)
@@ -782,7 +791,6 @@ SMB2_request_res_key(const unsigned int xid, struct cifs_tcon *tcon,
 
 	rc = SMB2_ioctl(xid, tcon, persistent_fid, volatile_fid,
 			FSCTL_SRV_REQUEST_RESUME_KEY, true /* is_fsctl */,
-			false /* use_ipc */,
 			NULL, 0 /* no input */,
 			(char **)&res_key, &ret_data_len);
 
@@ -848,8 +856,7 @@ smb2_copychunk_range(const unsigned int xid,
 		/* Request server copy to target from src identified by key */
 		rc = SMB2_ioctl(xid, tcon, trgtfile->fid.persistent_fid,
 			trgtfile->fid.volatile_fid, FSCTL_SRV_COPYCHUNK_WRITE,
-			true /* is_fsctl */, false /* use_ipc */,
-			(char *)pcchunk,
+			true /* is_fsctl */, (char *)pcchunk,
 			sizeof(struct copychunk_ioctl),	(char **)&retbuf,
 			&ret_data_len);
 		if (rc == 0) {
@@ -947,9 +954,13 @@ smb2_read_data_offset(char *buf)
 }
 
 static unsigned int
-smb2_read_data_length(char *buf)
+smb2_read_data_length(char *buf, bool in_remaining)
 {
 	struct smb2_read_rsp *rsp = (struct smb2_read_rsp *)buf;
+
+	if (in_remaining)
+		return le32_to_cpu(rsp->DataRemaining);
+
 	return le32_to_cpu(rsp->DataLength);
 }
 
@@ -1006,7 +1017,7 @@ static bool smb2_set_sparse(const unsigned int xid, struct cifs_tcon *tcon,
 
 	rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
 			cfile->fid.volatile_fid, FSCTL_SET_SPARSE,
-			true /* is_fctl */, false /* use_ipc */,
+			true /* is_fctl */,
 			&setsparse, 1, NULL, NULL);
 	if (rc) {
 		tcon->broken_sparse_sup = true;
@@ -1077,7 +1088,7 @@ smb2_duplicate_extents(const unsigned int xid,
 	rc = SMB2_ioctl(xid, tcon, trgtfile->fid.persistent_fid,
 			trgtfile->fid.volatile_fid,
 			FSCTL_DUPLICATE_EXTENTS_TO_FILE,
-			true /* is_fsctl */, false /* use_ipc */,
+			true /* is_fsctl */,
 			(char *)&dup_ext_buf,
 			sizeof(struct duplicate_extents_to_file),
 			NULL,
@@ -1112,7 +1123,7 @@ smb3_set_integrity(const unsigned int xid, struct cifs_tcon *tcon,
 	return SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
 			cfile->fid.volatile_fid,
 			FSCTL_SET_INTEGRITY_INFORMATION,
-			true /* is_fsctl */, false /* use_ipc */,
+			true /* is_fsctl */,
 			(char *)&integr_info,
 			sizeof(struct fsctl_set_integrity_information_req),
 			NULL,
@@ -1132,7 +1143,7 @@ smb3_enum_snapshots(const unsigned int xid, struct cifs_tcon *tcon,
 	rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
 			cfile->fid.volatile_fid,
 			FSCTL_SRV_ENUMERATE_SNAPSHOTS,
-			true /* is_fsctl */, false /* use_ipc */,
+			true /* is_fsctl */,
 			NULL, 0 /* no input data */,
 			(char **)&retbuf,
 			&ret_data_len);
@@ -1351,16 +1362,20 @@ smb2_get_dfs_refer(const unsigned int xid, struct cifs_ses *ses,
 	cifs_dbg(FYI, "smb2_get_dfs_refer path <%s>\n", search_name);
 
 	/*
-	 * Use any tcon from the current session. Here, the first one.
+	 * Try to use the IPC tcon, otherwise just use any
 	 */
-	spin_lock(&cifs_tcp_ses_lock);
-	tcon = list_first_entry_or_null(&ses->tcon_list, struct cifs_tcon,
-					tcon_list);
-	if (tcon)
-		tcon->tc_count++;
-	spin_unlock(&cifs_tcp_ses_lock);
+	tcon = ses->tcon_ipc;
+	if (tcon == NULL) {
+		spin_lock(&cifs_tcp_ses_lock);
+		tcon = list_first_entry_or_null(&ses->tcon_list,
+						struct cifs_tcon,
+						tcon_list);
+		if (tcon)
+			tcon->tc_count++;
+		spin_unlock(&cifs_tcp_ses_lock);
+	}
 
-	if (!tcon) {
+	if (tcon == NULL) {
 		cifs_dbg(VFS, "session %p has no tcon available for a dfs referral request\n",
 			 ses);
 		rc = -ENOTCONN;
@@ -1389,20 +1404,11 @@ smb2_get_dfs_refer(const unsigned int xid, struct cifs_ses *ses,
 	memcpy(dfs_req->RequestFileName, utf16_path, utf16_path_len);
 
 	do {
-		/* try first with IPC */
 		rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
 				FSCTL_DFS_GET_REFERRALS,
-				true /* is_fsctl */, true /* use_ipc */,
+				true /* is_fsctl */,
 				(char *)dfs_req, dfs_req_size,
 				(char **)&dfs_rsp, &dfs_rsp_size);
-		if (rc == -ENOTCONN) {
-			/* try with normal tcon */
-			rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
-					FSCTL_DFS_GET_REFERRALS,
-					true /* is_fsctl */, false /*use_ipc*/,
-					(char *)dfs_req, dfs_req_size,
-					(char **)&dfs_rsp, &dfs_rsp_size);
-		}
 	} while (rc == -EAGAIN);
 
 	if (rc) {
@@ -1421,7 +1427,8 @@ smb2_get_dfs_refer(const unsigned int xid, struct cifs_ses *ses,
 	}
 
  out:
-	if (tcon) {
+	if (tcon && !tcon->ipc) {
+		/* ipc tcons are not refcounted */
 		spin_lock(&cifs_tcp_ses_lock);
 		tcon->tc_count--;
 		spin_unlock(&cifs_tcp_ses_lock);
@@ -1713,8 +1720,7 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon,
 
 	rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
 			cfile->fid.volatile_fid, FSCTL_SET_ZERO_DATA,
-			true /* is_fctl */, false /* use_ipc */,
-			(char *)&fsctl_buf,
+			true /* is_fctl */, (char *)&fsctl_buf,
 			sizeof(struct file_zero_data_information), NULL, NULL);
 	free_xid(xid);
 	return rc;
@@ -1748,8 +1754,7 @@ static long smb3_punch_hole(struct file *file, struct cifs_tcon *tcon,
 
 	rc = SMB2_ioctl(xid, tcon, cfile->fid.persistent_fid,
 			cfile->fid.volatile_fid, FSCTL_SET_ZERO_DATA,
-			true /* is_fctl */, false /* use_ipc */,
-			(char *)&fsctl_buf,
+			true /* is_fctl */, (char *)&fsctl_buf,
 			sizeof(struct file_zero_data_information), NULL, NULL);
 	free_xid(xid);
 	return rc;
@@ -2411,6 +2416,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
 	struct iov_iter iter;
 	struct kvec iov;
 	int length;
+	bool use_rdma_mr = false;
 
 	if (shdr->Command != SMB2_READ) {
 		cifs_dbg(VFS, "only big read responses are supported\n");
@@ -2437,7 +2443,10 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid,
 	}
 
 	data_offset = server->ops->read_data_offset(buf) + 4;
-	data_len = server->ops->read_data_length(buf);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	use_rdma_mr = rdata->mr;
+#endif
+	data_len = server->ops->read_data_length(buf, use_rdma_mr);
 
 	if (data_offset < server->vals->read_rsp_size) {
 		/*
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c
index 01346b8..63778ac 100644
--- a/fs/cifs/smb2pdu.c
+++ b/fs/cifs/smb2pdu.c
@@ -48,6 +48,7 @@
 #include "smb2glob.h"
 #include "cifspdu.h"
 #include "cifs_spnego.h"
+#include "smbdirect.h"
 
 /*
  *  The following table defines the expected "StructureSize" of SMB2 requests
@@ -319,13 +320,16 @@ fill_small_buf(__le16 smb2_command, struct cifs_tcon *tcon, void *buf,
 	*total_len = parmsize + sizeof(struct smb2_sync_hdr);
 }
 
-/* init request without RFC1001 length at the beginning */
+/*
+ * Allocate and return pointer to an SMB request hdr, and set basic
+ * SMB information in the SMB header. If the return code is zero, this
+ * function must have filled in request_buf pointer.
+ */
 static int
 smb2_plain_req_init(__le16 smb2_command, struct cifs_tcon *tcon,
 		    void **request_buf, unsigned int *total_len)
 {
 	int rc;
-	struct smb2_sync_hdr *shdr;
 
 	rc = smb2_reconnect(smb2_command, tcon);
 	if (rc)
@@ -338,53 +342,9 @@ smb2_plain_req_init(__le16 smb2_command, struct cifs_tcon *tcon,
 		return -ENOMEM;
 	}
 
-	shdr = (struct smb2_sync_hdr *)(*request_buf);
-
-	fill_small_buf(smb2_command, tcon, shdr, total_len);
-
-	if (tcon != NULL) {
-#ifdef CONFIG_CIFS_STATS2
-		uint16_t com_code = le16_to_cpu(smb2_command);
-
-		cifs_stats_inc(&tcon->stats.smb2_stats.smb2_com_sent[com_code]);
-#endif
-		cifs_stats_inc(&tcon->num_smbs_sent);
-	}
-
-	return rc;
-}
-
-/*
- * Allocate and return pointer to an SMB request hdr, and set basic
- * SMB information in the SMB header. If the return code is zero, this
- * function must have filled in request_buf pointer. The returned buffer
- * has RFC1001 length at the beginning.
- */
-static int
-small_smb2_init(__le16 smb2_command, struct cifs_tcon *tcon,
-		void **request_buf)
-{
-	int rc;
-	unsigned int total_len;
-	struct smb2_pdu *pdu;
-
-	rc = smb2_reconnect(smb2_command, tcon);
-	if (rc)
-		return rc;
-
-	/* BB eventually switch this to SMB2 specific small buf size */
-	*request_buf = cifs_small_buf_get();
-	if (*request_buf == NULL) {
-		/* BB should we add a retry in here if not a writepage? */
-		return -ENOMEM;
-	}
-
-	pdu = (struct smb2_pdu *)(*request_buf);
-
-	fill_small_buf(smb2_command, tcon, get_sync_hdr(pdu), &total_len);
-
-	/* Note this is only network field converted to big endian */
-	pdu->hdr.smb2_buf_length = cpu_to_be32(total_len);
+	fill_small_buf(smb2_command, tcon,
+		       (struct smb2_sync_hdr *)(*request_buf),
+		       total_len);
 
 	if (tcon != NULL) {
 #ifdef CONFIG_CIFS_STATS2
@@ -398,8 +358,8 @@ small_smb2_init(__le16 smb2_command, struct cifs_tcon *tcon,
 }
 
 #ifdef CONFIG_CIFS_SMB311
-/* offset is sizeof smb2_negotiate_req - 4 but rounded up to 8 bytes */
-#define OFFSET_OF_NEG_CONTEXT 0x68  /* sizeof(struct smb2_negotiate_req) - 4 */
+/* offset is sizeof smb2_negotiate_req but rounded up to 8 bytes */
+#define OFFSET_OF_NEG_CONTEXT 0x68  /* sizeof(struct smb2_negotiate_req) */
 
 
 #define SMB2_PREAUTH_INTEGRITY_CAPABILITIES	cpu_to_le16(1)
@@ -427,23 +387,25 @@ build_encrypt_ctxt(struct smb2_encryption_neg_context *pneg_ctxt)
 }
 
 static void
-assemble_neg_contexts(struct smb2_negotiate_req *req)
+assemble_neg_contexts(struct smb2_negotiate_req *req,
+		      unsigned int *total_len)
 {
-
-	/* +4 is to account for the RFC1001 len field */
-	char *pneg_ctxt = (char *)req + OFFSET_OF_NEG_CONTEXT + 4;
+	char *pneg_ctxt = (char *)req + OFFSET_OF_NEG_CONTEXT;
 
 	build_preauth_ctxt((struct smb2_preauth_neg_context *)pneg_ctxt);
 	/* Add 2 to size to round to 8 byte boundary */
+
 	pneg_ctxt += 2 + sizeof(struct smb2_preauth_neg_context);
 	build_encrypt_ctxt((struct smb2_encryption_neg_context *)pneg_ctxt);
 	req->NegotiateContextOffset = cpu_to_le32(OFFSET_OF_NEG_CONTEXT);
 	req->NegotiateContextCount = cpu_to_le16(2);
-	inc_rfc1001_len(req, 4 + sizeof(struct smb2_preauth_neg_context)
-			+ sizeof(struct smb2_encryption_neg_context)); /* calculate hash */
+
+	*total_len += 4 + sizeof(struct smb2_preauth_neg_context)
+		+ sizeof(struct smb2_encryption_neg_context);
 }
 #else
-static void assemble_neg_contexts(struct smb2_negotiate_req *req)
+static void assemble_neg_contexts(struct smb2_negotiate_req *req,
+				  unsigned int *total_len)
 {
 	return;
 }
@@ -477,6 +439,7 @@ SMB2_negotiate(const unsigned int xid, struct cifs_ses *ses)
 	int blob_offset, blob_length;
 	char *security_blob;
 	int flags = CIFS_NEG_OP;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "Negotiate protocol\n");
 
@@ -485,30 +448,30 @@ SMB2_negotiate(const unsigned int xid, struct cifs_ses *ses)
 		return -EIO;
 	}
 
-	rc = small_smb2_init(SMB2_NEGOTIATE, NULL, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_NEGOTIATE, NULL, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
-	req->hdr.sync_hdr.SessionId = 0;
+	req->sync_hdr.SessionId = 0;
 
 	if (strcmp(ses->server->vals->version_string,
 		   SMB3ANY_VERSION_STRING) == 0) {
 		req->Dialects[0] = cpu_to_le16(SMB30_PROT_ID);
 		req->Dialects[1] = cpu_to_le16(SMB302_PROT_ID);
 		req->DialectCount = cpu_to_le16(2);
-		inc_rfc1001_len(req, 4);
+		total_len += 4;
 	} else if (strcmp(ses->server->vals->version_string,
 		   SMBDEFAULT_VERSION_STRING) == 0) {
 		req->Dialects[0] = cpu_to_le16(SMB21_PROT_ID);
 		req->Dialects[1] = cpu_to_le16(SMB30_PROT_ID);
 		req->Dialects[2] = cpu_to_le16(SMB302_PROT_ID);
 		req->DialectCount = cpu_to_le16(3);
-		inc_rfc1001_len(req, 6);
+		total_len += 6;
 	} else {
 		/* otherwise send specific dialect */
 		req->Dialects[0] = cpu_to_le16(ses->server->vals->protocol_id);
 		req->DialectCount = cpu_to_le16(1);
-		inc_rfc1001_len(req, 2);
+		total_len += 2;
 	}
 
 	/* only one of SMB2 signing flags may be set in SMB2 request */
@@ -528,13 +491,12 @@ SMB2_negotiate(const unsigned int xid, struct cifs_ses *ses)
 		memcpy(req->ClientGUID, server->client_guid,
 			SMB2_CLIENT_GUID_SIZE);
 		if (ses->server->vals->protocol_id == SMB311_PROT_ID)
-			assemble_neg_contexts(req);
+			assemble_neg_contexts(req, &total_len);
 	}
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
+	iov[0].iov_len = total_len;
 
-	rc = SendReceive2(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_negotiate_rsp *)rsp_iov.iov_base;
 	/*
@@ -654,6 +616,11 @@ int smb3_validate_negotiate(const unsigned int xid, struct cifs_tcon *tcon)
 
 	cifs_dbg(FYI, "validate negotiate\n");
 
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (tcon->ses->server->rdma)
+		return 0;
+#endif
+
 	/*
 	 * validation ioctl must be signed, so no point sending this if we
 	 * can not sign it (ie are not known user).  Even if signing is not
@@ -713,7 +680,6 @@ int smb3_validate_negotiate(const unsigned int xid, struct cifs_tcon *tcon)
 
 	rc = SMB2_ioctl(xid, tcon, NO_FILE_ID, NO_FILE_ID,
 		FSCTL_VALIDATE_NEGOTIATE_INFO, true /* is_fsctl */,
-		false /* use_ipc */,
 		(char *)&vneg_inbuf, sizeof(struct validate_negotiate_info_req),
 		(char **)&pneg_rsp, &rsplen);
 
@@ -733,8 +699,7 @@ int smb3_validate_negotiate(const unsigned int xid, struct cifs_tcon *tcon)
 	}
 
 	/* check validate negotiate info response matches what we got earlier */
-	if (pneg_rsp->Dialect !=
-			cpu_to_le16(tcon->ses->server->vals->protocol_id))
+	if (pneg_rsp->Dialect != cpu_to_le16(tcon->ses->server->dialect))
 		goto vneg_out;
 
 	if (pneg_rsp->SecurityMode != cpu_to_le16(tcon->ses->server->sec_mode))
@@ -806,20 +771,22 @@ SMB2_sess_alloc_buffer(struct SMB2_sess_data *sess_data)
 	struct cifs_ses *ses = sess_data->ses;
 	struct smb2_sess_setup_req *req;
 	struct TCP_Server_Info *server = ses->server;
+	unsigned int total_len;
 
-	rc = small_smb2_init(SMB2_SESSION_SETUP, NULL, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_SESSION_SETUP, NULL, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
 	/* First session, not a reauthenticate */
-	req->hdr.sync_hdr.SessionId = 0;
+	req->sync_hdr.SessionId = 0;
 
 	/* if reconnect, we need to send previous sess id, otherwise it is 0 */
 	req->PreviousSessionId = sess_data->previous_session;
 
 	req->Flags = 0; /* MBZ */
 	/* to enable echos and oplocks */
-	req->hdr.sync_hdr.CreditRequest = cpu_to_le16(3);
+	req->sync_hdr.CreditRequest = cpu_to_le16(3);
 
 	/* only one of SMB2 signing flags may be set in SMB2 request */
 	if (server->sign)
@@ -833,8 +800,8 @@ SMB2_sess_alloc_buffer(struct SMB2_sess_data *sess_data)
 	req->Channel = 0; /* MBZ */
 
 	sess_data->iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field and 1 for pad */
-	sess_data->iov[0].iov_len = get_rfc1002_length(req) + 4 - 1;
+	/* 1 for pad */
+	sess_data->iov[0].iov_len = total_len - 1;
 	/*
 	 * This variable will be used to clear the buffer
 	 * allocated above in case of any error in the calling function.
@@ -860,18 +827,15 @@ SMB2_sess_sendreceive(struct SMB2_sess_data *sess_data)
 
 	/* Testing shows that buffer offset must be at location of Buffer[0] */
 	req->SecurityBufferOffset =
-		cpu_to_le16(sizeof(struct smb2_sess_setup_req) -
-			1 /* pad */ - 4 /* rfc1001 len */);
+		cpu_to_le16(sizeof(struct smb2_sess_setup_req) - 1 /* pad */);
 	req->SecurityBufferLength = cpu_to_le16(sess_data->iov[1].iov_len);
 
-	inc_rfc1001_len(req, sess_data->iov[1].iov_len - 1 /* pad */);
-
 	/* BB add code to build os and lm fields */
 
-	rc = SendReceive2(sess_data->xid, sess_data->ses,
-				sess_data->iov, 2,
-				&sess_data->buf0_type,
-				CIFS_LOG_ERROR | CIFS_NEG_OP, &rsp_iov);
+	rc = smb2_send_recv(sess_data->xid, sess_data->ses,
+			    sess_data->iov, 2,
+			    &sess_data->buf0_type,
+			    CIFS_LOG_ERROR | CIFS_NEG_OP, &rsp_iov);
 	cifs_small_buf_release(sess_data->iov[0].iov_base);
 	memcpy(&sess_data->iov[0], &rsp_iov, sizeof(struct kvec));
 
@@ -1092,7 +1056,7 @@ SMB2_sess_auth_rawntlmssp_authenticate(struct SMB2_sess_data *sess_data)
 		goto out;
 
 	req = (struct smb2_sess_setup_req *) sess_data->iov[0].iov_base;
-	req->hdr.sync_hdr.SessionId = ses->Suid;
+	req->sync_hdr.SessionId = ses->Suid;
 
 	rc = build_ntlmssp_auth_blob(&ntlmssp_blob, &blob_length, ses,
 					sess_data->nls_cp);
@@ -1202,6 +1166,10 @@ SMB2_logoff(const unsigned int xid, struct cifs_ses *ses)
 	int rc = 0;
 	struct TCP_Server_Info *server;
 	int flags = 0;
+	unsigned int total_len;
+	struct kvec iov[1];
+	struct kvec rsp_iov;
+	int resp_buf_type;
 
 	cifs_dbg(FYI, "disconnect session %p\n", ses);
 
@@ -1214,19 +1182,24 @@ SMB2_logoff(const unsigned int xid, struct cifs_ses *ses)
 	if (ses->need_reconnect)
 		goto smb2_session_already_dead;
 
-	rc = small_smb2_init(SMB2_LOGOFF, NULL, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_LOGOFF, NULL, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
 	 /* since no tcon, smb2_init can not do this, so do here */
-	req->hdr.sync_hdr.SessionId = ses->Suid;
+	req->sync_hdr.SessionId = ses->Suid;
 
 	if (ses->session_flags & SMB2_SESSION_FLAG_ENCRYPT_DATA)
 		flags |= CIFS_TRANSFORM_REQ;
 	else if (server->sign)
-		req->hdr.sync_hdr.Flags |= SMB2_FLAGS_SIGNED;
+		req->sync_hdr.Flags |= SMB2_FLAGS_SIGNED;
 
-	rc = SendReceiveNoRsp(xid, ses, (char *) req, flags);
+	flags |= CIFS_NO_RESP;
+
+	iov[0].iov_base = (char *)req;
+	iov[0].iov_len = total_len;
+
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buf_type, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	/*
 	 * No tcon so can't do
@@ -1265,6 +1238,7 @@ SMB2_tcon(const unsigned int xid, struct cifs_ses *ses, const char *tree,
 	int unc_path_len;
 	__le16 *unc_path = NULL;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "TCON\n");
 
@@ -1283,40 +1257,30 @@ SMB2_tcon(const unsigned int xid, struct cifs_ses *ses, const char *tree,
 	}
 
 	/* SMB2 TREE_CONNECT request must be called with TreeId == 0 */
-	if (tcon)
-		tcon->tid = 0;
+	tcon->tid = 0;
 
-	rc = small_smb2_init(SMB2_TREE_CONNECT, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_TREE_CONNECT, tcon, (void **) &req,
+			     &total_len);
 	if (rc) {
 		kfree(unc_path);
 		return rc;
 	}
 
-	if (tcon == NULL) {
-		if ((ses->session_flags & SMB2_SESSION_FLAG_ENCRYPT_DATA))
-			flags |= CIFS_TRANSFORM_REQ;
-
-		/* since no tcon, smb2_init can not do this, so do here */
-		req->hdr.sync_hdr.SessionId = ses->Suid;
-		if (ses->server->sign)
-			req->hdr.sync_hdr.Flags |= SMB2_FLAGS_SIGNED;
-	} else if (encryption_required(tcon))
+	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field and 1 for pad */
-	iov[0].iov_len = get_rfc1002_length(req) + 4 - 1;
+	/* 1 for pad */
+	iov[0].iov_len = total_len - 1;
 
 	/* Testing shows that buffer offset must be at location of Buffer[0] */
 	req->PathOffset = cpu_to_le16(sizeof(struct smb2_tree_connect_req)
-			- 1 /* pad */ - 4 /* do not count rfc1001 len field */);
+			- 1 /* pad */);
 	req->PathLength = cpu_to_le16(unc_path_len - 2);
 	iov[1].iov_base = unc_path;
 	iov[1].iov_len = unc_path_len;
 
-	inc_rfc1001_len(req, unc_path_len - 1 /* pad */);
-
-	rc = SendReceive2(xid, ses, iov, 2, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 2, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_tree_connect_rsp *)rsp_iov.iov_base;
 
@@ -1328,21 +1292,16 @@ SMB2_tcon(const unsigned int xid, struct cifs_ses *ses, const char *tree,
 		goto tcon_error_exit;
 	}
 
-	if (tcon == NULL) {
-		ses->ipc_tid = rsp->hdr.sync_hdr.TreeId;
-		goto tcon_exit;
-	}
-
 	switch (rsp->ShareType) {
 	case SMB2_SHARE_TYPE_DISK:
 		cifs_dbg(FYI, "connection to disk share\n");
 		break;
 	case SMB2_SHARE_TYPE_PIPE:
-		tcon->ipc = true;
+		tcon->pipe = true;
 		cifs_dbg(FYI, "connection to pipe share\n");
 		break;
 	case SMB2_SHARE_TYPE_PRINT:
-		tcon->ipc = true;
+		tcon->print = true;
 		cifs_dbg(FYI, "connection to printer\n");
 		break;
 	default:
@@ -1389,6 +1348,10 @@ SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon)
 	int rc = 0;
 	struct cifs_ses *ses = tcon->ses;
 	int flags = 0;
+	unsigned int total_len;
+	struct kvec iov[1];
+	struct kvec rsp_iov;
+	int resp_buf_type;
 
 	cifs_dbg(FYI, "Tree Disconnect\n");
 
@@ -1398,14 +1361,20 @@ SMB2_tdis(const unsigned int xid, struct cifs_tcon *tcon)
 	if ((tcon->need_reconnect) || (tcon->ses->need_reconnect))
 		return 0;
 
-	rc = small_smb2_init(SMB2_TREE_DISCONNECT, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_TREE_DISCONNECT, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	rc = SendReceiveNoRsp(xid, ses, (char *)req, flags);
+	flags |= CIFS_NO_RESP;
+
+	iov[0].iov_base = (char *)req;
+	iov[0].iov_len = total_len;
+
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buf_type, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	if (rc)
 		cifs_stats_fail_inc(tcon, SMB2_TREE_DISCONNECT_HE);
@@ -1505,11 +1474,10 @@ add_lease_context(struct TCP_Server_Info *server, struct kvec *iov,
 	req->RequestedOplockLevel = SMB2_OPLOCK_LEVEL_LEASE;
 	if (!req->CreateContextsOffset)
 		req->CreateContextsOffset = cpu_to_le32(
-				sizeof(struct smb2_create_req) - 4 +
+				sizeof(struct smb2_create_req) +
 				iov[num - 1].iov_len);
 	le32_add_cpu(&req->CreateContextsLength,
 		     server->vals->create_lease_size);
-	inc_rfc1001_len(&req->hdr, server->vals->create_lease_size);
 	*num_iovec = num + 1;
 	return 0;
 }
@@ -1589,10 +1557,9 @@ add_durable_v2_context(struct kvec *iov, unsigned int *num_iovec,
 	iov[num].iov_len = sizeof(struct create_durable_v2);
 	if (!req->CreateContextsOffset)
 		req->CreateContextsOffset =
-			cpu_to_le32(sizeof(struct smb2_create_req) - 4 +
+			cpu_to_le32(sizeof(struct smb2_create_req) +
 								iov[1].iov_len);
 	le32_add_cpu(&req->CreateContextsLength, sizeof(struct create_durable_v2));
-	inc_rfc1001_len(&req->hdr, sizeof(struct create_durable_v2));
 	*num_iovec = num + 1;
 	return 0;
 }
@@ -1613,12 +1580,10 @@ add_durable_reconnect_v2_context(struct kvec *iov, unsigned int *num_iovec,
 	iov[num].iov_len = sizeof(struct create_durable_handle_reconnect_v2);
 	if (!req->CreateContextsOffset)
 		req->CreateContextsOffset =
-			cpu_to_le32(sizeof(struct smb2_create_req) - 4 +
+			cpu_to_le32(sizeof(struct smb2_create_req) +
 								iov[1].iov_len);
 	le32_add_cpu(&req->CreateContextsLength,
 			sizeof(struct create_durable_handle_reconnect_v2));
-	inc_rfc1001_len(&req->hdr,
-			sizeof(struct create_durable_handle_reconnect_v2));
 	*num_iovec = num + 1;
 	return 0;
 }
@@ -1649,10 +1614,9 @@ add_durable_context(struct kvec *iov, unsigned int *num_iovec,
 	iov[num].iov_len = sizeof(struct create_durable);
 	if (!req->CreateContextsOffset)
 		req->CreateContextsOffset =
-			cpu_to_le32(sizeof(struct smb2_create_req) - 4 +
+			cpu_to_le32(sizeof(struct smb2_create_req) +
 								iov[1].iov_len);
 	le32_add_cpu(&req->CreateContextsLength, sizeof(struct create_durable));
-	inc_rfc1001_len(&req->hdr, sizeof(struct create_durable));
 	*num_iovec = num + 1;
 	return 0;
 }
@@ -1723,6 +1687,7 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 	__u32 file_attributes = 0;
 	char *dhc_buf = NULL, *lc_buf = NULL;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "create/open\n");
 
@@ -1731,7 +1696,8 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 	else
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_CREATE, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_CREATE, tcon, (void **) &req, &total_len);
+
 	if (rc)
 		return rc;
 
@@ -1752,12 +1718,10 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 	req->CreateOptions = cpu_to_le32(oparms->create_options & CREATE_OPTIONS_MASK);
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
 	/* -1 since last byte is buf[0] which is sent below (path) */
-	iov[0].iov_len--;
+	iov[0].iov_len = total_len - 1;
 
-	req->NameOffset = cpu_to_le16(sizeof(struct smb2_create_req) - 4);
+	req->NameOffset = cpu_to_le16(sizeof(struct smb2_create_req));
 
 	/* [MS-SMB2] 2.2.13 NameOffset:
 	 * If SMB2_FLAGS_DFS_OPERATIONS is set in the Flags field of
@@ -1770,7 +1734,7 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 	if (tcon->share_flags & SHI1005_FLAGS_DFS) {
 		int name_len;
 
-		req->hdr.sync_hdr.Flags |= SMB2_FLAGS_DFS_OPERATIONS;
+		req->sync_hdr.Flags |= SMB2_FLAGS_DFS_OPERATIONS;
 		rc = alloc_path_with_tree_prefix(&copy_path, &copy_size,
 						 &name_len,
 						 tcon->treeName, path);
@@ -1797,8 +1761,6 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 
 	iov[1].iov_len = uni_path_len;
 	iov[1].iov_base = path;
-	/* -1 since last byte is buf[0] which was counted in smb2_buf_len */
-	inc_rfc1001_len(req, uni_path_len - 1);
 
 	if (!server->oplocks)
 		*oplock = SMB2_OPLOCK_LEVEL_NONE;
@@ -1836,7 +1798,8 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
 		dhc_buf = iov[n_iov-1].iov_base;
 	}
 
-	rc = SendReceive2(xid, ses, iov, n_iov, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, n_iov, &resp_buftype, flags,
+			    &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_create_rsp *)rsp_iov.iov_base;
 
@@ -1877,7 +1840,7 @@ SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms, __le16 *path,
  */
 int
 SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
-	   u64 volatile_fid, u32 opcode, bool is_fsctl, bool use_ipc,
+	   u64 volatile_fid, u32 opcode, bool is_fsctl,
 	   char *in_data, u32 indatalen,
 	   char **out_data, u32 *plen /* returned data len */)
 {
@@ -1891,6 +1854,7 @@ SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 	int n_iov;
 	int rc = 0;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "SMB2 IOCTL\n");
 
@@ -1909,20 +1873,10 @@ SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 	if (!ses || !(ses->server))
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_IOCTL, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_IOCTL, tcon, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
-	if (use_ipc) {
-		if (ses->ipc_tid == 0) {
-			cifs_small_buf_release(req);
-			return -ENOTCONN;
-		}
-
-		cifs_dbg(FYI, "replacing tid 0x%x with IPC tid 0x%x\n",
-			 req->hdr.sync_hdr.TreeId, ses->ipc_tid);
-		req->hdr.sync_hdr.TreeId = ses->ipc_tid;
-	}
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
@@ -1934,7 +1888,7 @@ SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 		req->InputCount = cpu_to_le32(indatalen);
 		/* do not set InputOffset if no input data */
 		req->InputOffset =
-		       cpu_to_le32(offsetof(struct smb2_ioctl_req, Buffer) - 4);
+		       cpu_to_le32(offsetof(struct smb2_ioctl_req, Buffer));
 		iov[1].iov_base = in_data;
 		iov[1].iov_len = indatalen;
 		n_iov = 2;
@@ -1969,21 +1923,20 @@ SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 	 * but if input data passed to ioctl, we do not
 	 * want to double count this, so we do not send
 	 * the dummy one byte of data in iovec[0] if sending
-	 * input data (in iovec[1]). We also must add 4 bytes
-	 * in first iovec to allow for rfc1002 length field.
+	 * input data (in iovec[1]).
 	 */
 
 	if (indatalen) {
-		iov[0].iov_len = get_rfc1002_length(req) + 4 - 1;
-		inc_rfc1001_len(req, indatalen - 1);
+		iov[0].iov_len = total_len - 1;
 	} else
-		iov[0].iov_len = get_rfc1002_length(req) + 4;
+		iov[0].iov_len = total_len;
 
 	/* validate negotiate request must be signed - see MS-SMB2 3.2.5.5 */
 	if (opcode == FSCTL_VALIDATE_NEGOTIATE_INFO)
-		req->hdr.sync_hdr.Flags |= SMB2_FLAGS_SIGNED;
+		req->sync_hdr.Flags |= SMB2_FLAGS_SIGNED;
 
-	rc = SendReceive2(xid, ses, iov, n_iov, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, n_iov, &resp_buftype, flags,
+			    &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_ioctl_rsp *)rsp_iov.iov_base;
 
@@ -2052,7 +2005,6 @@ SMB2_set_compression(const unsigned int xid, struct cifs_tcon *tcon,
 
 	rc = SMB2_ioctl(xid, tcon, persistent_fid, volatile_fid,
 			FSCTL_SET_COMPRESSION, true /* is_fsctl */,
-			false /* use_ipc */,
 			(char *)&fsctl_input /* data input */,
 			2 /* in data len */, &ret_data /* out data */, NULL);
 
@@ -2073,13 +2025,14 @@ SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
 	int resp_buftype;
 	int rc = 0;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "Close\n");
 
 	if (!ses || !(ses->server))
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_CLOSE, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_CLOSE, tcon, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
@@ -2090,10 +2043,9 @@ SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
 	req->VolatileFileId = volatile_fid;
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
+	iov[0].iov_len = total_len;
 
-	rc = SendReceive2(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_close_rsp *)rsp_iov.iov_base;
 
@@ -2180,13 +2132,15 @@ query_info(const unsigned int xid, struct cifs_tcon *tcon,
 	int resp_buftype;
 	struct cifs_ses *ses = tcon->ses;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "Query Info\n");
 
 	if (!ses || !(ses->server))
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_QUERY_INFO, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_QUERY_INFO, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
@@ -2203,15 +2157,14 @@ query_info(const unsigned int xid, struct cifs_tcon *tcon,
 	 * We do not use the input buffer (do not send extra byte)
 	 */
 	req->InputBufferOffset = 0;
-	inc_rfc1001_len(req, -1);
 
 	req->OutputBufferLength = cpu_to_le32(output_len);
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
+	/* 1 for Buffer */
+	iov[0].iov_len = total_len - 1;
 
-	rc = SendReceive2(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_query_info_rsp *)rsp_iov.iov_base;
 
@@ -2338,6 +2291,10 @@ void smb2_reconnect_server(struct work_struct *work)
 				tcon_exist = true;
 			}
 		}
+		if (ses->tcon_ipc && ses->tcon_ipc->need_reconnect) {
+			list_add_tail(&ses->tcon_ipc->rlist, &tmp_list);
+			tcon_exist = true;
+		}
 	}
 	/*
 	 * Get the reference to server struct to be sure that the last call of
@@ -2376,6 +2333,8 @@ SMB2_echo(struct TCP_Server_Info *server)
 	struct kvec iov[2];
 	struct smb_rqst rqst = { .rq_iov = iov,
 				 .rq_nvec = 2 };
+	unsigned int total_len;
+	__be32 rfc1002_marker;
 
 	cifs_dbg(FYI, "In echo request\n");
 
@@ -2385,17 +2344,17 @@ SMB2_echo(struct TCP_Server_Info *server)
 		return rc;
 	}
 
-	rc = small_smb2_init(SMB2_ECHO, NULL, (void **)&req);
+	rc = smb2_plain_req_init(SMB2_ECHO, NULL, (void **)&req, &total_len);
 	if (rc)
 		return rc;
 
-	req->hdr.sync_hdr.CreditRequest = cpu_to_le16(1);
+	req->sync_hdr.CreditRequest = cpu_to_le16(1);
 
-	/* 4 for rfc1002 length field */
 	iov[0].iov_len = 4;
-	iov[0].iov_base = (char *)req;
-	iov[1].iov_len = get_rfc1002_length(req);
-	iov[1].iov_base = (char *)req + 4;
+	rfc1002_marker = cpu_to_be32(total_len);
+	iov[0].iov_base = &rfc1002_marker;
+	iov[1].iov_len = total_len;
+	iov[1].iov_base = (char *)req;
 
 	rc = cifs_call_async(server, &rqst, NULL, smb2_echo_callback, NULL,
 			     server, CIFS_ECHO_OP);
@@ -2417,13 +2376,14 @@ SMB2_flush(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 	int resp_buftype;
 	int rc = 0;
 	int flags = 0;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "Flush\n");
 
 	if (!ses || !(ses->server))
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_FLUSH, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_FLUSH, tcon, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
@@ -2434,10 +2394,9 @@ SMB2_flush(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
 	req->VolatileFileId = volatile_fid;
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
+	iov[0].iov_len = total_len;
 
-	rc = SendReceive2(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 
 	if (rc != 0)
@@ -2453,18 +2412,21 @@ SMB2_flush(const unsigned int xid, struct cifs_tcon *tcon, u64 persistent_fid,
  */
 static int
 smb2_new_read_req(void **buf, unsigned int *total_len,
-		  struct cifs_io_parms *io_parms, unsigned int remaining_bytes,
-		  int request_type)
+	struct cifs_io_parms *io_parms, struct cifs_readdata *rdata,
+	unsigned int remaining_bytes, int request_type)
 {
 	int rc = -EACCES;
 	struct smb2_read_plain_req *req = NULL;
 	struct smb2_sync_hdr *shdr;
+	struct TCP_Server_Info *server;
 
 	rc = smb2_plain_req_init(SMB2_READ, io_parms->tcon, (void **) &req,
 				 total_len);
 	if (rc)
 		return rc;
-	if (io_parms->tcon->ses->server == NULL)
+
+	server = io_parms->tcon->ses->server;
+	if (server == NULL)
 		return -ECONNABORTED;
 
 	shdr = &req->sync_hdr;
@@ -2478,7 +2440,40 @@ smb2_new_read_req(void **buf, unsigned int *total_len,
 	req->MinimumCount = 0;
 	req->Length = cpu_to_le32(io_parms->length);
 	req->Offset = cpu_to_le64(io_parms->offset);
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	/*
+	 * If we want to do a RDMA write, fill in and append
+	 * smbd_buffer_descriptor_v1 to the end of read request
+	 */
+	if (server->rdma && rdata &&
+		rdata->bytes >= server->smbd_conn->rdma_readwrite_threshold) {
 
+		struct smbd_buffer_descriptor_v1 *v1;
+		bool need_invalidate =
+			io_parms->tcon->ses->server->dialect == SMB30_PROT_ID;
+
+		rdata->mr = smbd_register_mr(
+				server->smbd_conn, rdata->pages,
+				rdata->nr_pages, rdata->tailsz,
+				true, need_invalidate);
+		if (!rdata->mr)
+			return -ENOBUFS;
+
+		req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE;
+		if (need_invalidate)
+			req->Channel = SMB2_CHANNEL_RDMA_V1;
+		req->ReadChannelInfoOffset =
+			cpu_to_le16(offsetof(struct smb2_read_plain_req, Buffer));
+		req->ReadChannelInfoLength =
+			cpu_to_le16(sizeof(struct smbd_buffer_descriptor_v1));
+		v1 = (struct smbd_buffer_descriptor_v1 *) &req->Buffer[0];
+		v1->offset = cpu_to_le64(rdata->mr->mr->iova);
+		v1->token = cpu_to_le32(rdata->mr->mr->rkey);
+		v1->length = cpu_to_le32(rdata->mr->mr->length);
+
+		*total_len += sizeof(*v1) - 1;
+	}
+#endif
 	if (request_type & CHAINED_REQUEST) {
 		if (!(request_type & END_OF_CHAIN)) {
 			/* next 8-byte aligned request */
@@ -2557,7 +2552,17 @@ smb2_readv_callback(struct mid_q_entry *mid)
 		if (rdata->result != -ENODATA)
 			rdata->result = -EIO;
 	}
-
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	/*
+	 * If this rdata has a memmory registered, the MR can be freed
+	 * MR needs to be freed as soon as I/O finishes to prevent deadlock
+	 * because they have limited number and are used for future I/Os
+	 */
+	if (rdata->mr) {
+		smbd_deregister_mr(rdata->mr);
+		rdata->mr = NULL;
+	}
+#endif
 	if (rdata->result)
 		cifs_stats_fail_inc(tcon, SMB2_READ_HE);
 
@@ -2592,7 +2597,8 @@ smb2_async_readv(struct cifs_readdata *rdata)
 
 	server = io_parms.tcon->ses->server;
 
-	rc = smb2_new_read_req((void **) &buf, &total_len, &io_parms, 0, 0);
+	rc = smb2_new_read_req(
+		(void **) &buf, &total_len, &io_parms, rdata, 0, 0);
 	if (rc) {
 		if (rc == -EAGAIN && rdata->credits) {
 			/* credits was reset by reconnect */
@@ -2650,31 +2656,24 @@ SMB2_read(const unsigned int xid, struct cifs_io_parms *io_parms,
 	struct smb2_read_plain_req *req = NULL;
 	struct smb2_read_rsp *rsp = NULL;
 	struct smb2_sync_hdr *shdr;
-	struct kvec iov[2];
+	struct kvec iov[1];
 	struct kvec rsp_iov;
 	unsigned int total_len;
-	__be32 req_len;
-	struct smb_rqst rqst = { .rq_iov = iov,
-				 .rq_nvec = 2 };
 	int flags = CIFS_LOG_ERROR;
 	struct cifs_ses *ses = io_parms->tcon->ses;
 
 	*nbytes = 0;
-	rc = smb2_new_read_req((void **)&req, &total_len, io_parms, 0, 0);
+	rc = smb2_new_read_req((void **)&req, &total_len, io_parms, NULL, 0, 0);
 	if (rc)
 		return rc;
 
 	if (encryption_required(io_parms->tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	req_len = cpu_to_be32(total_len);
+	iov[0].iov_base = (char *)req;
+	iov[0].iov_len = total_len;
 
-	iov[0].iov_base = &req_len;
-	iov[0].iov_len = sizeof(__be32);
-	iov[1].iov_base = req;
-	iov[1].iov_len = total_len;
-
-	rc = cifs_send_recv(xid, ses, &rqst, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 
 	rsp = (struct smb2_read_rsp *)rsp_iov.iov_base;
@@ -2755,7 +2754,19 @@ smb2_writev_callback(struct mid_q_entry *mid)
 		wdata->result = -EIO;
 		break;
 	}
-
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	/*
+	 * If this wdata has a memory registered, the MR can be freed
+	 * The number of MRs available is limited, it's important to recover
+	 * used MR as soon as I/O is finished. Hold MR longer in the later
+	 * I/O process can possibly result in I/O deadlock due to lack of MR
+	 * to send request on I/O retry
+	 */
+	if (wdata->mr) {
+		smbd_deregister_mr(wdata->mr);
+		wdata->mr = NULL;
+	}
+#endif
 	if (wdata->result)
 		cifs_stats_fail_inc(tcon, SMB2_WRITE_HE);
 
@@ -2776,8 +2787,10 @@ smb2_async_writev(struct cifs_writedata *wdata,
 	struct TCP_Server_Info *server = tcon->ses->server;
 	struct kvec iov[2];
 	struct smb_rqst rqst = { };
+	unsigned int total_len;
+	__be32 rfc1002_marker;
 
-	rc = small_smb2_init(SMB2_WRITE, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_WRITE, tcon, (void **) &req, &total_len);
 	if (rc) {
 		if (rc == -EAGAIN && wdata->credits) {
 			/* credits was reset by reconnect */
@@ -2793,7 +2806,7 @@ smb2_async_writev(struct cifs_writedata *wdata,
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	shdr = get_sync_hdr(req);
+	shdr = (struct smb2_sync_hdr *)req;
 	shdr->ProcessId = cpu_to_le32(wdata->cfile->pid);
 
 	req->PersistentFileId = wdata->cfile->fid.persistent_fid;
@@ -2802,16 +2815,51 @@ smb2_async_writev(struct cifs_writedata *wdata,
 	req->WriteChannelInfoLength = 0;
 	req->Channel = 0;
 	req->Offset = cpu_to_le64(wdata->offset);
-	/* 4 for rfc1002 length field */
 	req->DataOffset = cpu_to_le16(
-				offsetof(struct smb2_write_req, Buffer) - 4);
+				offsetof(struct smb2_write_req, Buffer));
 	req->RemainingBytes = 0;
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	/*
+	 * If we want to do a server RDMA read, fill in and append
+	 * smbd_buffer_descriptor_v1 to the end of write request
+	 */
+	if (server->rdma && wdata->bytes >=
+		server->smbd_conn->rdma_readwrite_threshold) {
 
+		struct smbd_buffer_descriptor_v1 *v1;
+		bool need_invalidate = server->dialect == SMB30_PROT_ID;
+
+		wdata->mr = smbd_register_mr(
+				server->smbd_conn, wdata->pages,
+				wdata->nr_pages, wdata->tailsz,
+				false, need_invalidate);
+		if (!wdata->mr) {
+			rc = -ENOBUFS;
+			goto async_writev_out;
+		}
+		req->Length = 0;
+		req->DataOffset = 0;
+		req->RemainingBytes =
+			cpu_to_le32((wdata->nr_pages-1)*PAGE_SIZE + wdata->tailsz);
+		req->Channel = SMB2_CHANNEL_RDMA_V1_INVALIDATE;
+		if (need_invalidate)
+			req->Channel = SMB2_CHANNEL_RDMA_V1;
+		req->WriteChannelInfoOffset =
+			cpu_to_le16(offsetof(struct smb2_write_req, Buffer));
+		req->WriteChannelInfoLength =
+			cpu_to_le16(sizeof(struct smbd_buffer_descriptor_v1));
+		v1 = (struct smbd_buffer_descriptor_v1 *) &req->Buffer[0];
+		v1->offset = cpu_to_le64(wdata->mr->mr->iova);
+		v1->token = cpu_to_le32(wdata->mr->mr->rkey);
+		v1->length = cpu_to_le32(wdata->mr->mr->length);
+	}
+#endif
 	/* 4 for rfc1002 length field and 1 for Buffer */
 	iov[0].iov_len = 4;
-	iov[0].iov_base = req;
-	iov[1].iov_len = get_rfc1002_length(req) - 1;
-	iov[1].iov_base = (char *)req + 4;
+	rfc1002_marker = cpu_to_be32(total_len - 1 + wdata->bytes);
+	iov[0].iov_base = &rfc1002_marker;
+	iov[1].iov_len = total_len - 1;
+	iov[1].iov_base = (char *)req;
 
 	rqst.rq_iov = iov;
 	rqst.rq_nvec = 2;
@@ -2819,13 +2867,22 @@ smb2_async_writev(struct cifs_writedata *wdata,
 	rqst.rq_npages = wdata->nr_pages;
 	rqst.rq_pagesz = wdata->pagesz;
 	rqst.rq_tailsz = wdata->tailsz;
-
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	if (wdata->mr) {
+		iov[1].iov_len += sizeof(struct smbd_buffer_descriptor_v1);
+		rqst.rq_npages = 0;
+	}
+#endif
 	cifs_dbg(FYI, "async write at %llu %u bytes\n",
 		 wdata->offset, wdata->bytes);
 
+#ifdef CONFIG_CIFS_SMB_DIRECT
+	/* For RDMA read, I/O size is in RemainingBytes not in Length */
+	if (!wdata->mr)
+		req->Length = cpu_to_le32(wdata->bytes);
+#else
 	req->Length = cpu_to_le32(wdata->bytes);
-
-	inc_rfc1001_len(&req->hdr, wdata->bytes - 1 /* Buffer */);
+#endif
 
 	if (wdata->credits) {
 		shdr->CreditCharge = cpu_to_le16(DIV_ROUND_UP(wdata->bytes,
@@ -2869,13 +2926,15 @@ SMB2_write(const unsigned int xid, struct cifs_io_parms *io_parms,
 	int resp_buftype;
 	struct kvec rsp_iov;
 	int flags = 0;
+	unsigned int total_len;
 
 	*nbytes = 0;
 
 	if (n_vec < 1)
 		return rc;
 
-	rc = small_smb2_init(SMB2_WRITE, io_parms->tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_WRITE, io_parms->tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
@@ -2885,7 +2944,7 @@ SMB2_write(const unsigned int xid, struct cifs_io_parms *io_parms,
 	if (encryption_required(io_parms->tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	req->hdr.sync_hdr.ProcessId = cpu_to_le32(io_parms->pid);
+	req->sync_hdr.ProcessId = cpu_to_le32(io_parms->pid);
 
 	req->PersistentFileId = io_parms->persistent_fid;
 	req->VolatileFileId = io_parms->volatile_fid;
@@ -2894,20 +2953,16 @@ SMB2_write(const unsigned int xid, struct cifs_io_parms *io_parms,
 	req->Channel = 0;
 	req->Length = cpu_to_le32(io_parms->length);
 	req->Offset = cpu_to_le64(io_parms->offset);
-	/* 4 for rfc1002 length field */
 	req->DataOffset = cpu_to_le16(
-				offsetof(struct smb2_write_req, Buffer) - 4);
+				offsetof(struct smb2_write_req, Buffer));
 	req->RemainingBytes = 0;
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field and 1 for Buffer */
-	iov[0].iov_len = get_rfc1002_length(req) + 4 - 1;
+	/* 1 for Buffer */
+	iov[0].iov_len = total_len - 1;
 
-	/* length of entire message including data to be written */
-	inc_rfc1001_len(req, io_parms->length - 1 /* Buffer */);
-
-	rc = SendReceive2(xid, io_parms->tcon->ses, iov, n_vec + 1,
-			  &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, io_parms->tcon->ses, iov, n_vec + 1,
+			    &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_write_rsp *)rsp_iov.iov_base;
 
@@ -2984,13 +3039,15 @@ SMB2_query_directory(const unsigned int xid, struct cifs_tcon *tcon,
 	unsigned int output_size = CIFSMaxBufSize;
 	size_t info_buf_size;
 	int flags = 0;
+	unsigned int total_len;
 
 	if (ses && (ses->server))
 		server = ses->server;
 	else
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_QUERY_DIRECTORY, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_QUERY_DIRECTORY, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
@@ -3022,7 +3079,7 @@ SMB2_query_directory(const unsigned int xid, struct cifs_tcon *tcon,
 	memcpy(bufptr, &asteriks, len);
 
 	req->FileNameOffset =
-		cpu_to_le16(sizeof(struct smb2_query_directory_req) - 1 - 4);
+		cpu_to_le16(sizeof(struct smb2_query_directory_req) - 1);
 	req->FileNameLength = cpu_to_le16(len);
 	/*
 	 * BB could be 30 bytes or so longer if we used SMB2 specific
@@ -3033,15 +3090,13 @@ SMB2_query_directory(const unsigned int xid, struct cifs_tcon *tcon,
 	req->OutputBufferLength = cpu_to_le32(output_size);
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for RFC1001 length and 1 for Buffer */
-	iov[0].iov_len = get_rfc1002_length(req) + 4 - 1;
+	/* 1 for Buffer */
+	iov[0].iov_len = total_len - 1;
 
 	iov[1].iov_base = (char *)(req->Buffer);
 	iov[1].iov_len = len;
 
-	inc_rfc1001_len(req, len - 1 /* Buffer */);
-
-	rc = SendReceive2(xid, ses, iov, 2, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, 2, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_query_directory_rsp *)rsp_iov.iov_base;
 
@@ -3110,6 +3165,7 @@ send_set_info(const unsigned int xid, struct cifs_tcon *tcon,
 	unsigned int i;
 	struct cifs_ses *ses = tcon->ses;
 	int flags = 0;
+	unsigned int total_len;
 
 	if (!ses || !(ses->server))
 		return -EIO;
@@ -3121,7 +3177,7 @@ send_set_info(const unsigned int xid, struct cifs_tcon *tcon,
 	if (!iov)
 		return -ENOMEM;
 
-	rc = small_smb2_init(SMB2_SET_INFO, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_SET_INFO, tcon, (void **) &req, &total_len);
 	if (rc) {
 		kfree(iov);
 		return rc;
@@ -3130,7 +3186,7 @@ send_set_info(const unsigned int xid, struct cifs_tcon *tcon,
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	req->hdr.sync_hdr.ProcessId = cpu_to_le32(pid);
+	req->sync_hdr.ProcessId = cpu_to_le32(pid);
 
 	req->InfoType = info_type;
 	req->FileInfoClass = info_class;
@@ -3138,27 +3194,25 @@ send_set_info(const unsigned int xid, struct cifs_tcon *tcon,
 	req->VolatileFileId = volatile_fid;
 	req->AdditionalInformation = cpu_to_le32(additional_info);
 
-	/* 4 for RFC1001 length and 1 for Buffer */
 	req->BufferOffset =
-			cpu_to_le16(sizeof(struct smb2_set_info_req) - 1 - 4);
+			cpu_to_le16(sizeof(struct smb2_set_info_req) - 1);
 	req->BufferLength = cpu_to_le32(*size);
 
-	inc_rfc1001_len(req, *size - 1 /* Buffer */);
-
 	memcpy(req->Buffer, *data, *size);
+	total_len += *size;
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for RFC1001 length */
-	iov[0].iov_len = get_rfc1002_length(req) + 4;
+	/* 1 for Buffer */
+	iov[0].iov_len = total_len - 1;
 
 	for (i = 1; i < num; i++) {
-		inc_rfc1001_len(req, size[i]);
 		le32_add_cpu(&req->BufferLength, size[i]);
 		iov[i].iov_base = (char *)data[i];
 		iov[i].iov_len = size[i];
 	}
 
-	rc = SendReceive2(xid, ses, iov, num, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, iov, num, &resp_buftype, flags,
+			    &rsp_iov);
 	cifs_small_buf_release(req);
 	rsp = (struct smb2_set_info_rsp *)rsp_iov.iov_base;
 
@@ -3310,11 +3364,17 @@ SMB2_oplock_break(const unsigned int xid, struct cifs_tcon *tcon,
 		  __u8 oplock_level)
 {
 	int rc;
-	struct smb2_oplock_break *req = NULL;
+	struct smb2_oplock_break_req *req = NULL;
+	struct cifs_ses *ses = tcon->ses;
 	int flags = CIFS_OBREAK_OP;
+	unsigned int total_len;
+	struct kvec iov[1];
+	struct kvec rsp_iov;
+	int resp_buf_type;
 
 	cifs_dbg(FYI, "SMB2_oplock_break\n");
-	rc = small_smb2_init(SMB2_OPLOCK_BREAK, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_OPLOCK_BREAK, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
@@ -3324,9 +3384,14 @@ SMB2_oplock_break(const unsigned int xid, struct cifs_tcon *tcon,
 	req->VolatileFid = volatile_fid;
 	req->PersistentFid = persistent_fid;
 	req->OplockLevel = oplock_level;
-	req->hdr.sync_hdr.CreditRequest = cpu_to_le16(1);
+	req->sync_hdr.CreditRequest = cpu_to_le16(1);
 
-	rc = SendReceiveNoRsp(xid, tcon->ses, (char *) req, flags);
+	flags |= CIFS_NO_RESP;
+
+	iov[0].iov_base = (char *)req;
+	iov[0].iov_len = total_len;
+
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buf_type, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 
 	if (rc) {
@@ -3355,13 +3420,15 @@ build_qfs_info_req(struct kvec *iov, struct cifs_tcon *tcon, int level,
 {
 	int rc;
 	struct smb2_query_info_req *req;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "Query FSInfo level %d\n", level);
 
 	if ((tcon->ses == NULL) || (tcon->ses->server == NULL))
 		return -EIO;
 
-	rc = small_smb2_init(SMB2_QUERY_INFO, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_QUERY_INFO, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
@@ -3369,15 +3436,14 @@ build_qfs_info_req(struct kvec *iov, struct cifs_tcon *tcon, int level,
 	req->FileInfoClass = level;
 	req->PersistentFileId = persistent_fid;
 	req->VolatileFileId = volatile_fid;
-	/* 4 for rfc1002 length field and 1 for pad */
+	/* 1 for pad */
 	req->InputBufferOffset =
-			cpu_to_le16(sizeof(struct smb2_query_info_req) - 1 - 4);
+			cpu_to_le16(sizeof(struct smb2_query_info_req) - 1);
 	req->OutputBufferLength = cpu_to_le32(
 		outbuf_len + sizeof(struct smb2_query_info_rsp) - 1 - 4);
 
 	iov->iov_base = (char *)req;
-	/* 4 for rfc1002 length field */
-	iov->iov_len = get_rfc1002_length(req) + 4;
+	iov->iov_len = total_len;
 	return 0;
 }
 
@@ -3403,7 +3469,7 @@ SMB2_QFS_info(const unsigned int xid, struct cifs_tcon *tcon,
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	rc = SendReceive2(xid, ses, &iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, &iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(iov.iov_base);
 	if (rc) {
 		cifs_stats_fail_inc(tcon, SMB2_QUERY_INFO_HE);
@@ -3459,7 +3525,7 @@ SMB2_QFS_attr(const unsigned int xid, struct cifs_tcon *tcon,
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	rc = SendReceive2(xid, ses, &iov, 1, &resp_buftype, flags, &rsp_iov);
+	rc = smb2_send_recv(xid, ses, &iov, 1, &resp_buftype, flags, &rsp_iov);
 	cifs_small_buf_release(iov.iov_base);
 	if (rc) {
 		cifs_stats_fail_inc(tcon, SMB2_QUERY_INFO_HE);
@@ -3505,34 +3571,33 @@ smb2_lockv(const unsigned int xid, struct cifs_tcon *tcon,
 	int resp_buf_type;
 	unsigned int count;
 	int flags = CIFS_NO_RESP;
+	unsigned int total_len;
 
 	cifs_dbg(FYI, "smb2_lockv num lock %d\n", num_lock);
 
-	rc = small_smb2_init(SMB2_LOCK, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_LOCK, tcon, (void **) &req, &total_len);
 	if (rc)
 		return rc;
 
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	req->hdr.sync_hdr.ProcessId = cpu_to_le32(pid);
+	req->sync_hdr.ProcessId = cpu_to_le32(pid);
 	req->LockCount = cpu_to_le16(num_lock);
 
 	req->PersistentFileId = persist_fid;
 	req->VolatileFileId = volatile_fid;
 
 	count = num_lock * sizeof(struct smb2_lock_element);
-	inc_rfc1001_len(req, count - sizeof(struct smb2_lock_element));
 
 	iov[0].iov_base = (char *)req;
-	/* 4 for rfc1002 length field and count for all locks */
-	iov[0].iov_len = get_rfc1002_length(req) + 4 - count;
+	iov[0].iov_len = total_len - sizeof(struct smb2_lock_element);
 	iov[1].iov_base = (char *)buf;
 	iov[1].iov_len = count;
 
 	cifs_stats_inc(&tcon->stats.cifs_stats.num_locks);
-	rc = SendReceive2(xid, tcon->ses, iov, 2, &resp_buf_type, flags,
-			  &rsp_iov);
+	rc = smb2_send_recv(xid, tcon->ses, iov, 2, &resp_buf_type, flags,
+			    &rsp_iov);
 	cifs_small_buf_release(req);
 	if (rc) {
 		cifs_dbg(FYI, "Send error in smb2_lockv = %d\n", rc);
@@ -3565,24 +3630,35 @@ SMB2_lease_break(const unsigned int xid, struct cifs_tcon *tcon,
 {
 	int rc;
 	struct smb2_lease_ack *req = NULL;
+	struct cifs_ses *ses = tcon->ses;
 	int flags = CIFS_OBREAK_OP;
+	unsigned int total_len;
+	struct kvec iov[1];
+	struct kvec rsp_iov;
+	int resp_buf_type;
 
 	cifs_dbg(FYI, "SMB2_lease_break\n");
-	rc = small_smb2_init(SMB2_OPLOCK_BREAK, tcon, (void **) &req);
+	rc = smb2_plain_req_init(SMB2_OPLOCK_BREAK, tcon, (void **) &req,
+			     &total_len);
 	if (rc)
 		return rc;
 
 	if (encryption_required(tcon))
 		flags |= CIFS_TRANSFORM_REQ;
 
-	req->hdr.sync_hdr.CreditRequest = cpu_to_le16(1);
+	req->sync_hdr.CreditRequest = cpu_to_le16(1);
 	req->StructureSize = cpu_to_le16(36);
-	inc_rfc1001_len(req, 12);
+	total_len += 12;
 
 	memcpy(req->LeaseKey, lease_key, 16);
 	req->LeaseState = lease_state;
 
-	rc = SendReceiveNoRsp(xid, tcon->ses, (char *) req, flags);
+	flags |= CIFS_NO_RESP;
+
+	iov[0].iov_base = (char *)req;
+	iov[0].iov_len = total_len;
+
+	rc = smb2_send_recv(xid, ses, iov, 1, &resp_buf_type, flags, &rsp_iov);
 	cifs_small_buf_release(req);
 
 	if (rc) {
diff --git a/fs/cifs/smb2pdu.h b/fs/cifs/smb2pdu.h
index c2ec934..6eb9f96 100644
--- a/fs/cifs/smb2pdu.h
+++ b/fs/cifs/smb2pdu.h
@@ -195,7 +195,7 @@ struct smb2_symlink_err_rsp {
 #define SMB2_CLIENT_GUID_SIZE 16
 
 struct smb2_negotiate_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 36 */
 	__le16 DialectCount;
 	__le16 SecurityMode;
@@ -282,7 +282,7 @@ struct smb2_negotiate_rsp {
 #define SMB2_SESSION_REQ_FLAG_ENCRYPT_DATA	0x04
 
 struct smb2_sess_setup_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 25 */
 	__u8   Flags;
 	__u8   SecurityMode;
@@ -308,7 +308,7 @@ struct smb2_sess_setup_rsp {
 } __packed;
 
 struct smb2_logoff_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 4 */
 	__le16 Reserved;
 } __packed;
@@ -323,7 +323,7 @@ struct smb2_logoff_rsp {
 #define SMB2_SHAREFLAG_CLUSTER_RECONNECT	0x0001
 
 struct smb2_tree_connect_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 9 */
 	__le16 Reserved; /* Flags in SMB3.1.1 */
 	__le16 PathOffset;
@@ -375,7 +375,7 @@ struct smb2_tree_connect_rsp {
 #define SMB2_SHARE_CAP_ASYMMETRIC cpu_to_le32(0x00000080) /* 3.02 */
 
 struct smb2_tree_disconnect_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 4 */
 	__le16 Reserved;
 } __packed;
@@ -496,7 +496,7 @@ struct smb2_tree_disconnect_rsp {
 #define SVHDX_OPEN_DEVICE_CONTEXT	0x83CE6F1AD851E0986E34401CC9BCFCE9
 
 struct smb2_create_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 57 */
 	__u8   SecurityFlags;
 	__u8   RequestedOplockLevel;
@@ -753,7 +753,7 @@ struct duplicate_extents_to_file {
 } __packed;
 
 struct smb2_ioctl_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 57 */
 	__u16 Reserved;
 	__le32 CtlCode;
@@ -789,7 +789,7 @@ struct smb2_ioctl_rsp {
 /* Currently defined values for close flags */
 #define SMB2_CLOSE_FLAG_POSTQUERY_ATTRIB	cpu_to_le16(0x0001)
 struct smb2_close_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 24 */
 	__le16 Flags;
 	__le32 Reserved;
@@ -812,7 +812,7 @@ struct smb2_close_rsp {
 } __packed;
 
 struct smb2_flush_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 24 */
 	__le16 Reserved1;
 	__le32 Reserved2;
@@ -830,9 +830,9 @@ struct smb2_flush_rsp {
 #define SMB2_READFLAG_READ_UNBUFFERED	0x01
 
 /* Channel field for read and write: exactly one of following flags can be set*/
-#define SMB2_CHANNEL_NONE		0x00000000
-#define SMB2_CHANNEL_RDMA_V1		0x00000001 /* SMB3 or later */
-#define SMB2_CHANNEL_RDMA_V1_INVALIDATE 0x00000002 /* SMB3.02 or later */
+#define SMB2_CHANNEL_NONE	cpu_to_le32(0x00000000)
+#define SMB2_CHANNEL_RDMA_V1	cpu_to_le32(0x00000001) /* SMB3 or later */
+#define SMB2_CHANNEL_RDMA_V1_INVALIDATE cpu_to_le32(0x00000002) /* >= SMB3.02 */
 
 /* SMB2 read request without RFC1001 length at the beginning */
 struct smb2_read_plain_req {
@@ -847,8 +847,8 @@ struct smb2_read_plain_req {
 	__le32 MinimumCount;
 	__le32 Channel; /* MBZ except for SMB3 or later */
 	__le32 RemainingBytes;
-	__le16 ReadChannelInfoOffset; /* Reserved MBZ */
-	__le16 ReadChannelInfoLength; /* Reserved MBZ */
+	__le16 ReadChannelInfoOffset;
+	__le16 ReadChannelInfoLength;
 	__u8   Buffer[1];
 } __packed;
 
@@ -868,7 +868,7 @@ struct smb2_read_rsp {
 #define SMB2_WRITEFLAG_WRITE_UNBUFFERED	0x00000002	/* SMB3.02 or later */
 
 struct smb2_write_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 49 */
 	__le16 DataOffset; /* offset from start of SMB2 header to write data */
 	__le32 Length;
@@ -877,8 +877,8 @@ struct smb2_write_req {
 	__u64  VolatileFileId; /* opaque endianness */
 	__le32 Channel; /* Reserved MBZ */
 	__le32 RemainingBytes;
-	__le16 WriteChannelInfoOffset; /* Reserved MBZ */
-	__le16 WriteChannelInfoLength; /* Reserved MBZ */
+	__le16 WriteChannelInfoOffset;
+	__le16 WriteChannelInfoLength;
 	__le32 Flags;
 	__u8   Buffer[1];
 } __packed;
@@ -907,7 +907,7 @@ struct smb2_lock_element {
 } __packed;
 
 struct smb2_lock_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 48 */
 	__le16 LockCount;
 	__le32 Reserved;
@@ -924,7 +924,7 @@ struct smb2_lock_rsp {
 } __packed;
 
 struct smb2_echo_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize;	/* Must be 4 */
 	__u16  Reserved;
 } __packed;
@@ -942,7 +942,7 @@ struct smb2_echo_rsp {
 #define SMB2_REOPEN			0x10
 
 struct smb2_query_directory_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 33 */
 	__u8   FileInformationClass;
 	__u8   Flags;
@@ -989,7 +989,7 @@ struct smb2_query_directory_rsp {
 #define SL_INDEX_SPECIFIED	0x00000004
 
 struct smb2_query_info_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 41 */
 	__u8   InfoType;
 	__u8   FileInfoClass;
@@ -1013,7 +1013,7 @@ struct smb2_query_info_rsp {
 } __packed;
 
 struct smb2_set_info_req {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 33 */
 	__u8   InfoType;
 	__u8   FileInfoClass;
@@ -1031,7 +1031,19 @@ struct smb2_set_info_rsp {
 	__le16 StructureSize; /* Must be 2 */
 } __packed;
 
-struct smb2_oplock_break {
+/* oplock break without an rfc1002 header */
+struct smb2_oplock_break_req {
+	struct smb2_sync_hdr sync_hdr;
+	__le16 StructureSize; /* Must be 24 */
+	__u8   OplockLevel;
+	__u8   Reserved;
+	__le32 Reserved2;
+	__u64  PersistentFid;
+	__u64  VolatileFid;
+} __packed;
+
+/* oplock break with an rfc1002 header */
+struct smb2_oplock_break_rsp {
 	struct smb2_hdr hdr;
 	__le16 StructureSize; /* Must be 24 */
 	__u8   OplockLevel;
@@ -1057,7 +1069,7 @@ struct smb2_lease_break {
 } __packed;
 
 struct smb2_lease_ack {
-	struct smb2_hdr hdr;
+	struct smb2_sync_hdr sync_hdr;
 	__le16 StructureSize; /* Must be 36 */
 	__le16 Reserved;
 	__le32 Flags;
diff --git a/fs/cifs/smb2proto.h b/fs/cifs/smb2proto.h
index e9ab522..05287b0 100644
--- a/fs/cifs/smb2proto.h
+++ b/fs/cifs/smb2proto.h
@@ -125,8 +125,7 @@ extern int SMB2_open(const unsigned int xid, struct cifs_open_parms *oparms,
 		     struct smb2_err_rsp **err_buf);
 extern int SMB2_ioctl(const unsigned int xid, struct cifs_tcon *tcon,
 		     u64 persistent_fid, u64 volatile_fid, u32 opcode,
-		     bool is_fsctl, bool use_ipc,
-		     char *in_data, u32 indatalen,
+		     bool is_fsctl, char *in_data, u32 indatalen,
 		     char **out_data, u32 *plen /* returned data len */);
 extern int SMB2_close(const unsigned int xid, struct cifs_tcon *tcon,
 		      u64 persistent_file_id, u64 volatile_file_id);
diff --git a/fs/cifs/smbdirect.c b/fs/cifs/smbdirect.c
new file mode 100644
index 0000000..5130492
--- /dev/null
+++ b/fs/cifs/smbdirect.c
@@ -0,0 +1,2610 @@
+/*
+ *   Copyright (C) 2017, Microsoft Corporation.
+ *
+ *   Author(s): Long Li <longli@microsoft.com>
+ *
+ *   This program is free software;  you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY;  without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
+ *   the GNU General Public License for more details.
+ */
+#include <linux/module.h>
+#include <linux/highmem.h>
+#include "smbdirect.h"
+#include "cifs_debug.h"
+
+static struct smbd_response *get_empty_queue_buffer(
+		struct smbd_connection *info);
+static struct smbd_response *get_receive_buffer(
+		struct smbd_connection *info);
+static void put_receive_buffer(
+		struct smbd_connection *info,
+		struct smbd_response *response);
+static int allocate_receive_buffers(struct smbd_connection *info, int num_buf);
+static void destroy_receive_buffers(struct smbd_connection *info);
+
+static void put_empty_packet(
+		struct smbd_connection *info, struct smbd_response *response);
+static void enqueue_reassembly(
+		struct smbd_connection *info,
+		struct smbd_response *response, int data_length);
+static struct smbd_response *_get_first_reassembly(
+		struct smbd_connection *info);
+
+static int smbd_post_recv(
+		struct smbd_connection *info,
+		struct smbd_response *response);
+
+static int smbd_post_send_empty(struct smbd_connection *info);
+static int smbd_post_send_data(
+		struct smbd_connection *info,
+		struct kvec *iov, int n_vec, int remaining_data_length);
+static int smbd_post_send_page(struct smbd_connection *info,
+		struct page *page, unsigned long offset,
+		size_t size, int remaining_data_length);
+
+static void destroy_mr_list(struct smbd_connection *info);
+static int allocate_mr_list(struct smbd_connection *info);
+
+/* SMBD version number */
+#define SMBD_V1	0x0100
+
+/* Port numbers for SMBD transport */
+#define SMB_PORT	445
+#define SMBD_PORT	5445
+
+/* Address lookup and resolve timeout in ms */
+#define RDMA_RESOLVE_TIMEOUT	5000
+
+/* SMBD negotiation timeout in seconds */
+#define SMBD_NEGOTIATE_TIMEOUT	120
+
+/* SMBD minimum receive size and fragmented sized defined in [MS-SMBD] */
+#define SMBD_MIN_RECEIVE_SIZE		128
+#define SMBD_MIN_FRAGMENTED_SIZE	131072
+
+/*
+ * Default maximum number of RDMA read/write outstanding on this connection
+ * This value is possibly decreased during QP creation on hardware limit
+ */
+#define SMBD_CM_RESPONDER_RESOURCES	32
+
+/* Maximum number of retries on data transfer operations */
+#define SMBD_CM_RETRY			6
+/* No need to retry on Receiver Not Ready since SMBD manages credits */
+#define SMBD_CM_RNR_RETRY		0
+
+/*
+ * User configurable initial values per SMBD transport connection
+ * as defined in [MS-SMBD] 3.1.1.1
+ * Those may change after a SMBD negotiation
+ */
+/* The local peer's maximum number of credits to grant to the peer */
+int smbd_receive_credit_max = 255;
+
+/* The remote peer's credit request of local peer */
+int smbd_send_credit_target = 255;
+
+/* The maximum single message size can be sent to remote peer */
+int smbd_max_send_size = 1364;
+
+/*  The maximum fragmented upper-layer payload receive size supported */
+int smbd_max_fragmented_recv_size = 1024 * 1024;
+
+/*  The maximum single-message size which can be received */
+int smbd_max_receive_size = 8192;
+
+/* The timeout to initiate send of a keepalive message on idle */
+int smbd_keep_alive_interval = 120;
+
+/*
+ * User configurable initial values for RDMA transport
+ * The actual values used may be lower and are limited to hardware capabilities
+ */
+/* Default maximum number of SGEs in a RDMA write/read */
+int smbd_max_frmr_depth = 2048;
+
+/* If payload is less than this byte, use RDMA send/recv not read/write */
+int rdma_readwrite_threshold = 4096;
+
+/* Transport logging functions
+ * Logging are defined as classes. They can be OR'ed to define the actual
+ * logging level via module parameter smbd_logging_class
+ * e.g. cifs.smbd_logging_class=0xa0 will log all log_rdma_recv() and
+ * log_rdma_event()
+ */
+#define LOG_OUTGOING			0x1
+#define LOG_INCOMING			0x2
+#define LOG_READ			0x4
+#define LOG_WRITE			0x8
+#define LOG_RDMA_SEND			0x10
+#define LOG_RDMA_RECV			0x20
+#define LOG_KEEP_ALIVE			0x40
+#define LOG_RDMA_EVENT			0x80
+#define LOG_RDMA_MR			0x100
+static unsigned int smbd_logging_class;
+module_param(smbd_logging_class, uint, 0644);
+MODULE_PARM_DESC(smbd_logging_class,
+	"Logging class for SMBD transport 0x0 to 0x100");
+
+#define ERR		0x0
+#define INFO		0x1
+static unsigned int smbd_logging_level = ERR;
+module_param(smbd_logging_level, uint, 0644);
+MODULE_PARM_DESC(smbd_logging_level,
+	"Logging level for SMBD transport, 0 (default): error, 1: info");
+
+#define log_rdma(level, class, fmt, args...)				\
+do {									\
+	if (level <= smbd_logging_level || class & smbd_logging_class)	\
+		cifs_dbg(VFS, "%s:%d " fmt, __func__, __LINE__, ##args);\
+} while (0)
+
+#define log_outgoing(level, fmt, args...) \
+		log_rdma(level, LOG_OUTGOING, fmt, ##args)
+#define log_incoming(level, fmt, args...) \
+		log_rdma(level, LOG_INCOMING, fmt, ##args)
+#define log_read(level, fmt, args...)	log_rdma(level, LOG_READ, fmt, ##args)
+#define log_write(level, fmt, args...)	log_rdma(level, LOG_WRITE, fmt, ##args)
+#define log_rdma_send(level, fmt, args...) \
+		log_rdma(level, LOG_RDMA_SEND, fmt, ##args)
+#define log_rdma_recv(level, fmt, args...) \
+		log_rdma(level, LOG_RDMA_RECV, fmt, ##args)
+#define log_keep_alive(level, fmt, args...) \
+		log_rdma(level, LOG_KEEP_ALIVE, fmt, ##args)
+#define log_rdma_event(level, fmt, args...) \
+		log_rdma(level, LOG_RDMA_EVENT, fmt, ##args)
+#define log_rdma_mr(level, fmt, args...) \
+		log_rdma(level, LOG_RDMA_MR, fmt, ##args)
+
+/*
+ * Destroy the transport and related RDMA and memory resources
+ * Need to go through all the pending counters and make sure on one is using
+ * the transport while it is destroyed
+ */
+static void smbd_destroy_rdma_work(struct work_struct *work)
+{
+	struct smbd_response *response;
+	struct smbd_connection *info =
+		container_of(work, struct smbd_connection, destroy_work);
+	unsigned long flags;
+
+	log_rdma_event(INFO, "destroying qp\n");
+	ib_drain_qp(info->id->qp);
+	rdma_destroy_qp(info->id);
+
+	/* Unblock all I/O waiting on the send queue */
+	wake_up_interruptible_all(&info->wait_send_queue);
+
+	log_rdma_event(INFO, "cancelling idle timer\n");
+	cancel_delayed_work_sync(&info->idle_timer_work);
+	log_rdma_event(INFO, "cancelling send immediate work\n");
+	cancel_delayed_work_sync(&info->send_immediate_work);
+
+	log_rdma_event(INFO, "wait for all send to finish\n");
+	wait_event(info->wait_smbd_send_pending,
+		info->smbd_send_pending == 0);
+
+	log_rdma_event(INFO, "wait for all recv to finish\n");
+	wake_up_interruptible(&info->wait_reassembly_queue);
+	wait_event(info->wait_smbd_recv_pending,
+		info->smbd_recv_pending == 0);
+
+	log_rdma_event(INFO, "wait for all send posted to IB to finish\n");
+	wait_event(info->wait_send_pending,
+		atomic_read(&info->send_pending) == 0);
+	wait_event(info->wait_send_payload_pending,
+		atomic_read(&info->send_payload_pending) == 0);
+
+	log_rdma_event(INFO, "freeing mr list\n");
+	wake_up_interruptible_all(&info->wait_mr);
+	wait_event(info->wait_for_mr_cleanup,
+		atomic_read(&info->mr_used_count) == 0);
+	destroy_mr_list(info);
+
+	/* It's not posssible for upper layer to get to reassembly */
+	log_rdma_event(INFO, "drain the reassembly queue\n");
+	do {
+		spin_lock_irqsave(&info->reassembly_queue_lock, flags);
+		response = _get_first_reassembly(info);
+		if (response) {
+			list_del(&response->list);
+			spin_unlock_irqrestore(
+				&info->reassembly_queue_lock, flags);
+			put_receive_buffer(info, response);
+		}
+	} while (response);
+	spin_unlock_irqrestore(&info->reassembly_queue_lock, flags);
+	info->reassembly_data_length = 0;
+
+	log_rdma_event(INFO, "free receive buffers\n");
+	wait_event(info->wait_receive_queues,
+		info->count_receive_queue + info->count_empty_packet_queue
+			== info->receive_credit_max);
+	destroy_receive_buffers(info);
+
+	ib_free_cq(info->send_cq);
+	ib_free_cq(info->recv_cq);
+	ib_dealloc_pd(info->pd);
+	rdma_destroy_id(info->id);
+
+	/* free mempools */
+	mempool_destroy(info->request_mempool);
+	kmem_cache_destroy(info->request_cache);
+
+	mempool_destroy(info->response_mempool);
+	kmem_cache_destroy(info->response_cache);
+
+	info->transport_status = SMBD_DESTROYED;
+	wake_up_all(&info->wait_destroy);
+}
+
+static int smbd_process_disconnected(struct smbd_connection *info)
+{
+	schedule_work(&info->destroy_work);
+	return 0;
+}
+
+static void smbd_disconnect_rdma_work(struct work_struct *work)
+{
+	struct smbd_connection *info =
+		container_of(work, struct smbd_connection, disconnect_work);
+
+	if (info->transport_status == SMBD_CONNECTED) {
+		info->transport_status = SMBD_DISCONNECTING;
+		rdma_disconnect(info->id);
+	}
+}
+
+static void smbd_disconnect_rdma_connection(struct smbd_connection *info)
+{
+	queue_work(info->workqueue, &info->disconnect_work);
+}
+
+/* Upcall from RDMA CM */
+static int smbd_conn_upcall(
+		struct rdma_cm_id *id, struct rdma_cm_event *event)
+{
+	struct smbd_connection *info = id->context;
+
+	log_rdma_event(INFO, "event=%d status=%d\n",
+		event->event, event->status);
+
+	switch (event->event) {
+	case RDMA_CM_EVENT_ADDR_RESOLVED:
+	case RDMA_CM_EVENT_ROUTE_RESOLVED:
+		info->ri_rc = 0;
+		complete(&info->ri_done);
+		break;
+
+	case RDMA_CM_EVENT_ADDR_ERROR:
+		info->ri_rc = -EHOSTUNREACH;
+		complete(&info->ri_done);
+		break;
+
+	case RDMA_CM_EVENT_ROUTE_ERROR:
+		info->ri_rc = -ENETUNREACH;
+		complete(&info->ri_done);
+		break;
+
+	case RDMA_CM_EVENT_ESTABLISHED:
+		log_rdma_event(INFO, "connected event=%d\n", event->event);
+		info->transport_status = SMBD_CONNECTED;
+		wake_up_interruptible(&info->conn_wait);
+		break;
+
+	case RDMA_CM_EVENT_CONNECT_ERROR:
+	case RDMA_CM_EVENT_UNREACHABLE:
+	case RDMA_CM_EVENT_REJECTED:
+		log_rdma_event(INFO, "connecting failed event=%d\n", event->event);
+		info->transport_status = SMBD_DISCONNECTED;
+		wake_up_interruptible(&info->conn_wait);
+		break;
+
+	case RDMA_CM_EVENT_DEVICE_REMOVAL:
+	case RDMA_CM_EVENT_DISCONNECTED:
+		/* This happenes when we fail the negotiation */
+		if (info->transport_status == SMBD_NEGOTIATE_FAILED) {
+			info->transport_status = SMBD_DISCONNECTED;
+			wake_up(&info->conn_wait);
+			break;
+		}
+
+		info->transport_status = SMBD_DISCONNECTED;
+		smbd_process_disconnected(info);
+		break;
+
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+/* Upcall from RDMA QP */
+static void
+smbd_qp_async_error_upcall(struct ib_event *event, void *context)
+{
+	struct smbd_connection *info = context;
+
+	log_rdma_event(ERR, "%s on device %s info %p\n",
+		ib_event_msg(event->event), event->device->name, info);
+
+	switch (event->event) {
+	case IB_EVENT_CQ_ERR:
+	case IB_EVENT_QP_FATAL:
+		smbd_disconnect_rdma_connection(info);
+
+	default:
+		break;
+	}
+}
+
+static inline void *smbd_request_payload(struct smbd_request *request)
+{
+	return (void *)request->packet;
+}
+
+static inline void *smbd_response_payload(struct smbd_response *response)
+{
+	return (void *)response->packet;
+}
+
+/* Called when a RDMA send is done */
+static void send_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	int i;
+	struct smbd_request *request =
+		container_of(wc->wr_cqe, struct smbd_request, cqe);
+
+	log_rdma_send(INFO, "smbd_request %p completed wc->status=%d\n",
+		request, wc->status);
+
+	if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_SEND) {
+		log_rdma_send(ERR, "wc->status=%d wc->opcode=%d\n",
+			wc->status, wc->opcode);
+		smbd_disconnect_rdma_connection(request->info);
+	}
+
+	for (i = 0; i < request->num_sge; i++)
+		ib_dma_unmap_single(request->info->id->device,
+			request->sge[i].addr,
+			request->sge[i].length,
+			DMA_TO_DEVICE);
+
+	if (request->has_payload) {
+		if (atomic_dec_and_test(&request->info->send_payload_pending))
+			wake_up(&request->info->wait_send_payload_pending);
+	} else {
+		if (atomic_dec_and_test(&request->info->send_pending))
+			wake_up(&request->info->wait_send_pending);
+	}
+
+	mempool_free(request, request->info->request_mempool);
+}
+
+static void dump_smbd_negotiate_resp(struct smbd_negotiate_resp *resp)
+{
+	log_rdma_event(INFO, "resp message min_version %u max_version %u "
+		"negotiated_version %u credits_requested %u "
+		"credits_granted %u status %u max_readwrite_size %u "
+		"preferred_send_size %u max_receive_size %u "
+		"max_fragmented_size %u\n",
+		resp->min_version, resp->max_version, resp->negotiated_version,
+		resp->credits_requested, resp->credits_granted, resp->status,
+		resp->max_readwrite_size, resp->preferred_send_size,
+		resp->max_receive_size, resp->max_fragmented_size);
+}
+
+/*
+ * Process a negotiation response message, according to [MS-SMBD]3.1.5.7
+ * response, packet_length: the negotiation response message
+ * return value: true if negotiation is a success, false if failed
+ */
+static bool process_negotiation_response(
+		struct smbd_response *response, int packet_length)
+{
+	struct smbd_connection *info = response->info;
+	struct smbd_negotiate_resp *packet = smbd_response_payload(response);
+
+	if (packet_length < sizeof(struct smbd_negotiate_resp)) {
+		log_rdma_event(ERR,
+			"error: packet_length=%d\n", packet_length);
+		return false;
+	}
+
+	if (le16_to_cpu(packet->negotiated_version) != SMBD_V1) {
+		log_rdma_event(ERR, "error: negotiated_version=%x\n",
+			le16_to_cpu(packet->negotiated_version));
+		return false;
+	}
+	info->protocol = le16_to_cpu(packet->negotiated_version);
+
+	if (packet->credits_requested == 0) {
+		log_rdma_event(ERR, "error: credits_requested==0\n");
+		return false;
+	}
+	info->receive_credit_target = le16_to_cpu(packet->credits_requested);
+
+	if (packet->credits_granted == 0) {
+		log_rdma_event(ERR, "error: credits_granted==0\n");
+		return false;
+	}
+	atomic_set(&info->send_credits, le16_to_cpu(packet->credits_granted));
+
+	atomic_set(&info->receive_credits, 0);
+
+	if (le32_to_cpu(packet->preferred_send_size) > info->max_receive_size) {
+		log_rdma_event(ERR, "error: preferred_send_size=%d\n",
+			le32_to_cpu(packet->preferred_send_size));
+		return false;
+	}
+	info->max_receive_size = le32_to_cpu(packet->preferred_send_size);
+
+	if (le32_to_cpu(packet->max_receive_size) < SMBD_MIN_RECEIVE_SIZE) {
+		log_rdma_event(ERR, "error: max_receive_size=%d\n",
+			le32_to_cpu(packet->max_receive_size));
+		return false;
+	}
+	info->max_send_size = min_t(int, info->max_send_size,
+					le32_to_cpu(packet->max_receive_size));
+
+	if (le32_to_cpu(packet->max_fragmented_size) <
+			SMBD_MIN_FRAGMENTED_SIZE) {
+		log_rdma_event(ERR, "error: max_fragmented_size=%d\n",
+			le32_to_cpu(packet->max_fragmented_size));
+		return false;
+	}
+	info->max_fragmented_send_size =
+		le32_to_cpu(packet->max_fragmented_size);
+	info->rdma_readwrite_threshold =
+		rdma_readwrite_threshold > info->max_fragmented_send_size ?
+		info->max_fragmented_send_size :
+		rdma_readwrite_threshold;
+
+
+	info->max_readwrite_size = min_t(u32,
+			le32_to_cpu(packet->max_readwrite_size),
+			info->max_frmr_depth * PAGE_SIZE);
+	info->max_frmr_depth = info->max_readwrite_size / PAGE_SIZE;
+
+	return true;
+}
+
+/*
+ * Check and schedule to send an immediate packet
+ * This is used to extend credtis to remote peer to keep the transport busy
+ */
+static void check_and_send_immediate(struct smbd_connection *info)
+{
+	if (info->transport_status != SMBD_CONNECTED)
+		return;
+
+	info->send_immediate = true;
+
+	/*
+	 * Promptly send a packet if our peer is running low on receive
+	 * credits
+	 */
+	if (atomic_read(&info->receive_credits) <
+		info->receive_credit_target - 1)
+		queue_delayed_work(
+			info->workqueue, &info->send_immediate_work, 0);
+}
+
+static void smbd_post_send_credits(struct work_struct *work)
+{
+	int ret = 0;
+	int use_receive_queue = 1;
+	int rc;
+	struct smbd_response *response;
+	struct smbd_connection *info =
+		container_of(work, struct smbd_connection,
+			post_send_credits_work);
+
+	if (info->transport_status != SMBD_CONNECTED) {
+		wake_up(&info->wait_receive_queues);
+		return;
+	}
+
+	if (info->receive_credit_target >
+		atomic_read(&info->receive_credits)) {
+		while (true) {
+			if (use_receive_queue)
+				response = get_receive_buffer(info);
+			else
+				response = get_empty_queue_buffer(info);
+			if (!response) {
+				/* now switch to emtpy packet queue */
+				if (use_receive_queue) {
+					use_receive_queue = 0;
+					continue;
+				} else
+					break;
+			}
+
+			response->type = SMBD_TRANSFER_DATA;
+			response->first_segment = false;
+			rc = smbd_post_recv(info, response);
+			if (rc) {
+				log_rdma_recv(ERR,
+					"post_recv failed rc=%d\n", rc);
+				put_receive_buffer(info, response);
+				break;
+			}
+
+			ret++;
+		}
+	}
+
+	spin_lock(&info->lock_new_credits_offered);
+	info->new_credits_offered += ret;
+	spin_unlock(&info->lock_new_credits_offered);
+
+	atomic_add(ret, &info->receive_credits);
+
+	/* Check if we can post new receive and grant credits to peer */
+	check_and_send_immediate(info);
+}
+
+static void smbd_recv_done_work(struct work_struct *work)
+{
+	struct smbd_connection *info =
+		container_of(work, struct smbd_connection, recv_done_work);
+
+	/*
+	 * We may have new send credits granted from remote peer
+	 * If any sender is blcoked on lack of credets, unblock it
+	 */
+	if (atomic_read(&info->send_credits))
+		wake_up_interruptible(&info->wait_send_queue);
+
+	/*
+	 * Check if we need to send something to remote peer to
+	 * grant more credits or respond to KEEP_ALIVE packet
+	 */
+	check_and_send_immediate(info);
+}
+
+/* Called from softirq, when recv is done */
+static void recv_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct smbd_data_transfer *data_transfer;
+	struct smbd_response *response =
+		container_of(wc->wr_cqe, struct smbd_response, cqe);
+	struct smbd_connection *info = response->info;
+	int data_length = 0;
+
+	log_rdma_recv(INFO, "response=%p type=%d wc status=%d wc opcode %d "
+		      "byte_len=%d pkey_index=%x\n",
+		response, response->type, wc->status, wc->opcode,
+		wc->byte_len, wc->pkey_index);
+
+	if (wc->status != IB_WC_SUCCESS || wc->opcode != IB_WC_RECV) {
+		log_rdma_recv(INFO, "wc->status=%d opcode=%d\n",
+			wc->status, wc->opcode);
+		smbd_disconnect_rdma_connection(info);
+		goto error;
+	}
+
+	ib_dma_sync_single_for_cpu(
+		wc->qp->device,
+		response->sge.addr,
+		response->sge.length,
+		DMA_FROM_DEVICE);
+
+	switch (response->type) {
+	/* SMBD negotiation response */
+	case SMBD_NEGOTIATE_RESP:
+		dump_smbd_negotiate_resp(smbd_response_payload(response));
+		info->full_packet_received = true;
+		info->negotiate_done =
+			process_negotiation_response(response, wc->byte_len);
+		complete(&info->negotiate_completion);
+		break;
+
+	/* SMBD data transfer packet */
+	case SMBD_TRANSFER_DATA:
+		data_transfer = smbd_response_payload(response);
+		data_length = le32_to_cpu(data_transfer->data_length);
+
+		/*
+		 * If this is a packet with data playload place the data in
+		 * reassembly queue and wake up the reading thread
+		 */
+		if (data_length) {
+			if (info->full_packet_received)
+				response->first_segment = true;
+
+			if (le32_to_cpu(data_transfer->remaining_data_length))
+				info->full_packet_received = false;
+			else
+				info->full_packet_received = true;
+
+			enqueue_reassembly(
+				info,
+				response,
+				data_length);
+		} else
+			put_empty_packet(info, response);
+
+		if (data_length)
+			wake_up_interruptible(&info->wait_reassembly_queue);
+
+		atomic_dec(&info->receive_credits);
+		info->receive_credit_target =
+			le16_to_cpu(data_transfer->credits_requested);
+		atomic_add(le16_to_cpu(data_transfer->credits_granted),
+			&info->send_credits);
+
+		log_incoming(INFO, "data flags %d data_offset %d "
+			"data_length %d remaining_data_length %d\n",
+			le16_to_cpu(data_transfer->flags),
+			le32_to_cpu(data_transfer->data_offset),
+			le32_to_cpu(data_transfer->data_length),
+			le32_to_cpu(data_transfer->remaining_data_length));
+
+		/* Send a KEEP_ALIVE response right away if requested */
+		info->keep_alive_requested = KEEP_ALIVE_NONE;
+		if (le16_to_cpu(data_transfer->flags) &
+				SMB_DIRECT_RESPONSE_REQUESTED) {
+			info->keep_alive_requested = KEEP_ALIVE_PENDING;
+		}
+
+		queue_work(info->workqueue, &info->recv_done_work);
+		return;
+
+	default:
+		log_rdma_recv(ERR,
+			"unexpected response type=%d\n", response->type);
+	}
+
+error:
+	put_receive_buffer(info, response);
+}
+
+static struct rdma_cm_id *smbd_create_id(
+		struct smbd_connection *info,
+		struct sockaddr *dstaddr, int port)
+{
+	struct rdma_cm_id *id;
+	int rc;
+	__be16 *sport;
+
+	id = rdma_create_id(&init_net, smbd_conn_upcall, info,
+		RDMA_PS_TCP, IB_QPT_RC);
+	if (IS_ERR(id)) {
+		rc = PTR_ERR(id);
+		log_rdma_event(ERR, "rdma_create_id() failed %i\n", rc);
+		return id;
+	}
+
+	if (dstaddr->sa_family == AF_INET6)
+		sport = &((struct sockaddr_in6 *)dstaddr)->sin6_port;
+	else
+		sport = &((struct sockaddr_in *)dstaddr)->sin_port;
+
+	*sport = htons(port);
+
+	init_completion(&info->ri_done);
+	info->ri_rc = -ETIMEDOUT;
+
+	rc = rdma_resolve_addr(id, NULL, (struct sockaddr *)dstaddr,
+		RDMA_RESOLVE_TIMEOUT);
+	if (rc) {
+		log_rdma_event(ERR, "rdma_resolve_addr() failed %i\n", rc);
+		goto out;
+	}
+	wait_for_completion_interruptible_timeout(
+		&info->ri_done, msecs_to_jiffies(RDMA_RESOLVE_TIMEOUT));
+	rc = info->ri_rc;
+	if (rc) {
+		log_rdma_event(ERR, "rdma_resolve_addr() completed %i\n", rc);
+		goto out;
+	}
+
+	info->ri_rc = -ETIMEDOUT;
+	rc = rdma_resolve_route(id, RDMA_RESOLVE_TIMEOUT);
+	if (rc) {
+		log_rdma_event(ERR, "rdma_resolve_route() failed %i\n", rc);
+		goto out;
+	}
+	wait_for_completion_interruptible_timeout(
+		&info->ri_done, msecs_to_jiffies(RDMA_RESOLVE_TIMEOUT));
+	rc = info->ri_rc;
+	if (rc) {
+		log_rdma_event(ERR, "rdma_resolve_route() completed %i\n", rc);
+		goto out;
+	}
+
+	return id;
+
+out:
+	rdma_destroy_id(id);
+	return ERR_PTR(rc);
+}
+
+/*
+ * Test if FRWR (Fast Registration Work Requests) is supported on the device
+ * This implementation requries FRWR on RDMA read/write
+ * return value: true if it is supported
+ */
+static bool frwr_is_supported(struct ib_device_attr *attrs)
+{
+	if (!(attrs->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS))
+		return false;
+	if (attrs->max_fast_reg_page_list_len == 0)
+		return false;
+	return true;
+}
+
+static int smbd_ia_open(
+		struct smbd_connection *info,
+		struct sockaddr *dstaddr, int port)
+{
+	int rc;
+
+	info->id = smbd_create_id(info, dstaddr, port);
+	if (IS_ERR(info->id)) {
+		rc = PTR_ERR(info->id);
+		goto out1;
+	}
+
+	if (!frwr_is_supported(&info->id->device->attrs)) {
+		log_rdma_event(ERR,
+			"Fast Registration Work Requests "
+			"(FRWR) is not supported\n");
+		log_rdma_event(ERR,
+			"Device capability flags = %llx "
+			"max_fast_reg_page_list_len = %u\n",
+			info->id->device->attrs.device_cap_flags,
+			info->id->device->attrs.max_fast_reg_page_list_len);
+		rc = -EPROTONOSUPPORT;
+		goto out2;
+	}
+	info->max_frmr_depth = min_t(int,
+		smbd_max_frmr_depth,
+		info->id->device->attrs.max_fast_reg_page_list_len);
+	info->mr_type = IB_MR_TYPE_MEM_REG;
+	if (info->id->device->attrs.device_cap_flags & IB_DEVICE_SG_GAPS_REG)
+		info->mr_type = IB_MR_TYPE_SG_GAPS;
+
+	info->pd = ib_alloc_pd(info->id->device, 0);
+	if (IS_ERR(info->pd)) {
+		rc = PTR_ERR(info->pd);
+		log_rdma_event(ERR, "ib_alloc_pd() returned %d\n", rc);
+		goto out2;
+	}
+
+	return 0;
+
+out2:
+	rdma_destroy_id(info->id);
+	info->id = NULL;
+
+out1:
+	return rc;
+}
+
+/*
+ * Send a negotiation request message to the peer
+ * The negotiation procedure is in [MS-SMBD] 3.1.5.2 and 3.1.5.3
+ * After negotiation, the transport is connected and ready for
+ * carrying upper layer SMB payload
+ */
+static int smbd_post_send_negotiate_req(struct smbd_connection *info)
+{
+	struct ib_send_wr send_wr, *send_wr_fail;
+	int rc = -ENOMEM;
+	struct smbd_request *request;
+	struct smbd_negotiate_req *packet;
+
+	request = mempool_alloc(info->request_mempool, GFP_KERNEL);
+	if (!request)
+		return rc;
+
+	request->info = info;
+
+	packet = smbd_request_payload(request);
+	packet->min_version = cpu_to_le16(SMBD_V1);
+	packet->max_version = cpu_to_le16(SMBD_V1);
+	packet->reserved = 0;
+	packet->credits_requested = cpu_to_le16(info->send_credit_target);
+	packet->preferred_send_size = cpu_to_le32(info->max_send_size);
+	packet->max_receive_size = cpu_to_le32(info->max_receive_size);
+	packet->max_fragmented_size =
+		cpu_to_le32(info->max_fragmented_recv_size);
+
+	request->num_sge = 1;
+	request->sge[0].addr = ib_dma_map_single(
+				info->id->device, (void *)packet,
+				sizeof(*packet), DMA_TO_DEVICE);
+	if (ib_dma_mapping_error(info->id->device, request->sge[0].addr)) {
+		rc = -EIO;
+		goto dma_mapping_failed;
+	}
+
+	request->sge[0].length = sizeof(*packet);
+	request->sge[0].lkey = info->pd->local_dma_lkey;
+
+	ib_dma_sync_single_for_device(
+		info->id->device, request->sge[0].addr,
+		request->sge[0].length, DMA_TO_DEVICE);
+
+	request->cqe.done = send_done;
+
+	send_wr.next = NULL;
+	send_wr.wr_cqe = &request->cqe;
+	send_wr.sg_list = request->sge;
+	send_wr.num_sge = request->num_sge;
+	send_wr.opcode = IB_WR_SEND;
+	send_wr.send_flags = IB_SEND_SIGNALED;
+
+	log_rdma_send(INFO, "sge addr=%llx length=%x lkey=%x\n",
+		request->sge[0].addr,
+		request->sge[0].length, request->sge[0].lkey);
+
+	request->has_payload = false;
+	atomic_inc(&info->send_pending);
+	rc = ib_post_send(info->id->qp, &send_wr, &send_wr_fail);
+	if (!rc)
+		return 0;
+
+	/* if we reach here, post send failed */
+	log_rdma_send(ERR, "ib_post_send failed rc=%d\n", rc);
+	atomic_dec(&info->send_pending);
+	ib_dma_unmap_single(info->id->device, request->sge[0].addr,
+		request->sge[0].length, DMA_TO_DEVICE);
+
+dma_mapping_failed:
+	mempool_free(request, info->request_mempool);
+	return rc;
+}
+
+/*
+ * Extend the credits to remote peer
+ * This implements [MS-SMBD] 3.1.5.9
+ * The idea is that we should extend credits to remote peer as quickly as
+ * it's allowed, to maintain data flow. We allocate as much receive
+ * buffer as possible, and extend the receive credits to remote peer
+ * return value: the new credtis being granted.
+ */
+static int manage_credits_prior_sending(struct smbd_connection *info)
+{
+	int new_credits;
+
+	spin_lock(&info->lock_new_credits_offered);
+	new_credits = info->new_credits_offered;
+	info->new_credits_offered = 0;
+	spin_unlock(&info->lock_new_credits_offered);
+
+	return new_credits;
+}
+
+/*
+ * Check if we need to send a KEEP_ALIVE message
+ * The idle connection timer triggers a KEEP_ALIVE message when expires
+ * SMB_DIRECT_RESPONSE_REQUESTED is set in the message flag to have peer send
+ * back a response.
+ * return value:
+ * 1 if SMB_DIRECT_RESPONSE_REQUESTED needs to be set
+ * 0: otherwise
+ */
+static int manage_keep_alive_before_sending(struct smbd_connection *info)
+{
+	if (info->keep_alive_requested == KEEP_ALIVE_PENDING) {
+		info->keep_alive_requested = KEEP_ALIVE_SENT;
+		return 1;
+	}
+	return 0;
+}
+
+/*
+ * Build and prepare the SMBD packet header
+ * This function waits for avaialbe send credits and build a SMBD packet
+ * header. The caller then optional append payload to the packet after
+ * the header
+ * intput values
+ * size: the size of the payload
+ * remaining_data_length: remaining data to send if this is part of a
+ * fragmented packet
+ * output values
+ * request_out: the request allocated from this function
+ * return values: 0 on success, otherwise actual error code returned
+ */
+static int smbd_create_header(struct smbd_connection *info,
+		int size, int remaining_data_length,
+		struct smbd_request **request_out)
+{
+	struct smbd_request *request;
+	struct smbd_data_transfer *packet;
+	int header_length;
+	int rc;
+
+	/* Wait for send credits. A SMBD packet needs one credit */
+	rc = wait_event_interruptible(info->wait_send_queue,
+		atomic_read(&info->send_credits) > 0 ||
+		info->transport_status != SMBD_CONNECTED);
+	if (rc)
+		return rc;
+
+	if (info->transport_status != SMBD_CONNECTED) {
+		log_outgoing(ERR, "disconnected not sending\n");
+		return -ENOENT;
+	}
+	atomic_dec(&info->send_credits);
+
+	request = mempool_alloc(info->request_mempool, GFP_KERNEL);
+	if (!request) {
+		rc = -ENOMEM;
+		goto err;
+	}
+
+	request->info = info;
+
+	/* Fill in the packet header */
+	packet = smbd_request_payload(request);
+	packet->credits_requested = cpu_to_le16(info->send_credit_target);
+	packet->credits_granted =
+		cpu_to_le16(manage_credits_prior_sending(info));
+	info->send_immediate = false;
+
+	packet->flags = 0;
+	if (manage_keep_alive_before_sending(info))
+		packet->flags |= cpu_to_le16(SMB_DIRECT_RESPONSE_REQUESTED);
+
+	packet->reserved = 0;
+	if (!size)
+		packet->data_offset = 0;
+	else
+		packet->data_offset = cpu_to_le32(24);
+	packet->data_length = cpu_to_le32(size);
+	packet->remaining_data_length = cpu_to_le32(remaining_data_length);
+	packet->padding = 0;
+
+	log_outgoing(INFO, "credits_requested=%d credits_granted=%d "
+		"data_offset=%d data_length=%d remaining_data_length=%d\n",
+		le16_to_cpu(packet->credits_requested),
+		le16_to_cpu(packet->credits_granted),
+		le32_to_cpu(packet->data_offset),
+		le32_to_cpu(packet->data_length),
+		le32_to_cpu(packet->remaining_data_length));
+
+	/* Map the packet to DMA */
+	header_length = sizeof(struct smbd_data_transfer);
+	/* If this is a packet without payload, don't send padding */
+	if (!size)
+		header_length = offsetof(struct smbd_data_transfer, padding);
+
+	request->num_sge = 1;
+	request->sge[0].addr = ib_dma_map_single(info->id->device,
+						 (void *)packet,
+						 header_length,
+						 DMA_BIDIRECTIONAL);
+	if (ib_dma_mapping_error(info->id->device, request->sge[0].addr)) {
+		mempool_free(request, info->request_mempool);
+		rc = -EIO;
+		goto err;
+	}
+
+	request->sge[0].length = header_length;
+	request->sge[0].lkey = info->pd->local_dma_lkey;
+
+	*request_out = request;
+	return 0;
+
+err:
+	atomic_inc(&info->send_credits);
+	return rc;
+}
+
+static void smbd_destroy_header(struct smbd_connection *info,
+		struct smbd_request *request)
+{
+
+	ib_dma_unmap_single(info->id->device,
+			    request->sge[0].addr,
+			    request->sge[0].length,
+			    DMA_TO_DEVICE);
+	mempool_free(request, info->request_mempool);
+	atomic_inc(&info->send_credits);
+}
+
+/* Post the send request */
+static int smbd_post_send(struct smbd_connection *info,
+		struct smbd_request *request, bool has_payload)
+{
+	struct ib_send_wr send_wr, *send_wr_fail;
+	int rc, i;
+
+	for (i = 0; i < request->num_sge; i++) {
+		log_rdma_send(INFO,
+			"rdma_request sge[%d] addr=%llu legnth=%u\n",
+			i, request->sge[0].addr, request->sge[0].length);
+		ib_dma_sync_single_for_device(
+			info->id->device,
+			request->sge[i].addr,
+			request->sge[i].length,
+			DMA_TO_DEVICE);
+	}
+
+	request->cqe.done = send_done;
+
+	send_wr.next = NULL;
+	send_wr.wr_cqe = &request->cqe;
+	send_wr.sg_list = request->sge;
+	send_wr.num_sge = request->num_sge;
+	send_wr.opcode = IB_WR_SEND;
+	send_wr.send_flags = IB_SEND_SIGNALED;
+
+	if (has_payload) {
+		request->has_payload = true;
+		atomic_inc(&info->send_payload_pending);
+	} else {
+		request->has_payload = false;
+		atomic_inc(&info->send_pending);
+	}
+
+	rc = ib_post_send(info->id->qp, &send_wr, &send_wr_fail);
+	if (rc) {
+		log_rdma_send(ERR, "ib_post_send failed rc=%d\n", rc);
+		if (has_payload) {
+			if (atomic_dec_and_test(&info->send_payload_pending))
+				wake_up(&info->wait_send_payload_pending);
+		} else {
+			if (atomic_dec_and_test(&info->send_pending))
+				wake_up(&info->wait_send_pending);
+		}
+	} else
+		/* Reset timer for idle connection after packet is sent */
+		mod_delayed_work(info->workqueue, &info->idle_timer_work,
+			info->keep_alive_interval*HZ);
+
+	return rc;
+}
+
+static int smbd_post_send_sgl(struct smbd_connection *info,
+	struct scatterlist *sgl, int data_length, int remaining_data_length)
+{
+	int num_sgs;
+	int i, rc;
+	struct smbd_request *request;
+	struct scatterlist *sg;
+
+	rc = smbd_create_header(
+		info, data_length, remaining_data_length, &request);
+	if (rc)
+		return rc;
+
+	num_sgs = sgl ? sg_nents(sgl) : 0;
+	for_each_sg(sgl, sg, num_sgs, i) {
+		request->sge[i+1].addr =
+			ib_dma_map_page(info->id->device, sg_page(sg),
+			       sg->offset, sg->length, DMA_BIDIRECTIONAL);
+		if (ib_dma_mapping_error(
+				info->id->device, request->sge[i+1].addr)) {
+			rc = -EIO;
+			request->sge[i+1].addr = 0;
+			goto dma_mapping_failure;
+		}
+		request->sge[i+1].length = sg->length;
+		request->sge[i+1].lkey = info->pd->local_dma_lkey;
+		request->num_sge++;
+	}
+
+	rc = smbd_post_send(info, request, data_length);
+	if (!rc)
+		return 0;
+
+dma_mapping_failure:
+	for (i = 1; i < request->num_sge; i++)
+		if (request->sge[i].addr)
+			ib_dma_unmap_single(info->id->device,
+					    request->sge[i].addr,
+					    request->sge[i].length,
+					    DMA_TO_DEVICE);
+	smbd_destroy_header(info, request);
+	return rc;
+}
+
+/*
+ * Send a page
+ * page: the page to send
+ * offset: offset in the page to send
+ * size: length in the page to send
+ * remaining_data_length: remaining data to send in this payload
+ */
+static int smbd_post_send_page(struct smbd_connection *info, struct page *page,
+		unsigned long offset, size_t size, int remaining_data_length)
+{
+	struct scatterlist sgl;
+
+	sg_init_table(&sgl, 1);
+	sg_set_page(&sgl, page, size, offset);
+
+	return smbd_post_send_sgl(info, &sgl, size, remaining_data_length);
+}
+
+/*
+ * Send an empty message
+ * Empty message is used to extend credits to peer to for keep live
+ * while there is no upper layer payload to send at the time
+ */
+static int smbd_post_send_empty(struct smbd_connection *info)
+{
+	info->count_send_empty++;
+	return smbd_post_send_sgl(info, NULL, 0, 0);
+}
+
+/*
+ * Send a data buffer
+ * iov: the iov array describing the data buffers
+ * n_vec: number of iov array
+ * remaining_data_length: remaining data to send following this packet
+ * in segmented SMBD packet
+ */
+static int smbd_post_send_data(
+	struct smbd_connection *info, struct kvec *iov, int n_vec,
+	int remaining_data_length)
+{
+	int i;
+	u32 data_length = 0;
+	struct scatterlist sgl[SMBDIRECT_MAX_SGE];
+
+	if (n_vec > SMBDIRECT_MAX_SGE) {
+		cifs_dbg(VFS, "Can't fit data to SGL, n_vec=%d\n", n_vec);
+		return -ENOMEM;
+	}
+
+	sg_init_table(sgl, n_vec);
+	for (i = 0; i < n_vec; i++) {
+		data_length += iov[i].iov_len;
+		sg_set_buf(&sgl[i], iov[i].iov_base, iov[i].iov_len);
+	}
+
+	return smbd_post_send_sgl(info, sgl, data_length, remaining_data_length);
+}
+
+/*
+ * Post a receive request to the transport
+ * The remote peer can only send data when a receive request is posted
+ * The interaction is controlled by send/receive credit system
+ */
+static int smbd_post_recv(
+		struct smbd_connection *info, struct smbd_response *response)
+{
+	struct ib_recv_wr recv_wr, *recv_wr_fail = NULL;
+	int rc = -EIO;
+
+	response->sge.addr = ib_dma_map_single(
+				info->id->device, response->packet,
+				info->max_receive_size, DMA_FROM_DEVICE);
+	if (ib_dma_mapping_error(info->id->device, response->sge.addr))
+		return rc;
+
+	response->sge.length = info->max_receive_size;
+	response->sge.lkey = info->pd->local_dma_lkey;
+
+	response->cqe.done = recv_done;
+
+	recv_wr.wr_cqe = &response->cqe;
+	recv_wr.next = NULL;
+	recv_wr.sg_list = &response->sge;
+	recv_wr.num_sge = 1;
+
+	rc = ib_post_recv(info->id->qp, &recv_wr, &recv_wr_fail);
+	if (rc) {
+		ib_dma_unmap_single(info->id->device, response->sge.addr,
+				    response->sge.length, DMA_FROM_DEVICE);
+
+		log_rdma_recv(ERR, "ib_post_recv failed rc=%d\n", rc);
+	}
+
+	return rc;
+}
+
+/* Perform SMBD negotiate according to [MS-SMBD] 3.1.5.2 */
+static int smbd_negotiate(struct smbd_connection *info)
+{
+	int rc;
+	struct smbd_response *response = get_receive_buffer(info);
+
+	response->type = SMBD_NEGOTIATE_RESP;
+	rc = smbd_post_recv(info, response);
+	log_rdma_event(INFO,
+		"smbd_post_recv rc=%d iov.addr=%llx iov.length=%x "
+		"iov.lkey=%x\n",
+		rc, response->sge.addr,
+		response->sge.length, response->sge.lkey);
+	if (rc)
+		return rc;
+
+	init_completion(&info->negotiate_completion);
+	info->negotiate_done = false;
+	rc = smbd_post_send_negotiate_req(info);
+	if (rc)
+		return rc;
+
+	rc = wait_for_completion_interruptible_timeout(
+		&info->negotiate_completion, SMBD_NEGOTIATE_TIMEOUT * HZ);
+	log_rdma_event(INFO, "wait_for_completion_timeout rc=%d\n", rc);
+
+	if (info->negotiate_done)
+		return 0;
+
+	if (rc == 0)
+		rc = -ETIMEDOUT;
+	else if (rc == -ERESTARTSYS)
+		rc = -EINTR;
+	else
+		rc = -ENOTCONN;
+
+	return rc;
+}
+
+static void put_empty_packet(
+		struct smbd_connection *info, struct smbd_response *response)
+{
+	spin_lock(&info->empty_packet_queue_lock);
+	list_add_tail(&response->list, &info->empty_packet_queue);
+	info->count_empty_packet_queue++;
+	spin_unlock(&info->empty_packet_queue_lock);
+
+	queue_work(info->workqueue, &info->post_send_credits_work);
+}
+
+/*
+ * Implement Connection.FragmentReassemblyBuffer defined in [MS-SMBD] 3.1.1.1
+ * This is a queue for reassembling upper layer payload and present to upper
+ * layer. All the inncoming payload go to the reassembly queue, regardless of
+ * if reassembly is required. The uuper layer code reads from the queue for all
+ * incoming payloads.
+ * Put a received packet to the reassembly queue
+ * response: the packet received
+ * data_length: the size of payload in this packet
+ */
+static void enqueue_reassembly(
+	struct smbd_connection *info,
+	struct smbd_response *response,
+	int data_length)
+{
+	spin_lock(&info->reassembly_queue_lock);
+	list_add_tail(&response->list, &info->reassembly_queue);
+	info->reassembly_queue_length++;
+	/*
+	 * Make sure reassembly_data_length is updated after list and
+	 * reassembly_queue_length are updated. On the dequeue side
+	 * reassembly_data_length is checked without a lock to determine
+	 * if reassembly_queue_length and list is up to date
+	 */
+	virt_wmb();
+	info->reassembly_data_length += data_length;
+	spin_unlock(&info->reassembly_queue_lock);
+	info->count_reassembly_queue++;
+	info->count_enqueue_reassembly_queue++;
+}
+
+/*
+ * Get the first entry at the front of reassembly queue
+ * Caller is responsible for locking
+ * return value: the first entry if any, NULL if queue is empty
+ */
+static struct smbd_response *_get_first_reassembly(struct smbd_connection *info)
+{
+	struct smbd_response *ret = NULL;
+
+	if (!list_empty(&info->reassembly_queue)) {
+		ret = list_first_entry(
+			&info->reassembly_queue,
+			struct smbd_response, list);
+	}
+	return ret;
+}
+
+static struct smbd_response *get_empty_queue_buffer(
+		struct smbd_connection *info)
+{
+	struct smbd_response *ret = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&info->empty_packet_queue_lock, flags);
+	if (!list_empty(&info->empty_packet_queue)) {
+		ret = list_first_entry(
+			&info->empty_packet_queue,
+			struct smbd_response, list);
+		list_del(&ret->list);
+		info->count_empty_packet_queue--;
+	}
+	spin_unlock_irqrestore(&info->empty_packet_queue_lock, flags);
+
+	return ret;
+}
+
+/*
+ * Get a receive buffer
+ * For each remote send, we need to post a receive. The receive buffers are
+ * pre-allocated in advance.
+ * return value: the receive buffer, NULL if none is available
+ */
+static struct smbd_response *get_receive_buffer(struct smbd_connection *info)
+{
+	struct smbd_response *ret = NULL;
+	unsigned long flags;
+
+	spin_lock_irqsave(&info->receive_queue_lock, flags);
+	if (!list_empty(&info->receive_queue)) {
+		ret = list_first_entry(
+			&info->receive_queue,
+			struct smbd_response, list);
+		list_del(&ret->list);
+		info->count_receive_queue--;
+		info->count_get_receive_buffer++;
+	}
+	spin_unlock_irqrestore(&info->receive_queue_lock, flags);
+
+	return ret;
+}
+
+/*
+ * Return a receive buffer
+ * Upon returning of a receive buffer, we can post new receive and extend
+ * more receive credits to remote peer. This is done immediately after a
+ * receive buffer is returned.
+ */
+static void put_receive_buffer(
+	struct smbd_connection *info, struct smbd_response *response)
+{
+	unsigned long flags;
+
+	ib_dma_unmap_single(info->id->device, response->sge.addr,
+		response->sge.length, DMA_FROM_DEVICE);
+
+	spin_lock_irqsave(&info->receive_queue_lock, flags);
+	list_add_tail(&response->list, &info->receive_queue);
+	info->count_receive_queue++;
+	info->count_put_receive_buffer++;
+	spin_unlock_irqrestore(&info->receive_queue_lock, flags);
+
+	queue_work(info->workqueue, &info->post_send_credits_work);
+}
+
+/* Preallocate all receive buffer on transport establishment */
+static int allocate_receive_buffers(struct smbd_connection *info, int num_buf)
+{
+	int i;
+	struct smbd_response *response;
+
+	INIT_LIST_HEAD(&info->reassembly_queue);
+	spin_lock_init(&info->reassembly_queue_lock);
+	info->reassembly_data_length = 0;
+	info->reassembly_queue_length = 0;
+
+	INIT_LIST_HEAD(&info->receive_queue);
+	spin_lock_init(&info->receive_queue_lock);
+	info->count_receive_queue = 0;
+
+	INIT_LIST_HEAD(&info->empty_packet_queue);
+	spin_lock_init(&info->empty_packet_queue_lock);
+	info->count_empty_packet_queue = 0;
+
+	init_waitqueue_head(&info->wait_receive_queues);
+
+	for (i = 0; i < num_buf; i++) {
+		response = mempool_alloc(info->response_mempool, GFP_KERNEL);
+		if (!response)
+			goto allocate_failed;
+
+		response->info = info;
+		list_add_tail(&response->list, &info->receive_queue);
+		info->count_receive_queue++;
+	}
+
+	return 0;
+
+allocate_failed:
+	while (!list_empty(&info->receive_queue)) {
+		response = list_first_entry(
+				&info->receive_queue,
+				struct smbd_response, list);
+		list_del(&response->list);
+		info->count_receive_queue--;
+
+		mempool_free(response, info->response_mempool);
+	}
+	return -ENOMEM;
+}
+
+static void destroy_receive_buffers(struct smbd_connection *info)
+{
+	struct smbd_response *response;
+
+	while ((response = get_receive_buffer(info)))
+		mempool_free(response, info->response_mempool);
+
+	while ((response = get_empty_queue_buffer(info)))
+		mempool_free(response, info->response_mempool);
+}
+
+/*
+ * Check and send an immediate or keep alive packet
+ * The condition to send those packets are defined in [MS-SMBD] 3.1.1.1
+ * Connection.KeepaliveRequested and Connection.SendImmediate
+ * The idea is to extend credits to server as soon as it becomes available
+ */
+static void send_immediate_work(struct work_struct *work)
+{
+	struct smbd_connection *info = container_of(
+					work, struct smbd_connection,
+					send_immediate_work.work);
+
+	if (info->keep_alive_requested == KEEP_ALIVE_PENDING ||
+	    info->send_immediate) {
+		log_keep_alive(INFO, "send an empty message\n");
+		smbd_post_send_empty(info);
+	}
+}
+
+/* Implement idle connection timer [MS-SMBD] 3.1.6.2 */
+static void idle_connection_timer(struct work_struct *work)
+{
+	struct smbd_connection *info = container_of(
+					work, struct smbd_connection,
+					idle_timer_work.work);
+
+	if (info->keep_alive_requested != KEEP_ALIVE_NONE) {
+		log_keep_alive(ERR,
+			"error status info->keep_alive_requested=%d\n",
+			info->keep_alive_requested);
+		smbd_disconnect_rdma_connection(info);
+		return;
+	}
+
+	log_keep_alive(INFO, "about to send an empty idle message\n");
+	smbd_post_send_empty(info);
+
+	/* Setup the next idle timeout work */
+	queue_delayed_work(info->workqueue, &info->idle_timer_work,
+			info->keep_alive_interval*HZ);
+}
+
+/* Destroy this SMBD connection, called from upper layer */
+void smbd_destroy(struct smbd_connection *info)
+{
+	log_rdma_event(INFO, "destroying rdma session\n");
+
+	/* Kick off the disconnection process */
+	smbd_disconnect_rdma_connection(info);
+
+	log_rdma_event(INFO, "wait for transport being destroyed\n");
+	wait_event(info->wait_destroy,
+		info->transport_status == SMBD_DESTROYED);
+
+	destroy_workqueue(info->workqueue);
+	kfree(info);
+}
+
+/*
+ * Reconnect this SMBD connection, called from upper layer
+ * return value: 0 on success, or actual error code
+ */
+int smbd_reconnect(struct TCP_Server_Info *server)
+{
+	log_rdma_event(INFO, "reconnecting rdma session\n");
+
+	if (!server->smbd_conn) {
+		log_rdma_event(ERR, "rdma session already destroyed\n");
+		return -EINVAL;
+	}
+
+	/*
+	 * This is possible if transport is disconnected and we haven't received
+	 * notification from RDMA, but upper layer has detected timeout
+	 */
+	if (server->smbd_conn->transport_status == SMBD_CONNECTED) {
+		log_rdma_event(INFO, "disconnecting transport\n");
+		smbd_disconnect_rdma_connection(server->smbd_conn);
+	}
+
+	/* wait until the transport is destroyed */
+	wait_event(server->smbd_conn->wait_destroy,
+		server->smbd_conn->transport_status == SMBD_DESTROYED);
+
+	destroy_workqueue(server->smbd_conn->workqueue);
+	kfree(server->smbd_conn);
+
+	log_rdma_event(INFO, "creating rdma session\n");
+	server->smbd_conn = smbd_get_connection(
+		server, (struct sockaddr *) &server->dstaddr);
+
+	return server->smbd_conn ? 0 : -ENOENT;
+}
+
+static void destroy_caches_and_workqueue(struct smbd_connection *info)
+{
+	destroy_receive_buffers(info);
+	destroy_workqueue(info->workqueue);
+	mempool_destroy(info->response_mempool);
+	kmem_cache_destroy(info->response_cache);
+	mempool_destroy(info->request_mempool);
+	kmem_cache_destroy(info->request_cache);
+}
+
+#define MAX_NAME_LEN	80
+static int allocate_caches_and_workqueue(struct smbd_connection *info)
+{
+	char name[MAX_NAME_LEN];
+	int rc;
+
+	snprintf(name, MAX_NAME_LEN, "smbd_request_%p", info);
+	info->request_cache =
+		kmem_cache_create(
+			name,
+			sizeof(struct smbd_request) +
+				sizeof(struct smbd_data_transfer),
+			0, SLAB_HWCACHE_ALIGN, NULL);
+	if (!info->request_cache)
+		return -ENOMEM;
+
+	info->request_mempool =
+		mempool_create(info->send_credit_target, mempool_alloc_slab,
+			mempool_free_slab, info->request_cache);
+	if (!info->request_mempool)
+		goto out1;
+
+	snprintf(name, MAX_NAME_LEN, "smbd_response_%p", info);
+	info->response_cache =
+		kmem_cache_create(
+			name,
+			sizeof(struct smbd_response) +
+				info->max_receive_size,
+			0, SLAB_HWCACHE_ALIGN, NULL);
+	if (!info->response_cache)
+		goto out2;
+
+	info->response_mempool =
+		mempool_create(info->receive_credit_max, mempool_alloc_slab,
+		       mempool_free_slab, info->response_cache);
+	if (!info->response_mempool)
+		goto out3;
+
+	snprintf(name, MAX_NAME_LEN, "smbd_%p", info);
+	info->workqueue = create_workqueue(name);
+	if (!info->workqueue)
+		goto out4;
+
+	rc = allocate_receive_buffers(info, info->receive_credit_max);
+	if (rc) {
+		log_rdma_event(ERR, "failed to allocate receive buffers\n");
+		goto out5;
+	}
+
+	return 0;
+
+out5:
+	destroy_workqueue(info->workqueue);
+out4:
+	mempool_destroy(info->response_mempool);
+out3:
+	kmem_cache_destroy(info->response_cache);
+out2:
+	mempool_destroy(info->request_mempool);
+out1:
+	kmem_cache_destroy(info->request_cache);
+	return -ENOMEM;
+}
+
+/* Create a SMBD connection, called by upper layer */
+static struct smbd_connection *_smbd_get_connection(
+	struct TCP_Server_Info *server, struct sockaddr *dstaddr, int port)
+{
+	int rc;
+	struct smbd_connection *info;
+	struct rdma_conn_param conn_param;
+	struct ib_qp_init_attr qp_attr;
+	struct sockaddr_in *addr_in = (struct sockaddr_in *) dstaddr;
+	struct ib_port_immutable port_immutable;
+	u32 ird_ord_hdr[2];
+
+	info = kzalloc(sizeof(struct smbd_connection), GFP_KERNEL);
+	if (!info)
+		return NULL;
+
+	info->transport_status = SMBD_CONNECTING;
+	rc = smbd_ia_open(info, dstaddr, port);
+	if (rc) {
+		log_rdma_event(INFO, "smbd_ia_open rc=%d\n", rc);
+		goto create_id_failed;
+	}
+
+	if (smbd_send_credit_target > info->id->device->attrs.max_cqe ||
+	    smbd_send_credit_target > info->id->device->attrs.max_qp_wr) {
+		log_rdma_event(ERR,
+			"consider lowering send_credit_target = %d. "
+			"Possible CQE overrun, device "
+			"reporting max_cpe %d max_qp_wr %d\n",
+			smbd_send_credit_target,
+			info->id->device->attrs.max_cqe,
+			info->id->device->attrs.max_qp_wr);
+		goto config_failed;
+	}
+
+	if (smbd_receive_credit_max > info->id->device->attrs.max_cqe ||
+	    smbd_receive_credit_max > info->id->device->attrs.max_qp_wr) {
+		log_rdma_event(ERR,
+			"consider lowering receive_credit_max = %d. "
+			"Possible CQE overrun, device "
+			"reporting max_cpe %d max_qp_wr %d\n",
+			smbd_receive_credit_max,
+			info->id->device->attrs.max_cqe,
+			info->id->device->attrs.max_qp_wr);
+		goto config_failed;
+	}
+
+	info->receive_credit_max = smbd_receive_credit_max;
+	info->send_credit_target = smbd_send_credit_target;
+	info->max_send_size = smbd_max_send_size;
+	info->max_fragmented_recv_size = smbd_max_fragmented_recv_size;
+	info->max_receive_size = smbd_max_receive_size;
+	info->keep_alive_interval = smbd_keep_alive_interval;
+
+	if (info->id->device->attrs.max_sge < SMBDIRECT_MAX_SGE) {
+		log_rdma_event(ERR, "warning: device max_sge = %d too small\n",
+			info->id->device->attrs.max_sge);
+		log_rdma_event(ERR, "Queue Pair creation may fail\n");
+	}
+
+	info->send_cq = NULL;
+	info->recv_cq = NULL;
+	info->send_cq = ib_alloc_cq(info->id->device, info,
+			info->send_credit_target, 0, IB_POLL_SOFTIRQ);
+	if (IS_ERR(info->send_cq)) {
+		info->send_cq = NULL;
+		goto alloc_cq_failed;
+	}
+
+	info->recv_cq = ib_alloc_cq(info->id->device, info,
+			info->receive_credit_max, 0, IB_POLL_SOFTIRQ);
+	if (IS_ERR(info->recv_cq)) {
+		info->recv_cq = NULL;
+		goto alloc_cq_failed;
+	}
+
+	memset(&qp_attr, 0, sizeof(qp_attr));
+	qp_attr.event_handler = smbd_qp_async_error_upcall;
+	qp_attr.qp_context = info;
+	qp_attr.cap.max_send_wr = info->send_credit_target;
+	qp_attr.cap.max_recv_wr = info->receive_credit_max;
+	qp_attr.cap.max_send_sge = SMBDIRECT_MAX_SGE;
+	qp_attr.cap.max_recv_sge = SMBDIRECT_MAX_SGE;
+	qp_attr.cap.max_inline_data = 0;
+	qp_attr.sq_sig_type = IB_SIGNAL_REQ_WR;
+	qp_attr.qp_type = IB_QPT_RC;
+	qp_attr.send_cq = info->send_cq;
+	qp_attr.recv_cq = info->recv_cq;
+	qp_attr.port_num = ~0;
+
+	rc = rdma_create_qp(info->id, info->pd, &qp_attr);
+	if (rc) {
+		log_rdma_event(ERR, "rdma_create_qp failed %i\n", rc);
+		goto create_qp_failed;
+	}
+
+	memset(&conn_param, 0, sizeof(conn_param));
+	conn_param.initiator_depth = 0;
+
+	conn_param.responder_resources =
+		info->id->device->attrs.max_qp_rd_atom
+			< SMBD_CM_RESPONDER_RESOURCES ?
+		info->id->device->attrs.max_qp_rd_atom :
+		SMBD_CM_RESPONDER_RESOURCES;
+	info->responder_resources = conn_param.responder_resources;
+	log_rdma_mr(INFO, "responder_resources=%d\n",
+		info->responder_resources);
+
+	/* Need to send IRD/ORD in private data for iWARP */
+	info->id->device->get_port_immutable(
+		info->id->device, info->id->port_num, &port_immutable);
+	if (port_immutable.core_cap_flags & RDMA_CORE_PORT_IWARP) {
+		ird_ord_hdr[0] = info->responder_resources;
+		ird_ord_hdr[1] = 1;
+		conn_param.private_data = ird_ord_hdr;
+		conn_param.private_data_len = sizeof(ird_ord_hdr);
+	} else {
+		conn_param.private_data = NULL;
+		conn_param.private_data_len = 0;
+	}
+
+	conn_param.retry_count = SMBD_CM_RETRY;
+	conn_param.rnr_retry_count = SMBD_CM_RNR_RETRY;
+	conn_param.flow_control = 0;
+	init_waitqueue_head(&info->wait_destroy);
+
+	log_rdma_event(INFO, "connecting to IP %pI4 port %d\n",
+		&addr_in->sin_addr, port);
+
+	init_waitqueue_head(&info->conn_wait);
+	rc = rdma_connect(info->id, &conn_param);
+	if (rc) {
+		log_rdma_event(ERR, "rdma_connect() failed with %i\n", rc);
+		goto rdma_connect_failed;
+	}
+
+	wait_event_interruptible(
+		info->conn_wait, info->transport_status != SMBD_CONNECTING);
+
+	if (info->transport_status != SMBD_CONNECTED) {
+		log_rdma_event(ERR, "rdma_connect failed port=%d\n", port);
+		goto rdma_connect_failed;
+	}
+
+	log_rdma_event(INFO, "rdma_connect connected\n");
+
+	rc = allocate_caches_and_workqueue(info);
+	if (rc) {
+		log_rdma_event(ERR, "cache allocation failed\n");
+		goto allocate_cache_failed;
+	}
+
+	init_waitqueue_head(&info->wait_send_queue);
+	init_waitqueue_head(&info->wait_reassembly_queue);
+
+	INIT_DELAYED_WORK(&info->idle_timer_work, idle_connection_timer);
+	INIT_DELAYED_WORK(&info->send_immediate_work, send_immediate_work);
+	queue_delayed_work(info->workqueue, &info->idle_timer_work,
+		info->keep_alive_interval*HZ);
+
+	init_waitqueue_head(&info->wait_smbd_send_pending);
+	info->smbd_send_pending = 0;
+
+	init_waitqueue_head(&info->wait_smbd_recv_pending);
+	info->smbd_recv_pending = 0;
+
+	init_waitqueue_head(&info->wait_send_pending);
+	atomic_set(&info->send_pending, 0);
+
+	init_waitqueue_head(&info->wait_send_payload_pending);
+	atomic_set(&info->send_payload_pending, 0);
+
+	INIT_WORK(&info->disconnect_work, smbd_disconnect_rdma_work);
+	INIT_WORK(&info->destroy_work, smbd_destroy_rdma_work);
+	INIT_WORK(&info->recv_done_work, smbd_recv_done_work);
+	INIT_WORK(&info->post_send_credits_work, smbd_post_send_credits);
+	info->new_credits_offered = 0;
+	spin_lock_init(&info->lock_new_credits_offered);
+
+	rc = smbd_negotiate(info);
+	if (rc) {
+		log_rdma_event(ERR, "smbd_negotiate rc=%d\n", rc);
+		goto negotiation_failed;
+	}
+
+	rc = allocate_mr_list(info);
+	if (rc) {
+		log_rdma_mr(ERR, "memory registration allocation failed\n");
+		goto allocate_mr_failed;
+	}
+
+	return info;
+
+allocate_mr_failed:
+	/* At this point, need to a full transport shutdown */
+	smbd_destroy(info);
+	return NULL;
+
+negotiation_failed:
+	cancel_delayed_work_sync(&info->idle_timer_work);
+	destroy_caches_and_workqueue(info);
+	info->transport_status = SMBD_NEGOTIATE_FAILED;
+	init_waitqueue_head(&info->conn_wait);
+	rdma_disconnect(info->id);
+	wait_event(info->conn_wait,
+		info->transport_status == SMBD_DISCONNECTED);
+
+allocate_cache_failed:
+rdma_connect_failed:
+	rdma_destroy_qp(info->id);
+
+create_qp_failed:
+alloc_cq_failed:
+	if (info->send_cq)
+		ib_free_cq(info->send_cq);
+	if (info->recv_cq)
+		ib_free_cq(info->recv_cq);
+
+config_failed:
+	ib_dealloc_pd(info->pd);
+	rdma_destroy_id(info->id);
+
+create_id_failed:
+	kfree(info);
+	return NULL;
+}
+
+struct smbd_connection *smbd_get_connection(
+	struct TCP_Server_Info *server, struct sockaddr *dstaddr)
+{
+	struct smbd_connection *ret;
+	int port = SMBD_PORT;
+
+try_again:
+	ret = _smbd_get_connection(server, dstaddr, port);
+
+	/* Try SMB_PORT if SMBD_PORT doesn't work */
+	if (!ret && port == SMBD_PORT) {
+		port = SMB_PORT;
+		goto try_again;
+	}
+	return ret;
+}
+
+/*
+ * Receive data from receive reassembly queue
+ * All the incoming data packets are placed in reassembly queue
+ * buf: the buffer to read data into
+ * size: the length of data to read
+ * return value: actual data read
+ * Note: this implementation copies the data from reassebmly queue to receive
+ * buffers used by upper layer. This is not the optimal code path. A better way
+ * to do it is to not have upper layer allocate its receive buffers but rather
+ * borrow the buffer from reassembly queue, and return it after data is
+ * consumed. But this will require more changes to upper layer code, and also
+ * need to consider packet boundaries while they still being reassembled.
+ */
+static int smbd_recv_buf(struct smbd_connection *info, char *buf,
+		unsigned int size)
+{
+	struct smbd_response *response;
+	struct smbd_data_transfer *data_transfer;
+	int to_copy, to_read, data_read, offset;
+	u32 data_length, remaining_data_length, data_offset;
+	int rc;
+
+again:
+	if (info->transport_status != SMBD_CONNECTED) {
+		log_read(ERR, "disconnected\n");
+		return -ENODEV;
+	}
+
+	/*
+	 * No need to hold the reassembly queue lock all the time as we are
+	 * the only one reading from the front of the queue. The transport
+	 * may add more entries to the back of the queue at the same time
+	 */
+	log_read(INFO, "size=%d info->reassembly_data_length=%d\n", size,
+		info->reassembly_data_length);
+	if (info->reassembly_data_length >= size) {
+		int queue_length;
+		int queue_removed = 0;
+
+		/*
+		 * Need to make sure reassembly_data_length is read before
+		 * reading reassembly_queue_length and calling
+		 * _get_first_reassembly. This call is lock free
+		 * as we never read at the end of the queue which are being
+		 * updated in SOFTIRQ as more data is received
+		 */
+		virt_rmb();
+		queue_length = info->reassembly_queue_length;
+		data_read = 0;
+		to_read = size;
+		offset = info->first_entry_offset;
+		while (data_read < size) {
+			response = _get_first_reassembly(info);
+			data_transfer = smbd_response_payload(response);
+			data_length = le32_to_cpu(data_transfer->data_length);
+			remaining_data_length =
+				le32_to_cpu(
+					data_transfer->remaining_data_length);
+			data_offset = le32_to_cpu(data_transfer->data_offset);
+
+			/*
+			 * The upper layer expects RFC1002 length at the
+			 * beginning of the payload. Return it to indicate
+			 * the total length of the packet. This minimize the
+			 * change to upper layer packet processing logic. This
+			 * will be eventually remove when an intermediate
+			 * transport layer is added
+			 */
+			if (response->first_segment && size == 4) {
+				unsigned int rfc1002_len =
+					data_length + remaining_data_length;
+				*((__be32 *)buf) = cpu_to_be32(rfc1002_len);
+				data_read = 4;
+				response->first_segment = false;
+				log_read(INFO, "returning rfc1002 length %d\n",
+					rfc1002_len);
+				goto read_rfc1002_done;
+			}
+
+			to_copy = min_t(int, data_length - offset, to_read);
+			memcpy(
+				buf + data_read,
+				(char *)data_transfer + data_offset + offset,
+				to_copy);
+
+			/* move on to the next buffer? */
+			if (to_copy == data_length - offset) {
+				queue_length--;
+				/*
+				 * No need to lock if we are not at the
+				 * end of the queue
+				 */
+				if (!queue_length)
+					spin_lock_irq(
+						&info->reassembly_queue_lock);
+				list_del(&response->list);
+				queue_removed++;
+				if (!queue_length)
+					spin_unlock_irq(
+						&info->reassembly_queue_lock);
+
+				info->count_reassembly_queue--;
+				info->count_dequeue_reassembly_queue++;
+				put_receive_buffer(info, response);
+				offset = 0;
+				log_read(INFO, "put_receive_buffer offset=0\n");
+			} else
+				offset += to_copy;
+
+			to_read -= to_copy;
+			data_read += to_copy;
+
+			log_read(INFO, "_get_first_reassembly memcpy %d bytes "
+				"data_transfer_length-offset=%d after that "
+				"to_read=%d data_read=%d offset=%d\n",
+				to_copy, data_length - offset,
+				to_read, data_read, offset);
+		}
+
+		spin_lock_irq(&info->reassembly_queue_lock);
+		info->reassembly_data_length -= data_read;
+		info->reassembly_queue_length -= queue_removed;
+		spin_unlock_irq(&info->reassembly_queue_lock);
+
+		info->first_entry_offset = offset;
+		log_read(INFO, "returning to thread data_read=%d "
+			"reassembly_data_length=%d first_entry_offset=%d\n",
+			data_read, info->reassembly_data_length,
+			info->first_entry_offset);
+read_rfc1002_done:
+		return data_read;
+	}
+
+	log_read(INFO, "wait_event on more data\n");
+	rc = wait_event_interruptible(
+		info->wait_reassembly_queue,
+		info->reassembly_data_length >= size ||
+			info->transport_status != SMBD_CONNECTED);
+	/* Don't return any data if interrupted */
+	if (rc)
+		return -ENODEV;
+
+	goto again;
+}
+
+/*
+ * Receive a page from receive reassembly queue
+ * page: the page to read data into
+ * to_read: the length of data to read
+ * return value: actual data read
+ */
+static int smbd_recv_page(struct smbd_connection *info,
+		struct page *page, unsigned int to_read)
+{
+	int ret;
+	char *to_address;
+
+	/* make sure we have the page ready for read */
+	ret = wait_event_interruptible(
+		info->wait_reassembly_queue,
+		info->reassembly_data_length >= to_read ||
+			info->transport_status != SMBD_CONNECTED);
+	if (ret)
+		return 0;
+
+	/* now we can read from reassembly queue and not sleep */
+	to_address = kmap_atomic(page);
+
+	log_read(INFO, "reading from page=%p address=%p to_read=%d\n",
+		page, to_address, to_read);
+
+	ret = smbd_recv_buf(info, to_address, to_read);
+	kunmap_atomic(to_address);
+
+	return ret;
+}
+
+/*
+ * Receive data from transport
+ * msg: a msghdr point to the buffer, can be ITER_KVEC or ITER_BVEC
+ * return: total bytes read, or 0. SMB Direct will not do partial read.
+ */
+int smbd_recv(struct smbd_connection *info, struct msghdr *msg)
+{
+	char *buf;
+	struct page *page;
+	unsigned int to_read;
+	int rc;
+
+	info->smbd_recv_pending++;
+
+	switch (msg->msg_iter.type) {
+	case READ | ITER_KVEC:
+		buf = msg->msg_iter.kvec->iov_base;
+		to_read = msg->msg_iter.kvec->iov_len;
+		rc = smbd_recv_buf(info, buf, to_read);
+		break;
+
+	case READ | ITER_BVEC:
+		page = msg->msg_iter.bvec->bv_page;
+		to_read = msg->msg_iter.bvec->bv_len;
+		rc = smbd_recv_page(info, page, to_read);
+		break;
+
+	default:
+		/* It's a bug in upper layer to get there */
+		cifs_dbg(VFS, "CIFS: invalid msg type %d\n",
+			msg->msg_iter.type);
+		rc = -EIO;
+	}
+
+	info->smbd_recv_pending--;
+	wake_up(&info->wait_smbd_recv_pending);
+
+	/* SMBDirect will read it all or nothing */
+	if (rc > 0)
+		msg->msg_iter.count = 0;
+	return rc;
+}
+
+/*
+ * Send data to transport
+ * Each rqst is transported as a SMBDirect payload
+ * rqst: the data to write
+ * return value: 0 if successfully write, otherwise error code
+ */
+int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst)
+{
+	struct kvec vec;
+	int nvecs;
+	int size;
+	int buflen = 0, remaining_data_length;
+	int start, i, j;
+	int max_iov_size =
+		info->max_send_size - sizeof(struct smbd_data_transfer);
+	struct kvec iov[SMBDIRECT_MAX_SGE];
+	int rc;
+
+	info->smbd_send_pending++;
+	if (info->transport_status != SMBD_CONNECTED) {
+		rc = -ENODEV;
+		goto done;
+	}
+
+	/*
+	 * This usually means a configuration error
+	 * We use RDMA read/write for packet size > rdma_readwrite_threshold
+	 * as long as it's properly configured we should never get into this
+	 * situation
+	 */
+	if (rqst->rq_nvec + rqst->rq_npages > SMBDIRECT_MAX_SGE) {
+		log_write(ERR, "maximum send segment %x exceeding %x\n",
+			 rqst->rq_nvec + rqst->rq_npages, SMBDIRECT_MAX_SGE);
+		rc = -EINVAL;
+		goto done;
+	}
+
+	/*
+	 * Remove the RFC1002 length defined in MS-SMB2 section 2.1
+	 * It is used only for TCP transport
+	 * In future we may want to add a transport layer under protocol
+	 * layer so this will only be issued to TCP transport
+	 */
+	iov[0].iov_base = (char *)rqst->rq_iov[0].iov_base + 4;
+	iov[0].iov_len = rqst->rq_iov[0].iov_len - 4;
+	buflen += iov[0].iov_len;
+
+	/* total up iov array first */
+	for (i = 1; i < rqst->rq_nvec; i++) {
+		iov[i].iov_base = rqst->rq_iov[i].iov_base;
+		iov[i].iov_len = rqst->rq_iov[i].iov_len;
+		buflen += iov[i].iov_len;
+	}
+
+	/* add in the page array if there is one */
+	if (rqst->rq_npages) {
+		buflen += rqst->rq_pagesz * (rqst->rq_npages - 1);
+		buflen += rqst->rq_tailsz;
+	}
+
+	if (buflen + sizeof(struct smbd_data_transfer) >
+		info->max_fragmented_send_size) {
+		log_write(ERR, "payload size %d > max size %d\n",
+			buflen, info->max_fragmented_send_size);
+		rc = -EINVAL;
+		goto done;
+	}
+
+	remaining_data_length = buflen;
+
+	log_write(INFO, "rqst->rq_nvec=%d rqst->rq_npages=%d rq_pagesz=%d "
+		"rq_tailsz=%d buflen=%d\n",
+		rqst->rq_nvec, rqst->rq_npages, rqst->rq_pagesz,
+		rqst->rq_tailsz, buflen);
+
+	start = i = iov[0].iov_len ? 0 : 1;
+	buflen = 0;
+	while (true) {
+		buflen += iov[i].iov_len;
+		if (buflen > max_iov_size) {
+			if (i > start) {
+				remaining_data_length -=
+					(buflen-iov[i].iov_len);
+				log_write(INFO, "sending iov[] from start=%d "
+					"i=%d nvecs=%d "
+					"remaining_data_length=%d\n",
+					start, i, i-start,
+					remaining_data_length);
+				rc = smbd_post_send_data(
+					info, &iov[start], i-start,
+					remaining_data_length);
+				if (rc)
+					goto done;
+			} else {
+				/* iov[start] is too big, break it */
+				nvecs = (buflen+max_iov_size-1)/max_iov_size;
+				log_write(INFO, "iov[%d] iov_base=%p buflen=%d"
+					" break to %d vectors\n",
+					start, iov[start].iov_base,
+					buflen, nvecs);
+				for (j = 0; j < nvecs; j++) {
+					vec.iov_base =
+						(char *)iov[start].iov_base +
+						j*max_iov_size;
+					vec.iov_len = max_iov_size;
+					if (j == nvecs-1)
+						vec.iov_len =
+							buflen -
+							max_iov_size*(nvecs-1);
+					remaining_data_length -= vec.iov_len;
+					log_write(INFO,
+						"sending vec j=%d iov_base=%p"
+						" iov_len=%zu "
+						"remaining_data_length=%d\n",
+						j, vec.iov_base, vec.iov_len,
+						remaining_data_length);
+					rc = smbd_post_send_data(
+						info, &vec, 1,
+						remaining_data_length);
+					if (rc)
+						goto done;
+				}
+				i++;
+			}
+			start = i;
+			buflen = 0;
+		} else {
+			i++;
+			if (i == rqst->rq_nvec) {
+				/* send out all remaining vecs */
+				remaining_data_length -= buflen;
+				log_write(INFO,
+					"sending iov[] from start=%d i=%d "
+					"nvecs=%d remaining_data_length=%d\n",
+					start, i, i-start,
+					remaining_data_length);
+				rc = smbd_post_send_data(info, &iov[start],
+					i-start, remaining_data_length);
+				if (rc)
+					goto done;
+				break;
+			}
+		}
+		log_write(INFO, "looping i=%d buflen=%d\n", i, buflen);
+	}
+
+	/* now sending pages if there are any */
+	for (i = 0; i < rqst->rq_npages; i++) {
+		buflen = (i == rqst->rq_npages-1) ?
+			rqst->rq_tailsz : rqst->rq_pagesz;
+		nvecs = (buflen + max_iov_size - 1) / max_iov_size;
+		log_write(INFO, "sending pages buflen=%d nvecs=%d\n",
+			buflen, nvecs);
+		for (j = 0; j < nvecs; j++) {
+			size = max_iov_size;
+			if (j == nvecs-1)
+				size = buflen - j*max_iov_size;
+			remaining_data_length -= size;
+			log_write(INFO, "sending pages i=%d offset=%d size=%d"
+				" remaining_data_length=%d\n",
+				i, j*max_iov_size, size, remaining_data_length);
+			rc = smbd_post_send_page(
+				info, rqst->rq_pages[i], j*max_iov_size,
+				size, remaining_data_length);
+			if (rc)
+				goto done;
+		}
+	}
+
+done:
+	/*
+	 * As an optimization, we don't wait for individual I/O to finish
+	 * before sending the next one.
+	 * Send them all and wait for pending send count to get to 0
+	 * that means all the I/Os have been out and we are good to return
+	 */
+
+	wait_event(info->wait_send_payload_pending,
+		atomic_read(&info->send_payload_pending) == 0);
+
+	info->smbd_send_pending--;
+	wake_up(&info->wait_smbd_send_pending);
+
+	return rc;
+}
+
+static void register_mr_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct smbd_mr *mr;
+	struct ib_cqe *cqe;
+
+	if (wc->status) {
+		log_rdma_mr(ERR, "status=%d\n", wc->status);
+		cqe = wc->wr_cqe;
+		mr = container_of(cqe, struct smbd_mr, cqe);
+		smbd_disconnect_rdma_connection(mr->conn);
+	}
+}
+
+/*
+ * The work queue function that recovers MRs
+ * We need to call ib_dereg_mr() and ib_alloc_mr() before this MR can be used
+ * again. Both calls are slow, so finish them in a workqueue. This will not
+ * block I/O path.
+ * There is one workqueue that recovers MRs, there is no need to lock as the
+ * I/O requests calling smbd_register_mr will never update the links in the
+ * mr_list.
+ */
+static void smbd_mr_recovery_work(struct work_struct *work)
+{
+	struct smbd_connection *info =
+		container_of(work, struct smbd_connection, mr_recovery_work);
+	struct smbd_mr *smbdirect_mr;
+	int rc;
+
+	list_for_each_entry(smbdirect_mr, &info->mr_list, list) {
+		if (smbdirect_mr->state == MR_INVALIDATED ||
+			smbdirect_mr->state == MR_ERROR) {
+
+			if (smbdirect_mr->state == MR_INVALIDATED) {
+				ib_dma_unmap_sg(
+					info->id->device, smbdirect_mr->sgl,
+					smbdirect_mr->sgl_count,
+					smbdirect_mr->dir);
+				smbdirect_mr->state = MR_READY;
+			} else if (smbdirect_mr->state == MR_ERROR) {
+
+				/* recover this MR entry */
+				rc = ib_dereg_mr(smbdirect_mr->mr);
+				if (rc) {
+					log_rdma_mr(ERR,
+						"ib_dereg_mr faield rc=%x\n",
+						rc);
+					smbd_disconnect_rdma_connection(info);
+				}
+
+				smbdirect_mr->mr = ib_alloc_mr(
+					info->pd, info->mr_type,
+					info->max_frmr_depth);
+				if (IS_ERR(smbdirect_mr->mr)) {
+					log_rdma_mr(ERR,
+						"ib_alloc_mr failed mr_type=%x "
+						"max_frmr_depth=%x\n",
+						info->mr_type,
+						info->max_frmr_depth);
+					smbd_disconnect_rdma_connection(info);
+				}
+
+				smbdirect_mr->state = MR_READY;
+			}
+			/* smbdirect_mr->state is updated by this function
+			 * and is read and updated by I/O issuing CPUs trying
+			 * to get a MR, the call to atomic_inc_return
+			 * implicates a memory barrier and guarantees this
+			 * value is updated before waking up any calls to
+			 * get_mr() from the I/O issuing CPUs
+			 */
+			if (atomic_inc_return(&info->mr_ready_count) == 1)
+				wake_up_interruptible(&info->wait_mr);
+		}
+	}
+}
+
+static void destroy_mr_list(struct smbd_connection *info)
+{
+	struct smbd_mr *mr, *tmp;
+
+	cancel_work_sync(&info->mr_recovery_work);
+	list_for_each_entry_safe(mr, tmp, &info->mr_list, list) {
+		if (mr->state == MR_INVALIDATED)
+			ib_dma_unmap_sg(info->id->device, mr->sgl,
+				mr->sgl_count, mr->dir);
+		ib_dereg_mr(mr->mr);
+		kfree(mr->sgl);
+		kfree(mr);
+	}
+}
+
+/*
+ * Allocate MRs used for RDMA read/write
+ * The number of MRs will not exceed hardware capability in responder_resources
+ * All MRs are kept in mr_list. The MR can be recovered after it's used
+ * Recovery is done in smbd_mr_recovery_work. The content of list entry changes
+ * as MRs are used and recovered for I/O, but the list links will not change
+ */
+static int allocate_mr_list(struct smbd_connection *info)
+{
+	int i;
+	struct smbd_mr *smbdirect_mr, *tmp;
+
+	INIT_LIST_HEAD(&info->mr_list);
+	init_waitqueue_head(&info->wait_mr);
+	spin_lock_init(&info->mr_list_lock);
+	atomic_set(&info->mr_ready_count, 0);
+	atomic_set(&info->mr_used_count, 0);
+	init_waitqueue_head(&info->wait_for_mr_cleanup);
+	/* Allocate more MRs (2x) than hardware responder_resources */
+	for (i = 0; i < info->responder_resources * 2; i++) {
+		smbdirect_mr = kzalloc(sizeof(*smbdirect_mr), GFP_KERNEL);
+		if (!smbdirect_mr)
+			goto out;
+		smbdirect_mr->mr = ib_alloc_mr(info->pd, info->mr_type,
+					info->max_frmr_depth);
+		if (IS_ERR(smbdirect_mr->mr)) {
+			log_rdma_mr(ERR, "ib_alloc_mr failed mr_type=%x "
+				"max_frmr_depth=%x\n",
+				info->mr_type, info->max_frmr_depth);
+			goto out;
+		}
+		smbdirect_mr->sgl = kcalloc(
+					info->max_frmr_depth,
+					sizeof(struct scatterlist),
+					GFP_KERNEL);
+		if (!smbdirect_mr->sgl) {
+			log_rdma_mr(ERR, "failed to allocate sgl\n");
+			ib_dereg_mr(smbdirect_mr->mr);
+			goto out;
+		}
+		smbdirect_mr->state = MR_READY;
+		smbdirect_mr->conn = info;
+
+		list_add_tail(&smbdirect_mr->list, &info->mr_list);
+		atomic_inc(&info->mr_ready_count);
+	}
+	INIT_WORK(&info->mr_recovery_work, smbd_mr_recovery_work);
+	return 0;
+
+out:
+	kfree(smbdirect_mr);
+
+	list_for_each_entry_safe(smbdirect_mr, tmp, &info->mr_list, list) {
+		ib_dereg_mr(smbdirect_mr->mr);
+		kfree(smbdirect_mr->sgl);
+		kfree(smbdirect_mr);
+	}
+	return -ENOMEM;
+}
+
+/*
+ * Get a MR from mr_list. This function waits until there is at least one
+ * MR available in the list. It may access the list while the
+ * smbd_mr_recovery_work is recovering the MR list. This doesn't need a lock
+ * as they never modify the same places. However, there may be several CPUs
+ * issueing I/O trying to get MR at the same time, mr_list_lock is used to
+ * protect this situation.
+ */
+static struct smbd_mr *get_mr(struct smbd_connection *info)
+{
+	struct smbd_mr *ret;
+	int rc;
+again:
+	rc = wait_event_interruptible(info->wait_mr,
+		atomic_read(&info->mr_ready_count) ||
+		info->transport_status != SMBD_CONNECTED);
+	if (rc) {
+		log_rdma_mr(ERR, "wait_event_interruptible rc=%x\n", rc);
+		return NULL;
+	}
+
+	if (info->transport_status != SMBD_CONNECTED) {
+		log_rdma_mr(ERR, "info->transport_status=%x\n",
+			info->transport_status);
+		return NULL;
+	}
+
+	spin_lock(&info->mr_list_lock);
+	list_for_each_entry(ret, &info->mr_list, list) {
+		if (ret->state == MR_READY) {
+			ret->state = MR_REGISTERED;
+			spin_unlock(&info->mr_list_lock);
+			atomic_dec(&info->mr_ready_count);
+			atomic_inc(&info->mr_used_count);
+			return ret;
+		}
+	}
+
+	spin_unlock(&info->mr_list_lock);
+	/*
+	 * It is possible that we could fail to get MR because other processes may
+	 * try to acquire a MR at the same time. If this is the case, retry it.
+	 */
+	goto again;
+}
+
+/*
+ * Register memory for RDMA read/write
+ * pages[]: the list of pages to register memory with
+ * num_pages: the number of pages to register
+ * tailsz: if non-zero, the bytes to register in the last page
+ * writing: true if this is a RDMA write (SMB read), false for RDMA read
+ * need_invalidate: true if this MR needs to be locally invalidated after I/O
+ * return value: the MR registered, NULL if failed.
+ */
+struct smbd_mr *smbd_register_mr(
+	struct smbd_connection *info, struct page *pages[], int num_pages,
+	int tailsz, bool writing, bool need_invalidate)
+{
+	struct smbd_mr *smbdirect_mr;
+	int rc, i;
+	enum dma_data_direction dir;
+	struct ib_reg_wr *reg_wr;
+	struct ib_send_wr *bad_wr;
+
+	if (num_pages > info->max_frmr_depth) {
+		log_rdma_mr(ERR, "num_pages=%d max_frmr_depth=%d\n",
+			num_pages, info->max_frmr_depth);
+		return NULL;
+	}
+
+	smbdirect_mr = get_mr(info);
+	if (!smbdirect_mr) {
+		log_rdma_mr(ERR, "get_mr returning NULL\n");
+		return NULL;
+	}
+	smbdirect_mr->need_invalidate = need_invalidate;
+	smbdirect_mr->sgl_count = num_pages;
+	sg_init_table(smbdirect_mr->sgl, num_pages);
+
+	for (i = 0; i < num_pages - 1; i++)
+		sg_set_page(&smbdirect_mr->sgl[i], pages[i], PAGE_SIZE, 0);
+
+	sg_set_page(&smbdirect_mr->sgl[i], pages[i],
+		tailsz ? tailsz : PAGE_SIZE, 0);
+
+	dir = writing ? DMA_FROM_DEVICE : DMA_TO_DEVICE;
+	smbdirect_mr->dir = dir;
+	rc = ib_dma_map_sg(info->id->device, smbdirect_mr->sgl, num_pages, dir);
+	if (!rc) {
+		log_rdma_mr(INFO, "ib_dma_map_sg num_pages=%x dir=%x rc=%x\n",
+			num_pages, dir, rc);
+		goto dma_map_error;
+	}
+
+	rc = ib_map_mr_sg(smbdirect_mr->mr, smbdirect_mr->sgl, num_pages,
+		NULL, PAGE_SIZE);
+	if (rc != num_pages) {
+		log_rdma_mr(INFO,
+			"ib_map_mr_sg failed rc = %x num_pages = %x\n",
+			rc, num_pages);
+		goto map_mr_error;
+	}
+
+	ib_update_fast_reg_key(smbdirect_mr->mr,
+		ib_inc_rkey(smbdirect_mr->mr->rkey));
+	reg_wr = &smbdirect_mr->wr;
+	reg_wr->wr.opcode = IB_WR_REG_MR;
+	smbdirect_mr->cqe.done = register_mr_done;
+	reg_wr->wr.wr_cqe = &smbdirect_mr->cqe;
+	reg_wr->wr.num_sge = 0;
+	reg_wr->wr.send_flags = IB_SEND_SIGNALED;
+	reg_wr->mr = smbdirect_mr->mr;
+	reg_wr->key = smbdirect_mr->mr->rkey;
+	reg_wr->access = writing ?
+			IB_ACCESS_REMOTE_WRITE | IB_ACCESS_LOCAL_WRITE :
+			IB_ACCESS_REMOTE_READ;
+
+	/*
+	 * There is no need for waiting for complemtion on ib_post_send
+	 * on IB_WR_REG_MR. Hardware enforces a barrier and order of execution
+	 * on the next ib_post_send when we actaully send I/O to remote peer
+	 */
+	rc = ib_post_send(info->id->qp, &reg_wr->wr, &bad_wr);
+	if (!rc)
+		return smbdirect_mr;
+
+	log_rdma_mr(ERR, "ib_post_send failed rc=%x reg_wr->key=%x\n",
+		rc, reg_wr->key);
+
+	/* If all failed, attempt to recover this MR by setting it MR_ERROR*/
+map_mr_error:
+	ib_dma_unmap_sg(info->id->device, smbdirect_mr->sgl,
+		smbdirect_mr->sgl_count, smbdirect_mr->dir);
+
+dma_map_error:
+	smbdirect_mr->state = MR_ERROR;
+	if (atomic_dec_and_test(&info->mr_used_count))
+		wake_up(&info->wait_for_mr_cleanup);
+
+	return NULL;
+}
+
+static void local_inv_done(struct ib_cq *cq, struct ib_wc *wc)
+{
+	struct smbd_mr *smbdirect_mr;
+	struct ib_cqe *cqe;
+
+	cqe = wc->wr_cqe;
+	smbdirect_mr = container_of(cqe, struct smbd_mr, cqe);
+	smbdirect_mr->state = MR_INVALIDATED;
+	if (wc->status != IB_WC_SUCCESS) {
+		log_rdma_mr(ERR, "invalidate failed status=%x\n", wc->status);
+		smbdirect_mr->state = MR_ERROR;
+	}
+	complete(&smbdirect_mr->invalidate_done);
+}
+
+/*
+ * Deregister a MR after I/O is done
+ * This function may wait if remote invalidation is not used
+ * and we have to locally invalidate the buffer to prevent data is being
+ * modified by remote peer after upper layer consumes it
+ */
+int smbd_deregister_mr(struct smbd_mr *smbdirect_mr)
+{
+	struct ib_send_wr *wr, *bad_wr;
+	struct smbd_connection *info = smbdirect_mr->conn;
+	int rc = 0;
+
+	if (smbdirect_mr->need_invalidate) {
+		/* Need to finish local invalidation before returning */
+		wr = &smbdirect_mr->inv_wr;
+		wr->opcode = IB_WR_LOCAL_INV;
+		smbdirect_mr->cqe.done = local_inv_done;
+		wr->wr_cqe = &smbdirect_mr->cqe;
+		wr->num_sge = 0;
+		wr->ex.invalidate_rkey = smbdirect_mr->mr->rkey;
+		wr->send_flags = IB_SEND_SIGNALED;
+
+		init_completion(&smbdirect_mr->invalidate_done);
+		rc = ib_post_send(info->id->qp, wr, &bad_wr);
+		if (rc) {
+			log_rdma_mr(ERR, "ib_post_send failed rc=%x\n", rc);
+			smbd_disconnect_rdma_connection(info);
+			goto done;
+		}
+		wait_for_completion(&smbdirect_mr->invalidate_done);
+		smbdirect_mr->need_invalidate = false;
+	} else
+		/*
+		 * For remote invalidation, just set it to MR_INVALIDATED
+		 * and defer to mr_recovery_work to recover the MR for next use
+		 */
+		smbdirect_mr->state = MR_INVALIDATED;
+
+	/*
+	 * Schedule the work to do MR recovery for future I/Os
+	 * MR recovery is slow and we don't want it to block the current I/O
+	 */
+	queue_work(info->workqueue, &info->mr_recovery_work);
+
+done:
+	if (atomic_dec_and_test(&info->mr_used_count))
+		wake_up(&info->wait_for_mr_cleanup);
+
+	return rc;
+}
diff --git a/fs/cifs/smbdirect.h b/fs/cifs/smbdirect.h
new file mode 100644
index 0000000..f9038da
--- /dev/null
+++ b/fs/cifs/smbdirect.h
@@ -0,0 +1,338 @@
+/*
+ *   Copyright (C) 2017, Microsoft Corporation.
+ *
+ *   Author(s): Long Li <longli@microsoft.com>
+ *
+ *   This program is free software;  you can redistribute it and/or modify
+ *   it under the terms of the GNU General Public License as published by
+ *   the Free Software Foundation; either version 2 of the License, or
+ *   (at your option) any later version.
+ *
+ *   This program is distributed in the hope that it will be useful,
+ *   but WITHOUT ANY WARRANTY;  without even the implied warranty of
+ *   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See
+ *   the GNU General Public License for more details.
+ */
+#ifndef _SMBDIRECT_H
+#define _SMBDIRECT_H
+
+#ifdef CONFIG_CIFS_SMB_DIRECT
+#define cifs_rdma_enabled(server)	((server)->rdma)
+
+#include "cifsglob.h"
+#include <rdma/ib_verbs.h>
+#include <rdma/rdma_cm.h>
+#include <linux/mempool.h>
+
+extern int rdma_readwrite_threshold;
+extern int smbd_max_frmr_depth;
+extern int smbd_keep_alive_interval;
+extern int smbd_max_receive_size;
+extern int smbd_max_fragmented_recv_size;
+extern int smbd_max_send_size;
+extern int smbd_send_credit_target;
+extern int smbd_receive_credit_max;
+
+enum keep_alive_status {
+	KEEP_ALIVE_NONE,
+	KEEP_ALIVE_PENDING,
+	KEEP_ALIVE_SENT,
+};
+
+enum smbd_connection_status {
+	SMBD_CREATED,
+	SMBD_CONNECTING,
+	SMBD_CONNECTED,
+	SMBD_NEGOTIATE_FAILED,
+	SMBD_DISCONNECTING,
+	SMBD_DISCONNECTED,
+	SMBD_DESTROYED
+};
+
+/*
+ * The context for the SMBDirect transport
+ * Everything related to the transport is here. It has several logical parts
+ * 1. RDMA related structures
+ * 2. SMBDirect connection parameters
+ * 3. Memory registrations
+ * 4. Receive and reassembly queues for data receive path
+ * 5. mempools for allocating packets
+ */
+struct smbd_connection {
+	enum smbd_connection_status transport_status;
+
+	/* RDMA related */
+	struct rdma_cm_id *id;
+	struct ib_qp_init_attr qp_attr;
+	struct ib_pd *pd;
+	struct ib_cq *send_cq, *recv_cq;
+	struct ib_device_attr dev_attr;
+	int ri_rc;
+	struct completion ri_done;
+	wait_queue_head_t conn_wait;
+	wait_queue_head_t wait_destroy;
+
+	struct completion negotiate_completion;
+	bool negotiate_done;
+
+	struct work_struct destroy_work;
+	struct work_struct disconnect_work;
+	struct work_struct recv_done_work;
+	struct work_struct post_send_credits_work;
+
+	spinlock_t lock_new_credits_offered;
+	int new_credits_offered;
+
+	/* Connection parameters defined in [MS-SMBD] 3.1.1.1 */
+	int receive_credit_max;
+	int send_credit_target;
+	int max_send_size;
+	int max_fragmented_recv_size;
+	int max_fragmented_send_size;
+	int max_receive_size;
+	int keep_alive_interval;
+	int max_readwrite_size;
+	enum keep_alive_status keep_alive_requested;
+	int protocol;
+	atomic_t send_credits;
+	atomic_t receive_credits;
+	int receive_credit_target;
+	int fragment_reassembly_remaining;
+
+	/* Memory registrations */
+	/* Maximum number of RDMA read/write outstanding on this connection */
+	int responder_resources;
+	/* Maximum number of SGEs in a RDMA write/read */
+	int max_frmr_depth;
+	/*
+	 * If payload is less than or equal to the threshold,
+	 * use RDMA send/recv to send upper layer I/O.
+	 * If payload is more than the threshold,
+	 * use RDMA read/write through memory registration for I/O.
+	 */
+	int rdma_readwrite_threshold;
+	enum ib_mr_type mr_type;
+	struct list_head mr_list;
+	spinlock_t mr_list_lock;
+	/* The number of available MRs ready for memory registration */
+	atomic_t mr_ready_count;
+	atomic_t mr_used_count;
+	wait_queue_head_t wait_mr;
+	struct work_struct mr_recovery_work;
+	/* Used by transport to wait until all MRs are returned */
+	wait_queue_head_t wait_for_mr_cleanup;
+
+	/* Activity accoutning */
+	/* Pending reqeusts issued from upper layer */
+	int smbd_send_pending;
+	wait_queue_head_t wait_smbd_send_pending;
+
+	int smbd_recv_pending;
+	wait_queue_head_t wait_smbd_recv_pending;
+
+	atomic_t send_pending;
+	wait_queue_head_t wait_send_pending;
+	atomic_t send_payload_pending;
+	wait_queue_head_t wait_send_payload_pending;
+
+	/* Receive queue */
+	struct list_head receive_queue;
+	int count_receive_queue;
+	spinlock_t receive_queue_lock;
+
+	struct list_head empty_packet_queue;
+	int count_empty_packet_queue;
+	spinlock_t empty_packet_queue_lock;
+
+	wait_queue_head_t wait_receive_queues;
+
+	/* Reassembly queue */
+	struct list_head reassembly_queue;
+	spinlock_t reassembly_queue_lock;
+	wait_queue_head_t wait_reassembly_queue;
+
+	/* total data length of reassembly queue */
+	int reassembly_data_length;
+	int reassembly_queue_length;
+	/* the offset to first buffer in reassembly queue */
+	int first_entry_offset;
+
+	bool send_immediate;
+
+	wait_queue_head_t wait_send_queue;
+
+	/*
+	 * Indicate if we have received a full packet on the connection
+	 * This is used to identify the first SMBD packet of a assembled
+	 * payload (SMB packet) in reassembly queue so we can return a
+	 * RFC1002 length to upper layer to indicate the length of the SMB
+	 * packet received
+	 */
+	bool full_packet_received;
+
+	struct workqueue_struct *workqueue;
+	struct delayed_work idle_timer_work;
+	struct delayed_work send_immediate_work;
+
+	/* Memory pool for preallocating buffers */
+	/* request pool for RDMA send */
+	struct kmem_cache *request_cache;
+	mempool_t *request_mempool;
+
+	/* response pool for RDMA receive */
+	struct kmem_cache *response_cache;
+	mempool_t *response_mempool;
+
+	/* for debug purposes */
+	unsigned int count_get_receive_buffer;
+	unsigned int count_put_receive_buffer;
+	unsigned int count_reassembly_queue;
+	unsigned int count_enqueue_reassembly_queue;
+	unsigned int count_dequeue_reassembly_queue;
+	unsigned int count_send_empty;
+};
+
+enum smbd_message_type {
+	SMBD_NEGOTIATE_RESP,
+	SMBD_TRANSFER_DATA,
+};
+
+#define SMB_DIRECT_RESPONSE_REQUESTED 0x0001
+
+/* SMBD negotiation request packet [MS-SMBD] 2.2.1 */
+struct smbd_negotiate_req {
+	__le16 min_version;
+	__le16 max_version;
+	__le16 reserved;
+	__le16 credits_requested;
+	__le32 preferred_send_size;
+	__le32 max_receive_size;
+	__le32 max_fragmented_size;
+} __packed;
+
+/* SMBD negotiation response packet [MS-SMBD] 2.2.2 */
+struct smbd_negotiate_resp {
+	__le16 min_version;
+	__le16 max_version;
+	__le16 negotiated_version;
+	__le16 reserved;
+	__le16 credits_requested;
+	__le16 credits_granted;
+	__le32 status;
+	__le32 max_readwrite_size;
+	__le32 preferred_send_size;
+	__le32 max_receive_size;
+	__le32 max_fragmented_size;
+} __packed;
+
+/* SMBD data transfer packet with payload [MS-SMBD] 2.2.3 */
+struct smbd_data_transfer {
+	__le16 credits_requested;
+	__le16 credits_granted;
+	__le16 flags;
+	__le16 reserved;
+	__le32 remaining_data_length;
+	__le32 data_offset;
+	__le32 data_length;
+	__le32 padding;
+	__u8 buffer[];
+} __packed;
+
+/* The packet fields for a registered RDMA buffer */
+struct smbd_buffer_descriptor_v1 {
+	__le64 offset;
+	__le32 token;
+	__le32 length;
+} __packed;
+
+/* Default maximum number of SGEs in a RDMA send/recv */
+#define SMBDIRECT_MAX_SGE	16
+/* The context for a SMBD request */
+struct smbd_request {
+	struct smbd_connection *info;
+	struct ib_cqe cqe;
+
+	/* true if this request carries upper layer payload */
+	bool has_payload;
+
+	/* the SGE entries for this packet */
+	struct ib_sge sge[SMBDIRECT_MAX_SGE];
+	int num_sge;
+
+	/* SMBD packet header follows this structure */
+	u8 packet[];
+};
+
+/* The context for a SMBD response */
+struct smbd_response {
+	struct smbd_connection *info;
+	struct ib_cqe cqe;
+	struct ib_sge sge;
+
+	enum smbd_message_type type;
+
+	/* Link to receive queue or reassembly queue */
+	struct list_head list;
+
+	/* Indicate if this is the 1st packet of a payload */
+	bool first_segment;
+
+	/* SMBD packet header and payload follows this structure */
+	u8 packet[];
+};
+
+/* Create a SMBDirect session */
+struct smbd_connection *smbd_get_connection(
+	struct TCP_Server_Info *server, struct sockaddr *dstaddr);
+
+/* Reconnect SMBDirect session */
+int smbd_reconnect(struct TCP_Server_Info *server);
+/* Destroy SMBDirect session */
+void smbd_destroy(struct smbd_connection *info);
+
+/* Interface for carrying upper layer I/O through send/recv */
+int smbd_recv(struct smbd_connection *info, struct msghdr *msg);
+int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst);
+
+enum mr_state {
+	MR_READY,
+	MR_REGISTERED,
+	MR_INVALIDATED,
+	MR_ERROR
+};
+
+struct smbd_mr {
+	struct smbd_connection	*conn;
+	struct list_head	list;
+	enum mr_state		state;
+	struct ib_mr		*mr;
+	struct scatterlist	*sgl;
+	int			sgl_count;
+	enum dma_data_direction	dir;
+	union {
+		struct ib_reg_wr	wr;
+		struct ib_send_wr	inv_wr;
+	};
+	struct ib_cqe		cqe;
+	bool			need_invalidate;
+	struct completion	invalidate_done;
+};
+
+/* Interfaces to register and deregister MR for RDMA read/write */
+struct smbd_mr *smbd_register_mr(
+	struct smbd_connection *info, struct page *pages[], int num_pages,
+	int tailsz, bool writing, bool need_invalidate);
+int smbd_deregister_mr(struct smbd_mr *mr);
+
+#else
+#define cifs_rdma_enabled(server)	0
+struct smbd_connection {};
+static inline void *smbd_get_connection(
+	struct TCP_Server_Info *server, struct sockaddr *dstaddr) {return NULL;}
+static inline int smbd_reconnect(struct TCP_Server_Info *server) {return -1; }
+static inline void smbd_destroy(struct smbd_connection *info) {}
+static inline int smbd_recv(struct smbd_connection *info, struct msghdr *msg) {return -1; }
+static inline int smbd_send(struct smbd_connection *info, struct smb_rqst *rqst) {return -1; }
+#endif
+
+#endif
diff --git a/fs/cifs/transport.c b/fs/cifs/transport.c
index 7efbab0..9779b32 100644
--- a/fs/cifs/transport.c
+++ b/fs/cifs/transport.c
@@ -37,6 +37,10 @@
 #include "cifsglob.h"
 #include "cifsproto.h"
 #include "cifs_debug.h"
+#include "smbdirect.h"
+
+/* Max number of iovectors we can use off the stack when sending requests. */
+#define CIFS_MAX_IOV_SIZE 8
 
 void
 cifs_wake_up_task(struct mid_q_entry *mid)
@@ -229,7 +233,10 @@ __smb_send_rqst(struct TCP_Server_Info *server, struct smb_rqst *rqst)
 	struct socket *ssocket = server->ssocket;
 	struct msghdr smb_msg;
 	int val = 1;
-
+	if (cifs_rdma_enabled(server) && server->smbd_conn) {
+		rc = smbd_send(server->smbd_conn, rqst);
+		goto smbd_done;
+	}
 	if (ssocket == NULL)
 		return -ENOTSOCK;
 
@@ -298,7 +305,7 @@ __smb_send_rqst(struct TCP_Server_Info *server, struct smb_rqst *rqst)
 		 */
 		server->tcpStatus = CifsNeedReconnect;
 	}
-
+smbd_done:
 	if (rc < 0 && rc != -EINTR)
 		cifs_dbg(VFS, "Error %d sending data on socket to server\n",
 			 rc);
@@ -803,12 +810,16 @@ SendReceive2(const unsigned int xid, struct cifs_ses *ses,
 	     const int flags, struct kvec *resp_iov)
 {
 	struct smb_rqst rqst;
-	struct kvec *new_iov;
+	struct kvec s_iov[CIFS_MAX_IOV_SIZE], *new_iov;
 	int rc;
 
-	new_iov = kmalloc(sizeof(struct kvec) * (n_vec + 1), GFP_KERNEL);
-	if (!new_iov)
-		return -ENOMEM;
+	if (n_vec + 1 > CIFS_MAX_IOV_SIZE) {
+		new_iov = kmalloc(sizeof(struct kvec) * (n_vec + 1),
+				  GFP_KERNEL);
+		if (!new_iov)
+			return -ENOMEM;
+	} else
+		new_iov = s_iov;
 
 	/* 1st iov is a RFC1001 length followed by the rest of the packet */
 	memcpy(new_iov + 1, iov, (sizeof(struct kvec) * n_vec));
@@ -823,7 +834,51 @@ SendReceive2(const unsigned int xid, struct cifs_ses *ses,
 	rqst.rq_nvec = n_vec + 1;
 
 	rc = cifs_send_recv(xid, ses, &rqst, resp_buf_type, flags, resp_iov);
-	kfree(new_iov);
+	if (n_vec + 1 > CIFS_MAX_IOV_SIZE)
+		kfree(new_iov);
+	return rc;
+}
+
+/* Like SendReceive2 but iov[0] does not contain an rfc1002 header */
+int
+smb2_send_recv(const unsigned int xid, struct cifs_ses *ses,
+	       struct kvec *iov, int n_vec, int *resp_buf_type /* ret */,
+	       const int flags, struct kvec *resp_iov)
+{
+	struct smb_rqst rqst;
+	struct kvec s_iov[CIFS_MAX_IOV_SIZE], *new_iov;
+	int rc;
+	int i;
+	__u32 count;
+	__be32 rfc1002_marker;
+
+	if (n_vec + 1 > CIFS_MAX_IOV_SIZE) {
+		new_iov = kmalloc(sizeof(struct kvec) * (n_vec + 1),
+				  GFP_KERNEL);
+		if (!new_iov)
+			return -ENOMEM;
+	} else
+		new_iov = s_iov;
+
+	/* 1st iov is an RFC1002 Session Message length */
+	memcpy(new_iov + 1, iov, (sizeof(struct kvec) * n_vec));
+
+	count = 0;
+	for (i = 1; i < n_vec + 1; i++)
+		count += new_iov[i].iov_len;
+
+	rfc1002_marker = cpu_to_be32(count);
+
+	new_iov[0].iov_base = &rfc1002_marker;
+	new_iov[0].iov_len = 4;
+
+	memset(&rqst, 0, sizeof(struct smb_rqst));
+	rqst.rq_iov = new_iov;
+	rqst.rq_nvec = n_vec + 1;
+
+	rc = cifs_send_recv(xid, ses, &rqst, resp_buf_type, flags, resp_iov);
+	if (n_vec + 1 > CIFS_MAX_IOV_SIZE)
+		kfree(new_iov);
 	return rc;
 }
 
diff --git a/fs/dcache.c b/fs/dcache.c
index 5c7df1d..379dce8 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1636,8 +1636,7 @@ struct dentry *__d_alloc(struct super_block *sb, const struct qstr *name)
 	dname[name->len] = 0;
 
 	/* Make sure we always see the terminating NUL character */
-	smp_wmb();
-	dentry->d_name.name = dname;
+	smp_store_release(&dentry->d_name.name, dname); /* ^^^ */
 
 	dentry->d_lockref.count = 1;
 	dentry->d_flags = 0;
@@ -3047,17 +3046,14 @@ static int prepend(char **buffer, int *buflen, const char *str, int namelen)
  * retry it again when a d_move() does happen. So any garbage in the buffer
  * due to mismatched pointer and length will be discarded.
  *
- * Data dependency barrier is needed to make sure that we see that terminating
- * NUL.  Alpha strikes again, film at 11...
+ * Load acquire is needed to make sure that we see that terminating NUL.
  */
 static int prepend_name(char **buffer, int *buflen, const struct qstr *name)
 {
-	const char *dname = READ_ONCE(name->name);
+	const char *dname = smp_load_acquire(&name->name); /* ^^^ */
 	u32 dlen = READ_ONCE(name->len);
 	char *p;
 
-	smp_read_barrier_depends();
-
 	*buflen -= dlen + 1;
 	if (*buflen < 0)
 		return -ENAMETOOLONG;
diff --git a/fs/exec.c b/fs/exec.c
index 5688b5e..7eb8d21 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1349,9 +1349,14 @@ void setup_new_exec(struct linux_binprm * bprm)
 
 	current->sas_ss_sp = current->sas_ss_size = 0;
 
-	/* Figure out dumpability. */
+	/*
+	 * Figure out dumpability. Note that this checking only of current
+	 * is wrong, but userspace depends on it. This should be testing
+	 * bprm->secureexec instead.
+	 */
 	if (bprm->interp_flags & BINPRM_FLAGS_ENFORCE_NONDUMP ||
-	    bprm->secureexec)
+	    !(uid_eq(current_euid(), current_uid()) &&
+	      gid_eq(current_egid(), current_gid())))
 		set_dumpable(current->mm, suid_dumpable);
 	else
 		set_dumpable(current->mm, SUID_DUMP_USER);
diff --git a/fs/exofs/dir.c b/fs/exofs/dir.c
index 98233a9..c5a53fc 100644
--- a/fs/exofs/dir.c
+++ b/fs/exofs/dir.c
@@ -31,6 +31,7 @@
  * Foundation, Inc., 51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
  */
 
+#include <linux/iversion.h>
 #include "exofs.h"
 
 static inline unsigned exofs_chunk_size(struct inode *inode)
@@ -60,7 +61,7 @@ static int exofs_commit_chunk(struct page *page, loff_t pos, unsigned len)
 	struct inode *dir = mapping->host;
 	int err = 0;
 
-	dir->i_version++;
+	inode_inc_iversion(dir);
 
 	if (!PageUptodate(page))
 		SetPageUptodate(page);
@@ -241,7 +242,7 @@ exofs_readdir(struct file *file, struct dir_context *ctx)
 	unsigned long n = pos >> PAGE_SHIFT;
 	unsigned long npages = dir_pages(inode);
 	unsigned chunk_mask = ~(exofs_chunk_size(inode)-1);
-	int need_revalidate = (file->f_version != inode->i_version);
+	bool need_revalidate = inode_cmp_iversion(inode, file->f_version);
 
 	if (pos > inode->i_size - EXOFS_DIR_REC_LEN(1))
 		return 0;
@@ -264,8 +265,8 @@ exofs_readdir(struct file *file, struct dir_context *ctx)
 								chunk_mask);
 				ctx->pos = (n<<PAGE_SHIFT) + offset;
 			}
-			file->f_version = inode->i_version;
-			need_revalidate = 0;
+			file->f_version = inode_query_iversion(inode);
+			need_revalidate = false;
 		}
 		de = (struct exofs_dir_entry *)(kaddr + offset);
 		limit = kaddr + exofs_last_byte(inode, n) -
diff --git a/fs/exofs/super.c b/fs/exofs/super.c
index 819624c..7e24409 100644
--- a/fs/exofs/super.c
+++ b/fs/exofs/super.c
@@ -38,6 +38,7 @@
 #include <linux/module.h>
 #include <linux/exportfs.h>
 #include <linux/slab.h>
+#include <linux/iversion.h>
 
 #include "exofs.h"
 
@@ -159,7 +160,7 @@ static struct inode *exofs_alloc_inode(struct super_block *sb)
 	if (!oi)
 		return NULL;
 
-	oi->vfs_inode.i_version = 1;
+	inode_set_iversion(&oi->vfs_inode, 1);
 	return &oi->vfs_inode;
 }
 
diff --git a/fs/ext2/dir.c b/fs/ext2/dir.c
index 9876479..4111085 100644
--- a/fs/ext2/dir.c
+++ b/fs/ext2/dir.c
@@ -26,6 +26,7 @@
 #include <linux/buffer_head.h>
 #include <linux/pagemap.h>
 #include <linux/swap.h>
+#include <linux/iversion.h>
 
 typedef struct ext2_dir_entry_2 ext2_dirent;
 
@@ -92,7 +93,7 @@ static int ext2_commit_chunk(struct page *page, loff_t pos, unsigned len)
 	struct inode *dir = mapping->host;
 	int err = 0;
 
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	block_write_end(NULL, mapping, pos, len, len, page, NULL);
 
 	if (pos+len > dir->i_size) {
@@ -293,7 +294,7 @@ ext2_readdir(struct file *file, struct dir_context *ctx)
 	unsigned long npages = dir_pages(inode);
 	unsigned chunk_mask = ~(ext2_chunk_size(inode)-1);
 	unsigned char *types = NULL;
-	int need_revalidate = file->f_version != inode->i_version;
+	bool need_revalidate = inode_cmp_iversion(inode, file->f_version);
 
 	if (pos > inode->i_size - EXT2_DIR_REC_LEN(1))
 		return 0;
@@ -319,8 +320,8 @@ ext2_readdir(struct file *file, struct dir_context *ctx)
 				offset = ext2_validate_entry(kaddr, offset, chunk_mask);
 				ctx->pos = (n<<PAGE_SHIFT) + offset;
 			}
-			file->f_version = inode->i_version;
-			need_revalidate = 0;
+			file->f_version = inode_query_iversion(inode);
+			need_revalidate = false;
 		}
 		de = (ext2_dirent *)(kaddr+offset);
 		limit = kaddr + ext2_last_byte(inode, n) - EXT2_DIR_REC_LEN(1);
diff --git a/fs/ext2/super.c b/fs/ext2/super.c
index 7646818..554c98b 100644
--- a/fs/ext2/super.c
+++ b/fs/ext2/super.c
@@ -33,6 +33,7 @@
 #include <linux/quotaops.h>
 #include <linux/uaccess.h>
 #include <linux/dax.h>
+#include <linux/iversion.h>
 #include "ext2.h"
 #include "xattr.h"
 #include "acl.h"
@@ -184,7 +185,7 @@ static struct inode *ext2_alloc_inode(struct super_block *sb)
 	if (!ei)
 		return NULL;
 	ei->i_block_alloc_info = NULL;
-	ei->vfs_inode.i_version = 1;
+	inode_set_iversion(&ei->vfs_inode, 1);
 #ifdef CONFIG_QUOTA
 	memset(&ei->i_dquot, 0, sizeof(ei->i_dquot));
 #endif
@@ -1569,7 +1570,7 @@ static ssize_t ext2_quota_write(struct super_block *sb, int type,
 		return err;
 	if (inode->i_size < off+len-towrite)
 		i_size_write(inode, off+len-towrite);
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	inode->i_mtime = inode->i_ctime = current_time(inode);
 	mark_inode_dirty(inode);
 	return len - towrite;
diff --git a/fs/ext4/dir.c b/fs/ext4/dir.c
index d5babc9..afda0a0 100644
--- a/fs/ext4/dir.c
+++ b/fs/ext4/dir.c
@@ -25,6 +25,7 @@
 #include <linux/fs.h>
 #include <linux/buffer_head.h>
 #include <linux/slab.h>
+#include <linux/iversion.h>
 #include "ext4.h"
 #include "xattr.h"
 
@@ -208,7 +209,7 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
 		 * readdir(2), then we might be pointing to an invalid
 		 * dirent right now.  Scan from the start of the block
 		 * to make sure. */
-		if (file->f_version != inode->i_version) {
+		if (inode_cmp_iversion(inode, file->f_version)) {
 			for (i = 0; i < sb->s_blocksize && i < offset; ) {
 				de = (struct ext4_dir_entry_2 *)
 					(bh->b_data + i);
@@ -227,7 +228,7 @@ static int ext4_readdir(struct file *file, struct dir_context *ctx)
 			offset = i;
 			ctx->pos = (ctx->pos & ~(sb->s_blocksize - 1))
 				| offset;
-			file->f_version = inode->i_version;
+			file->f_version = inode_query_iversion(inode);
 		}
 
 		while (ctx->pos < inode->i_size
@@ -568,10 +569,10 @@ static int ext4_dx_readdir(struct file *file, struct dir_context *ctx)
 		 * cached entries.
 		 */
 		if ((!info->curr_node) ||
-		    (file->f_version != inode->i_version)) {
+		    inode_cmp_iversion(inode, file->f_version)) {
 			info->curr_node = NULL;
 			free_rb_tree_fname(&info->root);
-			file->f_version = inode->i_version;
+			file->f_version = inode_query_iversion(inode);
 			ret = ext4_htree_fill_tree(file, info->curr_hash,
 						   info->curr_minor_hash,
 						   &info->next_hash);
diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c
index 1367553..a8b987b 100644
--- a/fs/ext4/inline.c
+++ b/fs/ext4/inline.c
@@ -14,6 +14,7 @@
 
 #include <linux/iomap.h>
 #include <linux/fiemap.h>
+#include <linux/iversion.h>
 
 #include "ext4_jbd2.h"
 #include "ext4.h"
@@ -1042,7 +1043,7 @@ static int ext4_add_dirent_to_inline(handle_t *handle,
 	 */
 	dir->i_mtime = dir->i_ctime = current_time(dir);
 	ext4_update_dx_flag(dir);
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	return 1;
 }
 
@@ -1494,7 +1495,7 @@ int ext4_read_inline_dir(struct file *file,
 	 * dirent right now.  Scan from the start of the inline
 	 * dir to make sure.
 	 */
-	if (file->f_version != inode->i_version) {
+	if (inode_cmp_iversion(inode, file->f_version)) {
 		for (i = 0; i < extra_size && i < offset;) {
 			/*
 			 * "." is with offset 0 and
@@ -1526,7 +1527,7 @@ int ext4_read_inline_dir(struct file *file,
 		}
 		offset = i;
 		ctx->pos = offset;
-		file->f_version = inode->i_version;
+		file->f_version = inode_query_iversion(inode);
 	}
 
 	while (ctx->pos < extra_size) {
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 534a913..0eff5b7 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -39,6 +39,7 @@
 #include <linux/slab.h>
 #include <linux/bitops.h>
 #include <linux/iomap.h>
+#include <linux/iversion.h>
 
 #include "ext4_jbd2.h"
 #include "xattr.h"
@@ -4882,12 +4883,14 @@ struct inode *ext4_iget(struct super_block *sb, unsigned long ino)
 	EXT4_EINODE_GET_XTIME(i_crtime, ei, raw_inode);
 
 	if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
-		inode->i_version = le32_to_cpu(raw_inode->i_disk_version);
+		u64 ivers = le32_to_cpu(raw_inode->i_disk_version);
+
 		if (EXT4_INODE_SIZE(inode->i_sb) > EXT4_GOOD_OLD_INODE_SIZE) {
 			if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
-				inode->i_version |=
+				ivers |=
 		    (__u64)(le32_to_cpu(raw_inode->i_version_hi)) << 32;
 		}
+		inode_set_iversion_queried(inode, ivers);
 	}
 
 	ret = 0;
@@ -5173,11 +5176,13 @@ static int ext4_do_update_inode(handle_t *handle,
 	}
 
 	if (likely(!test_opt2(inode->i_sb, HURD_COMPAT))) {
-		raw_inode->i_disk_version = cpu_to_le32(inode->i_version);
+		u64 ivers = inode_peek_iversion(inode);
+
+		raw_inode->i_disk_version = cpu_to_le32(ivers);
 		if (ei->i_extra_isize) {
 			if (EXT4_FITS_IN_INODE(raw_inode, ei, i_version_hi))
 				raw_inode->i_version_hi =
-					cpu_to_le32(inode->i_version >> 32);
+					cpu_to_le32(ivers >> 32);
 			raw_inode->i_extra_isize =
 				cpu_to_le16(ei->i_extra_isize);
 		}
diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c
index 1eec250..7e99ad0 100644
--- a/fs/ext4/ioctl.c
+++ b/fs/ext4/ioctl.c
@@ -19,6 +19,7 @@
 #include <linux/uuid.h>
 #include <linux/uaccess.h>
 #include <linux/delay.h>
+#include <linux/iversion.h>
 #include "ext4_jbd2.h"
 #include "ext4.h"
 #include <linux/fsmap.h>
@@ -144,7 +145,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
 		i_gid_write(inode_bl, 0);
 		inode_bl->i_flags = 0;
 		ei_bl->i_flags = 0;
-		inode_bl->i_version = 1;
+		inode_set_iversion(inode_bl, 1);
 		i_size_write(inode_bl, 0);
 		inode_bl->i_mode = S_IFREG;
 		if (ext4_has_feature_extents(sb)) {
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index e750d68..6660686 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -34,6 +34,7 @@
 #include <linux/quotaops.h>
 #include <linux/buffer_head.h>
 #include <linux/bio.h>
+#include <linux/iversion.h>
 #include "ext4.h"
 #include "ext4_jbd2.h"
 
@@ -2959,7 +2960,7 @@ static int ext4_rmdir(struct inode *dir, struct dentry *dentry)
 			     "empty directory '%.*s' has too many links (%u)",
 			     dentry->d_name.len, dentry->d_name.name,
 			     inode->i_nlink);
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	clear_nlink(inode);
 	/* There's no need to set i_disksize: the fact that i_nlink is
 	 * zero will ensure that the right thing happens during any
@@ -3365,7 +3366,7 @@ static int ext4_setent(handle_t *handle, struct ext4_renament *ent,
 	ent->de->inode = cpu_to_le32(ino);
 	if (ext4_has_feature_filetype(ent->dir->i_sb))
 		ent->de->file_type = file_type;
-	ent->dir->i_version++;
+	inode_inc_iversion(ent->dir);
 	ent->dir->i_ctime = ent->dir->i_mtime =
 		current_time(ent->dir);
 	ext4_mark_inode_dirty(handle, ent->dir);
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 7c46693..5de959f 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -40,6 +40,7 @@
 #include <linux/dax.h>
 #include <linux/cleancache.h>
 #include <linux/uaccess.h>
+#include <linux/iversion.h>
 
 #include <linux/kthread.h>
 #include <linux/freezer.h>
@@ -967,7 +968,7 @@ static struct inode *ext4_alloc_inode(struct super_block *sb)
 	if (!ei)
 		return NULL;
 
-	ei->vfs_inode.i_version = 1;
+	inode_set_iversion(&ei->vfs_inode, 1);
 	spin_lock_init(&ei->i_raw_lock);
 	INIT_LIST_HEAD(&ei->i_prealloc_list);
 	spin_lock_init(&ei->i_prealloc_lock);
diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 218a7ba..63656db 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -56,6 +56,7 @@
 #include <linux/slab.h>
 #include <linux/mbcache.h>
 #include <linux/quotaops.h>
+#include <linux/iversion.h>
 #include "ext4_jbd2.h"
 #include "ext4.h"
 #include "xattr.h"
@@ -294,13 +295,13 @@ ext4_xattr_inode_hash(struct ext4_sb_info *sbi, const void *buffer, size_t size)
 static u64 ext4_xattr_inode_get_ref(struct inode *ea_inode)
 {
 	return ((u64)ea_inode->i_ctime.tv_sec << 32) |
-	       ((u32)ea_inode->i_version);
+		(u32) inode_peek_iversion_raw(ea_inode);
 }
 
 static void ext4_xattr_inode_set_ref(struct inode *ea_inode, u64 ref_count)
 {
 	ea_inode->i_ctime.tv_sec = (u32)(ref_count >> 32);
-	ea_inode->i_version = (u32)ref_count;
+	inode_set_iversion_raw(ea_inode, ref_count & 0xffffffff);
 }
 
 static u32 ext4_xattr_inode_get_hash(struct inode *ea_inode)
diff --git a/fs/f2fs/data.c b/fs/f2fs/data.c
index 516fa0d..455f086 100644
--- a/fs/f2fs/data.c
+++ b/fs/f2fs/data.c
@@ -56,7 +56,7 @@ static void f2fs_read_end_io(struct bio *bio)
 	int i;
 
 #ifdef CONFIG_F2FS_FAULT_INJECTION
-	if (time_to_inject(F2FS_P_SB(bio->bi_io_vec->bv_page), FAULT_IO)) {
+	if (time_to_inject(F2FS_P_SB(bio_first_page_all(bio)), FAULT_IO)) {
 		f2fs_show_injection_info(FAULT_IO);
 		bio->bi_status = BLK_STS_IOERR;
 	}
diff --git a/fs/fat/dir.c b/fs/fat/dir.c
index b833ffe..8e100c3bf7 100644
--- a/fs/fat/dir.c
+++ b/fs/fat/dir.c
@@ -16,6 +16,7 @@
 #include <linux/slab.h>
 #include <linux/compat.h>
 #include <linux/uaccess.h>
+#include <linux/iversion.h>
 #include "fat.h"
 
 /*
@@ -1055,7 +1056,7 @@ int fat_remove_entries(struct inode *dir, struct fat_slot_info *sinfo)
 	brelse(bh);
 	if (err)
 		return err;
-	dir->i_version++;
+	inode_inc_iversion(dir);
 
 	if (nr_slots) {
 		/*
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 20a0a89..ffbbf05 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -20,6 +20,7 @@
 #include <linux/blkdev.h>
 #include <linux/backing-dev.h>
 #include <asm/unaligned.h>
+#include <linux/iversion.h>
 #include "fat.h"
 
 #ifndef CONFIG_FAT_DEFAULT_IOCHARSET
@@ -507,7 +508,7 @@ int fat_fill_inode(struct inode *inode, struct msdos_dir_entry *de)
 	MSDOS_I(inode)->i_pos = 0;
 	inode->i_uid = sbi->options.fs_uid;
 	inode->i_gid = sbi->options.fs_gid;
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	inode->i_generation = get_seconds();
 
 	if ((de->attr & ATTR_DIR) && !IS_FREE(de->name)) {
@@ -590,7 +591,7 @@ struct inode *fat_build_inode(struct super_block *sb,
 		goto out;
 	}
 	inode->i_ino = iunique(sb, MSDOS_ROOT_INO);
-	inode->i_version = 1;
+	inode_set_iversion(inode, 1);
 	err = fat_fill_inode(inode, de);
 	if (err) {
 		iput(inode);
@@ -1377,7 +1378,7 @@ static int fat_read_root(struct inode *inode)
 	MSDOS_I(inode)->i_pos = MSDOS_ROOT_INO;
 	inode->i_uid = sbi->options.fs_uid;
 	inode->i_gid = sbi->options.fs_gid;
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	inode->i_generation = 0;
 	inode->i_mode = fat_make_mode(sbi, ATTR_DIR, S_IRWXUGO);
 	inode->i_op = sbi->dir_ops;
@@ -1828,7 +1829,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat,
 	if (!root_inode)
 		goto out_fail;
 	root_inode->i_ino = MSDOS_ROOT_INO;
-	root_inode->i_version = 1;
+	inode_set_iversion(root_inode, 1);
 	error = fat_read_root(root_inode);
 	if (error < 0) {
 		iput(root_inode);
diff --git a/fs/fat/namei_msdos.c b/fs/fat/namei_msdos.c
index d24d275..582ca73 100644
--- a/fs/fat/namei_msdos.c
+++ b/fs/fat/namei_msdos.c
@@ -7,6 +7,7 @@
  */
 
 #include <linux/module.h>
+#include <linux/iversion.h>
 #include "fat.h"
 
 /* Characters that are undesirable in an MS-DOS file name */
@@ -480,7 +481,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name,
 			} else
 				mark_inode_dirty(old_inode);
 
-			old_dir->i_version++;
+			inode_inc_iversion(old_dir);
 			old_dir->i_ctime = old_dir->i_mtime = current_time(old_dir);
 			if (IS_DIRSYNC(old_dir))
 				(void)fat_sync_inode(old_dir);
@@ -508,7 +509,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name,
 			goto out;
 		new_i_pos = sinfo.i_pos;
 	}
-	new_dir->i_version++;
+	inode_inc_iversion(new_dir);
 
 	fat_detach(old_inode);
 	fat_attach(old_inode, new_i_pos);
@@ -540,7 +541,7 @@ static int do_msdos_rename(struct inode *old_dir, unsigned char *old_name,
 	old_sinfo.bh = NULL;
 	if (err)
 		goto error_dotdot;
-	old_dir->i_version++;
+	inode_inc_iversion(old_dir);
 	old_dir->i_ctime = old_dir->i_mtime = ts;
 	if (IS_DIRSYNC(old_dir))
 		(void)fat_sync_inode(old_dir);
diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 02c0666..cefea792c 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -20,7 +20,7 @@
 #include <linux/slab.h>
 #include <linux/namei.h>
 #include <linux/kernel.h>
-
+#include <linux/iversion.h>
 #include "fat.h"
 
 static inline unsigned long vfat_d_version(struct dentry *dentry)
@@ -46,7 +46,7 @@ static int vfat_revalidate_shortname(struct dentry *dentry)
 {
 	int ret = 1;
 	spin_lock(&dentry->d_lock);
-	if (vfat_d_version(dentry) != d_inode(dentry->d_parent)->i_version)
+	if (inode_cmp_iversion(d_inode(dentry->d_parent), vfat_d_version(dentry)))
 		ret = 0;
 	spin_unlock(&dentry->d_lock);
 	return ret;
@@ -759,7 +759,7 @@ static struct dentry *vfat_lookup(struct inode *dir, struct dentry *dentry,
 out:
 	mutex_unlock(&MSDOS_SB(sb)->s_lock);
 	if (!inode)
-		vfat_d_version_set(dentry, dir->i_version);
+		vfat_d_version_set(dentry, inode_query_iversion(dir));
 	return d_splice_alias(inode, dentry);
 error:
 	mutex_unlock(&MSDOS_SB(sb)->s_lock);
@@ -781,7 +781,7 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 	err = vfat_add_entry(dir, &dentry->d_name, 0, 0, &ts, &sinfo);
 	if (err)
 		goto out;
-	dir->i_version++;
+	inode_inc_iversion(dir);
 
 	inode = fat_build_inode(sb, sinfo.de, sinfo.i_pos);
 	brelse(sinfo.bh);
@@ -789,7 +789,7 @@ static int vfat_create(struct inode *dir, struct dentry *dentry, umode_t mode,
 		err = PTR_ERR(inode);
 		goto out;
 	}
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	inode->i_mtime = inode->i_atime = inode->i_ctime = ts;
 	/* timestamp is already written, so mark_inode_dirty() is unneeded. */
 
@@ -823,7 +823,7 @@ static int vfat_rmdir(struct inode *dir, struct dentry *dentry)
 	clear_nlink(inode);
 	inode->i_mtime = inode->i_atime = current_time(inode);
 	fat_detach(inode);
-	vfat_d_version_set(dentry, dir->i_version);
+	vfat_d_version_set(dentry, inode_query_iversion(dir));
 out:
 	mutex_unlock(&MSDOS_SB(sb)->s_lock);
 
@@ -849,7 +849,7 @@ static int vfat_unlink(struct inode *dir, struct dentry *dentry)
 	clear_nlink(inode);
 	inode->i_mtime = inode->i_atime = current_time(inode);
 	fat_detach(inode);
-	vfat_d_version_set(dentry, dir->i_version);
+	vfat_d_version_set(dentry, inode_query_iversion(dir));
 out:
 	mutex_unlock(&MSDOS_SB(sb)->s_lock);
 
@@ -875,7 +875,7 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 	err = vfat_add_entry(dir, &dentry->d_name, 1, cluster, &ts, &sinfo);
 	if (err)
 		goto out_free;
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	inc_nlink(dir);
 
 	inode = fat_build_inode(sb, sinfo.de, sinfo.i_pos);
@@ -885,7 +885,7 @@ static int vfat_mkdir(struct inode *dir, struct dentry *dentry, umode_t mode)
 		/* the directory was completed, just return a error */
 		goto out;
 	}
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	set_nlink(inode, 2);
 	inode->i_mtime = inode->i_atime = inode->i_ctime = ts;
 	/* timestamp is already written, so mark_inode_dirty() is unneeded. */
@@ -951,7 +951,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry,
 			goto out;
 		new_i_pos = sinfo.i_pos;
 	}
-	new_dir->i_version++;
+	inode_inc_iversion(new_dir);
 
 	fat_detach(old_inode);
 	fat_attach(old_inode, new_i_pos);
@@ -979,7 +979,7 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry,
 	old_sinfo.bh = NULL;
 	if (err)
 		goto error_dotdot;
-	old_dir->i_version++;
+	inode_inc_iversion(old_dir);
 	old_dir->i_ctime = old_dir->i_mtime = ts;
 	if (IS_DIRSYNC(old_dir))
 		(void)fat_sync_inode(old_dir);
diff --git a/fs/file.c b/fs/file.c
index 3b08083..fc0eeb8 100644
--- a/fs/file.c
+++ b/fs/file.c
@@ -391,7 +391,7 @@ static struct fdtable *close_files(struct files_struct * files)
 				struct file * file = xchg(&fdt->fd[i], NULL);
 				if (file) {
 					filp_close(file, files);
-					cond_resched_rcu_qs();
+					cond_resched();
 				}
 			}
 			i++;
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index cea48363..d4d04fe 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -126,7 +126,7 @@ static void wb_io_lists_depopulated(struct bdi_writeback *wb)
  * inode_io_list_move_locked - move an inode onto a bdi_writeback IO list
  * @inode: inode to be moved
  * @wb: target bdi_writeback
- * @head: one of @wb->b_{dirty|io|more_io}
+ * @head: one of @wb->b_{dirty|io|more_io|dirty_time}
  *
  * Move @inode->i_io_list to @list of @wb and set %WB_has_dirty_io.
  * Returns %true if @inode is the first occupant of the !dirty_time IO
diff --git a/fs/inode.c b/fs/inode.c
index 03102d6..e2ca0f4 100644
--- a/fs/inode.c
+++ b/fs/inode.c
@@ -18,6 +18,7 @@
 #include <linux/buffer_head.h> /* for inode_has_buffers */
 #include <linux/ratelimit.h>
 #include <linux/list_lru.h>
+#include <linux/iversion.h>
 #include <trace/events/writeback.h>
 #include "internal.h"
 
@@ -1634,17 +1635,21 @@ static int relatime_need_update(const struct path *path, struct inode *inode,
 int generic_update_time(struct inode *inode, struct timespec *time, int flags)
 {
 	int iflags = I_DIRTY_TIME;
+	bool dirty = false;
 
 	if (flags & S_ATIME)
 		inode->i_atime = *time;
 	if (flags & S_VERSION)
-		inode_inc_iversion(inode);
+		dirty = inode_maybe_inc_iversion(inode, false);
 	if (flags & S_CTIME)
 		inode->i_ctime = *time;
 	if (flags & S_MTIME)
 		inode->i_mtime = *time;
+	if ((flags & (S_ATIME | S_CTIME | S_MTIME)) &&
+	    !(inode->i_sb->s_flags & SB_LAZYTIME))
+		dirty = true;
 
-	if (!(inode->i_sb->s_flags & SB_LAZYTIME) || (flags & S_VERSION))
+	if (dirty)
 		iflags |= I_DIRTY_SYNC;
 	__mark_inode_dirty(inode, iflags);
 	return 0;
@@ -1863,7 +1868,7 @@ int file_update_time(struct file *file)
 	if (!timespec_equal(&inode->i_ctime, &now))
 		sync_it |= S_CTIME;
 
-	if (IS_I_VERSION(inode))
+	if (IS_I_VERSION(inode) && inode_iversion_need_inc(inode))
 		sync_it |= S_VERSION;
 
 	if (!sync_it)
diff --git a/fs/nfs/delegation.c b/fs/nfs/delegation.c
index ade44ca..d8b4762 100644
--- a/fs/nfs/delegation.c
+++ b/fs/nfs/delegation.c
@@ -12,6 +12,7 @@
 #include <linux/sched.h>
 #include <linux/slab.h>
 #include <linux/spinlock.h>
+#include <linux/iversion.h>
 
 #include <linux/nfs4.h>
 #include <linux/nfs_fs.h>
@@ -347,7 +348,7 @@ int nfs_inode_set_delegation(struct inode *inode, struct rpc_cred *cred, struct
 	nfs4_stateid_copy(&delegation->stateid, &res->delegation);
 	delegation->type = res->delegation_type;
 	delegation->pagemod_limit = res->pagemod_limit;
-	delegation->change_attr = inode->i_version;
+	delegation->change_attr = inode_peek_iversion_raw(inode);
 	delegation->cred = get_rpccred(cred);
 	delegation->inode = inode;
 	delegation->flags = 1<<NFS_DELEGATION_REFERENCED;
diff --git a/fs/nfs/fscache-index.c b/fs/nfs/fscache-index.c
index 3025fe8..0ee4b93 100644
--- a/fs/nfs/fscache-index.c
+++ b/fs/nfs/fscache-index.c
@@ -16,6 +16,7 @@
 #include <linux/nfs_fs.h>
 #include <linux/nfs_fs_sb.h>
 #include <linux/in6.h>
+#include <linux/iversion.h>
 
 #include "internal.h"
 #include "fscache.h"
@@ -211,7 +212,7 @@ static uint16_t nfs_fscache_inode_get_aux(const void *cookie_netfs_data,
 	auxdata.ctime = nfsi->vfs_inode.i_ctime;
 
 	if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
-		auxdata.change_attr = nfsi->vfs_inode.i_version;
+		auxdata.change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
 
 	if (bufmax > sizeof(auxdata))
 		bufmax = sizeof(auxdata);
@@ -243,7 +244,7 @@ enum fscache_checkaux nfs_fscache_inode_check_aux(void *cookie_netfs_data,
 	auxdata.ctime = nfsi->vfs_inode.i_ctime;
 
 	if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
-		auxdata.change_attr = nfsi->vfs_inode.i_version;
+		auxdata.change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
 
 	if (memcmp(data, &auxdata, datalen) != 0)
 		return FSCACHE_CHECKAUX_OBSOLETE;
diff --git a/fs/nfs/inode.c b/fs/nfs/inode.c
index b992d23..93552c4 100644
--- a/fs/nfs/inode.c
+++ b/fs/nfs/inode.c
@@ -38,8 +38,8 @@
 #include <linux/slab.h>
 #include <linux/compat.h>
 #include <linux/freezer.h>
-
 #include <linux/uaccess.h>
+#include <linux/iversion.h>
 
 #include "nfs4_fs.h"
 #include "callback.h"
@@ -483,7 +483,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr, st
 		memset(&inode->i_atime, 0, sizeof(inode->i_atime));
 		memset(&inode->i_mtime, 0, sizeof(inode->i_mtime));
 		memset(&inode->i_ctime, 0, sizeof(inode->i_ctime));
-		inode->i_version = 0;
+		inode_set_iversion_raw(inode, 0);
 		inode->i_size = 0;
 		clear_nlink(inode);
 		inode->i_uid = make_kuid(&init_user_ns, -2);
@@ -508,7 +508,7 @@ nfs_fhget(struct super_block *sb, struct nfs_fh *fh, struct nfs_fattr *fattr, st
 		else if (nfs_server_capable(inode, NFS_CAP_CTIME))
 			nfs_set_cache_invalid(inode, NFS_INO_INVALID_ATTR);
 		if (fattr->valid & NFS_ATTR_FATTR_CHANGE)
-			inode->i_version = fattr->change_attr;
+			inode_set_iversion_raw(inode, fattr->change_attr);
 		else
 			nfs_set_cache_invalid(inode, NFS_INO_INVALID_ATTR
 				| NFS_INO_REVAL_PAGECACHE);
@@ -1289,8 +1289,8 @@ static unsigned long nfs_wcc_update_inode(struct inode *inode, struct nfs_fattr
 
 	if ((fattr->valid & NFS_ATTR_FATTR_PRECHANGE)
 			&& (fattr->valid & NFS_ATTR_FATTR_CHANGE)
-			&& inode->i_version == fattr->pre_change_attr) {
-		inode->i_version = fattr->change_attr;
+			&& !inode_cmp_iversion_raw(inode, fattr->pre_change_attr)) {
+		inode_set_iversion_raw(inode, fattr->change_attr);
 		if (S_ISDIR(inode->i_mode))
 			nfs_set_cache_invalid(inode, NFS_INO_INVALID_DATA);
 		ret |= NFS_INO_INVALID_ATTR;
@@ -1348,7 +1348,7 @@ static int nfs_check_inode_attributes(struct inode *inode, struct nfs_fattr *fat
 
 	if (!nfs_file_has_buffered_writers(nfsi)) {
 		/* Verify a few of the more important attributes */
-		if ((fattr->valid & NFS_ATTR_FATTR_CHANGE) != 0 && inode->i_version != fattr->change_attr)
+		if ((fattr->valid & NFS_ATTR_FATTR_CHANGE) != 0 && inode_cmp_iversion_raw(inode, fattr->change_attr))
 			invalid |= NFS_INO_INVALID_ATTR | NFS_INO_REVAL_PAGECACHE;
 
 		if ((fattr->valid & NFS_ATTR_FATTR_MTIME) && !timespec_equal(&inode->i_mtime, &fattr->mtime))
@@ -1642,7 +1642,7 @@ int nfs_post_op_update_inode_force_wcc_locked(struct inode *inode, struct nfs_fa
 	}
 	if ((fattr->valid & NFS_ATTR_FATTR_CHANGE) != 0 &&
 			(fattr->valid & NFS_ATTR_FATTR_PRECHANGE) == 0) {
-		fattr->pre_change_attr = inode->i_version;
+		fattr->pre_change_attr = inode_peek_iversion_raw(inode);
 		fattr->valid |= NFS_ATTR_FATTR_PRECHANGE;
 	}
 	if ((fattr->valid & NFS_ATTR_FATTR_CTIME) != 0 &&
@@ -1778,7 +1778,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
 
 	/* More cache consistency checks */
 	if (fattr->valid & NFS_ATTR_FATTR_CHANGE) {
-		if (inode->i_version != fattr->change_attr) {
+		if (inode_cmp_iversion_raw(inode, fattr->change_attr)) {
 			dprintk("NFS: change_attr change on server for file %s/%ld\n",
 					inode->i_sb->s_id, inode->i_ino);
 			/* Could it be a race with writeback? */
@@ -1790,7 +1790,7 @@ static int nfs_update_inode(struct inode *inode, struct nfs_fattr *fattr)
 				if (S_ISDIR(inode->i_mode))
 					nfs_force_lookup_revalidate(inode);
 			}
-			inode->i_version = fattr->change_attr;
+			inode_set_iversion_raw(inode, fattr->change_attr);
 		}
 	} else {
 		nfsi->cache_validity |= save_cache_validity;
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 56fa5a1..17a03f2 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -54,6 +54,7 @@
 #include <linux/xattr.h>
 #include <linux/utsname.h>
 #include <linux/freezer.h>
+#include <linux/iversion.h>
 
 #include "nfs4_fs.h"
 #include "delegation.h"
@@ -1045,16 +1046,16 @@ static void update_changeattr(struct inode *dir, struct nfs4_change_info *cinfo,
 
 	spin_lock(&dir->i_lock);
 	nfsi->cache_validity |= NFS_INO_INVALID_ATTR|NFS_INO_INVALID_DATA;
-	if (cinfo->atomic && cinfo->before == dir->i_version) {
+	if (cinfo->atomic && cinfo->before == inode_peek_iversion_raw(dir)) {
 		nfsi->cache_validity &= ~NFS_INO_REVAL_PAGECACHE;
 		nfsi->attrtimeo_timestamp = jiffies;
 	} else {
 		nfs_force_lookup_revalidate(dir);
-		if (cinfo->before != dir->i_version)
+		if (cinfo->before != inode_peek_iversion_raw(dir))
 			nfsi->cache_validity |= NFS_INO_INVALID_ACCESS |
 				NFS_INO_INVALID_ACL;
 	}
-	dir->i_version = cinfo->after;
+	inode_set_iversion_raw(dir, cinfo->after);
 	nfsi->read_cache_jiffies = timestamp;
 	nfsi->attr_gencount = nfs_inc_attr_generation_counter();
 	nfs_fscache_invalidate(dir);
@@ -2454,7 +2455,8 @@ static int _nfs4_proc_open(struct nfs4_opendata *data)
 			data->file_created = true;
 		else if (o_res->cinfo.before != o_res->cinfo.after)
 			data->file_created = true;
-		if (data->file_created || dir->i_version != o_res->cinfo.after)
+		if (data->file_created ||
+		    inode_peek_iversion_raw(dir) != o_res->cinfo.after)
 			update_changeattr(dir, &o_res->cinfo,
 					o_res->f_attr->time_start);
 	}
diff --git a/fs/nfs/nfstrace.h b/fs/nfs/nfstrace.h
index 093290c..610d89d 100644
--- a/fs/nfs/nfstrace.h
+++ b/fs/nfs/nfstrace.h
@@ -9,6 +9,7 @@
 #define _TRACE_NFS_H
 
 #include <linux/tracepoint.h>
+#include <linux/iversion.h>
 
 #define nfs_show_file_type(ftype) \
 	__print_symbolic(ftype, \
@@ -61,7 +62,7 @@ DECLARE_EVENT_CLASS(nfs_inode_event,
 			__entry->dev = inode->i_sb->s_dev;
 			__entry->fileid = nfsi->fileid;
 			__entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
-			__entry->version = inode->i_version;
+			__entry->version = inode_peek_iversion_raw(inode);
 		),
 
 		TP_printk(
@@ -100,7 +101,7 @@ DECLARE_EVENT_CLASS(nfs_inode_event_done,
 			__entry->fileid = nfsi->fileid;
 			__entry->fhandle = nfs_fhandle_hash(&nfsi->fh);
 			__entry->type = nfs_umode_to_dtype(inode->i_mode);
-			__entry->version = inode->i_version;
+			__entry->version = inode_peek_iversion_raw(inode);
 			__entry->size = i_size_read(inode);
 			__entry->nfsi_flags = nfsi->flags;
 			__entry->cache_validity = nfsi->cache_validity;
diff --git a/fs/nfs/write.c b/fs/nfs/write.c
index 4a379d7..12b2d47 100644
--- a/fs/nfs/write.c
+++ b/fs/nfs/write.c
@@ -23,6 +23,7 @@
 #include <linux/export.h>
 #include <linux/freezer.h>
 #include <linux/wait.h>
+#include <linux/iversion.h>
 
 #include <linux/uaccess.h>
 
@@ -753,11 +754,8 @@ static void nfs_inode_add_request(struct inode *inode, struct nfs_page *req)
 	 */
 	spin_lock(&mapping->private_lock);
 	if (!nfs_have_writebacks(inode) &&
-	    NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE)) {
-		spin_lock(&inode->i_lock);
-		inode->i_version++;
-		spin_unlock(&inode->i_lock);
-	}
+	    NFS_PROTO(inode)->have_delegation(inode, FMODE_WRITE))
+		inode_inc_iversion_raw(inode);
 	if (likely(!PageSwapCache(req->wb_page))) {
 		set_bit(PG_MAPPED, &req->wb_flags);
 		SetPagePrivate(req->wb_page);
diff --git a/fs/nfsd/auth.c b/fs/nfsd/auth.c
index f650e47..fdf2aad 100644
--- a/fs/nfsd/auth.c
+++ b/fs/nfsd/auth.c
@@ -60,10 +60,10 @@ int nfsd_setuser(struct svc_rqst *rqstp, struct svc_export *exp)
 				gi->gid[i] = exp->ex_anon_gid;
 			else
 				gi->gid[i] = rqgi->gid[i];
-
-			/* Each thread allocates its own gi, no race */
-			groups_sort(gi);
 		}
+
+		/* Each thread allocates its own gi, no race */
+		groups_sort(gi);
 	} else {
 		gi = get_group_info(rqgi);
 	}
diff --git a/fs/nfsd/nfsfh.h b/fs/nfsd/nfsfh.h
index 43f31cf..b844418 100644
--- a/fs/nfsd/nfsfh.h
+++ b/fs/nfsd/nfsfh.h
@@ -11,6 +11,7 @@
 #include <linux/crc32.h>
 #include <linux/sunrpc/svc.h>
 #include <uapi/linux/nfsd/nfsfh.h>
+#include <linux/iversion.h>
 
 static inline __u32 ino_t_to_u32(ino_t ino)
 {
@@ -259,7 +260,7 @@ static inline u64 nfsd4_change_attribute(struct inode *inode)
 	chattr =  inode->i_ctime.tv_sec;
 	chattr <<= 30;
 	chattr += inode->i_ctime.tv_nsec;
-	chattr += inode->i_version;
+	chattr += inode_query_iversion(inode);
 	return chattr;
 }
 
diff --git a/fs/ntfs/inode.c b/fs/ntfs/inode.c
index 7c410f8..1c1ee48 100644
--- a/fs/ntfs/inode.c
+++ b/fs/ntfs/inode.c
@@ -560,13 +560,6 @@ static int ntfs_read_locked_inode(struct inode *vi)
 	ntfs_debug("Entering for i_ino 0x%lx.", vi->i_ino);
 
 	/* Setup the generic vfs inode parts now. */
-
-	/*
-	 * This is for checking whether an inode has changed w.r.t. a file so
-	 * that the file can be updated if necessary (compare with f_version).
-	 */
-	vi->i_version = 1;
-
 	vi->i_uid = vol->uid;
 	vi->i_gid = vol->gid;
 	vi->i_mode = 0;
@@ -1240,7 +1233,6 @@ static int ntfs_read_locked_attr_inode(struct inode *base_vi, struct inode *vi)
 	base_ni = NTFS_I(base_vi);
 
 	/* Just mirror the values from the base inode. */
-	vi->i_version	= base_vi->i_version;
 	vi->i_uid	= base_vi->i_uid;
 	vi->i_gid	= base_vi->i_gid;
 	set_nlink(vi, base_vi->i_nlink);
@@ -1507,7 +1499,6 @@ static int ntfs_read_locked_index_inode(struct inode *base_vi, struct inode *vi)
 	ni	= NTFS_I(vi);
 	base_ni = NTFS_I(base_vi);
 	/* Just mirror the values from the base inode. */
-	vi->i_version	= base_vi->i_version;
 	vi->i_uid	= base_vi->i_uid;
 	vi->i_gid	= base_vi->i_gid;
 	set_nlink(vi, base_vi->i_nlink);
diff --git a/fs/ntfs/mft.c b/fs/ntfs/mft.c
index ee8392a..2831f49 100644
--- a/fs/ntfs/mft.c
+++ b/fs/ntfs/mft.c
@@ -2641,12 +2641,6 @@ ntfs_inode *ntfs_mft_record_alloc(ntfs_volume *vol, const int mode,
 			goto undo_mftbmp_alloc;
 		}
 		vi->i_ino = bit;
-		/*
-		 * This is for checking whether an inode has changed w.r.t. a
-		 * file so that the file can be updated if necessary (compare
-		 * with f_version).
-		 */
-		vi->i_version = 1;
 
 		/* The owner and group come from the ntfs volume. */
 		vi->i_uid = vol->uid;
diff --git a/fs/ocfs2/dir.c b/fs/ocfs2/dir.c
index febe631..32f9c72 100644
--- a/fs/ocfs2/dir.c
+++ b/fs/ocfs2/dir.c
@@ -42,6 +42,7 @@
 #include <linux/highmem.h>
 #include <linux/quotaops.h>
 #include <linux/sort.h>
+#include <linux/iversion.h>
 
 #include <cluster/masklog.h>
 
@@ -1174,7 +1175,7 @@ static int __ocfs2_delete_entry(handle_t *handle, struct inode *dir,
 				le16_add_cpu(&pde->rec_len,
 						le16_to_cpu(de->rec_len));
 			de->inode = 0;
-			dir->i_version++;
+			inode_inc_iversion(dir);
 			ocfs2_journal_dirty(handle, bh);
 			goto bail;
 		}
@@ -1729,7 +1730,7 @@ int __ocfs2_add_entry(handle_t *handle,
 			if (ocfs2_dir_indexed(dir))
 				ocfs2_recalc_free_list(dir, handle, lookup);
 
-			dir->i_version++;
+			inode_inc_iversion(dir);
 			ocfs2_journal_dirty(handle, insert_bh);
 			retval = 0;
 			goto bail;
@@ -1775,7 +1776,7 @@ static int ocfs2_dir_foreach_blk_id(struct inode *inode,
 		 * readdir(2), then we might be pointing to an invalid
 		 * dirent right now.  Scan from the start of the block
 		 * to make sure. */
-		if (*f_version != inode->i_version) {
+		if (inode_cmp_iversion(inode, *f_version)) {
 			for (i = 0; i < i_size_read(inode) && i < offset; ) {
 				de = (struct ocfs2_dir_entry *)
 					(data->id_data + i);
@@ -1791,7 +1792,7 @@ static int ocfs2_dir_foreach_blk_id(struct inode *inode,
 				i += le16_to_cpu(de->rec_len);
 			}
 			ctx->pos = offset = i;
-			*f_version = inode->i_version;
+			*f_version = inode_query_iversion(inode);
 		}
 
 		de = (struct ocfs2_dir_entry *) (data->id_data + ctx->pos);
@@ -1869,7 +1870,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
 		 * readdir(2), then we might be pointing to an invalid
 		 * dirent right now.  Scan from the start of the block
 		 * to make sure. */
-		if (*f_version != inode->i_version) {
+		if (inode_cmp_iversion(inode, *f_version)) {
 			for (i = 0; i < sb->s_blocksize && i < offset; ) {
 				de = (struct ocfs2_dir_entry *) (bh->b_data + i);
 				/* It's too expensive to do a full
@@ -1886,7 +1887,7 @@ static int ocfs2_dir_foreach_blk_el(struct inode *inode,
 			offset = i;
 			ctx->pos = (ctx->pos & ~(sb->s_blocksize - 1))
 				| offset;
-			*f_version = inode->i_version;
+			*f_version = inode_query_iversion(inode);
 		}
 
 		while (ctx->pos < i_size_read(inode)
@@ -1940,7 +1941,7 @@ static int ocfs2_dir_foreach_blk(struct inode *inode, u64 *f_version,
  */
 int ocfs2_dir_foreach(struct inode *inode, struct dir_context *ctx)
 {
-	u64 version = inode->i_version;
+	u64 version = inode_query_iversion(inode);
 	ocfs2_dir_foreach_blk(inode, &version, ctx, true);
 	return 0;
 }
diff --git a/fs/ocfs2/inode.c b/fs/ocfs2/inode.c
index 1a1e007..d51b80e 100644
--- a/fs/ocfs2/inode.c
+++ b/fs/ocfs2/inode.c
@@ -28,6 +28,7 @@
 #include <linux/highmem.h>
 #include <linux/pagemap.h>
 #include <linux/quotaops.h>
+#include <linux/iversion.h>
 
 #include <asm/byteorder.h>
 
@@ -302,7 +303,7 @@ void ocfs2_populate_inode(struct inode *inode, struct ocfs2_dinode *fe,
 	OCFS2_I(inode)->ip_attr = le32_to_cpu(fe->i_attr);
 	OCFS2_I(inode)->ip_dyn_features = le16_to_cpu(fe->i_dyn_features);
 
-	inode->i_version = 1;
+	inode_set_iversion(inode, 1);
 	inode->i_generation = le32_to_cpu(fe->i_generation);
 	inode->i_rdev = huge_decode_dev(le64_to_cpu(fe->id1.dev1.i_rdev));
 	inode->i_mode = le16_to_cpu(fe->i_mode);
diff --git a/fs/ocfs2/namei.c b/fs/ocfs2/namei.c
index 3b0a10d..c801edd 100644
--- a/fs/ocfs2/namei.c
+++ b/fs/ocfs2/namei.c
@@ -41,6 +41,7 @@
 #include <linux/slab.h>
 #include <linux/highmem.h>
 #include <linux/quotaops.h>
+#include <linux/iversion.h>
 
 #include <cluster/masklog.h>
 
@@ -1520,7 +1521,7 @@ static int ocfs2_rename(struct inode *old_dir,
 			mlog_errno(status);
 			goto bail;
 		}
-		new_dir->i_version++;
+		inode_inc_iversion(new_dir);
 
 		if (S_ISDIR(new_inode->i_mode))
 			ocfs2_set_links_count(newfe, 0);
diff --git a/fs/ocfs2/quota_global.c b/fs/ocfs2/quota_global.c
index b39d14c..7a92219 100644
--- a/fs/ocfs2/quota_global.c
+++ b/fs/ocfs2/quota_global.c
@@ -12,6 +12,7 @@
 #include <linux/writeback.h>
 #include <linux/workqueue.h>
 #include <linux/llist.h>
+#include <linux/iversion.h>
 
 #include <cluster/masklog.h>
 
@@ -289,7 +290,7 @@ ssize_t ocfs2_quota_write(struct super_block *sb, int type,
 		mlog_errno(err);
 		return err;
 	}
-	gqinode->i_version++;
+	inode_inc_iversion(gqinode);
 	ocfs2_mark_inode_dirty(handle, gqinode, oinfo->dqi_gqi_bh);
 	return len;
 }
diff --git a/fs/orangefs/devorangefs-req.c b/fs/orangefs/devorangefs-req.c
index ded456f..c584ad8 100644
--- a/fs/orangefs/devorangefs-req.c
+++ b/fs/orangefs/devorangefs-req.c
@@ -162,7 +162,7 @@ static ssize_t orangefs_devreq_read(struct file *file,
 	struct orangefs_kernel_op_s *op, *temp;
 	__s32 proto_ver = ORANGEFS_KERNEL_PROTO_VERSION;
 	static __s32 magic = ORANGEFS_DEVREQ_MAGIC;
-	struct orangefs_kernel_op_s *cur_op = NULL;
+	struct orangefs_kernel_op_s *cur_op;
 	unsigned long ret;
 
 	/* We do not support blocking IO. */
@@ -186,6 +186,7 @@ static ssize_t orangefs_devreq_read(struct file *file,
 		return -EAGAIN;
 
 restart:
+	cur_op = NULL;
 	/* Get next op (if any) from top of list. */
 	spin_lock(&orangefs_request_list_lock);
 	list_for_each_entry_safe(op, temp, &orangefs_request_list, list) {
diff --git a/fs/orangefs/file.c b/fs/orangefs/file.c
index 1668fd6..0d228cd 100644
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -452,7 +452,7 @@ ssize_t orangefs_inode_read(struct inode *inode,
 static ssize_t orangefs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
 {
 	struct file *file = iocb->ki_filp;
-	loff_t pos = *(&iocb->ki_pos);
+	loff_t pos = iocb->ki_pos;
 	ssize_t rc = 0;
 
 	BUG_ON(iocb->private);
@@ -492,9 +492,6 @@ static ssize_t orangefs_file_write_iter(struct kiocb *iocb, struct iov_iter *ite
 		}
 	}
 
-	if (file->f_pos > i_size_read(file->f_mapping->host))
-		orangefs_i_size_write(file->f_mapping->host, file->f_pos);
-
 	rc = generic_write_checks(iocb, iter);
 
 	if (rc <= 0) {
@@ -508,7 +505,7 @@ static ssize_t orangefs_file_write_iter(struct kiocb *iocb, struct iov_iter *ite
 	 * pos to the end of the file, so we will wait till now to set
 	 * pos...
 	 */
-	pos = *(&iocb->ki_pos);
+	pos = iocb->ki_pos;
 
 	rc = do_readv_writev(ORANGEFS_IO_WRITE,
 			     file,
diff --git a/fs/orangefs/orangefs-kernel.h b/fs/orangefs/orangefs-kernel.h
index 97adf7d1..2595453 100644
--- a/fs/orangefs/orangefs-kernel.h
+++ b/fs/orangefs/orangefs-kernel.h
@@ -533,17 +533,6 @@ do {									\
 	sys_attr.mask = ORANGEFS_ATTR_SYS_ALL_SETABLE;			\
 } while (0)
 
-static inline void orangefs_i_size_write(struct inode *inode, loff_t i_size)
-{
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
-	inode_lock(inode);
-#endif
-	i_size_write(inode, i_size);
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
-	inode_unlock(inode);
-#endif
-}
-
 static inline void orangefs_set_timeout(struct dentry *dentry)
 {
 	unsigned long time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000;
diff --git a/fs/orangefs/waitqueue.c b/fs/orangefs/waitqueue.c
index 835c6e1..0577d6d 100644
--- a/fs/orangefs/waitqueue.c
+++ b/fs/orangefs/waitqueue.c
@@ -29,10 +29,10 @@ static void orangefs_clean_up_interrupted_operation(struct orangefs_kernel_op_s
  */
 void purge_waiting_ops(void)
 {
-	struct orangefs_kernel_op_s *op;
+	struct orangefs_kernel_op_s *op, *tmp;
 
 	spin_lock(&orangefs_request_list_lock);
-	list_for_each_entry(op, &orangefs_request_list, list) {
+	list_for_each_entry_safe(op, tmp, &orangefs_request_list, list) {
 		gossip_debug(GOSSIP_WAIT_DEBUG,
 			     "pvfs2-client-core: purging op tag %llu %s\n",
 			     llu(op->tag),
diff --git a/fs/proc/array.c b/fs/proc/array.c
index 79375fc..d67a72d 100644
--- a/fs/proc/array.c
+++ b/fs/proc/array.c
@@ -430,8 +430,11 @@ static int do_task_stat(struct seq_file *m, struct pid_namespace *ns,
 		 * safe because the task has stopped executing permanently.
 		 */
 		if (permitted && (task->flags & PF_DUMPCORE)) {
-			eip = KSTK_EIP(task);
-			esp = KSTK_ESP(task);
+			if (try_get_task_stack(task)) {
+				eip = KSTK_EIP(task);
+				esp = KSTK_ESP(task);
+				put_task_stack(task);
+			}
 		}
 	}
 
diff --git a/fs/super.c b/fs/super.c
index 7ff1349..06bd25d 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -517,7 +517,11 @@ struct super_block *sget_userns(struct file_system_type *type,
 	hlist_add_head(&s->s_instances, &type->fs_supers);
 	spin_unlock(&sb_lock);
 	get_filesystem(type);
-	register_shrinker(&s->s_shrink);
+	err = register_shrinker(&s->s_shrink);
+	if (err) {
+		deactivate_locked_super(s);
+		s = ERR_PTR(err);
+	}
 	return s;
 }
 
diff --git a/fs/ubifs/dir.c b/fs/ubifs/dir.c
index 417fe0b..a2ea4856 100644
--- a/fs/ubifs/dir.c
+++ b/fs/ubifs/dir.c
@@ -220,20 +220,9 @@ static struct dentry *ubifs_lookup(struct inode *dir, struct dentry *dentry,
 
 	dbg_gen("'%pd' in dir ino %lu", dentry, dir->i_ino);
 
-	if (ubifs_crypt_is_encrypted(dir)) {
-		err = fscrypt_get_encryption_info(dir);
-
-		/*
-		 * DCACHE_ENCRYPTED_WITH_KEY is set if the dentry is
-		 * created while the directory was encrypted and we
-		 * have access to the key.
-		 */
-		if (fscrypt_has_encryption_key(dir))
-			fscrypt_set_encrypted_dentry(dentry);
-		fscrypt_set_d_op(dentry);
-		if (err && err != -ENOKEY)
-			return ERR_PTR(err);
-	}
+	err = fscrypt_prepare_lookup(dir, dentry, flags);
+	if (err)
+		return ERR_PTR(err);
 
 	err = fscrypt_setup_filename(dir, &dentry->d_name, 1, &nm);
 	if (err)
@@ -743,9 +732,9 @@ static int ubifs_link(struct dentry *old_dentry, struct inode *dir,
 	ubifs_assert(inode_is_locked(dir));
 	ubifs_assert(inode_is_locked(inode));
 
-	if (ubifs_crypt_is_encrypted(dir) &&
-	    !fscrypt_has_permitted_context(dir, inode))
-		return -EPERM;
+	err = fscrypt_prepare_link(old_dentry, dir, dentry);
+	if (err)
+		return err;
 
 	err = fscrypt_setup_filename(dir, &dentry->d_name, 0, &nm);
 	if (err)
@@ -1353,12 +1342,6 @@ static int do_rename(struct inode *old_dir, struct dentry *old_dentry,
 	if (unlink)
 		ubifs_assert(inode_is_locked(new_inode));
 
-	if (old_dir != new_dir) {
-		if (ubifs_crypt_is_encrypted(new_dir) &&
-		    !fscrypt_has_permitted_context(new_dir, old_inode))
-			return -EPERM;
-	}
-
 	if (unlink && is_dir) {
 		err = ubifs_check_dir_empty(new_inode);
 		if (err)
@@ -1573,13 +1556,6 @@ static int ubifs_xrename(struct inode *old_dir, struct dentry *old_dentry,
 
 	ubifs_assert(fst_inode && snd_inode);
 
-	if ((ubifs_crypt_is_encrypted(old_dir) ||
-	    ubifs_crypt_is_encrypted(new_dir)) &&
-	    (old_dir != new_dir) &&
-	    (!fscrypt_has_permitted_context(new_dir, fst_inode) ||
-	     !fscrypt_has_permitted_context(old_dir, snd_inode)))
-		return -EPERM;
-
 	err = fscrypt_setup_filename(old_dir, &old_dentry->d_name, 0, &fst_nm);
 	if (err)
 		return err;
@@ -1624,12 +1600,19 @@ static int ubifs_rename(struct inode *old_dir, struct dentry *old_dentry,
 			struct inode *new_dir, struct dentry *new_dentry,
 			unsigned int flags)
 {
+	int err;
+
 	if (flags & ~(RENAME_NOREPLACE | RENAME_WHITEOUT | RENAME_EXCHANGE))
 		return -EINVAL;
 
 	ubifs_assert(inode_is_locked(old_dir));
 	ubifs_assert(inode_is_locked(new_dir));
 
+	err = fscrypt_prepare_rename(old_dir, old_dentry, new_dir, new_dentry,
+				     flags);
+	if (err)
+		return err;
+
 	if (flags & RENAME_EXCHANGE)
 		return ubifs_xrename(old_dir, old_dentry, new_dir, new_dentry);
 
diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index dfe8506..9fe194a 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -1284,13 +1284,9 @@ int ubifs_setattr(struct dentry *dentry, struct iattr *attr)
 	if (err)
 		return err;
 
-	if (ubifs_crypt_is_encrypted(inode) && (attr->ia_valid & ATTR_SIZE)) {
-		err = fscrypt_get_encryption_info(inode);
-		if (err)
-			return err;
-		if (!fscrypt_has_encryption_key(inode))
-			return -ENOKEY;
-	}
+	err = fscrypt_prepare_setattr(dentry, attr);
+	if (err)
+		return err;
 
 	if ((attr->ia_valid & ATTR_SIZE) && attr->ia_size < inode->i_size)
 		/* Truncation to a smaller size */
@@ -1629,35 +1625,6 @@ static int ubifs_file_mmap(struct file *file, struct vm_area_struct *vma)
 	return 0;
 }
 
-static int ubifs_file_open(struct inode *inode, struct file *filp)
-{
-	int ret;
-	struct dentry *dir;
-	struct ubifs_info *c = inode->i_sb->s_fs_info;
-
-	if (ubifs_crypt_is_encrypted(inode)) {
-		ret = fscrypt_get_encryption_info(inode);
-		if (ret)
-			return -EACCES;
-		if (!fscrypt_has_encryption_key(inode))
-			return -ENOKEY;
-	}
-
-	dir = dget_parent(file_dentry(filp));
-	if (ubifs_crypt_is_encrypted(d_inode(dir)) &&
-			!fscrypt_has_permitted_context(d_inode(dir), inode)) {
-		ubifs_err(c, "Inconsistent encryption contexts: %lu/%lu",
-			  (unsigned long) d_inode(dir)->i_ino,
-			  (unsigned long) inode->i_ino);
-		dput(dir);
-		ubifs_ro_mode(c, -EPERM);
-		return -EPERM;
-	}
-	dput(dir);
-
-	return 0;
-}
-
 static const char *ubifs_get_link(struct dentry *dentry,
 					    struct inode *inode,
 					    struct delayed_call *done)
@@ -1746,7 +1713,7 @@ const struct file_operations ubifs_file_operations = {
 	.unlocked_ioctl = ubifs_ioctl,
 	.splice_read	= generic_file_splice_read,
 	.splice_write	= iter_file_splice_write,
-	.open		= ubifs_file_open,
+	.open		= fscrypt_file_open,
 #ifdef CONFIG_COMPAT
 	.compat_ioctl   = ubifs_compat_ioctl,
 #endif
diff --git a/fs/ubifs/tnc.c b/fs/ubifs/tnc.c
index 0a213dc..ba3d0e0 100644
--- a/fs/ubifs/tnc.c
+++ b/fs/ubifs/tnc.c
@@ -1890,35 +1890,28 @@ static int search_dh_cookie(struct ubifs_info *c, const union ubifs_key *key,
 	union ubifs_key *dkey;
 
 	for (;;) {
-		if (!err) {
-			err = tnc_next(c, &znode, n);
-			if (err)
-				goto out;
-		}
-
 		zbr = &znode->zbranch[*n];
 		dkey = &zbr->key;
 
 		if (key_inum(c, dkey) != key_inum(c, key) ||
 		    key_type(c, dkey) != key_type(c, key)) {
-			err = -ENOENT;
-			goto out;
+			return -ENOENT;
 		}
 
 		err = tnc_read_hashed_node(c, zbr, dent);
 		if (err)
-			goto out;
+			return err;
 
 		if (key_hash(c, key) == key_hash(c, dkey) &&
 		    le32_to_cpu(dent->cookie) == cookie) {
 			*zn = znode;
-			goto out;
+			return 0;
 		}
+
+		err = tnc_next(c, &znode, n);
+		if (err)
+			return err;
 	}
-
-out:
-
-	return err;
 }
 
 static int do_lookup_dh(struct ubifs_info *c, const union ubifs_key *key,
diff --git a/fs/ubifs/xattr.c b/fs/ubifs/xattr.c
index 5ddc89d..759f1a2 100644
--- a/fs/ubifs/xattr.c
+++ b/fs/ubifs/xattr.c
@@ -381,8 +381,6 @@ ssize_t ubifs_xattr_get(struct inode *host, const char *name, void *buf,
 	if (buf) {
 		/* If @buf is %NULL we are supposed to return the length */
 		if (ui->data_len > size) {
-			ubifs_err(c, "buffer size %zd, xattr len %d",
-				  size, ui->data_len);
 			err = -ERANGE;
 			goto out_iput;
 		}
diff --git a/fs/ufs/dir.c b/fs/ufs/dir.c
index 2edc175..50dfce0 100644
--- a/fs/ufs/dir.c
+++ b/fs/ufs/dir.c
@@ -20,6 +20,7 @@
 #include <linux/time.h>
 #include <linux/fs.h>
 #include <linux/swap.h>
+#include <linux/iversion.h>
 
 #include "ufs_fs.h"
 #include "ufs.h"
@@ -47,7 +48,7 @@ static int ufs_commit_chunk(struct page *page, loff_t pos, unsigned len)
 	struct inode *dir = mapping->host;
 	int err = 0;
 
-	dir->i_version++;
+	inode_inc_iversion(dir);
 	block_write_end(NULL, mapping, pos, len, len, page, NULL);
 	if (pos+len > dir->i_size) {
 		i_size_write(dir, pos+len);
@@ -428,7 +429,7 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
 	unsigned long n = pos >> PAGE_SHIFT;
 	unsigned long npages = dir_pages(inode);
 	unsigned chunk_mask = ~(UFS_SB(sb)->s_uspi->s_dirblksize - 1);
-	int need_revalidate = file->f_version != inode->i_version;
+	bool need_revalidate = inode_cmp_iversion(inode, file->f_version);
 	unsigned flags = UFS_SB(sb)->s_flags;
 
 	UFSD("BEGIN\n");
@@ -455,8 +456,8 @@ ufs_readdir(struct file *file, struct dir_context *ctx)
 				offset = ufs_validate_entry(sb, kaddr, offset, chunk_mask);
 				ctx->pos = (n<<PAGE_SHIFT) + offset;
 			}
-			file->f_version = inode->i_version;
-			need_revalidate = 0;
+			file->f_version = inode_query_iversion(inode);
+			need_revalidate = false;
 		}
 		de = (struct ufs_dir_entry *)(kaddr+offset);
 		limit = kaddr + ufs_last_byte(inode, n) - UFS_DIR_REC_LEN(1);
diff --git a/fs/ufs/inode.c b/fs/ufs/inode.c
index afb601c..c843ec8 100644
--- a/fs/ufs/inode.c
+++ b/fs/ufs/inode.c
@@ -36,6 +36,7 @@
 #include <linux/mm.h>
 #include <linux/buffer_head.h>
 #include <linux/writeback.h>
+#include <linux/iversion.h>
 
 #include "ufs_fs.h"
 #include "ufs.h"
@@ -693,7 +694,7 @@ struct inode *ufs_iget(struct super_block *sb, unsigned long ino)
 	if (err)
 		goto bad_inode;
 
-	inode->i_version++;
+	inode_inc_iversion(inode);
 	ufsi->i_lastfrag =
 		(inode->i_size + uspi->s_fsize - 1) >> uspi->s_fshift;
 	ufsi->i_dir_start_lookup = 0;
diff --git a/fs/ufs/super.c b/fs/ufs/super.c
index 4d497e9..b6ba80e 100644
--- a/fs/ufs/super.c
+++ b/fs/ufs/super.c
@@ -88,6 +88,7 @@
 #include <linux/log2.h>
 #include <linux/mount.h>
 #include <linux/seq_file.h>
+#include <linux/iversion.h>
 
 #include "ufs_fs.h"
 #include "ufs.h"
@@ -1440,7 +1441,7 @@ static struct inode *ufs_alloc_inode(struct super_block *sb)
 	if (!ei)
 		return NULL;
 
-	ei->vfs_inode.i_version = 1;
+	inode_set_iversion(&ei->vfs_inode, 1);
 	seqlock_init(&ei->meta_lock);
 	mutex_init(&ei->truncate_mutex);
 	return &ei->vfs_inode;
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
index ac9a4e6..41a75f9 100644
--- a/fs/userfaultfd.c
+++ b/fs/userfaultfd.c
@@ -570,11 +570,14 @@ int handle_userfault(struct vm_fault *vmf, unsigned long reason)
 static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
 					      struct userfaultfd_wait_queue *ewq)
 {
+	struct userfaultfd_ctx *release_new_ctx;
+
 	if (WARN_ON_ONCE(current->flags & PF_EXITING))
 		goto out;
 
 	ewq->ctx = ctx;
 	init_waitqueue_entry(&ewq->wq, current);
+	release_new_ctx = NULL;
 
 	spin_lock(&ctx->event_wqh.lock);
 	/*
@@ -601,8 +604,7 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
 				new = (struct userfaultfd_ctx *)
 					(unsigned long)
 					ewq->msg.arg.reserved.reserved1;
-
-				userfaultfd_ctx_put(new);
+				release_new_ctx = new;
 			}
 			break;
 		}
@@ -617,6 +619,20 @@ static void userfaultfd_event_wait_completion(struct userfaultfd_ctx *ctx,
 	__set_current_state(TASK_RUNNING);
 	spin_unlock(&ctx->event_wqh.lock);
 
+	if (release_new_ctx) {
+		struct vm_area_struct *vma;
+		struct mm_struct *mm = release_new_ctx->mm;
+
+		/* the various vma->vm_userfaultfd_ctx still points to it */
+		down_write(&mm->mmap_sem);
+		for (vma = mm->mmap; vma; vma = vma->vm_next)
+			if (vma->vm_userfaultfd_ctx.ctx == release_new_ctx)
+				vma->vm_userfaultfd_ctx = NULL_VM_UFFD_CTX;
+		up_write(&mm->mmap_sem);
+
+		userfaultfd_ctx_put(release_new_ctx);
+	}
+
 	/*
 	 * ctx may go away after this if the userfault pseudo fd is
 	 * already released.
diff --git a/fs/xfs/libxfs/xfs_inode_buf.c b/fs/xfs/libxfs/xfs_inode_buf.c
index 6b79890..b9c0bf8 100644
--- a/fs/xfs/libxfs/xfs_inode_buf.c
+++ b/fs/xfs/libxfs/xfs_inode_buf.c
@@ -32,6 +32,8 @@
 #include "xfs_ialloc.h"
 #include "xfs_dir2.h"
 
+#include <linux/iversion.h>
+
 /*
  * Check that none of the inode's in the buffer have a next
  * unlinked field of 0.
@@ -264,7 +266,8 @@ xfs_inode_from_disk(
 	to->di_flags	= be16_to_cpu(from->di_flags);
 
 	if (to->di_version == 3) {
-		inode->i_version = be64_to_cpu(from->di_changecount);
+		inode_set_iversion_queried(inode,
+					   be64_to_cpu(from->di_changecount));
 		to->di_crtime.t_sec = be32_to_cpu(from->di_crtime.t_sec);
 		to->di_crtime.t_nsec = be32_to_cpu(from->di_crtime.t_nsec);
 		to->di_flags2 = be64_to_cpu(from->di_flags2);
@@ -314,7 +317,7 @@ xfs_inode_to_disk(
 	to->di_flags = cpu_to_be16(from->di_flags);
 
 	if (from->di_version == 3) {
-		to->di_changecount = cpu_to_be64(inode->i_version);
+		to->di_changecount = cpu_to_be64(inode_peek_iversion(inode));
 		to->di_crtime.t_sec = cpu_to_be32(from->di_crtime.t_sec);
 		to->di_crtime.t_nsec = cpu_to_be32(from->di_crtime.t_nsec);
 		to->di_flags2 = cpu_to_be64(from->di_flags2);
diff --git a/fs/xfs/xfs_aops.c b/fs/xfs/xfs_aops.c
index 21e2d70..4fc526a 100644
--- a/fs/xfs/xfs_aops.c
+++ b/fs/xfs/xfs_aops.c
@@ -399,7 +399,7 @@ xfs_map_blocks(
 	       (ip->i_df.if_flags & XFS_IFEXTENTS));
 	ASSERT(offset <= mp->m_super->s_maxbytes);
 
-	if ((xfs_ufsize_t)offset + count > mp->m_super->s_maxbytes)
+	if (offset > mp->m_super->s_maxbytes - count)
 		count = mp->m_super->s_maxbytes - offset;
 	end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)offset + count);
 	offset_fsb = XFS_B_TO_FSBT(mp, offset);
@@ -1312,7 +1312,7 @@ xfs_get_blocks(
 	lockmode = xfs_ilock_data_map_shared(ip);
 
 	ASSERT(offset <= mp->m_super->s_maxbytes);
-	if ((xfs_ufsize_t)offset + size > mp->m_super->s_maxbytes)
+	if (offset > mp->m_super->s_maxbytes - size)
 		size = mp->m_super->s_maxbytes - offset;
 	end_fsb = XFS_B_TO_FSB(mp, (xfs_ufsize_t)offset + size);
 	offset_fsb = XFS_B_TO_FSBT(mp, offset);
diff --git a/fs/xfs/xfs_icache.c b/fs/xfs/xfs_icache.c
index 3861d61..3bcb8fd 100644
--- a/fs/xfs/xfs_icache.c
+++ b/fs/xfs/xfs_icache.c
@@ -37,6 +37,7 @@
 
 #include <linux/kthread.h>
 #include <linux/freezer.h>
+#include <linux/iversion.h>
 
 /*
  * Allocate and initialise an xfs_inode.
@@ -293,14 +294,14 @@ xfs_reinit_inode(
 	int		error;
 	uint32_t	nlink = inode->i_nlink;
 	uint32_t	generation = inode->i_generation;
-	uint64_t	version = inode->i_version;
+	uint64_t	version = inode_peek_iversion(inode);
 	umode_t		mode = inode->i_mode;
 
 	error = inode_init_always(mp->m_super, inode);
 
 	set_nlink(inode, nlink);
 	inode->i_generation = generation;
-	inode->i_version = version;
+	inode_set_iversion_queried(inode, version);
 	inode->i_mode = mode;
 	return error;
 }
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c
index 6f95bdb..9f424e0 100644
--- a/fs/xfs/xfs_inode.c
+++ b/fs/xfs/xfs_inode.c
@@ -16,6 +16,7 @@
  * Inc.,  51 Franklin St, Fifth Floor, Boston, MA  02110-1301  USA
  */
 #include <linux/log2.h>
+#include <linux/iversion.h>
 
 #include "xfs.h"
 #include "xfs_fs.h"
@@ -832,7 +833,7 @@ xfs_ialloc(
 	ip->i_d.di_flags = 0;
 
 	if (ip->i_d.di_version == 3) {
-		inode->i_version = 1;
+		inode_set_iversion(inode, 1);
 		ip->i_d.di_flags2 = 0;
 		ip->i_d.di_cowextsize = 0;
 		ip->i_d.di_crtime.t_sec = (int32_t)tv.tv_sec;
diff --git a/fs/xfs/xfs_inode_item.c b/fs/xfs/xfs_inode_item.c
index 6ee5c3b..7571abf 100644
--- a/fs/xfs/xfs_inode_item.c
+++ b/fs/xfs/xfs_inode_item.c
@@ -30,6 +30,7 @@
 #include "xfs_buf_item.h"
 #include "xfs_log.h"
 
+#include <linux/iversion.h>
 
 kmem_zone_t	*xfs_ili_zone;		/* inode log item zone */
 
@@ -354,7 +355,7 @@ xfs_inode_to_log_dinode(
 	to->di_next_unlinked = NULLAGINO;
 
 	if (from->di_version == 3) {
-		to->di_changecount = inode->i_version;
+		to->di_changecount = inode_peek_iversion(inode);
 		to->di_crtime.t_sec = from->di_crtime.t_sec;
 		to->di_crtime.t_nsec = from->di_crtime.t_nsec;
 		to->di_flags2 = from->di_flags2;
diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index 7ab52a8..66e1edb 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -1006,7 +1006,7 @@ xfs_file_iomap_begin(
 	}
 
 	ASSERT(offset <= mp->m_super->s_maxbytes);
-	if ((xfs_fsize_t)offset + length > mp->m_super->s_maxbytes)
+	if (offset > mp->m_super->s_maxbytes - length)
 		length = mp->m_super->s_maxbytes - offset;
 	offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	end_fsb = XFS_B_TO_FSB(mp, offset + length);
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c
index ec952df..b897b11 100644
--- a/fs/xfs/xfs_qm.c
+++ b/fs/xfs/xfs_qm.c
@@ -48,7 +48,7 @@
 STATIC int	xfs_qm_init_quotainos(xfs_mount_t *);
 STATIC int	xfs_qm_init_quotainfo(xfs_mount_t *);
 
-
+STATIC void	xfs_qm_destroy_quotainos(xfs_quotainfo_t *qi);
 STATIC void	xfs_qm_dqfree_one(struct xfs_dquot *dqp);
 /*
  * We use the batch lookup interface to iterate over the dquots as it
@@ -695,9 +695,17 @@ xfs_qm_init_quotainfo(
 	qinf->qi_shrinker.scan_objects = xfs_qm_shrink_scan;
 	qinf->qi_shrinker.seeks = DEFAULT_SEEKS;
 	qinf->qi_shrinker.flags = SHRINKER_NUMA_AWARE;
-	register_shrinker(&qinf->qi_shrinker);
+
+	error = register_shrinker(&qinf->qi_shrinker);
+	if (error)
+		goto out_free_inos;
+
 	return 0;
 
+out_free_inos:
+	mutex_destroy(&qinf->qi_quotaofflock);
+	mutex_destroy(&qinf->qi_tree_lock);
+	xfs_qm_destroy_quotainos(qinf);
 out_free_lru:
 	list_lru_destroy(&qinf->qi_lru);
 out_free_qinf:
@@ -706,7 +714,6 @@ xfs_qm_init_quotainfo(
 	return error;
 }
 
-
 /*
  * Gets called when unmounting a filesystem or when all quotas get
  * turned off.
@@ -723,19 +730,8 @@ xfs_qm_destroy_quotainfo(
 
 	unregister_shrinker(&qi->qi_shrinker);
 	list_lru_destroy(&qi->qi_lru);
-
-	if (qi->qi_uquotaip) {
-		IRELE(qi->qi_uquotaip);
-		qi->qi_uquotaip = NULL; /* paranoia */
-	}
-	if (qi->qi_gquotaip) {
-		IRELE(qi->qi_gquotaip);
-		qi->qi_gquotaip = NULL;
-	}
-	if (qi->qi_pquotaip) {
-		IRELE(qi->qi_pquotaip);
-		qi->qi_pquotaip = NULL;
-	}
+	xfs_qm_destroy_quotainos(qi);
+	mutex_destroy(&qi->qi_tree_lock);
 	mutex_destroy(&qi->qi_quotaofflock);
 	kmem_free(qi);
 	mp->m_quotainfo = NULL;
@@ -1600,6 +1596,24 @@ xfs_qm_init_quotainos(
 }
 
 STATIC void
+xfs_qm_destroy_quotainos(
+	xfs_quotainfo_t	*qi)
+{
+	if (qi->qi_uquotaip) {
+		IRELE(qi->qi_uquotaip);
+		qi->qi_uquotaip = NULL; /* paranoia */
+	}
+	if (qi->qi_gquotaip) {
+		IRELE(qi->qi_gquotaip);
+		qi->qi_gquotaip = NULL;
+	}
+	if (qi->qi_pquotaip) {
+		IRELE(qi->qi_pquotaip);
+		qi->qi_pquotaip = NULL;
+	}
+}
+
+STATIC void
 xfs_qm_dqfree_one(
 	struct xfs_dquot	*dqp)
 {
diff --git a/fs/xfs/xfs_trans_inode.c b/fs/xfs/xfs_trans_inode.c
index daa761549..4a89da4 100644
--- a/fs/xfs/xfs_trans_inode.c
+++ b/fs/xfs/xfs_trans_inode.c
@@ -28,6 +28,8 @@
 #include "xfs_inode_item.h"
 #include "xfs_trace.h"
 
+#include <linux/iversion.h>
+
 /*
  * Add a locked inode to the transaction.
  *
@@ -110,15 +112,17 @@ xfs_trans_log_inode(
 
 	/*
 	 * First time we log the inode in a transaction, bump the inode change
-	 * counter if it is configured for this to occur. We don't use
-	 * inode_inc_version() because there is no need for extra locking around
-	 * i_version as we already hold the inode locked exclusively for
-	 * metadata modification.
+	 * counter if it is configured for this to occur. While we have the
+	 * inode locked exclusively for metadata modification, we can usually
+	 * avoid setting XFS_ILOG_CORE if no one has queried the value since
+	 * the last time it was incremented. If we have XFS_ILOG_CORE already
+	 * set however, then go ahead and bump the i_version counter
+	 * unconditionally.
 	 */
 	if (!(ip->i_itemp->ili_item.li_desc->lid_flags & XFS_LID_DIRTY) &&
 	    IS_I_VERSION(VFS_I(ip))) {
-		VFS_I(ip)->i_version++;
-		flags |= XFS_ILOG_CORE;
+		if (inode_maybe_inc_iversion(VFS_I(ip), flags & XFS_ILOG_CORE))
+			flags |= XFS_ILOG_CORE;
 	}
 
 	tp->t_flags |= XFS_TRANS_DIRTY;
diff --git a/include/acpi/acconfig.h b/include/acpi/acconfig.h
index 6db3b46..ffe364f 100644
--- a/include/acpi/acconfig.h
+++ b/include/acpi/acconfig.h
@@ -145,9 +145,9 @@
 
 #define ACPI_ADDRESS_RANGE_MAX          2
 
-/* Maximum number of While() loops before abort */
+/* Maximum time (default 30s) of While() loops before abort */
 
-#define ACPI_MAX_LOOP_COUNT             0x000FFFFF
+#define ACPI_MAX_LOOP_TIMEOUT           30
 
 /******************************************************************************
  *
diff --git a/include/acpi/acexcep.h b/include/acpi/acexcep.h
index 17d61b1..3c46f0e 100644
--- a/include/acpi/acexcep.h
+++ b/include/acpi/acexcep.h
@@ -130,8 +130,9 @@ struct acpi_exception_info {
 #define AE_HEX_OVERFLOW                 EXCEP_ENV (0x0020)
 #define AE_DECIMAL_OVERFLOW             EXCEP_ENV (0x0021)
 #define AE_OCTAL_OVERFLOW               EXCEP_ENV (0x0022)
+#define AE_END_OF_TABLE                 EXCEP_ENV (0x0023)
 
-#define AE_CODE_ENV_MAX                 0x0022
+#define AE_CODE_ENV_MAX                 0x0023
 
 /*
  * Programmer exceptions
@@ -195,7 +196,7 @@ struct acpi_exception_info {
 #define AE_AML_CIRCULAR_REFERENCE       EXCEP_AML (0x001E)
 #define AE_AML_BAD_RESOURCE_LENGTH      EXCEP_AML (0x001F)
 #define AE_AML_ILLEGAL_ADDRESS          EXCEP_AML (0x0020)
-#define AE_AML_INFINITE_LOOP            EXCEP_AML (0x0021)
+#define AE_AML_LOOP_TIMEOUT             EXCEP_AML (0x0021)
 #define AE_AML_UNINITIALIZED_NODE       EXCEP_AML (0x0022)
 #define AE_AML_TARGET_TYPE              EXCEP_AML (0x0023)
 
@@ -275,7 +276,8 @@ static const struct acpi_exception_info acpi_gbl_exception_names_env[] = {
 	EXCEP_TXT("AE_DECIMAL_OVERFLOW",
 		  "Overflow during ASCII decimal-to-binary conversion"),
 	EXCEP_TXT("AE_OCTAL_OVERFLOW",
-		  "Overflow during ASCII octal-to-binary conversion")
+		  "Overflow during ASCII octal-to-binary conversion"),
+	EXCEP_TXT("AE_END_OF_TABLE", "Reached the end of table")
 };
 
 static const struct acpi_exception_info acpi_gbl_exception_names_pgm[] = {
@@ -368,8 +370,8 @@ static const struct acpi_exception_info acpi_gbl_exception_names_aml[] = {
 		  "The length of a Resource Descriptor in the AML is incorrect"),
 	EXCEP_TXT("AE_AML_ILLEGAL_ADDRESS",
 		  "A memory, I/O, or PCI configuration address is invalid"),
-	EXCEP_TXT("AE_AML_INFINITE_LOOP",
-		  "An apparent infinite AML While loop, method was aborted"),
+	EXCEP_TXT("AE_AML_LOOP_TIMEOUT",
+		  "An AML While loop exceeded the maximum execution time"),
 	EXCEP_TXT("AE_AML_UNINITIALIZED_NODE",
 		  "A namespace node is uninitialized or unresolved"),
 	EXCEP_TXT("AE_AML_TARGET_TYPE",
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h
index 7928762..c9608b0 100644
--- a/include/acpi/acpi_bus.h
+++ b/include/acpi/acpi_bus.h
@@ -91,6 +91,9 @@ acpi_evaluate_dsm_typed(acpi_handle handle, const guid_t *guid, u64 rev,
 bool acpi_dev_found(const char *hid);
 bool acpi_dev_present(const char *hid, const char *uid, s64 hrv);
 
+const char *
+acpi_dev_get_first_match_name(const char *hid, const char *uid, s64 hrv);
+
 #ifdef CONFIG_ACPI
 
 #include <linux/proc_fs.h>
diff --git a/include/acpi/acpixf.h b/include/acpi/acpixf.h
index e1dd1a8..c589c3e 100644
--- a/include/acpi/acpixf.h
+++ b/include/acpi/acpixf.h
@@ -46,7 +46,7 @@
 
 /* Current ACPICA subsystem version in YYYYMMDD format */
 
-#define ACPI_CA_VERSION                 0x20170831
+#define ACPI_CA_VERSION                 0x20171215
 
 #include <acpi/acconfig.h>
 #include <acpi/actypes.h>
@@ -260,11 +260,11 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_osi_data, 0);
 ACPI_INIT_GLOBAL(u8, acpi_gbl_reduced_hardware, FALSE);
 
 /*
- * Maximum number of While() loop iterations before forced method abort.
+ * Maximum timeout for While() loop iterations before forced method abort.
  * This mechanism is intended to prevent infinite loops during interpreter
  * execution within a host kernel.
  */
-ACPI_INIT_GLOBAL(u32, acpi_gbl_max_loop_iterations, ACPI_MAX_LOOP_COUNT);
+ACPI_INIT_GLOBAL(u32, acpi_gbl_max_loop_iterations, ACPI_MAX_LOOP_TIMEOUT);
 
 /*
  * This mechanism is used to trace a specified AML method. The method is
diff --git a/include/acpi/actbl1.h b/include/acpi/actbl1.h
index 7a89e6d..4c304bf 100644
--- a/include/acpi/actbl1.h
+++ b/include/acpi/actbl1.h
@@ -69,9 +69,10 @@
 #define ACPI_SIG_HEST           "HEST"	/* Hardware Error Source Table */
 #define ACPI_SIG_MADT           "APIC"	/* Multiple APIC Description Table */
 #define ACPI_SIG_MSCT           "MSCT"	/* Maximum System Characteristics Table */
-#define ACPI_SIG_PDTT           "PDTT"	/* Processor Debug Trigger Table */
+#define ACPI_SIG_PDTT           "PDTT"	/* Platform Debug Trigger Table */
 #define ACPI_SIG_PPTT           "PPTT"	/* Processor Properties Topology Table */
 #define ACPI_SIG_SBST           "SBST"	/* Smart Battery Specification Table */
+#define ACPI_SIG_SDEV           "SDEV"	/* Secure Devices table */
 #define ACPI_SIG_SLIT           "SLIT"	/* System Locality Distance Information Table */
 #define ACPI_SIG_SRAT           "SRAT"	/* System Resource Affinity Table */
 #define ACPI_SIG_NFIT           "NFIT"	/* NVDIMM Firmware Interface Table */
@@ -1149,7 +1150,8 @@ enum acpi_nfit_type {
 	ACPI_NFIT_TYPE_CONTROL_REGION = 4,
 	ACPI_NFIT_TYPE_DATA_REGION = 5,
 	ACPI_NFIT_TYPE_FLUSH_ADDRESS = 6,
-	ACPI_NFIT_TYPE_RESERVED = 7	/* 7 and greater are reserved */
+	ACPI_NFIT_TYPE_CAPABILITIES = 7,
+	ACPI_NFIT_TYPE_RESERVED = 8	/* 8 and greater are reserved */
 };
 
 /*
@@ -1162,7 +1164,7 @@ struct acpi_nfit_system_address {
 	struct acpi_nfit_header header;
 	u16 range_index;
 	u16 flags;
-	u32 reserved;		/* Reseved, must be zero */
+	u32 reserved;		/* Reserved, must be zero */
 	u32 proximity_domain;
 	u8 range_guid[16];
 	u64 address;
@@ -1281,9 +1283,72 @@ struct acpi_nfit_flush_address {
 	u64 hint_address[1];	/* Variable length */
 };
 
+/* 7: Platform Capabilities Structure */
+
+struct acpi_nfit_capabilities {
+	struct acpi_nfit_header header;
+	u8 highest_capability;
+	u8 reserved[3];		/* Reserved, must be zero */
+	u32 capabilities;
+	u32 reserved2;
+};
+
+/* Capabilities Flags */
+
+#define ACPI_NFIT_CAPABILITY_CACHE_FLUSH       (1)	/* 00: Cache Flush to NVDIMM capable */
+#define ACPI_NFIT_CAPABILITY_MEM_FLUSH         (1<<1)	/* 01: Memory Flush to NVDIMM capable */
+#define ACPI_NFIT_CAPABILITY_MEM_MIRRORING     (1<<2)	/* 02: Memory Mirroring capable */
+
+/*
+ * NFIT/DVDIMM device handle support - used as the _ADR for each NVDIMM
+ */
+struct nfit_device_handle {
+	u32 handle;
+};
+
+/* Device handle construction and extraction macros */
+
+#define ACPI_NFIT_DIMM_NUMBER_MASK              0x0000000F
+#define ACPI_NFIT_CHANNEL_NUMBER_MASK           0x000000F0
+#define ACPI_NFIT_MEMORY_ID_MASK                0x00000F00
+#define ACPI_NFIT_SOCKET_ID_MASK                0x0000F000
+#define ACPI_NFIT_NODE_ID_MASK                  0x0FFF0000
+
+#define ACPI_NFIT_DIMM_NUMBER_OFFSET            0
+#define ACPI_NFIT_CHANNEL_NUMBER_OFFSET         4
+#define ACPI_NFIT_MEMORY_ID_OFFSET              8
+#define ACPI_NFIT_SOCKET_ID_OFFSET              12
+#define ACPI_NFIT_NODE_ID_OFFSET                16
+
+/* Macro to construct a NFIT/NVDIMM device handle */
+
+#define ACPI_NFIT_BUILD_DEVICE_HANDLE(dimm, channel, memory, socket, node) \
+	((dimm)                                         | \
+	((channel) << ACPI_NFIT_CHANNEL_NUMBER_OFFSET)  | \
+	((memory)  << ACPI_NFIT_MEMORY_ID_OFFSET)       | \
+	((socket)  << ACPI_NFIT_SOCKET_ID_OFFSET)       | \
+	((node)    << ACPI_NFIT_NODE_ID_OFFSET))
+
+/* Macros to extract individual fields from a NFIT/NVDIMM device handle */
+
+#define ACPI_NFIT_GET_DIMM_NUMBER(handle) \
+	((handle) & ACPI_NFIT_DIMM_NUMBER_MASK)
+
+#define ACPI_NFIT_GET_CHANNEL_NUMBER(handle) \
+	(((handle) & ACPI_NFIT_CHANNEL_NUMBER_MASK) >> ACPI_NFIT_CHANNEL_NUMBER_OFFSET)
+
+#define ACPI_NFIT_GET_MEMORY_ID(handle) \
+	(((handle) & ACPI_NFIT_MEMORY_ID_MASK)      >> ACPI_NFIT_MEMORY_ID_OFFSET)
+
+#define ACPI_NFIT_GET_SOCKET_ID(handle) \
+	(((handle) & ACPI_NFIT_SOCKET_ID_MASK)      >> ACPI_NFIT_SOCKET_ID_OFFSET)
+
+#define ACPI_NFIT_GET_NODE_ID(handle) \
+	(((handle) & ACPI_NFIT_NODE_ID_MASK)        >> ACPI_NFIT_NODE_ID_OFFSET)
+
 /*******************************************************************************
  *
- * PDTT - Processor Debug Trigger Table (ACPI 6.2)
+ * PDTT - Platform Debug Trigger Table (ACPI 6.2)
  *        Version 0
  *
  ******************************************************************************/
@@ -1301,14 +1366,14 @@ struct acpi_table_pdtt {
  * starting at array_offset.
  */
 struct acpi_pdtt_channel {
-	u16 sub_channel_id;
+	u8 subchannel_id;
+	u8 flags;
 };
 
-/* Mask and Flags for above */
+/* Flags for above */
 
-#define ACPI_PDTT_SUBCHANNEL_ID_MASK        0x00FF
-#define ACPI_PDTT_RUNTIME_TRIGGER           (1<<8)
-#define ACPI_PPTT_WAIT_COMPLETION           (1<<9)
+#define ACPI_PDTT_RUNTIME_TRIGGER           (1)
+#define ACPI_PDTT_WAIT_COMPLETION           (1<<1)
 
 /*******************************************************************************
  *
@@ -1376,6 +1441,20 @@ struct acpi_pptt_cache {
 #define ACPI_PPTT_MASK_CACHE_TYPE           (0x0C)	/* Cache type */
 #define ACPI_PPTT_MASK_WRITE_POLICY         (0x10)	/* Write policy */
 
+/* Attributes describing cache */
+#define ACPI_PPTT_CACHE_READ_ALLOCATE       (0x0)	/* Cache line is allocated on read */
+#define ACPI_PPTT_CACHE_WRITE_ALLOCATE      (0x01)	/* Cache line is allocated on write */
+#define ACPI_PPTT_CACHE_RW_ALLOCATE         (0x02)	/* Cache line is allocated on read and write */
+#define ACPI_PPTT_CACHE_RW_ALLOCATE_ALT     (0x03)	/* Alternate representation of above */
+
+#define ACPI_PPTT_CACHE_TYPE_DATA           (0x0)	/* Data cache */
+#define ACPI_PPTT_CACHE_TYPE_INSTR          (1<<2)	/* Instruction cache */
+#define ACPI_PPTT_CACHE_TYPE_UNIFIED        (2<<2)	/* Unified I & D cache */
+#define ACPI_PPTT_CACHE_TYPE_UNIFIED_ALT    (3<<2)	/* Alternate representation of above */
+
+#define ACPI_PPTT_CACHE_POLICY_WB           (0x0)	/* Cache is write back */
+#define ACPI_PPTT_CACHE_POLICY_WT           (1<<4)	/* Cache is write through */
+
 /* 2: ID Structure */
 
 struct acpi_pptt_id {
@@ -1405,6 +1484,68 @@ struct acpi_table_sbst {
 
 /*******************************************************************************
  *
+ * SDEV - Secure Devices Table (ACPI 6.2)
+ *        Version 1
+ *
+ ******************************************************************************/
+
+struct acpi_table_sdev {
+	struct acpi_table_header header;	/* Common ACPI table header */
+};
+
+struct acpi_sdev_header {
+	u8 type;
+	u8 flags;
+	u16 length;
+};
+
+/* Values for subtable type above */
+
+enum acpi_sdev_type {
+	ACPI_SDEV_TYPE_NAMESPACE_DEVICE = 0,
+	ACPI_SDEV_TYPE_PCIE_ENDPOINT_DEVICE = 1,
+	ACPI_SDEV_TYPE_RESERVED = 2	/* 2 and greater are reserved */
+};
+
+/* Values for flags above */
+
+#define ACPI_SDEV_HANDOFF_TO_UNSECURE_OS    (1)
+
+/*
+ * SDEV subtables
+ */
+
+/* 0: Namespace Device Based Secure Device Structure */
+
+struct acpi_sdev_namespace {
+	struct acpi_sdev_header header;
+	u16 device_id_offset;
+	u16 device_id_length;
+	u16 vendor_data_offset;
+	u16 vendor_data_length;
+};
+
+/* 1: PCIe Endpoint Device Based Device Structure */
+
+struct acpi_sdev_pcie {
+	struct acpi_sdev_header header;
+	u16 segment;
+	u16 start_bus;
+	u16 path_offset;
+	u16 path_length;
+	u16 vendor_data_offset;
+	u16 vendor_data_length;
+};
+
+/* 1a: PCIe Endpoint path entry */
+
+struct acpi_sdev_pcie_path {
+	u8 device;
+	u8 function;
+};
+
+/*******************************************************************************
+ *
  * SLIT - System Locality Distance Information Table
  *        Version 1
  *
diff --git a/include/acpi/actbl2.h b/include/acpi/actbl2.h
index 686b6f8..0d60d5d 100644
--- a/include/acpi/actbl2.h
+++ b/include/acpi/actbl2.h
@@ -810,6 +810,7 @@ struct acpi_iort_smmu_v3 {
 	u8 pxm;
 	u8 reserved1;
 	u16 reserved2;
+	u32 id_mapping_index;
 };
 
 /* Values for Model field above */
@@ -1246,6 +1247,8 @@ enum acpi_spmi_interface_types {
  * TCPA - Trusted Computing Platform Alliance table
  *        Version 2
  *
+ * TCG Hardware Interface Table for TPM 1.2 Clients and Servers
+ *
  * Conforms to "TCG ACPI Specification, Family 1.2 and 2.0",
  * Version 1.2, Revision 8
  * February 27, 2017
@@ -1310,6 +1313,8 @@ struct acpi_table_tcpa_server {
  * TPM2 - Trusted Platform Module (TPM) 2.0 Hardware Interface Table
  *        Version 4
  *
+ * TCG Hardware Interface Table for TPM 2.0 Clients and Servers
+ *
  * Conforms to "TCG ACPI Specification, Family 1.2 and 2.0",
  * Version 1.2, Revision 8
  * February 27, 2017
@@ -1329,15 +1334,23 @@ struct acpi_table_tpm2 {
 /* Values for start_method above */
 
 #define ACPI_TPM2_NOT_ALLOWED                       0
+#define ACPI_TPM2_RESERVED1                         1
 #define ACPI_TPM2_START_METHOD                      2
+#define ACPI_TPM2_RESERVED3                         3
+#define ACPI_TPM2_RESERVED4                         4
+#define ACPI_TPM2_RESERVED5                         5
 #define ACPI_TPM2_MEMORY_MAPPED                     6
 #define ACPI_TPM2_COMMAND_BUFFER                    7
 #define ACPI_TPM2_COMMAND_BUFFER_WITH_START_METHOD  8
+#define ACPI_TPM2_RESERVED9                         9
+#define ACPI_TPM2_RESERVED10                        10
 #define ACPI_TPM2_COMMAND_BUFFER_WITH_ARM_SMC       11	/* V1.2 Rev 8 */
+#define ACPI_TPM2_RESERVED                          12
 
-/* Trailer appears after any start_method subtables */
+/* Optional trailer appears after any start_method subtables */
 
 struct acpi_tpm2_trailer {
+	u8 method_parameters[12];
 	u32 minimum_log_length;	/* Minimum length for the event log area */
 	u64 log_address;	/* Address of the event log area */
 };
diff --git a/include/acpi/actypes.h b/include/acpi/actypes.h
index 4f077ed..31f1be7 100644
--- a/include/acpi/actypes.h
+++ b/include/acpi/actypes.h
@@ -468,6 +468,8 @@ typedef void *acpi_handle;	/* Actually a ptr to a NS Node */
 #define ACPI_NSEC_PER_MSEC              1000000L
 #define ACPI_NSEC_PER_SEC               1000000000L
 
+#define ACPI_TIME_AFTER(a, b)           ((s64)((b) - (a)) < 0)
+
 /* Owner IDs are used to track namespace nodes for selective deletion */
 
 typedef u8 acpi_owner_id;
@@ -1299,6 +1301,8 @@ typedef enum {
 #define ACPI_OSI_WIN_7                  0x0B
 #define ACPI_OSI_WIN_8                  0x0C
 #define ACPI_OSI_WIN_10                 0x0D
+#define ACPI_OSI_WIN_10_RS1             0x0E
+#define ACPI_OSI_WIN_10_RS2             0x0F
 
 /* Definitions of getopt */
 
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index ee8b707..a564b83b 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -268,7 +268,11 @@
 #define INIT_TASK_DATA(align)						\
 	. = ALIGN(align);						\
 	VMLINUX_SYMBOL(__start_init_task) = .;				\
+	VMLINUX_SYMBOL(init_thread_union) = .;				\
+	VMLINUX_SYMBOL(init_stack) = .;					\
 	*(.data..init_task)						\
+	*(.data..init_thread_info)					\
+	. = VMLINUX_SYMBOL(__start_init_task) + THREAD_SIZE;		\
 	VMLINUX_SYMBOL(__end_init_task) = .;
 
 /*
diff --git a/include/crypto/if_alg.h b/include/crypto/if_alg.h
index 38d9c58..f38227a7 100644
--- a/include/crypto/if_alg.h
+++ b/include/crypto/if_alg.h
@@ -18,6 +18,7 @@
 #include <linux/if_alg.h>
 #include <linux/scatterlist.h>
 #include <linux/types.h>
+#include <linux/atomic.h>
 #include <net/sock.h>
 
 #include <crypto/aead.h>
@@ -150,7 +151,7 @@ struct af_alg_ctx {
 	struct crypto_wait wait;
 
 	size_t used;
-	size_t rcvused;
+	atomic_t rcvused;
 
 	bool more;
 	bool merge;
@@ -215,7 +216,7 @@ static inline int af_alg_rcvbuf(struct sock *sk)
 	struct af_alg_ctx *ctx = ask->private;
 
 	return max_t(int, max_t(int, sk->sk_rcvbuf & PAGE_MASK, PAGE_SIZE) -
-			  ctx->rcvused, 0);
+		     atomic_read(&ctx->rcvused), 0);
 }
 
 /**
diff --git a/include/linux/acpi.h b/include/linux/acpi.h
index dc1ebfe..b8f4c3c 100644
--- a/include/linux/acpi.h
+++ b/include/linux/acpi.h
@@ -451,6 +451,7 @@ void __init acpi_no_s4_hw_signature(void);
 void __init acpi_old_suspend_ordering(void);
 void __init acpi_nvs_nosave(void);
 void __init acpi_nvs_nosave_s3(void);
+void __init acpi_sleep_no_blacklist(void);
 #endif /* CONFIG_PM_SLEEP */
 
 struct acpi_osc_context {
@@ -640,6 +641,12 @@ static inline bool acpi_dev_present(const char *hid, const char *uid, s64 hrv)
 	return false;
 }
 
+static inline const char *
+acpi_dev_get_first_match_name(const char *hid, const char *uid, s64 hrv)
+{
+	return NULL;
+}
+
 static inline bool is_acpi_node(struct fwnode_handle *fwnode)
 {
 	return false;
diff --git a/include/linux/arch_topology.h b/include/linux/arch_topology.h
index 3045112..2b70941 100644
--- a/include/linux/arch_topology.h
+++ b/include/linux/arch_topology.h
@@ -27,7 +27,7 @@ void topology_set_cpu_scale(unsigned int cpu, unsigned long capacity);
 DECLARE_PER_CPU(unsigned long, freq_scale);
 
 static inline
-unsigned long topology_get_freq_scale(struct sched_domain *sd, int cpu)
+unsigned long topology_get_freq_scale(int cpu)
 {
 	return per_cpu(freq_scale, cpu);
 }
diff --git a/include/linux/arm_sdei.h b/include/linux/arm_sdei.h
new file mode 100644
index 0000000..942afbd
--- /dev/null
+++ b/include/linux/arm_sdei.h
@@ -0,0 +1,79 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (C) 2017 Arm Ltd.
+#ifndef __LINUX_ARM_SDEI_H
+#define __LINUX_ARM_SDEI_H
+
+#include <uapi/linux/arm_sdei.h>
+
+enum sdei_conduit_types {
+	CONDUIT_INVALID = 0,
+	CONDUIT_SMC,
+	CONDUIT_HVC,
+};
+
+#include <asm/sdei.h>
+
+/* Arch code should override this to set the entry point from firmware... */
+#ifndef sdei_arch_get_entry_point
+#define sdei_arch_get_entry_point(conduit)	(0)
+#endif
+
+/*
+ * When an event occurs sdei_event_handler() will call a user-provided callback
+ * like this in NMI context on the CPU that received the event.
+ */
+typedef int (sdei_event_callback)(u32 event, struct pt_regs *regs, void *arg);
+
+/*
+ * Register your callback to claim an event. The event must be described
+ * by firmware.
+ */
+int sdei_event_register(u32 event_num, sdei_event_callback *cb, void *arg);
+
+/*
+ * Calls to sdei_event_unregister() may return EINPROGRESS. Keep calling
+ * it until it succeeds.
+ */
+int sdei_event_unregister(u32 event_num);
+
+int sdei_event_enable(u32 event_num);
+int sdei_event_disable(u32 event_num);
+
+#ifdef CONFIG_ARM_SDE_INTERFACE
+/* For use by arch code when CPU hotplug notifiers are not appropriate. */
+int sdei_mask_local_cpu(void);
+int sdei_unmask_local_cpu(void);
+#else
+static inline int sdei_mask_local_cpu(void) { return 0; }
+static inline int sdei_unmask_local_cpu(void) { return 0; }
+#endif /* CONFIG_ARM_SDE_INTERFACE */
+
+
+/*
+ * This struct represents an event that has been registered. The driver
+ * maintains a list of all events, and which ones are registered. (Private
+ * events have one entry in the list, but are registered on each CPU).
+ * A pointer to this struct is passed to firmware, and back to the event
+ * handler. The event handler can then use this to invoke the registered
+ * callback, without having to walk the list.
+ *
+ * For CPU private events, this structure is per-cpu.
+ */
+struct sdei_registered_event {
+	/* For use by arch code: */
+	struct pt_regs          interrupted_regs;
+
+	sdei_event_callback	*callback;
+	void			*callback_arg;
+	u32			 event_num;
+	u8			 priority;
+};
+
+/* The arch code entry point should then call this when an event arrives. */
+int notrace sdei_event_handler(struct pt_regs *regs,
+			       struct sdei_registered_event *arg);
+
+/* arch code may use this to retrieve the extra registers. */
+int sdei_api_event_context(u32 query, u64 *result);
+
+#endif /* __LINUX_ARM_SDEI_H */
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index e54e7e0..3e4ce54 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -332,7 +332,7 @@ static inline bool inode_to_wb_is_valid(struct inode *inode)
  * holding either @inode->i_lock, @inode->i_mapping->tree_lock, or the
  * associated wb's list_lock.
  */
-static inline struct bdi_writeback *inode_to_wb(struct inode *inode)
+static inline struct bdi_writeback *inode_to_wb(const struct inode *inode)
 {
 #ifdef CONFIG_LOCKDEP
 	WARN_ON_ONCE(debug_locks &&
diff --git a/include/linux/bio.h b/include/linux/bio.h
index 23d29b3..d0eb659 100644
--- a/include/linux/bio.h
+++ b/include/linux/bio.h
@@ -300,6 +300,29 @@ static inline void bio_get_last_bvec(struct bio *bio, struct bio_vec *bv)
 		bv->bv_len = iter.bi_bvec_done;
 }
 
+static inline unsigned bio_pages_all(struct bio *bio)
+{
+	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
+	return bio->bi_vcnt;
+}
+
+static inline struct bio_vec *bio_first_bvec_all(struct bio *bio)
+{
+	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
+	return bio->bi_io_vec;
+}
+
+static inline struct page *bio_first_page_all(struct bio *bio)
+{
+	return bio_first_bvec_all(bio)->bv_page;
+}
+
+static inline struct bio_vec *bio_last_bvec_all(struct bio *bio)
+{
+	WARN_ON_ONCE(bio_flagged(bio, BIO_CLONED));
+	return &bio->bi_io_vec[bio->bi_vcnt - 1];
+}
+
 enum bip_flags {
 	BIP_BLOCK_INTEGRITY	= 1 << 0, /* block layer owns integrity data */
 	BIP_MAPPED_INTEGRITY	= 1 << 1, /* ref tag has been remapped */
@@ -477,7 +500,6 @@ static inline void bio_flush_dcache_pages(struct bio *bi)
 #endif
 
 extern void bio_copy_data(struct bio *dst, struct bio *src);
-extern int bio_alloc_pages(struct bio *bio, gfp_t gfp);
 extern void bio_free_pages(struct bio *bio);
 
 extern struct bio *bio_copy_user_iov(struct request_queue *,
diff --git a/include/linux/blk-cgroup.h b/include/linux/blk-cgroup.h
index e9825ff..69bea82 100644
--- a/include/linux/blk-cgroup.h
+++ b/include/linux/blk-cgroup.h
@@ -660,12 +660,14 @@ static inline void blkg_rwstat_reset(struct blkg_rwstat *rwstat)
 static inline void blkg_rwstat_add_aux(struct blkg_rwstat *to,
 				       struct blkg_rwstat *from)
 {
-	struct blkg_rwstat v = blkg_rwstat_read(from);
+	u64 sum[BLKG_RWSTAT_NR];
 	int i;
 
 	for (i = 0; i < BLKG_RWSTAT_NR; i++)
-		atomic64_add(atomic64_read(&v.aux_cnt[i]) +
-			     atomic64_read(&from->aux_cnt[i]),
+		sum[i] = percpu_counter_sum_positive(&from->cpu_cnt[i]);
+
+	for (i = 0; i < BLKG_RWSTAT_NR; i++)
+		atomic64_add(sum[i] + atomic64_read(&from->aux_cnt[i]),
 			     &to->aux_cnt[i]);
 }
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 95c9a5c..8efcf49 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -51,6 +51,7 @@ struct blk_mq_hw_ctx {
 	unsigned int		queue_num;
 
 	atomic_t		nr_active;
+	unsigned int		nr_expired;
 
 	struct hlist_node	cpuhp_dead;
 	struct kobject		kobj;
@@ -65,7 +66,7 @@ struct blk_mq_hw_ctx {
 #endif
 
 	/* Must be the last member - see also blk_mq_hw_ctx_size(). */
-	struct srcu_struct	queue_rq_srcu[0];
+	struct srcu_struct	srcu[0];
 };
 
 struct blk_mq_tag_set {
diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h
index 9e7d8bd..c5d3db0 100644
--- a/include/linux/blk_types.h
+++ b/include/linux/blk_types.h
@@ -39,6 +39,34 @@ typedef u8 __bitwise blk_status_t;
 
 #define BLK_STS_AGAIN		((__force blk_status_t)12)
 
+/**
+ * blk_path_error - returns true if error may be path related
+ * @error: status the request was completed with
+ *
+ * Description:
+ *     This classifies block error status into non-retryable errors and ones
+ *     that may be successful if retried on a failover path.
+ *
+ * Return:
+ *     %false - retrying failover path will not help
+ *     %true  - may succeed if retried
+ */
+static inline bool blk_path_error(blk_status_t error)
+{
+	switch (error) {
+	case BLK_STS_NOTSUPP:
+	case BLK_STS_NOSPC:
+	case BLK_STS_TARGET:
+	case BLK_STS_NEXUS:
+	case BLK_STS_MEDIUM:
+	case BLK_STS_PROTECTION:
+		return false;
+	}
+
+	/* Anything else could be a path failure, so should be retried */
+	return true;
+}
+
 struct blk_issue_stat {
 	u64 stat;
 };
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 0ce8a37..4f3df80 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -27,6 +27,8 @@
 #include <linux/percpu-refcount.h>
 #include <linux/scatterlist.h>
 #include <linux/blkzoned.h>
+#include <linux/seqlock.h>
+#include <linux/u64_stats_sync.h>
 
 struct module;
 struct scsi_ioctl_command;
@@ -121,6 +123,12 @@ typedef __u32 __bitwise req_flags_t;
 /* Look at ->special_vec for the actual data payload instead of the
    bio chain. */
 #define RQF_SPECIAL_PAYLOAD	((__force req_flags_t)(1 << 18))
+/* The per-zone write lock is held for this request */
+#define RQF_ZONE_WRITE_LOCKED	((__force req_flags_t)(1 << 19))
+/* timeout is expired */
+#define RQF_MQ_TIMEOUT_EXPIRED	((__force req_flags_t)(1 << 20))
+/* already slept for hybrid poll */
+#define RQF_MQ_POLL_SLEPT	((__force req_flags_t)(1 << 21))
 
 /* flags that prevent us from merging requests: */
 #define RQF_NOMERGE_FLAGS \
@@ -133,12 +141,6 @@ typedef __u32 __bitwise req_flags_t;
  * especially blk_mq_rq_ctx_init() to take care of the added fields.
  */
 struct request {
-	struct list_head queuelist;
-	union {
-		struct __call_single_data csd;
-		u64 fifo_time;
-	};
-
 	struct request_queue *q;
 	struct blk_mq_ctx *mq_ctx;
 
@@ -148,8 +150,6 @@ struct request {
 
 	int internal_tag;
 
-	unsigned long atomic_flags;
-
 	/* the following two fields are internal, NEVER access directly */
 	unsigned int __data_len;	/* total data len */
 	int tag;
@@ -158,6 +158,8 @@ struct request {
 	struct bio *bio;
 	struct bio *biotail;
 
+	struct list_head queuelist;
+
 	/*
 	 * The hash is used inside the scheduler, and killed once the
 	 * request reaches the dispatch list. The ipi_list is only used
@@ -205,19 +207,16 @@ struct request {
 	struct hd_struct *part;
 	unsigned long start_time;
 	struct blk_issue_stat issue_stat;
-#ifdef CONFIG_BLK_CGROUP
-	struct request_list *rl;		/* rl this rq is alloced from */
-	unsigned long long start_time_ns;
-	unsigned long long io_start_time_ns;    /* when passed to hardware */
-#endif
 	/* Number of scatter-gather DMA addr+len pairs after
 	 * physical address coalescing is performed.
 	 */
 	unsigned short nr_phys_segments;
+
 #if defined(CONFIG_BLK_DEV_INTEGRITY)
 	unsigned short nr_integrity_segments;
 #endif
 
+	unsigned short write_hint;
 	unsigned short ioprio;
 
 	unsigned int timeout;
@@ -226,11 +225,37 @@ struct request {
 
 	unsigned int extra_len;	/* length of alignment and padding */
 
-	unsigned short write_hint;
+	/*
+	 * On blk-mq, the lower bits of ->gstate (generation number and
+	 * state) carry the MQ_RQ_* state value and the upper bits the
+	 * generation number which is monotonically incremented and used to
+	 * distinguish the reuse instances.
+	 *
+	 * ->gstate_seq allows updates to ->gstate and other fields
+	 * (currently ->deadline) during request start to be read
+	 * atomically from the timeout path, so that it can operate on a
+	 * coherent set of information.
+	 */
+	seqcount_t gstate_seq;
+	u64 gstate;
 
-	unsigned long deadline;
+	/*
+	 * ->aborted_gstate is used by the timeout to claim a specific
+	 * recycle instance of this request.  See blk_mq_timeout_work().
+	 */
+	struct u64_stats_sync aborted_gstate_sync;
+	u64 aborted_gstate;
+
+	/* access through blk_rq_set_deadline, blk_rq_deadline */
+	unsigned long __deadline;
+
 	struct list_head timeout_list;
 
+	union {
+		struct __call_single_data csd;
+		u64 fifo_time;
+	};
+
 	/*
 	 * completion callback.
 	 */
@@ -239,6 +264,12 @@ struct request {
 
 	/* for bidi */
 	struct request *next_rq;
+
+#ifdef CONFIG_BLK_CGROUP
+	struct request_list *rl;		/* rl this rq is alloced from */
+	unsigned long long start_time_ns;
+	unsigned long long io_start_time_ns;    /* when passed to hardware */
+#endif
 };
 
 static inline bool blk_op_is_scsi(unsigned int op)
@@ -564,6 +595,22 @@ struct request_queue {
 	struct queue_limits	limits;
 
 	/*
+	 * Zoned block device information for request dispatch control.
+	 * nr_zones is the total number of zones of the device. This is always
+	 * 0 for regular block devices. seq_zones_bitmap is a bitmap of nr_zones
+	 * bits which indicates if a zone is conventional (bit clear) or
+	 * sequential (bit set). seq_zones_wlock is a bitmap of nr_zones
+	 * bits which indicates if a zone is write locked, that is, if a write
+	 * request targeting the zone was dispatched. All three fields are
+	 * initialized by the low level device driver (e.g. scsi/sd.c).
+	 * Stacking drivers (device mappers) may or may not initialize
+	 * these fields.
+	 */
+	unsigned int		nr_zones;
+	unsigned long		*seq_zones_bitmap;
+	unsigned long		*seq_zones_wlock;
+
+	/*
 	 * sg stuff
 	 */
 	unsigned int		sg_timeout;
@@ -807,6 +854,27 @@ static inline unsigned int blk_queue_zone_sectors(struct request_queue *q)
 	return blk_queue_is_zoned(q) ? q->limits.chunk_sectors : 0;
 }
 
+static inline unsigned int blk_queue_nr_zones(struct request_queue *q)
+{
+	return q->nr_zones;
+}
+
+static inline unsigned int blk_queue_zone_no(struct request_queue *q,
+					     sector_t sector)
+{
+	if (!blk_queue_is_zoned(q))
+		return 0;
+	return sector >> ilog2(q->limits.chunk_sectors);
+}
+
+static inline bool blk_queue_zone_is_seq(struct request_queue *q,
+					 sector_t sector)
+{
+	if (!blk_queue_is_zoned(q) || !q->seq_zones_bitmap)
+		return false;
+	return test_bit(blk_queue_zone_no(q, sector), q->seq_zones_bitmap);
+}
+
 static inline bool rq_is_sync(struct request *rq)
 {
 	return op_is_sync(rq->cmd_flags);
@@ -1046,6 +1114,16 @@ static inline unsigned int blk_rq_cur_sectors(const struct request *rq)
 	return blk_rq_cur_bytes(rq) >> 9;
 }
 
+static inline unsigned int blk_rq_zone_no(struct request *rq)
+{
+	return blk_queue_zone_no(rq->q, blk_rq_pos(rq));
+}
+
+static inline unsigned int blk_rq_zone_is_seq(struct request *rq)
+{
+	return blk_queue_zone_is_seq(rq->q, blk_rq_pos(rq));
+}
+
 /*
  * Some commands like WRITE SAME have a payload or data transfer size which
  * is different from the size of the request.  Any driver that supports such
@@ -1595,7 +1673,15 @@ static inline unsigned int bdev_zone_sectors(struct block_device *bdev)
 
 	if (q)
 		return blk_queue_zone_sectors(q);
+	return 0;
+}
 
+static inline unsigned int bdev_nr_zones(struct block_device *bdev)
+{
+	struct request_queue *q = bdev_get_queue(bdev);
+
+	if (q)
+		return blk_queue_nr_zones(q);
 	return 0;
 }
 
@@ -1731,8 +1817,6 @@ static inline bool req_gap_front_merge(struct request *req, struct bio *bio)
 
 int kblockd_schedule_work(struct work_struct *work);
 int kblockd_schedule_work_on(int cpu, struct work_struct *work);
-int kblockd_schedule_delayed_work(struct delayed_work *dwork, unsigned long delay);
-int kblockd_schedule_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay);
 int kblockd_mod_delayed_work_on(int cpu, struct delayed_work *dwork, unsigned long delay);
 
 #ifdef CONFIG_BLK_CGROUP
@@ -1971,6 +2055,60 @@ extern int __blkdev_driver_ioctl(struct block_device *, fmode_t, unsigned int,
 extern int bdev_read_page(struct block_device *, sector_t, struct page *);
 extern int bdev_write_page(struct block_device *, sector_t, struct page *,
 						struct writeback_control *);
+
+#ifdef CONFIG_BLK_DEV_ZONED
+bool blk_req_needs_zone_write_lock(struct request *rq);
+void __blk_req_zone_write_lock(struct request *rq);
+void __blk_req_zone_write_unlock(struct request *rq);
+
+static inline void blk_req_zone_write_lock(struct request *rq)
+{
+	if (blk_req_needs_zone_write_lock(rq))
+		__blk_req_zone_write_lock(rq);
+}
+
+static inline void blk_req_zone_write_unlock(struct request *rq)
+{
+	if (rq->rq_flags & RQF_ZONE_WRITE_LOCKED)
+		__blk_req_zone_write_unlock(rq);
+}
+
+static inline bool blk_req_zone_is_write_locked(struct request *rq)
+{
+	return rq->q->seq_zones_wlock &&
+		test_bit(blk_rq_zone_no(rq), rq->q->seq_zones_wlock);
+}
+
+static inline bool blk_req_can_dispatch_to_zone(struct request *rq)
+{
+	if (!blk_req_needs_zone_write_lock(rq))
+		return true;
+	return !blk_req_zone_is_write_locked(rq);
+}
+#else
+static inline bool blk_req_needs_zone_write_lock(struct request *rq)
+{
+	return false;
+}
+
+static inline void blk_req_zone_write_lock(struct request *rq)
+{
+}
+
+static inline void blk_req_zone_write_unlock(struct request *rq)
+{
+}
+static inline bool blk_req_zone_is_write_locked(struct request *rq)
+{
+	return false;
+}
+
+static inline bool blk_req_can_dispatch_to_zone(struct request *rq)
+{
+	return true;
+}
+#endif /* CONFIG_BLK_DEV_ZONED */
+
 #else /* CONFIG_BLOCK */
 
 struct block_device;
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index e55e425..0b25cf8 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -43,7 +43,14 @@ struct bpf_map_ops {
 };
 
 struct bpf_map {
-	atomic_t refcnt;
+	/* 1st cacheline with read-mostly members of which some
+	 * are also accessed in fast-path (e.g. ops, max_entries).
+	 */
+	const struct bpf_map_ops *ops ____cacheline_aligned;
+	struct bpf_map *inner_map_meta;
+#ifdef CONFIG_SECURITY
+	void *security;
+#endif
 	enum bpf_map_type map_type;
 	u32 key_size;
 	u32 value_size;
@@ -52,15 +59,17 @@ struct bpf_map {
 	u32 pages;
 	u32 id;
 	int numa_node;
-	struct user_struct *user;
-	const struct bpf_map_ops *ops;
-	struct work_struct work;
+	bool unpriv_array;
+	/* 7 bytes hole */
+
+	/* 2nd cacheline with misc members to avoid false sharing
+	 * particularly with refcounting.
+	 */
+	struct user_struct *user ____cacheline_aligned;
+	atomic_t refcnt;
 	atomic_t usercnt;
-	struct bpf_map *inner_map_meta;
+	struct work_struct work;
 	char name[BPF_OBJ_NAME_LEN];
-#ifdef CONFIG_SECURITY
-	void *security;
-#endif
 };
 
 /* function argument constraints */
@@ -221,6 +230,7 @@ struct bpf_prog_aux {
 struct bpf_array {
 	struct bpf_map map;
 	u32 elem_size;
+	u32 index_mask;
 	/* 'ownership' of prog_array is claimed by the first program that
 	 * is going to use this map or by the first program which FD is stored
 	 * in the map to make sure that all callers and callees have the same
@@ -419,6 +429,8 @@ static inline int bpf_map_attr_numa_node(const union bpf_attr *attr)
 		attr->numa_node : NUMA_NO_NODE;
 }
 
+struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type type);
+
 #else /* !CONFIG_BPF_SYSCALL */
 static inline struct bpf_prog *bpf_prog_get(u32 ufd)
 {
@@ -506,6 +518,12 @@ static inline int cpu_map_enqueue(struct bpf_cpu_map_entry *rcpu,
 {
 	return 0;
 }
+
+static inline struct bpf_prog *bpf_prog_get_type_path(const char *name,
+				enum bpf_prog_type type)
+{
+	return ERR_PTR(-EOPNOTSUPP);
+}
 #endif /* CONFIG_BPF_SYSCALL */
 
 static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
@@ -514,6 +532,8 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
 	return bpf_prog_get_type_dev(ufd, type, false);
 }
 
+bool bpf_prog_get_ok(struct bpf_prog *, enum bpf_prog_type *, bool);
+
 int bpf_prog_offload_compile(struct bpf_prog *prog);
 void bpf_prog_offload_destroy(struct bpf_prog *prog);
 
diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index ec8a4d7..fe7a22d 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -125,4 +125,13 @@ static inline bool bvec_iter_rewind(const struct bio_vec *bv,
 		((bvl = bvec_iter_bvec((bio_vec), (iter))), 1);	\
 	     bvec_iter_advance((bio_vec), &(iter), (bvl).bv_len))
 
+/* for iterating one bio from start to end */
+#define BVEC_ITER_ALL_INIT (struct bvec_iter)				\
+{									\
+	.bi_sector	= 0,						\
+	.bi_size	= UINT_MAX,					\
+	.bi_idx		= 0,						\
+	.bi_bvec_done	= 0,						\
+}
+
 #endif /* __LINUX_BVEC_ITER_H */
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 2272ded..631354a 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -219,7 +219,7 @@
 /* Mark a function definition as prohibited from being cloned. */
 #define __noclone	__attribute__((__noclone__, __optimize__("no-tracer")))
 
-#ifdef RANDSTRUCT_PLUGIN
+#if defined(RANDSTRUCT_PLUGIN) && !defined(__CHECKER__)
 #define __randomize_layout __attribute__((randomize_layout))
 #define __no_randomize_layout __attribute__((no_randomize_layout))
 #endif
diff --git a/include/linux/completion.h b/include/linux/completion.h
index 94a59ba..519e949 100644
--- a/include/linux/completion.h
+++ b/include/linux/completion.h
@@ -32,7 +32,6 @@ struct completion {
 #define init_completion(x) __init_completion(x)
 static inline void complete_acquire(struct completion *x) {}
 static inline void complete_release(struct completion *x) {}
-static inline void complete_release_commit(struct completion *x) {}
 
 #define COMPLETION_INITIALIZER(work) \
 	{ 0, __WAIT_QUEUE_HEAD_INITIALIZER((work).wait) }
diff --git a/include/linux/cper.h b/include/linux/cper.h
index 723e952..d14ef4e 100644
--- a/include/linux/cper.h
+++ b/include/linux/cper.h
@@ -275,6 +275,50 @@ enum {
 #define CPER_ARM_INFO_FLAGS_PROPAGATED		BIT(2)
 #define CPER_ARM_INFO_FLAGS_OVERFLOW		BIT(3)
 
+#define CPER_ARM_CACHE_ERROR			0
+#define CPER_ARM_TLB_ERROR			1
+#define CPER_ARM_BUS_ERROR			2
+#define CPER_ARM_VENDOR_ERROR			3
+#define CPER_ARM_MAX_TYPE			CPER_ARM_VENDOR_ERROR
+
+#define CPER_ARM_ERR_VALID_TRANSACTION_TYPE	BIT(0)
+#define CPER_ARM_ERR_VALID_OPERATION_TYPE	BIT(1)
+#define CPER_ARM_ERR_VALID_LEVEL		BIT(2)
+#define CPER_ARM_ERR_VALID_PROC_CONTEXT_CORRUPT	BIT(3)
+#define CPER_ARM_ERR_VALID_CORRECTED		BIT(4)
+#define CPER_ARM_ERR_VALID_PRECISE_PC		BIT(5)
+#define CPER_ARM_ERR_VALID_RESTARTABLE_PC	BIT(6)
+#define CPER_ARM_ERR_VALID_PARTICIPATION_TYPE	BIT(7)
+#define CPER_ARM_ERR_VALID_TIME_OUT		BIT(8)
+#define CPER_ARM_ERR_VALID_ADDRESS_SPACE	BIT(9)
+#define CPER_ARM_ERR_VALID_MEM_ATTRIBUTES	BIT(10)
+#define CPER_ARM_ERR_VALID_ACCESS_MODE		BIT(11)
+
+#define CPER_ARM_ERR_TRANSACTION_SHIFT		16
+#define CPER_ARM_ERR_TRANSACTION_MASK		GENMASK(1,0)
+#define CPER_ARM_ERR_OPERATION_SHIFT		18
+#define CPER_ARM_ERR_OPERATION_MASK		GENMASK(3,0)
+#define CPER_ARM_ERR_LEVEL_SHIFT		22
+#define CPER_ARM_ERR_LEVEL_MASK			GENMASK(2,0)
+#define CPER_ARM_ERR_PC_CORRUPT_SHIFT		25
+#define CPER_ARM_ERR_PC_CORRUPT_MASK		GENMASK(0,0)
+#define CPER_ARM_ERR_CORRECTED_SHIFT		26
+#define CPER_ARM_ERR_CORRECTED_MASK		GENMASK(0,0)
+#define CPER_ARM_ERR_PRECISE_PC_SHIFT		27
+#define CPER_ARM_ERR_PRECISE_PC_MASK		GENMASK(0,0)
+#define CPER_ARM_ERR_RESTARTABLE_PC_SHIFT	28
+#define CPER_ARM_ERR_RESTARTABLE_PC_MASK	GENMASK(0,0)
+#define CPER_ARM_ERR_PARTICIPATION_TYPE_SHIFT	29
+#define CPER_ARM_ERR_PARTICIPATION_TYPE_MASK	GENMASK(1,0)
+#define CPER_ARM_ERR_TIME_OUT_SHIFT		31
+#define CPER_ARM_ERR_TIME_OUT_MASK		GENMASK(0,0)
+#define CPER_ARM_ERR_ADDRESS_SPACE_SHIFT	32
+#define CPER_ARM_ERR_ADDRESS_SPACE_MASK		GENMASK(1,0)
+#define CPER_ARM_ERR_MEM_ATTRIBUTES_SHIFT	34
+#define CPER_ARM_ERR_MEM_ATTRIBUTES_MASK	GENMASK(8,0)
+#define CPER_ARM_ERR_ACCESS_MODE_SHIFT		43
+#define CPER_ARM_ERR_ACCESS_MODE_MASK		GENMASK(0,0)
+
 /*
  * All tables and structs must be byte-packed to match CPER
  * specification, since the tables are provided by the system BIOS
@@ -494,6 +538,8 @@ struct cper_sec_pcie {
 /* Reset to default packing */
 #pragma pack()
 
+extern const char * const cper_proc_error_type_strs[4];
+
 u64 cper_next_record_id(void);
 const char *cper_severity_str(unsigned int);
 const char *cper_mem_err_type_str(unsigned int);
@@ -503,5 +549,7 @@ void cper_mem_err_pack(const struct cper_sec_mem_err *,
 		       struct cper_mem_err_compact *);
 const char *cper_mem_err_unpack(struct trace_seq *,
 				struct cper_mem_err_compact *);
+void cper_print_proc_arm(const char *pfx,
+			 const struct cper_sec_proc_arm *proc);
 
 #endif
diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index a04ef7c1..7b01bc1 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -47,6 +47,13 @@ extern void cpu_remove_dev_attr(struct device_attribute *attr);
 extern int cpu_add_dev_attr_group(struct attribute_group *attrs);
 extern void cpu_remove_dev_attr_group(struct attribute_group *attrs);
 
+extern ssize_t cpu_show_meltdown(struct device *dev,
+				 struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_spectre_v1(struct device *dev,
+				   struct device_attribute *attr, char *buf);
+extern ssize_t cpu_show_spectre_v2(struct device *dev,
+				   struct device_attribute *attr, char *buf);
+
 extern __printf(4, 5)
 struct device *cpu_device_create(struct device *parent, void *drvdata,
 				 const struct attribute_group **groups,
diff --git a/include/linux/cpu_cooling.h b/include/linux/cpu_cooling.h
index d4292eb..de0dafb 100644
--- a/include/linux/cpu_cooling.h
+++ b/include/linux/cpu_cooling.h
@@ -30,9 +30,6 @@
 
 struct cpufreq_policy;
 
-typedef int (*get_static_t)(cpumask_t *cpumask, int interval,
-			    unsigned long voltage, u32 *power);
-
 #ifdef CONFIG_CPU_THERMAL
 /**
  * cpufreq_cooling_register - function to create cpufreq cooling device.
@@ -41,43 +38,6 @@ typedef int (*get_static_t)(cpumask_t *cpumask, int interval,
 struct thermal_cooling_device *
 cpufreq_cooling_register(struct cpufreq_policy *policy);
 
-struct thermal_cooling_device *
-cpufreq_power_cooling_register(struct cpufreq_policy *policy,
-			       u32 capacitance, get_static_t plat_static_func);
-
-/**
- * of_cpufreq_cooling_register - create cpufreq cooling device based on DT.
- * @np: a valid struct device_node to the cooling device device tree node.
- * @policy: cpufreq policy.
- */
-#ifdef CONFIG_THERMAL_OF
-struct thermal_cooling_device *
-of_cpufreq_cooling_register(struct device_node *np,
-			    struct cpufreq_policy *policy);
-
-struct thermal_cooling_device *
-of_cpufreq_power_cooling_register(struct device_node *np,
-				  struct cpufreq_policy *policy,
-				  u32 capacitance,
-				  get_static_t plat_static_func);
-#else
-static inline struct thermal_cooling_device *
-of_cpufreq_cooling_register(struct device_node *np,
-			    struct cpufreq_policy *policy)
-{
-	return ERR_PTR(-ENOSYS);
-}
-
-static inline struct thermal_cooling_device *
-of_cpufreq_power_cooling_register(struct device_node *np,
-				  struct cpufreq_policy *policy,
-				  u32 capacitance,
-				  get_static_t plat_static_func)
-{
-	return NULL;
-}
-#endif
-
 /**
  * cpufreq_cooling_unregister - function to remove cpufreq cooling device.
  * @cdev: thermal cooling device pointer.
@@ -90,28 +50,6 @@ cpufreq_cooling_register(struct cpufreq_policy *policy)
 {
 	return ERR_PTR(-ENOSYS);
 }
-static inline struct thermal_cooling_device *
-cpufreq_power_cooling_register(struct cpufreq_policy *policy,
-			       u32 capacitance, get_static_t plat_static_func)
-{
-	return NULL;
-}
-
-static inline struct thermal_cooling_device *
-of_cpufreq_cooling_register(struct device_node *np,
-			    struct cpufreq_policy *policy)
-{
-	return ERR_PTR(-ENOSYS);
-}
-
-static inline struct thermal_cooling_device *
-of_cpufreq_power_cooling_register(struct device_node *np,
-				  struct cpufreq_policy *policy,
-				  u32 capacitance,
-				  get_static_t plat_static_func)
-{
-	return NULL;
-}
 
 static inline
 void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
@@ -120,4 +58,19 @@ void cpufreq_cooling_unregister(struct thermal_cooling_device *cdev)
 }
 #endif	/* CONFIG_CPU_THERMAL */
 
+#if defined(CONFIG_THERMAL_OF) && defined(CONFIG_CPU_THERMAL)
+/**
+ * of_cpufreq_cooling_register - create cpufreq cooling device based on DT.
+ * @policy: cpufreq policy.
+ */
+struct thermal_cooling_device *
+of_cpufreq_cooling_register(struct cpufreq_policy *policy);
+#else
+static inline struct thermal_cooling_device *
+of_cpufreq_cooling_register(struct cpufreq_policy *policy)
+{
+	return NULL;
+}
+#endif /* defined(CONFIG_THERMAL_OF) && defined(CONFIG_CPU_THERMAL) */
+
 #endif /* __CPU_COOLING_H__ */
diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h
index 1a32e55..2c787c5 100644
--- a/include/linux/cpuhotplug.h
+++ b/include/linux/cpuhotplug.h
@@ -109,6 +109,7 @@ enum cpuhp_state {
 	CPUHP_AP_PERF_XTENSA_STARTING,
 	CPUHP_AP_PERF_METAG_STARTING,
 	CPUHP_AP_MIPS_OP_LOONGSON3_STARTING,
+	CPUHP_AP_ARM_SDEI_STARTING,
 	CPUHP_AP_ARM_VFP_STARTING,
 	CPUHP_AP_ARM64_DEBUG_MONITORS_STARTING,
 	CPUHP_AP_PERF_ARM_HW_BREAKPOINT_STARTING,
diff --git a/include/linux/cpuidle.h b/include/linux/cpuidle.h
index 8f7788d..871f9e2 100644
--- a/include/linux/cpuidle.h
+++ b/include/linux/cpuidle.h
@@ -257,22 +257,30 @@ static inline int cpuidle_register_governor(struct cpuidle_governor *gov)
 {return 0;}
 #endif
 
-#define CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx)	\
-({								\
-	int __ret;						\
-								\
-	if (!idx) {						\
-		cpu_do_idle();					\
-		return idx;					\
-	}							\
-								\
-	__ret = cpu_pm_enter();					\
-	if (!__ret) {						\
-		__ret = low_level_idle_enter(idx);		\
-		cpu_pm_exit();					\
-	}							\
-								\
-	__ret ? -1 : idx;					\
+#define __CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, is_retention) \
+({									\
+	int __ret = 0;							\
+									\
+	if (!idx) {							\
+		cpu_do_idle();						\
+		return idx;						\
+	}								\
+									\
+	if (!is_retention)						\
+		__ret =  cpu_pm_enter();				\
+	if (!__ret) {							\
+		__ret = low_level_idle_enter(idx);			\
+		if (!is_retention)					\
+			cpu_pm_exit();					\
+	}								\
+									\
+	__ret ? -1 : idx;						\
 })
 
+#define CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx)	\
+	__CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, 0)
+
+#define CPU_PM_CPU_IDLE_ENTER_RETENTION(low_level_idle_enter, idx)	\
+	__CPU_PM_CPU_IDLE_ENTER(low_level_idle_enter, idx, 1)
+
 #endif /* _LINUX_CPUIDLE_H */
diff --git a/include/linux/crash_core.h b/include/linux/crash_core.h
index 06097ef..b511f6d 100644
--- a/include/linux/crash_core.h
+++ b/include/linux/crash_core.h
@@ -42,6 +42,8 @@ phys_addr_t paddr_vmcoreinfo_note(void);
 	vmcoreinfo_append_str("PAGESIZE=%ld\n", value)
 #define VMCOREINFO_SYMBOL(name) \
 	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)&name)
+#define VMCOREINFO_SYMBOL_ARRAY(name) \
+	vmcoreinfo_append_str("SYMBOL(%s)=%lx\n", #name, (unsigned long)name)
 #define VMCOREINFO_SIZE(name) \
 	vmcoreinfo_append_str("SIZE(%s)=%lu\n", #name, \
 			      (unsigned long)sizeof(name))
diff --git a/include/linux/crc-ccitt.h b/include/linux/crc-ccitt.h
index cd4f420..72c92c3 100644
--- a/include/linux/crc-ccitt.h
+++ b/include/linux/crc-ccitt.h
@@ -5,12 +5,19 @@
 #include <linux/types.h>
 
 extern u16 const crc_ccitt_table[256];
+extern u16 const crc_ccitt_false_table[256];
 
 extern u16 crc_ccitt(u16 crc, const u8 *buffer, size_t len);
+extern u16 crc_ccitt_false(u16 crc, const u8 *buffer, size_t len);
 
 static inline u16 crc_ccitt_byte(u16 crc, const u8 c)
 {
 	return (crc >> 8) ^ crc_ccitt_table[(crc ^ c) & 0xff];
 }
 
+static inline u16 crc_ccitt_false_byte(u16 crc, const u8 c)
+{
+    return (crc << 8) ^ crc_ccitt_false_table[(crc >> 8) ^ c];
+}
+
 #endif /* _LINUX_CRC_CCITT_H */
diff --git a/include/linux/delayacct.h b/include/linux/delayacct.h
index 4178d24..5e335b6 100644
--- a/include/linux/delayacct.h
+++ b/include/linux/delayacct.h
@@ -71,7 +71,7 @@ extern void delayacct_init(void);
 extern void __delayacct_tsk_init(struct task_struct *);
 extern void __delayacct_tsk_exit(struct task_struct *);
 extern void __delayacct_blkio_start(void);
-extern void __delayacct_blkio_end(void);
+extern void __delayacct_blkio_end(struct task_struct *);
 extern int __delayacct_add_tsk(struct taskstats *, struct task_struct *);
 extern __u64 __delayacct_blkio_ticks(struct task_struct *);
 extern void __delayacct_freepages_start(void);
@@ -122,10 +122,10 @@ static inline void delayacct_blkio_start(void)
 		__delayacct_blkio_start();
 }
 
-static inline void delayacct_blkio_end(void)
+static inline void delayacct_blkio_end(struct task_struct *p)
 {
 	if (current->delays)
-		__delayacct_blkio_end();
+		__delayacct_blkio_end(p);
 	delayacct_clear_flag(DELAYACCT_PF_BLKIO);
 }
 
@@ -169,7 +169,7 @@ static inline void delayacct_tsk_free(struct task_struct *tsk)
 {}
 static inline void delayacct_blkio_start(void)
 {}
-static inline void delayacct_blkio_end(void)
+static inline void delayacct_blkio_end(struct task_struct *p)
 {}
 static inline int delayacct_add_tsk(struct taskstats *d,
 					struct task_struct *tsk)
diff --git a/include/linux/efi.h b/include/linux/efi.h
index d813f7b..29fdf80 100644
--- a/include/linux/efi.h
+++ b/include/linux/efi.h
@@ -140,11 +140,13 @@ struct efi_boot_memmap {
 
 struct capsule_info {
 	efi_capsule_header_t	header;
+	efi_capsule_header_t	*capsule;
 	int			reset_type;
 	long			index;
 	size_t			count;
 	size_t			total_size;
-	phys_addr_t		*pages;
+	struct page		**pages;
+	phys_addr_t		*phys;
 	size_t			page_bytes_remain;
 };
 
diff --git a/include/linux/elevator.h b/include/linux/elevator.h
index 3d794b3..6d9e230 100644
--- a/include/linux/elevator.h
+++ b/include/linux/elevator.h
@@ -198,8 +198,6 @@ extern bool elv_attempt_insert_merge(struct request_queue *, struct request *);
 extern void elv_requeue_request(struct request_queue *, struct request *);
 extern struct request *elv_former_request(struct request_queue *, struct request *);
 extern struct request *elv_latter_request(struct request_queue *, struct request *);
-extern int elv_register_queue(struct request_queue *q);
-extern void elv_unregister_queue(struct request_queue *q);
 extern int elv_may_queue(struct request_queue *, unsigned int);
 extern void elv_completed_request(struct request_queue *, struct request *);
 extern int elv_set_request(struct request_queue *q, struct request *rq,
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 511fbaa..6804d07 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -639,7 +639,7 @@ struct inode {
 		struct hlist_head	i_dentry;
 		struct rcu_head		i_rcu;
 	};
-	u64			i_version;
+	atomic64_t		i_version;
 	atomic_t		i_count;
 	atomic_t		i_dio_count;
 	atomic_t		i_writecount;
@@ -2036,21 +2036,6 @@ static inline void inode_dec_link_count(struct inode *inode)
 	mark_inode_dirty(inode);
 }
 
-/**
- * inode_inc_iversion - increments i_version
- * @inode: inode that need to be updated
- *
- * Every time the inode is modified, the i_version field will be incremented.
- * The filesystem has to be mounted with i_version flag
- */
-
-static inline void inode_inc_iversion(struct inode *inode)
-{
-       spin_lock(&inode->i_lock);
-       inode->i_version++;
-       spin_unlock(&inode->i_lock);
-}
-
 enum file_time_flags {
 	S_ATIME = 1,
 	S_MTIME = 2,
diff --git a/include/linux/fscache.h b/include/linux/fscache.h
index f4ff47d..fe0c349 100644
--- a/include/linux/fscache.h
+++ b/include/linux/fscache.h
@@ -755,7 +755,7 @@ bool fscache_maybe_release_page(struct fscache_cookie *cookie,
 {
 	if (fscache_cookie_valid(cookie) && PageFsCache(page))
 		return __fscache_maybe_release_page(cookie, page, gfp);
-	return false;
+	return true;
 }
 
 /**
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 2bab819..9c3c9a3 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -332,6 +332,8 @@ extern int ftrace_text_reserved(const void *start, const void *end);
 
 extern int ftrace_nr_registered_ops(void);
 
+struct ftrace_ops *ftrace_ops_trampoline(unsigned long addr);
+
 bool is_ftrace_trampoline(unsigned long addr);
 
 /*
@@ -764,9 +766,6 @@ typedef int (*trace_func_graph_ent_t)(struct ftrace_graph_ent *); /* entry */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 
-/* for init task */
-#define INIT_FTRACE_GRAPH		.ret_stack = NULL,
-
 /*
  * Stack of return addresses for functions
  * of a thread.
@@ -844,7 +843,6 @@ static inline void unpause_graph_tracing(void)
 #else /* !CONFIG_FUNCTION_GRAPH_TRACER */
 
 #define __notrace_funcgraph
-#define INIT_FTRACE_GRAPH
 
 static inline void ftrace_graph_init_task(struct task_struct *t) { }
 static inline void ftrace_graph_exit_task(struct task_struct *t) { }
@@ -923,10 +921,6 @@ extern int tracepoint_printk;
 extern void disable_trace_on_warning(void);
 extern int __disable_trace_on_warning;
 
-#ifdef CONFIG_PREEMPT
-#define INIT_TRACE_RECURSION		.trace_recursion = 0,
-#endif
-
 int tracepoint_printk_sysctl(struct ctl_table *table, int write,
 			     void __user *buffer, size_t *lenp,
 			     loff_t *ppos);
@@ -935,10 +929,6 @@ int tracepoint_printk_sysctl(struct ctl_table *table, int write,
 static inline void  disable_trace_on_warning(void) { }
 #endif /* CONFIG_TRACING */
 
-#ifndef INIT_TRACE_RECURSION
-#define INIT_TRACE_RECURSION
-#endif
-
 #ifdef CONFIG_FTRACE_SYSCALLS
 
 unsigned long arch_syscall_addr(int nr);
diff --git a/include/linux/genetlink.h b/include/linux/genetlink.h
index ecc2928..bc73850 100644
--- a/include/linux/genetlink.h
+++ b/include/linux/genetlink.h
@@ -31,8 +31,7 @@ extern wait_queue_head_t genl_sk_destructing_waitq;
  * @p: The pointer to read, prior to dereferencing
  *
  * Return the value of the specified RCU-protected pointer, but omit
- * both the smp_read_barrier_depends() and the READ_ONCE(), because
- * caller holds genl mutex.
+ * the READ_ONCE(), because caller holds genl mutex.
  */
 #define genl_dereference(p)					\
 	rcu_dereference_protected(p, lockdep_genl_is_held())
diff --git a/include/linux/genhd.h b/include/linux/genhd.h
index 5144ebe..5e35310 100644
--- a/include/linux/genhd.h
+++ b/include/linux/genhd.h
@@ -395,6 +395,11 @@ static inline void add_disk(struct gendisk *disk)
 {
 	device_add_disk(NULL, disk);
 }
+extern void device_add_disk_no_queue_reg(struct device *parent, struct gendisk *disk);
+static inline void add_disk_no_queue_reg(struct gendisk *disk)
+{
+	device_add_disk_no_queue_reg(NULL, disk);
+}
 
 extern void del_gendisk(struct gendisk *gp);
 extern struct gendisk *get_gendisk(dev_t dev, int *partno);
diff --git a/include/linux/hrtimer.h b/include/linux/hrtimer.h
index 012c37f..c7902ca 100644
--- a/include/linux/hrtimer.h
+++ b/include/linux/hrtimer.h
@@ -28,13 +28,29 @@ struct hrtimer_cpu_base;
 
 /*
  * Mode arguments of xxx_hrtimer functions:
+ *
+ * HRTIMER_MODE_ABS		- Time value is absolute
+ * HRTIMER_MODE_REL		- Time value is relative to now
+ * HRTIMER_MODE_PINNED		- Timer is bound to CPU (is only considered
+ *				  when starting the timer)
+ * HRTIMER_MODE_SOFT		- Timer callback function will be executed in
+ *				  soft irq context
  */
 enum hrtimer_mode {
-	HRTIMER_MODE_ABS = 0x0,		/* Time value is absolute */
-	HRTIMER_MODE_REL = 0x1,		/* Time value is relative to now */
-	HRTIMER_MODE_PINNED = 0x02,	/* Timer is bound to CPU */
-	HRTIMER_MODE_ABS_PINNED = 0x02,
-	HRTIMER_MODE_REL_PINNED = 0x03,
+	HRTIMER_MODE_ABS	= 0x00,
+	HRTIMER_MODE_REL	= 0x01,
+	HRTIMER_MODE_PINNED	= 0x02,
+	HRTIMER_MODE_SOFT	= 0x04,
+
+	HRTIMER_MODE_ABS_PINNED = HRTIMER_MODE_ABS | HRTIMER_MODE_PINNED,
+	HRTIMER_MODE_REL_PINNED = HRTIMER_MODE_REL | HRTIMER_MODE_PINNED,
+
+	HRTIMER_MODE_ABS_SOFT	= HRTIMER_MODE_ABS | HRTIMER_MODE_SOFT,
+	HRTIMER_MODE_REL_SOFT	= HRTIMER_MODE_REL | HRTIMER_MODE_SOFT,
+
+	HRTIMER_MODE_ABS_PINNED_SOFT = HRTIMER_MODE_ABS_PINNED | HRTIMER_MODE_SOFT,
+	HRTIMER_MODE_REL_PINNED_SOFT = HRTIMER_MODE_REL_PINNED | HRTIMER_MODE_SOFT,
+
 };
 
 /*
@@ -87,6 +103,7 @@ enum hrtimer_restart {
  * @base:	pointer to the timer base (per cpu and per clock)
  * @state:	state information (See bit values above)
  * @is_rel:	Set if the timer was armed relative
+ * @is_soft:	Set if hrtimer will be expired in soft interrupt context.
  *
  * The hrtimer structure must be initialized by hrtimer_init()
  */
@@ -97,6 +114,7 @@ struct hrtimer {
 	struct hrtimer_clock_base	*base;
 	u8				state;
 	u8				is_rel;
+	u8				is_soft;
 };
 
 /**
@@ -112,9 +130,9 @@ struct hrtimer_sleeper {
 };
 
 #ifdef CONFIG_64BIT
-# define HRTIMER_CLOCK_BASE_ALIGN	64
+# define __hrtimer_clock_base_align	____cacheline_aligned
 #else
-# define HRTIMER_CLOCK_BASE_ALIGN	32
+# define __hrtimer_clock_base_align
 #endif
 
 /**
@@ -123,48 +141,57 @@ struct hrtimer_sleeper {
  * @index:		clock type index for per_cpu support when moving a
  *			timer to a base on another cpu.
  * @clockid:		clock id for per_cpu support
+ * @seq:		seqcount around __run_hrtimer
+ * @running:		pointer to the currently running hrtimer
  * @active:		red black tree root node for the active timers
  * @get_time:		function to retrieve the current time of the clock
  * @offset:		offset of this clock to the monotonic base
  */
 struct hrtimer_clock_base {
 	struct hrtimer_cpu_base	*cpu_base;
-	int			index;
+	unsigned int		index;
 	clockid_t		clockid;
+	seqcount_t		seq;
+	struct hrtimer		*running;
 	struct timerqueue_head	active;
 	ktime_t			(*get_time)(void);
 	ktime_t			offset;
-} __attribute__((__aligned__(HRTIMER_CLOCK_BASE_ALIGN)));
+} __hrtimer_clock_base_align;
 
 enum  hrtimer_base_type {
 	HRTIMER_BASE_MONOTONIC,
 	HRTIMER_BASE_REALTIME,
 	HRTIMER_BASE_BOOTTIME,
 	HRTIMER_BASE_TAI,
+	HRTIMER_BASE_MONOTONIC_SOFT,
+	HRTIMER_BASE_REALTIME_SOFT,
+	HRTIMER_BASE_BOOTTIME_SOFT,
+	HRTIMER_BASE_TAI_SOFT,
 	HRTIMER_MAX_CLOCK_BASES,
 };
 
-/*
+/**
  * struct hrtimer_cpu_base - the per cpu clock bases
  * @lock:		lock protecting the base and associated clock bases
  *			and timers
- * @seq:		seqcount around __run_hrtimer
- * @running:		pointer to the currently running hrtimer
  * @cpu:		cpu number
  * @active_bases:	Bitfield to mark bases with active timers
  * @clock_was_set_seq:	Sequence counter of clock was set events
- * @migration_enabled:	The migration of hrtimers to other cpus is enabled
- * @nohz_active:	The nohz functionality is enabled
- * @expires_next:	absolute time of the next event which was scheduled
- *			via clock_set_next_event()
- * @next_timer:		Pointer to the first expiring timer
- * @in_hrtirq:		hrtimer_interrupt() is currently executing
  * @hres_active:	State of high resolution mode
+ * @in_hrtirq:		hrtimer_interrupt() is currently executing
  * @hang_detected:	The last hrtimer interrupt detected a hang
+ * @softirq_activated:	displays, if the softirq is raised - update of softirq
+ *			related settings is not required then.
  * @nr_events:		Total number of hrtimer interrupt events
  * @nr_retries:		Total number of hrtimer interrupt retries
  * @nr_hangs:		Total number of hrtimer interrupt hangs
  * @max_hang_time:	Maximum time spent in hrtimer_interrupt
+ * @expires_next:	absolute time of the next event, is required for remote
+ *			hrtimer enqueue; it is the total first expiry time (hard
+ *			and soft hrtimer are taken into account)
+ * @next_timer:		Pointer to the first expiring timer
+ * @softirq_expires_next: Time to check, if soft queues needs also to be expired
+ * @softirq_next_timer: Pointer to the first expiring softirq based timer
  * @clock_base:		array of clock bases for this cpu
  *
  * Note: next_timer is just an optimization for __remove_hrtimer().
@@ -173,31 +200,28 @@ enum  hrtimer_base_type {
  */
 struct hrtimer_cpu_base {
 	raw_spinlock_t			lock;
-	seqcount_t			seq;
-	struct hrtimer			*running;
 	unsigned int			cpu;
 	unsigned int			active_bases;
 	unsigned int			clock_was_set_seq;
-	bool				migration_enabled;
-	bool				nohz_active;
+	unsigned int			hres_active		: 1,
+					in_hrtirq		: 1,
+					hang_detected		: 1,
+					softirq_activated       : 1;
 #ifdef CONFIG_HIGH_RES_TIMERS
-	unsigned int			in_hrtirq	: 1,
-					hres_active	: 1,
-					hang_detected	: 1;
-	ktime_t				expires_next;
-	struct hrtimer			*next_timer;
 	unsigned int			nr_events;
-	unsigned int			nr_retries;
-	unsigned int			nr_hangs;
+	unsigned short			nr_retries;
+	unsigned short			nr_hangs;
 	unsigned int			max_hang_time;
 #endif
+	ktime_t				expires_next;
+	struct hrtimer			*next_timer;
+	ktime_t				softirq_expires_next;
+	struct hrtimer			*softirq_next_timer;
 	struct hrtimer_clock_base	clock_base[HRTIMER_MAX_CLOCK_BASES];
 } ____cacheline_aligned;
 
 static inline void hrtimer_set_expires(struct hrtimer *timer, ktime_t time)
 {
-	BUILD_BUG_ON(sizeof(struct hrtimer_clock_base) > HRTIMER_CLOCK_BASE_ALIGN);
-
 	timer->node.expires = time;
 	timer->_softexpires = time;
 }
@@ -266,16 +290,17 @@ static inline ktime_t hrtimer_cb_get_time(struct hrtimer *timer)
 	return timer->base->get_time();
 }
 
+static inline int hrtimer_is_hres_active(struct hrtimer *timer)
+{
+	return IS_ENABLED(CONFIG_HIGH_RES_TIMERS) ?
+		timer->base->cpu_base->hres_active : 0;
+}
+
 #ifdef CONFIG_HIGH_RES_TIMERS
 struct clock_event_device;
 
 extern void hrtimer_interrupt(struct clock_event_device *dev);
 
-static inline int hrtimer_is_hres_active(struct hrtimer *timer)
-{
-	return timer->base->cpu_base->hres_active;
-}
-
 /*
  * The resolution of the clocks. The resolution value is returned in
  * the clock_getres() system call to give application programmers an
@@ -298,11 +323,6 @@ extern unsigned int hrtimer_resolution;
 
 #define hrtimer_resolution	(unsigned int)LOW_RES_NSEC
 
-static inline int hrtimer_is_hres_active(struct hrtimer *timer)
-{
-	return 0;
-}
-
 static inline void clock_was_set_delayed(void) { }
 
 #endif
@@ -365,11 +385,12 @@ extern void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 				   u64 range_ns, const enum hrtimer_mode mode);
 
 /**
- * hrtimer_start - (re)start an hrtimer on the current CPU
+ * hrtimer_start - (re)start an hrtimer
  * @timer:	the timer to be added
  * @tim:	expiry time
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
+ * @mode:	timer mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL), and pinned (HRTIMER_MODE_PINNED);
+ *		softirq based mode is considered for debug purpose only!
  */
 static inline void hrtimer_start(struct hrtimer *timer, ktime_t tim,
 				 const enum hrtimer_mode mode)
@@ -422,7 +443,7 @@ static inline int hrtimer_is_queued(struct hrtimer *timer)
  */
 static inline int hrtimer_callback_running(struct hrtimer *timer)
 {
-	return timer->base->cpu_base->running == timer;
+	return timer->base->running == timer;
 }
 
 /* Forward a hrtimer so it expires after now: */
@@ -466,7 +487,7 @@ extern int schedule_hrtimeout_range(ktime_t *expires, u64 delta,
 extern int schedule_hrtimeout_range_clock(ktime_t *expires,
 					  u64 delta,
 					  const enum hrtimer_mode mode,
-					  int clock);
+					  clockid_t clock_id);
 extern int schedule_hrtimeout(ktime_t *expires, const enum hrtimer_mode mode);
 
 /* Soft interrupt function to run the hrtimer queues: */
diff --git a/include/linux/iio/adc/stm32-dfsdm-adc.h b/include/linux/iio/adc/stm32-dfsdm-adc.h
new file mode 100644
index 0000000..e7dc7a5
--- /dev/null
+++ b/include/linux/iio/adc/stm32-dfsdm-adc.h
@@ -0,0 +1,18 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * This file discribe the STM32 DFSDM IIO driver API for audio part
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Author(s): Arnaud Pouliquen <arnaud.pouliquen@st.com>.
+ */
+
+#ifndef STM32_DFSDM_ADC_H
+#define STM32_DFSDM_ADC_H
+
+int stm32_dfsdm_get_buff_cb(struct iio_dev *iio_dev,
+			    int (*cb)(const void *data, size_t size,
+				      void *private),
+			    void *private);
+int stm32_dfsdm_release_buff_cb(struct iio_dev *iio_dev);
+
+#endif
diff --git a/include/linux/iio/consumer.h b/include/linux/iio/consumer.h
index 5e347a9..9887f4f 100644
--- a/include/linux/iio/consumer.h
+++ b/include/linux/iio/consumer.h
@@ -134,6 +134,17 @@ struct iio_cb_buffer *iio_channel_get_all_cb(struct device *dev,
 						       void *private),
 					     void *private);
 /**
+ * iio_channel_cb_set_buffer_watermark() - set the buffer watermark.
+ * @cb_buffer:		The callback buffer from whom we want the channel
+ *			information.
+ * @watermark: buffer watermark in bytes.
+ *
+ * This function allows to configure the buffer watermark.
+ */
+int iio_channel_cb_set_buffer_watermark(struct iio_cb_buffer *cb_buffer,
+					size_t watermark);
+
+/**
  * iio_channel_release_all_cb() - release and unregister the callback.
  * @cb_buffer:		The callback buffer that was allocated.
  */
@@ -216,6 +227,32 @@ int iio_read_channel_average_raw(struct iio_channel *chan, int *val);
 int iio_read_channel_processed(struct iio_channel *chan, int *val);
 
 /**
+ * iio_write_channel_attribute() - Write values to the device attribute.
+ * @chan:	The channel being queried.
+ * @val:	Value being written.
+ * @val2:	Value being written.val2 use depends on attribute type.
+ * @attribute:	info attribute to be read.
+ *
+ * Returns an error code or 0.
+ */
+int iio_write_channel_attribute(struct iio_channel *chan, int val,
+				int val2, enum iio_chan_info_enum attribute);
+
+/**
+ * iio_read_channel_attribute() - Read values from the device attribute.
+ * @chan:	The channel being queried.
+ * @val:	Value being written.
+ * @val2:	Value being written.Val2 use depends on attribute type.
+ * @attribute:	info attribute to be written.
+ *
+ * Returns an error code if failed. Else returns a description of what is in val
+ * and val2, such as IIO_VAL_INT_PLUS_MICRO telling us we have a value of val
+ * + val2/1e6
+ */
+int iio_read_channel_attribute(struct iio_channel *chan, int *val,
+			       int *val2, enum iio_chan_info_enum attribute);
+
+/**
  * iio_write_channel_raw() - write to a given channel
  * @chan:		The channel being queried.
  * @val:		Value being written.
diff --git a/include/linux/iio/hw-consumer.h b/include/linux/iio/hw-consumer.h
new file mode 100644
index 0000000..44d48bb
--- /dev/null
+++ b/include/linux/iio/hw-consumer.h
@@ -0,0 +1,21 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Industrial I/O in kernel hardware consumer interface
+ *
+ * Copyright 2017 Analog Devices Inc.
+ *  Author: Lars-Peter Clausen <lars@metafoo.de>
+ */
+
+#ifndef LINUX_IIO_HW_CONSUMER_H
+#define LINUX_IIO_HW_CONSUMER_H
+
+struct iio_hw_consumer;
+
+struct iio_hw_consumer *iio_hw_consumer_alloc(struct device *dev);
+void iio_hw_consumer_free(struct iio_hw_consumer *hwc);
+struct iio_hw_consumer *devm_iio_hw_consumer_alloc(struct device *dev);
+void devm_iio_hw_consumer_free(struct device *dev, struct iio_hw_consumer *hwc);
+int iio_hw_consumer_enable(struct iio_hw_consumer *hwc);
+void iio_hw_consumer_disable(struct iio_hw_consumer *hwc);
+
+#endif
diff --git a/include/linux/iio/iio.h b/include/linux/iio/iio.h
index 20b6134..f12a61b 100644
--- a/include/linux/iio/iio.h
+++ b/include/linux/iio/iio.h
@@ -20,34 +20,6 @@
  * Currently assumes nano seconds.
  */
 
-enum iio_chan_info_enum {
-	IIO_CHAN_INFO_RAW = 0,
-	IIO_CHAN_INFO_PROCESSED,
-	IIO_CHAN_INFO_SCALE,
-	IIO_CHAN_INFO_OFFSET,
-	IIO_CHAN_INFO_CALIBSCALE,
-	IIO_CHAN_INFO_CALIBBIAS,
-	IIO_CHAN_INFO_PEAK,
-	IIO_CHAN_INFO_PEAK_SCALE,
-	IIO_CHAN_INFO_QUADRATURE_CORRECTION_RAW,
-	IIO_CHAN_INFO_AVERAGE_RAW,
-	IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY,
-	IIO_CHAN_INFO_HIGH_PASS_FILTER_3DB_FREQUENCY,
-	IIO_CHAN_INFO_SAMP_FREQ,
-	IIO_CHAN_INFO_FREQUENCY,
-	IIO_CHAN_INFO_PHASE,
-	IIO_CHAN_INFO_HARDWAREGAIN,
-	IIO_CHAN_INFO_HYSTERESIS,
-	IIO_CHAN_INFO_INT_TIME,
-	IIO_CHAN_INFO_ENABLE,
-	IIO_CHAN_INFO_CALIBHEIGHT,
-	IIO_CHAN_INFO_CALIBWEIGHT,
-	IIO_CHAN_INFO_DEBOUNCE_COUNT,
-	IIO_CHAN_INFO_DEBOUNCE_TIME,
-	IIO_CHAN_INFO_CALIBEMISSIVITY,
-	IIO_CHAN_INFO_OVERSAMPLING_RATIO,
-};
-
 enum iio_shared_by {
 	IIO_SEPARATE,
 	IIO_SHARED_BY_TYPE,
diff --git a/include/linux/iio/types.h b/include/linux/iio/types.h
index 2aa7b63..6eb3d683 100644
--- a/include/linux/iio/types.h
+++ b/include/linux/iio/types.h
@@ -34,4 +34,32 @@ enum iio_available_type {
 	IIO_AVAIL_RANGE,
 };
 
+enum iio_chan_info_enum {
+	IIO_CHAN_INFO_RAW = 0,
+	IIO_CHAN_INFO_PROCESSED,
+	IIO_CHAN_INFO_SCALE,
+	IIO_CHAN_INFO_OFFSET,
+	IIO_CHAN_INFO_CALIBSCALE,
+	IIO_CHAN_INFO_CALIBBIAS,
+	IIO_CHAN_INFO_PEAK,
+	IIO_CHAN_INFO_PEAK_SCALE,
+	IIO_CHAN_INFO_QUADRATURE_CORRECTION_RAW,
+	IIO_CHAN_INFO_AVERAGE_RAW,
+	IIO_CHAN_INFO_LOW_PASS_FILTER_3DB_FREQUENCY,
+	IIO_CHAN_INFO_HIGH_PASS_FILTER_3DB_FREQUENCY,
+	IIO_CHAN_INFO_SAMP_FREQ,
+	IIO_CHAN_INFO_FREQUENCY,
+	IIO_CHAN_INFO_PHASE,
+	IIO_CHAN_INFO_HARDWAREGAIN,
+	IIO_CHAN_INFO_HYSTERESIS,
+	IIO_CHAN_INFO_INT_TIME,
+	IIO_CHAN_INFO_ENABLE,
+	IIO_CHAN_INFO_CALIBHEIGHT,
+	IIO_CHAN_INFO_CALIBWEIGHT,
+	IIO_CHAN_INFO_DEBOUNCE_COUNT,
+	IIO_CHAN_INFO_DEBOUNCE_TIME,
+	IIO_CHAN_INFO_CALIBEMISSIVITY,
+	IIO_CHAN_INFO_OVERSAMPLING_RATIO,
+};
+
 #endif /* _IIO_TYPES_H_ */
diff --git a/include/linux/init_task.h b/include/linux/init_task.h
index 6a53262..a454b8a 100644
--- a/include/linux/init_task.h
+++ b/include/linux/init_task.h
@@ -21,22 +21,11 @@
 
 #include <asm/thread_info.h>
 
-#ifdef CONFIG_SMP
-# define INIT_PUSHABLE_TASKS(tsk)					\
-	.pushable_tasks = PLIST_NODE_INIT(tsk.pushable_tasks, MAX_PRIO),
-#else
-# define INIT_PUSHABLE_TASKS(tsk)
-#endif
-
 extern struct files_struct init_files;
 extern struct fs_struct init_fs;
-
-#ifdef CONFIG_CPUSETS
-#define INIT_CPUSET_SEQ(tsk)							\
-	.mems_allowed_seq = SEQCNT_ZERO(tsk.mems_allowed_seq),
-#else
-#define INIT_CPUSET_SEQ(tsk)
-#endif
+extern struct nsproxy init_nsproxy;
+extern struct group_info init_groups;
+extern struct cred init_cred;
 
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
 #define INIT_PREV_CPUTIME(x)	.prev_cputime = {			\
@@ -47,67 +36,16 @@ extern struct fs_struct init_fs;
 #endif
 
 #ifdef CONFIG_POSIX_TIMERS
-#define INIT_POSIX_TIMERS(s)						\
-	.posix_timers = LIST_HEAD_INIT(s.posix_timers),
 #define INIT_CPU_TIMERS(s)						\
 	.cpu_timers = {							\
 		LIST_HEAD_INIT(s.cpu_timers[0]),			\
 		LIST_HEAD_INIT(s.cpu_timers[1]),			\
-		LIST_HEAD_INIT(s.cpu_timers[2]),								\
-	},
-#define INIT_CPUTIMER(s)						\
-	.cputimer	= { 						\
-		.cputime_atomic	= INIT_CPUTIME_ATOMIC,			\
-		.running	= false,				\
-		.checking_timer = false,				\
+		LIST_HEAD_INIT(s.cpu_timers[2]),			\
 	},
 #else
-#define INIT_POSIX_TIMERS(s)
 #define INIT_CPU_TIMERS(s)
-#define INIT_CPUTIMER(s)
 #endif
 
-#define INIT_SIGNALS(sig) {						\
-	.nr_threads	= 1,						\
-	.thread_head	= LIST_HEAD_INIT(init_task.thread_node),	\
-	.wait_chldexit	= __WAIT_QUEUE_HEAD_INITIALIZER(sig.wait_chldexit),\
-	.shared_pending	= { 						\
-		.list = LIST_HEAD_INIT(sig.shared_pending.list),	\
-		.signal =  {{0}}},					\
-	INIT_POSIX_TIMERS(sig)						\
-	INIT_CPU_TIMERS(sig)						\
-	.rlim		= INIT_RLIMITS,					\
-	INIT_CPUTIMER(sig)						\
-	INIT_PREV_CPUTIME(sig)						\
-	.cred_guard_mutex =						\
-		 __MUTEX_INITIALIZER(sig.cred_guard_mutex),		\
-}
-
-extern struct nsproxy init_nsproxy;
-
-#define INIT_SIGHAND(sighand) {						\
-	.count		= ATOMIC_INIT(1), 				\
-	.action		= { { { .sa_handler = SIG_DFL, } }, },		\
-	.siglock	= __SPIN_LOCK_UNLOCKED(sighand.siglock),	\
-	.signalfd_wqh	= __WAIT_QUEUE_HEAD_INITIALIZER(sighand.signalfd_wqh),	\
-}
-
-extern struct group_info init_groups;
-
-#define INIT_STRUCT_PID {						\
-	.count 		= ATOMIC_INIT(1),				\
-	.tasks		= {						\
-		{ .first = NULL },					\
-		{ .first = NULL },					\
-		{ .first = NULL },					\
-	},								\
-	.level		= 0,						\
-	.numbers	= { {						\
-		.nr		= 0,					\
-		.ns		= &init_pid_ns,				\
-	}, }								\
-}
-
 #define INIT_PID_LINK(type) 					\
 {								\
 	.node = {						\
@@ -117,192 +55,16 @@ extern struct group_info init_groups;
 	.pid = &init_struct_pid,				\
 }
 
-#ifdef CONFIG_AUDITSYSCALL
-#define INIT_IDS \
-	.loginuid = INVALID_UID, \
-	.sessionid = (unsigned int)-1,
-#else
-#define INIT_IDS
-#endif
-
-#ifdef CONFIG_PREEMPT_RCU
-#define INIT_TASK_RCU_PREEMPT(tsk)					\
-	.rcu_read_lock_nesting = 0,					\
-	.rcu_read_unlock_special.s = 0,					\
-	.rcu_node_entry = LIST_HEAD_INIT(tsk.rcu_node_entry),		\
-	.rcu_blocked_node = NULL,
-#else
-#define INIT_TASK_RCU_PREEMPT(tsk)
-#endif
-#ifdef CONFIG_TASKS_RCU
-#define INIT_TASK_RCU_TASKS(tsk)					\
-	.rcu_tasks_holdout = false,					\
-	.rcu_tasks_holdout_list =					\
-		LIST_HEAD_INIT(tsk.rcu_tasks_holdout_list),		\
-	.rcu_tasks_idle_cpu = -1,
-#else
-#define INIT_TASK_RCU_TASKS(tsk)
-#endif
-
-extern struct cred init_cred;
-
-#ifdef CONFIG_CGROUP_SCHED
-# define INIT_CGROUP_SCHED(tsk)						\
-	.sched_task_group = &root_task_group,
-#else
-# define INIT_CGROUP_SCHED(tsk)
-#endif
-
-#ifdef CONFIG_PERF_EVENTS
-# define INIT_PERF_EVENTS(tsk)						\
-	.perf_event_mutex = 						\
-		 __MUTEX_INITIALIZER(tsk.perf_event_mutex),		\
-	.perf_event_list = LIST_HEAD_INIT(tsk.perf_event_list),
-#else
-# define INIT_PERF_EVENTS(tsk)
-#endif
-
-#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
-# define INIT_VTIME(tsk)						\
-	.vtime.seqcount = SEQCNT_ZERO(tsk.vtime.seqcount),		\
-	.vtime.starttime = 0,						\
-	.vtime.state = VTIME_SYS,
-#else
-# define INIT_VTIME(tsk)
-#endif
-
 #define INIT_TASK_COMM "swapper"
 
-#ifdef CONFIG_RT_MUTEXES
-# define INIT_RT_MUTEXES(tsk)						\
-	.pi_waiters = RB_ROOT_CACHED,					\
-	.pi_top_task = NULL,
-#else
-# define INIT_RT_MUTEXES(tsk)
-#endif
-
-#ifdef CONFIG_NUMA_BALANCING
-# define INIT_NUMA_BALANCING(tsk)					\
-	.numa_preferred_nid = -1,					\
-	.numa_group = NULL,						\
-	.numa_faults = NULL,
-#else
-# define INIT_NUMA_BALANCING(tsk)
-#endif
-
-#ifdef CONFIG_KASAN
-# define INIT_KASAN(tsk)						\
-	.kasan_depth = 1,
-#else
-# define INIT_KASAN(tsk)
-#endif
-
-#ifdef CONFIG_LIVEPATCH
-# define INIT_LIVEPATCH(tsk)						\
-	.patch_state = KLP_UNDEFINED,
-#else
-# define INIT_LIVEPATCH(tsk)
-#endif
-
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-# define INIT_TASK_TI(tsk)			\
-	.thread_info = INIT_THREAD_INFO(tsk),	\
-	.stack_refcount = ATOMIC_INIT(1),
-#else
-# define INIT_TASK_TI(tsk)
-#endif
-
-#ifdef CONFIG_SECURITY
-#define INIT_TASK_SECURITY .security = NULL,
-#else
-#define INIT_TASK_SECURITY
-#endif
-
-/*
- *  INIT_TASK is used to set up the first task table, touch at
- * your own risk!. Base=0, limit=0x1fffff (=2MB)
- */
-#define INIT_TASK(tsk)	\
-{									\
-	INIT_TASK_TI(tsk)						\
-	.state		= 0,						\
-	.stack		= init_stack,					\
-	.usage		= ATOMIC_INIT(2),				\
-	.flags		= PF_KTHREAD,					\
-	.prio		= MAX_PRIO-20,					\
-	.static_prio	= MAX_PRIO-20,					\
-	.normal_prio	= MAX_PRIO-20,					\
-	.policy		= SCHED_NORMAL,					\
-	.cpus_allowed	= CPU_MASK_ALL,					\
-	.nr_cpus_allowed= NR_CPUS,					\
-	.mm		= NULL,						\
-	.active_mm	= &init_mm,					\
-	.restart_block = {						\
-		.fn = do_no_restart_syscall,				\
-	},								\
-	.se		= {						\
-		.group_node 	= LIST_HEAD_INIT(tsk.se.group_node),	\
-	},								\
-	.rt		= {						\
-		.run_list	= LIST_HEAD_INIT(tsk.rt.run_list),	\
-		.time_slice	= RR_TIMESLICE,				\
-	},								\
-	.tasks		= LIST_HEAD_INIT(tsk.tasks),			\
-	INIT_PUSHABLE_TASKS(tsk)					\
-	INIT_CGROUP_SCHED(tsk)						\
-	.ptraced	= LIST_HEAD_INIT(tsk.ptraced),			\
-	.ptrace_entry	= LIST_HEAD_INIT(tsk.ptrace_entry),		\
-	.real_parent	= &tsk,						\
-	.parent		= &tsk,						\
-	.children	= LIST_HEAD_INIT(tsk.children),			\
-	.sibling	= LIST_HEAD_INIT(tsk.sibling),			\
-	.group_leader	= &tsk,						\
-	RCU_POINTER_INITIALIZER(real_cred, &init_cred),			\
-	RCU_POINTER_INITIALIZER(cred, &init_cred),			\
-	.comm		= INIT_TASK_COMM,				\
-	.thread		= INIT_THREAD,					\
-	.fs		= &init_fs,					\
-	.files		= &init_files,					\
-	.signal		= &init_signals,				\
-	.sighand	= &init_sighand,				\
-	.nsproxy	= &init_nsproxy,				\
-	.pending	= {						\
-		.list = LIST_HEAD_INIT(tsk.pending.list),		\
-		.signal = {{0}}},					\
-	.blocked	= {{0}},					\
-	.alloc_lock	= __SPIN_LOCK_UNLOCKED(tsk.alloc_lock),		\
-	.journal_info	= NULL,						\
-	INIT_CPU_TIMERS(tsk)						\
-	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(tsk.pi_lock),	\
-	.timer_slack_ns = 50000, /* 50 usec default slack */		\
-	.pids = {							\
-		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),		\
-		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),		\
-		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),		\
-	},								\
-	.thread_group	= LIST_HEAD_INIT(tsk.thread_group),		\
-	.thread_node	= LIST_HEAD_INIT(init_signals.thread_head),	\
-	INIT_IDS							\
-	INIT_PERF_EVENTS(tsk)						\
-	INIT_TRACE_IRQFLAGS						\
-	INIT_LOCKDEP							\
-	INIT_FTRACE_GRAPH						\
-	INIT_TRACE_RECURSION						\
-	INIT_TASK_RCU_PREEMPT(tsk)					\
-	INIT_TASK_RCU_TASKS(tsk)					\
-	INIT_CPUSET_SEQ(tsk)						\
-	INIT_RT_MUTEXES(tsk)						\
-	INIT_PREV_CPUTIME(tsk)						\
-	INIT_VTIME(tsk)							\
-	INIT_NUMA_BALANCING(tsk)					\
-	INIT_KASAN(tsk)							\
-	INIT_LIVEPATCH(tsk)						\
-	INIT_TASK_SECURITY						\
-}
-
-
 /* Attach to the init_task data structure for proper alignment */
+#ifdef CONFIG_ARCH_TASK_STRUCT_ON_STACK
 #define __init_task_data __attribute__((__section__(".data..init_task")))
+#else
+#define __init_task_data /**/
+#endif
 
+/* Attach to the thread_info data structure for proper alignment */
+#define __init_thread_info __attribute__((__section__(".data..init_thread_info")))
 
 #endif
diff --git a/include/linux/irq_work.h b/include/linux/irq_work.h
index 0e81035..b11fcdf 100644
--- a/include/linux/irq_work.h
+++ b/include/linux/irq_work.h
@@ -13,10 +13,13 @@
  * busy      NULL, 2 -> {free, claimed} : callback in progress, can be claimed
  */
 
-#define IRQ_WORK_PENDING	1UL
-#define IRQ_WORK_BUSY		2UL
-#define IRQ_WORK_FLAGS		3UL
-#define IRQ_WORK_LAZY		4UL /* Doesn't want IPI, wait for tick */
+#define IRQ_WORK_PENDING	BIT(0)
+#define IRQ_WORK_BUSY		BIT(1)
+
+/* Doesn't want IPI, wait for tick: */
+#define IRQ_WORK_LAZY		BIT(2)
+
+#define IRQ_WORK_CLAIMED	(IRQ_WORK_PENDING | IRQ_WORK_BUSY)
 
 struct irq_work {
 	unsigned long flags;
diff --git a/include/linux/irqflags.h b/include/linux/irqflags.h
index 46cb57d..9700f00 100644
--- a/include/linux/irqflags.h
+++ b/include/linux/irqflags.h
@@ -27,24 +27,19 @@
 # define trace_hardirq_enter()			\
 do {						\
 	current->hardirq_context++;		\
-	crossrelease_hist_start(XHLOCK_HARD);	\
 } while (0)
 # define trace_hardirq_exit()			\
 do {						\
 	current->hardirq_context--;		\
-	crossrelease_hist_end(XHLOCK_HARD);	\
 } while (0)
 # define lockdep_softirq_enter()		\
 do {						\
 	current->softirq_context++;		\
-	crossrelease_hist_start(XHLOCK_SOFT);	\
 } while (0)
 # define lockdep_softirq_exit()			\
 do {						\
 	current->softirq_context--;		\
-	crossrelease_hist_end(XHLOCK_SOFT);	\
 } while (0)
-# define INIT_TRACE_IRQFLAGS	.softirqs_enabled = 1,
 #else
 # define trace_hardirqs_on()		do { } while (0)
 # define trace_hardirqs_off()		do { } while (0)
@@ -58,7 +53,6 @@ do {						\
 # define trace_hardirq_exit()		do { } while (0)
 # define lockdep_softirq_enter()	do { } while (0)
 # define lockdep_softirq_exit()		do { } while (0)
-# define INIT_TRACE_IRQFLAGS
 #endif
 
 #if defined(CONFIG_IRQSOFF_TRACER) || \
diff --git a/include/linux/iversion.h b/include/linux/iversion.h
new file mode 100644
index 0000000..858463f
--- /dev/null
+++ b/include/linux/iversion.h
@@ -0,0 +1,341 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_IVERSION_H
+#define _LINUX_IVERSION_H
+
+#include <linux/fs.h>
+
+/*
+ * The inode->i_version field:
+ * ---------------------------
+ * The change attribute (i_version) is mandated by NFSv4 and is mostly for
+ * knfsd, but is also used for other purposes (e.g. IMA). The i_version must
+ * appear different to observers if there was a change to the inode's data or
+ * metadata since it was last queried.
+ *
+ * Observers see the i_version as a 64-bit number that never decreases. If it
+ * remains the same since it was last checked, then nothing has changed in the
+ * inode. If it's different then something has changed. Observers cannot infer
+ * anything about the nature or magnitude of the changes from the value, only
+ * that the inode has changed in some fashion.
+ *
+ * Not all filesystems properly implement the i_version counter. Subsystems that
+ * want to use i_version field on an inode should first check whether the
+ * filesystem sets the SB_I_VERSION flag (usually via the IS_I_VERSION macro).
+ *
+ * Those that set SB_I_VERSION will automatically have their i_version counter
+ * incremented on writes to normal files. If the SB_I_VERSION is not set, then
+ * the VFS will not touch it on writes, and the filesystem can use it how it
+ * wishes. Note that the filesystem is always responsible for updating the
+ * i_version on namespace changes in directories (mkdir, rmdir, unlink, etc.).
+ * We consider these sorts of filesystems to have a kernel-managed i_version.
+ *
+ * It may be impractical for filesystems to keep i_version updates atomic with
+ * respect to the changes that cause them.  They should, however, guarantee
+ * that i_version updates are never visible before the changes that caused
+ * them.  Also, i_version updates should never be delayed longer than it takes
+ * the original change to reach disk.
+ *
+ * This implementation uses the low bit in the i_version field as a flag to
+ * track when the value has been queried. If it has not been queried since it
+ * was last incremented, we can skip the increment in most cases.
+ *
+ * In the event that we're updating the ctime, we will usually go ahead and
+ * bump the i_version anyway. Since that has to go to stable storage in some
+ * fashion, we might as well increment it as well.
+ *
+ * With this implementation, the value should always appear to observers to
+ * increase over time if the file has changed. It's recommended to use
+ * inode_cmp_iversion() helper to compare values.
+ *
+ * Note that some filesystems (e.g. NFS and AFS) just use the field to store
+ * a server-provided value (for the most part). For that reason, those
+ * filesystems do not set SB_I_VERSION. These filesystems are considered to
+ * have a self-managed i_version.
+ *
+ * Persistently storing the i_version
+ * ----------------------------------
+ * Queries of the i_version field are not gated on them hitting the backing
+ * store. It's always possible that the host could crash after allowing
+ * a query of the value but before it has made it to disk.
+ *
+ * To mitigate this problem, filesystems should always use
+ * inode_set_iversion_queried when loading an existing inode from disk. This
+ * ensures that the next attempted inode increment will result in the value
+ * changing.
+ *
+ * Storing the value to disk therefore does not count as a query, so those
+ * filesystems should use inode_peek_iversion to grab the value to be stored.
+ * There is no need to flag the value as having been queried in that case.
+ */
+
+/*
+ * We borrow the lowest bit in the i_version to use as a flag to tell whether
+ * it has been queried since we last incremented it. If it has, then we must
+ * increment it on the next change. After that, we can clear the flag and
+ * avoid incrementing it again until it has again been queried.
+ */
+#define I_VERSION_QUERIED_SHIFT	(1)
+#define I_VERSION_QUERIED	(1ULL << (I_VERSION_QUERIED_SHIFT - 1))
+#define I_VERSION_INCREMENT	(1ULL << I_VERSION_QUERIED_SHIFT)
+
+/**
+ * inode_set_iversion_raw - set i_version to the specified raw value
+ * @inode: inode to set
+ * @val: new i_version value to set
+ *
+ * Set @inode's i_version field to @val. This function is for use by
+ * filesystems that self-manage the i_version.
+ *
+ * For example, the NFS client stores its NFSv4 change attribute in this way,
+ * and the AFS client stores the data_version from the server here.
+ */
+static inline void
+inode_set_iversion_raw(struct inode *inode, u64 val)
+{
+	atomic64_set(&inode->i_version, val);
+}
+
+/**
+ * inode_peek_iversion_raw - grab a "raw" iversion value
+ * @inode: inode from which i_version should be read
+ *
+ * Grab a "raw" inode->i_version value and return it. The i_version is not
+ * flagged or converted in any way. This is mostly used to access a self-managed
+ * i_version.
+ *
+ * With those filesystems, we want to treat the i_version as an entirely
+ * opaque value.
+ */
+static inline u64
+inode_peek_iversion_raw(const struct inode *inode)
+{
+	return atomic64_read(&inode->i_version);
+}
+
+/**
+ * inode_set_iversion - set i_version to a particular value
+ * @inode: inode to set
+ * @val: new i_version value to set
+ *
+ * Set @inode's i_version field to @val. This function is for filesystems with
+ * a kernel-managed i_version, for initializing a newly-created inode from
+ * scratch.
+ *
+ * In this case, we do not set the QUERIED flag since we know that this value
+ * has never been queried.
+ */
+static inline void
+inode_set_iversion(struct inode *inode, u64 val)
+{
+	inode_set_iversion_raw(inode, val << I_VERSION_QUERIED_SHIFT);
+}
+
+/**
+ * inode_set_iversion_queried - set i_version to a particular value as quereied
+ * @inode: inode to set
+ * @val: new i_version value to set
+ *
+ * Set @inode's i_version field to @val, and flag it for increment on the next
+ * change.
+ *
+ * Filesystems that persistently store the i_version on disk should use this
+ * when loading an existing inode from disk.
+ *
+ * When loading in an i_version value from a backing store, we can't be certain
+ * that it wasn't previously viewed before being stored. Thus, we must assume
+ * that it was, to ensure that we don't end up handing out the same value for
+ * different versions of the same inode.
+ */
+static inline void
+inode_set_iversion_queried(struct inode *inode, u64 val)
+{
+	inode_set_iversion_raw(inode, (val << I_VERSION_QUERIED_SHIFT) |
+				I_VERSION_QUERIED);
+}
+
+/**
+ * inode_maybe_inc_iversion - increments i_version
+ * @inode: inode with the i_version that should be updated
+ * @force: increment the counter even if it's not necessary?
+ *
+ * Every time the inode is modified, the i_version field must be seen to have
+ * changed by any observer.
+ *
+ * If "force" is set or the QUERIED flag is set, then ensure that we increment
+ * the value, and clear the queried flag.
+ *
+ * In the common case where neither is set, then we can return "false" without
+ * updating i_version.
+ *
+ * If this function returns false, and no other metadata has changed, then we
+ * can avoid logging the metadata.
+ */
+static inline bool
+inode_maybe_inc_iversion(struct inode *inode, bool force)
+{
+	u64 cur, old, new;
+
+	/*
+	 * The i_version field is not strictly ordered with any other inode
+	 * information, but the legacy inode_inc_iversion code used a spinlock
+	 * to serialize increments.
+	 *
+	 * Here, we add full memory barriers to ensure that any de-facto
+	 * ordering with other info is preserved.
+	 *
+	 * This barrier pairs with the barrier in inode_query_iversion()
+	 */
+	smp_mb();
+	cur = inode_peek_iversion_raw(inode);
+	for (;;) {
+		/* If flag is clear then we needn't do anything */
+		if (!force && !(cur & I_VERSION_QUERIED))
+			return false;
+
+		/* Since lowest bit is flag, add 2 to avoid it */
+		new = (cur & ~I_VERSION_QUERIED) + I_VERSION_INCREMENT;
+
+		old = atomic64_cmpxchg(&inode->i_version, cur, new);
+		if (likely(old == cur))
+			break;
+		cur = old;
+	}
+	return true;
+}
+
+
+/**
+ * inode_inc_iversion - forcibly increment i_version
+ * @inode: inode that needs to be updated
+ *
+ * Forcbily increment the i_version field. This always results in a change to
+ * the observable value.
+ */
+static inline void
+inode_inc_iversion(struct inode *inode)
+{
+	inode_maybe_inc_iversion(inode, true);
+}
+
+/**
+ * inode_iversion_need_inc - is the i_version in need of being incremented?
+ * @inode: inode to check
+ *
+ * Returns whether the inode->i_version counter needs incrementing on the next
+ * change. Just fetch the value and check the QUERIED flag.
+ */
+static inline bool
+inode_iversion_need_inc(struct inode *inode)
+{
+	return inode_peek_iversion_raw(inode) & I_VERSION_QUERIED;
+}
+
+/**
+ * inode_inc_iversion_raw - forcibly increment raw i_version
+ * @inode: inode that needs to be updated
+ *
+ * Forcbily increment the raw i_version field. This always results in a change
+ * to the raw value.
+ *
+ * NFS will use the i_version field to store the value from the server. It
+ * mostly treats it as opaque, but in the case where it holds a write
+ * delegation, it must increment the value itself. This function does that.
+ */
+static inline void
+inode_inc_iversion_raw(struct inode *inode)
+{
+	atomic64_inc(&inode->i_version);
+}
+
+/**
+ * inode_peek_iversion - read i_version without flagging it to be incremented
+ * @inode: inode from which i_version should be read
+ *
+ * Read the inode i_version counter for an inode without registering it as a
+ * query.
+ *
+ * This is typically used by local filesystems that need to store an i_version
+ * on disk. In that situation, it's not necessary to flag it as having been
+ * viewed, as the result won't be used to gauge changes from that point.
+ */
+static inline u64
+inode_peek_iversion(const struct inode *inode)
+{
+	return inode_peek_iversion_raw(inode) >> I_VERSION_QUERIED_SHIFT;
+}
+
+/**
+ * inode_query_iversion - read i_version for later use
+ * @inode: inode from which i_version should be read
+ *
+ * Read the inode i_version counter. This should be used by callers that wish
+ * to store the returned i_version for later comparison. This will guarantee
+ * that a later query of the i_version will result in a different value if
+ * anything has changed.
+ *
+ * In this implementation, we fetch the current value, set the QUERIED flag and
+ * then try to swap it into place with a cmpxchg, if it wasn't already set. If
+ * that fails, we try again with the newly fetched value from the cmpxchg.
+ */
+static inline u64
+inode_query_iversion(struct inode *inode)
+{
+	u64 cur, old, new;
+
+	cur = inode_peek_iversion_raw(inode);
+	for (;;) {
+		/* If flag is already set, then no need to swap */
+		if (cur & I_VERSION_QUERIED) {
+			/*
+			 * This barrier (and the implicit barrier in the
+			 * cmpxchg below) pairs with the barrier in
+			 * inode_maybe_inc_iversion().
+			 */
+			smp_mb();
+			break;
+		}
+
+		new = cur | I_VERSION_QUERIED;
+		old = atomic64_cmpxchg(&inode->i_version, cur, new);
+		if (likely(old == cur))
+			break;
+		cur = old;
+	}
+	return cur >> I_VERSION_QUERIED_SHIFT;
+}
+
+/**
+ * inode_cmp_iversion_raw - check whether the raw i_version counter has changed
+ * @inode: inode to check
+ * @old: old value to check against its i_version
+ *
+ * Compare the current raw i_version counter with a previous one. Returns 0 if
+ * they are the same or non-zero if they are different.
+ */
+static inline s64
+inode_cmp_iversion_raw(const struct inode *inode, u64 old)
+{
+	return (s64)inode_peek_iversion_raw(inode) - (s64)old;
+}
+
+/**
+ * inode_cmp_iversion - check whether the i_version counter has changed
+ * @inode: inode to check
+ * @old: old value to check against its i_version
+ *
+ * Compare an i_version counter with a previous one. Returns 0 if they are
+ * the same, a positive value if the one in the inode appears newer than @old,
+ * and a negative value if @old appears to be newer than the one in the
+ * inode.
+ *
+ * Note that we don't need to set the QUERIED flag in this case, as the value
+ * in the inode is not being recorded for later use.
+ */
+
+static inline s64
+inode_cmp_iversion(const struct inode *inode, u64 old)
+{
+	return (s64)(inode_peek_iversion_raw(inode) & ~I_VERSION_QUERIED) -
+	       (s64)(old << I_VERSION_QUERIED_SHIFT);
+}
+#endif
diff --git a/include/linux/jump_label.h b/include/linux/jump_label.h
index c7b368c..e0340ca 100644
--- a/include/linux/jump_label.h
+++ b/include/linux/jump_label.h
@@ -160,6 +160,8 @@ extern void arch_jump_label_transform_static(struct jump_entry *entry,
 extern int jump_label_text_reserved(void *start, void *end);
 extern void static_key_slow_inc(struct static_key *key);
 extern void static_key_slow_dec(struct static_key *key);
+extern void static_key_slow_inc_cpuslocked(struct static_key *key);
+extern void static_key_slow_dec_cpuslocked(struct static_key *key);
 extern void jump_label_apply_nops(struct module *mod);
 extern int static_key_count(struct static_key *key);
 extern void static_key_enable(struct static_key *key);
@@ -222,6 +224,9 @@ static inline void static_key_slow_dec(struct static_key *key)
 	atomic_dec(&key->enabled);
 }
 
+#define static_key_slow_inc_cpuslocked(key) static_key_slow_inc(key)
+#define static_key_slow_dec_cpuslocked(key) static_key_slow_dec(key)
+
 static inline int jump_label_text_reserved(void *start, void *end)
 {
 	return 0;
@@ -416,6 +421,8 @@ extern bool ____wrong_branch_error(void);
 
 #define static_branch_inc(x)		static_key_slow_inc(&(x)->key)
 #define static_branch_dec(x)		static_key_slow_dec(&(x)->key)
+#define static_branch_inc_cpuslocked(x)	static_key_slow_inc_cpuslocked(&(x)->key)
+#define static_branch_dec_cpuslocked(x)	static_key_slow_dec_cpuslocked(&(x)->key)
 
 /*
  * Normal usage; boolean enable/disable.
diff --git a/include/linux/lightnvm.h b/include/linux/lightnvm.h
index 2d1d9de..7f4b60a 100644
--- a/include/linux/lightnvm.h
+++ b/include/linux/lightnvm.h
@@ -50,10 +50,7 @@ struct nvm_id;
 struct nvm_dev;
 struct nvm_tgt_dev;
 
-typedef int (nvm_l2p_update_fn)(u64, u32, __le64 *, void *);
 typedef int (nvm_id_fn)(struct nvm_dev *, struct nvm_id *);
-typedef int (nvm_get_l2p_tbl_fn)(struct nvm_dev *, u64, u32,
-				nvm_l2p_update_fn *, void *);
 typedef int (nvm_op_bb_tbl_fn)(struct nvm_dev *, struct ppa_addr, u8 *);
 typedef int (nvm_op_set_bb_fn)(struct nvm_dev *, struct ppa_addr *, int, int);
 typedef int (nvm_submit_io_fn)(struct nvm_dev *, struct nvm_rq *);
@@ -66,7 +63,6 @@ typedef void (nvm_dev_dma_free_fn)(void *, void*, dma_addr_t);
 
 struct nvm_dev_ops {
 	nvm_id_fn		*identity;
-	nvm_get_l2p_tbl_fn	*get_l2p_tbl;
 	nvm_op_bb_tbl_fn	*get_bb_tbl;
 	nvm_op_set_bb_fn	*set_bb_tbl;
 
@@ -112,8 +108,6 @@ enum {
 	NVM_RSP_WARN_HIGHECC	= 0x4700,
 
 	/* Device opcodes */
-	NVM_OP_HBREAD		= 0x02,
-	NVM_OP_HBWRITE		= 0x81,
 	NVM_OP_PWRITE		= 0x91,
 	NVM_OP_PREAD		= 0x92,
 	NVM_OP_ERASE		= 0x90,
@@ -165,12 +159,16 @@ struct nvm_id_group {
 	u8	fmtype;
 	u8	num_ch;
 	u8	num_lun;
-	u8	num_pln;
-	u16	num_blk;
-	u16	num_pg;
-	u16	fpg_sz;
+	u16	num_chk;
+	u16	clba;
 	u16	csecs;
 	u16	sos;
+
+	u16	ws_min;
+	u16	ws_opt;
+	u16	ws_seq;
+	u16	ws_per_chk;
+
 	u32	trdt;
 	u32	trdm;
 	u32	tprt;
@@ -181,7 +179,10 @@ struct nvm_id_group {
 	u32	mccap;
 	u16	cpar;
 
-	struct nvm_id_lp_tbl lptbl;
+	/* 1.2 compatibility */
+	u8	num_pln;
+	u16	num_pg;
+	u16	fpg_sz;
 };
 
 struct nvm_addr_format {
@@ -217,6 +218,10 @@ struct nvm_target {
 
 #define ADDR_EMPTY (~0ULL)
 
+#define NVM_TARGET_DEFAULT_OP (101)
+#define NVM_TARGET_MIN_OP (3)
+#define NVM_TARGET_MAX_OP (80)
+
 #define NVM_VERSION_MAJOR 1
 #define NVM_VERSION_MINOR 0
 #define NVM_VERSION_PATCH 0
@@ -239,7 +244,6 @@ struct nvm_rq {
 	void *meta_list;
 	dma_addr_t dma_meta_list;
 
-	struct completion *wait;
 	nvm_end_io_fn *end_io;
 
 	uint8_t opcode;
@@ -268,31 +272,38 @@ enum {
 	NVM_BLK_ST_BAD =	0x8,	/* Bad block */
 };
 
+
 /* Device generic information */
 struct nvm_geo {
+	/* generic geometry */
 	int nr_chnls;
-	int nr_luns;
-	int luns_per_chnl; /* -1 if channels are not symmetric */
-	int nr_planes;
-	int sec_per_pg; /* only sectors for a single page */
-	int pgs_per_blk;
-	int blks_per_lun;
-	int fpg_size;
-	int pfpg_size; /* size of buffer if all pages are to be read */
+	int all_luns; /* across channels */
+	int nr_luns; /* per channel */
+	int nr_chks; /* per lun */
+
 	int sec_size;
 	int oob_size;
 	int mccap;
+
+	int sec_per_chk;
+	int sec_per_lun;
+
+	int ws_min;
+	int ws_opt;
+	int ws_seq;
+	int ws_per_chk;
+
+	int max_rq_size;
+
+	int op;
+
 	struct nvm_addr_format ppaf;
 
-	/* Calculated/Cached values. These do not reflect the actual usable
-	 * blocks at run-time.
-	 */
-	int max_rq_size;
+	/* Legacy 1.2 specific geometry */
 	int plane_mode; /* drive device in single, double or quad mode */
-
+	int nr_planes;
+	int sec_per_pg; /* only sectors for a single page */
 	int sec_per_pl; /* all sectors across planes */
-	int sec_per_blk;
-	int sec_per_lun;
 };
 
 /* sub-device structure */
@@ -320,10 +331,6 @@ struct nvm_dev {
 	/* Device information */
 	struct nvm_geo geo;
 
-	  /* lower page table */
-	int lps_per_blk;
-	int *lptbl;
-
 	unsigned long total_secs;
 
 	unsigned long *lun_map;
@@ -346,36 +353,6 @@ struct nvm_dev {
 	struct list_head targets;
 };
 
-static inline struct ppa_addr linear_to_generic_addr(struct nvm_geo *geo,
-						     u64 pba)
-{
-	struct ppa_addr l;
-	int secs, pgs, blks, luns;
-	sector_t ppa = pba;
-
-	l.ppa = 0;
-
-	div_u64_rem(ppa, geo->sec_per_pg, &secs);
-	l.g.sec = secs;
-
-	sector_div(ppa, geo->sec_per_pg);
-	div_u64_rem(ppa, geo->pgs_per_blk, &pgs);
-	l.g.pg = pgs;
-
-	sector_div(ppa, geo->pgs_per_blk);
-	div_u64_rem(ppa, geo->blks_per_lun, &blks);
-	l.g.blk = blks;
-
-	sector_div(ppa, geo->blks_per_lun);
-	div_u64_rem(ppa, geo->luns_per_chnl, &luns);
-	l.g.lun = luns;
-
-	sector_div(ppa, geo->luns_per_chnl);
-	l.g.ch = ppa;
-
-	return l;
-}
-
 static inline struct ppa_addr generic_to_dev_addr(struct nvm_tgt_dev *tgt_dev,
 						  struct ppa_addr r)
 {
@@ -418,25 +395,6 @@ static inline struct ppa_addr dev_to_generic_addr(struct nvm_tgt_dev *tgt_dev,
 	return l;
 }
 
-static inline int ppa_empty(struct ppa_addr ppa_addr)
-{
-	return (ppa_addr.ppa == ADDR_EMPTY);
-}
-
-static inline void ppa_set_empty(struct ppa_addr *ppa_addr)
-{
-	ppa_addr->ppa = ADDR_EMPTY;
-}
-
-static inline int ppa_cmp_blk(struct ppa_addr ppa1, struct ppa_addr ppa2)
-{
-	if (ppa_empty(ppa1) || ppa_empty(ppa2))
-		return 0;
-
-	return ((ppa1.g.ch == ppa2.g.ch) && (ppa1.g.lun == ppa2.g.lun) &&
-					(ppa1.g.blk == ppa2.g.blk));
-}
-
 typedef blk_qc_t (nvm_tgt_make_rq_fn)(struct request_queue *, struct bio *);
 typedef sector_t (nvm_tgt_capacity_fn)(void *);
 typedef void *(nvm_tgt_init_fn)(struct nvm_tgt_dev *, struct gendisk *,
@@ -481,17 +439,10 @@ extern int nvm_set_tgt_bb_tbl(struct nvm_tgt_dev *, struct ppa_addr *,
 extern int nvm_max_phys_sects(struct nvm_tgt_dev *);
 extern int nvm_submit_io(struct nvm_tgt_dev *, struct nvm_rq *);
 extern int nvm_submit_io_sync(struct nvm_tgt_dev *, struct nvm_rq *);
-extern int nvm_erase_sync(struct nvm_tgt_dev *, struct ppa_addr *, int);
-extern int nvm_get_l2p_tbl(struct nvm_tgt_dev *, u64, u32, nvm_l2p_update_fn *,
-			   void *);
-extern int nvm_get_area(struct nvm_tgt_dev *, sector_t *, sector_t);
-extern void nvm_put_area(struct nvm_tgt_dev *, sector_t);
 extern void nvm_end_io(struct nvm_rq *);
 extern int nvm_bb_tbl_fold(struct nvm_dev *, u8 *, int);
 extern int nvm_get_tgt_bb_tbl(struct nvm_tgt_dev *, struct ppa_addr, u8 *);
 
-extern void nvm_part_to_tgt(struct nvm_dev *, sector_t *, int);
-
 #else /* CONFIG_NVM */
 struct nvm_dev_ops;
 
diff --git a/include/linux/lockdep.h b/include/linux/lockdep.h
index 2e75dc34b..6fc77d4 100644
--- a/include/linux/lockdep.h
+++ b/include/linux/lockdep.h
@@ -337,9 +337,9 @@ extern void lock_release(struct lockdep_map *lock, int nested,
 /*
  * Same "read" as for lock_acquire(), except -1 means any.
  */
-extern int lock_is_held_type(struct lockdep_map *lock, int read);
+extern int lock_is_held_type(const struct lockdep_map *lock, int read);
 
-static inline int lock_is_held(struct lockdep_map *lock)
+static inline int lock_is_held(const struct lockdep_map *lock)
 {
 	return lock_is_held_type(lock, -1);
 }
@@ -367,8 +367,6 @@ extern struct pin_cookie lock_pin_lock(struct lockdep_map *lock);
 extern void lock_repin_lock(struct lockdep_map *lock, struct pin_cookie);
 extern void lock_unpin_lock(struct lockdep_map *lock, struct pin_cookie);
 
-# define INIT_LOCKDEP				.lockdep_recursion = 0,
-
 #define lockdep_depth(tsk)	(debug_locks ? (tsk)->lockdep_depth : 0)
 
 #define lockdep_assert_held(l)	do {				\
@@ -426,7 +424,6 @@ static inline void lockdep_on(void)
  * #ifdef the call himself.
  */
 
-# define INIT_LOCKDEP
 # define lockdep_reset()		do { debug_locks = 1; } while (0)
 # define lockdep_free_key_range(start, size)	do { } while (0)
 # define lockdep_sys_exit() 			do { } while (0)
@@ -475,8 +472,6 @@ enum xhlock_context_t {
 #define STATIC_LOCKDEP_MAP_INIT(_name, _key) \
 	{ .name = (_name), .key = (void *)(_key), }
 
-static inline void crossrelease_hist_start(enum xhlock_context_t c) {}
-static inline void crossrelease_hist_end(enum xhlock_context_t c) {}
 static inline void lockdep_invariant_state(bool force) {}
 static inline void lockdep_init_task(struct task_struct *task) {}
 static inline void lockdep_free_task(struct task_struct *task) {}
diff --git a/include/linux/mfd/axp20x.h b/include/linux/mfd/axp20x.h
index 78dc853..080798f 100644
--- a/include/linux/mfd/axp20x.h
+++ b/include/linux/mfd/axp20x.h
@@ -645,11 +645,6 @@ struct axp20x_dev {
 	const struct regmap_irq_chip	*regmap_irq_chip;
 };
 
-struct axp288_extcon_pdata {
-	/* GPIO pin control to switch D+/D- lines b/w PMIC and SOC */
-	struct gpio_desc *gpio_mux_cntl;
-};
-
 /* generic helper function for reading 9-16 bit wide regs */
 static inline int axp20x_read_variable_width(struct regmap *regmap,
 	unsigned int reg, unsigned int width)
diff --git a/include/linux/mfd/cros_ec.h b/include/linux/mfd/cros_ec.h
index 4e887ba..c615359 100644
--- a/include/linux/mfd/cros_ec.h
+++ b/include/linux/mfd/cros_ec.h
@@ -322,6 +322,10 @@ extern struct attribute_group cros_ec_attr_group;
 extern struct attribute_group cros_ec_lightbar_attr_group;
 extern struct attribute_group cros_ec_vbc_attr_group;
 
+/* debugfs stuff */
+int cros_ec_debugfs_init(struct cros_ec_dev *ec);
+void cros_ec_debugfs_remove(struct cros_ec_dev *ec);
+
 /* ACPI GPE handler */
 #ifdef CONFIG_ACPI
 
diff --git a/include/linux/mfd/cros_ec_commands.h b/include/linux/mfd/cros_ec_commands.h
index 2b16e95..a83f649 100644
--- a/include/linux/mfd/cros_ec_commands.h
+++ b/include/linux/mfd/cros_ec_commands.h
@@ -2904,16 +2904,33 @@ enum usb_pd_control_mux {
 	USB_PD_CTRL_MUX_AUTO = 5,
 };
 
+enum usb_pd_control_swap {
+	USB_PD_CTRL_SWAP_NONE = 0,
+	USB_PD_CTRL_SWAP_DATA = 1,
+	USB_PD_CTRL_SWAP_POWER = 2,
+	USB_PD_CTRL_SWAP_VCONN = 3,
+	USB_PD_CTRL_SWAP_COUNT
+};
+
 struct ec_params_usb_pd_control {
 	uint8_t port;
 	uint8_t role;
 	uint8_t mux;
+	uint8_t swap;
 } __packed;
 
 #define PD_CTRL_RESP_ENABLED_COMMS      (1 << 0) /* Communication enabled */
 #define PD_CTRL_RESP_ENABLED_CONNECTED  (1 << 1) /* Device connected */
 #define PD_CTRL_RESP_ENABLED_PD_CAPABLE (1 << 2) /* Partner is PD capable */
 
+#define PD_CTRL_RESP_ROLE_POWER         BIT(0) /* 0=SNK/1=SRC */
+#define PD_CTRL_RESP_ROLE_DATA          BIT(1) /* 0=UFP/1=DFP */
+#define PD_CTRL_RESP_ROLE_VCONN         BIT(2) /* Vconn status */
+#define PD_CTRL_RESP_ROLE_DR_POWER      BIT(3) /* Partner is dualrole power */
+#define PD_CTRL_RESP_ROLE_DR_DATA       BIT(4) /* Partner is dualrole data */
+#define PD_CTRL_RESP_ROLE_USB_COMM      BIT(5) /* Partner USB comm capable */
+#define PD_CTRL_RESP_ROLE_EXT_POWERED   BIT(6) /* Partner externally powerd */
+
 struct ec_response_usb_pd_control_v1 {
 	uint8_t enabled;
 	uint8_t role;
diff --git a/include/linux/mfd/palmas.h b/include/linux/mfd/palmas.h
index 3c8568a..75e5c8f 100644
--- a/include/linux/mfd/palmas.h
+++ b/include/linux/mfd/palmas.h
@@ -3733,6 +3733,9 @@ enum usb_irq_events {
 #define TPS65917_REGEN3_CTRL_MODE_ACTIVE			0x01
 #define TPS65917_REGEN3_CTRL_MODE_ACTIVE_SHIFT			0x00
 
+/* POWERHOLD Mask field for PRIMARY_SECONDARY_PAD2 register */
+#define TPS65917_PRIMARY_SECONDARY_PAD2_GPIO_5_MASK		0xC
+
 /* Registers for function RESOURCE */
 #define TPS65917_REGEN1_CTRL					0x2
 #define TPS65917_PLLEN_CTRL					0x3
diff --git a/include/linux/mfd/rave-sp.h b/include/linux/mfd/rave-sp.h
new file mode 100644
index 0000000..796fb979
--- /dev/null
+++ b/include/linux/mfd/rave-sp.h
@@ -0,0 +1,60 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+/*
+ * Core definitions for RAVE SP MFD driver.
+ *
+ * Copyright (C) 2017 Zodiac Inflight Innovations
+ */
+
+#ifndef _LINUX_RAVE_SP_H_
+#define _LINUX_RAVE_SP_H_
+
+#include <linux/notifier.h>
+
+enum rave_sp_command {
+	RAVE_SP_CMD_GET_FIRMWARE_VERSION	= 0x20,
+	RAVE_SP_CMD_GET_BOOTLOADER_VERSION	= 0x21,
+	RAVE_SP_CMD_BOOT_SOURCE			= 0x26,
+	RAVE_SP_CMD_GET_BOARD_COPPER_REV	= 0x2B,
+	RAVE_SP_CMD_GET_GPIO_STATE		= 0x2F,
+
+	RAVE_SP_CMD_STATUS			= 0xA0,
+	RAVE_SP_CMD_SW_WDT			= 0xA1,
+	RAVE_SP_CMD_PET_WDT			= 0xA2,
+	RAVE_SP_CMD_RESET			= 0xA7,
+	RAVE_SP_CMD_RESET_REASON		= 0xA8,
+
+	RAVE_SP_CMD_REQ_COPPER_REV		= 0xB6,
+	RAVE_SP_CMD_GET_I2C_DEVICE_STATUS	= 0xBA,
+	RAVE_SP_CMD_GET_SP_SILICON_REV		= 0xB9,
+	RAVE_SP_CMD_CONTROL_EVENTS		= 0xBB,
+
+	RAVE_SP_EVNT_BASE			= 0xE0,
+};
+
+struct rave_sp;
+
+static inline unsigned long rave_sp_action_pack(u8 event, u8 value)
+{
+	return ((unsigned long)value << 8) | event;
+}
+
+static inline u8 rave_sp_action_unpack_event(unsigned long action)
+{
+	return action;
+}
+
+static inline u8 rave_sp_action_unpack_value(unsigned long action)
+{
+	return action >> 8;
+}
+
+int rave_sp_exec(struct rave_sp *sp,
+		 void *__data,  size_t data_size,
+		 void *reply_data, size_t reply_data_size);
+
+struct device;
+int devm_rave_sp_register_event_notifier(struct device *dev,
+					 struct notifier_block *nb);
+
+#endif /* _LINUX_RAVE_SP_H_ */
diff --git a/include/linux/mfd/stm32-lptimer.h b/include/linux/mfd/stm32-lptimer.h
index 77c7cf4..605f622 100644
--- a/include/linux/mfd/stm32-lptimer.h
+++ b/include/linux/mfd/stm32-lptimer.h
@@ -1,13 +1,9 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * STM32 Low-Power Timer parent driver.
- *
  * Copyright (C) STMicroelectronics 2017
- *
  * Author: Fabrice Gasnier <fabrice.gasnier@st.com>
- *
  * Inspired by Benjamin Gaignard's stm32-timers driver
- *
- * License terms:  GNU General Public License (GPL), version 2
  */
 
 #ifndef _LINUX_STM32_LPTIMER_H_
diff --git a/include/linux/mfd/stm32-timers.h b/include/linux/mfd/stm32-timers.h
index ce7346e..2aadab6 100644
--- a/include/linux/mfd/stm32-timers.h
+++ b/include/linux/mfd/stm32-timers.h
@@ -1,9 +1,7 @@
+/* SPDX-License-Identifier: GPL-2.0 */
 /*
  * Copyright (C) STMicroelectronics 2016
- *
  * Author: Benjamin Gaignard <benjamin.gaignard@st.com>
- *
- * License terms:  GNU General Public License (GPL), version 2
  */
 
 #ifndef _LINUX_STM32_GPTIMER_H_
diff --git a/include/linux/mfd/tmio.h b/include/linux/mfd/tmio.h
index e1cfe91..396a103c 100644
--- a/include/linux/mfd/tmio.h
+++ b/include/linux/mfd/tmio.h
@@ -25,26 +25,6 @@
 		writew((val) >> 16, (addr) + 2); \
 	} while (0)
 
-#define CNF_CMD     0x04
-#define CNF_CTL_BASE   0x10
-#define CNF_INT_PIN  0x3d
-#define CNF_STOP_CLK_CTL 0x40
-#define CNF_GCLK_CTL 0x41
-#define CNF_SD_CLK_MODE 0x42
-#define CNF_PIN_STATUS 0x44
-#define CNF_PWR_CTL_1 0x48
-#define CNF_PWR_CTL_2 0x49
-#define CNF_PWR_CTL_3 0x4a
-#define CNF_CARD_DETECT_MODE 0x4c
-#define CNF_SD_SLOT 0x50
-#define CNF_EXT_GCLK_CTL_1 0xf0
-#define CNF_EXT_GCLK_CTL_2 0xf1
-#define CNF_EXT_GCLK_CTL_3 0xf9
-#define CNF_SD_LED_EN_1 0xfa
-#define CNF_SD_LED_EN_2 0xfe
-
-#define   SDCREN 0x2   /* Enable access to MMC CTL regs. (flag in COMMAND_REG)*/
-
 #define sd_config_write8(base, shift, reg, val) \
 	tmio_iowrite8((val), (base) + ((reg) << (shift)))
 #define sd_config_write16(base, shift, reg, val) \
diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
index 1f509d0..a061042 100644
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -36,6 +36,7 @@
 #include <linux/kernel.h>
 #include <linux/completion.h>
 #include <linux/pci.h>
+#include <linux/irq.h>
 #include <linux/spinlock_types.h>
 #include <linux/semaphore.h>
 #include <linux/slab.h>
@@ -1231,7 +1232,23 @@ enum {
 static inline const struct cpumask *
 mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
 {
-	return pci_irq_get_affinity(dev->pdev, MLX5_EQ_VEC_COMP_BASE + vector);
+	const struct cpumask *mask;
+	struct irq_desc *desc;
+	unsigned int irq;
+	int eqn;
+	int err;
+
+	err = mlx5_vector2eqn(dev, vector, &eqn, &irq);
+	if (err)
+		return NULL;
+
+	desc = irq_to_desc(irq);
+#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
+	mask = irq_data_get_effective_affinity_mask(&desc->irq_data);
+#else
+	mask = desc->irq_common_data.affinity;
+#endif
+	return mask;
 }
 
 #endif /* MLX5_DRIVER_H */
diff --git a/include/linux/mlx5/mlx5_ifc.h b/include/linux/mlx5/mlx5_ifc.h
index d44ec5f..1391a82 100644
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1027,8 +1027,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
 	u8         log_max_wq_sz[0x5];
 
 	u8         nic_vport_change_event[0x1];
-	u8         disable_local_lb[0x1];
-	u8         reserved_at_3e2[0x9];
+	u8         disable_local_lb_uc[0x1];
+	u8         disable_local_lb_mc[0x1];
+	u8         reserved_at_3e3[0x8];
 	u8         log_max_vlan_list[0x5];
 	u8         reserved_at_3f0[0x3];
 	u8         log_max_current_mc_list[0x5];
diff --git a/include/linux/mmc/host.h b/include/linux/mmc/host.h
index e7743ec..8514623 100644
--- a/include/linux/mmc/host.h
+++ b/include/linux/mmc/host.h
@@ -324,6 +324,7 @@ struct mmc_host {
 #define MMC_CAP_DRIVER_TYPE_A	(1 << 23)	/* Host supports Driver Type A */
 #define MMC_CAP_DRIVER_TYPE_C	(1 << 24)	/* Host supports Driver Type C */
 #define MMC_CAP_DRIVER_TYPE_D	(1 << 25)	/* Host supports Driver Type D */
+#define MMC_CAP_DONE_COMPLETE	(1 << 27)	/* RW reqs can be completed within mmc_request_done() */
 #define MMC_CAP_CD_WAKE		(1 << 28)	/* Enable card detect wake */
 #define MMC_CAP_CMD_DURING_TFR	(1 << 29)	/* Commands during data transfer */
 #define MMC_CAP_CMD23		(1 << 30)	/* CMD23 supported. */
@@ -380,6 +381,7 @@ struct mmc_host {
 	unsigned int		doing_retune:1;	/* re-tuning in progress */
 	unsigned int		retune_now:1;	/* do re-tuning at next req */
 	unsigned int		retune_paused:1; /* re-tuning is temporarily disabled */
+	unsigned int		use_blk_mq:1;	/* use blk-mq */
 
 	int			rescan_disable;	/* disable card detection */
 	int			rescan_entered;	/* used with nonremovable devices */
@@ -422,9 +424,6 @@ struct mmc_host {
 
 	struct dentry		*debugfs_root;
 
-	struct mmc_async_req	*areq;		/* active async req */
-	struct mmc_context_info	context_info;	/* async synchronization info */
-
 	/* Ongoing data transfer that allows commands during transfer */
 	struct mmc_request	*ongoing_mrq;
 
diff --git a/include/linux/mmc/slot-gpio.h b/include/linux/mmc/slot-gpio.h
index 82f0d28..91f1ba0 100644
--- a/include/linux/mmc/slot-gpio.h
+++ b/include/linux/mmc/slot-gpio.h
@@ -33,5 +33,6 @@ void mmc_gpio_set_cd_isr(struct mmc_host *host,
 			 irqreturn_t (*isr)(int irq, void *dev_id));
 void mmc_gpiod_request_cd_irq(struct mmc_host *host);
 bool mmc_can_gpio_cd(struct mmc_host *host);
+bool mmc_can_gpio_ro(struct mmc_host *host);
 
 #endif
diff --git a/include/linux/module.h b/include/linux/module.h
index c69b49a..1d8f245 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -801,6 +801,15 @@ static inline void module_bug_finalize(const Elf_Ehdr *hdr,
 static inline void module_bug_cleanup(struct module *mod) {}
 #endif	/* CONFIG_GENERIC_BUG */
 
+#ifdef RETPOLINE
+extern bool retpoline_module_ok(bool has_retpoline);
+#else
+static inline bool retpoline_module_ok(bool has_retpoline)
+{
+	return true;
+}
+#endif
+
 #ifdef CONFIG_MODULE_SIG
 static inline bool module_sig_ok(struct module *module)
 {
diff --git a/include/linux/mtd/map.h b/include/linux/mtd/map.h
index 3aa56e3..b5b43f9 100644
--- a/include/linux/mtd/map.h
+++ b/include/linux/mtd/map.h
@@ -270,75 +270,67 @@ void map_destroy(struct mtd_info *mtd);
 #define INVALIDATE_CACHED_RANGE(map, from, size) \
 	do { if (map->inval_cache) map->inval_cache(map, from, size); } while (0)
 
+#define map_word_equal(map, val1, val2)					\
+({									\
+	int i, ret = 1;							\
+	for (i = 0; i < map_words(map); i++)				\
+		if ((val1).x[i] != (val2).x[i]) {			\
+			ret = 0;					\
+			break;						\
+		}							\
+	ret;								\
+})
 
-static inline int map_word_equal(struct map_info *map, map_word val1, map_word val2)
-{
-	int i;
+#define map_word_and(map, val1, val2)					\
+({									\
+	map_word r;							\
+	int i;								\
+	for (i = 0; i < map_words(map); i++)				\
+		r.x[i] = (val1).x[i] & (val2).x[i];			\
+	r;								\
+})
 
-	for (i = 0; i < map_words(map); i++) {
-		if (val1.x[i] != val2.x[i])
-			return 0;
-	}
+#define map_word_clr(map, val1, val2)					\
+({									\
+	map_word r;							\
+	int i;								\
+	for (i = 0; i < map_words(map); i++)				\
+		r.x[i] = (val1).x[i] & ~(val2).x[i];			\
+	r;								\
+})
 
-	return 1;
-}
+#define map_word_or(map, val1, val2)					\
+({									\
+	map_word r;							\
+	int i;								\
+	for (i = 0; i < map_words(map); i++)				\
+		r.x[i] = (val1).x[i] | (val2).x[i];			\
+	r;								\
+})
 
-static inline map_word map_word_and(struct map_info *map, map_word val1, map_word val2)
-{
-	map_word r;
-	int i;
+#define map_word_andequal(map, val1, val2, val3)			\
+({									\
+	int i, ret = 1;							\
+	for (i = 0; i < map_words(map); i++) {				\
+		if (((val1).x[i] & (val2).x[i]) != (val2).x[i]) {	\
+			ret = 0;					\
+			break;						\
+		}							\
+	}								\
+	ret;								\
+})
 
-	for (i = 0; i < map_words(map); i++)
-		r.x[i] = val1.x[i] & val2.x[i];
-
-	return r;
-}
-
-static inline map_word map_word_clr(struct map_info *map, map_word val1, map_word val2)
-{
-	map_word r;
-	int i;
-
-	for (i = 0; i < map_words(map); i++)
-		r.x[i] = val1.x[i] & ~val2.x[i];
-
-	return r;
-}
-
-static inline map_word map_word_or(struct map_info *map, map_word val1, map_word val2)
-{
-	map_word r;
-	int i;
-
-	for (i = 0; i < map_words(map); i++)
-		r.x[i] = val1.x[i] | val2.x[i];
-
-	return r;
-}
-
-static inline int map_word_andequal(struct map_info *map, map_word val1, map_word val2, map_word val3)
-{
-	int i;
-
-	for (i = 0; i < map_words(map); i++) {
-		if ((val1.x[i] & val2.x[i]) != val3.x[i])
-			return 0;
-	}
-
-	return 1;
-}
-
-static inline int map_word_bitsset(struct map_info *map, map_word val1, map_word val2)
-{
-	int i;
-
-	for (i = 0; i < map_words(map); i++) {
-		if (val1.x[i] & val2.x[i])
-			return 1;
-	}
-
-	return 0;
-}
+#define map_word_bitsset(map, val1, val2)				\
+({									\
+	int i, ret = 0;							\
+	for (i = 0; i < map_words(map); i++) {				\
+		if ((val1).x[i] & (val2).x[i]) {			\
+			ret = 1;					\
+			break;						\
+		}							\
+	}								\
+	ret;								\
+})
 
 static inline map_word map_word_load(struct map_info *map, const void *ptr)
 {
diff --git a/include/linux/mtd/mtd.h b/include/linux/mtd/mtd.h
index cd55bf1..205eded 100644
--- a/include/linux/mtd/mtd.h
+++ b/include/linux/mtd/mtd.h
@@ -489,6 +489,34 @@ static inline uint32_t mtd_mod_by_eb(uint64_t sz, struct mtd_info *mtd)
 	return do_div(sz, mtd->erasesize);
 }
 
+/**
+ * mtd_align_erase_req - Adjust an erase request to align things on eraseblock
+ *			 boundaries.
+ * @mtd: the MTD device this erase request applies on
+ * @req: the erase request to adjust
+ *
+ * This function will adjust @req->addr and @req->len to align them on
+ * @mtd->erasesize. Of course we expect @mtd->erasesize to be != 0.
+ */
+static inline void mtd_align_erase_req(struct mtd_info *mtd,
+				       struct erase_info *req)
+{
+	u32 mod;
+
+	if (WARN_ON(!mtd->erasesize))
+		return;
+
+	mod = mtd_mod_by_eb(req->addr, mtd);
+	if (mod) {
+		req->addr -= mod;
+		req->len += mod;
+	}
+
+	mod = mtd_mod_by_eb(req->addr + req->len, mtd);
+	if (mod)
+		req->len += mtd->erasesize - mod;
+}
+
 static inline uint32_t mtd_div_by_ws(uint64_t sz, struct mtd_info *mtd)
 {
 	if (mtd->writesize_shift)
diff --git a/include/linux/mtd/rawnand.h b/include/linux/mtd/rawnand.h
index 749bb08..56c5570 100644
--- a/include/linux/mtd/rawnand.h
+++ b/include/linux/mtd/rawnand.h
@@ -133,12 +133,6 @@ enum nand_ecc_algo {
  */
 #define NAND_ECC_GENERIC_ERASED_CHECK	BIT(0)
 #define NAND_ECC_MAXIMIZE		BIT(1)
-/*
- * If your controller already sends the required NAND commands when
- * reading or writing a page, then the framework is not supposed to
- * send READ0 and SEQIN/PAGEPROG respectively.
- */
-#define NAND_ECC_CUSTOM_PAGE_ACCESS	BIT(2)
 
 /* Bit mask for flags passed to do_nand_read_ecc */
 #define NAND_GET_DEVICE		0x80
@@ -191,11 +185,6 @@ enum nand_ecc_algo {
 /* Non chip related options */
 /* This option skips the bbt scan during initialization. */
 #define NAND_SKIP_BBTSCAN	0x00010000
-/*
- * This option is defined if the board driver allocates its own buffers
- * (e.g. because it needs them DMA-coherent).
- */
-#define NAND_OWN_BUFFERS	0x00020000
 /* Chip may not exist, so silence any errors in scan */
 #define NAND_SCAN_SILENT_NODEV	0x00040000
 /*
@@ -525,6 +514,8 @@ static const struct nand_ecc_caps __name = {			\
  * @postpad:	padding information for syndrome based ECC generators
  * @options:	ECC specific options (see NAND_ECC_XXX flags defined above)
  * @priv:	pointer to private ECC control data
+ * @calc_buf:	buffer for calculated ECC, size is oobsize.
+ * @code_buf:	buffer for ECC read from flash, size is oobsize.
  * @hwctl:	function to control hardware ECC generator. Must only
  *		be provided if an hardware ECC is available
  * @calculate:	function for ECC calculation or readback from ECC hardware
@@ -575,6 +566,8 @@ struct nand_ecc_ctrl {
 	int postpad;
 	unsigned int options;
 	void *priv;
+	u8 *calc_buf;
+	u8 *code_buf;
 	void (*hwctl)(struct mtd_info *mtd, int mode);
 	int (*calculate)(struct mtd_info *mtd, const uint8_t *dat,
 			uint8_t *ecc_code);
@@ -602,26 +595,6 @@ struct nand_ecc_ctrl {
 			int page);
 };
 
-static inline int nand_standard_page_accessors(struct nand_ecc_ctrl *ecc)
-{
-	return !(ecc->options & NAND_ECC_CUSTOM_PAGE_ACCESS);
-}
-
-/**
- * struct nand_buffers - buffer structure for read/write
- * @ecccalc:	buffer pointer for calculated ECC, size is oobsize.
- * @ecccode:	buffer pointer for ECC read from flash, size is oobsize.
- * @databuf:	buffer pointer for data, size is (page size + oobsize).
- *
- * Do not change the order of buffers. databuf and oobrbuf must be in
- * consecutive order.
- */
-struct nand_buffers {
-	uint8_t	*ecccalc;
-	uint8_t	*ecccode;
-	uint8_t *databuf;
-};
-
 /**
  * struct nand_sdr_timings - SDR NAND chip timings
  *
@@ -762,6 +735,350 @@ struct nand_manufacturer_ops {
 };
 
 /**
+ * struct nand_op_cmd_instr - Definition of a command instruction
+ * @opcode: the command to issue in one cycle
+ */
+struct nand_op_cmd_instr {
+	u8 opcode;
+};
+
+/**
+ * struct nand_op_addr_instr - Definition of an address instruction
+ * @naddrs: length of the @addrs array
+ * @addrs: array containing the address cycles to issue
+ */
+struct nand_op_addr_instr {
+	unsigned int naddrs;
+	const u8 *addrs;
+};
+
+/**
+ * struct nand_op_data_instr - Definition of a data instruction
+ * @len: number of data bytes to move
+ * @in: buffer to fill when reading from the NAND chip
+ * @out: buffer to read from when writing to the NAND chip
+ * @force_8bit: force 8-bit access
+ *
+ * Please note that "in" and "out" are inverted from the ONFI specification
+ * and are from the controller perspective, so a "in" is a read from the NAND
+ * chip while a "out" is a write to the NAND chip.
+ */
+struct nand_op_data_instr {
+	unsigned int len;
+	union {
+		void *in;
+		const void *out;
+	} buf;
+	bool force_8bit;
+};
+
+/**
+ * struct nand_op_waitrdy_instr - Definition of a wait ready instruction
+ * @timeout_ms: maximum delay while waiting for the ready/busy pin in ms
+ */
+struct nand_op_waitrdy_instr {
+	unsigned int timeout_ms;
+};
+
+/**
+ * enum nand_op_instr_type - Definition of all instruction types
+ * @NAND_OP_CMD_INSTR: command instruction
+ * @NAND_OP_ADDR_INSTR: address instruction
+ * @NAND_OP_DATA_IN_INSTR: data in instruction
+ * @NAND_OP_DATA_OUT_INSTR: data out instruction
+ * @NAND_OP_WAITRDY_INSTR: wait ready instruction
+ */
+enum nand_op_instr_type {
+	NAND_OP_CMD_INSTR,
+	NAND_OP_ADDR_INSTR,
+	NAND_OP_DATA_IN_INSTR,
+	NAND_OP_DATA_OUT_INSTR,
+	NAND_OP_WAITRDY_INSTR,
+};
+
+/**
+ * struct nand_op_instr - Instruction object
+ * @type: the instruction type
+ * @cmd/@addr/@data/@waitrdy: extra data associated to the instruction.
+ *                            You'll have to use the appropriate element
+ *                            depending on @type
+ * @delay_ns: delay the controller should apply after the instruction has been
+ *	      issued on the bus. Most modern controllers have internal timings
+ *	      control logic, and in this case, the controller driver can ignore
+ *	      this field.
+ */
+struct nand_op_instr {
+	enum nand_op_instr_type type;
+	union {
+		struct nand_op_cmd_instr cmd;
+		struct nand_op_addr_instr addr;
+		struct nand_op_data_instr data;
+		struct nand_op_waitrdy_instr waitrdy;
+	} ctx;
+	unsigned int delay_ns;
+};
+
+/*
+ * Special handling must be done for the WAITRDY timeout parameter as it usually
+ * is either tPROG (after a prog), tR (before a read), tRST (during a reset) or
+ * tBERS (during an erase) which all of them are u64 values that cannot be
+ * divided by usual kernel macros and must be handled with the special
+ * DIV_ROUND_UP_ULL() macro.
+ */
+#define __DIVIDE(dividend, divisor) ({					\
+	sizeof(dividend) == sizeof(u32) ?				\
+		DIV_ROUND_UP(dividend, divisor) :			\
+		DIV_ROUND_UP_ULL(dividend, divisor);			\
+		})
+#define PSEC_TO_NSEC(x) __DIVIDE(x, 1000)
+#define PSEC_TO_MSEC(x) __DIVIDE(x, 1000000000)
+
+#define NAND_OP_CMD(id, ns)						\
+	{								\
+		.type = NAND_OP_CMD_INSTR,				\
+		.ctx.cmd.opcode = id,					\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_ADDR(ncycles, cycles, ns)				\
+	{								\
+		.type = NAND_OP_ADDR_INSTR,				\
+		.ctx.addr = {						\
+			.naddrs = ncycles,				\
+			.addrs = cycles,				\
+		},							\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_DATA_IN(l, b, ns)					\
+	{								\
+		.type = NAND_OP_DATA_IN_INSTR,				\
+		.ctx.data = {						\
+			.len = l,					\
+			.buf.in = b,					\
+			.force_8bit = false,				\
+		},							\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_DATA_OUT(l, b, ns)					\
+	{								\
+		.type = NAND_OP_DATA_OUT_INSTR,				\
+		.ctx.data = {						\
+			.len = l,					\
+			.buf.out = b,					\
+			.force_8bit = false,				\
+		},							\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_8BIT_DATA_IN(l, b, ns)					\
+	{								\
+		.type = NAND_OP_DATA_IN_INSTR,				\
+		.ctx.data = {						\
+			.len = l,					\
+			.buf.in = b,					\
+			.force_8bit = true,				\
+		},							\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_8BIT_DATA_OUT(l, b, ns)					\
+	{								\
+		.type = NAND_OP_DATA_OUT_INSTR,				\
+		.ctx.data = {						\
+			.len = l,					\
+			.buf.out = b,					\
+			.force_8bit = true,				\
+		},							\
+		.delay_ns = ns,						\
+	}
+
+#define NAND_OP_WAIT_RDY(tout_ms, ns)					\
+	{								\
+		.type = NAND_OP_WAITRDY_INSTR,				\
+		.ctx.waitrdy.timeout_ms = tout_ms,			\
+		.delay_ns = ns,						\
+	}
+
+/**
+ * struct nand_subop - a sub operation
+ * @instrs: array of instructions
+ * @ninstrs: length of the @instrs array
+ * @first_instr_start_off: offset to start from for the first instruction
+ *			   of the sub-operation
+ * @last_instr_end_off: offset to end at (excluded) for the last instruction
+ *			of the sub-operation
+ *
+ * Both @first_instr_start_off and @last_instr_end_off only apply to data or
+ * address instructions.
+ *
+ * When an operation cannot be handled as is by the NAND controller, it will
+ * be split by the parser into sub-operations which will be passed to the
+ * controller driver.
+ */
+struct nand_subop {
+	const struct nand_op_instr *instrs;
+	unsigned int ninstrs;
+	unsigned int first_instr_start_off;
+	unsigned int last_instr_end_off;
+};
+
+int nand_subop_get_addr_start_off(const struct nand_subop *subop,
+				  unsigned int op_id);
+int nand_subop_get_num_addr_cyc(const struct nand_subop *subop,
+				unsigned int op_id);
+int nand_subop_get_data_start_off(const struct nand_subop *subop,
+				  unsigned int op_id);
+int nand_subop_get_data_len(const struct nand_subop *subop,
+			    unsigned int op_id);
+
+/**
+ * struct nand_op_parser_addr_constraints - Constraints for address instructions
+ * @maxcycles: maximum number of address cycles the controller can issue in a
+ *	       single step
+ */
+struct nand_op_parser_addr_constraints {
+	unsigned int maxcycles;
+};
+
+/**
+ * struct nand_op_parser_data_constraints - Constraints for data instructions
+ * @maxlen: maximum data length that the controller can handle in a single step
+ */
+struct nand_op_parser_data_constraints {
+	unsigned int maxlen;
+};
+
+/**
+ * struct nand_op_parser_pattern_elem - One element of a pattern
+ * @type: the instructuction type
+ * @optional: whether this element of the pattern is optional or mandatory
+ * @addr/@data: address or data constraint (number of cycles or data length)
+ */
+struct nand_op_parser_pattern_elem {
+	enum nand_op_instr_type type;
+	bool optional;
+	union {
+		struct nand_op_parser_addr_constraints addr;
+		struct nand_op_parser_data_constraints data;
+	} ctx;
+};
+
+#define NAND_OP_PARSER_PAT_CMD_ELEM(_opt)			\
+	{							\
+		.type = NAND_OP_CMD_INSTR,			\
+		.optional = _opt,				\
+	}
+
+#define NAND_OP_PARSER_PAT_ADDR_ELEM(_opt, _maxcycles)		\
+	{							\
+		.type = NAND_OP_ADDR_INSTR,			\
+		.optional = _opt,				\
+		.ctx.addr.maxcycles = _maxcycles,		\
+	}
+
+#define NAND_OP_PARSER_PAT_DATA_IN_ELEM(_opt, _maxlen)		\
+	{							\
+		.type = NAND_OP_DATA_IN_INSTR,			\
+		.optional = _opt,				\
+		.ctx.data.maxlen = _maxlen,			\
+	}
+
+#define NAND_OP_PARSER_PAT_DATA_OUT_ELEM(_opt, _maxlen)		\
+	{							\
+		.type = NAND_OP_DATA_OUT_INSTR,			\
+		.optional = _opt,				\
+		.ctx.data.maxlen = _maxlen,			\
+	}
+
+#define NAND_OP_PARSER_PAT_WAITRDY_ELEM(_opt)			\
+	{							\
+		.type = NAND_OP_WAITRDY_INSTR,			\
+		.optional = _opt,				\
+	}
+
+/**
+ * struct nand_op_parser_pattern - NAND sub-operation pattern descriptor
+ * @elems: array of pattern elements
+ * @nelems: number of pattern elements in @elems array
+ * @exec: the function that will issue a sub-operation
+ *
+ * A pattern is a list of elements, each element reprensenting one instruction
+ * with its constraints. The pattern itself is used by the core to match NAND
+ * chip operation with NAND controller operations.
+ * Once a match between a NAND controller operation pattern and a NAND chip
+ * operation (or a sub-set of a NAND operation) is found, the pattern ->exec()
+ * hook is called so that the controller driver can issue the operation on the
+ * bus.
+ *
+ * Controller drivers should declare as many patterns as they support and pass
+ * this list of patterns (created with the help of the following macro) to
+ * the nand_op_parser_exec_op() helper.
+ */
+struct nand_op_parser_pattern {
+	const struct nand_op_parser_pattern_elem *elems;
+	unsigned int nelems;
+	int (*exec)(struct nand_chip *chip, const struct nand_subop *subop);
+};
+
+#define NAND_OP_PARSER_PATTERN(_exec, ...)							\
+	{											\
+		.exec = _exec,									\
+		.elems = (struct nand_op_parser_pattern_elem[]) { __VA_ARGS__ },		\
+		.nelems = sizeof((struct nand_op_parser_pattern_elem[]) { __VA_ARGS__ }) /	\
+			  sizeof(struct nand_op_parser_pattern_elem),				\
+	}
+
+/**
+ * struct nand_op_parser - NAND controller operation parser descriptor
+ * @patterns: array of supported patterns
+ * @npatterns: length of the @patterns array
+ *
+ * The parser descriptor is just an array of supported patterns which will be
+ * iterated by nand_op_parser_exec_op() everytime it tries to execute an
+ * NAND operation (or tries to determine if a specific operation is supported).
+ *
+ * It is worth mentioning that patterns will be tested in their declaration
+ * order, and the first match will be taken, so it's important to order patterns
+ * appropriately so that simple/inefficient patterns are placed at the end of
+ * the list. Usually, this is where you put single instruction patterns.
+ */
+struct nand_op_parser {
+	const struct nand_op_parser_pattern *patterns;
+	unsigned int npatterns;
+};
+
+#define NAND_OP_PARSER(...)									\
+	{											\
+		.patterns = (struct nand_op_parser_pattern[]) { __VA_ARGS__ },			\
+		.npatterns = sizeof((struct nand_op_parser_pattern[]) { __VA_ARGS__ }) /	\
+			     sizeof(struct nand_op_parser_pattern),				\
+	}
+
+/**
+ * struct nand_operation - NAND operation descriptor
+ * @instrs: array of instructions to execute
+ * @ninstrs: length of the @instrs array
+ *
+ * The actual operation structure that will be passed to chip->exec_op().
+ */
+struct nand_operation {
+	const struct nand_op_instr *instrs;
+	unsigned int ninstrs;
+};
+
+#define NAND_OPERATION(_instrs)					\
+	{							\
+		.instrs = _instrs,				\
+		.ninstrs = ARRAY_SIZE(_instrs),			\
+	}
+
+int nand_op_parser_exec_op(struct nand_chip *chip,
+			   const struct nand_op_parser *parser,
+			   const struct nand_operation *op, bool check_only);
+
+/**
  * struct nand_chip - NAND Private Flash Chip Data
  * @mtd:		MTD device registered to the MTD framework
  * @IO_ADDR_R:		[BOARDSPECIFIC] address to read the 8 I/O lines of the
@@ -787,10 +1104,13 @@ struct nand_manufacturer_ops {
  *			commands to the chip.
  * @waitfunc:		[REPLACEABLE] hardwarespecific function for wait on
  *			ready.
+ * @exec_op:		controller specific method to execute NAND operations.
+ *			This method replaces ->cmdfunc(),
+ *			->{read,write}_{buf,byte,word}(), ->dev_ready() and
+ *			->waifunc().
  * @setup_read_retry:	[FLASHSPECIFIC] flash (vendor) specific function for
  *			setting the read-retry mode. Mostly needed for MLC NAND.
  * @ecc:		[BOARDSPECIFIC] ECC control structure
- * @buffers:		buffer structure for read/write
  * @buf_align:		minimum buffer alignment required by a platform
  * @hwcontrol:		platform-specific hardware control structure
  * @erase:		[REPLACEABLE] erase function
@@ -830,6 +1150,7 @@ struct nand_manufacturer_ops {
  * @numchips:		[INTERN] number of physical chips
  * @chipsize:		[INTERN] the size of one chip for multichip arrays
  * @pagemask:		[INTERN] page number mask = number of (pages / chip) - 1
+ * @data_buf:		[INTERN] buffer for data, size is (page size + oobsize).
  * @pagebuf:		[INTERN] holds the pagenumber which is currently in
  *			data_buf.
  * @pagebuf_bitflips:	[INTERN] holds the bitflip count for the page which is
@@ -886,6 +1207,9 @@ struct nand_chip {
 	void (*cmdfunc)(struct mtd_info *mtd, unsigned command, int column,
 			int page_addr);
 	int(*waitfunc)(struct mtd_info *mtd, struct nand_chip *this);
+	int (*exec_op)(struct nand_chip *chip,
+		       const struct nand_operation *op,
+		       bool check_only);
 	int (*erase)(struct mtd_info *mtd, int page);
 	int (*scan_bbt)(struct mtd_info *mtd);
 	int (*onfi_set_features)(struct mtd_info *mtd, struct nand_chip *chip,
@@ -896,7 +1220,6 @@ struct nand_chip {
 	int (*setup_data_interface)(struct mtd_info *mtd, int chipnr,
 				    const struct nand_data_interface *conf);
 
-
 	int chip_delay;
 	unsigned int options;
 	unsigned int bbt_options;
@@ -908,6 +1231,7 @@ struct nand_chip {
 	int numchips;
 	uint64_t chipsize;
 	int pagemask;
+	u8 *data_buf;
 	int pagebuf;
 	unsigned int pagebuf_bitflips;
 	int subpagesize;
@@ -928,7 +1252,7 @@ struct nand_chip {
 	u16 max_bb_per_die;
 	u32 blocks_per_die;
 
-	struct nand_data_interface *data_interface;
+	struct nand_data_interface data_interface;
 
 	int read_retries;
 
@@ -938,7 +1262,6 @@ struct nand_chip {
 	struct nand_hw_control *controller;
 
 	struct nand_ecc_ctrl ecc;
-	struct nand_buffers *buffers;
 	unsigned long buf_align;
 	struct nand_hw_control hwcontrol;
 
@@ -956,6 +1279,15 @@ struct nand_chip {
 	} manufacturer;
 };
 
+static inline int nand_exec_op(struct nand_chip *chip,
+			       const struct nand_operation *op)
+{
+	if (!chip->exec_op)
+		return -ENOTSUPP;
+
+	return chip->exec_op(chip, op, false);
+}
+
 extern const struct mtd_ooblayout_ops nand_ooblayout_sp_ops;
 extern const struct mtd_ooblayout_ops nand_ooblayout_lp_ops;
 
@@ -1225,8 +1557,7 @@ static inline int onfi_get_sync_timing_mode(struct nand_chip *chip)
 	return le16_to_cpu(chip->onfi_params.src_sync_timing_mode);
 }
 
-int onfi_init_data_interface(struct nand_chip *chip,
-			     struct nand_data_interface *iface,
+int onfi_fill_data_interface(struct nand_chip *chip,
 			     enum nand_data_interface_type type,
 			     int timing_mode);
 
@@ -1269,8 +1600,6 @@ static inline int jedec_feature(struct nand_chip *chip)
 
 /* get timing characteristics from ONFI timing mode. */
 const struct nand_sdr_timings *onfi_async_timing_mode_to_sdr_timings(int mode);
-/* get data interface from ONFI timing mode 0, used after reset. */
-const struct nand_data_interface *nand_get_default_data_interface(void);
 
 int nand_check_erased_ecc_chunk(void *data, int datalen,
 				void *ecc, int ecclen,
@@ -1316,9 +1645,45 @@ int nand_write_page_raw(struct mtd_info *mtd, struct nand_chip *chip,
 /* Reset and initialize a NAND device */
 int nand_reset(struct nand_chip *chip, int chipnr);
 
+/* NAND operation helpers */
+int nand_reset_op(struct nand_chip *chip);
+int nand_readid_op(struct nand_chip *chip, u8 addr, void *buf,
+		   unsigned int len);
+int nand_status_op(struct nand_chip *chip, u8 *status);
+int nand_exit_status_op(struct nand_chip *chip);
+int nand_erase_op(struct nand_chip *chip, unsigned int eraseblock);
+int nand_read_page_op(struct nand_chip *chip, unsigned int page,
+		      unsigned int offset_in_page, void *buf, unsigned int len);
+int nand_change_read_column_op(struct nand_chip *chip,
+			       unsigned int offset_in_page, void *buf,
+			       unsigned int len, bool force_8bit);
+int nand_read_oob_op(struct nand_chip *chip, unsigned int page,
+		     unsigned int offset_in_page, void *buf, unsigned int len);
+int nand_prog_page_begin_op(struct nand_chip *chip, unsigned int page,
+			    unsigned int offset_in_page, const void *buf,
+			    unsigned int len);
+int nand_prog_page_end_op(struct nand_chip *chip);
+int nand_prog_page_op(struct nand_chip *chip, unsigned int page,
+		      unsigned int offset_in_page, const void *buf,
+		      unsigned int len);
+int nand_change_write_column_op(struct nand_chip *chip,
+				unsigned int offset_in_page, const void *buf,
+				unsigned int len, bool force_8bit);
+int nand_read_data_op(struct nand_chip *chip, void *buf, unsigned int len,
+		      bool force_8bit);
+int nand_write_data_op(struct nand_chip *chip, const void *buf,
+		       unsigned int len, bool force_8bit);
+
 /* Free resources held by the NAND device */
 void nand_cleanup(struct nand_chip *chip);
 
 /* Default extended ID decoding function */
 void nand_decode_ext_id(struct nand_chip *chip);
+
+/*
+ * External helper for controller drivers that have to implement the WAITRDY
+ * instruction and have no physical pin to check it.
+ */
+int nand_soft_waitrdy(struct nand_chip *chip, unsigned long timeout_ms);
+
 #endif /* __LINUX_MTD_RAWNAND_H */
diff --git a/include/linux/mtd/spi-nor.h b/include/linux/mtd/spi-nor.h
index d0c66a0..de36969 100644
--- a/include/linux/mtd/spi-nor.h
+++ b/include/linux/mtd/spi-nor.h
@@ -61,6 +61,7 @@
 #define SPINOR_OP_RDSFDP	0x5a	/* Read SFDP */
 #define SPINOR_OP_RDCR		0x35	/* Read configuration register */
 #define SPINOR_OP_RDFSR		0x70	/* Read flag status register */
+#define SPINOR_OP_CLFSR		0x50	/* Clear flag status register */
 
 /* 4-byte address opcodes - used on Spansion and some Macronix flashes. */
 #define SPINOR_OP_READ_4B	0x13	/* Read data bytes (low frequency) */
@@ -130,7 +131,10 @@
 #define EVCR_QUAD_EN_MICRON	BIT(7)	/* Micron Quad I/O */
 
 /* Flag Status Register bits */
-#define FSR_READY		BIT(7)
+#define FSR_READY		BIT(7)	/* Device status, 0 = Busy, 1 = Ready */
+#define FSR_E_ERR		BIT(5)	/* Erase operation status */
+#define FSR_P_ERR		BIT(4)	/* Program operation status */
+#define FSR_PT_ERR		BIT(1)	/* Protection error bit */
 
 /* Configuration Register bits. */
 #define CR_QUAD_EN_SPAN		BIT(1)	/* Spansion Quad I/O */
@@ -399,4 +403,10 @@ struct spi_nor_hwcaps {
 int spi_nor_scan(struct spi_nor *nor, const char *name,
 		 const struct spi_nor_hwcaps *hwcaps);
 
+/**
+ * spi_nor_restore_addr_mode() - restore the status of SPI NOR
+ * @nor:	the spi_nor structure
+ */
+void spi_nor_restore(struct spi_nor *nor);
+
 #endif
diff --git a/include/linux/netfilter/nfnetlink.h b/include/linux/netfilter/nfnetlink.h
index 495ba4d..34551f8 100644
--- a/include/linux/netfilter/nfnetlink.h
+++ b/include/linux/netfilter/nfnetlink.h
@@ -67,8 +67,7 @@ static inline bool lockdep_nfnl_is_held(__u8 subsys_id)
  * @ss: The nfnetlink subsystem ID
  *
  * Return the value of the specified RCU-protected pointer, but omit
- * both the smp_read_barrier_depends() and the READ_ONCE(), because
- * caller holds the NFNL subsystem mutex.
+ * the READ_ONCE(), because caller holds the NFNL subsystem mutex.
  */
 #define nfnl_dereference(p, ss)					\
 	rcu_dereference_protected(p, lockdep_nfnl_is_held(ss))
diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 49b4257c..f3075d6 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -85,7 +85,7 @@ struct netlink_ext_ack {
  * to the lack of an output buffer.)
  */
 #define NL_SET_ERR_MSG(extack, msg) do {		\
-	static const char __msg[] = (msg);		\
+	static const char __msg[] = msg;		\
 	struct netlink_ext_ack *__extack = (extack);	\
 							\
 	if (__extack)					\
@@ -101,7 +101,7 @@ struct netlink_ext_ack {
 } while (0)
 
 #define NL_SET_ERR_MSG_ATTR(extack, attr, msg) do {	\
-	static const char __msg[] = (msg);		\
+	static const char __msg[] = msg;		\
 	struct netlink_ext_ack *__extack = (extack);	\
 							\
 	if (__extack) {					\
diff --git a/include/linux/nubus.h b/include/linux/nubus.h
index 11ce6b1..6e82002 100644
--- a/include/linux/nubus.h
+++ b/include/linux/nubus.h
@@ -5,20 +5,36 @@
   Originally written by Alan Cox.
 
   Hacked to death by C. Scott Ananian and David Huggins-Daines.
-  
-  Some of the constants in here are from the corresponding
-  NetBSD/OpenBSD header file, by Allen Briggs.  We figured out the
-  rest of them on our own. */
+*/
+
 #ifndef LINUX_NUBUS_H
 #define LINUX_NUBUS_H
 
+#include <linux/device.h>
 #include <asm/nubus.h>
 #include <uapi/linux/nubus.h>
 
+struct proc_dir_entry;
+struct seq_file;
+
+struct nubus_dir {
+	unsigned char *base;
+	unsigned char *ptr;
+	int done;
+	int mask;
+	struct proc_dir_entry *procdir;
+};
+
+struct nubus_dirent {
+	unsigned char *base;
+	unsigned char type;
+	__u32 data;	/* Actually 24 bits used */
+	int mask;
+};
+
 struct nubus_board {
-	struct nubus_board* next;
-	struct nubus_dev* first_dev;
-	
+	struct device dev;
+
 	/* Only 9-E actually exist, though 0-8 are also theoretically
 	   possible, and 0 is a special case which represents the
 	   motherboard and onboard peripherals (Ethernet, video) */
@@ -27,10 +43,10 @@ struct nubus_board {
 	char name[64];
 
 	/* Format block */
-	unsigned char* fblock;
+	unsigned char *fblock;
 	/* Root directory (does *not* always equal fblock + doffset!) */
-	unsigned char* directory;
-	
+	unsigned char *directory;
+
 	unsigned long slot_addr;
 	/* Offset to root directory (sometimes) */
 	unsigned long doffset;
@@ -41,15 +57,15 @@ struct nubus_board {
 	unsigned char rev;
 	unsigned char format;
 	unsigned char lanes;
+
+	/* Directory entry in /proc/bus/nubus */
+	struct proc_dir_entry *procdir;
 };
 
-struct nubus_dev {
-	/* Next link in device list */
-	struct nubus_dev* next;
-	/* Directory entry in /proc/bus/nubus */
-	struct proc_dir_entry* procdir;
+struct nubus_rsrc {
+	struct list_head list;
 
-	/* The functional resource ID of this device */
+	/* The functional resource ID */
 	unsigned char resid;
 	/* These are mostly here for convenience; we could always read
 	   them from the ROMs if we wanted to */
@@ -57,79 +73,116 @@ struct nubus_dev {
 	unsigned short type;
 	unsigned short dr_sw;
 	unsigned short dr_hw;
-	/* This is the device's name rather than the board's.
-	   Sometimes they are different.  Usually the board name is
-	   more correct. */
-	char name[64];
-	/* MacOS driver (I kid you not) */
-	unsigned char* driver;
-	/* Actually this is an offset */
-	unsigned long iobase;
-	unsigned long iosize;
-	unsigned char flags, hwdevid;
-	
+
 	/* Functional directory */
-	unsigned char* directory;
+	unsigned char *directory;
 	/* Much of our info comes from here */
-	struct nubus_board* board;
+	struct nubus_board *board;
 };
 
-/* This is all NuBus devices (used to find devices later on) */
-extern struct nubus_dev* nubus_devices;
-/* This is all NuBus cards */
-extern struct nubus_board* nubus_boards;
+/* This is all NuBus functional resources (used to find devices later on) */
+extern struct list_head nubus_func_rsrcs;
+
+struct nubus_driver {
+	struct device_driver driver;
+	int (*probe)(struct nubus_board *board);
+	int (*remove)(struct nubus_board *board);
+};
+
+extern struct bus_type nubus_bus_type;
 
 /* Generic NuBus interface functions, modelled after the PCI interface */
-void nubus_scan_bus(void);
 #ifdef CONFIG_PROC_FS
-extern void nubus_proc_init(void);
+void nubus_proc_init(void);
+struct proc_dir_entry *nubus_proc_add_board(struct nubus_board *board);
+struct proc_dir_entry *nubus_proc_add_rsrc_dir(struct proc_dir_entry *procdir,
+					       const struct nubus_dirent *ent,
+					       struct nubus_board *board);
+void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir,
+			     const struct nubus_dirent *ent,
+			     unsigned int size);
+void nubus_proc_add_rsrc(struct proc_dir_entry *procdir,
+			 const struct nubus_dirent *ent);
 #else
 static inline void nubus_proc_init(void) {}
+static inline
+struct proc_dir_entry *nubus_proc_add_board(struct nubus_board *board)
+{ return NULL; }
+static inline
+struct proc_dir_entry *nubus_proc_add_rsrc_dir(struct proc_dir_entry *procdir,
+					       const struct nubus_dirent *ent,
+					       struct nubus_board *board)
+{ return NULL; }
+static inline void nubus_proc_add_rsrc_mem(struct proc_dir_entry *procdir,
+					   const struct nubus_dirent *ent,
+					   unsigned int size) {}
+static inline void nubus_proc_add_rsrc(struct proc_dir_entry *procdir,
+				       const struct nubus_dirent *ent) {}
 #endif
-int get_nubus_list(char *buf);
-int nubus_proc_attach_device(struct nubus_dev *dev);
-/* If we need more precision we can add some more of these */
-struct nubus_dev* nubus_find_device(unsigned short category,
-				    unsigned short type,
-				    unsigned short dr_hw,
-				    unsigned short dr_sw,
-				    const struct nubus_dev* from);
-struct nubus_dev* nubus_find_type(unsigned short category,
-				  unsigned short type,
-				  const struct nubus_dev* from);
-/* Might have more than one device in a slot, you know... */
-struct nubus_dev* nubus_find_slot(unsigned int slot,
-				  const struct nubus_dev* from);
+
+struct nubus_rsrc *nubus_first_rsrc_or_null(void);
+struct nubus_rsrc *nubus_next_rsrc_or_null(struct nubus_rsrc *from);
+
+#define for_each_func_rsrc(f) \
+	for (f = nubus_first_rsrc_or_null(); f; f = nubus_next_rsrc_or_null(f))
+
+#define for_each_board_func_rsrc(b, f) \
+	for_each_func_rsrc(f) if (f->board != b) {} else
 
 /* These are somewhat more NuBus-specific.  They all return 0 for
    success and -1 for failure, as you'd expect. */
 
 /* The root directory which contains the board and functional
    directories */
-int nubus_get_root_dir(const struct nubus_board* board,
-		       struct nubus_dir* dir);
+int nubus_get_root_dir(const struct nubus_board *board,
+		       struct nubus_dir *dir);
 /* The board directory */
-int nubus_get_board_dir(const struct nubus_board* board,
-			struct nubus_dir* dir);
+int nubus_get_board_dir(const struct nubus_board *board,
+			struct nubus_dir *dir);
 /* The functional directory */
-int nubus_get_func_dir(const struct nubus_dev* dev,
-		       struct nubus_dir* dir);
+int nubus_get_func_dir(const struct nubus_rsrc *fres, struct nubus_dir *dir);
 
 /* These work on any directory gotten via the above */
-int nubus_readdir(struct nubus_dir* dir,
-		  struct nubus_dirent* ent);
-int nubus_find_rsrc(struct nubus_dir* dir,
+int nubus_readdir(struct nubus_dir *dir,
+		  struct nubus_dirent *ent);
+int nubus_find_rsrc(struct nubus_dir *dir,
 		    unsigned char rsrc_type,
-		    struct nubus_dirent* ent);
-int nubus_rewinddir(struct nubus_dir* dir);
+		    struct nubus_dirent *ent);
+int nubus_rewinddir(struct nubus_dir *dir);
 
 /* Things to do with directory entries */
-int nubus_get_subdir(const struct nubus_dirent* ent,
-		     struct nubus_dir* dir);
-void nubus_get_rsrc_mem(void* dest,
-			const struct nubus_dirent *dirent,
-			int len);
-void nubus_get_rsrc_str(void* dest,
-			const struct nubus_dirent *dirent,
-			int maxlen);
+int nubus_get_subdir(const struct nubus_dirent *ent,
+		     struct nubus_dir *dir);
+void nubus_get_rsrc_mem(void *dest, const struct nubus_dirent *dirent,
+			unsigned int len);
+unsigned int nubus_get_rsrc_str(char *dest, const struct nubus_dirent *dirent,
+				unsigned int len);
+void nubus_seq_write_rsrc_mem(struct seq_file *m,
+			      const struct nubus_dirent *dirent,
+			      unsigned int len);
+unsigned char *nubus_dirptr(const struct nubus_dirent *nd);
+
+/* Declarations relating to driver model objects */
+int nubus_bus_register(void);
+int nubus_device_register(struct nubus_board *board);
+int nubus_driver_register(struct nubus_driver *ndrv);
+void nubus_driver_unregister(struct nubus_driver *ndrv);
+int nubus_proc_show(struct seq_file *m, void *data);
+
+static inline void nubus_set_drvdata(struct nubus_board *board, void *data)
+{
+	dev_set_drvdata(&board->dev, data);
+}
+
+static inline void *nubus_get_drvdata(struct nubus_board *board)
+{
+	return dev_get_drvdata(&board->dev);
+}
+
+/* Returns a pointer to the "standard" slot space. */
+static inline void *nubus_slot_addr(int slot)
+{
+	return (void *)(0xF0000000 | (slot << 24));
+}
+
 #endif /* LINUX_NUBUS_H */
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index aea87f0d..4112e2b 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -124,14 +124,20 @@ enum {
 
 #define NVME_CMB_BIR(cmbloc)	((cmbloc) & 0x7)
 #define NVME_CMB_OFST(cmbloc)	(((cmbloc) >> 12) & 0xfffff)
-#define NVME_CMB_SZ(cmbsz)	(((cmbsz) >> 12) & 0xfffff)
-#define NVME_CMB_SZU(cmbsz)	(((cmbsz) >> 8) & 0xf)
 
-#define NVME_CMB_WDS(cmbsz)	((cmbsz) & 0x10)
-#define NVME_CMB_RDS(cmbsz)	((cmbsz) & 0x8)
-#define NVME_CMB_LISTS(cmbsz)	((cmbsz) & 0x4)
-#define NVME_CMB_CQS(cmbsz)	((cmbsz) & 0x2)
-#define NVME_CMB_SQS(cmbsz)	((cmbsz) & 0x1)
+enum {
+	NVME_CMBSZ_SQS		= 1 << 0,
+	NVME_CMBSZ_CQS		= 1 << 1,
+	NVME_CMBSZ_LISTS	= 1 << 2,
+	NVME_CMBSZ_RDS		= 1 << 3,
+	NVME_CMBSZ_WDS		= 1 << 4,
+
+	NVME_CMBSZ_SZ_SHIFT	= 12,
+	NVME_CMBSZ_SZ_MASK	= 0xfffff,
+
+	NVME_CMBSZ_SZU_SHIFT	= 8,
+	NVME_CMBSZ_SZU_MASK	= 0xf,
+};
 
 /*
  * Submission and Completion Queue Entry Sizes for the NVM command set.
diff --git a/include/linux/of.h b/include/linux/of.h
index d3dea1d..173102d 100644
--- a/include/linux/of.h
+++ b/include/linux/of.h
@@ -544,6 +544,8 @@ const char *of_prop_next_string(struct property *prop, const char *cur);
 
 bool of_console_check(struct device_node *dn, char *name, int index);
 
+extern int of_cpu_node_to_id(struct device_node *np);
+
 #else /* CONFIG_OF */
 
 static inline void of_core_init(void)
@@ -916,6 +918,11 @@ static inline void of_property_clear_flag(struct property *p, unsigned long flag
 {
 }
 
+static inline int of_cpu_node_to_id(struct device_node *np)
+{
+	return -ENODEV;
+}
+
 #define of_match_ptr(_ptr)	NULL
 #define of_match_node(_matches, _node)	NULL
 #endif /* CONFIG_OF */
diff --git a/include/linux/omap-gpmc.h b/include/linux/omap-gpmc.h
index edfa280..053feb4 100644
--- a/include/linux/omap-gpmc.h
+++ b/include/linux/omap-gpmc.h
@@ -25,15 +25,43 @@ struct gpmc_nand_ops {
 
 struct gpmc_nand_regs;
 
+struct gpmc_onenand_info {
+	bool sync_read;
+	bool sync_write;
+	int burst_len;
+};
+
 #if IS_ENABLED(CONFIG_OMAP_GPMC)
 struct gpmc_nand_ops *gpmc_omap_get_nand_ops(struct gpmc_nand_regs *regs,
 					     int cs);
+/**
+ * gpmc_omap_onenand_set_timings - set optimized sync timings.
+ * @cs:      Chip Select Region
+ * @freq:    Chip frequency
+ * @latency: Burst latency cycle count
+ * @info:    Structure describing parameters used
+ *
+ * Sets optimized timings for the @cs region based on @freq and @latency.
+ * Updates the @info structure based on the GPMC settings.
+ */
+int gpmc_omap_onenand_set_timings(struct device *dev, int cs, int freq,
+				  int latency,
+				  struct gpmc_onenand_info *info);
+
 #else
 static inline struct gpmc_nand_ops *gpmc_omap_get_nand_ops(struct gpmc_nand_regs *regs,
 							   int cs)
 {
 	return NULL;
 }
+
+static inline
+int gpmc_omap_onenand_set_timings(struct device *dev, int cs, int freq,
+				  int latency,
+				  struct gpmc_onenand_info *info)
+{
+	return -EINVAL;
+}
 #endif /* CONFIG_OMAP_GPMC */
 
 extern int gpmc_calc_timings(struct gpmc_timings *gpmc_t,
diff --git a/include/linux/percpu-refcount.h b/include/linux/percpu-refcount.h
index 6658d9e..864d167 100644
--- a/include/linux/percpu-refcount.h
+++ b/include/linux/percpu-refcount.h
@@ -139,12 +139,12 @@ static inline bool __ref_is_percpu(struct percpu_ref *ref,
 	 * when using it as a pointer, __PERCPU_REF_ATOMIC may be set in
 	 * between contaminating the pointer value, meaning that
 	 * READ_ONCE() is required when fetching it.
+	 *
+	 * The smp_read_barrier_depends() implied by READ_ONCE() pairs
+	 * with smp_store_release() in __percpu_ref_switch_to_percpu().
 	 */
 	percpu_ptr = READ_ONCE(ref->percpu_count_ptr);
 
-	/* paired with smp_store_release() in __percpu_ref_switch_to_percpu() */
-	smp_read_barrier_depends();
-
 	/*
 	 * Theoretically, the following could test just ATOMIC; however,
 	 * then we'd have to mask off DEAD separately as DEAD may be
diff --git a/include/linux/platform_data/mtd-onenand-omap2.h b/include/linux/platform_data/mtd-onenand-omap2.h
deleted file mode 100644
index 56ff0e6..0000000
--- a/include/linux/platform_data/mtd-onenand-omap2.h
+++ /dev/null
@@ -1,34 +0,0 @@
-/*
- * Copyright (C) 2006 Nokia Corporation
- * Author: Juha Yrjola
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- */
-
-#ifndef	__MTD_ONENAND_OMAP2_H
-#define	__MTD_ONENAND_OMAP2_H
-
-#include <linux/mtd/mtd.h>
-#include <linux/mtd/partitions.h>
-
-#define ONENAND_SYNC_READ	(1 << 0)
-#define ONENAND_SYNC_READWRITE	(1 << 1)
-#define	ONENAND_IN_OMAP34XX	(1 << 2)
-
-struct omap_onenand_platform_data {
-	int			cs;
-	int			gpio_irq;
-	struct mtd_partition	*parts;
-	int			nr_parts;
-	int			(*onenand_setup)(void __iomem *, int *freq_ptr);
-	int			dma_channel;
-	u8			flags;
-	u8			regulator_can_sleep;
-	u8			skip_initial_unlocking;
-
-	/* for passing the partitions */
-	struct device_node	*of_node;
-};
-#endif
diff --git a/include/linux/platform_data/spi-s3c64xx.h b/include/linux/platform_data/spi-s3c64xx.h
index da79774..773daf7 100644
--- a/include/linux/platform_data/spi-s3c64xx.h
+++ b/include/linux/platform_data/spi-s3c64xx.h
@@ -1,10 +1,8 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
 /*
  * Copyright (C) 2009 Samsung Electronics Ltd.
  *	Jaswinder Singh <jassi.brar@samsung.com>
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
  */
 
 #ifndef __SPI_S3C64XX_H
diff --git a/include/linux/pm.h b/include/linux/pm.h
index 492ed47..e723b78 100644
--- a/include/linux/pm.h
+++ b/include/linux/pm.h
@@ -556,9 +556,10 @@ struct pm_subsys_data {
  * These flags can be set by device drivers at the probe time.  They need not be
  * cleared by the drivers as the driver core will take care of that.
  *
- * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
+ * NEVER_SKIP: Do not skip all system suspend/resume callbacks for the device.
  * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
  * SMART_SUSPEND: No need to resume the device from runtime suspend.
+ * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible.
  *
  * Setting SMART_PREPARE instructs bus types and PM domains which may want
  * system suspend/resume callbacks to be skipped for the device to return 0 from
@@ -572,10 +573,14 @@ struct pm_subsys_data {
  * necessary from the driver's perspective.  It also may cause them to skip
  * invocations of the ->suspend_late and ->suspend_noirq callbacks provided by
  * the driver if they decide to leave the device in runtime suspend.
+ *
+ * Setting LEAVE_SUSPENDED informs the PM core and middle-layer code that the
+ * driver prefers the device to be left in suspend after system resume.
  */
-#define DPM_FLAG_NEVER_SKIP	BIT(0)
-#define DPM_FLAG_SMART_PREPARE	BIT(1)
-#define DPM_FLAG_SMART_SUSPEND	BIT(2)
+#define DPM_FLAG_NEVER_SKIP		BIT(0)
+#define DPM_FLAG_SMART_PREPARE		BIT(1)
+#define DPM_FLAG_SMART_SUSPEND		BIT(2)
+#define DPM_FLAG_LEAVE_SUSPENDED	BIT(3)
 
 struct dev_pm_info {
 	pm_message_t		power_state;
@@ -597,6 +602,8 @@ struct dev_pm_info {
 	bool			wakeup_path:1;
 	bool			syscore:1;
 	bool			no_pm_callbacks:1;	/* Owned by the PM core */
+	unsigned int		must_resume:1;	/* Owned by the PM core */
+	unsigned int		may_skip_resume:1;	/* Set by subsystems */
 #else
 	unsigned int		should_wakeup:1;
 #endif
@@ -766,6 +773,7 @@ extern int pm_generic_poweroff(struct device *dev);
 extern void pm_generic_complete(struct device *dev);
 
 extern void dev_pm_skip_next_resume_phases(struct device *dev);
+extern bool dev_pm_may_skip_resume(struct device *dev);
 extern bool dev_pm_smart_suspend_and_suspended(struct device *dev);
 
 #else /* !CONFIG_PM_SLEEP */
diff --git a/include/linux/pm_wakeup.h b/include/linux/pm_wakeup.h
index 4c2cba7..4238dde 100644
--- a/include/linux/pm_wakeup.h
+++ b/include/linux/pm_wakeup.h
@@ -88,6 +88,11 @@ static inline bool device_may_wakeup(struct device *dev)
 	return dev->power.can_wakeup && !!dev->power.wakeup;
 }
 
+static inline void device_set_wakeup_path(struct device *dev)
+{
+	dev->power.wakeup_path = true;
+}
+
 /* drivers/base/power/wakeup.c */
 extern void wakeup_source_prepare(struct wakeup_source *ws, const char *name);
 extern struct wakeup_source *wakeup_source_create(const char *name);
@@ -174,6 +179,8 @@ static inline bool device_may_wakeup(struct device *dev)
 	return dev->power.can_wakeup && dev->power.should_wakeup;
 }
 
+static inline void device_set_wakeup_path(struct device *dev) {}
+
 static inline void __pm_stay_awake(struct wakeup_source *ws) {}
 
 static inline void pm_stay_awake(struct device *dev) {}
diff --git a/include/linux/posix-timers.h b/include/linux/posix-timers.h
index 672c4f3..c85704f 100644
--- a/include/linux/posix-timers.h
+++ b/include/linux/posix-timers.h
@@ -42,13 +42,26 @@ struct cpu_timer_list {
 #define CLOCKFD			CPUCLOCK_MAX
 #define CLOCKFD_MASK		(CPUCLOCK_PERTHREAD_MASK|CPUCLOCK_CLOCK_MASK)
 
-#define MAKE_PROCESS_CPUCLOCK(pid, clock) \
-	((~(clockid_t) (pid) << 3) | (clockid_t) (clock))
-#define MAKE_THREAD_CPUCLOCK(tid, clock) \
-	MAKE_PROCESS_CPUCLOCK((tid), (clock) | CPUCLOCK_PERTHREAD_MASK)
+static inline clockid_t make_process_cpuclock(const unsigned int pid,
+		const clockid_t clock)
+{
+	return ((~pid) << 3) | clock;
+}
+static inline clockid_t make_thread_cpuclock(const unsigned int tid,
+		const clockid_t clock)
+{
+	return make_process_cpuclock(tid, clock | CPUCLOCK_PERTHREAD_MASK);
+}
 
-#define FD_TO_CLOCKID(fd)	((~(clockid_t) (fd) << 3) | CLOCKFD)
-#define CLOCKID_TO_FD(clk)	((unsigned int) ~((clk) >> 3))
+static inline clockid_t fd_to_clockid(const int fd)
+{
+	return make_process_cpuclock((unsigned int) fd, CLOCKFD);
+}
+
+static inline int clockid_to_fd(const clockid_t clk)
+{
+	return ~(clk >> 3);
+}
 
 #define REQUEUE_PENDING 1
 
diff --git a/include/linux/psci.h b/include/linux/psci.h
index bdea1cb..f724fd8 100644
--- a/include/linux/psci.h
+++ b/include/linux/psci.h
@@ -26,6 +26,7 @@ int psci_cpu_init_idle(unsigned int cpu);
 int psci_cpu_suspend_enter(unsigned long index);
 
 struct psci_operations {
+	u32 (*get_version)(void);
 	int (*cpu_suspend)(u32 state, unsigned long entry_point);
 	int (*cpu_off)(u32 state);
 	int (*cpu_on)(unsigned long cpuid, unsigned long entry_point);
@@ -46,10 +47,11 @@ static inline int psci_dt_init(void) { return 0; }
 #if defined(CONFIG_ARM_PSCI_FW) && defined(CONFIG_ACPI)
 int __init psci_acpi_init(void);
 bool __init acpi_psci_present(void);
-bool __init acpi_psci_use_hvc(void);
+bool acpi_psci_use_hvc(void);
 #else
 static inline int psci_acpi_init(void) { return 0; }
 static inline bool acpi_psci_present(void) { return false; }
+static inline bool acpi_psci_use_hvc(void) {return false; }
 #endif
 
 #endif /* __LINUX_PSCI_H */
diff --git a/include/linux/ptr_ring.h b/include/linux/ptr_ring.h
index 6866df4..d72b2e7 100644
--- a/include/linux/ptr_ring.h
+++ b/include/linux/ptr_ring.h
@@ -174,6 +174,15 @@ static inline int ptr_ring_produce_bh(struct ptr_ring *r, void *ptr)
  * if they dereference the pointer - see e.g. PTR_RING_PEEK_CALL.
  * If ring is never resized, and if the pointer is merely
  * tested, there's no need to take the lock - see e.g.  __ptr_ring_empty.
+ * However, if called outside the lock, and if some other CPU
+ * consumes ring entries at the same time, the value returned
+ * is not guaranteed to be correct.
+ * In this case - to avoid incorrectly detecting the ring
+ * as empty - the CPU consuming the ring entries is responsible
+ * for either consuming all ring entries until the ring is empty,
+ * or synchronizing with some other CPU and causing it to
+ * execute __ptr_ring_peek and/or consume the ring enteries
+ * after the synchronization point.
  */
 static inline void *__ptr_ring_peek(struct ptr_ring *r)
 {
@@ -182,10 +191,7 @@ static inline void *__ptr_ring_peek(struct ptr_ring *r)
 	return NULL;
 }
 
-/* Note: callers invoking this in a loop must use a compiler barrier,
- * for example cpu_relax(). Callers must take consumer_lock
- * if the ring is ever resized - see e.g. ptr_ring_empty.
- */
+/* See __ptr_ring_peek above for locking rules. */
 static inline bool __ptr_ring_empty(struct ptr_ring *r)
 {
 	return !__ptr_ring_peek(r);
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index a6ddc42..043d047 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -197,7 +197,7 @@ static inline void exit_tasks_rcu_finish(void) { }
 #define cond_resched_rcu_qs() \
 do { \
 	if (!cond_resched()) \
-		rcu_note_voluntary_context_switch(current); \
+		rcu_note_voluntary_context_switch_lite(current); \
 } while (0)
 
 /*
@@ -433,12 +433,12 @@ static inline void rcu_preempt_sleep_check(void) { }
  * @p: The pointer to read
  *
  * Return the value of the specified RCU-protected pointer, but omit the
- * smp_read_barrier_depends() and keep the READ_ONCE().  This is useful
- * when the value of this pointer is accessed, but the pointer is not
- * dereferenced, for example, when testing an RCU-protected pointer against
- * NULL.  Although rcu_access_pointer() may also be used in cases where
- * update-side locks prevent the value of the pointer from changing, you
- * should instead use rcu_dereference_protected() for this use case.
+ * lockdep checks for being in an RCU read-side critical section.  This is
+ * useful when the value of this pointer is accessed, but the pointer is
+ * not dereferenced, for example, when testing an RCU-protected pointer
+ * against NULL.  Although rcu_access_pointer() may also be used in cases
+ * where update-side locks prevent the value of the pointer from changing,
+ * you should instead use rcu_dereference_protected() for this use case.
  *
  * It is also permissible to use rcu_access_pointer() when read-side
  * access to the pointer was removed at least one grace period ago, as
@@ -521,12 +521,11 @@ static inline void rcu_preempt_sleep_check(void) { }
  * @c: The conditions under which the dereference will take place
  *
  * Return the value of the specified RCU-protected pointer, but omit
- * both the smp_read_barrier_depends() and the READ_ONCE().  This
- * is useful in cases where update-side locks prevent the value of the
- * pointer from changing.  Please note that this primitive does *not*
- * prevent the compiler from repeating this reference or combining it
- * with other references, so it should not be used without protection
- * of appropriate locks.
+ * the READ_ONCE().  This is useful in cases where update-side locks
+ * prevent the value of the pointer from changing.  Please note that this
+ * primitive does *not* prevent the compiler from repeating this reference
+ * or combining it with other references, so it should not be used without
+ * protection of appropriate locks.
  *
  * This function is only for update-side use.  Using this function
  * when protected only by rcu_read_lock() will result in infrequent
diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
index b3dbf95..ce9beec 100644
--- a/include/linux/rcutiny.h
+++ b/include/linux/rcutiny.h
@@ -111,7 +111,6 @@ static inline void rcu_cpu_stall_reset(void) { }
 static inline void rcu_idle_enter(void) { }
 static inline void rcu_idle_exit(void) { }
 static inline void rcu_irq_enter(void) { }
-static inline bool rcu_irq_enter_disabled(void) { return false; }
 static inline void rcu_irq_exit_irqson(void) { }
 static inline void rcu_irq_enter_irqson(void) { }
 static inline void rcu_irq_exit(void) { }
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 37d6fd3..fd996cd 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -85,7 +85,6 @@ void rcu_irq_enter(void);
 void rcu_irq_exit(void);
 void rcu_irq_enter_irqson(void);
 void rcu_irq_exit_irqson(void);
-bool rcu_irq_enter_disabled(void);
 
 void exit_rcu(void);
 
diff --git a/include/linux/regmap.h b/include/linux/regmap.h
index 15eddc1..20268b7 100644
--- a/include/linux/regmap.h
+++ b/include/linux/regmap.h
@@ -30,6 +30,7 @@ struct regmap;
 struct regmap_range_cfg;
 struct regmap_field;
 struct snd_ac97;
+struct sdw_slave;
 
 /* An enum of all the supported cache types */
 enum regcache_type {
@@ -264,6 +265,9 @@ typedef void (*regmap_unlock)(void *);
  *                field is NULL but precious_table (see below) is not, the
  *                check is performed on such table (a register is precious if
  *                it belongs to one of the ranges specified by precious_table).
+ * @disable_locking: This regmap is either protected by external means or
+ *                   is guaranteed not be be accessed from multiple threads.
+ *                   Don't use any locking mechanisms.
  * @lock:	  Optional lock callback (overrides regmap's default lock
  *		  function, based on spinlock or mutex).
  * @unlock:	  As above for unlocking.
@@ -296,7 +300,10 @@ typedef void (*regmap_unlock)(void *);
  *                  a read.
  * @write_flag_mask: Mask to be set in the top bytes of the register when doing
  *                   a write. If both read_flag_mask and write_flag_mask are
- *                   empty the regmap_bus default masks are used.
+ *                   empty and zero_flag_mask is not set the regmap_bus default
+ *                   masks are used.
+ * @zero_flag_mask: If set, read_flag_mask and write_flag_mask are used even
+ *                   if they are both empty.
  * @use_single_rw: If set, converts the bulk read and write operations into
  *		    a series of single read and write operations. This is useful
  *		    for device that does not support bulk read and write.
@@ -317,6 +324,7 @@ typedef void (*regmap_unlock)(void *);
  *
  * @ranges: Array of configuration entries for virtual address ranges.
  * @num_ranges: Number of range configuration entries.
+ * @use_hwlock: Indicate if a hardware spinlock should be used.
  * @hwlock_id: Specify the hardware spinlock id.
  * @hwlock_mode: The hardware spinlock mode, should be HWLOCK_IRQSTATE,
  *		 HWLOCK_IRQ or 0.
@@ -333,6 +341,8 @@ struct regmap_config {
 	bool (*readable_reg)(struct device *dev, unsigned int reg);
 	bool (*volatile_reg)(struct device *dev, unsigned int reg);
 	bool (*precious_reg)(struct device *dev, unsigned int reg);
+
+	bool disable_locking;
 	regmap_lock lock;
 	regmap_unlock unlock;
 	void *lock_arg;
@@ -355,6 +365,7 @@ struct regmap_config {
 
 	unsigned long read_flag_mask;
 	unsigned long write_flag_mask;
+	bool zero_flag_mask;
 
 	bool use_single_rw;
 	bool can_multi_write;
@@ -365,6 +376,7 @@ struct regmap_config {
 	const struct regmap_range_cfg *ranges;
 	unsigned int num_ranges;
 
+	bool use_hwlock;
 	unsigned int hwlock_id;
 	unsigned int hwlock_mode;
 };
@@ -524,6 +536,10 @@ struct regmap *__regmap_init_ac97(struct snd_ac97 *ac97,
 				  const struct regmap_config *config,
 				  struct lock_class_key *lock_key,
 				  const char *lock_name);
+struct regmap *__regmap_init_sdw(struct sdw_slave *sdw,
+				 const struct regmap_config *config,
+				 struct lock_class_key *lock_key,
+				 const char *lock_name);
 
 struct regmap *__devm_regmap_init(struct device *dev,
 				  const struct regmap_bus *bus,
@@ -561,6 +577,10 @@ struct regmap *__devm_regmap_init_ac97(struct snd_ac97 *ac97,
 				       const struct regmap_config *config,
 				       struct lock_class_key *lock_key,
 				       const char *lock_name);
+struct regmap *__devm_regmap_init_sdw(struct sdw_slave *sdw,
+				 const struct regmap_config *config,
+				 struct lock_class_key *lock_key,
+				 const char *lock_name);
 
 /*
  * Wrapper for regmap_init macros to include a unique lockdep key and name
@@ -710,6 +730,20 @@ int regmap_attach_dev(struct device *dev, struct regmap *map,
 bool regmap_ac97_default_volatile(struct device *dev, unsigned int reg);
 
 /**
+ * regmap_init_sdw() - Initialise register map
+ *
+ * @sdw: Device that will be interacted with
+ * @config: Configuration for register map
+ *
+ * The return value will be an ERR_PTR() on error or a valid pointer to
+ * a struct regmap.
+ */
+#define regmap_init_sdw(sdw, config)					\
+	__regmap_lockdep_wrapper(__regmap_init_sdw, #config,		\
+				sdw, config)
+
+
+/**
  * devm_regmap_init() - Initialise managed register map
  *
  * @dev: Device that will be interacted with
@@ -839,6 +873,20 @@ bool regmap_ac97_default_volatile(struct device *dev, unsigned int reg);
 	__regmap_lockdep_wrapper(__devm_regmap_init_ac97, #config,	\
 				ac97, config)
 
+/**
+ * devm_regmap_init_sdw() - Initialise managed register map
+ *
+ * @sdw: Device that will be interacted with
+ * @config: Configuration for register map
+ *
+ * The return value will be an ERR_PTR() on error or a valid pointer
+ * to a struct regmap. The regmap will be automatically freed by the
+ * device management code.
+ */
+#define devm_regmap_init_sdw(sdw, config)				\
+	__regmap_lockdep_wrapper(__devm_regmap_init_sdw, #config,	\
+				sdw, config)
+
 void regmap_exit(struct regmap *map);
 int regmap_reinit_cache(struct regmap *map,
 			const struct regmap_config *config);
diff --git a/include/linux/regulator/driver.h b/include/linux/regulator/driver.h
index 94417b4..4c00486 100644
--- a/include/linux/regulator/driver.h
+++ b/include/linux/regulator/driver.h
@@ -214,6 +214,8 @@ struct regulator_ops {
 	/* set regulator suspend operating mode (defined in consumer.h) */
 	int (*set_suspend_mode) (struct regulator_dev *, unsigned int mode);
 
+	int (*resume_early)(struct regulator_dev *rdev);
+
 	int (*set_pull_down) (struct regulator_dev *);
 };
 
diff --git a/include/linux/regulator/machine.h b/include/linux/regulator/machine.h
index 9cd4fef..93a0489 100644
--- a/include/linux/regulator/machine.h
+++ b/include/linux/regulator/machine.h
@@ -42,6 +42,16 @@ struct regulator;
 #define REGULATOR_CHANGE_DRMS		0x10
 #define REGULATOR_CHANGE_BYPASS		0x20
 
+/*
+ * operations in suspend mode
+ * DO_NOTHING_IN_SUSPEND - the default value
+ * DISABLE_IN_SUSPEND	- turn off regulator in suspend states
+ * ENABLE_IN_SUSPEND	- keep regulator on in suspend states
+ */
+#define DO_NOTHING_IN_SUSPEND	(-1)
+#define DISABLE_IN_SUSPEND	0
+#define ENABLE_IN_SUSPEND	1
+
 /* Regulator active discharge flags */
 enum regulator_active_discharge {
 	REGULATOR_ACTIVE_DISCHARGE_DEFAULT,
@@ -56,16 +66,24 @@ enum regulator_active_discharge {
  * state.  One of enabled or disabled must be set for the
  * configuration to be applied.
  *
- * @uV: Operating voltage during suspend.
+ * @uV: Default operating voltage during suspend, it can be adjusted
+ *	among <min_uV, max_uV>.
+ * @min_uV: Minimum suspend voltage may be set.
+ * @max_uV: Maximum suspend voltage may be set.
  * @mode: Operating mode during suspend.
- * @enabled: Enabled during suspend.
- * @disabled: Disabled during suspend.
+ * @enabled: operations during suspend.
+ *	     - DO_NOTHING_IN_SUSPEND
+ *	     - DISABLE_IN_SUSPEND
+ *	     - ENABLE_IN_SUSPEND
+ * @changeable: Is this state can be switched between enabled/disabled,
  */
 struct regulator_state {
-	int uV;	/* suspend voltage */
-	unsigned int mode; /* suspend regulator operating mode */
-	int enabled; /* is regulator enabled in this suspend state */
-	int disabled; /* is the regulator disabled in this suspend state */
+	int uV;
+	int min_uV;
+	int max_uV;
+	unsigned int mode;
+	int enabled;
+	bool changeable;
 };
 
 /**
@@ -225,12 +243,12 @@ struct regulator_init_data {
 
 #ifdef CONFIG_REGULATOR
 void regulator_has_full_constraints(void);
-int regulator_suspend_prepare(suspend_state_t state);
-int regulator_suspend_finish(void);
 #else
 static inline void regulator_has_full_constraints(void)
 {
 }
+#endif
+
 static inline int regulator_suspend_prepare(suspend_state_t state)
 {
 	return 0;
@@ -239,6 +257,5 @@ static inline int regulator_suspend_finish(void)
 {
 	return 0;
 }
-#endif
 
 #endif
diff --git a/include/linux/rtnetlink.h b/include/linux/rtnetlink.h
index 2032ce2..1eadec3 100644
--- a/include/linux/rtnetlink.h
+++ b/include/linux/rtnetlink.h
@@ -70,8 +70,7 @@ static inline bool lockdep_rtnl_is_held(void)
  * @p: The pointer to read, prior to dereferencing
  *
  * Return the value of the specified RCU-protected pointer, but omit
- * both the smp_read_barrier_depends() and the READ_ONCE(), because
- * caller holds RTNL.
+ * the READ_ONCE(), because caller holds RTNL.
  */
 #define rtnl_dereference(p)					\
 	rcu_dereference_protected(p, lockdep_rtnl_is_held())
diff --git a/include/linux/mfd/rtsx_common.h b/include/linux/rtsx_common.h
similarity index 100%
rename from include/linux/mfd/rtsx_common.h
rename to include/linux/rtsx_common.h
diff --git a/include/linux/mfd/rtsx_pci.h b/include/linux/rtsx_pci.h
similarity index 83%
rename from include/linux/mfd/rtsx_pci.h
rename to include/linux/rtsx_pci.h
index c3d3f04..478acf6 100644
--- a/include/linux/mfd/rtsx_pci.h
+++ b/include/linux/rtsx_pci.h
@@ -24,7 +24,7 @@
 
 #include <linux/sched.h>
 #include <linux/pci.h>
-#include <linux/mfd/rtsx_common.h>
+#include <linux/rtsx_common.h>
 
 #define MAX_RW_REG_CNT			1024
 
@@ -203,6 +203,7 @@
 #define   SD_DDR_MODE			0x04
 #define   SD_30_MODE			0x08
 #define   SD_CLK_DIVIDE_MASK		0xC0
+#define   SD_MODE_SELECT_MASK		0x0C
 #define SD_CFG2				0xFDA1
 #define   SD_CALCULATE_CRC7		0x00
 #define   SD_NO_CALCULATE_CRC7		0x80
@@ -226,6 +227,7 @@
 #define   SD_RSP_TYPE_R6		0x01
 #define   SD_RSP_TYPE_R7		0x01
 #define SD_CFG3				0xFDA2
+#define   SD30_CLK_END_EN		0x10
 #define   SD_RSP_80CLK_TIMEOUT_EN	0x01
 
 #define SD_STAT1			0xFDA3
@@ -309,6 +311,12 @@
 
 #define SD_DATA_STATE			0xFDB6
 #define   SD_DATA_IDLE			0x80
+#define REG_SD_STOP_SDCLK_CFG		0xFDB8
+#define   SD30_CLK_STOP_CFG_EN		0x04
+#define   SD30_CLK_STOP_CFG1		0x02
+#define   SD30_CLK_STOP_CFG0		0x01
+#define REG_PRE_RW_MODE			0xFD70
+#define EN_INFINITE_MODE		0x01
 
 #define SRCTL				0xFC13
 
@@ -434,6 +442,7 @@
 #define CARD_CLK_EN			0xFD69
 #define   SD_CLK_EN			0x04
 #define   MS_CLK_EN			0x08
+#define   SD40_CLK_EN		0x10
 #define SDIO_CTRL			0xFD6B
 #define CD_PAD_CTL			0xFD73
 #define   CD_DISABLE_MASK		0x07
@@ -453,8 +462,8 @@
 #define FPDCTL				0xFC00
 #define   SSC_POWER_DOWN		0x01
 #define   SD_OC_POWER_DOWN		0x02
-#define   ALL_POWER_DOWN		0x07
-#define   OC_POWER_DOWN			0x06
+#define   ALL_POWER_DOWN		0x03
+#define   OC_POWER_DOWN			0x02
 #define PDINFO				0xFC01
 
 #define CLK_CTL				0xFC02
@@ -490,6 +499,9 @@
 
 #define FPGA_PULL_CTL			0xFC1D
 #define OLT_LED_CTL			0xFC1E
+#define   LED_SHINE_MASK		0x08
+#define   LED_SHINE_EN			0x08
+#define   LED_SHINE_DISABLE		0x00
 #define GPIO_CTL			0xFC1F
 
 #define LDO_CTL				0xFC1E
@@ -511,7 +523,11 @@
 #define   BPP_LDO_ON			0x00
 #define   BPP_LDO_SUSPEND		0x02
 #define   BPP_LDO_OFF			0x03
+#define EFUSE_CTL			0xFC30
+#define EFUSE_ADD			0xFC31
 #define SYS_VER				0xFC32
+#define EFUSE_DATAL			0xFC34
+#define EFUSE_DATAH			0xFC35
 
 #define CARD_PULL_CTL1			0xFD60
 #define CARD_PULL_CTL2			0xFD61
@@ -553,6 +569,9 @@
 #define RBBC1				0xFE2F
 #define RBDAT				0xFE30
 #define RBCTL				0xFE34
+#define   U_AUTO_DMA_EN_MASK		0x20
+#define   U_AUTO_DMA_DISABLE		0x00
+#define   RB_FLUSH			0x80
 #define CFGADDR0			0xFE35
 #define CFGADDR1			0xFE36
 #define CFGDATA0			0xFE37
@@ -581,6 +600,8 @@
 #define LTR_LATENCY_MODE_HW		0
 #define LTR_LATENCY_MODE_SW		BIT(6)
 #define OBFF_CFG			0xFE4C
+#define   OBFF_EN_MASK			0x03
+#define   OBFF_DISABLE			0x00
 
 #define CDRESUMECTL			0xFE52
 #define WAKE_SEL_CTL			0xFE54
@@ -595,6 +616,7 @@
 #define   FORCE_ASPM_L0_EN		0x01
 #define   FORCE_ASPM_NO_ASPM		0x00
 #define PM_CLK_FORCE_CTL		0xFE58
+#define   CLK_PM_EN			0x01
 #define FUNC_FORCE_CTL			0xFE59
 #define   FUNC_FORCE_UPME_XMT_DBG	0x02
 #define PERST_GLITCH_WIDTH		0xFE5C
@@ -620,14 +642,23 @@
 #define LDO_PWR_SEL			0xFE78
 
 #define L1SUB_CONFIG1			0xFE8D
+#define   AUX_CLK_ACTIVE_SEL_MASK	0x01
+#define   MAC_CKSW_DONE			0x00
 #define L1SUB_CONFIG2			0xFE8E
 #define   L1SUB_AUTO_CFG		0x02
 #define L1SUB_CONFIG3			0xFE8F
 #define   L1OFF_MBIAS2_EN_5250		BIT(7)
 
 #define DUMMY_REG_RESET_0		0xFE90
+#define   IC_VERSION_MASK		0x0F
 
+#define REG_VREF			0xFE97
+#define   PWD_SUSPND_EN			0x10
+#define RTS5260_DMA_RST_CTL_0		0xFEBF
+#define   RTS5260_DMA_RST		0x80
+#define   RTS5260_ADMA3_RST		0x40
 #define AUTOLOAD_CFG_BASE		0xFF00
+#define RELINK_TIME_MASK		0x01
 #define PETXCFG				0xFF03
 #define FORCE_CLKREQ_DELINK_MASK	BIT(7)
 #define FORCE_CLKREQ_LOW	0x80
@@ -667,15 +698,24 @@
 #define LDO_DV18_CFG			0xFF70
 #define   LDO_DV18_SR_MASK		0xC0
 #define   LDO_DV18_SR_DF		0x40
+#define   DV331812_MASK			0x70
+#define   DV331812_33			0x70
+#define   DV331812_17			0x30
 
 #define LDO_CONFIG2			0xFF71
 #define   LDO_D3318_MASK		0x07
 #define   LDO_D3318_33V			0x07
 #define   LDO_D3318_18V			0x02
+#define   DV331812_VDD1			0x04
+#define   DV331812_POWERON		0x08
+#define   DV331812_POWEROFF		0x00
 
 #define LDO_VCC_CFG0			0xFF72
 #define   LDO_VCC_LMTVTH_MASK		0x30
 #define   LDO_VCC_LMTVTH_2A		0x10
+/*RTS5260*/
+#define   RTS5260_DVCC_TUNE_MASK	0x70
+#define   RTS5260_DVCC_33		0x70
 
 #define LDO_VCC_CFG1			0xFF73
 #define   LDO_VCC_REF_TUNE_MASK		0x30
@@ -684,6 +724,10 @@
 #define   LDO_VCC_1V8			0x04
 #define   LDO_VCC_3V3			0x07
 #define   LDO_VCC_LMT_EN		0x08
+/*RTS5260*/
+#define	  LDO_POW_SDVDD1_MASK		0x08
+#define	  LDO_POW_SDVDD1_ON		0x08
+#define	  LDO_POW_SDVDD1_OFF		0x00
 
 #define LDO_VIO_CFG			0xFF75
 #define   LDO_VIO_SR_MASK		0xC0
@@ -711,6 +755,160 @@
 #define   SD_VIO_LDO_1V8		0x40
 #define   SD_VIO_LDO_3V3		0x70
 
+#define RTS5260_AUTOLOAD_CFG4		0xFF7F
+#define   RTS5260_MIMO_DISABLE		0x8A
+
+#define RTS5260_REG_GPIO_CTL0		0xFC1A
+#define   RTS5260_REG_GPIO_MASK		0x01
+#define   RTS5260_REG_GPIO_ON		0x01
+#define   RTS5260_REG_GPIO_OFF		0x00
+
+#define PWR_GLOBAL_CTRL			0xF200
+#define PCIE_L1_2_EN			0x0C
+#define PCIE_L1_1_EN			0x0A
+#define PCIE_L1_0_EN			0x09
+#define PWR_FE_CTL			0xF201
+#define PCIE_L1_2_PD_FE_EN		0x0C
+#define PCIE_L1_1_PD_FE_EN		0x0A
+#define PCIE_L1_0_PD_FE_EN		0x09
+#define CFG_PCIE_APHY_OFF_0		0xF204
+#define CFG_PCIE_APHY_OFF_0_DEFAULT	0xBF
+#define CFG_PCIE_APHY_OFF_1		0xF205
+#define CFG_PCIE_APHY_OFF_1_DEFAULT	0xFF
+#define CFG_PCIE_APHY_OFF_2		0xF206
+#define CFG_PCIE_APHY_OFF_2_DEFAULT	0x01
+#define CFG_PCIE_APHY_OFF_3		0xF207
+#define CFG_PCIE_APHY_OFF_3_DEFAULT	0x00
+#define CFG_L1_0_PCIE_MAC_RET_VALUE	0xF20C
+#define CFG_L1_0_PCIE_DPHY_RET_VALUE	0xF20E
+#define CFG_L1_0_SYS_RET_VALUE		0xF210
+#define CFG_L1_0_CRC_MISC_RET_VALUE	0xF212
+#define CFG_L1_0_CRC_SD30_RET_VALUE	0xF214
+#define CFG_L1_0_CRC_SD40_RET_VALUE	0xF216
+#define CFG_LP_FPWM_VALUE		0xF219
+#define CFG_LP_FPWM_VALUE_DEFAULT	0x18
+#define PWC_CDR				0xF253
+#define PWC_CDR_DEFAULT			0x03
+#define CFG_L1_0_RET_VALUE_DEFAULT	0x1B
+#define CFG_L1_0_CRC_MISC_RET_VALUE_DEFAULT	0x0C
+
+/* OCPCTL */
+#define SD_DETECT_EN			0x08
+#define SD_OCP_INT_EN			0x04
+#define SD_OCP_INT_CLR			0x02
+#define SD_OC_CLR			0x01
+
+#define SDVIO_DETECT_EN			(1 << 7)
+#define SDVIO_OCP_INT_EN		(1 << 6)
+#define SDVIO_OCP_INT_CLR		(1 << 5)
+#define SDVIO_OC_CLR			(1 << 4)
+
+/* OCPSTAT */
+#define SD_OCP_DETECT			0x08
+#define SD_OC_NOW			0x04
+#define SD_OC_EVER			0x02
+
+#define SDVIO_OC_NOW			(1 << 6)
+#define SDVIO_OC_EVER			(1 << 5)
+
+#define REG_OCPCTL			0xFD6A
+#define REG_OCPSTAT			0xFD6E
+#define REG_OCPGLITCH			0xFD6C
+#define REG_OCPPARA1			0xFD6B
+#define REG_OCPPARA2			0xFD6D
+
+/* rts5260 DV3318 OCP-related registers */
+#define REG_DV3318_OCPCTL		0xFD89
+#define DV3318_OCP_TIME_MASK	0xF0
+#define DV3318_DETECT_EN		0x08
+#define DV3318_OCP_INT_EN		0x04
+#define DV3318_OCP_INT_CLR		0x02
+#define DV3318_OCP_CLR			0x01
+
+#define REG_DV3318_OCPSTAT		0xFD8A
+#define DV3318_OCP_GlITCH_TIME_MASK	0xF0
+#define DV3318_OCP_DETECT		0x08
+#define DV3318_OCP_NOW			0x04
+#define DV3318_OCP_EVER			0x02
+
+#define SD_OCP_GLITCH_MASK		0x0F
+
+/* OCPPARA1 */
+#define SDVIO_OCP_TIME_60		0x00
+#define SDVIO_OCP_TIME_100		0x10
+#define SDVIO_OCP_TIME_200		0x20
+#define SDVIO_OCP_TIME_400		0x30
+#define SDVIO_OCP_TIME_600		0x40
+#define SDVIO_OCP_TIME_800		0x50
+#define SDVIO_OCP_TIME_1100		0x60
+#define SDVIO_OCP_TIME_MASK		0x70
+
+#define SD_OCP_TIME_60			0x00
+#define SD_OCP_TIME_100			0x01
+#define SD_OCP_TIME_200			0x02
+#define SD_OCP_TIME_400			0x03
+#define SD_OCP_TIME_600			0x04
+#define SD_OCP_TIME_800			0x05
+#define SD_OCP_TIME_1100		0x06
+#define SD_OCP_TIME_MASK		0x07
+
+/* OCPPARA2 */
+#define SDVIO_OCP_THD_190		0x00
+#define SDVIO_OCP_THD_250		0x10
+#define SDVIO_OCP_THD_320		0x20
+#define SDVIO_OCP_THD_380		0x30
+#define SDVIO_OCP_THD_440		0x40
+#define SDVIO_OCP_THD_500		0x50
+#define SDVIO_OCP_THD_570		0x60
+#define SDVIO_OCP_THD_630		0x70
+#define SDVIO_OCP_THD_MASK		0x70
+
+#define SD_OCP_THD_450			0x00
+#define SD_OCP_THD_550			0x01
+#define SD_OCP_THD_650			0x02
+#define SD_OCP_THD_750			0x03
+#define SD_OCP_THD_850			0x04
+#define SD_OCP_THD_950			0x05
+#define SD_OCP_THD_1050			0x06
+#define SD_OCP_THD_1150			0x07
+#define SD_OCP_THD_MASK			0x07
+
+#define SDVIO_OCP_GLITCH_MASK		0xF0
+#define SDVIO_OCP_GLITCH_NONE		0x00
+#define SDVIO_OCP_GLITCH_50U		0x10
+#define SDVIO_OCP_GLITCH_100U		0x20
+#define SDVIO_OCP_GLITCH_200U		0x30
+#define SDVIO_OCP_GLITCH_600U		0x40
+#define SDVIO_OCP_GLITCH_800U		0x50
+#define SDVIO_OCP_GLITCH_1M		0x60
+#define SDVIO_OCP_GLITCH_2M		0x70
+#define SDVIO_OCP_GLITCH_3M		0x80
+#define SDVIO_OCP_GLITCH_4M		0x90
+#define SDVIO_OCP_GLIVCH_5M		0xA0
+#define SDVIO_OCP_GLITCH_6M		0xB0
+#define SDVIO_OCP_GLITCH_7M		0xC0
+#define SDVIO_OCP_GLITCH_8M		0xD0
+#define SDVIO_OCP_GLITCH_9M		0xE0
+#define SDVIO_OCP_GLITCH_10M		0xF0
+
+#define SD_OCP_GLITCH_MASK		0x0F
+#define SD_OCP_GLITCH_NONE		0x00
+#define SD_OCP_GLITCH_50U		0x01
+#define SD_OCP_GLITCH_100U		0x02
+#define SD_OCP_GLITCH_200U		0x03
+#define SD_OCP_GLITCH_600U		0x04
+#define SD_OCP_GLITCH_800U		0x05
+#define SD_OCP_GLITCH_1M		0x06
+#define SD_OCP_GLITCH_2M		0x07
+#define SD_OCP_GLITCH_3M		0x08
+#define SD_OCP_GLITCH_4M		0x09
+#define SD_OCP_GLIVCH_5M		0x0A
+#define SD_OCP_GLITCH_6M		0x0B
+#define SD_OCP_GLITCH_7M		0x0C
+#define SD_OCP_GLITCH_8M		0x0D
+#define SD_OCP_GLITCH_9M		0x0E
+#define SD_OCP_GLITCH_10M		0x0F
+
 /* Phy register */
 #define PHY_PCR				0x00
 #define   PHY_PCR_FORCE_CODE		0xB000
@@ -857,6 +1055,7 @@
 
 #define PCR_ASPM_SETTING_REG1		0x160
 #define PCR_ASPM_SETTING_REG2		0x168
+#define PCR_ASPM_SETTING_5260		0x178
 
 #define PCR_SETTING_REG1		0x724
 #define PCR_SETTING_REG2		0x814
@@ -890,6 +1089,7 @@ struct pcr_ops {
 	int		(*conv_clk_and_div_n)(int clk, int dir);
 	void		(*fetch_vendor_settings)(struct rtsx_pcr *pcr);
 	void		(*force_power_down)(struct rtsx_pcr *pcr, u8 pm_state);
+	void		(*stop_cmd)(struct rtsx_pcr *pcr);
 
 	void (*set_aspm)(struct rtsx_pcr *pcr, bool enable);
 	int (*set_ltr_latency)(struct rtsx_pcr *pcr, u32 latency);
@@ -897,6 +1097,12 @@ struct pcr_ops {
 	void (*set_l1off_cfg_sub_d0)(struct rtsx_pcr *pcr, int active);
 	void (*full_on)(struct rtsx_pcr *pcr);
 	void (*power_saving)(struct rtsx_pcr *pcr);
+	void (*enable_ocp)(struct rtsx_pcr *pcr);
+	void (*disable_ocp)(struct rtsx_pcr *pcr);
+	void (*init_ocp)(struct rtsx_pcr *pcr);
+	void (*process_ocp)(struct rtsx_pcr *pcr);
+	int (*get_ocpstat)(struct rtsx_pcr *pcr, u8 *val);
+	void (*clear_ocpstat)(struct rtsx_pcr *pcr);
 };
 
 enum PDEV_STAT  {PDEV_STAT_IDLE, PDEV_STAT_RUN};
@@ -935,6 +1141,9 @@ enum dev_aspm_mode {
  * @l1_snooze_delay: l1 snooze delay
  * @ltr_l1off_sspwrgate: ltr l1off sspwrgate
  * @ltr_l1off_snooze_sspwrgate: ltr l1off snooze sspwrgate
+ * @ocp_en: enable ocp flag
+ * @sd_400mA_ocp_thd: 400mA ocp thd
+ * @sd_800mA_ocp_thd: 800mA ocp thd
  */
 struct rtsx_cr_option {
 	u32 dev_flags;
@@ -949,6 +1158,19 @@ struct rtsx_cr_option {
 	u32 l1_snooze_delay;
 	u8 ltr_l1off_sspwrgate;
 	u8 ltr_l1off_snooze_sspwrgate;
+	bool ocp_en;
+	u8 sd_400mA_ocp_thd;
+	u8 sd_800mA_ocp_thd;
+};
+
+/*
+ * struct rtsx_hw_param  - card reader hardware param
+ * @interrupt_en: indicate which interrutp enable
+ * @ocp_glitch: ocp glitch time
+ */
+struct rtsx_hw_param {
+	u32 interrupt_en;
+	u8 ocp_glitch;
 };
 
 #define rtsx_set_dev_flag(cr, flag) \
@@ -963,6 +1185,7 @@ struct rtsx_pcr {
 	unsigned int			id;
 	int				pcie_cap;
 	struct rtsx_cr_option	option;
+	struct rtsx_hw_param hw_param;
 
 	/* pci resources */
 	unsigned long			addr;
@@ -1042,12 +1265,15 @@ struct rtsx_pcr {
 	struct rtsx_slot		*slots;
 
 	u8				dma_error_count;
+	u8			ocp_stat;
+	u8			ocp_stat2;
 };
 
 #define PID_524A	0x524A
-#define PID_5249		0x5249
-#define PID_5250		0x5250
+#define PID_5249	0x5249
+#define PID_5250	0x5250
 #define PID_525A	0x525A
+#define PID_5260	0x5260
 
 #define CHK_PCI_PID(pcr, pid)		((pcr)->pci->device == (pid))
 #define PCI_VID(pcr)			((pcr)->pci->vendor)
diff --git a/include/linux/mfd/rtsx_usb.h b/include/linux/rtsx_usb.h
similarity index 100%
rename from include/linux/mfd/rtsx_usb.h
rename to include/linux/rtsx_usb.h
diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index b7c8325..22b2131 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -276,6 +276,17 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
 			      unsigned int n_pages, unsigned int offset,
 			      unsigned long size, gfp_t gfp_mask);
 
+#ifdef CONFIG_SGL_ALLOC
+struct scatterlist *sgl_alloc_order(unsigned long long length,
+				    unsigned int order, bool chainable,
+				    gfp_t gfp, unsigned int *nent_p);
+struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
+			      unsigned int *nent_p);
+void sgl_free_n_order(struct scatterlist *sgl, int nents, int order);
+void sgl_free_order(struct scatterlist *sgl, int order);
+void sgl_free(struct scatterlist *sgl);
+#endif /* CONFIG_SGL_ALLOC */
+
 size_t sg_copy_buffer(struct scatterlist *sgl, unsigned int nents, void *buf,
 		      size_t buflen, off_t skip, bool to_buffer);
 
diff --git a/include/linux/sched.h b/include/linux/sched.h
index d258826..166144c 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -472,11 +472,15 @@ struct sched_dl_entity {
 	 * has not been executed yet. This flag is useful to avoid race
 	 * conditions between the inactive timer handler and the wakeup
 	 * code.
+	 *
+	 * @dl_overrun tells if the task asked to be informed about runtime
+	 * overruns.
 	 */
 	unsigned int			dl_throttled      : 1;
 	unsigned int			dl_boosted        : 1;
 	unsigned int			dl_yielded        : 1;
 	unsigned int			dl_non_contending : 1;
+	unsigned int			dl_overrun	  : 1;
 
 	/*
 	 * Bandwidth enforcement timer. Each -deadline task has its
@@ -1427,6 +1431,7 @@ extern int idle_cpu(int cpu);
 extern int sched_setscheduler(struct task_struct *, int, const struct sched_param *);
 extern int sched_setscheduler_nocheck(struct task_struct *, int, const struct sched_param *);
 extern int sched_setattr(struct task_struct *, const struct sched_attr *);
+extern int sched_setattr_nocheck(struct task_struct *, const struct sched_attr *);
 extern struct task_struct *idle_task(int cpu);
 
 /**
@@ -1446,12 +1451,21 @@ extern void ia64_set_curr_task(int cpu, struct task_struct *p);
 void yield(void);
 
 union thread_union {
+#ifndef CONFIG_ARCH_TASK_STRUCT_ON_STACK
+	struct task_struct task;
+#endif
 #ifndef CONFIG_THREAD_INFO_IN_TASK
 	struct thread_info thread_info;
 #endif
 	unsigned long stack[THREAD_SIZE/sizeof(long)];
 };
 
+#ifndef CONFIG_THREAD_INFO_IN_TASK
+extern struct thread_info init_thread_info;
+#endif
+
+extern unsigned long init_stack[THREAD_SIZE / sizeof(unsigned long)];
+
 #ifdef CONFIG_THREAD_INFO_IN_TASK
 static inline struct thread_info *task_thread_info(struct task_struct *task)
 {
diff --git a/include/linux/sched/cpufreq.h b/include/linux/sched/cpufreq.h
index d1ad3d8..0b55834 100644
--- a/include/linux/sched/cpufreq.h
+++ b/include/linux/sched/cpufreq.h
@@ -12,8 +12,6 @@
 #define SCHED_CPUFREQ_DL	(1U << 1)
 #define SCHED_CPUFREQ_IOWAIT	(1U << 2)
 
-#define SCHED_CPUFREQ_RT_DL	(SCHED_CPUFREQ_RT | SCHED_CPUFREQ_DL)
-
 #ifdef CONFIG_CPU_FREQ
 struct update_util_data {
        void (*func)(struct update_util_data *data, u64 time, unsigned int flags);
diff --git a/include/linux/sched/task_stack.h b/include/linux/sched/task_stack.h
index cb4828a..6a84192 100644
--- a/include/linux/sched/task_stack.h
+++ b/include/linux/sched/task_stack.h
@@ -78,7 +78,7 @@ static inline void put_task_stack(struct task_struct *tsk) {}
 #define task_stack_end_corrupted(task) \
 		(*(end_of_stack(task)) != STACK_END_MAGIC)
 
-static inline int object_is_on_stack(void *obj)
+static inline int object_is_on_stack(const void *obj)
 {
 	void *stack = task_stack_page(current);
 
diff --git a/include/linux/sched/topology.h b/include/linux/sched/topology.h
index cf257c2..2634774 100644
--- a/include/linux/sched/topology.h
+++ b/include/linux/sched/topology.h
@@ -7,6 +7,12 @@
 #include <linux/sched/idle.h>
 
 /*
+ * Increase resolution of cpu_capacity calculations
+ */
+#define SCHED_CAPACITY_SHIFT	SCHED_FIXEDPOINT_SHIFT
+#define SCHED_CAPACITY_SCALE	(1L << SCHED_CAPACITY_SHIFT)
+
+/*
  * sched-domains (multiprocessor balancing) declarations:
  */
 #ifdef CONFIG_SMP
@@ -27,12 +33,6 @@
 #define SD_OVERLAP		0x2000	/* sched_domains of this level overlap */
 #define SD_NUMA			0x4000	/* cross-node balancing */
 
-/*
- * Increase resolution of cpu_capacity calculations
- */
-#define SCHED_CAPACITY_SHIFT	SCHED_FIXEDPOINT_SHIFT
-#define SCHED_CAPACITY_SCALE	(1L << SCHED_CAPACITY_SHIFT)
-
 #ifdef CONFIG_SCHED_SMT
 static inline int cpu_smt_flags(void)
 {
diff --git a/include/linux/seqlock.h b/include/linux/seqlock.h
index f189a8a..bcf4cf2 100644
--- a/include/linux/seqlock.h
+++ b/include/linux/seqlock.h
@@ -278,9 +278,8 @@ static inline void raw_write_seqcount_barrier(seqcount_t *s)
 
 static inline int raw_read_seqcount_latch(seqcount_t *s)
 {
-	int seq = READ_ONCE(s->sequence);
 	/* Pairs with the first smp_wmb() in raw_write_seqcount_latch() */
-	smp_read_barrier_depends();
+	int seq = READ_ONCE(s->sequence); /* ^^^ */
 	return seq;
 }
 
diff --git a/include/linux/serdev.h b/include/linux/serdev.h
index d609e6d..d4bb46a 100644
--- a/include/linux/serdev.h
+++ b/include/linux/serdev.h
@@ -193,6 +193,7 @@ static inline int serdev_controller_receive_buf(struct serdev_controller *ctrl,
 
 int serdev_device_open(struct serdev_device *);
 void serdev_device_close(struct serdev_device *);
+int devm_serdev_device_open(struct device *, struct serdev_device *);
 unsigned int serdev_device_set_baudrate(struct serdev_device *, unsigned int);
 void serdev_device_set_flow_control(struct serdev_device *, bool);
 int serdev_device_write_buf(struct serdev_device *, const unsigned char *, size_t);
diff --git a/include/linux/sh_eth.h b/include/linux/sh_eth.h
index ff3642d..94081e9 100644
--- a/include/linux/sh_eth.h
+++ b/include/linux/sh_eth.h
@@ -17,7 +17,6 @@ struct sh_eth_plat_data {
 	unsigned char mac_addr[ETH_ALEN];
 	unsigned no_ether_link:1;
 	unsigned ether_link_active_low:1;
-	unsigned needs_init:1;
 };
 
 #endif
diff --git a/include/linux/sound.h b/include/linux/sound.h
index 3c6d393..ec85b7a 100644
--- a/include/linux/sound.h
+++ b/include/linux/sound.h
@@ -12,11 +12,9 @@ struct device;
 extern int register_sound_special(const struct file_operations *fops, int unit);
 extern int register_sound_special_device(const struct file_operations *fops, int unit, struct device *dev);
 extern int register_sound_mixer(const struct file_operations *fops, int dev);
-extern int register_sound_midi(const struct file_operations *fops, int dev);
 extern int register_sound_dsp(const struct file_operations *fops, int dev);
 
 extern void unregister_sound_special(int unit);
 extern void unregister_sound_mixer(int unit);
-extern void unregister_sound_midi(int unit);
 extern void unregister_sound_dsp(int unit);
 #endif /* _LINUX_SOUND_H */
diff --git a/include/linux/srcu.h b/include/linux/srcu.h
index 62be896..33c1c69 100644
--- a/include/linux/srcu.h
+++ b/include/linux/srcu.h
@@ -92,7 +92,7 @@ void synchronize_srcu(struct srcu_struct *sp);
  * relies on normal RCU, it can be called from the CPU which
  * is in the idle loop from an RCU point of view or offline.
  */
-static inline int srcu_read_lock_held(struct srcu_struct *sp)
+static inline int srcu_read_lock_held(const struct srcu_struct *sp)
 {
 	if (!debug_lockdep_rcu_enabled())
 		return 1;
@@ -101,7 +101,7 @@ static inline int srcu_read_lock_held(struct srcu_struct *sp)
 
 #else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
 
-static inline int srcu_read_lock_held(struct srcu_struct *sp)
+static inline int srcu_read_lock_held(const struct srcu_struct *sp)
 {
 	return 1;
 }
diff --git a/include/linux/srcutree.h b/include/linux/srcutree.h
index a949f4f..4eda108 100644
--- a/include/linux/srcutree.h
+++ b/include/linux/srcutree.h
@@ -40,7 +40,7 @@ struct srcu_data {
 	unsigned long srcu_unlock_count[2];	/* Unlocks per CPU. */
 
 	/* Update-side state. */
-	raw_spinlock_t __private lock ____cacheline_internodealigned_in_smp;
+	spinlock_t __private lock ____cacheline_internodealigned_in_smp;
 	struct rcu_segcblist srcu_cblist;	/* List of callbacks.*/
 	unsigned long srcu_gp_seq_needed;	/* Furthest future GP needed. */
 	unsigned long srcu_gp_seq_needed_exp;	/* Furthest future exp GP. */
@@ -58,7 +58,7 @@ struct srcu_data {
  * Node in SRCU combining tree, similar in function to rcu_data.
  */
 struct srcu_node {
-	raw_spinlock_t __private lock;
+	spinlock_t __private lock;
 	unsigned long srcu_have_cbs[4];		/* GP seq for children */
 						/*  having CBs, but only */
 						/*  is > ->srcu_gq_seq. */
@@ -78,7 +78,7 @@ struct srcu_struct {
 	struct srcu_node *level[RCU_NUM_LVLS + 1];
 						/* First node at each level. */
 	struct mutex srcu_cb_mutex;		/* Serialize CB preparation. */
-	raw_spinlock_t __private lock;		/* Protect counters */
+	spinlock_t __private lock;		/* Protect counters */
 	struct mutex srcu_gp_mutex;		/* Serialize GP work. */
 	unsigned int srcu_idx;			/* Current rdr array element. */
 	unsigned long srcu_gp_seq;		/* Grace-period seq #. */
@@ -107,7 +107,7 @@ struct srcu_struct {
 #define __SRCU_STRUCT_INIT(name)					\
 	{								\
 		.sda = &name##_srcu_data,				\
-		.lock = __RAW_SPIN_LOCK_UNLOCKED(name.lock),		\
+		.lock = __SPIN_LOCK_UNLOCKED(name.lock),		\
 		.srcu_gp_seq_needed = 0 - 1,				\
 		__SRCU_DEP_MAP_INIT(name)				\
 	}
diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index d60b0f5..cc22a24 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -443,32 +443,8 @@ extern bool pm_save_wakeup_count(unsigned int count);
 extern void pm_wakep_autosleep_enabled(bool set);
 extern void pm_print_active_wakeup_sources(void);
 
-static inline void lock_system_sleep(void)
-{
-	current->flags |= PF_FREEZER_SKIP;
-	mutex_lock(&pm_mutex);
-}
-
-static inline void unlock_system_sleep(void)
-{
-	/*
-	 * Don't use freezer_count() because we don't want the call to
-	 * try_to_freeze() here.
-	 *
-	 * Reason:
-	 * Fundamentally, we just don't need it, because freezing condition
-	 * doesn't come into effect until we release the pm_mutex lock,
-	 * since the freezer always works with pm_mutex held.
-	 *
-	 * More importantly, in the case of hibernation,
-	 * unlock_system_sleep() gets called in snapshot_read() and
-	 * snapshot_write() when the freezing condition is still in effect.
-	 * Which means, if we use try_to_freeze() here, it would make them
-	 * enter the refrigerator, thus causing hibernation to lockup.
-	 */
-	current->flags &= ~PF_FREEZER_SKIP;
-	mutex_unlock(&pm_mutex);
-}
+extern void lock_system_sleep(void);
+extern void unlock_system_sleep(void);
 
 #else /* !CONFIG_PM_SLEEP */
 
diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 9c5a262..1d3877c 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -124,6 +124,11 @@ static inline bool is_write_device_private_entry(swp_entry_t entry)
 	return unlikely(swp_type(entry) == SWP_DEVICE_WRITE);
 }
 
+static inline unsigned long device_private_entry_to_pfn(swp_entry_t entry)
+{
+	return swp_offset(entry);
+}
+
 static inline struct page *device_private_entry_to_page(swp_entry_t entry)
 {
 	return pfn_to_page(swp_offset(entry));
@@ -154,6 +159,11 @@ static inline bool is_write_device_private_entry(swp_entry_t entry)
 	return false;
 }
 
+static inline unsigned long device_private_entry_to_pfn(swp_entry_t entry)
+{
+	return 0;
+}
+
 static inline struct page *device_private_entry_to_page(swp_entry_t entry)
 {
 	return NULL;
@@ -189,6 +199,11 @@ static inline int is_write_migration_entry(swp_entry_t entry)
 	return unlikely(swp_type(entry) == SWP_MIGRATION_WRITE);
 }
 
+static inline unsigned long migration_entry_to_pfn(swp_entry_t entry)
+{
+	return swp_offset(entry);
+}
+
 static inline struct page *migration_entry_to_page(swp_entry_t entry)
 {
 	struct page *p = pfn_to_page(swp_offset(entry));
@@ -218,6 +233,12 @@ static inline int is_migration_entry(swp_entry_t swp)
 {
 	return 0;
 }
+
+static inline unsigned long migration_entry_to_pfn(swp_entry_t entry)
+{
+	return 0;
+}
+
 static inline struct page *migration_entry_to_page(swp_entry_t entry)
 {
 	return NULL;
diff --git a/include/linux/torture.h b/include/linux/torture.h
index a45702e..6627286 100644
--- a/include/linux/torture.h
+++ b/include/linux/torture.h
@@ -79,7 +79,7 @@ void stutter_wait(const char *title);
 int torture_stutter_init(int s);
 
 /* Initialization and cleanup. */
-bool torture_init_begin(char *ttype, bool v, int *runnable);
+bool torture_init_begin(char *ttype, bool v);
 void torture_init_end(void);
 bool torture_cleanup_begin(void);
 void torture_cleanup_end(void);
@@ -96,4 +96,10 @@ void _torture_stop_kthread(char *m, struct task_struct **tp);
 #define torture_stop_kthread(n, tp) \
 	_torture_stop_kthread("Stopping " #n " task", &(tp))
 
+#ifdef CONFIG_PREEMPT
+#define torture_preempt_schedule() preempt_schedule()
+#else
+#define torture_preempt_schedule()
+#endif
+
 #endif /* __LINUX_TORTURE_H */
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index a26ffbe..c94f466d 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -137,11 +137,8 @@ extern void syscall_unregfunc(void);
 									\
 		if (!(cond))						\
 			return;						\
-		if (rcucheck) {						\
-			if (WARN_ON_ONCE(rcu_irq_enter_disabled()))	\
-				return;					\
+		if (rcucheck)						\
 			rcu_irq_enter_irqson();				\
-		}							\
 		rcu_read_lock_sched_notrace();				\
 		it_func_ptr = rcu_dereference_sched((tp)->funcs);	\
 		if (it_func_ptr) {					\
diff --git a/include/net/arp.h b/include/net/arp.h
index dc8cd47..977aabf 100644
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -20,6 +20,9 @@ static inline u32 arp_hashfn(const void *pkey, const struct net_device *dev, u32
 
 static inline struct neighbour *__ipv4_neigh_lookup_noref(struct net_device *dev, u32 key)
 {
+	if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+		key = INADDR_ANY;
+
 	return ___neigh_lookup_noref(&arp_tbl, neigh_key_eq32, arp_hashfn, &key, dev);
 }
 
diff --git a/include/net/cfg80211.h b/include/net/cfg80211.h
index cb4d92b..fb94a8b 100644
--- a/include/net/cfg80211.h
+++ b/include/net/cfg80211.h
@@ -815,6 +815,8 @@ struct cfg80211_csa_settings {
 	u8 count;
 };
 
+#define CFG80211_MAX_NUM_DIFFERENT_CHANNELS 10
+
 /**
  * struct iface_combination_params - input parameters for interface combinations
  *
diff --git a/include/net/dst.h b/include/net/dst.h
index b091fd5..d49d607 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -521,4 +521,12 @@ static inline struct xfrm_state *dst_xfrm(const struct dst_entry *dst)
 }
 #endif
 
+static inline void skb_dst_update_pmtu(struct sk_buff *skb, u32 mtu)
+{
+	struct dst_entry *dst = skb_dst(skb);
+
+	if (dst && dst->ops->update_pmtu)
+		dst->ops->update_pmtu(dst, NULL, skb, mtu);
+}
+
 #endif /* _NET_DST_H */
diff --git a/include/net/ipv6.h b/include/net/ipv6.h
index f73797e..2212382 100644
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -331,6 +331,7 @@ int ipv6_flowlabel_opt_get(struct sock *sk, struct in6_flowlabel_req *freq,
 			   int flags);
 int ip6_flowlabel_init(void);
 void ip6_flowlabel_cleanup(void);
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np);
 
 static inline void fl6_sock_release(struct ip6_flowlabel *fl)
 {
diff --git a/include/net/net_namespace.h b/include/net/net_namespace.h
index 10f99da..0490084 100644
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -223,6 +223,11 @@ int net_eq(const struct net *net1, const struct net *net2)
 	return net1 == net2;
 }
 
+static inline int check_net(const struct net *net)
+{
+	return atomic_read(&net->count) != 0;
+}
+
 void net_drop_ns(void *);
 
 #else
@@ -247,6 +252,11 @@ int net_eq(const struct net *net1, const struct net *net2)
 	return 1;
 }
 
+static inline int check_net(const struct net *net)
+{
+	return 1;
+}
+
 #define net_drop_ns NULL
 #endif
 
diff --git a/include/net/pkt_cls.h b/include/net/pkt_cls.h
index 8e08b6d..753ac93 100644
--- a/include/net/pkt_cls.h
+++ b/include/net/pkt_cls.h
@@ -522,7 +522,7 @@ static inline unsigned char * tcf_get_base_ptr(struct sk_buff *skb, int layer)
 {
 	switch (layer) {
 		case TCF_LAYER_LINK:
-			return skb->data;
+			return skb_mac_header(skb);
 		case TCF_LAYER_NETWORK:
 			return skb_network_header(skb);
 		case TCF_LAYER_TRANSPORT:
diff --git a/include/net/sch_generic.h b/include/net/sch_generic.h
index 83a3e47..becf86a 100644
--- a/include/net/sch_generic.h
+++ b/include/net/sch_generic.h
@@ -179,6 +179,7 @@ struct Qdisc_ops {
 	const struct Qdisc_class_ops	*cl_ops;
 	char			id[IFNAMSIZ];
 	int			priv_size;
+	unsigned int		static_flags;
 
 	int 			(*enqueue)(struct sk_buff *skb,
 					   struct Qdisc *sch,
@@ -444,6 +445,7 @@ void qdisc_tree_reduce_backlog(struct Qdisc *qdisc, unsigned int n,
 			       unsigned int len);
 struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 			  const struct Qdisc_ops *ops);
+void qdisc_free(struct Qdisc *qdisc);
 struct Qdisc *qdisc_create_dflt(struct netdev_queue *dev_queue,
 				const struct Qdisc_ops *ops, u32 parentid);
 void __qdisc_calculate_pkt_len(struct sk_buff *skb,
diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 2f8f93d..9a5ccf0 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -966,7 +966,7 @@ void sctp_transport_burst_limited(struct sctp_transport *);
 void sctp_transport_burst_reset(struct sctp_transport *);
 unsigned long sctp_transport_timeout(struct sctp_transport *);
 void sctp_transport_reset(struct sctp_transport *t);
-void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu);
+bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu);
 void sctp_transport_immediate_rtx(struct sctp_transport *);
 void sctp_transport_dst_release(struct sctp_transport *t);
 void sctp_transport_dst_confirm(struct sctp_transport *t);
diff --git a/include/net/sock.h b/include/net/sock.h
index 7a7b14e..c4a424f 100644
--- a/include/net/sock.h
+++ b/include/net/sock.h
@@ -1445,10 +1445,8 @@ do {									\
 } while (0)
 
 #ifdef CONFIG_LOCKDEP
-static inline bool lockdep_sock_is_held(const struct sock *csk)
+static inline bool lockdep_sock_is_held(const struct sock *sk)
 {
-	struct sock *sk = (struct sock *)csk;
-
 	return lockdep_is_held(&sk->sk_lock) ||
 	       lockdep_is_held(&sk->sk_lock.slock);
 }
diff --git a/include/net/tls.h b/include/net/tls.h
index 936cfc5..9185e53 100644
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -170,7 +170,7 @@ static inline bool tls_is_pending_open_record(struct tls_context *tls_ctx)
 
 static inline void tls_err_abort(struct sock *sk)
 {
-	sk->sk_err = -EBADMSG;
+	sk->sk_err = EBADMSG;
 	sk->sk_error_report(sk);
 }
 
diff --git a/include/net/vxlan.h b/include/net/vxlan.h
index 1322339..f96391e 100644
--- a/include/net/vxlan.h
+++ b/include/net/vxlan.h
@@ -146,7 +146,7 @@ struct vxlanhdr_gpe {
 		np_applied:1,
 		instance_applied:1,
 		version:2,
-reserved_flags2:2;
+		reserved_flags2:2;
 #elif defined(__BIG_ENDIAN_BITFIELD)
 	u8	reserved_flags2:2,
 		version:2,
diff --git a/include/sound/hdaudio_ext.h b/include/sound/hdaudio_ext.h
index ca00130..9c14e21 100644
--- a/include/sound/hdaudio_ext.h
+++ b/include/sound/hdaudio_ext.h
@@ -193,7 +193,7 @@ struct hda_dai_map {
  * @pvt_data - private data, for asoc contains asoc codec object
  */
 struct hdac_ext_device {
-	struct hdac_device hdac;
+	struct hdac_device hdev;
 	struct hdac_ext_bus *ebus;
 
 	/* soc-dai to nid map */
@@ -213,7 +213,7 @@ struct hdac_ext_dma_params {
 	u8 stream_tag;
 };
 #define to_ehdac_device(dev) (container_of((dev), \
-				 struct hdac_ext_device, hdac))
+				 struct hdac_ext_device, hdev))
 /*
  * HD-audio codec base driver
  */
diff --git a/include/sound/pcm.h b/include/sound/pcm.h
index 24febf9..e054c58 100644
--- a/include/sound/pcm.h
+++ b/include/sound/pcm.h
@@ -169,6 +169,10 @@ struct snd_pcm_ops {
 #define SNDRV_PCM_FMTBIT_IMA_ADPCM	_SNDRV_PCM_FMTBIT(IMA_ADPCM)
 #define SNDRV_PCM_FMTBIT_MPEG		_SNDRV_PCM_FMTBIT(MPEG)
 #define SNDRV_PCM_FMTBIT_GSM		_SNDRV_PCM_FMTBIT(GSM)
+#define SNDRV_PCM_FMTBIT_S20_LE	_SNDRV_PCM_FMTBIT(S20_LE)
+#define SNDRV_PCM_FMTBIT_U20_LE	_SNDRV_PCM_FMTBIT(U20_LE)
+#define SNDRV_PCM_FMTBIT_S20_BE	_SNDRV_PCM_FMTBIT(S20_BE)
+#define SNDRV_PCM_FMTBIT_U20_BE	_SNDRV_PCM_FMTBIT(U20_BE)
 #define SNDRV_PCM_FMTBIT_SPECIAL	_SNDRV_PCM_FMTBIT(SPECIAL)
 #define SNDRV_PCM_FMTBIT_S24_3LE	_SNDRV_PCM_FMTBIT(S24_3LE)
 #define SNDRV_PCM_FMTBIT_U24_3LE	_SNDRV_PCM_FMTBIT(U24_3LE)
@@ -202,6 +206,8 @@ struct snd_pcm_ops {
 #define SNDRV_PCM_FMTBIT_FLOAT		SNDRV_PCM_FMTBIT_FLOAT_LE
 #define SNDRV_PCM_FMTBIT_FLOAT64	SNDRV_PCM_FMTBIT_FLOAT64_LE
 #define SNDRV_PCM_FMTBIT_IEC958_SUBFRAME SNDRV_PCM_FMTBIT_IEC958_SUBFRAME_LE
+#define SNDRV_PCM_FMTBIT_S20		SNDRV_PCM_FMTBIT_S20_LE
+#define SNDRV_PCM_FMTBIT_U20		SNDRV_PCM_FMTBIT_U20_LE
 #endif
 #ifdef SNDRV_BIG_ENDIAN
 #define SNDRV_PCM_FMTBIT_S16		SNDRV_PCM_FMTBIT_S16_BE
@@ -213,6 +219,8 @@ struct snd_pcm_ops {
 #define SNDRV_PCM_FMTBIT_FLOAT		SNDRV_PCM_FMTBIT_FLOAT_BE
 #define SNDRV_PCM_FMTBIT_FLOAT64	SNDRV_PCM_FMTBIT_FLOAT64_BE
 #define SNDRV_PCM_FMTBIT_IEC958_SUBFRAME SNDRV_PCM_FMTBIT_IEC958_SUBFRAME_BE
+#define SNDRV_PCM_FMTBIT_S20		SNDRV_PCM_FMTBIT_S20_BE
+#define SNDRV_PCM_FMTBIT_U20		SNDRV_PCM_FMTBIT_U20_BE
 #endif
 
 struct snd_pcm_file {
diff --git a/include/sound/rt5514.h b/include/sound/rt5514.h
index ef18494..64d027dba 100644
--- a/include/sound/rt5514.h
+++ b/include/sound/rt5514.h
@@ -14,6 +14,8 @@
 
 struct rt5514_platform_data {
 	unsigned int dmic_init_delay;
+	const char *dsp_calib_clk_name;
+	unsigned int dsp_calib_clk_rate;
 };
 
 #endif
diff --git a/include/sound/rt5645.h b/include/sound/rt5645.h
index d0c33a9..f218c74 100644
--- a/include/sound/rt5645.h
+++ b/include/sound/rt5645.h
@@ -25,6 +25,9 @@ struct rt5645_platform_data {
 	bool level_trigger_irq;
 	/* Invert JD1_1 status polarity */
 	bool inv_jd1_1;
+
+	/* Value to asign to snd_soc_card.long_name */
+	const char *long_name;
 };
 
 #endif
diff --git a/include/sound/soc-acpi-intel-match.h b/include/sound/soc-acpi-intel-match.h
index 1a9191c..9da6388 100644
--- a/include/sound/soc-acpi-intel-match.h
+++ b/include/sound/soc-acpi-intel-match.h
@@ -16,6 +16,7 @@
 #ifndef __LINUX_SND_SOC_ACPI_INTEL_MATCH_H
 #define __LINUX_SND_SOC_ACPI_INTEL_MATCH_H
 
+#include <linux/module.h>
 #include <linux/stddef.h>
 #include <linux/acpi.h>
 
diff --git a/include/sound/soc-acpi.h b/include/sound/soc-acpi.h
index a7d8d335..08222427 100644
--- a/include/sound/soc-acpi.h
+++ b/include/sound/soc-acpi.h
@@ -17,6 +17,7 @@
 
 #include <linux/stddef.h>
 #include <linux/acpi.h>
+#include <linux/mod_devicetable.h>
 
 struct snd_soc_acpi_package_context {
 	char *name;           /* package name */
@@ -26,17 +27,13 @@ struct snd_soc_acpi_package_context {
 	bool data_valid;
 };
 
+/* codec name is used in DAIs is i2c-<HID>:00 with HID being 8 chars */
+#define SND_ACPI_I2C_ID_LEN (4 + ACPI_ID_LEN + 3 + 1)
+
 #if IS_ENABLED(CONFIG_ACPI)
-/* translation fron HID to I2C name, needed for DAI codec_name */
-const char *snd_soc_acpi_find_name_from_hid(const u8 hid[ACPI_ID_LEN]);
 bool snd_soc_acpi_find_package_from_hid(const u8 hid[ACPI_ID_LEN],
 				    struct snd_soc_acpi_package_context *ctx);
 #else
-static inline const char *
-snd_soc_acpi_find_name_from_hid(const u8 hid[ACPI_ID_LEN])
-{
-	return NULL;
-}
 static inline bool
 snd_soc_acpi_find_package_from_hid(const u8 hid[ACPI_ID_LEN],
 				   struct snd_soc_acpi_package_context *ctx)
@@ -49,9 +46,6 @@ snd_soc_acpi_find_package_from_hid(const u8 hid[ACPI_ID_LEN],
 struct snd_soc_acpi_mach *
 snd_soc_acpi_find_machine(struct snd_soc_acpi_mach *machines);
 
-/* acpi check hid */
-bool snd_soc_acpi_check_hid(const u8 hid[ACPI_ID_LEN]);
-
 /**
  * snd_soc_acpi_mach: ACPI-based machine descriptor. Most of the fields are
  * related to the hardware, except for the firmware and topology file names.
diff --git a/include/sound/soc-dai.h b/include/sound/soc-dai.h
index 58acd00..8ad1166 100644
--- a/include/sound/soc-dai.h
+++ b/include/sound/soc-dai.h
@@ -102,6 +102,8 @@ struct snd_compr_stream;
 			       SNDRV_PCM_FMTBIT_S16_BE |\
 			       SNDRV_PCM_FMTBIT_S20_3LE |\
 			       SNDRV_PCM_FMTBIT_S20_3BE |\
+			       SNDRV_PCM_FMTBIT_S20_LE |\
+			       SNDRV_PCM_FMTBIT_S20_BE |\
 			       SNDRV_PCM_FMTBIT_S24_3LE |\
 			       SNDRV_PCM_FMTBIT_S24_3BE |\
                                SNDRV_PCM_FMTBIT_S32_LE |\
@@ -294,9 +296,6 @@ struct snd_soc_dai {
 	/* DAI runtime info */
 	unsigned int capture_active:1;		/* stream is in use */
 	unsigned int playback_active:1;		/* stream is in use */
-	unsigned int symmetric_rates:1;
-	unsigned int symmetric_channels:1;
-	unsigned int symmetric_samplebits:1;
 	unsigned int probed:1;
 
 	unsigned int active;
diff --git a/include/sound/soc.h b/include/sound/soc.h
index 1a73232..b655d98 100644
--- a/include/sound/soc.h
+++ b/include/sound/soc.h
@@ -494,6 +494,8 @@ int soc_new_pcm(struct snd_soc_pcm_runtime *rtd, int num);
 int snd_soc_new_compress(struct snd_soc_pcm_runtime *rtd, int num);
 #endif
 
+void snd_soc_disconnect_sync(struct device *dev);
+
 struct snd_pcm_substream *snd_soc_get_dai_substream(struct snd_soc_card *card,
 		const char *dai_link, int stream);
 struct snd_soc_pcm_runtime *snd_soc_get_pcm_runtime(struct snd_soc_card *card,
@@ -802,6 +804,9 @@ struct snd_soc_component_driver {
 	int (*suspend)(struct snd_soc_component *);
 	int (*resume)(struct snd_soc_component *);
 
+	unsigned int (*read)(struct snd_soc_component *, unsigned int);
+	int (*write)(struct snd_soc_component *, unsigned int, unsigned int);
+
 	/* pcm creation and destruction */
 	int (*pcm_new)(struct snd_soc_pcm_runtime *);
 	void (*pcm_free)(struct snd_pcm *);
@@ -858,12 +863,10 @@ struct snd_soc_component {
 	struct list_head card_aux_list; /* for auxiliary bound components */
 	struct list_head card_list;
 
-	struct snd_soc_dai_driver *dai_drv;
-	int num_dai;
-
 	const struct snd_soc_component_driver *driver;
 
 	struct list_head dai_list;
+	int num_dai;
 
 	int (*read)(struct snd_soc_component *, unsigned int, unsigned int *);
 	int (*write)(struct snd_soc_component *, unsigned int, unsigned int);
diff --git a/include/trace/events/btrfs.h b/include/trace/events/btrfs.h
index 4342a32..c3ac5ec 100644
--- a/include/trace/events/btrfs.h
+++ b/include/trace/events/btrfs.h
@@ -193,7 +193,6 @@ DEFINE_EVENT(btrfs__inode, btrfs_inode_evict,
 	__print_flags(flag, "|",					\
 		{ (1 << EXTENT_FLAG_PINNED), 		"PINNED" 	},\
 		{ (1 << EXTENT_FLAG_COMPRESSED), 	"COMPRESSED" 	},\
-		{ (1 << EXTENT_FLAG_VACANCY), 		"VACANCY" 	},\
 		{ (1 << EXTENT_FLAG_PREALLOC), 		"PREALLOC" 	},\
 		{ (1 << EXTENT_FLAG_LOGGING),	 	"LOGGING" 	},\
 		{ (1 << EXTENT_FLAG_FILLING),	 	"FILLING" 	},\
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 59d40c4..0b50fda 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -243,6 +243,7 @@ TRACE_EVENT(rcu_exp_funnel_lock,
 		  __entry->grphi, __entry->gpevent)
 );
 
+#ifdef CONFIG_RCU_NOCB_CPU
 /*
  * Tracepoint for RCU no-CBs CPU callback handoffs.  This event is intended
  * to assist debugging of these handoffs.
@@ -285,6 +286,7 @@ TRACE_EVENT(rcu_nocb_wake,
 
 	TP_printk("%s %d %s", __entry->rcuname, __entry->cpu, __entry->reason)
 );
+#endif
 
 /*
  * Tracepoint for tasks blocking within preemptible-RCU read-side
@@ -421,76 +423,40 @@ TRACE_EVENT(rcu_fqs,
 
 /*
  * Tracepoint for dyntick-idle entry/exit events.  These take a string
- * as argument: "Start" for entering dyntick-idle mode, "End" for
- * leaving it, "--=" for events moving towards idle, and "++=" for events
- * moving away from idle.  "Error on entry: not idle task" and "Error on
- * exit: not idle task" indicate that a non-idle task is erroneously
- * toying with the idle loop.
+ * as argument: "Start" for entering dyntick-idle mode, "Startirq" for
+ * entering it from irq/NMI, "End" for leaving it, "Endirq" for leaving it
+ * to irq/NMI, "--=" for events moving towards idle, and "++=" for events
+ * moving away from idle.
  *
  * These events also take a pair of numbers, which indicate the nesting
- * depth before and after the event of interest.  Note that task-related
- * events use the upper bits of each number, while interrupt-related
- * events use the lower bits.
+ * depth before and after the event of interest, and a third number that is
+ * the ->dynticks counter.  Note that task-related and interrupt-related
+ * events use two separate counters, and that the "++=" and "--=" events
+ * for irq/NMI will change the counter by two, otherwise by one.
  */
 TRACE_EVENT(rcu_dyntick,
 
-	TP_PROTO(const char *polarity, long long oldnesting, long long newnesting),
+	TP_PROTO(const char *polarity, long oldnesting, long newnesting, atomic_t dynticks),
 
-	TP_ARGS(polarity, oldnesting, newnesting),
+	TP_ARGS(polarity, oldnesting, newnesting, dynticks),
 
 	TP_STRUCT__entry(
 		__field(const char *, polarity)
-		__field(long long, oldnesting)
-		__field(long long, newnesting)
+		__field(long, oldnesting)
+		__field(long, newnesting)
+		__field(int, dynticks)
 	),
 
 	TP_fast_assign(
 		__entry->polarity = polarity;
 		__entry->oldnesting = oldnesting;
 		__entry->newnesting = newnesting;
+		__entry->dynticks = atomic_read(&dynticks);
 	),
 
-	TP_printk("%s %llx %llx", __entry->polarity,
-		  __entry->oldnesting, __entry->newnesting)
-);
-
-/*
- * Tracepoint for RCU preparation for idle, the goal being to get RCU
- * processing done so that the current CPU can shut off its scheduling
- * clock and enter dyntick-idle mode.  One way to accomplish this is
- * to drain all RCU callbacks from this CPU, and the other is to have
- * done everything RCU requires for the current grace period.  In this
- * latter case, the CPU will be awakened at the end of the current grace
- * period in order to process the remainder of its callbacks.
- *
- * These tracepoints take a string as argument:
- *
- *	"No callbacks": Nothing to do, no callbacks on this CPU.
- *	"In holdoff": Nothing to do, holding off after unsuccessful attempt.
- *	"Begin holdoff": Attempt failed, don't retry until next jiffy.
- *	"Dyntick with callbacks": Entering dyntick-idle despite callbacks.
- *	"Dyntick with lazy callbacks": Entering dyntick-idle w/lazy callbacks.
- *	"More callbacks": Still more callbacks, try again to clear them out.
- *	"Callbacks drained": All callbacks processed, off to dyntick idle!
- *	"Timer": Timer fired to cause CPU to continue processing callbacks.
- *	"Demigrate": Timer fired on wrong CPU, woke up correct CPU.
- *	"Cleanup after idle": Idle exited, timer canceled.
- */
-TRACE_EVENT(rcu_prep_idle,
-
-	TP_PROTO(const char *reason),
-
-	TP_ARGS(reason),
-
-	TP_STRUCT__entry(
-		__field(const char *, reason)
-	),
-
-	TP_fast_assign(
-		__entry->reason = reason;
-	),
-
-	TP_printk("%s", __entry->reason)
+	TP_printk("%s %lx %lx %#3x", __entry->polarity,
+		  __entry->oldnesting, __entry->newnesting,
+		  __entry->dynticks & 0xfff)
 );
 
 /*
@@ -799,8 +765,7 @@ TRACE_EVENT(rcu_barrier,
 					 grplo, grphi, gp_tasks) do { } \
 	while (0)
 #define trace_rcu_fqs(rcuname, gpnum, cpu, qsevent) do { } while (0)
-#define trace_rcu_dyntick(polarity, oldnesting, newnesting) do { } while (0)
-#define trace_rcu_prep_idle(reason) do { } while (0)
+#define trace_rcu_dyntick(polarity, oldnesting, newnesting, dyntick) do { } while (0)
 #define trace_rcu_callback(rcuname, rhp, qlen_lazy, qlen) do { } while (0)
 #define trace_rcu_kfree_callback(rcuname, rhp, offset, qlen_lazy, qlen) \
 	do { } while (0)
diff --git a/include/trace/events/thermal.h b/include/trace/events/thermal.h
index 7894664..135e542 100644
--- a/include/trace/events/thermal.h
+++ b/include/trace/events/thermal.h
@@ -94,9 +94,9 @@ TRACE_EVENT(thermal_zone_trip,
 #ifdef CONFIG_CPU_THERMAL
 TRACE_EVENT(thermal_power_cpu_get_power,
 	TP_PROTO(const struct cpumask *cpus, unsigned long freq, u32 *load,
-		size_t load_len, u32 dynamic_power, u32 static_power),
+		size_t load_len, u32 dynamic_power),
 
-	TP_ARGS(cpus, freq, load, load_len, dynamic_power, static_power),
+	TP_ARGS(cpus, freq, load, load_len, dynamic_power),
 
 	TP_STRUCT__entry(
 		__bitmask(cpumask, num_possible_cpus())
@@ -104,7 +104,6 @@ TRACE_EVENT(thermal_power_cpu_get_power,
 		__dynamic_array(u32,   load, load_len)
 		__field(size_t,        load_len      )
 		__field(u32,           dynamic_power )
-		__field(u32,           static_power  )
 	),
 
 	TP_fast_assign(
@@ -115,13 +114,12 @@ TRACE_EVENT(thermal_power_cpu_get_power,
 			load_len * sizeof(*load));
 		__entry->load_len = load_len;
 		__entry->dynamic_power = dynamic_power;
-		__entry->static_power = static_power;
 	),
 
-	TP_printk("cpus=%s freq=%lu load={%s} dynamic_power=%d static_power=%d",
+	TP_printk("cpus=%s freq=%lu load={%s} dynamic_power=%d",
 		__get_bitmask(cpumask), __entry->freq,
 		__print_array(__get_dynamic_array(load), __entry->load_len, 4),
-		__entry->dynamic_power, __entry->static_power)
+		__entry->dynamic_power)
 );
 
 TRACE_EVENT(thermal_power_cpu_limit,
diff --git a/include/trace/events/timer.h b/include/trace/events/timer.h
index 16e305e..a57e4ee 100644
--- a/include/trace/events/timer.h
+++ b/include/trace/events/timer.h
@@ -136,6 +136,24 @@ DEFINE_EVENT(timer_class, timer_cancel,
 	TP_ARGS(timer)
 );
 
+#define decode_clockid(type)						\
+	__print_symbolic(type,						\
+		{ CLOCK_REALTIME,	"CLOCK_REALTIME"	},	\
+		{ CLOCK_MONOTONIC,	"CLOCK_MONOTONIC"	},	\
+		{ CLOCK_BOOTTIME,	"CLOCK_BOOTTIME"	},	\
+		{ CLOCK_TAI,		"CLOCK_TAI"		})
+
+#define decode_hrtimer_mode(mode)					\
+	__print_symbolic(mode,						\
+		{ HRTIMER_MODE_ABS,		"ABS"		},	\
+		{ HRTIMER_MODE_REL,		"REL"		},	\
+		{ HRTIMER_MODE_ABS_PINNED,	"ABS|PINNED"	},	\
+		{ HRTIMER_MODE_REL_PINNED,	"REL|PINNED"	},	\
+		{ HRTIMER_MODE_ABS_SOFT,	"ABS|SOFT"	},	\
+		{ HRTIMER_MODE_REL_SOFT,	"REL|SOFT"	},	\
+		{ HRTIMER_MODE_ABS_PINNED_SOFT,	"ABS|PINNED|SOFT" },	\
+		{ HRTIMER_MODE_REL_PINNED_SOFT,	"REL|PINNED|SOFT" })
+
 /**
  * hrtimer_init - called when the hrtimer is initialized
  * @hrtimer:	pointer to struct hrtimer
@@ -162,10 +180,8 @@ TRACE_EVENT(hrtimer_init,
 	),
 
 	TP_printk("hrtimer=%p clockid=%s mode=%s", __entry->hrtimer,
-		  __entry->clockid == CLOCK_REALTIME ?
-			"CLOCK_REALTIME" : "CLOCK_MONOTONIC",
-		  __entry->mode == HRTIMER_MODE_ABS ?
-			"HRTIMER_MODE_ABS" : "HRTIMER_MODE_REL")
+		  decode_clockid(__entry->clockid),
+		  decode_hrtimer_mode(__entry->mode))
 );
 
 /**
@@ -174,15 +190,16 @@ TRACE_EVENT(hrtimer_init,
  */
 TRACE_EVENT(hrtimer_start,
 
-	TP_PROTO(struct hrtimer *hrtimer),
+	TP_PROTO(struct hrtimer *hrtimer, enum hrtimer_mode mode),
 
-	TP_ARGS(hrtimer),
+	TP_ARGS(hrtimer, mode),
 
 	TP_STRUCT__entry(
 		__field( void *,	hrtimer		)
 		__field( void *,	function	)
 		__field( s64,		expires		)
 		__field( s64,		softexpires	)
+		__field( enum hrtimer_mode,	mode	)
 	),
 
 	TP_fast_assign(
@@ -190,12 +207,14 @@ TRACE_EVENT(hrtimer_start,
 		__entry->function	= hrtimer->function;
 		__entry->expires	= hrtimer_get_expires(hrtimer);
 		__entry->softexpires	= hrtimer_get_softexpires(hrtimer);
+		__entry->mode		= mode;
 	),
 
-	TP_printk("hrtimer=%p function=%pf expires=%llu softexpires=%llu",
-		  __entry->hrtimer, __entry->function,
+	TP_printk("hrtimer=%p function=%pf expires=%llu softexpires=%llu "
+		  "mode=%s", __entry->hrtimer, __entry->function,
 		  (unsigned long long) __entry->expires,
-		  (unsigned long long) __entry->softexpires)
+		  (unsigned long long) __entry->softexpires,
+		  decode_hrtimer_mode(__entry->mode))
 );
 
 /**
diff --git a/include/uapi/linux/arm_sdei.h b/include/uapi/linux/arm_sdei.h
new file mode 100644
index 0000000..af0630b
--- /dev/null
+++ b/include/uapi/linux/arm_sdei.h
@@ -0,0 +1,73 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/* Copyright (C) 2017 Arm Ltd. */
+#ifndef _UAPI_LINUX_ARM_SDEI_H
+#define _UAPI_LINUX_ARM_SDEI_H
+
+#define SDEI_1_0_FN_BASE			0xC4000020
+#define SDEI_1_0_MASK				0xFFFFFFE0
+#define SDEI_1_0_FN(n)				(SDEI_1_0_FN_BASE + (n))
+
+#define SDEI_1_0_FN_SDEI_VERSION			SDEI_1_0_FN(0x00)
+#define SDEI_1_0_FN_SDEI_EVENT_REGISTER			SDEI_1_0_FN(0x01)
+#define SDEI_1_0_FN_SDEI_EVENT_ENABLE			SDEI_1_0_FN(0x02)
+#define SDEI_1_0_FN_SDEI_EVENT_DISABLE			SDEI_1_0_FN(0x03)
+#define SDEI_1_0_FN_SDEI_EVENT_CONTEXT			SDEI_1_0_FN(0x04)
+#define SDEI_1_0_FN_SDEI_EVENT_COMPLETE			SDEI_1_0_FN(0x05)
+#define SDEI_1_0_FN_SDEI_EVENT_COMPLETE_AND_RESUME	SDEI_1_0_FN(0x06)
+#define SDEI_1_0_FN_SDEI_EVENT_UNREGISTER		SDEI_1_0_FN(0x07)
+#define SDEI_1_0_FN_SDEI_EVENT_STATUS			SDEI_1_0_FN(0x08)
+#define SDEI_1_0_FN_SDEI_EVENT_GET_INFO			SDEI_1_0_FN(0x09)
+#define SDEI_1_0_FN_SDEI_EVENT_ROUTING_SET		SDEI_1_0_FN(0x0A)
+#define SDEI_1_0_FN_SDEI_PE_MASK			SDEI_1_0_FN(0x0B)
+#define SDEI_1_0_FN_SDEI_PE_UNMASK			SDEI_1_0_FN(0x0C)
+#define SDEI_1_0_FN_SDEI_INTERRUPT_BIND			SDEI_1_0_FN(0x0D)
+#define SDEI_1_0_FN_SDEI_INTERRUPT_RELEASE		SDEI_1_0_FN(0x0E)
+#define SDEI_1_0_FN_SDEI_PRIVATE_RESET			SDEI_1_0_FN(0x11)
+#define SDEI_1_0_FN_SDEI_SHARED_RESET			SDEI_1_0_FN(0x12)
+
+#define SDEI_VERSION_MAJOR_SHIFT			48
+#define SDEI_VERSION_MAJOR_MASK				0x7fff
+#define SDEI_VERSION_MINOR_SHIFT			32
+#define SDEI_VERSION_MINOR_MASK				0xffff
+#define SDEI_VERSION_VENDOR_SHIFT			0
+#define SDEI_VERSION_VENDOR_MASK			0xffffffff
+
+#define SDEI_VERSION_MAJOR(x)	(x>>SDEI_VERSION_MAJOR_SHIFT & SDEI_VERSION_MAJOR_MASK)
+#define SDEI_VERSION_MINOR(x)	(x>>SDEI_VERSION_MINOR_SHIFT & SDEI_VERSION_MINOR_MASK)
+#define SDEI_VERSION_VENDOR(x)	(x>>SDEI_VERSION_VENDOR_SHIFT & SDEI_VERSION_VENDOR_MASK)
+
+/* SDEI return values */
+#define SDEI_SUCCESS		0
+#define SDEI_NOT_SUPPORTED	-1
+#define SDEI_INVALID_PARAMETERS	-2
+#define SDEI_DENIED		-3
+#define SDEI_PENDING		-5
+#define SDEI_OUT_OF_RESOURCE	-10
+
+/* EVENT_REGISTER flags */
+#define SDEI_EVENT_REGISTER_RM_ANY	0
+#define SDEI_EVENT_REGISTER_RM_PE	1
+
+/* EVENT_STATUS return value bits */
+#define SDEI_EVENT_STATUS_RUNNING	2
+#define SDEI_EVENT_STATUS_ENABLED	1
+#define SDEI_EVENT_STATUS_REGISTERED	0
+
+/* EVENT_COMPLETE status values */
+#define SDEI_EV_HANDLED	0
+#define SDEI_EV_FAILED	1
+
+/* GET_INFO values */
+#define SDEI_EVENT_INFO_EV_TYPE			0
+#define SDEI_EVENT_INFO_EV_SIGNALED		1
+#define SDEI_EVENT_INFO_EV_PRIORITY		2
+#define SDEI_EVENT_INFO_EV_ROUTING_MODE		3
+#define SDEI_EVENT_INFO_EV_ROUTING_AFF		4
+
+/* and their results */
+#define SDEI_EVENT_TYPE_PRIVATE			0
+#define SDEI_EVENT_TYPE_SHARED			1
+#define SDEI_EVENT_PRIORITY_NORMAL		0
+#define SDEI_EVENT_PRIORITY_CRITICAL		1
+
+#endif /* _UAPI_LINUX_ARM_SDEI_H */
diff --git a/include/uapi/linux/btrfs.h b/include/uapi/linux/btrfs.h
index ce615b7..c8d99b9 100644
--- a/include/uapi/linux/btrfs.h
+++ b/include/uapi/linux/btrfs.h
@@ -33,7 +33,12 @@ struct btrfs_ioctl_vol_args {
 	char name[BTRFS_PATH_NAME_MAX + 1];
 };
 
-#define BTRFS_DEVICE_PATH_NAME_MAX 1024
+#define BTRFS_DEVICE_PATH_NAME_MAX	1024
+#define BTRFS_SUBVOL_NAME_MAX 		4039
+
+#define BTRFS_SUBVOL_CREATE_ASYNC	(1ULL << 0)
+#define BTRFS_SUBVOL_RDONLY		(1ULL << 1)
+#define BTRFS_SUBVOL_QGROUP_INHERIT	(1ULL << 2)
 
 #define BTRFS_DEVICE_SPEC_BY_ID		(1ULL << 3)
 
@@ -101,11 +106,7 @@ struct btrfs_ioctl_qgroup_limit_args {
  * - BTRFS_IOC_SUBVOL_GETFLAGS
  * - BTRFS_IOC_SUBVOL_SETFLAGS
  */
-#define BTRFS_SUBVOL_CREATE_ASYNC	(1ULL << 0)
-#define BTRFS_SUBVOL_RDONLY		(1ULL << 1)
-#define BTRFS_SUBVOL_QGROUP_INHERIT	(1ULL << 2)
 
-#define BTRFS_SUBVOL_NAME_MAX 4039
 struct btrfs_ioctl_vol_args_v2 {
 	__s64 fd;
 	__u64 transid;
diff --git a/include/uapi/linux/btrfs_tree.h b/include/uapi/linux/btrfs_tree.h
index 6d6e5da..aff1356 100644
--- a/include/uapi/linux/btrfs_tree.h
+++ b/include/uapi/linux/btrfs_tree.h
@@ -456,6 +456,8 @@ struct btrfs_free_space_header {
 
 #define BTRFS_SUPER_FLAG_SEEDING	(1ULL << 32)
 #define BTRFS_SUPER_FLAG_METADUMP	(1ULL << 33)
+#define BTRFS_SUPER_FLAG_METADUMP_V2	(1ULL << 34)
+#define BTRFS_SUPER_FLAG_CHANGING_FSID	(1ULL << 35)
 
 
 /*
diff --git a/include/uapi/linux/if_ether.h b/include/uapi/linux/if_ether.h
index 3ee3bf7..144de4d 100644
--- a/include/uapi/linux/if_ether.h
+++ b/include/uapi/linux/if_ether.h
@@ -23,6 +23,7 @@
 #define _UAPI_LINUX_IF_ETHER_H
 
 #include <linux/types.h>
+#include <linux/libc-compat.h>
 
 /*
  *	IEEE 802.3 Ethernet magic constants.  The frame sizes omit the preamble
@@ -149,11 +150,13 @@
  *	This is an Ethernet frame header.
  */
 
+#if __UAPI_DEF_ETHHDR
 struct ethhdr {
 	unsigned char	h_dest[ETH_ALEN];	/* destination eth addr	*/
 	unsigned char	h_source[ETH_ALEN];	/* source ether addr	*/
 	__be16		h_proto;		/* packet type ID field	*/
 } __attribute__((packed));
+#endif
 
 
 #endif /* _UAPI_LINUX_IF_ETHER_H */
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index 496e59a..8fb90a0 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -932,6 +932,8 @@ struct kvm_ppc_resize_hpt {
 #define KVM_CAP_HYPERV_SYNIC2 148
 #define KVM_CAP_HYPERV_VP_INDEX 149
 #define KVM_CAP_S390_AIS_MIGRATION 150
+#define KVM_CAP_PPC_GET_CPU_CHAR 151
+#define KVM_CAP_S390_BPB 152
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
@@ -1261,6 +1263,8 @@ struct kvm_s390_ucas_mapping {
 #define KVM_PPC_CONFIGURE_V3_MMU  _IOW(KVMIO,  0xaf, struct kvm_ppc_mmuv3_cfg)
 /* Available with KVM_CAP_PPC_RADIX_MMU */
 #define KVM_PPC_GET_RMMU_INFO	  _IOW(KVMIO,  0xb0, struct kvm_ppc_rmmu_info)
+/* Available with KVM_CAP_PPC_GET_CPU_CHAR */
+#define KVM_PPC_GET_CPU_CHAR	  _IOR(KVMIO,  0xb1, struct kvm_ppc_cpu_char)
 
 /* ioctl for vm fd */
 #define KVM_CREATE_DEVICE	  _IOWR(KVMIO,  0xe0, struct kvm_create_device)
diff --git a/include/uapi/linux/libc-compat.h b/include/uapi/linux/libc-compat.h
index 282875c..fc29efaa 100644
--- a/include/uapi/linux/libc-compat.h
+++ b/include/uapi/linux/libc-compat.h
@@ -168,47 +168,106 @@
 
 /* If we did not see any headers from any supported C libraries,
  * or we are being included in the kernel, then define everything
- * that we need. */
+ * that we need. Check for previous __UAPI_* definitions to give
+ * unsupported C libraries a way to opt out of any kernel definition. */
 #else /* !defined(__GLIBC__) */
 
 /* Definitions for if.h */
+#ifndef __UAPI_DEF_IF_IFCONF
 #define __UAPI_DEF_IF_IFCONF 1
+#endif
+#ifndef __UAPI_DEF_IF_IFMAP
 #define __UAPI_DEF_IF_IFMAP 1
+#endif
+#ifndef __UAPI_DEF_IF_IFNAMSIZ
 #define __UAPI_DEF_IF_IFNAMSIZ 1
+#endif
+#ifndef __UAPI_DEF_IF_IFREQ
 #define __UAPI_DEF_IF_IFREQ 1
+#endif
 /* Everything up to IFF_DYNAMIC, matches net/if.h until glibc 2.23 */
+#ifndef __UAPI_DEF_IF_NET_DEVICE_FLAGS
 #define __UAPI_DEF_IF_NET_DEVICE_FLAGS 1
+#endif
 /* For the future if glibc adds IFF_LOWER_UP, IFF_DORMANT and IFF_ECHO */
+#ifndef __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO
 #define __UAPI_DEF_IF_NET_DEVICE_FLAGS_LOWER_UP_DORMANT_ECHO 1
+#endif
 
 /* Definitions for in.h */
+#ifndef __UAPI_DEF_IN_ADDR
 #define __UAPI_DEF_IN_ADDR		1
+#endif
+#ifndef __UAPI_DEF_IN_IPPROTO
 #define __UAPI_DEF_IN_IPPROTO		1
+#endif
+#ifndef __UAPI_DEF_IN_PKTINFO
 #define __UAPI_DEF_IN_PKTINFO		1
+#endif
+#ifndef __UAPI_DEF_IP_MREQ
 #define __UAPI_DEF_IP_MREQ		1
+#endif
+#ifndef __UAPI_DEF_SOCKADDR_IN
 #define __UAPI_DEF_SOCKADDR_IN		1
+#endif
+#ifndef __UAPI_DEF_IN_CLASS
 #define __UAPI_DEF_IN_CLASS		1
+#endif
 
 /* Definitions for in6.h */
+#ifndef __UAPI_DEF_IN6_ADDR
 #define __UAPI_DEF_IN6_ADDR		1
+#endif
+#ifndef __UAPI_DEF_IN6_ADDR_ALT
 #define __UAPI_DEF_IN6_ADDR_ALT		1
+#endif
+#ifndef __UAPI_DEF_SOCKADDR_IN6
 #define __UAPI_DEF_SOCKADDR_IN6		1
+#endif
+#ifndef __UAPI_DEF_IPV6_MREQ
 #define __UAPI_DEF_IPV6_MREQ		1
+#endif
+#ifndef __UAPI_DEF_IPPROTO_V6
 #define __UAPI_DEF_IPPROTO_V6		1
+#endif
+#ifndef __UAPI_DEF_IPV6_OPTIONS
 #define __UAPI_DEF_IPV6_OPTIONS		1
+#endif
+#ifndef __UAPI_DEF_IN6_PKTINFO
 #define __UAPI_DEF_IN6_PKTINFO		1
+#endif
+#ifndef __UAPI_DEF_IP6_MTUINFO
 #define __UAPI_DEF_IP6_MTUINFO		1
+#endif
 
 /* Definitions for ipx.h */
+#ifndef __UAPI_DEF_SOCKADDR_IPX
 #define __UAPI_DEF_SOCKADDR_IPX			1
+#endif
+#ifndef __UAPI_DEF_IPX_ROUTE_DEFINITION
 #define __UAPI_DEF_IPX_ROUTE_DEFINITION		1
+#endif
+#ifndef __UAPI_DEF_IPX_INTERFACE_DEFINITION
 #define __UAPI_DEF_IPX_INTERFACE_DEFINITION	1
+#endif
+#ifndef __UAPI_DEF_IPX_CONFIG_DATA
 #define __UAPI_DEF_IPX_CONFIG_DATA		1
+#endif
+#ifndef __UAPI_DEF_IPX_ROUTE_DEF
 #define __UAPI_DEF_IPX_ROUTE_DEF		1
+#endif
 
 /* Definitions for xattr.h */
+#ifndef __UAPI_DEF_XATTR
 #define __UAPI_DEF_XATTR		1
+#endif
 
 #endif /* __GLIBC__ */
 
+/* Definitions for if_ether.h */
+/* allow libcs like musl to deactivate this, glibc does not implement this. */
+#ifndef __UAPI_DEF_ETHHDR
+#define __UAPI_DEF_ETHHDR		1
+#endif
+
 #endif /* _UAPI_LIBC_COMPAT_H */
diff --git a/include/uapi/linux/lightnvm.h b/include/uapi/linux/lightnvm.h
index 42d1a43..f9a1be7 100644
--- a/include/uapi/linux/lightnvm.h
+++ b/include/uapi/linux/lightnvm.h
@@ -75,14 +75,23 @@ struct nvm_ioctl_create_simple {
 	__u32 lun_end;
 };
 
+struct nvm_ioctl_create_extended {
+	__u16 lun_begin;
+	__u16 lun_end;
+	__u16 op;
+	__u16 rsv;
+};
+
 enum {
 	NVM_CONFIG_TYPE_SIMPLE = 0,
+	NVM_CONFIG_TYPE_EXTENDED = 1,
 };
 
 struct nvm_ioctl_create_conf {
 	__u32 type;
 	union {
 		struct nvm_ioctl_create_simple s;
+		struct nvm_ioctl_create_extended e;
 	};
 };
 
diff --git a/include/uapi/linux/netfilter/nf_conntrack_common.h b/include/uapi/linux/netfilter/nf_conntrack_common.h
index 3fea770..57ccfb3 100644
--- a/include/uapi/linux/netfilter/nf_conntrack_common.h
+++ b/include/uapi/linux/netfilter/nf_conntrack_common.h
@@ -36,7 +36,7 @@ enum ip_conntrack_info {
 
 #define NF_CT_STATE_INVALID_BIT			(1 << 0)
 #define NF_CT_STATE_BIT(ctinfo)			(1 << ((ctinfo) % IP_CT_IS_REPLY + 1))
-#define NF_CT_STATE_UNTRACKED_BIT		(1 << (IP_CT_UNTRACKED + 1))
+#define NF_CT_STATE_UNTRACKED_BIT		(1 << 6)
 
 /* Bitset representing status of connection. */
 enum ip_conntrack_status {
diff --git a/include/uapi/linux/nubus.h b/include/uapi/linux/nubus.h
index f3776cc..48031e7 100644
--- a/include/uapi/linux/nubus.h
+++ b/include/uapi/linux/nubus.h
@@ -221,27 +221,4 @@ enum nubus_display_res_id {
 	NUBUS_RESID_SIXTHMODE   = 0x0085
 };
 
-struct nubus_dir
-{
-	unsigned char *base;
-	unsigned char *ptr;
-	int done;
-	int mask;
-};
-
-struct nubus_dirent
-{
-	unsigned char *base;
-	unsigned char type;
-	__u32 data;	/* Actually 24bits used */
-	int mask;
-};
-
-
-/* We'd like to get rid of this eventually.  Only daynaport.c uses it now. */
-static inline void *nubus_slot_addr(int slot)
-{
-	return (void *)(0xF0000000|(slot<<24));
-}
-
 #endif /* _UAPILINUX_NUBUS_H */
diff --git a/include/uapi/linux/openvswitch.h b/include/uapi/linux/openvswitch.h
index 4265d7f..dcfab5e 100644
--- a/include/uapi/linux/openvswitch.h
+++ b/include/uapi/linux/openvswitch.h
@@ -363,7 +363,6 @@ enum ovs_tunnel_key_attr {
 	OVS_TUNNEL_KEY_ATTR_IPV6_SRC,		/* struct in6_addr src IPv6 address. */
 	OVS_TUNNEL_KEY_ATTR_IPV6_DST,		/* struct in6_addr dst IPv6 address. */
 	OVS_TUNNEL_KEY_ATTR_PAD,
-	OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,	/* be32 ERSPAN index. */
 	__OVS_TUNNEL_KEY_ATTR_MAX
 };
 
diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h
index b9a4953..c77c9a2 100644
--- a/include/uapi/linux/perf_event.h
+++ b/include/uapi/linux/perf_event.h
@@ -612,9 +612,12 @@ struct perf_event_mmap_page {
  */
 #define PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT	(1 << 12)
 /*
- * PERF_RECORD_MISC_MMAP_DATA and PERF_RECORD_MISC_COMM_EXEC are used on
- * different events so can reuse the same bit position.
- * Ditto PERF_RECORD_MISC_SWITCH_OUT.
+ * Following PERF_RECORD_MISC_* are used on different
+ * events, so can reuse the same bit position:
+ *
+ *   PERF_RECORD_MISC_MMAP_DATA  - PERF_RECORD_MMAP* events
+ *   PERF_RECORD_MISC_COMM_EXEC  - PERF_RECORD_COMM event
+ *   PERF_RECORD_MISC_SWITCH_OUT - PERF_RECORD_SWITCH* events
  */
 #define PERF_RECORD_MISC_MMAP_DATA		(1 << 13)
 #define PERF_RECORD_MISC_COMM_EXEC		(1 << 13)
@@ -864,6 +867,7 @@ enum perf_event_type {
 	 *	struct perf_event_header	header;
 	 *	u32				pid;
 	 *	u32				tid;
+	 *	struct sample_id		sample_id;
 	 * };
 	 */
 	PERF_RECORD_ITRACE_START		= 12,
diff --git a/include/uapi/linux/sched.h b/include/uapi/linux/sched.h
index 30a9e51..22627f8 100644
--- a/include/uapi/linux/sched.h
+++ b/include/uapi/linux/sched.h
@@ -49,5 +49,10 @@
  */
 #define SCHED_FLAG_RESET_ON_FORK	0x01
 #define SCHED_FLAG_RECLAIM		0x02
+#define SCHED_FLAG_DL_OVERRUN		0x04
+
+#define SCHED_FLAG_ALL	(SCHED_FLAG_RESET_ON_FORK	| \
+			 SCHED_FLAG_RECLAIM		| \
+			 SCHED_FLAG_DL_OVERRUN)
 
 #endif /* _UAPI_LINUX_SCHED_H */
diff --git a/include/uapi/sound/asound.h b/include/uapi/sound/asound.h
index c227ccb..07d6158 100644
--- a/include/uapi/sound/asound.h
+++ b/include/uapi/sound/asound.h
@@ -214,6 +214,11 @@ typedef int __bitwise snd_pcm_format_t;
 #define	SNDRV_PCM_FORMAT_IMA_ADPCM	((__force snd_pcm_format_t) 22)
 #define	SNDRV_PCM_FORMAT_MPEG		((__force snd_pcm_format_t) 23)
 #define	SNDRV_PCM_FORMAT_GSM		((__force snd_pcm_format_t) 24)
+#define	SNDRV_PCM_FORMAT_S20_LE	((__force snd_pcm_format_t) 25) /* in four bytes, LSB justified */
+#define	SNDRV_PCM_FORMAT_S20_BE	((__force snd_pcm_format_t) 26) /* in four bytes, LSB justified */
+#define	SNDRV_PCM_FORMAT_U20_LE	((__force snd_pcm_format_t) 27) /* in four bytes, LSB justified */
+#define	SNDRV_PCM_FORMAT_U20_BE	((__force snd_pcm_format_t) 28) /* in four bytes, LSB justified */
+/* gap in the numbering for a future standard linear format */
 #define	SNDRV_PCM_FORMAT_SPECIAL	((__force snd_pcm_format_t) 31)
 #define	SNDRV_PCM_FORMAT_S24_3LE	((__force snd_pcm_format_t) 32)	/* in three bytes */
 #define	SNDRV_PCM_FORMAT_S24_3BE	((__force snd_pcm_format_t) 33)	/* in three bytes */
@@ -248,6 +253,8 @@ typedef int __bitwise snd_pcm_format_t;
 #define	SNDRV_PCM_FORMAT_FLOAT		SNDRV_PCM_FORMAT_FLOAT_LE
 #define	SNDRV_PCM_FORMAT_FLOAT64	SNDRV_PCM_FORMAT_FLOAT64_LE
 #define	SNDRV_PCM_FORMAT_IEC958_SUBFRAME SNDRV_PCM_FORMAT_IEC958_SUBFRAME_LE
+#define	SNDRV_PCM_FORMAT_S20		SNDRV_PCM_FORMAT_S20_LE
+#define	SNDRV_PCM_FORMAT_U20		SNDRV_PCM_FORMAT_U20_LE
 #endif
 #ifdef SNDRV_BIG_ENDIAN
 #define	SNDRV_PCM_FORMAT_S16		SNDRV_PCM_FORMAT_S16_BE
@@ -259,6 +266,8 @@ typedef int __bitwise snd_pcm_format_t;
 #define	SNDRV_PCM_FORMAT_FLOAT		SNDRV_PCM_FORMAT_FLOAT_BE
 #define	SNDRV_PCM_FORMAT_FLOAT64	SNDRV_PCM_FORMAT_FLOAT64_BE
 #define	SNDRV_PCM_FORMAT_IEC958_SUBFRAME SNDRV_PCM_FORMAT_IEC958_SUBFRAME_BE
+#define	SNDRV_PCM_FORMAT_S20		SNDRV_PCM_FORMAT_S20_BE
+#define	SNDRV_PCM_FORMAT_U20		SNDRV_PCM_FORMAT_U20_BE
 #endif
 
 typedef int __bitwise snd_pcm_subformat_t;
diff --git a/include/uapi/sound/snd_sst_tokens.h b/include/uapi/sound/snd_sst_tokens.h
index 326054a..8ba0112 100644
--- a/include/uapi/sound/snd_sst_tokens.h
+++ b/include/uapi/sound/snd_sst_tokens.h
@@ -222,6 +222,17 @@
  * %SKL_TKN_MM_U32_NUM_IN_FMT:  Number of input formats
  * %SKL_TKN_MM_U32_NUM_OUT_FMT: Number of output formats
  *
+ * %SKL_TKN_U32_ASTATE_IDX:     Table Index for the A-State entry to be filled
+ *                              with kcps and clock source
+ *
+ * %SKL_TKN_U32_ASTATE_COUNT:   Number of valid entries in A-State table
+ *
+ * %SKL_TKN_U32_ASTATE_KCPS:    Specifies the core load threshold (in kilo
+ *                              cycles per second) below which DSP is clocked
+ *                              from source specified by clock source.
+ *
+ * %SKL_TKN_U32_ASTATE_CLK_SRC: Clock source for A-State entry
+ *
  * module_id and loadable flags dont have tokens as these values will be
  * read from the DSP FW manifest
  *
@@ -309,7 +320,11 @@ enum SKL_TKNS {
 	SKL_TKN_MM_U32_NUM_IN_FMT,
 	SKL_TKN_MM_U32_NUM_OUT_FMT,
 
-	SKL_TKN_MAX = SKL_TKN_MM_U32_NUM_OUT_FMT,
+	SKL_TKN_U32_ASTATE_IDX,
+	SKL_TKN_U32_ASTATE_COUNT,
+	SKL_TKN_U32_ASTATE_KCPS,
+	SKL_TKN_U32_ASTATE_CLK_SRC,
+	SKL_TKN_MAX = SKL_TKN_U32_ASTATE_CLK_SRC,
 };
 
 #endif
diff --git a/init/Kconfig b/init/Kconfig
index 690a381..a9a2e2c 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -461,6 +461,7 @@
 
 config CPU_ISOLATION
 	bool "CPU isolation"
+	depends on SMP || COMPILE_TEST
 	default y
 	help
 	  Make sure that CPUs running critical tasks are not disturbed by
@@ -1396,6 +1397,13 @@
 	  Enable the bpf() system call that allows to manipulate eBPF
 	  programs and maps via file descriptors.
 
+config BPF_JIT_ALWAYS_ON
+	bool "Permanently enable BPF JIT and remove BPF interpreter"
+	depends on BPF_SYSCALL && HAVE_EBPF_JIT && BPF_JIT
+	help
+	  Enables BPF JIT and removes BPF interpreter to avoid
+	  speculative execution of BPF instructions by the interpreter
+
 config USERFAULTFD
 	bool "Enable userfaultfd() system call"
 	select ANON_INODES
diff --git a/init/Makefile b/init/Makefile
index 1dbb237..a3e5ce2 100644
--- a/init/Makefile
+++ b/init/Makefile
@@ -13,9 +13,7 @@
 endif
 obj-$(CONFIG_GENERIC_CALIBRATE_DELAY) += calibrate.o
 
-ifneq ($(CONFIG_ARCH_INIT_TASK),y)
 obj-y                          += init_task.o
-endif
 
 mounts-y			:= do_mounts.o
 mounts-$(CONFIG_BLK_DEV_RAM)	+= do_mounts_rd.o
diff --git a/init/init_task.c b/init/init_task.c
index 9325fee..3ac6e75 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -13,19 +13,175 @@
 #include <asm/pgtable.h>
 #include <linux/uaccess.h>
 
-static struct signal_struct init_signals = INIT_SIGNALS(init_signals);
-static struct sighand_struct init_sighand = INIT_SIGHAND(init_sighand);
+static struct signal_struct init_signals = {
+	.nr_threads	= 1,
+	.thread_head	= LIST_HEAD_INIT(init_task.thread_node),
+	.wait_chldexit	= __WAIT_QUEUE_HEAD_INITIALIZER(init_signals.wait_chldexit),
+	.shared_pending	= {
+		.list = LIST_HEAD_INIT(init_signals.shared_pending.list),
+		.signal =  {{0}}
+	},
+	.rlim		= INIT_RLIMITS,
+	.cred_guard_mutex = __MUTEX_INITIALIZER(init_signals.cred_guard_mutex),
+#ifdef CONFIG_POSIX_TIMERS
+	.posix_timers = LIST_HEAD_INIT(init_signals.posix_timers),
+	.cputimer	= {
+		.cputime_atomic	= INIT_CPUTIME_ATOMIC,
+		.running	= false,
+		.checking_timer = false,
+	},
+#endif
+	INIT_CPU_TIMERS(init_signals)
+	INIT_PREV_CPUTIME(init_signals)
+};
 
-/* Initial task structure */
-struct task_struct init_task = INIT_TASK(init_task);
+static struct sighand_struct init_sighand = {
+	.count		= ATOMIC_INIT(1),
+	.action		= { { { .sa_handler = SIG_DFL, } }, },
+	.siglock	= __SPIN_LOCK_UNLOCKED(init_sighand.siglock),
+	.signalfd_wqh	= __WAIT_QUEUE_HEAD_INITIALIZER(init_sighand.signalfd_wqh),
+};
+
+/*
+ * Set up the first task table, touch at your own risk!. Base=0,
+ * limit=0x1fffff (=2MB)
+ */
+struct task_struct init_task
+#ifdef CONFIG_ARCH_TASK_STRUCT_ON_STACK
+	__init_task_data
+#endif
+= {
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	.thread_info	= INIT_THREAD_INFO(init_task),
+	.stack_refcount	= ATOMIC_INIT(1),
+#endif
+	.state		= 0,
+	.stack		= init_stack,
+	.usage		= ATOMIC_INIT(2),
+	.flags		= PF_KTHREAD,
+	.prio		= MAX_PRIO - 20,
+	.static_prio	= MAX_PRIO - 20,
+	.normal_prio	= MAX_PRIO - 20,
+	.policy		= SCHED_NORMAL,
+	.cpus_allowed	= CPU_MASK_ALL,
+	.nr_cpus_allowed= NR_CPUS,
+	.mm		= NULL,
+	.active_mm	= &init_mm,
+	.restart_block	= {
+		.fn = do_no_restart_syscall,
+	},
+	.se		= {
+		.group_node 	= LIST_HEAD_INIT(init_task.se.group_node),
+	},
+	.rt		= {
+		.run_list	= LIST_HEAD_INIT(init_task.rt.run_list),
+		.time_slice	= RR_TIMESLICE,
+	},
+	.tasks		= LIST_HEAD_INIT(init_task.tasks),
+#ifdef CONFIG_SMP
+	.pushable_tasks	= PLIST_NODE_INIT(init_task.pushable_tasks, MAX_PRIO),
+#endif
+#ifdef CONFIG_CGROUP_SCHED
+	.sched_task_group = &root_task_group,
+#endif
+	.ptraced	= LIST_HEAD_INIT(init_task.ptraced),
+	.ptrace_entry	= LIST_HEAD_INIT(init_task.ptrace_entry),
+	.real_parent	= &init_task,
+	.parent		= &init_task,
+	.children	= LIST_HEAD_INIT(init_task.children),
+	.sibling	= LIST_HEAD_INIT(init_task.sibling),
+	.group_leader	= &init_task,
+	RCU_POINTER_INITIALIZER(real_cred, &init_cred),
+	RCU_POINTER_INITIALIZER(cred, &init_cred),
+	.comm		= INIT_TASK_COMM,
+	.thread		= INIT_THREAD,
+	.fs		= &init_fs,
+	.files		= &init_files,
+	.signal		= &init_signals,
+	.sighand	= &init_sighand,
+	.nsproxy	= &init_nsproxy,
+	.pending	= {
+		.list = LIST_HEAD_INIT(init_task.pending.list),
+		.signal = {{0}}
+	},
+	.blocked	= {{0}},
+	.alloc_lock	= __SPIN_LOCK_UNLOCKED(init_task.alloc_lock),
+	.journal_info	= NULL,
+	INIT_CPU_TIMERS(init_task)
+	.pi_lock	= __RAW_SPIN_LOCK_UNLOCKED(init_task.pi_lock),
+	.timer_slack_ns = 50000, /* 50 usec default slack */
+	.pids = {
+		[PIDTYPE_PID]  = INIT_PID_LINK(PIDTYPE_PID),
+		[PIDTYPE_PGID] = INIT_PID_LINK(PIDTYPE_PGID),
+		[PIDTYPE_SID]  = INIT_PID_LINK(PIDTYPE_SID),
+	},
+	.thread_group	= LIST_HEAD_INIT(init_task.thread_group),
+	.thread_node	= LIST_HEAD_INIT(init_signals.thread_head),
+#ifdef CONFIG_AUDITSYSCALL
+	.loginuid	= INVALID_UID,
+	.sessionid	= (unsigned int)-1,
+#endif
+#ifdef CONFIG_PERF_EVENTS
+	.perf_event_mutex = __MUTEX_INITIALIZER(init_task.perf_event_mutex),
+	.perf_event_list = LIST_HEAD_INIT(init_task.perf_event_list),
+#endif
+#ifdef CONFIG_PREEMPT_RCU
+	.rcu_read_lock_nesting = 0,
+	.rcu_read_unlock_special.s = 0,
+	.rcu_node_entry = LIST_HEAD_INIT(init_task.rcu_node_entry),
+	.rcu_blocked_node = NULL,
+#endif
+#ifdef CONFIG_TASKS_RCU
+	.rcu_tasks_holdout = false,
+	.rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list),
+	.rcu_tasks_idle_cpu = -1,
+#endif
+#ifdef CONFIG_CPUSETS
+	.mems_allowed_seq = SEQCNT_ZERO(init_task.mems_allowed_seq),
+#endif
+#ifdef CONFIG_RT_MUTEXES
+	.pi_waiters	= RB_ROOT_CACHED,
+	.pi_top_task	= NULL,
+#endif
+	INIT_PREV_CPUTIME(init_task)
+#ifdef CONFIG_VIRT_CPU_ACCOUNTING_GEN
+	.vtime.seqcount	= SEQCNT_ZERO(init_task.vtime_seqcount),
+	.vtime.starttime = 0,
+	.vtime.state	= VTIME_SYS,
+#endif
+#ifdef CONFIG_NUMA_BALANCING
+	.numa_preferred_nid = -1,
+	.numa_group	= NULL,
+	.numa_faults	= NULL,
+#endif
+#ifdef CONFIG_KASAN
+	.kasan_depth	= 1,
+#endif
+#ifdef CONFIG_TRACE_IRQFLAGS
+	.softirqs_enabled = 1,
+#endif
+#ifdef CONFIG_LOCKDEP
+	.lockdep_recursion = 0,
+#endif
+#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+	.ret_stack	= NULL,
+#endif
+#if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPT)
+	.trace_recursion = 0,
+#endif
+#ifdef CONFIG_LIVEPATCH
+	.patch_state	= KLP_UNDEFINED,
+#endif
+#ifdef CONFIG_SECURITY
+	.security	= NULL,
+#endif
+};
 EXPORT_SYMBOL(init_task);
 
 /*
  * Initial thread structure. Alignment of this is handled by a special
  * linker map entry.
  */
-union thread_union init_thread_union __init_task_data = {
 #ifndef CONFIG_THREAD_INFO_IN_TASK
-	INIT_THREAD_INFO(init_task)
+struct thread_info init_thread_info __init_thread_info = INIT_THREAD_INFO(init_task);
 #endif
-};
diff --git a/kernel/acct.c b/kernel/acct.c
index d15c0ee..addf773 100644
--- a/kernel/acct.c
+++ b/kernel/acct.c
@@ -102,7 +102,7 @@ static int check_free_space(struct bsd_acct_struct *acct)
 {
 	struct kstatfs sbuf;
 
-	if (time_is_before_jiffies(acct->needcheck))
+	if (time_is_after_jiffies(acct->needcheck))
 		goto out;
 
 	/* May block */
diff --git a/kernel/bpf/arraymap.c b/kernel/bpf/arraymap.c
index 7c25426..ab94d30 100644
--- a/kernel/bpf/arraymap.c
+++ b/kernel/bpf/arraymap.c
@@ -53,9 +53,10 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 {
 	bool percpu = attr->map_type == BPF_MAP_TYPE_PERCPU_ARRAY;
 	int numa_node = bpf_map_attr_numa_node(attr);
+	u32 elem_size, index_mask, max_entries;
+	bool unpriv = !capable(CAP_SYS_ADMIN);
 	struct bpf_array *array;
-	u64 array_size;
-	u32 elem_size;
+	u64 array_size, mask64;
 
 	/* check sanity of attributes */
 	if (attr->max_entries == 0 || attr->key_size != 4 ||
@@ -72,11 +73,32 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 
 	elem_size = round_up(attr->value_size, 8);
 
+	max_entries = attr->max_entries;
+
+	/* On 32 bit archs roundup_pow_of_two() with max_entries that has
+	 * upper most bit set in u32 space is undefined behavior due to
+	 * resulting 1U << 32, so do it manually here in u64 space.
+	 */
+	mask64 = fls_long(max_entries - 1);
+	mask64 = 1ULL << mask64;
+	mask64 -= 1;
+
+	index_mask = mask64;
+	if (unpriv) {
+		/* round up array size to nearest power of 2,
+		 * since cpu will speculate within index_mask limits
+		 */
+		max_entries = index_mask + 1;
+		/* Check for overflows. */
+		if (max_entries < attr->max_entries)
+			return ERR_PTR(-E2BIG);
+	}
+
 	array_size = sizeof(*array);
 	if (percpu)
-		array_size += (u64) attr->max_entries * sizeof(void *);
+		array_size += (u64) max_entries * sizeof(void *);
 	else
-		array_size += (u64) attr->max_entries * elem_size;
+		array_size += (u64) max_entries * elem_size;
 
 	/* make sure there is no u32 overflow later in round_up() */
 	if (array_size >= U32_MAX - PAGE_SIZE)
@@ -86,6 +108,8 @@ static struct bpf_map *array_map_alloc(union bpf_attr *attr)
 	array = bpf_map_area_alloc(array_size, numa_node);
 	if (!array)
 		return ERR_PTR(-ENOMEM);
+	array->index_mask = index_mask;
+	array->map.unpriv_array = unpriv;
 
 	/* copy mandatory map attributes */
 	array->map.map_type = attr->map_type;
@@ -121,12 +145,13 @@ static void *array_map_lookup_elem(struct bpf_map *map, void *key)
 	if (unlikely(index >= array->map.max_entries))
 		return NULL;
 
-	return array->value + array->elem_size * index;
+	return array->value + array->elem_size * (index & array->index_mask);
 }
 
 /* emit BPF instructions equivalent to C code of array_map_lookup_elem() */
 static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
 {
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
 	struct bpf_insn *insn = insn_buf;
 	u32 elem_size = round_up(map->value_size, 8);
 	const int ret = BPF_REG_0;
@@ -135,7 +160,12 @@ static u32 array_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
 
 	*insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value));
 	*insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0);
-	*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3);
+	if (map->unpriv_array) {
+		*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 4);
+		*insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask);
+	} else {
+		*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 3);
+	}
 
 	if (is_power_of_2(elem_size)) {
 		*insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size));
@@ -157,7 +187,7 @@ static void *percpu_array_map_lookup_elem(struct bpf_map *map, void *key)
 	if (unlikely(index >= array->map.max_entries))
 		return NULL;
 
-	return this_cpu_ptr(array->pptrs[index]);
+	return this_cpu_ptr(array->pptrs[index & array->index_mask]);
 }
 
 int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
@@ -177,7 +207,7 @@ int bpf_percpu_array_copy(struct bpf_map *map, void *key, void *value)
 	 */
 	size = round_up(map->value_size, 8);
 	rcu_read_lock();
-	pptr = array->pptrs[index];
+	pptr = array->pptrs[index & array->index_mask];
 	for_each_possible_cpu(cpu) {
 		bpf_long_memcpy(value + off, per_cpu_ptr(pptr, cpu), size);
 		off += size;
@@ -225,10 +255,11 @@ static int array_map_update_elem(struct bpf_map *map, void *key, void *value,
 		return -EEXIST;
 
 	if (array->map.map_type == BPF_MAP_TYPE_PERCPU_ARRAY)
-		memcpy(this_cpu_ptr(array->pptrs[index]),
+		memcpy(this_cpu_ptr(array->pptrs[index & array->index_mask]),
 		       value, map->value_size);
 	else
-		memcpy(array->value + array->elem_size * index,
+		memcpy(array->value +
+		       array->elem_size * (index & array->index_mask),
 		       value, map->value_size);
 	return 0;
 }
@@ -262,7 +293,7 @@ int bpf_percpu_array_update(struct bpf_map *map, void *key, void *value,
 	 */
 	size = round_up(map->value_size, 8);
 	rcu_read_lock();
-	pptr = array->pptrs[index];
+	pptr = array->pptrs[index & array->index_mask];
 	for_each_possible_cpu(cpu) {
 		bpf_long_memcpy(per_cpu_ptr(pptr, cpu), value + off, size);
 		off += size;
@@ -613,6 +644,7 @@ static void *array_of_map_lookup_elem(struct bpf_map *map, void *key)
 static u32 array_of_map_gen_lookup(struct bpf_map *map,
 				   struct bpf_insn *insn_buf)
 {
+	struct bpf_array *array = container_of(map, struct bpf_array, map);
 	u32 elem_size = round_up(map->value_size, 8);
 	struct bpf_insn *insn = insn_buf;
 	const int ret = BPF_REG_0;
@@ -621,7 +653,12 @@ static u32 array_of_map_gen_lookup(struct bpf_map *map,
 
 	*insn++ = BPF_ALU64_IMM(BPF_ADD, map_ptr, offsetof(struct bpf_array, value));
 	*insn++ = BPF_LDX_MEM(BPF_W, ret, index, 0);
-	*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5);
+	if (map->unpriv_array) {
+		*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 6);
+		*insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask);
+	} else {
+		*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 5);
+	}
 	if (is_power_of_2(elem_size))
 		*insn++ = BPF_ALU64_IMM(BPF_LSH, ret, ilog2(elem_size));
 	else
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 86b50aa..7949e8b 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -767,6 +767,7 @@ noinline u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5)
 }
 EXPORT_SYMBOL_GPL(__bpf_call_base);
 
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 /**
  *	__bpf_prog_run - run eBPF program on a given context
  *	@ctx: is the data we are operating on
@@ -955,7 +956,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn,
 		DST = tmp;
 		CONT;
 	ALU_MOD_X:
-		if (unlikely(SRC == 0))
+		if (unlikely((u32)SRC == 0))
 			return 0;
 		tmp = (u32) DST;
 		DST = do_div(tmp, (u32) SRC);
@@ -974,7 +975,7 @@ static unsigned int ___bpf_prog_run(u64 *regs, const struct bpf_insn *insn,
 		DST = div64_u64(DST, SRC);
 		CONT;
 	ALU_DIV_X:
-		if (unlikely(SRC == 0))
+		if (unlikely((u32)SRC == 0))
 			return 0;
 		tmp = (u32) DST;
 		do_div(tmp, (u32) SRC);
@@ -1317,6 +1318,14 @@ EVAL6(PROG_NAME_LIST, 224, 256, 288, 320, 352, 384)
 EVAL4(PROG_NAME_LIST, 416, 448, 480, 512)
 };
 
+#else
+static unsigned int __bpf_prog_ret0(const void *ctx,
+				    const struct bpf_insn *insn)
+{
+	return 0;
+}
+#endif
+
 bool bpf_prog_array_compatible(struct bpf_array *array,
 			       const struct bpf_prog *fp)
 {
@@ -1364,9 +1373,13 @@ static int bpf_check_tail_call(const struct bpf_prog *fp)
  */
 struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 {
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 	u32 stack_depth = max_t(u32, fp->aux->stack_depth, 1);
 
 	fp->bpf_func = interpreters[(round_up(stack_depth, 32) / 32) - 1];
+#else
+	fp->bpf_func = __bpf_prog_ret0;
+#endif
 
 	/* eBPF JITs can rewrite the program in case constant
 	 * blinding is active. However, in case of error during
@@ -1376,6 +1389,12 @@ struct bpf_prog *bpf_prog_select_runtime(struct bpf_prog *fp, int *err)
 	 */
 	if (!bpf_prog_is_dev_bound(fp->aux)) {
 		fp = bpf_int_jit_compile(fp);
+#ifdef CONFIG_BPF_JIT_ALWAYS_ON
+		if (!fp->jited) {
+			*err = -ENOTSUPP;
+			return fp;
+		}
+#endif
 	} else {
 		*err = bpf_prog_offload_compile(fp);
 		if (*err)
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c
index 01aaef1..5bb5e49 100644
--- a/kernel/bpf/inode.c
+++ b/kernel/bpf/inode.c
@@ -368,7 +368,45 @@ int bpf_obj_get_user(const char __user *pathname, int flags)
 	putname(pname);
 	return ret;
 }
-EXPORT_SYMBOL_GPL(bpf_obj_get_user);
+
+static struct bpf_prog *__get_prog_inode(struct inode *inode, enum bpf_prog_type type)
+{
+	struct bpf_prog *prog;
+	int ret = inode_permission(inode, MAY_READ | MAY_WRITE);
+	if (ret)
+		return ERR_PTR(ret);
+
+	if (inode->i_op == &bpf_map_iops)
+		return ERR_PTR(-EINVAL);
+	if (inode->i_op != &bpf_prog_iops)
+		return ERR_PTR(-EACCES);
+
+	prog = inode->i_private;
+
+	ret = security_bpf_prog(prog);
+	if (ret < 0)
+		return ERR_PTR(ret);
+
+	if (!bpf_prog_get_ok(prog, &type, false))
+		return ERR_PTR(-EINVAL);
+
+	return bpf_prog_inc(prog);
+}
+
+struct bpf_prog *bpf_prog_get_type_path(const char *name, enum bpf_prog_type type)
+{
+	struct bpf_prog *prog;
+	struct path path;
+	int ret = kern_path(name, LOOKUP_FOLLOW, &path);
+	if (ret)
+		return ERR_PTR(ret);
+	prog = __get_prog_inode(d_backing_inode(path.dentry), type);
+	if (!IS_ERR(prog))
+		touch_atime(&path);
+	path_put(&path);
+	return prog;
+}
+EXPORT_SYMBOL(bpf_prog_get_type_path);
 
 static void bpf_evict_inode(struct inode *inode)
 {
diff --git a/kernel/bpf/sockmap.c b/kernel/bpf/sockmap.c
index 5ee2e41..1712d31 100644
--- a/kernel/bpf/sockmap.c
+++ b/kernel/bpf/sockmap.c
@@ -591,8 +591,15 @@ static void sock_map_free(struct bpf_map *map)
 
 		write_lock_bh(&sock->sk_callback_lock);
 		psock = smap_psock_sk(sock);
-		smap_list_remove(psock, &stab->sock_map[i]);
-		smap_release_sock(psock, sock);
+		/* This check handles a racing sock event that can get the
+		 * sk_callback_lock before this case but after xchg happens
+		 * causing the refcnt to hit zero and sock user data (psock)
+		 * to be null and queued for garbage collection.
+		 */
+		if (likely(psock)) {
+			smap_list_remove(psock, &stab->sock_map[i]);
+			smap_release_sock(psock, sock);
+		}
 		write_unlock_bh(&sock->sk_callback_lock);
 	}
 	rcu_read_unlock();
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2c4cfea..5cb783f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -1057,7 +1057,7 @@ struct bpf_prog *bpf_prog_inc_not_zero(struct bpf_prog *prog)
 }
 EXPORT_SYMBOL_GPL(bpf_prog_inc_not_zero);
 
-static bool bpf_prog_get_ok(struct bpf_prog *prog,
+bool bpf_prog_get_ok(struct bpf_prog *prog,
 			    enum bpf_prog_type *attach_type, bool attach_drv)
 {
 	/* not an attachment, just a refcount inc, always allow */
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 04b2487..13551e6 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -978,6 +978,13 @@ static bool is_pointer_value(struct bpf_verifier_env *env, int regno)
 	return __is_pointer_value(env->allow_ptr_leaks, cur_regs(env) + regno);
 }
 
+static bool is_ctx_reg(struct bpf_verifier_env *env, int regno)
+{
+	const struct bpf_reg_state *reg = cur_regs(env) + regno;
+
+	return reg->type == PTR_TO_CTX;
+}
+
 static int check_pkt_ptr_alignment(struct bpf_verifier_env *env,
 				   const struct bpf_reg_state *reg,
 				   int off, int size, bool strict)
@@ -1258,6 +1265,12 @@ static int check_xadd(struct bpf_verifier_env *env, int insn_idx, struct bpf_ins
 		return -EACCES;
 	}
 
+	if (is_ctx_reg(env, insn->dst_reg)) {
+		verbose(env, "BPF_XADD stores into R%d context is not allowed\n",
+			insn->dst_reg);
+		return -EACCES;
+	}
+
 	/* check whether atomic_add can read the memory */
 	err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
 			       BPF_SIZE(insn->code), BPF_READ, -1);
@@ -1729,6 +1742,13 @@ static int check_call(struct bpf_verifier_env *env, int func_id, int insn_idx)
 	err = check_func_arg(env, BPF_REG_2, fn->arg2_type, &meta);
 	if (err)
 		return err;
+	if (func_id == BPF_FUNC_tail_call) {
+		if (meta.map_ptr == NULL) {
+			verbose(env, "verifier bug\n");
+			return -EINVAL;
+		}
+		env->insn_aux_data[insn_idx].map_ptr = meta.map_ptr;
+	}
 	err = check_func_arg(env, BPF_REG_3, fn->arg3_type, &meta);
 	if (err)
 		return err;
@@ -1875,17 +1895,13 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
 
 	dst_reg = &regs[dst];
 
-	if (WARN_ON_ONCE(known && (smin_val != smax_val))) {
-		print_verifier_state(env, env->cur_state);
-		verbose(env,
-			"verifier internal error: known but bad sbounds\n");
-		return -EINVAL;
-	}
-	if (WARN_ON_ONCE(known && (umin_val != umax_val))) {
-		print_verifier_state(env, env->cur_state);
-		verbose(env,
-			"verifier internal error: known but bad ubounds\n");
-		return -EINVAL;
+	if ((known && (smin_val != smax_val || umin_val != umax_val)) ||
+	    smin_val > smax_val || umin_val > umax_val) {
+		/* Taint dst register if offset had invalid bounds derived from
+		 * e.g. dead branches.
+		 */
+		__mark_reg_unknown(dst_reg);
+		return 0;
 	}
 
 	if (BPF_CLASS(insn->code) != BPF_ALU64) {
@@ -2077,6 +2093,15 @@ static int adjust_scalar_min_max_vals(struct bpf_verifier_env *env,
 	src_known = tnum_is_const(src_reg.var_off);
 	dst_known = tnum_is_const(dst_reg->var_off);
 
+	if ((src_known && (smin_val != smax_val || umin_val != umax_val)) ||
+	    smin_val > smax_val || umin_val > umax_val) {
+		/* Taint dst register if offset had invalid bounds derived from
+		 * e.g. dead branches.
+		 */
+		__mark_reg_unknown(dst_reg);
+		return 0;
+	}
+
 	if (!src_known &&
 	    opcode != BPF_ADD && opcode != BPF_SUB && opcode != BPF_AND) {
 		__mark_reg_unknown(dst_reg);
@@ -2486,6 +2511,11 @@ static int check_alu_op(struct bpf_verifier_env *env, struct bpf_insn *insn)
 			return -EINVAL;
 		}
 
+		if (opcode == BPF_ARSH && BPF_CLASS(insn->code) != BPF_ALU64) {
+			verbose(env, "BPF_ARSH not supported for 32 bit ALU\n");
+			return -EINVAL;
+		}
+
 		if ((opcode == BPF_LSH || opcode == BPF_RSH ||
 		     opcode == BPF_ARSH) && BPF_SRC(insn->code) == BPF_K) {
 			int size = BPF_CLASS(insn->code) == BPF_ALU64 ? 64 : 32;
@@ -3981,6 +4011,12 @@ static int do_check(struct bpf_verifier_env *env)
 			if (err)
 				return err;
 
+			if (is_ctx_reg(env, insn->dst_reg)) {
+				verbose(env, "BPF_ST stores into R%d context is not allowed\n",
+					insn->dst_reg);
+				return -EACCES;
+			}
+
 			/* check that memory (dst_reg + off) is writeable */
 			err = check_mem_access(env, insn_idx, insn->dst_reg, insn->off,
 					       BPF_SIZE(insn->code), BPF_WRITE,
@@ -4433,6 +4469,24 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
 	int i, cnt, delta = 0;
 
 	for (i = 0; i < insn_cnt; i++, insn++) {
+		if (insn->code == (BPF_ALU | BPF_MOD | BPF_X) ||
+		    insn->code == (BPF_ALU | BPF_DIV | BPF_X)) {
+			/* due to JIT bugs clear upper 32-bits of src register
+			 * before div/mod operation
+			 */
+			insn_buf[0] = BPF_MOV32_REG(insn->src_reg, insn->src_reg);
+			insn_buf[1] = *insn;
+			cnt = 2;
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
+			continue;
+		}
+
 		if (insn->code != (BPF_JMP | BPF_CALL))
 			continue;
 
@@ -4456,6 +4510,35 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
 			 */
 			insn->imm = 0;
 			insn->code = BPF_JMP | BPF_TAIL_CALL;
+
+			/* instead of changing every JIT dealing with tail_call
+			 * emit two extra insns:
+			 * if (index >= max_entries) goto out;
+			 * index &= array->index_mask;
+			 * to avoid out-of-bounds cpu speculation
+			 */
+			map_ptr = env->insn_aux_data[i + delta].map_ptr;
+			if (map_ptr == BPF_MAP_PTR_POISON) {
+				verbose(env, "tail_call abusing map_ptr\n");
+				return -EINVAL;
+			}
+			if (!map_ptr->unpriv_array)
+				continue;
+			insn_buf[0] = BPF_JMP_IMM(BPF_JGE, BPF_REG_3,
+						  map_ptr->max_entries, 2);
+			insn_buf[1] = BPF_ALU32_IMM(BPF_AND, BPF_REG_3,
+						    container_of(map_ptr,
+								 struct bpf_array,
+								 map)->index_mask);
+			insn_buf[2] = *insn;
+			cnt = 3;
+			new_prog = bpf_patch_insn_data(env, i + delta, insn_buf, cnt);
+			if (!new_prog)
+				return -ENOMEM;
+
+			delta    += cnt - 1;
+			env->prog = prog = new_prog;
+			insn      = new_prog->insnsi + i + delta;
 			continue;
 		}
 
diff --git a/kernel/cgroup/cgroup-v1.c b/kernel/cgroup/cgroup-v1.c
index 024085d..a2c05d2 100644
--- a/kernel/cgroup/cgroup-v1.c
+++ b/kernel/cgroup/cgroup-v1.c
@@ -123,7 +123,11 @@ int cgroup_transfer_tasks(struct cgroup *to, struct cgroup *from)
 	 */
 	do {
 		css_task_iter_start(&from->self, 0, &it);
-		task = css_task_iter_next(&it);
+
+		do {
+			task = css_task_iter_next(&it);
+		} while (task && (task->flags & PF_EXITING));
+
 		if (task)
 			get_task_struct(task);
 		css_task_iter_end(&it);
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c
index 0b1ffe1..7e4c445 100644
--- a/kernel/cgroup/cgroup.c
+++ b/kernel/cgroup/cgroup.c
@@ -1397,7 +1397,7 @@ static char *cgroup_file_name(struct cgroup *cgrp, const struct cftype *cft,
 			 cgroup_on_dfl(cgrp) ? ss->name : ss->legacy_name,
 			 cft->name);
 	else
-		strncpy(buf, cft->name, CGROUP_FILE_NAME_MAX);
+		strlcpy(buf, cft->name, CGROUP_FILE_NAME_MAX);
 	return buf;
 }
 
@@ -1864,9 +1864,9 @@ void init_cgroup_root(struct cgroup_root *root, struct cgroup_sb_opts *opts)
 
 	root->flags = opts->flags;
 	if (opts->release_agent)
-		strcpy(root->release_agent_path, opts->release_agent);
+		strlcpy(root->release_agent_path, opts->release_agent, PATH_MAX);
 	if (opts->name)
-		strcpy(root->name, opts->name);
+		strlcpy(root->name, opts->name, MAX_CGROUP_ROOT_NAMELEN);
 	if (opts->cpuset_clone_children)
 		set_bit(CGRP_CPUSET_CLONE_CHILDREN, &root->cgrp.flags);
 }
@@ -4125,26 +4125,24 @@ static void css_task_iter_advance_css_set(struct css_task_iter *it)
 
 static void css_task_iter_advance(struct css_task_iter *it)
 {
-	struct list_head *l = it->task_pos;
+	struct list_head *next;
 
 	lockdep_assert_held(&css_set_lock);
-	WARN_ON_ONCE(!l);
-
 repeat:
 	/*
 	 * Advance iterator to find next entry.  cset->tasks is consumed
 	 * first and then ->mg_tasks.  After ->mg_tasks, we move onto the
 	 * next cset.
 	 */
-	l = l->next;
+	next = it->task_pos->next;
 
-	if (l == it->tasks_head)
-		l = it->mg_tasks_head->next;
+	if (next == it->tasks_head)
+		next = it->mg_tasks_head->next;
 
-	if (l == it->mg_tasks_head)
+	if (next == it->mg_tasks_head)
 		css_task_iter_advance_css_set(it);
 	else
-		it->task_pos = l;
+		it->task_pos = next;
 
 	/* if PROCS, skip over tasks which aren't group leaders */
 	if ((it->flags & CSS_TASK_ITER_PROCS) && it->task_pos &&
@@ -4449,6 +4447,7 @@ static struct cftype cgroup_base_files[] = {
 	},
 	{
 		.name = "cgroup.threads",
+		.flags = CFTYPE_NS_DELEGATABLE,
 		.release = cgroup_procs_release,
 		.seq_start = cgroup_threads_start,
 		.seq_next = cgroup_procs_next,
diff --git a/kernel/configs/nopm.config b/kernel/configs/nopm.config
new file mode 100644
index 0000000..81ff078
--- /dev/null
+++ b/kernel/configs/nopm.config
@@ -0,0 +1,15 @@
+CONFIG_PM=n
+CONFIG_SUSPEND=n
+CONFIG_HIBERNATION=n
+
+# Triggers PM on OMAP
+CONFIG_CPU_IDLE=n
+
+# Triggers enablement via hibernate callbacks
+CONFIG_XEN=n
+
+# ARM/ARM64 architectures that select PM unconditionally
+CONFIG_ARCH_OMAP2PLUS_TYPICAL=n
+CONFIG_ARCH_RENESAS=n
+CONFIG_ARCH_TEGRA=n
+CONFIG_ARCH_VEXPRESS=n
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index b366389..4f63597 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -410,7 +410,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 	VMCOREINFO_SYMBOL(contig_page_data);
 #endif
 #ifdef CONFIG_SPARSEMEM
-	VMCOREINFO_SYMBOL(mem_section);
+	VMCOREINFO_SYMBOL_ARRAY(mem_section);
 	VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
 	VMCOREINFO_STRUCT_SIZE(mem_section);
 	VMCOREINFO_OFFSET(mem_section, section_mem_map);
diff --git a/kernel/delayacct.c b/kernel/delayacct.c
index 4a1c334..e2764d7 100644
--- a/kernel/delayacct.c
+++ b/kernel/delayacct.c
@@ -51,16 +51,16 @@ void __delayacct_tsk_init(struct task_struct *tsk)
  * Finish delay accounting for a statistic using its timestamps (@start),
  * accumalator (@total) and @count
  */
-static void delayacct_end(u64 *start, u64 *total, u32 *count)
+static void delayacct_end(spinlock_t *lock, u64 *start, u64 *total, u32 *count)
 {
 	s64 ns = ktime_get_ns() - *start;
 	unsigned long flags;
 
 	if (ns > 0) {
-		spin_lock_irqsave(&current->delays->lock, flags);
+		spin_lock_irqsave(lock, flags);
 		*total += ns;
 		(*count)++;
-		spin_unlock_irqrestore(&current->delays->lock, flags);
+		spin_unlock_irqrestore(lock, flags);
 	}
 }
 
@@ -69,17 +69,25 @@ void __delayacct_blkio_start(void)
 	current->delays->blkio_start = ktime_get_ns();
 }
 
-void __delayacct_blkio_end(void)
+/*
+ * We cannot rely on the `current` macro, as we haven't yet switched back to
+ * the process being woken.
+ */
+void __delayacct_blkio_end(struct task_struct *p)
 {
-	if (current->delays->flags & DELAYACCT_PF_SWAPIN)
-		/* Swapin block I/O */
-		delayacct_end(&current->delays->blkio_start,
-			&current->delays->swapin_delay,
-			&current->delays->swapin_count);
-	else	/* Other block I/O */
-		delayacct_end(&current->delays->blkio_start,
-			&current->delays->blkio_delay,
-			&current->delays->blkio_count);
+	struct task_delay_info *delays = p->delays;
+	u64 *total;
+	u32 *count;
+
+	if (p->delays->flags & DELAYACCT_PF_SWAPIN) {
+		total = &delays->swapin_delay;
+		count = &delays->swapin_count;
+	} else {
+		total = &delays->blkio_delay;
+		count = &delays->blkio_count;
+	}
+
+	delayacct_end(&delays->lock, &delays->blkio_start, total, count);
 }
 
 int __delayacct_add_tsk(struct taskstats *d, struct task_struct *tsk)
@@ -153,8 +161,10 @@ void __delayacct_freepages_start(void)
 
 void __delayacct_freepages_end(void)
 {
-	delayacct_end(&current->delays->freepages_start,
-			&current->delays->freepages_delay,
-			&current->delays->freepages_count);
+	delayacct_end(
+		&current->delays->lock,
+		&current->delays->freepages_start,
+		&current->delays->freepages_delay,
+		&current->delays->freepages_count);
 }
 
diff --git a/kernel/events/callchain.c b/kernel/events/callchain.c
index 1b2be63..772a43f 100644
--- a/kernel/events/callchain.c
+++ b/kernel/events/callchain.c
@@ -179,21 +179,6 @@ put_callchain_entry(int rctx)
 }
 
 struct perf_callchain_entry *
-perf_callchain(struct perf_event *event, struct pt_regs *regs)
-{
-	bool kernel = !event->attr.exclude_callchain_kernel;
-	bool user   = !event->attr.exclude_callchain_user;
-	/* Disallow cross-task user callchains. */
-	bool crosstask = event->ctx->task && event->ctx->task != current;
-	const u32 max_stack = event->attr.sample_max_stack;
-
-	if (!kernel && !user)
-		return NULL;
-
-	return get_perf_callchain(regs, 0, kernel, user, max_stack, crosstask, true);
-}
-
-struct perf_callchain_entry *
 get_perf_callchain(struct pt_regs *regs, u32 init_nr, bool kernel, bool user,
 		   u32 max_stack, bool crosstask, bool add_mark)
 {
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 4df5b69..02f7d6e 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -1231,6 +1231,10 @@ static void put_ctx(struct perf_event_context *ctx)
  *	      perf_event_context::lock
  *	    perf_event::mmap_mutex
  *	    mmap_sem
+ *
+ *    cpu_hotplug_lock
+ *      pmus_lock
+ *	  cpuctx->mutex / perf_event_context::mutex
  */
 static struct perf_event_context *
 perf_event_ctx_lock_nested(struct perf_event *event, int nesting)
@@ -4196,6 +4200,7 @@ int perf_event_release_kernel(struct perf_event *event)
 {
 	struct perf_event_context *ctx = event->ctx;
 	struct perf_event *child, *tmp;
+	LIST_HEAD(free_list);
 
 	/*
 	 * If we got here through err_file: fput(event_file); we will not have
@@ -4268,8 +4273,7 @@ int perf_event_release_kernel(struct perf_event *event)
 					       struct perf_event, child_list);
 		if (tmp == child) {
 			perf_remove_from_context(child, DETACH_GROUP);
-			list_del(&child->child_list);
-			free_event(child);
+			list_move(&child->child_list, &free_list);
 			/*
 			 * This matches the refcount bump in inherit_event();
 			 * this can't be the last reference.
@@ -4284,6 +4288,11 @@ int perf_event_release_kernel(struct perf_event *event)
 	}
 	mutex_unlock(&event->child_mutex);
 
+	list_for_each_entry_safe(child, tmp, &free_list, child_list) {
+		list_del(&child->child_list);
+		free_event(child);
+	}
+
 no_ctx:
 	put_event(event); /* Must be the 'last' reference */
 	return 0;
@@ -4904,6 +4913,7 @@ void perf_event_update_userpage(struct perf_event *event)
 unlock:
 	rcu_read_unlock();
 }
+EXPORT_SYMBOL_GPL(perf_event_update_userpage);
 
 static int perf_mmap_fault(struct vm_fault *vmf)
 {
@@ -5815,19 +5825,11 @@ void perf_output_sample(struct perf_output_handle *handle,
 		perf_output_read(handle, event);
 
 	if (sample_type & PERF_SAMPLE_CALLCHAIN) {
-		if (data->callchain) {
-			int size = 1;
+		int size = 1;
 
-			if (data->callchain)
-				size += data->callchain->nr;
-
-			size *= sizeof(u64);
-
-			__output_copy(handle, data->callchain, size);
-		} else {
-			u64 nr = 0;
-			perf_output_put(handle, nr);
-		}
+		size += data->callchain->nr;
+		size *= sizeof(u64);
+		__output_copy(handle, data->callchain, size);
 	}
 
 	if (sample_type & PERF_SAMPLE_RAW) {
@@ -5980,6 +5982,26 @@ static u64 perf_virt_to_phys(u64 virt)
 	return phys_addr;
 }
 
+static struct perf_callchain_entry __empty_callchain = { .nr = 0, };
+
+static struct perf_callchain_entry *
+perf_callchain(struct perf_event *event, struct pt_regs *regs)
+{
+	bool kernel = !event->attr.exclude_callchain_kernel;
+	bool user   = !event->attr.exclude_callchain_user;
+	/* Disallow cross-task user callchains. */
+	bool crosstask = event->ctx->task && event->ctx->task != current;
+	const u32 max_stack = event->attr.sample_max_stack;
+	struct perf_callchain_entry *callchain;
+
+	if (!kernel && !user)
+		return &__empty_callchain;
+
+	callchain = get_perf_callchain(regs, 0, kernel, user,
+				       max_stack, crosstask, true);
+	return callchain ?: &__empty_callchain;
+}
+
 void perf_prepare_sample(struct perf_event_header *header,
 			 struct perf_sample_data *data,
 			 struct perf_event *event,
@@ -6002,9 +6024,7 @@ void perf_prepare_sample(struct perf_event_header *header,
 		int size = 1;
 
 		data->callchain = perf_callchain(event, regs);
-
-		if (data->callchain)
-			size += data->callchain->nr;
+		size += data->callchain->nr;
 
 		header->size += size * sizeof(u64);
 	}
@@ -8516,6 +8536,29 @@ perf_event_set_addr_filter(struct perf_event *event, char *filter_str)
 	return ret;
 }
 
+static int
+perf_tracepoint_set_filter(struct perf_event *event, char *filter_str)
+{
+	struct perf_event_context *ctx = event->ctx;
+	int ret;
+
+	/*
+	 * Beware, here be dragons!!
+	 *
+	 * the tracepoint muck will deadlock against ctx->mutex, but the tracepoint
+	 * stuff does not actually need it. So temporarily drop ctx->mutex. As per
+	 * perf_event_ctx_lock() we already have a reference on ctx.
+	 *
+	 * This can result in event getting moved to a different ctx, but that
+	 * does not affect the tracepoint state.
+	 */
+	mutex_unlock(&ctx->mutex);
+	ret = ftrace_profile_set_filter(event, event->attr.config, filter_str);
+	mutex_lock(&ctx->mutex);
+
+	return ret;
+}
+
 static int perf_event_set_filter(struct perf_event *event, void __user *arg)
 {
 	char *filter_str;
@@ -8532,8 +8575,7 @@ static int perf_event_set_filter(struct perf_event *event, void __user *arg)
 
 	if (IS_ENABLED(CONFIG_EVENT_TRACING) &&
 	    event->attr.type == PERF_TYPE_TRACEPOINT)
-		ret = ftrace_profile_set_filter(event, event->attr.config,
-						filter_str);
+		ret = perf_tracepoint_set_filter(event, filter_str);
 	else if (has_addr_filter(event))
 		ret = perf_event_set_addr_filter(event, filter_str);
 
@@ -9168,7 +9210,13 @@ static int perf_try_init_event(struct pmu *pmu, struct perf_event *event)
 	if (!try_module_get(pmu->module))
 		return -ENODEV;
 
-	if (event->group_leader != event) {
+	/*
+	 * A number of pmu->event_init() methods iterate the sibling_list to,
+	 * for example, validate if the group fits on the PMU. Therefore,
+	 * if this is a sibling event, acquire the ctx->mutex to protect
+	 * the sibling_list.
+	 */
+	if (event->group_leader != event && pmu->task_ctx_nr != perf_sw_context) {
 		/*
 		 * This ctx->mutex can nest when we're called through
 		 * inheritance. See the perf_event_ctx_lock_nested() comment.
@@ -10703,6 +10751,19 @@ inherit_event(struct perf_event *parent_event,
 	if (IS_ERR(child_event))
 		return child_event;
 
+
+	if ((child_event->attach_state & PERF_ATTACH_TASK_DATA) &&
+	    !child_ctx->task_ctx_data) {
+		struct pmu *pmu = child_event->pmu;
+
+		child_ctx->task_ctx_data = kzalloc(pmu->task_ctx_size,
+						   GFP_KERNEL);
+		if (!child_ctx->task_ctx_data) {
+			free_event(child_event);
+			return NULL;
+		}
+	}
+
 	/*
 	 * is_orphaned_event() and list_add_tail(&parent_event->child_list)
 	 * must be under the same lock in order to serialize against
@@ -10713,6 +10774,7 @@ inherit_event(struct perf_event *parent_event,
 	if (is_orphaned_event(parent_event) ||
 	    !atomic_long_inc_not_zero(&parent_event->refcount)) {
 		mutex_unlock(&parent_event->child_mutex);
+		/* task_ctx_data is freed with child_ctx */
 		free_event(child_event);
 		return NULL;
 	}
diff --git a/kernel/events/internal.h b/kernel/events/internal.h
index 09b1537..6dc725a 100644
--- a/kernel/events/internal.h
+++ b/kernel/events/internal.h
@@ -201,10 +201,6 @@ arch_perf_out_copy_user(void *dst, const void *src, unsigned long n)
 
 DEFINE_OUTPUT_COPY(__output_copy_user, arch_perf_out_copy_user)
 
-/* Callchain handling */
-extern struct perf_callchain_entry *
-perf_callchain(struct perf_event *event, struct pt_regs *regs);
-
 static inline int get_recursion_context(int *recursion)
 {
 	int rctx;
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index 267f6ef..ce6848e 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -1167,8 +1167,8 @@ static int xol_add_vma(struct mm_struct *mm, struct xol_area *area)
 	}
 
 	ret = 0;
-	smp_wmb();	/* pairs with get_xol_area() */
-	mm->uprobes_state.xol_area = area;
+	/* pairs with get_xol_area() */
+	smp_store_release(&mm->uprobes_state.xol_area, area); /* ^^^ */
  fail:
 	up_write(&mm->mmap_sem);
 
@@ -1230,8 +1230,8 @@ static struct xol_area *get_xol_area(void)
 	if (!mm->uprobes_state.xol_area)
 		__create_xol_area(0);
 
-	area = mm->uprobes_state.xol_area;
-	smp_read_barrier_depends();	/* pairs with wmb in xol_add_vma() */
+	/* Pairs with xol_add_vma() smp_store_release() */
+	area = READ_ONCE(mm->uprobes_state.xol_area); /* ^^^ */
 	return area;
 }
 
@@ -1528,8 +1528,8 @@ static unsigned long get_trampoline_vaddr(void)
 	struct xol_area *area;
 	unsigned long trampoline_vaddr = -1;
 
-	area = current->mm->uprobes_state.xol_area;
-	smp_read_barrier_depends();
+	/* Pairs with xol_add_vma() smp_store_release() */
+	area = READ_ONCE(current->mm->uprobes_state.xol_area); /* ^^^ */
 	if (area)
 		trampoline_vaddr = area->vaddr;
 
diff --git a/kernel/exit.c b/kernel/exit.c
index df0c91d..995453d 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1763,3 +1763,4 @@ __weak void abort(void)
 	/* if that doesn't kill us, halt */
 	panic("Oops failed to kill thread");
 }
+EXPORT_SYMBOL(abort);
diff --git a/kernel/futex.c b/kernel/futex.c
index 57d0b36..7f719d1 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -1878,6 +1878,9 @@ static int futex_requeue(u32 __user *uaddr1, unsigned int flags,
 	struct futex_q *this, *next;
 	DEFINE_WAKE_Q(wake_q);
 
+	if (nr_wake < 0 || nr_requeue < 0)
+		return -EINVAL;
+
 	/*
 	 * When PI not supported: return -ENOSYS if requeue_pi is true,
 	 * consequently the compiler knows requeue_pi is always false past
@@ -2294,34 +2297,33 @@ static void unqueue_me_pi(struct futex_q *q)
 	spin_unlock(q->lock_ptr);
 }
 
-/*
- * Fixup the pi_state owner with the new owner.
- *
- * Must be called with hash bucket lock held and mm->sem held for non
- * private futexes.
- */
 static int fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
-				struct task_struct *newowner)
+				struct task_struct *argowner)
 {
-	u32 newtid = task_pid_vnr(newowner) | FUTEX_WAITERS;
 	struct futex_pi_state *pi_state = q->pi_state;
 	u32 uval, uninitialized_var(curval), newval;
-	struct task_struct *oldowner;
+	struct task_struct *oldowner, *newowner;
+	u32 newtid;
 	int ret;
 
+	lockdep_assert_held(q->lock_ptr);
+
 	raw_spin_lock_irq(&pi_state->pi_mutex.wait_lock);
 
 	oldowner = pi_state->owner;
-	/* Owner died? */
-	if (!pi_state->owner)
-		newtid |= FUTEX_OWNER_DIED;
 
 	/*
-	 * We are here either because we stole the rtmutex from the
-	 * previous highest priority waiter or we are the highest priority
-	 * waiter but have failed to get the rtmutex the first time.
+	 * We are here because either:
 	 *
-	 * We have to replace the newowner TID in the user space variable.
+	 *  - we stole the lock and pi_state->owner needs updating to reflect
+	 *    that (@argowner == current),
+	 *
+	 * or:
+	 *
+	 *  - someone stole our lock and we need to fix things to point to the
+	 *    new owner (@argowner == NULL).
+	 *
+	 * Either way, we have to replace the TID in the user space variable.
 	 * This must be atomic as we have to preserve the owner died bit here.
 	 *
 	 * Note: We write the user space value _before_ changing the pi_state
@@ -2334,6 +2336,45 @@ static int fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
 	 * in the PID check in lookup_pi_state.
 	 */
 retry:
+	if (!argowner) {
+		if (oldowner != current) {
+			/*
+			 * We raced against a concurrent self; things are
+			 * already fixed up. Nothing to do.
+			 */
+			ret = 0;
+			goto out_unlock;
+		}
+
+		if (__rt_mutex_futex_trylock(&pi_state->pi_mutex)) {
+			/* We got the lock after all, nothing to fix. */
+			ret = 0;
+			goto out_unlock;
+		}
+
+		/*
+		 * Since we just failed the trylock; there must be an owner.
+		 */
+		newowner = rt_mutex_owner(&pi_state->pi_mutex);
+		BUG_ON(!newowner);
+	} else {
+		WARN_ON_ONCE(argowner != current);
+		if (oldowner == current) {
+			/*
+			 * We raced against a concurrent self; things are
+			 * already fixed up. Nothing to do.
+			 */
+			ret = 0;
+			goto out_unlock;
+		}
+		newowner = argowner;
+	}
+
+	newtid = task_pid_vnr(newowner) | FUTEX_WAITERS;
+	/* Owner died? */
+	if (!pi_state->owner)
+		newtid |= FUTEX_OWNER_DIED;
+
 	if (get_futex_value_locked(&uval, uaddr))
 		goto handle_fault;
 
@@ -2434,9 +2475,9 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
 		 * Got the lock. We might not be the anticipated owner if we
 		 * did a lock-steal - fix up the PI-state in that case:
 		 *
-		 * We can safely read pi_state->owner without holding wait_lock
-		 * because we now own the rt_mutex, only the owner will attempt
-		 * to change it.
+		 * Speculative pi_state->owner read (we don't hold wait_lock);
+		 * since we own the lock pi_state->owner == current is the
+		 * stable state, anything else needs more attention.
 		 */
 		if (q->pi_state->owner != current)
 			ret = fixup_pi_state_owner(uaddr, q, current);
@@ -2444,6 +2485,19 @@ static int fixup_owner(u32 __user *uaddr, struct futex_q *q, int locked)
 	}
 
 	/*
+	 * If we didn't get the lock; check if anybody stole it from us. In
+	 * that case, we need to fix up the uval to point to them instead of
+	 * us, otherwise bad things happen. [10]
+	 *
+	 * Another speculative read; pi_state->owner == current is unstable
+	 * but needs our attention.
+	 */
+	if (q->pi_state->owner == current) {
+		ret = fixup_pi_state_owner(uaddr, q, NULL);
+		goto out;
+	}
+
+	/*
 	 * Paranoia check. If we did not take the lock, then we should not be
 	 * the owner of the rt_mutex.
 	 */
diff --git a/kernel/irq/Kconfig b/kernel/irq/Kconfig
index 89e3558..6fc87cc 100644
--- a/kernel/irq/Kconfig
+++ b/kernel/irq/Kconfig
@@ -103,16 +103,6 @@
 config GENERIC_IRQ_RESERVATION_MODE
 	bool
 
-config IRQ_DOMAIN_DEBUG
-	bool "Expose hardware/virtual IRQ mapping via debugfs"
-	depends on IRQ_DOMAIN && DEBUG_FS
-	help
-	  This option will show the mapping relationship between hardware irq
-	  numbers and Linux irq numbers. The mapping is exposed via debugfs
-	  in the file "irq_domain_mapping".
-
-	  If you don't know what this means you don't need it.
-
 # Support forced irq threading
 config IRQ_FORCED_THREADING
        bool
diff --git a/kernel/irq/affinity.c b/kernel/irq/affinity.c
index e12d351..a37a3b4 100644
--- a/kernel/irq/affinity.c
+++ b/kernel/irq/affinity.c
@@ -39,7 +39,7 @@ static void irq_spread_init_one(struct cpumask *irqmsk, struct cpumask *nmsk,
 	}
 }
 
-static cpumask_var_t *alloc_node_to_present_cpumask(void)
+static cpumask_var_t *alloc_node_to_possible_cpumask(void)
 {
 	cpumask_var_t *masks;
 	int node;
@@ -62,7 +62,7 @@ static cpumask_var_t *alloc_node_to_present_cpumask(void)
 	return NULL;
 }
 
-static void free_node_to_present_cpumask(cpumask_var_t *masks)
+static void free_node_to_possible_cpumask(cpumask_var_t *masks)
 {
 	int node;
 
@@ -71,22 +71,22 @@ static void free_node_to_present_cpumask(cpumask_var_t *masks)
 	kfree(masks);
 }
 
-static void build_node_to_present_cpumask(cpumask_var_t *masks)
+static void build_node_to_possible_cpumask(cpumask_var_t *masks)
 {
 	int cpu;
 
-	for_each_present_cpu(cpu)
+	for_each_possible_cpu(cpu)
 		cpumask_set_cpu(cpu, masks[cpu_to_node(cpu)]);
 }
 
-static int get_nodes_in_cpumask(cpumask_var_t *node_to_present_cpumask,
+static int get_nodes_in_cpumask(cpumask_var_t *node_to_possible_cpumask,
 				const struct cpumask *mask, nodemask_t *nodemsk)
 {
 	int n, nodes = 0;
 
 	/* Calculate the number of nodes in the supplied affinity mask */
 	for_each_node(n) {
-		if (cpumask_intersects(mask, node_to_present_cpumask[n])) {
+		if (cpumask_intersects(mask, node_to_possible_cpumask[n])) {
 			node_set(n, *nodemsk);
 			nodes++;
 		}
@@ -109,7 +109,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	int last_affv = affv + affd->pre_vectors;
 	nodemask_t nodemsk = NODE_MASK_NONE;
 	struct cpumask *masks;
-	cpumask_var_t nmsk, *node_to_present_cpumask;
+	cpumask_var_t nmsk, *node_to_possible_cpumask;
 
 	/*
 	 * If there aren't any vectors left after applying the pre/post
@@ -125,8 +125,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (!masks)
 		goto out;
 
-	node_to_present_cpumask = alloc_node_to_present_cpumask();
-	if (!node_to_present_cpumask)
+	node_to_possible_cpumask = alloc_node_to_possible_cpumask();
+	if (!node_to_possible_cpumask)
 		goto out;
 
 	/* Fill out vectors at the beginning that don't need affinity */
@@ -135,8 +135,8 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 
 	/* Stabilize the cpumasks */
 	get_online_cpus();
-	build_node_to_present_cpumask(node_to_present_cpumask);
-	nodes = get_nodes_in_cpumask(node_to_present_cpumask, cpu_present_mask,
+	build_node_to_possible_cpumask(node_to_possible_cpumask);
+	nodes = get_nodes_in_cpumask(node_to_possible_cpumask, cpu_possible_mask,
 				     &nodemsk);
 
 	/*
@@ -146,7 +146,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	if (affv <= nodes) {
 		for_each_node_mask(n, nodemsk) {
 			cpumask_copy(masks + curvec,
-				     node_to_present_cpumask[n]);
+				     node_to_possible_cpumask[n]);
 			if (++curvec == last_affv)
 				break;
 		}
@@ -160,7 +160,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 		vecs_per_node = (affv - (curvec - affd->pre_vectors)) / nodes;
 
 		/* Get the cpus on this node which are in the mask */
-		cpumask_and(nmsk, cpu_present_mask, node_to_present_cpumask[n]);
+		cpumask_and(nmsk, cpu_possible_mask, node_to_possible_cpumask[n]);
 
 		/* Calculate the number of cpus per vector */
 		ncpus = cpumask_weight(nmsk);
@@ -192,7 +192,7 @@ irq_create_affinity_masks(int nvecs, const struct irq_affinity *affd)
 	/* Fill out vectors at the end that don't need affinity */
 	for (; curvec < nvecs; curvec++)
 		cpumask_copy(masks + curvec, irq_default_affinity);
-	free_node_to_present_cpumask(node_to_present_cpumask);
+	free_node_to_possible_cpumask(node_to_possible_cpumask);
 out:
 	free_cpumask_var(nmsk);
 	return masks;
@@ -214,7 +214,7 @@ int irq_calc_affinity_vectors(int minvec, int maxvec, const struct irq_affinity
 		return 0;
 
 	get_online_cpus();
-	ret = min_t(int, cpumask_weight(cpu_present_mask), vecs) + resv;
+	ret = min_t(int, cpumask_weight(cpu_possible_mask), vecs) + resv;
 	put_online_cpus();
 	return ret;
 }
diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
index 62068ad..e6a9c36 100644
--- a/kernel/irq/irqdomain.c
+++ b/kernel/irq/irqdomain.c
@@ -897,124 +897,6 @@ unsigned int irq_find_mapping(struct irq_domain *domain,
 }
 EXPORT_SYMBOL_GPL(irq_find_mapping);
 
-#ifdef CONFIG_IRQ_DOMAIN_DEBUG
-static void virq_debug_show_one(struct seq_file *m, struct irq_desc *desc)
-{
-	struct irq_domain *domain;
-	struct irq_data *data;
-
-	domain = desc->irq_data.domain;
-	data = &desc->irq_data;
-
-	while (domain) {
-		unsigned int irq = data->irq;
-		unsigned long hwirq = data->hwirq;
-		struct irq_chip *chip;
-		bool direct;
-
-		if (data == &desc->irq_data)
-			seq_printf(m, "%5d  ", irq);
-		else
-			seq_printf(m, "%5d+ ", irq);
-		seq_printf(m, "0x%05lx  ", hwirq);
-
-		chip = irq_data_get_irq_chip(data);
-		seq_printf(m, "%-15s  ", (chip && chip->name) ? chip->name : "none");
-
-		seq_printf(m, "0x%p  ", irq_data_get_irq_chip_data(data));
-
-		seq_printf(m, "   %c    ", (desc->action && desc->action->handler) ? '*' : ' ');
-		direct = (irq == hwirq) && (irq < domain->revmap_direct_max_irq);
-		seq_printf(m, "%6s%-8s  ",
-			   (hwirq < domain->revmap_size) ? "LINEAR" : "RADIX",
-			   direct ? "(DIRECT)" : "");
-		seq_printf(m, "%s\n", domain->name);
-#ifdef CONFIG_IRQ_DOMAIN_HIERARCHY
-		domain = domain->parent;
-		data = data->parent_data;
-#else
-		domain = NULL;
-#endif
-	}
-}
-
-static int virq_debug_show(struct seq_file *m, void *private)
-{
-	unsigned long flags;
-	struct irq_desc *desc;
-	struct irq_domain *domain;
-	struct radix_tree_iter iter;
-	void __rcu **slot;
-	int i;
-
-	seq_printf(m, " %-16s  %-6s  %-10s  %-10s  %s\n",
-		   "name", "mapped", "linear-max", "direct-max", "devtree-node");
-	mutex_lock(&irq_domain_mutex);
-	list_for_each_entry(domain, &irq_domain_list, link) {
-		struct device_node *of_node;
-		const char *name;
-
-		int count = 0;
-
-		of_node = irq_domain_get_of_node(domain);
-		if (of_node)
-			name = of_node_full_name(of_node);
-		else if (is_fwnode_irqchip(domain->fwnode))
-			name = container_of(domain->fwnode, struct irqchip_fwid,
-					    fwnode)->name;
-		else
-			name = "";
-
-		radix_tree_for_each_slot(slot, &domain->revmap_tree, &iter, 0)
-			count++;
-		seq_printf(m, "%c%-16s  %6u  %10u  %10u  %s\n",
-			   domain == irq_default_domain ? '*' : ' ', domain->name,
-			   domain->revmap_size + count, domain->revmap_size,
-			   domain->revmap_direct_max_irq,
-			   name);
-	}
-	mutex_unlock(&irq_domain_mutex);
-
-	seq_printf(m, "%-5s  %-7s  %-15s  %-*s  %6s  %-14s  %s\n", "irq", "hwirq",
-		      "chip name", (int)(2 * sizeof(void *) + 2), "chip data",
-		      "active", "type", "domain");
-
-	for (i = 1; i < nr_irqs; i++) {
-		desc = irq_to_desc(i);
-		if (!desc)
-			continue;
-
-		raw_spin_lock_irqsave(&desc->lock, flags);
-		virq_debug_show_one(m, desc);
-		raw_spin_unlock_irqrestore(&desc->lock, flags);
-	}
-
-	return 0;
-}
-
-static int virq_debug_open(struct inode *inode, struct file *file)
-{
-	return single_open(file, virq_debug_show, inode->i_private);
-}
-
-static const struct file_operations virq_debug_fops = {
-	.open = virq_debug_open,
-	.read = seq_read,
-	.llseek = seq_lseek,
-	.release = single_release,
-};
-
-static int __init irq_debugfs_init(void)
-{
-	if (debugfs_create_file("irq_domain_mapping", S_IRUGO, NULL,
-				 NULL, &virq_debug_fops) == NULL)
-		return -ENOMEM;
-
-	return 0;
-}
-__initcall(irq_debugfs_init);
-#endif /* CONFIG_IRQ_DOMAIN_DEBUG */
-
 /**
  * irq_domain_xlate_onecell() - Generic xlate for direct one cell bindings
  *
diff --git a/kernel/irq/matrix.c b/kernel/irq/matrix.c
index 0ba0dd8..5187dfe 100644
--- a/kernel/irq/matrix.c
+++ b/kernel/irq/matrix.c
@@ -321,15 +321,23 @@ void irq_matrix_remove_reserved(struct irq_matrix *m)
 int irq_matrix_alloc(struct irq_matrix *m, const struct cpumask *msk,
 		     bool reserved, unsigned int *mapped_cpu)
 {
-	unsigned int cpu;
+	unsigned int cpu, best_cpu, maxavl = 0;
+	struct cpumap *cm;
+	unsigned int bit;
 
+	best_cpu = UINT_MAX;
 	for_each_cpu(cpu, msk) {
-		struct cpumap *cm = per_cpu_ptr(m->maps, cpu);
-		unsigned int bit;
+		cm = per_cpu_ptr(m->maps, cpu);
 
-		if (!cm->online)
+		if (!cm->online || cm->available <= maxavl)
 			continue;
 
+		best_cpu = cpu;
+		maxavl = cm->available;
+	}
+
+	if (maxavl) {
+		cm = per_cpu_ptr(m->maps, best_cpu);
 		bit = matrix_alloc_area(m, cm, 1, false);
 		if (bit < m->alloc_end) {
 			cm->allocated++;
@@ -338,8 +346,8 @@ int irq_matrix_alloc(struct irq_matrix *m, const struct cpumask *msk,
 			m->global_available--;
 			if (reserved)
 				m->global_reserved--;
-			*mapped_cpu = cpu;
-			trace_irq_matrix_alloc(bit, cpu, m, cm);
+			*mapped_cpu = best_cpu;
+			trace_irq_matrix_alloc(bit, best_cpu, m, cm);
 			return bit;
 		}
 	}
diff --git a/kernel/irq_work.c b/kernel/irq_work.c
index 40e9d73..6b7cdf1 100644
--- a/kernel/irq_work.c
+++ b/kernel/irq_work.c
@@ -36,7 +36,7 @@ static bool irq_work_claim(struct irq_work *work)
 	 */
 	flags = work->flags & ~IRQ_WORK_PENDING;
 	for (;;) {
-		nflags = flags | IRQ_WORK_FLAGS;
+		nflags = flags | IRQ_WORK_CLAIMED;
 		oflags = cmpxchg(&work->flags, flags, nflags);
 		if (oflags == flags)
 			break;
diff --git a/kernel/jump_label.c b/kernel/jump_label.c
index 8594d24..b451709 100644
--- a/kernel/jump_label.c
+++ b/kernel/jump_label.c
@@ -79,7 +79,7 @@ int static_key_count(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_count);
 
-static void static_key_slow_inc_cpuslocked(struct static_key *key)
+void static_key_slow_inc_cpuslocked(struct static_key *key)
 {
 	int v, v1;
 
@@ -180,7 +180,7 @@ void static_key_disable(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_disable);
 
-static void static_key_slow_dec_cpuslocked(struct static_key *key,
+static void __static_key_slow_dec_cpuslocked(struct static_key *key,
 					   unsigned long rate_limit,
 					   struct delayed_work *work)
 {
@@ -211,7 +211,7 @@ static void __static_key_slow_dec(struct static_key *key,
 				  struct delayed_work *work)
 {
 	cpus_read_lock();
-	static_key_slow_dec_cpuslocked(key, rate_limit, work);
+	__static_key_slow_dec_cpuslocked(key, rate_limit, work);
 	cpus_read_unlock();
 }
 
@@ -229,6 +229,12 @@ void static_key_slow_dec(struct static_key *key)
 }
 EXPORT_SYMBOL_GPL(static_key_slow_dec);
 
+void static_key_slow_dec_cpuslocked(struct static_key *key)
+{
+	STATIC_KEY_CHECK_USE(key);
+	__static_key_slow_dec_cpuslocked(key, 0, NULL);
+}
+
 void static_key_slow_dec_deferred(struct static_key_deferred *key)
 {
 	STATIC_KEY_CHECK_USE(key);
diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
index 5fa1324..89b5f83 100644
--- a/kernel/locking/lockdep.c
+++ b/kernel/locking/lockdep.c
@@ -49,6 +49,7 @@
 #include <linux/gfp.h>
 #include <linux/random.h>
 #include <linux/jhash.h>
+#include <linux/nmi.h>
 
 #include <asm/sections.h>
 
@@ -647,18 +648,12 @@ static int count_matching_names(struct lock_class *new_class)
 	return count + 1;
 }
 
-/*
- * Register a lock's class in the hash-table, if the class is not present
- * yet. Otherwise we look it up. We cache the result in the lock object
- * itself, so actual lookup of the hash should be once per lock object.
- */
 static inline struct lock_class *
-look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
+look_up_lock_class(const struct lockdep_map *lock, unsigned int subclass)
 {
 	struct lockdep_subclass_key *key;
 	struct hlist_head *hash_head;
 	struct lock_class *class;
-	bool is_static = false;
 
 	if (unlikely(subclass >= MAX_LOCKDEP_SUBCLASSES)) {
 		debug_locks_off();
@@ -671,24 +666,11 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
 	}
 
 	/*
-	 * Static locks do not have their class-keys yet - for them the key
-	 * is the lock object itself. If the lock is in the per cpu area,
-	 * the canonical address of the lock (per cpu offset removed) is
-	 * used.
+	 * If it is not initialised then it has never been locked,
+	 * so it won't be present in the hash table.
 	 */
-	if (unlikely(!lock->key)) {
-		unsigned long can_addr, addr = (unsigned long)lock;
-
-		if (__is_kernel_percpu_address(addr, &can_addr))
-			lock->key = (void *)can_addr;
-		else if (__is_module_percpu_address(addr, &can_addr))
-			lock->key = (void *)can_addr;
-		else if (static_obj(lock))
-			lock->key = (void *)lock;
-		else
-			return ERR_PTR(-EINVAL);
-		is_static = true;
-	}
+	if (unlikely(!lock->key))
+		return NULL;
 
 	/*
 	 * NOTE: the class-key must be unique. For dynamic locks, a static
@@ -720,7 +702,35 @@ look_up_lock_class(struct lockdep_map *lock, unsigned int subclass)
 		}
 	}
 
-	return is_static || static_obj(lock->key) ? NULL : ERR_PTR(-EINVAL);
+	return NULL;
+}
+
+/*
+ * Static locks do not have their class-keys yet - for them the key is
+ * the lock object itself. If the lock is in the per cpu area, the
+ * canonical address of the lock (per cpu offset removed) is used.
+ */
+static bool assign_lock_key(struct lockdep_map *lock)
+{
+	unsigned long can_addr, addr = (unsigned long)lock;
+
+	if (__is_kernel_percpu_address(addr, &can_addr))
+		lock->key = (void *)can_addr;
+	else if (__is_module_percpu_address(addr, &can_addr))
+		lock->key = (void *)can_addr;
+	else if (static_obj(lock))
+		lock->key = (void *)lock;
+	else {
+		/* Debug-check: all keys must be persistent! */
+		debug_locks_off();
+		pr_err("INFO: trying to register non-static key.\n");
+		pr_err("the code is fine but needs lockdep annotation.\n");
+		pr_err("turning off the locking correctness validator.\n");
+		dump_stack();
+		return false;
+	}
+
+	return true;
 }
 
 /*
@@ -738,18 +748,13 @@ register_lock_class(struct lockdep_map *lock, unsigned int subclass, int force)
 	DEBUG_LOCKS_WARN_ON(!irqs_disabled());
 
 	class = look_up_lock_class(lock, subclass);
-	if (likely(!IS_ERR_OR_NULL(class)))
+	if (likely(class))
 		goto out_set_class_cache;
 
-	/*
-	 * Debug-check: all keys must be persistent!
-	 */
-	if (IS_ERR(class)) {
-		debug_locks_off();
-		printk("INFO: trying to register non-static key.\n");
-		printk("the code is fine but needs lockdep annotation.\n");
-		printk("turning off the locking correctness validator.\n");
-		dump_stack();
+	if (!lock->key) {
+		if (!assign_lock_key(lock))
+			return NULL;
+	} else if (!static_obj(lock->key)) {
 		return NULL;
 	}
 
@@ -3272,7 +3277,7 @@ print_lock_nested_lock_not_held(struct task_struct *curr,
 	return 0;
 }
 
-static int __lock_is_held(struct lockdep_map *lock, int read);
+static int __lock_is_held(const struct lockdep_map *lock, int read);
 
 /*
  * This gets called for every mutex_lock*()/spin_lock*() operation.
@@ -3481,13 +3486,14 @@ print_unlock_imbalance_bug(struct task_struct *curr, struct lockdep_map *lock,
 	return 0;
 }
 
-static int match_held_lock(struct held_lock *hlock, struct lockdep_map *lock)
+static int match_held_lock(const struct held_lock *hlock,
+					const struct lockdep_map *lock)
 {
 	if (hlock->instance == lock)
 		return 1;
 
 	if (hlock->references) {
-		struct lock_class *class = lock->class_cache[0];
+		const struct lock_class *class = lock->class_cache[0];
 
 		if (!class)
 			class = look_up_lock_class(lock, 0);
@@ -3498,7 +3504,7 @@ static int match_held_lock(struct held_lock *hlock, struct lockdep_map *lock)
 		 * Clearly if the lock hasn't been acquired _ever_, we're not
 		 * holding it either, so report failure.
 		 */
-		if (IS_ERR_OR_NULL(class))
+		if (!class)
 			return 0;
 
 		/*
@@ -3723,7 +3729,7 @@ __lock_release(struct lockdep_map *lock, int nested, unsigned long ip)
 	return 1;
 }
 
-static int __lock_is_held(struct lockdep_map *lock, int read)
+static int __lock_is_held(const struct lockdep_map *lock, int read)
 {
 	struct task_struct *curr = current;
 	int i;
@@ -3937,7 +3943,7 @@ void lock_release(struct lockdep_map *lock, int nested,
 }
 EXPORT_SYMBOL_GPL(lock_release);
 
-int lock_is_held_type(struct lockdep_map *lock, int read)
+int lock_is_held_type(const struct lockdep_map *lock, int read)
 {
 	unsigned long flags;
 	int ret = 0;
@@ -4294,7 +4300,7 @@ void lockdep_reset_lock(struct lockdep_map *lock)
 		 * If the class exists we look it up and zap it:
 		 */
 		class = look_up_lock_class(lock, j);
-		if (!IS_ERR_OR_NULL(class))
+		if (class)
 			zap_class(class);
 	}
 	/*
@@ -4490,6 +4496,7 @@ void debug_show_all_locks(void)
 		if (!unlock)
 			if (read_trylock(&tasklist_lock))
 				unlock = 1;
+		touch_nmi_watchdog();
 	} while_each_thread(g, p);
 
 	pr_warn("\n");
diff --git a/kernel/locking/locktorture.c b/kernel/locking/locktorture.c
index f24582d..6850ffd 100644
--- a/kernel/locking/locktorture.c
+++ b/kernel/locking/locktorture.c
@@ -77,10 +77,6 @@ struct lock_stress_stats {
 	long n_lock_acquired;
 };
 
-int torture_runnable = IS_ENABLED(MODULE);
-module_param(torture_runnable, int, 0444);
-MODULE_PARM_DESC(torture_runnable, "Start locktorture at module init");
-
 /* Forward reference. */
 static void lock_torture_cleanup(void);
 
@@ -130,10 +126,8 @@ static void torture_lock_busted_write_delay(struct torture_random_state *trsp)
 	if (!(torture_random(trsp) %
 	      (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
 		mdelay(longdelay_ms);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_lock_busted_write_unlock(void)
@@ -179,10 +173,8 @@ static void torture_spin_lock_write_delay(struct torture_random_state *trsp)
 	if (!(torture_random(trsp) %
 	      (cxt.nrealwriters_stress * 2 * shortdelay_us)))
 		udelay(shortdelay_us);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_spin_lock_write_unlock(void) __releases(torture_spinlock)
@@ -352,10 +344,8 @@ static void torture_mutex_delay(struct torture_random_state *trsp)
 		mdelay(longdelay_ms * 5);
 	else
 		mdelay(longdelay_ms / 5);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_mutex_unlock(void) __releases(torture_mutex)
@@ -507,10 +497,8 @@ static void torture_rtmutex_delay(struct torture_random_state *trsp)
 	if (!(torture_random(trsp) %
 	      (cxt.nrealwriters_stress * 2 * shortdelay_us)))
 		udelay(shortdelay_us);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_rtmutex_unlock(void) __releases(torture_rtmutex)
@@ -547,10 +535,8 @@ static void torture_rwsem_write_delay(struct torture_random_state *trsp)
 		mdelay(longdelay_ms * 10);
 	else
 		mdelay(longdelay_ms / 10);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealwriters_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_rwsem_up_write(void) __releases(torture_rwsem)
@@ -570,14 +556,12 @@ static void torture_rwsem_read_delay(struct torture_random_state *trsp)
 
 	/* We want a long delay occasionally to force massive contention.  */
 	if (!(torture_random(trsp) %
-	      (cxt.nrealwriters_stress * 2000 * longdelay_ms)))
+	      (cxt.nrealreaders_stress * 2000 * longdelay_ms)))
 		mdelay(longdelay_ms * 2);
 	else
 		mdelay(longdelay_ms / 2);
-#ifdef CONFIG_PREEMPT
 	if (!(torture_random(trsp) % (cxt.nrealreaders_stress * 20000)))
-		preempt_schedule();  /* Allow test to be preempted. */
-#endif
+		torture_preempt_schedule();  /* Allow test to be preempted. */
 }
 
 static void torture_rwsem_up_read(void) __releases(torture_rwsem)
@@ -715,8 +699,7 @@ static void __torture_print_stats(char *page,
 {
 	bool fail = 0;
 	int i, n_stress;
-	long max = 0;
-	long min = statp[0].n_lock_acquired;
+	long max = 0, min = statp ? statp[0].n_lock_acquired : 0;
 	long long sum = 0;
 
 	n_stress = write ? cxt.nrealwriters_stress : cxt.nrealreaders_stress;
@@ -823,7 +806,7 @@ static void lock_torture_cleanup(void)
 	 * such, only perform the underlying torture-specific cleanups,
 	 * and avoid anything related to locktorture.
 	 */
-	if (!cxt.lwsa)
+	if (!cxt.lwsa && !cxt.lrsa)
 		goto end;
 
 	if (writer_tasks) {
@@ -879,7 +862,7 @@ static int __init lock_torture_init(void)
 		&percpu_rwsem_lock_ops,
 	};
 
-	if (!torture_init_begin(torture_type, verbose, &torture_runnable))
+	if (!torture_init_begin(torture_type, verbose))
 		return -EBUSY;
 
 	/* Process args and tell the world that the torturer is on the job. */
@@ -898,6 +881,13 @@ static int __init lock_torture_init(void)
 		firsterr = -EINVAL;
 		goto unwind;
 	}
+
+	if (nwriters_stress == 0 && nreaders_stress == 0) {
+		pr_alert("lock-torture: must run at least one locking thread\n");
+		firsterr = -EINVAL;
+		goto unwind;
+	}
+
 	if (cxt.cur_ops->init)
 		cxt.cur_ops->init();
 
@@ -921,17 +911,19 @@ static int __init lock_torture_init(void)
 #endif
 
 	/* Initialize the statistics so that each run gets its own numbers. */
+	if (nwriters_stress) {
+		lock_is_write_held = 0;
+		cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, GFP_KERNEL);
+		if (cxt.lwsa == NULL) {
+			VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
+			firsterr = -ENOMEM;
+			goto unwind;
+		}
 
-	lock_is_write_held = 0;
-	cxt.lwsa = kmalloc(sizeof(*cxt.lwsa) * cxt.nrealwriters_stress, GFP_KERNEL);
-	if (cxt.lwsa == NULL) {
-		VERBOSE_TOROUT_STRING("cxt.lwsa: Out of memory");
-		firsterr = -ENOMEM;
-		goto unwind;
-	}
-	for (i = 0; i < cxt.nrealwriters_stress; i++) {
-		cxt.lwsa[i].n_lock_fail = 0;
-		cxt.lwsa[i].n_lock_acquired = 0;
+		for (i = 0; i < cxt.nrealwriters_stress; i++) {
+			cxt.lwsa[i].n_lock_fail = 0;
+			cxt.lwsa[i].n_lock_acquired = 0;
+		}
 	}
 
 	if (cxt.cur_ops->readlock) {
@@ -948,19 +940,21 @@ static int __init lock_torture_init(void)
 			cxt.nrealreaders_stress = cxt.nrealwriters_stress;
 		}
 
-		lock_is_read_held = 0;
-		cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, GFP_KERNEL);
-		if (cxt.lrsa == NULL) {
-			VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
-			firsterr = -ENOMEM;
-			kfree(cxt.lwsa);
-			cxt.lwsa = NULL;
-			goto unwind;
-		}
+		if (nreaders_stress) {
+			lock_is_read_held = 0;
+			cxt.lrsa = kmalloc(sizeof(*cxt.lrsa) * cxt.nrealreaders_stress, GFP_KERNEL);
+			if (cxt.lrsa == NULL) {
+				VERBOSE_TOROUT_STRING("cxt.lrsa: Out of memory");
+				firsterr = -ENOMEM;
+				kfree(cxt.lwsa);
+				cxt.lwsa = NULL;
+				goto unwind;
+			}
 
-		for (i = 0; i < cxt.nrealreaders_stress; i++) {
-			cxt.lrsa[i].n_lock_fail = 0;
-			cxt.lrsa[i].n_lock_acquired = 0;
+			for (i = 0; i < cxt.nrealreaders_stress; i++) {
+				cxt.lrsa[i].n_lock_fail = 0;
+				cxt.lrsa[i].n_lock_acquired = 0;
+			}
 		}
 	}
 
@@ -990,12 +984,14 @@ static int __init lock_torture_init(void)
 			goto unwind;
 	}
 
-	writer_tasks = kzalloc(cxt.nrealwriters_stress * sizeof(writer_tasks[0]),
-			       GFP_KERNEL);
-	if (writer_tasks == NULL) {
-		VERBOSE_TOROUT_ERRSTRING("writer_tasks: Out of memory");
-		firsterr = -ENOMEM;
-		goto unwind;
+	if (nwriters_stress) {
+		writer_tasks = kzalloc(cxt.nrealwriters_stress * sizeof(writer_tasks[0]),
+				       GFP_KERNEL);
+		if (writer_tasks == NULL) {
+			VERBOSE_TOROUT_ERRSTRING("writer_tasks: Out of memory");
+			firsterr = -ENOMEM;
+			goto unwind;
+		}
 	}
 
 	if (cxt.cur_ops->readlock) {
diff --git a/kernel/locking/qspinlock.c b/kernel/locking/qspinlock.c
index 294294c..38ece03 100644
--- a/kernel/locking/qspinlock.c
+++ b/kernel/locking/qspinlock.c
@@ -170,7 +170,7 @@ static __always_inline void clear_pending_set_locked(struct qspinlock *lock)
  * @tail : The new queue tail code word
  * Return: The previous queue tail code word
  *
- * xchg(lock, tail)
+ * xchg(lock, tail), which heads an address dependency
  *
  * p,*,* -> n,*,* ; prev = xchg(lock, node)
  */
@@ -409,13 +409,11 @@ void queued_spin_lock_slowpath(struct qspinlock *lock, u32 val)
 	if (old & _Q_TAIL_MASK) {
 		prev = decode_tail(old);
 		/*
-		 * The above xchg_tail() is also a load of @lock which generates,
-		 * through decode_tail(), a pointer.
-		 *
-		 * The address dependency matches the RELEASE of xchg_tail()
-		 * such that the access to @prev must happen after.
+		 * The above xchg_tail() is also a load of @lock which
+		 * generates, through decode_tail(), a pointer.  The address
+		 * dependency matches the RELEASE of xchg_tail() such that
+		 * the subsequent access to @prev happens after.
 		 */
-		smp_read_barrier_depends();
 
 		WRITE_ONCE(prev->next, node);
 
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 6f3dba6..65cc0cb 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1290,6 +1290,19 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 	return ret;
 }
 
+static inline int __rt_mutex_slowtrylock(struct rt_mutex *lock)
+{
+	int ret = try_to_take_rt_mutex(lock, current, NULL);
+
+	/*
+	 * try_to_take_rt_mutex() sets the lock waiters bit
+	 * unconditionally. Clean this up.
+	 */
+	fixup_rt_mutex_waiters(lock);
+
+	return ret;
+}
+
 /*
  * Slow path try-lock function:
  */
@@ -1312,13 +1325,7 @@ static inline int rt_mutex_slowtrylock(struct rt_mutex *lock)
 	 */
 	raw_spin_lock_irqsave(&lock->wait_lock, flags);
 
-	ret = try_to_take_rt_mutex(lock, current, NULL);
-
-	/*
-	 * try_to_take_rt_mutex() sets the lock waiters bit
-	 * unconditionally. Clean this up.
-	 */
-	fixup_rt_mutex_waiters(lock);
+	ret = __rt_mutex_slowtrylock(lock);
 
 	raw_spin_unlock_irqrestore(&lock->wait_lock, flags);
 
@@ -1505,6 +1512,11 @@ int __sched rt_mutex_futex_trylock(struct rt_mutex *lock)
 	return rt_mutex_slowtrylock(lock);
 }
 
+int __sched __rt_mutex_futex_trylock(struct rt_mutex *lock)
+{
+	return __rt_mutex_slowtrylock(lock);
+}
+
 /**
  * rt_mutex_timed_lock - lock a rt_mutex interruptible
  *			the timeout structure is provided
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index 124e98c..68686b3 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -148,6 +148,7 @@ extern bool rt_mutex_cleanup_proxy_lock(struct rt_mutex *lock,
 				 struct rt_mutex_waiter *waiter);
 
 extern int rt_mutex_futex_trylock(struct rt_mutex *l);
+extern int __rt_mutex_futex_trylock(struct rt_mutex *l);
 
 extern void rt_mutex_futex_unlock(struct rt_mutex *lock);
 extern bool __rt_mutex_futex_unlock(struct rt_mutex *lock,
diff --git a/kernel/module.c b/kernel/module.c
index dea01ac..09e48ee 100644
--- a/kernel/module.c
+++ b/kernel/module.c
@@ -2863,6 +2863,15 @@ static int check_modinfo_livepatch(struct module *mod, struct load_info *info)
 }
 #endif /* CONFIG_LIVEPATCH */
 
+static void check_modinfo_retpoline(struct module *mod, struct load_info *info)
+{
+	if (retpoline_module_ok(get_modinfo(info, "retpoline")))
+		return;
+
+	pr_warn("%s: loading module not compiled with retpoline compiler.\n",
+		mod->name);
+}
+
 /* Sets info->hdr and info->len. */
 static int copy_module_from_user(const void __user *umod, unsigned long len,
 				  struct load_info *info)
@@ -3029,6 +3038,8 @@ static int check_modinfo(struct module *mod, struct load_info *info, int flags)
 		add_taint_module(mod, TAINT_OOT_MODULE, LOCKDEP_STILL_OK);
 	}
 
+	check_modinfo_retpoline(mod, info);
+
 	if (get_modinfo(info, "staging")) {
 		add_taint_module(mod, TAINT_CRAP, LOCKDEP_STILL_OK);
 		pr_warn("%s: module is from the staging directory, the quality "
diff --git a/kernel/pid.c b/kernel/pid.c
index b13b624..5d30c87 100644
--- a/kernel/pid.c
+++ b/kernel/pid.c
@@ -41,7 +41,19 @@
 #include <linux/sched/task.h>
 #include <linux/idr.h>
 
-struct pid init_struct_pid = INIT_STRUCT_PID;
+struct pid init_struct_pid = {
+	.count 		= ATOMIC_INIT(1),
+	.tasks		= {
+		{ .first = NULL },
+		{ .first = NULL },
+		{ .first = NULL },
+	},
+	.level		= 0,
+	.numbers	= { {
+		.nr		= 0,
+		.ns		= &init_pid_ns,
+	}, }
+};
 
 int pid_max = PID_MAX_DEFAULT;
 
@@ -193,10 +205,8 @@ struct pid *alloc_pid(struct pid_namespace *ns)
 	}
 
 	if (unlikely(is_child_reaper(pid))) {
-		if (pid_ns_prepare_proc(ns)) {
-			disable_pid_allocation(ns);
+		if (pid_ns_prepare_proc(ns))
 			goto out_free;
-		}
 	}
 
 	get_pid_ns(ns);
@@ -226,6 +236,10 @@ struct pid *alloc_pid(struct pid_namespace *ns)
 	while (++i <= ns->level)
 		idr_remove(&ns->idr, (pid->numbers + i)->nr);
 
+	/* On failure to allocate the first pid, reset the state */
+	if (ns->pid_allocated == PIDNS_ADDING)
+		idr_set_cursor(&ns->idr, 0);
+
 	spin_unlock_irq(&pidmap_lock);
 
 	kmem_cache_free(ns->pid_cachep, pid);
diff --git a/kernel/power/main.c b/kernel/power/main.c
index 3a2ca90..705c236 100644
--- a/kernel/power/main.c
+++ b/kernel/power/main.c
@@ -22,6 +22,35 @@ DEFINE_MUTEX(pm_mutex);
 
 #ifdef CONFIG_PM_SLEEP
 
+void lock_system_sleep(void)
+{
+	current->flags |= PF_FREEZER_SKIP;
+	mutex_lock(&pm_mutex);
+}
+EXPORT_SYMBOL_GPL(lock_system_sleep);
+
+void unlock_system_sleep(void)
+{
+	/*
+	 * Don't use freezer_count() because we don't want the call to
+	 * try_to_freeze() here.
+	 *
+	 * Reason:
+	 * Fundamentally, we just don't need it, because freezing condition
+	 * doesn't come into effect until we release the pm_mutex lock,
+	 * since the freezer always works with pm_mutex held.
+	 *
+	 * More importantly, in the case of hibernation,
+	 * unlock_system_sleep() gets called in snapshot_read() and
+	 * snapshot_write() when the freezing condition is still in effect.
+	 * Which means, if we use try_to_freeze() here, it would make them
+	 * enter the refrigerator, thus causing hibernation to lockup.
+	 */
+	current->flags &= ~PF_FREEZER_SKIP;
+	mutex_unlock(&pm_mutex);
+}
+EXPORT_SYMBOL_GPL(unlock_system_sleep);
+
 /* Routines for PM-transition notifications */
 
 static BLOCKING_NOTIFIER_HEAD(pm_chain_head);
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index bce0464..3d37c27 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -1645,8 +1645,7 @@ static unsigned long free_unnecessary_pages(void)
  * [number of saveable pages] - [number of pages that can be freed in theory]
  *
  * where the second term is the sum of (1) reclaimable slab pages, (2) active
- * and (3) inactive anonymous pages, (4) active and (5) inactive file pages,
- * minus mapped file pages.
+ * and (3) inactive anonymous pages, (4) active and (5) inactive file pages.
  */
 static unsigned long minimum_image_size(unsigned long saveable)
 {
@@ -1656,8 +1655,7 @@ static unsigned long minimum_image_size(unsigned long saveable)
 		+ global_node_page_state(NR_ACTIVE_ANON)
 		+ global_node_page_state(NR_INACTIVE_ANON)
 		+ global_node_page_state(NR_ACTIVE_FILE)
-		+ global_node_page_state(NR_INACTIVE_FILE)
-		- global_node_page_state(NR_FILE_MAPPED);
+		+ global_node_page_state(NR_INACTIVE_FILE);
 
 	return saveable <= size ? 0 : saveable - size;
 }
diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 293ead5..11b4282 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -240,7 +240,7 @@ static void hib_init_batch(struct hib_bio_batch *hb)
 static void hib_end_io(struct bio *bio)
 {
 	struct hib_bio_batch *hb = bio->bi_private;
-	struct page *page = bio->bi_io_vec[0].bv_page;
+	struct page *page = bio_first_page_all(bio);
 
 	if (bio->bi_status) {
 		pr_alert("Read-error on swap-device (%u:%u:%Lu)\n",
@@ -879,7 +879,7 @@ static int save_image_lzo(struct swap_map_handle *handle,
  *	space avaiable from the resume partition.
  */
 
-static int enough_swap(unsigned int nr_pages, unsigned int flags)
+static int enough_swap(unsigned int nr_pages)
 {
 	unsigned int free_swap = count_swap_pages(root_swap, 1);
 	unsigned int required;
@@ -915,7 +915,7 @@ int swsusp_write(unsigned int flags)
 		return error;
 	}
 	if (flags & SF_NOCOMPRESS_MODE) {
-		if (!enough_swap(pages, flags)) {
+		if (!enough_swap(pages)) {
 			pr_err("Not enough free swap\n");
 			error = -ENOSPC;
 			goto out_finish;
diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
index 59c471d..6334f2c 100644
--- a/kernel/rcu/rcu.h
+++ b/kernel/rcu/rcu.h
@@ -30,31 +30,8 @@
 #define RCU_TRACE(stmt)
 #endif /* #else #ifdef CONFIG_RCU_TRACE */
 
-/*
- * Process-level increment to ->dynticks_nesting field.  This allows for
- * architectures that use half-interrupts and half-exceptions from
- * process context.
- *
- * DYNTICK_TASK_NEST_MASK defines a field of width DYNTICK_TASK_NEST_WIDTH
- * that counts the number of process-based reasons why RCU cannot
- * consider the corresponding CPU to be idle, and DYNTICK_TASK_NEST_VALUE
- * is the value used to increment or decrement this field.
- *
- * The rest of the bits could in principle be used to count interrupts,
- * but this would mean that a negative-one value in the interrupt
- * field could incorrectly zero out the DYNTICK_TASK_NEST_MASK field.
- * We therefore provide a two-bit guard field defined by DYNTICK_TASK_MASK
- * that is set to DYNTICK_TASK_FLAG upon initial exit from idle.
- * The DYNTICK_TASK_EXIT_IDLE value is thus the combined value used upon
- * initial exit from idle.
- */
-#define DYNTICK_TASK_NEST_WIDTH 7
-#define DYNTICK_TASK_NEST_VALUE ((LLONG_MAX >> DYNTICK_TASK_NEST_WIDTH) + 1)
-#define DYNTICK_TASK_NEST_MASK  (LLONG_MAX - DYNTICK_TASK_NEST_VALUE + 1)
-#define DYNTICK_TASK_FLAG	   ((DYNTICK_TASK_NEST_VALUE / 8) * 2)
-#define DYNTICK_TASK_MASK	   ((DYNTICK_TASK_NEST_VALUE / 8) * 3)
-#define DYNTICK_TASK_EXIT_IDLE	   (DYNTICK_TASK_NEST_VALUE + \
-				    DYNTICK_TASK_FLAG)
+/* Offset to allow for unmatched rcu_irq_{enter,exit}(). */
+#define DYNTICK_IRQ_NONIDLE	((LONG_MAX / 2) + 1)
 
 
 /*
diff --git a/kernel/rcu/rcuperf.c b/kernel/rcu/rcuperf.c
index 1f87a02..d1ebdf9 100644
--- a/kernel/rcu/rcuperf.c
+++ b/kernel/rcu/rcuperf.c
@@ -106,10 +106,6 @@ static int rcu_perf_writer_state;
 #define MAX_MEAS 10000
 #define MIN_MEAS 100
 
-static int perf_runnable = IS_ENABLED(MODULE);
-module_param(perf_runnable, int, 0444);
-MODULE_PARM_DESC(perf_runnable, "Start rcuperf at boot");
-
 /*
  * Operations vector for selecting different types of tests.
  */
@@ -646,7 +642,7 @@ rcu_perf_init(void)
 		&tasks_ops,
 	};
 
-	if (!torture_init_begin(perf_type, verbose, &perf_runnable))
+	if (!torture_init_begin(perf_type, verbose))
 		return -EBUSY;
 
 	/* Process args and tell the world that the perf'er is on the job. */
diff --git a/kernel/rcu/rcutorture.c b/kernel/rcu/rcutorture.c
index 74f6b01..308e6fd 100644
--- a/kernel/rcu/rcutorture.c
+++ b/kernel/rcu/rcutorture.c
@@ -187,10 +187,6 @@ static const char *rcu_torture_writer_state_getname(void)
 	return rcu_torture_writer_state_names[i];
 }
 
-static int torture_runnable = IS_ENABLED(MODULE);
-module_param(torture_runnable, int, 0444);
-MODULE_PARM_DESC(torture_runnable, "Start rcutorture at boot");
-
 #if defined(CONFIG_RCU_BOOST) && !defined(CONFIG_HOTPLUG_CPU)
 #define rcu_can_boost() 1
 #else /* #if defined(CONFIG_RCU_BOOST) && !defined(CONFIG_HOTPLUG_CPU) */
@@ -315,11 +311,9 @@ static void rcu_read_delay(struct torture_random_state *rrsp)
 	}
 	if (!(torture_random(rrsp) % (nrealreaders * 2 * shortdelay_us)))
 		udelay(shortdelay_us);
-#ifdef CONFIG_PREEMPT
 	if (!preempt_count() &&
-	    !(torture_random(rrsp) % (nrealreaders * 20000)))
-		preempt_schedule();  /* No QS if preempt_disable() in effect */
-#endif
+	    !(torture_random(rrsp) % (nrealreaders * 500)))
+		torture_preempt_schedule();  /* QS only if preemptible. */
 }
 
 static void rcu_torture_read_unlock(int idx) __releases(RCU)
@@ -1731,7 +1725,7 @@ rcu_torture_init(void)
 		&sched_ops, &tasks_ops,
 	};
 
-	if (!torture_init_begin(torture_type, verbose, &torture_runnable))
+	if (!torture_init_begin(torture_type, verbose))
 		return -EBUSY;
 
 	/* Process args and tell the world that the torturer is on the job. */
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 6d58800..d5cea81 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -53,6 +53,33 @@ static void srcu_invoke_callbacks(struct work_struct *work);
 static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay);
 static void process_srcu(struct work_struct *work);
 
+/* Wrappers for lock acquisition and release, see raw_spin_lock_rcu_node(). */
+#define spin_lock_rcu_node(p)					\
+do {									\
+	spin_lock(&ACCESS_PRIVATE(p, lock));			\
+	smp_mb__after_unlock_lock();					\
+} while (0)
+
+#define spin_unlock_rcu_node(p) spin_unlock(&ACCESS_PRIVATE(p, lock))
+
+#define spin_lock_irq_rcu_node(p)					\
+do {									\
+	spin_lock_irq(&ACCESS_PRIVATE(p, lock));			\
+	smp_mb__after_unlock_lock();					\
+} while (0)
+
+#define spin_unlock_irq_rcu_node(p)					\
+	spin_unlock_irq(&ACCESS_PRIVATE(p, lock))
+
+#define spin_lock_irqsave_rcu_node(p, flags)			\
+do {									\
+	spin_lock_irqsave(&ACCESS_PRIVATE(p, lock), flags);	\
+	smp_mb__after_unlock_lock();					\
+} while (0)
+
+#define spin_unlock_irqrestore_rcu_node(p, flags)			\
+	spin_unlock_irqrestore(&ACCESS_PRIVATE(p, lock), flags)	\
+
 /*
  * Initialize SRCU combining tree.  Note that statically allocated
  * srcu_struct structures might already have srcu_read_lock() and
@@ -77,7 +104,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
 
 	/* Each pass through this loop initializes one srcu_node structure. */
 	rcu_for_each_node_breadth_first(sp, snp) {
-		raw_spin_lock_init(&ACCESS_PRIVATE(snp, lock));
+		spin_lock_init(&ACCESS_PRIVATE(snp, lock));
 		WARN_ON_ONCE(ARRAY_SIZE(snp->srcu_have_cbs) !=
 			     ARRAY_SIZE(snp->srcu_data_have_cbs));
 		for (i = 0; i < ARRAY_SIZE(snp->srcu_have_cbs); i++) {
@@ -111,7 +138,7 @@ static void init_srcu_struct_nodes(struct srcu_struct *sp, bool is_static)
 	snp_first = sp->level[level];
 	for_each_possible_cpu(cpu) {
 		sdp = per_cpu_ptr(sp->sda, cpu);
-		raw_spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
+		spin_lock_init(&ACCESS_PRIVATE(sdp, lock));
 		rcu_segcblist_init(&sdp->srcu_cblist);
 		sdp->srcu_cblist_invoking = false;
 		sdp->srcu_gp_seq_needed = sp->srcu_gp_seq;
@@ -170,7 +197,7 @@ int __init_srcu_struct(struct srcu_struct *sp, const char *name,
 	/* Don't re-initialize a lock while it is held. */
 	debug_check_no_locks_freed((void *)sp, sizeof(*sp));
 	lockdep_init_map(&sp->dep_map, name, key, 0);
-	raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
+	spin_lock_init(&ACCESS_PRIVATE(sp, lock));
 	return init_srcu_struct_fields(sp, false);
 }
 EXPORT_SYMBOL_GPL(__init_srcu_struct);
@@ -187,7 +214,7 @@ EXPORT_SYMBOL_GPL(__init_srcu_struct);
  */
 int init_srcu_struct(struct srcu_struct *sp)
 {
-	raw_spin_lock_init(&ACCESS_PRIVATE(sp, lock));
+	spin_lock_init(&ACCESS_PRIVATE(sp, lock));
 	return init_srcu_struct_fields(sp, false);
 }
 EXPORT_SYMBOL_GPL(init_srcu_struct);
@@ -210,13 +237,13 @@ static void check_init_srcu_struct(struct srcu_struct *sp)
 	/* The smp_load_acquire() pairs with the smp_store_release(). */
 	if (!rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq_needed))) /*^^^*/
 		return; /* Already initialized. */
-	raw_spin_lock_irqsave_rcu_node(sp, flags);
+	spin_lock_irqsave_rcu_node(sp, flags);
 	if (!rcu_seq_state(sp->srcu_gp_seq_needed)) {
-		raw_spin_unlock_irqrestore_rcu_node(sp, flags);
+		spin_unlock_irqrestore_rcu_node(sp, flags);
 		return;
 	}
 	init_srcu_struct_fields(sp, true);
-	raw_spin_unlock_irqrestore_rcu_node(sp, flags);
+	spin_unlock_irqrestore_rcu_node(sp, flags);
 }
 
 /*
@@ -513,7 +540,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
 	mutex_lock(&sp->srcu_cb_mutex);
 
 	/* End the current grace period. */
-	raw_spin_lock_irq_rcu_node(sp);
+	spin_lock_irq_rcu_node(sp);
 	idx = rcu_seq_state(sp->srcu_gp_seq);
 	WARN_ON_ONCE(idx != SRCU_STATE_SCAN2);
 	cbdelay = srcu_get_delay(sp);
@@ -522,7 +549,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
 	gpseq = rcu_seq_current(&sp->srcu_gp_seq);
 	if (ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, gpseq))
 		sp->srcu_gp_seq_needed_exp = gpseq;
-	raw_spin_unlock_irq_rcu_node(sp);
+	spin_unlock_irq_rcu_node(sp);
 	mutex_unlock(&sp->srcu_gp_mutex);
 	/* A new grace period can start at this point.  But only one. */
 
@@ -530,7 +557,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
 	idx = rcu_seq_ctr(gpseq) % ARRAY_SIZE(snp->srcu_have_cbs);
 	idxnext = (idx + 1) % ARRAY_SIZE(snp->srcu_have_cbs);
 	rcu_for_each_node_breadth_first(sp, snp) {
-		raw_spin_lock_irq_rcu_node(snp);
+		spin_lock_irq_rcu_node(snp);
 		cbs = false;
 		if (snp >= sp->level[rcu_num_lvls - 1])
 			cbs = snp->srcu_have_cbs[idx] == gpseq;
@@ -540,7 +567,7 @@ static void srcu_gp_end(struct srcu_struct *sp)
 			snp->srcu_gp_seq_needed_exp = gpseq;
 		mask = snp->srcu_data_have_cbs[idx];
 		snp->srcu_data_have_cbs[idx] = 0;
-		raw_spin_unlock_irq_rcu_node(snp);
+		spin_unlock_irq_rcu_node(snp);
 		if (cbs)
 			srcu_schedule_cbs_snp(sp, snp, mask, cbdelay);
 
@@ -548,11 +575,11 @@ static void srcu_gp_end(struct srcu_struct *sp)
 		if (!(gpseq & counter_wrap_check))
 			for (cpu = snp->grplo; cpu <= snp->grphi; cpu++) {
 				sdp = per_cpu_ptr(sp->sda, cpu);
-				raw_spin_lock_irqsave_rcu_node(sdp, flags);
+				spin_lock_irqsave_rcu_node(sdp, flags);
 				if (ULONG_CMP_GE(gpseq,
 						 sdp->srcu_gp_seq_needed + 100))
 					sdp->srcu_gp_seq_needed = gpseq;
-				raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
+				spin_unlock_irqrestore_rcu_node(sdp, flags);
 			}
 	}
 
@@ -560,17 +587,17 @@ static void srcu_gp_end(struct srcu_struct *sp)
 	mutex_unlock(&sp->srcu_cb_mutex);
 
 	/* Start a new grace period if needed. */
-	raw_spin_lock_irq_rcu_node(sp);
+	spin_lock_irq_rcu_node(sp);
 	gpseq = rcu_seq_current(&sp->srcu_gp_seq);
 	if (!rcu_seq_state(gpseq) &&
 	    ULONG_CMP_LT(gpseq, sp->srcu_gp_seq_needed)) {
 		srcu_gp_start(sp);
-		raw_spin_unlock_irq_rcu_node(sp);
+		spin_unlock_irq_rcu_node(sp);
 		/* Throttle expedited grace periods: Should be rare! */
 		srcu_reschedule(sp, rcu_seq_ctr(gpseq) & 0x3ff
 				    ? 0 : SRCU_INTERVAL);
 	} else {
-		raw_spin_unlock_irq_rcu_node(sp);
+		spin_unlock_irq_rcu_node(sp);
 	}
 }
 
@@ -590,18 +617,18 @@ static void srcu_funnel_exp_start(struct srcu_struct *sp, struct srcu_node *snp,
 		if (rcu_seq_done(&sp->srcu_gp_seq, s) ||
 		    ULONG_CMP_GE(READ_ONCE(snp->srcu_gp_seq_needed_exp), s))
 			return;
-		raw_spin_lock_irqsave_rcu_node(snp, flags);
+		spin_lock_irqsave_rcu_node(snp, flags);
 		if (ULONG_CMP_GE(snp->srcu_gp_seq_needed_exp, s)) {
-			raw_spin_unlock_irqrestore_rcu_node(snp, flags);
+			spin_unlock_irqrestore_rcu_node(snp, flags);
 			return;
 		}
 		WRITE_ONCE(snp->srcu_gp_seq_needed_exp, s);
-		raw_spin_unlock_irqrestore_rcu_node(snp, flags);
+		spin_unlock_irqrestore_rcu_node(snp, flags);
 	}
-	raw_spin_lock_irqsave_rcu_node(sp, flags);
+	spin_lock_irqsave_rcu_node(sp, flags);
 	if (!ULONG_CMP_LT(sp->srcu_gp_seq_needed_exp, s))
 		sp->srcu_gp_seq_needed_exp = s;
-	raw_spin_unlock_irqrestore_rcu_node(sp, flags);
+	spin_unlock_irqrestore_rcu_node(sp, flags);
 }
 
 /*
@@ -623,12 +650,12 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
 	for (; snp != NULL; snp = snp->srcu_parent) {
 		if (rcu_seq_done(&sp->srcu_gp_seq, s) && snp != sdp->mynode)
 			return; /* GP already done and CBs recorded. */
-		raw_spin_lock_irqsave_rcu_node(snp, flags);
+		spin_lock_irqsave_rcu_node(snp, flags);
 		if (ULONG_CMP_GE(snp->srcu_have_cbs[idx], s)) {
 			snp_seq = snp->srcu_have_cbs[idx];
 			if (snp == sdp->mynode && snp_seq == s)
 				snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
-			raw_spin_unlock_irqrestore_rcu_node(snp, flags);
+			spin_unlock_irqrestore_rcu_node(snp, flags);
 			if (snp == sdp->mynode && snp_seq != s) {
 				srcu_schedule_cbs_sdp(sdp, do_norm
 							   ? SRCU_INTERVAL
@@ -644,11 +671,11 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
 			snp->srcu_data_have_cbs[idx] |= sdp->grpmask;
 		if (!do_norm && ULONG_CMP_LT(snp->srcu_gp_seq_needed_exp, s))
 			snp->srcu_gp_seq_needed_exp = s;
-		raw_spin_unlock_irqrestore_rcu_node(snp, flags);
+		spin_unlock_irqrestore_rcu_node(snp, flags);
 	}
 
 	/* Top of tree, must ensure the grace period will be started. */
-	raw_spin_lock_irqsave_rcu_node(sp, flags);
+	spin_lock_irqsave_rcu_node(sp, flags);
 	if (ULONG_CMP_LT(sp->srcu_gp_seq_needed, s)) {
 		/*
 		 * Record need for grace period s.  Pair with load
@@ -667,7 +694,7 @@ static void srcu_funnel_gp_start(struct srcu_struct *sp, struct srcu_data *sdp,
 		queue_delayed_work(system_power_efficient_wq, &sp->work,
 				   srcu_get_delay(sp));
 	}
-	raw_spin_unlock_irqrestore_rcu_node(sp, flags);
+	spin_unlock_irqrestore_rcu_node(sp, flags);
 }
 
 /*
@@ -830,7 +857,7 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
 	rhp->func = func;
 	local_irq_save(flags);
 	sdp = this_cpu_ptr(sp->sda);
-	raw_spin_lock_rcu_node(sdp);
+	spin_lock_rcu_node(sdp);
 	rcu_segcblist_enqueue(&sdp->srcu_cblist, rhp, false);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&sp->srcu_gp_seq));
@@ -844,7 +871,7 @@ void __call_srcu(struct srcu_struct *sp, struct rcu_head *rhp,
 		sdp->srcu_gp_seq_needed_exp = s;
 		needexp = true;
 	}
-	raw_spin_unlock_irqrestore_rcu_node(sdp, flags);
+	spin_unlock_irqrestore_rcu_node(sdp, flags);
 	if (needgp)
 		srcu_funnel_gp_start(sp, sdp, s, do_norm);
 	else if (needexp)
@@ -900,7 +927,7 @@ static void __synchronize_srcu(struct srcu_struct *sp, bool do_norm)
 
 	/*
 	 * Make sure that later code is ordered after the SRCU grace
-	 * period.  This pairs with the raw_spin_lock_irq_rcu_node()
+	 * period.  This pairs with the spin_lock_irq_rcu_node()
 	 * in srcu_invoke_callbacks().  Unlike Tree RCU, this is needed
 	 * because the current CPU might have been totally uninvolved with
 	 * (and thus unordered against) that grace period.
@@ -1024,7 +1051,7 @@ void srcu_barrier(struct srcu_struct *sp)
 	 */
 	for_each_possible_cpu(cpu) {
 		sdp = per_cpu_ptr(sp->sda, cpu);
-		raw_spin_lock_irq_rcu_node(sdp);
+		spin_lock_irq_rcu_node(sdp);
 		atomic_inc(&sp->srcu_barrier_cpu_cnt);
 		sdp->srcu_barrier_head.func = srcu_barrier_cb;
 		debug_rcu_head_queue(&sdp->srcu_barrier_head);
@@ -1033,7 +1060,7 @@ void srcu_barrier(struct srcu_struct *sp)
 			debug_rcu_head_unqueue(&sdp->srcu_barrier_head);
 			atomic_dec(&sp->srcu_barrier_cpu_cnt);
 		}
-		raw_spin_unlock_irq_rcu_node(sdp);
+		spin_unlock_irq_rcu_node(sdp);
 	}
 
 	/* Remove the initial count, at which point reaching zero can happen. */
@@ -1082,17 +1109,17 @@ static void srcu_advance_state(struct srcu_struct *sp)
 	 */
 	idx = rcu_seq_state(smp_load_acquire(&sp->srcu_gp_seq)); /* ^^^ */
 	if (idx == SRCU_STATE_IDLE) {
-		raw_spin_lock_irq_rcu_node(sp);
+		spin_lock_irq_rcu_node(sp);
 		if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
 			WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq));
-			raw_spin_unlock_irq_rcu_node(sp);
+			spin_unlock_irq_rcu_node(sp);
 			mutex_unlock(&sp->srcu_gp_mutex);
 			return;
 		}
 		idx = rcu_seq_state(READ_ONCE(sp->srcu_gp_seq));
 		if (idx == SRCU_STATE_IDLE)
 			srcu_gp_start(sp);
-		raw_spin_unlock_irq_rcu_node(sp);
+		spin_unlock_irq_rcu_node(sp);
 		if (idx != SRCU_STATE_IDLE) {
 			mutex_unlock(&sp->srcu_gp_mutex);
 			return; /* Someone else started the grace period. */
@@ -1141,19 +1168,19 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	sdp = container_of(work, struct srcu_data, work.work);
 	sp = sdp->sp;
 	rcu_cblist_init(&ready_cbs);
-	raw_spin_lock_irq_rcu_node(sdp);
+	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&sp->srcu_gp_seq));
 	if (sdp->srcu_cblist_invoking ||
 	    !rcu_segcblist_ready_cbs(&sdp->srcu_cblist)) {
-		raw_spin_unlock_irq_rcu_node(sdp);
+		spin_unlock_irq_rcu_node(sdp);
 		return;  /* Someone else on the job or nothing to do. */
 	}
 
 	/* We are on the job!  Extract and invoke ready callbacks. */
 	sdp->srcu_cblist_invoking = true;
 	rcu_segcblist_extract_done_cbs(&sdp->srcu_cblist, &ready_cbs);
-	raw_spin_unlock_irq_rcu_node(sdp);
+	spin_unlock_irq_rcu_node(sdp);
 	rhp = rcu_cblist_dequeue(&ready_cbs);
 	for (; rhp != NULL; rhp = rcu_cblist_dequeue(&ready_cbs)) {
 		debug_rcu_head_unqueue(rhp);
@@ -1166,13 +1193,13 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	 * Update counts, accelerate new callbacks, and if needed,
 	 * schedule another round of callback invocation.
 	 */
-	raw_spin_lock_irq_rcu_node(sdp);
+	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_insert_count(&sdp->srcu_cblist, &ready_cbs);
 	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
 				       rcu_seq_snap(&sp->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
 	more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
-	raw_spin_unlock_irq_rcu_node(sdp);
+	spin_unlock_irq_rcu_node(sdp);
 	if (more)
 		srcu_schedule_cbs_sdp(sdp, 0);
 }
@@ -1185,7 +1212,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
 {
 	bool pushgp = true;
 
-	raw_spin_lock_irq_rcu_node(sp);
+	spin_lock_irq_rcu_node(sp);
 	if (ULONG_CMP_GE(sp->srcu_gp_seq, sp->srcu_gp_seq_needed)) {
 		if (!WARN_ON_ONCE(rcu_seq_state(sp->srcu_gp_seq))) {
 			/* All requests fulfilled, time to go idle. */
@@ -1195,7 +1222,7 @@ static void srcu_reschedule(struct srcu_struct *sp, unsigned long delay)
 		/* Outstanding request and no GP.  Start one. */
 		srcu_gp_start(sp);
 	}
-	raw_spin_unlock_irq_rcu_node(sp);
+	spin_unlock_irq_rcu_node(sp);
 
 	if (pushgp)
 		queue_delayed_work(system_power_efficient_wq, &sp->work, delay);
diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index f9c0ca2..491bdf3 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -265,25 +265,12 @@ void rcu_bh_qs(void)
 #endif
 
 static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
-	.dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
+	.dynticks_nesting = 1,
+	.dynticks_nmi_nesting = DYNTICK_IRQ_NONIDLE,
 	.dynticks = ATOMIC_INIT(RCU_DYNTICK_CTRL_CTR),
 };
 
 /*
- * There's a few places, currently just in the tracing infrastructure,
- * that uses rcu_irq_enter() to make sure RCU is watching. But there's
- * a small location where that will not even work. In those cases
- * rcu_irq_enter_disabled() needs to be checked to make sure rcu_irq_enter()
- * can be called.
- */
-static DEFINE_PER_CPU(bool, disable_rcu_irq_enter);
-
-bool rcu_irq_enter_disabled(void)
-{
-	return this_cpu_read(disable_rcu_irq_enter);
-}
-
-/*
  * Record entry into an extended quiescent state.  This is only to be
  * called when not already in an extended quiescent state.
  */
@@ -762,68 +749,39 @@ cpu_needs_another_gp(struct rcu_state *rsp, struct rcu_data *rdp)
 }
 
 /*
- * rcu_eqs_enter_common - current CPU is entering an extended quiescent state
+ * Enter an RCU extended quiescent state, which can be either the
+ * idle loop or adaptive-tickless usermode execution.
  *
- * Enter idle, doing appropriate accounting.  The caller must have
- * disabled interrupts.
+ * We crowbar the ->dynticks_nmi_nesting field to zero to allow for
+ * the possibility of usermode upcalls having messed up our count
+ * of interrupt nesting level during the prior busy period.
  */
-static void rcu_eqs_enter_common(bool user)
+static void rcu_eqs_enter(bool user)
 {
 	struct rcu_state *rsp;
 	struct rcu_data *rdp;
-	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+	struct rcu_dynticks *rdtp;
+
+	rdtp = this_cpu_ptr(&rcu_dynticks);
+	WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
+		     rdtp->dynticks_nesting == 0);
+	if (rdtp->dynticks_nesting != 1) {
+		rdtp->dynticks_nesting--;
+		return;
+	}
 
 	lockdep_assert_irqs_disabled();
-	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0);
-	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-	    !user && !is_idle_task(current)) {
-		struct task_struct *idle __maybe_unused =
-			idle_task(smp_processor_id());
-
-		trace_rcu_dyntick(TPS("Error on entry: not idle task"), rdtp->dynticks_nesting, 0);
-		rcu_ftrace_dump(DUMP_ORIG);
-		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
-			  current->pid, current->comm,
-			  idle->pid, idle->comm); /* must be idle task! */
-	}
+	trace_rcu_dyntick(TPS("Start"), rdtp->dynticks_nesting, 0, rdtp->dynticks);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
 	for_each_rcu_flavor(rsp) {
 		rdp = this_cpu_ptr(rsp->rda);
 		do_nocb_deferred_wakeup(rdp);
 	}
 	rcu_prepare_for_idle();
-	__this_cpu_inc(disable_rcu_irq_enter);
-	rdtp->dynticks_nesting = 0; /* Breaks tracing momentarily. */
-	rcu_dynticks_eqs_enter(); /* After this, tracing works again. */
-	__this_cpu_dec(disable_rcu_irq_enter);
+	WRITE_ONCE(rdtp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
+	rcu_dynticks_eqs_enter();
 	rcu_dynticks_task_enter();
-
-	/*
-	 * It is illegal to enter an extended quiescent state while
-	 * in an RCU read-side critical section.
-	 */
-	RCU_LOCKDEP_WARN(lock_is_held(&rcu_lock_map),
-			 "Illegal idle entry in RCU read-side critical section.");
-	RCU_LOCKDEP_WARN(lock_is_held(&rcu_bh_lock_map),
-			 "Illegal idle entry in RCU-bh read-side critical section.");
-	RCU_LOCKDEP_WARN(lock_is_held(&rcu_sched_lock_map),
-			 "Illegal idle entry in RCU-sched read-side critical section.");
-}
-
-/*
- * Enter an RCU extended quiescent state, which can be either the
- * idle loop or adaptive-tickless usermode execution.
- */
-static void rcu_eqs_enter(bool user)
-{
-	struct rcu_dynticks *rdtp;
-
-	rdtp = this_cpu_ptr(&rcu_dynticks);
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     (rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == 0);
-	if ((rdtp->dynticks_nesting & DYNTICK_TASK_NEST_MASK) == DYNTICK_TASK_NEST_VALUE)
-		rcu_eqs_enter_common(user);
-	else
-		rdtp->dynticks_nesting -= DYNTICK_TASK_NEST_VALUE;
 }
 
 /**
@@ -834,10 +792,6 @@ static void rcu_eqs_enter(bool user)
  * critical sections can occur in irq handlers in idle, a possibility
  * handled by irq_enter() and irq_exit().)
  *
- * We crowbar the ->dynticks_nesting field to zero to allow for
- * the possibility of usermode upcalls having messed up our count
- * of interrupt nesting level during the prior busy period.
- *
  * If you add or remove a call to rcu_idle_enter(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
@@ -867,6 +821,46 @@ void rcu_user_enter(void)
 #endif /* CONFIG_NO_HZ_FULL */
 
 /**
+ * rcu_nmi_exit - inform RCU of exit from NMI context
+ *
+ * If we are returning from the outermost NMI handler that interrupted an
+ * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
+ * to let the RCU grace-period handling know that the CPU is back to
+ * being RCU-idle.
+ *
+ * If you add or remove a call to rcu_nmi_exit(), be sure to test
+ * with CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_nmi_exit(void)
+{
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
+
+	/*
+	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
+	 * (We are exiting an NMI handler, so RCU better be paying attention
+	 * to us!)
+	 */
+	WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
+	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
+
+	/*
+	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
+	 * leave it in non-RCU-idle state.
+	 */
+	if (rdtp->dynticks_nmi_nesting != 1) {
+		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nmi_nesting, rdtp->dynticks_nmi_nesting - 2, rdtp->dynticks);
+		WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* No store tearing. */
+			   rdtp->dynticks_nmi_nesting - 2);
+		return;
+	}
+
+	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
+	trace_rcu_dyntick(TPS("Startirq"), rdtp->dynticks_nmi_nesting, 0, rdtp->dynticks);
+	WRITE_ONCE(rdtp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
+	rcu_dynticks_eqs_enter();
+}
+
+/**
  * rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
  *
  * Exit from an interrupt handler, which might possibly result in entering
@@ -875,8 +869,8 @@ void rcu_user_enter(void)
  *
  * This code assumes that the idle loop never does anything that might
  * result in unbalanced calls to irq_enter() and irq_exit().  If your
- * architecture violates this assumption, RCU will give you what you
- * deserve, good and hard.  But very infrequently and irreproducibly.
+ * architecture's idle loop violates this assumption, RCU will give you what
+ * you deserve, good and hard.  But very infrequently and irreproducibly.
  *
  * Use things like work queues to work around this limitation.
  *
@@ -887,23 +881,14 @@ void rcu_user_enter(void)
  */
 void rcu_irq_exit(void)
 {
-	struct rcu_dynticks *rdtp;
+	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
 	lockdep_assert_irqs_disabled();
-	rdtp = this_cpu_ptr(&rcu_dynticks);
-
-	/* Page faults can happen in NMI handlers, so check... */
-	if (rdtp->dynticks_nmi_nesting)
-		return;
-
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     rdtp->dynticks_nesting < 1);
-	if (rdtp->dynticks_nesting <= 1) {
-		rcu_eqs_enter_common(true);
-	} else {
-		trace_rcu_dyntick(TPS("--="), rdtp->dynticks_nesting, rdtp->dynticks_nesting - 1);
-		rdtp->dynticks_nesting--;
-	}
+	if (rdtp->dynticks_nmi_nesting == 1)
+		rcu_prepare_for_idle();
+	rcu_nmi_exit();
+	if (rdtp->dynticks_nmi_nesting == 0)
+		rcu_dynticks_task_enter();
 }
 
 /*
@@ -922,55 +907,33 @@ void rcu_irq_exit_irqson(void)
 }
 
 /*
- * rcu_eqs_exit_common - current CPU moving away from extended quiescent state
- *
- * If the new value of the ->dynticks_nesting counter was previously zero,
- * we really have exited idle, and must do the appropriate accounting.
- * The caller must have disabled interrupts.
- */
-static void rcu_eqs_exit_common(long long oldval, int user)
-{
-	RCU_TRACE(struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);)
-
-	rcu_dynticks_task_exit();
-	rcu_dynticks_eqs_exit();
-	rcu_cleanup_after_idle();
-	trace_rcu_dyntick(TPS("End"), oldval, rdtp->dynticks_nesting);
-	if (IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-	    !user && !is_idle_task(current)) {
-		struct task_struct *idle __maybe_unused =
-			idle_task(smp_processor_id());
-
-		trace_rcu_dyntick(TPS("Error on exit: not idle task"),
-				  oldval, rdtp->dynticks_nesting);
-		rcu_ftrace_dump(DUMP_ORIG);
-		WARN_ONCE(1, "Current pid: %d comm: %s / Idle pid: %d comm: %s",
-			  current->pid, current->comm,
-			  idle->pid, idle->comm); /* must be idle task! */
-	}
-}
-
-/*
  * Exit an RCU extended quiescent state, which can be either the
  * idle loop or adaptive-tickless usermode execution.
+ *
+ * We crowbar the ->dynticks_nmi_nesting field to DYNTICK_IRQ_NONIDLE to
+ * allow for the possibility of usermode upcalls messing up our count of
+ * interrupt nesting level during the busy period that is just now starting.
  */
 static void rcu_eqs_exit(bool user)
 {
 	struct rcu_dynticks *rdtp;
-	long long oldval;
+	long oldval;
 
 	lockdep_assert_irqs_disabled();
 	rdtp = this_cpu_ptr(&rcu_dynticks);
 	oldval = rdtp->dynticks_nesting;
 	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
-	if (oldval & DYNTICK_TASK_NEST_MASK) {
-		rdtp->dynticks_nesting += DYNTICK_TASK_NEST_VALUE;
-	} else {
-		__this_cpu_inc(disable_rcu_irq_enter);
-		rdtp->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
-		rcu_eqs_exit_common(oldval, user);
-		__this_cpu_dec(disable_rcu_irq_enter);
+	if (oldval) {
+		rdtp->dynticks_nesting++;
+		return;
 	}
+	rcu_dynticks_task_exit();
+	rcu_dynticks_eqs_exit();
+	rcu_cleanup_after_idle();
+	trace_rcu_dyntick(TPS("End"), rdtp->dynticks_nesting, 1, rdtp->dynticks);
+	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
+	WRITE_ONCE(rdtp->dynticks_nesting, 1);
+	WRITE_ONCE(rdtp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
 }
 
 /**
@@ -979,11 +942,6 @@ static void rcu_eqs_exit(bool user)
  * Exit idle mode, in other words, -enter- the mode in which RCU
  * read-side critical sections can occur.
  *
- * We crowbar the ->dynticks_nesting field to DYNTICK_TASK_NEST to
- * allow for the possibility of usermode upcalls messing up our count
- * of interrupt nesting level during the busy period that is just
- * now starting.
- *
  * If you add or remove a call to rcu_idle_exit(), be sure to test with
  * CONFIG_RCU_EQS_DEBUG=y.
  */
@@ -1013,65 +971,6 @@ void rcu_user_exit(void)
 #endif /* CONFIG_NO_HZ_FULL */
 
 /**
- * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
- *
- * Enter an interrupt handler, which might possibly result in exiting
- * idle mode, in other words, entering the mode in which read-side critical
- * sections can occur.  The caller must have disabled interrupts.
- *
- * Note that the Linux kernel is fully capable of entering an interrupt
- * handler that it never exits, for example when doing upcalls to
- * user mode!  This code assumes that the idle loop never does upcalls to
- * user mode.  If your architecture does do upcalls from the idle loop (or
- * does anything else that results in unbalanced calls to the irq_enter()
- * and irq_exit() functions), RCU will give you what you deserve, good
- * and hard.  But very infrequently and irreproducibly.
- *
- * Use things like work queues to work around this limitation.
- *
- * You have been warned.
- *
- * If you add or remove a call to rcu_irq_enter(), be sure to test with
- * CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_irq_enter(void)
-{
-	struct rcu_dynticks *rdtp;
-	long long oldval;
-
-	lockdep_assert_irqs_disabled();
-	rdtp = this_cpu_ptr(&rcu_dynticks);
-
-	/* Page faults can happen in NMI handlers, so check... */
-	if (rdtp->dynticks_nmi_nesting)
-		return;
-
-	oldval = rdtp->dynticks_nesting;
-	rdtp->dynticks_nesting++;
-	WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
-		     rdtp->dynticks_nesting == 0);
-	if (oldval)
-		trace_rcu_dyntick(TPS("++="), oldval, rdtp->dynticks_nesting);
-	else
-		rcu_eqs_exit_common(oldval, true);
-}
-
-/*
- * Wrapper for rcu_irq_enter() where interrupts are enabled.
- *
- * If you add or remove a call to rcu_irq_enter_irqson(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_irq_enter_irqson(void)
-{
-	unsigned long flags;
-
-	local_irq_save(flags);
-	rcu_irq_enter();
-	local_irq_restore(flags);
-}
-
-/**
  * rcu_nmi_enter - inform RCU of entry to NMI context
  *
  * If the CPU was idle from RCU's viewpoint, update rdtp->dynticks and
@@ -1086,7 +985,7 @@ void rcu_irq_enter_irqson(void)
 void rcu_nmi_enter(void)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
-	int incby = 2;
+	long incby = 2;
 
 	/* Complain about underflow. */
 	WARN_ON_ONCE(rdtp->dynticks_nmi_nesting < 0);
@@ -1103,45 +1002,61 @@ void rcu_nmi_enter(void)
 		rcu_dynticks_eqs_exit();
 		incby = 1;
 	}
-	rdtp->dynticks_nmi_nesting += incby;
+	trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
+			  rdtp->dynticks_nmi_nesting,
+			  rdtp->dynticks_nmi_nesting + incby, rdtp->dynticks);
+	WRITE_ONCE(rdtp->dynticks_nmi_nesting, /* Prevent store tearing. */
+		   rdtp->dynticks_nmi_nesting + incby);
 	barrier();
 }
 
 /**
- * rcu_nmi_exit - inform RCU of exit from NMI context
+ * rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
  *
- * If we are returning from the outermost NMI handler that interrupted an
- * RCU-idle period, update rdtp->dynticks and rdtp->dynticks_nmi_nesting
- * to let the RCU grace-period handling know that the CPU is back to
- * being RCU-idle.
+ * Enter an interrupt handler, which might possibly result in exiting
+ * idle mode, in other words, entering the mode in which read-side critical
+ * sections can occur.  The caller must have disabled interrupts.
  *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
+ * Note that the Linux kernel is fully capable of entering an interrupt
+ * handler that it never exits, for example when doing upcalls to user mode!
+ * This code assumes that the idle loop never does upcalls to user mode.
+ * If your architecture's idle loop does do upcalls to user mode (or does
+ * anything else that results in unbalanced calls to the irq_enter() and
+ * irq_exit() functions), RCU will give you what you deserve, good and hard.
+ * But very infrequently and irreproducibly.
+ *
+ * Use things like work queues to work around this limitation.
+ *
+ * You have been warned.
+ *
+ * If you add or remove a call to rcu_irq_enter(), be sure to test with
+ * CONFIG_RCU_EQS_DEBUG=y.
  */
-void rcu_nmi_exit(void)
+void rcu_irq_enter(void)
 {
 	struct rcu_dynticks *rdtp = this_cpu_ptr(&rcu_dynticks);
 
-	/*
-	 * Check for ->dynticks_nmi_nesting underflow and bad ->dynticks.
-	 * (We are exiting an NMI handler, so RCU better be paying attention
-	 * to us!)
-	 */
-	WARN_ON_ONCE(rdtp->dynticks_nmi_nesting <= 0);
-	WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs());
+	lockdep_assert_irqs_disabled();
+	if (rdtp->dynticks_nmi_nesting == 0)
+		rcu_dynticks_task_exit();
+	rcu_nmi_enter();
+	if (rdtp->dynticks_nmi_nesting == 1)
+		rcu_cleanup_after_idle();
+}
 
-	/*
-	 * If the nesting level is not 1, the CPU wasn't RCU-idle, so
-	 * leave it in non-RCU-idle state.
-	 */
-	if (rdtp->dynticks_nmi_nesting != 1) {
-		rdtp->dynticks_nmi_nesting -= 2;
-		return;
-	}
+/*
+ * Wrapper for rcu_irq_enter() where interrupts are enabled.
+ *
+ * If you add or remove a call to rcu_irq_enter_irqson(), be sure to test
+ * with CONFIG_RCU_EQS_DEBUG=y.
+ */
+void rcu_irq_enter_irqson(void)
+{
+	unsigned long flags;
 
-	/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
-	rdtp->dynticks_nmi_nesting = 0;
-	rcu_dynticks_eqs_enter();
+	local_irq_save(flags);
+	rcu_irq_enter();
+	local_irq_restore(flags);
 }
 
 /**
@@ -1233,7 +1148,8 @@ EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
  */
 static int rcu_is_cpu_rrupt_from_idle(void)
 {
-	return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 1;
+	return __this_cpu_read(rcu_dynticks.dynticks_nesting) <= 0 &&
+	       __this_cpu_read(rcu_dynticks.dynticks_nmi_nesting) <= 1;
 }
 
 /*
@@ -2789,6 +2705,11 @@ static void rcu_do_batch(struct rcu_state *rsp, struct rcu_data *rdp)
 		rdp->n_force_qs_snap = rsp->n_force_qs;
 	} else if (count < rdp->qlen_last_fqs_check - qhimark)
 		rdp->qlen_last_fqs_check = count;
+
+	/*
+	 * The following usually indicates a double call_rcu().  To track
+	 * this down, try building with CONFIG_DEBUG_OBJECTS_RCU_HEAD=y.
+	 */
 	WARN_ON_ONCE(rcu_segcblist_empty(&rdp->cblist) != (count == 0));
 
 	local_irq_restore(flags);
@@ -3723,7 +3644,7 @@ rcu_boot_init_percpu_data(int cpu, struct rcu_state *rsp)
 	raw_spin_lock_irqsave_rcu_node(rnp, flags);
 	rdp->grpmask = leaf_node_cpu_bit(rdp->mynode, cpu);
 	rdp->dynticks = &per_cpu(rcu_dynticks, cpu);
-	WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != DYNTICK_TASK_EXIT_IDLE);
+	WARN_ON_ONCE(rdp->dynticks->dynticks_nesting != 1);
 	WARN_ON_ONCE(rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp->dynticks)));
 	rdp->cpu = cpu;
 	rdp->rsp = rsp;
@@ -3752,7 +3673,7 @@ rcu_init_percpu_data(int cpu, struct rcu_state *rsp)
 	if (rcu_segcblist_empty(&rdp->cblist) && /* No early-boot CBs? */
 	    !init_nocb_callback_list(rdp))
 		rcu_segcblist_init(&rdp->cblist);  /* Re-enable callbacks. */
-	rdp->dynticks->dynticks_nesting = DYNTICK_TASK_EXIT_IDLE;
+	rdp->dynticks->dynticks_nesting = 1;	/* CPU not up, no tearing. */
 	rcu_dynticks_eqs_online();
 	raw_spin_unlock_rcu_node(rnp);		/* irqs remain disabled. */
 
diff --git a/kernel/rcu/tree.h b/kernel/rcu/tree.h
index 46a5d19..6488a3b 100644
--- a/kernel/rcu/tree.h
+++ b/kernel/rcu/tree.h
@@ -38,9 +38,8 @@
  * Dynticks per-CPU state.
  */
 struct rcu_dynticks {
-	long long dynticks_nesting; /* Track irq/process nesting level. */
-				    /* Process level is worth LLONG_MAX/2. */
-	int dynticks_nmi_nesting;   /* Track NMI nesting level. */
+	long dynticks_nesting;      /* Track process nesting level. */
+	long dynticks_nmi_nesting;  /* Track irq/NMI nesting level. */
 	atomic_t dynticks;	    /* Even value for idle, else odd. */
 	bool rcu_need_heavy_qs;     /* GP old, need heavy quiescent state. */
 	unsigned long rcu_qs_ctr;   /* Light universal quiescent state ctr. */
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index db85ca3..fb88a02 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -61,7 +61,6 @@ DEFINE_PER_CPU(char, rcu_cpu_has_work);
 
 #ifdef CONFIG_RCU_NOCB_CPU
 static cpumask_var_t rcu_nocb_mask; /* CPUs to have callbacks offloaded. */
-static bool have_rcu_nocb_mask;	    /* Was rcu_nocb_mask allocated? */
 static bool __read_mostly rcu_nocb_poll;    /* Offload kthread are to poll. */
 #endif /* #ifdef CONFIG_RCU_NOCB_CPU */
 
@@ -1687,7 +1686,7 @@ static void print_cpu_stall_info(struct rcu_state *rsp, int cpu)
 	}
 	print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
 	delta = rdp->mynode->gpnum - rdp->rcu_iw_gpnum;
-	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%llx/%d softirq=%u/%u fqs=%ld %s\n",
+	pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%ld softirq=%u/%u fqs=%ld %s\n",
 	       cpu,
 	       "O."[!!cpu_online(cpu)],
 	       "o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
@@ -1752,7 +1751,6 @@ static void increment_cpu_stall_ticks(void)
 static int __init rcu_nocb_setup(char *str)
 {
 	alloc_bootmem_cpumask_var(&rcu_nocb_mask);
-	have_rcu_nocb_mask = true;
 	cpulist_parse(str, rcu_nocb_mask);
 	return 1;
 }
@@ -1801,7 +1799,7 @@ static void rcu_init_one_nocb(struct rcu_node *rnp)
 /* Is the specified CPU a no-CBs CPU? */
 bool rcu_is_nocb_cpu(int cpu)
 {
-	if (have_rcu_nocb_mask)
+	if (cpumask_available(rcu_nocb_mask))
 		return cpumask_test_cpu(cpu, rcu_nocb_mask);
 	return false;
 }
@@ -2295,14 +2293,13 @@ void __init rcu_init_nohz(void)
 		need_rcu_nocb_mask = true;
 #endif /* #if defined(CONFIG_NO_HZ_FULL) */
 
-	if (!have_rcu_nocb_mask && need_rcu_nocb_mask) {
+	if (!cpumask_available(rcu_nocb_mask) && need_rcu_nocb_mask) {
 		if (!zalloc_cpumask_var(&rcu_nocb_mask, GFP_KERNEL)) {
 			pr_info("rcu_nocb_mask allocation failed, callback offloading disabled.\n");
 			return;
 		}
-		have_rcu_nocb_mask = true;
 	}
-	if (!have_rcu_nocb_mask)
+	if (!cpumask_available(rcu_nocb_mask))
 		return;
 
 #if defined(CONFIG_NO_HZ_FULL)
@@ -2428,7 +2425,7 @@ static void __init rcu_organize_nocb_kthreads(struct rcu_state *rsp)
 	struct rcu_data *rdp_leader = NULL;  /* Suppress misguided gcc warn. */
 	struct rcu_data *rdp_prev = NULL;
 
-	if (!have_rcu_nocb_mask)
+	if (!cpumask_available(rcu_nocb_mask))
 		return;
 	if (ls == -1) {
 		ls = int_sqrt(nr_cpu_ids);
diff --git a/kernel/sched/completion.c b/kernel/sched/completion.c
index 2ddaec4..0926aef 100644
--- a/kernel/sched/completion.c
+++ b/kernel/sched/completion.c
@@ -34,11 +34,6 @@ void complete(struct completion *x)
 
 	spin_lock_irqsave(&x->wait.lock, flags);
 
-	/*
-	 * Perform commit of crossrelease here.
-	 */
-	complete_release_commit(x);
-
 	if (x->done != UINT_MAX)
 		x->done++;
 	__wake_up_locked(&x->wait, TASK_NORMAL, 1);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 644fa2e..3da7a24 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -508,7 +508,8 @@ void resched_cpu(int cpu)
 	unsigned long flags;
 
 	raw_spin_lock_irqsave(&rq->lock, flags);
-	resched_curr(rq);
+	if (cpu_online(cpu) || cpu == smp_processor_id())
+		resched_curr(rq);
 	raw_spin_unlock_irqrestore(&rq->lock, flags);
 }
 
@@ -2045,7 +2046,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	 * If the owning (remote) CPU is still in the middle of schedule() with
 	 * this task as prev, wait until its done referencing the task.
 	 *
-	 * Pairs with the smp_store_release() in finish_lock_switch().
+	 * Pairs with the smp_store_release() in finish_task().
 	 *
 	 * This ensures that tasks getting woken will be fully ordered against
 	 * their previous state and preserve Program Order.
@@ -2056,7 +2057,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 	p->state = TASK_WAKING;
 
 	if (p->in_iowait) {
-		delayacct_blkio_end();
+		delayacct_blkio_end(p);
 		atomic_dec(&task_rq(p)->nr_iowait);
 	}
 
@@ -2069,7 +2070,7 @@ try_to_wake_up(struct task_struct *p, unsigned int state, int wake_flags)
 #else /* CONFIG_SMP */
 
 	if (p->in_iowait) {
-		delayacct_blkio_end();
+		delayacct_blkio_end(p);
 		atomic_dec(&task_rq(p)->nr_iowait);
 	}
 
@@ -2122,7 +2123,7 @@ static void try_to_wake_up_local(struct task_struct *p, struct rq_flags *rf)
 
 	if (!task_on_rq_queued(p)) {
 		if (p->in_iowait) {
-			delayacct_blkio_end();
+			delayacct_blkio_end(p);
 			atomic_dec(&rq->nr_iowait);
 		}
 		ttwu_activate(rq, p, ENQUEUE_WAKEUP | ENQUEUE_NOCLOCK);
@@ -2571,6 +2572,50 @@ fire_sched_out_preempt_notifiers(struct task_struct *curr,
 
 #endif /* CONFIG_PREEMPT_NOTIFIERS */
 
+static inline void prepare_task(struct task_struct *next)
+{
+#ifdef CONFIG_SMP
+	/*
+	 * Claim the task as running, we do this before switching to it
+	 * such that any running task will have this set.
+	 */
+	next->on_cpu = 1;
+#endif
+}
+
+static inline void finish_task(struct task_struct *prev)
+{
+#ifdef CONFIG_SMP
+	/*
+	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
+	 * We must ensure this doesn't happen until the switch is completely
+	 * finished.
+	 *
+	 * In particular, the load of prev->state in finish_task_switch() must
+	 * happen before this.
+	 *
+	 * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
+	 */
+	smp_store_release(&prev->on_cpu, 0);
+#endif
+}
+
+static inline void finish_lock_switch(struct rq *rq)
+{
+#ifdef CONFIG_DEBUG_SPINLOCK
+	/* this is a valid case when another task releases the spinlock */
+	rq->lock.owner = current;
+#endif
+	/*
+	 * If we are tracking spinlock dependencies then we have to
+	 * fix up the runqueue lock - which gets 'carried over' from
+	 * prev into current:
+	 */
+	spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
+
+	raw_spin_unlock_irq(&rq->lock);
+}
+
 /**
  * prepare_task_switch - prepare to switch tasks
  * @rq: the runqueue preparing to switch
@@ -2591,7 +2636,7 @@ prepare_task_switch(struct rq *rq, struct task_struct *prev,
 	sched_info_switch(rq, prev, next);
 	perf_event_task_sched_out(prev, next);
 	fire_sched_out_preempt_notifiers(prev, next);
-	prepare_lock_switch(rq, next);
+	prepare_task(next);
 	prepare_arch_switch(next);
 }
 
@@ -2646,7 +2691,7 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	 * the scheduled task must drop that reference.
 	 *
 	 * We must observe prev->state before clearing prev->on_cpu (in
-	 * finish_lock_switch), otherwise a concurrent wakeup can get prev
+	 * finish_task), otherwise a concurrent wakeup can get prev
 	 * running on another CPU and we could rave with its RUNNING -> DEAD
 	 * transition, resulting in a double drop.
 	 */
@@ -2663,7 +2708,8 @@ static struct rq *finish_task_switch(struct task_struct *prev)
 	 * to use.
 	 */
 	smp_mb__after_unlock_lock();
-	finish_lock_switch(rq, prev);
+	finish_task(prev);
+	finish_lock_switch(rq);
 	finish_arch_post_lock_switch();
 
 	fire_sched_in_preempt_notifiers(current);
@@ -4040,8 +4086,7 @@ static int __sched_setscheduler(struct task_struct *p,
 			return -EINVAL;
 	}
 
-	if (attr->sched_flags &
-		~(SCHED_FLAG_RESET_ON_FORK | SCHED_FLAG_RECLAIM))
+	if (attr->sched_flags & ~(SCHED_FLAG_ALL | SCHED_FLAG_SUGOV))
 		return -EINVAL;
 
 	/*
@@ -4108,6 +4153,9 @@ static int __sched_setscheduler(struct task_struct *p,
 	}
 
 	if (user) {
+		if (attr->sched_flags & SCHED_FLAG_SUGOV)
+			return -EINVAL;
+
 		retval = security_task_setscheduler(p);
 		if (retval)
 			return retval;
@@ -4163,7 +4211,8 @@ static int __sched_setscheduler(struct task_struct *p,
 		}
 #endif
 #ifdef CONFIG_SMP
-		if (dl_bandwidth_enabled() && dl_policy(policy)) {
+		if (dl_bandwidth_enabled() && dl_policy(policy) &&
+				!(attr->sched_flags & SCHED_FLAG_SUGOV)) {
 			cpumask_t *span = rq->rd->span;
 
 			/*
@@ -4293,6 +4342,11 @@ int sched_setattr(struct task_struct *p, const struct sched_attr *attr)
 }
 EXPORT_SYMBOL_GPL(sched_setattr);
 
+int sched_setattr_nocheck(struct task_struct *p, const struct sched_attr *attr)
+{
+	return __sched_setscheduler(p, attr, false, true);
+}
+
 /**
  * sched_setscheduler_nocheck - change the scheduling policy and/or RT priority of a thread from kernelspace.
  * @p: the task in question.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
index d6717a3..dd062a1 100644
--- a/kernel/sched/cpufreq_schedutil.c
+++ b/kernel/sched/cpufreq_schedutil.c
@@ -60,7 +60,8 @@ struct sugov_cpu {
 	u64 last_update;
 
 	/* The fields below are only needed when sharing a policy. */
-	unsigned long util;
+	unsigned long util_cfs;
+	unsigned long util_dl;
 	unsigned long max;
 	unsigned int flags;
 
@@ -176,21 +177,28 @@ static unsigned int get_next_freq(struct sugov_policy *sg_policy,
 	return cpufreq_driver_resolve_freq(policy, freq);
 }
 
-static void sugov_get_util(unsigned long *util, unsigned long *max, int cpu)
+static void sugov_get_util(struct sugov_cpu *sg_cpu)
 {
-	struct rq *rq = cpu_rq(cpu);
-	unsigned long cfs_max;
+	struct rq *rq = cpu_rq(sg_cpu->cpu);
 
-	cfs_max = arch_scale_cpu_capacity(NULL, cpu);
-
-	*util = min(rq->cfs.avg.util_avg, cfs_max);
-	*max = cfs_max;
+	sg_cpu->max = arch_scale_cpu_capacity(NULL, sg_cpu->cpu);
+	sg_cpu->util_cfs = cpu_util_cfs(rq);
+	sg_cpu->util_dl  = cpu_util_dl(rq);
 }
 
-static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time,
-				   unsigned int flags)
+static unsigned long sugov_aggregate_util(struct sugov_cpu *sg_cpu)
 {
-	if (flags & SCHED_CPUFREQ_IOWAIT) {
+	/*
+	 * Ideally we would like to set util_dl as min/guaranteed freq and
+	 * util_cfs + util_dl as requested freq. However, cpufreq is not yet
+	 * ready for such an interface. So, we only do the latter for now.
+	 */
+	return min(sg_cpu->util_cfs + sg_cpu->util_dl, sg_cpu->max);
+}
+
+static void sugov_set_iowait_boost(struct sugov_cpu *sg_cpu, u64 time)
+{
+	if (sg_cpu->flags & SCHED_CPUFREQ_IOWAIT) {
 		if (sg_cpu->iowait_boost_pending)
 			return;
 
@@ -264,7 +272,7 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
 	unsigned int next_f;
 	bool busy;
 
-	sugov_set_iowait_boost(sg_cpu, time, flags);
+	sugov_set_iowait_boost(sg_cpu, time);
 	sg_cpu->last_update = time;
 
 	if (!sugov_should_update_freq(sg_policy, time))
@@ -272,10 +280,12 @@ static void sugov_update_single(struct update_util_data *hook, u64 time,
 
 	busy = sugov_cpu_is_busy(sg_cpu);
 
-	if (flags & SCHED_CPUFREQ_RT_DL) {
+	if (flags & SCHED_CPUFREQ_RT) {
 		next_f = policy->cpuinfo.max_freq;
 	} else {
-		sugov_get_util(&util, &max, sg_cpu->cpu);
+		sugov_get_util(sg_cpu);
+		max = sg_cpu->max;
+		util = sugov_aggregate_util(sg_cpu);
 		sugov_iowait_boost(sg_cpu, &util, &max);
 		next_f = get_next_freq(sg_policy, util, max);
 		/*
@@ -305,23 +315,27 @@ static unsigned int sugov_next_freq_shared(struct sugov_cpu *sg_cpu, u64 time)
 		s64 delta_ns;
 
 		/*
-		 * If the CPU utilization was last updated before the previous
-		 * frequency update and the time elapsed between the last update
-		 * of the CPU utilization and the last frequency update is long
-		 * enough, don't take the CPU into account as it probably is
-		 * idle now (and clear iowait_boost for it).
+		 * If the CFS CPU utilization was last updated before the
+		 * previous frequency update and the time elapsed between the
+		 * last update of the CPU utilization and the last frequency
+		 * update is long enough, reset iowait_boost and util_cfs, as
+		 * they are now probably stale. However, still consider the
+		 * CPU contribution if it has some DEADLINE utilization
+		 * (util_dl).
 		 */
 		delta_ns = time - j_sg_cpu->last_update;
 		if (delta_ns > TICK_NSEC) {
 			j_sg_cpu->iowait_boost = 0;
 			j_sg_cpu->iowait_boost_pending = false;
-			continue;
+			j_sg_cpu->util_cfs = 0;
+			if (j_sg_cpu->util_dl == 0)
+				continue;
 		}
-		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT_DL)
+		if (j_sg_cpu->flags & SCHED_CPUFREQ_RT)
 			return policy->cpuinfo.max_freq;
 
-		j_util = j_sg_cpu->util;
 		j_max = j_sg_cpu->max;
+		j_util = sugov_aggregate_util(j_sg_cpu);
 		if (j_util * max > j_max * util) {
 			util = j_util;
 			max = j_max;
@@ -338,22 +352,18 @@ static void sugov_update_shared(struct update_util_data *hook, u64 time,
 {
 	struct sugov_cpu *sg_cpu = container_of(hook, struct sugov_cpu, update_util);
 	struct sugov_policy *sg_policy = sg_cpu->sg_policy;
-	unsigned long util, max;
 	unsigned int next_f;
 
-	sugov_get_util(&util, &max, sg_cpu->cpu);
-
 	raw_spin_lock(&sg_policy->update_lock);
 
-	sg_cpu->util = util;
-	sg_cpu->max = max;
+	sugov_get_util(sg_cpu);
 	sg_cpu->flags = flags;
 
-	sugov_set_iowait_boost(sg_cpu, time, flags);
+	sugov_set_iowait_boost(sg_cpu, time);
 	sg_cpu->last_update = time;
 
 	if (sugov_should_update_freq(sg_policy, time)) {
-		if (flags & SCHED_CPUFREQ_RT_DL)
+		if (flags & SCHED_CPUFREQ_RT)
 			next_f = sg_policy->policy->cpuinfo.max_freq;
 		else
 			next_f = sugov_next_freq_shared(sg_cpu, time);
@@ -383,9 +393,9 @@ static void sugov_irq_work(struct irq_work *irq_work)
 	sg_policy = container_of(irq_work, struct sugov_policy, irq_work);
 
 	/*
-	 * For RT and deadline tasks, the schedutil governor shoots the
-	 * frequency to maximum. Special care must be taken to ensure that this
-	 * kthread doesn't result in the same behavior.
+	 * For RT tasks, the schedutil governor shoots the frequency to maximum.
+	 * Special care must be taken to ensure that this kthread doesn't result
+	 * in the same behavior.
 	 *
 	 * This is (mostly) guaranteed by the work_in_progress flag. The flag is
 	 * updated only at the end of the sugov_work() function and before that
@@ -470,7 +480,20 @@ static void sugov_policy_free(struct sugov_policy *sg_policy)
 static int sugov_kthread_create(struct sugov_policy *sg_policy)
 {
 	struct task_struct *thread;
-	struct sched_param param = { .sched_priority = MAX_USER_RT_PRIO / 2 };
+	struct sched_attr attr = {
+		.size = sizeof(struct sched_attr),
+		.sched_policy = SCHED_DEADLINE,
+		.sched_flags = SCHED_FLAG_SUGOV,
+		.sched_nice = 0,
+		.sched_priority = 0,
+		/*
+		 * Fake (unused) bandwidth; workaround to "fix"
+		 * priority inheritance.
+		 */
+		.sched_runtime	=  1000000,
+		.sched_deadline = 10000000,
+		.sched_period	= 10000000,
+	};
 	struct cpufreq_policy *policy = sg_policy->policy;
 	int ret;
 
@@ -488,10 +511,10 @@ static int sugov_kthread_create(struct sugov_policy *sg_policy)
 		return PTR_ERR(thread);
 	}
 
-	ret = sched_setscheduler_nocheck(thread, SCHED_FIFO, &param);
+	ret = sched_setattr_nocheck(thread, &attr);
 	if (ret) {
 		kthread_stop(thread);
-		pr_warn("%s: failed to set SCHED_FIFO\n", __func__);
+		pr_warn("%s: failed to set SCHED_DEADLINE\n", __func__);
 		return ret;
 	}
 
@@ -655,7 +678,7 @@ static int sugov_start(struct cpufreq_policy *policy)
 		memset(sg_cpu, 0, sizeof(*sg_cpu));
 		sg_cpu->cpu = cpu;
 		sg_cpu->sg_policy = sg_policy;
-		sg_cpu->flags = SCHED_CPUFREQ_RT;
+		sg_cpu->flags = 0;
 		sg_cpu->iowait_boost_max = policy->cpuinfo.max_freq;
 	}
 
diff --git a/kernel/sched/deadline.c b/kernel/sched/deadline.c
index 2473736..9bb0e0c 100644
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -78,7 +78,7 @@ static inline int dl_bw_cpus(int i)
 #endif
 
 static inline
-void add_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
+void __add_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 {
 	u64 old = dl_rq->running_bw;
 
@@ -86,10 +86,12 @@ void add_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 	dl_rq->running_bw += dl_bw;
 	SCHED_WARN_ON(dl_rq->running_bw < old); /* overflow */
 	SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
+	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
+	cpufreq_update_util(rq_of_dl_rq(dl_rq), SCHED_CPUFREQ_DL);
 }
 
 static inline
-void sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
+void __sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 {
 	u64 old = dl_rq->running_bw;
 
@@ -98,10 +100,12 @@ void sub_running_bw(u64 dl_bw, struct dl_rq *dl_rq)
 	SCHED_WARN_ON(dl_rq->running_bw > old); /* underflow */
 	if (dl_rq->running_bw > old)
 		dl_rq->running_bw = 0;
+	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
+	cpufreq_update_util(rq_of_dl_rq(dl_rq), SCHED_CPUFREQ_DL);
 }
 
 static inline
-void add_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
+void __add_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
 {
 	u64 old = dl_rq->this_bw;
 
@@ -111,7 +115,7 @@ void add_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
 }
 
 static inline
-void sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
+void __sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
 {
 	u64 old = dl_rq->this_bw;
 
@@ -123,16 +127,46 @@ void sub_rq_bw(u64 dl_bw, struct dl_rq *dl_rq)
 	SCHED_WARN_ON(dl_rq->running_bw > dl_rq->this_bw);
 }
 
+static inline
+void add_rq_bw(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
+{
+	if (!dl_entity_is_special(dl_se))
+		__add_rq_bw(dl_se->dl_bw, dl_rq);
+}
+
+static inline
+void sub_rq_bw(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
+{
+	if (!dl_entity_is_special(dl_se))
+		__sub_rq_bw(dl_se->dl_bw, dl_rq);
+}
+
+static inline
+void add_running_bw(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
+{
+	if (!dl_entity_is_special(dl_se))
+		__add_running_bw(dl_se->dl_bw, dl_rq);
+}
+
+static inline
+void sub_running_bw(struct sched_dl_entity *dl_se, struct dl_rq *dl_rq)
+{
+	if (!dl_entity_is_special(dl_se))
+		__sub_running_bw(dl_se->dl_bw, dl_rq);
+}
+
 void dl_change_utilization(struct task_struct *p, u64 new_bw)
 {
 	struct rq *rq;
 
+	BUG_ON(p->dl.flags & SCHED_FLAG_SUGOV);
+
 	if (task_on_rq_queued(p))
 		return;
 
 	rq = task_rq(p);
 	if (p->dl.dl_non_contending) {
-		sub_running_bw(p->dl.dl_bw, &rq->dl);
+		sub_running_bw(&p->dl, &rq->dl);
 		p->dl.dl_non_contending = 0;
 		/*
 		 * If the timer handler is currently running and the
@@ -144,8 +178,8 @@ void dl_change_utilization(struct task_struct *p, u64 new_bw)
 		if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
 			put_task_struct(p);
 	}
-	sub_rq_bw(p->dl.dl_bw, &rq->dl);
-	add_rq_bw(new_bw, &rq->dl);
+	__sub_rq_bw(p->dl.dl_bw, &rq->dl);
+	__add_rq_bw(new_bw, &rq->dl);
 }
 
 /*
@@ -217,6 +251,9 @@ static void task_non_contending(struct task_struct *p)
 	if (dl_se->dl_runtime == 0)
 		return;
 
+	if (dl_entity_is_special(dl_se))
+		return;
+
 	WARN_ON(hrtimer_active(&dl_se->inactive_timer));
 	WARN_ON(dl_se->dl_non_contending);
 
@@ -236,12 +273,12 @@ static void task_non_contending(struct task_struct *p)
 	 */
 	if (zerolag_time < 0) {
 		if (dl_task(p))
-			sub_running_bw(dl_se->dl_bw, dl_rq);
+			sub_running_bw(dl_se, dl_rq);
 		if (!dl_task(p) || p->state == TASK_DEAD) {
 			struct dl_bw *dl_b = dl_bw_of(task_cpu(p));
 
 			if (p->state == TASK_DEAD)
-				sub_rq_bw(p->dl.dl_bw, &rq->dl);
+				sub_rq_bw(&p->dl, &rq->dl);
 			raw_spin_lock(&dl_b->lock);
 			__dl_sub(dl_b, p->dl.dl_bw, dl_bw_cpus(task_cpu(p)));
 			__dl_clear_params(p);
@@ -268,7 +305,7 @@ static void task_contending(struct sched_dl_entity *dl_se, int flags)
 		return;
 
 	if (flags & ENQUEUE_MIGRATED)
-		add_rq_bw(dl_se->dl_bw, dl_rq);
+		add_rq_bw(dl_se, dl_rq);
 
 	if (dl_se->dl_non_contending) {
 		dl_se->dl_non_contending = 0;
@@ -289,7 +326,7 @@ static void task_contending(struct sched_dl_entity *dl_se, int flags)
 		 * when the "inactive timer" fired).
 		 * So, add it back.
 		 */
-		add_running_bw(dl_se->dl_bw, dl_rq);
+		add_running_bw(dl_se, dl_rq);
 	}
 }
 
@@ -1114,7 +1151,8 @@ static void update_curr_dl(struct rq *rq)
 {
 	struct task_struct *curr = rq->curr;
 	struct sched_dl_entity *dl_se = &curr->dl;
-	u64 delta_exec;
+	u64 delta_exec, scaled_delta_exec;
+	int cpu = cpu_of(rq);
 
 	if (!dl_task(curr) || !on_dl_rq(dl_se))
 		return;
@@ -1134,9 +1172,6 @@ static void update_curr_dl(struct rq *rq)
 		return;
 	}
 
-	/* kick cpufreq (see the comment in kernel/sched/sched.h). */
-	cpufreq_update_util(rq, SCHED_CPUFREQ_DL);
-
 	schedstat_set(curr->se.statistics.exec_max,
 		      max(curr->se.statistics.exec_max, delta_exec));
 
@@ -1148,13 +1183,39 @@ static void update_curr_dl(struct rq *rq)
 
 	sched_rt_avg_update(rq, delta_exec);
 
-	if (unlikely(dl_se->flags & SCHED_FLAG_RECLAIM))
-		delta_exec = grub_reclaim(delta_exec, rq, &curr->dl);
-	dl_se->runtime -= delta_exec;
+	if (dl_entity_is_special(dl_se))
+		return;
+
+	/*
+	 * For tasks that participate in GRUB, we implement GRUB-PA: the
+	 * spare reclaimed bandwidth is used to clock down frequency.
+	 *
+	 * For the others, we still need to scale reservation parameters
+	 * according to current frequency and CPU maximum capacity.
+	 */
+	if (unlikely(dl_se->flags & SCHED_FLAG_RECLAIM)) {
+		scaled_delta_exec = grub_reclaim(delta_exec,
+						 rq,
+						 &curr->dl);
+	} else {
+		unsigned long scale_freq = arch_scale_freq_capacity(cpu);
+		unsigned long scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
+
+		scaled_delta_exec = cap_scale(delta_exec, scale_freq);
+		scaled_delta_exec = cap_scale(scaled_delta_exec, scale_cpu);
+	}
+
+	dl_se->runtime -= scaled_delta_exec;
 
 throttle:
 	if (dl_runtime_exceeded(dl_se) || dl_se->dl_yielded) {
 		dl_se->dl_throttled = 1;
+
+		/* If requested, inform the user about runtime overruns. */
+		if (dl_runtime_exceeded(dl_se) &&
+		    (dl_se->flags & SCHED_FLAG_DL_OVERRUN))
+			dl_se->dl_overrun = 1;
+
 		__dequeue_task_dl(rq, curr, 0);
 		if (unlikely(dl_se->dl_boosted || !start_dl_timer(curr)))
 			enqueue_task_dl(rq, curr, ENQUEUE_REPLENISH);
@@ -1204,8 +1265,8 @@ static enum hrtimer_restart inactive_task_timer(struct hrtimer *timer)
 		struct dl_bw *dl_b = dl_bw_of(task_cpu(p));
 
 		if (p->state == TASK_DEAD && dl_se->dl_non_contending) {
-			sub_running_bw(p->dl.dl_bw, dl_rq_of_se(&p->dl));
-			sub_rq_bw(p->dl.dl_bw, dl_rq_of_se(&p->dl));
+			sub_running_bw(&p->dl, dl_rq_of_se(&p->dl));
+			sub_rq_bw(&p->dl, dl_rq_of_se(&p->dl));
 			dl_se->dl_non_contending = 0;
 		}
 
@@ -1222,7 +1283,7 @@ static enum hrtimer_restart inactive_task_timer(struct hrtimer *timer)
 	sched_clock_tick();
 	update_rq_clock(rq);
 
-	sub_running_bw(dl_se->dl_bw, &rq->dl);
+	sub_running_bw(dl_se, &rq->dl);
 	dl_se->dl_non_contending = 0;
 unlock:
 	task_rq_unlock(rq, p, &rf);
@@ -1416,8 +1477,8 @@ static void enqueue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 		dl_check_constrained_dl(&p->dl);
 
 	if (p->on_rq == TASK_ON_RQ_MIGRATING || flags & ENQUEUE_RESTORE) {
-		add_rq_bw(p->dl.dl_bw, &rq->dl);
-		add_running_bw(p->dl.dl_bw, &rq->dl);
+		add_rq_bw(&p->dl, &rq->dl);
+		add_running_bw(&p->dl, &rq->dl);
 	}
 
 	/*
@@ -1457,8 +1518,8 @@ static void dequeue_task_dl(struct rq *rq, struct task_struct *p, int flags)
 	__dequeue_task_dl(rq, p, flags);
 
 	if (p->on_rq == TASK_ON_RQ_MIGRATING || flags & DEQUEUE_SAVE) {
-		sub_running_bw(p->dl.dl_bw, &rq->dl);
-		sub_rq_bw(p->dl.dl_bw, &rq->dl);
+		sub_running_bw(&p->dl, &rq->dl);
+		sub_rq_bw(&p->dl, &rq->dl);
 	}
 
 	/*
@@ -1564,7 +1625,7 @@ static void migrate_task_rq_dl(struct task_struct *p)
 	 */
 	raw_spin_lock(&rq->lock);
 	if (p->dl.dl_non_contending) {
-		sub_running_bw(p->dl.dl_bw, &rq->dl);
+		sub_running_bw(&p->dl, &rq->dl);
 		p->dl.dl_non_contending = 0;
 		/*
 		 * If the timer handler is currently running and the
@@ -1576,7 +1637,7 @@ static void migrate_task_rq_dl(struct task_struct *p)
 		if (hrtimer_try_to_cancel(&p->dl.inactive_timer) == 1)
 			put_task_struct(p);
 	}
-	sub_rq_bw(p->dl.dl_bw, &rq->dl);
+	sub_rq_bw(&p->dl, &rq->dl);
 	raw_spin_unlock(&rq->lock);
 }
 
@@ -2019,11 +2080,11 @@ static int push_dl_task(struct rq *rq)
 	}
 
 	deactivate_task(rq, next_task, 0);
-	sub_running_bw(next_task->dl.dl_bw, &rq->dl);
-	sub_rq_bw(next_task->dl.dl_bw, &rq->dl);
+	sub_running_bw(&next_task->dl, &rq->dl);
+	sub_rq_bw(&next_task->dl, &rq->dl);
 	set_task_cpu(next_task, later_rq->cpu);
-	add_rq_bw(next_task->dl.dl_bw, &later_rq->dl);
-	add_running_bw(next_task->dl.dl_bw, &later_rq->dl);
+	add_rq_bw(&next_task->dl, &later_rq->dl);
+	add_running_bw(&next_task->dl, &later_rq->dl);
 	activate_task(later_rq, next_task, 0);
 	ret = 1;
 
@@ -2111,11 +2172,11 @@ static void pull_dl_task(struct rq *this_rq)
 			resched = true;
 
 			deactivate_task(src_rq, p, 0);
-			sub_running_bw(p->dl.dl_bw, &src_rq->dl);
-			sub_rq_bw(p->dl.dl_bw, &src_rq->dl);
+			sub_running_bw(&p->dl, &src_rq->dl);
+			sub_rq_bw(&p->dl, &src_rq->dl);
 			set_task_cpu(p, this_cpu);
-			add_rq_bw(p->dl.dl_bw, &this_rq->dl);
-			add_running_bw(p->dl.dl_bw, &this_rq->dl);
+			add_rq_bw(&p->dl, &this_rq->dl);
+			add_running_bw(&p->dl, &this_rq->dl);
 			activate_task(this_rq, p, 0);
 			dmin = p->dl.deadline;
 
@@ -2224,7 +2285,7 @@ static void switched_from_dl(struct rq *rq, struct task_struct *p)
 		task_non_contending(p);
 
 	if (!task_on_rq_queued(p))
-		sub_rq_bw(p->dl.dl_bw, &rq->dl);
+		sub_rq_bw(&p->dl, &rq->dl);
 
 	/*
 	 * We cannot use inactive_task_timer() to invoke sub_running_bw()
@@ -2256,7 +2317,7 @@ static void switched_to_dl(struct rq *rq, struct task_struct *p)
 
 	/* If p is not queued we will update its parameters at next wakeup. */
 	if (!task_on_rq_queued(p)) {
-		add_rq_bw(p->dl.dl_bw, &rq->dl);
+		add_rq_bw(&p->dl, &rq->dl);
 
 		return;
 	}
@@ -2435,6 +2496,9 @@ int sched_dl_overflow(struct task_struct *p, int policy,
 	u64 new_bw = dl_policy(policy) ? to_ratio(period, runtime) : 0;
 	int cpus, err = -1;
 
+	if (attr->sched_flags & SCHED_FLAG_SUGOV)
+		return 0;
+
 	/* !deadline task may carry old deadline bandwidth */
 	if (new_bw == p->dl.dl_bw && task_has_dl_policy(p))
 		return 0;
@@ -2521,6 +2585,10 @@ void __getparam_dl(struct task_struct *p, struct sched_attr *attr)
  */
 bool __checkparam_dl(const struct sched_attr *attr)
 {
+	/* special dl tasks don't actually use any parameter */
+	if (attr->sched_flags & SCHED_FLAG_SUGOV)
+		return true;
+
 	/* deadline != 0 */
 	if (attr->sched_deadline == 0)
 		return false;
@@ -2566,6 +2634,7 @@ void __dl_clear_params(struct task_struct *p)
 	dl_se->dl_throttled = 0;
 	dl_se->dl_yielded = 0;
 	dl_se->dl_non_contending = 0;
+	dl_se->dl_overrun = 0;
 }
 
 bool dl_param_changed(struct task_struct *p, const struct sched_attr *attr)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2fe3aa8..7b65359 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3020,9 +3020,7 @@ static inline void cfs_rq_util_change(struct cfs_rq *cfs_rq)
 		/*
 		 * There are a few boundary cases this might miss but it should
 		 * get called often enough that that should (hopefully) not be
-		 * a real problem -- added to that it only calls on the local
-		 * CPU, so if we enqueue remotely we'll miss an update, but
-		 * the next tick/schedule should update.
+		 * a real problem.
 		 *
 		 * It will not get called when we go idle, because the idle
 		 * thread is a different class (!fair), nor will the utilization
@@ -3091,8 +3089,6 @@ static u32 __accumulate_pelt_segments(u64 periods, u32 d1, u32 d3)
 	return c1 + c2 + c3;
 }
 
-#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
-
 /*
  * Accumulate the three separate parts of the sum; d1 the remainder
  * of the last (incomplete) period, d2 the span of full periods and d3
@@ -3122,7 +3118,7 @@ accumulate_sum(u64 delta, int cpu, struct sched_avg *sa,
 	u32 contrib = (u32)delta; /* p == 0 -> delta < 1024 */
 	u64 periods;
 
-	scale_freq = arch_scale_freq_capacity(NULL, cpu);
+	scale_freq = arch_scale_freq_capacity(cpu);
 	scale_cpu = arch_scale_cpu_capacity(NULL, cpu);
 
 	delta += sa->period_contrib;
@@ -4365,12 +4361,12 @@ static inline bool cfs_bandwidth_used(void)
 
 void cfs_bandwidth_usage_inc(void)
 {
-	static_key_slow_inc(&__cfs_bandwidth_used);
+	static_key_slow_inc_cpuslocked(&__cfs_bandwidth_used);
 }
 
 void cfs_bandwidth_usage_dec(void)
 {
-	static_key_slow_dec(&__cfs_bandwidth_used);
+	static_key_slow_dec_cpuslocked(&__cfs_bandwidth_used);
 }
 #else /* HAVE_JUMP_LABEL */
 static bool cfs_bandwidth_used(void)
@@ -5689,8 +5685,8 @@ static int wake_wide(struct task_struct *p)
  * soonest. For the purpose of speed we only consider the waking and previous
  * CPU.
  *
- * wake_affine_idle() - only considers 'now', it check if the waking CPU is (or
- *			will be) idle.
+ * wake_affine_idle() - only considers 'now', it check if the waking CPU is
+ *			cache-affine and is (or	will be) idle.
  *
  * wake_affine_weight() - considers the weight to reflect the average
  *			  scheduling latency of the CPUs. This seems to work
@@ -5701,7 +5697,13 @@ static bool
 wake_affine_idle(struct sched_domain *sd, struct task_struct *p,
 		 int this_cpu, int prev_cpu, int sync)
 {
-	if (idle_cpu(this_cpu))
+	/*
+	 * If this_cpu is idle, it implies the wakeup is from interrupt
+	 * context. Only allow the move if cache is shared. Otherwise an
+	 * interrupt intensive workload could force all tasks onto one
+	 * node depending on the IO topology or IRQ affinity settings.
+	 */
+	if (idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
 		return true;
 
 	if (sync && cpu_rq(this_cpu)->nr_running == 1)
@@ -5765,12 +5767,12 @@ static int wake_affine(struct sched_domain *sd, struct task_struct *p,
 	return affine;
 }
 
-static inline int task_util(struct task_struct *p);
-static int cpu_util_wake(int cpu, struct task_struct *p);
+static inline unsigned long task_util(struct task_struct *p);
+static unsigned long cpu_util_wake(int cpu, struct task_struct *p);
 
 static unsigned long capacity_spare_wake(int cpu, struct task_struct *p)
 {
-	return capacity_orig_of(cpu) - cpu_util_wake(cpu, p);
+	return max_t(long, capacity_of(cpu) - cpu_util_wake(cpu, p), 0);
 }
 
 /*
@@ -5950,7 +5952,7 @@ find_idlest_group_cpu(struct sched_group *group, struct task_struct *p, int this
 			}
 		} else if (shallowest_idle_cpu == -1) {
 			load = weighted_cpuload(cpu_rq(i));
-			if (load < min_load || (load == min_load && i == this_cpu)) {
+			if (load < min_load) {
 				min_load = load;
 				least_loaded_cpu = i;
 			}
@@ -6247,7 +6249,7 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
  * capacity_orig) as it useful for predicting the capacity required after task
  * migrations (scheduler-driven DVFS).
  */
-static int cpu_util(int cpu)
+static unsigned long cpu_util(int cpu)
 {
 	unsigned long util = cpu_rq(cpu)->cfs.avg.util_avg;
 	unsigned long capacity = capacity_orig_of(cpu);
@@ -6255,7 +6257,7 @@ static int cpu_util(int cpu)
 	return (util >= capacity) ? capacity : util;
 }
 
-static inline int task_util(struct task_struct *p)
+static inline unsigned long task_util(struct task_struct *p)
 {
 	return p->se.avg.util_avg;
 }
@@ -6264,7 +6266,7 @@ static inline int task_util(struct task_struct *p)
  * cpu_util_wake: Compute cpu utilization with any contributions from
  * the waking task p removed.
  */
-static int cpu_util_wake(int cpu, struct task_struct *p)
+static unsigned long cpu_util_wake(int cpu, struct task_struct *p)
 {
 	unsigned long util, capacity;
 
@@ -6449,8 +6451,7 @@ static void task_dead_fair(struct task_struct *p)
 }
 #endif /* CONFIG_SMP */
 
-static unsigned long
-wakeup_gran(struct sched_entity *curr, struct sched_entity *se)
+static unsigned long wakeup_gran(struct sched_entity *se)
 {
 	unsigned long gran = sysctl_sched_wakeup_granularity;
 
@@ -6492,7 +6493,7 @@ wakeup_preempt_entity(struct sched_entity *curr, struct sched_entity *se)
 	if (vdiff <= 0)
 		return -1;
 
-	gran = wakeup_gran(curr, se);
+	gran = wakeup_gran(se);
 	if (vdiff > gran)
 		return 1;
 
diff --git a/kernel/sched/membarrier.c b/kernel/sched/membarrier.c
index dd79087..9bcbacba 100644
--- a/kernel/sched/membarrier.c
+++ b/kernel/sched/membarrier.c
@@ -89,7 +89,9 @@ static int membarrier_private_expedited(void)
 		rcu_read_unlock();
 	}
 	if (!fallback) {
+		preempt_disable();
 		smp_call_function_many(tmpmask, ipi_mb, NULL, 1);
+		preempt_enable();
 		free_cpumask_var(tmpmask);
 	}
 	cpus_read_unlock();
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 665ace2..862a513 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2212,7 +2212,7 @@ static void switched_to_rt(struct rq *rq, struct task_struct *p)
 		if (p->nr_cpus_allowed > 1 && rq->rt.overloaded)
 			queue_push_tasks(rq);
 #endif /* CONFIG_SMP */
-		if (p->prio < rq->curr->prio)
+		if (p->prio < rq->curr->prio && cpu_online(cpu_of(rq)))
 			resched_curr(rq);
 	}
 }
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index b19552a2..2e95505 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -156,13 +156,39 @@ static inline int task_has_dl_policy(struct task_struct *p)
 	return dl_policy(p->policy);
 }
 
+#define cap_scale(v, s) ((v)*(s) >> SCHED_CAPACITY_SHIFT)
+
+/*
+ * !! For sched_setattr_nocheck() (kernel) only !!
+ *
+ * This is actually gross. :(
+ *
+ * It is used to make schedutil kworker(s) higher priority than SCHED_DEADLINE
+ * tasks, but still be able to sleep. We need this on platforms that cannot
+ * atomically change clock frequency. Remove once fast switching will be
+ * available on such platforms.
+ *
+ * SUGOV stands for SchedUtil GOVernor.
+ */
+#define SCHED_FLAG_SUGOV	0x10000000
+
+static inline bool dl_entity_is_special(struct sched_dl_entity *dl_se)
+{
+#ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL
+	return unlikely(dl_se->flags & SCHED_FLAG_SUGOV);
+#else
+	return false;
+#endif
+}
+
 /*
  * Tells if entity @a should preempt entity @b.
  */
 static inline bool
 dl_entity_preempt(struct sched_dl_entity *a, struct sched_dl_entity *b)
 {
-	return dl_time_before(a->deadline, b->deadline);
+	return dl_entity_is_special(a) ||
+	       dl_time_before(a->deadline, b->deadline);
 }
 
 /*
@@ -1328,47 +1354,6 @@ static inline int task_on_rq_migrating(struct task_struct *p)
 # define finish_arch_post_lock_switch()	do { } while (0)
 #endif
 
-static inline void prepare_lock_switch(struct rq *rq, struct task_struct *next)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * We can optimise this out completely for !SMP, because the
-	 * SMP rebalancing from interrupt is the only thing that cares
-	 * here.
-	 */
-	next->on_cpu = 1;
-#endif
-}
-
-static inline void finish_lock_switch(struct rq *rq, struct task_struct *prev)
-{
-#ifdef CONFIG_SMP
-	/*
-	 * After ->on_cpu is cleared, the task can be moved to a different CPU.
-	 * We must ensure this doesn't happen until the switch is completely
-	 * finished.
-	 *
-	 * In particular, the load of prev->state in finish_task_switch() must
-	 * happen before this.
-	 *
-	 * Pairs with the smp_cond_load_acquire() in try_to_wake_up().
-	 */
-	smp_store_release(&prev->on_cpu, 0);
-#endif
-#ifdef CONFIG_DEBUG_SPINLOCK
-	/* this is a valid case when another task releases the spinlock */
-	rq->lock.owner = current;
-#endif
-	/*
-	 * If we are tracking spinlock dependencies then we have to
-	 * fix up the runqueue lock - which gets 'carried over' from
-	 * prev into current:
-	 */
-	spin_acquire(&rq->lock.dep_map, 0, 0, _THIS_IP_);
-
-	raw_spin_unlock_irq(&rq->lock);
-}
-
 /*
  * wake flags
  */
@@ -1687,17 +1672,17 @@ static inline int hrtick_enabled(struct rq *rq)
 
 #endif /* CONFIG_SCHED_HRTICK */
 
-#ifdef CONFIG_SMP
-extern void sched_avg_update(struct rq *rq);
-
 #ifndef arch_scale_freq_capacity
 static __always_inline
-unsigned long arch_scale_freq_capacity(struct sched_domain *sd, int cpu)
+unsigned long arch_scale_freq_capacity(int cpu)
 {
 	return SCHED_CAPACITY_SCALE;
 }
 #endif
 
+#ifdef CONFIG_SMP
+extern void sched_avg_update(struct rq *rq);
+
 #ifndef arch_scale_cpu_capacity
 static __always_inline
 unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
@@ -1711,10 +1696,17 @@ unsigned long arch_scale_cpu_capacity(struct sched_domain *sd, int cpu)
 
 static inline void sched_rt_avg_update(struct rq *rq, u64 rt_delta)
 {
-	rq->rt_avg += rt_delta * arch_scale_freq_capacity(NULL, cpu_of(rq));
+	rq->rt_avg += rt_delta * arch_scale_freq_capacity(cpu_of(rq));
 	sched_avg_update(rq);
 }
 #else
+#ifndef arch_scale_cpu_capacity
+static __always_inline
+unsigned long arch_scale_cpu_capacity(void __always_unused *sd, int cpu)
+{
+	return SCHED_CAPACITY_SCALE;
+}
+#endif
 static inline void sched_rt_avg_update(struct rq *rq, u64 rt_delta) { }
 static inline void sched_avg_update(struct rq *rq) { }
 #endif
@@ -2096,14 +2088,14 @@ DECLARE_PER_CPU(struct update_util_data *, cpufreq_update_util_data);
  * The way cpufreq is currently arranged requires it to evaluate the CPU
  * performance state (frequency/voltage) on a regular basis to prevent it from
  * being stuck in a completely inadequate performance level for too long.
- * That is not guaranteed to happen if the updates are only triggered from CFS,
- * though, because they may not be coming in if RT or deadline tasks are active
- * all the time (or there are RT and DL tasks only).
+ * That is not guaranteed to happen if the updates are only triggered from CFS
+ * and DL, though, because they may not be coming in if only RT tasks are
+ * active all the time (or there are RT tasks only).
  *
- * As a workaround for that issue, this function is called by the RT and DL
- * sched classes to trigger extra cpufreq updates to prevent it from stalling,
+ * As a workaround for that issue, this function is called periodically by the
+ * RT sched class to trigger extra cpufreq updates to prevent it from stalling,
  * but that really is a band-aid.  Going forward it should be replaced with
- * solutions targeted more specifically at RT and DL tasks.
+ * solutions targeted more specifically at RT tasks.
  */
 static inline void cpufreq_update_util(struct rq *rq, unsigned int flags)
 {
@@ -2125,3 +2117,17 @@ static inline void cpufreq_update_util(struct rq *rq, unsigned int flags) {}
 #else /* arch_scale_freq_capacity */
 #define arch_scale_freq_invariant()	(false)
 #endif
+
+#ifdef CONFIG_CPU_FREQ_GOV_SCHEDUTIL
+
+static inline unsigned long cpu_util_dl(struct rq *rq)
+{
+	return (rq->dl.running_bw * SCHED_CAPACITY_SCALE) >> BW_SHIFT;
+}
+
+static inline unsigned long cpu_util_cfs(struct rq *rq)
+{
+	return rq->cfs.avg.util_avg;
+}
+
+#endif
diff --git a/kernel/softirq.c b/kernel/softirq.c
index 2f5e87f..24d243e 100644
--- a/kernel/softirq.c
+++ b/kernel/softirq.c
@@ -665,7 +665,7 @@ static void run_ksoftirqd(unsigned int cpu)
 		 */
 		__do_softirq();
 		local_irq_enable();
-		cond_resched_rcu_qs();
+		cond_resched();
 		return;
 	}
 	local_irq_enable();
diff --git a/kernel/time/hrtimer.c b/kernel/time/hrtimer.c
index d325208..ae0c8a4 100644
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -60,6 +60,15 @@
 #include "tick-internal.h"
 
 /*
+ * Masks for selecting the soft and hard context timers from
+ * cpu_base->active
+ */
+#define MASK_SHIFT		(HRTIMER_BASE_MONOTONIC_SOFT)
+#define HRTIMER_ACTIVE_HARD	((1U << MASK_SHIFT) - 1)
+#define HRTIMER_ACTIVE_SOFT	(HRTIMER_ACTIVE_HARD << MASK_SHIFT)
+#define HRTIMER_ACTIVE_ALL	(HRTIMER_ACTIVE_SOFT | HRTIMER_ACTIVE_HARD)
+
+/*
  * The timer bases:
  *
  * There are more clockids than hrtimer bases. Thus, we index
@@ -70,7 +79,6 @@
 DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 {
 	.lock = __RAW_SPIN_LOCK_UNLOCKED(hrtimer_bases.lock),
-	.seq = SEQCNT_ZERO(hrtimer_bases.seq),
 	.clock_base =
 	{
 		{
@@ -93,6 +101,26 @@ DEFINE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases) =
 			.clockid = CLOCK_TAI,
 			.get_time = &ktime_get_clocktai,
 		},
+		{
+			.index = HRTIMER_BASE_MONOTONIC_SOFT,
+			.clockid = CLOCK_MONOTONIC,
+			.get_time = &ktime_get,
+		},
+		{
+			.index = HRTIMER_BASE_REALTIME_SOFT,
+			.clockid = CLOCK_REALTIME,
+			.get_time = &ktime_get_real,
+		},
+		{
+			.index = HRTIMER_BASE_BOOTTIME_SOFT,
+			.clockid = CLOCK_BOOTTIME,
+			.get_time = &ktime_get_boottime,
+		},
+		{
+			.index = HRTIMER_BASE_TAI_SOFT,
+			.clockid = CLOCK_TAI,
+			.get_time = &ktime_get_clocktai,
+		},
 	}
 };
 
@@ -118,7 +146,6 @@ static const int hrtimer_clock_to_base_table[MAX_CLOCKS] = {
  * timer->base->cpu_base
  */
 static struct hrtimer_cpu_base migration_cpu_base = {
-	.seq = SEQCNT_ZERO(migration_cpu_base),
 	.clock_base = { { .cpu_base = &migration_cpu_base, }, },
 };
 
@@ -156,45 +183,33 @@ struct hrtimer_clock_base *lock_hrtimer_base(const struct hrtimer *timer,
 }
 
 /*
- * With HIGHRES=y we do not migrate the timer when it is expiring
- * before the next event on the target cpu because we cannot reprogram
- * the target cpu hardware and we would cause it to fire late.
+ * We do not migrate the timer when it is expiring before the next
+ * event on the target cpu. When high resolution is enabled, we cannot
+ * reprogram the target cpu hardware and we would cause it to fire
+ * late. To keep it simple, we handle the high resolution enabled and
+ * disabled case similar.
  *
  * Called with cpu_base->lock of target cpu held.
  */
 static int
 hrtimer_check_target(struct hrtimer *timer, struct hrtimer_clock_base *new_base)
 {
-#ifdef CONFIG_HIGH_RES_TIMERS
 	ktime_t expires;
 
-	if (!new_base->cpu_base->hres_active)
-		return 0;
-
 	expires = ktime_sub(hrtimer_get_expires(timer), new_base->offset);
-	return expires <= new_base->cpu_base->expires_next;
-#else
-	return 0;
-#endif
+	return expires < new_base->cpu_base->expires_next;
 }
 
-#ifdef CONFIG_NO_HZ_COMMON
 static inline
 struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base,
 					 int pinned)
 {
-	if (pinned || !base->migration_enabled)
-		return base;
-	return &per_cpu(hrtimer_bases, get_nohz_timer_target());
-}
-#else
-static inline
-struct hrtimer_cpu_base *get_target_base(struct hrtimer_cpu_base *base,
-					 int pinned)
-{
+#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
+	if (static_branch_likely(&timers_migration_enabled) && !pinned)
+		return &per_cpu(hrtimer_bases, get_nohz_timer_target());
+#endif
 	return base;
 }
-#endif
 
 /*
  * We switch the timer base to a power-optimized selected CPU target,
@@ -396,7 +411,8 @@ static inline void debug_hrtimer_init(struct hrtimer *timer)
 	debug_object_init(timer, &hrtimer_debug_descr);
 }
 
-static inline void debug_hrtimer_activate(struct hrtimer *timer)
+static inline void debug_hrtimer_activate(struct hrtimer *timer,
+					  enum hrtimer_mode mode)
 {
 	debug_object_activate(timer, &hrtimer_debug_descr);
 }
@@ -429,8 +445,10 @@ void destroy_hrtimer_on_stack(struct hrtimer *timer)
 EXPORT_SYMBOL_GPL(destroy_hrtimer_on_stack);
 
 #else
+
 static inline void debug_hrtimer_init(struct hrtimer *timer) { }
-static inline void debug_hrtimer_activate(struct hrtimer *timer) { }
+static inline void debug_hrtimer_activate(struct hrtimer *timer,
+					  enum hrtimer_mode mode) { }
 static inline void debug_hrtimer_deactivate(struct hrtimer *timer) { }
 #endif
 
@@ -442,10 +460,11 @@ debug_init(struct hrtimer *timer, clockid_t clockid,
 	trace_hrtimer_init(timer, clockid, mode);
 }
 
-static inline void debug_activate(struct hrtimer *timer)
+static inline void debug_activate(struct hrtimer *timer,
+				  enum hrtimer_mode mode)
 {
-	debug_hrtimer_activate(timer);
-	trace_hrtimer_start(timer);
+	debug_hrtimer_activate(timer, mode);
+	trace_hrtimer_start(timer, mode);
 }
 
 static inline void debug_deactivate(struct hrtimer *timer)
@@ -454,35 +473,43 @@ static inline void debug_deactivate(struct hrtimer *timer)
 	trace_hrtimer_cancel(timer);
 }
 
-#if defined(CONFIG_NO_HZ_COMMON) || defined(CONFIG_HIGH_RES_TIMERS)
-static inline void hrtimer_update_next_timer(struct hrtimer_cpu_base *cpu_base,
-					     struct hrtimer *timer)
+static struct hrtimer_clock_base *
+__next_base(struct hrtimer_cpu_base *cpu_base, unsigned int *active)
 {
-#ifdef CONFIG_HIGH_RES_TIMERS
-	cpu_base->next_timer = timer;
-#endif
+	unsigned int idx;
+
+	if (!*active)
+		return NULL;
+
+	idx = __ffs(*active);
+	*active &= ~(1U << idx);
+
+	return &cpu_base->clock_base[idx];
 }
 
-static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
-{
-	struct hrtimer_clock_base *base = cpu_base->clock_base;
-	unsigned int active = cpu_base->active_bases;
-	ktime_t expires, expires_next = KTIME_MAX;
+#define for_each_active_base(base, cpu_base, active)	\
+	while ((base = __next_base((cpu_base), &(active))))
 
-	hrtimer_update_next_timer(cpu_base, NULL);
-	for (; active; base++, active >>= 1) {
+static ktime_t __hrtimer_next_event_base(struct hrtimer_cpu_base *cpu_base,
+					 unsigned int active,
+					 ktime_t expires_next)
+{
+	struct hrtimer_clock_base *base;
+	ktime_t expires;
+
+	for_each_active_base(base, cpu_base, active) {
 		struct timerqueue_node *next;
 		struct hrtimer *timer;
 
-		if (!(active & 0x01))
-			continue;
-
 		next = timerqueue_getnext(&base->active);
 		timer = container_of(next, struct hrtimer, node);
 		expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
 		if (expires < expires_next) {
 			expires_next = expires;
-			hrtimer_update_next_timer(cpu_base, timer);
+			if (timer->is_soft)
+				cpu_base->softirq_next_timer = timer;
+			else
+				cpu_base->next_timer = timer;
 		}
 	}
 	/*
@@ -494,7 +521,47 @@ static ktime_t __hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base)
 		expires_next = 0;
 	return expires_next;
 }
-#endif
+
+/*
+ * Recomputes cpu_base::*next_timer and returns the earliest expires_next but
+ * does not set cpu_base::*expires_next, that is done by hrtimer_reprogram.
+ *
+ * When a softirq is pending, we can ignore the HRTIMER_ACTIVE_SOFT bases,
+ * those timers will get run whenever the softirq gets handled, at the end of
+ * hrtimer_run_softirq(), hrtimer_update_softirq_timer() will re-add these bases.
+ *
+ * Therefore softirq values are those from the HRTIMER_ACTIVE_SOFT clock bases.
+ * The !softirq values are the minima across HRTIMER_ACTIVE_ALL, unless an actual
+ * softirq is pending, in which case they're the minima of HRTIMER_ACTIVE_HARD.
+ *
+ * @active_mask must be one of:
+ *  - HRTIMER_ACTIVE_ALL,
+ *  - HRTIMER_ACTIVE_SOFT, or
+ *  - HRTIMER_ACTIVE_HARD.
+ */
+static ktime_t
+__hrtimer_get_next_event(struct hrtimer_cpu_base *cpu_base, unsigned int active_mask)
+{
+	unsigned int active;
+	struct hrtimer *next_timer = NULL;
+	ktime_t expires_next = KTIME_MAX;
+
+	if (!cpu_base->softirq_activated && (active_mask & HRTIMER_ACTIVE_SOFT)) {
+		active = cpu_base->active_bases & HRTIMER_ACTIVE_SOFT;
+		cpu_base->softirq_next_timer = NULL;
+		expires_next = __hrtimer_next_event_base(cpu_base, active, KTIME_MAX);
+
+		next_timer = cpu_base->softirq_next_timer;
+	}
+
+	if (active_mask & HRTIMER_ACTIVE_HARD) {
+		active = cpu_base->active_bases & HRTIMER_ACTIVE_HARD;
+		cpu_base->next_timer = next_timer;
+		expires_next = __hrtimer_next_event_base(cpu_base, active, expires_next);
+	}
+
+	return expires_next;
+}
 
 static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
 {
@@ -502,8 +569,84 @@ static inline ktime_t hrtimer_update_base(struct hrtimer_cpu_base *base)
 	ktime_t *offs_boot = &base->clock_base[HRTIMER_BASE_BOOTTIME].offset;
 	ktime_t *offs_tai = &base->clock_base[HRTIMER_BASE_TAI].offset;
 
-	return ktime_get_update_offsets_now(&base->clock_was_set_seq,
+	ktime_t now = ktime_get_update_offsets_now(&base->clock_was_set_seq,
 					    offs_real, offs_boot, offs_tai);
+
+	base->clock_base[HRTIMER_BASE_REALTIME_SOFT].offset = *offs_real;
+	base->clock_base[HRTIMER_BASE_BOOTTIME_SOFT].offset = *offs_boot;
+	base->clock_base[HRTIMER_BASE_TAI_SOFT].offset = *offs_tai;
+
+	return now;
+}
+
+/*
+ * Is the high resolution mode active ?
+ */
+static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base)
+{
+	return IS_ENABLED(CONFIG_HIGH_RES_TIMERS) ?
+		cpu_base->hres_active : 0;
+}
+
+static inline int hrtimer_hres_active(void)
+{
+	return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases));
+}
+
+/*
+ * Reprogram the event source with checking both queues for the
+ * next event
+ * Called with interrupts disabled and base->lock held
+ */
+static void
+hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
+{
+	ktime_t expires_next;
+
+	/*
+	 * Find the current next expiration time.
+	 */
+	expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL);
+
+	if (cpu_base->next_timer && cpu_base->next_timer->is_soft) {
+		/*
+		 * When the softirq is activated, hrtimer has to be
+		 * programmed with the first hard hrtimer because soft
+		 * timer interrupt could occur too late.
+		 */
+		if (cpu_base->softirq_activated)
+			expires_next = __hrtimer_get_next_event(cpu_base,
+								HRTIMER_ACTIVE_HARD);
+		else
+			cpu_base->softirq_expires_next = expires_next;
+	}
+
+	if (skip_equal && expires_next == cpu_base->expires_next)
+		return;
+
+	cpu_base->expires_next = expires_next;
+
+	/*
+	 * If hres is not active, hardware does not have to be
+	 * reprogrammed yet.
+	 *
+	 * If a hang was detected in the last timer interrupt then we
+	 * leave the hang delay active in the hardware. We want the
+	 * system to make progress. That also prevents the following
+	 * scenario:
+	 * T1 expires 50ms from now
+	 * T2 expires 5s from now
+	 *
+	 * T1 is removed, so this code is called and would reprogram
+	 * the hardware to 5s from now. Any hrtimer_start after that
+	 * will not reprogram the hardware due to hang_detected being
+	 * set. So we'd effectivly block all timers until the T2 event
+	 * fires.
+	 */
+	if (!__hrtimer_hres_active(cpu_base) || cpu_base->hang_detected)
+		return;
+
+	tick_program_event(cpu_base->expires_next, 1);
 }
 
 /* High resolution timer related functions */
@@ -535,130 +678,6 @@ static inline int hrtimer_is_hres_enabled(void)
 }
 
 /*
- * Is the high resolution mode active ?
- */
-static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *cpu_base)
-{
-	return cpu_base->hres_active;
-}
-
-static inline int hrtimer_hres_active(void)
-{
-	return __hrtimer_hres_active(this_cpu_ptr(&hrtimer_bases));
-}
-
-/*
- * Reprogram the event source with checking both queues for the
- * next event
- * Called with interrupts disabled and base->lock held
- */
-static void
-hrtimer_force_reprogram(struct hrtimer_cpu_base *cpu_base, int skip_equal)
-{
-	ktime_t expires_next;
-
-	if (!cpu_base->hres_active)
-		return;
-
-	expires_next = __hrtimer_get_next_event(cpu_base);
-
-	if (skip_equal && expires_next == cpu_base->expires_next)
-		return;
-
-	cpu_base->expires_next = expires_next;
-
-	/*
-	 * If a hang was detected in the last timer interrupt then we
-	 * leave the hang delay active in the hardware. We want the
-	 * system to make progress. That also prevents the following
-	 * scenario:
-	 * T1 expires 50ms from now
-	 * T2 expires 5s from now
-	 *
-	 * T1 is removed, so this code is called and would reprogram
-	 * the hardware to 5s from now. Any hrtimer_start after that
-	 * will not reprogram the hardware due to hang_detected being
-	 * set. So we'd effectivly block all timers until the T2 event
-	 * fires.
-	 */
-	if (cpu_base->hang_detected)
-		return;
-
-	tick_program_event(cpu_base->expires_next, 1);
-}
-
-/*
- * When a timer is enqueued and expires earlier than the already enqueued
- * timers, we have to check, whether it expires earlier than the timer for
- * which the clock event device was armed.
- *
- * Called with interrupts disabled and base->cpu_base.lock held
- */
-static void hrtimer_reprogram(struct hrtimer *timer,
-			      struct hrtimer_clock_base *base)
-{
-	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
-	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
-
-	WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
-
-	/*
-	 * If the timer is not on the current cpu, we cannot reprogram
-	 * the other cpus clock event device.
-	 */
-	if (base->cpu_base != cpu_base)
-		return;
-
-	/*
-	 * If the hrtimer interrupt is running, then it will
-	 * reevaluate the clock bases and reprogram the clock event
-	 * device. The callbacks are always executed in hard interrupt
-	 * context so we don't need an extra check for a running
-	 * callback.
-	 */
-	if (cpu_base->in_hrtirq)
-		return;
-
-	/*
-	 * CLOCK_REALTIME timer might be requested with an absolute
-	 * expiry time which is less than base->offset. Set it to 0.
-	 */
-	if (expires < 0)
-		expires = 0;
-
-	if (expires >= cpu_base->expires_next)
-		return;
-
-	/* Update the pointer to the next expiring timer */
-	cpu_base->next_timer = timer;
-
-	/*
-	 * If a hang was detected in the last timer interrupt then we
-	 * do not schedule a timer which is earlier than the expiry
-	 * which we enforced in the hang detection. We want the system
-	 * to make progress.
-	 */
-	if (cpu_base->hang_detected)
-		return;
-
-	/*
-	 * Program the timer hardware. We enforce the expiry for
-	 * events which are already in the past.
-	 */
-	cpu_base->expires_next = expires;
-	tick_program_event(expires, 1);
-}
-
-/*
- * Initialize the high resolution related parts of cpu_base
- */
-static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
-{
-	base->expires_next = KTIME_MAX;
-	base->hres_active = 0;
-}
-
-/*
  * Retrigger next event is called after clock was set
  *
  * Called with interrupts disabled via on_each_cpu()
@@ -667,7 +686,7 @@ static void retrigger_next_event(void *arg)
 {
 	struct hrtimer_cpu_base *base = this_cpu_ptr(&hrtimer_bases);
 
-	if (!base->hres_active)
+	if (!__hrtimer_hres_active(base))
 		return;
 
 	raw_spin_lock(&base->lock);
@@ -714,23 +733,102 @@ void clock_was_set_delayed(void)
 
 #else
 
-static inline int __hrtimer_hres_active(struct hrtimer_cpu_base *b) { return 0; }
-static inline int hrtimer_hres_active(void) { return 0; }
 static inline int hrtimer_is_hres_enabled(void) { return 0; }
 static inline void hrtimer_switch_to_hres(void) { }
-static inline void
-hrtimer_force_reprogram(struct hrtimer_cpu_base *base, int skip_equal) { }
-static inline int hrtimer_reprogram(struct hrtimer *timer,
-				    struct hrtimer_clock_base *base)
-{
-	return 0;
-}
-static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base) { }
 static inline void retrigger_next_event(void *arg) { }
 
 #endif /* CONFIG_HIGH_RES_TIMERS */
 
 /*
+ * When a timer is enqueued and expires earlier than the already enqueued
+ * timers, we have to check, whether it expires earlier than the timer for
+ * which the clock event device was armed.
+ *
+ * Called with interrupts disabled and base->cpu_base.lock held
+ */
+static void hrtimer_reprogram(struct hrtimer *timer, bool reprogram)
+{
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
+	struct hrtimer_clock_base *base = timer->base;
+	ktime_t expires = ktime_sub(hrtimer_get_expires(timer), base->offset);
+
+	WARN_ON_ONCE(hrtimer_get_expires_tv64(timer) < 0);
+
+	/*
+	 * CLOCK_REALTIME timer might be requested with an absolute
+	 * expiry time which is less than base->offset. Set it to 0.
+	 */
+	if (expires < 0)
+		expires = 0;
+
+	if (timer->is_soft) {
+		/*
+		 * soft hrtimer could be started on a remote CPU. In this
+		 * case softirq_expires_next needs to be updated on the
+		 * remote CPU. The soft hrtimer will not expire before the
+		 * first hard hrtimer on the remote CPU -
+		 * hrtimer_check_target() prevents this case.
+		 */
+		struct hrtimer_cpu_base *timer_cpu_base = base->cpu_base;
+
+		if (timer_cpu_base->softirq_activated)
+			return;
+
+		if (!ktime_before(expires, timer_cpu_base->softirq_expires_next))
+			return;
+
+		timer_cpu_base->softirq_next_timer = timer;
+		timer_cpu_base->softirq_expires_next = expires;
+
+		if (!ktime_before(expires, timer_cpu_base->expires_next) ||
+		    !reprogram)
+			return;
+	}
+
+	/*
+	 * If the timer is not on the current cpu, we cannot reprogram
+	 * the other cpus clock event device.
+	 */
+	if (base->cpu_base != cpu_base)
+		return;
+
+	/*
+	 * If the hrtimer interrupt is running, then it will
+	 * reevaluate the clock bases and reprogram the clock event
+	 * device. The callbacks are always executed in hard interrupt
+	 * context so we don't need an extra check for a running
+	 * callback.
+	 */
+	if (cpu_base->in_hrtirq)
+		return;
+
+	if (expires >= cpu_base->expires_next)
+		return;
+
+	/* Update the pointer to the next expiring timer */
+	cpu_base->next_timer = timer;
+	cpu_base->expires_next = expires;
+
+	/*
+	 * If hres is not active, hardware does not have to be
+	 * programmed yet.
+	 *
+	 * If a hang was detected in the last timer interrupt then we
+	 * do not schedule a timer which is earlier than the expiry
+	 * which we enforced in the hang detection. We want the system
+	 * to make progress.
+	 */
+	if (!__hrtimer_hres_active(cpu_base) || cpu_base->hang_detected)
+		return;
+
+	/*
+	 * Program the timer hardware. We enforce the expiry for
+	 * events which are already in the past.
+	 */
+	tick_program_event(expires, 1);
+}
+
+/*
  * Clock realtime was set
  *
  * Change the offset of the realtime clock vs. the monotonic
@@ -835,9 +933,10 @@ EXPORT_SYMBOL_GPL(hrtimer_forward);
  * Returns 1 when the new timer is the leftmost timer in the tree.
  */
 static int enqueue_hrtimer(struct hrtimer *timer,
-			   struct hrtimer_clock_base *base)
+			   struct hrtimer_clock_base *base,
+			   enum hrtimer_mode mode)
 {
-	debug_activate(timer);
+	debug_activate(timer, mode);
 
 	base->cpu_base->active_bases |= 1 << base->index;
 
@@ -870,7 +969,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 	if (!timerqueue_del(&base->active, &timer->node))
 		cpu_base->active_bases &= ~(1 << base->index);
 
-#ifdef CONFIG_HIGH_RES_TIMERS
 	/*
 	 * Note: If reprogram is false we do not update
 	 * cpu_base->next_timer. This happens when we remove the first
@@ -881,7 +979,6 @@ static void __remove_hrtimer(struct hrtimer *timer,
 	 */
 	if (reprogram && timer == cpu_base->next_timer)
 		hrtimer_force_reprogram(cpu_base, 1);
-#endif
 }
 
 /*
@@ -930,22 +1027,36 @@ static inline ktime_t hrtimer_update_lowres(struct hrtimer *timer, ktime_t tim,
 	return tim;
 }
 
-/**
- * hrtimer_start_range_ns - (re)start an hrtimer on the current CPU
- * @timer:	the timer to be added
- * @tim:	expiry time
- * @delta_ns:	"slack" range for the timer
- * @mode:	expiry mode: absolute (HRTIMER_MODE_ABS) or
- *		relative (HRTIMER_MODE_REL)
- */
-void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
-			    u64 delta_ns, const enum hrtimer_mode mode)
+static void
+hrtimer_update_softirq_timer(struct hrtimer_cpu_base *cpu_base, bool reprogram)
 {
-	struct hrtimer_clock_base *base, *new_base;
-	unsigned long flags;
-	int leftmost;
+	ktime_t expires;
 
-	base = lock_hrtimer_base(timer, &flags);
+	/*
+	 * Find the next SOFT expiration.
+	 */
+	expires = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_SOFT);
+
+	/*
+	 * reprogramming needs to be triggered, even if the next soft
+	 * hrtimer expires at the same time than the next hard
+	 * hrtimer. cpu_base->softirq_expires_next needs to be updated!
+	 */
+	if (expires == KTIME_MAX)
+		return;
+
+	/*
+	 * cpu_base->*next_timer is recomputed by __hrtimer_get_next_event()
+	 * cpu_base->*expires_next is only set by hrtimer_reprogram()
+	 */
+	hrtimer_reprogram(cpu_base->softirq_next_timer, reprogram);
+}
+
+static int __hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+				    u64 delta_ns, const enum hrtimer_mode mode,
+				    struct hrtimer_clock_base *base)
+{
+	struct hrtimer_clock_base *new_base;
 
 	/* Remove an active timer from the queue: */
 	remove_hrtimer(timer, base, true);
@@ -960,21 +1071,35 @@ void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
 	/* Switch the timer base, if necessary: */
 	new_base = switch_hrtimer_base(timer, base, mode & HRTIMER_MODE_PINNED);
 
-	leftmost = enqueue_hrtimer(timer, new_base);
-	if (!leftmost)
-		goto unlock;
+	return enqueue_hrtimer(timer, new_base, mode);
+}
 
-	if (!hrtimer_is_hres_active(timer)) {
-		/*
-		 * Kick to reschedule the next tick to handle the new timer
-		 * on dynticks target.
-		 */
-		if (new_base->cpu_base->nohz_active)
-			wake_up_nohz_cpu(new_base->cpu_base->cpu);
-	} else {
-		hrtimer_reprogram(timer, new_base);
-	}
-unlock:
+/**
+ * hrtimer_start_range_ns - (re)start an hrtimer
+ * @timer:	the timer to be added
+ * @tim:	expiry time
+ * @delta_ns:	"slack" range for the timer
+ * @mode:	timer mode: absolute (HRTIMER_MODE_ABS) or
+ *		relative (HRTIMER_MODE_REL), and pinned (HRTIMER_MODE_PINNED);
+ *		softirq based mode is considered for debug purpose only!
+ */
+void hrtimer_start_range_ns(struct hrtimer *timer, ktime_t tim,
+			    u64 delta_ns, const enum hrtimer_mode mode)
+{
+	struct hrtimer_clock_base *base;
+	unsigned long flags;
+
+	/*
+	 * Check whether the HRTIMER_MODE_SOFT bit and hrtimer.is_soft
+	 * match.
+	 */
+	WARN_ON_ONCE(!(mode & HRTIMER_MODE_SOFT) ^ !timer->is_soft);
+
+	base = lock_hrtimer_base(timer, &flags);
+
+	if (__hrtimer_start_range_ns(timer, tim, delta_ns, mode, base))
+		hrtimer_reprogram(timer, true);
+
 	unlock_hrtimer_base(timer, &flags);
 }
 EXPORT_SYMBOL_GPL(hrtimer_start_range_ns);
@@ -1072,7 +1197,7 @@ u64 hrtimer_get_next_event(void)
 	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 
 	if (!__hrtimer_hres_active(cpu_base))
-		expires = __hrtimer_get_next_event(cpu_base);
+		expires = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL);
 
 	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 
@@ -1095,17 +1220,24 @@ static inline int hrtimer_clockid_to_base(clockid_t clock_id)
 static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 			   enum hrtimer_mode mode)
 {
+	bool softtimer = !!(mode & HRTIMER_MODE_SOFT);
+	int base = softtimer ? HRTIMER_MAX_CLOCK_BASES / 2 : 0;
 	struct hrtimer_cpu_base *cpu_base;
-	int base;
 
 	memset(timer, 0, sizeof(struct hrtimer));
 
 	cpu_base = raw_cpu_ptr(&hrtimer_bases);
 
-	if (clock_id == CLOCK_REALTIME && mode != HRTIMER_MODE_ABS)
+	/*
+	 * POSIX magic: Relative CLOCK_REALTIME timers are not affected by
+	 * clock modifications, so they needs to become CLOCK_MONOTONIC to
+	 * ensure POSIX compliance.
+	 */
+	if (clock_id == CLOCK_REALTIME && mode & HRTIMER_MODE_REL)
 		clock_id = CLOCK_MONOTONIC;
 
-	base = hrtimer_clockid_to_base(clock_id);
+	base += hrtimer_clockid_to_base(clock_id);
+	timer->is_soft = softtimer;
 	timer->base = &cpu_base->clock_base[base];
 	timerqueue_init(&timer->node);
 }
@@ -1114,7 +1246,13 @@ static void __hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
  * hrtimer_init - initialize a timer to the given clock
  * @timer:	the timer to be initialized
  * @clock_id:	the clock to be used
- * @mode:	timer mode abs/rel
+ * @mode:       The modes which are relevant for intitialization:
+ *              HRTIMER_MODE_ABS, HRTIMER_MODE_REL, HRTIMER_MODE_ABS_SOFT,
+ *              HRTIMER_MODE_REL_SOFT
+ *
+ *              The PINNED variants of the above can be handed in,
+ *              but the PINNED bit is ignored as pinning happens
+ *              when the hrtimer is started
  */
 void hrtimer_init(struct hrtimer *timer, clockid_t clock_id,
 		  enum hrtimer_mode mode)
@@ -1133,19 +1271,19 @@ EXPORT_SYMBOL_GPL(hrtimer_init);
  */
 bool hrtimer_active(const struct hrtimer *timer)
 {
-	struct hrtimer_cpu_base *cpu_base;
+	struct hrtimer_clock_base *base;
 	unsigned int seq;
 
 	do {
-		cpu_base = READ_ONCE(timer->base->cpu_base);
-		seq = raw_read_seqcount_begin(&cpu_base->seq);
+		base = READ_ONCE(timer->base);
+		seq = raw_read_seqcount_begin(&base->seq);
 
 		if (timer->state != HRTIMER_STATE_INACTIVE ||
-		    cpu_base->running == timer)
+		    base->running == timer)
 			return true;
 
-	} while (read_seqcount_retry(&cpu_base->seq, seq) ||
-		 cpu_base != READ_ONCE(timer->base->cpu_base));
+	} while (read_seqcount_retry(&base->seq, seq) ||
+		 base != READ_ONCE(timer->base));
 
 	return false;
 }
@@ -1171,7 +1309,8 @@ EXPORT_SYMBOL_GPL(hrtimer_active);
 
 static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 			  struct hrtimer_clock_base *base,
-			  struct hrtimer *timer, ktime_t *now)
+			  struct hrtimer *timer, ktime_t *now,
+			  unsigned long flags)
 {
 	enum hrtimer_restart (*fn)(struct hrtimer *);
 	int restart;
@@ -1179,16 +1318,16 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 	lockdep_assert_held(&cpu_base->lock);
 
 	debug_deactivate(timer);
-	cpu_base->running = timer;
+	base->running = timer;
 
 	/*
 	 * Separate the ->running assignment from the ->state assignment.
 	 *
 	 * As with a regular write barrier, this ensures the read side in
-	 * hrtimer_active() cannot observe cpu_base->running == NULL &&
+	 * hrtimer_active() cannot observe base->running == NULL &&
 	 * timer->state == INACTIVE.
 	 */
-	raw_write_seqcount_barrier(&cpu_base->seq);
+	raw_write_seqcount_barrier(&base->seq);
 
 	__remove_hrtimer(timer, base, HRTIMER_STATE_INACTIVE, 0);
 	fn = timer->function;
@@ -1202,15 +1341,15 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 		timer->is_rel = false;
 
 	/*
-	 * Because we run timers from hardirq context, there is no chance
-	 * they get migrated to another cpu, therefore its safe to unlock
-	 * the timer base.
+	 * The timer is marked as running in the CPU base, so it is
+	 * protected against migration to a different CPU even if the lock
+	 * is dropped.
 	 */
-	raw_spin_unlock(&cpu_base->lock);
+	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 	trace_hrtimer_expire_entry(timer, now);
 	restart = fn(timer);
 	trace_hrtimer_expire_exit(timer);
-	raw_spin_lock(&cpu_base->lock);
+	raw_spin_lock_irq(&cpu_base->lock);
 
 	/*
 	 * Note: We clear the running state after enqueue_hrtimer and
@@ -1223,33 +1362,31 @@ static void __run_hrtimer(struct hrtimer_cpu_base *cpu_base,
 	 */
 	if (restart != HRTIMER_NORESTART &&
 	    !(timer->state & HRTIMER_STATE_ENQUEUED))
-		enqueue_hrtimer(timer, base);
+		enqueue_hrtimer(timer, base, HRTIMER_MODE_ABS);
 
 	/*
 	 * Separate the ->running assignment from the ->state assignment.
 	 *
 	 * As with a regular write barrier, this ensures the read side in
-	 * hrtimer_active() cannot observe cpu_base->running == NULL &&
+	 * hrtimer_active() cannot observe base->running.timer == NULL &&
 	 * timer->state == INACTIVE.
 	 */
-	raw_write_seqcount_barrier(&cpu_base->seq);
+	raw_write_seqcount_barrier(&base->seq);
 
-	WARN_ON_ONCE(cpu_base->running != timer);
-	cpu_base->running = NULL;
+	WARN_ON_ONCE(base->running != timer);
+	base->running = NULL;
 }
 
-static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
+static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now,
+				 unsigned long flags, unsigned int active_mask)
 {
-	struct hrtimer_clock_base *base = cpu_base->clock_base;
-	unsigned int active = cpu_base->active_bases;
+	struct hrtimer_clock_base *base;
+	unsigned int active = cpu_base->active_bases & active_mask;
 
-	for (; active; base++, active >>= 1) {
+	for_each_active_base(base, cpu_base, active) {
 		struct timerqueue_node *node;
 		ktime_t basenow;
 
-		if (!(active & 0x01))
-			continue;
-
 		basenow = ktime_add(now, base->offset);
 
 		while ((node = timerqueue_getnext(&base->active))) {
@@ -1272,11 +1409,28 @@ static void __hrtimer_run_queues(struct hrtimer_cpu_base *cpu_base, ktime_t now)
 			if (basenow < hrtimer_get_softexpires_tv64(timer))
 				break;
 
-			__run_hrtimer(cpu_base, base, timer, &basenow);
+			__run_hrtimer(cpu_base, base, timer, &basenow, flags);
 		}
 	}
 }
 
+static __latent_entropy void hrtimer_run_softirq(struct softirq_action *h)
+{
+	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
+	unsigned long flags;
+	ktime_t now;
+
+	raw_spin_lock_irqsave(&cpu_base->lock, flags);
+
+	now = hrtimer_update_base(cpu_base);
+	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_SOFT);
+
+	cpu_base->softirq_activated = 0;
+	hrtimer_update_softirq_timer(cpu_base, true);
+
+	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
+}
+
 #ifdef CONFIG_HIGH_RES_TIMERS
 
 /*
@@ -1287,13 +1441,14 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
 	ktime_t expires_next, now, entry_time, delta;
+	unsigned long flags;
 	int retries = 0;
 
 	BUG_ON(!cpu_base->hres_active);
 	cpu_base->nr_events++;
 	dev->next_event = KTIME_MAX;
 
-	raw_spin_lock(&cpu_base->lock);
+	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 	entry_time = now = hrtimer_update_base(cpu_base);
 retry:
 	cpu_base->in_hrtirq = 1;
@@ -1306,17 +1461,23 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	 */
 	cpu_base->expires_next = KTIME_MAX;
 
-	__hrtimer_run_queues(cpu_base, now);
+	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
+		cpu_base->softirq_expires_next = KTIME_MAX;
+		cpu_base->softirq_activated = 1;
+		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+	}
+
+	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
 
 	/* Reevaluate the clock bases for the next expiry */
-	expires_next = __hrtimer_get_next_event(cpu_base);
+	expires_next = __hrtimer_get_next_event(cpu_base, HRTIMER_ACTIVE_ALL);
 	/*
 	 * Store the new expiry value so the migration code can verify
 	 * against it.
 	 */
 	cpu_base->expires_next = expires_next;
 	cpu_base->in_hrtirq = 0;
-	raw_spin_unlock(&cpu_base->lock);
+	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 
 	/* Reprogramming necessary ? */
 	if (!tick_program_event(expires_next, 0)) {
@@ -1337,7 +1498,7 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	 * Acquire base lock for updating the offsets and retrieving
 	 * the current time.
 	 */
-	raw_spin_lock(&cpu_base->lock);
+	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 	now = hrtimer_update_base(cpu_base);
 	cpu_base->nr_retries++;
 	if (++retries < 3)
@@ -1350,7 +1511,8 @@ void hrtimer_interrupt(struct clock_event_device *dev)
 	 */
 	cpu_base->nr_hangs++;
 	cpu_base->hang_detected = 1;
-	raw_spin_unlock(&cpu_base->lock);
+	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
+
 	delta = ktime_sub(now, entry_time);
 	if ((unsigned int)delta > cpu_base->max_hang_time)
 		cpu_base->max_hang_time = (unsigned int) delta;
@@ -1392,6 +1554,7 @@ static inline void __hrtimer_peek_ahead_timers(void) { }
 void hrtimer_run_queues(void)
 {
 	struct hrtimer_cpu_base *cpu_base = this_cpu_ptr(&hrtimer_bases);
+	unsigned long flags;
 	ktime_t now;
 
 	if (__hrtimer_hres_active(cpu_base))
@@ -1409,10 +1572,17 @@ void hrtimer_run_queues(void)
 		return;
 	}
 
-	raw_spin_lock(&cpu_base->lock);
+	raw_spin_lock_irqsave(&cpu_base->lock, flags);
 	now = hrtimer_update_base(cpu_base);
-	__hrtimer_run_queues(cpu_base, now);
-	raw_spin_unlock(&cpu_base->lock);
+
+	if (!ktime_before(now, cpu_base->softirq_expires_next)) {
+		cpu_base->softirq_expires_next = KTIME_MAX;
+		cpu_base->softirq_activated = 1;
+		raise_softirq_irqoff(HRTIMER_SOFTIRQ);
+	}
+
+	__hrtimer_run_queues(cpu_base, now, flags, HRTIMER_ACTIVE_HARD);
+	raw_spin_unlock_irqrestore(&cpu_base->lock, flags);
 }
 
 /*
@@ -1590,7 +1760,13 @@ int hrtimers_prepare_cpu(unsigned int cpu)
 	}
 
 	cpu_base->cpu = cpu;
-	hrtimer_init_hres(cpu_base);
+	cpu_base->active_bases = 0;
+	cpu_base->hres_active = 0;
+	cpu_base->hang_detected = 0;
+	cpu_base->next_timer = NULL;
+	cpu_base->softirq_next_timer = NULL;
+	cpu_base->expires_next = KTIME_MAX;
+	cpu_base->softirq_expires_next = KTIME_MAX;
 	return 0;
 }
 
@@ -1622,7 +1798,7 @@ static void migrate_hrtimer_list(struct hrtimer_clock_base *old_base,
 		 * sort out already expired timers and reprogram the
 		 * event device.
 		 */
-		enqueue_hrtimer(timer, new_base);
+		enqueue_hrtimer(timer, new_base, HRTIMER_MODE_ABS);
 	}
 }
 
@@ -1634,6 +1810,12 @@ int hrtimers_dead_cpu(unsigned int scpu)
 	BUG_ON(cpu_online(scpu));
 	tick_cancel_sched_timer(scpu);
 
+	/*
+	 * this BH disable ensures that raise_softirq_irqoff() does
+	 * not wakeup ksoftirqd (and acquire the pi-lock) while
+	 * holding the cpu_base lock
+	 */
+	local_bh_disable();
 	local_irq_disable();
 	old_base = &per_cpu(hrtimer_bases, scpu);
 	new_base = this_cpu_ptr(&hrtimer_bases);
@@ -1649,12 +1831,19 @@ int hrtimers_dead_cpu(unsigned int scpu)
 				     &new_base->clock_base[i]);
 	}
 
+	/*
+	 * The migration might have changed the first expiring softirq
+	 * timer on this CPU. Update it.
+	 */
+	hrtimer_update_softirq_timer(new_base, false);
+
 	raw_spin_unlock(&old_base->lock);
 	raw_spin_unlock(&new_base->lock);
 
 	/* Check, if we got expired work to do */
 	__hrtimer_peek_ahead_timers();
 	local_irq_enable();
+	local_bh_enable();
 	return 0;
 }
 
@@ -1663,18 +1852,19 @@ int hrtimers_dead_cpu(unsigned int scpu)
 void __init hrtimers_init(void)
 {
 	hrtimers_prepare_cpu(smp_processor_id());
+	open_softirq(HRTIMER_SOFTIRQ, hrtimer_run_softirq);
 }
 
 /**
  * schedule_hrtimeout_range_clock - sleep until timeout
  * @expires:	timeout value (ktime_t)
  * @delta:	slack in expires timeout (ktime_t)
- * @mode:	timer mode, HRTIMER_MODE_ABS or HRTIMER_MODE_REL
- * @clock:	timer clock, CLOCK_MONOTONIC or CLOCK_REALTIME
+ * @mode:	timer mode
+ * @clock_id:	timer clock to be used
  */
 int __sched
 schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
-			       const enum hrtimer_mode mode, int clock)
+			       const enum hrtimer_mode mode, clockid_t clock_id)
 {
 	struct hrtimer_sleeper t;
 
@@ -1695,7 +1885,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
 		return -EINTR;
 	}
 
-	hrtimer_init_on_stack(&t.timer, clock, mode);
+	hrtimer_init_on_stack(&t.timer, clock_id, mode);
 	hrtimer_set_expires_range_ns(&t.timer, *expires, delta);
 
 	hrtimer_init_sleeper(&t, current);
@@ -1717,7 +1907,7 @@ schedule_hrtimeout_range_clock(ktime_t *expires, u64 delta,
  * schedule_hrtimeout_range - sleep until timeout
  * @expires:	timeout value (ktime_t)
  * @delta:	slack in expires timeout (ktime_t)
- * @mode:	timer mode, HRTIMER_MODE_ABS or HRTIMER_MODE_REL
+ * @mode:	timer mode
  *
  * Make the current task sleep until the given expiry time has
  * elapsed. The routine will return immediately unless
@@ -1756,7 +1946,7 @@ EXPORT_SYMBOL_GPL(schedule_hrtimeout_range);
 /**
  * schedule_hrtimeout - sleep until timeout
  * @expires:	timeout value (ktime_t)
- * @mode:	timer mode, HRTIMER_MODE_ABS or HRTIMER_MODE_REL
+ * @mode:	timer mode
  *
  * Make the current task sleep until the given expiry time has
  * elapsed. The routine will return immediately unless
diff --git a/kernel/time/posix-clock.c b/kernel/time/posix-clock.c
index 17cdc55..cc91d90 100644
--- a/kernel/time/posix-clock.c
+++ b/kernel/time/posix-clock.c
@@ -216,7 +216,7 @@ struct posix_clock_desc {
 
 static int get_clock_desc(const clockid_t id, struct posix_clock_desc *cd)
 {
-	struct file *fp = fget(CLOCKID_TO_FD(id));
+	struct file *fp = fget(clockid_to_fd(id));
 	int err = -EINVAL;
 
 	if (!fp)
diff --git a/kernel/time/posix-cpu-timers.c b/kernel/time/posix-cpu-timers.c
index 1f27887a..2541bd8 100644
--- a/kernel/time/posix-cpu-timers.c
+++ b/kernel/time/posix-cpu-timers.c
@@ -14,6 +14,7 @@
 #include <linux/tick.h>
 #include <linux/workqueue.h>
 #include <linux/compat.h>
+#include <linux/sched/deadline.h>
 
 #include "posix-timers.h"
 
@@ -791,6 +792,14 @@ check_timers_list(struct list_head *timers,
 	return 0;
 }
 
+static inline void check_dl_overrun(struct task_struct *tsk)
+{
+	if (tsk->dl.dl_overrun) {
+		tsk->dl.dl_overrun = 0;
+		__group_send_sig_info(SIGXCPU, SEND_SIG_PRIV, tsk);
+	}
+}
+
 /*
  * Check for any per-thread CPU timers that have fired and move them off
  * the tsk->cpu_timers[N] list onto the firing list.  Here we update the
@@ -804,6 +813,9 @@ static void check_thread_timers(struct task_struct *tsk,
 	u64 expires;
 	unsigned long soft;
 
+	if (dl_task(tsk))
+		check_dl_overrun(tsk);
+
 	/*
 	 * If cputime_expires is zero, then there are no active
 	 * per thread CPU timers.
@@ -906,6 +918,9 @@ static void check_process_timers(struct task_struct *tsk,
 	struct task_cputime cputime;
 	unsigned long soft;
 
+	if (dl_task(tsk))
+		check_dl_overrun(tsk);
+
 	/*
 	 * If cputimer is not running, then there are no active
 	 * process wide timers (POSIX 1.b, itimers, RLIMIT_CPU).
@@ -1111,6 +1126,9 @@ static inline int fastpath_timer_check(struct task_struct *tsk)
 			return 1;
 	}
 
+	if (dl_task(tsk) && tsk->dl.dl_overrun)
+		return 1;
+
 	return 0;
 }
 
@@ -1189,9 +1207,8 @@ void set_process_cpu_timer(struct task_struct *tsk, unsigned int clock_idx,
 	u64 now;
 
 	WARN_ON_ONCE(clock_idx == CPUCLOCK_SCHED);
-	cpu_timer_sample_group(clock_idx, tsk, &now);
 
-	if (oldval) {
+	if (oldval && cpu_timer_sample_group(clock_idx, tsk, &now) != -EINVAL) {
 		/*
 		 * We are setting itimer. The *oldval is absolute and we update
 		 * it to be relative, *newval argument is relative and we update
@@ -1363,8 +1380,8 @@ static long posix_cpu_nsleep_restart(struct restart_block *restart_block)
 	return do_cpu_nanosleep(which_clock, TIMER_ABSTIME, &t);
 }
 
-#define PROCESS_CLOCK	MAKE_PROCESS_CPUCLOCK(0, CPUCLOCK_SCHED)
-#define THREAD_CLOCK	MAKE_THREAD_CPUCLOCK(0, CPUCLOCK_SCHED)
+#define PROCESS_CLOCK	make_process_cpuclock(0, CPUCLOCK_SCHED)
+#define THREAD_CLOCK	make_thread_cpuclock(0, CPUCLOCK_SCHED)
 
 static int process_cpu_clock_getres(const clockid_t which_clock,
 				    struct timespec64 *tp)
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index f8e1845..e277284 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -150,16 +150,15 @@ static inline void tick_nohz_init(void) { }
 
 #ifdef CONFIG_NO_HZ_COMMON
 extern unsigned long tick_nohz_active;
-#else
+extern void timers_update_nohz(void);
+# ifdef CONFIG_SMP
+extern struct static_key_false timers_migration_enabled;
+# endif
+#else /* CONFIG_NO_HZ_COMMON */
+static inline void timers_update_nohz(void) { }
 #define tick_nohz_active (0)
 #endif
 
-#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
-extern void timers_update_migration(bool update_nohz);
-#else
-static inline void timers_update_migration(bool update_nohz) { }
-#endif
-
 DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
 
 extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index f7cc7ab..29a5733 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -1107,7 +1107,7 @@ static inline void tick_nohz_activate(struct tick_sched *ts, int mode)
 	ts->nohz_mode = mode;
 	/* One update is enough */
 	if (!test_and_set_bit(0, &tick_nohz_active))
-		timers_update_migration(true);
+		timers_update_nohz();
 }
 
 /**
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 89a9e1b..48150ab 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -200,8 +200,6 @@ struct timer_base {
 	unsigned long		clk;
 	unsigned long		next_expiry;
 	unsigned int		cpu;
-	bool			migration_enabled;
-	bool			nohz_active;
 	bool			is_idle;
 	bool			must_forward_clk;
 	DECLARE_BITMAP(pending_map, WHEEL_SIZE);
@@ -210,45 +208,64 @@ struct timer_base {
 
 static DEFINE_PER_CPU(struct timer_base, timer_bases[NR_BASES]);
 
-#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
+#ifdef CONFIG_NO_HZ_COMMON
+
+static DEFINE_STATIC_KEY_FALSE(timers_nohz_active);
+static DEFINE_MUTEX(timer_keys_mutex);
+
+static void timer_update_keys(struct work_struct *work);
+static DECLARE_WORK(timer_update_work, timer_update_keys);
+
+#ifdef CONFIG_SMP
 unsigned int sysctl_timer_migration = 1;
 
-void timers_update_migration(bool update_nohz)
+DEFINE_STATIC_KEY_FALSE(timers_migration_enabled);
+
+static void timers_update_migration(void)
 {
-	bool on = sysctl_timer_migration && tick_nohz_active;
-	unsigned int cpu;
+	if (sysctl_timer_migration && tick_nohz_active)
+		static_branch_enable(&timers_migration_enabled);
+	else
+		static_branch_disable(&timers_migration_enabled);
+}
+#else
+static inline void timers_update_migration(void) { }
+#endif /* !CONFIG_SMP */
 
-	/* Avoid the loop, if nothing to update */
-	if (this_cpu_read(timer_bases[BASE_STD].migration_enabled) == on)
-		return;
+static void timer_update_keys(struct work_struct *work)
+{
+	mutex_lock(&timer_keys_mutex);
+	timers_update_migration();
+	static_branch_enable(&timers_nohz_active);
+	mutex_unlock(&timer_keys_mutex);
+}
 
-	for_each_possible_cpu(cpu) {
-		per_cpu(timer_bases[BASE_STD].migration_enabled, cpu) = on;
-		per_cpu(timer_bases[BASE_DEF].migration_enabled, cpu) = on;
-		per_cpu(hrtimer_bases.migration_enabled, cpu) = on;
-		if (!update_nohz)
-			continue;
-		per_cpu(timer_bases[BASE_STD].nohz_active, cpu) = true;
-		per_cpu(timer_bases[BASE_DEF].nohz_active, cpu) = true;
-		per_cpu(hrtimer_bases.nohz_active, cpu) = true;
-	}
+void timers_update_nohz(void)
+{
+	schedule_work(&timer_update_work);
 }
 
 int timer_migration_handler(struct ctl_table *table, int write,
 			    void __user *buffer, size_t *lenp,
 			    loff_t *ppos)
 {
-	static DEFINE_MUTEX(mutex);
 	int ret;
 
-	mutex_lock(&mutex);
+	mutex_lock(&timer_keys_mutex);
 	ret = proc_dointvec_minmax(table, write, buffer, lenp, ppos);
 	if (!ret && write)
-		timers_update_migration(false);
-	mutex_unlock(&mutex);
+		timers_update_migration();
+	mutex_unlock(&timer_keys_mutex);
 	return ret;
 }
-#endif
+
+static inline bool is_timers_nohz_active(void)
+{
+	return static_branch_unlikely(&timers_nohz_active);
+}
+#else
+static inline bool is_timers_nohz_active(void) { return false; }
+#endif /* NO_HZ_COMMON */
 
 static unsigned long round_jiffies_common(unsigned long j, int cpu,
 		bool force_up)
@@ -534,7 +551,7 @@ __internal_add_timer(struct timer_base *base, struct timer_list *timer)
 static void
 trigger_dyntick_cpu(struct timer_base *base, struct timer_list *timer)
 {
-	if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+	if (!is_timers_nohz_active())
 		return;
 
 	/*
@@ -849,21 +866,20 @@ static inline struct timer_base *get_timer_base(u32 tflags)
 	return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
 }
 
-#ifdef CONFIG_NO_HZ_COMMON
 static inline struct timer_base *
 get_target_base(struct timer_base *base, unsigned tflags)
 {
-#ifdef CONFIG_SMP
-	if ((tflags & TIMER_PINNED) || !base->migration_enabled)
-		return get_timer_this_cpu_base(tflags);
-	return get_timer_cpu_base(tflags, get_nohz_timer_target());
-#else
-	return get_timer_this_cpu_base(tflags);
+#if defined(CONFIG_SMP) && defined(CONFIG_NO_HZ_COMMON)
+	if (static_branch_likely(&timers_migration_enabled) &&
+	    !(tflags & TIMER_PINNED))
+		return get_timer_cpu_base(tflags, get_nohz_timer_target());
 #endif
+	return get_timer_this_cpu_base(tflags);
 }
 
 static inline void forward_timer_base(struct timer_base *base)
 {
+#ifdef CONFIG_NO_HZ_COMMON
 	unsigned long jnow;
 
 	/*
@@ -887,16 +903,8 @@ static inline void forward_timer_base(struct timer_base *base)
 		base->clk = jnow;
 	else
 		base->clk = base->next_expiry;
-}
-#else
-static inline struct timer_base *
-get_target_base(struct timer_base *base, unsigned tflags)
-{
-	return get_timer_this_cpu_base(tflags);
-}
-
-static inline void forward_timer_base(struct timer_base *base) { }
 #endif
+}
 
 
 /*
@@ -1696,7 +1704,7 @@ void run_local_timers(void)
 	hrtimer_run_queues();
 	/* Raise the softirq only if required. */
 	if (time_before(jiffies, base->clk)) {
-		if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+		if (!IS_ENABLED(CONFIG_NO_HZ_COMMON))
 			return;
 		/* CPU is awake, so check the deferrable base. */
 		base++;
diff --git a/kernel/torture.c b/kernel/torture.c
index 637e172..37b94012 100644
--- a/kernel/torture.c
+++ b/kernel/torture.c
@@ -47,6 +47,7 @@
 #include <linux/ktime.h>
 #include <asm/byteorder.h>
 #include <linux/torture.h>
+#include "rcu/rcu.h"
 
 MODULE_LICENSE("GPL");
 MODULE_AUTHOR("Paul E. McKenney <paulmck@us.ibm.com>");
@@ -60,7 +61,6 @@ static bool verbose;
 #define FULLSTOP_RMMOD    2	/* Normal rmmod of torture. */
 static int fullstop = FULLSTOP_RMMOD;
 static DEFINE_MUTEX(fullstop_mutex);
-static int *torture_runnable;
 
 #ifdef CONFIG_HOTPLUG_CPU
 
@@ -500,7 +500,7 @@ static int torture_shutdown(void *arg)
 		torture_shutdown_hook();
 	else
 		VERBOSE_TOROUT_STRING("No torture_shutdown_hook(), skipping.");
-	ftrace_dump(DUMP_ALL);
+	rcu_ftrace_dump(DUMP_ALL);
 	kernel_power_off();	/* Shut down the system. */
 	return 0;
 }
@@ -572,17 +572,19 @@ static int stutter;
  */
 void stutter_wait(const char *title)
 {
+	int spt;
+
 	cond_resched_rcu_qs();
-	while (READ_ONCE(stutter_pause_test) ||
-	       (torture_runnable && !READ_ONCE(*torture_runnable))) {
-		if (stutter_pause_test)
-			if (READ_ONCE(stutter_pause_test) == 1)
-				schedule_timeout_interruptible(1);
-			else
-				while (READ_ONCE(stutter_pause_test))
-					cond_resched();
-		else
+	spt = READ_ONCE(stutter_pause_test);
+	for (; spt; spt = READ_ONCE(stutter_pause_test)) {
+		if (spt == 1) {
+			schedule_timeout_interruptible(1);
+		} else if (spt == 2) {
+			while (READ_ONCE(stutter_pause_test))
+				cond_resched();
+		} else {
 			schedule_timeout_interruptible(round_jiffies_relative(HZ));
+		}
 		torture_shutdown_absorb(title);
 	}
 }
@@ -596,17 +598,15 @@ static int torture_stutter(void *arg)
 {
 	VERBOSE_TOROUT_STRING("torture_stutter task started");
 	do {
-		if (!torture_must_stop()) {
-			if (stutter > 1) {
-				schedule_timeout_interruptible(stutter - 1);
-				WRITE_ONCE(stutter_pause_test, 2);
-			}
-			schedule_timeout_interruptible(1);
+		if (!torture_must_stop() && stutter > 1) {
 			WRITE_ONCE(stutter_pause_test, 1);
+			schedule_timeout_interruptible(stutter - 1);
+			WRITE_ONCE(stutter_pause_test, 2);
+			schedule_timeout_interruptible(1);
 		}
+		WRITE_ONCE(stutter_pause_test, 0);
 		if (!torture_must_stop())
 			schedule_timeout_interruptible(stutter);
-		WRITE_ONCE(stutter_pause_test, 0);
 		torture_shutdown_absorb("torture_stutter");
 	} while (!torture_must_stop());
 	torture_kthread_stopping("torture_stutter");
@@ -647,7 +647,7 @@ static void torture_stutter_cleanup(void)
  * The runnable parameter points to a flag that controls whether or not
  * the test is currently runnable.  If there is no such flag, pass in NULL.
  */
-bool torture_init_begin(char *ttype, bool v, int *runnable)
+bool torture_init_begin(char *ttype, bool v)
 {
 	mutex_lock(&fullstop_mutex);
 	if (torture_type != NULL) {
@@ -659,7 +659,6 @@ bool torture_init_begin(char *ttype, bool v, int *runnable)
 	}
 	torture_type = ttype;
 	verbose = v;
-	torture_runnable = runnable;
 	fullstop = FULLSTOP_DONTSTOP;
 	return true;
 }
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 904c952..f54dc62 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -355,7 +355,7 @@
 	  on if you need to profile the system's use of these macros.
 
 config PROFILE_ALL_BRANCHES
-	bool "Profile all if conditionals"
+	bool "Profile all if conditionals" if !FORTIFY_SOURCE
 	select TRACE_BRANCH_PROFILING
 	help
 	  This tracer profiles all branch conditions. Every if ()
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index ccdf366..554b517 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -1119,15 +1119,11 @@ static struct ftrace_ops global_ops = {
 };
 
 /*
- * This is used by __kernel_text_address() to return true if the
- * address is on a dynamically allocated trampoline that would
- * not return true for either core_kernel_text() or
- * is_module_text_address().
+ * Used by the stack undwinder to know about dynamic ftrace trampolines.
  */
-bool is_ftrace_trampoline(unsigned long addr)
+struct ftrace_ops *ftrace_ops_trampoline(unsigned long addr)
 {
-	struct ftrace_ops *op;
-	bool ret = false;
+	struct ftrace_ops *op = NULL;
 
 	/*
 	 * Some of the ops may be dynamically allocated,
@@ -1144,15 +1140,24 @@ bool is_ftrace_trampoline(unsigned long addr)
 		if (op->trampoline && op->trampoline_size)
 			if (addr >= op->trampoline &&
 			    addr < op->trampoline + op->trampoline_size) {
-				ret = true;
-				goto out;
+				preempt_enable_notrace();
+				return op;
 			}
 	} while_for_each_ftrace_op(op);
-
- out:
 	preempt_enable_notrace();
 
-	return ret;
+	return NULL;
+}
+
+/*
+ * This is used by __kernel_text_address() to return true if the
+ * address is on a dynamically allocated trampoline that would
+ * not return true for either core_kernel_text() or
+ * is_module_text_address().
+ */
+bool is_ftrace_trampoline(unsigned long addr)
+{
+	return ftrace_ops_trampoline(addr) != NULL;
 }
 
 struct ftrace_page {
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 9ab1899..5af2842 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -2534,29 +2534,58 @@ rb_wakeups(struct ring_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
  * The lock and unlock are done within a preempt disable section.
  * The current_context per_cpu variable can only be modified
  * by the current task between lock and unlock. But it can
- * be modified more than once via an interrupt. There are four
- * different contexts that we need to consider.
+ * be modified more than once via an interrupt. To pass this
+ * information from the lock to the unlock without having to
+ * access the 'in_interrupt()' functions again (which do show
+ * a bit of overhead in something as critical as function tracing,
+ * we use a bitmask trick.
  *
- *  Normal context.
- *  SoftIRQ context
- *  IRQ context
- *  NMI context
+ *  bit 0 =  NMI context
+ *  bit 1 =  IRQ context
+ *  bit 2 =  SoftIRQ context
+ *  bit 3 =  normal context.
  *
- * If for some reason the ring buffer starts to recurse, we
- * only allow that to happen at most 4 times (one for each
- * context). If it happens 5 times, then we consider this a
- * recusive loop and do not let it go further.
+ * This works because this is the order of contexts that can
+ * preempt other contexts. A SoftIRQ never preempts an IRQ
+ * context.
+ *
+ * When the context is determined, the corresponding bit is
+ * checked and set (if it was set, then a recursion of that context
+ * happened).
+ *
+ * On unlock, we need to clear this bit. To do so, just subtract
+ * 1 from the current_context and AND it to itself.
+ *
+ * (binary)
+ *  101 - 1 = 100
+ *  101 & 100 = 100 (clearing bit zero)
+ *
+ *  1010 - 1 = 1001
+ *  1010 & 1001 = 1000 (clearing bit 1)
+ *
+ * The least significant bit can be cleared this way, and it
+ * just so happens that it is the same bit corresponding to
+ * the current context.
  */
 
 static __always_inline int
 trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
 {
-	if (cpu_buffer->current_context >= 4)
+	unsigned int val = cpu_buffer->current_context;
+	unsigned long pc = preempt_count();
+	int bit;
+
+	if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
+		bit = RB_CTX_NORMAL;
+	else
+		bit = pc & NMI_MASK ? RB_CTX_NMI :
+			pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
+
+	if (unlikely(val & (1 << bit)))
 		return 1;
 
-	cpu_buffer->current_context++;
-	/* Interrupts must see this update */
-	barrier();
+	val |= (1 << bit);
+	cpu_buffer->current_context = val;
 
 	return 0;
 }
@@ -2564,9 +2593,7 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
 static __always_inline void
 trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
 {
-	/* Don't let the dec leak out */
-	barrier();
-	cpu_buffer->current_context--;
+	cpu_buffer->current_context &= cpu_buffer->current_context - 1;
 }
 
 /**
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 2a8d8a29..4f3a8e2 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2374,6 +2374,15 @@ void trace_event_buffer_commit(struct trace_event_buffer *fbuffer)
 }
 EXPORT_SYMBOL_GPL(trace_event_buffer_commit);
 
+/*
+ * Skip 3:
+ *
+ *   trace_buffer_unlock_commit_regs()
+ *   trace_event_buffer_commit()
+ *   trace_event_raw_event_xxx()
+*/
+# define STACK_SKIP 3
+
 void trace_buffer_unlock_commit_regs(struct trace_array *tr,
 				     struct ring_buffer *buffer,
 				     struct ring_buffer_event *event,
@@ -2383,16 +2392,12 @@ void trace_buffer_unlock_commit_regs(struct trace_array *tr,
 	__buffer_unlock_commit(buffer, event);
 
 	/*
-	 * If regs is not set, then skip the following callers:
-	 *   trace_buffer_unlock_commit_regs
-	 *   event_trigger_unlock_commit
-	 *   trace_event_buffer_commit
-	 *   trace_event_raw_event_sched_switch
+	 * If regs is not set, then skip the necessary functions.
 	 * Note, we can still get here via blktrace, wakeup tracer
 	 * and mmiotrace, but that's ok if they lose a function or
-	 * two. They are that meaningful.
+	 * two. They are not that meaningful.
 	 */
-	ftrace_trace_stack(tr, buffer, flags, regs ? 0 : 4, pc, regs);
+	ftrace_trace_stack(tr, buffer, flags, regs ? 0 : STACK_SKIP, pc, regs);
 	ftrace_trace_userstack(buffer, flags, pc);
 }
 
@@ -2579,11 +2584,13 @@ static void __ftrace_trace_stack(struct ring_buffer *buffer,
 	trace.skip		= skip;
 
 	/*
-	 * Add two, for this function and the call to save_stack_trace()
+	 * Add one, for this function and the call to save_stack_trace()
 	 * If regs is set, then these functions will not be in the way.
 	 */
+#ifndef CONFIG_UNWINDER_ORC
 	if (!regs)
-		trace.skip += 2;
+		trace.skip++;
+#endif
 
 	/*
 	 * Since events can happen in NMIs there's no safe way to
@@ -2682,17 +2689,6 @@ void __trace_stack(struct trace_array *tr, unsigned long flags, int skip,
 	if (unlikely(in_nmi()))
 		return;
 
-	/*
-	 * It is possible that a function is being traced in a
-	 * location that RCU is not watching. A call to
-	 * rcu_irq_enter() will make sure that it is, but there's
-	 * a few internal rcu functions that could be traced
-	 * where that wont work either. In those cases, we just
-	 * do nothing.
-	 */
-	if (unlikely(rcu_irq_enter_disabled()))
-		return;
-
 	rcu_irq_enter_irqson();
 	__ftrace_trace_stack(buffer, flags, skip, pc, NULL);
 	rcu_irq_exit_irqson();
@@ -2711,11 +2707,10 @@ void trace_dump_stack(int skip)
 
 	local_save_flags(flags);
 
-	/*
-	 * Skip 3 more, seems to get us at the caller of
-	 * this function.
-	 */
-	skip += 3;
+#ifndef CONFIG_UNWINDER_ORC
+	/* Skip 1 to skip this function. */
+	skip++;
+#endif
 	__ftrace_trace_stack(global_trace.trace_buffer.buffer,
 			     flags, skip, preempt_count(), NULL);
 }
diff --git a/kernel/trace/trace_benchmark.c b/kernel/trace/trace_benchmark.c
index 79f838a..22fee76 100644
--- a/kernel/trace/trace_benchmark.c
+++ b/kernel/trace/trace_benchmark.c
@@ -165,7 +165,7 @@ static int benchmark_event_kthread(void *arg)
 		 * this thread will never voluntarily schedule which would
 		 * block synchronize_rcu_tasks() indefinitely.
 		 */
-		cond_resched_rcu_qs();
+		cond_resched();
 	}
 
 	return 0;
diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c
index ec0f9aa..1b87157e 100644
--- a/kernel/trace/trace_events.c
+++ b/kernel/trace/trace_events.c
@@ -2213,6 +2213,7 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
 {
 	struct trace_event_call *call, *p;
 	const char *last_system = NULL;
+	bool first = false;
 	int last_i;
 	int i;
 
@@ -2220,15 +2221,28 @@ void trace_event_eval_update(struct trace_eval_map **map, int len)
 	list_for_each_entry_safe(call, p, &ftrace_events, list) {
 		/* events are usually grouped together with systems */
 		if (!last_system || call->class->system != last_system) {
+			first = true;
 			last_i = 0;
 			last_system = call->class->system;
 		}
 
+		/*
+		 * Since calls are grouped by systems, the likelyhood that the
+		 * next call in the iteration belongs to the same system as the
+		 * previous call is high. As an optimization, we skip seaching
+		 * for a map[] that matches the call's system if the last call
+		 * was from the same system. That's what last_i is for. If the
+		 * call has the same system as the previous call, then last_i
+		 * will be the index of the first map[] that has a matching
+		 * system.
+		 */
 		for (i = last_i; i < len; i++) {
 			if (call->class->system == map[i]->system) {
 				/* Save the first system if need be */
-				if (!last_i)
+				if (first) {
 					last_i = i;
+					first = false;
+				}
 				update_event_printk(call, map[i]);
 			}
 		}
diff --git a/kernel/trace/trace_events_trigger.c b/kernel/trace/trace_events_trigger.c
index f2ac9d4..8741148 100644
--- a/kernel/trace/trace_events_trigger.c
+++ b/kernel/trace/trace_events_trigger.c
@@ -1123,13 +1123,22 @@ static __init int register_trigger_snapshot_cmd(void) { return 0; }
 #endif /* CONFIG_TRACER_SNAPSHOT */
 
 #ifdef CONFIG_STACKTRACE
-/*
- * Skip 3:
- *   stacktrace_trigger()
+#ifdef CONFIG_UNWINDER_ORC
+/* Skip 2:
  *   event_triggers_post_call()
  *   trace_event_raw_event_xxx()
  */
-#define STACK_SKIP 3
+# define STACK_SKIP 2
+#else
+/*
+ * Skip 4:
+ *   stacktrace_trigger()
+ *   event_triggers_post_call()
+ *   trace_event_buffer_commit()
+ *   trace_event_raw_event_xxx()
+ */
+#define STACK_SKIP 4
+#endif
 
 static void
 stacktrace_trigger(struct event_trigger_data *data, void *rec)
diff --git a/kernel/trace/trace_functions.c b/kernel/trace/trace_functions.c
index 27f7ad1..b611cd3 100644
--- a/kernel/trace/trace_functions.c
+++ b/kernel/trace/trace_functions.c
@@ -154,6 +154,24 @@ function_trace_call(unsigned long ip, unsigned long parent_ip,
 	preempt_enable_notrace();
 }
 
+#ifdef CONFIG_UNWINDER_ORC
+/*
+ * Skip 2:
+ *
+ *   function_stack_trace_call()
+ *   ftrace_call()
+ */
+#define STACK_SKIP 2
+#else
+/*
+ * Skip 3:
+ *   __trace_stack()
+ *   function_stack_trace_call()
+ *   ftrace_call()
+ */
+#define STACK_SKIP 3
+#endif
+
 static void
 function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
 			  struct ftrace_ops *op, struct pt_regs *pt_regs)
@@ -180,15 +198,7 @@ function_stack_trace_call(unsigned long ip, unsigned long parent_ip,
 	if (likely(disabled == 1)) {
 		pc = preempt_count();
 		trace_function(tr, ip, parent_ip, flags, pc);
-		/*
-		 * skip over 5 funcs:
-		 *    __ftrace_trace_stack,
-		 *    __trace_stack,
-		 *    function_stack_trace_call
-		 *    ftrace_list_func
-		 *    ftrace_call
-		 */
-		__trace_stack(tr, flags, 5, pc);
+		__trace_stack(tr, flags, STACK_SKIP, pc);
 	}
 
 	atomic_dec(&data->disabled);
@@ -367,14 +377,27 @@ ftrace_traceoff(unsigned long ip, unsigned long parent_ip,
 	tracer_tracing_off(tr);
 }
 
+#ifdef CONFIG_UNWINDER_ORC
 /*
- * Skip 4:
- *   ftrace_stacktrace()
+ * Skip 3:
+ *
  *   function_trace_probe_call()
- *   ftrace_ops_list_func()
+ *   ftrace_ops_assist_func()
  *   ftrace_call()
  */
-#define STACK_SKIP 4
+#define FTRACE_STACK_SKIP 3
+#else
+/*
+ * Skip 5:
+ *
+ *   __trace_stack()
+ *   ftrace_stacktrace()
+ *   function_trace_probe_call()
+ *   ftrace_ops_assist_func()
+ *   ftrace_call()
+ */
+#define FTRACE_STACK_SKIP 5
+#endif
 
 static __always_inline void trace_stack(struct trace_array *tr)
 {
@@ -384,7 +407,7 @@ static __always_inline void trace_stack(struct trace_array *tr)
 	local_save_flags(flags);
 	pc = preempt_count();
 
-	__trace_stack(tr, flags, STACK_SKIP, pc);
+	__trace_stack(tr, flags, FTRACE_STACK_SKIP, pc);
 }
 
 static void
diff --git a/kernel/tracepoint.c b/kernel/tracepoint.c
index 685c50a..671b134 100644
--- a/kernel/tracepoint.c
+++ b/kernel/tracepoint.c
@@ -212,11 +212,10 @@ static int tracepoint_add_func(struct tracepoint *tp,
 	}
 
 	/*
-	 * rcu_assign_pointer has a smp_wmb() which makes sure that the new
-	 * probe callbacks array is consistent before setting a pointer to it.
-	 * This array is referenced by __DO_TRACE from
-	 * include/linux/tracepoints.h. A matching smp_read_barrier_depends()
-	 * is used.
+	 * rcu_assign_pointer has as smp_store_release() which makes sure
+	 * that the new probe callbacks array is consistent before setting
+	 * a pointer to it.  This array is referenced by __DO_TRACE from
+	 * include/linux/tracepoint.h using rcu_dereference_sched().
 	 */
 	rcu_assign_pointer(tp->funcs, tp_funcs);
 	if (!static_key_enabled(&tp->key))
diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 43d18cb..8c34981 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -48,6 +48,7 @@
 #include <linux/moduleparam.h>
 #include <linux/uaccess.h>
 #include <linux/sched/isolation.h>
+#include <linux/nmi.h>
 
 #include "workqueue_internal.h"
 
@@ -2135,7 +2136,7 @@ __acquires(&pool->lock)
 	 * stop_machine. At the same time, report a quiescent RCU state so
 	 * the same condition doesn't freeze RCU.
 	 */
-	cond_resched_rcu_qs();
+	cond_resched();
 
 	spin_lock_irq(&pool->lock);
 
@@ -4463,6 +4464,12 @@ void show_workqueue_state(void)
 			if (pwq->nr_active || !list_empty(&pwq->delayed_works))
 				show_pwq(pwq);
 			spin_unlock_irqrestore(&pwq->pool->lock, flags);
+			/*
+			 * We could be printing a lot from atomic context, e.g.
+			 * sysrq-t -> show_workqueue_state(). Avoid triggering
+			 * hard lockup.
+			 */
+			touch_nmi_watchdog();
 		}
 	}
 
@@ -4490,6 +4497,12 @@ void show_workqueue_state(void)
 		pr_cont("\n");
 	next_pool:
 		spin_unlock_irqrestore(&pool->lock, flags);
+		/*
+		 * We could be printing a lot from atomic context, e.g.
+		 * sysrq-t -> show_workqueue_state(). Avoid triggering
+		 * hard lockup.
+		 */
+		touch_nmi_watchdog();
 	}
 
 	rcu_read_unlock_sched();
diff --git a/lib/Kconfig b/lib/Kconfig
index c5e84fb..4dd5c11 100644
--- a/lib/Kconfig
+++ b/lib/Kconfig
@@ -409,6 +409,10 @@
 	depends on !NO_DMA
 	default y
 
+config SGL_ALLOC
+	bool
+	default n
+
 config DMA_NOOP_OPS
 	bool
 	depends on HAS_DMA && (!64BIT || ARCH_DMA_ADDR_T_64BIT)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 9d5b78a..2a1f28c 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1952,7 +1952,7 @@
 	bool "Filter access to /dev/mem"
 	depends on MMU && DEVMEM
 	depends on ARCH_HAS_DEVMEM_IS_ALLOWED
-	default y if TILE || PPC
+	default y if TILE || PPC || X86 || ARM64
 	---help---
 	  If this option is disabled, you allow userspace (root) access to all
 	  of memory, including kernel and userspace memory. Accidental
diff --git a/lib/assoc_array.c b/lib/assoc_array.c
index b77d51d..c6659cb 100644
--- a/lib/assoc_array.c
+++ b/lib/assoc_array.c
@@ -38,12 +38,10 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
 	if (assoc_array_ptr_is_shortcut(cursor)) {
 		/* Descend through a shortcut */
 		shortcut = assoc_array_ptr_to_shortcut(cursor);
-		smp_read_barrier_depends();
-		cursor = READ_ONCE(shortcut->next_node);
+		cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */
 	}
 
 	node = assoc_array_ptr_to_node(cursor);
-	smp_read_barrier_depends();
 	slot = 0;
 
 	/* We perform two passes of each node.
@@ -55,15 +53,12 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
 	 */
 	has_meta = 0;
 	for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
-		ptr = READ_ONCE(node->slots[slot]);
+		ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
 		has_meta |= (unsigned long)ptr;
 		if (ptr && assoc_array_ptr_is_leaf(ptr)) {
-			/* We need a barrier between the read of the pointer
-			 * and dereferencing the pointer - but only if we are
-			 * actually going to dereference it.
+			/* We need a barrier between the read of the pointer,
+			 * which is supplied by the above READ_ONCE().
 			 */
-			smp_read_barrier_depends();
-
 			/* Invoke the callback */
 			ret = iterator(assoc_array_ptr_to_leaf(ptr),
 				       iterator_data);
@@ -86,10 +81,8 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
 
 continue_node:
 	node = assoc_array_ptr_to_node(cursor);
-	smp_read_barrier_depends();
-
 	for (; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
-		ptr = READ_ONCE(node->slots[slot]);
+		ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
 		if (assoc_array_ptr_is_meta(ptr)) {
 			cursor = ptr;
 			goto begin_node;
@@ -98,16 +91,15 @@ static int assoc_array_subtree_iterate(const struct assoc_array_ptr *root,
 
 finished_node:
 	/* Move up to the parent (may need to skip back over a shortcut) */
-	parent = READ_ONCE(node->back_pointer);
+	parent = READ_ONCE(node->back_pointer); /* Address dependency. */
 	slot = node->parent_slot;
 	if (parent == stop)
 		return 0;
 
 	if (assoc_array_ptr_is_shortcut(parent)) {
 		shortcut = assoc_array_ptr_to_shortcut(parent);
-		smp_read_barrier_depends();
 		cursor = parent;
-		parent = READ_ONCE(shortcut->back_pointer);
+		parent = READ_ONCE(shortcut->back_pointer); /* Address dependency. */
 		slot = shortcut->parent_slot;
 		if (parent == stop)
 			return 0;
@@ -147,7 +139,7 @@ int assoc_array_iterate(const struct assoc_array *array,
 					void *iterator_data),
 			void *iterator_data)
 {
-	struct assoc_array_ptr *root = READ_ONCE(array->root);
+	struct assoc_array_ptr *root = READ_ONCE(array->root); /* Address dependency. */
 
 	if (!root)
 		return 0;
@@ -194,7 +186,7 @@ assoc_array_walk(const struct assoc_array *array,
 
 	pr_devel("-->%s()\n", __func__);
 
-	cursor = READ_ONCE(array->root);
+	cursor = READ_ONCE(array->root);  /* Address dependency. */
 	if (!cursor)
 		return assoc_array_walk_tree_empty;
 
@@ -216,11 +208,9 @@ assoc_array_walk(const struct assoc_array *array,
 
 consider_node:
 	node = assoc_array_ptr_to_node(cursor);
-	smp_read_barrier_depends();
-
 	slot = segments >> (level & ASSOC_ARRAY_KEY_CHUNK_MASK);
 	slot &= ASSOC_ARRAY_FAN_MASK;
-	ptr = READ_ONCE(node->slots[slot]);
+	ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
 
 	pr_devel("consider slot %x [ix=%d type=%lu]\n",
 		 slot, level, (unsigned long)ptr & 3);
@@ -254,7 +244,6 @@ assoc_array_walk(const struct assoc_array *array,
 	cursor = ptr;
 follow_shortcut:
 	shortcut = assoc_array_ptr_to_shortcut(cursor);
-	smp_read_barrier_depends();
 	pr_devel("shortcut to %d\n", shortcut->skip_to_level);
 	sc_level = level + ASSOC_ARRAY_LEVEL_STEP;
 	BUG_ON(sc_level > shortcut->skip_to_level);
@@ -294,7 +283,7 @@ assoc_array_walk(const struct assoc_array *array,
 	} while (sc_level < shortcut->skip_to_level);
 
 	/* The shortcut matches the leaf's index to this point. */
-	cursor = READ_ONCE(shortcut->next_node);
+	cursor = READ_ONCE(shortcut->next_node); /* Address dependency. */
 	if (((level ^ sc_level) & ~ASSOC_ARRAY_KEY_CHUNK_MASK) != 0) {
 		level = sc_level;
 		goto jumped;
@@ -331,20 +320,18 @@ void *assoc_array_find(const struct assoc_array *array,
 		return NULL;
 
 	node = result.terminal_node.node;
-	smp_read_barrier_depends();
 
 	/* If the target key is available to us, it's has to be pointed to by
 	 * the terminal node.
 	 */
 	for (slot = 0; slot < ASSOC_ARRAY_FAN_OUT; slot++) {
-		ptr = READ_ONCE(node->slots[slot]);
+		ptr = READ_ONCE(node->slots[slot]); /* Address dependency. */
 		if (ptr && assoc_array_ptr_is_leaf(ptr)) {
 			/* We need a barrier between the read of the pointer
 			 * and dereferencing the pointer - but only if we are
 			 * actually going to dereference it.
 			 */
 			leaf = assoc_array_ptr_to_leaf(ptr);
-			smp_read_barrier_depends();
 			if (ops->compare_object(leaf, index_key))
 				return (void *)leaf;
 		}
diff --git a/lib/crc-ccitt.c b/lib/crc-ccitt.c
index 7f6dd68..d873b34 100644
--- a/lib/crc-ccitt.c
+++ b/lib/crc-ccitt.c
@@ -51,8 +51,49 @@ u16 const crc_ccitt_table[256] = {
 };
 EXPORT_SYMBOL(crc_ccitt_table);
 
+/*
+ * Similar table to calculate CRC16 variant known as CRC-CCITT-FALSE
+ * Reflected bits order, does not augment final value.
+ */
+u16 const crc_ccitt_false_table[256] = {
+    0x0000, 0x1021, 0x2042, 0x3063, 0x4084, 0x50A5, 0x60C6, 0x70E7,
+    0x8108, 0x9129, 0xA14A, 0xB16B, 0xC18C, 0xD1AD, 0xE1CE, 0xF1EF,
+    0x1231, 0x0210, 0x3273, 0x2252, 0x52B5, 0x4294, 0x72F7, 0x62D6,
+    0x9339, 0x8318, 0xB37B, 0xA35A, 0xD3BD, 0xC39C, 0xF3FF, 0xE3DE,
+    0x2462, 0x3443, 0x0420, 0x1401, 0x64E6, 0x74C7, 0x44A4, 0x5485,
+    0xA56A, 0xB54B, 0x8528, 0x9509, 0xE5EE, 0xF5CF, 0xC5AC, 0xD58D,
+    0x3653, 0x2672, 0x1611, 0x0630, 0x76D7, 0x66F6, 0x5695, 0x46B4,
+    0xB75B, 0xA77A, 0x9719, 0x8738, 0xF7DF, 0xE7FE, 0xD79D, 0xC7BC,
+    0x48C4, 0x58E5, 0x6886, 0x78A7, 0x0840, 0x1861, 0x2802, 0x3823,
+    0xC9CC, 0xD9ED, 0xE98E, 0xF9AF, 0x8948, 0x9969, 0xA90A, 0xB92B,
+    0x5AF5, 0x4AD4, 0x7AB7, 0x6A96, 0x1A71, 0x0A50, 0x3A33, 0x2A12,
+    0xDBFD, 0xCBDC, 0xFBBF, 0xEB9E, 0x9B79, 0x8B58, 0xBB3B, 0xAB1A,
+    0x6CA6, 0x7C87, 0x4CE4, 0x5CC5, 0x2C22, 0x3C03, 0x0C60, 0x1C41,
+    0xEDAE, 0xFD8F, 0xCDEC, 0xDDCD, 0xAD2A, 0xBD0B, 0x8D68, 0x9D49,
+    0x7E97, 0x6EB6, 0x5ED5, 0x4EF4, 0x3E13, 0x2E32, 0x1E51, 0x0E70,
+    0xFF9F, 0xEFBE, 0xDFDD, 0xCFFC, 0xBF1B, 0xAF3A, 0x9F59, 0x8F78,
+    0x9188, 0x81A9, 0xB1CA, 0xA1EB, 0xD10C, 0xC12D, 0xF14E, 0xE16F,
+    0x1080, 0x00A1, 0x30C2, 0x20E3, 0x5004, 0x4025, 0x7046, 0x6067,
+    0x83B9, 0x9398, 0xA3FB, 0xB3DA, 0xC33D, 0xD31C, 0xE37F, 0xF35E,
+    0x02B1, 0x1290, 0x22F3, 0x32D2, 0x4235, 0x5214, 0x6277, 0x7256,
+    0xB5EA, 0xA5CB, 0x95A8, 0x8589, 0xF56E, 0xE54F, 0xD52C, 0xC50D,
+    0x34E2, 0x24C3, 0x14A0, 0x0481, 0x7466, 0x6447, 0x5424, 0x4405,
+    0xA7DB, 0xB7FA, 0x8799, 0x97B8, 0xE75F, 0xF77E, 0xC71D, 0xD73C,
+    0x26D3, 0x36F2, 0x0691, 0x16B0, 0x6657, 0x7676, 0x4615, 0x5634,
+    0xD94C, 0xC96D, 0xF90E, 0xE92F, 0x99C8, 0x89E9, 0xB98A, 0xA9AB,
+    0x5844, 0x4865, 0x7806, 0x6827, 0x18C0, 0x08E1, 0x3882, 0x28A3,
+    0xCB7D, 0xDB5C, 0xEB3F, 0xFB1E, 0x8BF9, 0x9BD8, 0xABBB, 0xBB9A,
+    0x4A75, 0x5A54, 0x6A37, 0x7A16, 0x0AF1, 0x1AD0, 0x2AB3, 0x3A92,
+    0xFD2E, 0xED0F, 0xDD6C, 0xCD4D, 0xBDAA, 0xAD8B, 0x9DE8, 0x8DC9,
+    0x7C26, 0x6C07, 0x5C64, 0x4C45, 0x3CA2, 0x2C83, 0x1CE0, 0x0CC1,
+    0xEF1F, 0xFF3E, 0xCF5D, 0xDF7C, 0xAF9B, 0xBFBA, 0x8FD9, 0x9FF8,
+    0x6E17, 0x7E36, 0x4E55, 0x5E74, 0x2E93, 0x3EB2, 0x0ED1, 0x1EF0
+};
+EXPORT_SYMBOL(crc_ccitt_false_table);
+
 /**
- *	crc_ccitt - recompute the CRC for the data buffer
+ *	crc_ccitt - recompute the CRC (CRC-CCITT variant) for the data
+ *	buffer
  *	@crc: previous CRC value
  *	@buffer: data pointer
  *	@len: number of bytes in the buffer
@@ -65,5 +106,20 @@ u16 crc_ccitt(u16 crc, u8 const *buffer, size_t len)
 }
 EXPORT_SYMBOL(crc_ccitt);
 
+/**
+ *	crc_ccitt_false - recompute the CRC (CRC-CCITT-FALSE variant)
+ *	for the data buffer
+ *	@crc: previous CRC value
+ *	@buffer: data pointer
+ *	@len: number of bytes in the buffer
+ */
+u16 crc_ccitt_false(u16 crc, u8 const *buffer, size_t len)
+{
+	while (len--)
+		crc = crc_ccitt_false_byte(crc, *buffer++);
+	return crc;
+}
+EXPORT_SYMBOL(crc_ccitt_false);
+
 MODULE_DESCRIPTION("CRC-CCITT calculations");
 MODULE_LICENSE("GPL");
diff --git a/lib/mpi/longlong.h b/lib/mpi/longlong.h
index 57fd45a..08c60d1 100644
--- a/lib/mpi/longlong.h
+++ b/lib/mpi/longlong.h
@@ -671,7 +671,23 @@ do {						\
 	**************  MIPS/64  **************
 	***************************************/
 #if (defined(__mips) && __mips >= 3) && W_TYPE_SIZE == 64
-#if (__GNUC__ >= 5) || (__GNUC__ >= 4 && __GNUC_MINOR__ >= 4)
+#if defined(__mips_isa_rev) && __mips_isa_rev >= 6
+/*
+ * GCC ends up emitting a __multi3 intrinsic call for MIPS64r6 with the plain C
+ * code below, so we special case MIPS64r6 until the compiler can do better.
+ */
+#define umul_ppmm(w1, w0, u, v)						\
+do {									\
+	__asm__ ("dmulu %0,%1,%2"					\
+		 : "=d" ((UDItype)(w0))					\
+		 : "d" ((UDItype)(u)),					\
+		   "d" ((UDItype)(v)));					\
+	__asm__ ("dmuhu %0,%1,%2"					\
+		 : "=d" ((UDItype)(w1))					\
+		 : "d" ((UDItype)(u)),					\
+		   "d" ((UDItype)(v)));					\
+} while (0)
+#elif (__GNUC__ >= 5) || (__GNUC__ >= 4 && __GNUC_MINOR__ >= 4)
 #define umul_ppmm(w1, w0, u, v) \
 do {									\
 	typedef unsigned int __ll_UTItype __attribute__((mode(TI)));	\
diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index fe03c6d..30e7dd8 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -197,10 +197,10 @@ static void __percpu_ref_switch_to_percpu(struct percpu_ref *ref)
 	atomic_long_add(PERCPU_COUNT_BIAS, &ref->count);
 
 	/*
-	 * Restore per-cpu operation.  smp_store_release() is paired with
-	 * smp_read_barrier_depends() in __ref_is_percpu() and guarantees
-	 * that the zeroing is visible to all percpu accesses which can see
-	 * the following __PERCPU_REF_ATOMIC clearing.
+	 * Restore per-cpu operation.  smp_store_release() is paired
+	 * with READ_ONCE() in __ref_is_percpu() and guarantees that the
+	 * zeroing is visible to all percpu accesses which can see the
+	 * following __PERCPU_REF_ATOMIC clearing.
 	 */
 	for_each_possible_cpu(cpu)
 		*per_cpu_ptr(percpu_count, cpu) = 0;
diff --git a/lib/sbitmap.c b/lib/sbitmap.c
index 80aa8d5..42b5ca0 100644
--- a/lib/sbitmap.c
+++ b/lib/sbitmap.c
@@ -462,7 +462,7 @@ static void sbq_wake_up(struct sbitmap_queue *sbq)
 		 */
 		atomic_cmpxchg(&ws->wait_cnt, wait_cnt, wait_cnt + wake_batch);
 		sbq_index_atomic_inc(&sbq->wake_index);
-		wake_up(&ws->wait);
+		wake_up_nr(&ws->wait, wake_batch);
 	}
 }
 
diff --git a/lib/scatterlist.c b/lib/scatterlist.c
index 7c1c55f..53728d3 100644
--- a/lib/scatterlist.c
+++ b/lib/scatterlist.c
@@ -474,6 +474,133 @@ int sg_alloc_table_from_pages(struct sg_table *sgt, struct page **pages,
 }
 EXPORT_SYMBOL(sg_alloc_table_from_pages);
 
+#ifdef CONFIG_SGL_ALLOC
+
+/**
+ * sgl_alloc_order - allocate a scatterlist and its pages
+ * @length: Length in bytes of the scatterlist. Must be at least one
+ * @order: Second argument for alloc_pages()
+ * @chainable: Whether or not to allocate an extra element in the scatterlist
+ *	for scatterlist chaining purposes
+ * @gfp: Memory allocation flags
+ * @nent_p: [out] Number of entries in the scatterlist that have pages
+ *
+ * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
+ */
+struct scatterlist *sgl_alloc_order(unsigned long long length,
+				    unsigned int order, bool chainable,
+				    gfp_t gfp, unsigned int *nent_p)
+{
+	struct scatterlist *sgl, *sg;
+	struct page *page;
+	unsigned int nent, nalloc;
+	u32 elem_len;
+
+	nent = round_up(length, PAGE_SIZE << order) >> (PAGE_SHIFT + order);
+	/* Check for integer overflow */
+	if (length > (nent << (PAGE_SHIFT + order)))
+		return NULL;
+	nalloc = nent;
+	if (chainable) {
+		/* Check for integer overflow */
+		if (nalloc + 1 < nalloc)
+			return NULL;
+		nalloc++;
+	}
+	sgl = kmalloc_array(nalloc, sizeof(struct scatterlist),
+			    (gfp & ~GFP_DMA) | __GFP_ZERO);
+	if (!sgl)
+		return NULL;
+
+	sg_init_table(sgl, nalloc);
+	sg = sgl;
+	while (length) {
+		elem_len = min_t(u64, length, PAGE_SIZE << order);
+		page = alloc_pages(gfp, order);
+		if (!page) {
+			sgl_free(sgl);
+			return NULL;
+		}
+
+		sg_set_page(sg, page, elem_len, 0);
+		length -= elem_len;
+		sg = sg_next(sg);
+	}
+	WARN_ONCE(length, "length = %lld\n", length);
+	if (nent_p)
+		*nent_p = nent;
+	return sgl;
+}
+EXPORT_SYMBOL(sgl_alloc_order);
+
+/**
+ * sgl_alloc - allocate a scatterlist and its pages
+ * @length: Length in bytes of the scatterlist
+ * @gfp: Memory allocation flags
+ * @nent_p: [out] Number of entries in the scatterlist
+ *
+ * Returns: A pointer to an initialized scatterlist or %NULL upon failure.
+ */
+struct scatterlist *sgl_alloc(unsigned long long length, gfp_t gfp,
+			      unsigned int *nent_p)
+{
+	return sgl_alloc_order(length, 0, false, gfp, nent_p);
+}
+EXPORT_SYMBOL(sgl_alloc);
+
+/**
+ * sgl_free_n_order - free a scatterlist and its pages
+ * @sgl: Scatterlist with one or more elements
+ * @nents: Maximum number of elements to free
+ * @order: Second argument for __free_pages()
+ *
+ * Notes:
+ * - If several scatterlists have been chained and each chain element is
+ *   freed separately then it's essential to set nents correctly to avoid that a
+ *   page would get freed twice.
+ * - All pages in a chained scatterlist can be freed at once by setting @nents
+ *   to a high number.
+ */
+void sgl_free_n_order(struct scatterlist *sgl, int nents, int order)
+{
+	struct scatterlist *sg;
+	struct page *page;
+	int i;
+
+	for_each_sg(sgl, sg, nents, i) {
+		if (!sg)
+			break;
+		page = sg_page(sg);
+		if (page)
+			__free_pages(page, order);
+	}
+	kfree(sgl);
+}
+EXPORT_SYMBOL(sgl_free_n_order);
+
+/**
+ * sgl_free_order - free a scatterlist and its pages
+ * @sgl: Scatterlist with one or more elements
+ * @order: Second argument for __free_pages()
+ */
+void sgl_free_order(struct scatterlist *sgl, int order)
+{
+	sgl_free_n_order(sgl, INT_MAX, order);
+}
+EXPORT_SYMBOL(sgl_free_order);
+
+/**
+ * sgl_free - free a scatterlist and its pages
+ * @sgl: Scatterlist with one or more elements
+ */
+void sgl_free(struct scatterlist *sgl)
+{
+	sgl_free_order(sgl, 0);
+}
+EXPORT_SYMBOL(sgl_free);
+
+#endif /* CONFIG_SGL_ALLOC */
+
 void __sg_page_iter_start(struct sg_page_iter *piter,
 			  struct scatterlist *sglist, unsigned int nents,
 			  unsigned long pgoffset)
diff --git a/lib/test_bpf.c b/lib/test_bpf.c
index 9e97480..f369889 100644
--- a/lib/test_bpf.c
+++ b/lib/test_bpf.c
@@ -6250,9 +6250,8 @@ static struct bpf_prog *generate_filter(int which, int *err)
 				return NULL;
 			}
 		}
-		/* We don't expect to fail. */
 		if (*err) {
-			pr_cont("FAIL to attach err=%d len=%d\n",
+			pr_cont("FAIL to prog_create err=%d len=%d\n",
 				*err, fprog.len);
 			return NULL;
 		}
@@ -6276,6 +6275,10 @@ static struct bpf_prog *generate_filter(int which, int *err)
 		 * checks.
 		 */
 		fp = bpf_prog_select_runtime(fp, err);
+		if (*err) {
+			pr_cont("FAIL to select_runtime err=%d\n", *err);
+			return NULL;
+		}
 		break;
 	}
 
@@ -6461,8 +6464,8 @@ static __init int test_bpf(void)
 				pass_cnt++;
 				continue;
 			}
-
-			return err;
+			err_cnt++;
+			continue;
 		}
 
 		pr_cont("jited:%u ", fp->jited);
diff --git a/mm/debug.c b/mm/debug.c
index d947f3e..56e2d91 100644
--- a/mm/debug.c
+++ b/mm/debug.c
@@ -50,7 +50,7 @@ void __dump_page(struct page *page, const char *reason)
 	 */
 	int mapcount = PageSlab(page) ? 0 : page_mapcount(page);
 
-	pr_emerg("page:%p count:%d mapcount:%d mapping:%p index:%#lx",
+	pr_emerg("page:%px count:%d mapcount:%d mapping:%px index:%#lx",
 		  page, page_ref_count(page), mapcount,
 		  page->mapping, page_to_pgoff(page));
 	if (PageCompound(page))
@@ -69,7 +69,7 @@ void __dump_page(struct page *page, const char *reason)
 
 #ifdef CONFIG_MEMCG
 	if (page->mem_cgroup)
-		pr_alert("page->mem_cgroup:%p\n", page->mem_cgroup);
+		pr_alert("page->mem_cgroup:%px\n", page->mem_cgroup);
 #endif
 }
 
@@ -84,10 +84,10 @@ EXPORT_SYMBOL(dump_page);
 
 void dump_vma(const struct vm_area_struct *vma)
 {
-	pr_emerg("vma %p start %p end %p\n"
-		"next %p prev %p mm %p\n"
-		"prot %lx anon_vma %p vm_ops %p\n"
-		"pgoff %lx file %p private_data %p\n"
+	pr_emerg("vma %px start %px end %px\n"
+		"next %px prev %px mm %px\n"
+		"prot %lx anon_vma %px vm_ops %px\n"
+		"pgoff %lx file %px private_data %px\n"
 		"flags: %#lx(%pGv)\n",
 		vma, (void *)vma->vm_start, (void *)vma->vm_end, vma->vm_next,
 		vma->vm_prev, vma->vm_mm,
@@ -100,27 +100,27 @@ EXPORT_SYMBOL(dump_vma);
 
 void dump_mm(const struct mm_struct *mm)
 {
-	pr_emerg("mm %p mmap %p seqnum %d task_size %lu\n"
+	pr_emerg("mm %px mmap %px seqnum %d task_size %lu\n"
 #ifdef CONFIG_MMU
-		"get_unmapped_area %p\n"
+		"get_unmapped_area %px\n"
 #endif
 		"mmap_base %lu mmap_legacy_base %lu highest_vm_end %lu\n"
-		"pgd %p mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n"
+		"pgd %px mm_users %d mm_count %d pgtables_bytes %lu map_count %d\n"
 		"hiwater_rss %lx hiwater_vm %lx total_vm %lx locked_vm %lx\n"
 		"pinned_vm %lx data_vm %lx exec_vm %lx stack_vm %lx\n"
 		"start_code %lx end_code %lx start_data %lx end_data %lx\n"
 		"start_brk %lx brk %lx start_stack %lx\n"
 		"arg_start %lx arg_end %lx env_start %lx env_end %lx\n"
-		"binfmt %p flags %lx core_state %p\n"
+		"binfmt %px flags %lx core_state %px\n"
 #ifdef CONFIG_AIO
-		"ioctx_table %p\n"
+		"ioctx_table %px\n"
 #endif
 #ifdef CONFIG_MEMCG
-		"owner %p "
+		"owner %px "
 #endif
-		"exe_file %p\n"
+		"exe_file %px\n"
 #ifdef CONFIG_MMU_NOTIFIER
-		"mmu_notifier_mm %p\n"
+		"mmu_notifier_mm %px\n"
 #endif
 #ifdef CONFIG_NUMA_BALANCING
 		"numa_next_scan %lu numa_scan_offset %lu numa_scan_seq %d\n"
diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index d73c142..f656ca2 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -127,7 +127,7 @@
 /* GFP bitmask for kmemleak internal allocations */
 #define gfp_kmemleak_mask(gfp)	(((gfp) & (GFP_KERNEL | GFP_ATOMIC)) | \
 				 __GFP_NORETRY | __GFP_NOMEMALLOC | \
-				 __GFP_NOWARN)
+				 __GFP_NOWARN | __GFP_NOFAIL)
 
 /* scanning area inside a memory block */
 struct kmemleak_scan_area {
diff --git a/mm/ksm.c b/mm/ksm.c
index be8f457..c406f75 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -675,15 +675,8 @@ static struct page *get_ksm_page(struct stable_node *stable_node, bool lock_it)
 	expected_mapping = (void *)((unsigned long)stable_node |
 					PAGE_MAPPING_KSM);
 again:
-	kpfn = READ_ONCE(stable_node->kpfn);
+	kpfn = READ_ONCE(stable_node->kpfn); /* Address dependency. */
 	page = pfn_to_page(kpfn);
-
-	/*
-	 * page is computed from kpfn, so on most architectures reading
-	 * page->mapping is naturally ordered after reading node->kpfn,
-	 * but on Alpha we need to be more careful.
-	 */
-	smp_read_barrier_depends();
 	if (READ_ONCE(page->mapping) != expected_mapping)
 		goto stale;
 
diff --git a/mm/memory.c b/mm/memory.c
index ca5674c..7930046 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2857,8 +2857,11 @@ int do_swap_page(struct vm_fault *vmf)
 	int ret = 0;
 	bool vma_readahead = swap_use_vma_readahead();
 
-	if (vma_readahead)
+	if (vma_readahead) {
 		page = swap_readahead_detect(vmf, &swap_ra);
+		swapcache = page;
+	}
+
 	if (!pte_unmap_same(vma->vm_mm, vmf->pmd, vmf->pte, vmf->orig_pte)) {
 		if (page)
 			put_page(page);
@@ -2889,9 +2892,12 @@ int do_swap_page(struct vm_fault *vmf)
 
 
 	delayacct_set_flag(DELAYACCT_PF_SWAPIN);
-	if (!page)
+	if (!page) {
 		page = lookup_swap_cache(entry, vma_readahead ? vma : NULL,
 					 vmf->address);
+		swapcache = page;
+	}
+
 	if (!page) {
 		struct swap_info_struct *si = swp_swap_info(entry);
 
diff --git a/mm/mlock.c b/mm/mlock.c
index 30472d4..f7f54fd 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -779,7 +779,7 @@ static int apply_mlockall_flags(int flags)
 
 		/* Ignore errors */
 		mlock_fixup(vma, &prev, vma->vm_start, vma->vm_end, newflags);
-		cond_resched_rcu_qs();
+		cond_resched();
 	}
 out:
 	return 0;
diff --git a/mm/mprotect.c b/mm/mprotect.c
index ec39f730..58b629b 100644
--- a/mm/mprotect.c
+++ b/mm/mprotect.c
@@ -166,7 +166,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		next = pmd_addr_end(addr, end);
 		if (!is_swap_pmd(*pmd) && !pmd_trans_huge(*pmd) && !pmd_devmap(*pmd)
 				&& pmd_none_or_clear_bad(pmd))
-			continue;
+			goto next;
 
 		/* invoke the mmu notifier if the pmd is populated */
 		if (!mni_start) {
@@ -188,7 +188,7 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 					}
 
 					/* huge pmd was handled */
-					continue;
+					goto next;
 				}
 			}
 			/* fall through, the trans huge pmd just split */
@@ -196,6 +196,8 @@ static inline unsigned long change_pmd_range(struct vm_area_struct *vma,
 		this_pages = change_pte_range(vma, pmd, addr, next, newprot,
 				 dirty_accountable, prot_numa);
 		pages += this_pages;
+next:
+		cond_resched();
 	} while (pmd++, addr = next, addr != end);
 
 	if (mni_start)
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 7e5e775..76c9688 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6260,6 +6260,8 @@ void __paginginit zero_resv_unavail(void)
 	pgcnt = 0;
 	for_each_resv_unavail_range(i, &start, &end) {
 		for (pfn = PFN_DOWN(start); pfn < PFN_UP(end); pfn++) {
+			if (!pfn_valid(ALIGN_DOWN(pfn, pageblock_nr_pages)))
+				continue;
 			mm_zero_struct_page(pfn_to_page(pfn));
 			pgcnt++;
 		}
diff --git a/mm/page_io.c b/mm/page_io.c
index e93f1a4..b41cf96 100644
--- a/mm/page_io.c
+++ b/mm/page_io.c
@@ -50,7 +50,7 @@ static struct bio *get_swap_bio(gfp_t gfp_flags,
 
 void end_swap_bio_write(struct bio *bio)
 {
-	struct page *page = bio->bi_io_vec[0].bv_page;
+	struct page *page = bio_first_page_all(bio);
 
 	if (bio->bi_status) {
 		SetPageError(page);
@@ -122,7 +122,7 @@ static void swap_slot_free_notify(struct page *page)
 
 static void end_swap_bio_read(struct bio *bio)
 {
-	struct page *page = bio->bi_io_vec[0].bv_page;
+	struct page *page = bio_first_page_all(bio);
 	struct task_struct *waiter = bio->bi_private;
 
 	if (bio->bi_status) {
diff --git a/mm/page_owner.c b/mm/page_owner.c
index 8592543..270a821 100644
--- a/mm/page_owner.c
+++ b/mm/page_owner.c
@@ -616,7 +616,6 @@ static void init_early_allocated_pages(void)
 {
 	pg_data_t *pgdat;
 
-	drain_all_pages(NULL);
 	for_each_online_pgdat(pgdat)
 		init_zones_in_node(pgdat);
 }
diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c
index d22b843..ae3c2a3 100644
--- a/mm/page_vma_mapped.c
+++ b/mm/page_vma_mapped.c
@@ -30,10 +30,37 @@ static bool map_pte(struct page_vma_mapped_walk *pvmw)
 	return true;
 }
 
+static inline bool pfn_in_hpage(struct page *hpage, unsigned long pfn)
+{
+	unsigned long hpage_pfn = page_to_pfn(hpage);
+
+	/* THP can be referenced by any subpage */
+	return pfn >= hpage_pfn && pfn - hpage_pfn < hpage_nr_pages(hpage);
+}
+
+/**
+ * check_pte - check if @pvmw->page is mapped at the @pvmw->pte
+ *
+ * page_vma_mapped_walk() found a place where @pvmw->page is *potentially*
+ * mapped. check_pte() has to validate this.
+ *
+ * @pvmw->pte may point to empty PTE, swap PTE or PTE pointing to arbitrary
+ * page.
+ *
+ * If PVMW_MIGRATION flag is set, returns true if @pvmw->pte contains migration
+ * entry that points to @pvmw->page or any subpage in case of THP.
+ *
+ * If PVMW_MIGRATION flag is not set, returns true if @pvmw->pte points to
+ * @pvmw->page or any subpage in case of THP.
+ *
+ * Otherwise, return false.
+ *
+ */
 static bool check_pte(struct page_vma_mapped_walk *pvmw)
 {
+	unsigned long pfn;
+
 	if (pvmw->flags & PVMW_MIGRATION) {
-#ifdef CONFIG_MIGRATION
 		swp_entry_t entry;
 		if (!is_swap_pte(*pvmw->pte))
 			return false;
@@ -41,38 +68,25 @@ static bool check_pte(struct page_vma_mapped_walk *pvmw)
 
 		if (!is_migration_entry(entry))
 			return false;
-		if (migration_entry_to_page(entry) - pvmw->page >=
-				hpage_nr_pages(pvmw->page)) {
+
+		pfn = migration_entry_to_pfn(entry);
+	} else if (is_swap_pte(*pvmw->pte)) {
+		swp_entry_t entry;
+
+		/* Handle un-addressable ZONE_DEVICE memory */
+		entry = pte_to_swp_entry(*pvmw->pte);
+		if (!is_device_private_entry(entry))
 			return false;
-		}
-		if (migration_entry_to_page(entry) < pvmw->page)
-			return false;
-#else
-		WARN_ON_ONCE(1);
-#endif
+
+		pfn = device_private_entry_to_pfn(entry);
 	} else {
-		if (is_swap_pte(*pvmw->pte)) {
-			swp_entry_t entry;
-
-			entry = pte_to_swp_entry(*pvmw->pte);
-			if (is_device_private_entry(entry) &&
-			    device_private_entry_to_page(entry) == pvmw->page)
-				return true;
-		}
-
 		if (!pte_present(*pvmw->pte))
 			return false;
 
-		/* THP can be referenced by any subpage */
-		if (pte_page(*pvmw->pte) - pvmw->page >=
-				hpage_nr_pages(pvmw->page)) {
-			return false;
-		}
-		if (pte_page(*pvmw->pte) < pvmw->page)
-			return false;
+		pfn = pte_pfn(*pvmw->pte);
 	}
 
-	return true;
+	return pfn_in_hpage(pvmw->page, pfn);
 }
 
 /**
diff --git a/mm/sparse.c b/mm/sparse.c
index 7a5daca..2609aba 100644
--- a/mm/sparse.c
+++ b/mm/sparse.c
@@ -211,7 +211,7 @@ void __init memory_present(int nid, unsigned long start, unsigned long end)
 	if (unlikely(!mem_section)) {
 		unsigned long size, align;
 
-		size = sizeof(struct mem_section) * NR_SECTION_ROOTS;
+		size = sizeof(struct mem_section*) * NR_SECTION_ROOTS;
 		align = 1 << (INTERNODE_CACHE_SHIFT);
 		mem_section = memblock_virt_alloc(size, align);
 	}
diff --git a/mm/vmscan.c b/mm/vmscan.c
index c02c850..47d5ced 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -297,10 +297,13 @@ EXPORT_SYMBOL(register_shrinker);
  */
 void unregister_shrinker(struct shrinker *shrinker)
 {
+	if (!shrinker->nr_deferred)
+		return;
 	down_write(&shrinker_rwsem);
 	list_del(&shrinker->list);
 	up_write(&shrinker_rwsem);
 	kfree(shrinker->nr_deferred);
+	shrinker->nr_deferred = NULL;
 }
 EXPORT_SYMBOL(unregister_shrinker);
 
diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 685049a..683c065 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -53,6 +53,7 @@
 #include <linux/mount.h>
 #include <linux/migrate.h>
 #include <linux/pagemap.h>
+#include <linux/fs.h>
 
 #define ZSPAGE_MAGIC	0x58
 
diff --git a/net/8021q/vlan.c b/net/8021q/vlan.c
index 8dfdd94..bad01b1 100644
--- a/net/8021q/vlan.c
+++ b/net/8021q/vlan.c
@@ -111,12 +111,7 @@ void unregister_vlan_dev(struct net_device *dev, struct list_head *head)
 		vlan_gvrp_uninit_applicant(real_dev);
 	}
 
-	/* Take it out of our own structures, but be sure to interlock with
-	 * HW accelerating devices or SW vlan input packet processing if
-	 * VLAN is not 0 (leave it there for 802.1p).
-	 */
-	if (vlan_id)
-		vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id);
+	vlan_vid_del(real_dev, vlan->vlan_proto, vlan_id);
 
 	/* Get rid of the vlan's reference to real_dev */
 	dev_put(real_dev);
diff --git a/net/9p/trans_xen.c b/net/9p/trans_xen.c
index 325c560..086a4ab 100644
--- a/net/9p/trans_xen.c
+++ b/net/9p/trans_xen.c
@@ -543,3 +543,7 @@ static void p9_trans_xen_exit(void)
 	return xenbus_unregister_driver(&xen_9pfs_front_driver);
 }
 module_exit(p9_trans_xen_exit);
+
+MODULE_AUTHOR("Stefano Stabellini <stefano@aporeto.com>");
+MODULE_DESCRIPTION("Xen Transport for 9P");
+MODULE_LICENSE("GPL");
diff --git a/net/bluetooth/l2cap_core.c b/net/bluetooth/l2cap_core.c
index 43ba91c..fc6615d 100644
--- a/net/bluetooth/l2cap_core.c
+++ b/net/bluetooth/l2cap_core.c
@@ -3363,9 +3363,10 @@ static int l2cap_parse_conf_req(struct l2cap_chan *chan, void *data, size_t data
 			break;
 
 		case L2CAP_CONF_EFS:
-			remote_efs = 1;
-			if (olen == sizeof(efs))
+			if (olen == sizeof(efs)) {
+				remote_efs = 1;
 				memcpy(&efs, (void *) val, olen);
+			}
 			break;
 
 		case L2CAP_CONF_EWS:
@@ -3584,16 +3585,17 @@ static int l2cap_parse_conf_rsp(struct l2cap_chan *chan, void *rsp, int len,
 			break;
 
 		case L2CAP_CONF_EFS:
-			if (olen == sizeof(efs))
+			if (olen == sizeof(efs)) {
 				memcpy(&efs, (void *)val, olen);
 
-			if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
-			    efs.stype != L2CAP_SERV_NOTRAFIC &&
-			    efs.stype != chan->local_stype)
-				return -ECONNREFUSED;
+				if (chan->local_stype != L2CAP_SERV_NOTRAFIC &&
+				    efs.stype != L2CAP_SERV_NOTRAFIC &&
+				    efs.stype != chan->local_stype)
+					return -ECONNREFUSED;
 
-			l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
-					   (unsigned long) &efs, endptr - ptr);
+				l2cap_add_conf_opt(&ptr, L2CAP_CONF_EFS, sizeof(efs),
+						   (unsigned long) &efs, endptr - ptr);
+			}
 			break;
 
 		case L2CAP_CONF_FCS:
diff --git a/net/caif/caif_dev.c b/net/caif/caif_dev.c
index 2d38b6e..e0adcd1 100644
--- a/net/caif/caif_dev.c
+++ b/net/caif/caif_dev.c
@@ -334,9 +334,8 @@ void caif_enroll_dev(struct net_device *dev, struct caif_dev_common *caifdev,
 	mutex_lock(&caifdevs->lock);
 	list_add_rcu(&caifd->list, &caifdevs->list);
 
-	strncpy(caifd->layer.name, dev->name,
-		sizeof(caifd->layer.name) - 1);
-	caifd->layer.name[sizeof(caifd->layer.name) - 1] = 0;
+	strlcpy(caifd->layer.name, dev->name,
+		sizeof(caifd->layer.name));
 	caifd->layer.transmit = transmit;
 	cfcnfg_add_phy_layer(cfg,
 				dev,
diff --git a/net/caif/caif_usb.c b/net/caif/caif_usb.c
index 5cd44f0..1a082a9 100644
--- a/net/caif/caif_usb.c
+++ b/net/caif/caif_usb.c
@@ -176,9 +176,7 @@ static int cfusbl_device_notify(struct notifier_block *me, unsigned long what,
 		dev_add_pack(&caif_usb_type);
 	pack_added = true;
 
-	strncpy(layer->name, dev->name,
-			sizeof(layer->name) - 1);
-	layer->name[sizeof(layer->name) - 1] = 0;
+	strlcpy(layer->name, dev->name, sizeof(layer->name));
 
 	return 0;
 }
diff --git a/net/caif/cfcnfg.c b/net/caif/cfcnfg.c
index 273cb07..8f00bea 100644
--- a/net/caif/cfcnfg.c
+++ b/net/caif/cfcnfg.c
@@ -268,17 +268,15 @@ static int caif_connect_req_to_link_param(struct cfcnfg *cnfg,
 	case CAIFPROTO_RFM:
 		l->linktype = CFCTRL_SRV_RFM;
 		l->u.datagram.connid = s->sockaddr.u.rfm.connection_id;
-		strncpy(l->u.rfm.volume, s->sockaddr.u.rfm.volume,
-			sizeof(l->u.rfm.volume)-1);
-		l->u.rfm.volume[sizeof(l->u.rfm.volume)-1] = 0;
+		strlcpy(l->u.rfm.volume, s->sockaddr.u.rfm.volume,
+			sizeof(l->u.rfm.volume));
 		break;
 	case CAIFPROTO_UTIL:
 		l->linktype = CFCTRL_SRV_UTIL;
 		l->endpoint = 0x00;
 		l->chtype = 0x00;
-		strncpy(l->u.utility.name, s->sockaddr.u.util.service,
-			sizeof(l->u.utility.name)-1);
-		l->u.utility.name[sizeof(l->u.utility.name)-1] = 0;
+		strlcpy(l->u.utility.name, s->sockaddr.u.util.service,
+			sizeof(l->u.utility.name));
 		caif_assert(sizeof(l->u.utility.name) > 10);
 		l->u.utility.paramlen = s->param.size;
 		if (l->u.utility.paramlen > sizeof(l->u.utility.params))
diff --git a/net/caif/cfctrl.c b/net/caif/cfctrl.c
index f5afda1..655ed70 100644
--- a/net/caif/cfctrl.c
+++ b/net/caif/cfctrl.c
@@ -258,8 +258,8 @@ int cfctrl_linkup_request(struct cflayer *layer,
 		tmp16 = cpu_to_le16(param->u.utility.fifosize_bufs);
 		cfpkt_add_body(pkt, &tmp16, 2);
 		memset(utility_name, 0, sizeof(utility_name));
-		strncpy(utility_name, param->u.utility.name,
-			UTILITY_NAME_LENGTH - 1);
+		strlcpy(utility_name, param->u.utility.name,
+			UTILITY_NAME_LENGTH);
 		cfpkt_add_body(pkt, utility_name, UTILITY_NAME_LENGTH);
 		tmp8 = param->u.utility.paramlen;
 		cfpkt_add_body(pkt, &tmp8, 1);
diff --git a/net/can/af_can.c b/net/can/af_can.c
index 003b2d6..4d7f988 100644
--- a/net/can/af_can.c
+++ b/net/can/af_can.c
@@ -721,20 +721,16 @@ static int can_rcv(struct sk_buff *skb, struct net_device *dev,
 {
 	struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
 
-	if (WARN_ONCE(dev->type != ARPHRD_CAN ||
-		      skb->len != CAN_MTU ||
-		      cfd->len > CAN_MAX_DLEN,
-		      "PF_CAN: dropped non conform CAN skbuf: "
-		      "dev type %d, len %d, datalen %d\n",
-		      dev->type, skb->len, cfd->len))
-		goto drop;
+	if (unlikely(dev->type != ARPHRD_CAN || skb->len != CAN_MTU ||
+		     cfd->len > CAN_MAX_DLEN)) {
+		pr_warn_once("PF_CAN: dropped non conform CAN skbuf: dev type %d, len %d, datalen %d\n",
+			     dev->type, skb->len, cfd->len);
+		kfree_skb(skb);
+		return NET_RX_DROP;
+	}
 
 	can_receive(skb, dev);
 	return NET_RX_SUCCESS;
-
-drop:
-	kfree_skb(skb);
-	return NET_RX_DROP;
 }
 
 static int canfd_rcv(struct sk_buff *skb, struct net_device *dev,
@@ -742,20 +738,16 @@ static int canfd_rcv(struct sk_buff *skb, struct net_device *dev,
 {
 	struct canfd_frame *cfd = (struct canfd_frame *)skb->data;
 
-	if (WARN_ONCE(dev->type != ARPHRD_CAN ||
-		      skb->len != CANFD_MTU ||
-		      cfd->len > CANFD_MAX_DLEN,
-		      "PF_CAN: dropped non conform CAN FD skbuf: "
-		      "dev type %d, len %d, datalen %d\n",
-		      dev->type, skb->len, cfd->len))
-		goto drop;
+	if (unlikely(dev->type != ARPHRD_CAN || skb->len != CANFD_MTU ||
+		     cfd->len > CANFD_MAX_DLEN)) {
+		pr_warn_once("PF_CAN: dropped non conform CAN FD skbuf: dev type %d, len %d, datalen %d\n",
+			     dev->type, skb->len, cfd->len);
+		kfree_skb(skb);
+		return NET_RX_DROP;
+	}
 
 	can_receive(skb, dev);
 	return NET_RX_SUCCESS;
-
-drop:
-	kfree_skb(skb);
-	return NET_RX_DROP;
 }
 
 /*
diff --git a/net/core/dev.c b/net/core/dev.c
index 01ee854..613fb40 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1146,7 +1146,19 @@ EXPORT_SYMBOL(dev_alloc_name);
 int dev_get_valid_name(struct net *net, struct net_device *dev,
 		       const char *name)
 {
-	return dev_alloc_name_ns(net, dev, name);
+	BUG_ON(!net);
+
+	if (!dev_valid_name(name))
+		return -EINVAL;
+
+	if (strchr(name, '%'))
+		return dev_alloc_name_ns(net, dev, name);
+	else if (__dev_get_by_name(net, name))
+		return -EEXIST;
+	else if (dev->name != name)
+		strlcpy(dev->name, name, IFNAMSIZ);
+
+	return 0;
 }
 EXPORT_SYMBOL(dev_get_valid_name);
 
@@ -3139,10 +3151,21 @@ static void qdisc_pkt_len_init(struct sk_buff *skb)
 		hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
 
 		/* + transport layer */
-		if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
-			hdr_len += tcp_hdrlen(skb);
-		else
-			hdr_len += sizeof(struct udphdr);
+		if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+			const struct tcphdr *th;
+			struct tcphdr _tcphdr;
+
+			th = skb_header_pointer(skb, skb_transport_offset(skb),
+						sizeof(_tcphdr), &_tcphdr);
+			if (likely(th))
+				hdr_len += __tcp_hdrlen(th);
+		} else {
+			struct udphdr _udphdr;
+
+			if (skb_header_pointer(skb, skb_transport_offset(skb),
+					       sizeof(_udphdr), &_udphdr))
+				hdr_len += sizeof(struct udphdr);
+		}
 
 		if (shinfo->gso_type & SKB_GSO_DODGY)
 			gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
diff --git a/net/core/ethtool.c b/net/core/ethtool.c
index f8fcf45..8225416 100644
--- a/net/core/ethtool.c
+++ b/net/core/ethtool.c
@@ -770,15 +770,6 @@ static int ethtool_set_link_ksettings(struct net_device *dev,
 	return dev->ethtool_ops->set_link_ksettings(dev, &link_ksettings);
 }
 
-static void
-warn_incomplete_ethtool_legacy_settings_conversion(const char *details)
-{
-	char name[sizeof(current->comm)];
-
-	pr_info_once("warning: `%s' uses legacy ethtool link settings API, %s\n",
-		     get_task_comm(name, current), details);
-}
-
 /* Query device for its ethtool_cmd settings.
  *
  * Backward compatibility note: for compatibility with legacy ethtool,
@@ -805,10 +796,8 @@ static int ethtool_get_settings(struct net_device *dev, void __user *useraddr)
 							   &link_ksettings);
 		if (err < 0)
 			return err;
-		if (!convert_link_ksettings_to_legacy_settings(&cmd,
-							       &link_ksettings))
-			warn_incomplete_ethtool_legacy_settings_conversion(
-				"link modes are only partially reported");
+		convert_link_ksettings_to_legacy_settings(&cmd,
+							  &link_ksettings);
 
 		/* send a sensible cmd tag back to user */
 		cmd.cmd = ETHTOOL_GSET;
diff --git a/net/core/filter.c b/net/core/filter.c
index 6a85e67..1c0eb43 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -458,6 +458,10 @@ static int bpf_convert_filter(struct sock_filter *prog, int len,
 			    convert_bpf_extensions(fp, &insn))
 				break;
 
+			if (fp->code == (BPF_ALU | BPF_DIV | BPF_X) ||
+			    fp->code == (BPF_ALU | BPF_MOD | BPF_X))
+				*insn++ = BPF_MOV32_REG(BPF_REG_X, BPF_REG_X);
+
 			*insn = BPF_RAW_INSN(fp->code, BPF_REG_A, BPF_REG_X, 0, fp->k);
 			break;
 
@@ -1054,11 +1058,9 @@ static struct bpf_prog *bpf_migrate_filter(struct bpf_prog *fp)
 		 */
 		goto out_err_free;
 
-	/* We are guaranteed to never error here with cBPF to eBPF
-	 * transitions, since there's no issue with type compatibility
-	 * checks on program arrays.
-	 */
 	fp = bpf_prog_select_runtime(fp, &err);
+	if (err)
+		goto out_err_free;
 
 	kfree(old_prog);
 	return fp;
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 15ce300..544bddf 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -976,8 +976,8 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 out_good:
 	ret = true;
 
-	key_control->thoff = (u16)nhoff;
 out:
+	key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
 	key_basic->n_proto = proto;
 	key_basic->ip_proto = ip_proto;
 
@@ -985,7 +985,6 @@ bool __skb_flow_dissect(const struct sk_buff *skb,
 
 out_bad:
 	ret = false;
-	key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
 	goto out;
 }
 EXPORT_SYMBOL(__skb_flow_dissect);
diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index d1f5fe9..7f83171 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -532,7 +532,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
 	if (atomic_read(&tbl->entries) > (1 << nht->hash_shift))
 		nht = neigh_hash_grow(tbl, nht->hash_shift + 1);
 
-	hash_val = tbl->hash(pkey, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
+	hash_val = tbl->hash(n->primary_key, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
 
 	if (n->parms->dead) {
 		rc = ERR_PTR(-EINVAL);
@@ -544,7 +544,7 @@ struct neighbour *__neigh_create(struct neigh_table *tbl, const void *pkey,
 	     n1 != NULL;
 	     n1 = rcu_dereference_protected(n1->next,
 			lockdep_is_held(&tbl->lock))) {
-		if (dev == n1->dev && !memcmp(n1->primary_key, pkey, key_len)) {
+		if (dev == n1->dev && !memcmp(n1->primary_key, n->primary_key, key_len)) {
 			if (want_ref)
 				neigh_hold(n1);
 			rc = n1;
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index dabba2a..778d7f0 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1681,18 +1681,18 @@ static bool link_dump_filtered(struct net_device *dev,
 	return false;
 }
 
-static struct net *get_target_net(struct sk_buff *skb, int netnsid)
+static struct net *get_target_net(struct sock *sk, int netnsid)
 {
 	struct net *net;
 
-	net = get_net_ns_by_id(sock_net(skb->sk), netnsid);
+	net = get_net_ns_by_id(sock_net(sk), netnsid);
 	if (!net)
 		return ERR_PTR(-EINVAL);
 
 	/* For now, the caller is required to have CAP_NET_ADMIN in
 	 * the user namespace owning the target net ns.
 	 */
-	if (!netlink_ns_capable(skb, net->user_ns, CAP_NET_ADMIN)) {
+	if (!sk_ns_capable(sk, net->user_ns, CAP_NET_ADMIN)) {
 		put_net(net);
 		return ERR_PTR(-EACCES);
 	}
@@ -1733,7 +1733,7 @@ static int rtnl_dump_ifinfo(struct sk_buff *skb, struct netlink_callback *cb)
 			ifla_policy, NULL) >= 0) {
 		if (tb[IFLA_IF_NETNSID]) {
 			netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]);
-			tgt_net = get_target_net(skb, netnsid);
+			tgt_net = get_target_net(skb->sk, netnsid);
 			if (IS_ERR(tgt_net)) {
 				tgt_net = net;
 				netnsid = -1;
@@ -2883,7 +2883,7 @@ static int rtnl_getlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	if (tb[IFLA_IF_NETNSID]) {
 		netnsid = nla_get_s32(tb[IFLA_IF_NETNSID]);
-		tgt_net = get_target_net(skb, netnsid);
+		tgt_net = get_target_net(NETLINK_CB(skb).sk, netnsid);
 		if (IS_ERR(tgt_net))
 			return PTR_ERR(tgt_net);
 	}
diff --git a/net/core/sock_diag.c b/net/core/sock_diag.c
index 217f4e3..146b50e 100644
--- a/net/core/sock_diag.c
+++ b/net/core/sock_diag.c
@@ -288,7 +288,7 @@ static int sock_diag_bind(struct net *net, int group)
 	case SKNLGRP_INET6_UDP_DESTROY:
 		if (!sock_diag_handlers[AF_INET6])
 			request_module("net-pf-%d-proto-%d-type-%d", PF_NETLINK,
-				       NETLINK_SOCK_DIAG, AF_INET);
+				       NETLINK_SOCK_DIAG, AF_INET6);
 		break;
 	}
 	return 0;
diff --git a/net/core/sysctl_net_core.c b/net/core/sysctl_net_core.c
index cbc3dde..a47ad6c 100644
--- a/net/core/sysctl_net_core.c
+++ b/net/core/sysctl_net_core.c
@@ -325,7 +325,13 @@ static struct ctl_table net_core_table[] = {
 		.data		= &bpf_jit_enable,
 		.maxlen		= sizeof(int),
 		.mode		= 0644,
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 		.proc_handler	= proc_dointvec
+#else
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &one,
+		.extra2		= &one,
+#endif
 	},
 # ifdef CONFIG_HAVE_EBPF_JIT
 	{
diff --git a/net/dccp/ccids/ccid2.c b/net/dccp/ccids/ccid2.c
index 1c75cd1..92d016e 100644
--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -140,6 +140,9 @@ static void ccid2_hc_tx_rto_expire(struct timer_list *t)
 
 	ccid2_pr_debug("RTO_EXPIRE\n");
 
+	if (sk->sk_state == DCCP_CLOSED)
+		goto out;
+
 	/* back-off timer */
 	hc->tx_rto <<= 1;
 	if (hc->tx_rto > DCCP_RTO_MAX)
diff --git a/net/ipv4/arp.c b/net/ipv4/arp.c
index a8d7c5a..6c231b4 100644
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -223,11 +223,16 @@ static bool arp_key_eq(const struct neighbour *neigh, const void *pkey)
 
 static int arp_constructor(struct neighbour *neigh)
 {
-	__be32 addr = *(__be32 *)neigh->primary_key;
+	__be32 addr;
 	struct net_device *dev = neigh->dev;
 	struct in_device *in_dev;
 	struct neigh_parms *parms;
+	u32 inaddr_any = INADDR_ANY;
 
+	if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+		memcpy(neigh->primary_key, &inaddr_any, arp_tbl.key_len);
+
+	addr = *(__be32 *)neigh->primary_key;
 	rcu_read_lock();
 	in_dev = __in_dev_get_rcu(dev);
 	if (!in_dev) {
diff --git a/net/ipv4/esp4.c b/net/ipv4/esp4.c
index d57aa64..61fe6e4 100644
--- a/net/ipv4/esp4.c
+++ b/net/ipv4/esp4.c
@@ -981,6 +981,7 @@ static int esp_init_state(struct xfrm_state *x)
 
 		switch (encap->encap_type) {
 		default:
+			err = -EINVAL;
 			goto error;
 		case UDP_ENCAP_ESPINUDP:
 			x->props.header_len += sizeof(struct udphdr);
diff --git a/net/ipv4/esp4_offload.c b/net/ipv4/esp4_offload.c
index f8b918c..29b333a 100644
--- a/net/ipv4/esp4_offload.c
+++ b/net/ipv4/esp4_offload.c
@@ -38,7 +38,8 @@ static struct sk_buff **esp4_gro_receive(struct sk_buff **head,
 	__be32 spi;
 	int err;
 
-	skb_pull(skb, offset);
+	if (!pskb_pull(skb, offset))
+		return NULL;
 
 	if ((err = xfrm_parse_spi(skb, IPPROTO_ESP, &spi, &seq)) != 0)
 		goto out;
@@ -121,6 +122,9 @@ static struct sk_buff *esp4_gso_segment(struct sk_buff *skb,
 	if (!xo)
 		goto out;
 
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_ESP))
+		goto out;
+
 	seq = xo->seq.low;
 
 	x = skb->sp->xvec[skb->sp->len - 1];
diff --git a/net/ipv4/igmp.c b/net/ipv4/igmp.c
index 726f6b6..2d49717 100644
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -332,7 +332,7 @@ static __be32 igmpv3_get_srcaddr(struct net_device *dev,
 		return htonl(INADDR_ANY);
 
 	for_ifa(in_dev) {
-		if (inet_ifa_match(fl4->saddr, ifa))
+		if (fl4->saddr == ifa->ifa_local)
 			return fl4->saddr;
 	} endfor_ifa(in_dev);
 
diff --git a/net/ipv4/ip_tunnel.c b/net/ipv4/ip_tunnel.c
index 5ddb1cb..6d21068 100644
--- a/net/ipv4/ip_tunnel.c
+++ b/net/ipv4/ip_tunnel.c
@@ -520,8 +520,7 @@ static int tnl_update_pmtu(struct net_device *dev, struct sk_buff *skb,
 	else
 		mtu = skb_dst(skb) ? dst_mtu(skb_dst(skb)) : dev->mtu;
 
-	if (skb_dst(skb))
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 
 	if (skb->protocol == htons(ETH_P_IP)) {
 		if (!skb_is_gso(skb) &&
diff --git a/net/ipv4/ip_vti.c b/net/ipv4/ip_vti.c
index 949f432..51b1669 100644
--- a/net/ipv4/ip_vti.c
+++ b/net/ipv4/ip_vti.c
@@ -200,7 +200,7 @@ static netdev_tx_t vti_xmit(struct sk_buff *skb, struct net_device *dev,
 
 	mtu = dst_mtu(dst);
 	if (skb->len > mtu) {
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 		if (skb->protocol == htons(ETH_P_IP)) {
 			icmp_send(skb, ICMP_DEST_UNREACH, ICMP_FRAG_NEEDED,
 				  htonl(mtu));
diff --git a/net/ipv4/netfilter/arp_tables.c b/net/ipv4/netfilter/arp_tables.c
index 0c3c944..eb8246c 100644
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -202,13 +202,8 @@ unsigned int arpt_do_table(struct sk_buff *skb,
 
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu     = smp_processor_id();
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
 	table_base = private->entries;
 	jumpstack  = (struct arpt_entry **)private->jumpstack[cpu];
 
diff --git a/net/ipv4/netfilter/ip_tables.c b/net/ipv4/netfilter/ip_tables.c
index 2e0d339..cc984d0 100644
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -260,13 +260,8 @@ ipt_do_table(struct sk_buff *skb,
 	WARN_ON(!(table->valid_hooks & (1 << hook)));
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu        = smp_processor_id();
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
 	table_base = private->entries;
 	jumpstack  = (struct ipt_entry **)private->jumpstack[cpu];
 
diff --git a/net/ipv4/raw.c b/net/ipv4/raw.c
index 125c1ea..5e570aa 100644
--- a/net/ipv4/raw.c
+++ b/net/ipv4/raw.c
@@ -520,9 +520,11 @@ static int raw_sendmsg(struct sock *sk, struct msghdr *msg, size_t len)
 		goto out;
 
 	/* hdrincl should be READ_ONCE(inet->hdrincl)
-	 * but READ_ONCE() doesn't work with bit fields
+	 * but READ_ONCE() doesn't work with bit fields.
+	 * Doing this indirectly yields the same result.
 	 */
 	hdrincl = inet->hdrincl;
+	hdrincl = READ_ONCE(hdrincl);
 	/*
 	 *	Check the flags.
 	 */
diff --git a/net/ipv4/route.c b/net/ipv4/route.c
index 43b69af..4e153b2 100644
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2762,6 +2762,7 @@ static int inet_rtm_getroute(struct sk_buff *in_skb, struct nlmsghdr *nlh,
 		if (err == 0 && rt->dst.error)
 			err = -rt->dst.error;
 	} else {
+		fl4.flowi4_iif = LOOPBACK_IFINDEX;
 		rt = ip_route_output_key_hash_rcu(net, &fl4, &res, skb);
 		err = 0;
 		if (IS_ERR(rt))
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
index f08eebe..8e053ad 100644
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2298,6 +2298,9 @@ void tcp_close(struct sock *sk, long timeout)
 			tcp_send_active_reset(sk, GFP_ATOMIC);
 			__NET_INC_STATS(sock_net(sk),
 					LINUX_MIB_TCPABORTONMEMORY);
+		} else if (!check_net(sock_net(sk))) {
+			/* Not possible to send reset; just close */
+			tcp_set_state(sk, TCP_CLOSE);
 		}
 	}
 
diff --git a/net/ipv4/tcp_offload.c b/net/ipv4/tcp_offload.c
index b6a2aa1..4d58e2c 100644
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -32,6 +32,9 @@ static void tcp_gso_tstamp(struct sk_buff *skb, unsigned int ts_seq,
 static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
 					netdev_features_t features)
 {
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+		return ERR_PTR(-EINVAL);
+
 	if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
 		return ERR_PTR(-EINVAL);
 
diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
index 968fda1..388158c 100644
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -48,11 +48,19 @@ static void tcp_write_err(struct sock *sk)
  *  to prevent DoS attacks. It is called when a retransmission timeout
  *  or zero probe timeout occurs on orphaned socket.
  *
+ *  Also close if our net namespace is exiting; in that case there is no
+ *  hope of ever communicating again since all netns interfaces are already
+ *  down (or about to be down), and we need to release our dst references,
+ *  which have been moved to the netns loopback interface, so the namespace
+ *  can finish exiting.  This condition is only possible if we are a kernel
+ *  socket, as those do not hold references to the namespace.
+ *
  *  Criteria is still not confirmed experimentally and may change.
  *  We kill the socket, if:
  *  1. If number of orphaned sockets exceeds an administratively configured
  *     limit.
  *  2. If we have strong memory pressure.
+ *  3. If our net namespace is exiting.
  */
 static int tcp_out_of_resources(struct sock *sk, bool do_reset)
 {
@@ -81,6 +89,13 @@ static int tcp_out_of_resources(struct sock *sk, bool do_reset)
 		__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
 		return 1;
 	}
+
+	if (!check_net(sock_net(sk))) {
+		/* Not possible to send reset; just close */
+		tcp_done(sk);
+		return 1;
+	}
+
 	return 0;
 }
 
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c
index 01801b7..ea6e6e7 100644
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -203,6 +203,9 @@ static struct sk_buff *udp4_ufo_fragment(struct sk_buff *skb,
 		goto out;
 	}
 
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+		goto out;
+
 	if (!pskb_may_pull(skb, sizeof(struct udphdr)))
 		goto out;
 
diff --git a/net/ipv4/xfrm4_mode_tunnel.c b/net/ipv4/xfrm4_mode_tunnel.c
index e6265e2..20ca486 100644
--- a/net/ipv4/xfrm4_mode_tunnel.c
+++ b/net/ipv4/xfrm4_mode_tunnel.c
@@ -92,6 +92,7 @@ static int xfrm4_mode_tunnel_input(struct xfrm_state *x, struct sk_buff *skb)
 
 	skb_reset_network_header(skb);
 	skb_mac_header_rebuild(skb);
+	eth_hdr(skb)->h_proto = skb->protocol;
 
 	err = 0;
 
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index a902ff8..1a7f00c 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -890,13 +890,12 @@ static int esp6_init_state(struct xfrm_state *x)
 			x->props.header_len += IPV4_BEET_PHMAXLEN +
 					       (sizeof(struct ipv6hdr) - sizeof(struct iphdr));
 		break;
+	default:
 	case XFRM_MODE_TRANSPORT:
 		break;
 	case XFRM_MODE_TUNNEL:
 		x->props.header_len += sizeof(struct ipv6hdr);
 		break;
-	default:
-		goto error;
 	}
 
 	align = ALIGN(crypto_aead_blocksize(aead), 4);
diff --git a/net/ipv6/esp6_offload.c b/net/ipv6/esp6_offload.c
index 333a478..f52c314 100644
--- a/net/ipv6/esp6_offload.c
+++ b/net/ipv6/esp6_offload.c
@@ -60,7 +60,8 @@ static struct sk_buff **esp6_gro_receive(struct sk_buff **head,
 	int nhoff;
 	int err;
 
-	skb_pull(skb, offset);
+	if (!pskb_pull(skb, offset))
+		return NULL;
 
 	if ((err = xfrm_parse_spi(skb, IPPROTO_ESP, &spi, &seq)) != 0)
 		goto out;
@@ -148,6 +149,9 @@ static struct sk_buff *esp6_gso_segment(struct sk_buff *skb,
 	if (!xo)
 		goto out;
 
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_ESP))
+		goto out;
+
 	seq = xo->seq.low;
 
 	x = skb->sp->xvec[skb->sp->len - 1];
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index 83bd757..bc68eb6 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -925,6 +925,15 @@ static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto,
 	sr_phdr->segments[0] = **addr_p;
 	*addr_p = &sr_ihdr->segments[sr_ihdr->segments_left];
 
+	if (sr_ihdr->hdrlen > hops * 2) {
+		int tlvs_offset, tlvs_length;
+
+		tlvs_offset = (1 + hops * 2) << 3;
+		tlvs_length = (sr_ihdr->hdrlen - hops * 2) << 3;
+		memcpy((char *)sr_phdr + tlvs_offset,
+		       (char *)sr_ihdr + tlvs_offset, tlvs_length);
+	}
+
 #ifdef CONFIG_IPV6_SEG6_HMAC
 	if (sr_has_hmac(sr_phdr)) {
 		struct net *net = NULL;
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index f5285f4..217683d 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -640,6 +640,11 @@ static struct fib6_node *fib6_add_1(struct net *net,
 			if (!(fn->fn_flags & RTN_RTINFO)) {
 				RCU_INIT_POINTER(fn->leaf, NULL);
 				rt6_release(leaf);
+			/* remove null_entry in the root node */
+			} else if (fn->fn_flags & RTN_TL_ROOT &&
+				   rcu_access_pointer(fn->leaf) ==
+				   net->ipv6.ip6_null_entry) {
+				RCU_INIT_POINTER(fn->leaf, NULL);
 			}
 
 			return fn;
@@ -1221,8 +1226,14 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
 		}
 
 		if (!rcu_access_pointer(fn->leaf)) {
-			atomic_inc(&rt->rt6i_ref);
-			rcu_assign_pointer(fn->leaf, rt);
+			if (fn->fn_flags & RTN_TL_ROOT) {
+				/* put back null_entry for root node */
+				rcu_assign_pointer(fn->leaf,
+					    info->nl_net->ipv6.ip6_null_entry);
+			} else {
+				atomic_inc(&rt->rt6i_ref);
+				rcu_assign_pointer(fn->leaf, rt);
+			}
 		}
 		fn = sn;
 	}
@@ -1241,23 +1252,28 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
 		 * If fib6_add_1 has cleared the old leaf pointer in the
 		 * super-tree leaf node we have to find a new one for it.
 		 */
-		struct rt6_info *pn_leaf = rcu_dereference_protected(pn->leaf,
-					    lockdep_is_held(&table->tb6_lock));
-		if (pn != fn && pn_leaf == rt) {
-			pn_leaf = NULL;
-			RCU_INIT_POINTER(pn->leaf, NULL);
-			atomic_dec(&rt->rt6i_ref);
-		}
-		if (pn != fn && !pn_leaf && !(pn->fn_flags & RTN_RTINFO)) {
-			pn_leaf = fib6_find_prefix(info->nl_net, table, pn);
-#if RT6_DEBUG >= 2
-			if (!pn_leaf) {
-				WARN_ON(!pn_leaf);
-				pn_leaf = info->nl_net->ipv6.ip6_null_entry;
+		if (pn != fn) {
+			struct rt6_info *pn_leaf =
+				rcu_dereference_protected(pn->leaf,
+				    lockdep_is_held(&table->tb6_lock));
+			if (pn_leaf == rt) {
+				pn_leaf = NULL;
+				RCU_INIT_POINTER(pn->leaf, NULL);
+				atomic_dec(&rt->rt6i_ref);
 			}
+			if (!pn_leaf && !(pn->fn_flags & RTN_RTINFO)) {
+				pn_leaf = fib6_find_prefix(info->nl_net, table,
+							   pn);
+#if RT6_DEBUG >= 2
+				if (!pn_leaf) {
+					WARN_ON(!pn_leaf);
+					pn_leaf =
+					    info->nl_net->ipv6.ip6_null_entry;
+				}
 #endif
-			atomic_inc(&pn_leaf->rt6i_ref);
-			rcu_assign_pointer(pn->leaf, pn_leaf);
+				atomic_inc(&pn_leaf->rt6i_ref);
+				rcu_assign_pointer(pn->leaf, pn_leaf);
+			}
 		}
 #endif
 		goto failure;
@@ -1265,13 +1281,17 @@ int fib6_add(struct fib6_node *root, struct rt6_info *rt,
 	return err;
 
 failure:
-	/* fn->leaf could be NULL if fn is an intermediate node and we
-	 * failed to add the new route to it in both subtree creation
-	 * failure and fib6_add_rt2node() failure case.
-	 * In both cases, fib6_repair_tree() should be called to fix
-	 * fn->leaf.
+	/* fn->leaf could be NULL and fib6_repair_tree() needs to be called if:
+	 * 1. fn is an intermediate node and we failed to add the new
+	 * route to it in both subtree creation failure and fib6_add_rt2node()
+	 * failure case.
+	 * 2. fn is the root node in the table and we fail to add the first
+	 * default route to it.
 	 */
-	if (fn && !(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)))
+	if (fn &&
+	    (!(fn->fn_flags & (RTN_RTINFO|RTN_ROOT)) ||
+	     (fn->fn_flags & RTN_TL_ROOT &&
+	      !rcu_access_pointer(fn->leaf))))
 		fib6_repair_tree(info->nl_net, table, fn);
 	/* Always release dst as dst->__refcnt is guaranteed
 	 * to be taken before entering this function
@@ -1526,6 +1546,12 @@ static struct fib6_node *fib6_repair_tree(struct net *net,
 	struct fib6_walker *w;
 	int iter = 0;
 
+	/* Set fn->leaf to null_entry for root node. */
+	if (fn->fn_flags & RTN_TL_ROOT) {
+		rcu_assign_pointer(fn->leaf, net->ipv6.ip6_null_entry);
+		return fn;
+	}
+
 	for (;;) {
 		struct fib6_node *fn_r = rcu_dereference_protected(fn->right,
 					    lockdep_is_held(&table->tb6_lock));
@@ -1680,10 +1706,15 @@ static void fib6_del_route(struct fib6_table *table, struct fib6_node *fn,
 	}
 	read_unlock(&net->ipv6.fib6_walker_lock);
 
-	/* If it was last route, expunge its radix tree node */
+	/* If it was last route, call fib6_repair_tree() to:
+	 * 1. For root node, put back null_entry as how the table was created.
+	 * 2. For other nodes, expunge its radix tree node.
+	 */
 	if (!rcu_access_pointer(fn->leaf)) {
-		fn->fn_flags &= ~RTN_RTINFO;
-		net->ipv6.rt6_stats->fib_route_nodes--;
+		if (!(fn->fn_flags & RTN_TL_ROOT)) {
+			fn->fn_flags &= ~RTN_RTINFO;
+			net->ipv6.rt6_stats->fib_route_nodes--;
+		}
 		fn = fib6_repair_tree(net, table, fn);
 	}
 
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index 7726959..8735492 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -337,11 +337,12 @@ static struct ip6_tnl *ip6gre_tunnel_locate(struct net *net,
 
 	nt->dev = dev;
 	nt->net = dev_net(dev);
-	ip6gre_tnl_link_config(nt, 1);
 
 	if (register_netdevice(dev) < 0)
 		goto failed_free;
 
+	ip6gre_tnl_link_config(nt, 1);
+
 	/* Can use a lockless transmit, unless we generate output sequences */
 	if (!(nt->parms.o_flags & TUNNEL_SEQ))
 		dev->features |= NETIF_F_LLTX;
@@ -1303,7 +1304,6 @@ static void ip6gre_netlink_parms(struct nlattr *data[],
 
 static int ip6gre_tap_init(struct net_device *dev)
 {
-	struct ip6_tnl *tunnel;
 	int ret;
 
 	ret = ip6gre_tunnel_init_common(dev);
@@ -1312,10 +1312,6 @@ static int ip6gre_tap_init(struct net_device *dev)
 
 	dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
 
-	tunnel = netdev_priv(dev);
-
-	ip6gre_tnl_link_config(tunnel, 1);
-
 	return 0;
 }
 
@@ -1408,12 +1404,16 @@ static int ip6gre_newlink(struct net *src_net, struct net_device *dev,
 
 	nt->dev = dev;
 	nt->net = dev_net(dev);
-	ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
 
 	err = register_netdevice(dev);
 	if (err)
 		goto out;
 
+	ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
+
+	if (tb[IFLA_MTU])
+		ip6_tnl_change_mtu(dev, nla_get_u32(tb[IFLA_MTU]));
+
 	dev_hold(dev);
 	ip6gre_tunnel_link(ign, nt);
 
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f7dd51c..3763dc0 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -166,7 +166,7 @@ int ip6_output(struct net *net, struct sock *sk, struct sk_buff *skb)
 			    !(IP6CB(skb)->flags & IP6SKB_REROUTED));
 }
 
-static bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
 {
 	if (!np->autoflowlabel_set)
 		return ip6_default_np_autolabel(net);
@@ -1206,14 +1206,16 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
 	v6_cork->tclass = ipc6->tclass;
 	if (rt->dst.flags & DST_XFRM_TUNNEL)
 		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
-		      rt->dst.dev->mtu : dst_mtu(&rt->dst);
+		      READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
 	else
 		mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
-		      rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+		      READ_ONCE(rt->dst.dev->mtu) : dst_mtu(rt->dst.path);
 	if (np->frag_size < mtu) {
 		if (np->frag_size)
 			mtu = np->frag_size;
 	}
+	if (mtu < IPV6_MIN_MTU)
+		return -EINVAL;
 	cork->base.fragsize = mtu;
 	if (dst_allfrag(rt->dst.path))
 		cork->base.flags |= IPCORK_ALLFRAG;
@@ -1733,11 +1735,13 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
 	cork.base.flags = 0;
 	cork.base.addr = 0;
 	cork.base.opt = NULL;
+	cork.base.dst = NULL;
 	v6_cork.opt = NULL;
 	err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6);
-	if (err)
+	if (err) {
+		ip6_cork_release(&cork, &v6_cork);
 		return ERR_PTR(err);
-
+	}
 	if (ipc6->dontfrag < 0)
 		ipc6->dontfrag = inet6_sk(sk)->dontfrag;
 
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index 931c38f..1ee5584 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -642,8 +642,7 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
 		if (rel_info > dst_mtu(skb_dst(skb2)))
 			goto out;
 
-		skb_dst(skb2)->ops->update_pmtu(skb_dst(skb2), NULL, skb2,
-						rel_info);
+		skb_dst_update_pmtu(skb2, rel_info);
 	}
 
 	icmp_send(skb2, rel_type, rel_code, htonl(rel_info));
@@ -1074,10 +1073,11 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 			memcpy(&fl6->daddr, addr6, sizeof(fl6->daddr));
 			neigh_release(neigh);
 		}
-	} else if (!(t->parms.flags &
-		     (IP6_TNL_F_USE_ORIG_TCLASS | IP6_TNL_F_USE_ORIG_FWMARK))) {
-		/* enable the cache only only if the routing decision does
-		 * not depend on the current inner header value
+	} else if (t->parms.proto != 0 && !(t->parms.flags &
+					    (IP6_TNL_F_USE_ORIG_TCLASS |
+					     IP6_TNL_F_USE_ORIG_FWMARK))) {
+		/* enable the cache only if neither the outer protocol nor the
+		 * routing decision depends on the current inner header value
 		 */
 		use_cache = true;
 	}
@@ -1130,8 +1130,7 @@ int ip6_tnl_xmit(struct sk_buff *skb, struct net_device *dev, __u8 dsfield,
 		mtu = 576;
 	}
 
-	if (skb_dst(skb) && !t->parms.collect_md)
-		skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+	skb_dst_update_pmtu(skb, mtu);
 	if (skb->len - t->tun_hlen - eth_hlen > mtu && !skb_is_gso(skb)) {
 		*pmtu = mtu;
 		err = -EMSGSIZE;
@@ -1676,11 +1675,11 @@ int ip6_tnl_change_mtu(struct net_device *dev, int new_mtu)
 {
 	struct ip6_tnl *tnl = netdev_priv(dev);
 
-	if (tnl->parms.proto == IPPROTO_IPIP) {
-		if (new_mtu < ETH_MIN_MTU)
+	if (tnl->parms.proto == IPPROTO_IPV6) {
+		if (new_mtu < IPV6_MIN_MTU)
 			return -EINVAL;
 	} else {
-		if (new_mtu < IPV6_MIN_MTU)
+		if (new_mtu < ETH_MIN_MTU)
 			return -EINVAL;
 	}
 	if (new_mtu > 0xFFF8 - dev->hard_header_len)
diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index dbb74f3..8c184f8 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -483,7 +483,7 @@ vti6_xmit(struct sk_buff *skb, struct net_device *dev, struct flowi *fl)
 
 	mtu = dst_mtu(dst);
 	if (!skb->ignore_df && skb->len > mtu) {
-		skb_dst(skb)->ops->update_pmtu(dst, NULL, skb, mtu);
+		skb_dst_update_pmtu(skb, mtu);
 
 		if (skb->protocol == htons(ETH_P_IPV6)) {
 			if (mtu < IPV6_MIN_MTU)
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index 2d4680e..e8ffb5b 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -1336,7 +1336,7 @@ static int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
 		break;
 
 	case IPV6_AUTOFLOWLABEL:
-		val = np->autoflowlabel;
+		val = ip6_autoflowlabel(sock_net(sk), np);
 		break;
 
 	case IPV6_RECVFRAGSIZE:
diff --git a/net/ipv6/netfilter/ip6_tables.c b/net/ipv6/netfilter/ip6_tables.c
index 1d7ae93..66a8c69 100644
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -282,12 +282,7 @@ ip6t_do_table(struct sk_buff *skb,
 
 	local_bh_disable();
 	addend = xt_write_recseq_begin();
-	private = table->private;
-	/*
-	 * Ensure we load private-> members after we've fetched the base
-	 * pointer.
-	 */
-	smp_read_barrier_depends();
+	private = READ_ONCE(table->private); /* Address dependency. */
 	cpu        = smp_processor_id();
 	table_base = private->entries;
 	jumpstack  = (struct ip6t_entry **)private->jumpstack[cpu];
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index d7dc23c..3873d38 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -934,8 +934,8 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
 			df = 0;
 		}
 
-		if (tunnel->parms.iph.daddr && skb_dst(skb))
-			skb_dst(skb)->ops->update_pmtu(skb_dst(skb), NULL, skb, mtu);
+		if (tunnel->parms.iph.daddr)
+			skb_dst_update_pmtu(skb, mtu);
 
 		if (skb->len > mtu && !skb_is_gso(skb)) {
 			icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index d883c92..278e49c 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -46,6 +46,9 @@ static struct sk_buff *tcp6_gso_segment(struct sk_buff *skb,
 {
 	struct tcphdr *th;
 
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6))
+		return ERR_PTR(-EINVAL);
+
 	if (!pskb_may_pull(skb, sizeof(*th)))
 		return ERR_PTR(-EINVAL);
 
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index a0f89ad..2a04dc9 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -42,6 +42,9 @@ static struct sk_buff *udp6_ufo_fragment(struct sk_buff *skb,
 		const struct ipv6hdr *ipv6h;
 		struct udphdr *uh;
 
+		if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+			goto out;
+
 		if (!pskb_may_pull(skb, sizeof(struct udphdr)))
 			goto out;
 
diff --git a/net/ipv6/xfrm6_mode_tunnel.c b/net/ipv6/xfrm6_mode_tunnel.c
index 02556e3..dc93002 100644
--- a/net/ipv6/xfrm6_mode_tunnel.c
+++ b/net/ipv6/xfrm6_mode_tunnel.c
@@ -92,6 +92,7 @@ static int xfrm6_mode_tunnel_input(struct xfrm_state *x, struct sk_buff *skb)
 
 	skb_reset_network_header(skb);
 	skb_mac_header_rebuild(skb);
+	eth_hdr(skb)->h_proto = skb->protocol;
 
 	err = 0;
 
diff --git a/net/kcm/kcmsock.c b/net/kcm/kcmsock.c
index d4e98f2..4a8d407 100644
--- a/net/kcm/kcmsock.c
+++ b/net/kcm/kcmsock.c
@@ -1387,8 +1387,13 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 	if (!csk)
 		return -EINVAL;
 
-	/* We must prevent loops or risk deadlock ! */
-	if (csk->sk_family == PF_KCM)
+	/* Only allow TCP sockets to be attached for now */
+	if ((csk->sk_family != AF_INET && csk->sk_family != AF_INET6) ||
+	    csk->sk_protocol != IPPROTO_TCP)
+		return -EOPNOTSUPP;
+
+	/* Don't allow listeners or closed sockets */
+	if (csk->sk_state == TCP_LISTEN || csk->sk_state == TCP_CLOSE)
 		return -EOPNOTSUPP;
 
 	psock = kmem_cache_zalloc(kcm_psockp, GFP_KERNEL);
@@ -1405,9 +1410,18 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 		return err;
 	}
 
-	sock_hold(csk);
-
 	write_lock_bh(&csk->sk_callback_lock);
+
+	/* Check if sk_user_data is aready by KCM or someone else.
+	 * Must be done under lock to prevent race conditions.
+	 */
+	if (csk->sk_user_data) {
+		write_unlock_bh(&csk->sk_callback_lock);
+		strp_done(&psock->strp);
+		kmem_cache_free(kcm_psockp, psock);
+		return -EALREADY;
+	}
+
 	psock->save_data_ready = csk->sk_data_ready;
 	psock->save_write_space = csk->sk_write_space;
 	psock->save_state_change = csk->sk_state_change;
@@ -1415,8 +1429,11 @@ static int kcm_attach(struct socket *sock, struct socket *csock,
 	csk->sk_data_ready = psock_data_ready;
 	csk->sk_write_space = psock_write_space;
 	csk->sk_state_change = psock_state_change;
+
 	write_unlock_bh(&csk->sk_callback_lock);
 
+	sock_hold(csk);
+
 	/* Finished initialization, now add the psock to the MUX. */
 	spin_lock_bh(&mux->lock);
 	head = &mux->psocks;
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 3dffb89..7e2e718 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -401,6 +401,11 @@ static int verify_address_len(const void *p)
 #endif
 	int len;
 
+	if (sp->sadb_address_len <
+	    DIV_ROUND_UP(sizeof(*sp) + offsetofend(typeof(*addr), sa_family),
+			 sizeof(uint64_t)))
+		return -EINVAL;
+
 	switch (addr->sa_family) {
 	case AF_INET:
 		len = DIV_ROUND_UP(sizeof(*sp) + sizeof(*sin), sizeof(uint64_t));
@@ -511,6 +516,9 @@ static int parse_exthdrs(struct sk_buff *skb, const struct sadb_msg *hdr, void *
 		uint16_t ext_type;
 		int ext_len;
 
+		if (len < sizeof(*ehdr))
+			return -EINVAL;
+
 		ext_len  = ehdr->sadb_ext_len;
 		ext_len *= sizeof(uint64_t);
 		ext_type = ehdr->sadb_ext_type;
@@ -2194,8 +2202,10 @@ static int key_notify_policy(struct xfrm_policy *xp, int dir, const struct km_ev
 		return PTR_ERR(out_skb);
 
 	err = pfkey_xfrm_policy2msg(out_skb, xp, dir);
-	if (err < 0)
+	if (err < 0) {
+		kfree_skb(out_skb);
 		return err;
+	}
 
 	out_hdr = (struct sadb_msg *) out_skb->data;
 	out_hdr->sadb_msg_version = PF_KEY_V2;
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c
index 70e9d2c..4daafb0 100644
--- a/net/mac80211/rx.c
+++ b/net/mac80211/rx.c
@@ -3632,6 +3632,8 @@ static bool ieee80211_accept_frame(struct ieee80211_rx_data *rx)
 		}
 		return true;
 	case NL80211_IFTYPE_MESH_POINT:
+		if (ether_addr_equal(sdata->vif.addr, hdr->addr2))
+			return false;
 		if (multicast)
 			return true;
 		return ether_addr_equal(sdata->vif.addr, hdr->addr1);
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 85f643c..4efaa30 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -1044,7 +1044,7 @@ static void gc_worker(struct work_struct *work)
 		 * we will just continue with next hash slot.
 		 */
 		rcu_read_unlock();
-		cond_resched_rcu_qs();
+		cond_resched();
 	} while (++buckets < goal);
 
 	if (gc_work->exiting)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 10798b3..07bd413 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2072,7 +2072,7 @@ static int nf_tables_dump_rules(struct sk_buff *skb,
 				continue;
 
 			list_for_each_entry_rcu(chain, &table->chains, list) {
-				if (ctx && ctx->chain[0] &&
+				if (ctx && ctx->chain &&
 				    strcmp(ctx->chain, chain->name) != 0)
 					continue;
 
@@ -4665,8 +4665,10 @@ static int nf_tables_dump_obj_done(struct netlink_callback *cb)
 {
 	struct nft_obj_filter *filter = cb->data;
 
-	kfree(filter->table);
-	kfree(filter);
+	if (filter) {
+		kfree(filter->table);
+		kfree(filter);
+	}
 
 	return 0;
 }
diff --git a/net/netfilter/xt_bpf.c b/net/netfilter/xt_bpf.c
index 1f7fbd3..06b090d 100644
--- a/net/netfilter/xt_bpf.c
+++ b/net/netfilter/xt_bpf.c
@@ -55,21 +55,11 @@ static int __bpf_mt_check_fd(int fd, struct bpf_prog **ret)
 
 static int __bpf_mt_check_path(const char *path, struct bpf_prog **ret)
 {
-	mm_segment_t oldfs = get_fs();
-	int retval, fd;
-
 	if (strnlen(path, XT_BPF_PATH_MAX) == XT_BPF_PATH_MAX)
 		return -EINVAL;
 
-	set_fs(KERNEL_DS);
-	fd = bpf_obj_get_user(path, 0);
-	set_fs(oldfs);
-	if (fd < 0)
-		return fd;
-
-	retval = __bpf_mt_check_fd(fd, ret);
-	sys_close(fd);
-	return retval;
+	*ret = bpf_prog_get_type_path(path, BPF_PROG_TYPE_SOCKET_FILTER);
+	return PTR_ERR_OR_ZERO(*ret);
 }
 
 static int bpf_mt_check(const struct xt_mtchk_param *par)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 79cc1bf..84a4e4c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2384,13 +2384,14 @@ int netlink_rcv_skb(struct sk_buff *skb, int (*cb)(struct sk_buff *,
 						   struct nlmsghdr *,
 						   struct netlink_ext_ack *))
 {
-	struct netlink_ext_ack extack = {};
+	struct netlink_ext_ack extack;
 	struct nlmsghdr *nlh;
 	int err;
 
 	while (skb->len >= nlmsg_total_size(0)) {
 		int msglen;
 
+		memset(&extack, 0, sizeof(extack));
 		nlh = nlmsg_hdr(skb);
 		err = 0;
 
diff --git a/net/openvswitch/flow_netlink.c b/net/openvswitch/flow_netlink.c
index 624ea74..f143908 100644
--- a/net/openvswitch/flow_netlink.c
+++ b/net/openvswitch/flow_netlink.c
@@ -49,7 +49,6 @@
 #include <net/mpls.h>
 #include <net/vxlan.h>
 #include <net/tun_proto.h>
-#include <net/erspan.h>
 
 #include "flow_netlink.h"
 
@@ -334,8 +333,7 @@ size_t ovs_tun_key_attr_size(void)
 		 * OVS_TUNNEL_KEY_ATTR_GENEVE_OPTS and covered by it.
 		 */
 		+ nla_total_size(2)    /* OVS_TUNNEL_KEY_ATTR_TP_SRC */
-		+ nla_total_size(2)    /* OVS_TUNNEL_KEY_ATTR_TP_DST */
-		+ nla_total_size(4);   /* OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS */
+		+ nla_total_size(2);   /* OVS_TUNNEL_KEY_ATTR_TP_DST */
 }
 
 static size_t ovs_nsh_key_attr_size(void)
@@ -402,7 +400,6 @@ static const struct ovs_len_tbl ovs_tunnel_key_lens[OVS_TUNNEL_KEY_ATTR_MAX + 1]
 						.next = ovs_vxlan_ext_key_lens },
 	[OVS_TUNNEL_KEY_ATTR_IPV6_SRC]      = { .len = sizeof(struct in6_addr) },
 	[OVS_TUNNEL_KEY_ATTR_IPV6_DST]      = { .len = sizeof(struct in6_addr) },
-	[OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS]   = { .len = sizeof(u32) },
 };
 
 static const struct ovs_len_tbl
@@ -634,33 +631,6 @@ static int vxlan_tun_opt_from_nlattr(const struct nlattr *attr,
 	return 0;
 }
 
-static int erspan_tun_opt_from_nlattr(const struct nlattr *attr,
-				      struct sw_flow_match *match, bool is_mask,
-				      bool log)
-{
-	unsigned long opt_key_offset;
-	struct erspan_metadata opts;
-
-	BUILD_BUG_ON(sizeof(opts) > sizeof(match->key->tun_opts));
-
-	memset(&opts, 0, sizeof(opts));
-	opts.index = nla_get_be32(attr);
-
-	/* Index has only 20-bit */
-	if (ntohl(opts.index) & ~INDEX_MASK) {
-		OVS_NLERR(log, "ERSPAN index number %x too large.",
-			  ntohl(opts.index));
-		return -EINVAL;
-	}
-
-	SW_FLOW_KEY_PUT(match, tun_opts_len, sizeof(opts), is_mask);
-	opt_key_offset = TUN_METADATA_OFFSET(sizeof(opts));
-	SW_FLOW_KEY_MEMCPY_OFFSET(match, opt_key_offset, &opts, sizeof(opts),
-				  is_mask);
-
-	return 0;
-}
-
 static int ip_tun_from_nlattr(const struct nlattr *attr,
 			      struct sw_flow_match *match, bool is_mask,
 			      bool log)
@@ -768,19 +738,6 @@ static int ip_tun_from_nlattr(const struct nlattr *attr,
 			break;
 		case OVS_TUNNEL_KEY_ATTR_PAD:
 			break;
-		case OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS:
-			if (opts_type) {
-				OVS_NLERR(log, "Multiple metadata blocks provided");
-				return -EINVAL;
-			}
-
-			err = erspan_tun_opt_from_nlattr(a, match, is_mask, log);
-			if (err)
-				return err;
-
-			tun_flags |= TUNNEL_ERSPAN_OPT;
-			opts_type = type;
-			break;
 		default:
 			OVS_NLERR(log, "Unknown IP tunnel attribute %d",
 				  type);
@@ -905,10 +862,6 @@ static int __ip_tun_to_nlattr(struct sk_buff *skb,
 		else if (output->tun_flags & TUNNEL_VXLAN_OPT &&
 			 vxlan_opt_to_nlattr(skb, tun_opts, swkey_tun_opts_len))
 			return -EMSGSIZE;
-		else if (output->tun_flags & TUNNEL_ERSPAN_OPT &&
-			 nla_put_be32(skb, OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS,
-				      ((struct erspan_metadata *)tun_opts)->index))
-			return -EMSGSIZE;
 	}
 
 	return 0;
@@ -2533,8 +2486,6 @@ static int validate_and_copy_set_tun(const struct nlattr *attr,
 			break;
 		case OVS_TUNNEL_KEY_ATTR_VXLAN_OPTS:
 			break;
-		case OVS_TUNNEL_KEY_ATTR_ERSPAN_OPTS:
-			break;
 		}
 	};
 
diff --git a/net/rds/rdma.c b/net/rds/rdma.c
index bc2f1e0..634cfcb 100644
--- a/net/rds/rdma.c
+++ b/net/rds/rdma.c
@@ -525,6 +525,9 @@ int rds_rdma_extra_size(struct rds_rdma_args *args)
 
 	local_vec = (struct rds_iovec __user *)(unsigned long) args->local_vec_addr;
 
+	if (args->nr_local == 0)
+		return -EINVAL;
+
 	/* figure out the number of pages in the vector */
 	for (i = 0; i < args->nr_local; i++) {
 		if (copy_from_user(&vec, &local_vec[i],
@@ -874,6 +877,7 @@ int rds_cmsg_atomic(struct rds_sock *rs, struct rds_message *rm,
 err:
 	if (page)
 		put_page(page);
+	rm->atomic.op_active = 0;
 	kfree(rm->atomic.op_notifier);
 
 	return ret;
diff --git a/net/rds/tcp.c b/net/rds/tcp.c
index 6b7ee71..ab7356e 100644
--- a/net/rds/tcp.c
+++ b/net/rds/tcp.c
@@ -90,9 +90,10 @@ void rds_tcp_nonagle(struct socket *sock)
 			      sizeof(val));
 }
 
-u32 rds_tcp_snd_nxt(struct rds_tcp_connection *tc)
+u32 rds_tcp_write_seq(struct rds_tcp_connection *tc)
 {
-	return tcp_sk(tc->t_sock->sk)->snd_nxt;
+	/* seq# of the last byte of data in tcp send buffer */
+	return tcp_sk(tc->t_sock->sk)->write_seq;
 }
 
 u32 rds_tcp_snd_una(struct rds_tcp_connection *tc)
diff --git a/net/rds/tcp.h b/net/rds/tcp.h
index 1aafbf7..864ca7d 100644
--- a/net/rds/tcp.h
+++ b/net/rds/tcp.h
@@ -54,7 +54,7 @@ void rds_tcp_set_callbacks(struct socket *sock, struct rds_conn_path *cp);
 void rds_tcp_reset_callbacks(struct socket *sock, struct rds_conn_path *cp);
 void rds_tcp_restore_callbacks(struct socket *sock,
 			       struct rds_tcp_connection *tc);
-u32 rds_tcp_snd_nxt(struct rds_tcp_connection *tc);
+u32 rds_tcp_write_seq(struct rds_tcp_connection *tc);
 u32 rds_tcp_snd_una(struct rds_tcp_connection *tc);
 u64 rds_tcp_map_seq(struct rds_tcp_connection *tc, u32 seq);
 extern struct rds_transport rds_tcp_transport;
diff --git a/net/rds/tcp_send.c b/net/rds/tcp_send.c
index dc860d1..9b76e0f 100644
--- a/net/rds/tcp_send.c
+++ b/net/rds/tcp_send.c
@@ -86,7 +86,7 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
 		 * m_ack_seq is set to the sequence number of the last byte of
 		 * header and data.  see rds_tcp_is_acked().
 		 */
-		tc->t_last_sent_nxt = rds_tcp_snd_nxt(tc);
+		tc->t_last_sent_nxt = rds_tcp_write_seq(tc);
 		rm->m_ack_seq = tc->t_last_sent_nxt +
 				sizeof(struct rds_header) +
 				be32_to_cpu(rm->m_inc.i_hdr.h_len) - 1;
@@ -98,7 +98,7 @@ int rds_tcp_xmit(struct rds_connection *conn, struct rds_message *rm,
 			rm->m_inc.i_hdr.h_flags |= RDS_FLAG_RETRANSMITTED;
 
 		rdsdebug("rm %p tcp nxt %u ack_seq %llu\n",
-			 rm, rds_tcp_snd_nxt(tc),
+			 rm, rds_tcp_write_seq(tc),
 			 (unsigned long long)rm->m_ack_seq);
 	}
 
diff --git a/net/sched/act_gact.c b/net/sched/act_gact.c
index e29a48e..a0ac42b 100644
--- a/net/sched/act_gact.c
+++ b/net/sched/act_gact.c
@@ -159,7 +159,7 @@ static void tcf_gact_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 	if (action == TC_ACT_SHOT)
 		this_cpu_ptr(gact->common.cpu_qstats)->drops += packets;
 
-	tm->lastuse = lastuse;
+	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
 static int tcf_gact_dump(struct sk_buff *skb, struct tc_action *a,
diff --git a/net/sched/act_mirred.c b/net/sched/act_mirred.c
index 8b3e5938..08b6184 100644
--- a/net/sched/act_mirred.c
+++ b/net/sched/act_mirred.c
@@ -239,7 +239,7 @@ static void tcf_stats_update(struct tc_action *a, u64 bytes, u32 packets,
 	struct tcf_t *tm = &m->tcf_tm;
 
 	_bstats_cpu_update(this_cpu_ptr(a->cpu_bstats), bytes, packets);
-	tm->lastuse = lastuse;
+	tm->lastuse = max_t(u64, tm->lastuse, lastuse);
 }
 
 static int tcf_mirred_dump(struct sk_buff *skb, struct tc_action *a, int bind,
diff --git a/net/sched/cls_bpf.c b/net/sched/cls_bpf.c
index 8d78e7f..a62586e 100644
--- a/net/sched/cls_bpf.c
+++ b/net/sched/cls_bpf.c
@@ -183,10 +183,17 @@ static int cls_bpf_offload_cmd(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 	return 0;
 }
 
+static u32 cls_bpf_flags(u32 flags)
+{
+	return flags & CLS_BPF_SUPPORTED_GEN_FLAGS;
+}
+
 static int cls_bpf_offload(struct tcf_proto *tp, struct cls_bpf_prog *prog,
 			   struct cls_bpf_prog *oldprog)
 {
-	if (prog && oldprog && prog->gen_flags != oldprog->gen_flags)
+	if (prog && oldprog &&
+	    cls_bpf_flags(prog->gen_flags) !=
+	    cls_bpf_flags(oldprog->gen_flags))
 		return -EINVAL;
 
 	if (prog && tc_skip_hw(prog->gen_flags))
diff --git a/net/sched/em_nbyte.c b/net/sched/em_nbyte.c
index df3110d..07c10ba 100644
--- a/net/sched/em_nbyte.c
+++ b/net/sched/em_nbyte.c
@@ -51,7 +51,7 @@ static int em_nbyte_match(struct sk_buff *skb, struct tcf_ematch *em,
 	if (!tcf_valid_offset(skb, ptr, nbyte->hdr.len))
 		return 0;
 
-	return !memcmp(ptr + nbyte->hdr.off, nbyte->pattern, nbyte->hdr.len);
+	return !memcmp(ptr, nbyte->pattern, nbyte->hdr.len);
 }
 
 static struct tcf_ematch_ops em_nbyte_ops = {
diff --git a/net/sched/sch_api.c b/net/sched/sch_api.c
index 0f1eab9..52529b7 100644
--- a/net/sched/sch_api.c
+++ b/net/sched/sch_api.c
@@ -1063,17 +1063,6 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 	}
 
 	if (!ops->init || (err = ops->init(sch, tca[TCA_OPTIONS])) == 0) {
-		if (qdisc_is_percpu_stats(sch)) {
-			sch->cpu_bstats =
-				netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
-			if (!sch->cpu_bstats)
-				goto err_out4;
-
-			sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
-			if (!sch->cpu_qstats)
-				goto err_out4;
-		}
-
 		if (tca[TCA_STAB]) {
 			stab = qdisc_get_stab(tca[TCA_STAB]);
 			if (IS_ERR(stab)) {
@@ -1115,7 +1104,7 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 		ops->destroy(sch);
 err_out3:
 	dev_put(dev);
-	kfree((char *) sch - sch->padded);
+	qdisc_free(sch);
 err_out2:
 	module_put(ops->owner);
 err_out:
@@ -1123,8 +1112,6 @@ static struct Qdisc *qdisc_create(struct net_device *dev,
 	return NULL;
 
 err_out4:
-	free_percpu(sch->cpu_bstats);
-	free_percpu(sch->cpu_qstats);
 	/*
 	 * Any broken qdiscs that would require a ops->reset() here?
 	 * The qdisc was never in action so it shouldn't be necessary.
diff --git a/net/sched/sch_generic.c b/net/sched/sch_generic.c
index 661c714..cac003f 100644
--- a/net/sched/sch_generic.c
+++ b/net/sched/sch_generic.c
@@ -633,6 +633,19 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 	qdisc_skb_head_init(&sch->q);
 	spin_lock_init(&sch->q.lock);
 
+	if (ops->static_flags & TCQ_F_CPUSTATS) {
+		sch->cpu_bstats =
+			netdev_alloc_pcpu_stats(struct gnet_stats_basic_cpu);
+		if (!sch->cpu_bstats)
+			goto errout1;
+
+		sch->cpu_qstats = alloc_percpu(struct gnet_stats_queue);
+		if (!sch->cpu_qstats) {
+			free_percpu(sch->cpu_bstats);
+			goto errout1;
+		}
+	}
+
 	spin_lock_init(&sch->busylock);
 	lockdep_set_class(&sch->busylock,
 			  dev->qdisc_tx_busylock ?: &qdisc_tx_busylock);
@@ -642,6 +655,7 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 			  dev->qdisc_running_key ?: &qdisc_running_key);
 
 	sch->ops = ops;
+	sch->flags = ops->static_flags;
 	sch->enqueue = ops->enqueue;
 	sch->dequeue = ops->dequeue;
 	sch->dev_queue = dev_queue;
@@ -649,6 +663,8 @@ struct Qdisc *qdisc_alloc(struct netdev_queue *dev_queue,
 	refcount_set(&sch->refcnt, 1);
 
 	return sch;
+errout1:
+	kfree(p);
 errout:
 	return ERR_PTR(err);
 }
@@ -698,7 +714,7 @@ void qdisc_reset(struct Qdisc *qdisc)
 }
 EXPORT_SYMBOL(qdisc_reset);
 
-static void qdisc_free(struct Qdisc *qdisc)
+void qdisc_free(struct Qdisc *qdisc)
 {
 	if (qdisc_is_percpu_stats(qdisc)) {
 		free_percpu(qdisc->cpu_bstats);
diff --git a/net/sched/sch_ingress.c b/net/sched/sch_ingress.c
index fc1286f..003e1b0 100644
--- a/net/sched/sch_ingress.c
+++ b/net/sched/sch_ingress.c
@@ -66,7 +66,6 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 {
 	struct ingress_sched_data *q = qdisc_priv(sch);
 	struct net_device *dev = qdisc_dev(sch);
-	int err;
 
 	net_inc_ingress_queue();
 
@@ -76,13 +75,7 @@ static int ingress_init(struct Qdisc *sch, struct nlattr *opt)
 	q->block_info.chain_head_change = clsact_chain_head_change;
 	q->block_info.chain_head_change_priv = &q->miniqp;
 
-	err = tcf_block_get_ext(&q->block, sch, &q->block_info);
-	if (err)
-		return err;
-
-	sch->flags |= TCQ_F_CPUSTATS;
-
-	return 0;
+	return tcf_block_get_ext(&q->block, sch, &q->block_info);
 }
 
 static void ingress_destroy(struct Qdisc *sch)
@@ -121,6 +114,7 @@ static struct Qdisc_ops ingress_qdisc_ops __read_mostly = {
 	.cl_ops		=	&ingress_class_ops,
 	.id		=	"ingress",
 	.priv_size	=	sizeof(struct ingress_sched_data),
+	.static_flags	=	TCQ_F_CPUSTATS,
 	.init		=	ingress_init,
 	.destroy	=	ingress_destroy,
 	.dump		=	ingress_dump,
@@ -192,13 +186,7 @@ static int clsact_init(struct Qdisc *sch, struct nlattr *opt)
 	q->egress_block_info.chain_head_change = clsact_chain_head_change;
 	q->egress_block_info.chain_head_change_priv = &q->miniqp_egress;
 
-	err = tcf_block_get_ext(&q->egress_block, sch, &q->egress_block_info);
-	if (err)
-		return err;
-
-	sch->flags |= TCQ_F_CPUSTATS;
-
-	return 0;
+	return tcf_block_get_ext(&q->egress_block, sch, &q->egress_block_info);
 }
 
 static void clsact_destroy(struct Qdisc *sch)
@@ -225,6 +213,7 @@ static struct Qdisc_ops clsact_qdisc_ops __read_mostly = {
 	.cl_ops		=	&clsact_class_ops,
 	.id		=	"clsact",
 	.priv_size	=	sizeof(struct clsact_sched_data),
+	.static_flags	=	TCQ_F_CPUSTATS,
 	.init		=	clsact_init,
 	.destroy	=	clsact_destroy,
 	.dump		=	ingress_dump,
diff --git a/net/sctp/input.c b/net/sctp/input.c
index 621b5ca..141c9c4 100644
--- a/net/sctp/input.c
+++ b/net/sctp/input.c
@@ -399,20 +399,24 @@ void sctp_icmp_frag_needed(struct sock *sk, struct sctp_association *asoc,
 		return;
 	}
 
-	if (t->param_flags & SPP_PMTUD_ENABLE) {
-		/* Update transports view of the MTU */
-		sctp_transport_update_pmtu(t, pmtu);
+	if (!(t->param_flags & SPP_PMTUD_ENABLE))
+		/* We can't allow retransmitting in such case, as the
+		 * retransmission would be sized just as before, and thus we
+		 * would get another icmp, and retransmit again.
+		 */
+		return;
 
-		/* Update association pmtu. */
-		sctp_assoc_sync_pmtu(asoc);
-	}
-
-	/* Retransmit with the new pmtu setting.
-	 * Normally, if PMTU discovery is disabled, an ICMP Fragmentation
-	 * Needed will never be sent, but if a message was sent before
-	 * PMTU discovery was disabled that was larger than the PMTU, it
-	 * would not be fragmented, so it must be re-transmitted fragmented.
+	/* Update transports view of the MTU. Return if no update was needed.
+	 * If an update wasn't needed/possible, it also doesn't make sense to
+	 * try to retransmit now.
 	 */
+	if (!sctp_transport_update_pmtu(t, pmtu))
+		return;
+
+	/* Update association pmtu. */
+	sctp_assoc_sync_pmtu(asoc);
+
+	/* Retransmit with the new pmtu setting. */
 	sctp_retransmit(&asoc->outqueue, t, SCTP_RTXR_PMTUD);
 }
 
diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
index 3b18085e..5d4c15b 100644
--- a/net/sctp/ipv6.c
+++ b/net/sctp/ipv6.c
@@ -826,6 +826,7 @@ static int sctp_inet6_af_supported(sa_family_t family, struct sctp_sock *sp)
 	case AF_INET:
 		if (!__ipv6_only_sock(sctp_opt2sk(sp)))
 			return 1;
+		/* fallthru */
 	default:
 		return 0;
 	}
diff --git a/net/sctp/offload.c b/net/sctp/offload.c
index 275925b..35bc710 100644
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -45,6 +45,9 @@ static struct sk_buff *sctp_gso_segment(struct sk_buff *skb,
 	struct sk_buff *segs = ERR_PTR(-EINVAL);
 	struct sctphdr *sh;
 
+	if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP))
+		goto out;
+
 	sh = sctp_hdr(skb);
 	if (!pskb_may_pull(skb, sizeof(*sh)))
 		goto out;
diff --git a/net/sctp/outqueue.c b/net/sctp/outqueue.c
index 7d67fee..c4ec99b 100644
--- a/net/sctp/outqueue.c
+++ b/net/sctp/outqueue.c
@@ -918,9 +918,9 @@ static void sctp_outq_flush(struct sctp_outq *q, int rtx_timeout, gfp_t gfp)
 			break;
 
 		case SCTP_CID_ABORT:
-			if (sctp_test_T_bit(chunk)) {
+			if (sctp_test_T_bit(chunk))
 				packet->vtag = asoc->c.my_vtag;
-			}
+			/* fallthru */
 		/* The following chunks are "response" chunks, i.e.
 		 * they are generated in response to something we
 		 * received.  If we are sending these, then we can
diff --git a/net/sctp/socket.c b/net/sctp/socket.c
index b4fb6e4..039fcb6 100644
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -85,7 +85,7 @@
 static int sctp_writeable(struct sock *sk);
 static void sctp_wfree(struct sk_buff *skb);
 static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
-				size_t msg_len, struct sock **orig_sk);
+				size_t msg_len);
 static int sctp_wait_for_packet(struct sock *sk, int *err, long *timeo_p);
 static int sctp_wait_for_connect(struct sctp_association *, long *timeo_p);
 static int sctp_wait_for_accept(struct sock *sk, long timeo);
@@ -335,16 +335,14 @@ static struct sctp_af *sctp_sockaddr_af(struct sctp_sock *opt,
 	if (len < sizeof (struct sockaddr))
 		return NULL;
 
+	if (!opt->pf->af_supported(addr->sa.sa_family, opt))
+		return NULL;
+
 	/* V4 mapped address are really of AF_INET family */
 	if (addr->sa.sa_family == AF_INET6 &&
-	    ipv6_addr_v4mapped(&addr->v6.sin6_addr)) {
-		if (!opt->pf->af_supported(AF_INET, opt))
-			return NULL;
-	} else {
-		/* Does this PF support this AF? */
-		if (!opt->pf->af_supported(addr->sa.sa_family, opt))
-			return NULL;
-	}
+	    ipv6_addr_v4mapped(&addr->v6.sin6_addr) &&
+	    !opt->pf->af_supported(AF_INET, opt))
+		return NULL;
 
 	/* If we get this far, af is valid. */
 	af = sctp_get_af_specific(addr->sa.sa_family);
@@ -1883,8 +1881,14 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 		 */
 		if (sinit) {
 			if (sinit->sinit_num_ostreams) {
-				asoc->c.sinit_num_ostreams =
-					sinit->sinit_num_ostreams;
+				__u16 outcnt = sinit->sinit_num_ostreams;
+
+				asoc->c.sinit_num_ostreams = outcnt;
+				/* outcnt has been changed, so re-init stream */
+				err = sctp_stream_init(&asoc->stream, outcnt, 0,
+						       GFP_KERNEL);
+				if (err)
+					goto out_free;
 			}
 			if (sinit->sinit_max_instreams) {
 				asoc->c.sinit_max_instreams =
@@ -1971,7 +1975,7 @@ static int sctp_sendmsg(struct sock *sk, struct msghdr *msg, size_t msg_len)
 	timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
 	if (!sctp_wspace(asoc)) {
 		/* sk can be changed by peel off when waiting for buf. */
-		err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len, &sk);
+		err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len);
 		if (err) {
 			if (err == -ESRCH) {
 				/* asoc is already dead. */
@@ -2277,7 +2281,7 @@ static int sctp_setsockopt_events(struct sock *sk, char __user *optval,
 
 		if (asoc && sctp_outq_is_empty(&asoc->outqueue)) {
 			event = sctp_ulpevent_make_sender_dry_event(asoc,
-					GFP_ATOMIC);
+					GFP_USER | __GFP_NOWARN);
 			if (!event)
 				return -ENOMEM;
 
@@ -3498,6 +3502,8 @@ static int sctp_setsockopt_hmac_ident(struct sock *sk,
 
 	if (optlen < sizeof(struct sctp_hmacalgo))
 		return -EINVAL;
+	optlen = min_t(unsigned int, optlen, sizeof(struct sctp_hmacalgo) +
+					     SCTP_AUTH_NUM_HMACS * sizeof(u16));
 
 	hmacs = memdup_user(optval, optlen);
 	if (IS_ERR(hmacs))
@@ -3536,6 +3542,11 @@ static int sctp_setsockopt_auth_key(struct sock *sk,
 
 	if (optlen <= sizeof(struct sctp_authkey))
 		return -EINVAL;
+	/* authkey->sca_keylength is u16, so optlen can't be bigger than
+	 * this.
+	 */
+	optlen = min_t(unsigned int, optlen, USHRT_MAX +
+					     sizeof(struct sctp_authkey));
 
 	authkey = memdup_user(optval, optlen);
 	if (IS_ERR(authkey))
@@ -3893,6 +3904,9 @@ static int sctp_setsockopt_reset_streams(struct sock *sk,
 
 	if (optlen < sizeof(*params))
 		return -EINVAL;
+	/* srs_number_streams is u16, so optlen can't be bigger than this. */
+	optlen = min_t(unsigned int, optlen, USHRT_MAX +
+					     sizeof(__u16) * sizeof(*params));
 
 	params = memdup_user(optval, optlen);
 	if (IS_ERR(params))
@@ -5015,7 +5029,7 @@ static int sctp_getsockopt_autoclose(struct sock *sk, int len, char __user *optv
 	len = sizeof(int);
 	if (put_user(len, optlen))
 		return -EFAULT;
-	if (copy_to_user(optval, &sctp_sk(sk)->autoclose, sizeof(int)))
+	if (copy_to_user(optval, &sctp_sk(sk)->autoclose, len))
 		return -EFAULT;
 	return 0;
 }
@@ -5645,6 +5659,9 @@ static int sctp_getsockopt_local_addrs(struct sock *sk, int len,
 		err = -EFAULT;
 		goto out;
 	}
+	/* XXX: We should have accounted for sizeof(struct sctp_getaddrs) too,
+	 * but we can't change it anymore.
+	 */
 	if (put_user(bytes_copied, optlen))
 		err = -EFAULT;
 out:
@@ -6081,7 +6098,7 @@ static int sctp_getsockopt_maxseg(struct sock *sk, int len,
 		params.assoc_id = 0;
 	} else if (len >= sizeof(struct sctp_assoc_value)) {
 		len = sizeof(struct sctp_assoc_value);
-		if (copy_from_user(&params, optval, sizeof(params)))
+		if (copy_from_user(&params, optval, len))
 			return -EFAULT;
 	} else
 		return -EINVAL;
@@ -6251,7 +6268,9 @@ static int sctp_getsockopt_active_key(struct sock *sk, int len,
 
 	if (len < sizeof(struct sctp_authkeyid))
 		return -EINVAL;
-	if (copy_from_user(&val, optval, sizeof(struct sctp_authkeyid)))
+
+	len = sizeof(struct sctp_authkeyid);
+	if (copy_from_user(&val, optval, len))
 		return -EFAULT;
 
 	asoc = sctp_id2assoc(sk, val.scact_assoc_id);
@@ -6263,7 +6282,6 @@ static int sctp_getsockopt_active_key(struct sock *sk, int len,
 	else
 		val.scact_keynumber = ep->active_key_id;
 
-	len = sizeof(struct sctp_authkeyid);
 	if (put_user(len, optlen))
 		return -EFAULT;
 	if (copy_to_user(optval, &val, len))
@@ -6289,7 +6307,7 @@ static int sctp_getsockopt_peer_auth_chunks(struct sock *sk, int len,
 	if (len < sizeof(struct sctp_authchunks))
 		return -EINVAL;
 
-	if (copy_from_user(&val, optval, sizeof(struct sctp_authchunks)))
+	if (copy_from_user(&val, optval, sizeof(val)))
 		return -EFAULT;
 
 	to = p->gauth_chunks;
@@ -6334,7 +6352,7 @@ static int sctp_getsockopt_local_auth_chunks(struct sock *sk, int len,
 	if (len < sizeof(struct sctp_authchunks))
 		return -EINVAL;
 
-	if (copy_from_user(&val, optval, sizeof(struct sctp_authchunks)))
+	if (copy_from_user(&val, optval, sizeof(val)))
 		return -EFAULT;
 
 	to = p->gauth_chunks;
@@ -8002,12 +8020,12 @@ void sctp_sock_rfree(struct sk_buff *skb)
 
 /* Helper function to wait for space in the sndbuf.  */
 static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
-				size_t msg_len, struct sock **orig_sk)
+				size_t msg_len)
 {
 	struct sock *sk = asoc->base.sk;
-	int err = 0;
 	long current_timeo = *timeo_p;
 	DEFINE_WAIT(wait);
+	int err = 0;
 
 	pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc,
 		 *timeo_p, msg_len);
@@ -8036,17 +8054,13 @@ static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
 		release_sock(sk);
 		current_timeo = schedule_timeout(current_timeo);
 		lock_sock(sk);
-		if (sk != asoc->base.sk) {
-			release_sock(sk);
-			sk = asoc->base.sk;
-			lock_sock(sk);
-		}
+		if (sk != asoc->base.sk)
+			goto do_error;
 
 		*timeo_p = current_timeo;
 	}
 
 out:
-	*orig_sk = sk;
 	finish_wait(&asoc->wait, &wait);
 
 	/* Release the association's refcnt.  */
diff --git a/net/sctp/stream.c b/net/sctp/stream.c
index 76ea66b..524dfeb 100644
--- a/net/sctp/stream.c
+++ b/net/sctp/stream.c
@@ -156,9 +156,9 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	sctp_stream_outq_migrate(stream, NULL, outcnt);
 	sched->sched_all(stream);
 
-	i = sctp_stream_alloc_out(stream, outcnt, gfp);
-	if (i)
-		return i;
+	ret = sctp_stream_alloc_out(stream, outcnt, gfp);
+	if (ret)
+		goto out;
 
 	stream->outcnt = outcnt;
 	for (i = 0; i < stream->outcnt; i++)
@@ -170,19 +170,17 @@ int sctp_stream_init(struct sctp_stream *stream, __u16 outcnt, __u16 incnt,
 	if (!incnt)
 		goto out;
 
-	i = sctp_stream_alloc_in(stream, incnt, gfp);
-	if (i) {
-		ret = -ENOMEM;
-		goto free;
+	ret = sctp_stream_alloc_in(stream, incnt, gfp);
+	if (ret) {
+		sched->free(stream);
+		kfree(stream->out);
+		stream->out = NULL;
+		stream->outcnt = 0;
+		goto out;
 	}
 
 	stream->incnt = incnt;
-	goto out;
 
-free:
-	sched->free(stream);
-	kfree(stream->out);
-	stream->out = NULL;
 out:
 	return ret;
 }
diff --git a/net/sctp/transport.c b/net/sctp/transport.c
index 1e5a224..47f82bd 100644
--- a/net/sctp/transport.c
+++ b/net/sctp/transport.c
@@ -248,28 +248,37 @@ void sctp_transport_pmtu(struct sctp_transport *transport, struct sock *sk)
 		transport->pathmtu = SCTP_DEFAULT_MAXSEGMENT;
 }
 
-void sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu)
+bool sctp_transport_update_pmtu(struct sctp_transport *t, u32 pmtu)
 {
 	struct dst_entry *dst = sctp_transport_dst_check(t);
+	bool change = true;
 
 	if (unlikely(pmtu < SCTP_DEFAULT_MINSEGMENT)) {
-		pr_warn("%s: Reported pmtu %d too low, using default minimum of %d\n",
-			__func__, pmtu, SCTP_DEFAULT_MINSEGMENT);
-		/* Use default minimum segment size and disable
-		 * pmtu discovery on this transport.
-		 */
-		t->pathmtu = SCTP_DEFAULT_MINSEGMENT;
-	} else {
-		t->pathmtu = pmtu;
+		pr_warn_ratelimited("%s: Reported pmtu %d too low, using default minimum of %d\n",
+				    __func__, pmtu, SCTP_DEFAULT_MINSEGMENT);
+		/* Use default minimum segment instead */
+		pmtu = SCTP_DEFAULT_MINSEGMENT;
 	}
+	pmtu = SCTP_TRUNC4(pmtu);
 
 	if (dst) {
 		dst->ops->update_pmtu(dst, t->asoc->base.sk, NULL, pmtu);
 		dst = sctp_transport_dst_check(t);
 	}
 
-	if (!dst)
+	if (!dst) {
 		t->af_specific->get_dst(t, &t->saddr, &t->fl, t->asoc->base.sk);
+		dst = t->dst;
+	}
+
+	if (dst) {
+		/* Re-fetch, as under layers may have a higher minimum size */
+		pmtu = SCTP_TRUNC4(dst_mtu(dst));
+		change = t->pathmtu != pmtu;
+	}
+	t->pathmtu = pmtu;
+
+	return change;
 }
 
 /* Caches the dst entry and source address for a transport's destination
diff --git a/net/socket.c b/net/socket.c
index 05f361f..6f05d5c 100644
--- a/net/socket.c
+++ b/net/socket.c
@@ -436,8 +436,10 @@ static int sock_map_fd(struct socket *sock, int flags)
 {
 	struct file *newfile;
 	int fd = get_unused_fd_flags(flags);
-	if (unlikely(fd < 0))
+	if (unlikely(fd < 0)) {
+		sock_release(sock);
 		return fd;
+	}
 
 	newfile = sock_alloc_file(sock, flags, NULL);
 	if (likely(!IS_ERR(newfile))) {
@@ -2619,6 +2621,15 @@ static int __init sock_init(void)
 
 core_initcall(sock_init);	/* early initcall */
 
+static int __init jit_init(void)
+{
+#ifdef CONFIG_BPF_JIT_ALWAYS_ON
+	bpf_jit_enable = 1;
+#endif
+	return 0;
+}
+pure_initcall(jit_init);
+
 #ifdef CONFIG_PROC_FS
 void socket_seq_show(struct seq_file *seq)
 {
diff --git a/net/tipc/group.c b/net/tipc/group.c
index 8e12ab5..5f4ffae 100644
--- a/net/tipc/group.c
+++ b/net/tipc/group.c
@@ -109,7 +109,8 @@ static void tipc_group_proto_xmit(struct tipc_group *grp, struct tipc_member *m,
 static void tipc_group_decr_active(struct tipc_group *grp,
 				   struct tipc_member *m)
 {
-	if (m->state == MBR_ACTIVE || m->state == MBR_RECLAIMING)
+	if (m->state == MBR_ACTIVE || m->state == MBR_RECLAIMING ||
+	    m->state == MBR_REMITTED)
 		grp->active_cnt--;
 }
 
@@ -562,7 +563,7 @@ void tipc_group_update_rcv_win(struct tipc_group *grp, int blks, u32 node,
 	int max_active = grp->max_active;
 	int reclaim_limit = max_active * 3 / 4;
 	int active_cnt = grp->active_cnt;
-	struct tipc_member *m, *rm;
+	struct tipc_member *m, *rm, *pm;
 
 	m = tipc_group_find_member(grp, node, port);
 	if (!m)
@@ -605,6 +606,17 @@ void tipc_group_update_rcv_win(struct tipc_group *grp, int blks, u32 node,
 			pr_warn_ratelimited("Rcv unexpected msg after REMIT\n");
 			tipc_group_proto_xmit(grp, m, GRP_ADV_MSG, xmitq);
 		}
+		grp->active_cnt--;
+		list_del_init(&m->list);
+		if (list_empty(&grp->pending))
+			return;
+
+		/* Set oldest pending member to active and advertise */
+		pm = list_first_entry(&grp->pending, struct tipc_member, list);
+		pm->state = MBR_ACTIVE;
+		list_move_tail(&pm->list, &grp->active);
+		grp->active_cnt++;
+		tipc_group_proto_xmit(grp, pm, GRP_ADV_MSG, xmitq);
 		break;
 	case MBR_RECLAIMING:
 	case MBR_DISCOVERED:
@@ -742,14 +754,14 @@ void tipc_group_proto_rcv(struct tipc_group *grp, bool *usr_wakeup,
 		if (!m || m->state != MBR_RECLAIMING)
 			return;
 
-		list_del_init(&m->list);
-		grp->active_cnt--;
 		remitted = msg_grp_remitted(hdr);
 
 		/* Messages preceding the REMIT still in receive queue */
 		if (m->advertised > remitted) {
 			m->state = MBR_REMITTED;
 			in_flight = m->advertised - remitted;
+			m->advertised = ADV_IDLE + in_flight;
+			return;
 		}
 		/* All messages preceding the REMIT have been read */
 		if (m->advertised <= remitted) {
@@ -761,6 +773,8 @@ void tipc_group_proto_rcv(struct tipc_group *grp, bool *usr_wakeup,
 			tipc_group_proto_xmit(grp, m, GRP_ADV_MSG, xmitq);
 
 		m->advertised = ADV_IDLE + in_flight;
+		grp->active_cnt--;
+		list_del_init(&m->list);
 
 		/* Set oldest pending member to active and advertise */
 		if (list_empty(&grp->pending))
diff --git a/net/tipc/node.c b/net/tipc/node.c
index 507017f..9036d87 100644
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1880,36 +1880,38 @@ int tipc_nl_node_get_link(struct sk_buff *skb, struct genl_info *info)
 
 	if (strcmp(name, tipc_bclink_name) == 0) {
 		err = tipc_nl_add_bc_link(net, &msg);
-		if (err) {
-			nlmsg_free(msg.skb);
-			return err;
-		}
+		if (err)
+			goto err_free;
 	} else {
 		int bearer_id;
 		struct tipc_node *node;
 		struct tipc_link *link;
 
 		node = tipc_node_find_by_name(net, name, &bearer_id);
-		if (!node)
-			return -EINVAL;
+		if (!node) {
+			err = -EINVAL;
+			goto err_free;
+		}
 
 		tipc_node_read_lock(node);
 		link = node->links[bearer_id].link;
 		if (!link) {
 			tipc_node_read_unlock(node);
-			nlmsg_free(msg.skb);
-			return -EINVAL;
+			err = -EINVAL;
+			goto err_free;
 		}
 
 		err = __tipc_nl_add_link(net, &msg, link, 0);
 		tipc_node_read_unlock(node);
-		if (err) {
-			nlmsg_free(msg.skb);
-			return err;
-		}
+		if (err)
+			goto err_free;
 	}
 
 	return genlmsg_reply(msg.skb, info);
+
+err_free:
+	nlmsg_free(msg.skb);
+	return err;
 }
 
 int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info)
diff --git a/net/tls/tls_main.c b/net/tls/tls_main.c
index e07ee3a..736719c 100644
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -367,8 +367,10 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 
 	crypto_info = &ctx->crypto_send;
 	/* Currently we don't support set crypto info more than one time */
-	if (TLS_CRYPTO_INFO_READY(crypto_info))
+	if (TLS_CRYPTO_INFO_READY(crypto_info)) {
+		rc = -EBUSY;
 		goto out;
+	}
 
 	rc = copy_from_user(crypto_info, optval, sizeof(*crypto_info));
 	if (rc) {
@@ -386,7 +388,7 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 	case TLS_CIPHER_AES_GCM_128: {
 		if (optlen != sizeof(struct tls12_crypto_info_aes_gcm_128)) {
 			rc = -EINVAL;
-			goto out;
+			goto err_crypto_info;
 		}
 		rc = copy_from_user(crypto_info + 1, optval + sizeof(*crypto_info),
 				    optlen - sizeof(*crypto_info));
@@ -398,7 +400,7 @@ static int do_tls_setsockopt_tx(struct sock *sk, char __user *optval,
 	}
 	default:
 		rc = -EINVAL;
-		goto out;
+		goto err_crypto_info;
 	}
 
 	/* currently SW is default, we will have ethtool in future */
@@ -454,6 +456,15 @@ static int tls_init(struct sock *sk)
 	struct tls_context *ctx;
 	int rc = 0;
 
+	/* The TLS ulp is currently supported only for TCP sockets
+	 * in ESTABLISHED state.
+	 * Supporting sockets in LISTEN state will require us
+	 * to modify the accept implementation to clone rather then
+	 * share the ulp context.
+	 */
+	if (sk->sk_state != TCP_ESTABLISHED)
+		return -ENOTSUPP;
+
 	/* allocate tls context */
 	ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
 	if (!ctx) {
diff --git a/net/tls/tls_sw.c b/net/tls/tls_sw.c
index 73d1921..0a9b72f 100644
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -391,7 +391,7 @@ int tls_sw_sendmsg(struct sock *sk, struct msghdr *msg, size_t size)
 
 	while (msg_data_left(msg)) {
 		if (sk->sk_err) {
-			ret = sk->sk_err;
+			ret = -sk->sk_err;
 			goto send_end;
 		}
 
@@ -544,7 +544,7 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 		size_t copy, required_size;
 
 		if (sk->sk_err) {
-			ret = sk->sk_err;
+			ret = -sk->sk_err;
 			goto sendpage_end;
 		}
 
@@ -577,6 +577,8 @@ int tls_sw_sendpage(struct sock *sk, struct page *page,
 		get_page(page);
 		sg = ctx->sg_plaintext_data + ctx->sg_plaintext_num_elem;
 		sg_set_page(sg, page, copy, offset);
+		sg_unmark_end(sg);
+
 		ctx->sg_plaintext_num_elem++;
 
 		sk_mem_charge(sk, copy);
@@ -681,18 +683,17 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
 	}
 	default:
 		rc = -EINVAL;
-		goto out;
+		goto free_priv;
 	}
 
 	ctx->prepend_size = TLS_HEADER_SIZE + nonce_size;
 	ctx->tag_size = tag_size;
 	ctx->overhead_size = ctx->prepend_size + ctx->tag_size;
 	ctx->iv_size = iv_size;
-	ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
-			  GFP_KERNEL);
+	ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE, GFP_KERNEL);
 	if (!ctx->iv) {
 		rc = -ENOMEM;
-		goto out;
+		goto free_priv;
 	}
 	memcpy(ctx->iv, gcm_128_info->salt, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
 	memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size);
@@ -740,7 +741,7 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
 
 	rc = crypto_aead_setauthsize(sw_ctx->aead_send, ctx->tag_size);
 	if (!rc)
-		goto out;
+		return 0;
 
 free_aead:
 	crypto_free_aead(sw_ctx->aead_send);
@@ -751,6 +752,9 @@ int tls_set_sw_offload(struct sock *sk, struct tls_context *ctx)
 free_iv:
 	kfree(ctx->iv);
 	ctx->iv = NULL;
+free_priv:
+	kfree(ctx->priv_ctx);
+	ctx->priv_ctx = NULL;
 out:
 	return rc;
 }
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 5d28abf..c9473d6 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -951,7 +951,7 @@ static unsigned int vsock_poll(struct file *file, struct socket *sock,
 		 * POLLOUT|POLLWRNORM when peer is closed and nothing to read,
 		 * but local send is not shutdown.
 		 */
-		if (sk->sk_state == TCP_CLOSE) {
+		if (sk->sk_state == TCP_CLOSE || sk->sk_state == TCP_CLOSING) {
 			if (!(sk->sk_shutdown & SEND_SHUTDOWN))
 				mask |= POLLOUT | POLLWRNORM;
 
diff --git a/net/wireless/core.c b/net/wireless/core.c
index fdde0d9..a6f3cac 100644
--- a/net/wireless/core.c
+++ b/net/wireless/core.c
@@ -439,6 +439,8 @@ struct wiphy *wiphy_new_nm(const struct cfg80211_ops *ops, int sizeof_priv,
 		if (rv)
 			goto use_default_name;
 	} else {
+		int rv;
+
 use_default_name:
 		/* NOTE:  This is *probably* safe w/out holding rtnl because of
 		 * the restrictions on phy names.  Probably this call could
@@ -446,7 +448,11 @@ struct wiphy *wiphy_new_nm(const struct cfg80211_ops *ops, int sizeof_priv,
 		 * phyX.  But, might should add some locking and check return
 		 * value, and use a different name if this one exists?
 		 */
-		dev_set_name(&rdev->wiphy.dev, PHY_NAME "%d", rdev->wiphy_idx);
+		rv = dev_set_name(&rdev->wiphy.dev, PHY_NAME "%d", rdev->wiphy_idx);
+		if (rv < 0) {
+			kfree(rdev);
+			return NULL;
+		}
 	}
 
 	INIT_LIST_HEAD(&rdev->wiphy.wdev_list);
diff --git a/net/wireless/core.h b/net/wireless/core.h
index d2f7e8b..eaff636 100644
--- a/net/wireless/core.h
+++ b/net/wireless/core.h
@@ -507,8 +507,6 @@ void cfg80211_stop_p2p_device(struct cfg80211_registered_device *rdev,
 void cfg80211_stop_nan(struct cfg80211_registered_device *rdev,
 		       struct wireless_dev *wdev);
 
-#define CFG80211_MAX_NUM_DIFFERENT_CHANNELS 10
-
 #ifdef CONFIG_CFG80211_DEVELOPER_WARNINGS
 #define CFG80211_DEV_WARN_ON(cond)	WARN_ON(cond)
 #else
diff --git a/net/wireless/nl80211.c b/net/wireless/nl80211.c
index 213d0c4..542a4fc 100644
--- a/net/wireless/nl80211.c
+++ b/net/wireless/nl80211.c
@@ -2618,12 +2618,13 @@ static int nl80211_send_iface(struct sk_buff *msg, u32 portid, u32 seq, int flag
 		const u8 *ssid_ie;
 		if (!wdev->current_bss)
 			break;
+		rcu_read_lock();
 		ssid_ie = ieee80211_bss_get_ie(&wdev->current_bss->pub,
 					       WLAN_EID_SSID);
-		if (!ssid_ie)
-			break;
-		if (nla_put(msg, NL80211_ATTR_SSID, ssid_ie[1], ssid_ie + 2))
-			goto nla_put_failure_locked;
+		if (ssid_ie &&
+		    nla_put(msg, NL80211_ATTR_SSID, ssid_ie[1], ssid_ie + 2))
+			goto nla_put_failure_rcu_locked;
+		rcu_read_unlock();
 		break;
 		}
 	default:
@@ -2635,6 +2636,8 @@ static int nl80211_send_iface(struct sk_buff *msg, u32 portid, u32 seq, int flag
 	genlmsg_end(msg, hdr);
 	return 0;
 
+ nla_put_failure_rcu_locked:
+	rcu_read_unlock();
  nla_put_failure_locked:
 	wdev_unlock(wdev);
  nla_put_failure:
@@ -9806,7 +9809,7 @@ static int cfg80211_cqm_rssi_update(struct cfg80211_registered_device *rdev,
 	 */
 	if (!wdev->cqm_config->last_rssi_event_value && wdev->current_bss &&
 	    rdev->ops->get_station) {
-		struct station_info sinfo;
+		struct station_info sinfo = {};
 		u8 *mac_addr;
 
 		mac_addr = wdev->current_bss->pub.bssid;
@@ -11361,7 +11364,8 @@ static int nl80211_nan_add_func(struct sk_buff *skb,
 		break;
 	case NL80211_NAN_FUNC_FOLLOW_UP:
 		if (!tb[NL80211_NAN_FUNC_FOLLOW_UP_ID] ||
-		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_REQ_ID]) {
+		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_REQ_ID] ||
+		    !tb[NL80211_NAN_FUNC_FOLLOW_UP_DEST]) {
 			err = -EINVAL;
 			goto out;
 		}
diff --git a/net/wireless/reg.c b/net/wireless/reg.c
index 78e71b0..7b42f0b 100644
--- a/net/wireless/reg.c
+++ b/net/wireless/reg.c
@@ -1769,8 +1769,7 @@ static void handle_reg_beacon(struct wiphy *wiphy, unsigned int chan_idx,
 	if (wiphy->regulatory_flags & REGULATORY_DISABLE_BEACON_HINTS)
 		return;
 
-	chan_before.center_freq = chan->center_freq;
-	chan_before.flags = chan->flags;
+	chan_before = *chan;
 
 	if (chan->flags & IEEE80211_CHAN_NO_IR) {
 		chan->flags &= ~IEEE80211_CHAN_NO_IR;
diff --git a/net/wireless/wext-compat.c b/net/wireless/wext-compat.c
index 7ca04a7..05186a4 100644
--- a/net/wireless/wext-compat.c
+++ b/net/wireless/wext-compat.c
@@ -1254,8 +1254,7 @@ static int cfg80211_wext_giwrate(struct net_device *dev,
 {
 	struct wireless_dev *wdev = dev->ieee80211_ptr;
 	struct cfg80211_registered_device *rdev = wiphy_to_rdev(wdev->wiphy);
-	/* we are under RTNL - globally locked - so can use a static struct */
-	static struct station_info sinfo;
+	struct station_info sinfo = {};
 	u8 addr[ETH_ALEN];
 	int err;
 
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 30e5746..ac947718 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -102,6 +102,7 @@ int xfrm_dev_state_add(struct net *net, struct xfrm_state *x,
 
 	err = dev->xfrmdev_ops->xdo_dev_state_add(x);
 	if (err) {
+		xso->dev = NULL;
 		dev_put(dev);
 		return err;
 	}
diff --git a/net/xfrm/xfrm_input.c b/net/xfrm/xfrm_input.c
index 3f6f6f8..5b24097 100644
--- a/net/xfrm/xfrm_input.c
+++ b/net/xfrm/xfrm_input.c
@@ -518,7 +518,7 @@ int xfrm_trans_queue(struct sk_buff *skb,
 		return -ENOBUFS;
 
 	XFRM_TRANS_SKB_CB(skb)->finish = finish;
-	skb_queue_tail(&trans->queue, skb);
+	__skb_queue_tail(&trans->queue, skb);
 	tasklet_schedule(&trans->tasklet);
 	return 0;
 }
diff --git a/net/xfrm/xfrm_policy.c b/net/xfrm/xfrm_policy.c
index 70aa5cb..bd6b0e7 100644
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -609,7 +609,8 @@ static void xfrm_hash_rebuild(struct work_struct *work)
 
 	/* re-insert all policies by order of creation */
 	list_for_each_entry_reverse(policy, &net->xfrm.policy_all, walk.all) {
-		if (xfrm_policy_id2dir(policy->index) >= XFRM_POLICY_MAX) {
+		if (policy->walk.dead ||
+		    xfrm_policy_id2dir(policy->index) >= XFRM_POLICY_MAX) {
 			/* skip socket policies */
 			continue;
 		}
@@ -974,8 +975,6 @@ int xfrm_policy_flush(struct net *net, u8 type, bool task_valid)
 	}
 	if (!cnt)
 		err = -ESRCH;
-	else
-		xfrm_policy_cache_flush();
 out:
 	spin_unlock_bh(&net->xfrm.xfrm_policy_lock);
 	return err;
@@ -1743,6 +1742,8 @@ void xfrm_policy_cache_flush(void)
 	bool found = 0;
 	int cpu;
 
+	might_sleep();
+
 	local_bh_disable();
 	rcu_read_lock();
 	for_each_possible_cpu(cpu) {
@@ -2062,8 +2063,11 @@ xfrm_bundle_lookup(struct net *net, const struct flowi *fl, u16 family, u8 dir,
 	if (num_xfrms <= 0)
 		goto make_dummy_bundle;
 
+	local_bh_disable();
 	xdst = xfrm_resolve_and_create_bundle(pols, num_pols, fl, family,
-						  xflo->dst_orig);
+					      xflo->dst_orig);
+	local_bh_enable();
+
 	if (IS_ERR(xdst)) {
 		err = PTR_ERR(xdst);
 		if (err != -EAGAIN)
@@ -2150,9 +2154,12 @@ struct dst_entry *xfrm_lookup(struct net *net, struct dst_entry *dst_orig,
 				goto no_transform;
 			}
 
+			local_bh_disable();
 			xdst = xfrm_resolve_and_create_bundle(
 					pols, num_pols, fl,
 					family, dst_orig);
+			local_bh_enable();
+
 			if (IS_ERR(xdst)) {
 				xfrm_pols_put(pols, num_pols);
 				err = PTR_ERR(xdst);
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 500b339..a3785f5 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -313,13 +313,14 @@ xfrm_get_type_offload(u8 proto, unsigned short family, bool try_load)
 	if ((type && !try_module_get(type->owner)))
 		type = NULL;
 
+	rcu_read_unlock();
+
 	if (!type && try_load) {
 		request_module("xfrm-offload-%d-%d", family, proto);
-		try_load = 0;
+		try_load = false;
 		goto retry;
 	}
 
-	rcu_read_unlock();
 	return type;
 }
 
@@ -1534,8 +1535,12 @@ int xfrm_state_update(struct xfrm_state *x)
 	err = -EINVAL;
 	spin_lock_bh(&x1->lock);
 	if (likely(x1->km.state == XFRM_STATE_VALID)) {
-		if (x->encap && x1->encap)
+		if (x->encap && x1->encap &&
+		    x->encap->encap_type == x1->encap->encap_type)
 			memcpy(x1->encap, x->encap, sizeof(*x1->encap));
+		else if (x->encap || x1->encap)
+			goto fail;
+
 		if (x->coaddr && x1->coaddr) {
 			memcpy(x1->coaddr, x->coaddr, sizeof(*x1->coaddr));
 		}
@@ -1552,6 +1557,8 @@ int xfrm_state_update(struct xfrm_state *x)
 		x->km.state = XFRM_STATE_DEAD;
 		__xfrm_state_put(x);
 	}
+
+fail:
 	spin_unlock_bh(&x1->lock);
 
 	xfrm_state_put(x1);
@@ -2265,8 +2272,6 @@ int __xfrm_init_state(struct xfrm_state *x, bool init_replay, bool offload)
 			goto error;
 	}
 
-	x->km.state = XFRM_STATE_VALID;
-
 error:
 	return err;
 }
@@ -2275,7 +2280,13 @@ EXPORT_SYMBOL(__xfrm_init_state);
 
 int xfrm_init_state(struct xfrm_state *x)
 {
-	return __xfrm_init_state(x, true, false);
+	int err;
+
+	err = __xfrm_init_state(x, true, false);
+	if (!err)
+		x->km.state = XFRM_STATE_VALID;
+
+	return err;
 }
 
 EXPORT_SYMBOL(xfrm_init_state);
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index bdb48e5..7f52b8e 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -598,13 +598,6 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
 			goto error;
 	}
 
-	if (attrs[XFRMA_OFFLOAD_DEV]) {
-		err = xfrm_dev_state_add(net, x,
-					 nla_data(attrs[XFRMA_OFFLOAD_DEV]));
-		if (err)
-			goto error;
-	}
-
 	if ((err = xfrm_alloc_replay_state_esn(&x->replay_esn, &x->preplay_esn,
 					       attrs[XFRMA_REPLAY_ESN_VAL])))
 		goto error;
@@ -620,6 +613,14 @@ static struct xfrm_state *xfrm_state_construct(struct net *net,
 	/* override default values from above */
 	xfrm_update_ae_params(x, attrs, 0);
 
+	/* configure the hardware if offload is requested */
+	if (attrs[XFRMA_OFFLOAD_DEV]) {
+		err = xfrm_dev_state_add(net, x,
+					 nla_data(attrs[XFRMA_OFFLOAD_DEV]));
+		if (err)
+			goto error;
+	}
+
 	return x;
 
 error:
@@ -662,6 +663,9 @@ static int xfrm_add_sa(struct sk_buff *skb, struct nlmsghdr *nlh,
 		goto out;
 	}
 
+	if (x->km.state == XFRM_STATE_VOID)
+		x->km.state = XFRM_STATE_VALID;
+
 	c.seq = nlh->nlmsg_seq;
 	c.portid = nlh->nlmsg_pid;
 	c.event = nlh->nlmsg_type;
diff --git a/scripts/Makefile.build b/scripts/Makefile.build
index cb8997e..47cddf3 100644
--- a/scripts/Makefile.build
+++ b/scripts/Makefile.build
@@ -265,12 +265,18 @@
 objtool_args += $(call cc-ifversion, -lt, 0405, --no-unreachable)
 endif
 
+ifdef CONFIG_MODVERSIONS
+objtool_o = $(@D)/.tmp_$(@F)
+else
+objtool_o = $(@)
+endif
+
 # 'OBJECT_FILES_NON_STANDARD := y': skip objtool checking for a directory
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'y': skip objtool checking for a file
 # 'OBJECT_FILES_NON_STANDARD_foo.o := 'n': override directory skip for a file
 cmd_objtool = $(if $(patsubst y%,, \
 	$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
-	$(__objtool_obj) $(objtool_args) "$(@)";)
+	$(__objtool_obj) $(objtool_args) "$(objtool_o)";)
 objtool_obj = $(if $(patsubst y%,, \
 	$(OBJECT_FILES_NON_STANDARD_$(basetarget).o)$(OBJECT_FILES_NON_STANDARD)n), \
 	$(__objtool_obj))
@@ -286,16 +292,16 @@
 define rule_cc_o_c
 	$(call echo-cmd,checksrc) $(cmd_checksrc)			  \
 	$(call cmd_and_fixdep,cc_o_c)					  \
-	$(cmd_modversions_c)						  \
 	$(cmd_checkdoc)							  \
 	$(call echo-cmd,objtool) $(cmd_objtool)				  \
+	$(cmd_modversions_c)						  \
 	$(call echo-cmd,record_mcount) $(cmd_record_mcount)
 endef
 
 define rule_as_o_S
 	$(call cmd_and_fixdep,as_o_S)					  \
-	$(cmd_modversions_S)						  \
-	$(call echo-cmd,objtool) $(cmd_objtool)
+	$(call echo-cmd,objtool) $(cmd_objtool)				  \
+	$(cmd_modversions_S)
 endef
 
 # List module undefined symbols (or empty line if not enabled)
diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
index 31031f1..ba03f17 100755
--- a/scripts/checkpatch.pl
+++ b/scripts/checkpatch.pl
@@ -5586,6 +5586,12 @@
 			}
 		}
 
+# check for smp_read_barrier_depends and read_barrier_depends
+		if (!$file && $line =~ /\b(smp_|)read_barrier_depends\s*\(/) {
+			WARN("READ_BARRIER_DEPENDS",
+			     "$1read_barrier_depends should only be used in READ_ONCE or DEC Alpha code\n" . $herecurr);
+		}
+
 # check of hardware specific defines
 		if ($line =~ m@^.\s*\#\s*if.*\b(__i386__|__powerpc64__|__sun__|__s390x__)\b@ && $realfile !~ m@include/asm-@) {
 			CHK("ARCH_DEFINES",
diff --git a/scripts/decodecode b/scripts/decodecode
index 438120d..5ea0710 100755
--- a/scripts/decodecode
+++ b/scripts/decodecode
@@ -59,6 +59,14 @@
 		${CROSS_COMPILE}strip $1.o
 	fi
 
+	if [ "$ARCH" = "arm64" ]; then
+		if [ $width -eq 4 ]; then
+			type=inst
+		fi
+
+		${CROSS_COMPILE}strip $1.o
+	fi
+
 	${CROSS_COMPILE}objdump $OBJDUMPFLAGS -S $1.o | \
 		grep -v "/tmp\|Disassembly\|\.text\|^$" > $1.dis 2>&1
 }
diff --git a/scripts/gdb/linux/tasks.py b/scripts/gdb/linux/tasks.py
index 1bf949c..f6ab3cc 100644
--- a/scripts/gdb/linux/tasks.py
+++ b/scripts/gdb/linux/tasks.py
@@ -96,6 +96,8 @@
         thread_info_addr = task.address + ia64_task_size
         thread_info = thread_info_addr.cast(thread_info_ptr_type)
     else:
+        if task.type.fields()[0].type == thread_info_type.get_type():
+            return task['thread_info']
         thread_info = task['stack'].cast(thread_info_ptr_type)
     return thread_info.dereference()
 
diff --git a/scripts/genksyms/.gitignore b/scripts/genksyms/.gitignore
index 86dc07a..e7836b4 100644
--- a/scripts/genksyms/.gitignore
+++ b/scripts/genksyms/.gitignore
@@ -1,4 +1,3 @@
-*.hash.c
 *.lex.c
 *.tab.c
 *.tab.h
diff --git a/scripts/kconfig/expr.c b/scripts/kconfig/expr.c
index cbf4996..8cee597 100644
--- a/scripts/kconfig/expr.c
+++ b/scripts/kconfig/expr.c
@@ -893,7 +893,10 @@ static enum string_value_kind expr_parse_string(const char *str,
 	switch (type) {
 	case S_BOOLEAN:
 	case S_TRISTATE:
-		return k_string;
+		val->s = !strcmp(str, "n") ? 0 :
+			 !strcmp(str, "m") ? 1 :
+			 !strcmp(str, "y") ? 2 : -1;
+		return k_signed;
 	case S_INT:
 		val->s = strtoll(str, &tail, 10);
 		kind = k_signed;
diff --git a/scripts/mod/modpost.c b/scripts/mod/modpost.c
index f51cf97..6510536 100644
--- a/scripts/mod/modpost.c
+++ b/scripts/mod/modpost.c
@@ -2165,6 +2165,14 @@ static void add_intree_flag(struct buffer *b, int is_intree)
 		buf_printf(b, "\nMODULE_INFO(intree, \"Y\");\n");
 }
 
+/* Cannot check for assembler */
+static void add_retpoline(struct buffer *b)
+{
+	buf_printf(b, "\n#ifdef RETPOLINE\n");
+	buf_printf(b, "MODULE_INFO(retpoline, \"Y\");\n");
+	buf_printf(b, "#endif\n");
+}
+
 static void add_staging_flag(struct buffer *b, const char *name)
 {
 	static const char *staging_dir = "drivers/staging";
@@ -2506,6 +2514,7 @@ int main(int argc, char **argv)
 		err |= check_modname_len(mod);
 		add_header(&buf, mod);
 		add_intree_flag(&buf, !external_module);
+		add_retpoline(&buf);
 		add_staging_flag(&buf, mod->name);
 		err |= add_versions(&buf, mod);
 		add_depends(&buf, mod, modules);
diff --git a/security/Kconfig b/security/Kconfig
index a623d13..b0cb9a5 100644
--- a/security/Kconfig
+++ b/security/Kconfig
@@ -56,13 +56,14 @@
 
 config PAGE_TABLE_ISOLATION
 	bool "Remove the kernel mapping in user mode"
+	default y
 	depends on X86_64 && !UML
 	help
 	  This feature reduces the number of hardware side channels by
 	  ensuring that the majority of kernel addresses are not mapped
 	  into userspace.
 
-	  See Documentation/x86/pagetable-isolation.txt for more details.
+	  See Documentation/x86/pti.txt for more details.
 
 config SECURITY_INFINIBAND
 	bool "Infiniband Security Hooks"
diff --git a/security/apparmor/domain.c b/security/apparmor/domain.c
index 04ba9d0..6a54d2f 100644
--- a/security/apparmor/domain.c
+++ b/security/apparmor/domain.c
@@ -330,10 +330,7 @@ static struct aa_profile *__attach_match(const char *name,
 			continue;
 
 		if (profile->xmatch) {
-			if (profile->xmatch_len == len) {
-				conflict = true;
-				continue;
-			} else if (profile->xmatch_len > len) {
+			if (profile->xmatch_len >= len) {
 				unsigned int state;
 				u32 perm;
 
@@ -342,6 +339,10 @@ static struct aa_profile *__attach_match(const char *name,
 				perm = dfa_user_allow(profile->xmatch, state);
 				/* any accepting state means a valid match. */
 				if (perm & MAY_EXEC) {
+					if (profile->xmatch_len == len) {
+						conflict = true;
+						continue;
+					}
 					candidate = profile;
 					len = profile->xmatch_len;
 					conflict = false;
diff --git a/security/apparmor/include/perms.h b/security/apparmor/include/perms.h
index 2b27bb7..d7b7e71 100644
--- a/security/apparmor/include/perms.h
+++ b/security/apparmor/include/perms.h
@@ -133,6 +133,9 @@ extern struct aa_perms allperms;
 #define xcheck_labels_profiles(L1, L2, FN, args...)		\
 	xcheck_ns_labels((L1), (L2), xcheck_ns_profile_label, (FN), args)
 
+#define xcheck_labels(L1, L2, P, FN1, FN2)			\
+	xcheck(fn_for_each((L1), (P), (FN1)), fn_for_each((L2), (P), (FN2)))
+
 
 void aa_perm_mask_to_str(char *str, const char *chrs, u32 mask);
 void aa_audit_perm_names(struct audit_buffer *ab, const char **names, u32 mask);
diff --git a/security/apparmor/ipc.c b/security/apparmor/ipc.c
index 7ca0032..b40678f 100644
--- a/security/apparmor/ipc.c
+++ b/security/apparmor/ipc.c
@@ -64,40 +64,48 @@ static void audit_ptrace_cb(struct audit_buffer *ab, void *va)
 			FLAGS_NONE, GFP_ATOMIC);
 }
 
+/* assumes check for PROFILE_MEDIATES is already done */
 /* TODO: conditionals */
 static int profile_ptrace_perm(struct aa_profile *profile,
-			       struct aa_profile *peer, u32 request,
-			       struct common_audit_data *sa)
+			     struct aa_label *peer, u32 request,
+			     struct common_audit_data *sa)
 {
 	struct aa_perms perms = { };
 
-	/* need because of peer in cross check */
-	if (profile_unconfined(profile) ||
-	    !PROFILE_MEDIATES(profile, AA_CLASS_PTRACE))
-		return 0;
-
-	aad(sa)->peer = &peer->label;
-	aa_profile_match_label(profile, &peer->label, AA_CLASS_PTRACE, request,
+	aad(sa)->peer = peer;
+	aa_profile_match_label(profile, peer, AA_CLASS_PTRACE, request,
 			       &perms);
 	aa_apply_modes_to_perms(profile, &perms);
 	return aa_check_perms(profile, &perms, request, sa, audit_ptrace_cb);
 }
 
-static int cross_ptrace_perm(struct aa_profile *tracer,
-			     struct aa_profile *tracee, u32 request,
-			     struct common_audit_data *sa)
+static int profile_tracee_perm(struct aa_profile *tracee,
+			       struct aa_label *tracer, u32 request,
+			       struct common_audit_data *sa)
 {
+	if (profile_unconfined(tracee) || unconfined(tracer) ||
+	    !PROFILE_MEDIATES(tracee, AA_CLASS_PTRACE))
+		return 0;
+
+	return profile_ptrace_perm(tracee, tracer, request, sa);
+}
+
+static int profile_tracer_perm(struct aa_profile *tracer,
+			       struct aa_label *tracee, u32 request,
+			       struct common_audit_data *sa)
+{
+	if (profile_unconfined(tracer))
+		return 0;
+
 	if (PROFILE_MEDIATES(tracer, AA_CLASS_PTRACE))
-		return xcheck(profile_ptrace_perm(tracer, tracee, request, sa),
-			      profile_ptrace_perm(tracee, tracer,
-						  request << PTRACE_PERM_SHIFT,
-						  sa));
-	/* policy uses the old style capability check for ptrace */
-	if (profile_unconfined(tracer) || tracer == tracee)
+		return profile_ptrace_perm(tracer, tracee, request, sa);
+
+	/* profile uses the old style capability check for ptrace */
+	if (&tracer->label == tracee)
 		return 0;
 
 	aad(sa)->label = &tracer->label;
-	aad(sa)->peer = &tracee->label;
+	aad(sa)->peer = tracee;
 	aad(sa)->request = 0;
 	aad(sa)->error = aa_capable(&tracer->label, CAP_SYS_PTRACE, 1);
 
@@ -115,10 +123,13 @@ static int cross_ptrace_perm(struct aa_profile *tracer,
 int aa_may_ptrace(struct aa_label *tracer, struct aa_label *tracee,
 		  u32 request)
 {
+	struct aa_profile *profile;
+	u32 xrequest = request << PTRACE_PERM_SHIFT;
 	DEFINE_AUDIT_DATA(sa, LSM_AUDIT_DATA_NONE, OP_PTRACE);
 
-	return xcheck_labels_profiles(tracer, tracee, cross_ptrace_perm,
-				      request, &sa);
+	return xcheck_labels(tracer, tracee, profile,
+			profile_tracer_perm(profile, tracee, request, &sa),
+			profile_tracee_perm(profile, tracer, xrequest, &sa));
 }
 
 
diff --git a/security/apparmor/mount.c b/security/apparmor/mount.c
index ed9b4d0..8c558cb 100644
--- a/security/apparmor/mount.c
+++ b/security/apparmor/mount.c
@@ -329,6 +329,9 @@ static int match_mnt_path_str(struct aa_profile *profile,
 	AA_BUG(!mntpath);
 	AA_BUG(!buffer);
 
+	if (!PROFILE_MEDIATES(profile, AA_CLASS_MOUNT))
+		return 0;
+
 	error = aa_path_name(mntpath, path_flags(profile, mntpath), buffer,
 			     &mntpnt, &info, profile->disconnected);
 	if (error)
@@ -380,6 +383,9 @@ static int match_mnt(struct aa_profile *profile, const struct path *path,
 	AA_BUG(!profile);
 	AA_BUG(devpath && !devbuffer);
 
+	if (!PROFILE_MEDIATES(profile, AA_CLASS_MOUNT))
+		return 0;
+
 	if (devpath) {
 		error = aa_path_name(devpath, path_flags(profile, devpath),
 				     devbuffer, &devname, &info,
@@ -558,6 +564,9 @@ static int profile_umount(struct aa_profile *profile, struct path *path,
 	AA_BUG(!profile);
 	AA_BUG(!path);
 
+	if (!PROFILE_MEDIATES(profile, AA_CLASS_MOUNT))
+		return 0;
+
 	error = aa_path_name(path, path_flags(profile, path), buffer, &name,
 			     &info, profile->disconnected);
 	if (error)
@@ -613,7 +622,8 @@ static struct aa_label *build_pivotroot(struct aa_profile *profile,
 	AA_BUG(!new_path);
 	AA_BUG(!old_path);
 
-	if (profile_unconfined(profile))
+	if (profile_unconfined(profile) ||
+	    !PROFILE_MEDIATES(profile, AA_CLASS_MOUNT))
 		return aa_get_newest_label(&profile->label);
 
 	error = aa_path_name(old_path, path_flags(profile, old_path),
diff --git a/security/commoncap.c b/security/commoncap.c
index 4f8e093..48620c9 100644
--- a/security/commoncap.c
+++ b/security/commoncap.c
@@ -348,21 +348,18 @@ static __u32 sansflags(__u32 m)
 	return m & ~VFS_CAP_FLAGS_EFFECTIVE;
 }
 
-static bool is_v2header(size_t size, __le32 magic)
+static bool is_v2header(size_t size, const struct vfs_cap_data *cap)
 {
-	__u32 m = le32_to_cpu(magic);
 	if (size != XATTR_CAPS_SZ_2)
 		return false;
-	return sansflags(m) == VFS_CAP_REVISION_2;
+	return sansflags(le32_to_cpu(cap->magic_etc)) == VFS_CAP_REVISION_2;
 }
 
-static bool is_v3header(size_t size, __le32 magic)
+static bool is_v3header(size_t size, const struct vfs_cap_data *cap)
 {
-	__u32 m = le32_to_cpu(magic);
-
 	if (size != XATTR_CAPS_SZ_3)
 		return false;
-	return sansflags(m) == VFS_CAP_REVISION_3;
+	return sansflags(le32_to_cpu(cap->magic_etc)) == VFS_CAP_REVISION_3;
 }
 
 /*
@@ -405,7 +402,7 @@ int cap_inode_getsecurity(struct inode *inode, const char *name, void **buffer,
 
 	fs_ns = inode->i_sb->s_user_ns;
 	cap = (struct vfs_cap_data *) tmpbuf;
-	if (is_v2header((size_t) ret, cap->magic_etc)) {
+	if (is_v2header((size_t) ret, cap)) {
 		/* If this is sizeof(vfs_cap_data) then we're ok with the
 		 * on-disk value, so return that.  */
 		if (alloc)
@@ -413,7 +410,7 @@ int cap_inode_getsecurity(struct inode *inode, const char *name, void **buffer,
 		else
 			kfree(tmpbuf);
 		return ret;
-	} else if (!is_v3header((size_t) ret, cap->magic_etc)) {
+	} else if (!is_v3header((size_t) ret, cap)) {
 		kfree(tmpbuf);
 		return -EINVAL;
 	}
@@ -470,9 +467,9 @@ static kuid_t rootid_from_xattr(const void *value, size_t size,
 	return make_kuid(task_ns, rootid);
 }
 
-static bool validheader(size_t size, __le32 magic)
+static bool validheader(size_t size, const struct vfs_cap_data *cap)
 {
-	return is_v2header(size, magic) || is_v3header(size, magic);
+	return is_v2header(size, cap) || is_v3header(size, cap);
 }
 
 /*
@@ -495,7 +492,7 @@ int cap_convert_nscap(struct dentry *dentry, void **ivalue, size_t size)
 
 	if (!*ivalue)
 		return -EINVAL;
-	if (!validheader(size, cap->magic_etc))
+	if (!validheader(size, cap))
 		return -EINVAL;
 	if (!capable_wrt_inode_uidgid(inode, CAP_SETFCAP))
 		return -EPERM;
diff --git a/security/integrity/ima/ima_api.c b/security/integrity/ima/ima_api.c
index c7e8db0..c6ae422 100644
--- a/security/integrity/ima/ima_api.c
+++ b/security/integrity/ima/ima_api.c
@@ -18,6 +18,7 @@
 #include <linux/fs.h>
 #include <linux/xattr.h>
 #include <linux/evm.h>
+#include <linux/iversion.h>
 
 #include "ima.h"
 
@@ -215,7 +216,7 @@ int ima_collect_measurement(struct integrity_iint_cache *iint,
 	 * which do not support i_version, support is limited to an initial
 	 * measurement/appraisal/audit.
 	 */
-	i_version = file_inode(file)->i_version;
+	i_version = inode_query_iversion(inode);
 	hash.hdr.algo = algo;
 
 	/* Initialize hash digest to 0's in case of failure */
diff --git a/security/integrity/ima/ima_main.c b/security/integrity/ima/ima_main.c
index 7706546..06a70c5 100644
--- a/security/integrity/ima/ima_main.c
+++ b/security/integrity/ima/ima_main.c
@@ -24,6 +24,7 @@
 #include <linux/slab.h>
 #include <linux/xattr.h>
 #include <linux/ima.h>
+#include <linux/iversion.h>
 
 #include "ima.h"
 
@@ -127,7 +128,8 @@ static void ima_check_last_writer(struct integrity_iint_cache *iint,
 
 	inode_lock(inode);
 	if (atomic_read(&inode->i_writecount) == 1) {
-		if ((iint->version != inode->i_version) ||
+		if (!IS_I_VERSION(inode) ||
+		    inode_cmp_iversion(inode, iint->version) ||
 		    (iint->flags & IMA_NEW_FILE)) {
 			iint->flags &= ~(IMA_DONE_MASK | IMA_NEW_FILE);
 			iint->measured_pcrs = 0;
diff --git a/security/keys/keyring.c b/security/keys/keyring.c
index d0bcceb..41bcf57 100644
--- a/security/keys/keyring.c
+++ b/security/keys/keyring.c
@@ -713,7 +713,6 @@ static bool search_nested_keyrings(struct key *keyring,
 		 * doesn't contain any keyring pointers.
 		 */
 		shortcut = assoc_array_ptr_to_shortcut(ptr);
-		smp_read_barrier_depends();
 		if ((shortcut->index_key[0] & ASSOC_ARRAY_FAN_MASK) != 0)
 			goto not_this_keyring;
 
@@ -723,8 +722,6 @@ static bool search_nested_keyrings(struct key *keyring,
 	}
 
 	node = assoc_array_ptr_to_node(ptr);
-	smp_read_barrier_depends();
-
 	ptr = node->slots[0];
 	if (!assoc_array_ptr_is_meta(ptr))
 		goto begin_node;
@@ -736,7 +733,6 @@ static bool search_nested_keyrings(struct key *keyring,
 	kdebug("descend");
 	if (assoc_array_ptr_is_shortcut(ptr)) {
 		shortcut = assoc_array_ptr_to_shortcut(ptr);
-		smp_read_barrier_depends();
 		ptr = READ_ONCE(shortcut->next_node);
 		BUG_ON(!assoc_array_ptr_is_node(ptr));
 	}
@@ -744,7 +740,6 @@ static bool search_nested_keyrings(struct key *keyring,
 
 begin_node:
 	kdebug("begin_node");
-	smp_read_barrier_depends();
 	slot = 0;
 ascend_to_node:
 	/* Go through the slots in a node */
@@ -792,14 +787,12 @@ static bool search_nested_keyrings(struct key *keyring,
 
 	if (ptr && assoc_array_ptr_is_shortcut(ptr)) {
 		shortcut = assoc_array_ptr_to_shortcut(ptr);
-		smp_read_barrier_depends();
 		ptr = READ_ONCE(shortcut->back_pointer);
 		slot = shortcut->parent_slot;
 	}
 	if (!ptr)
 		goto not_this_keyring;
 	node = assoc_array_ptr_to_node(ptr);
-	smp_read_barrier_depends();
 	slot++;
 
 	/* If we've ascended to the root (zero backpointer), we must have just
diff --git a/sound/core/oss/pcm_oss.c b/sound/core/oss/pcm_oss.c
index e49f448..e8b19876 100644
--- a/sound/core/oss/pcm_oss.c
+++ b/sound/core/oss/pcm_oss.c
@@ -186,7 +186,7 @@ static int _snd_pcm_hw_param_mask(struct snd_pcm_hw_params *params,
 {
 	int changed;
 	changed = snd_mask_refine(hw_param_mask(params, var), val);
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -233,7 +233,7 @@ static int _snd_pcm_hw_param_min(struct snd_pcm_hw_params *params,
 						  val, open);
 	else
 		return -EINVAL;
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -294,7 +294,7 @@ static int _snd_pcm_hw_param_max(struct snd_pcm_hw_params *params,
 						  val, open);
 	else
 		return -EINVAL;
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -455,7 +455,6 @@ static int snd_pcm_hw_param_near(struct snd_pcm_substream *pcm,
 		v = snd_pcm_hw_param_last(pcm, params, var, dir);
 	else
 		v = snd_pcm_hw_param_first(pcm, params, var, dir);
-	snd_BUG_ON(v < 0);
 	return v;
 }
 
@@ -500,7 +499,7 @@ static int _snd_pcm_hw_param_set(struct snd_pcm_hw_params *params,
 		}
 	} else
 		return -EINVAL;
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -540,7 +539,7 @@ static int _snd_pcm_hw_param_setinteger(struct snd_pcm_hw_params *params,
 {
 	int changed;
 	changed = snd_interval_setinteger(hw_param_interval(params, var));
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -843,7 +842,7 @@ static int snd_pcm_oss_change_params(struct snd_pcm_substream *substream,
 		if (!(mutex_trylock(&runtime->oss.params_lock)))
 			return -EAGAIN;
 	} else if (mutex_lock_interruptible(&runtime->oss.params_lock))
-		return -EINTR;
+		return -ERESTARTSYS;
 	sw_params = kzalloc(sizeof(*sw_params), GFP_KERNEL);
 	params = kmalloc(sizeof(*params), GFP_KERNEL);
 	sparams = kmalloc(sizeof(*sparams), GFP_KERNEL);
@@ -1335,8 +1334,11 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha
 
 	if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
 		return tmp;
-	mutex_lock(&runtime->oss.params_lock);
 	while (bytes > 0) {
+		if (mutex_lock_interruptible(&runtime->oss.params_lock)) {
+			tmp = -ERESTARTSYS;
+			break;
+		}
 		if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) {
 			tmp = bytes;
 			if (tmp + runtime->oss.buffer_used > runtime->oss.period_bytes)
@@ -1380,14 +1382,18 @@ static ssize_t snd_pcm_oss_write1(struct snd_pcm_substream *substream, const cha
 			xfer += tmp;
 			if ((substream->f_flags & O_NONBLOCK) != 0 &&
 			    tmp != runtime->oss.period_bytes)
-				break;
+				tmp = -EAGAIN;
 		}
-	}
-	mutex_unlock(&runtime->oss.params_lock);
-	return xfer;
-
  err:
-	mutex_unlock(&runtime->oss.params_lock);
+		mutex_unlock(&runtime->oss.params_lock);
+		if (tmp < 0)
+			break;
+		if (signal_pending(current)) {
+			tmp = -ERESTARTSYS;
+			break;
+		}
+		tmp = 0;
+	}
 	return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
@@ -1435,8 +1441,11 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use
 
 	if ((tmp = snd_pcm_oss_make_ready(substream)) < 0)
 		return tmp;
-	mutex_lock(&runtime->oss.params_lock);
 	while (bytes > 0) {
+		if (mutex_lock_interruptible(&runtime->oss.params_lock)) {
+			tmp = -ERESTARTSYS;
+			break;
+		}
 		if (bytes < runtime->oss.period_bytes || runtime->oss.buffer_used > 0) {
 			if (runtime->oss.buffer_used == 0) {
 				tmp = snd_pcm_oss_read2(substream, runtime->oss.buffer, runtime->oss.period_bytes, 1);
@@ -1467,12 +1476,16 @@ static ssize_t snd_pcm_oss_read1(struct snd_pcm_substream *substream, char __use
 			bytes -= tmp;
 			xfer += tmp;
 		}
-	}
-	mutex_unlock(&runtime->oss.params_lock);
-	return xfer;
-
  err:
-	mutex_unlock(&runtime->oss.params_lock);
+		mutex_unlock(&runtime->oss.params_lock);
+		if (tmp < 0)
+			break;
+		if (signal_pending(current)) {
+			tmp = -ERESTARTSYS;
+			break;
+		}
+		tmp = 0;
+	}
 	return xfer > 0 ? (snd_pcm_sframes_t)xfer : tmp;
 }
 
diff --git a/sound/core/oss/pcm_plugin.c b/sound/core/oss/pcm_plugin.c
index cadc937..85a56af 100644
--- a/sound/core/oss/pcm_plugin.c
+++ b/sound/core/oss/pcm_plugin.c
@@ -592,18 +592,26 @@ snd_pcm_sframes_t snd_pcm_plug_write_transfer(struct snd_pcm_substream *plug, st
 	snd_pcm_sframes_t frames = size;
 
 	plugin = snd_pcm_plug_first(plug);
-	while (plugin && frames > 0) {
+	while (plugin) {
+		if (frames <= 0)
+			return frames;
 		if ((next = plugin->next) != NULL) {
 			snd_pcm_sframes_t frames1 = frames;
-			if (plugin->dst_frames)
+			if (plugin->dst_frames) {
 				frames1 = plugin->dst_frames(plugin, frames);
+				if (frames1 <= 0)
+					return frames1;
+			}
 			if ((err = next->client_channels(next, frames1, &dst_channels)) < 0) {
 				return err;
 			}
 			if (err != frames1) {
 				frames = err;
-				if (plugin->src_frames)
+				if (plugin->src_frames) {
 					frames = plugin->src_frames(plugin, frames1);
+					if (frames <= 0)
+						return frames;
+				}
 			}
 		} else
 			dst_channels = NULL;
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c
index 10e7ef7..a83152e 100644
--- a/sound/core/pcm_lib.c
+++ b/sound/core/pcm_lib.c
@@ -560,7 +560,6 @@ static inline unsigned int muldiv32(unsigned int a, unsigned int b,
 {
 	u_int64_t n = (u_int64_t) a * b;
 	if (c == 0) {
-		snd_BUG_ON(!n);
 		*r = 0;
 		return UINT_MAX;
 	}
@@ -1603,7 +1602,7 @@ static int _snd_pcm_hw_param_first(struct snd_pcm_hw_params *params,
 		changed = snd_interval_refine_first(hw_param_interval(params, var));
 	else
 		return -EINVAL;
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -1632,7 +1631,7 @@ int snd_pcm_hw_param_first(struct snd_pcm_substream *pcm,
 		return changed;
 	if (params->rmask) {
 		int err = snd_pcm_hw_refine(pcm, params);
-		if (snd_BUG_ON(err < 0))
+		if (err < 0)
 			return err;
 	}
 	return snd_pcm_hw_param_value(params, var, dir);
@@ -1649,7 +1648,7 @@ static int _snd_pcm_hw_param_last(struct snd_pcm_hw_params *params,
 		changed = snd_interval_refine_last(hw_param_interval(params, var));
 	else
 		return -EINVAL;
-	if (changed) {
+	if (changed > 0) {
 		params->cmask |= 1 << var;
 		params->rmask |= 1 << var;
 	}
@@ -1678,7 +1677,7 @@ int snd_pcm_hw_param_last(struct snd_pcm_substream *pcm,
 		return changed;
 	if (params->rmask) {
 		int err = snd_pcm_hw_refine(pcm, params);
-		if (snd_BUG_ON(err < 0))
+		if (err < 0)
 			return err;
 	}
 	return snd_pcm_hw_param_value(params, var, dir);
diff --git a/sound/core/pcm_misc.c b/sound/core/pcm_misc.c
index 9be8102..c4eb561 100644
--- a/sound/core/pcm_misc.c
+++ b/sound/core/pcm_misc.c
@@ -163,13 +163,30 @@ static struct pcm_format_data pcm_formats[(INT)SNDRV_PCM_FORMAT_LAST+1] = {
 		.width = 32, .phys = 32, .le = 0, .signd = 0,
 		.silence = { 0x69, 0x69, 0x69, 0x69 },
 	},
-	/* FIXME: the following three formats are not defined properly yet */
+	/* FIXME: the following two formats are not defined properly yet */
 	[SNDRV_PCM_FORMAT_MPEG] = {
 		.le = -1, .signd = -1,
 	},
 	[SNDRV_PCM_FORMAT_GSM] = {
 		.le = -1, .signd = -1,
 	},
+	[SNDRV_PCM_FORMAT_S20_LE] = {
+		.width = 20, .phys = 32, .le = 1, .signd = 1,
+		.silence = {},
+	},
+	[SNDRV_PCM_FORMAT_S20_BE] = {
+		.width = 20, .phys = 32, .le = 0, .signd = 1,
+		.silence = {},
+	},
+	[SNDRV_PCM_FORMAT_U20_LE] = {
+		.width = 20, .phys = 32, .le = 1, .signd = 0,
+		.silence = { 0x00, 0x00, 0x08, 0x00 },
+	},
+	[SNDRV_PCM_FORMAT_U20_BE] = {
+		.width = 20, .phys = 32, .le = 0, .signd = 0,
+		.silence = { 0x00, 0x08, 0x00, 0x00 },
+	},
+	/* FIXME: the following format is not defined properly yet */
 	[SNDRV_PCM_FORMAT_SPECIAL] = {
 		.le = -1, .signd = -1,
 	},
diff --git a/sound/core/pcm_native.c b/sound/core/pcm_native.c
index a4d92e4..484a18d 100644
--- a/sound/core/pcm_native.c
+++ b/sound/core/pcm_native.c
@@ -2580,7 +2580,7 @@ static snd_pcm_sframes_t forward_appl_ptr(struct snd_pcm_substream *substream,
 	return ret < 0 ? ret : frames;
 }
 
-/* decrease the appl_ptr; returns the processed frames or a negative error */
+/* decrease the appl_ptr; returns the processed frames or zero for error */
 static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream,
 					 snd_pcm_uframes_t frames,
 					 snd_pcm_sframes_t avail)
@@ -2597,7 +2597,12 @@ static snd_pcm_sframes_t rewind_appl_ptr(struct snd_pcm_substream *substream,
 	if (appl_ptr < 0)
 		appl_ptr += runtime->boundary;
 	ret = pcm_lib_apply_appl_ptr(substream, appl_ptr);
-	return ret < 0 ? ret : frames;
+	/* NOTE: we return zero for errors because PulseAudio gets depressed
+	 * upon receiving an error from rewind ioctl and stops processing
+	 * any longer.  Returning zero means that no rewind is done, so
+	 * it's not absolutely wrong to answer like that.
+	 */
+	return ret < 0 ? 0 : frames;
 }
 
 static snd_pcm_sframes_t snd_pcm_playback_rewind(struct snd_pcm_substream *substream,
@@ -3441,7 +3446,7 @@ EXPORT_SYMBOL_GPL(snd_pcm_lib_default_mmap);
 int snd_pcm_lib_mmap_iomem(struct snd_pcm_substream *substream,
 			   struct vm_area_struct *area)
 {
-	struct snd_pcm_runtime *runtime = substream->runtime;;
+	struct snd_pcm_runtime *runtime = substream->runtime;
 
 	area->vm_page_prot = pgprot_noncached(area->vm_page_prot);
 	return vm_iomap_memory(area, runtime->dma_addr, runtime->dma_bytes);
diff --git a/sound/core/seq/seq_clientmgr.c b/sound/core/seq/seq_clientmgr.c
index 6e22eea..d019134 100644
--- a/sound/core/seq/seq_clientmgr.c
+++ b/sound/core/seq/seq_clientmgr.c
@@ -221,6 +221,7 @@ static struct snd_seq_client *seq_create_client1(int client_index, int poolsize)
 	rwlock_init(&client->ports_lock);
 	mutex_init(&client->ports_mutex);
 	INIT_LIST_HEAD(&client->ports_list_head);
+	mutex_init(&client->ioctl_mutex);
 
 	/* find free slot in the client table */
 	spin_lock_irqsave(&clients_lock, flags);
@@ -2130,7 +2131,9 @@ static long snd_seq_ioctl(struct file *file, unsigned int cmd,
 			return -EFAULT;
 	}
 
+	mutex_lock(&client->ioctl_mutex);
 	err = handler->func(client, &buf);
+	mutex_unlock(&client->ioctl_mutex);
 	if (err >= 0) {
 		/* Some commands includes a bug in 'dir' field. */
 		if (handler->cmd == SNDRV_SEQ_IOCTL_SET_QUEUE_CLIENT ||
diff --git a/sound/core/seq/seq_clientmgr.h b/sound/core/seq/seq_clientmgr.h
index c661425..0611e1e 100644
--- a/sound/core/seq/seq_clientmgr.h
+++ b/sound/core/seq/seq_clientmgr.h
@@ -61,6 +61,7 @@ struct snd_seq_client {
 	struct list_head ports_list_head;
 	rwlock_t ports_lock;
 	struct mutex ports_mutex;
+	struct mutex ioctl_mutex;
 	int convert32;		/* convert 32->64bit */
 
 	/* output pool */
diff --git a/sound/core/seq/seq_queue.c b/sound/core/seq/seq_queue.c
index 79e0c56..0428e90 100644
--- a/sound/core/seq/seq_queue.c
+++ b/sound/core/seq/seq_queue.c
@@ -497,9 +497,7 @@ int snd_seq_queue_timer_set_tempo(int queueid, int client,
 		return -EPERM;
 	}
 
-	result = snd_seq_timer_set_tempo(q->timer, info->tempo);
-	if (result >= 0)
-		result = snd_seq_timer_set_ppq(q->timer, info->ppq);
+	result = snd_seq_timer_set_tempo_ppq(q->timer, info->tempo, info->ppq);
 	if (result >= 0 && info->skew_base > 0)
 		result = snd_seq_timer_set_skew(q->timer, info->skew_value,
 						info->skew_base);
diff --git a/sound/core/seq/seq_timer.c b/sound/core/seq/seq_timer.c
index b80985f..2316757 100644
--- a/sound/core/seq/seq_timer.c
+++ b/sound/core/seq/seq_timer.c
@@ -191,14 +191,15 @@ int snd_seq_timer_set_tempo(struct snd_seq_timer * tmr, int tempo)
 	return 0;
 }
 
-/* set current ppq */
-int snd_seq_timer_set_ppq(struct snd_seq_timer * tmr, int ppq)
+/* set current tempo and ppq in a shot */
+int snd_seq_timer_set_tempo_ppq(struct snd_seq_timer *tmr, int tempo, int ppq)
 {
+	int changed;
 	unsigned long flags;
 
 	if (snd_BUG_ON(!tmr))
 		return -EINVAL;
-	if (ppq <= 0)
+	if (tempo <= 0 || ppq <= 0)
 		return -EINVAL;
 	spin_lock_irqsave(&tmr->lock, flags);
 	if (tmr->running && (ppq != tmr->ppq)) {
@@ -208,9 +209,11 @@ int snd_seq_timer_set_ppq(struct snd_seq_timer * tmr, int ppq)
 		pr_debug("ALSA: seq: cannot change ppq of a running timer\n");
 		return -EBUSY;
 	}
-
+	changed = (tempo != tmr->tempo) || (ppq != tmr->ppq);
+	tmr->tempo = tempo;
 	tmr->ppq = ppq;
-	snd_seq_timer_set_tick_resolution(tmr);
+	if (changed)
+		snd_seq_timer_set_tick_resolution(tmr);
 	spin_unlock_irqrestore(&tmr->lock, flags);
 	return 0;
 }
diff --git a/sound/core/seq/seq_timer.h b/sound/core/seq/seq_timer.h
index 9506b66..62f3906 100644
--- a/sound/core/seq/seq_timer.h
+++ b/sound/core/seq/seq_timer.h
@@ -131,7 +131,7 @@ int snd_seq_timer_stop(struct snd_seq_timer *tmr);
 int snd_seq_timer_start(struct snd_seq_timer *tmr);
 int snd_seq_timer_continue(struct snd_seq_timer *tmr);
 int snd_seq_timer_set_tempo(struct snd_seq_timer *tmr, int tempo);
-int snd_seq_timer_set_ppq(struct snd_seq_timer *tmr, int ppq);
+int snd_seq_timer_set_tempo_ppq(struct snd_seq_timer *tmr, int tempo, int ppq);
 int snd_seq_timer_set_position_tick(struct snd_seq_timer *tmr, snd_seq_tick_time_t position);
 int snd_seq_timer_set_position_time(struct snd_seq_timer *tmr, snd_seq_real_time_t position);
 int snd_seq_timer_set_skew(struct snd_seq_timer *tmr, unsigned int skew, unsigned int base);
diff --git a/sound/drivers/aloop.c b/sound/drivers/aloop.c
index afac886..0333143 100644
--- a/sound/drivers/aloop.c
+++ b/sound/drivers/aloop.c
@@ -39,6 +39,7 @@
 #include <sound/core.h>
 #include <sound/control.h>
 #include <sound/pcm.h>
+#include <sound/pcm_params.h>
 #include <sound/info.h>
 #include <sound/initval.h>
 
@@ -305,19 +306,6 @@ static int loopback_trigger(struct snd_pcm_substream *substream, int cmd)
 	return 0;
 }
 
-static void params_change_substream(struct loopback_pcm *dpcm,
-				    struct snd_pcm_runtime *runtime)
-{
-	struct snd_pcm_runtime *dst_runtime;
-
-	if (dpcm == NULL || dpcm->substream == NULL)
-		return;
-	dst_runtime = dpcm->substream->runtime;
-	if (dst_runtime == NULL)
-		return;
-	dst_runtime->hw = dpcm->cable->hw;
-}
-
 static void params_change(struct snd_pcm_substream *substream)
 {
 	struct snd_pcm_runtime *runtime = substream->runtime;
@@ -329,10 +317,6 @@ static void params_change(struct snd_pcm_substream *substream)
 	cable->hw.rate_max = runtime->rate;
 	cable->hw.channels_min = runtime->channels;
 	cable->hw.channels_max = runtime->channels;
-	params_change_substream(cable->streams[SNDRV_PCM_STREAM_PLAYBACK],
-				runtime);
-	params_change_substream(cable->streams[SNDRV_PCM_STREAM_CAPTURE],
-				runtime);
 }
 
 static int loopback_prepare(struct snd_pcm_substream *substream)
@@ -620,26 +604,29 @@ static unsigned int get_cable_index(struct snd_pcm_substream *substream)
 static int rule_format(struct snd_pcm_hw_params *params,
 		       struct snd_pcm_hw_rule *rule)
 {
+	struct loopback_pcm *dpcm = rule->private;
+	struct loopback_cable *cable = dpcm->cable;
+	struct snd_mask m;
 
-	struct snd_pcm_hardware *hw = rule->private;
-	struct snd_mask *maskp = hw_param_mask(params, rule->var);
-
-	maskp->bits[0] &= (u_int32_t)hw->formats;
-	maskp->bits[1] &= (u_int32_t)(hw->formats >> 32);
-	memset(maskp->bits + 2, 0, (SNDRV_MASK_MAX-64) / 8); /* clear rest */
-	if (! maskp->bits[0] && ! maskp->bits[1])
-		return -EINVAL;
-	return 0;
+	snd_mask_none(&m);
+	mutex_lock(&dpcm->loopback->cable_lock);
+	m.bits[0] = (u_int32_t)cable->hw.formats;
+	m.bits[1] = (u_int32_t)(cable->hw.formats >> 32);
+	mutex_unlock(&dpcm->loopback->cable_lock);
+	return snd_mask_refine(hw_param_mask(params, rule->var), &m);
 }
 
 static int rule_rate(struct snd_pcm_hw_params *params,
 		     struct snd_pcm_hw_rule *rule)
 {
-	struct snd_pcm_hardware *hw = rule->private;
+	struct loopback_pcm *dpcm = rule->private;
+	struct loopback_cable *cable = dpcm->cable;
 	struct snd_interval t;
 
-        t.min = hw->rate_min;
-        t.max = hw->rate_max;
+	mutex_lock(&dpcm->loopback->cable_lock);
+	t.min = cable->hw.rate_min;
+	t.max = cable->hw.rate_max;
+	mutex_unlock(&dpcm->loopback->cable_lock);
         t.openmin = t.openmax = 0;
         t.integer = 0;
 	return snd_interval_refine(hw_param_interval(params, rule->var), &t);
@@ -648,22 +635,44 @@ static int rule_rate(struct snd_pcm_hw_params *params,
 static int rule_channels(struct snd_pcm_hw_params *params,
 			 struct snd_pcm_hw_rule *rule)
 {
-	struct snd_pcm_hardware *hw = rule->private;
+	struct loopback_pcm *dpcm = rule->private;
+	struct loopback_cable *cable = dpcm->cable;
 	struct snd_interval t;
 
-        t.min = hw->channels_min;
-        t.max = hw->channels_max;
+	mutex_lock(&dpcm->loopback->cable_lock);
+	t.min = cable->hw.channels_min;
+	t.max = cable->hw.channels_max;
+	mutex_unlock(&dpcm->loopback->cable_lock);
         t.openmin = t.openmax = 0;
         t.integer = 0;
 	return snd_interval_refine(hw_param_interval(params, rule->var), &t);
 }
 
+static void free_cable(struct snd_pcm_substream *substream)
+{
+	struct loopback *loopback = substream->private_data;
+	int dev = get_cable_index(substream);
+	struct loopback_cable *cable;
+
+	cable = loopback->cables[substream->number][dev];
+	if (!cable)
+		return;
+	if (cable->streams[!substream->stream]) {
+		/* other stream is still alive */
+		cable->streams[substream->stream] = NULL;
+	} else {
+		/* free the cable */
+		loopback->cables[substream->number][dev] = NULL;
+		kfree(cable);
+	}
+}
+
 static int loopback_open(struct snd_pcm_substream *substream)
 {
 	struct snd_pcm_runtime *runtime = substream->runtime;
 	struct loopback *loopback = substream->private_data;
 	struct loopback_pcm *dpcm;
-	struct loopback_cable *cable;
+	struct loopback_cable *cable = NULL;
 	int err = 0;
 	int dev = get_cable_index(substream);
 
@@ -681,7 +690,6 @@ static int loopback_open(struct snd_pcm_substream *substream)
 	if (!cable) {
 		cable = kzalloc(sizeof(*cable), GFP_KERNEL);
 		if (!cable) {
-			kfree(dpcm);
 			err = -ENOMEM;
 			goto unlock;
 		}
@@ -699,19 +707,19 @@ static int loopback_open(struct snd_pcm_substream *substream)
 	/* are cached -> they do not reflect the actual state */
 	err = snd_pcm_hw_rule_add(runtime, 0,
 				  SNDRV_PCM_HW_PARAM_FORMAT,
-				  rule_format, &runtime->hw,
+				  rule_format, dpcm,
 				  SNDRV_PCM_HW_PARAM_FORMAT, -1);
 	if (err < 0)
 		goto unlock;
 	err = snd_pcm_hw_rule_add(runtime, 0,
 				  SNDRV_PCM_HW_PARAM_RATE,
-				  rule_rate, &runtime->hw,
+				  rule_rate, dpcm,
 				  SNDRV_PCM_HW_PARAM_RATE, -1);
 	if (err < 0)
 		goto unlock;
 	err = snd_pcm_hw_rule_add(runtime, 0,
 				  SNDRV_PCM_HW_PARAM_CHANNELS,
-				  rule_channels, &runtime->hw,
+				  rule_channels, dpcm,
 				  SNDRV_PCM_HW_PARAM_CHANNELS, -1);
 	if (err < 0)
 		goto unlock;
@@ -723,6 +731,10 @@ static int loopback_open(struct snd_pcm_substream *substream)
 	else
 		runtime->hw = cable->hw;
  unlock:
+	if (err < 0) {
+		free_cable(substream);
+		kfree(dpcm);
+	}
 	mutex_unlock(&loopback->cable_lock);
 	return err;
 }
@@ -731,20 +743,10 @@ static int loopback_close(struct snd_pcm_substream *substream)
 {
 	struct loopback *loopback = substream->private_data;
 	struct loopback_pcm *dpcm = substream->runtime->private_data;
-	struct loopback_cable *cable;
-	int dev = get_cable_index(substream);
 
 	loopback_timer_stop(dpcm);
 	mutex_lock(&loopback->cable_lock);
-	cable = loopback->cables[substream->number][dev];
-	if (cable->streams[!substream->stream]) {
-		/* other stream is still alive */
-		cable->streams[substream->stream] = NULL;
-	} else {
-		/* free the cable */
-		loopback->cables[substream->number][dev] = NULL;
-		kfree(cable);
-	}
+	free_cable(substream);
 	mutex_unlock(&loopback->cable_lock);
 	return 0;
 }
diff --git a/sound/drivers/dummy.c b/sound/drivers/dummy.c
index 7b2b1f7..8fb9a54 100644
--- a/sound/drivers/dummy.c
+++ b/sound/drivers/dummy.c
@@ -375,17 +375,9 @@ struct dummy_hrtimer_pcm {
 	ktime_t period_time;
 	atomic_t running;
 	struct hrtimer timer;
-	struct tasklet_struct tasklet;
 	struct snd_pcm_substream *substream;
 };
 
-static void dummy_hrtimer_pcm_elapsed(unsigned long priv)
-{
-	struct dummy_hrtimer_pcm *dpcm = (struct dummy_hrtimer_pcm *)priv;
-	if (atomic_read(&dpcm->running))
-		snd_pcm_period_elapsed(dpcm->substream);
-}
-
 static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
 {
 	struct dummy_hrtimer_pcm *dpcm;
@@ -393,7 +385,14 @@ static enum hrtimer_restart dummy_hrtimer_callback(struct hrtimer *timer)
 	dpcm = container_of(timer, struct dummy_hrtimer_pcm, timer);
 	if (!atomic_read(&dpcm->running))
 		return HRTIMER_NORESTART;
-	tasklet_schedule(&dpcm->tasklet);
+	/*
+	 * In cases of XRUN and draining, this calls .trigger to stop PCM
+	 * substream.
+	 */
+	snd_pcm_period_elapsed(dpcm->substream);
+	if (!atomic_read(&dpcm->running))
+		return HRTIMER_NORESTART;
+
 	hrtimer_forward_now(timer, dpcm->period_time);
 	return HRTIMER_RESTART;
 }
@@ -403,7 +402,7 @@ static int dummy_hrtimer_start(struct snd_pcm_substream *substream)
 	struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
 
 	dpcm->base_time = hrtimer_cb_get_time(&dpcm->timer);
-	hrtimer_start(&dpcm->timer, dpcm->period_time, HRTIMER_MODE_REL);
+	hrtimer_start(&dpcm->timer, dpcm->period_time, HRTIMER_MODE_REL_SOFT);
 	atomic_set(&dpcm->running, 1);
 	return 0;
 }
@@ -413,14 +412,14 @@ static int dummy_hrtimer_stop(struct snd_pcm_substream *substream)
 	struct dummy_hrtimer_pcm *dpcm = substream->runtime->private_data;
 
 	atomic_set(&dpcm->running, 0);
-	hrtimer_cancel(&dpcm->timer);
+	if (!hrtimer_callback_running(&dpcm->timer))
+		hrtimer_cancel(&dpcm->timer);
 	return 0;
 }
 
 static inline void dummy_hrtimer_sync(struct dummy_hrtimer_pcm *dpcm)
 {
 	hrtimer_cancel(&dpcm->timer);
-	tasklet_kill(&dpcm->tasklet);
 }
 
 static snd_pcm_uframes_t
@@ -465,12 +464,10 @@ static int dummy_hrtimer_create(struct snd_pcm_substream *substream)
 	if (!dpcm)
 		return -ENOMEM;
 	substream->runtime->private_data = dpcm;
-	hrtimer_init(&dpcm->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer_init(&dpcm->timer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_SOFT);
 	dpcm->timer.function = dummy_hrtimer_callback;
 	dpcm->substream = substream;
 	atomic_set(&dpcm->running, 0);
-	tasklet_init(&dpcm->tasklet, dummy_hrtimer_pcm_elapsed,
-		     (unsigned long)dpcm);
 	return 0;
 }
 
@@ -830,7 +827,7 @@ static int snd_dummy_capsrc_put(struct snd_kcontrol *kcontrol, struct snd_ctl_el
 static int snd_dummy_iobox_info(struct snd_kcontrol *kcontrol,
 				struct snd_ctl_elem_info *info)
 {
-	const char *const names[] = { "None", "CD Player" };
+	static const char *const names[] = { "None", "CD Player" };
 
 	return snd_ctl_enum_info(info, 1, 2, names);
 }
diff --git a/sound/hda/ext/hdac_ext_bus.c b/sound/hda/ext/hdac_ext_bus.c
index 31b510c..0daf313 100644
--- a/sound/hda/ext/hdac_ext_bus.c
+++ b/sound/hda/ext/hdac_ext_bus.c
@@ -146,7 +146,7 @@ int snd_hdac_ext_bus_device_init(struct hdac_ext_bus *ebus, int addr)
 	edev = kzalloc(sizeof(*edev), GFP_KERNEL);
 	if (!edev)
 		return -ENOMEM;
-	hdev = &edev->hdac;
+	hdev = &edev->hdev;
 	edev->ebus = ebus;
 
 	snprintf(name, sizeof(name), "ehdaudio%dD%d", ebus->idx, addr);
diff --git a/sound/isa/gus/gus_dma.c b/sound/isa/gus/gus_dma.c
index 36c27c8..7f95f45 100644
--- a/sound/isa/gus/gus_dma.c
+++ b/sound/isa/gus/gus_dma.c
@@ -201,10 +201,9 @@ int snd_gf1_dma_transfer_block(struct snd_gus_card * gus,
 	struct snd_gf1_dma_block *block;
 
 	block = kmalloc(sizeof(*block), atomic ? GFP_ATOMIC : GFP_KERNEL);
-	if (block == NULL) {
-		snd_printk(KERN_ERR "gf1: DMA transfer failure; not enough memory\n");
+	if (!block)
 		return -ENOMEM;
-	}
+
 	*block = *__block;
 	block->next = NULL;
 
diff --git a/sound/mips/hal2.c b/sound/mips/hal2.c
index 37d378a..c8904e7 100644
--- a/sound/mips/hal2.c
+++ b/sound/mips/hal2.c
@@ -814,7 +814,7 @@ static int hal2_create(struct snd_card *card, struct snd_hal2 **rchip)
 	struct hpc3_regs *hpc3 = hpc3c0;
 	int err;
 
-	hal2 = kzalloc(sizeof(struct snd_hal2), GFP_KERNEL);
+	hal2 = kzalloc(sizeof(*hal2), GFP_KERNEL);
 	if (!hal2)
 		return -ENOMEM;
 
diff --git a/sound/mips/sgio2audio.c b/sound/mips/sgio2audio.c
index 71c9421..9fb68b3 100644
--- a/sound/mips/sgio2audio.c
+++ b/sound/mips/sgio2audio.c
@@ -840,7 +840,7 @@ static int snd_sgio2audio_create(struct snd_card *card,
 	if (!(readq(&mace->perif.audio.control) & AUDIO_CONTROL_CODEC_PRESENT))
 		return -ENOENT;
 
-	chip = kzalloc(sizeof(struct snd_sgio2audio), GFP_KERNEL);
+	chip = kzalloc(sizeof(*chip), GFP_KERNEL);
 	if (chip == NULL)
 		return -ENOMEM;
 
diff --git a/sound/pci/hda/Kconfig b/sound/pci/hda/Kconfig
index 7f3b5ed..f7a492c 100644
--- a/sound/pci/hda/Kconfig
+++ b/sound/pci/hda/Kconfig
@@ -88,7 +88,6 @@
 config SND_HDA_CODEC_REALTEK
 	tristate "Build Realtek HD-audio codec support"
 	select SND_HDA_GENERIC
-	select INPUT
 	help
 	  Say Y or M here to include Realtek HD-audio codec support in
 	  snd-hda-intel driver, such as ALC880.
diff --git a/sound/pci/hda/patch_cirrus.c b/sound/pci/hda/patch_cirrus.c
index 80bbadc..d6e079f 100644
--- a/sound/pci/hda/patch_cirrus.c
+++ b/sound/pci/hda/patch_cirrus.c
@@ -408,6 +408,7 @@ static const struct snd_pci_quirk cs420x_fixup_tbl[] = {
 	/*SND_PCI_QUIRK(0x8086, 0x7270, "IMac 27 Inch", CS420X_IMAC27),*/
 
 	/* codec SSID */
+	SND_PCI_QUIRK(0x106b, 0x0600, "iMac 14,1", CS420X_IMAC27_122),
 	SND_PCI_QUIRK(0x106b, 0x1c00, "MacBookPro 8,1", CS420X_MBP81),
 	SND_PCI_QUIRK(0x106b, 0x2000, "iMac 12,2", CS420X_IMAC27_122),
 	SND_PCI_QUIRK(0x106b, 0x2800, "MacBookPro 10,1", CS420X_MBP101),
diff --git a/sound/pci/hda/patch_realtek.c b/sound/pci/hda/patch_realtek.c
index 8fd2d9c..2347588 100644
--- a/sound/pci/hda/patch_realtek.c
+++ b/sound/pci/hda/patch_realtek.c
@@ -3154,11 +3154,13 @@ static void alc256_shutup(struct hda_codec *codec)
 	if (hp_pin_sense)
 		msleep(85);
 
+	/* 3k pull low control for Headset jack. */
+	/* NOTE: call this before clearing the pin, otherwise codec stalls */
+	alc_update_coef_idx(codec, 0x46, 0, 3 << 12);
+
 	snd_hda_codec_write(codec, hp_pin, 0,
 			    AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0);
 
-	alc_update_coef_idx(codec, 0x46, 0, 3 << 12); /* 3k pull low control for Headset jack. */
-
 	if (hp_pin_sense)
 		msleep(100);
 
@@ -3166,6 +3168,93 @@ static void alc256_shutup(struct hda_codec *codec)
 	snd_hda_shutup_pins(codec);
 }
 
+static void alc225_init(struct hda_codec *codec)
+{
+	struct alc_spec *spec = codec->spec;
+	hda_nid_t hp_pin = spec->gen.autocfg.hp_pins[0];
+	bool hp1_pin_sense, hp2_pin_sense;
+
+	if (!hp_pin)
+		return;
+
+	msleep(30);
+
+	hp1_pin_sense = snd_hda_jack_detect(codec, hp_pin);
+	hp2_pin_sense = snd_hda_jack_detect(codec, 0x16);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(2);
+
+	alc_update_coefex_idx(codec, 0x57, 0x04, 0x0007, 0x1); /* Low power */
+
+	if (hp1_pin_sense)
+		snd_hda_codec_write(codec, hp_pin, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);
+	if (hp2_pin_sense)
+		snd_hda_codec_write(codec, 0x16, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(85);
+
+	if (hp1_pin_sense)
+		snd_hda_codec_write(codec, hp_pin, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT);
+	if (hp2_pin_sense)
+		snd_hda_codec_write(codec, 0x16, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(100);
+
+	alc_update_coef_idx(codec, 0x4a, 3 << 10, 0);
+	alc_update_coefex_idx(codec, 0x57, 0x04, 0x0007, 0x4); /* Hight power */
+}
+
+static void alc225_shutup(struct hda_codec *codec)
+{
+	struct alc_spec *spec = codec->spec;
+	hda_nid_t hp_pin = spec->gen.autocfg.hp_pins[0];
+	bool hp1_pin_sense, hp2_pin_sense;
+
+	if (!hp_pin) {
+		alc269_shutup(codec);
+		return;
+	}
+
+	/* 3k pull low control for Headset jack. */
+	alc_update_coef_idx(codec, 0x4a, 0, 3 << 10);
+
+	hp1_pin_sense = snd_hda_jack_detect(codec, hp_pin);
+	hp2_pin_sense = snd_hda_jack_detect(codec, 0x16);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(2);
+
+	if (hp1_pin_sense)
+		snd_hda_codec_write(codec, hp_pin, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);
+	if (hp2_pin_sense)
+		snd_hda_codec_write(codec, 0x16, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(85);
+
+	if (hp1_pin_sense)
+		snd_hda_codec_write(codec, hp_pin, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0);
+	if (hp2_pin_sense)
+		snd_hda_codec_write(codec, 0x16, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0);
+
+	if (hp1_pin_sense || hp2_pin_sense)
+		msleep(100);
+
+	alc_auto_setup_eapd(codec, false);
+	snd_hda_shutup_pins(codec);
+}
+
 static void alc_default_init(struct hda_codec *codec)
 {
 	struct alc_spec *spec = codec->spec;
@@ -3723,6 +3812,7 @@ static void alc280_fixup_hp_gpio4(struct hda_codec *codec,
 	}
 }
 
+#if IS_REACHABLE(INPUT)
 static void gpio2_mic_hotkey_event(struct hda_codec *codec,
 				   struct hda_jack_callback *event)
 {
@@ -3855,6 +3945,10 @@ static void alc233_fixup_lenovo_line2_mic_hotkey(struct hda_codec *codec,
 		spec->kb_dev = NULL;
 	}
 }
+#else /* INPUT */
+#define alc280_fixup_hp_gpio2_mic_hotkey	NULL
+#define alc233_fixup_lenovo_line2_mic_hotkey	NULL
+#endif /* INPUT */
 
 static void alc269_fixup_hp_line1_mic1_led(struct hda_codec *codec,
 				const struct hda_fixup *fix, int action)
@@ -3994,8 +4088,11 @@ static void alc_headset_mode_unplugged(struct hda_codec *codec)
 	case 0x10ec0668:
 		alc_process_coef_fw(codec, coef0668);
 		break;
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
 		alc_process_coef_fw(codec, coef0225);
 		break;
@@ -4117,8 +4214,11 @@ static void alc_headset_mode_mic_in(struct hda_codec *codec, hda_nid_t hp_pin,
 		alc_process_coef_fw(codec, coef0688);
 		snd_hda_set_pin_ctl_cache(codec, mic_pin, PIN_VREF50);
 		break;
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
 		alc_process_coef_fw(codec, alc225_pre_hsmode);
 		alc_update_coef_idx(codec, 0x45, 0x3f<<10, 0x31<<10);
@@ -4189,8 +4289,11 @@ static void alc_headset_mode_default(struct hda_codec *codec)
 	};
 
 	switch (codec->core.vendor_id) {
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
 		alc_process_coef_fw(codec, alc225_pre_hsmode);
 		alc_process_coef_fw(codec, coef0225);
@@ -4332,8 +4435,11 @@ static void alc_headset_mode_ctia(struct hda_codec *codec)
 	case 0x10ec0668:
 		alc_process_coef_fw(codec, coef0688);
 		break;
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
 		val = alc_read_coef_idx(codec, 0x45);
 		if (val & (1 << 9))
@@ -4436,8 +4542,11 @@ static void alc_headset_mode_omtp(struct hda_codec *codec)
 	case 0x10ec0668:
 		alc_process_coef_fw(codec, coef0688);
 		break;
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
 		alc_process_coef_fw(codec, coef0225);
 		break;
@@ -4566,9 +4675,18 @@ static void alc_determine_headset_type(struct hda_codec *codec)
 		val = alc_read_coef_idx(codec, 0xbe);
 		is_ctia = (val & 0x1c02) == 0x1c02;
 		break;
+	case 0x10ec0215:
 	case 0x10ec0225:
+	case 0x10ec0285:
 	case 0x10ec0295:
+	case 0x10ec0289:
 	case 0x10ec0299:
+		snd_hda_codec_write(codec, 0x21, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_MUTE);
+		msleep(80);
+		snd_hda_codec_write(codec, 0x21, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, 0x0);
+
 		alc_process_coef_fw(codec, alc225_pre_hsmode);
 		alc_update_coef_idx(codec, 0x67, 0xf000, 0x1000);
 		val = alc_read_coef_idx(codec, 0x45);
@@ -4588,6 +4706,12 @@ static void alc_determine_headset_type(struct hda_codec *codec)
 		alc_update_coef_idx(codec, 0x4a, 7<<6, 7<<6);
 		alc_update_coef_idx(codec, 0x4a, 3<<4, 3<<4);
 		alc_update_coef_idx(codec, 0x67, 0xf000, 0x3000);
+
+		snd_hda_codec_write(codec, 0x21, 0,
+			    AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT);
+		msleep(80);
+		snd_hda_codec_write(codec, 0x21, 0,
+			    AC_VERB_SET_AMP_GAIN_MUTE, AMP_OUT_UNMUTE);
 		break;
 	case 0x10ec0867:
 		is_ctia = true;
@@ -6196,6 +6320,7 @@ static const struct snd_pci_quirk alc269_fixup_tbl[] = {
 	SND_PCI_QUIRK(0x1028, 0x075b, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE),
 	SND_PCI_QUIRK(0x1028, 0x075d, "Dell AIO", ALC298_FIXUP_SPK_VOLUME),
 	SND_PCI_QUIRK(0x1028, 0x0798, "Dell Inspiron 17 7000 Gaming", ALC256_FIXUP_DELL_INSPIRON_7559_SUBWOOFER),
+	SND_PCI_QUIRK(0x1028, 0x082a, "Dell XPS 13 9360", ALC256_FIXUP_DELL_XPS_13_HEADPHONE_NOISE),
 	SND_PCI_QUIRK(0x1028, 0x164a, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x1028, 0x164b, "Dell", ALC293_FIXUP_DELL1_MIC_NO_PRESENCE),
 	SND_PCI_QUIRK(0x103c, 0x1586, "HP", ALC269_FIXUP_HP_MUTE_LED_MIC2),
@@ -6919,16 +7044,17 @@ static int patch_alc269(struct hda_codec *codec)
 	case 0x10ec0285:
 	case 0x10ec0289:
 		spec->codec_variant = ALC269_TYPE_ALC215;
+		spec->shutup = alc225_shutup;
+		spec->init_hook = alc225_init;
 		spec->gen.mixer_nid = 0;
 		break;
 	case 0x10ec0225:
 	case 0x10ec0295:
-		spec->codec_variant = ALC269_TYPE_ALC225;
-		spec->gen.mixer_nid = 0; /* no loopback on ALC225 ALC295 */
-		break;
 	case 0x10ec0299:
 		spec->codec_variant = ALC269_TYPE_ALC225;
-		spec->gen.mixer_nid = 0; /* no loopback on ALC299 */
+		spec->shutup = alc225_shutup;
+		spec->init_hook = alc225_init;
+		spec->gen.mixer_nid = 0; /* no loopback on ALC225, ALC295 and ALC299 */
 		break;
 	case 0x10ec0234:
 	case 0x10ec0274:
diff --git a/sound/pci/ice1712/prodigy_hifi.c b/sound/pci/ice1712/prodigy_hifi.c
index 2697402..8dabd4d 100644
--- a/sound/pci/ice1712/prodigy_hifi.c
+++ b/sound/pci/ice1712/prodigy_hifi.c
@@ -965,13 +965,32 @@ static int prodigy_hd2_add_controls(struct snd_ice1712 *ice)
 	return 0;
 }
 
-
-/*
- * initialize the chip
- */
-static int prodigy_hifi_init(struct snd_ice1712 *ice)
+static void wm8766_init(struct snd_ice1712 *ice)
 {
-	static unsigned short wm_inits[] = {
+	static unsigned short wm8766_inits[] = {
+		WM8766_RESET,	   0x0000,
+		WM8766_DAC_CTRL,	0x0120,
+		WM8766_INT_CTRL,	0x0022, /* I2S Normal Mode, 24 bit */
+		WM8766_DAC_CTRL2,       0x0001,
+		WM8766_DAC_CTRL3,       0x0080,
+		WM8766_LDA1,	    0x0100,
+		WM8766_LDA2,	    0x0100,
+		WM8766_LDA3,	    0x0100,
+		WM8766_RDA1,	    0x0100,
+		WM8766_RDA2,	    0x0100,
+		WM8766_RDA3,	    0x0100,
+		WM8766_MUTE1,	   0x0000,
+		WM8766_MUTE2,	   0x0000,
+	};
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(wm8766_inits); i += 2)
+		wm8766_spi_write(ice, wm8766_inits[i], wm8766_inits[i + 1]);
+}
+
+static void wm8776_init(struct snd_ice1712 *ice)
+{
+	static unsigned short wm8776_inits[] = {
 		/* These come first to reduce init pop noise */
 		WM_ADC_MUX,	0x0003,	/* ADC mute */
 		/* 0x00c0 replaced by 0x0003 */
@@ -982,7 +1001,76 @@ static int prodigy_hifi_init(struct snd_ice1712 *ice)
 		WM_POWERDOWN,	0x0008,	/* All power-up except HP */
 		WM_RESET,	0x0000,	/* reset */
 	};
-	static unsigned short wm_inits2[] = {
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(wm8776_inits); i += 2)
+		wm_put(ice, wm8776_inits[i], wm8776_inits[i + 1]);
+}
+
+#ifdef CONFIG_PM_SLEEP
+static int prodigy_hifi_resume(struct snd_ice1712 *ice)
+{
+	static unsigned short wm8776_reinit_registers[] = {
+		WM_MASTER_CTRL,
+		WM_DAC_INT,
+		WM_ADC_INT,
+		WM_OUT_MUX,
+		WM_HP_ATTEN_L,
+		WM_HP_ATTEN_R,
+		WM_PHASE_SWAP,
+		WM_DAC_CTRL2,
+		WM_ADC_ATTEN_L,
+		WM_ADC_ATTEN_R,
+		WM_ALC_CTRL1,
+		WM_ALC_CTRL2,
+		WM_ALC_CTRL3,
+		WM_NOISE_GATE,
+		WM_ADC_MUX,
+		/* no DAC attenuation here */
+	};
+	struct prodigy_hifi_spec *spec = ice->spec;
+	int i, ch;
+
+	mutex_lock(&ice->gpio_mutex);
+
+	/* reinitialize WM8776 and re-apply old register values */
+	wm8776_init(ice);
+	schedule_timeout_uninterruptible(1);
+	for (i = 0; i < ARRAY_SIZE(wm8776_reinit_registers); i++)
+		wm_put(ice, wm8776_reinit_registers[i],
+		       wm_get(ice, wm8776_reinit_registers[i]));
+
+	/* reinitialize WM8766 and re-apply volumes for all DACs */
+	wm8766_init(ice);
+	for (ch = 0; ch < 2; ch++) {
+		wm_set_vol(ice, WM_DAC_ATTEN_L + ch,
+			   spec->vol[2 + ch], spec->master[ch]);
+
+		wm8766_set_vol(ice, WM8766_LDA1 + ch,
+			       spec->vol[0 + ch], spec->master[ch]);
+
+		wm8766_set_vol(ice, WM8766_LDA2 + ch,
+			       spec->vol[4 + ch], spec->master[ch]);
+
+		wm8766_set_vol(ice, WM8766_LDA3 + ch,
+			       spec->vol[6 + ch], spec->master[ch]);
+	}
+
+	/* unmute WM8776 DAC */
+	wm_put(ice, WM_DAC_MUTE, 0x00);
+	wm_put(ice, WM_DAC_CTRL1, 0x90);
+
+	mutex_unlock(&ice->gpio_mutex);
+	return 0;
+}
+#endif
+
+/*
+ * initialize the chip
+ */
+static int prodigy_hifi_init(struct snd_ice1712 *ice)
+{
+	static unsigned short wm8776_defaults[] = {
 		WM_MASTER_CTRL,  0x0022, /* 256fs, slave mode */
 		WM_DAC_INT,	0x0022,	/* I2S, normal polarity, 24bit */
 		WM_ADC_INT,	0x0022,	/* I2S, normal polarity, 24bit */
@@ -1010,22 +1098,6 @@ static int prodigy_hifi_init(struct snd_ice1712 *ice)
 		WM_DAC_MUTE,	0x0000,	/* DAC unmute */
 		WM_ADC_MUX,	0x0003,	/* ADC unmute, both CD/Line On */
 	};
-	static unsigned short wm8766_inits[] = {
-		WM8766_RESET,	   0x0000,
-		WM8766_DAC_CTRL,	0x0120,
-		WM8766_INT_CTRL,	0x0022, /* I2S Normal Mode, 24 bit */
-		WM8766_DAC_CTRL2,       0x0001,
-		WM8766_DAC_CTRL3,       0x0080,
-		WM8766_LDA1,	    0x0100,
-		WM8766_LDA2,	    0x0100,
-		WM8766_LDA3,	    0x0100,
-		WM8766_RDA1,	    0x0100,
-		WM8766_RDA2,	    0x0100,
-		WM8766_RDA3,	    0x0100,
-		WM8766_MUTE1,	   0x0000,
-		WM8766_MUTE2,	   0x0000,
-	};
-
 	struct prodigy_hifi_spec *spec;
 	unsigned int i;
 
@@ -1052,16 +1124,17 @@ static int prodigy_hifi_init(struct snd_ice1712 *ice)
 	ice->spec = spec;
 
 	/* initialize WM8776 codec */
-	for (i = 0; i < ARRAY_SIZE(wm_inits); i += 2)
-		wm_put(ice, wm_inits[i], wm_inits[i+1]);
+	wm8776_init(ice);
 	schedule_timeout_uninterruptible(1);
-	for (i = 0; i < ARRAY_SIZE(wm_inits2); i += 2)
-		wm_put(ice, wm_inits2[i], wm_inits2[i+1]);
+	for (i = 0; i < ARRAY_SIZE(wm8776_defaults); i += 2)
+		wm_put(ice, wm8776_defaults[i], wm8776_defaults[i + 1]);
 
-	/* initialize WM8766 codec */
-	for (i = 0; i < ARRAY_SIZE(wm8766_inits); i += 2)
-		wm8766_spi_write(ice, wm8766_inits[i], wm8766_inits[i+1]);
+	wm8766_init(ice);
 
+#ifdef CONFIG_PM_SLEEP
+	ice->pm_resume = &prodigy_hifi_resume;
+	ice->pm_suspend_enabled = 1;
+#endif
 
 	return 0;
 }
diff --git a/sound/pci/korg1212/korg1212.c b/sound/pci/korg1212/korg1212.c
index c7b0071..4206ba4 100644
--- a/sound/pci/korg1212/korg1212.c
+++ b/sound/pci/korg1212/korg1212.c
@@ -2348,7 +2348,6 @@ static int snd_korg1212_create(struct snd_card *card, struct pci_dev *pci,
 
 	err = request_firmware(&dsp_code, "korg/k1212.dsp", &pci->dev);
 	if (err < 0) {
-		release_firmware(dsp_code);
 		snd_printk(KERN_ERR "firmware not available\n");
 		snd_korg1212_free(korg1212);
 		return err;
diff --git a/sound/soc/Kconfig b/sound/soc/Kconfig
index d227581..84c3582 100644
--- a/sound/soc/Kconfig
+++ b/sound/soc/Kconfig
@@ -71,6 +71,7 @@
 source "sound/soc/sunxi/Kconfig"
 source "sound/soc/tegra/Kconfig"
 source "sound/soc/txx9/Kconfig"
+source "sound/soc/uniphier/Kconfig"
 source "sound/soc/ux500/Kconfig"
 source "sound/soc/xtensa/Kconfig"
 source "sound/soc/zte/Kconfig"
diff --git a/sound/soc/Makefile b/sound/soc/Makefile
index 5327f4d..74cd185 100644
--- a/sound/soc/Makefile
+++ b/sound/soc/Makefile
@@ -55,6 +55,7 @@
 obj-$(CONFIG_SND_SOC)	+= sunxi/
 obj-$(CONFIG_SND_SOC)	+= tegra/
 obj-$(CONFIG_SND_SOC)	+= txx9/
+obj-$(CONFIG_SND_SOC)	+= uniphier/
 obj-$(CONFIG_SND_SOC)	+= ux500/
 obj-$(CONFIG_SND_SOC)	+= xtensa/
 obj-$(CONFIG_SND_SOC)	+= zte/
diff --git a/sound/soc/amd/acp-pcm-dma.c b/sound/soc/amd/acp-pcm-dma.c
index b5e41df..c33a512 100644
--- a/sound/soc/amd/acp-pcm-dma.c
+++ b/sound/soc/amd/acp-pcm-dma.c
@@ -850,6 +850,9 @@ static snd_pcm_uframes_t acp_dma_pointer(struct snd_pcm_substream *substream)
 	struct snd_pcm_runtime *runtime = substream->runtime;
 	struct audio_substream_data *rtd = runtime->private_data;
 
+	if (!rtd)
+		return -EINVAL;
+
 	buffersize = frames_to_bytes(runtime, runtime->buffer_size);
 	bytescount = acp_get_byte_count(rtd->acp_mmio, substream->stream);
 
@@ -875,6 +878,8 @@ static int acp_dma_prepare(struct snd_pcm_substream *substream)
 	struct snd_pcm_runtime *runtime = substream->runtime;
 	struct audio_substream_data *rtd = runtime->private_data;
 
+	if (!rtd)
+		return -EINVAL;
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
 		config_acp_dma_channel(rtd->acp_mmio, SYSRAM_TO_ACP_CH_NUM,
 					PLAYBACK_START_DMA_DESCR_CH12,
@@ -1091,7 +1096,11 @@ static int acp_audio_probe(struct platform_device *pdev)
 	dev_set_drvdata(&pdev->dev, audio_drv_data);
 
 	/* Initialize the ACP */
-	acp_init(audio_drv_data->acp_mmio, audio_drv_data->asic_type);
+	status = acp_init(audio_drv_data->acp_mmio, audio_drv_data->asic_type);
+	if (status) {
+		dev_err(&pdev->dev, "ACP Init failed status:%d\n", status);
+		return status;
+	}
 
 	status = snd_soc_register_platform(&pdev->dev, &acp_asoc_platform);
 	if (status != 0) {
@@ -1108,9 +1117,12 @@ static int acp_audio_probe(struct platform_device *pdev)
 
 static int acp_audio_remove(struct platform_device *pdev)
 {
+	int status;
 	struct audio_drv_data *adata = dev_get_drvdata(&pdev->dev);
 
-	acp_deinit(adata->acp_mmio);
+	status = acp_deinit(adata->acp_mmio);
+	if (status)
+		dev_err(&pdev->dev, "ACP Deinit failed status:%d\n", status);
 	snd_soc_unregister_platform(&pdev->dev);
 	pm_runtime_disable(&pdev->dev);
 
@@ -1120,9 +1132,14 @@ static int acp_audio_remove(struct platform_device *pdev)
 static int acp_pcm_resume(struct device *dev)
 {
 	u16 bank;
+	int status;
 	struct audio_drv_data *adata = dev_get_drvdata(dev);
 
-	acp_init(adata->acp_mmio, adata->asic_type);
+	status = acp_init(adata->acp_mmio, adata->asic_type);
+	if (status) {
+		dev_err(dev, "ACP Init failed status:%d\n", status);
+		return status;
+	}
 
 	if (adata->play_stream && adata->play_stream->runtime) {
 		/* For Stoney, Memory gating is disabled,i.e SRAM Banks
@@ -1154,18 +1171,26 @@ static int acp_pcm_resume(struct device *dev)
 
 static int acp_pcm_runtime_suspend(struct device *dev)
 {
+	int status;
 	struct audio_drv_data *adata = dev_get_drvdata(dev);
 
-	acp_deinit(adata->acp_mmio);
+	status = acp_deinit(adata->acp_mmio);
+	if (status)
+		dev_err(dev, "ACP Deinit failed status:%d\n", status);
 	acp_reg_write(0, adata->acp_mmio, mmACP_EXTERNAL_INTR_ENB);
 	return 0;
 }
 
 static int acp_pcm_runtime_resume(struct device *dev)
 {
+	int status;
 	struct audio_drv_data *adata = dev_get_drvdata(dev);
 
-	acp_init(adata->acp_mmio, adata->asic_type);
+	status = acp_init(adata->acp_mmio, adata->asic_type);
+	if (status) {
+		dev_err(dev, "ACP Init failed status:%d\n", status);
+		return status;
+	}
 	acp_reg_write(1, adata->acp_mmio, mmACP_EXTERNAL_INTR_ENB);
 	return 0;
 }
diff --git a/sound/soc/atmel/atmel-classd.c b/sound/soc/atmel/atmel-classd.c
index 8445edd..ebabed6 100644
--- a/sound/soc/atmel/atmel-classd.c
+++ b/sound/soc/atmel/atmel-classd.c
@@ -308,15 +308,9 @@ static int atmel_classd_codec_resume(struct snd_soc_codec *codec)
 	return regcache_sync(dd->regmap);
 }
 
-static struct regmap *atmel_classd_codec_get_remap(struct device *dev)
-{
-	return dev_get_regmap(dev, NULL);
-}
-
 static struct snd_soc_codec_driver soc_codec_dev_classd = {
 	.probe		= atmel_classd_codec_probe,
 	.resume		= atmel_classd_codec_resume,
-	.get_regmap	= atmel_classd_codec_get_remap,
 	.component_driver = {
 		.controls		= atmel_classd_snd_controls,
 		.num_controls		= ARRAY_SIZE(atmel_classd_snd_controls),
diff --git a/sound/soc/au1x/ac97c.c b/sound/soc/au1x/ac97c.c
index 29a97d5..66d6c52 100644
--- a/sound/soc/au1x/ac97c.c
+++ b/sound/soc/au1x/ac97c.c
@@ -91,8 +91,8 @@ static unsigned short au1xac97c_ac97_read(struct snd_ac97 *ac97,
 	do {
 		mutex_lock(&ctx->lock);
 
-		tmo = 5;
-		while ((RD(ctx, AC97_STATUS) & STAT_CP) && tmo--)
+		tmo = 6;
+		while ((RD(ctx, AC97_STATUS) & STAT_CP) && --tmo)
 			udelay(21);	/* wait an ac97 frame time */
 		if (!tmo) {
 			pr_debug("ac97rd timeout #1\n");
@@ -105,7 +105,7 @@ static unsigned short au1xac97c_ac97_read(struct snd_ac97 *ac97,
 		 * poll, Forrest, poll...
 		 */
 		tmo = 0x10000;
-		while ((RD(ctx, AC97_STATUS) & STAT_CP) && tmo--)
+		while ((RD(ctx, AC97_STATUS) & STAT_CP) && --tmo)
 			asm volatile ("nop");
 		data = RD(ctx, AC97_CMDRESP);
 
diff --git a/sound/soc/bcm/bcm2835-i2s.c b/sound/soc/bcm/bcm2835-i2s.c
index 2e449d7..d5f73a8 100644
--- a/sound/soc/bcm/bcm2835-i2s.c
+++ b/sound/soc/bcm/bcm2835-i2s.c
@@ -130,6 +130,7 @@ struct bcm2835_i2s_dev {
 	struct regmap				*i2s_regmap;
 	struct clk				*clk;
 	bool					clk_prepared;
+	int					clk_rate;
 };
 
 static void bcm2835_i2s_start_clock(struct bcm2835_i2s_dev *dev)
@@ -419,10 +420,19 @@ static int bcm2835_i2s_hw_params(struct snd_pcm_substream *substream,
 	}
 
 	/* Clock should only be set up here if CPU is clock master */
-	if (bit_clock_master) {
-		ret = clk_set_rate(dev->clk, bclk_rate);
-		if (ret)
-			return ret;
+	if (bit_clock_master &&
+	    (!dev->clk_prepared || dev->clk_rate != bclk_rate)) {
+		if (dev->clk_prepared)
+			bcm2835_i2s_stop_clock(dev);
+
+		if (dev->clk_rate != bclk_rate) {
+			ret = clk_set_rate(dev->clk, bclk_rate);
+			if (ret)
+				return ret;
+			dev->clk_rate = bclk_rate;
+		}
+
+		bcm2835_i2s_start_clock(dev);
 	}
 
 	/* Setup the frame format */
@@ -618,8 +628,6 @@ static int bcm2835_i2s_prepare(struct snd_pcm_substream *substream,
 	struct bcm2835_i2s_dev *dev = snd_soc_dai_get_drvdata(dai);
 	uint32_t cs_reg;
 
-	bcm2835_i2s_start_clock(dev);
-
 	/*
 	 * Clear both FIFOs if the one that should be started
 	 * is not empty at the moment. This should only happen
diff --git a/sound/soc/cirrus/ep93xx-ac97.c b/sound/soc/cirrus/ep93xx-ac97.c
index bbf7a92..cd5a939 100644
--- a/sound/soc/cirrus/ep93xx-ac97.c
+++ b/sound/soc/cirrus/ep93xx-ac97.c
@@ -365,7 +365,7 @@ static int ep93xx_ac97_probe(struct platform_device *pdev)
 {
 	struct ep93xx_ac97_info *info;
 	struct resource *res;
-	unsigned int irq;
+	int irq;
 	int ret;
 
 	info = devm_kzalloc(&pdev->dev, sizeof(*info), GFP_KERNEL);
@@ -378,8 +378,8 @@ static int ep93xx_ac97_probe(struct platform_device *pdev)
 		return PTR_ERR(info->regs);
 
 	irq = platform_get_irq(pdev, 0);
-	if (!irq)
-		return -ENODEV;
+	if (irq <= 0)
+		return irq < 0 ? irq : -ENODEV;
 
 	ret = devm_request_irq(&pdev->dev, irq, ep93xx_ac97_interrupt,
 			       IRQF_TRIGGER_HIGH, pdev->name, info);
diff --git a/sound/soc/codecs/88pm860x-codec.c b/sound/soc/codecs/88pm860x-codec.c
index 848c5fe..be8ea72 100644
--- a/sound/soc/codecs/88pm860x-codec.c
+++ b/sound/soc/codecs/88pm860x-codec.c
@@ -1319,6 +1319,7 @@ static int pm860x_probe(struct snd_soc_codec *codec)
 	int i, ret;
 
 	pm860x->codec = codec;
+	snd_soc_codec_init_regmap(codec,  pm860x->regmap);
 
 	for (i = 0; i < 4; i++) {
 		ret = request_threaded_irq(pm860x->irq[i], NULL,
@@ -1348,18 +1349,10 @@ static int pm860x_remove(struct snd_soc_codec *codec)
 	return 0;
 }
 
-static struct regmap *pm860x_get_regmap(struct device *dev)
-{
-	struct pm860x_priv *pm860x = dev_get_drvdata(dev);
-
-	return pm860x->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_pm860x = {
 	.probe		= pm860x_probe,
 	.remove		= pm860x_remove,
 	.set_bias_level	= pm860x_set_bias_level,
-	.get_regmap	= pm860x_get_regmap,
 
 	.component_driver = {
 		.controls		= pm860x_snd_controls,
diff --git a/sound/soc/codecs/Kconfig b/sound/soc/codecs/Kconfig
index a42ddbc..2b331f7 100644
--- a/sound/soc/codecs/Kconfig
+++ b/sound/soc/codecs/Kconfig
@@ -95,6 +95,7 @@
 	select SND_SOC_MAX98925 if I2C
 	select SND_SOC_MAX98926 if I2C
 	select SND_SOC_MAX98927 if I2C
+	select SND_SOC_MAX98373 if I2C
 	select SND_SOC_MAX9850 if I2C
 	select SND_SOC_MAX9860 if I2C
 	select SND_SOC_MAX9768 if I2C
@@ -109,6 +110,8 @@
 	select SND_SOC_PCM1681 if I2C
 	select SND_SOC_PCM179X_I2C if I2C
 	select SND_SOC_PCM179X_SPI if SPI_MASTER
+	select SND_SOC_PCM186X_I2C if I2C
+	select SND_SOC_PCM186X_SPI if SPI_MASTER
 	select SND_SOC_PCM3008
 	select SND_SOC_PCM3168A_I2C if I2C
 	select SND_SOC_PCM3168A_SPI if SPI_MASTER
@@ -133,7 +136,6 @@
 	select SND_SOC_SGTL5000 if I2C
 	select SND_SOC_SI476X if MFD_SI476X_CORE
 	select SND_SOC_SIRF_AUDIO_CODEC
-	select SND_SOC_SN95031 if INTEL_SCU_IPC
 	select SND_SOC_SPDIF
 	select SND_SOC_SSM2518 if I2C
 	select SND_SOC_SSM2602_SPI if SPI_MASTER
@@ -148,6 +150,7 @@
 	select SND_SOC_TAS5086 if I2C
 	select SND_SOC_TAS571X if I2C
 	select SND_SOC_TAS5720 if I2C
+	select SND_SOC_TAS6424 if I2C
 	select SND_SOC_TFA9879 if I2C
 	select SND_SOC_TLV320AIC23_I2C if I2C
 	select SND_SOC_TLV320AIC23_SPI if SPI_MASTER
@@ -158,6 +161,7 @@
 	select SND_SOC_TLV320AIC3X if I2C
 	select SND_SOC_TPA6130A2 if I2C
 	select SND_SOC_TLV320DAC33 if I2C
+	select SND_SOC_TSCS42XX if I2C
 	select SND_SOC_TS3A227E if I2C
 	select SND_SOC_TWL4030 if TWL4030_CORE
 	select SND_SOC_TWL6040 if TWL6040_CORE
@@ -623,6 +627,10 @@
 	tristate "Maxim Integrated MAX98927 Speaker Amplifier"
 	depends on I2C
 
+config SND_SOC_MAX98373
+	tristate "Maxim Integrated MAX98373 Speaker Amplifier"
+	depends on I2C
+
 config SND_SOC_MAX9850
 	tristate
 
@@ -661,6 +669,21 @@
 	  Enable support for Texas Instruments PCM179x CODEC.
 	  Select this if your PCM179x is connected via an SPI bus.
 
+config SND_SOC_PCM186X
+	tristate
+
+config SND_SOC_PCM186X_I2C
+	tristate "Texas Instruments PCM186x CODECs - I2C"
+	depends on I2C
+	select SND_SOC_PCM186X
+	select REGMAP_I2C
+
+config SND_SOC_PCM186X_SPI
+	tristate "Texas Instruments PCM186x CODECs - SPI"
+	depends on SPI_MASTER
+	select SND_SOC_PCM186X
+	select REGMAP_SPI
+
 config SND_SOC_PCM3008
        tristate
 
@@ -818,9 +841,6 @@
 	tristate "SiRF SoC internal audio codec"
 	select REGMAP_MMIO
 
-config SND_SOC_SN95031
-	tristate
-
 config SND_SOC_SPDIF
 	tristate "S/PDIF CODEC"
 
@@ -883,6 +903,13 @@
 	  Enable support for Texas Instruments TAS5720L/M high-efficiency mono
 	  Class-D audio power amplifiers.
 
+config SND_SOC_TAS6424
+	tristate "Texas Instruments TAS6424 Quad-Channel Audio amplifier"
+	depends on I2C
+	help
+	  Enable support for Texas Instruments TAS6424 high-efficiency
+	  digital input quad-channel Class-D audio power amplifiers.
+
 config SND_SOC_TFA9879
 	tristate "NXP Semiconductors TFA9879 amplifier"
 	depends on I2C
@@ -913,12 +940,12 @@
 	tristate
 
 config SND_SOC_TLV320AIC32X4_I2C
-	tristate
+	tristate "Texas Instruments TLV320AIC32x4 audio CODECs - I2C"
 	depends on I2C
 	select SND_SOC_TLV320AIC32X4
 
 config SND_SOC_TLV320AIC32X4_SPI
-	tristate
+	tristate "Texas Instruments TLV320AIC32x4 audio CODECs - SPI"
 	depends on SPI_MASTER
 	select SND_SOC_TLV320AIC32X4
 
@@ -933,6 +960,13 @@
 	tristate "TI Headset/Mic detect and keypress chip"
 	depends on I2C
 
+config SND_SOC_TSCS42XX
+	tristate "Tempo Semiconductor TSCS42xx CODEC"
+	depends on I2C
+	select REGMAP_I2C
+	help
+	  Add support for Tempo Semiconductor's TSCS42xx audio CODEC.
+
 config SND_SOC_TWL4030
 	select MFD_TWL4030_AUDIO
 	tristate
diff --git a/sound/soc/codecs/Makefile b/sound/soc/codecs/Makefile
index 0001069..da15713 100644
--- a/sound/soc/codecs/Makefile
+++ b/sound/soc/codecs/Makefile
@@ -90,6 +90,7 @@
 snd-soc-max98925-objs := max98925.o
 snd-soc-max98926-objs := max98926.o
 snd-soc-max98927-objs := max98927.o
+snd-soc-max98373-objs := max98373.o
 snd-soc-max9850-objs := max9850.o
 snd-soc-max9860-objs := max9860.o
 snd-soc-mc13783-objs := mc13783.o
@@ -105,6 +106,9 @@
 snd-soc-pcm179x-codec-objs := pcm179x.o
 snd-soc-pcm179x-i2c-objs := pcm179x-i2c.o
 snd-soc-pcm179x-spi-objs := pcm179x-spi.o
+snd-soc-pcm186x-objs := pcm186x.o
+snd-soc-pcm186x-i2c-objs := pcm186x-i2c.o
+snd-soc-pcm186x-spi-objs := pcm186x-spi.o
 snd-soc-pcm3008-objs := pcm3008.o
 snd-soc-pcm3168a-objs := pcm3168a.o
 snd-soc-pcm3168a-i2c-objs := pcm3168a-i2c.o
@@ -140,7 +144,6 @@
 snd-soc-sigmadsp-regmap-objs := sigmadsp-regmap.o
 snd-soc-si476x-objs := si476x.o
 snd-soc-sirf-audio-codec-objs := sirf-audio-codec.o
-snd-soc-sn95031-objs := sn95031.o
 snd-soc-spdif-tx-objs := spdif_transmitter.o
 snd-soc-spdif-rx-objs := spdif_receiver.o
 snd-soc-ssm2518-objs := ssm2518.o
@@ -156,6 +159,7 @@
 snd-soc-tas5086-objs := tas5086.o
 snd-soc-tas571x-objs := tas571x.o
 snd-soc-tas5720-objs := tas5720.o
+snd-soc-tas6424-objs := tas6424.o
 snd-soc-tfa9879-objs := tfa9879.o
 snd-soc-tlv320aic23-objs := tlv320aic23.o
 snd-soc-tlv320aic23-i2c-objs := tlv320aic23-i2c.o
@@ -167,6 +171,7 @@
 snd-soc-tlv320aic32x4-spi-objs := tlv320aic32x4-spi.o
 snd-soc-tlv320aic3x-objs := tlv320aic3x.o
 snd-soc-tlv320dac33-objs := tlv320dac33.o
+snd-soc-tscs42xx-objs := tscs42xx.o
 snd-soc-ts3a227e-objs := ts3a227e.o
 snd-soc-twl4030-objs := twl4030.o
 snd-soc-twl6040-objs := twl6040.o
@@ -330,6 +335,7 @@
 obj-$(CONFIG_SND_SOC_MAX98925)	+= snd-soc-max98925.o
 obj-$(CONFIG_SND_SOC_MAX98926)	+= snd-soc-max98926.o
 obj-$(CONFIG_SND_SOC_MAX98927)	+= snd-soc-max98927.o
+obj-$(CONFIG_SND_SOC_MAX98373)	+= snd-soc-max98373.o
 obj-$(CONFIG_SND_SOC_MAX9850)	+= snd-soc-max9850.o
 obj-$(CONFIG_SND_SOC_MAX9860)	+= snd-soc-max9860.o
 obj-$(CONFIG_SND_SOC_MC13783)	+= snd-soc-mc13783.o
@@ -345,6 +351,9 @@
 obj-$(CONFIG_SND_SOC_PCM179X)	+= snd-soc-pcm179x-codec.o
 obj-$(CONFIG_SND_SOC_PCM179X_I2C)	+= snd-soc-pcm179x-i2c.o
 obj-$(CONFIG_SND_SOC_PCM179X_SPI)	+= snd-soc-pcm179x-spi.o
+obj-$(CONFIG_SND_SOC_PCM186X)	+= snd-soc-pcm186x.o
+obj-$(CONFIG_SND_SOC_PCM186X_I2C)	+= snd-soc-pcm186x-i2c.o
+obj-$(CONFIG_SND_SOC_PCM186X_SPI)	+= snd-soc-pcm186x-spi.o
 obj-$(CONFIG_SND_SOC_PCM3008)	+= snd-soc-pcm3008.o
 obj-$(CONFIG_SND_SOC_PCM3168A)	+= snd-soc-pcm3168a.o
 obj-$(CONFIG_SND_SOC_PCM3168A_I2C)	+= snd-soc-pcm3168a-i2c.o
@@ -395,6 +404,7 @@
 obj-$(CONFIG_SND_SOC_TAS5086)	+= snd-soc-tas5086.o
 obj-$(CONFIG_SND_SOC_TAS571X)	+= snd-soc-tas571x.o
 obj-$(CONFIG_SND_SOC_TAS5720)	+= snd-soc-tas5720.o
+obj-$(CONFIG_SND_SOC_TAS6424)	+= snd-soc-tas6424.o
 obj-$(CONFIG_SND_SOC_TFA9879)	+= snd-soc-tfa9879.o
 obj-$(CONFIG_SND_SOC_TLV320AIC23)	+= snd-soc-tlv320aic23.o
 obj-$(CONFIG_SND_SOC_TLV320AIC23_I2C)	+= snd-soc-tlv320aic23-i2c.o
@@ -406,6 +416,7 @@
 obj-$(CONFIG_SND_SOC_TLV320AIC32X4_SPI)	+= snd-soc-tlv320aic32x4-spi.o
 obj-$(CONFIG_SND_SOC_TLV320AIC3X)	+= snd-soc-tlv320aic3x.o
 obj-$(CONFIG_SND_SOC_TLV320DAC33)	+= snd-soc-tlv320dac33.o
+obj-$(CONFIG_SND_SOC_TSCS42XX)	+= snd-soc-tscs42xx.o
 obj-$(CONFIG_SND_SOC_TS3A227E)	+= snd-soc-ts3a227e.o
 obj-$(CONFIG_SND_SOC_TWL4030)	+= snd-soc-twl4030.o
 obj-$(CONFIG_SND_SOC_TWL6040)	+= snd-soc-twl6040.o
diff --git a/sound/soc/codecs/cq93vc.c b/sound/soc/codecs/cq93vc.c
index 6ed2cc3..3bf9365 100644
--- a/sound/soc/codecs/cq93vc.c
+++ b/sound/soc/codecs/cq93vc.c
@@ -121,17 +121,19 @@ static struct snd_soc_dai_driver cq93vc_dai = {
 	.ops = &cq93vc_dai_ops,
 };
 
-static struct regmap *cq93vc_get_regmap(struct device *dev)
+static int cq93vc_probe(struct snd_soc_component *component)
 {
-	struct davinci_vc *davinci_vc = dev->platform_data;
+	struct davinci_vc *davinci_vc = component->dev->platform_data;
 
-	return davinci_vc->regmap;
+	snd_soc_component_init_regmap(component, davinci_vc->regmap);
+
+	return 0;
 }
 
 static const struct snd_soc_codec_driver soc_codec_dev_cq93vc = {
 	.set_bias_level = cq93vc_set_bias_level,
-	.get_regmap = cq93vc_get_regmap,
 	.component_driver = {
+		.probe = cq93vc_probe,
 		.controls = cq93vc_snd_controls,
 		.num_controls = ARRAY_SIZE(cq93vc_snd_controls),
 	},
diff --git a/sound/soc/codecs/cs35l32.c b/sound/soc/codecs/cs35l32.c
index 7e98062..bc3a72e 100644
--- a/sound/soc/codecs/cs35l32.c
+++ b/sound/soc/codecs/cs35l32.c
@@ -355,13 +355,9 @@ static int cs35l32_i2c_probe(struct i2c_client *i2c_client,
 	unsigned int devid = 0;
 	unsigned int reg;
 
-
-	cs35l32 = devm_kzalloc(&i2c_client->dev, sizeof(struct cs35l32_private),
-			       GFP_KERNEL);
-	if (!cs35l32) {
-		dev_err(&i2c_client->dev, "could not allocate codec\n");
+	cs35l32 = devm_kzalloc(&i2c_client->dev, sizeof(*cs35l32), GFP_KERNEL);
+	if (!cs35l32)
 		return -ENOMEM;
-	}
 
 	i2c_set_clientdata(i2c_client, cs35l32);
 
@@ -375,13 +371,11 @@ static int cs35l32_i2c_probe(struct i2c_client *i2c_client,
 	if (pdata) {
 		cs35l32->pdata = *pdata;
 	} else {
-		pdata = devm_kzalloc(&i2c_client->dev,
-				     sizeof(struct cs35l32_platform_data),
-				GFP_KERNEL);
-		if (!pdata) {
-			dev_err(&i2c_client->dev, "could not allocate pdata\n");
+		pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata),
+				     GFP_KERNEL);
+		if (!pdata)
 			return -ENOMEM;
-		}
+
 		if (i2c_client->dev.of_node) {
 			ret = cs35l32_handle_of_data(i2c_client,
 						     &cs35l32->pdata);
diff --git a/sound/soc/codecs/cs35l34.c b/sound/soc/codecs/cs35l34.c
index 1e05026..0600d52 100644
--- a/sound/soc/codecs/cs35l34.c
+++ b/sound/soc/codecs/cs35l34.c
@@ -1004,13 +1004,9 @@ static int cs35l34_i2c_probe(struct i2c_client *i2c_client,
 	unsigned int devid = 0;
 	unsigned int reg;
 
-	cs35l34 = devm_kzalloc(&i2c_client->dev,
-			       sizeof(struct cs35l34_private),
-			       GFP_KERNEL);
-	if (!cs35l34) {
-		dev_err(&i2c_client->dev, "could not allocate codec\n");
+	cs35l34 = devm_kzalloc(&i2c_client->dev, sizeof(*cs35l34), GFP_KERNEL);
+	if (!cs35l34)
 		return -ENOMEM;
-	}
 
 	i2c_set_clientdata(i2c_client, cs35l34);
 	cs35l34->regmap = devm_regmap_init_i2c(i2c_client, &cs35l34_regmap);
@@ -1044,14 +1040,11 @@ static int cs35l34_i2c_probe(struct i2c_client *i2c_client,
 	if (pdata) {
 		cs35l34->pdata = *pdata;
 	} else {
-		pdata = devm_kzalloc(&i2c_client->dev,
-				sizeof(struct cs35l34_platform_data),
-				GFP_KERNEL);
-		if (!pdata) {
-			dev_err(&i2c_client->dev,
-				"could not allocate pdata\n");
+		pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata),
+				     GFP_KERNEL);
+		if (!pdata)
 			return -ENOMEM;
-		}
+
 		if (i2c_client->dev.of_node) {
 			ret = cs35l34_handle_of_data(i2c_client, pdata);
 			if (ret != 0)
diff --git a/sound/soc/codecs/cs42l52.c b/sound/soc/codecs/cs42l52.c
index 0d9c4a5..9731e5d 100644
--- a/sound/soc/codecs/cs42l52.c
+++ b/sound/soc/codecs/cs42l52.c
@@ -1100,8 +1100,7 @@ static int cs42l52_i2c_probe(struct i2c_client *i2c_client,
 	unsigned int reg;
 	u32 val32;
 
-	cs42l52 = devm_kzalloc(&i2c_client->dev, sizeof(struct cs42l52_private),
-			       GFP_KERNEL);
+	cs42l52 = devm_kzalloc(&i2c_client->dev, sizeof(*cs42l52), GFP_KERNEL);
 	if (cs42l52 == NULL)
 		return -ENOMEM;
 	cs42l52->dev = &i2c_client->dev;
@@ -1115,13 +1114,11 @@ static int cs42l52_i2c_probe(struct i2c_client *i2c_client,
 	if (pdata) {
 		cs42l52->pdata = *pdata;
 	} else {
-		pdata = devm_kzalloc(&i2c_client->dev,
-				     sizeof(struct cs42l52_platform_data),
-				GFP_KERNEL);
-		if (!pdata) {
-			dev_err(&i2c_client->dev, "could not allocate pdata\n");
+		pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata),
+				     GFP_KERNEL);
+		if (!pdata)
 			return -ENOMEM;
-		}
+
 		if (i2c_client->dev.of_node) {
 			if (of_property_read_bool(i2c_client->dev.of_node,
 				"cirrus,mica-differential-cfg"))
diff --git a/sound/soc/codecs/cs42l56.c b/sound/soc/codecs/cs42l56.c
index cb6ca85..fd7b8d3 100644
--- a/sound/soc/codecs/cs42l56.c
+++ b/sound/soc/codecs/cs42l56.c
@@ -1190,9 +1190,7 @@ static int cs42l56_i2c_probe(struct i2c_client *i2c_client,
 	unsigned int alpha_rev, metal_rev;
 	unsigned int reg;
 
-	cs42l56 = devm_kzalloc(&i2c_client->dev,
-			       sizeof(struct cs42l56_private),
-			       GFP_KERNEL);
+	cs42l56 = devm_kzalloc(&i2c_client->dev, sizeof(*cs42l56), GFP_KERNEL);
 	if (cs42l56 == NULL)
 		return -ENOMEM;
 	cs42l56->dev = &i2c_client->dev;
@@ -1207,14 +1205,11 @@ static int cs42l56_i2c_probe(struct i2c_client *i2c_client,
 	if (pdata) {
 		cs42l56->pdata = *pdata;
 	} else {
-		pdata = devm_kzalloc(&i2c_client->dev,
-				     sizeof(struct cs42l56_platform_data),
+		pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata),
 				     GFP_KERNEL);
-		if (!pdata) {
-			dev_err(&i2c_client->dev,
-				"could not allocate pdata\n");
+		if (!pdata)
 			return -ENOMEM;
-		}
+
 		if (i2c_client->dev.of_node) {
 			ret = cs42l56_handle_of_data(i2c_client,
 						     &cs42l56->pdata);
diff --git a/sound/soc/codecs/cs42l73.c b/sound/soc/codecs/cs42l73.c
index 3df2c47..aebaa97 100644
--- a/sound/soc/codecs/cs42l73.c
+++ b/sound/soc/codecs/cs42l73.c
@@ -1289,8 +1289,7 @@ static int cs42l73_i2c_probe(struct i2c_client *i2c_client,
 	unsigned int reg;
 	u32 val32;
 
-	cs42l73 = devm_kzalloc(&i2c_client->dev, sizeof(struct cs42l73_private),
-			       GFP_KERNEL);
+	cs42l73 = devm_kzalloc(&i2c_client->dev, sizeof(*cs42l73), GFP_KERNEL);
 	if (!cs42l73)
 		return -ENOMEM;
 
@@ -1304,13 +1303,11 @@ static int cs42l73_i2c_probe(struct i2c_client *i2c_client,
 	if (pdata) {
 		cs42l73->pdata = *pdata;
 	} else {
-		pdata = devm_kzalloc(&i2c_client->dev,
-				     sizeof(struct cs42l73_platform_data),
-				GFP_KERNEL);
-		if (!pdata) {
-			dev_err(&i2c_client->dev, "could not allocate pdata\n");
+		pdata = devm_kzalloc(&i2c_client->dev, sizeof(*pdata),
+				     GFP_KERNEL);
+		if (!pdata)
 			return -ENOMEM;
-		}
+
 		if (i2c_client->dev.of_node) {
 			if (of_property_read_u32(i2c_client->dev.of_node,
 				"chgfreq", &val32) >= 0)
@@ -1358,7 +1355,7 @@ static int cs42l73_i2c_probe(struct i2c_client *i2c_client,
 	ret = regmap_read(cs42l73->regmap, CS42L73_REVID, &reg);
 	if (ret < 0) {
 		dev_err(&i2c_client->dev, "Get Revision ID failed\n");
-		return ret;;
+		return ret;
 	}
 
 	dev_info(&i2c_client->dev,
diff --git a/sound/soc/codecs/cs47l24.c b/sound/soc/codecs/cs47l24.c
index 94c0209..be27506 100644
--- a/sound/soc/codecs/cs47l24.c
+++ b/sound/soc/codecs/cs47l24.c
@@ -1120,9 +1120,11 @@ static int cs47l24_codec_probe(struct snd_soc_codec *codec)
 	struct snd_soc_dapm_context *dapm = snd_soc_codec_get_dapm(codec);
 	struct snd_soc_component *component = snd_soc_dapm_to_component(dapm);
 	struct cs47l24_priv *priv = snd_soc_codec_get_drvdata(codec);
+	struct arizona *arizona = priv->core.arizona;
 	int ret;
 
-	priv->core.arizona->dapm = dapm;
+	arizona->dapm = dapm;
+	snd_soc_codec_init_regmap(codec, arizona->regmap);
 
 	ret = arizona_init_spk(codec);
 	if (ret < 0)
@@ -1175,17 +1177,9 @@ static unsigned int cs47l24_digital_vu[] = {
 	ARIZONA_DAC_DIGITAL_VOLUME_4L,
 };
 
-static struct regmap *cs47l24_get_regmap(struct device *dev)
-{
-	struct cs47l24_priv *priv = dev_get_drvdata(dev);
-
-	return priv->core.arizona->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_cs47l24 = {
 	.probe = cs47l24_codec_probe,
 	.remove = cs47l24_codec_remove,
-	.get_regmap = cs47l24_get_regmap,
 
 	.idle_bias_off = true,
 
diff --git a/sound/soc/codecs/cx20442.c b/sound/soc/codecs/cx20442.c
index 46b1fbb..95bb10b 100644
--- a/sound/soc/codecs/cx20442.c
+++ b/sound/soc/codecs/cx20442.c
@@ -26,8 +26,9 @@
 
 
 struct cx20442_priv {
-	void *control_data;
+	struct tty_struct *tty;
 	struct regulator *por;
+	u8 reg_cache;
 };
 
 #define CX20442_PM		0x0
@@ -89,14 +90,14 @@ static const struct snd_soc_dapm_route cx20442_audio_map[] = {
 };
 
 static unsigned int cx20442_read_reg_cache(struct snd_soc_codec *codec,
-							unsigned int reg)
+					   unsigned int reg)
 {
-	u8 *reg_cache = codec->reg_cache;
+	struct cx20442_priv *cx20442 = snd_soc_codec_get_drvdata(codec);
 
-	if (reg >= codec->driver->reg_cache_size)
+	if (reg >= 1)
 		return -EINVAL;
 
-	return reg_cache[reg];
+	return cx20442->reg_cache;
 }
 
 enum v253_vls {
@@ -156,20 +157,19 @@ static int cx20442_write(struct snd_soc_codec *codec, unsigned int reg,
 							unsigned int value)
 {
 	struct cx20442_priv *cx20442 = snd_soc_codec_get_drvdata(codec);
-	u8 *reg_cache = codec->reg_cache;
 	int vls, vsp, old, len;
 	char buf[18];
 
-	if (reg >= codec->driver->reg_cache_size)
+	if (reg >= 1)
 		return -EINVAL;
 
-	/* hw_write and control_data pointers required for talking to the modem
+	/* tty and write pointers required for talking to the modem
 	 * are expected to be set by the line discipline initialization code */
-	if (!codec->hw_write || !cx20442->control_data)
+	if (!cx20442->tty || !cx20442->tty->ops->write)
 		return -EIO;
 
-	old = reg_cache[reg];
-	reg_cache[reg] = value;
+	old = cx20442->reg_cache;
+	cx20442->reg_cache = value;
 
 	vls = cx20442_pm_to_v253_vls(value);
 	if (vls < 0)
@@ -194,13 +194,12 @@ static int cx20442_write(struct snd_soc_codec *codec, unsigned int reg,
 		return -ENOMEM;
 
 	dev_dbg(codec->dev, "%s: %s\n", __func__, buf);
-	if (codec->hw_write(cx20442->control_data, buf, len) != len)
+	if (cx20442->tty->ops->write(cx20442->tty, buf, len) != len)
 		return -EIO;
 
 	return 0;
 }
 
-
 /*
  * Line discpline related code
  *
@@ -252,8 +251,7 @@ static void v253_close(struct tty_struct *tty)
 	cx20442 = snd_soc_codec_get_drvdata(codec);
 
 	/* Prevent the codec driver from further accessing the modem */
-	codec->hw_write = NULL;
-	cx20442->control_data = NULL;
+	cx20442->tty = NULL;
 	codec->component.card->pop_time = 0;
 }
 
@@ -276,12 +274,11 @@ static void v253_receive(struct tty_struct *tty,
 
 	cx20442 = snd_soc_codec_get_drvdata(codec);
 
-	if (!cx20442->control_data) {
+	if (!cx20442->tty) {
 		/* First modem response, complete setup procedure */
 
 		/* Set up codec driver access to modem controls */
-		cx20442->control_data = tty;
-		codec->hw_write = (hw_write_t)tty->ops->write;
+		cx20442->tty = tty;
 		codec->component.card->pop_time = 1;
 	}
 }
@@ -367,10 +364,9 @@ static int cx20442_codec_probe(struct snd_soc_codec *codec)
 	cx20442->por = regulator_get(codec->dev, "POR");
 	if (IS_ERR(cx20442->por))
 		dev_warn(codec->dev, "failed to get the regulator");
-	cx20442->control_data = NULL;
+	cx20442->tty = NULL;
 
 	snd_soc_codec_set_drvdata(codec, cx20442);
-	codec->hw_write = NULL;
 	codec->component.card->pop_time = 0;
 
 	return 0;
@@ -381,8 +377,8 @@ static int cx20442_codec_remove(struct snd_soc_codec *codec)
 {
 	struct cx20442_priv *cx20442 = snd_soc_codec_get_drvdata(codec);
 
-	if (cx20442->control_data) {
-		struct tty_struct *tty = cx20442->control_data;
+	if (cx20442->tty) {
+		struct tty_struct *tty = cx20442->tty;
 		tty_hangup(tty);
 	}
 
@@ -396,17 +392,13 @@ static int cx20442_codec_remove(struct snd_soc_codec *codec)
 	return 0;
 }
 
-static const u8 cx20442_reg;
-
 static const struct snd_soc_codec_driver cx20442_codec_dev = {
 	.probe = 	cx20442_codec_probe,
 	.remove = 	cx20442_codec_remove,
 	.set_bias_level = cx20442_set_bias_level,
-	.reg_cache_default = &cx20442_reg,
-	.reg_cache_size = 1,
-	.reg_word_size = sizeof(u8),
 	.read = cx20442_read_reg_cache,
 	.write = cx20442_write,
+
 	.component_driver = {
 		.dapm_widgets		= cx20442_dapm_widgets,
 		.num_dapm_widgets	= ARRAY_SIZE(cx20442_dapm_widgets),
diff --git a/sound/soc/codecs/da7213.c b/sound/soc/codecs/da7213.c
index 41d9b1d..b2b4e90 100644
--- a/sound/soc/codecs/da7213.c
+++ b/sound/soc/codecs/da7213.c
@@ -1654,10 +1654,8 @@ static struct da7213_platform_data
 	u32 fw_val32;
 
 	pdata = devm_kzalloc(codec->dev, sizeof(*pdata), GFP_KERNEL);
-	if (!pdata) {
-		dev_warn(codec->dev, "Failed to allocate memory for pdata\n");
+	if (!pdata)
 		return NULL;
-	}
 
 	if (device_property_read_u32(dev, "dlg,micbias1-lvl", &fw_val32) >= 0)
 		pdata->micbias1_lvl = da7213_of_micbias_lvl(codec, fw_val32);
@@ -1855,8 +1853,7 @@ static int da7213_i2c_probe(struct i2c_client *i2c,
 	struct da7213_priv *da7213;
 	int ret;
 
-	da7213 = devm_kzalloc(&i2c->dev, sizeof(struct da7213_priv),
-			      GFP_KERNEL);
+	da7213 = devm_kzalloc(&i2c->dev, sizeof(*da7213), GFP_KERNEL);
 	if (!da7213)
 		return -ENOMEM;
 
diff --git a/sound/soc/codecs/da7218.c b/sound/soc/codecs/da7218.c
index 56564ce..96c644a 100644
--- a/sound/soc/codecs/da7218.c
+++ b/sound/soc/codecs/da7218.c
@@ -2455,10 +2455,8 @@ static struct da7218_pdata *da7218_of_to_pdata(struct snd_soc_codec *codec)
 	u32 of_val32;
 
 	pdata = devm_kzalloc(codec->dev, sizeof(*pdata), GFP_KERNEL);
-	if (!pdata) {
-		dev_warn(codec->dev, "Failed to allocate memory for pdata\n");
+	if (!pdata)
 		return NULL;
-	}
 
 	if (of_property_read_u32(np, "dlg,micbias1-lvl-millivolt", &of_val32) >= 0)
 		pdata->micbias1_lvl = da7218_of_micbias_lvl(codec, of_val32);
@@ -2527,8 +2525,6 @@ static struct da7218_pdata *da7218_of_to_pdata(struct snd_soc_codec *codec)
 		hpldet_pdata = devm_kzalloc(codec->dev, sizeof(*hpldet_pdata),
 					    GFP_KERNEL);
 		if (!hpldet_pdata) {
-			dev_warn(codec->dev,
-				 "Failed to allocate memory for hpldet pdata\n");
 			of_node_put(hpldet_np);
 			return pdata;
 		}
@@ -3273,8 +3269,7 @@ static int da7218_i2c_probe(struct i2c_client *i2c,
 	struct da7218_priv *da7218;
 	int ret;
 
-	da7218 = devm_kzalloc(&i2c->dev, sizeof(struct da7218_priv),
-			      GFP_KERNEL);
+	da7218 = devm_kzalloc(&i2c->dev, sizeof(*da7218), GFP_KERNEL);
 	if (!da7218)
 		return -ENOMEM;
 
diff --git a/sound/soc/codecs/dmic.c b/sound/soc/codecs/dmic.c
index b88a1ee..c88f974 100644
--- a/sound/soc/codecs/dmic.c
+++ b/sound/soc/codecs/dmic.c
@@ -107,8 +107,30 @@ static const struct snd_soc_codec_driver soc_dmic = {
 
 static int dmic_dev_probe(struct platform_device *pdev)
 {
+	int err;
+	u32 chans;
+	struct snd_soc_dai_driver *dai_drv = &dmic_dai;
+
+	if (pdev->dev.of_node) {
+		err = of_property_read_u32(pdev->dev.of_node, "num-channels", &chans);
+		if (err && (err != -ENOENT))
+			return err;
+
+		if (!err) {
+			if (chans < 1 || chans > 8)
+				return -EINVAL;
+
+			dai_drv = devm_kzalloc(&pdev->dev, sizeof(*dai_drv), GFP_KERNEL);
+			if (!dai_drv)
+				return -ENOMEM;
+
+			memcpy(dai_drv, &dmic_dai, sizeof(*dai_drv));
+			dai_drv->capture.channels_max = chans;
+		}
+	}
+
 	return snd_soc_register_codec(&pdev->dev,
-			&soc_dmic, &dmic_dai, 1);
+			&soc_dmic, dai_drv, 1);
 }
 
 static int dmic_dev_remove(struct platform_device *pdev)
diff --git a/sound/soc/codecs/hdac_hdmi.c b/sound/soc/codecs/hdac_hdmi.c
index f3b4f4d..dba6f4c 100644
--- a/sound/soc/codecs/hdac_hdmi.c
+++ b/sound/soc/codecs/hdac_hdmi.c
@@ -136,8 +136,11 @@ struct hdac_hdmi_priv {
 	struct mutex pin_mutex;
 	struct hdac_chmap chmap;
 	struct hdac_hdmi_drv_data *drv_data;
+	struct snd_soc_dai_driver *dai_drv;
 };
 
+#define hdev_to_hdmi_priv(_hdev) ((to_ehdac_device(_hdev))->private_data)
+
 static struct hdac_hdmi_pcm *
 hdac_hdmi_get_pcm_from_cvt(struct hdac_hdmi_priv *hdmi,
 			   struct hdac_hdmi_cvt *cvt)
@@ -169,7 +172,7 @@ static void hdac_hdmi_jack_report(struct hdac_hdmi_pcm *pcm,
 		 * ports.
 		 */
 		if (pcm->jack_event == 0) {
-			dev_dbg(&edev->hdac.dev,
+			dev_dbg(&edev->hdev.dev,
 					"jack report for pcm=%d\n",
 					pcm->pcm_id);
 			snd_soc_jack_report(pcm->jack, SND_JACK_AVOUT,
@@ -195,18 +198,18 @@ static void hdac_hdmi_jack_report(struct hdac_hdmi_pcm *pcm,
 /*
  * Get the no devices that can be connected to a port on the Pin widget.
  */
-static int hdac_hdmi_get_port_len(struct hdac_ext_device *hdac, hda_nid_t nid)
+static int hdac_hdmi_get_port_len(struct hdac_ext_device *edev, hda_nid_t nid)
 {
 	unsigned int caps;
 	unsigned int type, param;
 
-	caps = get_wcaps(&hdac->hdac, nid);
+	caps = get_wcaps(&edev->hdev, nid);
 	type = get_wcaps_type(caps);
 
 	if (!(caps & AC_WCAP_DIGITAL) || (type != AC_WID_PIN))
 		return 0;
 
-	param = snd_hdac_read_parm_uncached(&hdac->hdac, nid,
+	param = snd_hdac_read_parm_uncached(&edev->hdev, nid,
 					AC_PAR_DEVLIST_LEN);
 	if (param == -1)
 		return param;
@@ -219,10 +222,10 @@ static int hdac_hdmi_get_port_len(struct hdac_ext_device *hdac, hda_nid_t nid)
  * id selected on the pin. Return 0 means the first port entry
  * is selected or MST is not supported.
  */
-static int hdac_hdmi_port_select_get(struct hdac_ext_device *hdac,
+static int hdac_hdmi_port_select_get(struct hdac_ext_device *edev,
 					struct hdac_hdmi_port *port)
 {
-	return snd_hdac_codec_read(&hdac->hdac, port->pin->nid,
+	return snd_hdac_codec_read(&edev->hdev, port->pin->nid,
 				0, AC_VERB_GET_DEVICE_SEL, 0);
 }
 
@@ -230,7 +233,7 @@ static int hdac_hdmi_port_select_get(struct hdac_ext_device *hdac,
  * Sets the selected port entry for the configuring Pin widget verb.
  * returns error if port set is not equal to port get otherwise success
  */
-static int hdac_hdmi_port_select_set(struct hdac_ext_device *hdac,
+static int hdac_hdmi_port_select_set(struct hdac_ext_device *edev,
 					struct hdac_hdmi_port *port)
 {
 	int num_ports;
@@ -239,7 +242,7 @@ static int hdac_hdmi_port_select_set(struct hdac_ext_device *hdac,
 		return 0;
 
 	/* AC_PAR_DEVLIST_LEN is 0 based. */
-	num_ports = hdac_hdmi_get_port_len(hdac, port->pin->nid);
+	num_ports = hdac_hdmi_get_port_len(edev, port->pin->nid);
 
 	if (num_ports < 0)
 		return -EIO;
@@ -250,13 +253,13 @@ static int hdac_hdmi_port_select_set(struct hdac_ext_device *hdac,
 	if (num_ports + 1  < port->id)
 		return 0;
 
-	snd_hdac_codec_write(&hdac->hdac, port->pin->nid, 0,
+	snd_hdac_codec_write(&edev->hdev, port->pin->nid, 0,
 			AC_VERB_SET_DEVICE_SEL, port->id);
 
-	if (port->id != hdac_hdmi_port_select_get(hdac, port))
+	if (port->id != hdac_hdmi_port_select_get(edev, port))
 		return -EIO;
 
-	dev_dbg(&hdac->hdac.dev, "Selected the port=%d\n", port->id);
+	dev_dbg(&edev->hdev.dev, "Selected the port=%d\n", port->id);
 
 	return 0;
 }
@@ -276,9 +279,9 @@ static struct hdac_hdmi_pcm *get_hdmi_pcm_from_id(struct hdac_hdmi_priv *hdmi,
 
 static inline struct hdac_ext_device *to_hda_ext_device(struct device *dev)
 {
-	struct hdac_device *hdac = dev_to_hdac_dev(dev);
+	struct hdac_device *hdev = dev_to_hdac_dev(dev);
 
-	return to_ehdac_device(hdac);
+	return to_ehdac_device(hdev);
 }
 
 static unsigned int sad_format(const u8 *sad)
@@ -321,14 +324,14 @@ static int hdac_hdmi_eld_limit_formats(struct snd_pcm_runtime *runtime,
 }
 
 static void
-hdac_hdmi_set_dip_index(struct hdac_ext_device *hdac, hda_nid_t pin_nid,
+hdac_hdmi_set_dip_index(struct hdac_ext_device *edev, hda_nid_t pin_nid,
 				int packet_index, int byte_index)
 {
 	int val;
 
 	val = (packet_index << 5) | (byte_index & 0x1f);
 
-	snd_hdac_codec_write(&hdac->hdac, pin_nid, 0,
+	snd_hdac_codec_write(&edev->hdev, pin_nid, 0,
 				AC_VERB_SET_HDMI_DIP_INDEX, val);
 }
 
@@ -344,14 +347,14 @@ struct dp_audio_infoframe {
 	u8 LFEPBL01_LSV36_DM_INH7;
 };
 
-static int hdac_hdmi_setup_audio_infoframe(struct hdac_ext_device *hdac,
+static int hdac_hdmi_setup_audio_infoframe(struct hdac_ext_device *edev,
 		   struct hdac_hdmi_pcm *pcm, struct hdac_hdmi_port *port)
 {
 	uint8_t buffer[HDMI_INFOFRAME_HEADER_SIZE + HDMI_AUDIO_INFOFRAME_SIZE];
 	struct hdmi_audio_infoframe frame;
 	struct hdac_hdmi_pin *pin = port->pin;
 	struct dp_audio_infoframe dp_ai;
-	struct hdac_hdmi_priv *hdmi = hdac->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_cvt *cvt = pcm->cvt;
 	u8 *dip;
 	int ret;
@@ -360,11 +363,11 @@ static int hdac_hdmi_setup_audio_infoframe(struct hdac_ext_device *hdac,
 	u8 conn_type;
 	int channels, ca;
 
-	ca = snd_hdac_channel_allocation(&hdac->hdac, port->eld.info.spk_alloc,
+	ca = snd_hdac_channel_allocation(&edev->hdev, port->eld.info.spk_alloc,
 			pcm->channels, pcm->chmap_set, true, pcm->chmap);
 
 	channels = snd_hdac_get_active_channels(ca);
-	hdmi->chmap.ops.set_channel_count(&hdac->hdac, cvt->nid, channels);
+	hdmi->chmap.ops.set_channel_count(&edev->hdev, cvt->nid, channels);
 
 	snd_hdac_setup_channel_mapping(&hdmi->chmap, pin->nid, false, ca,
 				pcm->channels, pcm->chmap, pcm->chmap_set);
@@ -397,32 +400,32 @@ static int hdac_hdmi_setup_audio_infoframe(struct hdac_ext_device *hdac,
 		break;
 
 	default:
-		dev_err(&hdac->hdac.dev, "Invalid connection type: %d\n",
+		dev_err(&edev->hdev.dev, "Invalid connection type: %d\n",
 						conn_type);
 		return -EIO;
 	}
 
 	/* stop infoframe transmission */
-	hdac_hdmi_set_dip_index(hdac, pin->nid, 0x0, 0x0);
-	snd_hdac_codec_write(&hdac->hdac, pin->nid, 0,
+	hdac_hdmi_set_dip_index(edev, pin->nid, 0x0, 0x0);
+	snd_hdac_codec_write(&edev->hdev, pin->nid, 0,
 			AC_VERB_SET_HDMI_DIP_XMIT, AC_DIPXMIT_DISABLE);
 
 
 	/*  Fill infoframe. Index auto-incremented */
-	hdac_hdmi_set_dip_index(hdac, pin->nid, 0x0, 0x0);
+	hdac_hdmi_set_dip_index(edev, pin->nid, 0x0, 0x0);
 	if (conn_type == DRM_ELD_CONN_TYPE_HDMI) {
 		for (i = 0; i < sizeof(buffer); i++)
-			snd_hdac_codec_write(&hdac->hdac, pin->nid, 0,
+			snd_hdac_codec_write(&edev->hdev, pin->nid, 0,
 				AC_VERB_SET_HDMI_DIP_DATA, buffer[i]);
 	} else {
 		for (i = 0; i < sizeof(dp_ai); i++)
-			snd_hdac_codec_write(&hdac->hdac, pin->nid, 0,
+			snd_hdac_codec_write(&edev->hdev, pin->nid, 0,
 				AC_VERB_SET_HDMI_DIP_DATA, dip[i]);
 	}
 
 	/* Start infoframe */
-	hdac_hdmi_set_dip_index(hdac, pin->nid, 0x0, 0x0);
-	snd_hdac_codec_write(&hdac->hdac, pin->nid, 0,
+	hdac_hdmi_set_dip_index(edev, pin->nid, 0x0, 0x0);
+	snd_hdac_codec_write(&edev->hdev, pin->nid, 0,
 			AC_VERB_SET_HDMI_DIP_XMIT, AC_DIPXMIT_BEST);
 
 	return 0;
@@ -433,11 +436,11 @@ static int hdac_hdmi_set_tdm_slot(struct snd_soc_dai *dai,
 		int slots, int slot_width)
 {
 	struct hdac_ext_device *edev = snd_soc_dai_get_drvdata(dai);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_dai_port_map *dai_map;
 	struct hdac_hdmi_pcm *pcm;
 
-	dev_dbg(&edev->hdac.dev, "%s: strm_tag: %d\n", __func__, tx_mask);
+	dev_dbg(&edev->hdev.dev, "%s: strm_tag: %d\n", __func__, tx_mask);
 
 	dai_map = &hdmi->dai_map[dai->id];
 
@@ -452,8 +455,8 @@ static int hdac_hdmi_set_tdm_slot(struct snd_soc_dai *dai,
 static int hdac_hdmi_set_hw_params(struct snd_pcm_substream *substream,
 	struct snd_pcm_hw_params *hparams, struct snd_soc_dai *dai)
 {
-	struct hdac_ext_device *hdac = snd_soc_dai_get_drvdata(dai);
-	struct hdac_hdmi_priv *hdmi = hdac->private_data;
+	struct hdac_ext_device *edev = snd_soc_dai_get_drvdata(dai);
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_dai_port_map *dai_map;
 	struct hdac_hdmi_port *port;
 	struct hdac_hdmi_pcm *pcm;
@@ -466,7 +469,7 @@ static int hdac_hdmi_set_hw_params(struct snd_pcm_substream *substream,
 		return -ENODEV;
 
 	if ((!port->eld.monitor_present) || (!port->eld.eld_valid)) {
-		dev_err(&hdac->hdac.dev,
+		dev_err(&edev->hdev.dev,
 			"device is not configured for this pin:port%d:%d\n",
 					port->pin->nid, port->id);
 		return -ENODEV;
@@ -486,28 +489,28 @@ static int hdac_hdmi_set_hw_params(struct snd_pcm_substream *substream,
 	return 0;
 }
 
-static int hdac_hdmi_query_port_connlist(struct hdac_ext_device *hdac,
+static int hdac_hdmi_query_port_connlist(struct hdac_ext_device *edev,
 					struct hdac_hdmi_pin *pin,
 					struct hdac_hdmi_port *port)
 {
-	if (!(get_wcaps(&hdac->hdac, pin->nid) & AC_WCAP_CONN_LIST)) {
-		dev_warn(&hdac->hdac.dev,
+	if (!(get_wcaps(&edev->hdev, pin->nid) & AC_WCAP_CONN_LIST)) {
+		dev_warn(&edev->hdev.dev,
 			"HDMI: pin %d wcaps %#x does not support connection list\n",
-			pin->nid, get_wcaps(&hdac->hdac, pin->nid));
+			pin->nid, get_wcaps(&edev->hdev, pin->nid));
 		return -EINVAL;
 	}
 
-	if (hdac_hdmi_port_select_set(hdac, port) < 0)
+	if (hdac_hdmi_port_select_set(edev, port) < 0)
 		return -EIO;
 
-	port->num_mux_nids = snd_hdac_get_connections(&hdac->hdac, pin->nid,
+	port->num_mux_nids = snd_hdac_get_connections(&edev->hdev, pin->nid,
 			port->mux_nids, HDA_MAX_CONNECTIONS);
 	if (port->num_mux_nids == 0)
-		dev_warn(&hdac->hdac.dev,
+		dev_warn(&edev->hdev.dev,
 			"No connections found for pin:port %d:%d\n",
 						pin->nid, port->id);
 
-	dev_dbg(&hdac->hdac.dev, "num_mux_nids %d for pin:port %d:%d\n",
+	dev_dbg(&edev->hdev.dev, "num_mux_nids %d for pin:port %d:%d\n",
 			port->num_mux_nids, pin->nid, port->id);
 
 	return port->num_mux_nids;
@@ -565,8 +568,8 @@ static struct hdac_hdmi_port *hdac_hdmi_get_port_from_cvt(
 static int hdac_hdmi_pcm_open(struct snd_pcm_substream *substream,
 			struct snd_soc_dai *dai)
 {
-	struct hdac_ext_device *hdac = snd_soc_dai_get_drvdata(dai);
-	struct hdac_hdmi_priv *hdmi = hdac->private_data;
+	struct hdac_ext_device *edev = snd_soc_dai_get_drvdata(dai);
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_dai_port_map *dai_map;
 	struct hdac_hdmi_cvt *cvt;
 	struct hdac_hdmi_port *port;
@@ -575,7 +578,7 @@ static int hdac_hdmi_pcm_open(struct snd_pcm_substream *substream,
 	dai_map = &hdmi->dai_map[dai->id];
 
 	cvt = dai_map->cvt;
-	port = hdac_hdmi_get_port_from_cvt(hdac, hdmi, cvt);
+	port = hdac_hdmi_get_port_from_cvt(edev, hdmi, cvt);
 
 	/*
 	 * To make PA and other userland happy.
@@ -586,7 +589,7 @@ static int hdac_hdmi_pcm_open(struct snd_pcm_substream *substream,
 	if ((!port->eld.monitor_present) ||
 			(!port->eld.eld_valid)) {
 
-		dev_warn(&hdac->hdac.dev,
+		dev_warn(&edev->hdev.dev,
 			"Failed: present?:%d ELD valid?:%d pin:port: %d:%d\n",
 			port->eld.monitor_present, port->eld.eld_valid,
 			port->pin->nid, port->id);
@@ -608,8 +611,8 @@ static int hdac_hdmi_pcm_open(struct snd_pcm_substream *substream,
 static void hdac_hdmi_pcm_close(struct snd_pcm_substream *substream,
 		struct snd_soc_dai *dai)
 {
-	struct hdac_ext_device *hdac = snd_soc_dai_get_drvdata(dai);
-	struct hdac_hdmi_priv *hdmi = hdac->private_data;
+	struct hdac_ext_device *edev = snd_soc_dai_get_drvdata(dai);
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_dai_port_map *dai_map;
 	struct hdac_hdmi_pcm *pcm;
 
@@ -630,14 +633,13 @@ static void hdac_hdmi_pcm_close(struct snd_pcm_substream *substream,
 }
 
 static int
-hdac_hdmi_query_cvt_params(struct hdac_device *hdac, struct hdac_hdmi_cvt *cvt)
+hdac_hdmi_query_cvt_params(struct hdac_device *hdev, struct hdac_hdmi_cvt *cvt)
 {
 	unsigned int chans;
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	int err;
 
-	chans = get_wcaps(hdac, cvt->nid);
+	chans = get_wcaps(hdev, cvt->nid);
 	chans = get_wcaps_channels(chans);
 
 	cvt->params.channels_min = 2;
@@ -646,12 +648,12 @@ hdac_hdmi_query_cvt_params(struct hdac_device *hdac, struct hdac_hdmi_cvt *cvt)
 	if (chans > hdmi->chmap.channels_max)
 		hdmi->chmap.channels_max = chans;
 
-	err = snd_hdac_query_supported_pcm(hdac, cvt->nid,
+	err = snd_hdac_query_supported_pcm(hdev, cvt->nid,
 			&cvt->params.rates,
 			&cvt->params.formats,
 			&cvt->params.maxbps);
 	if (err < 0)
-		dev_err(&hdac->dev,
+		dev_err(&hdev->dev,
 			"Failed to query pcm params for nid %d: %d\n",
 			cvt->nid, err);
 
@@ -696,7 +698,7 @@ static void hdac_hdmi_fill_route(struct snd_soc_dapm_route *route,
 static struct hdac_hdmi_pcm *hdac_hdmi_get_pcm(struct hdac_ext_device *edev,
 					struct hdac_hdmi_port *port)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pcm *pcm = NULL;
 	struct hdac_hdmi_port *p;
 
@@ -716,9 +718,9 @@ static struct hdac_hdmi_pcm *hdac_hdmi_get_pcm(struct hdac_ext_device *edev,
 static void hdac_hdmi_set_power_state(struct hdac_ext_device *edev,
 			     hda_nid_t nid, unsigned int pwr_state)
 {
-	if (get_wcaps(&edev->hdac, nid) & AC_WCAP_POWER) {
-		if (!snd_hdac_check_power_state(&edev->hdac, nid, pwr_state))
-			snd_hdac_codec_write(&edev->hdac, nid, 0,
+	if (get_wcaps(&edev->hdev, nid) & AC_WCAP_POWER) {
+		if (!snd_hdac_check_power_state(&edev->hdev, nid, pwr_state))
+			snd_hdac_codec_write(&edev->hdev, nid, 0,
 				AC_VERB_SET_POWER_STATE, pwr_state);
 	}
 }
@@ -726,8 +728,8 @@ static void hdac_hdmi_set_power_state(struct hdac_ext_device *edev,
 static void hdac_hdmi_set_amp(struct hdac_ext_device *edev,
 				   hda_nid_t nid, int val)
 {
-	if (get_wcaps(&edev->hdac, nid) & AC_WCAP_OUT_AMP)
-		snd_hdac_codec_write(&edev->hdac, nid, 0,
+	if (get_wcaps(&edev->hdev, nid) & AC_WCAP_OUT_AMP)
+		snd_hdac_codec_write(&edev->hdev, nid, 0,
 					AC_VERB_SET_AMP_GAIN_MUTE, val);
 }
 
@@ -739,7 +741,7 @@ static int hdac_hdmi_pin_output_widget_event(struct snd_soc_dapm_widget *w,
 	struct hdac_ext_device *edev = to_hda_ext_device(w->dapm->dev);
 	struct hdac_hdmi_pcm *pcm;
 
-	dev_dbg(&edev->hdac.dev, "%s: widget: %s event: %x\n",
+	dev_dbg(&edev->hdev.dev, "%s: widget: %s event: %x\n",
 			__func__, w->name, event);
 
 	pcm = hdac_hdmi_get_pcm(edev, port);
@@ -755,7 +757,7 @@ static int hdac_hdmi_pin_output_widget_event(struct snd_soc_dapm_widget *w,
 		hdac_hdmi_set_power_state(edev, port->pin->nid, AC_PWRST_D0);
 
 		/* Enable out path for this pin widget */
-		snd_hdac_codec_write(&edev->hdac, port->pin->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, port->pin->nid, 0,
 				AC_VERB_SET_PIN_WIDGET_CONTROL, PIN_OUT);
 
 		hdac_hdmi_set_amp(edev, port->pin->nid, AMP_OUT_UNMUTE);
@@ -766,7 +768,7 @@ static int hdac_hdmi_pin_output_widget_event(struct snd_soc_dapm_widget *w,
 		hdac_hdmi_set_amp(edev, port->pin->nid, AMP_OUT_MUTE);
 
 		/* Disable out path for this pin widget */
-		snd_hdac_codec_write(&edev->hdac, port->pin->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, port->pin->nid, 0,
 				AC_VERB_SET_PIN_WIDGET_CONTROL, 0);
 
 		hdac_hdmi_set_power_state(edev, port->pin->nid, AC_PWRST_D3);
@@ -782,10 +784,10 @@ static int hdac_hdmi_cvt_output_widget_event(struct snd_soc_dapm_widget *w,
 {
 	struct hdac_hdmi_cvt *cvt = w->priv;
 	struct hdac_ext_device *edev = to_hda_ext_device(w->dapm->dev);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pcm *pcm;
 
-	dev_dbg(&edev->hdac.dev, "%s: widget: %s event: %x\n",
+	dev_dbg(&edev->hdev.dev, "%s: widget: %s event: %x\n",
 			__func__, w->name, event);
 
 	pcm = hdac_hdmi_get_pcm_from_cvt(hdmi, cvt);
@@ -797,23 +799,23 @@ static int hdac_hdmi_cvt_output_widget_event(struct snd_soc_dapm_widget *w,
 		hdac_hdmi_set_power_state(edev, cvt->nid, AC_PWRST_D0);
 
 		/* Enable transmission */
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 			AC_VERB_SET_DIGI_CONVERT_1, 1);
 
 		/* Category Code (CC) to zero */
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 			AC_VERB_SET_DIGI_CONVERT_2, 0);
 
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 				AC_VERB_SET_CHANNEL_STREAMID, pcm->stream_tag);
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 				AC_VERB_SET_STREAM_FORMAT, pcm->format);
 		break;
 
 	case SND_SOC_DAPM_POST_PMD:
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 				AC_VERB_SET_CHANNEL_STREAMID, 0);
-		snd_hdac_codec_write(&edev->hdac, cvt->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, cvt->nid, 0,
 				AC_VERB_SET_STREAM_FORMAT, 0);
 
 		hdac_hdmi_set_power_state(edev, cvt->nid, AC_PWRST_D3);
@@ -831,7 +833,7 @@ static int hdac_hdmi_pin_mux_widget_event(struct snd_soc_dapm_widget *w,
 	struct hdac_ext_device *edev = to_hda_ext_device(w->dapm->dev);
 	int mux_idx;
 
-	dev_dbg(&edev->hdac.dev, "%s: widget: %s event: %x\n",
+	dev_dbg(&edev->hdev.dev, "%s: widget: %s event: %x\n",
 			__func__, w->name, event);
 
 	if (!kc)
@@ -844,7 +846,7 @@ static int hdac_hdmi_pin_mux_widget_event(struct snd_soc_dapm_widget *w,
 		return -EIO;
 
 	if (mux_idx > 0) {
-		snd_hdac_codec_write(&edev->hdac, port->pin->nid, 0,
+		snd_hdac_codec_write(&edev->hdev, port->pin->nid, 0,
 			AC_VERB_SET_CONNECT_SEL, (mux_idx - 1));
 	}
 
@@ -864,7 +866,7 @@ static int hdac_hdmi_set_pin_port_mux(struct snd_kcontrol *kcontrol,
 	struct snd_soc_dapm_context *dapm = w->dapm;
 	struct hdac_hdmi_port *port = w->priv;
 	struct hdac_ext_device *edev = to_hda_ext_device(dapm->dev);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pcm *pcm = NULL;
 	const char *cvt_name =  e->texts[ucontrol->value.enumerated.item[0]];
 
@@ -922,7 +924,7 @@ static int hdac_hdmi_create_pin_port_muxs(struct hdac_ext_device *edev,
 				struct snd_soc_dapm_widget *widget,
 				const char *widget_name)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pin *pin = port->pin;
 	struct snd_kcontrol_new *kc;
 	struct hdac_hdmi_cvt *cvt;
@@ -934,17 +936,17 @@ static int hdac_hdmi_create_pin_port_muxs(struct hdac_ext_device *edev,
 	int i = 0;
 	int num_items = hdmi->num_cvt + 1;
 
-	kc = devm_kzalloc(&edev->hdac.dev, sizeof(*kc), GFP_KERNEL);
+	kc = devm_kzalloc(&edev->hdev.dev, sizeof(*kc), GFP_KERNEL);
 	if (!kc)
 		return -ENOMEM;
 
-	se = devm_kzalloc(&edev->hdac.dev, sizeof(*se), GFP_KERNEL);
+	se = devm_kzalloc(&edev->hdev.dev, sizeof(*se), GFP_KERNEL);
 	if (!se)
 		return -ENOMEM;
 
 	snprintf(kc_name, NAME_SIZE, "Pin %d port %d Input",
 						pin->nid, port->id);
-	kc->name = devm_kstrdup(&edev->hdac.dev, kc_name, GFP_KERNEL);
+	kc->name = devm_kstrdup(&edev->hdev.dev, kc_name, GFP_KERNEL);
 	if (!kc->name)
 		return -ENOMEM;
 
@@ -962,24 +964,24 @@ static int hdac_hdmi_create_pin_port_muxs(struct hdac_ext_device *edev,
 	se->mask = roundup_pow_of_two(se->items) - 1;
 
 	sprintf(mux_items, "NONE");
-	items[i] = devm_kstrdup(&edev->hdac.dev, mux_items, GFP_KERNEL);
+	items[i] = devm_kstrdup(&edev->hdev.dev, mux_items, GFP_KERNEL);
 	if (!items[i])
 		return -ENOMEM;
 
 	list_for_each_entry(cvt, &hdmi->cvt_list, head) {
 		i++;
 		sprintf(mux_items, "cvt %d", cvt->nid);
-		items[i] = devm_kstrdup(&edev->hdac.dev, mux_items, GFP_KERNEL);
+		items[i] = devm_kstrdup(&edev->hdev.dev, mux_items, GFP_KERNEL);
 		if (!items[i])
 			return -ENOMEM;
 	}
 
-	se->texts = devm_kmemdup(&edev->hdac.dev, items,
+	se->texts = devm_kmemdup(&edev->hdev.dev, items,
 			(num_items  * sizeof(char *)), GFP_KERNEL);
 	if (!se->texts)
 		return -ENOMEM;
 
-	return hdac_hdmi_fill_widget_info(&edev->hdac.dev, widget,
+	return hdac_hdmi_fill_widget_info(&edev->hdev.dev, widget,
 			snd_soc_dapm_mux, port, widget_name, NULL, kc, 1,
 			hdac_hdmi_pin_mux_widget_event,
 			SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_REG);
@@ -990,7 +992,7 @@ static void hdac_hdmi_add_pinmux_cvt_route(struct hdac_ext_device *edev,
 			struct snd_soc_dapm_widget *widgets,
 			struct snd_soc_dapm_route *route, int rindex)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	const struct snd_kcontrol_new *kc;
 	struct soc_enum *se;
 	int mux_index = hdmi->num_cvt + hdmi->num_ports;
@@ -1033,8 +1035,8 @@ static int create_fill_widget_route_map(struct snd_soc_dapm_context *dapm)
 	struct snd_soc_dapm_widget *widgets;
 	struct snd_soc_dapm_route *route;
 	struct hdac_ext_device *edev = to_hda_ext_device(dapm->dev);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
-	struct snd_soc_dai_driver *dai_drv = dapm->component->dai_drv;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
+	struct snd_soc_dai_driver *dai_drv = hdmi->dai_drv;
 	char widget_name[NAME_SIZE];
 	struct hdac_hdmi_cvt *cvt;
 	struct hdac_hdmi_pin *pin;
@@ -1134,7 +1136,7 @@ static int create_fill_widget_route_map(struct snd_soc_dapm_context *dapm)
 
 static int hdac_hdmi_init_dai_map(struct hdac_ext_device *edev)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_dai_port_map *dai_map;
 	struct hdac_hdmi_cvt *cvt;
 	int dai_id = 0;
@@ -1150,7 +1152,7 @@ static int hdac_hdmi_init_dai_map(struct hdac_ext_device *edev)
 		dai_id++;
 
 		if (dai_id == HDA_MAX_CVTS) {
-			dev_warn(&edev->hdac.dev,
+			dev_warn(&edev->hdev.dev,
 				"Max dais supported: %d\n", dai_id);
 			break;
 		}
@@ -1161,7 +1163,7 @@ static int hdac_hdmi_init_dai_map(struct hdac_ext_device *edev)
 
 static int hdac_hdmi_add_cvt(struct hdac_ext_device *edev, hda_nid_t nid)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_cvt *cvt;
 	char name[NAME_SIZE];
 
@@ -1176,7 +1178,7 @@ static int hdac_hdmi_add_cvt(struct hdac_ext_device *edev, hda_nid_t nid)
 	list_add_tail(&cvt->head, &hdmi->cvt_list);
 	hdmi->num_cvt++;
 
-	return hdac_hdmi_query_cvt_params(&edev->hdac, cvt);
+	return hdac_hdmi_query_cvt_params(&edev->hdev, cvt);
 }
 
 static int hdac_hdmi_parse_eld(struct hdac_ext_device *edev,
@@ -1188,7 +1190,7 @@ static int hdac_hdmi_parse_eld(struct hdac_ext_device *edev,
 						>> DRM_ELD_VER_SHIFT;
 
 	if (ver != ELD_VER_CEA_861D && ver != ELD_VER_PARTIAL) {
-		dev_err(&edev->hdac.dev, "HDMI: Unknown ELD version %d\n", ver);
+		dev_err(&edev->hdev.dev, "HDMI: Unknown ELD version %d\n", ver);
 		return -EINVAL;
 	}
 
@@ -1196,7 +1198,7 @@ static int hdac_hdmi_parse_eld(struct hdac_ext_device *edev,
 		DRM_ELD_MNL_MASK) >> DRM_ELD_MNL_SHIFT;
 
 	if (mnl > ELD_MAX_MNL) {
-		dev_err(&edev->hdac.dev, "HDMI: MNL Invalid %d\n", mnl);
+		dev_err(&edev->hdev.dev, "HDMI: MNL Invalid %d\n", mnl);
 		return -EINVAL;
 	}
 
@@ -1209,7 +1211,7 @@ static void hdac_hdmi_present_sense(struct hdac_hdmi_pin *pin,
 				    struct hdac_hdmi_port *port)
 {
 	struct hdac_ext_device *edev = pin->edev;
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pcm *pcm;
 	int size = 0;
 	int port_id = -1;
@@ -1227,7 +1229,7 @@ static void hdac_hdmi_present_sense(struct hdac_hdmi_pin *pin,
 	if (pin->mst_capable)
 		port_id = port->id;
 
-	size = snd_hdac_acomp_get_eld(&edev->hdac, pin->nid, port_id,
+	size = snd_hdac_acomp_get_eld(&edev->hdev, pin->nid, port_id,
 				&port->eld.monitor_present,
 				port->eld.eld_buffer,
 				ELD_MAX_SIZE);
@@ -1250,7 +1252,7 @@ static void hdac_hdmi_present_sense(struct hdac_hdmi_pin *pin,
 
 	if (!port->eld.monitor_present || !port->eld.eld_valid) {
 
-		dev_err(&edev->hdac.dev, "%s: disconnect for pin:port %d:%d\n",
+		dev_err(&edev->hdev.dev, "%s: disconnect for pin:port %d:%d\n",
 						__func__, pin->nid, port->id);
 
 		/*
@@ -1304,7 +1306,7 @@ static int hdac_hdmi_add_ports(struct hdac_hdmi_priv *hdmi,
 
 static int hdac_hdmi_add_pin(struct hdac_ext_device *edev, hda_nid_t nid)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pin *pin;
 	int ret;
 
@@ -1333,40 +1335,38 @@ static int hdac_hdmi_add_pin(struct hdac_ext_device *edev, hda_nid_t nid)
 #define INTEL_EN_DP12			0x02 /* enable DP 1.2 features */
 #define INTEL_EN_ALL_PIN_CVTS	0x01 /* enable 2nd & 3rd pins and convertors */
 
-static void hdac_hdmi_skl_enable_all_pins(struct hdac_device *hdac)
+static void hdac_hdmi_skl_enable_all_pins(struct hdac_device *hdev)
 {
 	unsigned int vendor_param;
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	unsigned int vendor_nid = hdmi->drv_data->vendor_nid;
 
-	vendor_param = snd_hdac_codec_read(hdac, vendor_nid, 0,
+	vendor_param = snd_hdac_codec_read(hdev, vendor_nid, 0,
 				INTEL_GET_VENDOR_VERB, 0);
 	if (vendor_param == -1 || vendor_param & INTEL_EN_ALL_PIN_CVTS)
 		return;
 
 	vendor_param |= INTEL_EN_ALL_PIN_CVTS;
-	vendor_param = snd_hdac_codec_read(hdac, vendor_nid, 0,
+	vendor_param = snd_hdac_codec_read(hdev, vendor_nid, 0,
 				INTEL_SET_VENDOR_VERB, vendor_param);
 	if (vendor_param == -1)
 		return;
 }
 
-static void hdac_hdmi_skl_enable_dp12(struct hdac_device *hdac)
+static void hdac_hdmi_skl_enable_dp12(struct hdac_device *hdev)
 {
 	unsigned int vendor_param;
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	unsigned int vendor_nid = hdmi->drv_data->vendor_nid;
 
-	vendor_param = snd_hdac_codec_read(hdac, vendor_nid, 0,
+	vendor_param = snd_hdac_codec_read(hdev, vendor_nid, 0,
 				INTEL_GET_VENDOR_VERB, 0);
 	if (vendor_param == -1 || vendor_param & INTEL_EN_DP12)
 		return;
 
 	/* enable DP1.2 mode */
 	vendor_param |= INTEL_EN_DP12;
-	vendor_param = snd_hdac_codec_read(hdac, vendor_nid, 0,
+	vendor_param = snd_hdac_codec_read(hdev, vendor_nid, 0,
 				INTEL_SET_VENDOR_VERB, vendor_param);
 	if (vendor_param == -1)
 		return;
@@ -1384,7 +1384,7 @@ static const struct snd_soc_dai_ops hdmi_dai_ops = {
  * Each converter can support a stream independently. So a dai is created
  * based on the number of converter queried.
  */
-static int hdac_hdmi_create_dais(struct hdac_device *hdac,
+static int hdac_hdmi_create_dais(struct hdac_device *hdev,
 		struct snd_soc_dai_driver **dais,
 		struct hdac_hdmi_priv *hdmi, int num_dais)
 {
@@ -1397,20 +1397,20 @@ static int hdac_hdmi_create_dais(struct hdac_device *hdac,
 	u64 formats;
 	int ret;
 
-	hdmi_dais = devm_kzalloc(&hdac->dev,
+	hdmi_dais = devm_kzalloc(&hdev->dev,
 			(sizeof(*hdmi_dais) * num_dais),
 			GFP_KERNEL);
 	if (!hdmi_dais)
 		return -ENOMEM;
 
 	list_for_each_entry(cvt, &hdmi->cvt_list, head) {
-		ret = snd_hdac_query_supported_pcm(hdac, cvt->nid,
+		ret = snd_hdac_query_supported_pcm(hdev, cvt->nid,
 					&rates,	&formats, &bps);
 		if (ret)
 			return ret;
 
 		sprintf(dai_name, "intel-hdmi-hifi%d", i+1);
-		hdmi_dais[i].name = devm_kstrdup(&hdac->dev,
+		hdmi_dais[i].name = devm_kstrdup(&hdev->dev,
 					dai_name, GFP_KERNEL);
 
 		if (!hdmi_dais[i].name)
@@ -1418,7 +1418,7 @@ static int hdac_hdmi_create_dais(struct hdac_device *hdac,
 
 		snprintf(name, sizeof(name), "hifi%d", i+1);
 		hdmi_dais[i].playback.stream_name =
-				devm_kstrdup(&hdac->dev, name, GFP_KERNEL);
+				devm_kstrdup(&hdev->dev, name, GFP_KERNEL);
 		if (!hdmi_dais[i].playback.stream_name)
 			return -ENOMEM;
 
@@ -1438,6 +1438,7 @@ static int hdac_hdmi_create_dais(struct hdac_device *hdac,
 	}
 
 	*dais = hdmi_dais;
+	hdmi->dai_drv = hdmi_dais;
 
 	return 0;
 }
@@ -1451,29 +1452,26 @@ static int hdac_hdmi_parse_and_map_nid(struct hdac_ext_device *edev,
 {
 	hda_nid_t nid;
 	int i, num_nodes;
-	struct hdac_device *hdac = &edev->hdac;
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
 	struct hdac_hdmi_cvt *temp_cvt, *cvt_next;
 	struct hdac_hdmi_pin *temp_pin, *pin_next;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
+	struct hdac_device *hdev = &edev->hdev;
 	int ret;
 
-	hdac_hdmi_skl_enable_all_pins(hdac);
-	hdac_hdmi_skl_enable_dp12(hdac);
+	hdac_hdmi_skl_enable_all_pins(hdev);
+	hdac_hdmi_skl_enable_dp12(hdev);
 
-	num_nodes = snd_hdac_get_sub_nodes(hdac, hdac->afg, &nid);
+	num_nodes = snd_hdac_get_sub_nodes(hdev, hdev->afg, &nid);
 	if (!nid || num_nodes <= 0) {
-		dev_warn(&hdac->dev, "HDMI: failed to get afg sub nodes\n");
+		dev_warn(&hdev->dev, "HDMI: failed to get afg sub nodes\n");
 		return -EINVAL;
 	}
 
-	hdac->num_nodes = num_nodes;
-	hdac->start_nid = nid;
-
-	for (i = 0; i < hdac->num_nodes; i++, nid++) {
+	for (i = 0; i < num_nodes; i++, nid++) {
 		unsigned int caps;
 		unsigned int type;
 
-		caps = get_wcaps(hdac, nid);
+		caps = get_wcaps(hdev, nid);
 		type = get_wcaps_type(caps);
 
 		if (!(caps & AC_WCAP_DIGITAL))
@@ -1495,16 +1493,14 @@ static int hdac_hdmi_parse_and_map_nid(struct hdac_ext_device *edev,
 		}
 	}
 
-	hdac->end_nid = nid;
-
 	if (!hdmi->num_pin || !hdmi->num_cvt) {
 		ret = -EIO;
 		goto free_widgets;
 	}
 
-	ret = hdac_hdmi_create_dais(hdac, dais, hdmi, hdmi->num_cvt);
+	ret = hdac_hdmi_create_dais(hdev, dais, hdmi, hdmi->num_cvt);
 	if (ret) {
-		dev_err(&hdac->dev, "Failed to create dais with err: %d\n",
+		dev_err(&hdev->dev, "Failed to create dais with err: %d\n",
 							ret);
 		goto free_widgets;
 	}
@@ -1537,7 +1533,7 @@ static int hdac_hdmi_parse_and_map_nid(struct hdac_ext_device *edev,
 static void hdac_hdmi_eld_notify_cb(void *aptr, int port, int pipe)
 {
 	struct hdac_ext_device *edev = aptr;
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pin *pin = NULL;
 	struct hdac_hdmi_port *hport = NULL;
 	struct snd_soc_codec *codec = edev->scodec;
@@ -1546,7 +1542,7 @@ static void hdac_hdmi_eld_notify_cb(void *aptr, int port, int pipe)
 	/* Don't know how this mapping is derived */
 	hda_nid_t pin_nid = port + 0x04;
 
-	dev_dbg(&edev->hdac.dev, "%s: for pin:%d port=%d\n", __func__,
+	dev_dbg(&edev->hdev.dev, "%s: for pin:%d port=%d\n", __func__,
 							pin_nid, pipe);
 
 	/*
@@ -1559,7 +1555,7 @@ static void hdac_hdmi_eld_notify_cb(void *aptr, int port, int pipe)
 			SNDRV_CTL_POWER_D0)
 		return;
 
-	if (atomic_read(&edev->hdac.in_pm))
+	if (atomic_read(&edev->hdev.in_pm))
 		return;
 
 	list_for_each_entry(pin, &hdmi->pin_list, head) {
@@ -1614,7 +1610,7 @@ static int create_fill_jack_kcontrols(struct snd_soc_card *card,
 	char *name;
 	int i = 0, j;
 	struct snd_soc_codec *codec = edev->scodec;
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 
 	kc = devm_kcalloc(codec->dev, hdmi->num_ports,
 				sizeof(*kc), GFP_KERNEL);
@@ -1652,7 +1648,7 @@ int hdac_hdmi_jack_port_init(struct snd_soc_codec *codec,
 			struct snd_soc_dapm_context *dapm)
 {
 	struct hdac_ext_device *edev = snd_soc_codec_get_drvdata(codec);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pin *pin;
 	struct snd_soc_dapm_widget *widgets;
 	struct snd_soc_dapm_route *route;
@@ -1728,7 +1724,7 @@ int hdac_hdmi_jack_init(struct snd_soc_dai *dai, int device,
 {
 	struct snd_soc_codec *codec = dai->codec;
 	struct hdac_ext_device *edev = snd_soc_codec_get_drvdata(codec);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pcm *pcm;
 	struct snd_pcm *snd_pcm;
 	int err;
@@ -1750,7 +1746,7 @@ int hdac_hdmi_jack_init(struct snd_soc_dai *dai, int device,
 	if (snd_pcm) {
 		err = snd_hdac_add_chmap_ctls(snd_pcm, device, &hdmi->chmap);
 		if (err < 0) {
-			dev_err(&edev->hdac.dev,
+			dev_err(&edev->hdev.dev,
 				"chmap control add failed with err: %d for pcm: %d\n",
 				err, device);
 			kfree(pcm);
@@ -1791,7 +1787,7 @@ static void hdac_hdmi_present_sense_all_pins(struct hdac_ext_device *edev,
 static int hdmi_codec_probe(struct snd_soc_codec *codec)
 {
 	struct hdac_ext_device *edev = snd_soc_codec_get_drvdata(codec);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct snd_soc_dapm_context *dapm =
 		snd_soc_component_get_dapm(&codec->component);
 	struct hdac_ext_link *hlink = NULL;
@@ -1803,9 +1799,9 @@ static int hdmi_codec_probe(struct snd_soc_codec *codec)
 	 * hold the ref while we probe, also no need to drop the ref on
 	 * exit, we call pm_runtime_suspend() so that will do for us
 	 */
-	hlink = snd_hdac_ext_bus_get_link(edev->ebus, dev_name(&edev->hdac.dev));
+	hlink = snd_hdac_ext_bus_get_link(edev->ebus, dev_name(&edev->hdev.dev));
 	if (!hlink) {
-		dev_err(&edev->hdac.dev, "hdac link not found\n");
+		dev_err(&edev->hdev.dev, "hdac link not found\n");
 		return -EIO;
 	}
 
@@ -1818,7 +1814,7 @@ static int hdmi_codec_probe(struct snd_soc_codec *codec)
 	aops.audio_ptr = edev;
 	ret = snd_hdac_i915_register_notifier(&aops);
 	if (ret < 0) {
-		dev_err(&edev->hdac.dev, "notifier register failed: err: %d\n",
+		dev_err(&edev->hdev.dev, "notifier register failed: err: %d\n",
 				ret);
 		return ret;
 	}
@@ -1831,9 +1827,9 @@ static int hdmi_codec_probe(struct snd_soc_codec *codec)
 	 * hdac_device core already sets the state to active and calls
 	 * get_noresume. So enable runtime and set the device to suspend.
 	 */
-	pm_runtime_enable(&edev->hdac.dev);
-	pm_runtime_put(&edev->hdac.dev);
-	pm_runtime_suspend(&edev->hdac.dev);
+	pm_runtime_enable(&edev->hdev.dev);
+	pm_runtime_put(&edev->hdev.dev);
+	pm_runtime_suspend(&edev->hdev.dev);
 
 	return 0;
 }
@@ -1842,7 +1838,7 @@ static int hdmi_codec_remove(struct snd_soc_codec *codec)
 {
 	struct hdac_ext_device *edev = snd_soc_codec_get_drvdata(codec);
 
-	pm_runtime_disable(&edev->hdac.dev);
+	pm_runtime_disable(&edev->hdev.dev);
 	return 0;
 }
 
@@ -1850,9 +1846,9 @@ static int hdmi_codec_remove(struct snd_soc_codec *codec)
 static int hdmi_codec_prepare(struct device *dev)
 {
 	struct hdac_ext_device *edev = to_hda_ext_device(dev);
-	struct hdac_device *hdac = &edev->hdac;
+	struct hdac_device *hdev = &edev->hdev;
 
-	pm_runtime_get_sync(&edev->hdac.dev);
+	pm_runtime_get_sync(&edev->hdev.dev);
 
 	/*
 	 * Power down afg.
@@ -1861,7 +1857,7 @@ static int hdmi_codec_prepare(struct device *dev)
 	 * is received. So setting power state is ensured without using loop
 	 * to read the state.
 	 */
-	snd_hdac_codec_read(hdac, hdac->afg, 0,	AC_VERB_SET_POWER_STATE,
+	snd_hdac_codec_read(hdev, hdev->afg, 0,	AC_VERB_SET_POWER_STATE,
 							AC_PWRST_D3);
 
 	return 0;
@@ -1870,15 +1866,15 @@ static int hdmi_codec_prepare(struct device *dev)
 static void hdmi_codec_complete(struct device *dev)
 {
 	struct hdac_ext_device *edev = to_hda_ext_device(dev);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
-	struct hdac_device *hdac = &edev->hdac;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
+	struct hdac_device *hdev = &edev->hdev;
 
 	/* Power up afg */
-	snd_hdac_codec_read(hdac, hdac->afg, 0,	AC_VERB_SET_POWER_STATE,
+	snd_hdac_codec_read(hdev, hdev->afg, 0,	AC_VERB_SET_POWER_STATE,
 							AC_PWRST_D0);
 
-	hdac_hdmi_skl_enable_all_pins(&edev->hdac);
-	hdac_hdmi_skl_enable_dp12(&edev->hdac);
+	hdac_hdmi_skl_enable_all_pins(&edev->hdev);
+	hdac_hdmi_skl_enable_dp12(&edev->hdev);
 
 	/*
 	 * As the ELD notify callback request is not entertained while the
@@ -1888,7 +1884,7 @@ static void hdmi_codec_complete(struct device *dev)
 	 */
 	hdac_hdmi_present_sense_all_pins(edev, hdmi, false);
 
-	pm_runtime_put_sync(&edev->hdac.dev);
+	pm_runtime_put_sync(&edev->hdev.dev);
 }
 #else
 #define hdmi_codec_prepare NULL
@@ -1901,21 +1897,20 @@ static const struct snd_soc_codec_driver hdmi_hda_codec = {
 	.idle_bias_off	= true,
 };
 
-static void hdac_hdmi_get_chmap(struct hdac_device *hdac, int pcm_idx,
+static void hdac_hdmi_get_chmap(struct hdac_device *hdev, int pcm_idx,
 					unsigned char *chmap)
 {
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	struct hdac_hdmi_pcm *pcm = get_hdmi_pcm_from_id(hdmi, pcm_idx);
 
 	memcpy(chmap, pcm->chmap, ARRAY_SIZE(pcm->chmap));
 }
 
-static void hdac_hdmi_set_chmap(struct hdac_device *hdac, int pcm_idx,
+static void hdac_hdmi_set_chmap(struct hdac_device *hdev, int pcm_idx,
 				unsigned char *chmap, int prepared)
 {
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_ext_device *edev = to_ehdac_device(hdev);
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	struct hdac_hdmi_pcm *pcm = get_hdmi_pcm_from_id(hdmi, pcm_idx);
 	struct hdac_hdmi_port *port;
 
@@ -1934,10 +1929,9 @@ static void hdac_hdmi_set_chmap(struct hdac_device *hdac, int pcm_idx,
 	mutex_unlock(&pcm->lock);
 }
 
-static bool is_hdac_hdmi_pcm_attached(struct hdac_device *hdac, int pcm_idx)
+static bool is_hdac_hdmi_pcm_attached(struct hdac_device *hdev, int pcm_idx)
 {
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	struct hdac_hdmi_pcm *pcm = get_hdmi_pcm_from_id(hdmi, pcm_idx);
 
 	if (!pcm)
@@ -1949,10 +1943,9 @@ static bool is_hdac_hdmi_pcm_attached(struct hdac_device *hdac, int pcm_idx)
 	return true;
 }
 
-static int hdac_hdmi_get_spk_alloc(struct hdac_device *hdac, int pcm_idx)
+static int hdac_hdmi_get_spk_alloc(struct hdac_device *hdev, int pcm_idx)
 {
-	struct hdac_ext_device *edev = to_ehdac_device(hdac);
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(hdev);
 	struct hdac_hdmi_pcm *pcm = get_hdmi_pcm_from_id(hdmi, pcm_idx);
 	struct hdac_hdmi_port *port;
 
@@ -1983,30 +1976,30 @@ static struct hdac_hdmi_drv_data intel_drv_data  = {
 
 static int hdac_hdmi_dev_probe(struct hdac_ext_device *edev)
 {
-	struct hdac_device *codec = &edev->hdac;
+	struct hdac_device *hdev = &edev->hdev;
 	struct hdac_hdmi_priv *hdmi_priv;
 	struct snd_soc_dai_driver *hdmi_dais = NULL;
 	struct hdac_ext_link *hlink = NULL;
 	int num_dais = 0;
 	int ret = 0;
-	struct hdac_driver *hdrv = drv_to_hdac_driver(codec->dev.driver);
-	const struct hda_device_id *hdac_id = hdac_get_device_id(codec, hdrv);
+	struct hdac_driver *hdrv = drv_to_hdac_driver(hdev->dev.driver);
+	const struct hda_device_id *hdac_id = hdac_get_device_id(hdev, hdrv);
 
 	/* hold the ref while we probe */
-	hlink = snd_hdac_ext_bus_get_link(edev->ebus, dev_name(&edev->hdac.dev));
+	hlink = snd_hdac_ext_bus_get_link(edev->ebus, dev_name(&edev->hdev.dev));
 	if (!hlink) {
-		dev_err(&edev->hdac.dev, "hdac link not found\n");
+		dev_err(&edev->hdev.dev, "hdac link not found\n");
 		return -EIO;
 	}
 
 	snd_hdac_ext_bus_link_get(edev->ebus, hlink);
 
-	hdmi_priv = devm_kzalloc(&codec->dev, sizeof(*hdmi_priv), GFP_KERNEL);
+	hdmi_priv = devm_kzalloc(&hdev->dev, sizeof(*hdmi_priv), GFP_KERNEL);
 	if (hdmi_priv == NULL)
 		return -ENOMEM;
 
 	edev->private_data = hdmi_priv;
-	snd_hdac_register_chmap_ops(codec, &hdmi_priv->chmap);
+	snd_hdac_register_chmap_ops(hdev, &hdmi_priv->chmap);
 	hdmi_priv->chmap.ops.get_chmap = hdac_hdmi_get_chmap;
 	hdmi_priv->chmap.ops.set_chmap = hdac_hdmi_set_chmap;
 	hdmi_priv->chmap.ops.is_pcm_attached = is_hdac_hdmi_pcm_attached;
@@ -2021,7 +2014,7 @@ static int hdac_hdmi_dev_probe(struct hdac_ext_device *edev)
 	else
 		hdmi_priv->drv_data = &intel_drv_data;
 
-	dev_set_drvdata(&codec->dev, edev);
+	dev_set_drvdata(&hdev->dev, edev);
 
 	INIT_LIST_HEAD(&hdmi_priv->pin_list);
 	INIT_LIST_HEAD(&hdmi_priv->cvt_list);
@@ -2032,9 +2025,9 @@ static int hdac_hdmi_dev_probe(struct hdac_ext_device *edev)
 	 * Turned off in the runtime_suspend during the first explicit
 	 * pm_runtime_suspend call.
 	 */
-	ret = snd_hdac_display_power(edev->hdac.bus, true);
+	ret = snd_hdac_display_power(edev->hdev.bus, true);
 	if (ret < 0) {
-		dev_err(&edev->hdac.dev,
+		dev_err(&edev->hdev.dev,
 			"Cannot turn on display power on i915 err: %d\n",
 			ret);
 		return ret;
@@ -2042,13 +2035,14 @@ static int hdac_hdmi_dev_probe(struct hdac_ext_device *edev)
 
 	ret = hdac_hdmi_parse_and_map_nid(edev, &hdmi_dais, &num_dais);
 	if (ret < 0) {
-		dev_err(&codec->dev,
+		dev_err(&hdev->dev,
 			"Failed in parse and map nid with err: %d\n", ret);
 		return ret;
 	}
+	snd_hdac_refresh_widgets(hdev, true);
 
 	/* ASoC specific initialization */
-	ret = snd_soc_register_codec(&codec->dev, &hdmi_hda_codec,
+	ret = snd_soc_register_codec(&hdev->dev, &hdmi_hda_codec,
 					hdmi_dais, num_dais);
 
 	snd_hdac_ext_bus_link_put(edev->ebus, hlink);
@@ -2058,14 +2052,14 @@ static int hdac_hdmi_dev_probe(struct hdac_ext_device *edev)
 
 static int hdac_hdmi_dev_remove(struct hdac_ext_device *edev)
 {
-	struct hdac_hdmi_priv *hdmi = edev->private_data;
+	struct hdac_hdmi_priv *hdmi = hdev_to_hdmi_priv(&edev->hdev);
 	struct hdac_hdmi_pin *pin, *pin_next;
 	struct hdac_hdmi_cvt *cvt, *cvt_next;
 	struct hdac_hdmi_pcm *pcm, *pcm_next;
 	struct hdac_hdmi_port *port, *port_next;
 	int i;
 
-	snd_soc_unregister_codec(&edev->hdac.dev);
+	snd_soc_unregister_codec(&edev->hdev.dev);
 
 	list_for_each_entry_safe(pcm, pcm_next, &hdmi->pcm_list, head) {
 		pcm->cvt = NULL;
@@ -2101,8 +2095,8 @@ static int hdac_hdmi_dev_remove(struct hdac_ext_device *edev)
 static int hdac_hdmi_runtime_suspend(struct device *dev)
 {
 	struct hdac_ext_device *edev = to_hda_ext_device(dev);
-	struct hdac_device *hdac = &edev->hdac;
-	struct hdac_bus *bus = hdac->bus;
+	struct hdac_device *hdev = &edev->hdev;
+	struct hdac_bus *bus = hdev->bus;
 	struct hdac_ext_bus *ebus = hbus_to_ebus(bus);
 	struct hdac_ext_link *hlink = NULL;
 	int err;
@@ -2120,7 +2114,7 @@ static int hdac_hdmi_runtime_suspend(struct device *dev)
 	 * is received. So setting power state is ensured without using loop
 	 * to read the state.
 	 */
-	snd_hdac_codec_read(hdac, hdac->afg, 0,	AC_VERB_SET_POWER_STATE,
+	snd_hdac_codec_read(hdev, hdev->afg, 0,	AC_VERB_SET_POWER_STATE,
 							AC_PWRST_D3);
 	err = snd_hdac_display_power(bus, false);
 	if (err < 0) {
@@ -2142,8 +2136,8 @@ static int hdac_hdmi_runtime_suspend(struct device *dev)
 static int hdac_hdmi_runtime_resume(struct device *dev)
 {
 	struct hdac_ext_device *edev = to_hda_ext_device(dev);
-	struct hdac_device *hdac = &edev->hdac;
-	struct hdac_bus *bus = hdac->bus;
+	struct hdac_device *hdev = &edev->hdev;
+	struct hdac_bus *bus = hdev->bus;
 	struct hdac_ext_bus *ebus = hbus_to_ebus(bus);
 	struct hdac_ext_link *hlink = NULL;
 	int err;
@@ -2168,11 +2162,11 @@ static int hdac_hdmi_runtime_resume(struct device *dev)
 		return err;
 	}
 
-	hdac_hdmi_skl_enable_all_pins(&edev->hdac);
-	hdac_hdmi_skl_enable_dp12(&edev->hdac);
+	hdac_hdmi_skl_enable_all_pins(&edev->hdev);
+	hdac_hdmi_skl_enable_dp12(&edev->hdev);
 
 	/* Power up afg */
-	snd_hdac_codec_read(hdac, hdac->afg, 0,	AC_VERB_SET_POWER_STATE,
+	snd_hdac_codec_read(hdev, hdev->afg, 0,	AC_VERB_SET_POWER_STATE,
 							AC_PWRST_D0);
 
 	return 0;
@@ -2192,6 +2186,8 @@ static const struct hda_device_id hdmi_list[] = {
 	HDA_CODEC_EXT_ENTRY(0x80862809, 0x100000, "Skylake HDMI", 0),
 	HDA_CODEC_EXT_ENTRY(0x8086280a, 0x100000, "Broxton HDMI", 0),
 	HDA_CODEC_EXT_ENTRY(0x8086280b, 0x100000, "Kabylake HDMI", 0),
+	HDA_CODEC_EXT_ENTRY(0x8086280c, 0x100000, "Cannonlake HDMI",
+						   &intel_glk_drv_data),
 	HDA_CODEC_EXT_ENTRY(0x8086280d, 0x100000, "Geminilake HDMI",
 						   &intel_glk_drv_data),
 	{}
diff --git a/sound/soc/codecs/max98373.c b/sound/soc/codecs/max98373.c
new file mode 100644
index 0000000..31b0864
--- /dev/null
+++ b/sound/soc/codecs/max98373.c
@@ -0,0 +1,976 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017, Maxim Integrated */
+
+#include <linux/acpi.h>
+#include <linux/i2c.h>
+#include <linux/module.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+#include <linux/cdev.h>
+#include <sound/pcm.h>
+#include <sound/pcm_params.h>
+#include <sound/soc.h>
+#include <linux/gpio.h>
+#include <linux/of_gpio.h>
+#include <sound/tlv.h>
+#include "max98373.h"
+
+static struct reg_default max98373_reg[] = {
+	{MAX98373_R2000_SW_RESET, 0x00},
+	{MAX98373_R2001_INT_RAW1, 0x00},
+	{MAX98373_R2002_INT_RAW2, 0x00},
+	{MAX98373_R2003_INT_RAW3, 0x00},
+	{MAX98373_R2004_INT_STATE1, 0x00},
+	{MAX98373_R2005_INT_STATE2, 0x00},
+	{MAX98373_R2006_INT_STATE3, 0x00},
+	{MAX98373_R2007_INT_FLAG1, 0x00},
+	{MAX98373_R2008_INT_FLAG2, 0x00},
+	{MAX98373_R2009_INT_FLAG3, 0x00},
+	{MAX98373_R200A_INT_EN1, 0x00},
+	{MAX98373_R200B_INT_EN2, 0x00},
+	{MAX98373_R200C_INT_EN3, 0x00},
+	{MAX98373_R200D_INT_FLAG_CLR1, 0x00},
+	{MAX98373_R200E_INT_FLAG_CLR2, 0x00},
+	{MAX98373_R200F_INT_FLAG_CLR3, 0x00},
+	{MAX98373_R2010_IRQ_CTRL, 0x00},
+	{MAX98373_R2014_THERM_WARN_THRESH, 0x10},
+	{MAX98373_R2015_THERM_SHDN_THRESH, 0x27},
+	{MAX98373_R2016_THERM_HYSTERESIS, 0x01},
+	{MAX98373_R2017_THERM_FOLDBACK_SET, 0xC0},
+	{MAX98373_R2018_THERM_FOLDBACK_EN, 0x00},
+	{MAX98373_R201E_PIN_DRIVE_STRENGTH, 0x55},
+	{MAX98373_R2020_PCM_TX_HIZ_EN_1, 0xFE},
+	{MAX98373_R2021_PCM_TX_HIZ_EN_2, 0xFF},
+	{MAX98373_R2022_PCM_TX_SRC_1, 0x00},
+	{MAX98373_R2023_PCM_TX_SRC_2, 0x00},
+	{MAX98373_R2024_PCM_DATA_FMT_CFG, 0xC0},
+	{MAX98373_R2025_AUDIO_IF_MODE, 0x00},
+	{MAX98373_R2026_PCM_CLOCK_RATIO, 0x04},
+	{MAX98373_R2027_PCM_SR_SETUP_1, 0x08},
+	{MAX98373_R2028_PCM_SR_SETUP_2, 0x88},
+	{MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1, 0x00},
+	{MAX98373_R202A_PCM_TO_SPK_MONO_MIX_2, 0x00},
+	{MAX98373_R202B_PCM_RX_EN, 0x00},
+	{MAX98373_R202C_PCM_TX_EN, 0x00},
+	{MAX98373_R202E_ICC_RX_CH_EN_1, 0x00},
+	{MAX98373_R202F_ICC_RX_CH_EN_2, 0x00},
+	{MAX98373_R2030_ICC_TX_HIZ_EN_1, 0xFF},
+	{MAX98373_R2031_ICC_TX_HIZ_EN_2, 0xFF},
+	{MAX98373_R2032_ICC_LINK_EN_CFG, 0x30},
+	{MAX98373_R2034_ICC_TX_CNTL, 0x00},
+	{MAX98373_R2035_ICC_TX_EN, 0x00},
+	{MAX98373_R2036_SOUNDWIRE_CTRL, 0x05},
+	{MAX98373_R203D_AMP_DIG_VOL_CTRL, 0x00},
+	{MAX98373_R203E_AMP_PATH_GAIN, 0x08},
+	{MAX98373_R203F_AMP_DSP_CFG, 0x02},
+	{MAX98373_R2040_TONE_GEN_CFG, 0x00},
+	{MAX98373_R2041_AMP_CFG, 0x03},
+	{MAX98373_R2042_AMP_EDGE_RATE_CFG, 0x00},
+	{MAX98373_R2043_AMP_EN, 0x00},
+	{MAX98373_R2046_IV_SENSE_ADC_DSP_CFG, 0x04},
+	{MAX98373_R2047_IV_SENSE_ADC_EN, 0x00},
+	{MAX98373_R2051_MEAS_ADC_SAMPLING_RATE, 0x00},
+	{MAX98373_R2052_MEAS_ADC_PVDD_FLT_CFG, 0x00},
+	{MAX98373_R2053_MEAS_ADC_THERM_FLT_CFG, 0x00},
+	{MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK, 0x00},
+	{MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK, 0x00},
+	{MAX98373_R2056_MEAS_ADC_PVDD_CH_EN, 0x00},
+	{MAX98373_R2090_BDE_LVL_HOLD, 0x00},
+	{MAX98373_R2091_BDE_GAIN_ATK_REL_RATE, 0x00},
+	{MAX98373_R2092_BDE_CLIPPER_MODE, 0x00},
+	{MAX98373_R2097_BDE_L1_THRESH, 0x00},
+	{MAX98373_R2098_BDE_L2_THRESH, 0x00},
+	{MAX98373_R2099_BDE_L3_THRESH, 0x00},
+	{MAX98373_R209A_BDE_L4_THRESH, 0x00},
+	{MAX98373_R209B_BDE_THRESH_HYST, 0x00},
+	{MAX98373_R20A8_BDE_L1_CFG_1, 0x00},
+	{MAX98373_R20A9_BDE_L1_CFG_2, 0x00},
+	{MAX98373_R20AA_BDE_L1_CFG_3, 0x00},
+	{MAX98373_R20AB_BDE_L2_CFG_1, 0x00},
+	{MAX98373_R20AC_BDE_L2_CFG_2, 0x00},
+	{MAX98373_R20AD_BDE_L2_CFG_3, 0x00},
+	{MAX98373_R20AE_BDE_L3_CFG_1, 0x00},
+	{MAX98373_R20AF_BDE_L3_CFG_2, 0x00},
+	{MAX98373_R20B0_BDE_L3_CFG_3, 0x00},
+	{MAX98373_R20B1_BDE_L4_CFG_1, 0x00},
+	{MAX98373_R20B2_BDE_L4_CFG_2, 0x00},
+	{MAX98373_R20B3_BDE_L4_CFG_3, 0x00},
+	{MAX98373_R20B4_BDE_INFINITE_HOLD_RELEASE, 0x00},
+	{MAX98373_R20B5_BDE_EN, 0x00},
+	{MAX98373_R20B6_BDE_CUR_STATE_READBACK, 0x00},
+	{MAX98373_R20D1_DHT_CFG, 0x01},
+	{MAX98373_R20D2_DHT_ATTACK_CFG, 0x02},
+	{MAX98373_R20D3_DHT_RELEASE_CFG, 0x03},
+	{MAX98373_R20D4_DHT_EN, 0x00},
+	{MAX98373_R20E0_LIMITER_THRESH_CFG, 0x00},
+	{MAX98373_R20E1_LIMITER_ATK_REL_RATES, 0x00},
+	{MAX98373_R20E2_LIMITER_EN, 0x00},
+	{MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG, 0x00},
+	{MAX98373_R20FF_GLOBAL_SHDN, 0x00},
+	{MAX98373_R21FF_REV_ID, 0x42},
+};
+
+static int max98373_dai_set_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt)
+{
+	struct snd_soc_codec *codec = codec_dai->codec;
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+	unsigned int format = 0;
+	unsigned int invert = 0;
+
+	dev_dbg(codec->dev, "%s: fmt 0x%08X\n", __func__, fmt);
+
+	switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
+	case SND_SOC_DAIFMT_NB_NF:
+		break;
+	case SND_SOC_DAIFMT_IB_NF:
+		invert = MAX98373_PCM_MODE_CFG_PCM_BCLKEDGE;
+		break;
+	default:
+		dev_err(codec->dev, "DAI invert mode unsupported\n");
+		return -EINVAL;
+	}
+
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2026_PCM_CLOCK_RATIO,
+		MAX98373_PCM_MODE_CFG_PCM_BCLKEDGE,
+		invert);
+
+	/* interface format */
+	switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
+	case SND_SOC_DAIFMT_I2S:
+		format = MAX98373_PCM_FORMAT_I2S;
+		break;
+	case SND_SOC_DAIFMT_LEFT_J:
+		format = MAX98373_PCM_FORMAT_LJ;
+		break;
+	case SND_SOC_DAIFMT_DSP_A:
+		format = MAX98373_PCM_FORMAT_TDM_MODE1;
+		break;
+	case SND_SOC_DAIFMT_DSP_B:
+		format = MAX98373_PCM_FORMAT_TDM_MODE0;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2024_PCM_DATA_FMT_CFG,
+		MAX98373_PCM_MODE_CFG_FORMAT_MASK,
+		format << MAX98373_PCM_MODE_CFG_FORMAT_SHIFT);
+
+	return 0;
+}
+
+/* BCLKs per LRCLK */
+static const int bclk_sel_table[] = {
+	32, 48, 64, 96, 128, 192, 256, 384, 512, 320,
+};
+
+static int max98373_get_bclk_sel(int bclk)
+{
+	int i;
+	/* match BCLKs per LRCLK */
+	for (i = 0; i < ARRAY_SIZE(bclk_sel_table); i++) {
+		if (bclk_sel_table[i] == bclk)
+			return i + 2;
+	}
+	return 0;
+}
+
+static int max98373_set_clock(struct snd_soc_codec *codec,
+	struct snd_pcm_hw_params *params)
+{
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+	/* BCLK/LRCLK ratio calculation */
+	int blr_clk_ratio = params_channels(params) * max98373->ch_size;
+	int value;
+
+	if (!max98373->tdm_mode) {
+		/* BCLK configuration */
+		value = max98373_get_bclk_sel(blr_clk_ratio);
+		if (!value) {
+			dev_err(codec->dev, "format unsupported %d\n",
+				params_format(params));
+			return -EINVAL;
+		}
+
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2026_PCM_CLOCK_RATIO,
+			MAX98373_PCM_CLK_SETUP_BSEL_MASK,
+			value);
+	}
+	return 0;
+}
+
+static int max98373_dai_hw_params(struct snd_pcm_substream *substream,
+	struct snd_pcm_hw_params *params,
+	struct snd_soc_dai *dai)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+	unsigned int sampling_rate = 0;
+	unsigned int chan_sz = 0;
+
+	/* pcm mode configuration */
+	switch (snd_pcm_format_width(params_format(params))) {
+	case 16:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_16;
+		break;
+	case 24:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_24;
+		break;
+	case 32:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_32;
+		break;
+	default:
+		dev_err(codec->dev, "format unsupported %d\n",
+			params_format(params));
+		goto err;
+	}
+
+	max98373->ch_size = snd_pcm_format_width(params_format(params));
+
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2024_PCM_DATA_FMT_CFG,
+		MAX98373_PCM_MODE_CFG_CHANSZ_MASK, chan_sz);
+
+	dev_dbg(codec->dev, "format supported %d",
+		params_format(params));
+
+	/* sampling rate configuration */
+	switch (params_rate(params)) {
+	case 8000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_8000;
+		break;
+	case 11025:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_11025;
+		break;
+	case 12000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_12000;
+		break;
+	case 16000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_16000;
+		break;
+	case 22050:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_22050;
+		break;
+	case 24000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_24000;
+		break;
+	case 32000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_32000;
+		break;
+	case 44100:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_44100;
+		break;
+	case 48000:
+		sampling_rate = MAX98373_PCM_SR_SET1_SR_48000;
+		break;
+	default:
+		dev_err(codec->dev, "rate %d not supported\n",
+			params_rate(params));
+		goto err;
+	}
+
+	/* set DAI_SR to correct LRCLK frequency */
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2027_PCM_SR_SETUP_1,
+		MAX98373_PCM_SR_SET1_SR_MASK,
+		sampling_rate);
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2028_PCM_SR_SETUP_2,
+		MAX98373_PCM_SR_SET2_SR_MASK,
+		sampling_rate << MAX98373_PCM_SR_SET2_SR_SHIFT);
+
+	/* set sampling rate of IV */
+	if (max98373->interleave_mode &&
+	    sampling_rate > MAX98373_PCM_SR_SET1_SR_16000)
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2028_PCM_SR_SETUP_2,
+			MAX98373_PCM_SR_SET2_IVADC_SR_MASK,
+			sampling_rate - 3);
+	else
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2028_PCM_SR_SETUP_2,
+			MAX98373_PCM_SR_SET2_IVADC_SR_MASK,
+			sampling_rate);
+
+	return max98373_set_clock(codec, params);
+err:
+	return -EINVAL;
+}
+
+static int max98373_dai_tdm_slot(struct snd_soc_dai *dai,
+	unsigned int tx_mask, unsigned int rx_mask,
+	int slots, int slot_width)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+	int bsel = 0;
+	unsigned int chan_sz = 0;
+	unsigned int mask;
+	int x, slot_found;
+
+	if (!tx_mask && !rx_mask && !slots && !slot_width)
+		max98373->tdm_mode = false;
+	else
+		max98373->tdm_mode = true;
+
+	/* BCLK configuration */
+	bsel = max98373_get_bclk_sel(slots * slot_width);
+	if (bsel == 0) {
+		dev_err(codec->dev, "BCLK %d not supported\n",
+			slots * slot_width);
+		return -EINVAL;
+	}
+
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2026_PCM_CLOCK_RATIO,
+		MAX98373_PCM_CLK_SETUP_BSEL_MASK,
+		bsel);
+
+	/* Channel size configuration */
+	switch (slot_width) {
+	case 16:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_16;
+		break;
+	case 24:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_24;
+		break;
+	case 32:
+		chan_sz = MAX98373_PCM_MODE_CFG_CHANSZ_32;
+		break;
+	default:
+		dev_err(codec->dev, "format unsupported %d\n",
+			slot_width);
+		return -EINVAL;
+	}
+
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2024_PCM_DATA_FMT_CFG,
+		MAX98373_PCM_MODE_CFG_CHANSZ_MASK, chan_sz);
+
+	/* Rx slot configuration */
+	slot_found = 0;
+	mask = rx_mask;
+	for (x = 0 ; x < 16 ; x++, mask >>= 1) {
+		if (mask & 0x1) {
+			if (slot_found == 0)
+				regmap_update_bits(max98373->regmap,
+					MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1,
+					MAX98373_PCM_TO_SPK_CH0_SRC_MASK, x);
+			else
+				regmap_write(max98373->regmap,
+					MAX98373_R202A_PCM_TO_SPK_MONO_MIX_2,
+					x);
+			slot_found++;
+			if (slot_found > 1)
+				break;
+		}
+	}
+
+	/* Tx slot Hi-Z configuration */
+	regmap_write(max98373->regmap,
+		MAX98373_R2020_PCM_TX_HIZ_EN_1,
+		~tx_mask & 0xFF);
+	regmap_write(max98373->regmap,
+		MAX98373_R2021_PCM_TX_HIZ_EN_2,
+		(~tx_mask & 0xFF00) >> 8);
+
+	return 0;
+}
+
+#define MAX98373_RATES SNDRV_PCM_RATE_8000_96000
+
+#define MAX98373_FORMATS (SNDRV_PCM_FMTBIT_S16_LE | \
+	SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE)
+
+static const struct snd_soc_dai_ops max98373_dai_ops = {
+	.set_fmt = max98373_dai_set_fmt,
+	.hw_params = max98373_dai_hw_params,
+	.set_tdm_slot = max98373_dai_tdm_slot,
+};
+
+static int max98373_dac_event(struct snd_soc_dapm_widget *w,
+	struct snd_kcontrol *kcontrol, int event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+
+	switch (event) {
+	case SND_SOC_DAPM_POST_PMU:
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R20FF_GLOBAL_SHDN,
+			MAX98373_GLOBAL_EN_MASK, 1);
+		break;
+	case SND_SOC_DAPM_POST_PMD:
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R20FF_GLOBAL_SHDN,
+			MAX98373_GLOBAL_EN_MASK, 0);
+		max98373->tdm_mode = 0;
+		break;
+	default:
+		return 0;
+	}
+	return 0;
+}
+
+static const char * const max98373_switch_text[] = {
+	"Left", "Right", "LeftRight"};
+
+static const struct soc_enum dai_sel_enum =
+	SOC_ENUM_SINGLE(MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1,
+		MAX98373_PCM_TO_SPK_MONOMIX_CFG_SHIFT,
+		3, max98373_switch_text);
+
+static const struct snd_kcontrol_new max98373_dai_controls =
+	SOC_DAPM_ENUM("DAI Sel", dai_sel_enum);
+
+static const struct snd_kcontrol_new max98373_vi_control =
+	SOC_DAPM_SINGLE("Switch", MAX98373_R202C_PCM_TX_EN, 0, 1, 0);
+
+static const struct snd_kcontrol_new max98373_spkfb_control =
+	SOC_DAPM_SINGLE("Switch", MAX98373_R2043_AMP_EN, 1, 1, 0);
+
+static const struct snd_soc_dapm_widget max98373_dapm_widgets[] = {
+SND_SOC_DAPM_DAC_E("Amp Enable", "HiFi Playback",
+	MAX98373_R202B_PCM_RX_EN, 0, 0, max98373_dac_event,
+	SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_POST_PMD),
+SND_SOC_DAPM_MUX("DAI Sel Mux", SND_SOC_NOPM, 0, 0,
+	&max98373_dai_controls),
+SND_SOC_DAPM_OUTPUT("BE_OUT"),
+SND_SOC_DAPM_AIF_OUT("Voltage Sense", "HiFi Capture", 0,
+	MAX98373_R2047_IV_SENSE_ADC_EN, 0, 0),
+SND_SOC_DAPM_AIF_OUT("Current Sense", "HiFi Capture", 0,
+	MAX98373_R2047_IV_SENSE_ADC_EN, 1, 0),
+SND_SOC_DAPM_AIF_OUT("Speaker FB Sense", "HiFi Capture", 0,
+	SND_SOC_NOPM, 0, 0),
+SND_SOC_DAPM_SWITCH("VI Sense", SND_SOC_NOPM, 0, 0,
+	&max98373_vi_control),
+SND_SOC_DAPM_SWITCH("SpkFB Sense", SND_SOC_NOPM, 0, 0,
+	&max98373_spkfb_control),
+SND_SOC_DAPM_SIGGEN("VMON"),
+SND_SOC_DAPM_SIGGEN("IMON"),
+SND_SOC_DAPM_SIGGEN("FBMON"),
+};
+
+static DECLARE_TLV_DB_SCALE(max98373_digital_tlv, 0, -50, 0);
+static const DECLARE_TLV_DB_RANGE(max98373_spk_tlv,
+	0, 8, TLV_DB_SCALE_ITEM(0, 50, 0),
+	9, 10, TLV_DB_SCALE_ITEM(500, 100, 0),
+);
+static const DECLARE_TLV_DB_RANGE(max98373_spkgain_max_tlv,
+	0, 9, TLV_DB_SCALE_ITEM(800, 100, 0),
+);
+static const DECLARE_TLV_DB_RANGE(max98373_dht_step_size_tlv,
+	0, 1, TLV_DB_SCALE_ITEM(25, 25, 0),
+	2, 4, TLV_DB_SCALE_ITEM(100, 100, 0),
+);
+static const DECLARE_TLV_DB_RANGE(max98373_dht_spkgain_min_tlv,
+	0, 9, TLV_DB_SCALE_ITEM(800, 100, 0),
+);
+static const DECLARE_TLV_DB_RANGE(max98373_dht_rotation_point_tlv,
+	0, 1, TLV_DB_SCALE_ITEM(-50, -50, 0),
+	2, 7, TLV_DB_SCALE_ITEM(-200, -100, 0),
+	8, 9, TLV_DB_SCALE_ITEM(-1000, -200, 0),
+	10, 11, TLV_DB_SCALE_ITEM(-1500, -300, 0),
+	12, 13, TLV_DB_SCALE_ITEM(-2000, -200, 0),
+	14, 15, TLV_DB_SCALE_ITEM(-2500, -500, 0),
+);
+static const DECLARE_TLV_DB_RANGE(max98373_limiter_thresh_tlv,
+	0, 15, TLV_DB_SCALE_ITEM(0, -100, 0),
+);
+
+static const DECLARE_TLV_DB_RANGE(max98373_bde_gain_tlv,
+	0, 60, TLV_DB_SCALE_ITEM(0, -25, 0),
+);
+
+static bool max98373_readable_register(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case MAX98373_R2001_INT_RAW1 ... MAX98373_R200C_INT_EN3:
+	case MAX98373_R2010_IRQ_CTRL:
+	case MAX98373_R2014_THERM_WARN_THRESH
+		... MAX98373_R2018_THERM_FOLDBACK_EN:
+	case MAX98373_R201E_PIN_DRIVE_STRENGTH
+		... MAX98373_R2036_SOUNDWIRE_CTRL:
+	case MAX98373_R203D_AMP_DIG_VOL_CTRL ... MAX98373_R2043_AMP_EN:
+	case MAX98373_R2046_IV_SENSE_ADC_DSP_CFG
+		... MAX98373_R2047_IV_SENSE_ADC_EN:
+	case MAX98373_R2051_MEAS_ADC_SAMPLING_RATE
+		... MAX98373_R2056_MEAS_ADC_PVDD_CH_EN:
+	case MAX98373_R2090_BDE_LVL_HOLD ... MAX98373_R2092_BDE_CLIPPER_MODE:
+	case MAX98373_R2097_BDE_L1_THRESH
+		... MAX98373_R209B_BDE_THRESH_HYST:
+	case MAX98373_R20A8_BDE_L1_CFG_1 ... MAX98373_R20B3_BDE_L4_CFG_3:
+	case MAX98373_R20B5_BDE_EN ... MAX98373_R20B6_BDE_CUR_STATE_READBACK:
+	case MAX98373_R20D1_DHT_CFG ... MAX98373_R20D4_DHT_EN:
+	case MAX98373_R20E0_LIMITER_THRESH_CFG ... MAX98373_R20E2_LIMITER_EN:
+	case MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG
+		... MAX98373_R20FF_GLOBAL_SHDN:
+	case MAX98373_R21FF_REV_ID:
+		return true;
+	default:
+		return false;
+	}
+};
+
+static bool max98373_volatile_reg(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case MAX98373_R2000_SW_RESET ... MAX98373_R2009_INT_FLAG3:
+	case MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK:
+	case MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK:
+	case MAX98373_R20B6_BDE_CUR_STATE_READBACK:
+	case MAX98373_R21FF_REV_ID:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static const char * const max98373_output_voltage_lvl_text[] = {
+	"5.43V", "6.09V", "6.83V", "7.67V", "8.60V",
+	"9.65V", "10.83V", "12.15V", "13.63V", "15.29V"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_out_volt_enum,
+			    MAX98373_R203E_AMP_PATH_GAIN, 0,
+			    max98373_output_voltage_lvl_text);
+
+static const char * const max98373_dht_attack_rate_text[] = {
+	"17.5us", "35us", "70us", "140us",
+	"280us", "560us", "1120us", "2240us"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_dht_attack_rate_enum,
+			    MAX98373_R20D2_DHT_ATTACK_CFG, 0,
+			    max98373_dht_attack_rate_text);
+
+static const char * const max98373_dht_release_rate_text[] = {
+	"45ms", "225ms", "450ms", "1150ms",
+	"2250ms", "3100ms", "4500ms", "6750ms"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_dht_release_rate_enum,
+			    MAX98373_R20D3_DHT_RELEASE_CFG, 0,
+			    max98373_dht_release_rate_text);
+
+static const char * const max98373_limiter_attack_rate_text[] = {
+	"10us", "20us", "40us", "80us",
+	"160us", "320us", "640us", "1.28ms",
+	"2.56ms", "5.12ms", "10.24ms", "20.48ms",
+	"40.96ms", "81.92ms", "16.384ms", "32.768ms"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_limiter_attack_rate_enum,
+			    MAX98373_R20E1_LIMITER_ATK_REL_RATES, 4,
+			    max98373_limiter_attack_rate_text);
+
+static const char * const max98373_limiter_release_rate_text[] = {
+	"40us", "80us", "160us", "320us",
+	"640us", "1.28ms", "2.56ms", "5.120ms",
+	"10.24ms", "20.48ms", "40.96ms", "81.92ms",
+	"163.84ms", "327.68ms", "655.36ms", "1310.72ms"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_limiter_release_rate_enum,
+			    MAX98373_R20E1_LIMITER_ATK_REL_RATES, 0,
+			    max98373_limiter_release_rate_text);
+
+static const char * const max98373_ADC_samplerate_text[] = {
+	"333kHz", "192kHz", "64kHz", "48kHz"
+};
+
+static SOC_ENUM_SINGLE_DECL(max98373_adc_samplerate_enum,
+			    MAX98373_R2051_MEAS_ADC_SAMPLING_RATE, 0,
+			    max98373_ADC_samplerate_text);
+
+static const struct snd_kcontrol_new max98373_snd_controls[] = {
+SOC_SINGLE("Digital Vol Sel Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_VOL_SEL_SHIFT, 1, 0),
+SOC_SINGLE("Volume Location Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_VOL_SEL_SHIFT, 1, 0),
+SOC_SINGLE("Ramp Up Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_DSP_CFG_RMP_UP_SHIFT, 1, 0),
+SOC_SINGLE("Ramp Down Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_DSP_CFG_RMP_DN_SHIFT, 1, 0),
+SOC_SINGLE("CLK Monitor Switch", MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG,
+	MAX98373_CLOCK_MON_SHIFT, 1, 0),
+SOC_SINGLE("Dither Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_DSP_CFG_DITH_SHIFT, 1, 0),
+SOC_SINGLE("DC Blocker Switch", MAX98373_R203F_AMP_DSP_CFG,
+	MAX98373_AMP_DSP_CFG_DCBLK_SHIFT, 1, 0),
+SOC_SINGLE_TLV("Digital Volume", MAX98373_R203D_AMP_DIG_VOL_CTRL,
+	0, 0x7F, 0, max98373_digital_tlv),
+SOC_SINGLE_TLV("Speaker Volume", MAX98373_R203E_AMP_PATH_GAIN,
+	MAX98373_SPK_DIGI_GAIN_SHIFT, 10, 0, max98373_spk_tlv),
+SOC_SINGLE_TLV("FS Max Volume", MAX98373_R203E_AMP_PATH_GAIN,
+	MAX98373_FS_GAIN_MAX_SHIFT, 9, 0, max98373_spkgain_max_tlv),
+SOC_ENUM("Output Voltage", max98373_out_volt_enum),
+/* Dynamic Headroom Tracking */
+SOC_SINGLE("DHT Switch", MAX98373_R20D4_DHT_EN,
+	MAX98373_DHT_EN_SHIFT, 1, 0),
+SOC_SINGLE_TLV("DHT Min Volume", MAX98373_R20D1_DHT_CFG,
+	MAX98373_DHT_SPK_GAIN_MIN_SHIFT, 9, 0, max98373_dht_spkgain_min_tlv),
+SOC_SINGLE_TLV("DHT Rot Pnt Volume", MAX98373_R20D1_DHT_CFG,
+	MAX98373_DHT_ROT_PNT_SHIFT, 15, 0, max98373_dht_rotation_point_tlv),
+SOC_SINGLE_TLV("DHT Attack Step Volume", MAX98373_R20D2_DHT_ATTACK_CFG,
+	MAX98373_DHT_ATTACK_STEP_SHIFT, 4, 0, max98373_dht_step_size_tlv),
+SOC_SINGLE_TLV("DHT Release Step Volume", MAX98373_R20D3_DHT_RELEASE_CFG,
+	MAX98373_DHT_RELEASE_STEP_SHIFT, 4, 0, max98373_dht_step_size_tlv),
+SOC_ENUM("DHT Attack Rate", max98373_dht_attack_rate_enum),
+SOC_ENUM("DHT Release Rate", max98373_dht_release_rate_enum),
+/* ADC configuration */
+SOC_SINGLE("ADC PVDD CH Switch", MAX98373_R2056_MEAS_ADC_PVDD_CH_EN, 0, 1, 0),
+SOC_SINGLE("ADC PVDD FLT Switch", MAX98373_R2052_MEAS_ADC_PVDD_FLT_CFG,
+	MAX98373_FLT_EN_SHIFT, 1, 0),
+SOC_SINGLE("ADC TEMP FLT Switch", MAX98373_R2053_MEAS_ADC_THERM_FLT_CFG,
+	MAX98373_FLT_EN_SHIFT, 1, 0),
+SOC_SINGLE("ADC PVDD", MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK, 0, 0xFF, 0),
+SOC_SINGLE("ADC TEMP", MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK, 0, 0xFF, 0),
+SOC_SINGLE("ADC PVDD FLT Coeff", MAX98373_R2052_MEAS_ADC_PVDD_FLT_CFG,
+	0, 0x3, 0),
+SOC_SINGLE("ADC TEMP FLT Coeff", MAX98373_R2053_MEAS_ADC_THERM_FLT_CFG,
+	0, 0x3, 0),
+SOC_ENUM("ADC SampleRate", max98373_adc_samplerate_enum),
+/* Brownout Detection Engine */
+SOC_SINGLE("BDE Switch", MAX98373_R20B5_BDE_EN, MAX98373_BDE_EN_SHIFT, 1, 0),
+SOC_SINGLE("BDE LVL4 Mute Switch", MAX98373_R20B2_BDE_L4_CFG_2,
+	MAX98373_LVL4_MUTE_EN_SHIFT, 1, 0),
+SOC_SINGLE("BDE LVL4 Hold Switch", MAX98373_R20B2_BDE_L4_CFG_2,
+	MAX98373_LVL4_HOLD_EN_SHIFT, 1, 0),
+SOC_SINGLE("BDE LVL1 Thresh", MAX98373_R2097_BDE_L1_THRESH, 0, 0xFF, 0),
+SOC_SINGLE("BDE LVL2 Thresh", MAX98373_R2098_BDE_L2_THRESH, 0, 0xFF, 0),
+SOC_SINGLE("BDE LVL3 Thresh", MAX98373_R2099_BDE_L3_THRESH, 0, 0xFF, 0),
+SOC_SINGLE("BDE LVL4 Thresh", MAX98373_R209A_BDE_L4_THRESH, 0, 0xFF, 0),
+SOC_SINGLE("BDE Active Level", MAX98373_R20B6_BDE_CUR_STATE_READBACK, 0, 8, 0),
+SOC_SINGLE("BDE Clip Mode Switch", MAX98373_R2092_BDE_CLIPPER_MODE, 0, 1, 0),
+SOC_SINGLE("BDE Thresh Hysteresis", MAX98373_R209B_BDE_THRESH_HYST, 0, 0xFF, 0),
+SOC_SINGLE("BDE Hold Time", MAX98373_R2090_BDE_LVL_HOLD, 0, 0xFF, 0),
+SOC_SINGLE("BDE Attack Rate", MAX98373_R2091_BDE_GAIN_ATK_REL_RATE, 4, 0xF, 0),
+SOC_SINGLE("BDE Release Rate", MAX98373_R2091_BDE_GAIN_ATK_REL_RATE, 0, 0xF, 0),
+SOC_SINGLE_TLV("BDE LVL1 Clip Thresh Volume", MAX98373_R20A9_BDE_L1_CFG_2,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL2 Clip Thresh Volume", MAX98373_R20AC_BDE_L2_CFG_2,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL3 Clip Thresh Volume", MAX98373_R20AF_BDE_L3_CFG_2,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL4 Clip Thresh Volume", MAX98373_R20B2_BDE_L4_CFG_2,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL1 Clip Reduction Volume", MAX98373_R20AA_BDE_L1_CFG_3,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL2 Clip Reduction Volume", MAX98373_R20AD_BDE_L2_CFG_3,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL3 Clip Reduction Volume", MAX98373_R20B0_BDE_L3_CFG_3,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL4 Clip Reduction Volume", MAX98373_R20B3_BDE_L4_CFG_3,
+	0, 0x3C, 0, max98373_bde_gain_tlv),
+SOC_SINGLE_TLV("BDE LVL1 Limiter Thresh Volume", MAX98373_R20A8_BDE_L1_CFG_1,
+	0, 0xF, 0, max98373_limiter_thresh_tlv),
+SOC_SINGLE_TLV("BDE LVL2 Limiter Thresh Volume", MAX98373_R20AB_BDE_L2_CFG_1,
+	0, 0xF, 0, max98373_limiter_thresh_tlv),
+SOC_SINGLE_TLV("BDE LVL3 Limiter Thresh Volume", MAX98373_R20AE_BDE_L3_CFG_1,
+	0, 0xF, 0, max98373_limiter_thresh_tlv),
+SOC_SINGLE_TLV("BDE LVL4 Limiter Thresh Volume", MAX98373_R20B1_BDE_L4_CFG_1,
+	0, 0xF, 0, max98373_limiter_thresh_tlv),
+/* Limiter */
+SOC_SINGLE("Limiter Switch", MAX98373_R20E2_LIMITER_EN,
+	MAX98373_LIMITER_EN_SHIFT, 1, 0),
+SOC_SINGLE("Limiter Src Switch", MAX98373_R20E0_LIMITER_THRESH_CFG,
+	MAX98373_LIMITER_THRESH_SRC_SHIFT, 1, 0),
+SOC_SINGLE_TLV("Limiter Thresh Volume", MAX98373_R20E0_LIMITER_THRESH_CFG,
+	MAX98373_LIMITER_THRESH_SHIFT, 15, 0, max98373_limiter_thresh_tlv),
+SOC_ENUM("Limiter Attack Rate", max98373_limiter_attack_rate_enum),
+SOC_ENUM("Limiter Release Rate", max98373_limiter_release_rate_enum),
+};
+
+static const struct snd_soc_dapm_route max98373_audio_map[] = {
+	/* Plabyack */
+	{"DAI Sel Mux", "Left", "Amp Enable"},
+	{"DAI Sel Mux", "Right", "Amp Enable"},
+	{"DAI Sel Mux", "LeftRight", "Amp Enable"},
+	{"BE_OUT", NULL, "DAI Sel Mux"},
+	/* Capture */
+	{ "VI Sense", "Switch", "VMON" },
+	{ "VI Sense", "Switch", "IMON" },
+	{ "SpkFB Sense", "Switch", "FBMON" },
+	{ "Voltage Sense", NULL, "VI Sense" },
+	{ "Current Sense", NULL, "VI Sense" },
+	{ "Speaker FB Sense", NULL, "SpkFB Sense" },
+};
+
+static struct snd_soc_dai_driver max98373_dai[] = {
+	{
+		.name = "max98373-aif1",
+		.playback = {
+			.stream_name = "HiFi Playback",
+			.channels_min = 1,
+			.channels_max = 2,
+			.rates = MAX98373_RATES,
+			.formats = MAX98373_FORMATS,
+		},
+		.capture = {
+			.stream_name = "HiFi Capture",
+			.channels_min = 1,
+			.channels_max = 2,
+			.rates = MAX98373_RATES,
+			.formats = MAX98373_FORMATS,
+		},
+		.ops = &max98373_dai_ops,
+	}
+};
+
+static int max98373_probe(struct snd_soc_codec *codec)
+{
+	struct max98373_priv *max98373 = snd_soc_codec_get_drvdata(codec);
+
+	codec->control_data = max98373->regmap;
+
+	/* Software Reset */
+	regmap_write(max98373->regmap,
+		MAX98373_R2000_SW_RESET, MAX98373_SOFT_RESET);
+
+	/* IV default slot configuration */
+	regmap_write(max98373->regmap,
+		MAX98373_R2020_PCM_TX_HIZ_EN_1,
+		0xFF);
+	regmap_write(max98373->regmap,
+		MAX98373_R2021_PCM_TX_HIZ_EN_2,
+		0xFF);
+	/* L/R mix configuration */
+	regmap_write(max98373->regmap,
+		MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1,
+		0x80);
+	regmap_write(max98373->regmap,
+		MAX98373_R202A_PCM_TO_SPK_MONO_MIX_2,
+		0x1);
+	/* Set inital volume (0dB) */
+	regmap_write(max98373->regmap,
+		MAX98373_R203D_AMP_DIG_VOL_CTRL,
+		0x00);
+	regmap_write(max98373->regmap,
+		MAX98373_R203E_AMP_PATH_GAIN,
+		0x00);
+	/* Enable DC blocker */
+	regmap_write(max98373->regmap,
+		MAX98373_R203F_AMP_DSP_CFG,
+		0x3);
+	/* Enable IMON VMON DC blocker */
+	regmap_write(max98373->regmap,
+		MAX98373_R2046_IV_SENSE_ADC_DSP_CFG,
+		0x7);
+	/* voltage, current slot configuration */
+	regmap_write(max98373->regmap,
+		MAX98373_R2022_PCM_TX_SRC_1,
+		(max98373->i_slot << MAX98373_PCM_TX_CH_SRC_A_I_SHIFT |
+		max98373->v_slot) & 0xFF);
+	if (max98373->v_slot < 8)
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2020_PCM_TX_HIZ_EN_1,
+			1 << max98373->v_slot, 0);
+	else
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2021_PCM_TX_HIZ_EN_2,
+			1 << (max98373->v_slot - 8), 0);
+
+	if (max98373->i_slot < 8)
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2020_PCM_TX_HIZ_EN_1,
+			1 << max98373->i_slot, 0);
+	else
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2021_PCM_TX_HIZ_EN_2,
+			1 << (max98373->i_slot - 8), 0);
+
+	/* speaker feedback slot configuration */
+	regmap_write(max98373->regmap,
+		MAX98373_R2023_PCM_TX_SRC_2,
+		max98373->spkfb_slot & 0xFF);
+
+	/* Set interleave mode */
+	if (max98373->interleave_mode)
+		regmap_update_bits(max98373->regmap,
+			MAX98373_R2024_PCM_DATA_FMT_CFG,
+			MAX98373_PCM_TX_CH_INTERLEAVE_MASK,
+			MAX98373_PCM_TX_CH_INTERLEAVE_MASK);
+
+	/* Speaker enable */
+	regmap_update_bits(max98373->regmap,
+		MAX98373_R2043_AMP_EN,
+		MAX98373_SPK_EN_MASK, 1);
+
+	return 0;
+}
+
+#ifdef CONFIG_PM_SLEEP
+static int max98373_suspend(struct device *dev)
+{
+	struct max98373_priv *max98373 = dev_get_drvdata(dev);
+
+	regcache_cache_only(max98373->regmap, true);
+	regcache_mark_dirty(max98373->regmap);
+	return 0;
+}
+static int max98373_resume(struct device *dev)
+{
+	struct max98373_priv *max98373 = dev_get_drvdata(dev);
+
+	regmap_write(max98373->regmap,
+		MAX98373_R2000_SW_RESET, MAX98373_SOFT_RESET);
+	regcache_cache_only(max98373->regmap, false);
+	regcache_sync(max98373->regmap);
+	return 0;
+}
+#endif
+
+static const struct dev_pm_ops max98373_pm = {
+	SET_SYSTEM_SLEEP_PM_OPS(max98373_suspend, max98373_resume)
+};
+
+static const struct snd_soc_codec_driver soc_codec_dev_max98373 = {
+	.probe = max98373_probe,
+	.component_driver = {
+		.controls = max98373_snd_controls,
+		.num_controls = ARRAY_SIZE(max98373_snd_controls),
+		.dapm_widgets = max98373_dapm_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(max98373_dapm_widgets),
+		.dapm_routes = max98373_audio_map,
+		.num_dapm_routes = ARRAY_SIZE(max98373_audio_map),
+	},
+};
+
+static const struct regmap_config max98373_regmap = {
+	.reg_bits = 16,
+	.val_bits = 8,
+	.max_register = MAX98373_R21FF_REV_ID,
+	.reg_defaults  = max98373_reg,
+	.num_reg_defaults = ARRAY_SIZE(max98373_reg),
+	.readable_reg = max98373_readable_register,
+	.volatile_reg = max98373_volatile_reg,
+	.cache_type = REGCACHE_RBTREE,
+};
+
+static void max98373_slot_config(struct i2c_client *i2c,
+	struct max98373_priv *max98373)
+{
+	int value;
+	struct device *dev = &i2c->dev;
+
+	if (!device_property_read_u32(dev, "maxim,vmon-slot-no", &value))
+		max98373->v_slot = value & 0xF;
+	else
+		max98373->v_slot = 0;
+
+	if (!device_property_read_u32(dev, "maxim,imon-slot-no", &value))
+		max98373->i_slot = value & 0xF;
+	else
+		max98373->i_slot = 1;
+
+	if (!device_property_read_u32(dev, "maxim,spkfb-slot-no", &value))
+		max98373->spkfb_slot = value & 0xF;
+	else
+		max98373->spkfb_slot = 2;
+}
+
+static int max98373_i2c_probe(struct i2c_client *i2c,
+	const struct i2c_device_id *id)
+{
+
+	int ret = 0;
+	int reg = 0;
+	struct max98373_priv *max98373 = NULL;
+
+	max98373 = devm_kzalloc(&i2c->dev, sizeof(*max98373), GFP_KERNEL);
+
+	if (!max98373) {
+		ret = -ENOMEM;
+		return ret;
+	}
+	i2c_set_clientdata(i2c, max98373);
+
+	/* update interleave mode info */
+	if (device_property_read_bool(&i2c->dev, "maxim,interleave_mode"))
+		max98373->interleave_mode = 1;
+	else
+		max98373->interleave_mode = 0;
+
+
+	/* regmap initialization */
+	max98373->regmap
+		= devm_regmap_init_i2c(i2c, &max98373_regmap);
+	if (IS_ERR(max98373->regmap)) {
+		ret = PTR_ERR(max98373->regmap);
+		dev_err(&i2c->dev,
+			"Failed to allocate regmap: %d\n", ret);
+		return ret;
+	}
+
+	/* Check Revision ID */
+	ret = regmap_read(max98373->regmap,
+		MAX98373_R21FF_REV_ID, &reg);
+	if (ret < 0) {
+		dev_err(&i2c->dev,
+			"Failed to read: 0x%02X\n", MAX98373_R21FF_REV_ID);
+		return ret;
+	}
+	dev_info(&i2c->dev, "MAX98373 revisionID: 0x%02X\n", reg);
+
+	/* voltage/current slot configuration */
+	max98373_slot_config(i2c, max98373);
+
+	/* codec registeration */
+	ret = snd_soc_register_codec(&i2c->dev, &soc_codec_dev_max98373,
+		max98373_dai, ARRAY_SIZE(max98373_dai));
+	if (ret < 0)
+		dev_err(&i2c->dev, "Failed to register codec: %d\n", ret);
+
+	return ret;
+}
+
+static int max98373_i2c_remove(struct i2c_client *client)
+{
+	snd_soc_unregister_codec(&client->dev);
+	return 0;
+}
+
+static const struct i2c_device_id max98373_i2c_id[] = {
+	{ "max98373", 0},
+	{ },
+};
+
+MODULE_DEVICE_TABLE(i2c, max98373_i2c_id);
+
+#if defined(CONFIG_OF)
+static const struct of_device_id max98373_of_match[] = {
+	{ .compatible = "maxim,max98373", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, max98373_of_match);
+#endif
+
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id max98373_acpi_match[] = {
+	{ "MX98373", 0 },
+	{},
+};
+MODULE_DEVICE_TABLE(acpi, max98373_acpi_match);
+#endif
+
+static struct i2c_driver max98373_i2c_driver = {
+	.driver = {
+		.name = "max98373",
+		.of_match_table = of_match_ptr(max98373_of_match),
+		.acpi_match_table = ACPI_PTR(max98373_acpi_match),
+		.pm = &max98373_pm,
+	},
+	.probe = max98373_i2c_probe,
+	.remove = max98373_i2c_remove,
+	.id_table = max98373_i2c_id,
+};
+
+module_i2c_driver(max98373_i2c_driver)
+
+MODULE_DESCRIPTION("ALSA SoC MAX98373 driver");
+MODULE_AUTHOR("Ryan Lee <ryans.lee@maximintegrated.com>");
+MODULE_LICENSE("GPL");
diff --git a/sound/soc/codecs/max98373.h b/sound/soc/codecs/max98373.h
new file mode 100644
index 0000000..d0b359d
--- /dev/null
+++ b/sound/soc/codecs/max98373.h
@@ -0,0 +1,212 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2017, Maxim Integrated */
+#ifndef _MAX98373_H
+#define _MAX98373_H
+
+#define MAX98373_R2000_SW_RESET 0x2000
+#define MAX98373_R2001_INT_RAW1 0x2001
+#define MAX98373_R2002_INT_RAW2 0x2002
+#define MAX98373_R2003_INT_RAW3 0x2003
+#define MAX98373_R2004_INT_STATE1 0x2004
+#define MAX98373_R2005_INT_STATE2 0x2005
+#define MAX98373_R2006_INT_STATE3 0x2006
+#define MAX98373_R2007_INT_FLAG1 0x2007
+#define MAX98373_R2008_INT_FLAG2 0x2008
+#define MAX98373_R2009_INT_FLAG3 0x2009
+#define MAX98373_R200A_INT_EN1 0x200A
+#define MAX98373_R200B_INT_EN2 0x200B
+#define MAX98373_R200C_INT_EN3 0x200C
+#define MAX98373_R200D_INT_FLAG_CLR1 0x200D
+#define MAX98373_R200E_INT_FLAG_CLR2 0x200E
+#define MAX98373_R200F_INT_FLAG_CLR3 0x200F
+#define MAX98373_R2010_IRQ_CTRL 0x2010
+#define MAX98373_R2014_THERM_WARN_THRESH 0x2014
+#define MAX98373_R2015_THERM_SHDN_THRESH 0x2015
+#define MAX98373_R2016_THERM_HYSTERESIS 0x2016
+#define MAX98373_R2017_THERM_FOLDBACK_SET 0x2017
+#define MAX98373_R2018_THERM_FOLDBACK_EN 0x2018
+#define MAX98373_R201E_PIN_DRIVE_STRENGTH 0x201E
+#define MAX98373_R2020_PCM_TX_HIZ_EN_1 0x2020
+#define MAX98373_R2021_PCM_TX_HIZ_EN_2 0x2021
+#define MAX98373_R2022_PCM_TX_SRC_1 0x2022
+#define MAX98373_R2023_PCM_TX_SRC_2 0x2023
+#define MAX98373_R2024_PCM_DATA_FMT_CFG	0x2024
+#define MAX98373_R2025_AUDIO_IF_MODE 0x2025
+#define MAX98373_R2026_PCM_CLOCK_RATIO 0x2026
+#define MAX98373_R2027_PCM_SR_SETUP_1 0x2027
+#define MAX98373_R2028_PCM_SR_SETUP_2 0x2028
+#define MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1 0x2029
+#define MAX98373_R202A_PCM_TO_SPK_MONO_MIX_2 0x202A
+#define MAX98373_R202B_PCM_RX_EN 0x202B
+#define MAX98373_R202C_PCM_TX_EN 0x202C
+#define MAX98373_R202E_ICC_RX_CH_EN_1 0x202E
+#define MAX98373_R202F_ICC_RX_CH_EN_2 0x202F
+#define MAX98373_R2030_ICC_TX_HIZ_EN_1 0x2030
+#define MAX98373_R2031_ICC_TX_HIZ_EN_2 0x2031
+#define MAX98373_R2032_ICC_LINK_EN_CFG 0x2032
+#define MAX98373_R2034_ICC_TX_CNTL 0x2034
+#define MAX98373_R2035_ICC_TX_EN 0x2035
+#define MAX98373_R2036_SOUNDWIRE_CTRL 0x2036
+#define MAX98373_R203D_AMP_DIG_VOL_CTRL 0x203D
+#define MAX98373_R203E_AMP_PATH_GAIN 0x203E
+#define MAX98373_R203F_AMP_DSP_CFG 0x203F
+#define MAX98373_R2040_TONE_GEN_CFG 0x2040
+#define MAX98373_R2041_AMP_CFG 0x2041
+#define MAX98373_R2042_AMP_EDGE_RATE_CFG 0x2042
+#define MAX98373_R2043_AMP_EN 0x2043
+#define MAX98373_R2046_IV_SENSE_ADC_DSP_CFG 0x2046
+#define MAX98373_R2047_IV_SENSE_ADC_EN 0x2047
+#define MAX98373_R2051_MEAS_ADC_SAMPLING_RATE 0x2051
+#define MAX98373_R2052_MEAS_ADC_PVDD_FLT_CFG 0x2052
+#define MAX98373_R2053_MEAS_ADC_THERM_FLT_CFG 0x2053
+#define MAX98373_R2054_MEAS_ADC_PVDD_CH_READBACK 0x2054
+#define MAX98373_R2055_MEAS_ADC_THERM_CH_READBACK 0x2055
+#define MAX98373_R2056_MEAS_ADC_PVDD_CH_EN 0x2056
+#define MAX98373_R2090_BDE_LVL_HOLD 0x2090
+#define MAX98373_R2091_BDE_GAIN_ATK_REL_RATE 0x2091
+#define MAX98373_R2092_BDE_CLIPPER_MODE 0x2092
+#define MAX98373_R2097_BDE_L1_THRESH 0x2097
+#define MAX98373_R2098_BDE_L2_THRESH 0x2098
+#define MAX98373_R2099_BDE_L3_THRESH 0x2099
+#define MAX98373_R209A_BDE_L4_THRESH 0x209A
+#define MAX98373_R209B_BDE_THRESH_HYST 0x209B
+#define MAX98373_R20A8_BDE_L1_CFG_1 0x20A8
+#define MAX98373_R20A9_BDE_L1_CFG_2 0x20A9
+#define MAX98373_R20AA_BDE_L1_CFG_3 0x20AA
+#define MAX98373_R20AB_BDE_L2_CFG_1 0x20AB
+#define MAX98373_R20AC_BDE_L2_CFG_2 0x20AC
+#define MAX98373_R20AD_BDE_L2_CFG_3 0x20AD
+#define MAX98373_R20AE_BDE_L3_CFG_1 0x20AE
+#define MAX98373_R20AF_BDE_L3_CFG_2 0x20AF
+#define MAX98373_R20B0_BDE_L3_CFG_3 0x20B0
+#define MAX98373_R20B1_BDE_L4_CFG_1 0x20B1
+#define MAX98373_R20B2_BDE_L4_CFG_2 0x20B2
+#define MAX98373_R20B3_BDE_L4_CFG_3 0x20B3
+#define MAX98373_R20B4_BDE_INFINITE_HOLD_RELEASE 0x20B4
+#define MAX98373_R20B5_BDE_EN 0x20B5
+#define MAX98373_R20B6_BDE_CUR_STATE_READBACK 0x20B6
+#define MAX98373_R20D1_DHT_CFG 0x20D1
+#define MAX98373_R20D2_DHT_ATTACK_CFG 0x20D2
+#define MAX98373_R20D3_DHT_RELEASE_CFG 0x20D3
+#define MAX98373_R20D4_DHT_EN 0x20D4
+#define MAX98373_R20E0_LIMITER_THRESH_CFG 0x20E0
+#define MAX98373_R20E1_LIMITER_ATK_REL_RATES 0x20E1
+#define MAX98373_R20E2_LIMITER_EN 0x20E2
+#define MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG 0x20FE
+#define MAX98373_R20FF_GLOBAL_SHDN 0x20FF
+#define MAX98373_R21FF_REV_ID 0x21FF
+
+/* MAX98373_R2022_PCM_TX_SRC_1 */
+#define MAX98373_PCM_TX_CH_SRC_A_V_SHIFT (0)
+#define MAX98373_PCM_TX_CH_SRC_A_I_SHIFT (4)
+
+/* MAX98373_R2024_PCM_DATA_FMT_CFG */
+#define MAX98373_PCM_MODE_CFG_FORMAT_MASK (0x7 << 3)
+#define MAX98373_PCM_MODE_CFG_FORMAT_SHIFT (3)
+#define MAX98373_PCM_TX_CH_INTERLEAVE_MASK (0x1 << 2)
+#define MAX98373_PCM_FORMAT_I2S (0x0 << 0)
+#define MAX98373_PCM_FORMAT_LJ (0x1 << 0)
+#define MAX98373_PCM_FORMAT_TDM_MODE0 (0x3 << 0)
+#define MAX98373_PCM_FORMAT_TDM_MODE1 (0x4 << 0)
+#define MAX98373_PCM_FORMAT_TDM_MODE2 (0x5 << 0)
+#define MAX98373_PCM_MODE_CFG_CHANSZ_MASK (0x3 << 6)
+#define MAX98373_PCM_MODE_CFG_CHANSZ_16 (0x1 << 6)
+#define MAX98373_PCM_MODE_CFG_CHANSZ_24 (0x2 << 6)
+#define MAX98373_PCM_MODE_CFG_CHANSZ_32 (0x3 << 6)
+
+/* MAX98373_R2026_PCM_CLOCK_RATIO */
+#define MAX98373_PCM_MODE_CFG_PCM_BCLKEDGE (0x1 << 4)
+#define MAX98373_PCM_CLK_SETUP_BSEL_MASK (0xF << 0)
+
+/* MAX98373_R2027_PCM_SR_SETUP_1 */
+#define MAX98373_PCM_SR_SET1_SR_MASK (0xF << 0)
+#define MAX98373_PCM_SR_SET1_SR_8000 (0x0 << 0)
+#define MAX98373_PCM_SR_SET1_SR_11025 (0x1 << 0)
+#define MAX98373_PCM_SR_SET1_SR_12000 (0x2 << 0)
+#define MAX98373_PCM_SR_SET1_SR_16000 (0x3 << 0)
+#define MAX98373_PCM_SR_SET1_SR_22050 (0x4 << 0)
+#define MAX98373_PCM_SR_SET1_SR_24000 (0x5 << 0)
+#define MAX98373_PCM_SR_SET1_SR_32000 (0x6 << 0)
+#define MAX98373_PCM_SR_SET1_SR_44100 (0x7 << 0)
+#define MAX98373_PCM_SR_SET1_SR_48000 (0x8 << 0)
+
+/* MAX98373_R2028_PCM_SR_SETUP_2 */
+#define MAX98373_PCM_SR_SET2_SR_MASK (0xF << 4)
+#define MAX98373_PCM_SR_SET2_SR_SHIFT (4)
+#define MAX98373_PCM_SR_SET2_IVADC_SR_MASK (0xF << 0)
+
+/* MAX98373_R2029_PCM_TO_SPK_MONO_MIX_1 */
+#define MAX98373_PCM_TO_SPK_MONOMIX_CFG_MASK (0x3 << 6)
+#define MAX98373_PCM_TO_SPK_MONOMIX_CFG_SHIFT (6)
+#define MAX98373_PCM_TO_SPK_CH0_SRC_MASK (0xF << 0)
+
+/* MAX98373_R203E_AMP_PATH_GAIN */
+#define MAX98373_SPK_DIGI_GAIN_MASK (0xF << 4)
+#define MAX98373_SPK_DIGI_GAIN_SHIFT (4)
+#define MAX98373_FS_GAIN_MAX_MASK (0xF << 0)
+#define MAX98373_FS_GAIN_MAX_SHIFT (0)
+
+/* MAX98373_R203F_AMP_DSP_CFG */
+#define MAX98373_AMP_DSP_CFG_DCBLK_SHIFT (0)
+#define MAX98373_AMP_DSP_CFG_DITH_SHIFT (1)
+#define MAX98373_AMP_DSP_CFG_RMP_UP_SHIFT (2)
+#define MAX98373_AMP_DSP_CFG_RMP_DN_SHIFT (3)
+#define MAX98373_AMP_DSP_CFG_DAC_INV_SHIFT (5)
+#define MAX98373_AMP_VOL_SEL_SHIFT (7)
+
+/* MAX98373_R2043_AMP_EN */
+#define MAX98373_SPKFB_EN_MASK (0x1 << 1)
+#define MAX98373_SPK_EN_MASK (0x1 << 0)
+#define MAX98373_SPKFB_EN_SHIFT (1)
+
+/*MAX98373_R2052_MEAS_ADC_PVDD_FLT_CFG */
+#define MAX98373_FLT_EN_SHIFT (4)
+
+/* MAX98373_R20B2_BDE_L4_CFG_2 */
+#define MAX98373_LVL4_MUTE_EN_SHIFT (7)
+#define MAX98373_LVL4_HOLD_EN_SHIFT (6)
+
+/* MAX98373_R20B5_BDE_EN */
+#define MAX98373_BDE_EN_SHIFT (0)
+
+/* MAX98373_R20D1_DHT_CFG */
+#define MAX98373_DHT_SPK_GAIN_MIN_SHIFT	(4)
+#define MAX98373_DHT_ROT_PNT_SHIFT	(0)
+
+/* MAX98373_R20D2_DHT_ATTACK_CFG */
+#define MAX98373_DHT_ATTACK_STEP_SHIFT (3)
+#define MAX98373_DHT_ATTACK_RATE_SHIFT (0)
+
+/* MAX98373_R20D3_DHT_RELEASE_CFG */
+#define MAX98373_DHT_RELEASE_STEP_SHIFT (3)
+#define MAX98373_DHT_RELEASE_RATE_SHIFT (0)
+
+/* MAX98373_R20D4_DHT_EN */
+#define MAX98373_DHT_EN_SHIFT (0)
+
+/* MAX98373_R20E0_LIMITER_THRESH_CFG */
+#define MAX98373_LIMITER_THRESH_SHIFT (2)
+#define MAX98373_LIMITER_THRESH_SRC_SHIFT (0)
+
+/* MAX98373_R20E2_LIMITER_EN */
+#define MAX98373_LIMITER_EN_SHIFT (0)
+
+/* MAX98373_R20FE_DEVICE_AUTO_RESTART_CFG */
+#define MAX98373_CLOCK_MON_SHIFT (0)
+
+/* MAX98373_R20FF_GLOBAL_SHDN */
+#define MAX98373_GLOBAL_EN_MASK (0x1 << 0)
+
+/* MAX98373_R2000_SW_RESET */
+#define MAX98373_SOFT_RESET (0x1 << 0)
+
+struct max98373_priv {
+	struct regmap *regmap;
+	unsigned int v_slot;
+	unsigned int i_slot;
+	unsigned int spkfb_slot;
+	bool interleave_mode;
+	unsigned int ch_size;
+	bool tdm_mode;
+};
+#endif
diff --git a/sound/soc/codecs/max98926.c b/sound/soc/codecs/max98926.c
index 03d07bf..7b1d1b0 100644
--- a/sound/soc/codecs/max98926.c
+++ b/sound/soc/codecs/max98926.c
@@ -490,7 +490,7 @@ static int max98926_probe(struct snd_soc_codec *codec)
 	struct max98926_priv *max98926 = snd_soc_codec_get_drvdata(codec);
 
 	max98926->codec = codec;
-	codec->control_data = max98926->regmap;
+
 	/* Hi-Z all the slots */
 	regmap_write(max98926->regmap, MAX98926_DOUT_HIZ_CFG4, 0xF0);
 	return 0;
diff --git a/sound/soc/codecs/max98927.c b/sound/soc/codecs/max98927.c
index a1d3935..f701fdc8 100644
--- a/sound/soc/codecs/max98927.c
+++ b/sound/soc/codecs/max98927.c
@@ -682,7 +682,6 @@ static int max98927_probe(struct snd_soc_codec *codec)
 	struct max98927_priv *max98927 = snd_soc_codec_get_drvdata(codec);
 
 	max98927->codec = codec;
-	codec->control_data = max98927->regmap;
 
 	/* Software Reset */
 	regmap_write(max98927->regmap,
diff --git a/sound/soc/codecs/mc13783.c b/sound/soc/codecs/mc13783.c
index 4fd8d1d..be7a45f 100644
--- a/sound/soc/codecs/mc13783.c
+++ b/sound/soc/codecs/mc13783.c
@@ -610,6 +610,9 @@ static int mc13783_probe(struct snd_soc_codec *codec)
 {
 	struct mc13783_priv *priv = snd_soc_codec_get_drvdata(codec);
 
+	snd_soc_codec_init_regmap(codec,
+				  dev_get_regmap(codec->dev->parent, NULL));
+
 	/* these are the reset values */
 	mc13xxx_reg_write(priv->mc13xxx, MC13783_AUDIO_RX0, 0x25893);
 	mc13xxx_reg_write(priv->mc13xxx, MC13783_AUDIO_RX1, 0x00d35A);
@@ -728,15 +731,9 @@ static struct snd_soc_dai_driver mc13783_dai_sync[] = {
 	}
 };
 
-static struct regmap *mc13783_get_regmap(struct device *dev)
-{
-	return dev_get_regmap(dev->parent, NULL);
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_mc13783 = {
 	.probe		= mc13783_probe,
 	.remove		= mc13783_remove,
-	.get_regmap	= mc13783_get_regmap,
 	.component_driver = {
 		.controls		= mc13783_control_list,
 		.num_controls		= ARRAY_SIZE(mc13783_control_list),
diff --git a/sound/soc/codecs/msm8916-wcd-analog.c b/sound/soc/codecs/msm8916-wcd-analog.c
index 066ea2f..44062bb 100644
--- a/sound/soc/codecs/msm8916-wcd-analog.c
+++ b/sound/soc/codecs/msm8916-wcd-analog.c
@@ -712,6 +712,8 @@ static int pm8916_wcd_analog_probe(struct snd_soc_codec *codec)
 		return err;
 	}
 
+	snd_soc_codec_init_regmap(codec,
+				  dev_get_regmap(codec->dev->parent, NULL));
 	snd_soc_codec_set_drvdata(codec, priv);
 	priv->pmic_rev = snd_soc_read(codec, CDC_D_REVISION1);
 	priv->codec_version = snd_soc_read(codec, CDC_D_PERPH_SUBTYPE);
@@ -943,11 +945,6 @@ static int pm8916_wcd_analog_set_jack(struct snd_soc_codec *codec,
 	return 0;
 }
 
-static struct regmap *pm8916_get_regmap(struct device *dev)
-{
-	return dev_get_regmap(dev->parent, NULL);
-}
-
 static irqreturn_t mbhc_btn_release_irq_handler(int irq, void *arg)
 {
 	struct pm8916_wcd_analog_priv *priv = arg;
@@ -1082,7 +1079,6 @@ static const struct snd_soc_codec_driver pm8916_wcd_analog = {
 	.probe = pm8916_wcd_analog_probe,
 	.remove = pm8916_wcd_analog_remove,
 	.set_jack = pm8916_wcd_analog_set_jack,
-	.get_regmap = pm8916_get_regmap,
 	.component_driver = {
 		.controls = pm8916_wcd_analog_snd_controls,
 		.num_controls = ARRAY_SIZE(pm8916_wcd_analog_snd_controls),
diff --git a/sound/soc/codecs/nau8540.c b/sound/soc/codecs/nau8540.c
index f9c9933..b08fb7e 100644
--- a/sound/soc/codecs/nau8540.c
+++ b/sound/soc/codecs/nau8540.c
@@ -233,6 +233,41 @@ static SOC_ENUM_SINGLE_DECL(
 static const struct snd_kcontrol_new digital_ch1_mux =
 	SOC_DAPM_ENUM("Digital CH1 Select", digital_ch1_enum);
 
+static int adc_power_control(struct snd_soc_dapm_widget *w,
+		struct snd_kcontrol *k, int  event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	struct nau8540 *nau8540 = snd_soc_codec_get_drvdata(codec);
+
+	if (SND_SOC_DAPM_EVENT_ON(event)) {
+		msleep(300);
+		/* DO12 and DO34 pad output enable */
+		regmap_update_bits(nau8540->regmap, NAU8540_REG_PCM_CTRL1,
+			NAU8540_I2S_DO12_TRI, 0);
+		regmap_update_bits(nau8540->regmap, NAU8540_REG_PCM_CTRL2,
+			NAU8540_I2S_DO34_TRI, 0);
+	} else if (SND_SOC_DAPM_EVENT_OFF(event)) {
+		regmap_update_bits(nau8540->regmap, NAU8540_REG_PCM_CTRL1,
+			NAU8540_I2S_DO12_TRI, NAU8540_I2S_DO12_TRI);
+		regmap_update_bits(nau8540->regmap, NAU8540_REG_PCM_CTRL2,
+			NAU8540_I2S_DO34_TRI, NAU8540_I2S_DO34_TRI);
+	}
+	return 0;
+}
+
+static int aiftx_power_control(struct snd_soc_dapm_widget *w,
+		struct snd_kcontrol *k, int  event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	struct nau8540 *nau8540 = snd_soc_codec_get_drvdata(codec);
+
+	if (SND_SOC_DAPM_EVENT_OFF(event)) {
+		regmap_write(nau8540->regmap, NAU8540_REG_RST, 0x0001);
+		regmap_write(nau8540->regmap, NAU8540_REG_RST, 0x0000);
+	}
+	return 0;
+}
+
 static const struct snd_soc_dapm_widget nau8540_dapm_widgets[] = {
 	SND_SOC_DAPM_SUPPLY("MICBIAS2", NAU8540_REG_MIC_BIAS, 11, 0, NULL, 0),
 	SND_SOC_DAPM_SUPPLY("MICBIAS1", NAU8540_REG_MIC_BIAS, 10, 0, NULL, 0),
@@ -247,14 +282,18 @@ static const struct snd_soc_dapm_widget nau8540_dapm_widgets[] = {
 	SND_SOC_DAPM_PGA("Frontend PGA3", NAU8540_REG_PWR, 14, 0, NULL, 0),
 	SND_SOC_DAPM_PGA("Frontend PGA4", NAU8540_REG_PWR, 15, 0, NULL, 0),
 
-	SND_SOC_DAPM_ADC("ADC1", NULL,
-		NAU8540_REG_POWER_MANAGEMENT, 0, 0),
-	SND_SOC_DAPM_ADC("ADC2", NULL,
-		NAU8540_REG_POWER_MANAGEMENT, 1, 0),
-	SND_SOC_DAPM_ADC("ADC3", NULL,
-		NAU8540_REG_POWER_MANAGEMENT, 2, 0),
-	SND_SOC_DAPM_ADC("ADC4", NULL,
-		NAU8540_REG_POWER_MANAGEMENT, 3, 0),
+	SND_SOC_DAPM_ADC_E("ADC1", NULL,
+		NAU8540_REG_POWER_MANAGEMENT, 0, 0, adc_power_control,
+		SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD),
+	SND_SOC_DAPM_ADC_E("ADC2", NULL,
+		NAU8540_REG_POWER_MANAGEMENT, 1, 0, adc_power_control,
+		SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD),
+	SND_SOC_DAPM_ADC_E("ADC3", NULL,
+		NAU8540_REG_POWER_MANAGEMENT, 2, 0, adc_power_control,
+		SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD),
+	SND_SOC_DAPM_ADC_E("ADC4", NULL,
+		NAU8540_REG_POWER_MANAGEMENT, 3, 0, adc_power_control,
+		SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD),
 
 	SND_SOC_DAPM_PGA("ADC CH1", NAU8540_REG_ANALOG_PWR, 0, 0, NULL, 0),
 	SND_SOC_DAPM_PGA("ADC CH2", NAU8540_REG_ANALOG_PWR, 1, 0, NULL, 0),
@@ -270,7 +309,8 @@ static const struct snd_soc_dapm_widget nau8540_dapm_widgets[] = {
 	SND_SOC_DAPM_MUX("Digital CH1 Mux",
 		SND_SOC_NOPM, 0, 0, &digital_ch1_mux),
 
-	SND_SOC_DAPM_AIF_OUT("AIFTX", "Capture", 0, SND_SOC_NOPM, 0, 0),
+	SND_SOC_DAPM_AIF_OUT_E("AIFTX", "Capture", 0, SND_SOC_NOPM, 0, 0,
+		aiftx_power_control, SND_SOC_DAPM_POST_PMD),
 };
 
 static const struct snd_soc_dapm_route nau8540_dapm_routes[] = {
@@ -575,7 +615,8 @@ static void nau8540_fll_apply(struct regmap *regmap,
 		NAU8540_CLK_SRC_MASK | NAU8540_CLK_MCLK_SRC_MASK,
 		NAU8540_CLK_SRC_MCLK | fll_param->mclk_src);
 	regmap_update_bits(regmap, NAU8540_REG_FLL1,
-		NAU8540_FLL_RATIO_MASK, fll_param->ratio);
+		NAU8540_FLL_RATIO_MASK | NAU8540_ICTRL_LATCH_MASK,
+		fll_param->ratio | (0x6 << NAU8540_ICTRL_LATCH_SFT));
 	/* FLL 16-bit fractional input */
 	regmap_write(regmap, NAU8540_REG_FLL2, fll_param->fll_frac);
 	/* FLL 10-bit integer input */
@@ -596,13 +637,14 @@ static void nau8540_fll_apply(struct regmap *regmap,
 			NAU8540_FLL_PDB_DAC_EN | NAU8540_FLL_LOOP_FTR_EN |
 			NAU8540_FLL_FTR_SW_FILTER);
 		regmap_update_bits(regmap, NAU8540_REG_FLL6,
-			NAU8540_SDM_EN, NAU8540_SDM_EN);
+			NAU8540_SDM_EN | NAU8540_CUTOFF500,
+			NAU8540_SDM_EN | NAU8540_CUTOFF500);
 	} else {
 		regmap_update_bits(regmap, NAU8540_REG_FLL5,
 			NAU8540_FLL_PDB_DAC_EN | NAU8540_FLL_LOOP_FTR_EN |
 			NAU8540_FLL_FTR_SW_MASK, NAU8540_FLL_FTR_SW_ACCU);
-		regmap_update_bits(regmap,
-			NAU8540_REG_FLL6, NAU8540_SDM_EN, 0);
+		regmap_update_bits(regmap, NAU8540_REG_FLL6,
+			NAU8540_SDM_EN | NAU8540_CUTOFF500, 0);
 	}
 }
 
@@ -617,17 +659,22 @@ static int nau8540_set_pll(struct snd_soc_codec *codec, int pll_id, int source,
 	switch (pll_id) {
 	case NAU8540_CLK_FLL_MCLK:
 		regmap_update_bits(nau8540->regmap, NAU8540_REG_FLL3,
-			NAU8540_FLL_CLK_SRC_MASK, NAU8540_FLL_CLK_SRC_MCLK);
+			NAU8540_FLL_CLK_SRC_MASK | NAU8540_GAIN_ERR_MASK,
+			NAU8540_FLL_CLK_SRC_MCLK | 0);
 		break;
 
 	case NAU8540_CLK_FLL_BLK:
 		regmap_update_bits(nau8540->regmap, NAU8540_REG_FLL3,
-			NAU8540_FLL_CLK_SRC_MASK, NAU8540_FLL_CLK_SRC_BLK);
+			NAU8540_FLL_CLK_SRC_MASK | NAU8540_GAIN_ERR_MASK,
+			NAU8540_FLL_CLK_SRC_BLK |
+			(0xf << NAU8540_GAIN_ERR_SFT));
 		break;
 
 	case NAU8540_CLK_FLL_FS:
 		regmap_update_bits(nau8540->regmap, NAU8540_REG_FLL3,
-			NAU8540_FLL_CLK_SRC_MASK, NAU8540_FLL_CLK_SRC_FS);
+			NAU8540_FLL_CLK_SRC_MASK | NAU8540_GAIN_ERR_MASK,
+			NAU8540_FLL_CLK_SRC_FS |
+			(0xf << NAU8540_GAIN_ERR_SFT));
 		break;
 
 	default:
@@ -710,9 +757,24 @@ static void nau8540_init_regs(struct nau8540 *nau8540)
 	regmap_update_bits(regmap, NAU8540_REG_CLOCK_CTRL,
 		NAU8540_CLK_ADC_EN | NAU8540_CLK_I2S_EN,
 		NAU8540_CLK_ADC_EN | NAU8540_CLK_I2S_EN);
-	/* ADC OSR selection, CLK_ADC = Fs * OSR */
+	/* ADC OSR selection, CLK_ADC = Fs * OSR;
+	 * Channel time alignment enable.
+	 */
 	regmap_update_bits(regmap, NAU8540_REG_ADC_SAMPLE_RATE,
-		NAU8540_ADC_OSR_MASK, NAU8540_ADC_OSR_64);
+		NAU8540_CH_SYNC | NAU8540_ADC_OSR_MASK,
+		NAU8540_CH_SYNC | NAU8540_ADC_OSR_64);
+	/* PGA input mode selection */
+	regmap_update_bits(regmap, NAU8540_REG_FEPGA1,
+		NAU8540_FEPGA1_MODCH2_SHT | NAU8540_FEPGA1_MODCH1_SHT,
+		NAU8540_FEPGA1_MODCH2_SHT | NAU8540_FEPGA1_MODCH1_SHT);
+	regmap_update_bits(regmap, NAU8540_REG_FEPGA2,
+		NAU8540_FEPGA2_MODCH4_SHT | NAU8540_FEPGA2_MODCH3_SHT,
+		NAU8540_FEPGA2_MODCH4_SHT | NAU8540_FEPGA2_MODCH3_SHT);
+	/* DO12 and DO34 pad output disable */
+	regmap_update_bits(regmap, NAU8540_REG_PCM_CTRL1,
+		NAU8540_I2S_DO12_TRI, NAU8540_I2S_DO12_TRI);
+	regmap_update_bits(regmap, NAU8540_REG_PCM_CTRL2,
+		NAU8540_I2S_DO34_TRI, NAU8540_I2S_DO34_TRI);
 }
 
 static int __maybe_unused nau8540_suspend(struct snd_soc_codec *codec)
diff --git a/sound/soc/codecs/nau8540.h b/sound/soc/codecs/nau8540.h
index 5db5b22..732b490 100644
--- a/sound/soc/codecs/nau8540.h
+++ b/sound/soc/codecs/nau8540.h
@@ -100,9 +100,13 @@
 #define NAU8540_CLK_MCLK_SRC_MASK	0xf
 
 /* FLL1 (0x04) */
+#define NAU8540_ICTRL_LATCH_SFT	10
+#define NAU8540_ICTRL_LATCH_MASK	(0x7 << NAU8540_ICTRL_LATCH_SFT)
 #define NAU8540_FLL_RATIO_MASK	0x7f
 
 /* FLL3 (0x06) */
+#define NAU8540_GAIN_ERR_SFT		12
+#define NAU8540_GAIN_ERR_MASK		(0xf << NAU8540_GAIN_ERR_SFT)
 #define NAU8540_FLL_CLK_SRC_SFT	10
 #define NAU8540_FLL_CLK_SRC_MASK	(0x3 << NAU8540_FLL_CLK_SRC_SFT)
 #define NAU8540_FLL_CLK_SRC_MCLK	(0 << NAU8540_FLL_CLK_SRC_SFT)
@@ -127,6 +131,7 @@
 /* FLL6 (0x9) */
 #define NAU8540_DCO_EN			(0x1 << 15)
 #define NAU8540_SDM_EN			(0x1 << 14)
+#define NAU8540_CUTOFF500		(0x1 << 13)
 
 /* PCM_CTRL0 (0x10) */
 #define NAU8540_I2S_BP_SFT		7
@@ -146,6 +151,7 @@
 #define NAU8540_I2S_DF_PCM_AB		0x3
 
 /* PCM_CTRL1 (0x11) */
+#define NAU8540_I2S_DO12_TRI		(0x1 << 15)
 #define NAU8540_I2S_LRC_DIV_SFT	12
 #define NAU8540_I2S_LRC_DIV_MASK	(0x3 << NAU8540_I2S_LRC_DIV_SFT)
 #define NAU8540_I2S_DO12_OE		(0x1 << 4)
@@ -156,6 +162,7 @@
 #define NAU8540_I2S_BLK_DIV_MASK	0x7
 
 /* PCM_CTRL1 (0x12) */
+#define NAU8540_I2S_DO34_TRI		(0x1 << 15)
 #define NAU8540_I2S_DO34_OE		(0x1 << 11)
 #define NAU8540_I2S_TSLOT_L_MASK	0x3ff
 
@@ -165,6 +172,7 @@
 #define NAU8540_TDM_TX_MASK		0xf
 
 /* ADC_SAMPLE_RATE (0x3A) */
+#define NAU8540_CH_SYNC		(0x1 << 14)
 #define NAU8540_ADC_OSR_MASK		0x3
 #define NAU8540_ADC_OSR_256		0x3
 #define NAU8540_ADC_OSR_128		0x2
@@ -183,6 +191,18 @@
 #define NAU8540_PRECHARGE_DIS		(0x1 << 13)
 #define NAU8540_GLOBAL_BIAS_EN	(0x1 << 12)
 
+/* FEPGA1 (0x69) */
+#define NAU8540_FEPGA1_MODCH2_SHT_SFT	7
+#define NAU8540_FEPGA1_MODCH2_SHT	(0x1 << NAU8540_FEPGA1_MODCH2_SHT_SFT)
+#define NAU8540_FEPGA1_MODCH1_SHT_SFT	3
+#define NAU8540_FEPGA1_MODCH1_SHT	(0x1 << NAU8540_FEPGA1_MODCH1_SHT_SFT)
+
+/* FEPGA2 (0x6A) */
+#define NAU8540_FEPGA2_MODCH4_SHT_SFT	7
+#define NAU8540_FEPGA2_MODCH4_SHT	(0x1 << NAU8540_FEPGA2_MODCH4_SHT_SFT)
+#define NAU8540_FEPGA2_MODCH3_SHT_SFT	3
+#define NAU8540_FEPGA2_MODCH3_SHT	(0x1 << NAU8540_FEPGA2_MODCH3_SHT_SFT)
+
 
 /* System Clock Source */
 enum {
diff --git a/sound/soc/codecs/nau8824.c b/sound/soc/codecs/nau8824.c
index 0240759..088e0ce 100644
--- a/sound/soc/codecs/nau8824.c
+++ b/sound/soc/codecs/nau8824.c
@@ -43,7 +43,7 @@ static bool nau8824_is_jack_inserted(struct nau8824 *nau8824);
 
 /* the parameter threshold of FLL */
 #define NAU_FREF_MAX 13500000
-#define NAU_FVCO_MAX 124000000
+#define NAU_FVCO_MAX 100000000
 #define NAU_FVCO_MIN 90000000
 
 /* scaling for mclk from sysclk_src output */
@@ -811,7 +811,8 @@ static void nau8824_eject_jack(struct nau8824 *nau8824)
 		NAU8824_JD_SLEEP_MODE, NAU8824_JD_SLEEP_MODE);
 
 	/* Close clock for jack type detection at manual mode */
-	nau8824_config_sysclk(nau8824, NAU8824_CLK_DIS, 0);
+	if (dapm->bias_level < SND_SOC_BIAS_PREPARE)
+		nau8824_config_sysclk(nau8824, NAU8824_CLK_DIS, 0);
 }
 
 static void nau8824_jdet_work(struct work_struct *work)
@@ -843,6 +844,11 @@ static void nau8824_jdet_work(struct work_struct *work)
 	event_mask |= SND_JACK_HEADSET;
 	snd_soc_jack_report(nau8824->jack, event, event_mask);
 
+	/* Enable short key press and release interruption. */
+	regmap_update_bits(regmap, NAU8824_REG_INTERRUPT_SETTING,
+		NAU8824_IRQ_KEY_RELEASE_DIS |
+		NAU8824_IRQ_KEY_SHORT_PRESS_DIS, 0);
+
 	nau8824_sema_release(nau8824);
 }
 
@@ -850,15 +856,15 @@ static void nau8824_setup_auto_irq(struct nau8824 *nau8824)
 {
 	struct regmap *regmap = nau8824->regmap;
 
-	/* Enable jack ejection, short key press and release interruption. */
+	/* Enable jack ejection interruption. */
 	regmap_update_bits(regmap, NAU8824_REG_INTERRUPT_SETTING_1,
 		NAU8824_IRQ_INSERT_EN | NAU8824_IRQ_EJECT_EN,
 		NAU8824_IRQ_EJECT_EN);
 	regmap_update_bits(regmap, NAU8824_REG_INTERRUPT_SETTING,
-		NAU8824_IRQ_EJECT_DIS | NAU8824_IRQ_KEY_RELEASE_DIS |
-		NAU8824_IRQ_KEY_SHORT_PRESS_DIS, 0);
+		NAU8824_IRQ_EJECT_DIS, 0);
 	/* Enable internal VCO needed for interruptions */
-	nau8824_config_sysclk(nau8824, NAU8824_CLK_INTERNAL, 0);
+	if (nau8824->dapm->bias_level < SND_SOC_BIAS_PREPARE)
+		nau8824_config_sysclk(nau8824, NAU8824_CLK_INTERNAL, 0);
 	regmap_update_bits(regmap, NAU8824_REG_ENA_CTRL,
 		NAU8824_JD_SLEEP_MODE, 0);
 }
diff --git a/sound/soc/codecs/nau8825.c b/sound/soc/codecs/nau8825.c
index e853a6d..a1b697b 100644
--- a/sound/soc/codecs/nau8825.c
+++ b/sound/soc/codecs/nau8825.c
@@ -194,10 +194,10 @@ static const struct reg_default nau8825_reg_defaults[] = {
 
 /* register backup table when cross talk detection */
 static struct reg_default nau8825_xtalk_baktab[] = {
-	{ NAU8825_REG_ADC_DGAIN_CTRL, 0 },
+	{ NAU8825_REG_ADC_DGAIN_CTRL, 0x00cf },
 	{ NAU8825_REG_HSVOL_CTRL, 0 },
-	{ NAU8825_REG_DACL_CTRL, 0 },
-	{ NAU8825_REG_DACR_CTRL, 0 },
+	{ NAU8825_REG_DACL_CTRL, 0x00cf },
+	{ NAU8825_REG_DACR_CTRL, 0x02cf },
 };
 
 static const unsigned short logtable[256] = {
@@ -245,13 +245,14 @@ static const unsigned short logtable[256] = {
  * tasks are allowed to acquire the semaphore, calling this function will
  * put the task to sleep. If the semaphore is not released within the
  * specified number of jiffies, this function returns.
- * Acquires the semaphore without jiffies. If no more tasks are allowed
- * to acquire the semaphore, calling this function will put the task to
- * sleep until the semaphore is released.
  * If the semaphore is not released within the specified number of jiffies,
- * this function returns -ETIME.
- * If the sleep is interrupted by a signal, this function will return -EINTR.
- * It returns 0 if the semaphore was acquired successfully.
+ * this function returns -ETIME. If the sleep is interrupted by a signal,
+ * this function will return -EINTR. It returns 0 if the semaphore was
+ * acquired successfully.
+ *
+ * Acquires the semaphore without jiffies. Try to acquire the semaphore
+ * atomically. Returns 0 if the semaphore has been acquired successfully
+ * or 1 if it it cannot be acquired.
  */
 static int nau8825_sema_acquire(struct nau8825 *nau8825, long timeout)
 {
@@ -262,8 +263,8 @@ static int nau8825_sema_acquire(struct nau8825 *nau8825, long timeout)
 		if (ret < 0)
 			dev_warn(nau8825->dev, "Acquire semaphore timeout\n");
 	} else {
-		ret = down_interruptible(&nau8825->xtalk_sem);
-		if (ret < 0)
+		ret = down_trylock(&nau8825->xtalk_sem);
+		if (ret)
 			dev_warn(nau8825->dev, "Acquire semaphore fail\n");
 	}
 
@@ -454,22 +455,32 @@ static void nau8825_xtalk_backup(struct nau8825 *nau8825)
 {
 	int i;
 
+	if (nau8825->xtalk_baktab_initialized)
+		return;
+
 	/* Backup some register values to backup table */
 	for (i = 0; i < ARRAY_SIZE(nau8825_xtalk_baktab); i++)
 		regmap_read(nau8825->regmap, nau8825_xtalk_baktab[i].reg,
 				&nau8825_xtalk_baktab[i].def);
+
+	nau8825->xtalk_baktab_initialized = true;
 }
 
-static void nau8825_xtalk_restore(struct nau8825 *nau8825)
+static void nau8825_xtalk_restore(struct nau8825 *nau8825, bool cause_cancel)
 {
 	int i, volume;
 
+	if (!nau8825->xtalk_baktab_initialized)
+		return;
+
 	/* Restore register values from backup table; When the driver restores
-	 * the headphone volumem, it needs recover to original level gradually
-	 * with 3dB per step for less pop noise.
+	 * the headphone volume in XTALK_DONE state, it needs recover to
+	 * original level gradually with 3dB per step for less pop noise.
+	 * Otherwise, the restore should do ASAP.
 	 */
 	for (i = 0; i < ARRAY_SIZE(nau8825_xtalk_baktab); i++) {
-		if (nau8825_xtalk_baktab[i].reg == NAU8825_REG_HSVOL_CTRL) {
+		if (!cause_cancel && nau8825_xtalk_baktab[i].reg ==
+			NAU8825_REG_HSVOL_CTRL) {
 			/* Ramping up the volume change to reduce pop noise */
 			volume = nau8825_xtalk_baktab[i].def &
 				NAU8825_HPR_VOL_MASK;
@@ -479,6 +490,8 @@ static void nau8825_xtalk_restore(struct nau8825 *nau8825)
 		regmap_write(nau8825->regmap, nau8825_xtalk_baktab[i].reg,
 				nau8825_xtalk_baktab[i].def);
 	}
+
+	nau8825->xtalk_baktab_initialized = false;
 }
 
 static void nau8825_xtalk_prepare_dac(struct nau8825 *nau8825)
@@ -644,7 +657,7 @@ static void nau8825_xtalk_clean_adc(struct nau8825 *nau8825)
 		NAU8825_POWERUP_ADCL | NAU8825_ADC_VREFSEL_MASK, 0);
 }
 
-static void nau8825_xtalk_clean(struct nau8825 *nau8825)
+static void nau8825_xtalk_clean(struct nau8825 *nau8825, bool cause_cancel)
 {
 	/* Enable internal VCO needed for interruptions */
 	nau8825_configure_sysclk(nau8825, NAU8825_CLK_INTERNAL, 0);
@@ -660,7 +673,7 @@ static void nau8825_xtalk_clean(struct nau8825 *nau8825)
 		NAU8825_I2S_MS_MASK | NAU8825_I2S_LRC_DIV_MASK |
 		NAU8825_I2S_BLK_DIV_MASK, NAU8825_I2S_MS_SLAVE);
 	/* Restore value of specific register for cross talk */
-	nau8825_xtalk_restore(nau8825);
+	nau8825_xtalk_restore(nau8825, cause_cancel);
 }
 
 static void nau8825_xtalk_imm_start(struct nau8825 *nau8825, int vol)
@@ -779,7 +792,7 @@ static void nau8825_xtalk_measure(struct nau8825 *nau8825)
 		dev_dbg(nau8825->dev, "cross talk sidetone: %x\n", sidetone);
 		regmap_write(nau8825->regmap, NAU8825_REG_DAC_DGAIN_CTRL,
 					(sidetone << 8) | sidetone);
-		nau8825_xtalk_clean(nau8825);
+		nau8825_xtalk_clean(nau8825, false);
 		nau8825->xtalk_state = NAU8825_XTALK_DONE;
 		break;
 	default:
@@ -815,13 +828,14 @@ static void nau8825_xtalk_work(struct work_struct *work)
 
 static void nau8825_xtalk_cancel(struct nau8825 *nau8825)
 {
-	/* If the xtalk_protect is true, that means the process is still
-	 * on going. The driver forces to cancel the cross talk task and
+	/* If the crosstalk is eanbled and the process is on going,
+	 * the driver forces to cancel the crosstalk task and
 	 * restores the configuration to original status.
 	 */
-	if (nau8825->xtalk_protect) {
+	if (nau8825->xtalk_enable && nau8825->xtalk_state !=
+		NAU8825_XTALK_DONE) {
 		cancel_work_sync(&nau8825->xtalk_work);
-		nau8825_xtalk_clean(nau8825);
+		nau8825_xtalk_clean(nau8825, true);
 	}
 	/* Reset parameters for cross talk suppression function */
 	nau8825_sema_reset(nau8825);
@@ -1246,8 +1260,10 @@ static int nau8825_hw_params(struct snd_pcm_substream *substream,
 		regmap_read(nau8825->regmap, NAU8825_REG_DAC_CTRL1, &osr);
 		osr &= NAU8825_DAC_OVERSAMPLE_MASK;
 		if (nau8825_clock_check(nau8825, substream->stream,
-			params_rate(params), osr))
+			params_rate(params), osr)) {
+			nau8825_sema_release(nau8825);
 			return -EINVAL;
+		}
 		regmap_update_bits(nau8825->regmap, NAU8825_REG_CLK_DIVIDER,
 			NAU8825_CLK_DAC_SRC_MASK,
 			osr_dac_sel[osr].clk_src << NAU8825_CLK_DAC_SRC_SFT);
@@ -1255,8 +1271,10 @@ static int nau8825_hw_params(struct snd_pcm_substream *substream,
 		regmap_read(nau8825->regmap, NAU8825_REG_ADC_RATE, &osr);
 		osr &= NAU8825_ADC_SYNC_DOWN_MASK;
 		if (nau8825_clock_check(nau8825, substream->stream,
-			params_rate(params), osr))
+			params_rate(params), osr)) {
+			nau8825_sema_release(nau8825);
 			return -EINVAL;
+		}
 		regmap_update_bits(nau8825->regmap, NAU8825_REG_CLK_DIVIDER,
 			NAU8825_CLK_ADC_SRC_MASK,
 			osr_adc_sel[osr].clk_src << NAU8825_CLK_ADC_SRC_SFT);
@@ -1273,8 +1291,10 @@ static int nau8825_hw_params(struct snd_pcm_substream *substream,
 			bclk_div = 1;
 		else if (bclk_fs <= 128)
 			bclk_div = 0;
-		else
+		else {
+			nau8825_sema_release(nau8825);
 			return -EINVAL;
+		}
 		regmap_update_bits(nau8825->regmap, NAU8825_REG_I2S_PCM_CTRL2,
 			NAU8825_I2S_LRC_DIV_MASK | NAU8825_I2S_BLK_DIV_MASK,
 			((bclk_div + 1) << NAU8825_I2S_LRC_DIV_SFT) | bclk_div);
@@ -1294,6 +1314,7 @@ static int nau8825_hw_params(struct snd_pcm_substream *substream,
 		val_len |= NAU8825_I2S_DL_32;
 		break;
 	default:
+		nau8825_sema_release(nau8825);
 		return -EINVAL;
 	}
 
@@ -1312,8 +1333,6 @@ static int nau8825_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt)
 	struct nau8825 *nau8825 = snd_soc_codec_get_drvdata(codec);
 	unsigned int ctrl1_val = 0, ctrl2_val = 0;
 
-	nau8825_sema_acquire(nau8825, 3 * HZ);
-
 	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
 	case SND_SOC_DAIFMT_CBM_CFM:
 		ctrl2_val |= NAU8825_I2S_MS_MASTER;
@@ -1355,6 +1374,8 @@ static int nau8825_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt)
 		return -EINVAL;
 	}
 
+	nau8825_sema_acquire(nau8825, 3 * HZ);
+
 	regmap_update_bits(nau8825->regmap, NAU8825_REG_I2S_PCM_CTRL1,
 		NAU8825_I2S_DL_MASK | NAU8825_I2S_DF_MASK |
 		NAU8825_I2S_BP_MASK | NAU8825_I2S_PCMB_MASK,
@@ -1687,7 +1708,7 @@ static irqreturn_t nau8825_interrupt(int irq, void *data)
 	} else if (active_irq & NAU8825_HEADSET_COMPLETION_IRQ) {
 		if (nau8825_is_jack_inserted(regmap)) {
 			event |= nau8825_jack_insert(nau8825);
-			if (!nau8825->xtalk_bypass && !nau8825->high_imped) {
+			if (nau8825->xtalk_enable && !nau8825->high_imped) {
 				/* Apply the cross talk suppression in the
 				 * headset without high impedance.
 				 */
@@ -1701,12 +1722,15 @@ static irqreturn_t nau8825_interrupt(int irq, void *data)
 					int ret;
 					nau8825->xtalk_protect = true;
 					ret = nau8825_sema_acquire(nau8825, 0);
-					if (ret < 0)
+					if (ret)
 						nau8825->xtalk_protect = false;
 				}
 				/* Startup cross talk detection process */
-				nau8825->xtalk_state = NAU8825_XTALK_PREPARE;
-				schedule_work(&nau8825->xtalk_work);
+				if (nau8825->xtalk_protect) {
+					nau8825->xtalk_state =
+						NAU8825_XTALK_PREPARE;
+					schedule_work(&nau8825->xtalk_work);
+				}
 			} else {
 				/* The cross talk suppression shouldn't apply
 				 * in the headset with high impedance. Thus,
@@ -1733,7 +1757,9 @@ static irqreturn_t nau8825_interrupt(int irq, void *data)
 			nau8825->xtalk_event_mask = event_mask;
 		}
 	} else if (active_irq & NAU8825_IMPEDANCE_MEAS_IRQ) {
-		schedule_work(&nau8825->xtalk_work);
+		/* crosstalk detection enable and process on going */
+		if (nau8825->xtalk_enable && nau8825->xtalk_protect)
+			schedule_work(&nau8825->xtalk_work);
 		clear_irq = NAU8825_IMPEDANCE_MEAS_IRQ;
 	} else if ((active_irq & NAU8825_JACK_INSERTION_IRQ_MASK) ==
 		NAU8825_JACK_INSERTION_DETECTED) {
@@ -2382,7 +2408,7 @@ static int __maybe_unused nau8825_resume(struct snd_soc_codec *codec)
 	regcache_sync(nau8825->regmap);
 	nau8825->xtalk_protect = true;
 	ret = nau8825_sema_acquire(nau8825, 0);
-	if (ret < 0)
+	if (ret)
 		nau8825->xtalk_protect = false;
 	enable_irq(nau8825->irq);
 
@@ -2441,8 +2467,8 @@ static void nau8825_print_device_properties(struct nau8825 *nau8825)
 			nau8825->jack_insert_debounce);
 	dev_dbg(dev, "jack-eject-debounce:  %d\n",
 			nau8825->jack_eject_debounce);
-	dev_dbg(dev, "crosstalk-bypass:     %d\n",
-			nau8825->xtalk_bypass);
+	dev_dbg(dev, "crosstalk-enable:     %d\n",
+			nau8825->xtalk_enable);
 }
 
 static int nau8825_read_device_properties(struct device *dev,
@@ -2507,8 +2533,8 @@ static int nau8825_read_device_properties(struct device *dev,
 		&nau8825->jack_eject_debounce);
 	if (ret)
 		nau8825->jack_eject_debounce = 0;
-	nau8825->xtalk_bypass = device_property_read_bool(dev,
-		"nuvoton,crosstalk-bypass");
+	nau8825->xtalk_enable = device_property_read_bool(dev,
+		"nuvoton,crosstalk-enable");
 
 	nau8825->mclk = devm_clk_get(dev, "mclk");
 	if (PTR_ERR(nau8825->mclk) == -EPROBE_DEFER) {
@@ -2569,6 +2595,7 @@ static int nau8825_i2c_probe(struct i2c_client *i2c,
 	 */
 	nau8825->xtalk_state = NAU8825_XTALK_DONE;
 	nau8825->xtalk_protect = false;
+	nau8825->xtalk_baktab_initialized = false;
 	sema_init(&nau8825->xtalk_sem, 1);
 	INIT_WORK(&nau8825->xtalk_work, nau8825_xtalk_work);
 
diff --git a/sound/soc/codecs/nau8825.h b/sound/soc/codecs/nau8825.h
index 8aee5c86..f7e7321 100644
--- a/sound/soc/codecs/nau8825.h
+++ b/sound/soc/codecs/nau8825.h
@@ -476,7 +476,8 @@ struct nau8825 {
 	int xtalk_event_mask;
 	bool xtalk_protect;
 	int imp_rms[NAU8825_XTALK_IMM];
-	int xtalk_bypass;
+	int xtalk_enable;
+	bool xtalk_baktab_initialized; /* True if initialized. */
 };
 
 int nau8825_enable_jack_detect(struct snd_soc_codec *codec,
diff --git a/sound/soc/codecs/pcm186x-i2c.c b/sound/soc/codecs/pcm186x-i2c.c
new file mode 100644
index 0000000..5436212
--- /dev/null
+++ b/sound/soc/codecs/pcm186x-i2c.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Texas Instruments PCM186x Universal Audio ADC - I2C
+ *
+ * Copyright (C) 2015-2017 Texas Instruments Incorporated - http://www.ti.com
+ *	Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/i2c.h>
+
+#include "pcm186x.h"
+
+static const struct of_device_id pcm186x_of_match[] = {
+	{ .compatible = "ti,pcm1862", .data = (void *)PCM1862 },
+	{ .compatible = "ti,pcm1863", .data = (void *)PCM1863 },
+	{ .compatible = "ti,pcm1864", .data = (void *)PCM1864 },
+	{ .compatible = "ti,pcm1865", .data = (void *)PCM1865 },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, pcm186x_of_match);
+
+static int pcm186x_i2c_probe(struct i2c_client *i2c,
+			     const struct i2c_device_id *id)
+{
+	const enum pcm186x_type type = (enum pcm186x_type)id->driver_data;
+	int irq = i2c->irq;
+	struct regmap *regmap;
+
+	regmap = devm_regmap_init_i2c(i2c, &pcm186x_regmap);
+	if (IS_ERR(regmap))
+		return PTR_ERR(regmap);
+
+	return pcm186x_probe(&i2c->dev, type, irq, regmap);
+}
+
+static int pcm186x_i2c_remove(struct i2c_client *i2c)
+{
+	pcm186x_remove(&i2c->dev);
+
+	return 0;
+}
+
+static const struct i2c_device_id pcm186x_i2c_id[] = {
+	{ "pcm1862", PCM1862 },
+	{ "pcm1863", PCM1863 },
+	{ "pcm1864", PCM1864 },
+	{ "pcm1865", PCM1865 },
+	{ }
+};
+MODULE_DEVICE_TABLE(i2c, pcm186x_i2c_id);
+
+static struct i2c_driver pcm186x_i2c_driver = {
+	.probe		= pcm186x_i2c_probe,
+	.remove		= pcm186x_i2c_remove,
+	.id_table	= pcm186x_i2c_id,
+	.driver		= {
+		.name	= "pcm186x",
+		.of_match_table = pcm186x_of_match,
+	},
+};
+module_i2c_driver(pcm186x_i2c_driver);
+
+MODULE_AUTHOR("Andreas Dannenberg <dannenberg@ti.com>");
+MODULE_AUTHOR("Andrew F. Davis <afd@ti.com>");
+MODULE_DESCRIPTION("PCM186x Universal Audio ADC I2C Interface Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/pcm186x-spi.c b/sound/soc/codecs/pcm186x-spi.c
new file mode 100644
index 0000000..2366f8e
--- /dev/null
+++ b/sound/soc/codecs/pcm186x-spi.c
@@ -0,0 +1,69 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Texas Instruments PCM186x Universal Audio ADC - SPI
+ *
+ * Copyright (C) 2015-2017 Texas Instruments Incorporated - http://www.ti.com
+ *	Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/spi/spi.h>
+
+#include "pcm186x.h"
+
+static const struct of_device_id pcm186x_of_match[] = {
+	{ .compatible = "ti,pcm1862", .data = (void *)PCM1862 },
+	{ .compatible = "ti,pcm1863", .data = (void *)PCM1863 },
+	{ .compatible = "ti,pcm1864", .data = (void *)PCM1864 },
+	{ .compatible = "ti,pcm1865", .data = (void *)PCM1865 },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, pcm186x_of_match);
+
+static int pcm186x_spi_probe(struct spi_device *spi)
+{
+	const enum pcm186x_type type =
+			 (enum pcm186x_type)spi_get_device_id(spi)->driver_data;
+	int irq = spi->irq;
+	struct regmap *regmap;
+
+	regmap = devm_regmap_init_spi(spi, &pcm186x_regmap);
+	if (IS_ERR(regmap))
+		return PTR_ERR(regmap);
+
+	return pcm186x_probe(&spi->dev, type, irq, regmap);
+}
+
+static int pcm186x_spi_remove(struct spi_device *spi)
+{
+	pcm186x_remove(&spi->dev);
+
+	return 0;
+}
+
+static const struct spi_device_id pcm186x_spi_id[] = {
+	{ "pcm1862", PCM1862 },
+	{ "pcm1863", PCM1863 },
+	{ "pcm1864", PCM1864 },
+	{ "pcm1865", PCM1865 },
+	{ }
+};
+MODULE_DEVICE_TABLE(spi, pcm186x_spi_id);
+
+static struct spi_driver pcm186x_spi_driver = {
+	.probe		= pcm186x_spi_probe,
+	.remove		= pcm186x_spi_remove,
+	.id_table	= pcm186x_spi_id,
+	.driver		= {
+		.name	= "pcm186x",
+		.of_match_table = pcm186x_of_match,
+	},
+};
+module_spi_driver(pcm186x_spi_driver);
+
+MODULE_AUTHOR("Andreas Dannenberg <dannenberg@ti.com>");
+MODULE_AUTHOR("Andrew F. Davis <afd@ti.com>");
+MODULE_DESCRIPTION("PCM186x Universal Audio ADC SPI Interface Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/pcm186x.c b/sound/soc/codecs/pcm186x.c
new file mode 100644
index 0000000..cdb5142
--- /dev/null
+++ b/sound/soc/codecs/pcm186x.c
@@ -0,0 +1,719 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Texas Instruments PCM186x Universal Audio ADC
+ *
+ * Copyright (C) 2015-2017 Texas Instruments Incorporated - http://www.ti.com
+ *	Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/init.h>
+#include <linux/delay.h>
+#include <linux/pm.h>
+#include <linux/pm_runtime.h>
+#include <linux/regulator/consumer.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+#include <sound/core.h>
+#include <sound/pcm.h>
+#include <sound/pcm_params.h>
+#include <sound/soc.h>
+#include <sound/jack.h>
+#include <sound/initval.h>
+#include <sound/tlv.h>
+
+#include "pcm186x.h"
+
+static const char * const pcm186x_supply_names[] = {
+	"avdd",		/* Analog power supply. Connect to 3.3-V supply. */
+	"dvdd",		/* Digital power supply. Connect to 3.3-V supply. */
+	"iovdd",	/* I/O power supply. Connect to 3.3-V or 1.8-V. */
+};
+#define PCM186x_NUM_SUPPLIES ARRAY_SIZE(pcm186x_supply_names)
+
+struct pcm186x_priv {
+	struct regmap *regmap;
+	struct regulator_bulk_data supplies[PCM186x_NUM_SUPPLIES];
+	unsigned int sysclk;
+	unsigned int tdm_offset;
+	bool is_tdm_mode;
+	bool is_master_mode;
+};
+
+static const DECLARE_TLV_DB_SCALE(pcm186x_pga_tlv, -1200, 4000, 50);
+
+static const struct snd_kcontrol_new pcm1863_snd_controls[] = {
+	SOC_DOUBLE_R_S_TLV("ADC Capture Volume", PCM186X_PGA_VAL_CH1_L,
+			   PCM186X_PGA_VAL_CH1_R, 0, -24, 80, 7, 0,
+			   pcm186x_pga_tlv),
+};
+
+static const struct snd_kcontrol_new pcm1865_snd_controls[] = {
+	SOC_DOUBLE_R_S_TLV("ADC1 Capture Volume", PCM186X_PGA_VAL_CH1_L,
+			   PCM186X_PGA_VAL_CH1_R, 0, -24, 80, 7, 0,
+			   pcm186x_pga_tlv),
+	SOC_DOUBLE_R_S_TLV("ADC2 Capture Volume", PCM186X_PGA_VAL_CH2_L,
+			   PCM186X_PGA_VAL_CH2_R, 0, -24, 80, 7, 0,
+			   pcm186x_pga_tlv),
+};
+
+static const unsigned int pcm186x_adc_input_channel_sel_value[] = {
+	0x00, 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, 0x07,
+	0x08, 0x09, 0x0a, 0x0b, 0x0c, 0x0d, 0x0e, 0x0f,
+	0x10, 0x20, 0x30
+};
+
+static const char * const pcm186x_adcl_input_channel_sel_text[] = {
+	"No Select",
+	"VINL1[SE]",					/* Default for ADC1L */
+	"VINL2[SE]",					/* Default for ADC2L */
+	"VINL2[SE] + VINL1[SE]",
+	"VINL3[SE]",
+	"VINL3[SE] + VINL1[SE]",
+	"VINL3[SE] + VINL2[SE]",
+	"VINL3[SE] + VINL2[SE] + VINL1[SE]",
+	"VINL4[SE]",
+	"VINL4[SE] + VINL1[SE]",
+	"VINL4[SE] + VINL2[SE]",
+	"VINL4[SE] + VINL2[SE] + VINL1[SE]",
+	"VINL4[SE] + VINL3[SE]",
+	"VINL4[SE] + VINL3[SE] + VINL1[SE]",
+	"VINL4[SE] + VINL3[SE] + VINL2[SE]",
+	"VINL4[SE] + VINL3[SE] + VINL2[SE] + VINL1[SE]",
+	"{VIN1P, VIN1M}[DIFF]",
+	"{VIN4P, VIN4M}[DIFF]",
+	"{VIN1P, VIN1M}[DIFF] + {VIN4P, VIN4M}[DIFF]"
+};
+
+static const char * const pcm186x_adcr_input_channel_sel_text[] = {
+	"No Select",
+	"VINR1[SE]",					/* Default for ADC1R */
+	"VINR2[SE]",					/* Default for ADC2R */
+	"VINR2[SE] + VINR1[SE]",
+	"VINR3[SE]",
+	"VINR3[SE] + VINR1[SE]",
+	"VINR3[SE] + VINR2[SE]",
+	"VINR3[SE] + VINR2[SE] + VINR1[SE]",
+	"VINR4[SE]",
+	"VINR4[SE] + VINR1[SE]",
+	"VINR4[SE] + VINR2[SE]",
+	"VINR4[SE] + VINR2[SE] + VINR1[SE]",
+	"VINR4[SE] + VINR3[SE]",
+	"VINR4[SE] + VINR3[SE] + VINR1[SE]",
+	"VINR4[SE] + VINR3[SE] + VINR2[SE]",
+	"VINR4[SE] + VINR3[SE] + VINR2[SE] + VINR1[SE]",
+	"{VIN2P, VIN2M}[DIFF]",
+	"{VIN3P, VIN3M}[DIFF]",
+	"{VIN2P, VIN2M}[DIFF] + {VIN3P, VIN3M}[DIFF]"
+};
+
+static const struct soc_enum pcm186x_adc_input_channel_sel[] = {
+	SOC_VALUE_ENUM_SINGLE(PCM186X_ADC1_INPUT_SEL_L, 0,
+			      PCM186X_ADC_INPUT_SEL_MASK,
+			      ARRAY_SIZE(pcm186x_adcl_input_channel_sel_text),
+			      pcm186x_adcl_input_channel_sel_text,
+			      pcm186x_adc_input_channel_sel_value),
+	SOC_VALUE_ENUM_SINGLE(PCM186X_ADC1_INPUT_SEL_R, 0,
+			      PCM186X_ADC_INPUT_SEL_MASK,
+			      ARRAY_SIZE(pcm186x_adcr_input_channel_sel_text),
+			      pcm186x_adcr_input_channel_sel_text,
+			      pcm186x_adc_input_channel_sel_value),
+	SOC_VALUE_ENUM_SINGLE(PCM186X_ADC2_INPUT_SEL_L, 0,
+			      PCM186X_ADC_INPUT_SEL_MASK,
+			      ARRAY_SIZE(pcm186x_adcl_input_channel_sel_text),
+			      pcm186x_adcl_input_channel_sel_text,
+			      pcm186x_adc_input_channel_sel_value),
+	SOC_VALUE_ENUM_SINGLE(PCM186X_ADC2_INPUT_SEL_R, 0,
+			      PCM186X_ADC_INPUT_SEL_MASK,
+			      ARRAY_SIZE(pcm186x_adcr_input_channel_sel_text),
+			      pcm186x_adcr_input_channel_sel_text,
+			      pcm186x_adc_input_channel_sel_value),
+};
+
+static const struct snd_kcontrol_new pcm186x_adc_mux_controls[] = {
+	SOC_DAPM_ENUM("ADC1 Left Input", pcm186x_adc_input_channel_sel[0]),
+	SOC_DAPM_ENUM("ADC1 Right Input", pcm186x_adc_input_channel_sel[1]),
+	SOC_DAPM_ENUM("ADC2 Left Input", pcm186x_adc_input_channel_sel[2]),
+	SOC_DAPM_ENUM("ADC2 Right Input", pcm186x_adc_input_channel_sel[3]),
+};
+
+static const struct snd_soc_dapm_widget pcm1863_dapm_widgets[] = {
+	SND_SOC_DAPM_INPUT("VINL1"),
+	SND_SOC_DAPM_INPUT("VINR1"),
+	SND_SOC_DAPM_INPUT("VINL2"),
+	SND_SOC_DAPM_INPUT("VINR2"),
+	SND_SOC_DAPM_INPUT("VINL3"),
+	SND_SOC_DAPM_INPUT("VINR3"),
+	SND_SOC_DAPM_INPUT("VINL4"),
+	SND_SOC_DAPM_INPUT("VINR4"),
+
+	SND_SOC_DAPM_MUX("ADC Left Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[0]),
+	SND_SOC_DAPM_MUX("ADC Right Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[1]),
+
+	/*
+	 * Put the codec into SLEEP mode when not in use, allowing the
+	 * Energysense mechanism to operate.
+	 */
+	SND_SOC_DAPM_ADC("ADC", "HiFi Capture", PCM186X_POWER_CTRL, 1,  0),
+};
+
+static const struct snd_soc_dapm_widget pcm1865_dapm_widgets[] = {
+	SND_SOC_DAPM_INPUT("VINL1"),
+	SND_SOC_DAPM_INPUT("VINR1"),
+	SND_SOC_DAPM_INPUT("VINL2"),
+	SND_SOC_DAPM_INPUT("VINR2"),
+	SND_SOC_DAPM_INPUT("VINL3"),
+	SND_SOC_DAPM_INPUT("VINR3"),
+	SND_SOC_DAPM_INPUT("VINL4"),
+	SND_SOC_DAPM_INPUT("VINR4"),
+
+	SND_SOC_DAPM_MUX("ADC1 Left Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[0]),
+	SND_SOC_DAPM_MUX("ADC1 Right Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[1]),
+	SND_SOC_DAPM_MUX("ADC2 Left Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[2]),
+	SND_SOC_DAPM_MUX("ADC2 Right Capture Source", SND_SOC_NOPM, 0, 0,
+			 &pcm186x_adc_mux_controls[3]),
+
+	/*
+	 * Put the codec into SLEEP mode when not in use, allowing the
+	 * Energysense mechanism to operate.
+	 */
+	SND_SOC_DAPM_ADC("ADC1", "HiFi Capture 1", PCM186X_POWER_CTRL, 1,  0),
+	SND_SOC_DAPM_ADC("ADC2", "HiFi Capture 2", PCM186X_POWER_CTRL, 1,  0),
+};
+
+static const struct snd_soc_dapm_route pcm1863_dapm_routes[] = {
+	{ "ADC Left Capture Source", NULL, "VINL1" },
+	{ "ADC Left Capture Source", NULL, "VINR1" },
+	{ "ADC Left Capture Source", NULL, "VINL2" },
+	{ "ADC Left Capture Source", NULL, "VINR2" },
+	{ "ADC Left Capture Source", NULL, "VINL3" },
+	{ "ADC Left Capture Source", NULL, "VINR3" },
+	{ "ADC Left Capture Source", NULL, "VINL4" },
+	{ "ADC Left Capture Source", NULL, "VINR4" },
+
+	{ "ADC", NULL, "ADC Left Capture Source" },
+
+	{ "ADC Right Capture Source", NULL, "VINL1" },
+	{ "ADC Right Capture Source", NULL, "VINR1" },
+	{ "ADC Right Capture Source", NULL, "VINL2" },
+	{ "ADC Right Capture Source", NULL, "VINR2" },
+	{ "ADC Right Capture Source", NULL, "VINL3" },
+	{ "ADC Right Capture Source", NULL, "VINR3" },
+	{ "ADC Right Capture Source", NULL, "VINL4" },
+	{ "ADC Right Capture Source", NULL, "VINR4" },
+
+	{ "ADC", NULL, "ADC Right Capture Source" },
+};
+
+static const struct snd_soc_dapm_route pcm1865_dapm_routes[] = {
+	{ "ADC1 Left Capture Source", NULL, "VINL1" },
+	{ "ADC1 Left Capture Source", NULL, "VINR1" },
+	{ "ADC1 Left Capture Source", NULL, "VINL2" },
+	{ "ADC1 Left Capture Source", NULL, "VINR2" },
+	{ "ADC1 Left Capture Source", NULL, "VINL3" },
+	{ "ADC1 Left Capture Source", NULL, "VINR3" },
+	{ "ADC1 Left Capture Source", NULL, "VINL4" },
+	{ "ADC1 Left Capture Source", NULL, "VINR4" },
+
+	{ "ADC1", NULL, "ADC1 Left Capture Source" },
+
+	{ "ADC1 Right Capture Source", NULL, "VINL1" },
+	{ "ADC1 Right Capture Source", NULL, "VINR1" },
+	{ "ADC1 Right Capture Source", NULL, "VINL2" },
+	{ "ADC1 Right Capture Source", NULL, "VINR2" },
+	{ "ADC1 Right Capture Source", NULL, "VINL3" },
+	{ "ADC1 Right Capture Source", NULL, "VINR3" },
+	{ "ADC1 Right Capture Source", NULL, "VINL4" },
+	{ "ADC1 Right Capture Source", NULL, "VINR4" },
+
+	{ "ADC1", NULL, "ADC1 Right Capture Source" },
+
+	{ "ADC2 Left Capture Source", NULL, "VINL1" },
+	{ "ADC2 Left Capture Source", NULL, "VINR1" },
+	{ "ADC2 Left Capture Source", NULL, "VINL2" },
+	{ "ADC2 Left Capture Source", NULL, "VINR2" },
+	{ "ADC2 Left Capture Source", NULL, "VINL3" },
+	{ "ADC2 Left Capture Source", NULL, "VINR3" },
+	{ "ADC2 Left Capture Source", NULL, "VINL4" },
+	{ "ADC2 Left Capture Source", NULL, "VINR4" },
+
+	{ "ADC2", NULL, "ADC2 Left Capture Source" },
+
+	{ "ADC2 Right Capture Source", NULL, "VINL1" },
+	{ "ADC2 Right Capture Source", NULL, "VINR1" },
+	{ "ADC2 Right Capture Source", NULL, "VINL2" },
+	{ "ADC2 Right Capture Source", NULL, "VINR2" },
+	{ "ADC2 Right Capture Source", NULL, "VINL3" },
+	{ "ADC2 Right Capture Source", NULL, "VINR3" },
+	{ "ADC2 Right Capture Source", NULL, "VINL4" },
+	{ "ADC2 Right Capture Source", NULL, "VINR4" },
+
+	{ "ADC2", NULL, "ADC2 Right Capture Source" },
+};
+
+static int pcm186x_hw_params(struct snd_pcm_substream *substream,
+			     struct snd_pcm_hw_params *params,
+			     struct snd_soc_dai *dai)
+{
+	struct snd_soc_codec *codec = dai->codec;
+
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+	unsigned int rate = params_rate(params);
+	unsigned int format = params_format(params);
+	unsigned int width = params_width(params);
+	unsigned int channels = params_channels(params);
+	unsigned int div_lrck;
+	unsigned int div_bck;
+	u8 tdm_tx_sel = 0;
+	u8 pcm_cfg = 0;
+
+	dev_dbg(codec->dev, "%s() rate=%u format=0x%x width=%u channels=%u\n",
+		__func__, rate, format, width, channels);
+
+	switch (width) {
+	case 16:
+		pcm_cfg = PCM186X_PCM_CFG_RX_WLEN_16 <<
+			  PCM186X_PCM_CFG_RX_WLEN_SHIFT |
+			  PCM186X_PCM_CFG_TX_WLEN_16 <<
+			  PCM186X_PCM_CFG_TX_WLEN_SHIFT;
+		break;
+	case 20:
+		pcm_cfg = PCM186X_PCM_CFG_RX_WLEN_20 <<
+			  PCM186X_PCM_CFG_RX_WLEN_SHIFT |
+			  PCM186X_PCM_CFG_TX_WLEN_20 <<
+			  PCM186X_PCM_CFG_TX_WLEN_SHIFT;
+		break;
+	case 24:
+		pcm_cfg = PCM186X_PCM_CFG_RX_WLEN_24 <<
+			  PCM186X_PCM_CFG_RX_WLEN_SHIFT |
+			  PCM186X_PCM_CFG_TX_WLEN_24 <<
+			  PCM186X_PCM_CFG_TX_WLEN_SHIFT;
+		break;
+	case 32:
+		pcm_cfg = PCM186X_PCM_CFG_RX_WLEN_32 <<
+			  PCM186X_PCM_CFG_RX_WLEN_SHIFT |
+			  PCM186X_PCM_CFG_TX_WLEN_32 <<
+			  PCM186X_PCM_CFG_TX_WLEN_SHIFT;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	snd_soc_update_bits(codec, PCM186X_PCM_CFG,
+			    PCM186X_PCM_CFG_RX_WLEN_MASK |
+			    PCM186X_PCM_CFG_TX_WLEN_MASK,
+			    pcm_cfg);
+
+	div_lrck = width * channels;
+
+	if (priv->is_tdm_mode) {
+		/* Select TDM transmission data */
+		switch (channels) {
+		case 2:
+			tdm_tx_sel = PCM186X_TDM_TX_SEL_2CH;
+			break;
+		case 4:
+			tdm_tx_sel = PCM186X_TDM_TX_SEL_4CH;
+			break;
+		case 6:
+			tdm_tx_sel = PCM186X_TDM_TX_SEL_6CH;
+			break;
+		default:
+			return -EINVAL;
+		}
+
+		snd_soc_update_bits(codec, PCM186X_TDM_TX_SEL,
+				    PCM186X_TDM_TX_SEL_MASK, tdm_tx_sel);
+
+		/* In DSP/TDM mode, the LRCLK divider must be 256 */
+		div_lrck = 256;
+
+		/* Configure 1/256 duty cycle for LRCK */
+		snd_soc_update_bits(codec, PCM186X_PCM_CFG,
+				    PCM186X_PCM_CFG_TDM_LRCK_MODE,
+				    PCM186X_PCM_CFG_TDM_LRCK_MODE);
+	}
+
+	/* Only configure clock dividers in master mode. */
+	if (priv->is_master_mode) {
+		div_bck = priv->sysclk / (div_lrck * rate);
+
+		dev_dbg(codec->dev,
+			"%s() master_clk=%u div_bck=%u div_lrck=%u\n",
+			__func__, priv->sysclk, div_bck, div_lrck);
+
+		snd_soc_write(codec, PCM186X_BCK_DIV, div_bck - 1);
+		snd_soc_write(codec, PCM186X_LRK_DIV, div_lrck - 1);
+	}
+
+	return 0;
+}
+
+static int pcm186x_set_fmt(struct snd_soc_dai *dai, unsigned int format)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+	u8 clk_ctrl = 0;
+	u8 pcm_cfg = 0;
+
+	dev_dbg(codec->dev, "%s() format=0x%x\n", __func__, format);
+
+	/* set master/slave audio interface */
+	switch (format & SND_SOC_DAIFMT_MASTER_MASK) {
+	case SND_SOC_DAIFMT_CBM_CFM:
+		if (!priv->sysclk) {
+			dev_err(codec->dev, "operating in master mode requires sysclock to be configured\n");
+			return -EINVAL;
+		}
+		clk_ctrl |= PCM186X_CLK_CTRL_MST_MODE;
+		priv->is_master_mode = true;
+		break;
+	case SND_SOC_DAIFMT_CBS_CFS:
+		priv->is_master_mode = false;
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI master/slave interface\n");
+		return -EINVAL;
+	}
+
+	/* set interface polarity */
+	switch (format & SND_SOC_DAIFMT_INV_MASK) {
+	case SND_SOC_DAIFMT_NB_NF:
+		break;
+	default:
+		dev_err(codec->dev, "Inverted DAI clocks not supported\n");
+		return -EINVAL;
+	}
+
+	/* set interface format */
+	switch (format & SND_SOC_DAIFMT_FORMAT_MASK) {
+	case SND_SOC_DAIFMT_I2S:
+		pcm_cfg = PCM186X_PCM_CFG_FMT_I2S;
+		break;
+	case SND_SOC_DAIFMT_LEFT_J:
+		pcm_cfg = PCM186X_PCM_CFG_FMT_LEFTJ;
+		break;
+	case SND_SOC_DAIFMT_DSP_A:
+		priv->tdm_offset += 1;
+		/* Fall through... DSP_A uses the same basic config as DSP_B
+		 * except we need to shift the TDM output by one BCK cycle
+		 */
+	case SND_SOC_DAIFMT_DSP_B:
+		priv->is_tdm_mode = true;
+		pcm_cfg = PCM186X_PCM_CFG_FMT_TDM;
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI format\n");
+		return -EINVAL;
+	}
+
+	snd_soc_update_bits(codec, PCM186X_CLK_CTRL,
+			    PCM186X_CLK_CTRL_MST_MODE, clk_ctrl);
+
+	snd_soc_write(codec, PCM186X_TDM_TX_OFFSET, priv->tdm_offset);
+
+	snd_soc_update_bits(codec, PCM186X_PCM_CFG,
+			    PCM186X_PCM_CFG_FMT_MASK, pcm_cfg);
+
+	return 0;
+}
+
+static int pcm186x_set_tdm_slot(struct snd_soc_dai *dai, unsigned int tx_mask,
+				unsigned int rx_mask, int slots, int slot_width)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+	unsigned int first_slot, last_slot, tdm_offset;
+
+	dev_dbg(codec->dev,
+		"%s() tx_mask=0x%x rx_mask=0x%x slots=%d slot_width=%d\n",
+		__func__, tx_mask, rx_mask, slots, slot_width);
+
+	if (!tx_mask) {
+		dev_err(codec->dev, "tdm tx mask must not be 0\n");
+		return -EINVAL;
+	}
+
+	first_slot = __ffs(tx_mask);
+	last_slot = __fls(tx_mask);
+
+	if (last_slot - first_slot != hweight32(tx_mask) - 1) {
+		dev_err(codec->dev, "tdm tx mask must be contiguous\n");
+		return -EINVAL;
+	}
+
+	tdm_offset = first_slot * slot_width;
+
+	if (tdm_offset > 255) {
+		dev_err(codec->dev, "tdm tx slot selection out of bounds\n");
+		return -EINVAL;
+	}
+
+	priv->tdm_offset = tdm_offset;
+
+	return 0;
+}
+
+static int pcm186x_set_dai_sysclk(struct snd_soc_dai *dai, int clk_id,
+				  unsigned int freq, int dir)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+
+	dev_dbg(codec->dev, "%s() clk_id=%d freq=%u dir=%d\n",
+		__func__, clk_id, freq, dir);
+
+	priv->sysclk = freq;
+
+	return 0;
+}
+
+static const struct snd_soc_dai_ops pcm186x_dai_ops = {
+	.set_sysclk = pcm186x_set_dai_sysclk,
+	.set_tdm_slot = pcm186x_set_tdm_slot,
+	.set_fmt = pcm186x_set_fmt,
+	.hw_params = pcm186x_hw_params,
+};
+
+static struct snd_soc_dai_driver pcm1863_dai = {
+	.name = "pcm1863-aif",
+	.capture = {
+		 .stream_name = "Capture",
+		 .channels_min = 1,
+		 .channels_max = 2,
+		 .rates = PCM186X_RATES,
+		 .formats = PCM186X_FORMATS,
+	 },
+	.ops = &pcm186x_dai_ops,
+};
+
+static struct snd_soc_dai_driver pcm1865_dai = {
+	.name = "pcm1865-aif",
+	.capture = {
+		 .stream_name = "Capture",
+		 .channels_min = 1,
+		 .channels_max = 4,
+		 .rates = PCM186X_RATES,
+		 .formats = PCM186X_FORMATS,
+	 },
+	.ops = &pcm186x_dai_ops,
+};
+
+static int pcm186x_power_on(struct snd_soc_codec *codec)
+{
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+	int ret = 0;
+
+	ret = regulator_bulk_enable(ARRAY_SIZE(priv->supplies),
+				    priv->supplies);
+	if (ret)
+		return ret;
+
+	regcache_cache_only(priv->regmap, false);
+	ret = regcache_sync(priv->regmap);
+	if (ret) {
+		dev_err(codec->dev, "Failed to restore cache\n");
+		regcache_cache_only(priv->regmap, true);
+		regulator_bulk_disable(ARRAY_SIZE(priv->supplies),
+				       priv->supplies);
+		return ret;
+	}
+
+	snd_soc_update_bits(codec, PCM186X_POWER_CTRL,
+			    PCM186X_PWR_CTRL_PWRDN, 0);
+
+	return 0;
+}
+
+static int pcm186x_power_off(struct snd_soc_codec *codec)
+{
+	struct pcm186x_priv *priv = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	snd_soc_update_bits(codec, PCM186X_POWER_CTRL,
+			    PCM186X_PWR_CTRL_PWRDN, PCM186X_PWR_CTRL_PWRDN);
+
+	regcache_cache_only(priv->regmap, true);
+
+	ret = regulator_bulk_disable(ARRAY_SIZE(priv->supplies),
+				     priv->supplies);
+	if (ret)
+		return ret;
+
+	return 0;
+}
+
+static int pcm186x_set_bias_level(struct snd_soc_codec *codec,
+				  enum snd_soc_bias_level level)
+{
+	dev_dbg(codec->dev, "## %s: %d -> %d\n", __func__,
+		snd_soc_codec_get_bias_level(codec), level);
+
+	switch (level) {
+	case SND_SOC_BIAS_ON:
+		break;
+	case SND_SOC_BIAS_PREPARE:
+		break;
+	case SND_SOC_BIAS_STANDBY:
+		if (snd_soc_codec_get_bias_level(codec) == SND_SOC_BIAS_OFF)
+			pcm186x_power_on(codec);
+		break;
+	case SND_SOC_BIAS_OFF:
+		pcm186x_power_off(codec);
+		break;
+	}
+
+	return 0;
+}
+
+static struct snd_soc_codec_driver soc_codec_dev_pcm1863 = {
+	.set_bias_level = pcm186x_set_bias_level,
+
+	.component_driver = {
+		.controls = pcm1863_snd_controls,
+		.num_controls = ARRAY_SIZE(pcm1863_snd_controls),
+		.dapm_widgets = pcm1863_dapm_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(pcm1863_dapm_widgets),
+		.dapm_routes = pcm1863_dapm_routes,
+		.num_dapm_routes = ARRAY_SIZE(pcm1863_dapm_routes),
+	},
+};
+
+static struct snd_soc_codec_driver soc_codec_dev_pcm1865 = {
+	.set_bias_level = pcm186x_set_bias_level,
+	.suspend_bias_off = true,
+
+	.component_driver = {
+		.controls = pcm1865_snd_controls,
+		.num_controls = ARRAY_SIZE(pcm1865_snd_controls),
+		.dapm_widgets = pcm1865_dapm_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(pcm1865_dapm_widgets),
+		.dapm_routes = pcm1865_dapm_routes,
+		.num_dapm_routes = ARRAY_SIZE(pcm1865_dapm_routes),
+	},
+};
+
+static bool pcm186x_volatile(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case PCM186X_PAGE:
+	case PCM186X_DEVICE_STATUS:
+	case PCM186X_FSAMPLE_STATUS:
+	case PCM186X_DIV_STATUS:
+	case PCM186X_CLK_STATUS:
+	case PCM186X_SUPPLY_STATUS:
+	case PCM186X_MMAP_STAT_CTRL:
+	case PCM186X_MMAP_ADDRESS:
+		return true;
+	}
+
+	return false;
+}
+
+static const struct regmap_range_cfg pcm186x_range = {
+	.name = "Pages",
+	.range_max = PCM186X_MAX_REGISTER,
+	.selector_reg = PCM186X_PAGE,
+	.selector_mask = 0xff,
+	.window_len = PCM186X_PAGE_LEN,
+};
+
+const struct regmap_config pcm186x_regmap = {
+	.reg_bits = 8,
+	.val_bits = 8,
+
+	.volatile_reg = pcm186x_volatile,
+
+	.ranges = &pcm186x_range,
+	.num_ranges = 1,
+
+	.max_register = PCM186X_MAX_REGISTER,
+
+	.cache_type = REGCACHE_RBTREE,
+};
+EXPORT_SYMBOL_GPL(pcm186x_regmap);
+
+int pcm186x_probe(struct device *dev, enum pcm186x_type type, int irq,
+		  struct regmap *regmap)
+{
+	struct pcm186x_priv *priv;
+	int i, ret;
+
+	priv = devm_kzalloc(dev, sizeof(struct pcm186x_priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	dev_set_drvdata(dev, priv);
+	priv->regmap = regmap;
+
+	for (i = 0; i < ARRAY_SIZE(priv->supplies); i++)
+		priv->supplies[i].supply = pcm186x_supply_names[i];
+
+	ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(priv->supplies),
+				      priv->supplies);
+	if (ret) {
+		dev_err(dev, "failed to request supplies: %d\n", ret);
+		return ret;
+	}
+
+	ret = regulator_bulk_enable(ARRAY_SIZE(priv->supplies),
+				    priv->supplies);
+	if (ret) {
+		dev_err(dev, "failed enable supplies: %d\n", ret);
+		return ret;
+	}
+
+	/* Reset device registers for a consistent power-on like state */
+	ret = regmap_write(regmap, PCM186X_PAGE, PCM186X_RESET);
+	if (ret) {
+		dev_err(dev, "failed to write device: %d\n", ret);
+		return ret;
+	}
+
+	ret = regulator_bulk_disable(ARRAY_SIZE(priv->supplies),
+				     priv->supplies);
+	if (ret) {
+		dev_err(dev, "failed disable supplies: %d\n", ret);
+		return ret;
+	}
+
+	switch (type) {
+	case PCM1865:
+	case PCM1864:
+		ret = snd_soc_register_codec(dev, &soc_codec_dev_pcm1865,
+					     &pcm1865_dai, 1);
+		break;
+	case PCM1863:
+	case PCM1862:
+	default:
+		ret = snd_soc_register_codec(dev, &soc_codec_dev_pcm1863,
+					     &pcm1863_dai, 1);
+	}
+	if (ret) {
+		dev_err(dev, "failed to register CODEC: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pcm186x_probe);
+
+int pcm186x_remove(struct device *dev)
+{
+	snd_soc_unregister_codec(dev);
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(pcm186x_remove);
+
+MODULE_AUTHOR("Andreas Dannenberg <dannenberg@ti.com>");
+MODULE_AUTHOR("Andrew F. Davis <afd@ti.com>");
+MODULE_DESCRIPTION("PCM186x Universal Audio ADC driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/pcm186x.h b/sound/soc/codecs/pcm186x.h
new file mode 100644
index 0000000..b630111
--- /dev/null
+++ b/sound/soc/codecs/pcm186x.h
@@ -0,0 +1,220 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Texas Instruments PCM186x Universal Audio ADC
+ *
+ * Copyright (C) 2015-2017 Texas Instruments Incorporated - http://www.ti.com
+ *	Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#ifndef _PCM186X_H_
+#define _PCM186X_H_
+
+#include <linux/pm.h>
+#include <linux/regmap.h>
+
+enum pcm186x_type {
+	PCM1862,
+	PCM1863,
+	PCM1864,
+	PCM1865,
+};
+
+#define PCM186X_RATES	SNDRV_PCM_RATE_8000_192000
+#define PCM186X_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | \
+			 SNDRV_PCM_FMTBIT_S20_3LE |\
+			 SNDRV_PCM_FMTBIT_S24_LE | \
+			 SNDRV_PCM_FMTBIT_S32_LE)
+
+#define PCM186X_PAGE_LEN		0x0100
+#define PCM186X_PAGE_BASE(n)		(PCM186X_PAGE_LEN * n)
+
+/* The page selection register address is the same on all pages */
+#define PCM186X_PAGE			0
+
+/* Register Definitions - Page 0 */
+#define PCM186X_PGA_VAL_CH1_L		(PCM186X_PAGE_BASE(0) +   1)
+#define PCM186X_PGA_VAL_CH1_R		(PCM186X_PAGE_BASE(0) +   2)
+#define PCM186X_PGA_VAL_CH2_L		(PCM186X_PAGE_BASE(0) +   3)
+#define PCM186X_PGA_VAL_CH2_R		(PCM186X_PAGE_BASE(0) +   4)
+#define PCM186X_PGA_CTRL		(PCM186X_PAGE_BASE(0) +   5)
+#define PCM186X_ADC1_INPUT_SEL_L	(PCM186X_PAGE_BASE(0) +   6)
+#define PCM186X_ADC1_INPUT_SEL_R	(PCM186X_PAGE_BASE(0) +   7)
+#define PCM186X_ADC2_INPUT_SEL_L	(PCM186X_PAGE_BASE(0) +   8)
+#define PCM186X_ADC2_INPUT_SEL_R	(PCM186X_PAGE_BASE(0) +   9)
+#define PCM186X_AUXADC_INPUT_SEL	(PCM186X_PAGE_BASE(0) +  10)
+#define PCM186X_PCM_CFG			(PCM186X_PAGE_BASE(0) +  11)
+#define PCM186X_TDM_TX_SEL		(PCM186X_PAGE_BASE(0) +  12)
+#define PCM186X_TDM_TX_OFFSET		(PCM186X_PAGE_BASE(0) +  13)
+#define PCM186X_TDM_RX_OFFSET		(PCM186X_PAGE_BASE(0) +  14)
+#define PCM186X_DPGA_VAL_CH1_L		(PCM186X_PAGE_BASE(0) +  15)
+#define PCM186X_GPIO1_0_CTRL		(PCM186X_PAGE_BASE(0) +  16)
+#define PCM186X_GPIO3_2_CTRL		(PCM186X_PAGE_BASE(0) +  17)
+#define PCM186X_GPIO1_0_DIR_CTRL	(PCM186X_PAGE_BASE(0) +  18)
+#define PCM186X_GPIO3_2_DIR_CTRL	(PCM186X_PAGE_BASE(0) +  19)
+#define PCM186X_GPIO_IN_OUT		(PCM186X_PAGE_BASE(0) +  20)
+#define PCM186X_GPIO_PULL_CTRL		(PCM186X_PAGE_BASE(0) +  21)
+#define PCM186X_DPGA_VAL_CH1_R		(PCM186X_PAGE_BASE(0) +  22)
+#define PCM186X_DPGA_VAL_CH2_L		(PCM186X_PAGE_BASE(0) +  23)
+#define PCM186X_DPGA_VAL_CH2_R		(PCM186X_PAGE_BASE(0) +  24)
+#define PCM186X_DPGA_GAIN_CTRL		(PCM186X_PAGE_BASE(0) +  25)
+#define PCM186X_DPGA_MIC_CTRL		(PCM186X_PAGE_BASE(0) +  26)
+#define PCM186X_DIN_RESAMP_CTRL		(PCM186X_PAGE_BASE(0) +  27)
+#define PCM186X_CLK_CTRL		(PCM186X_PAGE_BASE(0) +  32)
+#define PCM186X_DSP1_CLK_DIV		(PCM186X_PAGE_BASE(0) +  33)
+#define PCM186X_DSP2_CLK_DIV		(PCM186X_PAGE_BASE(0) +  34)
+#define PCM186X_ADC_CLK_DIV		(PCM186X_PAGE_BASE(0) +  35)
+#define PCM186X_PLL_SCK_DIV		(PCM186X_PAGE_BASE(0) +  37)
+#define PCM186X_BCK_DIV			(PCM186X_PAGE_BASE(0) +  38)
+#define PCM186X_LRK_DIV			(PCM186X_PAGE_BASE(0) +  39)
+#define PCM186X_PLL_CTRL		(PCM186X_PAGE_BASE(0) +  40)
+#define PCM186X_PLL_P_DIV		(PCM186X_PAGE_BASE(0) +  41)
+#define PCM186X_PLL_R_DIV		(PCM186X_PAGE_BASE(0) +  42)
+#define PCM186X_PLL_J_DIV		(PCM186X_PAGE_BASE(0) +  43)
+#define PCM186X_PLL_D_DIV_LSB		(PCM186X_PAGE_BASE(0) +  44)
+#define PCM186X_PLL_D_DIV_MSB		(PCM186X_PAGE_BASE(0) +  45)
+#define PCM186X_SIGDET_MODE		(PCM186X_PAGE_BASE(0) +  48)
+#define PCM186X_SIGDET_MASK		(PCM186X_PAGE_BASE(0) +  49)
+#define PCM186X_SIGDET_STAT		(PCM186X_PAGE_BASE(0) +  50)
+#define PCM186X_SIGDET_LOSS_TIME	(PCM186X_PAGE_BASE(0) +  52)
+#define PCM186X_SIGDET_SCAN_TIME	(PCM186X_PAGE_BASE(0) +  53)
+#define PCM186X_SIGDET_INT_INTVL	(PCM186X_PAGE_BASE(0) +  54)
+#define PCM186X_SIGDET_DC_REF_CH1_L	(PCM186X_PAGE_BASE(0) +  64)
+#define PCM186X_SIGDET_DC_DIFF_CH1_L	(PCM186X_PAGE_BASE(0) +  65)
+#define PCM186X_SIGDET_DC_LEV_CH1_L	(PCM186X_PAGE_BASE(0) +  66)
+#define PCM186X_SIGDET_DC_REF_CH1_R	(PCM186X_PAGE_BASE(0) +  67)
+#define PCM186X_SIGDET_DC_DIFF_CH1_R	(PCM186X_PAGE_BASE(0) +  68)
+#define PCM186X_SIGDET_DC_LEV_CH1_R	(PCM186X_PAGE_BASE(0) +  69)
+#define PCM186X_SIGDET_DC_REF_CH2_L	(PCM186X_PAGE_BASE(0) +  70)
+#define PCM186X_SIGDET_DC_DIFF_CH2_L	(PCM186X_PAGE_BASE(0) +  71)
+#define PCM186X_SIGDET_DC_LEV_CH2_L	(PCM186X_PAGE_BASE(0) +  72)
+#define PCM186X_SIGDET_DC_REF_CH2_R	(PCM186X_PAGE_BASE(0) +  73)
+#define PCM186X_SIGDET_DC_DIFF_CH2_R	(PCM186X_PAGE_BASE(0) +  74)
+#define PCM186X_SIGDET_DC_LEV_CH2_R	(PCM186X_PAGE_BASE(0) +  75)
+#define PCM186X_SIGDET_DC_REF_CH3_L	(PCM186X_PAGE_BASE(0) +  76)
+#define PCM186X_SIGDET_DC_DIFF_CH3_L	(PCM186X_PAGE_BASE(0) +  77)
+#define PCM186X_SIGDET_DC_LEV_CH3_L	(PCM186X_PAGE_BASE(0) +  78)
+#define PCM186X_SIGDET_DC_REF_CH3_R	(PCM186X_PAGE_BASE(0) +  79)
+#define PCM186X_SIGDET_DC_DIFF_CH3_R	(PCM186X_PAGE_BASE(0) +  80)
+#define PCM186X_SIGDET_DC_LEV_CH3_R	(PCM186X_PAGE_BASE(0) +  81)
+#define PCM186X_SIGDET_DC_REF_CH4_L	(PCM186X_PAGE_BASE(0) +  82)
+#define PCM186X_SIGDET_DC_DIFF_CH4_L	(PCM186X_PAGE_BASE(0) +  83)
+#define PCM186X_SIGDET_DC_LEV_CH4_L	(PCM186X_PAGE_BASE(0) +  84)
+#define PCM186X_SIGDET_DC_REF_CH4_R	(PCM186X_PAGE_BASE(0) +  85)
+#define PCM186X_SIGDET_DC_DIFF_CH4_R	(PCM186X_PAGE_BASE(0) +  86)
+#define PCM186X_SIGDET_DC_LEV_CH4_R	(PCM186X_PAGE_BASE(0) +  87)
+#define PCM186X_AUXADC_DATA_CTRL	(PCM186X_PAGE_BASE(0) +  88)
+#define PCM186X_AUXADC_DATA_LSB		(PCM186X_PAGE_BASE(0) +  89)
+#define PCM186X_AUXADC_DATA_MSB		(PCM186X_PAGE_BASE(0) +  90)
+#define PCM186X_INT_ENABLE		(PCM186X_PAGE_BASE(0) +  96)
+#define PCM186X_INT_FLAG		(PCM186X_PAGE_BASE(0) +  97)
+#define PCM186X_INT_POL_WIDTH		(PCM186X_PAGE_BASE(0) +  98)
+#define PCM186X_POWER_CTRL		(PCM186X_PAGE_BASE(0) + 112)
+#define PCM186X_FILTER_MUTE_CTRL	(PCM186X_PAGE_BASE(0) + 113)
+#define PCM186X_DEVICE_STATUS		(PCM186X_PAGE_BASE(0) + 114)
+#define PCM186X_FSAMPLE_STATUS		(PCM186X_PAGE_BASE(0) + 115)
+#define PCM186X_DIV_STATUS		(PCM186X_PAGE_BASE(0) + 116)
+#define PCM186X_CLK_STATUS		(PCM186X_PAGE_BASE(0) + 117)
+#define PCM186X_SUPPLY_STATUS		(PCM186X_PAGE_BASE(0) + 120)
+
+/* Register Definitions - Page 1 */
+#define PCM186X_MMAP_STAT_CTRL		(PCM186X_PAGE_BASE(1) +   1)
+#define PCM186X_MMAP_ADDRESS		(PCM186X_PAGE_BASE(1) +   2)
+#define PCM186X_MEM_WDATA0		(PCM186X_PAGE_BASE(1) +   4)
+#define PCM186X_MEM_WDATA1		(PCM186X_PAGE_BASE(1) +   5)
+#define PCM186X_MEM_WDATA2		(PCM186X_PAGE_BASE(1) +   6)
+#define PCM186X_MEM_WDATA3		(PCM186X_PAGE_BASE(1) +   7)
+#define PCM186X_MEM_RDATA0		(PCM186X_PAGE_BASE(1) +   8)
+#define PCM186X_MEM_RDATA1		(PCM186X_PAGE_BASE(1) +   9)
+#define PCM186X_MEM_RDATA2		(PCM186X_PAGE_BASE(1) +  10)
+#define PCM186X_MEM_RDATA3		(PCM186X_PAGE_BASE(1) +  11)
+
+/* Register Definitions - Page 3 */
+#define PCM186X_OSC_PWR_DOWN_CTRL	(PCM186X_PAGE_BASE(3) +  18)
+#define PCM186X_MIC_BIAS_CTRL		(PCM186X_PAGE_BASE(3) +  21)
+
+/* Register Definitions - Page 253 */
+#define PCM186X_CURR_TRIM_CTRL		(PCM186X_PAGE_BASE(253) +  20)
+
+#define PCM186X_MAX_REGISTER		PCM186X_CURR_TRIM_CTRL
+
+/* PCM186X_PAGE */
+#define PCM186X_RESET			0xff
+
+/* PCM186X_ADCX_INPUT_SEL_X */
+#define PCM186X_ADC_INPUT_SEL_POL	BIT(7)
+#define PCM186X_ADC_INPUT_SEL_MASK	GENMASK(5, 0)
+
+/* PCM186X_PCM_CFG */
+#define PCM186X_PCM_CFG_RX_WLEN_MASK	GENMASK(7, 6)
+#define PCM186X_PCM_CFG_RX_WLEN_SHIFT	6
+#define PCM186X_PCM_CFG_RX_WLEN_32	0x00
+#define PCM186X_PCM_CFG_RX_WLEN_24	0x01
+#define PCM186X_PCM_CFG_RX_WLEN_20	0x02
+#define PCM186X_PCM_CFG_RX_WLEN_16	0x03
+#define PCM186X_PCM_CFG_TDM_LRCK_MODE	BIT(4)
+#define PCM186X_PCM_CFG_TX_WLEN_MASK	GENMASK(3, 2)
+#define PCM186X_PCM_CFG_TX_WLEN_SHIFT	2
+#define PCM186X_PCM_CFG_TX_WLEN_32	0x00
+#define PCM186X_PCM_CFG_TX_WLEN_24	0x01
+#define PCM186X_PCM_CFG_TX_WLEN_20	0x02
+#define PCM186X_PCM_CFG_TX_WLEN_16	0x03
+#define PCM186X_PCM_CFG_FMT_MASK	GENMASK(1, 0)
+#define PCM186X_PCM_CFG_FMT_SHIFT	0
+#define PCM186X_PCM_CFG_FMT_I2S		0x00
+#define PCM186X_PCM_CFG_FMT_LEFTJ	0x01
+#define PCM186X_PCM_CFG_FMT_RIGHTJ	0x02
+#define PCM186X_PCM_CFG_FMT_TDM		0x03
+
+/* PCM186X_TDM_TX_SEL */
+#define PCM186X_TDM_TX_SEL_2CH		0x00
+#define PCM186X_TDM_TX_SEL_4CH		0x01
+#define PCM186X_TDM_TX_SEL_6CH		0x02
+#define PCM186X_TDM_TX_SEL_MASK		0x03
+
+/* PCM186X_CLK_CTRL */
+#define PCM186X_CLK_CTRL_SCK_XI_SEL1	BIT(7)
+#define PCM186X_CLK_CTRL_SCK_XI_SEL0	BIT(6)
+#define PCM186X_CLK_CTRL_SCK_SRC_PLL	BIT(5)
+#define PCM186X_CLK_CTRL_MST_MODE	BIT(4)
+#define PCM186X_CLK_CTRL_ADC_SRC_PLL	BIT(3)
+#define PCM186X_CLK_CTRL_DSP2_SRC_PLL	BIT(2)
+#define PCM186X_CLK_CTRL_DSP1_SRC_PLL	BIT(1)
+#define PCM186X_CLK_CTRL_CLKDET_EN	BIT(0)
+
+/* PCM186X_PLL_CTRL */
+#define PCM186X_PLL_CTRL_LOCK		BIT(4)
+#define PCM186X_PLL_CTRL_REF_SEL	BIT(1)
+#define PCM186X_PLL_CTRL_EN		BIT(0)
+
+/* PCM186X_POWER_CTRL */
+#define PCM186X_PWR_CTRL_PWRDN		BIT(2)
+#define PCM186X_PWR_CTRL_SLEEP		BIT(1)
+#define PCM186X_PWR_CTRL_STBY		BIT(0)
+
+/* PCM186X_CLK_STATUS */
+#define PCM186X_CLK_STATUS_LRCKHLT	BIT(6)
+#define PCM186X_CLK_STATUS_BCKHLT	BIT(5)
+#define PCM186X_CLK_STATUS_SCKHLT	BIT(4)
+#define PCM186X_CLK_STATUS_LRCKERR	BIT(2)
+#define PCM186X_CLK_STATUS_BCKERR	BIT(1)
+#define PCM186X_CLK_STATUS_SCKERR	BIT(0)
+
+/* PCM186X_SUPPLY_STATUS */
+#define PCM186X_SUPPLY_STATUS_DVDD	BIT(2)
+#define PCM186X_SUPPLY_STATUS_AVDD	BIT(1)
+#define PCM186X_SUPPLY_STATUS_LDO	BIT(0)
+
+/* PCM186X_MMAP_STAT_CTRL */
+#define PCM186X_MMAP_STAT_DONE		BIT(4)
+#define PCM186X_MMAP_STAT_BUSY		BIT(2)
+#define PCM186X_MMAP_STAT_R_REQ		BIT(1)
+#define PCM186X_MMAP_STAT_W_REQ		BIT(0)
+
+extern const struct regmap_config pcm186x_regmap;
+
+int pcm186x_probe(struct device *dev, enum pcm186x_type type, int irq,
+		  struct regmap *regmap);
+int pcm186x_remove(struct device *dev);
+
+#endif /* _PCM186X_H_ */
diff --git a/sound/soc/codecs/pcm512x-spi.c b/sound/soc/codecs/pcm512x-spi.c
index 25c6351..7cdd2dc 100644
--- a/sound/soc/codecs/pcm512x-spi.c
+++ b/sound/soc/codecs/pcm512x-spi.c
@@ -70,3 +70,7 @@ static struct spi_driver pcm512x_spi_driver = {
 };
 
 module_spi_driver(pcm512x_spi_driver);
+
+MODULE_DESCRIPTION("ASoC PCM512x codec driver - SPI");
+MODULE_AUTHOR("Mark Brown <broonie@kernel.org>");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/rl6231.c b/sound/soc/codecs/rl6231.c
index 974a904..7ef3b54 100644
--- a/sound/soc/codecs/rl6231.c
+++ b/sound/soc/codecs/rl6231.c
@@ -13,6 +13,7 @@
 #include <linux/module.h>
 #include <linux/regmap.h>
 
+#include <linux/gcd.h>
 #include "rl6231.h"
 
 /**
@@ -106,6 +107,25 @@ static const struct pll_calc_map pll_preset_table[] = {
 	{19200000,  24576000,  3, 30, 3, false},
 };
 
+static unsigned int find_best_div(unsigned int in,
+	unsigned int max, unsigned int div)
+{
+	unsigned int d;
+
+	if (in <= max)
+		return 1;
+
+	d = in / max;
+	if (in % max)
+		d++;
+
+	while (div % d != 0)
+		d++;
+
+
+	return d;
+}
+
 /**
  * rl6231_pll_calc - Calcualte PLL M/N/K code.
  * @freq_in: external clock provided to codec.
@@ -120,9 +140,11 @@ int rl6231_pll_calc(const unsigned int freq_in,
 	const unsigned int freq_out, struct rl6231_pll_code *pll_code)
 {
 	int max_n = RL6231_PLL_N_MAX, max_m = RL6231_PLL_M_MAX;
-	int i, k, red, n_t, pll_out, in_t, out_t;
-	int n = 0, m = 0, m_t = 0;
-	int red_t = abs(freq_out - freq_in);
+	int i, k, n_t;
+	int k_t, min_k, max_k, n = 0, m = 0, m_t = 0;
+	unsigned int red, pll_out, in_t, out_t, div, div_t;
+	unsigned int red_t = abs(freq_out - freq_in);
+	unsigned int f_in, f_out, f_max;
 	bool bypass = false;
 
 	if (RL6231_PLL_INP_MAX < freq_in || RL6231_PLL_INP_MIN > freq_in)
@@ -140,39 +162,52 @@ int rl6231_pll_calc(const unsigned int freq_in,
 		}
 	}
 
-	k = 100000000 / freq_out - 2;
-	if (k > RL6231_PLL_K_MAX)
-		k = RL6231_PLL_K_MAX;
-	for (n_t = 0; n_t <= max_n; n_t++) {
-		in_t = freq_in / (k + 2);
-		pll_out = freq_out / (n_t + 2);
-		if (in_t < 0)
-			continue;
-		if (in_t == pll_out) {
-			bypass = true;
-			n = n_t;
-			goto code_find;
-		}
-		red = abs(in_t - pll_out);
-		if (red < red_t) {
-			bypass = true;
-			n = n_t;
-			m = m_t;
-			if (red == 0)
-				goto code_find;
-			red_t = red;
-		}
-		for (m_t = 0; m_t <= max_m; m_t++) {
-			out_t = in_t / (m_t + 2);
-			red = abs(out_t - pll_out);
-			if (red < red_t) {
-				bypass = false;
+	min_k = 80000000 / freq_out - 2;
+	max_k = 150000000 / freq_out - 2;
+	if (max_k > RL6231_PLL_K_MAX)
+		max_k = RL6231_PLL_K_MAX;
+	if (min_k > RL6231_PLL_K_MAX)
+		min_k = max_k = RL6231_PLL_K_MAX;
+	div_t = gcd(freq_in, freq_out);
+	f_max = 0xffffffff / RL6231_PLL_N_MAX;
+	div = find_best_div(freq_in, f_max, div_t);
+	f_in = freq_in / div;
+	f_out = freq_out / div;
+	k = min_k;
+	for (k_t = min_k; k_t <= max_k; k_t++) {
+		for (n_t = 0; n_t <= max_n; n_t++) {
+			in_t = f_in * (n_t + 2);
+			pll_out = f_out * (k_t + 2);
+			if (in_t == pll_out) {
+				bypass = true;
 				n = n_t;
-				m = m_t;
+				k = k_t;
+				goto code_find;
+			}
+			out_t = in_t / (k_t + 2);
+			red = abs(f_out - out_t);
+			if (red < red_t) {
+				bypass = true;
+				n = n_t;
+				m = 0;
+				k = k_t;
 				if (red == 0)
 					goto code_find;
 				red_t = red;
 			}
+			for (m_t = 0; m_t <= max_m; m_t++) {
+				out_t = in_t / ((m_t + 2) * (k_t + 2));
+				red = abs(f_out - out_t);
+				if (red < red_t) {
+					bypass = false;
+					n = n_t;
+					m = m_t;
+					k = k_t;
+					if (red == 0)
+						goto code_find;
+					red_t = red;
+				}
+			}
 		}
 	}
 	pr_debug("Only get approximation about PLL\n");
diff --git a/sound/soc/codecs/rt5514-spi.c b/sound/soc/codecs/rt5514-spi.c
index 64bf26c..2144edc 100644
--- a/sound/soc/codecs/rt5514-spi.c
+++ b/sound/soc/codecs/rt5514-spi.c
@@ -381,6 +381,7 @@ int rt5514_spi_burst_read(unsigned int addr, u8 *rxbuf, size_t len)
 
 	return true;
 }
+EXPORT_SYMBOL_GPL(rt5514_spi_burst_read);
 
 /**
  * rt5514_spi_burst_write - Write data to SPI by rt5514 address.
diff --git a/sound/soc/codecs/rt5514.c b/sound/soc/codecs/rt5514.c
index 2dd6e9f..198df01 100644
--- a/sound/soc/codecs/rt5514.c
+++ b/sound/soc/codecs/rt5514.c
@@ -295,6 +295,33 @@ static int rt5514_dsp_voice_wake_up_get(struct snd_kcontrol *kcontrol,
 	return 0;
 }
 
+static int rt5514_calibration(struct rt5514_priv *rt5514, bool on)
+{
+	if (on) {
+		regmap_write(rt5514->regmap, RT5514_ANA_CTRL_PLL3, 0x0000000a);
+		regmap_update_bits(rt5514->regmap, RT5514_PLL_SOURCE_CTRL, 0xf,
+			0xa);
+		regmap_update_bits(rt5514->regmap, RT5514_PWR_ANA1, 0x301,
+			0x301);
+		regmap_write(rt5514->regmap, RT5514_PLL3_CALIB_CTRL4,
+			0x80000000 | rt5514->pll3_cal_value);
+		regmap_write(rt5514->regmap, RT5514_PLL3_CALIB_CTRL1,
+			0x8bb80800);
+		regmap_update_bits(rt5514->regmap, RT5514_PLL3_CALIB_CTRL5,
+			0xc0000000, 0x80000000);
+		regmap_update_bits(rt5514->regmap, RT5514_PLL3_CALIB_CTRL5,
+			0xc0000000, 0xc0000000);
+	} else {
+		regmap_update_bits(rt5514->regmap, RT5514_PLL3_CALIB_CTRL5,
+			0xc0000000, 0x40000000);
+		regmap_update_bits(rt5514->regmap, RT5514_PWR_ANA1, 0x301, 0);
+		regmap_update_bits(rt5514->regmap, RT5514_PLL_SOURCE_CTRL, 0xf,
+			0x4);
+	}
+
+	return 0;
+}
+
 static int rt5514_dsp_voice_wake_up_put(struct snd_kcontrol *kcontrol,
 		struct snd_ctl_elem_value *ucontrol)
 {
@@ -302,6 +329,7 @@ static int rt5514_dsp_voice_wake_up_put(struct snd_kcontrol *kcontrol,
 	struct rt5514_priv *rt5514 = snd_soc_component_get_drvdata(component);
 	struct snd_soc_codec *codec = rt5514->codec;
 	const struct firmware *fw = NULL;
+	u8 buf[8];
 
 	if (ucontrol->value.integer.value[0] == rt5514->dsp_enabled)
 		return 0;
@@ -310,6 +338,35 @@ static int rt5514_dsp_voice_wake_up_put(struct snd_kcontrol *kcontrol,
 		rt5514->dsp_enabled = ucontrol->value.integer.value[0];
 
 		if (rt5514->dsp_enabled) {
+			if (rt5514->pdata.dsp_calib_clk_name &&
+				!IS_ERR(rt5514->dsp_calib_clk)) {
+				if (clk_set_rate(rt5514->dsp_calib_clk,
+					rt5514->pdata.dsp_calib_clk_rate))
+					dev_err(codec->dev,
+						"Can't set rate for mclk");
+
+				if (clk_prepare_enable(rt5514->dsp_calib_clk))
+					dev_err(codec->dev,
+						"Can't enable dsp_calib_clk");
+
+				rt5514_calibration(rt5514, true);
+
+				msleep(20);
+#if IS_ENABLED(CONFIG_SND_SOC_RT5514_SPI)
+				rt5514_spi_burst_read(RT5514_PLL3_CALIB_CTRL6 |
+					RT5514_DSP_MAPPING,
+					(u8 *)&buf, sizeof(buf));
+#else
+				dev_err(codec->dev, "There is no SPI driver for"
+					" loading the firmware\n");
+#endif
+				rt5514->pll3_cal_value = buf[0] | buf[1] << 8 |
+					buf[2] << 16 | buf[3] << 24;
+
+				rt5514_calibration(rt5514, false);
+				clk_disable_unprepare(rt5514->dsp_calib_clk);
+			}
+
 			rt5514_enable_dsp_prepare(rt5514);
 
 			request_firmware(&fw, RT5514_FIRMWARE1, codec->dev);
@@ -341,6 +398,20 @@ static int rt5514_dsp_voice_wake_up_put(struct snd_kcontrol *kcontrol,
 			/* DSP run */
 			regmap_write(rt5514->i2c_regmap, 0x18002f00,
 				0x00055148);
+
+			if (rt5514->pdata.dsp_calib_clk_name &&
+				!IS_ERR(rt5514->dsp_calib_clk)) {
+				msleep(20);
+
+				regmap_write(rt5514->i2c_regmap, 0x1800211c,
+					rt5514->pll3_cal_value);
+				regmap_write(rt5514->i2c_regmap, 0x18002124,
+					0x00220012);
+				regmap_write(rt5514->i2c_regmap, 0x18002124,
+					0x80220042);
+				regmap_write(rt5514->i2c_regmap, 0x18002124,
+					0xe0220042);
+			}
 		} else {
 			regmap_multi_reg_write(rt5514->i2c_regmap,
 				rt5514_i2c_patch, ARRAY_SIZE(rt5514_i2c_patch));
@@ -1024,12 +1095,22 @@ static int rt5514_set_bias_level(struct snd_soc_codec *codec,
 static int rt5514_probe(struct snd_soc_codec *codec)
 {
 	struct rt5514_priv *rt5514 = snd_soc_codec_get_drvdata(codec);
+	struct platform_device *pdev = container_of(codec->dev,
+						   struct platform_device, dev);
 
 	rt5514->mclk = devm_clk_get(codec->dev, "mclk");
 	if (PTR_ERR(rt5514->mclk) == -EPROBE_DEFER)
 		return -EPROBE_DEFER;
 
+	if (rt5514->pdata.dsp_calib_clk_name) {
+		rt5514->dsp_calib_clk = devm_clk_get(&pdev->dev,
+				rt5514->pdata.dsp_calib_clk_name);
+		if (PTR_ERR(rt5514->dsp_calib_clk) == -EPROBE_DEFER)
+			return -EPROBE_DEFER;
+	}
+
 	rt5514->codec = codec;
+	rt5514->pll3_cal_value = 0x0078b000;
 
 	return 0;
 }
@@ -1147,6 +1228,10 @@ static int rt5514_parse_dp(struct rt5514_priv *rt5514, struct device *dev)
 {
 	device_property_read_u32(dev, "realtek,dmic-init-delay-ms",
 		&rt5514->pdata.dmic_init_delay);
+	device_property_read_string(dev, "realtek,dsp-calib-clk-name",
+		&rt5514->pdata.dsp_calib_clk_name);
+	device_property_read_u32(dev, "realtek,dsp-calib-clk-rate",
+		&rt5514->pdata.dsp_calib_clk_rate);
 
 	return 0;
 }
diff --git a/sound/soc/codecs/rt5514.h b/sound/soc/codecs/rt5514.h
index 2dc40e6..f0f3400 100644
--- a/sound/soc/codecs/rt5514.h
+++ b/sound/soc/codecs/rt5514.h
@@ -34,7 +34,9 @@
 #define RT5514_CLK_CTRL1			0x2104
 #define RT5514_CLK_CTRL2			0x2108
 #define RT5514_PLL3_CALIB_CTRL1			0x2110
+#define RT5514_PLL3_CALIB_CTRL4			0x2120
 #define RT5514_PLL3_CALIB_CTRL5			0x2124
+#define RT5514_PLL3_CALIB_CTRL6			0x2128
 #define RT5514_DELAY_BUF_CTRL1			0x2140
 #define RT5514_DELAY_BUF_CTRL3			0x2148
 #define RT5514_ASRC_IN_CTRL1			0x2180
@@ -272,7 +274,7 @@ struct rt5514_priv {
 	struct rt5514_platform_data pdata;
 	struct snd_soc_codec *codec;
 	struct regmap *i2c_regmap, *regmap;
-	struct clk *mclk;
+	struct clk *mclk, *dsp_calib_clk;
 	int sysclk;
 	int sysclk_src;
 	int lrck;
@@ -281,6 +283,7 @@ struct rt5514_priv {
 	int pll_in;
 	int pll_out;
 	int dsp_enabled;
+	unsigned int pll3_cal_value;
 };
 
 #endif /* __RT5514_H__ */
diff --git a/sound/soc/codecs/rt5645.c b/sound/soc/codecs/rt5645.c
index edc152c..8f140c8 100644
--- a/sound/soc/codecs/rt5645.c
+++ b/sound/soc/codecs/rt5645.c
@@ -1943,6 +1943,56 @@ static int rt5650_hp_event(struct snd_soc_dapm_widget *w,
 	return 0;
 }
 
+static int rt5645_set_micbias1_event(struct snd_soc_dapm_widget *w,
+		struct snd_kcontrol *k, int  event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+
+	switch (event) {
+	case SND_SOC_DAPM_PRE_PMU:
+		snd_soc_update_bits(codec, RT5645_GEN_CTRL2,
+			RT5645_MICBIAS1_POW_CTRL_SEL_MASK,
+			RT5645_MICBIAS1_POW_CTRL_SEL_M);
+		break;
+
+	case SND_SOC_DAPM_POST_PMD:
+		snd_soc_update_bits(codec, RT5645_GEN_CTRL2,
+			RT5645_MICBIAS1_POW_CTRL_SEL_MASK,
+			RT5645_MICBIAS1_POW_CTRL_SEL_A);
+		break;
+
+	default:
+		return 0;
+	}
+
+	return 0;
+}
+
+static int rt5645_set_micbias2_event(struct snd_soc_dapm_widget *w,
+		struct snd_kcontrol *k, int  event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+
+	switch (event) {
+	case SND_SOC_DAPM_PRE_PMU:
+		snd_soc_update_bits(codec, RT5645_GEN_CTRL2,
+			RT5645_MICBIAS2_POW_CTRL_SEL_MASK,
+			RT5645_MICBIAS2_POW_CTRL_SEL_M);
+		break;
+
+	case SND_SOC_DAPM_POST_PMD:
+		snd_soc_update_bits(codec, RT5645_GEN_CTRL2,
+			RT5645_MICBIAS2_POW_CTRL_SEL_MASK,
+			RT5645_MICBIAS2_POW_CTRL_SEL_A);
+		break;
+
+	default:
+		return 0;
+	}
+
+	return 0;
+}
+
 static const struct snd_soc_dapm_widget rt5645_dapm_widgets[] = {
 	SND_SOC_DAPM_SUPPLY("LDO2", RT5645_PWR_MIXER,
 		RT5645_PWR_LDO2_BIT, 0, NULL, 0),
@@ -1980,10 +2030,12 @@ static const struct snd_soc_dapm_widget rt5645_dapm_widgets[] = {
 
 	/* Input Side */
 	/* micbias */
-	SND_SOC_DAPM_MICBIAS("micbias1", RT5645_PWR_ANLG2,
-			RT5645_PWR_MB1_BIT, 0),
-	SND_SOC_DAPM_MICBIAS("micbias2", RT5645_PWR_ANLG2,
-			RT5645_PWR_MB2_BIT, 0),
+	SND_SOC_DAPM_SUPPLY("micbias1", RT5645_PWR_ANLG2,
+			RT5645_PWR_MB1_BIT, 0, rt5645_set_micbias1_event,
+			SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
+	SND_SOC_DAPM_SUPPLY("micbias2", RT5645_PWR_ANLG2,
+			RT5645_PWR_MB2_BIT, 0, rt5645_set_micbias2_event,
+			SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
 	/* Input Lines */
 	SND_SOC_DAPM_INPUT("DMIC L1"),
 	SND_SOC_DAPM_INPUT("DMIC R1"),
@@ -3394,6 +3446,9 @@ static int rt5645_probe(struct snd_soc_codec *codec)
 		snd_soc_dapm_sync(dapm);
 	}
 
+	if (rt5645->pdata.long_name)
+		codec->component.card->long_name = rt5645->pdata.long_name;
+
 	rt5645->eq_param = devm_kzalloc(codec->dev,
 		RT5645_HWEQ_NUM * sizeof(struct rt5645_eq_param_s), GFP_KERNEL);
 
@@ -3570,40 +3625,12 @@ static const struct acpi_device_id rt5645_acpi_match[] = {
 MODULE_DEVICE_TABLE(acpi, rt5645_acpi_match);
 #endif
 
-static const struct rt5645_platform_data general_platform_data = {
+static const struct rt5645_platform_data intel_braswell_platform_data = {
 	.dmic1_data_pin = RT5645_DMIC1_DISABLE,
 	.dmic2_data_pin = RT5645_DMIC_DATA_IN2P,
 	.jd_mode = 3,
 };
 
-static const struct dmi_system_id dmi_platform_intel_braswell[] = {
-	{
-		.ident = "Intel Strago",
-		.matches = {
-			DMI_MATCH(DMI_PRODUCT_NAME, "Strago"),
-		},
-	},
-	{
-		.ident = "Google Chrome",
-		.matches = {
-			DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
-		},
-	},
-	{
-		.ident = "Google Setzer",
-		.matches = {
-			DMI_MATCH(DMI_PRODUCT_NAME, "Setzer"),
-		},
-	},
-	{
-		.ident = "Microsoft Surface 3",
-		.matches = {
-			DMI_MATCH(DMI_PRODUCT_NAME, "Surface 3"),
-		},
-	},
-	{ }
-};
-
 static const struct rt5645_platform_data buddy_platform_data = {
 	.dmic1_data_pin = RT5645_DMIC_DATA_GPIO5,
 	.dmic2_data_pin = RT5645_DMIC_DATA_IN2P,
@@ -3611,22 +3638,61 @@ static const struct rt5645_platform_data buddy_platform_data = {
 	.level_trigger_irq = true,
 };
 
-static const struct dmi_system_id dmi_platform_intel_broadwell[] = {
+static const struct rt5645_platform_data gpd_win_platform_data = {
+	.jd_mode = 3,
+	.inv_jd1_1 = true,
+	.long_name = "gpd-win-pocket-rt5645",
+	/* The GPD pocket has a diff. mic, for the win this does not matter. */
+	.in2_diff = true,
+};
+
+static const struct rt5645_platform_data asus_t100ha_platform_data = {
+	.dmic1_data_pin = RT5645_DMIC_DATA_IN2N,
+	.dmic2_data_pin = RT5645_DMIC2_DISABLE,
+	.jd_mode = 3,
+	.inv_jd1_1 = true,
+};
+
+static const struct rt5645_platform_data jd_mode3_platform_data = {
+	.jd_mode = 3,
+};
+
+static const struct dmi_system_id dmi_platform_data[] = {
 	{
 		.ident = "Chrome Buddy",
 		.matches = {
 			DMI_MATCH(DMI_PRODUCT_NAME, "Buddy"),
 		},
+		.driver_data = (void *)&buddy_platform_data,
 	},
-	{ }
-};
-
-static const struct rt5645_platform_data gpd_win_platform_data = {
-	.jd_mode = 3,
-	.inv_jd1_1 = true,
-};
-
-static const struct dmi_system_id dmi_platform_gpd_win[] = {
+	{
+		.ident = "Intel Strago",
+		.matches = {
+			DMI_MATCH(DMI_PRODUCT_NAME, "Strago"),
+		},
+		.driver_data = (void *)&intel_braswell_platform_data,
+	},
+	{
+		.ident = "Google Chrome",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "GOOGLE"),
+		},
+		.driver_data = (void *)&intel_braswell_platform_data,
+	},
+	{
+		.ident = "Google Setzer",
+		.matches = {
+			DMI_MATCH(DMI_PRODUCT_NAME, "Setzer"),
+		},
+		.driver_data = (void *)&intel_braswell_platform_data,
+	},
+	{
+		.ident = "Microsoft Surface 3",
+		.matches = {
+			DMI_MATCH(DMI_PRODUCT_NAME, "Surface 3"),
+		},
+		.driver_data = (void *)&intel_braswell_platform_data,
+	},
 	{
 		/*
 		 * Match for the GPDwin which unfortunately uses somewhat
@@ -3637,46 +3703,38 @@ static const struct dmi_system_id dmi_platform_gpd_win[] = {
 		 * the same default product_name. Also the GPDwin is the
 		 * only device to have both board_ and product_name not set.
 		 */
-		.ident = "GPD Win",
+		.ident = "GPD Win / Pocket",
 		.matches = {
 			DMI_MATCH(DMI_BOARD_VENDOR, "AMI Corporation"),
 			DMI_MATCH(DMI_BOARD_NAME, "Default string"),
 			DMI_MATCH(DMI_BOARD_SERIAL, "Default string"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "Default string"),
 		},
+		.driver_data = (void *)&gpd_win_platform_data,
 	},
-	{}
-};
-
-static const struct rt5645_platform_data general_platform_data2 = {
-	.dmic1_data_pin = RT5645_DMIC_DATA_IN2N,
-	.dmic2_data_pin = RT5645_DMIC2_DISABLE,
-	.jd_mode = 3,
-	.inv_jd1_1 = true,
-};
-
-static const struct dmi_system_id dmi_platform_asus_t100ha[] = {
 	{
 		.ident = "ASUS T100HAN",
 		.matches = {
 			DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
 			DMI_MATCH(DMI_PRODUCT_NAME, "T100HAN"),
 		},
+		.driver_data = (void *)&asus_t100ha_platform_data,
 	},
-	{ }
-};
-
-static const struct rt5645_platform_data minix_z83_4_platform_data = {
-	.jd_mode = 3,
-};
-
-static const struct dmi_system_id dmi_platform_minix_z83_4[] = {
 	{
 		.ident = "MINIX Z83-4",
 		.matches = {
 			DMI_EXACT_MATCH(DMI_SYS_VENDOR, "MINIX"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "Z83-4"),
 		},
+		.driver_data = (void *)&jd_mode3_platform_data,
+	},
+	{
+		.ident = "Teclast X80 Pro",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "TECLAST"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "X80 Pro"),
+		},
+		.driver_data = (void *)&jd_mode3_platform_data,
 	},
 	{ }
 };
@@ -3684,9 +3742,9 @@ static const struct dmi_system_id dmi_platform_minix_z83_4[] = {
 static bool rt5645_check_dp(struct device *dev)
 {
 	if (device_property_present(dev, "realtek,in2-differential") ||
-		device_property_present(dev, "realtek,dmic1-data-pin") ||
-		device_property_present(dev, "realtek,dmic2-data-pin") ||
-		device_property_present(dev, "realtek,jd-mode"))
+	    device_property_present(dev, "realtek,dmic1-data-pin") ||
+	    device_property_present(dev, "realtek,dmic2-data-pin") ||
+	    device_property_present(dev, "realtek,jd-mode"))
 		return true;
 
 	return false;
@@ -3710,6 +3768,7 @@ static int rt5645_i2c_probe(struct i2c_client *i2c,
 		    const struct i2c_device_id *id)
 {
 	struct rt5645_platform_data *pdata = dev_get_platdata(&i2c->dev);
+	const struct dmi_system_id *dmi_data;
 	struct rt5645_priv *rt5645;
 	int ret, i;
 	unsigned int val;
@@ -3723,20 +3782,18 @@ static int rt5645_i2c_probe(struct i2c_client *i2c,
 	rt5645->i2c = i2c;
 	i2c_set_clientdata(i2c, rt5645);
 
+	dmi_data = dmi_first_match(dmi_platform_data);
+	if (dmi_data) {
+		dev_info(&i2c->dev, "Detected %s platform\n", dmi_data->ident);
+		pdata = dmi_data->driver_data;
+	}
+
 	if (pdata)
 		rt5645->pdata = *pdata;
-	else if (dmi_check_system(dmi_platform_intel_broadwell))
-		rt5645->pdata = buddy_platform_data;
 	else if (rt5645_check_dp(&i2c->dev))
 		rt5645_parse_dt(rt5645, &i2c->dev);
-	else if (dmi_check_system(dmi_platform_intel_braswell))
-		rt5645->pdata = general_platform_data;
-	else if (dmi_check_system(dmi_platform_gpd_win))
-		rt5645->pdata = gpd_win_platform_data;
-	else if (dmi_check_system(dmi_platform_asus_t100ha))
-		rt5645->pdata = general_platform_data2;
-	else if (dmi_check_system(dmi_platform_minix_z83_4))
-		rt5645->pdata = minix_z83_4_platform_data;
+	else
+		rt5645->pdata = jd_mode3_platform_data;
 
 	if (quirk != -1) {
 		rt5645->pdata.in2_diff = QUIRK_IN2_DIFF(quirk);
diff --git a/sound/soc/codecs/rt5645.h b/sound/soc/codecs/rt5645.h
index cfc5f975..940325b 100644
--- a/sound/soc/codecs/rt5645.h
+++ b/sound/soc/codecs/rt5645.h
@@ -2117,6 +2117,12 @@ enum {
 #define RT5645_RXDC_SRC_STO			(0x0 << 7)
 #define RT5645_RXDC_SRC_MONO			(0x1 << 7)
 #define RT5645_RXDC_SRC_SFT			(7)
+#define RT5645_MICBIAS1_POW_CTRL_SEL_MASK	(0x1 << 5)
+#define RT5645_MICBIAS1_POW_CTRL_SEL_A		(0x0 << 5)
+#define RT5645_MICBIAS1_POW_CTRL_SEL_M		(0x1 << 5)
+#define RT5645_MICBIAS2_POW_CTRL_SEL_MASK	(0x1 << 4)
+#define RT5645_MICBIAS2_POW_CTRL_SEL_A		(0x0 << 4)
+#define RT5645_MICBIAS2_POW_CTRL_SEL_M		(0x1 << 4)
 #define RT5645_RXDP2_SEL_MASK			(0x1 << 3)
 #define RT5645_RXDP2_SEL_IF2			(0x0 << 3)
 #define RT5645_RXDP2_SEL_ADC			(0x1 << 3)
diff --git a/sound/soc/codecs/sgtl5000.c b/sound/soc/codecs/sgtl5000.c
index f2bb4fe..633cdcf 100644
--- a/sound/soc/codecs/sgtl5000.c
+++ b/sound/soc/codecs/sgtl5000.c
@@ -1332,10 +1332,13 @@ static int sgtl5000_i2c_probe(struct i2c_client *client,
 	sgtl5000->mclk = devm_clk_get(&client->dev, NULL);
 	if (IS_ERR(sgtl5000->mclk)) {
 		ret = PTR_ERR(sgtl5000->mclk);
-		dev_err(&client->dev, "Failed to get mclock: %d\n", ret);
 		/* Defer the probe to see if the clk will be provided later */
 		if (ret == -ENOENT)
 			ret = -EPROBE_DEFER;
+
+		if (ret != -EPROBE_DEFER)
+			dev_err(&client->dev, "Failed to get mclock: %d\n",
+				ret);
 		goto disable_regs;
 	}
 
diff --git a/sound/soc/codecs/si476x.c b/sound/soc/codecs/si476x.c
index 354dc0d..7b91ee2 100644
--- a/sound/soc/codecs/si476x.c
+++ b/sound/soc/codecs/si476x.c
@@ -231,14 +231,17 @@ static struct snd_soc_dai_driver si476x_dai = {
 	.ops		= &si476x_dai_ops,
 };
 
-static struct regmap *si476x_get_regmap(struct device *dev)
+static int si476x_probe(struct snd_soc_component *component)
 {
-	return dev_get_regmap(dev->parent, NULL);
+	snd_soc_component_init_regmap(component,
+				dev_get_regmap(component->dev->parent, NULL));
+
+	return 0;
 }
 
 static const struct snd_soc_codec_driver soc_codec_dev_si476x = {
-	.get_regmap = si476x_get_regmap,
 	.component_driver = {
+		.probe			= si476x_probe,
 		.dapm_widgets		= si476x_dapm_widgets,
 		.num_dapm_widgets	= ARRAY_SIZE(si476x_dapm_widgets),
 		.dapm_routes		= si476x_dapm_routes,
diff --git a/sound/soc/codecs/sn95031.c b/sound/soc/codecs/sn95031.c
deleted file mode 100644
index 887923e..0000000
--- a/sound/soc/codecs/sn95031.c
+++ /dev/null
@@ -1,936 +0,0 @@
-/*
- *  sn95031.c -  TI sn95031 Codec driver
- *
- *  Copyright (C) 2010 Intel Corp
- *  Author: Vinod Koul <vinod.koul@intel.com>
- *  Author: Harsha Priya <priya.harsha@intel.com>
- *  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; version 2 of the License.
- *
- *  This program is distributed in the hope that it will be useful, but
- *  WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- *  General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License along
- *  with this program; if not, write to the Free Software Foundation, Inc.,
- *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
- *
- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- *
- */
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/platform_device.h>
-#include <linux/delay.h>
-#include <linux/slab.h>
-#include <linux/module.h>
-
-#include <asm/intel_scu_ipc.h>
-#include <sound/pcm.h>
-#include <sound/pcm_params.h>
-#include <sound/soc.h>
-#include <sound/soc-dapm.h>
-#include <sound/initval.h>
-#include <sound/tlv.h>
-#include <sound/jack.h>
-#include "sn95031.h"
-
-#define SN95031_RATES (SNDRV_PCM_RATE_48000 | SNDRV_PCM_RATE_44100)
-#define SN95031_FORMATS (SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S16_LE)
-
-/* adc helper functions */
-
-/* enables mic bias voltage */
-static void sn95031_enable_mic_bias(struct snd_soc_codec *codec)
-{
-	snd_soc_write(codec, SN95031_VAUD, BIT(2)|BIT(1)|BIT(0));
-	snd_soc_update_bits(codec, SN95031_MICBIAS, BIT(2), BIT(2));
-}
-
-/* Enable/Disable the ADC depending on the argument */
-static void configure_adc(struct snd_soc_codec *sn95031_codec, int val)
-{
-	int value = snd_soc_read(sn95031_codec, SN95031_ADC1CNTL1);
-
-	if (val) {
-		/* Enable and start the ADC */
-		value |= (SN95031_ADC_ENBL | SN95031_ADC_START);
-		value &= (~SN95031_ADC_NO_LOOP);
-	} else {
-		/* Just stop the ADC */
-		value &= (~SN95031_ADC_START);
-	}
-	snd_soc_write(sn95031_codec, SN95031_ADC1CNTL1, value);
-}
-
-/*
- * finds an empty channel for conversion
- * If the ADC is not enabled then start using 0th channel
- * itself. Otherwise find an empty channel by looking for a
- * channel in which the stopbit is set to 1. returns the index
- * of the first free channel if succeeds or an error code.
- *
- * Context: can sleep
- *
- */
-static int find_free_channel(struct snd_soc_codec *sn95031_codec)
-{
-	int i, value;
-
-	/* check whether ADC is enabled */
-	value = snd_soc_read(sn95031_codec, SN95031_ADC1CNTL1);
-
-	if ((value & SN95031_ADC_ENBL) == 0)
-		return 0;
-
-	/* ADC is already enabled; Looking for an empty channel */
-	for (i = 0; i <	SN95031_ADC_CHANLS_MAX; i++) {
-		value = snd_soc_read(sn95031_codec,
-				SN95031_ADC_CHNL_START_ADDR + i);
-		if (value & SN95031_STOPBIT_MASK)
-			break;
-	}
-	return (i == SN95031_ADC_CHANLS_MAX) ? (-EINVAL) : i;
-}
-
-/* Initialize the ADC for reading micbias values. Can sleep. */
-static int sn95031_initialize_adc(struct snd_soc_codec *sn95031_codec)
-{
-	int base_addr, chnl_addr;
-	int value;
-	int channel_index;
-
-	/* Index of the first channel in which the stop bit is set */
-	channel_index = find_free_channel(sn95031_codec);
-	if (channel_index < 0) {
-		pr_err("No free ADC channels");
-		return channel_index;
-	}
-
-	base_addr = SN95031_ADC_CHNL_START_ADDR + channel_index;
-
-	if (!(channel_index == 0 || channel_index ==  SN95031_ADC_LOOP_MAX)) {
-		/* Reset stop bit for channels other than 0 and 12 */
-		value = snd_soc_read(sn95031_codec, base_addr);
-		/* Set the stop bit to zero */
-		snd_soc_write(sn95031_codec, base_addr, value & 0xEF);
-		/* Index of the first free channel */
-		base_addr++;
-		channel_index++;
-	}
-
-	/* Since this is the last channel, set the stop bit
-	   to 1 by ORing the DIE_SENSOR_CODE with 0x10 */
-	snd_soc_write(sn95031_codec, base_addr,
-				SN95031_AUDIO_DETECT_CODE | 0x10);
-
-	chnl_addr = SN95031_ADC_DATA_START_ADDR + 2 * channel_index;
-	pr_debug("mid_initialize : %x", chnl_addr);
-	configure_adc(sn95031_codec, 1);
-	return chnl_addr;
-}
-
-
-/* reads the ADC registers and gets the mic bias value in mV. */
-static unsigned int sn95031_get_mic_bias(struct snd_soc_codec *codec)
-{
-	u16 adc_adr = sn95031_initialize_adc(codec);
-	u16 adc_val1, adc_val2;
-	unsigned int mic_bias;
-
-	sn95031_enable_mic_bias(codec);
-
-	/* Enable the sound card for conversion before reading */
-	snd_soc_write(codec, SN95031_ADC1CNTL3, 0x05);
-	/* Re-toggle the RRDATARD bit */
-	snd_soc_write(codec, SN95031_ADC1CNTL3, 0x04);
-
-	/* Read the higher bits of data */
-	msleep(1000);
-	adc_val1 = snd_soc_read(codec, adc_adr);
-	adc_adr++;
-	adc_val2 = snd_soc_read(codec, adc_adr);
-
-	/* Adding lower two bits to the higher bits */
-	mic_bias = (adc_val1 << 2) + (adc_val2 & 3);
-	mic_bias = (mic_bias * SN95031_ADC_ONE_LSB_MULTIPLIER) / 1000;
-	pr_debug("mic bias = %dmV\n", mic_bias);
-	return mic_bias;
-}
-/*end - adc helper functions */
-
-static int sn95031_read(void *ctx, unsigned int reg, unsigned int *val)
-{
-	u8 value = 0;
-	int ret;
-
-	ret = intel_scu_ipc_ioread8(reg, &value);
-	if (ret == 0)
-		*val = value;
-
-	return ret;
-}
-
-static int sn95031_write(void *ctx, unsigned int reg, unsigned int value)
-{
-	return intel_scu_ipc_iowrite8(reg, value);
-}
-
-static const struct regmap_config sn95031_regmap = {
-	.reg_read = sn95031_read,
-	.reg_write = sn95031_write,
-};
-
-static int sn95031_set_vaud_bias(struct snd_soc_codec *codec,
-		enum snd_soc_bias_level level)
-{
-	switch (level) {
-	case SND_SOC_BIAS_ON:
-		break;
-
-	case SND_SOC_BIAS_PREPARE:
-		if (snd_soc_codec_get_bias_level(codec) == SND_SOC_BIAS_STANDBY) {
-			pr_debug("vaud_bias powering up pll\n");
-			/* power up the pll */
-			snd_soc_write(codec, SN95031_AUDPLLCTRL, BIT(5));
-			/* enable pcm 2 */
-			snd_soc_update_bits(codec, SN95031_PCM2C2,
-					BIT(0), BIT(0));
-		}
-		break;
-
-	case SND_SOC_BIAS_STANDBY:
-		switch (snd_soc_codec_get_bias_level(codec)) {
-		case SND_SOC_BIAS_OFF:
-			pr_debug("vaud_bias power up rail\n");
-			/* power up the rail */
-			snd_soc_write(codec, SN95031_VAUD,
-					BIT(2)|BIT(1)|BIT(0));
-			msleep(1);
-			break;
-		case SND_SOC_BIAS_PREPARE:
-			/* turn off pcm */
-			pr_debug("vaud_bias power dn pcm\n");
-			snd_soc_update_bits(codec, SN95031_PCM2C2, BIT(0), 0);
-			snd_soc_write(codec, SN95031_AUDPLLCTRL, 0);
-			break;
-		default:
-			break;
-		}
-		break;
-
-
-	case SND_SOC_BIAS_OFF:
-		pr_debug("vaud_bias _OFF doing rail shutdown\n");
-		snd_soc_write(codec, SN95031_VAUD, BIT(3));
-		break;
-	}
-
-	return 0;
-}
-
-static int sn95031_vhs_event(struct snd_soc_dapm_widget *w,
-		    struct snd_kcontrol *kcontrol, int event)
-{
-	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
-
-	if (SND_SOC_DAPM_EVENT_ON(event)) {
-		pr_debug("VHS SND_SOC_DAPM_EVENT_ON doing rail startup now\n");
-		/* power up the rail */
-		snd_soc_write(codec, SN95031_VHSP, 0x3D);
-		snd_soc_write(codec, SN95031_VHSN, 0x3F);
-		msleep(1);
-	} else if (SND_SOC_DAPM_EVENT_OFF(event)) {
-		pr_debug("VHS SND_SOC_DAPM_EVENT_OFF doing rail shutdown\n");
-		snd_soc_write(codec, SN95031_VHSP, 0xC4);
-		snd_soc_write(codec, SN95031_VHSN, 0x04);
-	}
-	return 0;
-}
-
-static int sn95031_vihf_event(struct snd_soc_dapm_widget *w,
-		    struct snd_kcontrol *kcontrol, int event)
-{
-	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
-
-	if (SND_SOC_DAPM_EVENT_ON(event)) {
-		pr_debug("VIHF SND_SOC_DAPM_EVENT_ON doing rail startup now\n");
-		/* power up the rail */
-		snd_soc_write(codec, SN95031_VIHF, 0x27);
-		msleep(1);
-	} else if (SND_SOC_DAPM_EVENT_OFF(event)) {
-		pr_debug("VIHF SND_SOC_DAPM_EVENT_OFF doing rail shutdown\n");
-		snd_soc_write(codec, SN95031_VIHF, 0x24);
-	}
-	return 0;
-}
-
-static int sn95031_dmic12_event(struct snd_soc_dapm_widget *w,
-			struct snd_kcontrol *k, int event)
-{
-	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
-	unsigned int ldo = 0, clk_dir = 0, data_dir = 0;
-
-	if (SND_SOC_DAPM_EVENT_ON(event)) {
-		ldo = BIT(5)|BIT(4);
-		clk_dir = BIT(0);
-		data_dir = BIT(7);
-	}
-	/* program DMIC LDO, clock and set clock */
-	snd_soc_update_bits(codec, SN95031_MICBIAS, BIT(5)|BIT(4), ldo);
-	snd_soc_update_bits(codec, SN95031_DMICBUF0123, BIT(0), clk_dir);
-	snd_soc_update_bits(codec, SN95031_DMICBUF0123, BIT(7), data_dir);
-	return 0;
-}
-
-static int sn95031_dmic34_event(struct snd_soc_dapm_widget *w,
-			struct snd_kcontrol *k, int event)
-{
-	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
-	unsigned int ldo = 0, clk_dir = 0, data_dir = 0;
-
-	if (SND_SOC_DAPM_EVENT_ON(event)) {
-		ldo = BIT(5)|BIT(4);
-		clk_dir = BIT(2);
-		data_dir = BIT(1);
-	}
-	/* program DMIC LDO, clock and set clock */
-	snd_soc_update_bits(codec, SN95031_MICBIAS, BIT(5)|BIT(4), ldo);
-	snd_soc_update_bits(codec, SN95031_DMICBUF0123, BIT(2), clk_dir);
-	snd_soc_update_bits(codec, SN95031_DMICBUF45, BIT(1), data_dir);
-	return 0;
-}
-
-static int sn95031_dmic56_event(struct snd_soc_dapm_widget *w,
-			struct snd_kcontrol *k, int event)
-{
-	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
-	unsigned int ldo = 0;
-
-	if (SND_SOC_DAPM_EVENT_ON(event))
-		ldo = BIT(7)|BIT(6);
-
-	/* program DMIC LDO */
-	snd_soc_update_bits(codec, SN95031_MICBIAS, BIT(7)|BIT(6), ldo);
-	return 0;
-}
-
-/* mux controls */
-static const char *sn95031_mic_texts[] = { "AMIC", "LineIn" };
-
-static SOC_ENUM_SINGLE_DECL(sn95031_micl_enum,
-			    SN95031_ADCCONFIG, 1, sn95031_mic_texts);
-
-static const struct snd_kcontrol_new sn95031_micl_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_micl_enum);
-
-static SOC_ENUM_SINGLE_DECL(sn95031_micr_enum,
-			    SN95031_ADCCONFIG, 3, sn95031_mic_texts);
-
-static const struct snd_kcontrol_new sn95031_micr_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_micr_enum);
-
-static const char *sn95031_input_texts[] = {	"DMIC1", "DMIC2", "DMIC3",
-						"DMIC4", "DMIC5", "DMIC6",
-						"ADC Left", "ADC Right" };
-
-static SOC_ENUM_SINGLE_DECL(sn95031_input1_enum,
-			    SN95031_AUDIOMUX12, 0, sn95031_input_texts);
-
-static const struct snd_kcontrol_new sn95031_input1_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_input1_enum);
-
-static SOC_ENUM_SINGLE_DECL(sn95031_input2_enum,
-			    SN95031_AUDIOMUX12, 4, sn95031_input_texts);
-
-static const struct snd_kcontrol_new sn95031_input2_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_input2_enum);
-
-static SOC_ENUM_SINGLE_DECL(sn95031_input3_enum,
-			    SN95031_AUDIOMUX34, 0, sn95031_input_texts);
-
-static const struct snd_kcontrol_new sn95031_input3_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_input3_enum);
-
-static SOC_ENUM_SINGLE_DECL(sn95031_input4_enum,
-			    SN95031_AUDIOMUX34, 4, sn95031_input_texts);
-
-static const struct snd_kcontrol_new sn95031_input4_mux_control =
-	SOC_DAPM_ENUM("Route", sn95031_input4_enum);
-
-/* capture path controls */
-
-static const char *sn95031_micmode_text[] = {"Single Ended", "Differential"};
-
-/* 0dB to 30dB in 10dB steps */
-static const DECLARE_TLV_DB_SCALE(mic_tlv, 0, 10, 0);
-
-static SOC_ENUM_SINGLE_DECL(sn95031_micmode1_enum,
-			    SN95031_MICAMP1, 1, sn95031_micmode_text);
-static SOC_ENUM_SINGLE_DECL(sn95031_micmode2_enum,
-			    SN95031_MICAMP2, 1, sn95031_micmode_text);
-
-static const char *sn95031_dmic_cfg_text[] = {"GPO", "DMIC"};
-
-static SOC_ENUM_SINGLE_DECL(sn95031_dmic12_cfg_enum,
-			    SN95031_DMICMUX, 0, sn95031_dmic_cfg_text);
-static SOC_ENUM_SINGLE_DECL(sn95031_dmic34_cfg_enum,
-			    SN95031_DMICMUX, 1, sn95031_dmic_cfg_text);
-static SOC_ENUM_SINGLE_DECL(sn95031_dmic56_cfg_enum,
-			    SN95031_DMICMUX, 2, sn95031_dmic_cfg_text);
-
-static const struct snd_kcontrol_new sn95031_snd_controls[] = {
-	SOC_ENUM("Mic1Mode Capture Route", sn95031_micmode1_enum),
-	SOC_ENUM("Mic2Mode Capture Route", sn95031_micmode2_enum),
-	SOC_ENUM("DMIC12 Capture Route", sn95031_dmic12_cfg_enum),
-	SOC_ENUM("DMIC34 Capture Route", sn95031_dmic34_cfg_enum),
-	SOC_ENUM("DMIC56 Capture Route", sn95031_dmic56_cfg_enum),
-	SOC_SINGLE_TLV("Mic1 Capture Volume", SN95031_MICAMP1,
-			2, 4, 0, mic_tlv),
-	SOC_SINGLE_TLV("Mic2 Capture Volume", SN95031_MICAMP2,
-			2, 4, 0, mic_tlv),
-};
-
-/* DAPM widgets */
-static const struct snd_soc_dapm_widget sn95031_dapm_widgets[] = {
-
-	/* all end points mic, hs etc */
-	SND_SOC_DAPM_OUTPUT("HPOUTL"),
-	SND_SOC_DAPM_OUTPUT("HPOUTR"),
-	SND_SOC_DAPM_OUTPUT("EPOUT"),
-	SND_SOC_DAPM_OUTPUT("IHFOUTL"),
-	SND_SOC_DAPM_OUTPUT("IHFOUTR"),
-	SND_SOC_DAPM_OUTPUT("LINEOUTL"),
-	SND_SOC_DAPM_OUTPUT("LINEOUTR"),
-	SND_SOC_DAPM_OUTPUT("VIB1OUT"),
-	SND_SOC_DAPM_OUTPUT("VIB2OUT"),
-
-	SND_SOC_DAPM_INPUT("AMIC1"), /* headset mic */
-	SND_SOC_DAPM_INPUT("AMIC2"),
-	SND_SOC_DAPM_INPUT("DMIC1"),
-	SND_SOC_DAPM_INPUT("DMIC2"),
-	SND_SOC_DAPM_INPUT("DMIC3"),
-	SND_SOC_DAPM_INPUT("DMIC4"),
-	SND_SOC_DAPM_INPUT("DMIC5"),
-	SND_SOC_DAPM_INPUT("DMIC6"),
-	SND_SOC_DAPM_INPUT("LINEINL"),
-	SND_SOC_DAPM_INPUT("LINEINR"),
-
-	SND_SOC_DAPM_MICBIAS("AMIC1Bias", SN95031_MICBIAS, 2, 0),
-	SND_SOC_DAPM_MICBIAS("AMIC2Bias", SN95031_MICBIAS, 3, 0),
-	SND_SOC_DAPM_MICBIAS("DMIC12Bias", SN95031_DMICMUX, 3, 0),
-	SND_SOC_DAPM_MICBIAS("DMIC34Bias", SN95031_DMICMUX, 4, 0),
-	SND_SOC_DAPM_MICBIAS("DMIC56Bias", SN95031_DMICMUX, 5, 0),
-
-	SND_SOC_DAPM_SUPPLY("DMIC12supply", SN95031_DMICLK, 0, 0,
-				sn95031_dmic12_event,
-				SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
-	SND_SOC_DAPM_SUPPLY("DMIC34supply", SN95031_DMICLK, 1, 0,
-				sn95031_dmic34_event,
-				SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
-	SND_SOC_DAPM_SUPPLY("DMIC56supply", SN95031_DMICLK, 2, 0,
-				sn95031_dmic56_event,
-				SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
-
-	SND_SOC_DAPM_AIF_OUT("PCM_Out", "Capture", 0,
-			SND_SOC_NOPM, 0, 0),
-
-	SND_SOC_DAPM_SUPPLY("Headset Rail", SND_SOC_NOPM, 0, 0,
-			sn95031_vhs_event,
-			SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
-	SND_SOC_DAPM_SUPPLY("Speaker Rail", SND_SOC_NOPM, 0, 0,
-			sn95031_vihf_event,
-			SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
-
-	/* playback path driver enables */
-	SND_SOC_DAPM_PGA("Headset Left Playback",
-			SN95031_DRIVEREN, 0, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Headset Right Playback",
-			SN95031_DRIVEREN, 1, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Speaker Left Playback",
-			SN95031_DRIVEREN, 2, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Speaker Right Playback",
-			SN95031_DRIVEREN, 3, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Vibra1 Playback",
-			SN95031_DRIVEREN, 4, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Vibra2 Playback",
-			SN95031_DRIVEREN, 5, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Earpiece Playback",
-			SN95031_DRIVEREN, 6, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Lineout Left Playback",
-			SN95031_LOCTL, 0, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Lineout Right Playback",
-			SN95031_LOCTL, 4, 0, NULL, 0),
-
-	/* playback path filter enable */
-	SND_SOC_DAPM_PGA("Headset Left Filter",
-			SN95031_HSEPRXCTRL, 4, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("Headset Right Filter",
-			SN95031_HSEPRXCTRL, 5, 0,  NULL, 0),
-	SND_SOC_DAPM_PGA("Speaker Left Filter",
-			SN95031_IHFRXCTRL, 0, 0,  NULL, 0),
-	SND_SOC_DAPM_PGA("Speaker Right Filter",
-			SN95031_IHFRXCTRL, 1, 0,  NULL, 0),
-
-	/* DACs */
-	SND_SOC_DAPM_DAC("HSDAC Left", "Headset",
-			SN95031_DACCONFIG, 0, 0),
-	SND_SOC_DAPM_DAC("HSDAC Right", "Headset",
-			SN95031_DACCONFIG, 1, 0),
-	SND_SOC_DAPM_DAC("IHFDAC Left", "Speaker",
-			SN95031_DACCONFIG, 2, 0),
-	SND_SOC_DAPM_DAC("IHFDAC Right", "Speaker",
-			SN95031_DACCONFIG, 3, 0),
-	SND_SOC_DAPM_DAC("Vibra1 DAC", "Vibra1",
-			SN95031_VIB1C5, 1, 0),
-	SND_SOC_DAPM_DAC("Vibra2 DAC", "Vibra2",
-			SN95031_VIB2C5, 1, 0),
-
-	/* capture widgets */
-	SND_SOC_DAPM_PGA("LineIn Enable Left", SN95031_MICAMP1,
-				7, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("LineIn Enable Right", SN95031_MICAMP2,
-				7, 0, NULL, 0),
-
-	SND_SOC_DAPM_PGA("MIC1 Enable", SN95031_MICAMP1, 0, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("MIC2 Enable", SN95031_MICAMP2, 0, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("TX1 Enable", SN95031_AUDIOTXEN, 2, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("TX2 Enable", SN95031_AUDIOTXEN, 3, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("TX3 Enable", SN95031_AUDIOTXEN, 4, 0, NULL, 0),
-	SND_SOC_DAPM_PGA("TX4 Enable", SN95031_AUDIOTXEN, 5, 0, NULL, 0),
-
-	/* ADC have null stream as they will be turned ON by TX path */
-	SND_SOC_DAPM_ADC("ADC Left", NULL,
-			SN95031_ADCCONFIG, 0, 0),
-	SND_SOC_DAPM_ADC("ADC Right", NULL,
-			SN95031_ADCCONFIG, 2, 0),
-
-	SND_SOC_DAPM_MUX("Mic_InputL Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_micl_mux_control),
-	SND_SOC_DAPM_MUX("Mic_InputR Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_micr_mux_control),
-
-	SND_SOC_DAPM_MUX("Txpath1 Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_input1_mux_control),
-	SND_SOC_DAPM_MUX("Txpath2 Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_input2_mux_control),
-	SND_SOC_DAPM_MUX("Txpath3 Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_input3_mux_control),
-	SND_SOC_DAPM_MUX("Txpath4 Capture Route",
-			SND_SOC_NOPM, 0, 0, &sn95031_input4_mux_control),
-
-};
-
-static const struct snd_soc_dapm_route sn95031_audio_map[] = {
-	/* headset and earpiece map */
-	{ "HPOUTL", NULL, "Headset Rail"},
-	{ "HPOUTR", NULL, "Headset Rail"},
-	{ "HPOUTL", NULL, "Headset Left Playback" },
-	{ "HPOUTR", NULL, "Headset Right Playback" },
-	{ "EPOUT", NULL, "Earpiece Playback" },
-	{ "Headset Left Playback", NULL, "Headset Left Filter"},
-	{ "Headset Right Playback", NULL, "Headset Right Filter"},
-	{ "Earpiece Playback", NULL, "Headset Left Filter"},
-	{ "Headset Left Filter", NULL, "HSDAC Left"},
-	{ "Headset Right Filter", NULL, "HSDAC Right"},
-
-	/* speaker map */
-	{ "IHFOUTL", NULL, "Speaker Rail"},
-	{ "IHFOUTR", NULL, "Speaker Rail"},
-	{ "IHFOUTL", NULL, "Speaker Left Playback"},
-	{ "IHFOUTR", NULL, "Speaker Right Playback"},
-	{ "Speaker Left Playback", NULL, "Speaker Left Filter"},
-	{ "Speaker Right Playback", NULL, "Speaker Right Filter"},
-	{ "Speaker Left Filter", NULL, "IHFDAC Left"},
-	{ "Speaker Right Filter", NULL, "IHFDAC Right"},
-
-	/* vibra map */
-	{ "VIB1OUT", NULL, "Vibra1 Playback"},
-	{ "Vibra1 Playback", NULL, "Vibra1 DAC"},
-
-	{ "VIB2OUT", NULL, "Vibra2 Playback"},
-	{ "Vibra2 Playback", NULL, "Vibra2 DAC"},
-
-	/* lineout */
-	{ "LINEOUTL", NULL, "Lineout Left Playback"},
-	{ "LINEOUTR", NULL, "Lineout Right Playback"},
-	{ "Lineout Left Playback", NULL, "Headset Left Filter"},
-	{ "Lineout Left Playback", NULL, "Speaker Left Filter"},
-	{ "Lineout Left Playback", NULL, "Vibra1 DAC"},
-	{ "Lineout Right Playback", NULL, "Headset Right Filter"},
-	{ "Lineout Right Playback", NULL, "Speaker Right Filter"},
-	{ "Lineout Right Playback", NULL, "Vibra2 DAC"},
-
-	/* Headset (AMIC1) mic */
-	{ "AMIC1Bias", NULL, "AMIC1"},
-	{ "MIC1 Enable", NULL, "AMIC1Bias"},
-	{ "Mic_InputL Capture Route", "AMIC", "MIC1 Enable"},
-
-	/* AMIC2 */
-	{ "AMIC2Bias", NULL, "AMIC2"},
-	{ "MIC2 Enable", NULL, "AMIC2Bias"},
-	{ "Mic_InputR Capture Route", "AMIC", "MIC2 Enable"},
-
-
-	/* Linein */
-	{ "LineIn Enable Left", NULL, "LINEINL"},
-	{ "LineIn Enable Right", NULL, "LINEINR"},
-	{ "Mic_InputL Capture Route", "LineIn", "LineIn Enable Left"},
-	{ "Mic_InputR Capture Route", "LineIn", "LineIn Enable Right"},
-
-	/* ADC connection */
-	{ "ADC Left", NULL, "Mic_InputL Capture Route"},
-	{ "ADC Right", NULL, "Mic_InputR Capture Route"},
-
-	/*DMIC connections */
-	{ "DMIC1", NULL, "DMIC12supply"},
-	{ "DMIC2", NULL, "DMIC12supply"},
-	{ "DMIC3", NULL, "DMIC34supply"},
-	{ "DMIC4", NULL, "DMIC34supply"},
-	{ "DMIC5", NULL, "DMIC56supply"},
-	{ "DMIC6", NULL, "DMIC56supply"},
-
-	{ "DMIC12Bias", NULL, "DMIC1"},
-	{ "DMIC12Bias", NULL, "DMIC2"},
-	{ "DMIC34Bias", NULL, "DMIC3"},
-	{ "DMIC34Bias", NULL, "DMIC4"},
-	{ "DMIC56Bias", NULL, "DMIC5"},
-	{ "DMIC56Bias", NULL, "DMIC6"},
-
-	/*TX path inputs*/
-	{ "Txpath1 Capture Route", "ADC Left", "ADC Left"},
-	{ "Txpath2 Capture Route", "ADC Left", "ADC Left"},
-	{ "Txpath3 Capture Route", "ADC Left", "ADC Left"},
-	{ "Txpath4 Capture Route", "ADC Left", "ADC Left"},
-	{ "Txpath1 Capture Route", "ADC Right", "ADC Right"},
-	{ "Txpath2 Capture Route", "ADC Right", "ADC Right"},
-	{ "Txpath3 Capture Route", "ADC Right", "ADC Right"},
-	{ "Txpath4 Capture Route", "ADC Right", "ADC Right"},
-	{ "Txpath1 Capture Route", "DMIC1", "DMIC1"},
-	{ "Txpath2 Capture Route", "DMIC1", "DMIC1"},
-	{ "Txpath3 Capture Route", "DMIC1", "DMIC1"},
-	{ "Txpath4 Capture Route", "DMIC1", "DMIC1"},
-	{ "Txpath1 Capture Route", "DMIC2", "DMIC2"},
-	{ "Txpath2 Capture Route", "DMIC2", "DMIC2"},
-	{ "Txpath3 Capture Route", "DMIC2", "DMIC2"},
-	{ "Txpath4 Capture Route", "DMIC2", "DMIC2"},
-	{ "Txpath1 Capture Route", "DMIC3", "DMIC3"},
-	{ "Txpath2 Capture Route", "DMIC3", "DMIC3"},
-	{ "Txpath3 Capture Route", "DMIC3", "DMIC3"},
-	{ "Txpath4 Capture Route", "DMIC3", "DMIC3"},
-	{ "Txpath1 Capture Route", "DMIC4", "DMIC4"},
-	{ "Txpath2 Capture Route", "DMIC4", "DMIC4"},
-	{ "Txpath3 Capture Route", "DMIC4", "DMIC4"},
-	{ "Txpath4 Capture Route", "DMIC4", "DMIC4"},
-	{ "Txpath1 Capture Route", "DMIC5", "DMIC5"},
-	{ "Txpath2 Capture Route", "DMIC5", "DMIC5"},
-	{ "Txpath3 Capture Route", "DMIC5", "DMIC5"},
-	{ "Txpath4 Capture Route", "DMIC5", "DMIC5"},
-	{ "Txpath1 Capture Route", "DMIC6", "DMIC6"},
-	{ "Txpath2 Capture Route", "DMIC6", "DMIC6"},
-	{ "Txpath3 Capture Route", "DMIC6", "DMIC6"},
-	{ "Txpath4 Capture Route", "DMIC6", "DMIC6"},
-
-	/* tx path */
-	{ "TX1 Enable", NULL, "Txpath1 Capture Route"},
-	{ "TX2 Enable", NULL, "Txpath2 Capture Route"},
-	{ "TX3 Enable", NULL, "Txpath3 Capture Route"},
-	{ "TX4 Enable", NULL, "Txpath4 Capture Route"},
-	{ "PCM_Out", NULL, "TX1 Enable"},
-	{ "PCM_Out", NULL, "TX2 Enable"},
-	{ "PCM_Out", NULL, "TX3 Enable"},
-	{ "PCM_Out", NULL, "TX4 Enable"},
-
-};
-
-/* speaker and headset mutes, for audio pops and clicks */
-static int sn95031_pcm_hs_mute(struct snd_soc_dai *dai, int mute)
-{
-	snd_soc_update_bits(dai->codec,
-			SN95031_HSLVOLCTRL, BIT(7), (!mute << 7));
-	snd_soc_update_bits(dai->codec,
-			SN95031_HSRVOLCTRL, BIT(7), (!mute << 7));
-	return 0;
-}
-
-static int sn95031_pcm_spkr_mute(struct snd_soc_dai *dai, int mute)
-{
-	snd_soc_update_bits(dai->codec,
-			SN95031_IHFLVOLCTRL, BIT(7), (!mute << 7));
-	snd_soc_update_bits(dai->codec,
-			SN95031_IHFRVOLCTRL, BIT(7), (!mute << 7));
-	return 0;
-}
-
-static int sn95031_pcm_hw_params(struct snd_pcm_substream *substream,
-		struct snd_pcm_hw_params *params, struct snd_soc_dai *dai)
-{
-	unsigned int format, rate;
-
-	switch (params_width(params)) {
-	case 16:
-		format = BIT(4)|BIT(5);
-		break;
-
-	case 24:
-		format = 0;
-		break;
-	default:
-		return -EINVAL;
-	}
-	snd_soc_update_bits(dai->codec, SN95031_PCM2C2,
-			BIT(4)|BIT(5), format);
-
-	switch (params_rate(params)) {
-	case 48000:
-		pr_debug("RATE_48000\n");
-		rate = 0;
-		break;
-
-	case 44100:
-		pr_debug("RATE_44100\n");
-		rate = BIT(7);
-		break;
-
-	default:
-		pr_err("ERR rate %d\n", params_rate(params));
-		return -EINVAL;
-	}
-	snd_soc_update_bits(dai->codec, SN95031_PCM1C1, BIT(7), rate);
-
-	return 0;
-}
-
-/* Codec DAI section */
-static const struct snd_soc_dai_ops sn95031_headset_dai_ops = {
-	.digital_mute	= sn95031_pcm_hs_mute,
-	.hw_params	= sn95031_pcm_hw_params,
-};
-
-static const struct snd_soc_dai_ops sn95031_speaker_dai_ops = {
-	.digital_mute	= sn95031_pcm_spkr_mute,
-	.hw_params	= sn95031_pcm_hw_params,
-};
-
-static const struct snd_soc_dai_ops sn95031_vib1_dai_ops = {
-	.hw_params	= sn95031_pcm_hw_params,
-};
-
-static const struct snd_soc_dai_ops sn95031_vib2_dai_ops = {
-	.hw_params	= sn95031_pcm_hw_params,
-};
-
-static struct snd_soc_dai_driver sn95031_dais[] = {
-{
-	.name = "SN95031 Headset",
-	.playback = {
-		.stream_name = "Headset",
-		.channels_min = 2,
-		.channels_max = 2,
-		.rates = SN95031_RATES,
-		.formats = SN95031_FORMATS,
-	},
-	.capture = {
-		.stream_name = "Capture",
-		.channels_min = 1,
-		.channels_max = 5,
-		.rates = SN95031_RATES,
-		.formats = SN95031_FORMATS,
-	},
-	.ops = &sn95031_headset_dai_ops,
-},
-{	.name = "SN95031 Speaker",
-	.playback = {
-		.stream_name = "Speaker",
-		.channels_min = 2,
-		.channels_max = 2,
-		.rates = SN95031_RATES,
-		.formats = SN95031_FORMATS,
-	},
-	.ops = &sn95031_speaker_dai_ops,
-},
-{	.name = "SN95031 Vibra1",
-	.playback = {
-		.stream_name = "Vibra1",
-		.channels_min = 1,
-		.channels_max = 1,
-		.rates = SN95031_RATES,
-		.formats = SN95031_FORMATS,
-	},
-	.ops = &sn95031_vib1_dai_ops,
-},
-{	.name = "SN95031 Vibra2",
-	.playback = {
-		.stream_name = "Vibra2",
-		.channels_min = 1,
-		.channels_max = 1,
-		.rates = SN95031_RATES,
-		.formats = SN95031_FORMATS,
-	},
-	.ops = &sn95031_vib2_dai_ops,
-},
-};
-
-static inline void sn95031_disable_jack_btn(struct snd_soc_codec *codec)
-{
-	snd_soc_write(codec, SN95031_BTNCTRL2, 0x00);
-}
-
-static inline void sn95031_enable_jack_btn(struct snd_soc_codec *codec)
-{
-	snd_soc_write(codec, SN95031_BTNCTRL1, 0x77);
-	snd_soc_write(codec, SN95031_BTNCTRL2, 0x01);
-}
-
-static int sn95031_get_headset_state(struct snd_soc_codec *codec,
-	struct snd_soc_jack *mfld_jack)
-{
-	int micbias = sn95031_get_mic_bias(codec);
-
-	int jack_type = snd_soc_jack_get_type(mfld_jack, micbias);
-
-	pr_debug("jack type detected = %d\n", jack_type);
-	if (jack_type == SND_JACK_HEADSET)
-		sn95031_enable_jack_btn(codec);
-	return jack_type;
-}
-
-void sn95031_jack_detection(struct snd_soc_codec *codec,
-	struct mfld_jack_data *jack_data)
-{
-	unsigned int status;
-	unsigned int mask = SND_JACK_BTN_0 | SND_JACK_BTN_1 | SND_JACK_HEADSET;
-
-	pr_debug("interrupt id read in sram = 0x%x\n", jack_data->intr_id);
-	if (jack_data->intr_id & 0x1) {
-		pr_debug("short_push detected\n");
-		status = SND_JACK_HEADSET | SND_JACK_BTN_0;
-	} else if (jack_data->intr_id & 0x2) {
-		pr_debug("long_push detected\n");
-		status = SND_JACK_HEADSET | SND_JACK_BTN_1;
-	} else if (jack_data->intr_id & 0x4) {
-		pr_debug("headset or headphones inserted\n");
-		status = sn95031_get_headset_state(codec, jack_data->mfld_jack);
-	} else if (jack_data->intr_id & 0x8) {
-		pr_debug("headset or headphones removed\n");
-		status = 0;
-		sn95031_disable_jack_btn(codec);
-	} else {
-		pr_err("unidentified interrupt\n");
-		return;
-	}
-
-	snd_soc_jack_report(jack_data->mfld_jack, status, mask);
-	/*button pressed and released so we send explicit button release */
-	if ((status & SND_JACK_BTN_0) | (status & SND_JACK_BTN_1))
-		snd_soc_jack_report(jack_data->mfld_jack,
-				SND_JACK_HEADSET, mask);
-}
-EXPORT_SYMBOL_GPL(sn95031_jack_detection);
-
-/* codec registration */
-static int sn95031_codec_probe(struct snd_soc_codec *codec)
-{
-	pr_debug("codec_probe called\n");
-
-	/* PCM interface config
-	 * This sets the pcm rx slot conguration to max 6 slots
-	 * for max 4 dais (2 stereo and 2 mono)
-	 */
-	snd_soc_write(codec, SN95031_PCM2RXSLOT01, 0x10);
-	snd_soc_write(codec, SN95031_PCM2RXSLOT23, 0x32);
-	snd_soc_write(codec, SN95031_PCM2RXSLOT45, 0x54);
-	snd_soc_write(codec, SN95031_PCM2TXSLOT01, 0x10);
-	snd_soc_write(codec, SN95031_PCM2TXSLOT23, 0x32);
-	/* pcm port setting
-	 * This sets the pcm port to slave and clock at 19.2Mhz which
-	 * can support 6slots, sampling rate set per stream in hw-params
-	 */
-	snd_soc_write(codec, SN95031_PCM1C1, 0x00);
-	snd_soc_write(codec, SN95031_PCM2C1, 0x01);
-	snd_soc_write(codec, SN95031_PCM2C2, 0x0A);
-	snd_soc_write(codec, SN95031_HSMIXER, BIT(0)|BIT(4));
-	/* vendor vibra workround, the vibras are muted by
-	 * custom register so unmute them
-	 */
-	snd_soc_write(codec, SN95031_SSR5, 0x80);
-	snd_soc_write(codec, SN95031_SSR6, 0x80);
-	snd_soc_write(codec, SN95031_VIB1C5, 0x00);
-	snd_soc_write(codec, SN95031_VIB2C5, 0x00);
-	/* configure vibras for pcm port */
-	snd_soc_write(codec, SN95031_VIB1C3, 0x00);
-	snd_soc_write(codec, SN95031_VIB2C3, 0x00);
-
-	/* soft mute ramp time */
-	snd_soc_write(codec, SN95031_SOFTMUTE, 0x3);
-	/* fix the initial volume at 1dB,
-	 * default in +9dB,
-	 * 1dB give optimal swing on DAC, amps
-	 */
-	snd_soc_write(codec, SN95031_HSLVOLCTRL, 0x08);
-	snd_soc_write(codec, SN95031_HSRVOLCTRL, 0x08);
-	snd_soc_write(codec, SN95031_IHFLVOLCTRL, 0x08);
-	snd_soc_write(codec, SN95031_IHFRVOLCTRL, 0x08);
-	/* dac mode and lineout workaround */
-	snd_soc_write(codec, SN95031_SSR2, 0x10);
-	snd_soc_write(codec, SN95031_SSR3, 0x40);
-
-	return 0;
-}
-
-static const struct snd_soc_codec_driver sn95031_codec = {
-	.probe		= sn95031_codec_probe,
-	.set_bias_level	= sn95031_set_vaud_bias,
-	.idle_bias_off	= true,
-
-	.component_driver = {
-		.controls		= sn95031_snd_controls,
-		.num_controls		= ARRAY_SIZE(sn95031_snd_controls),
-		.dapm_widgets		= sn95031_dapm_widgets,
-		.num_dapm_widgets	= ARRAY_SIZE(sn95031_dapm_widgets),
-		.dapm_routes		= sn95031_audio_map,
-		.num_dapm_routes	= ARRAY_SIZE(sn95031_audio_map),
-	},
-};
-
-static int sn95031_device_probe(struct platform_device *pdev)
-{
-	struct regmap *regmap;
-
-	pr_debug("codec device probe called for %s\n", dev_name(&pdev->dev));
-
-	regmap = devm_regmap_init(&pdev->dev, NULL, NULL, &sn95031_regmap);
-	if (IS_ERR(regmap))
-		return PTR_ERR(regmap);
-
-	return snd_soc_register_codec(&pdev->dev, &sn95031_codec,
-			sn95031_dais, ARRAY_SIZE(sn95031_dais));
-}
-
-static int sn95031_device_remove(struct platform_device *pdev)
-{
-	pr_debug("codec device remove called\n");
-	snd_soc_unregister_codec(&pdev->dev);
-	return 0;
-}
-
-static struct platform_driver sn95031_codec_driver = {
-	.driver		= {
-		.name		= "sn95031",
-	},
-	.probe		= sn95031_device_probe,
-	.remove		= sn95031_device_remove,
-};
-
-module_platform_driver(sn95031_codec_driver);
-
-MODULE_DESCRIPTION("ASoC TI SN95031 codec driver");
-MODULE_AUTHOR("Vinod Koul <vinod.koul@intel.com>");
-MODULE_AUTHOR("Harsha Priya <priya.harsha@intel.com>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS("platform:sn95031");
diff --git a/sound/soc/codecs/sn95031.h b/sound/soc/codecs/sn95031.h
deleted file mode 100644
index 7651fe4..0000000
--- a/sound/soc/codecs/sn95031.h
+++ /dev/null
@@ -1,133 +0,0 @@
-/*
- *  sn95031.h - TI sn95031 Codec driver
- *
- *  Copyright (C) 2010 Intel Corp
- *  Author: Vinod Koul <vinod.koul@intel.com>
- *  Author: Harsha Priya <priya.harsha@intel.com>
- *  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; version 2 of the License.
- *
- *  This program is distributed in the hope that it will be useful, but
- *  WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- *  General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License along
- *  with this program; if not, write to the Free Software Foundation, Inc.,
- *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
- *
- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- *
- */
-#ifndef _SN95031_H
-#define _SN95031_H
-
-/*register map*/
-#define SN95031_VAUD			0xDB
-#define SN95031_VHSP			0xDC
-#define SN95031_VHSN			0xDD
-#define SN95031_VIHF			0xC9
-
-#define SN95031_AUDPLLCTRL		0x240
-#define SN95031_DMICBUF0123		0x241
-#define SN95031_DMICBUF45		0x242
-#define SN95031_DMICGPO			0x244
-#define SN95031_DMICMUX			0x245
-#define SN95031_DMICLK			0x246
-#define SN95031_MICBIAS			0x247
-#define SN95031_ADCCONFIG		0x248
-#define SN95031_MICAMP1			0x249
-#define SN95031_MICAMP2			0x24A
-#define SN95031_NOISEMUX		0x24B
-#define SN95031_AUDIOMUX12		0x24C
-#define SN95031_AUDIOMUX34		0x24D
-#define SN95031_AUDIOSINC		0x24E
-#define SN95031_AUDIOTXEN		0x24F
-#define SN95031_HSEPRXCTRL		0x250
-#define SN95031_IHFRXCTRL		0x251
-#define SN95031_HSMIXER			0x256
-#define SN95031_DACCONFIG		0x257
-#define SN95031_SOFTMUTE		0x258
-#define SN95031_HSLVOLCTRL		0x259
-#define SN95031_HSRVOLCTRL		0x25A
-#define SN95031_IHFLVOLCTRL		0x25B
-#define SN95031_IHFRVOLCTRL		0x25C
-#define SN95031_DRIVEREN		0x25D
-#define SN95031_LOCTL			0x25E
-#define SN95031_VIB1C1			0x25F
-#define SN95031_VIB1C2			0x260
-#define SN95031_VIB1C3			0x261
-#define SN95031_VIB1SPIPCM1		0x262
-#define SN95031_VIB1SPIPCM2		0x263
-#define SN95031_VIB1C5			0x264
-#define SN95031_VIB2C1			0x265
-#define SN95031_VIB2C2			0x266
-#define SN95031_VIB2C3			0x267
-#define SN95031_VIB2SPIPCM1		0x268
-#define SN95031_VIB2SPIPCM2		0x269
-#define SN95031_VIB2C5			0x26A
-#define SN95031_BTNCTRL1		0x26B
-#define SN95031_BTNCTRL2		0x26C
-#define SN95031_PCM1TXSLOT01		0x26D
-#define SN95031_PCM1TXSLOT23		0x26E
-#define SN95031_PCM1TXSLOT45		0x26F
-#define SN95031_PCM1RXSLOT0_3		0x270
-#define SN95031_PCM1RXSLOT45		0x271
-#define SN95031_PCM2TXSLOT01		0x272
-#define SN95031_PCM2TXSLOT23		0x273
-#define SN95031_PCM2TXSLOT45		0x274
-#define SN95031_PCM2RXSLOT01		0x275
-#define SN95031_PCM2RXSLOT23		0x276
-#define SN95031_PCM2RXSLOT45		0x277
-#define SN95031_PCM1C1			0x278
-#define SN95031_PCM1C2			0x279
-#define SN95031_PCM1C3			0x27A
-#define SN95031_PCM2C1			0x27B
-#define SN95031_PCM2C2			0x27C
-/*end codec register defn*/
-
-/*vendor defn these are not part of avp*/
-#define SN95031_SSR2			0x381
-#define SN95031_SSR3			0x382
-#define SN95031_SSR5			0x384
-#define SN95031_SSR6			0x385
-
-/* ADC registers */
-
-#define SN95031_ADC1CNTL1 0x1C0
-#define SN95031_ADC_ENBL 0x10
-#define SN95031_ADC_START 0x08
-#define SN95031_ADC1CNTL3 0x1C2
-#define SN95031_ADCTHERM_ENBL 0x04
-#define SN95031_ADCRRDATA_ENBL 0x05
-#define SN95031_STOPBIT_MASK 16
-#define SN95031_ADCTHERM_MASK 4
-#define SN95031_ADC_CHANLS_MAX 15 /* Number of ADC channels */
-#define SN95031_ADC_LOOP_MAX (SN95031_ADC_CHANLS_MAX - 1)
-#define SN95031_ADC_NO_LOOP 0x07
-#define SN95031_AUDIO_GPIO_CTRL 0x070
-
-/* ADC channel code values */
-#define SN95031_AUDIO_DETECT_CODE 0x06
-
-/* ADC base addresses */
-#define SN95031_ADC_CHNL_START_ADDR 0x1C5 /* increments by 1 */
-#define SN95031_ADC_DATA_START_ADDR 0x1D4  /* increments by 2 */
-/* multipier to convert to mV */
-#define SN95031_ADC_ONE_LSB_MULTIPLIER 2346
-
-
-struct mfld_jack_data {
-	int intr_id;
-	int micbias_vol;
-	struct snd_soc_jack *mfld_jack;
-};
-
-extern void sn95031_jack_detection(struct snd_soc_codec *codec,
-	struct mfld_jack_data *jack_data);
-
-#endif
diff --git a/sound/soc/codecs/spdif_receiver.c b/sound/soc/codecs/spdif_receiver.c
index 7acd051..c8fd636 100644
--- a/sound/soc/codecs/spdif_receiver.c
+++ b/sound/soc/codecs/spdif_receiver.c
@@ -34,10 +34,11 @@ static const struct snd_soc_dapm_route dir_routes[] = {
 #define STUB_RATES	SNDRV_PCM_RATE_8000_192000
 #define STUB_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | \
 			SNDRV_PCM_FMTBIT_S20_3LE | \
-			SNDRV_PCM_FMTBIT_S24_LE | \
+			SNDRV_PCM_FMTBIT_S24_LE  | \
+			SNDRV_PCM_FMTBIT_S32_LE | \
 			SNDRV_PCM_FMTBIT_IEC958_SUBFRAME_LE)
 
-static const struct snd_soc_codec_driver soc_codec_spdif_dir = {
+static struct snd_soc_codec_driver soc_codec_spdif_dir = {
 	.component_driver = {
 		.dapm_widgets		= dir_widgets,
 		.num_dapm_widgets	= ARRAY_SIZE(dir_widgets),
diff --git a/sound/soc/codecs/spdif_transmitter.c b/sound/soc/codecs/spdif_transmitter.c
index 063a64f..037aa1d 100644
--- a/sound/soc/codecs/spdif_transmitter.c
+++ b/sound/soc/codecs/spdif_transmitter.c
@@ -27,7 +27,8 @@
 #define STUB_RATES	SNDRV_PCM_RATE_8000_192000
 #define STUB_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | \
 			SNDRV_PCM_FMTBIT_S20_3LE | \
-			SNDRV_PCM_FMTBIT_S24_LE)
+			SNDRV_PCM_FMTBIT_S24_LE  | \
+			SNDRV_PCM_FMTBIT_S32_LE)
 
 static const struct snd_soc_dapm_widget dit_widgets[] = {
 	SND_SOC_DAPM_OUTPUT("spdif-out"),
@@ -37,7 +38,7 @@ static const struct snd_soc_dapm_route dit_routes[] = {
 	{ "spdif-out", NULL, "Playback" },
 };
 
-static const struct snd_soc_codec_driver soc_codec_spdif_dit = {
+static struct snd_soc_codec_driver soc_codec_spdif_dit = {
 	.component_driver = {
 		.dapm_widgets		= dit_widgets,
 		.num_dapm_widgets	= ARRAY_SIZE(dit_widgets),
diff --git a/sound/soc/codecs/tas5720.c b/sound/soc/codecs/tas5720.c
index a736a2a6..f3006f3 100644
--- a/sound/soc/codecs/tas5720.c
+++ b/sound/soc/codecs/tas5720.c
@@ -36,6 +36,11 @@
 /* Define how often to check (and clear) the fault status register (in ms) */
 #define TAS5720_FAULT_CHECK_INTERVAL		200
 
+enum tas572x_type {
+	TAS5720,
+	TAS5722,
+};
+
 static const char * const tas5720_supply_names[] = {
 	"dvdd",		/* Digital power supply. Connect to 3.3-V supply. */
 	"pvdd",		/* Class-D amp and analog power supply (connected). */
@@ -47,6 +52,7 @@ struct tas5720_data {
 	struct snd_soc_codec *codec;
 	struct regmap *regmap;
 	struct i2c_client *tas5720_client;
+	enum tas572x_type devtype;
 	struct regulator_bulk_data supplies[TAS5720_NUM_SUPPLIES];
 	struct delayed_work fault_check_work;
 	unsigned int last_fault;
@@ -264,7 +270,7 @@ static void tas5720_fault_check_work(struct work_struct *work)
 static int tas5720_codec_probe(struct snd_soc_codec *codec)
 {
 	struct tas5720_data *tas5720 = snd_soc_codec_get_drvdata(codec);
-	unsigned int device_id;
+	unsigned int device_id, expected_device_id;
 	int ret;
 
 	tas5720->codec = codec;
@@ -276,6 +282,11 @@ static int tas5720_codec_probe(struct snd_soc_codec *codec)
 		return ret;
 	}
 
+	/*
+	 * Take a liberal approach to checking the device ID to allow the
+	 * driver to be used even if the device ID does not match, however
+	 * issue a warning if there is a mismatch.
+	 */
 	ret = regmap_read(tas5720->regmap, TAS5720_DEVICE_ID_REG, &device_id);
 	if (ret < 0) {
 		dev_err(codec->dev, "failed to read device ID register: %d\n",
@@ -283,13 +294,22 @@ static int tas5720_codec_probe(struct snd_soc_codec *codec)
 		goto probe_fail;
 	}
 
-	if (device_id != TAS5720_DEVICE_ID) {
-		dev_err(codec->dev, "wrong device ID. expected: %u read: %u\n",
-			TAS5720_DEVICE_ID, device_id);
-		ret = -ENODEV;
-		goto probe_fail;
+	switch (tas5720->devtype) {
+	case TAS5720:
+		expected_device_id = TAS5720_DEVICE_ID;
+		break;
+	case TAS5722:
+		expected_device_id = TAS5722_DEVICE_ID;
+		break;
+	default:
+		dev_err(codec->dev, "unexpected private driver data\n");
+		return -EINVAL;
 	}
 
+	if (device_id != expected_device_id)
+		dev_warn(codec->dev, "wrong device ID. expected: %u read: %u\n",
+			 expected_device_id, device_id);
+
 	/* Set device to mute */
 	ret = snd_soc_update_bits(codec, TAS5720_DIGITAL_CTRL2_REG,
 				  TAS5720_MUTE, TAS5720_MUTE);
@@ -446,6 +466,15 @@ static const struct regmap_config tas5720_regmap_config = {
 	.volatile_reg = tas5720_is_volatile_reg,
 };
 
+static const struct regmap_config tas5722_regmap_config = {
+	.reg_bits = 8,
+	.val_bits = 8,
+
+	.max_register = TAS5722_MAX_REG,
+	.cache_type = REGCACHE_RBTREE,
+	.volatile_reg = tas5720_is_volatile_reg,
+};
+
 /*
  * DAC analog gain. There are four discrete values to select from, ranging
  * from 19.2 dB to 26.3dB.
@@ -544,6 +573,7 @@ static int tas5720_probe(struct i2c_client *client,
 {
 	struct device *dev = &client->dev;
 	struct tas5720_data *data;
+	const struct regmap_config *regmap_config;
 	int ret;
 	int i;
 
@@ -552,7 +582,20 @@ static int tas5720_probe(struct i2c_client *client,
 		return -ENOMEM;
 
 	data->tas5720_client = client;
-	data->regmap = devm_regmap_init_i2c(client, &tas5720_regmap_config);
+	data->devtype = id->driver_data;
+
+	switch (id->driver_data) {
+	case TAS5720:
+		regmap_config = &tas5720_regmap_config;
+		break;
+	case TAS5722:
+		regmap_config = &tas5722_regmap_config;
+		break;
+	default:
+		dev_err(dev, "unexpected private driver data\n");
+		return -EINVAL;
+	}
+	data->regmap = devm_regmap_init_i2c(client, regmap_config);
 	if (IS_ERR(data->regmap)) {
 		ret = PTR_ERR(data->regmap);
 		dev_err(dev, "failed to allocate register map: %d\n", ret);
@@ -592,7 +635,8 @@ static int tas5720_remove(struct i2c_client *client)
 }
 
 static const struct i2c_device_id tas5720_id[] = {
-	{ "tas5720", 0 },
+	{ "tas5720", TAS5720 },
+	{ "tas5722", TAS5722 },
 	{ }
 };
 MODULE_DEVICE_TABLE(i2c, tas5720_id);
@@ -600,6 +644,7 @@ MODULE_DEVICE_TABLE(i2c, tas5720_id);
 #if IS_ENABLED(CONFIG_OF)
 static const struct of_device_id tas5720_of_match[] = {
 	{ .compatible = "ti,tas5720", },
+	{ .compatible = "ti,tas5722", },
 	{ },
 };
 MODULE_DEVICE_TABLE(of, tas5720_of_match);
diff --git a/sound/soc/codecs/tas5720.h b/sound/soc/codecs/tas5720.h
index 3d077c7..1dda309 100644
--- a/sound/soc/codecs/tas5720.h
+++ b/sound/soc/codecs/tas5720.h
@@ -30,8 +30,14 @@
 #define TAS5720_DIGITAL_CLIP1_REG	0x11
 #define TAS5720_MAX_REG			TAS5720_DIGITAL_CLIP1_REG
 
+/* Additional TAS5722-specific Registers */
+#define TAS5722_DIGITAL_CTRL2_REG	0x13
+#define TAS5722_ANALOG_CTRL2_REG	0x14
+#define TAS5722_MAX_REG			TAS5722_ANALOG_CTRL2_REG
+
 /* TAS5720_DEVICE_ID_REG */
 #define TAS5720_DEVICE_ID		0x01
+#define TAS5722_DEVICE_ID		0x12
 
 /* TAS5720_POWER_CTRL_REG */
 #define TAS5720_DIG_CLIP_MASK		GENMASK(7, 2)
@@ -51,6 +57,7 @@
 #define TAS5720_SAIF_FORMAT_MASK	GENMASK(2, 0)
 
 /* TAS5720_DIGITAL_CTRL2_REG */
+#define TAS5722_VOL_RAMP_RATE		BIT(6)
 #define TAS5720_MUTE			BIT(4)
 #define TAS5720_TDM_SLOT_SEL_MASK	GENMASK(2, 0)
 
@@ -87,4 +94,28 @@
 #define TAS5720_CLIP1_MASK		GENMASK(7, 2)
 #define TAS5720_CLIP1_SHIFT		(0x2)
 
+/* TAS5722_DIGITAL_CTRL2_REG */
+#define TAS5722_HPF_3_7HZ		(0x0 << 5)
+#define TAS5722_HPF_7_4HZ		(0x1 << 5)
+#define TAS5722_HPF_14_9HZ		(0x2 << 5)
+#define TAS5722_HPF_29_7HZ		(0x3 << 5)
+#define TAS5722_HPF_59_4HZ		(0x4 << 5)
+#define TAS5722_HPF_118_4HZ		(0x5 << 5)
+#define TAS5722_HPF_235_0HZ		(0x6 << 5)
+#define TAS5722_HPF_463_2HZ		(0x7 << 5)
+#define TAS5722_HPF_MASK		GENMASK(7, 5)
+#define TAS5722_AUTO_SLEEP_OFF		(0x0 << 3)
+#define TAS5722_AUTO_SLEEP_1024LR	(0x1 << 3)
+#define TAS5722_AUTO_SLEEP_65536LR	(0x2 << 3)
+#define TAS5722_AUTO_SLEEP_262144LR	(0x3 << 3)
+#define TAS5722_AUTO_SLEEP_MASK		GENMASK(4, 3)
+#define TAS5722_TDM_SLOT_16B		BIT(2)
+#define TAS5722_MCLK_PIN_CFG		BIT(1)
+#define TAS5722_VOL_CONTROL_LSB		BIT(0)
+
+/* TAS5722_ANALOG_CTRL2_REG */
+#define TAS5722_FAULTZ_PU		BIT(3)
+#define TAS5722_VREG_LVL		BIT(2)
+#define TAS5722_PWR_TUNE		BIT(0)
+
 #endif /* __TAS5720_H__ */
diff --git a/sound/soc/codecs/tas6424.c b/sound/soc/codecs/tas6424.c
new file mode 100644
index 0000000..49b87f6
--- /dev/null
+++ b/sound/soc/codecs/tas6424.c
@@ -0,0 +1,707 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ALSA SoC Texas Instruments TAS6424 Quad-Channel Audio Amplifier
+ *
+ * Copyright (C) 2016-2017 Texas Instruments Incorporated - http://www.ti.com/
+ *	Author: Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#include <linux/module.h>
+#include <linux/errno.h>
+#include <linux/device.h>
+#include <linux/i2c.h>
+#include <linux/pm_runtime.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+#include <linux/regulator/consumer.h>
+#include <linux/delay.h>
+
+#include <sound/pcm.h>
+#include <sound/pcm_params.h>
+#include <sound/soc.h>
+#include <sound/soc-dapm.h>
+#include <sound/tlv.h>
+
+#include "tas6424.h"
+
+/* Define how often to check (and clear) the fault status register (in ms) */
+#define TAS6424_FAULT_CHECK_INTERVAL 200
+
+static const char * const tas6424_supply_names[] = {
+	"dvdd", /* Digital power supply. Connect to 3.3-V supply. */
+	"vbat", /* Supply used for higher voltage analog circuits. */
+	"pvdd", /* Class-D amp output FETs supply. */
+};
+#define TAS6424_NUM_SUPPLIES ARRAY_SIZE(tas6424_supply_names)
+
+struct tas6424_data {
+	struct device *dev;
+	struct regmap *regmap;
+	struct regulator_bulk_data supplies[TAS6424_NUM_SUPPLIES];
+	struct delayed_work fault_check_work;
+	unsigned int last_fault1;
+	unsigned int last_fault2;
+	unsigned int last_warn;
+};
+
+/*
+ * DAC digital volumes. From -103.5 to 24 dB in 0.5 dB steps. Note that
+ * setting the gain below -100 dB (register value <0x7) is effectively a MUTE
+ * as per device datasheet.
+ */
+static DECLARE_TLV_DB_SCALE(dac_tlv, -10350, 50, 0);
+
+static const struct snd_kcontrol_new tas6424_snd_controls[] = {
+	SOC_SINGLE_TLV("Speaker Driver CH1 Playback Volume",
+		       TAS6424_CH1_VOL_CTRL, 0, 0xff, 0, dac_tlv),
+	SOC_SINGLE_TLV("Speaker Driver CH2 Playback Volume",
+		       TAS6424_CH2_VOL_CTRL, 0, 0xff, 0, dac_tlv),
+	SOC_SINGLE_TLV("Speaker Driver CH3 Playback Volume",
+		       TAS6424_CH3_VOL_CTRL, 0, 0xff, 0, dac_tlv),
+	SOC_SINGLE_TLV("Speaker Driver CH4 Playback Volume",
+		       TAS6424_CH4_VOL_CTRL, 0, 0xff, 0, dac_tlv),
+};
+
+static int tas6424_dac_event(struct snd_soc_dapm_widget *w,
+			     struct snd_kcontrol *kcontrol, int event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	struct tas6424_data *tas6424 = snd_soc_codec_get_drvdata(codec);
+
+	dev_dbg(codec->dev, "%s() event=0x%0x\n", __func__, event);
+
+	if (event & SND_SOC_DAPM_POST_PMU) {
+		/* Observe codec shutdown-to-active time */
+		msleep(12);
+
+		/* Turn on TAS6424 periodic fault checking/handling */
+		tas6424->last_fault1 = 0;
+		tas6424->last_fault2 = 0;
+		tas6424->last_warn = 0;
+		schedule_delayed_work(&tas6424->fault_check_work,
+				      msecs_to_jiffies(TAS6424_FAULT_CHECK_INTERVAL));
+	} else if (event & SND_SOC_DAPM_PRE_PMD) {
+		/* Disable TAS6424 periodic fault checking/handling */
+		cancel_delayed_work_sync(&tas6424->fault_check_work);
+	}
+
+	return 0;
+}
+
+static const struct snd_soc_dapm_widget tas6424_dapm_widgets[] = {
+	SND_SOC_DAPM_AIF_IN("DAC IN", "Playback", 0, SND_SOC_NOPM, 0, 0),
+	SND_SOC_DAPM_DAC_E("DAC", NULL, SND_SOC_NOPM, 0, 0, tas6424_dac_event,
+			   SND_SOC_DAPM_POST_PMU | SND_SOC_DAPM_PRE_PMD),
+	SND_SOC_DAPM_OUTPUT("OUT")
+};
+
+static const struct snd_soc_dapm_route tas6424_audio_map[] = {
+	{ "DAC", NULL, "DAC IN" },
+	{ "OUT", NULL, "DAC" },
+};
+
+static int tas6424_hw_params(struct snd_pcm_substream *substream,
+			     struct snd_pcm_hw_params *params,
+			     struct snd_soc_dai *dai)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	unsigned int rate = params_rate(params);
+	unsigned int width = params_width(params);
+	u8 sap_ctrl = 0;
+
+	dev_dbg(codec->dev, "%s() rate=%u width=%u\n", __func__, rate, width);
+
+	switch (rate) {
+	case 44100:
+		sap_ctrl |= TAS6424_SAP_RATE_44100;
+		break;
+	case 48000:
+		sap_ctrl |= TAS6424_SAP_RATE_48000;
+		break;
+	case 96000:
+		sap_ctrl |= TAS6424_SAP_RATE_96000;
+		break;
+	default:
+		dev_err(codec->dev, "unsupported sample rate: %u\n", rate);
+		return -EINVAL;
+	}
+
+	switch (width) {
+	case 16:
+		sap_ctrl |= TAS6424_SAP_TDM_SLOT_SZ_16;
+		break;
+	case 24:
+		break;
+	default:
+		dev_err(codec->dev, "unsupported sample width: %u\n", width);
+		return -EINVAL;
+	}
+
+	snd_soc_update_bits(codec, TAS6424_SAP_CTRL,
+			    TAS6424_SAP_RATE_MASK |
+			    TAS6424_SAP_TDM_SLOT_SZ_16,
+			    sap_ctrl);
+
+	return 0;
+}
+
+static int tas6424_set_dai_fmt(struct snd_soc_dai *dai, unsigned int fmt)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	u8 serial_format = 0;
+
+	dev_dbg(codec->dev, "%s() fmt=0x%0x\n", __func__, fmt);
+
+	/* clock masters */
+	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
+	case SND_SOC_DAIFMT_CBS_CFS:
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI master/slave interface\n");
+		return -EINVAL;
+	}
+
+	/* signal polarity */
+	switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
+	case SND_SOC_DAIFMT_NB_NF:
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI clock signal polarity\n");
+		return -EINVAL;
+	}
+
+	/* interface format */
+	switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
+	case SND_SOC_DAIFMT_I2S:
+		serial_format |= TAS6424_SAP_I2S;
+		break;
+	case SND_SOC_DAIFMT_DSP_A:
+		serial_format |= TAS6424_SAP_DSP;
+		break;
+	case SND_SOC_DAIFMT_DSP_B:
+		/*
+		 * We can use the fact that the TAS6424 does not care about the
+		 * LRCLK duty cycle during TDM to receive DSP_B formatted data
+		 * in LEFTJ mode (no delaying of the 1st data bit).
+		 */
+		serial_format |= TAS6424_SAP_LEFTJ;
+		break;
+	case SND_SOC_DAIFMT_LEFT_J:
+		serial_format |= TAS6424_SAP_LEFTJ;
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI interface format\n");
+		return -EINVAL;
+	}
+
+	snd_soc_update_bits(codec, TAS6424_SAP_CTRL,
+			    TAS6424_SAP_FMT_MASK, serial_format);
+
+	return 0;
+}
+
+static int tas6424_set_dai_tdm_slot(struct snd_soc_dai *dai,
+				    unsigned int tx_mask, unsigned int rx_mask,
+				    int slots, int slot_width)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	unsigned int first_slot, last_slot;
+	bool sap_tdm_slot_last;
+
+	dev_dbg(codec->dev, "%s() tx_mask=%d rx_mask=%d\n", __func__,
+		tx_mask, rx_mask);
+
+	if (!tx_mask || !rx_mask)
+		return 0; /* nothing needed to disable TDM mode */
+
+	/*
+	 * Determine the first slot and last slot that is being requested so
+	 * we'll be able to more easily enforce certain constraints as the
+	 * TAS6424's TDM interface is not fully configurable.
+	 */
+	first_slot = __ffs(tx_mask);
+	last_slot = __fls(rx_mask);
+
+	if (last_slot - first_slot != 4) {
+		dev_err(codec->dev, "tdm mask must cover 4 contiguous slots\n");
+		return -EINVAL;
+	}
+
+	switch (first_slot) {
+	case 0:
+		sap_tdm_slot_last = false;
+		break;
+	case 4:
+		sap_tdm_slot_last = true;
+		break;
+	default:
+		dev_err(codec->dev, "tdm mask must start at slot 0 or 4\n");
+		return -EINVAL;
+	}
+
+	snd_soc_update_bits(codec, TAS6424_SAP_CTRL, TAS6424_SAP_TDM_SLOT_LAST,
+			    sap_tdm_slot_last ? TAS6424_SAP_TDM_SLOT_LAST : 0);
+
+	return 0;
+}
+
+static int tas6424_mute(struct snd_soc_dai *dai, int mute)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	unsigned int val;
+
+	dev_dbg(codec->dev, "%s() mute=%d\n", __func__, mute);
+
+	if (mute)
+		val = TAS6424_ALL_STATE_MUTE;
+	else
+		val = TAS6424_ALL_STATE_PLAY;
+
+	snd_soc_write(codec, TAS6424_CH_STATE_CTRL, val);
+
+	return 0;
+}
+
+static int tas6424_power_off(struct snd_soc_codec *codec)
+{
+	struct tas6424_data *tas6424 = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	snd_soc_write(codec, TAS6424_CH_STATE_CTRL, TAS6424_ALL_STATE_HIZ);
+
+	regcache_cache_only(tas6424->regmap, true);
+	regcache_mark_dirty(tas6424->regmap);
+
+	ret = regulator_bulk_disable(ARRAY_SIZE(tas6424->supplies),
+				     tas6424->supplies);
+	if (ret < 0) {
+		dev_err(codec->dev, "failed to disable supplies: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int tas6424_power_on(struct snd_soc_codec *codec)
+{
+	struct tas6424_data *tas6424 = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	ret = regulator_bulk_enable(ARRAY_SIZE(tas6424->supplies),
+				    tas6424->supplies);
+	if (ret < 0) {
+		dev_err(codec->dev, "failed to enable supplies: %d\n", ret);
+		return ret;
+	}
+
+	regcache_cache_only(tas6424->regmap, false);
+
+	ret = regcache_sync(tas6424->regmap);
+	if (ret < 0) {
+		dev_err(codec->dev, "failed to sync regcache: %d\n", ret);
+		return ret;
+	}
+
+	snd_soc_write(codec, TAS6424_CH_STATE_CTRL, TAS6424_ALL_STATE_MUTE);
+
+	/* any time we come out of HIZ, the output channels automatically run DC
+	 * load diagnostics, wait here until this completes
+	 */
+	msleep(230);
+
+	return 0;
+}
+
+static int tas6424_set_bias_level(struct snd_soc_codec *codec,
+				  enum snd_soc_bias_level level)
+{
+	dev_dbg(codec->dev, "%s() level=%d\n", __func__, level);
+
+	switch (level) {
+	case SND_SOC_BIAS_ON:
+	case SND_SOC_BIAS_PREPARE:
+		break;
+	case SND_SOC_BIAS_STANDBY:
+		if (snd_soc_codec_get_bias_level(codec) == SND_SOC_BIAS_OFF)
+			tas6424_power_on(codec);
+		break;
+	case SND_SOC_BIAS_OFF:
+		tas6424_power_off(codec);
+		break;
+	}
+
+	return 0;
+}
+
+static struct snd_soc_codec_driver soc_codec_dev_tas6424 = {
+	.set_bias_level = tas6424_set_bias_level,
+	.idle_bias_off = true,
+
+	.component_driver = {
+		.controls = tas6424_snd_controls,
+		.num_controls = ARRAY_SIZE(tas6424_snd_controls),
+		.dapm_widgets = tas6424_dapm_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(tas6424_dapm_widgets),
+		.dapm_routes = tas6424_audio_map,
+		.num_dapm_routes = ARRAY_SIZE(tas6424_audio_map),
+	},
+};
+
+static struct snd_soc_dai_ops tas6424_speaker_dai_ops = {
+	.hw_params	= tas6424_hw_params,
+	.set_fmt	= tas6424_set_dai_fmt,
+	.set_tdm_slot	= tas6424_set_dai_tdm_slot,
+	.digital_mute	= tas6424_mute,
+};
+
+static struct snd_soc_dai_driver tas6424_dai[] = {
+	{
+		.name = "tas6424-amplifier",
+		.playback = {
+			.stream_name = "Playback",
+			.channels_min = 1,
+			.channels_max = 4,
+			.rates = TAS6424_RATES,
+			.formats = TAS6424_FORMATS,
+		},
+		.ops = &tas6424_speaker_dai_ops,
+	},
+};
+
+static void tas6424_fault_check_work(struct work_struct *work)
+{
+	struct tas6424_data *tas6424 = container_of(work, struct tas6424_data,
+						    fault_check_work.work);
+	struct device *dev = tas6424->dev;
+	unsigned int reg;
+	int ret;
+
+	ret = regmap_read(tas6424->regmap, TAS6424_GLOB_FAULT1, &reg);
+	if (ret < 0) {
+		dev_err(dev, "failed to read FAULT1 register: %d\n", ret);
+		goto out;
+	}
+
+	/*
+	 * Ignore any clock faults as there is no clean way to check for them.
+	 * We would need to start checking for those faults *after* the SAIF
+	 * stream has been setup, and stop checking *before* the stream is
+	 * stopped to avoid any false-positives. However there are no
+	 * appropriate hooks to monitor these events.
+	 */
+	reg &= TAS6424_FAULT_PVDD_OV |
+	       TAS6424_FAULT_VBAT_OV |
+	       TAS6424_FAULT_PVDD_UV |
+	       TAS6424_FAULT_VBAT_UV;
+
+	if (reg)
+		goto check_global_fault2_reg;
+
+	/*
+	 * Only flag errors once for a given occurrence. This is needed as
+	 * the TAS6424 will take time clearing the fault condition internally
+	 * during which we don't want to bombard the system with the same
+	 * error message over and over.
+	 */
+	if ((reg & TAS6424_FAULT_PVDD_OV) && !(tas6424->last_fault1 & TAS6424_FAULT_PVDD_OV))
+		dev_crit(dev, "experienced a PVDD overvoltage fault\n");
+
+	if ((reg & TAS6424_FAULT_VBAT_OV) && !(tas6424->last_fault1 & TAS6424_FAULT_VBAT_OV))
+		dev_crit(dev, "experienced a VBAT overvoltage fault\n");
+
+	if ((reg & TAS6424_FAULT_PVDD_UV) && !(tas6424->last_fault1 & TAS6424_FAULT_PVDD_UV))
+		dev_crit(dev, "experienced a PVDD undervoltage fault\n");
+
+	if ((reg & TAS6424_FAULT_VBAT_UV) && !(tas6424->last_fault1 & TAS6424_FAULT_VBAT_UV))
+		dev_crit(dev, "experienced a VBAT undervoltage fault\n");
+
+	/* Store current fault1 value so we can detect any changes next time */
+	tas6424->last_fault1 = reg;
+
+check_global_fault2_reg:
+	ret = regmap_read(tas6424->regmap, TAS6424_GLOB_FAULT2, &reg);
+	if (ret < 0) {
+		dev_err(dev, "failed to read FAULT2 register: %d\n", ret);
+		goto out;
+	}
+
+	reg &= TAS6424_FAULT_OTSD |
+	       TAS6424_FAULT_OTSD_CH1 |
+	       TAS6424_FAULT_OTSD_CH2 |
+	       TAS6424_FAULT_OTSD_CH3 |
+	       TAS6424_FAULT_OTSD_CH4;
+
+	if (!reg)
+		goto check_warn_reg;
+
+	if ((reg & TAS6424_FAULT_OTSD) && !(tas6424->last_fault2 & TAS6424_FAULT_OTSD))
+		dev_crit(dev, "experienced a global overtemp shutdown\n");
+
+	if ((reg & TAS6424_FAULT_OTSD_CH1) && !(tas6424->last_fault2 & TAS6424_FAULT_OTSD_CH1))
+		dev_crit(dev, "experienced an overtemp shutdown on CH1\n");
+
+	if ((reg & TAS6424_FAULT_OTSD_CH2) && !(tas6424->last_fault2 & TAS6424_FAULT_OTSD_CH2))
+		dev_crit(dev, "experienced an overtemp shutdown on CH2\n");
+
+	if ((reg & TAS6424_FAULT_OTSD_CH3) && !(tas6424->last_fault2 & TAS6424_FAULT_OTSD_CH3))
+		dev_crit(dev, "experienced an overtemp shutdown on CH3\n");
+
+	if ((reg & TAS6424_FAULT_OTSD_CH4) && !(tas6424->last_fault2 & TAS6424_FAULT_OTSD_CH4))
+		dev_crit(dev, "experienced an overtemp shutdown on CH4\n");
+
+	/* Store current fault2 value so we can detect any changes next time */
+	tas6424->last_fault2 = reg;
+
+check_warn_reg:
+	ret = regmap_read(tas6424->regmap, TAS6424_WARN, &reg);
+	if (ret < 0) {
+		dev_err(dev, "failed to read WARN register: %d\n", ret);
+		goto out;
+	}
+
+	reg &= TAS6424_WARN_VDD_UV |
+	       TAS6424_WARN_VDD_POR |
+	       TAS6424_WARN_VDD_OTW |
+	       TAS6424_WARN_VDD_OTW_CH1 |
+	       TAS6424_WARN_VDD_OTW_CH2 |
+	       TAS6424_WARN_VDD_OTW_CH3 |
+	       TAS6424_WARN_VDD_OTW_CH4;
+
+	if (!reg)
+		goto out;
+
+	if ((reg & TAS6424_WARN_VDD_UV) && !(tas6424->last_warn & TAS6424_WARN_VDD_UV))
+		dev_warn(dev, "experienced a VDD under voltage condition\n");
+
+	if ((reg & TAS6424_WARN_VDD_POR) && !(tas6424->last_warn & TAS6424_WARN_VDD_POR))
+		dev_warn(dev, "experienced a VDD POR condition\n");
+
+	if ((reg & TAS6424_WARN_VDD_OTW) && !(tas6424->last_warn & TAS6424_WARN_VDD_OTW))
+		dev_warn(dev, "experienced a global overtemp warning\n");
+
+	if ((reg & TAS6424_WARN_VDD_OTW_CH1) && !(tas6424->last_warn & TAS6424_WARN_VDD_OTW_CH1))
+		dev_warn(dev, "experienced an overtemp warning on CH1\n");
+
+	if ((reg & TAS6424_WARN_VDD_OTW_CH2) && !(tas6424->last_warn & TAS6424_WARN_VDD_OTW_CH2))
+		dev_warn(dev, "experienced an overtemp warning on CH2\n");
+
+	if ((reg & TAS6424_WARN_VDD_OTW_CH3) && !(tas6424->last_warn & TAS6424_WARN_VDD_OTW_CH3))
+		dev_warn(dev, "experienced an overtemp warning on CH3\n");
+
+	if ((reg & TAS6424_WARN_VDD_OTW_CH4) && !(tas6424->last_warn & TAS6424_WARN_VDD_OTW_CH4))
+		dev_warn(dev, "experienced an overtemp warning on CH4\n");
+
+	/* Store current warn value so we can detect any changes next time */
+	tas6424->last_warn = reg;
+
+	/* Clear any faults by toggling the CLEAR_FAULT control bit */
+	ret = regmap_write_bits(tas6424->regmap, TAS6424_MISC_CTRL3,
+				TAS6424_CLEAR_FAULT, TAS6424_CLEAR_FAULT);
+	if (ret < 0)
+		dev_err(dev, "failed to write MISC_CTRL3 register: %d\n", ret);
+
+	ret = regmap_write_bits(tas6424->regmap, TAS6424_MISC_CTRL3,
+				TAS6424_CLEAR_FAULT, 0);
+	if (ret < 0)
+		dev_err(dev, "failed to write MISC_CTRL3 register: %d\n", ret);
+
+out:
+	/* Schedule the next fault check at the specified interval */
+	schedule_delayed_work(&tas6424->fault_check_work,
+			      msecs_to_jiffies(TAS6424_FAULT_CHECK_INTERVAL));
+}
+
+static const struct reg_default tas6424_reg_defaults[] = {
+	{ TAS6424_MODE_CTRL,		0x00 },
+	{ TAS6424_MISC_CTRL1,		0x32 },
+	{ TAS6424_MISC_CTRL2,		0x62 },
+	{ TAS6424_SAP_CTRL,		0x04 },
+	{ TAS6424_CH_STATE_CTRL,	0x55 },
+	{ TAS6424_CH1_VOL_CTRL,		0xcf },
+	{ TAS6424_CH2_VOL_CTRL,		0xcf },
+	{ TAS6424_CH3_VOL_CTRL,		0xcf },
+	{ TAS6424_CH4_VOL_CTRL,		0xcf },
+	{ TAS6424_DC_DIAG_CTRL1,	0x00 },
+	{ TAS6424_DC_DIAG_CTRL2,	0x11 },
+	{ TAS6424_DC_DIAG_CTRL3,	0x11 },
+	{ TAS6424_PIN_CTRL,		0xff },
+	{ TAS6424_AC_DIAG_CTRL1,	0x00 },
+	{ TAS6424_MISC_CTRL3,		0x00 },
+	{ TAS6424_CLIP_CTRL,		0x01 },
+	{ TAS6424_CLIP_WINDOW,		0x14 },
+	{ TAS6424_CLIP_WARN,		0x00 },
+	{ TAS6424_CBC_STAT,		0x00 },
+	{ TAS6424_MISC_CTRL4,		0x40 },
+};
+
+static bool tas6424_is_writable_reg(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case TAS6424_MODE_CTRL:
+	case TAS6424_MISC_CTRL1:
+	case TAS6424_MISC_CTRL2:
+	case TAS6424_SAP_CTRL:
+	case TAS6424_CH_STATE_CTRL:
+	case TAS6424_CH1_VOL_CTRL:
+	case TAS6424_CH2_VOL_CTRL:
+	case TAS6424_CH3_VOL_CTRL:
+	case TAS6424_CH4_VOL_CTRL:
+	case TAS6424_DC_DIAG_CTRL1:
+	case TAS6424_DC_DIAG_CTRL2:
+	case TAS6424_DC_DIAG_CTRL3:
+	case TAS6424_PIN_CTRL:
+	case TAS6424_AC_DIAG_CTRL1:
+	case TAS6424_MISC_CTRL3:
+	case TAS6424_CLIP_CTRL:
+	case TAS6424_CLIP_WINDOW:
+	case TAS6424_CLIP_WARN:
+	case TAS6424_CBC_STAT:
+	case TAS6424_MISC_CTRL4:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static bool tas6424_is_volatile_reg(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case TAS6424_DC_LOAD_DIAG_REP12:
+	case TAS6424_DC_LOAD_DIAG_REP34:
+	case TAS6424_DC_LOAD_DIAG_REPLO:
+	case TAS6424_CHANNEL_STATE:
+	case TAS6424_CHANNEL_FAULT:
+	case TAS6424_GLOB_FAULT1:
+	case TAS6424_GLOB_FAULT2:
+	case TAS6424_WARN:
+	case TAS6424_AC_LOAD_DIAG_REP1:
+	case TAS6424_AC_LOAD_DIAG_REP2:
+	case TAS6424_AC_LOAD_DIAG_REP3:
+	case TAS6424_AC_LOAD_DIAG_REP4:
+		return true;
+	default:
+		return false;
+	}
+}
+
+static const struct regmap_config tas6424_regmap_config = {
+	.reg_bits = 8,
+	.val_bits = 8,
+
+	.writeable_reg = tas6424_is_writable_reg,
+	.volatile_reg = tas6424_is_volatile_reg,
+
+	.max_register = TAS6424_MAX,
+	.reg_defaults = tas6424_reg_defaults,
+	.num_reg_defaults = ARRAY_SIZE(tas6424_reg_defaults),
+	.cache_type = REGCACHE_RBTREE,
+};
+
+#if IS_ENABLED(CONFIG_OF)
+static const struct of_device_id tas6424_of_ids[] = {
+	{ .compatible = "ti,tas6424", },
+	{ },
+};
+MODULE_DEVICE_TABLE(of, tas6424_of_ids);
+#endif
+
+static int tas6424_i2c_probe(struct i2c_client *client,
+			     const struct i2c_device_id *id)
+{
+	struct device *dev = &client->dev;
+	struct tas6424_data *tas6424;
+	int ret;
+	int i;
+
+	tas6424 = devm_kzalloc(dev, sizeof(*tas6424), GFP_KERNEL);
+	if (!tas6424)
+		return -ENOMEM;
+	dev_set_drvdata(dev, tas6424);
+
+	tas6424->dev = dev;
+
+	tas6424->regmap = devm_regmap_init_i2c(client, &tas6424_regmap_config);
+	if (IS_ERR(tas6424->regmap)) {
+		ret = PTR_ERR(tas6424->regmap);
+		dev_err(dev, "unable to allocate register map: %d\n", ret);
+		return ret;
+	}
+
+	for (i = 0; i < ARRAY_SIZE(tas6424->supplies); i++)
+		tas6424->supplies[i].supply = tas6424_supply_names[i];
+	ret = devm_regulator_bulk_get(dev, ARRAY_SIZE(tas6424->supplies),
+				      tas6424->supplies);
+	if (ret) {
+		dev_err(dev, "unable to request supplies: %d\n", ret);
+		return ret;
+	}
+
+	ret = regulator_bulk_enable(ARRAY_SIZE(tas6424->supplies),
+				    tas6424->supplies);
+	if (ret) {
+		dev_err(dev, "unable to enable supplies: %d\n", ret);
+		return ret;
+	}
+
+	/* Reset device to establish well-defined startup state */
+	ret = regmap_update_bits(tas6424->regmap, TAS6424_MODE_CTRL,
+				 TAS6424_RESET, TAS6424_RESET);
+	if (ret) {
+		dev_err(dev, "unable to reset device: %d\n", ret);
+		return ret;
+	}
+
+	INIT_DELAYED_WORK(&tas6424->fault_check_work, tas6424_fault_check_work);
+
+	ret = snd_soc_register_codec(dev, &soc_codec_dev_tas6424,
+				     tas6424_dai, ARRAY_SIZE(tas6424_dai));
+	if (ret < 0) {
+		dev_err(dev, "unable to register codec: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int tas6424_i2c_remove(struct i2c_client *client)
+{
+	struct device *dev = &client->dev;
+	struct tas6424_data *tas6424 = dev_get_drvdata(dev);
+	int ret;
+
+	snd_soc_unregister_codec(dev);
+
+	cancel_delayed_work_sync(&tas6424->fault_check_work);
+
+	ret = regulator_bulk_disable(ARRAY_SIZE(tas6424->supplies),
+				     tas6424->supplies);
+	if (ret < 0) {
+		dev_err(dev, "unable to disable supplies: %d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static const struct i2c_device_id tas6424_i2c_ids[] = {
+	{ "tas6424", 0 },
+	{ }
+};
+MODULE_DEVICE_TABLE(i2c, tas6424_i2c_ids);
+
+static struct i2c_driver tas6424_i2c_driver = {
+	.driver = {
+		.name = "tas6424",
+		.of_match_table = of_match_ptr(tas6424_of_ids),
+	},
+	.probe = tas6424_i2c_probe,
+	.remove = tas6424_i2c_remove,
+	.id_table = tas6424_i2c_ids,
+};
+module_i2c_driver(tas6424_i2c_driver);
+
+MODULE_AUTHOR("Andreas Dannenberg <dannenberg@ti.com>");
+MODULE_AUTHOR("Andrew F. Davis <afd@ti.com>");
+MODULE_DESCRIPTION("TAS6424 Audio amplifier driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/tas6424.h b/sound/soc/codecs/tas6424.h
new file mode 100644
index 0000000..4305883
--- /dev/null
+++ b/sound/soc/codecs/tas6424.h
@@ -0,0 +1,144 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ALSA SoC Texas Instruments TAS6424 Quad-Channel Audio Amplifier
+ *
+ * Copyright (C) 2016-2017 Texas Instruments Incorporated - http://www.ti.com/
+ *	Author: Andreas Dannenberg <dannenberg@ti.com>
+ *	Andrew F. Davis <afd@ti.com>
+ */
+
+#ifndef __TAS6424_H__
+#define __TAS6424_H__
+
+#define TAS6424_RATES (SNDRV_PCM_RATE_44100 | \
+		       SNDRV_PCM_RATE_48000 | \
+		       SNDRV_PCM_RATE_96000)
+
+#define TAS6424_FORMATS (SNDRV_PCM_FMTBIT_S16_LE | \
+			 SNDRV_PCM_FMTBIT_S24_LE)
+
+/* Register Address Map */
+#define TAS6424_MODE_CTRL		0x00
+#define TAS6424_MISC_CTRL1		0x01
+#define TAS6424_MISC_CTRL2		0x02
+#define TAS6424_SAP_CTRL		0x03
+#define TAS6424_CH_STATE_CTRL		0x04
+#define TAS6424_CH1_VOL_CTRL		0x05
+#define TAS6424_CH2_VOL_CTRL		0x06
+#define TAS6424_CH3_VOL_CTRL		0x07
+#define TAS6424_CH4_VOL_CTRL		0x08
+#define TAS6424_DC_DIAG_CTRL1		0x09
+#define TAS6424_DC_DIAG_CTRL2		0x0a
+#define TAS6424_DC_DIAG_CTRL3		0x0b
+#define TAS6424_DC_LOAD_DIAG_REP12	0x0c
+#define TAS6424_DC_LOAD_DIAG_REP34	0x0d
+#define TAS6424_DC_LOAD_DIAG_REPLO	0x0e
+#define TAS6424_CHANNEL_STATE		0x0f
+#define TAS6424_CHANNEL_FAULT		0x10
+#define TAS6424_GLOB_FAULT1		0x11
+#define TAS6424_GLOB_FAULT2		0x12
+#define TAS6424_WARN			0x13
+#define TAS6424_PIN_CTRL		0x14
+#define TAS6424_AC_DIAG_CTRL1		0x15
+#define TAS6424_AC_DIAG_CTRL2		0x16
+#define TAS6424_AC_LOAD_DIAG_REP1	0x17
+#define TAS6424_AC_LOAD_DIAG_REP2	0x18
+#define TAS6424_AC_LOAD_DIAG_REP3	0x19
+#define TAS6424_AC_LOAD_DIAG_REP4	0x1a
+#define TAS6424_MISC_CTRL3		0x21
+#define TAS6424_CLIP_CTRL		0x22
+#define TAS6424_CLIP_WINDOW		0x23
+#define TAS6424_CLIP_WARN		0x24
+#define TAS6424_CBC_STAT		0x25
+#define TAS6424_MISC_CTRL4		0x26
+#define TAS6424_MAX			TAS6424_MISC_CTRL4
+
+/* TAS6424_MODE_CTRL_REG */
+#define TAS6424_RESET			BIT(7)
+
+/* TAS6424_SAP_CTRL_REG */
+#define TAS6424_SAP_RATE_MASK		GENMASK(7, 6)
+#define TAS6424_SAP_RATE_44100		(0x00 << 6)
+#define TAS6424_SAP_RATE_48000		(0x01 << 6)
+#define TAS6424_SAP_RATE_96000		(0x02 << 6)
+#define TAS6424_SAP_TDM_SLOT_LAST	BIT(5)
+#define TAS6424_SAP_TDM_SLOT_SZ_16	BIT(4)
+#define TAS6424_SAP_TDM_SLOT_SWAP	BIT(3)
+#define TAS6424_SAP_FMT_MASK		GENMASK(2, 0)
+#define TAS6424_SAP_RIGHTJ_24		(0x00 << 0)
+#define TAS6424_SAP_RIGHTJ_20		(0x01 << 0)
+#define TAS6424_SAP_RIGHTJ_18		(0x02 << 0)
+#define TAS6424_SAP_RIGHTJ_16		(0x03 << 0)
+#define TAS6424_SAP_I2S			(0x04 << 0)
+#define TAS6424_SAP_LEFTJ		(0x05 << 0)
+#define TAS6424_SAP_DSP			(0x06 << 0)
+
+/* TAS6424_CH_STATE_CTRL_REG */
+#define TAS6424_CH1_STATE_MASK		GENMASK(7, 6)
+#define TAS6424_CH1_STATE_PLAY		(0x00 << 6)
+#define TAS6424_CH1_STATE_HIZ		(0x01 << 6)
+#define TAS6424_CH1_STATE_MUTE		(0x02 << 6)
+#define TAS6424_CH1_STATE_DIAG		(0x03 << 6)
+#define TAS6424_CH2_STATE_MASK		GENMASK(5, 4)
+#define TAS6424_CH2_STATE_PLAY		(0x00 << 4)
+#define TAS6424_CH2_STATE_HIZ		(0x01 << 4)
+#define TAS6424_CH2_STATE_MUTE		(0x02 << 4)
+#define TAS6424_CH2_STATE_DIAG		(0x03 << 4)
+#define TAS6424_CH3_STATE_MASK		GENMASK(3, 2)
+#define TAS6424_CH3_STATE_PLAY		(0x00 << 2)
+#define TAS6424_CH3_STATE_HIZ		(0x01 << 2)
+#define TAS6424_CH3_STATE_MUTE		(0x02 << 2)
+#define TAS6424_CH3_STATE_DIAG		(0x03 << 2)
+#define TAS6424_CH4_STATE_MASK		GENMASK(1, 0)
+#define TAS6424_CH4_STATE_PLAY		(0x00 << 0)
+#define TAS6424_CH4_STATE_HIZ		(0x01 << 0)
+#define TAS6424_CH4_STATE_MUTE		(0x02 << 0)
+#define TAS6424_CH4_STATE_DIAG		(0x03 << 0)
+#define TAS6424_ALL_STATE_PLAY		(TAS6424_CH1_STATE_PLAY | \
+					 TAS6424_CH2_STATE_PLAY | \
+					 TAS6424_CH3_STATE_PLAY | \
+					 TAS6424_CH4_STATE_PLAY)
+#define TAS6424_ALL_STATE_HIZ		(TAS6424_CH1_STATE_HIZ | \
+					 TAS6424_CH2_STATE_HIZ | \
+					 TAS6424_CH3_STATE_HIZ | \
+					 TAS6424_CH4_STATE_HIZ)
+#define TAS6424_ALL_STATE_MUTE		(TAS6424_CH1_STATE_MUTE | \
+					 TAS6424_CH2_STATE_MUTE | \
+					 TAS6424_CH3_STATE_MUTE | \
+					 TAS6424_CH4_STATE_MUTE)
+#define TAS6424_ALL_STATE_DIAG		(TAS6424_CH1_STATE_DIAG | \
+					 TAS6424_CH2_STATE_DIAG | \
+					 TAS6424_CH3_STATE_DIAG | \
+					 TAS6424_CH4_STATE_DIAG)
+
+/* TAS6424_GLOB_FAULT1_REG */
+#define TAS6424_FAULT_CLOCK		BIT(4)
+#define TAS6424_FAULT_PVDD_OV		BIT(3)
+#define TAS6424_FAULT_VBAT_OV		BIT(2)
+#define TAS6424_FAULT_PVDD_UV		BIT(1)
+#define TAS6424_FAULT_VBAT_UV		BIT(0)
+
+/* TAS6424_GLOB_FAULT2_REG */
+#define TAS6424_FAULT_OTSD		BIT(4)
+#define TAS6424_FAULT_OTSD_CH1		BIT(3)
+#define TAS6424_FAULT_OTSD_CH2		BIT(2)
+#define TAS6424_FAULT_OTSD_CH3		BIT(1)
+#define TAS6424_FAULT_OTSD_CH4		BIT(0)
+
+/* TAS6424_WARN_REG */
+#define TAS6424_WARN_VDD_UV		BIT(6)
+#define TAS6424_WARN_VDD_POR		BIT(5)
+#define TAS6424_WARN_VDD_OTW		BIT(4)
+#define TAS6424_WARN_VDD_OTW_CH1	BIT(3)
+#define TAS6424_WARN_VDD_OTW_CH2	BIT(2)
+#define TAS6424_WARN_VDD_OTW_CH3	BIT(1)
+#define TAS6424_WARN_VDD_OTW_CH4	BIT(0)
+
+/* TAS6424_MISC_CTRL3_REG */
+#define TAS6424_CLEAR_FAULT		BIT(7)
+#define TAS6424_PBTL_CH_SEL		BIT(6)
+#define TAS6424_MASK_CBC_WARN		BIT(5)
+#define TAS6424_MASK_VDD_UV		BIT(4)
+#define TAS6424_OTSD_AUTO_RECOVERY	BIT(3)
+
+#endif /* __TAS6424_H__ */
diff --git a/sound/soc/codecs/tfa9879.c b/sound/soc/codecs/tfa9879.c
index f8dd67c..e7ca764 100644
--- a/sound/soc/codecs/tfa9879.c
+++ b/sound/soc/codecs/tfa9879.c
@@ -316,6 +316,7 @@ static const struct of_device_id tfa9879_of_match[] = {
 	{ .compatible = "nxp,tfa9879", },
 	{ }
 };
+MODULE_DEVICE_TABLE(of, tfa9879_of_match);
 
 static struct i2c_driver tfa9879_i2c_driver = {
 	.driver = {
diff --git a/sound/soc/codecs/tlv320aic31xx.c b/sound/soc/codecs/tlv320aic31xx.c
index e286237..858cb8b 100644
--- a/sound/soc/codecs/tlv320aic31xx.c
+++ b/sound/soc/codecs/tlv320aic31xx.c
@@ -1,22 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * ALSA SoC TLV320AIC31XX codec driver
+ * ALSA SoC TLV320AIC31xx CODEC Driver
  *
- * Copyright (C) 2014 Texas Instruments, Inc.
- *
- * Author: Jyri Sarha <jsarha@ti.com>
+ * Copyright (C) 2014-2017 Texas Instruments Incorporated - http://www.ti.com/
+ *	Jyri Sarha <jsarha@ti.com>
  *
  * Based on ground work by: Ajit Kulkarni <x0175765@ti.com>
  *
- * This package is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * THIS PACKAGE IS PROVIDED AS IS AND WITHOUT ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
- * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
- *
- * The TLV320AIC31xx series of audio codec is a low-power, highly integrated
- * high performance codec which provides a stereo DAC, a mono ADC,
+ * The TLV320AIC31xx series of audio codecs are low-power, highly integrated
+ * high performance codecs which provides a stereo DAC, a mono ADC,
  * and mono/stereo Class-D speaker driver.
  */
 
@@ -26,7 +18,7 @@
 #include <linux/delay.h>
 #include <linux/pm.h>
 #include <linux/i2c.h>
-#include <linux/gpio.h>
+#include <linux/gpio/consumer.h>
 #include <linux/regulator/consumer.h>
 #include <linux/acpi.h>
 #include <linux/of.h>
@@ -144,8 +136,7 @@ static const struct regmap_config aic31xx_i2c_regmap = {
 	.max_register = 12 * 128,
 };
 
-#define AIC31XX_NUM_SUPPLIES	6
-static const char * const aic31xx_supply_names[AIC31XX_NUM_SUPPLIES] = {
+static const char * const aic31xx_supply_names[] = {
 	"HPVDD",
 	"SPRVDD",
 	"SPLVDD",
@@ -154,6 +145,8 @@ static const char * const aic31xx_supply_names[AIC31XX_NUM_SUPPLIES] = {
 	"DVDD",
 };
 
+#define AIC31XX_NUM_SUPPLIES ARRAY_SIZE(aic31xx_supply_names)
+
 struct aic31xx_disable_nb {
 	struct notifier_block nb;
 	struct aic31xx_priv *aic31xx;
@@ -164,6 +157,9 @@ struct aic31xx_priv {
 	u8 i2c_regs_status;
 	struct device *dev;
 	struct regmap *regmap;
+	enum aic31xx_type codec_type;
+	struct gpio_desc *gpio_reset;
+	int micbias_vg;
 	struct aic31xx_pdata pdata;
 	struct regulator_bulk_data supplies[AIC31XX_NUM_SUPPLIES];
 	struct aic31xx_disable_nb disable_nb[AIC31XX_NUM_SUPPLIES];
@@ -185,7 +181,7 @@ struct aic31xx_rate_divs {
 	u8 madc;
 };
 
-/* ADC dividers can be disabled by cofiguring them to 0 */
+/* ADC dividers can be disabled by configuring them to 0 */
 static const struct aic31xx_rate_divs aic31xx_divs[] = {
 	/* mclk/p    rate  pll: j     d        dosr ndac mdac  aors nadc madc */
 	/* 8k rate */
@@ -456,7 +452,7 @@ static int mic_bias_event(struct snd_soc_dapm_widget *w,
 		/* change mic bias voltage to user defined */
 		snd_soc_update_bits(codec, AIC31XX_MICBIAS,
 				    AIC31XX_MICBIAS_MASK,
-				    aic31xx->pdata.micbias_vg <<
+				    aic31xx->micbias_vg <<
 				    AIC31XX_MICBIAS_SHIFT);
 		dev_dbg(codec->dev, "%s: turned on\n", __func__);
 		break;
@@ -679,14 +675,14 @@ static int aic31xx_add_controls(struct snd_soc_codec *codec)
 	int ret = 0;
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
 
-	if (!(aic31xx->pdata.codec_type & DAC31XX_BIT))
+	if (!(aic31xx->codec_type & DAC31XX_BIT))
 		ret = snd_soc_add_codec_controls(
 			codec, aic31xx_snd_controls,
 			ARRAY_SIZE(aic31xx_snd_controls));
 	if (ret)
 		return ret;
 
-	if (aic31xx->pdata.codec_type & AIC31XX_STEREO_CLASS_D_BIT)
+	if (aic31xx->codec_type & AIC31XX_STEREO_CLASS_D_BIT)
 		ret = snd_soc_add_codec_controls(
 			codec, aic311x_snd_controls,
 			ARRAY_SIZE(aic311x_snd_controls));
@@ -704,7 +700,7 @@ static int aic31xx_add_widgets(struct snd_soc_codec *codec)
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
 	int ret = 0;
 
-	if (aic31xx->pdata.codec_type & DAC31XX_BIT) {
+	if (aic31xx->codec_type & DAC31XX_BIT) {
 		ret = snd_soc_dapm_new_controls(
 			dapm, dac31xx_dapm_widgets,
 			ARRAY_SIZE(dac31xx_dapm_widgets));
@@ -728,7 +724,7 @@ static int aic31xx_add_widgets(struct snd_soc_codec *codec)
 			return ret;
 	}
 
-	if (aic31xx->pdata.codec_type & AIC31XX_STEREO_CLASS_D_BIT) {
+	if (aic31xx->codec_type & AIC31XX_STEREO_CLASS_D_BIT) {
 		ret = snd_soc_dapm_new_controls(
 			dapm, aic311x_dapm_widgets,
 			ARRAY_SIZE(aic311x_dapm_widgets));
@@ -760,11 +756,17 @@ static int aic31xx_setup_pll(struct snd_soc_codec *codec,
 {
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
 	int bclk_score = snd_soc_params_to_frame_size(params);
-	int mclk_p = aic31xx->sysclk / aic31xx->p_div;
+	int mclk_p;
 	int bclk_n = 0;
 	int match = -1;
 	int i;
 
+	if (!aic31xx->sysclk || !aic31xx->p_div) {
+		dev_err(codec->dev, "Master clock not supplied\n");
+		return -EINVAL;
+	}
+	mclk_p = aic31xx->sysclk / aic31xx->p_div;
+
 	/* Use PLL as CODEC_CLKIN and DAC_CLK as BDIV_CLKIN */
 	snd_soc_update_bits(codec, AIC31XX_CLKMUX,
 			    AIC31XX_CODEC_CLKIN_MASK, AIC31XX_CODEC_CLKIN_PLL);
@@ -840,11 +842,17 @@ static int aic31xx_setup_pll(struct snd_soc_codec *codec,
 
 	dev_dbg(codec->dev,
 		"pll %d.%04d/%d dosr %d n %d m %d aosr %d n %d m %d bclk_n %d\n",
-		aic31xx_divs[i].pll_j, aic31xx_divs[i].pll_d,
-		aic31xx->p_div, aic31xx_divs[i].dosr,
-		aic31xx_divs[i].ndac, aic31xx_divs[i].mdac,
-		aic31xx_divs[i].aosr, aic31xx_divs[i].nadc,
-		aic31xx_divs[i].madc, bclk_n);
+		aic31xx_divs[i].pll_j,
+		aic31xx_divs[i].pll_d,
+		aic31xx->p_div,
+		aic31xx_divs[i].dosr,
+		aic31xx_divs[i].ndac,
+		aic31xx_divs[i].mdac,
+		aic31xx_divs[i].aosr,
+		aic31xx_divs[i].nadc,
+		aic31xx_divs[i].madc,
+		bclk_n
+	);
 
 	return 0;
 }
@@ -919,8 +927,28 @@ static int aic31xx_set_dai_fmt(struct snd_soc_dai *codec_dai,
 	case SND_SOC_DAIFMT_CBM_CFM:
 		iface_reg1 |= AIC31XX_BCLK_MASTER | AIC31XX_WCLK_MASTER;
 		break;
+	case SND_SOC_DAIFMT_CBS_CFM:
+		iface_reg1 |= AIC31XX_WCLK_MASTER;
+		break;
+	case SND_SOC_DAIFMT_CBM_CFS:
+		iface_reg1 |= AIC31XX_BCLK_MASTER;
+		break;
+	case SND_SOC_DAIFMT_CBS_CFS:
+		break;
 	default:
-		dev_alert(codec->dev, "Invalid DAI master/slave interface\n");
+		dev_err(codec->dev, "Invalid DAI master/slave interface\n");
+		return -EINVAL;
+	}
+
+	/* signal polarity */
+	switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
+	case SND_SOC_DAIFMT_NB_NF:
+		break;
+	case SND_SOC_DAIFMT_IB_NF:
+		iface_reg2 |= AIC31XX_BCLKINV_MASK;
+		break;
+	default:
+		dev_err(codec->dev, "Invalid DAI clock signal polarity\n");
 		return -EINVAL;
 	}
 
@@ -931,16 +959,12 @@ static int aic31xx_set_dai_fmt(struct snd_soc_dai *codec_dai,
 	case SND_SOC_DAIFMT_DSP_A:
 		dsp_a_val = 0x1; /* fall through */
 	case SND_SOC_DAIFMT_DSP_B:
-		/* NOTE: BCLKINV bit value 1 equas NB and 0 equals IB */
-		switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
-		case SND_SOC_DAIFMT_NB_NF:
-			iface_reg2 |= AIC31XX_BCLKINV_MASK;
-			break;
-		case SND_SOC_DAIFMT_IB_NF:
-			break;
-		default:
-			return -EINVAL;
-		}
+		/*
+		 * NOTE: This CODEC samples on the falling edge of BCLK in
+		 * DSP mode, this is inverted compared to what most DAIs
+		 * expect, so we invert for this mode
+		 */
+		iface_reg2 ^= AIC31XX_BCLKINV_MASK;
 		iface_reg1 |= (AIC31XX_DSP_MODE <<
 			       AIC31XX_IFACE1_DATATYPE_SHIFT);
 		break;
@@ -981,8 +1005,9 @@ static int aic31xx_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 	dev_dbg(codec->dev, "## %s: clk_id = %d, freq = %d, dir = %d\n",
 		__func__, clk_id, freq, dir);
 
-	for (i = 1; freq/i > 20000000 && i < 8; i++)
-		;
+	for (i = 1; i < 8; i++)
+		if (freq / i <= 20000000)
+			break;
 	if (freq/i > 20000000) {
 		dev_err(aic31xx->dev, "%s: Too high mclk frequency %u\n",
 			__func__, freq);
@@ -990,9 +1015,9 @@ static int aic31xx_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 	}
 	aic31xx->p_div = i;
 
-	for (i = 0; i < ARRAY_SIZE(aic31xx_divs) &&
-		     aic31xx_divs[i].mclk_p != freq/aic31xx->p_div; i++)
-		;
+	for (i = 0; i < ARRAY_SIZE(aic31xx_divs); i++)
+		if (aic31xx_divs[i].mclk_p == freq / aic31xx->p_div)
+			break;
 	if (i == ARRAY_SIZE(aic31xx_divs)) {
 		dev_err(aic31xx->dev, "%s: Unsupported frequency %d\n",
 			__func__, freq);
@@ -1004,6 +1029,7 @@ static int aic31xx_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 			    clk_id << AIC31XX_PLL_CLKIN_SHIFT);
 
 	aic31xx->sysclk = freq;
+
 	return 0;
 }
 
@@ -1019,8 +1045,8 @@ static int aic31xx_regulator_event(struct notifier_block *nb,
 		 * Put codec to reset and as at least one of the
 		 * supplies was disabled.
 		 */
-		if (gpio_is_valid(aic31xx->pdata.gpio_reset))
-			gpio_set_value(aic31xx->pdata.gpio_reset, 0);
+		if (aic31xx->gpio_reset)
+			gpiod_set_value(aic31xx->gpio_reset, 1);
 
 		regcache_mark_dirty(aic31xx->regmap);
 		dev_dbg(aic31xx->dev, "## %s: DISABLE received\n", __func__);
@@ -1029,6 +1055,22 @@ static int aic31xx_regulator_event(struct notifier_block *nb,
 	return 0;
 }
 
+static int aic31xx_reset(struct aic31xx_priv *aic31xx)
+{
+	int ret = 0;
+
+	if (aic31xx->gpio_reset) {
+		gpiod_set_value(aic31xx->gpio_reset, 1);
+		ndelay(10); /* At least 10ns */
+		gpiod_set_value(aic31xx->gpio_reset, 0);
+	} else {
+		ret = regmap_write(aic31xx->regmap, AIC31XX_RESET, 1);
+	}
+	mdelay(1); /* At least 1ms */
+
+	return ret;
+}
+
 static void aic31xx_clk_on(struct snd_soc_codec *codec)
 {
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
@@ -1065,20 +1107,22 @@ static void aic31xx_clk_off(struct snd_soc_codec *codec)
 static int aic31xx_power_on(struct snd_soc_codec *codec)
 {
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
-	int ret = 0;
+	int ret;
 
 	ret = regulator_bulk_enable(ARRAY_SIZE(aic31xx->supplies),
 				    aic31xx->supplies);
 	if (ret)
 		return ret;
 
-	if (gpio_is_valid(aic31xx->pdata.gpio_reset)) {
-		gpio_set_value(aic31xx->pdata.gpio_reset, 1);
-		udelay(100);
-	}
 	regcache_cache_only(aic31xx->regmap, false);
+
+	/* Reset device registers for a consistent power-on like state */
+	ret = aic31xx_reset(aic31xx);
+	if (ret < 0)
+		dev_err(aic31xx->dev, "Could not reset device: %d\n", ret);
+
 	ret = regcache_sync(aic31xx->regmap);
-	if (ret != 0) {
+	if (ret) {
 		dev_err(codec->dev,
 			"Failed to restore cache: %d\n", ret);
 		regcache_cache_only(aic31xx->regmap, true);
@@ -1086,19 +1130,17 @@ static int aic31xx_power_on(struct snd_soc_codec *codec)
 				       aic31xx->supplies);
 		return ret;
 	}
+
 	return 0;
 }
 
-static int aic31xx_power_off(struct snd_soc_codec *codec)
+static void aic31xx_power_off(struct snd_soc_codec *codec)
 {
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
-	int ret = 0;
 
 	regcache_cache_only(aic31xx->regmap, true);
-	ret = regulator_bulk_disable(ARRAY_SIZE(aic31xx->supplies),
-				     aic31xx->supplies);
-
-	return ret;
+	regulator_bulk_disable(ARRAY_SIZE(aic31xx->supplies),
+			       aic31xx->supplies);
 }
 
 static int aic31xx_set_bias_level(struct snd_soc_codec *codec,
@@ -1137,14 +1179,11 @@ static int aic31xx_set_bias_level(struct snd_soc_codec *codec,
 
 static int aic31xx_codec_probe(struct snd_soc_codec *codec)
 {
-	int ret = 0;
 	struct aic31xx_priv *aic31xx = snd_soc_codec_get_drvdata(codec);
-	int i;
+	int i, ret;
 
 	dev_dbg(aic31xx->dev, "## %s\n", __func__);
 
-	aic31xx = snd_soc_codec_get_drvdata(codec);
-
 	aic31xx->codec = codec;
 
 	for (i = 0; i < ARRAY_SIZE(aic31xx->supplies); i++) {
@@ -1169,8 +1208,10 @@ static int aic31xx_codec_probe(struct snd_soc_codec *codec)
 		return ret;
 
 	ret = aic31xx_add_widgets(codec);
+	if (ret)
+		return ret;
 
-	return ret;
+	return 0;
 }
 
 static int aic31xx_codec_remove(struct snd_soc_codec *codec)
@@ -1258,89 +1299,31 @@ static const struct of_device_id tlv320aic31xx_of_match[] = {
 	{},
 };
 MODULE_DEVICE_TABLE(of, tlv320aic31xx_of_match);
-
-static void aic31xx_pdata_from_of(struct aic31xx_priv *aic31xx)
-{
-	struct device_node *np = aic31xx->dev->of_node;
-	unsigned int value = MICBIAS_2_0V;
-	int ret;
-
-	of_property_read_u32(np, "ai31xx-micbias-vg", &value);
-	switch (value) {
-	case MICBIAS_2_0V:
-	case MICBIAS_2_5V:
-	case MICBIAS_AVDDV:
-		aic31xx->pdata.micbias_vg = value;
-		break;
-	default:
-		dev_err(aic31xx->dev,
-			"Bad ai31xx-micbias-vg value %d DT\n",
-			value);
-		aic31xx->pdata.micbias_vg = MICBIAS_2_0V;
-	}
-
-	ret = of_get_named_gpio(np, "gpio-reset", 0);
-	if (ret > 0)
-		aic31xx->pdata.gpio_reset = ret;
-}
-#else /* CONFIG_OF */
-static void aic31xx_pdata_from_of(struct aic31xx_priv *aic31xx)
-{
-}
 #endif /* CONFIG_OF */
 
-static int aic31xx_device_init(struct aic31xx_priv *aic31xx)
-{
-	int ret, i;
-
-	dev_set_drvdata(aic31xx->dev, aic31xx);
-
-	if (dev_get_platdata(aic31xx->dev))
-		memcpy(&aic31xx->pdata, dev_get_platdata(aic31xx->dev),
-		       sizeof(aic31xx->pdata));
-	else if (aic31xx->dev->of_node)
-		aic31xx_pdata_from_of(aic31xx);
-
-	if (aic31xx->pdata.gpio_reset) {
-		ret = devm_gpio_request_one(aic31xx->dev,
-					    aic31xx->pdata.gpio_reset,
-					    GPIOF_OUT_INIT_HIGH,
-					    "aic31xx-reset-pin");
-		if (ret < 0) {
-			dev_err(aic31xx->dev, "not able to acquire gpio\n");
-			return ret;
-		}
-	}
-
-	for (i = 0; i < ARRAY_SIZE(aic31xx->supplies); i++)
-		aic31xx->supplies[i].supply = aic31xx_supply_names[i];
-
-	ret = devm_regulator_bulk_get(aic31xx->dev,
-				      ARRAY_SIZE(aic31xx->supplies),
-				      aic31xx->supplies);
-	if (ret != 0)
-		dev_err(aic31xx->dev, "Failed to request supplies: %d\n", ret);
-
-	return ret;
-}
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id aic31xx_acpi_match[] = {
+	{ "10TI3100", 0 },
+	{ }
+};
+MODULE_DEVICE_TABLE(acpi, aic31xx_acpi_match);
+#endif
 
 static int aic31xx_i2c_probe(struct i2c_client *i2c,
 			     const struct i2c_device_id *id)
 {
 	struct aic31xx_priv *aic31xx;
-	int ret;
-	const struct regmap_config *regmap_config;
+	unsigned int micbias_value = MICBIAS_2_0V;
+	int i, ret;
 
 	dev_dbg(&i2c->dev, "## %s: %s codec_type = %d\n", __func__,
-		id->name, (int) id->driver_data);
-
-	regmap_config = &aic31xx_i2c_regmap;
+		id->name, (int)id->driver_data);
 
 	aic31xx = devm_kzalloc(&i2c->dev, sizeof(*aic31xx), GFP_KERNEL);
-	if (aic31xx == NULL)
+	if (!aic31xx)
 		return -ENOMEM;
 
-	aic31xx->regmap = devm_regmap_init_i2c(i2c, regmap_config);
+	aic31xx->regmap = devm_regmap_init_i2c(i2c, &aic31xx_i2c_regmap);
 	if (IS_ERR(aic31xx->regmap)) {
 		ret = PTR_ERR(aic31xx->regmap);
 		dev_err(&i2c->dev, "Failed to allocate register map: %d\n",
@@ -1349,13 +1332,49 @@ static int aic31xx_i2c_probe(struct i2c_client *i2c,
 	}
 	aic31xx->dev = &i2c->dev;
 
-	aic31xx->pdata.codec_type = id->driver_data;
+	aic31xx->codec_type = id->driver_data;
 
-	ret = aic31xx_device_init(aic31xx);
-	if (ret)
+	dev_set_drvdata(aic31xx->dev, aic31xx);
+
+	fwnode_property_read_u32(aic31xx->dev->fwnode, "ai31xx-micbias-vg",
+				 &micbias_value);
+	switch (micbias_value) {
+	case MICBIAS_2_0V:
+	case MICBIAS_2_5V:
+	case MICBIAS_AVDDV:
+		aic31xx->micbias_vg = micbias_value;
+		break;
+	default:
+		dev_err(aic31xx->dev, "Bad ai31xx-micbias-vg value %d\n",
+			micbias_value);
+		aic31xx->micbias_vg = MICBIAS_2_0V;
+	}
+
+	if (dev_get_platdata(aic31xx->dev)) {
+		memcpy(&aic31xx->pdata, dev_get_platdata(aic31xx->dev), sizeof(aic31xx->pdata));
+		aic31xx->codec_type = aic31xx->pdata.codec_type;
+		aic31xx->micbias_vg = aic31xx->pdata.micbias_vg;
+	}
+
+	aic31xx->gpio_reset = devm_gpiod_get_optional(aic31xx->dev, "reset",
+						      GPIOD_OUT_LOW);
+	if (IS_ERR(aic31xx->gpio_reset)) {
+		dev_err(aic31xx->dev, "not able to acquire gpio\n");
+		return PTR_ERR(aic31xx->gpio_reset);
+	}
+
+	for (i = 0; i < ARRAY_SIZE(aic31xx->supplies); i++)
+		aic31xx->supplies[i].supply = aic31xx_supply_names[i];
+
+	ret = devm_regulator_bulk_get(aic31xx->dev,
+				      ARRAY_SIZE(aic31xx->supplies),
+				      aic31xx->supplies);
+	if (ret) {
+		dev_err(aic31xx->dev, "Failed to request supplies: %d\n", ret);
 		return ret;
+	}
 
-	if (aic31xx->pdata.codec_type & DAC31XX_BIT)
+	if (aic31xx->codec_type & DAC31XX_BIT)
 		return snd_soc_register_codec(&i2c->dev,
 				&soc_codec_driver_aic31xx,
 				dac31xx_dai_driver,
@@ -1386,14 +1405,6 @@ static const struct i2c_device_id aic31xx_i2c_id[] = {
 };
 MODULE_DEVICE_TABLE(i2c, aic31xx_i2c_id);
 
-#ifdef CONFIG_ACPI
-static const struct acpi_device_id aic31xx_acpi_match[] = {
-	{ "10TI3100", 0 },
-	{ }
-};
-MODULE_DEVICE_TABLE(acpi, aic31xx_acpi_match);
-#endif
-
 static struct i2c_driver aic31xx_i2c_driver = {
 	.driver = {
 		.name	= "tlv320aic31xx-codec",
@@ -1404,9 +1415,8 @@ static struct i2c_driver aic31xx_i2c_driver = {
 	.remove		= aic31xx_i2c_remove,
 	.id_table	= aic31xx_i2c_id,
 };
-
 module_i2c_driver(aic31xx_i2c_driver);
 
-MODULE_DESCRIPTION("ASoC TLV320AIC3111 codec driver");
-MODULE_AUTHOR("Jyri Sarha");
-MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Jyri Sarha <jsarha@ti.com>");
+MODULE_DESCRIPTION("ASoC TLV320AIC31xx CODEC Driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/codecs/tlv320aic31xx.h b/sound/soc/codecs/tlv320aic31xx.h
index 1ff3edb..15ac7cb 100644
--- a/sound/soc/codecs/tlv320aic31xx.h
+++ b/sound/soc/codecs/tlv320aic31xx.h
@@ -1,36 +1,30 @@
+// SPDX-License-Identifier: GPL-2.0
 /*
- * ALSA SoC TLV320AIC31XX codec driver
+ * ALSA SoC TLV320AIC31xx CODEC Driver Definitions
  *
- * Copyright (C) 2013 Texas Instruments, Inc.
- *
- * This package is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * THIS PACKAGE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
- * IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
- * WARRANTIES OF MERCHANTIBILITY AND FITNESS FOR A PARTICULAR PURPOSE.
- *
+ * Copyright (C) 2014-2017 Texas Instruments Incorporated - http://www.ti.com/
  */
+
 #ifndef _TLV320AIC31XX_H
 #define _TLV320AIC31XX_H
 
 #define AIC31XX_RATES	SNDRV_PCM_RATE_8000_192000
 
-#define AIC31XX_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S20_3LE \
-			 | SNDRV_PCM_FMTBIT_S24_3LE | SNDRV_PCM_FMTBIT_S24_LE \
-			 | SNDRV_PCM_FMTBIT_S32_LE)
+#define AIC31XX_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | \
+			 SNDRV_PCM_FMTBIT_S20_3LE | \
+			 SNDRV_PCM_FMTBIT_S24_3LE | \
+			 SNDRV_PCM_FMTBIT_S24_LE | \
+			 SNDRV_PCM_FMTBIT_S32_LE)
 
-
-#define AIC31XX_STEREO_CLASS_D_BIT	0x1
-#define AIC31XX_MINIDSP_BIT		0x2
-#define DAC31XX_BIT			0x4
+#define AIC31XX_STEREO_CLASS_D_BIT	BIT(1)
+#define AIC31XX_MINIDSP_BIT		BIT(2)
+#define DAC31XX_BIT			BIT(3)
 
 enum aic31xx_type {
 	AIC3100	= 0,
 	AIC3110 = AIC31XX_STEREO_CLASS_D_BIT,
 	AIC3120 = AIC31XX_MINIDSP_BIT,
-	AIC3111 = (AIC31XX_STEREO_CLASS_D_BIT | AIC31XX_MINIDSP_BIT),
+	AIC3111 = AIC31XX_STEREO_CLASS_D_BIT | AIC31XX_MINIDSP_BIT,
 	DAC3100 = DAC31XX_BIT,
 	DAC3101 = DAC31XX_BIT | AIC31XX_STEREO_CLASS_D_BIT,
 };
@@ -43,222 +37,167 @@ struct aic31xx_pdata {
 
 #define AIC31XX_REG(page, reg)	((page * 128) + reg)
 
-/* Page Control Register */
-#define AIC31XX_PAGECTL		AIC31XX_REG(0, 0)
+#define AIC31XX_PAGECTL		AIC31XX_REG(0, 0) /* Page Control Register */
 
 /* Page 0 Registers */
-/* Software reset register */
-#define AIC31XX_RESET		AIC31XX_REG(0, 1)
-/* OT FLAG register */
-#define AIC31XX_OT_FLAG		AIC31XX_REG(0, 3)
-/* Clock clock Gen muxing, Multiplexers*/
-#define AIC31XX_CLKMUX		AIC31XX_REG(0, 4)
-/* PLL P and R-VAL register */
-#define AIC31XX_PLLPR		AIC31XX_REG(0, 5)
-/* PLL J-VAL register */
-#define AIC31XX_PLLJ		AIC31XX_REG(0, 6)
-/* PLL D-VAL MSB register */
-#define AIC31XX_PLLDMSB		AIC31XX_REG(0, 7)
-/* PLL D-VAL LSB register */
-#define AIC31XX_PLLDLSB		AIC31XX_REG(0, 8)
-/* DAC NDAC_VAL register*/
-#define AIC31XX_NDAC		AIC31XX_REG(0, 11)
-/* DAC MDAC_VAL register */
-#define AIC31XX_MDAC		AIC31XX_REG(0, 12)
-/* DAC OSR setting register 1, MSB value */
-#define AIC31XX_DOSRMSB		AIC31XX_REG(0, 13)
-/* DAC OSR setting register 2, LSB value */
-#define AIC31XX_DOSRLSB		AIC31XX_REG(0, 14)
+#define AIC31XX_RESET		AIC31XX_REG(0, 1) /* Software reset register */
+#define AIC31XX_OT_FLAG		AIC31XX_REG(0, 3) /* OT FLAG register */
+#define AIC31XX_CLKMUX		AIC31XX_REG(0, 4) /* Clock clock Gen muxing, Multiplexers*/
+#define AIC31XX_PLLPR		AIC31XX_REG(0, 5) /* PLL P and R-VAL register */
+#define AIC31XX_PLLJ		AIC31XX_REG(0, 6) /* PLL J-VAL register */
+#define AIC31XX_PLLDMSB		AIC31XX_REG(0, 7) /* PLL D-VAL MSB register */
+#define AIC31XX_PLLDLSB		AIC31XX_REG(0, 8) /* PLL D-VAL LSB register */
+#define AIC31XX_NDAC		AIC31XX_REG(0, 11) /* DAC NDAC_VAL register*/
+#define AIC31XX_MDAC		AIC31XX_REG(0, 12) /* DAC MDAC_VAL register */
+#define AIC31XX_DOSRMSB		AIC31XX_REG(0, 13) /* DAC OSR setting register 1, MSB value */
+#define AIC31XX_DOSRLSB		AIC31XX_REG(0, 14) /* DAC OSR setting register 2, LSB value */
 #define AIC31XX_MINI_DSP_INPOL	AIC31XX_REG(0, 16)
-/* Clock setting register 8, PLL */
-#define AIC31XX_NADC		AIC31XX_REG(0, 18)
-/* Clock setting register 9, PLL */
-#define AIC31XX_MADC		AIC31XX_REG(0, 19)
-/* ADC Oversampling (AOSR) Register */
-#define AIC31XX_AOSR		AIC31XX_REG(0, 20)
-/* Clock setting register 9, Multiplexers */
-#define AIC31XX_CLKOUTMUX	AIC31XX_REG(0, 25)
-/* Clock setting register 10, CLOCKOUT M divider value */
-#define AIC31XX_CLKOUTMVAL	AIC31XX_REG(0, 26)
-/* Audio Interface Setting Register 1 */
-#define AIC31XX_IFACE1		AIC31XX_REG(0, 27)
-/* Audio Data Slot Offset Programming */
-#define AIC31XX_DATA_OFFSET	AIC31XX_REG(0, 28)
-/* Audio Interface Setting Register 2 */
-#define AIC31XX_IFACE2		AIC31XX_REG(0, 29)
-/* Clock setting register 11, BCLK N Divider */
-#define AIC31XX_BCLKN		AIC31XX_REG(0, 30)
-/* Audio Interface Setting Register 3, Secondary Audio Interface */
-#define AIC31XX_IFACESEC1	AIC31XX_REG(0, 31)
-/* Audio Interface Setting Register 4 */
-#define AIC31XX_IFACESEC2	AIC31XX_REG(0, 32)
-/* Audio Interface Setting Register 5 */
-#define AIC31XX_IFACESEC3	AIC31XX_REG(0, 33)
-/* I2C Bus Condition */
-#define AIC31XX_I2C		AIC31XX_REG(0, 34)
-/* ADC FLAG */
-#define AIC31XX_ADCFLAG		AIC31XX_REG(0, 36)
-/* DAC Flag Registers */
-#define AIC31XX_DACFLAG1	AIC31XX_REG(0, 37)
+#define AIC31XX_NADC		AIC31XX_REG(0, 18) /* Clock setting register 8, PLL */
+#define AIC31XX_MADC		AIC31XX_REG(0, 19) /* Clock setting register 9, PLL */
+#define AIC31XX_AOSR		AIC31XX_REG(0, 20) /* ADC Oversampling (AOSR) Register */
+#define AIC31XX_CLKOUTMUX	AIC31XX_REG(0, 25) /* Clock setting register 9, Multiplexers */
+#define AIC31XX_CLKOUTMVAL	AIC31XX_REG(0, 26) /* Clock setting register 10, CLOCKOUT M divider value */
+#define AIC31XX_IFACE1		AIC31XX_REG(0, 27) /* Audio Interface Setting Register 1 */
+#define AIC31XX_DATA_OFFSET	AIC31XX_REG(0, 28) /* Audio Data Slot Offset Programming */
+#define AIC31XX_IFACE2		AIC31XX_REG(0, 29) /* Audio Interface Setting Register 2 */
+#define AIC31XX_BCLKN		AIC31XX_REG(0, 30) /* Clock setting register 11, BCLK N Divider */
+#define AIC31XX_IFACESEC1	AIC31XX_REG(0, 31) /* Audio Interface Setting Register 3, Secondary Audio Interface */
+#define AIC31XX_IFACESEC2	AIC31XX_REG(0, 32) /* Audio Interface Setting Register 4 */
+#define AIC31XX_IFACESEC3	AIC31XX_REG(0, 33) /* Audio Interface Setting Register 5 */
+#define AIC31XX_I2C		AIC31XX_REG(0, 34) /* I2C Bus Condition */
+#define AIC31XX_ADCFLAG		AIC31XX_REG(0, 36) /* ADC FLAG */
+#define AIC31XX_DACFLAG1	AIC31XX_REG(0, 37) /* DAC Flag Registers */
 #define AIC31XX_DACFLAG2	AIC31XX_REG(0, 38)
-/* Sticky Interrupt flag (overflow) */
-#define AIC31XX_OFFLAG		AIC31XX_REG(0, 39)
-/* Sticy DAC Interrupt flags */
-#define AIC31XX_INTRDACFLAG	AIC31XX_REG(0, 44)
-/* Sticy ADC Interrupt flags */
-#define AIC31XX_INTRADCFLAG	AIC31XX_REG(0, 45)
-/* DAC Interrupt flags 2 */
-#define AIC31XX_INTRDACFLAG2	AIC31XX_REG(0, 46)
-/* ADC Interrupt flags 2 */
-#define AIC31XX_INTRADCFLAG2	AIC31XX_REG(0, 47)
-/* INT1 interrupt control */
-#define AIC31XX_INT1CTRL	AIC31XX_REG(0, 48)
-/* INT2 interrupt control */
-#define AIC31XX_INT2CTRL	AIC31XX_REG(0, 49)
-/* GPIO1 control */
-#define AIC31XX_GPIO1		AIC31XX_REG(0, 51)
-
+#define AIC31XX_OFFLAG		AIC31XX_REG(0, 39) /* Sticky Interrupt flag (overflow) */
+#define AIC31XX_INTRDACFLAG	AIC31XX_REG(0, 44) /* Sticy DAC Interrupt flags */
+#define AIC31XX_INTRADCFLAG	AIC31XX_REG(0, 45) /* Sticy ADC Interrupt flags */
+#define AIC31XX_INTRDACFLAG2	AIC31XX_REG(0, 46) /* DAC Interrupt flags 2 */
+#define AIC31XX_INTRADCFLAG2	AIC31XX_REG(0, 47) /* ADC Interrupt flags 2 */
+#define AIC31XX_INT1CTRL	AIC31XX_REG(0, 48) /* INT1 interrupt control */
+#define AIC31XX_INT2CTRL	AIC31XX_REG(0, 49) /* INT2 interrupt control */
+#define AIC31XX_GPIO1		AIC31XX_REG(0, 51) /* GPIO1 control */
 #define AIC31XX_DACPRB		AIC31XX_REG(0, 60)
-/* ADC Instruction Set Register */
-#define AIC31XX_ADCPRB		AIC31XX_REG(0, 61)
-/* DAC channel setup register */
-#define AIC31XX_DACSETUP	AIC31XX_REG(0, 63)
-/* DAC Mute and volume control register */
-#define AIC31XX_DACMUTE		AIC31XX_REG(0, 64)
-/* Left DAC channel digital volume control */
-#define AIC31XX_LDACVOL		AIC31XX_REG(0, 65)
-/* Right DAC channel digital volume control */
-#define AIC31XX_RDACVOL		AIC31XX_REG(0, 66)
-/* Headset detection */
-#define AIC31XX_HSDETECT	AIC31XX_REG(0, 67)
-/* ADC Digital Mic */
-#define AIC31XX_ADCSETUP	AIC31XX_REG(0, 81)
-/* ADC Digital Volume Control Fine Adjust */
-#define AIC31XX_ADCFGA		AIC31XX_REG(0, 82)
-/* ADC Digital Volume Control Coarse Adjust */
-#define AIC31XX_ADCVOL		AIC31XX_REG(0, 83)
-
+#define AIC31XX_ADCPRB		AIC31XX_REG(0, 61) /* ADC Instruction Set Register */
+#define AIC31XX_DACSETUP	AIC31XX_REG(0, 63) /* DAC channel setup register */
+#define AIC31XX_DACMUTE		AIC31XX_REG(0, 64) /* DAC Mute and volume control register */
+#define AIC31XX_LDACVOL		AIC31XX_REG(0, 65) /* Left DAC channel digital volume control */
+#define AIC31XX_RDACVOL		AIC31XX_REG(0, 66) /* Right DAC channel digital volume control */
+#define AIC31XX_HSDETECT	AIC31XX_REG(0, 67) /* Headset detection */
+#define AIC31XX_ADCSETUP	AIC31XX_REG(0, 81) /* ADC Digital Mic */
+#define AIC31XX_ADCFGA		AIC31XX_REG(0, 82) /* ADC Digital Volume Control Fine Adjust */
+#define AIC31XX_ADCVOL		AIC31XX_REG(0, 83) /* ADC Digital Volume Control Coarse Adjust */
 
 /* Page 1 Registers */
-/* Headphone drivers */
-#define AIC31XX_HPDRIVER	AIC31XX_REG(1, 31)
-/* Class-D Speakear Amplifier */
-#define AIC31XX_SPKAMP		AIC31XX_REG(1, 32)
-/* HP Output Drivers POP Removal Settings */
-#define AIC31XX_HPPOP		AIC31XX_REG(1, 33)
-/* Output Driver PGA Ramp-Down Period Control */
-#define AIC31XX_SPPGARAMP	AIC31XX_REG(1, 34)
-/* DAC_L and DAC_R Output Mixer Routing */
-#define AIC31XX_DACMIXERROUTE	AIC31XX_REG(1, 35)
-/* Left Analog Vol to HPL */
-#define AIC31XX_LANALOGHPL	AIC31XX_REG(1, 36)
-/* Right Analog Vol to HPR */
-#define AIC31XX_RANALOGHPR	AIC31XX_REG(1, 37)
-/* Left Analog Vol to SPL */
-#define AIC31XX_LANALOGSPL	AIC31XX_REG(1, 38)
-/* Right Analog Vol to SPR */
-#define AIC31XX_RANALOGSPR	AIC31XX_REG(1, 39)
-/* HPL Driver */
-#define AIC31XX_HPLGAIN		AIC31XX_REG(1, 40)
-/* HPR Driver */
-#define AIC31XX_HPRGAIN		AIC31XX_REG(1, 41)
-/* SPL Driver */
-#define AIC31XX_SPLGAIN		AIC31XX_REG(1, 42)
-/* SPR Driver */
-#define AIC31XX_SPRGAIN		AIC31XX_REG(1, 43)
-/* HP Driver Control */
-#define AIC31XX_HPCONTROL	AIC31XX_REG(1, 44)
-/* MIC Bias Control */
-#define AIC31XX_MICBIAS		AIC31XX_REG(1, 46)
-/* MIC PGA*/
-#define AIC31XX_MICPGA		AIC31XX_REG(1, 47)
-/* Delta-Sigma Mono ADC Channel Fine-Gain Input Selection for P-Terminal */
-#define AIC31XX_MICPGAPI	AIC31XX_REG(1, 48)
-/* ADC Input Selection for M-Terminal */
-#define AIC31XX_MICPGAMI	AIC31XX_REG(1, 49)
-/* Input CM Settings */
-#define AIC31XX_MICPGACM	AIC31XX_REG(1, 50)
+#define AIC31XX_HPDRIVER	AIC31XX_REG(1, 31) /* Headphone drivers */
+#define AIC31XX_SPKAMP		AIC31XX_REG(1, 32) /* Class-D Speakear Amplifier */
+#define AIC31XX_HPPOP		AIC31XX_REG(1, 33) /* HP Output Drivers POP Removal Settings */
+#define AIC31XX_SPPGARAMP	AIC31XX_REG(1, 34) /* Output Driver PGA Ramp-Down Period Control */
+#define AIC31XX_DACMIXERROUTE	AIC31XX_REG(1, 35) /* DAC_L and DAC_R Output Mixer Routing */
+#define AIC31XX_LANALOGHPL	AIC31XX_REG(1, 36) /* Left Analog Vol to HPL */
+#define AIC31XX_RANALOGHPR	AIC31XX_REG(1, 37) /* Right Analog Vol to HPR */
+#define AIC31XX_LANALOGSPL	AIC31XX_REG(1, 38) /* Left Analog Vol to SPL */
+#define AIC31XX_RANALOGSPR	AIC31XX_REG(1, 39) /* Right Analog Vol to SPR */
+#define AIC31XX_HPLGAIN		AIC31XX_REG(1, 40) /* HPL Driver */
+#define AIC31XX_HPRGAIN		AIC31XX_REG(1, 41) /* HPR Driver */
+#define AIC31XX_SPLGAIN		AIC31XX_REG(1, 42) /* SPL Driver */
+#define AIC31XX_SPRGAIN		AIC31XX_REG(1, 43) /* SPR Driver */
+#define AIC31XX_HPCONTROL	AIC31XX_REG(1, 44) /* HP Driver Control */
+#define AIC31XX_MICBIAS		AIC31XX_REG(1, 46) /* MIC Bias Control */
+#define AIC31XX_MICPGA		AIC31XX_REG(1, 47) /* MIC PGA*/
+#define AIC31XX_MICPGAPI	AIC31XX_REG(1, 48) /* Delta-Sigma Mono ADC Channel Fine-Gain Input Selection for P-Terminal */
+#define AIC31XX_MICPGAMI	AIC31XX_REG(1, 49) /* ADC Input Selection for M-Terminal */
+#define AIC31XX_MICPGACM	AIC31XX_REG(1, 50) /* Input CM Settings */
 
-/* Bits, masks and shifts */
+/* Bits, masks, and shifts */
 
 /* AIC31XX_CLKMUX */
-#define AIC31XX_PLL_CLKIN_MASK			0x0c
-#define AIC31XX_PLL_CLKIN_SHIFT			2
-#define AIC31XX_PLL_CLKIN_MCLK			0
-#define AIC31XX_CODEC_CLKIN_MASK		0x03
-#define AIC31XX_CODEC_CLKIN_SHIFT		0
-#define AIC31XX_CODEC_CLKIN_PLL			3
-#define AIC31XX_CODEC_CLKIN_BCLK		1
+#define AIC31XX_PLL_CLKIN_MASK		GENMASK(3, 2)
+#define AIC31XX_PLL_CLKIN_SHIFT		(2)
+#define AIC31XX_PLL_CLKIN_MCLK		0x00
+#define AIC31XX_PLL_CLKIN_BCKL		0x01
+#define AIC31XX_PLL_CLKIN_GPIO1		0x02
+#define AIC31XX_PLL_CLKIN_DIN		0x03
+#define AIC31XX_CODEC_CLKIN_MASK	GENMASK(1, 0)
+#define AIC31XX_CODEC_CLKIN_SHIFT	(0)
+#define AIC31XX_CODEC_CLKIN_MCLK	0x00
+#define AIC31XX_CODEC_CLKIN_BCLK	0x01
+#define AIC31XX_CODEC_CLKIN_GPIO1	0x02
+#define AIC31XX_CODEC_CLKIN_PLL		0x03
 
-/* AIC31XX_PLLPR, AIC31XX_NDAC, AIC31XX_MDAC, AIC31XX_NADC, AIC31XX_MADC,
-   AIC31XX_BCLKN */
-#define AIC31XX_PLL_MASK		0x7f
-#define AIC31XX_PM_MASK			0x80
+/* AIC31XX_PLLPR */
+/* AIC31XX_NDAC */
+/* AIC31XX_MDAC */
+/* AIC31XX_NADC */
+/* AIC31XX_MADC */
+/* AIC31XX_BCLKN */
+#define AIC31XX_PLL_MASK		GENMASK(6, 0)
+#define AIC31XX_PM_MASK			BIT(7)
 
 /* AIC31XX_IFACE1 */
-#define AIC31XX_WORD_LEN_16BITS		0x00
-#define AIC31XX_WORD_LEN_20BITS		0x01
-#define AIC31XX_WORD_LEN_24BITS		0x02
-#define AIC31XX_WORD_LEN_32BITS		0x03
-#define AIC31XX_IFACE1_DATALEN_MASK	0x30
-#define AIC31XX_IFACE1_DATALEN_SHIFT	(4)
-#define AIC31XX_IFACE1_DATATYPE_MASK	0xC0
+#define AIC31XX_IFACE1_DATATYPE_MASK	GENMASK(7, 6)
 #define AIC31XX_IFACE1_DATATYPE_SHIFT	(6)
 #define AIC31XX_I2S_MODE		0x00
 #define AIC31XX_DSP_MODE		0x01
 #define AIC31XX_RIGHT_JUSTIFIED_MODE	0x02
 #define AIC31XX_LEFT_JUSTIFIED_MODE	0x03
-#define AIC31XX_IFACE1_MASTER_MASK	0x0C
-#define AIC31XX_BCLK_MASTER		0x08
-#define AIC31XX_WCLK_MASTER		0x04
+#define AIC31XX_IFACE1_DATALEN_MASK	GENMASK(5, 4)
+#define AIC31XX_IFACE1_DATALEN_SHIFT	(4)
+#define AIC31XX_WORD_LEN_16BITS		0x00
+#define AIC31XX_WORD_LEN_20BITS		0x01
+#define AIC31XX_WORD_LEN_24BITS		0x02
+#define AIC31XX_WORD_LEN_32BITS		0x03
+#define AIC31XX_IFACE1_MASTER_MASK	GENMASK(3, 2)
+#define AIC31XX_BCLK_MASTER		BIT(2)
+#define AIC31XX_WCLK_MASTER		BIT(3)
 
 /* AIC31XX_DATA_OFFSET */
-#define AIC31XX_DATA_OFFSET_MASK	0xFF
+#define AIC31XX_DATA_OFFSET_MASK	GENMASK(7, 0)
 
 /* AIC31XX_IFACE2 */
-#define AIC31XX_BCLKINV_MASK		0x08
-#define AIC31XX_BDIVCLK_MASK		0x03
+#define AIC31XX_BCLKINV_MASK		BIT(3)
+#define AIC31XX_BDIVCLK_MASK		GENMASK(1, 0)
 #define AIC31XX_DAC2BCLK		0x00
 #define AIC31XX_DACMOD2BCLK		0x01
 #define AIC31XX_ADC2BCLK		0x02
 #define AIC31XX_ADCMOD2BCLK		0x03
 
 /* AIC31XX_ADCFLAG */
-#define AIC31XX_ADCPWRSTATUS_MASK		0x40
+#define AIC31XX_ADCPWRSTATUS_MASK	BIT(6)
 
 /* AIC31XX_DACFLAG1 */
-#define AIC31XX_LDACPWRSTATUS_MASK		0x80
-#define AIC31XX_RDACPWRSTATUS_MASK		0x08
-#define AIC31XX_HPLDRVPWRSTATUS_MASK		0x20
-#define AIC31XX_HPRDRVPWRSTATUS_MASK		0x02
-#define AIC31XX_SPLDRVPWRSTATUS_MASK		0x10
-#define AIC31XX_SPRDRVPWRSTATUS_MASK		0x01
+#define AIC31XX_LDACPWRSTATUS_MASK	BIT(7)
+#define AIC31XX_HPLDRVPWRSTATUS_MASK	BIT(5)
+#define AIC31XX_SPLDRVPWRSTATUS_MASK	BIT(4)
+#define AIC31XX_RDACPWRSTATUS_MASK	BIT(3)
+#define AIC31XX_HPRDRVPWRSTATUS_MASK	BIT(1)
+#define AIC31XX_SPRDRVPWRSTATUS_MASK	BIT(0)
 
 /* AIC31XX_INTRDACFLAG */
-#define AIC31XX_HPSCDETECT_MASK			0x80
-#define AIC31XX_BUTTONPRESS_MASK		0x20
-#define AIC31XX_HSPLUG_MASK			0x10
-#define AIC31XX_LDRCTHRES_MASK			0x08
-#define AIC31XX_RDRCTHRES_MASK			0x04
-#define AIC31XX_DACSINT_MASK			0x02
-#define AIC31XX_DACAINT_MASK			0x01
+#define AIC31XX_HPLSCDETECT		BIT(7)
+#define AIC31XX_HPRSCDETECT		BIT(6)
+#define AIC31XX_BUTTONPRESS		BIT(5)
+#define AIC31XX_HSPLUG			BIT(4)
+#define AIC31XX_LDRCTHRES		BIT(3)
+#define AIC31XX_RDRCTHRES		BIT(2)
+#define AIC31XX_DACSINT			BIT(1)
+#define AIC31XX_DACAINT			BIT(0)
 
 /* AIC31XX_INT1CTRL */
-#define AIC31XX_HSPLUGDET_MASK			0x80
-#define AIC31XX_BUTTONPRESSDET_MASK		0x40
-#define AIC31XX_DRCTHRES_MASK			0x20
-#define AIC31XX_AGCNOISE_MASK			0x10
-#define AIC31XX_OC_MASK				0x08
-#define AIC31XX_ENGINE_MASK			0x04
+#define AIC31XX_HSPLUGDET		BIT(7)
+#define AIC31XX_BUTTONPRESSDET		BIT(6)
+#define AIC31XX_DRCTHRES		BIT(5)
+#define AIC31XX_AGCNOISE		BIT(4)
+#define AIC31XX_SC			BIT(3)
+#define AIC31XX_ENGINE			BIT(2)
 
 /* AIC31XX_DACSETUP */
-#define AIC31XX_SOFTSTEP_MASK			0x03
+#define AIC31XX_SOFTSTEP_MASK		GENMASK(1, 0)
 
 /* AIC31XX_DACMUTE */
-#define AIC31XX_DACMUTE_MASK			0x0C
+#define AIC31XX_DACMUTE_MASK		GENMASK(3, 2)
 
 /* AIC31XX_MICBIAS */
-#define AIC31XX_MICBIAS_MASK			0x03
-#define AIC31XX_MICBIAS_SHIFT			0
+#define AIC31XX_MICBIAS_MASK		GENMASK(1, 0)
+#define AIC31XX_MICBIAS_SHIFT		0
 
 #endif	/* _TLV320AIC31XX_H */
diff --git a/sound/soc/codecs/tlv320aic32x4.c b/sound/soc/codecs/tlv320aic32x4.c
index e694f5f..fea0193 100644
--- a/sound/soc/codecs/tlv320aic32x4.c
+++ b/sound/soc/codecs/tlv320aic32x4.c
@@ -281,34 +281,34 @@ static const struct snd_kcontrol_new aic32x4_snd_controls[] = {
 
 static const struct aic32x4_rate_divs aic32x4_divs[] = {
 	/* 8k rate */
-	{AIC32X4_FREQ_12000000, 8000, 1, 7, 6800, 768, 5, 3, 128, 5, 18, 24},
-	{AIC32X4_FREQ_24000000, 8000, 2, 7, 6800, 768, 15, 1, 64, 45, 4, 24},
-	{AIC32X4_FREQ_25000000, 8000, 2, 7, 3728, 768, 15, 1, 64, 45, 4, 24},
+	{12000000, 8000, 1, 7, 6800, 768, 5, 3, 128, 5, 18, 24},
+	{24000000, 8000, 2, 7, 6800, 768, 15, 1, 64, 45, 4, 24},
+	{25000000, 8000, 2, 7, 3728, 768, 15, 1, 64, 45, 4, 24},
 	/* 11.025k rate */
-	{AIC32X4_FREQ_12000000, 11025, 1, 7, 5264, 512, 8, 2, 128, 8, 8, 16},
-	{AIC32X4_FREQ_24000000, 11025, 2, 7, 5264, 512, 16, 1, 64, 32, 4, 16},
+	{12000000, 11025, 1, 7, 5264, 512, 8, 2, 128, 8, 8, 16},
+	{24000000, 11025, 2, 7, 5264, 512, 16, 1, 64, 32, 4, 16},
 	/* 16k rate */
-	{AIC32X4_FREQ_12000000, 16000, 1, 7, 6800, 384, 5, 3, 128, 5, 9, 12},
-	{AIC32X4_FREQ_24000000, 16000, 2, 7, 6800, 384, 15, 1, 64, 18, 5, 12},
-	{AIC32X4_FREQ_25000000, 16000, 2, 7, 3728, 384, 15, 1, 64, 18, 5, 12},
+	{12000000, 16000, 1, 7, 6800, 384, 5, 3, 128, 5, 9, 12},
+	{24000000, 16000, 2, 7, 6800, 384, 15, 1, 64, 18, 5, 12},
+	{25000000, 16000, 2, 7, 3728, 384, 15, 1, 64, 18, 5, 12},
 	/* 22.05k rate */
-	{AIC32X4_FREQ_12000000, 22050, 1, 7, 5264, 256, 4, 4, 128, 4, 8, 8},
-	{AIC32X4_FREQ_24000000, 22050, 2, 7, 5264, 256, 16, 1, 64, 16, 4, 8},
-	{AIC32X4_FREQ_25000000, 22050, 2, 7, 2253, 256, 16, 1, 64, 16, 4, 8},
+	{12000000, 22050, 1, 7, 5264, 256, 4, 4, 128, 4, 8, 8},
+	{24000000, 22050, 2, 7, 5264, 256, 16, 1, 64, 16, 4, 8},
+	{25000000, 22050, 2, 7, 2253, 256, 16, 1, 64, 16, 4, 8},
 	/* 32k rate */
-	{AIC32X4_FREQ_12000000, 32000, 1, 7, 1680, 192, 2, 7, 64, 2, 21, 6},
-	{AIC32X4_FREQ_24000000, 32000, 2, 7, 1680, 192, 7, 2, 64, 7, 6, 6},
+	{12000000, 32000, 1, 7, 1680, 192, 2, 7, 64, 2, 21, 6},
+	{24000000, 32000, 2, 7, 1680, 192, 7, 2, 64, 7, 6, 6},
 	/* 44.1k rate */
-	{AIC32X4_FREQ_12000000, 44100, 1, 7, 5264, 128, 2, 8, 128, 2, 8, 4},
-	{AIC32X4_FREQ_24000000, 44100, 2, 7, 5264, 128, 8, 2, 64, 8, 4, 4},
-	{AIC32X4_FREQ_25000000, 44100, 2, 7, 2253, 128, 8, 2, 64, 8, 4, 4},
+	{12000000, 44100, 1, 7, 5264, 128, 2, 8, 128, 2, 8, 4},
+	{24000000, 44100, 2, 7, 5264, 128, 8, 2, 64, 8, 4, 4},
+	{25000000, 44100, 2, 7, 2253, 128, 8, 2, 64, 8, 4, 4},
 	/* 48k rate */
-	{AIC32X4_FREQ_12000000, 48000, 1, 8, 1920, 128, 2, 8, 128, 2, 8, 4},
-	{AIC32X4_FREQ_24000000, 48000, 2, 8, 1920, 128, 8, 2, 64, 8, 4, 4},
-	{AIC32X4_FREQ_25000000, 48000, 2, 7, 8643, 128, 8, 2, 64, 8, 4, 4},
+	{12000000, 48000, 1, 8, 1920, 128, 2, 8, 128, 2, 8, 4},
+	{24000000, 48000, 2, 8, 1920, 128, 8, 2, 64, 8, 4, 4},
+	{25000000, 48000, 2, 7, 8643, 128, 8, 2, 64, 8, 4, 4},
 
 	/* 96k rate */
-	{AIC32X4_FREQ_25000000, 96000, 2, 7, 8643, 64, 4, 4, 64, 4, 4, 1},
+	{25000000, 96000, 2, 7, 8643, 64, 4, 4, 64, 4, 4, 1},
 };
 
 static const struct snd_kcontrol_new hpl_output_mixer_controls[] = {
@@ -601,9 +601,9 @@ static int aic32x4_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 	struct aic32x4_priv *aic32x4 = snd_soc_codec_get_drvdata(codec);
 
 	switch (freq) {
-	case AIC32X4_FREQ_12000000:
-	case AIC32X4_FREQ_24000000:
-	case AIC32X4_FREQ_25000000:
+	case 12000000:
+	case 24000000:
+	case 25000000:
 		aic32x4->sysclk = freq;
 		return 0;
 	}
@@ -614,16 +614,9 @@ static int aic32x4_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 static int aic32x4_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt)
 {
 	struct snd_soc_codec *codec = codec_dai->codec;
-	u8 iface_reg_1;
-	u8 iface_reg_2;
-	u8 iface_reg_3;
-
-	iface_reg_1 = snd_soc_read(codec, AIC32X4_IFACE1);
-	iface_reg_1 = iface_reg_1 & ~(3 << 6 | 3 << 2);
-	iface_reg_2 = snd_soc_read(codec, AIC32X4_IFACE2);
-	iface_reg_2 = 0;
-	iface_reg_3 = snd_soc_read(codec, AIC32X4_IFACE3);
-	iface_reg_3 = iface_reg_3 & ~(1 << 3);
+	u8 iface_reg_1 = 0;
+	u8 iface_reg_2 = 0;
+	u8 iface_reg_3 = 0;
 
 	/* set master/slave audio interface */
 	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
@@ -641,30 +634,37 @@ static int aic32x4_set_dai_fmt(struct snd_soc_dai *codec_dai, unsigned int fmt)
 	case SND_SOC_DAIFMT_I2S:
 		break;
 	case SND_SOC_DAIFMT_DSP_A:
-		iface_reg_1 |= (AIC32X4_DSP_MODE << AIC32X4_PLLJ_SHIFT);
-		iface_reg_3 |= (1 << 3); /* invert bit clock */
+		iface_reg_1 |= (AIC32X4_DSP_MODE <<
+				AIC32X4_IFACE1_DATATYPE_SHIFT);
+		iface_reg_3 |= AIC32X4_BCLKINV_MASK; /* invert bit clock */
 		iface_reg_2 = 0x01; /* add offset 1 */
 		break;
 	case SND_SOC_DAIFMT_DSP_B:
-		iface_reg_1 |= (AIC32X4_DSP_MODE << AIC32X4_PLLJ_SHIFT);
-		iface_reg_3 |= (1 << 3); /* invert bit clock */
+		iface_reg_1 |= (AIC32X4_DSP_MODE <<
+				AIC32X4_IFACE1_DATATYPE_SHIFT);
+		iface_reg_3 |= AIC32X4_BCLKINV_MASK; /* invert bit clock */
 		break;
 	case SND_SOC_DAIFMT_RIGHT_J:
-		iface_reg_1 |=
-			(AIC32X4_RIGHT_JUSTIFIED_MODE << AIC32X4_PLLJ_SHIFT);
+		iface_reg_1 |= (AIC32X4_RIGHT_JUSTIFIED_MODE <<
+				AIC32X4_IFACE1_DATATYPE_SHIFT);
 		break;
 	case SND_SOC_DAIFMT_LEFT_J:
-		iface_reg_1 |=
-			(AIC32X4_LEFT_JUSTIFIED_MODE << AIC32X4_PLLJ_SHIFT);
+		iface_reg_1 |= (AIC32X4_LEFT_JUSTIFIED_MODE <<
+				AIC32X4_IFACE1_DATATYPE_SHIFT);
 		break;
 	default:
 		printk(KERN_ERR "aic32x4: invalid DAI interface format\n");
 		return -EINVAL;
 	}
 
-	snd_soc_write(codec, AIC32X4_IFACE1, iface_reg_1);
-	snd_soc_write(codec, AIC32X4_IFACE2, iface_reg_2);
-	snd_soc_write(codec, AIC32X4_IFACE3, iface_reg_3);
+	snd_soc_update_bits(codec, AIC32X4_IFACE1,
+			    AIC32X4_IFACE1_DATATYPE_MASK |
+			    AIC32X4_IFACE1_MASTER_MASK, iface_reg_1);
+	snd_soc_update_bits(codec, AIC32X4_IFACE2,
+			    AIC32X4_DATA_OFFSET_MASK, iface_reg_2);
+	snd_soc_update_bits(codec, AIC32X4_IFACE3,
+			    AIC32X4_BCLKINV_MASK, iface_reg_3);
+
 	return 0;
 }
 
@@ -674,7 +674,8 @@ static int aic32x4_hw_params(struct snd_pcm_substream *substream,
 {
 	struct snd_soc_codec *codec = dai->codec;
 	struct aic32x4_priv *aic32x4 = snd_soc_codec_get_drvdata(codec);
-	u8 data;
+	u8 iface1_reg = 0;
+	u8 dacsetup_reg = 0;
 	int i;
 
 	i = aic32x4_get_divs(aic32x4->sysclk, params_rate(params));
@@ -683,82 +684,88 @@ static int aic32x4_hw_params(struct snd_pcm_substream *substream,
 		return i;
 	}
 
-	/* Use PLL as CODEC_CLKIN and DAC_MOD_CLK as BDIV_CLKIN */
-	snd_soc_write(codec, AIC32X4_CLKMUX, AIC32X4_PLLCLKIN);
-	snd_soc_write(codec, AIC32X4_IFACE3, AIC32X4_DACMOD2BCLK);
+	/* MCLK as PLL_CLKIN */
+	snd_soc_update_bits(codec, AIC32X4_CLKMUX, AIC32X4_PLL_CLKIN_MASK,
+			    AIC32X4_PLL_CLKIN_MCLK << AIC32X4_PLL_CLKIN_SHIFT);
+	/* PLL as CODEC_CLKIN */
+	snd_soc_update_bits(codec, AIC32X4_CLKMUX, AIC32X4_CODEC_CLKIN_MASK,
+			    AIC32X4_CODEC_CLKIN_PLL << AIC32X4_CODEC_CLKIN_SHIFT);
+	/* DAC_MOD_CLK as BDIV_CLKIN */
+	snd_soc_update_bits(codec, AIC32X4_IFACE3, AIC32X4_BDIVCLK_MASK,
+			    AIC32X4_DACMOD2BCLK << AIC32X4_BDIVCLK_SHIFT);
 
-	/* We will fix R value to 1 and will make P & J=K.D as varialble */
-	data = snd_soc_read(codec, AIC32X4_PLLPR);
-	data &= ~(7 << 4);
-	snd_soc_write(codec, AIC32X4_PLLPR,
-		      (data | (aic32x4_divs[i].p_val << 4) | 0x01));
+	/* We will fix R value to 1 and will make P & J=K.D as variable */
+	snd_soc_update_bits(codec, AIC32X4_PLLPR, AIC32X4_PLL_R_MASK, 0x01);
 
+	/* PLL P value */
+	snd_soc_update_bits(codec, AIC32X4_PLLPR, AIC32X4_PLL_P_MASK,
+			    aic32x4_divs[i].p_val << AIC32X4_PLL_P_SHIFT);
+
+	/* PLL J value */
 	snd_soc_write(codec, AIC32X4_PLLJ, aic32x4_divs[i].pll_j);
 
+	/* PLL D value */
 	snd_soc_write(codec, AIC32X4_PLLDMSB, (aic32x4_divs[i].pll_d >> 8));
-	snd_soc_write(codec, AIC32X4_PLLDLSB,
-		      (aic32x4_divs[i].pll_d & 0xff));
+	snd_soc_write(codec, AIC32X4_PLLDLSB, (aic32x4_divs[i].pll_d & 0xff));
 
 	/* NDAC divider value */
-	data = snd_soc_read(codec, AIC32X4_NDAC);
-	data &= ~(0x7f);
-	snd_soc_write(codec, AIC32X4_NDAC, data | aic32x4_divs[i].ndac);
+	snd_soc_update_bits(codec, AIC32X4_NDAC,
+			    AIC32X4_NDAC_MASK, aic32x4_divs[i].ndac);
 
 	/* MDAC divider value */
-	data = snd_soc_read(codec, AIC32X4_MDAC);
-	data &= ~(0x7f);
-	snd_soc_write(codec, AIC32X4_MDAC, data | aic32x4_divs[i].mdac);
+	snd_soc_update_bits(codec, AIC32X4_MDAC,
+			    AIC32X4_MDAC_MASK, aic32x4_divs[i].mdac);
 
 	/* DOSR MSB & LSB values */
 	snd_soc_write(codec, AIC32X4_DOSRMSB, aic32x4_divs[i].dosr >> 8);
-	snd_soc_write(codec, AIC32X4_DOSRLSB,
-		      (aic32x4_divs[i].dosr & 0xff));
+	snd_soc_write(codec, AIC32X4_DOSRLSB, (aic32x4_divs[i].dosr & 0xff));
 
 	/* NADC divider value */
-	data = snd_soc_read(codec, AIC32X4_NADC);
-	data &= ~(0x7f);
-	snd_soc_write(codec, AIC32X4_NADC, data | aic32x4_divs[i].nadc);
+	snd_soc_update_bits(codec, AIC32X4_NADC,
+			    AIC32X4_NADC_MASK, aic32x4_divs[i].nadc);
 
 	/* MADC divider value */
-	data = snd_soc_read(codec, AIC32X4_MADC);
-	data &= ~(0x7f);
-	snd_soc_write(codec, AIC32X4_MADC, data | aic32x4_divs[i].madc);
+	snd_soc_update_bits(codec, AIC32X4_MADC,
+			    AIC32X4_MADC_MASK, aic32x4_divs[i].madc);
 
 	/* AOSR value */
 	snd_soc_write(codec, AIC32X4_AOSR, aic32x4_divs[i].aosr);
 
 	/* BCLK N divider */
-	data = snd_soc_read(codec, AIC32X4_BCLKN);
-	data &= ~(0x7f);
-	snd_soc_write(codec, AIC32X4_BCLKN, data | aic32x4_divs[i].blck_N);
+	snd_soc_update_bits(codec, AIC32X4_BCLKN,
+			    AIC32X4_BCLK_MASK, aic32x4_divs[i].blck_N);
 
-	data = snd_soc_read(codec, AIC32X4_IFACE1);
-	data = data & ~(3 << 4);
 	switch (params_width(params)) {
 	case 16:
+		iface1_reg |= (AIC32X4_WORD_LEN_16BITS <<
+			       AIC32X4_IFACE1_DATALEN_SHIFT);
 		break;
 	case 20:
-		data |= (AIC32X4_WORD_LEN_20BITS << AIC32X4_DOSRMSB_SHIFT);
+		iface1_reg |= (AIC32X4_WORD_LEN_20BITS <<
+			       AIC32X4_IFACE1_DATALEN_SHIFT);
 		break;
 	case 24:
-		data |= (AIC32X4_WORD_LEN_24BITS << AIC32X4_DOSRMSB_SHIFT);
+		iface1_reg |= (AIC32X4_WORD_LEN_24BITS <<
+			       AIC32X4_IFACE1_DATALEN_SHIFT);
 		break;
 	case 32:
-		data |= (AIC32X4_WORD_LEN_32BITS << AIC32X4_DOSRMSB_SHIFT);
+		iface1_reg |= (AIC32X4_WORD_LEN_32BITS <<
+			       AIC32X4_IFACE1_DATALEN_SHIFT);
 		break;
 	}
-	snd_soc_write(codec, AIC32X4_IFACE1, data);
+	snd_soc_update_bits(codec, AIC32X4_IFACE1,
+			    AIC32X4_IFACE1_DATALEN_MASK, iface1_reg);
 
 	if (params_channels(params) == 1) {
-		data = AIC32X4_RDAC2LCHN | AIC32X4_LDAC2LCHN;
+		dacsetup_reg = AIC32X4_RDAC2LCHN | AIC32X4_LDAC2LCHN;
 	} else {
 		if (aic32x4->swapdacs)
-			data = AIC32X4_RDAC2LCHN | AIC32X4_LDAC2RCHN;
+			dacsetup_reg = AIC32X4_RDAC2LCHN | AIC32X4_LDAC2RCHN;
 		else
-			data = AIC32X4_LDAC2LCHN | AIC32X4_RDAC2RCHN;
+			dacsetup_reg = AIC32X4_LDAC2LCHN | AIC32X4_RDAC2RCHN;
 	}
-	snd_soc_update_bits(codec, AIC32X4_DACSETUP, AIC32X4_DAC_CHAN_MASK,
-			data);
+	snd_soc_update_bits(codec, AIC32X4_DACSETUP,
+			    AIC32X4_DAC_CHAN_MASK, dacsetup_reg);
 
 	return 0;
 }
@@ -766,13 +773,10 @@ static int aic32x4_hw_params(struct snd_pcm_substream *substream,
 static int aic32x4_mute(struct snd_soc_dai *dai, int mute)
 {
 	struct snd_soc_codec *codec = dai->codec;
-	u8 dac_reg;
 
-	dac_reg = snd_soc_read(codec, AIC32X4_DACMUTE) & ~AIC32X4_MUTEON;
-	if (mute)
-		snd_soc_write(codec, AIC32X4_DACMUTE, dac_reg | AIC32X4_MUTEON);
-	else
-		snd_soc_write(codec, AIC32X4_DACMUTE, dac_reg);
+	snd_soc_update_bits(codec, AIC32X4_DACMUTE,
+			    AIC32X4_MUTEON, mute ? AIC32X4_MUTEON : 0);
+
 	return 0;
 }
 
diff --git a/sound/soc/codecs/tlv320aic32x4.h b/sound/soc/codecs/tlv320aic32x4.h
index da7cec4..e9df49e 100644
--- a/sound/soc/codecs/tlv320aic32x4.h
+++ b/sound/soc/codecs/tlv320aic32x4.h
@@ -19,141 +19,189 @@ int aic32x4_remove(struct device *dev);
 
 /* tlv320aic32x4 register space (in decimal to match datasheet) */
 
-#define AIC32X4_PAGE1		128
+#define AIC32X4_REG(page, reg)	((page * 128) + reg)
 
-#define	AIC32X4_PSEL		0
-#define	AIC32X4_RESET		1
-#define	AIC32X4_CLKMUX		4
-#define	AIC32X4_PLLPR		5
-#define	AIC32X4_PLLJ		6
-#define	AIC32X4_PLLDMSB		7
-#define	AIC32X4_PLLDLSB		8
-#define	AIC32X4_NDAC		11
-#define	AIC32X4_MDAC		12
-#define AIC32X4_DOSRMSB		13
-#define AIC32X4_DOSRLSB		14
-#define	AIC32X4_NADC		18
-#define	AIC32X4_MADC		19
-#define AIC32X4_AOSR		20
-#define AIC32X4_CLKMUX2		25
-#define AIC32X4_CLKOUTM		26
-#define AIC32X4_IFACE1		27
-#define AIC32X4_IFACE2		28
-#define AIC32X4_IFACE3		29
-#define AIC32X4_BCLKN		30
-#define AIC32X4_IFACE4		31
-#define AIC32X4_IFACE5		32
-#define AIC32X4_IFACE6		33
-#define AIC32X4_GPIOCTL		52
-#define AIC32X4_DOUTCTL		53
-#define AIC32X4_DINCTL		54
-#define AIC32X4_MISOCTL		55
-#define AIC32X4_SCLKCTL		56
-#define AIC32X4_DACSPB		60
-#define AIC32X4_ADCSPB		61
-#define AIC32X4_DACSETUP	63
-#define AIC32X4_DACMUTE		64
-#define AIC32X4_LDACVOL		65
-#define AIC32X4_RDACVOL		66
-#define AIC32X4_ADCSETUP	81
-#define	AIC32X4_ADCFGA		82
-#define AIC32X4_LADCVOL		83
-#define AIC32X4_RADCVOL		84
-#define AIC32X4_LAGC1		86
-#define AIC32X4_LAGC2		87
-#define AIC32X4_LAGC3		88
-#define AIC32X4_LAGC4		89
-#define AIC32X4_LAGC5		90
-#define AIC32X4_LAGC6		91
-#define AIC32X4_LAGC7		92
-#define AIC32X4_RAGC1		94
-#define AIC32X4_RAGC2		95
-#define AIC32X4_RAGC3		96
-#define AIC32X4_RAGC4		97
-#define AIC32X4_RAGC5		98
-#define AIC32X4_RAGC6		99
-#define AIC32X4_RAGC7		100
-#define AIC32X4_PWRCFG		(AIC32X4_PAGE1 + 1)
-#define AIC32X4_LDOCTL		(AIC32X4_PAGE1 + 2)
-#define AIC32X4_OUTPWRCTL	(AIC32X4_PAGE1 + 9)
-#define AIC32X4_CMMODE		(AIC32X4_PAGE1 + 10)
-#define AIC32X4_HPLROUTE	(AIC32X4_PAGE1 + 12)
-#define AIC32X4_HPRROUTE	(AIC32X4_PAGE1 + 13)
-#define AIC32X4_LOLROUTE	(AIC32X4_PAGE1 + 14)
-#define AIC32X4_LORROUTE	(AIC32X4_PAGE1 + 15)
-#define	AIC32X4_HPLGAIN		(AIC32X4_PAGE1 + 16)
-#define	AIC32X4_HPRGAIN		(AIC32X4_PAGE1 + 17)
-#define	AIC32X4_LOLGAIN		(AIC32X4_PAGE1 + 18)
-#define	AIC32X4_LORGAIN		(AIC32X4_PAGE1 + 19)
-#define AIC32X4_HEADSTART	(AIC32X4_PAGE1 + 20)
-#define AIC32X4_MICBIAS		(AIC32X4_PAGE1 + 51)
-#define AIC32X4_LMICPGAPIN	(AIC32X4_PAGE1 + 52)
-#define AIC32X4_LMICPGANIN	(AIC32X4_PAGE1 + 54)
-#define AIC32X4_RMICPGAPIN	(AIC32X4_PAGE1 + 55)
-#define AIC32X4_RMICPGANIN	(AIC32X4_PAGE1 + 57)
-#define AIC32X4_FLOATINGINPUT	(AIC32X4_PAGE1 + 58)
-#define AIC32X4_LMICPGAVOL	(AIC32X4_PAGE1 + 59)
-#define AIC32X4_RMICPGAVOL	(AIC32X4_PAGE1 + 60)
+#define	AIC32X4_PSEL		AIC32X4_REG(0, 0)
 
-#define AIC32X4_FREQ_12000000 12000000
-#define AIC32X4_FREQ_24000000 24000000
-#define AIC32X4_FREQ_25000000 25000000
+#define	AIC32X4_RESET		AIC32X4_REG(0, 1)
+#define	AIC32X4_CLKMUX		AIC32X4_REG(0, 4)
+#define	AIC32X4_PLLPR		AIC32X4_REG(0, 5)
+#define	AIC32X4_PLLJ		AIC32X4_REG(0, 6)
+#define	AIC32X4_PLLDMSB		AIC32X4_REG(0, 7)
+#define	AIC32X4_PLLDLSB		AIC32X4_REG(0, 8)
+#define	AIC32X4_NDAC		AIC32X4_REG(0, 11)
+#define	AIC32X4_MDAC		AIC32X4_REG(0, 12)
+#define AIC32X4_DOSRMSB		AIC32X4_REG(0, 13)
+#define AIC32X4_DOSRLSB		AIC32X4_REG(0, 14)
+#define	AIC32X4_NADC		AIC32X4_REG(0, 18)
+#define	AIC32X4_MADC		AIC32X4_REG(0, 19)
+#define AIC32X4_AOSR		AIC32X4_REG(0, 20)
+#define AIC32X4_CLKMUX2		AIC32X4_REG(0, 25)
+#define AIC32X4_CLKOUTM		AIC32X4_REG(0, 26)
+#define AIC32X4_IFACE1		AIC32X4_REG(0, 27)
+#define AIC32X4_IFACE2		AIC32X4_REG(0, 28)
+#define AIC32X4_IFACE3		AIC32X4_REG(0, 29)
+#define AIC32X4_BCLKN		AIC32X4_REG(0, 30)
+#define AIC32X4_IFACE4		AIC32X4_REG(0, 31)
+#define AIC32X4_IFACE5		AIC32X4_REG(0, 32)
+#define AIC32X4_IFACE6		AIC32X4_REG(0, 33)
+#define AIC32X4_GPIOCTL		AIC32X4_REG(0, 52)
+#define AIC32X4_DOUTCTL		AIC32X4_REG(0, 53)
+#define AIC32X4_DINCTL		AIC32X4_REG(0, 54)
+#define AIC32X4_MISOCTL		AIC32X4_REG(0, 55)
+#define AIC32X4_SCLKCTL		AIC32X4_REG(0, 56)
+#define AIC32X4_DACSPB		AIC32X4_REG(0, 60)
+#define AIC32X4_ADCSPB		AIC32X4_REG(0, 61)
+#define AIC32X4_DACSETUP	AIC32X4_REG(0, 63)
+#define AIC32X4_DACMUTE		AIC32X4_REG(0, 64)
+#define AIC32X4_LDACVOL		AIC32X4_REG(0, 65)
+#define AIC32X4_RDACVOL		AIC32X4_REG(0, 66)
+#define AIC32X4_ADCSETUP	AIC32X4_REG(0, 81)
+#define	AIC32X4_ADCFGA		AIC32X4_REG(0, 82)
+#define AIC32X4_LADCVOL		AIC32X4_REG(0, 83)
+#define AIC32X4_RADCVOL		AIC32X4_REG(0, 84)
+#define AIC32X4_LAGC1		AIC32X4_REG(0, 86)
+#define AIC32X4_LAGC2		AIC32X4_REG(0, 87)
+#define AIC32X4_LAGC3		AIC32X4_REG(0, 88)
+#define AIC32X4_LAGC4		AIC32X4_REG(0, 89)
+#define AIC32X4_LAGC5		AIC32X4_REG(0, 90)
+#define AIC32X4_LAGC6		AIC32X4_REG(0, 91)
+#define AIC32X4_LAGC7		AIC32X4_REG(0, 92)
+#define AIC32X4_RAGC1		AIC32X4_REG(0, 94)
+#define AIC32X4_RAGC2		AIC32X4_REG(0, 95)
+#define AIC32X4_RAGC3		AIC32X4_REG(0, 96)
+#define AIC32X4_RAGC4		AIC32X4_REG(0, 97)
+#define AIC32X4_RAGC5		AIC32X4_REG(0, 98)
+#define AIC32X4_RAGC6		AIC32X4_REG(0, 99)
+#define AIC32X4_RAGC7		AIC32X4_REG(0, 100)
 
-#define AIC32X4_WORD_LEN_16BITS		0x00
-#define AIC32X4_WORD_LEN_20BITS		0x01
-#define AIC32X4_WORD_LEN_24BITS		0x02
-#define AIC32X4_WORD_LEN_32BITS		0x03
+#define AIC32X4_PWRCFG		AIC32X4_REG(1, 1)
+#define AIC32X4_LDOCTL		AIC32X4_REG(1, 2)
+#define AIC32X4_OUTPWRCTL	AIC32X4_REG(1, 9)
+#define AIC32X4_CMMODE		AIC32X4_REG(1, 10)
+#define AIC32X4_HPLROUTE	AIC32X4_REG(1, 12)
+#define AIC32X4_HPRROUTE	AIC32X4_REG(1, 13)
+#define AIC32X4_LOLROUTE	AIC32X4_REG(1, 14)
+#define AIC32X4_LORROUTE	AIC32X4_REG(1, 15)
+#define	AIC32X4_HPLGAIN		AIC32X4_REG(1, 16)
+#define	AIC32X4_HPRGAIN		AIC32X4_REG(1, 17)
+#define	AIC32X4_LOLGAIN		AIC32X4_REG(1, 18)
+#define	AIC32X4_LORGAIN		AIC32X4_REG(1, 19)
+#define AIC32X4_HEADSTART	AIC32X4_REG(1, 20)
+#define AIC32X4_MICBIAS		AIC32X4_REG(1, 51)
+#define AIC32X4_LMICPGAPIN	AIC32X4_REG(1, 52)
+#define AIC32X4_LMICPGANIN	AIC32X4_REG(1, 54)
+#define AIC32X4_RMICPGAPIN	AIC32X4_REG(1, 55)
+#define AIC32X4_RMICPGANIN	AIC32X4_REG(1, 57)
+#define AIC32X4_FLOATINGINPUT	AIC32X4_REG(1, 58)
+#define AIC32X4_LMICPGAVOL	AIC32X4_REG(1, 59)
+#define AIC32X4_RMICPGAVOL	AIC32X4_REG(1, 60)
 
-#define AIC32X4_LADC_EN			(1 << 7)
-#define AIC32X4_RADC_EN			(1 << 6)
+/* Bits, masks, and shifts */
 
-#define AIC32X4_I2S_MODE		0x00
-#define AIC32X4_DSP_MODE		0x01
-#define AIC32X4_RIGHT_JUSTIFIED_MODE	0x02
-#define AIC32X4_LEFT_JUSTIFIED_MODE	0x03
+/* AIC32X4_CLKMUX */
+#define AIC32X4_PLL_CLKIN_MASK		GENMASK(3, 2)
+#define AIC32X4_PLL_CLKIN_SHIFT		(2)
+#define AIC32X4_PLL_CLKIN_MCLK		(0x00)
+#define AIC32X4_PLL_CLKIN_BCKL		(0x01)
+#define AIC32X4_PLL_CLKIN_GPIO1		(0x02)
+#define AIC32X4_PLL_CLKIN_DIN		(0x03)
+#define AIC32X4_CODEC_CLKIN_MASK	GENMASK(1, 0)
+#define AIC32X4_CODEC_CLKIN_SHIFT	(0)
+#define AIC32X4_CODEC_CLKIN_MCLK	(0x00)
+#define AIC32X4_CODEC_CLKIN_BCLK	(0x01)
+#define AIC32X4_CODEC_CLKIN_GPIO1	(0x02)
+#define AIC32X4_CODEC_CLKIN_PLL		(0x03)
 
-#define AIC32X4_AVDDWEAKDISABLE		0x08
-#define AIC32X4_LDOCTLEN		0x01
+/* AIC32X4_PLLPR */
+#define AIC32X4_PLLEN			BIT(7)
+#define AIC32X4_PLL_P_MASK		GENMASK(6, 4)
+#define AIC32X4_PLL_P_SHIFT		(4)
+#define AIC32X4_PLL_R_MASK		GENMASK(3, 0)
 
-#define AIC32X4_LDOIN_18_36		0x01
-#define AIC32X4_LDOIN2HP		0x02
+/* AIC32X4_NDAC */
+#define AIC32X4_NDACEN			BIT(7)
+#define AIC32X4_NDAC_MASK		GENMASK(6, 0)
 
-#define AIC32X4_DACSPBLOCK_MASK		0x1f
-#define AIC32X4_ADCSPBLOCK_MASK		0x1f
+/* AIC32X4_MDAC */
+#define AIC32X4_MDACEN			BIT(7)
+#define AIC32X4_MDAC_MASK		GENMASK(6, 0)
 
-#define AIC32X4_PLLJ_SHIFT		6
-#define AIC32X4_DOSRMSB_SHIFT		4
+/* AIC32X4_NADC */
+#define AIC32X4_NADCEN			BIT(7)
+#define AIC32X4_NADC_MASK		GENMASK(6, 0)
 
-#define AIC32X4_PLLCLKIN		0x03
+/* AIC32X4_MADC */
+#define AIC32X4_MADCEN			BIT(7)
+#define AIC32X4_MADC_MASK		GENMASK(6, 0)
 
-#define AIC32X4_MICBIAS_LDOIN		0x08
+/* AIC32X4_BCLKN */
+#define AIC32X4_BCLKEN			BIT(7)
+#define AIC32X4_BCLK_MASK		GENMASK(6, 0)
+
+/* AIC32X4_IFACE1 */
+#define AIC32X4_IFACE1_DATATYPE_MASK	GENMASK(7, 6)
+#define AIC32X4_IFACE1_DATATYPE_SHIFT	(6)
+#define AIC32X4_I2S_MODE		(0x00)
+#define AIC32X4_DSP_MODE		(0x01)
+#define AIC32X4_RIGHT_JUSTIFIED_MODE	(0x02)
+#define AIC32X4_LEFT_JUSTIFIED_MODE	(0x03)
+#define AIC32X4_IFACE1_DATALEN_MASK	GENMASK(5, 4)
+#define AIC32X4_IFACE1_DATALEN_SHIFT	(4)
+#define AIC32X4_WORD_LEN_16BITS		(0x00)
+#define AIC32X4_WORD_LEN_20BITS		(0x01)
+#define AIC32X4_WORD_LEN_24BITS		(0x02)
+#define AIC32X4_WORD_LEN_32BITS		(0x03)
+#define AIC32X4_IFACE1_MASTER_MASK	GENMASK(3, 2)
+#define AIC32X4_BCLKMASTER		BIT(2)
+#define AIC32X4_WCLKMASTER		BIT(3)
+
+/* AIC32X4_IFACE2 */
+#define AIC32X4_DATA_OFFSET_MASK	GENMASK(7, 0)
+
+/* AIC32X4_IFACE3 */
+#define AIC32X4_BCLKINV_MASK		BIT(3)
+#define AIC32X4_BDIVCLK_MASK		GENMASK(1, 0)
+#define AIC32X4_BDIVCLK_SHIFT		(0)
+#define AIC32X4_DAC2BCLK		(0x00)
+#define AIC32X4_DACMOD2BCLK		(0x01)
+#define AIC32X4_ADC2BCLK		(0x02)
+#define AIC32X4_ADCMOD2BCLK		(0x03)
+
+/* AIC32X4_DACSETUP */
+#define AIC32X4_DAC_CHAN_MASK		GENMASK(5, 2)
+#define AIC32X4_LDAC2RCHN		BIT(5)
+#define AIC32X4_LDAC2LCHN		BIT(4)
+#define AIC32X4_RDAC2LCHN		BIT(3)
+#define AIC32X4_RDAC2RCHN		BIT(2)
+
+/* AIC32X4_DACMUTE */
+#define AIC32X4_MUTEON			0x0C
+
+/* AIC32X4_ADCSETUP */
+#define AIC32X4_LADC_EN			BIT(7)
+#define AIC32X4_RADC_EN			BIT(6)
+
+/* AIC32X4_PWRCFG */
+#define AIC32X4_AVDDWEAKDISABLE		BIT(3)
+
+/* AIC32X4_LDOCTL */
+#define AIC32X4_LDOCTLEN		BIT(0)
+
+/* AIC32X4_CMMODE */
+#define AIC32X4_LDOIN_18_36		BIT(0)
+#define AIC32X4_LDOIN2HP		BIT(1)
+
+/* AIC32X4_MICBIAS */
+#define AIC32X4_MICBIAS_LDOIN		BIT(3)
 #define AIC32X4_MICBIAS_2075V		0x60
 
+/* AIC32X4_LMICPGANIN */
 #define AIC32X4_LMICPGANIN_IN2R_10K	0x10
 #define AIC32X4_LMICPGANIN_CM1L_10K	0x40
+
+/* AIC32X4_RMICPGANIN */
 #define AIC32X4_RMICPGANIN_IN1L_10K	0x10
 #define AIC32X4_RMICPGANIN_CM1R_10K	0x40
 
-#define AIC32X4_LMICPGAVOL_NOGAIN	0x80
-#define AIC32X4_RMICPGAVOL_NOGAIN	0x80
-
-#define AIC32X4_BCLKMASTER		0x08
-#define AIC32X4_WCLKMASTER		0x04
-#define AIC32X4_PLLEN			(0x01 << 7)
-#define AIC32X4_NDACEN			(0x01 << 7)
-#define AIC32X4_MDACEN			(0x01 << 7)
-#define AIC32X4_NADCEN			(0x01 << 7)
-#define AIC32X4_MADCEN			(0x01 << 7)
-#define AIC32X4_BCLKEN			(0x01 << 7)
-#define AIC32X4_DACEN			(0x03 << 6)
-#define AIC32X4_RDAC2LCHN		(0x02 << 2)
-#define AIC32X4_LDAC2RCHN		(0x02 << 4)
-#define AIC32X4_LDAC2LCHN		(0x01 << 4)
-#define AIC32X4_RDAC2RCHN		(0x01 << 2)
-#define AIC32X4_DAC_CHAN_MASK		0x3c
-
-#define AIC32X4_SSTEP2WCLK		0x01
-#define AIC32X4_MUTEON			0x0C
-#define	AIC32X4_DACMOD2BCLK		0x01
-
 #endif				/* _TLV320AIC32X4_H */
diff --git a/sound/soc/codecs/tlv320aic3x.c b/sound/soc/codecs/tlv320aic3x.c
index 06f9257..b751cad 100644
--- a/sound/soc/codecs/tlv320aic3x.c
+++ b/sound/soc/codecs/tlv320aic3x.c
@@ -1804,11 +1804,18 @@ static int aic3x_i2c_probe(struct i2c_client *i2c,
 		if (!ai3x_setup)
 			return -ENOMEM;
 
-		ret = of_get_named_gpio(np, "gpio-reset", 0);
-		if (ret >= 0)
+		ret = of_get_named_gpio(np, "reset-gpios", 0);
+		if (ret >= 0) {
 			aic3x->gpio_reset = ret;
-		else
-			aic3x->gpio_reset = -1;
+		} else {
+			ret = of_get_named_gpio(np, "gpio-reset", 0);
+			if (ret > 0) {
+				dev_warn(&i2c->dev, "Using deprecated property \"gpio-reset\", please update your DT");
+				aic3x->gpio_reset = ret;
+			} else {
+				aic3x->gpio_reset = -1;
+			}
+		}
 
 		if (of_property_read_u32_array(np, "ai3x-gpio-func",
 					ai3x_setup->gpio_func, 2) >= 0) {
diff --git a/sound/soc/codecs/tlv320dac33.c b/sound/soc/codecs/tlv320dac33.c
index 5b94a15..8c71d2f 100644
--- a/sound/soc/codecs/tlv320dac33.c
+++ b/sound/soc/codecs/tlv320dac33.c
@@ -106,6 +106,7 @@ struct tlv320dac33_priv {
 	int mode1_latency;		/* latency caused by the i2c writes in
 					 * us */
 	u8 burst_bclkdiv;		/* BCLK divider value in burst mode */
+	u8 *reg_cache;
 	unsigned int burst_rate;	/* Interface speed in Burst modes */
 
 	int keep_bclk;			/* Keep the BCLK continuously running
@@ -121,7 +122,7 @@ struct tlv320dac33_priv {
 	unsigned int uthr;
 
 	enum dac33_state state;
-	void *control_data;
+	struct i2c_client *i2c;
 };
 
 static const u8 dac33_reg[DAC33_CACHEREGNUM] = {
@@ -173,7 +174,8 @@ static const u8 dac33_reg[DAC33_CACHEREGNUM] = {
 static inline unsigned int dac33_read_reg_cache(struct snd_soc_codec *codec,
 						unsigned reg)
 {
-	u8 *cache = codec->reg_cache;
+	struct tlv320dac33_priv *dac33 = snd_soc_codec_get_drvdata(codec);
+	u8 *cache = dac33->reg_cache;
 	if (reg >= DAC33_CACHEREGNUM)
 		return 0;
 
@@ -183,7 +185,8 @@ static inline unsigned int dac33_read_reg_cache(struct snd_soc_codec *codec,
 static inline void dac33_write_reg_cache(struct snd_soc_codec *codec,
 					 u8 reg, u8 value)
 {
-	u8 *cache = codec->reg_cache;
+	struct tlv320dac33_priv *dac33 = snd_soc_codec_get_drvdata(codec);
+	u8 *cache = dac33->reg_cache;
 	if (reg >= DAC33_CACHEREGNUM)
 		return;
 
@@ -200,7 +203,7 @@ static int dac33_read(struct snd_soc_codec *codec, unsigned int reg,
 
 	/* If powered off, return the cached value */
 	if (dac33->chip_power) {
-		val = i2c_smbus_read_byte_data(codec->control_data, value[0]);
+		val = i2c_smbus_read_byte_data(dac33->i2c, value[0]);
 		if (val < 0) {
 			dev_err(codec->dev, "Read failed (%d)\n", val);
 			value[0] = dac33_read_reg_cache(codec, reg);
@@ -233,7 +236,7 @@ static int dac33_write(struct snd_soc_codec *codec, unsigned int reg,
 
 	dac33_write_reg_cache(codec, data[0], data[1]);
 	if (dac33->chip_power) {
-		ret = codec->hw_write(codec->control_data, data, 2);
+		ret = i2c_master_send(dac33->i2c, data, 2);
 		if (ret != 2)
 			dev_err(codec->dev, "Write failed (%d)\n", ret);
 		else
@@ -244,7 +247,7 @@ static int dac33_write(struct snd_soc_codec *codec, unsigned int reg,
 }
 
 static int dac33_write_locked(struct snd_soc_codec *codec, unsigned int reg,
-		       unsigned int value)
+			      unsigned int value)
 {
 	struct tlv320dac33_priv *dac33 = snd_soc_codec_get_drvdata(codec);
 	int ret;
@@ -280,7 +283,7 @@ static int dac33_write16(struct snd_soc_codec *codec, unsigned int reg,
 	if (dac33->chip_power) {
 		/* We need to set autoincrement mode for 16 bit writes */
 		data[0] |= DAC33_I2C_ADDR_AUTOINC;
-		ret = codec->hw_write(codec->control_data, data, 3);
+		ret = i2c_master_send(dac33->i2c, data, 3);
 		if (ret != 3)
 			dev_err(codec->dev, "Write failed (%d)\n", ret);
 		else
@@ -1379,8 +1382,6 @@ static int dac33_soc_probe(struct snd_soc_codec *codec)
 	struct tlv320dac33_priv *dac33 = snd_soc_codec_get_drvdata(codec);
 	int ret = 0;
 
-	codec->control_data = dac33->control_data;
-	codec->hw_write = (hw_write_t) i2c_master_send;
 	dac33->codec = codec;
 
 	/* Read the tlv320dac33 ID registers */
@@ -1438,9 +1439,7 @@ static const struct snd_soc_codec_driver soc_codec_dev_tlv320dac33 = {
 	.write = dac33_write_locked,
 	.set_bias_level = dac33_set_bias_level,
 	.idle_bias_off = true,
-	.reg_cache_size = ARRAY_SIZE(dac33_reg),
-	.reg_word_size = sizeof(u8),
-	.reg_cache_default = dac33_reg,
+
 	.probe = dac33_soc_probe,
 	.remove = dac33_soc_remove,
 
@@ -1499,7 +1498,14 @@ static int dac33_i2c_probe(struct i2c_client *client,
 	if (dac33 == NULL)
 		return -ENOMEM;
 
-	dac33->control_data = client;
+	dac33->reg_cache = devm_kmemdup(&client->dev,
+					dac33_reg,
+					ARRAY_SIZE(dac33_reg) * sizeof(u8),
+					GFP_KERNEL);
+	if (!dac33->reg_cache)
+		return -ENOMEM;
+
+	dac33->i2c = client;
 	mutex_init(&dac33->mutex);
 	spin_lock_init(&dac33->lock);
 
diff --git a/sound/soc/codecs/ts3a227e.c b/sound/soc/codecs/ts3a227e.c
index 738e04b..1271e7e 100644
--- a/sound/soc/codecs/ts3a227e.c
+++ b/sound/soc/codecs/ts3a227e.c
@@ -241,7 +241,7 @@ int ts3a227e_enable_jack_detect(struct snd_soc_component *component,
 {
 	struct ts3a227e *ts3a227e = snd_soc_component_get_drvdata(component);
 
-	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_MEDIA);
+	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_1, KEY_VOICECOMMAND);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_2, KEY_VOLUMEUP);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_3, KEY_VOLUMEDOWN);
diff --git a/sound/soc/codecs/tscs42xx.c b/sound/soc/codecs/tscs42xx.c
new file mode 100644
index 0000000..e7661d0
--- /dev/null
+++ b/sound/soc/codecs/tscs42xx.c
@@ -0,0 +1,1456 @@
+// SPDX-License-Identifier: GPL-2.0
+// tscs42xx.c -- TSCS42xx ALSA SoC Audio driver
+// Copyright 2017 Tempo Semiconductor, Inc.
+// Author: Steven Eckhoff <steven.eckhoff.opensource@gmail.com>
+
+#include <linux/i2c.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/regmap.h>
+#include <sound/pcm_params.h>
+#include <sound/soc.h>
+#include <sound/soc-dapm.h>
+#include <sound/tlv.h>
+
+#include "tscs42xx.h"
+
+#define COEFF_SIZE 3
+#define BIQUAD_COEFF_COUNT 5
+#define BIQUAD_SIZE (COEFF_SIZE * BIQUAD_COEFF_COUNT)
+
+#define COEFF_RAM_MAX_ADDR 0xcd
+#define COEFF_RAM_COEFF_COUNT (COEFF_RAM_MAX_ADDR + 1)
+#define COEFF_RAM_SIZE (COEFF_SIZE * COEFF_RAM_COEFF_COUNT)
+
+struct tscs42xx {
+
+	int bclk_ratio;
+	int samplerate;
+	unsigned int blrcm;
+	struct mutex audio_params_lock;
+
+	u8 coeff_ram[COEFF_RAM_SIZE];
+	bool coeff_ram_synced;
+	struct mutex coeff_ram_lock;
+
+	struct mutex pll_lock;
+
+	struct regmap *regmap;
+
+	struct device *dev;
+};
+
+struct coeff_ram_ctl {
+	unsigned int addr;
+	struct soc_bytes_ext bytes_ext;
+};
+
+static bool tscs42xx_volatile(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case R_DACCRWRL:
+	case R_DACCRWRM:
+	case R_DACCRWRH:
+	case R_DACCRRDL:
+	case R_DACCRRDM:
+	case R_DACCRRDH:
+	case R_DACCRSTAT:
+	case R_DACCRADDR:
+	case R_PLLCTL0:
+		return true;
+	default:
+		return false;
+	};
+}
+
+static bool tscs42xx_precious(struct device *dev, unsigned int reg)
+{
+	switch (reg) {
+	case R_DACCRWRL:
+	case R_DACCRWRM:
+	case R_DACCRWRH:
+	case R_DACCRRDL:
+	case R_DACCRRDM:
+	case R_DACCRRDH:
+		return true;
+	default:
+		return false;
+	};
+}
+
+static const struct regmap_config tscs42xx_regmap = {
+	.reg_bits = 8,
+	.val_bits = 8,
+
+	.volatile_reg = tscs42xx_volatile,
+	.precious_reg = tscs42xx_precious,
+	.max_register = R_DACMBCREL3H,
+
+	.cache_type = REGCACHE_RBTREE,
+	.can_multi_write = true,
+};
+
+#define MAX_PLL_LOCK_20MS_WAITS 1
+static bool plls_locked(struct snd_soc_codec *codec)
+{
+	int ret;
+	int count = MAX_PLL_LOCK_20MS_WAITS;
+
+	do {
+		ret = snd_soc_read(codec, R_PLLCTL0);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to read PLL lock status (%d)\n", ret);
+			return false;
+		} else if (ret > 0) {
+			return true;
+		}
+		msleep(20);
+	} while (count--);
+
+	return false;
+}
+
+static int sample_rate_to_pll_freq_out(int sample_rate)
+{
+	switch (sample_rate) {
+	case 11025:
+	case 22050:
+	case 44100:
+	case 88200:
+		return 112896000;
+	case 8000:
+	case 16000:
+	case 32000:
+	case 48000:
+	case 96000:
+		return 122880000;
+	default:
+		return -EINVAL;
+	}
+}
+
+#define DACCRSTAT_MAX_TRYS 10
+static int write_coeff_ram(struct snd_soc_codec *codec, u8 *coeff_ram,
+	unsigned int addr, unsigned int coeff_cnt)
+{
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	int cnt;
+	int trys;
+	int ret;
+
+	for (cnt = 0; cnt < coeff_cnt; cnt++, addr++) {
+
+		for (trys = 0; trys < DACCRSTAT_MAX_TRYS; trys++) {
+			ret = snd_soc_read(codec, R_DACCRSTAT);
+			if (ret < 0) {
+				dev_err(codec->dev,
+					"Failed to read stat (%d)\n", ret);
+				return ret;
+			}
+			if (!ret)
+				break;
+		}
+
+		if (trys == DACCRSTAT_MAX_TRYS) {
+			ret = -EIO;
+			dev_err(codec->dev,
+				"dac coefficient write error (%d)\n", ret);
+			return ret;
+		}
+
+		ret = regmap_write(tscs42xx->regmap, R_DACCRADDR, addr);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to write dac ram address (%d)\n", ret);
+			return ret;
+		}
+
+		ret = regmap_bulk_write(tscs42xx->regmap, R_DACCRWRL,
+			&coeff_ram[addr * COEFF_SIZE],
+			COEFF_SIZE);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to write dac ram (%d)\n", ret);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int power_up_audio_plls(struct snd_soc_codec *codec)
+{
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	int freq_out;
+	int ret;
+	unsigned int mask;
+	unsigned int val;
+
+	freq_out = sample_rate_to_pll_freq_out(tscs42xx->samplerate);
+	switch (freq_out) {
+	case 122880000: /* 48k */
+		mask = RM_PLLCTL1C_PDB_PLL1;
+		val = RV_PLLCTL1C_PDB_PLL1_ENABLE;
+		break;
+	case 112896000: /* 44.1k */
+		mask = RM_PLLCTL1C_PDB_PLL2;
+		val = RV_PLLCTL1C_PDB_PLL2_ENABLE;
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(codec->dev, "Unrecognized PLL output freq (%d)\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&tscs42xx->pll_lock);
+
+	ret = snd_soc_update_bits(codec, R_PLLCTL1C, mask, val);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to turn PLL on (%d)\n", ret);
+		goto exit;
+	}
+
+	if (!plls_locked(codec)) {
+		dev_err(codec->dev, "Failed to lock plls\n");
+		ret = -ENOMSG;
+		goto exit;
+	}
+
+	ret = 0;
+exit:
+	mutex_unlock(&tscs42xx->pll_lock);
+
+	return ret;
+}
+
+static int power_down_audio_plls(struct snd_soc_codec *codec)
+{
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	mutex_lock(&tscs42xx->pll_lock);
+
+	ret = snd_soc_update_bits(codec, R_PLLCTL1C,
+			RM_PLLCTL1C_PDB_PLL1,
+			RV_PLLCTL1C_PDB_PLL1_DISABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to turn PLL off (%d)\n", ret);
+		goto exit;
+	}
+	ret = snd_soc_update_bits(codec, R_PLLCTL1C,
+			RM_PLLCTL1C_PDB_PLL2,
+			RV_PLLCTL1C_PDB_PLL2_DISABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to turn PLL off (%d)\n", ret);
+		goto exit;
+	}
+
+	ret = 0;
+exit:
+	mutex_unlock(&tscs42xx->pll_lock);
+
+	return ret;
+}
+
+static int coeff_ram_get(struct snd_kcontrol *kcontrol,
+	struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	struct coeff_ram_ctl *ctl =
+		(struct coeff_ram_ctl *)kcontrol->private_value;
+	struct soc_bytes_ext *params = &ctl->bytes_ext;
+
+	mutex_lock(&tscs42xx->coeff_ram_lock);
+
+	memcpy(ucontrol->value.bytes.data,
+		&tscs42xx->coeff_ram[ctl->addr * COEFF_SIZE], params->max);
+
+	mutex_unlock(&tscs42xx->coeff_ram_lock);
+
+	return 0;
+}
+
+static int coeff_ram_put(struct snd_kcontrol *kcontrol,
+	struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	struct coeff_ram_ctl *ctl =
+		(struct coeff_ram_ctl *)kcontrol->private_value;
+	struct soc_bytes_ext *params = &ctl->bytes_ext;
+	unsigned int coeff_cnt = params->max / COEFF_SIZE;
+	int ret;
+
+	mutex_lock(&tscs42xx->coeff_ram_lock);
+
+	tscs42xx->coeff_ram_synced = false;
+
+	memcpy(&tscs42xx->coeff_ram[ctl->addr * COEFF_SIZE],
+		ucontrol->value.bytes.data, params->max);
+
+	mutex_lock(&tscs42xx->pll_lock);
+
+	if (plls_locked(codec)) {
+		ret = write_coeff_ram(codec, tscs42xx->coeff_ram,
+			ctl->addr, coeff_cnt);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to flush coeff ram cache (%d)\n", ret);
+			goto exit;
+		}
+		tscs42xx->coeff_ram_synced = true;
+	}
+
+	ret = 0;
+exit:
+	mutex_unlock(&tscs42xx->pll_lock);
+
+	mutex_unlock(&tscs42xx->coeff_ram_lock);
+
+	return ret;
+}
+
+/* Input L Capture Route */
+static char const * const input_select_text[] = {
+	"Line 1", "Line 2", "Line 3", "D2S"
+};
+
+static const struct soc_enum left_input_select_enum =
+SOC_ENUM_SINGLE(R_INSELL, FB_INSELL, ARRAY_SIZE(input_select_text),
+		input_select_text);
+
+static const struct snd_kcontrol_new left_input_select =
+SOC_DAPM_ENUM("LEFT_INPUT_SELECT_ENUM", left_input_select_enum);
+
+/* Input R Capture Route */
+static const struct soc_enum right_input_select_enum =
+SOC_ENUM_SINGLE(R_INSELR, FB_INSELR, ARRAY_SIZE(input_select_text),
+		input_select_text);
+
+static const struct snd_kcontrol_new right_input_select =
+SOC_DAPM_ENUM("RIGHT_INPUT_SELECT_ENUM", right_input_select_enum);
+
+/* Input Channel Mapping */
+static char const * const ch_map_select_text[] = {
+	"Normal", "Left to Right", "Right to Left", "Swap"
+};
+
+static const struct soc_enum ch_map_select_enum =
+SOC_ENUM_SINGLE(R_AIC2, FB_AIC2_ADCDSEL, ARRAY_SIZE(ch_map_select_text),
+		ch_map_select_text);
+
+static int dapm_vref_event(struct snd_soc_dapm_widget *w,
+			 struct snd_kcontrol *kcontrol, int event)
+{
+	msleep(20);
+	return 0;
+}
+
+static int dapm_micb_event(struct snd_soc_dapm_widget *w,
+			 struct snd_kcontrol *kcontrol, int event)
+{
+	msleep(20);
+	return 0;
+}
+
+static int pll_event(struct snd_soc_dapm_widget *w,
+		     struct snd_kcontrol *kcontrol, int event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	int ret;
+
+	if (SND_SOC_DAPM_EVENT_ON(event))
+		ret = power_up_audio_plls(codec);
+	else
+		ret = power_down_audio_plls(codec);
+
+	return ret;
+}
+
+static int dac_event(struct snd_soc_dapm_widget *w,
+		     struct snd_kcontrol *kcontrol, int event)
+{
+	struct snd_soc_codec *codec = snd_soc_dapm_to_codec(w->dapm);
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	mutex_lock(&tscs42xx->coeff_ram_lock);
+
+	if (tscs42xx->coeff_ram_synced == false) {
+		ret = write_coeff_ram(codec, tscs42xx->coeff_ram, 0x00,
+			COEFF_RAM_COEFF_COUNT);
+		if (ret < 0)
+			goto exit;
+		tscs42xx->coeff_ram_synced = true;
+	}
+
+	ret = 0;
+exit:
+	mutex_unlock(&tscs42xx->coeff_ram_lock);
+
+	return ret;
+}
+
+static const struct snd_soc_dapm_widget tscs42xx_dapm_widgets[] = {
+	/* Vref */
+	SND_SOC_DAPM_SUPPLY_S("Vref", 1, R_PWRM2, FB_PWRM2_VREF, 0,
+		dapm_vref_event, SND_SOC_DAPM_POST_PMU|SND_SOC_DAPM_PRE_PMD),
+
+	/* PLL */
+	SND_SOC_DAPM_SUPPLY("PLL", SND_SOC_NOPM, 0, 0, pll_event,
+		SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
+
+	/* Headphone */
+	SND_SOC_DAPM_DAC_E("DAC L", "HiFi Playback", R_PWRM2, FB_PWRM2_HPL, 0,
+			dac_event, SND_SOC_DAPM_POST_PMU),
+	SND_SOC_DAPM_DAC_E("DAC R", "HiFi Playback", R_PWRM2, FB_PWRM2_HPR, 0,
+			dac_event, SND_SOC_DAPM_POST_PMU),
+	SND_SOC_DAPM_OUTPUT("Headphone L"),
+	SND_SOC_DAPM_OUTPUT("Headphone R"),
+
+	/* Speaker */
+	SND_SOC_DAPM_DAC_E("ClassD L", "HiFi Playback",
+		R_PWRM2, FB_PWRM2_SPKL, 0,
+		dac_event, SND_SOC_DAPM_POST_PMU),
+	SND_SOC_DAPM_DAC_E("ClassD R", "HiFi Playback",
+		R_PWRM2, FB_PWRM2_SPKR, 0,
+		dac_event, SND_SOC_DAPM_POST_PMU),
+	SND_SOC_DAPM_OUTPUT("Speaker L"),
+	SND_SOC_DAPM_OUTPUT("Speaker R"),
+
+	/* Capture */
+	SND_SOC_DAPM_PGA("Analog In PGA L", R_PWRM1, FB_PWRM1_PGAL, 0, NULL, 0),
+	SND_SOC_DAPM_PGA("Analog In PGA R", R_PWRM1, FB_PWRM1_PGAR, 0, NULL, 0),
+	SND_SOC_DAPM_PGA("Analog Boost L", R_PWRM1, FB_PWRM1_BSTL, 0, NULL, 0),
+	SND_SOC_DAPM_PGA("Analog Boost R", R_PWRM1, FB_PWRM1_BSTR, 0, NULL, 0),
+	SND_SOC_DAPM_PGA("ADC Mute", R_CNVRTR0, FB_CNVRTR0_HPOR, true, NULL, 0),
+	SND_SOC_DAPM_ADC("ADC L", "HiFi Capture", R_PWRM1, FB_PWRM1_ADCL, 0),
+	SND_SOC_DAPM_ADC("ADC R", "HiFi Capture", R_PWRM1, FB_PWRM1_ADCR, 0),
+
+	/* Capture Input */
+	SND_SOC_DAPM_MUX("Input L Capture Route", R_PWRM2,
+			FB_PWRM2_INSELL, 0, &left_input_select),
+	SND_SOC_DAPM_MUX("Input R Capture Route", R_PWRM2,
+			FB_PWRM2_INSELR, 0, &right_input_select),
+
+	/* Digital Mic */
+	SND_SOC_DAPM_SUPPLY_S("Digital Mic Enable", 2, R_DMICCTL,
+		FB_DMICCTL_DMICEN, 0, NULL,
+		SND_SOC_DAPM_POST_PMU|SND_SOC_DAPM_PRE_PMD),
+
+	/* Analog Mic */
+	SND_SOC_DAPM_SUPPLY_S("Mic Bias", 2, R_PWRM1, FB_PWRM1_MICB,
+		0, dapm_micb_event, SND_SOC_DAPM_POST_PMU|SND_SOC_DAPM_PRE_PMD),
+
+	/* Line In */
+	SND_SOC_DAPM_INPUT("Line In 1 L"),
+	SND_SOC_DAPM_INPUT("Line In 1 R"),
+	SND_SOC_DAPM_INPUT("Line In 2 L"),
+	SND_SOC_DAPM_INPUT("Line In 2 R"),
+	SND_SOC_DAPM_INPUT("Line In 3 L"),
+	SND_SOC_DAPM_INPUT("Line In 3 R"),
+};
+
+static const struct snd_soc_dapm_route tscs42xx_intercon[] = {
+	{"DAC L", NULL, "PLL"},
+	{"DAC R", NULL, "PLL"},
+	{"DAC L", NULL, "Vref"},
+	{"DAC R", NULL, "Vref"},
+	{"Headphone L", NULL, "DAC L"},
+	{"Headphone R", NULL, "DAC R"},
+
+	{"ClassD L", NULL, "PLL"},
+	{"ClassD R", NULL, "PLL"},
+	{"ClassD L", NULL, "Vref"},
+	{"ClassD R", NULL, "Vref"},
+	{"Speaker L", NULL, "ClassD L"},
+	{"Speaker R", NULL, "ClassD R"},
+
+	{"Input L Capture Route", NULL, "Vref"},
+	{"Input R Capture Route", NULL, "Vref"},
+
+	{"Mic Bias", NULL, "Vref"},
+
+	{"Input L Capture Route", "Line 1", "Line In 1 L"},
+	{"Input R Capture Route", "Line 1", "Line In 1 R"},
+	{"Input L Capture Route", "Line 2", "Line In 2 L"},
+	{"Input R Capture Route", "Line 2", "Line In 2 R"},
+	{"Input L Capture Route", "Line 3", "Line In 3 L"},
+	{"Input R Capture Route", "Line 3", "Line In 3 R"},
+
+	{"Analog In PGA L", NULL, "Input L Capture Route"},
+	{"Analog In PGA R", NULL, "Input R Capture Route"},
+	{"Analog Boost L", NULL, "Analog In PGA L"},
+	{"Analog Boost R", NULL, "Analog In PGA R"},
+	{"ADC Mute", NULL, "Analog Boost L"},
+	{"ADC Mute", NULL, "Analog Boost R"},
+	{"ADC L", NULL, "PLL"},
+	{"ADC R", NULL, "PLL"},
+	{"ADC L", NULL, "ADC Mute"},
+	{"ADC R", NULL, "ADC Mute"},
+};
+
+/************
+ * CONTROLS *
+ ************/
+
+static char const * const eq_band_enable_text[] = {
+	"Prescale only",
+	"Band1",
+	"Band1:2",
+	"Band1:3",
+	"Band1:4",
+	"Band1:5",
+	"Band1:6",
+};
+
+static char const * const level_detection_text[] = {
+	"Average",
+	"Peak",
+};
+
+static char const * const level_detection_window_text[] = {
+	"512 Samples",
+	"64 Samples",
+};
+
+static char const * const compressor_ratio_text[] = {
+	"Reserved", "1.5:1", "2:1", "3:1", "4:1", "5:1", "6:1",
+	"7:1", "8:1", "9:1", "10:1", "11:1", "12:1", "13:1", "14:1",
+	"15:1", "16:1", "17:1", "18:1", "19:1", "20:1",
+};
+
+static DECLARE_TLV_DB_SCALE(hpvol_scale, -8850, 75, 0);
+static DECLARE_TLV_DB_SCALE(spkvol_scale, -7725, 75, 0);
+static DECLARE_TLV_DB_SCALE(dacvol_scale, -9563, 38, 0);
+static DECLARE_TLV_DB_SCALE(adcvol_scale, -7125, 38, 0);
+static DECLARE_TLV_DB_SCALE(invol_scale, -1725, 75, 0);
+static DECLARE_TLV_DB_SCALE(mic_boost_scale, 0, 1000, 0);
+static DECLARE_TLV_DB_MINMAX(mugain_scale, 0, 4650);
+static DECLARE_TLV_DB_MINMAX(compth_scale, -9562, 0);
+
+static const struct soc_enum eq1_band_enable_enum =
+	SOC_ENUM_SINGLE(R_CONFIG1, FB_CONFIG1_EQ1_BE,
+		ARRAY_SIZE(eq_band_enable_text), eq_band_enable_text);
+
+static const struct soc_enum eq2_band_enable_enum =
+	SOC_ENUM_SINGLE(R_CONFIG1, FB_CONFIG1_EQ2_BE,
+		ARRAY_SIZE(eq_band_enable_text), eq_band_enable_text);
+
+static const struct soc_enum cle_level_detection_enum =
+	SOC_ENUM_SINGLE(R_CLECTL, FB_CLECTL_LVL_MODE,
+		ARRAY_SIZE(level_detection_text),
+		level_detection_text);
+
+static const struct soc_enum cle_level_detection_window_enum =
+	SOC_ENUM_SINGLE(R_CLECTL, FB_CLECTL_WINDOWSEL,
+		ARRAY_SIZE(level_detection_window_text),
+		level_detection_window_text);
+
+static const struct soc_enum mbc_level_detection_enums[] = {
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_LVLMODE1,
+		ARRAY_SIZE(level_detection_text),
+			level_detection_text),
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_LVLMODE2,
+		ARRAY_SIZE(level_detection_text),
+			level_detection_text),
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_LVLMODE3,
+		ARRAY_SIZE(level_detection_text),
+			level_detection_text),
+};
+
+static const struct soc_enum mbc_level_detection_window_enums[] = {
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_WINSEL1,
+		ARRAY_SIZE(level_detection_window_text),
+			level_detection_window_text),
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_WINSEL2,
+		ARRAY_SIZE(level_detection_window_text),
+			level_detection_window_text),
+	SOC_ENUM_SINGLE(R_DACMBCCTL, FB_DACMBCCTL_WINSEL3,
+		ARRAY_SIZE(level_detection_window_text),
+			level_detection_window_text),
+};
+
+static const struct soc_enum compressor_ratio_enum =
+	SOC_ENUM_SINGLE(R_CMPRAT, FB_CMPRAT,
+		ARRAY_SIZE(compressor_ratio_text), compressor_ratio_text);
+
+static const struct soc_enum dac_mbc1_compressor_ratio_enum =
+	SOC_ENUM_SINGLE(R_DACMBCRAT1, FB_DACMBCRAT1_RATIO,
+		ARRAY_SIZE(compressor_ratio_text), compressor_ratio_text);
+
+static const struct soc_enum dac_mbc2_compressor_ratio_enum =
+	SOC_ENUM_SINGLE(R_DACMBCRAT2, FB_DACMBCRAT2_RATIO,
+		ARRAY_SIZE(compressor_ratio_text), compressor_ratio_text);
+
+static const struct soc_enum dac_mbc3_compressor_ratio_enum =
+	SOC_ENUM_SINGLE(R_DACMBCRAT3, FB_DACMBCRAT3_RATIO,
+		ARRAY_SIZE(compressor_ratio_text), compressor_ratio_text);
+
+static int bytes_info_ext(struct snd_kcontrol *kcontrol,
+	struct snd_ctl_elem_info *ucontrol)
+{
+	struct coeff_ram_ctl *ctl =
+		(struct coeff_ram_ctl *)kcontrol->private_value;
+	struct soc_bytes_ext *params = &ctl->bytes_ext;
+
+	ucontrol->type = SNDRV_CTL_ELEM_TYPE_BYTES;
+	ucontrol->count = params->max;
+
+	return 0;
+}
+
+#define COEFF_RAM_CTL(xname, xcount, xaddr) \
+{	.iface = SNDRV_CTL_ELEM_IFACE_MIXER, .name = xname, \
+	.info = bytes_info_ext, \
+	.get = coeff_ram_get, .put = coeff_ram_put, \
+	.private_value = (unsigned long)&(struct coeff_ram_ctl) { \
+		.addr = xaddr, \
+		.bytes_ext = {.max = xcount, }, \
+	} \
+}
+
+static const struct snd_kcontrol_new tscs42xx_snd_controls[] = {
+	/* Volumes */
+	SOC_DOUBLE_R_TLV("Headphone Playback Volume", R_HPVOLL, R_HPVOLR,
+			FB_HPVOLL, 0x7F, 0, hpvol_scale),
+	SOC_DOUBLE_R_TLV("Speaker Playback Volume", R_SPKVOLL, R_SPKVOLR,
+			FB_SPKVOLL, 0x7F, 0, spkvol_scale),
+	SOC_DOUBLE_R_TLV("Master Playback Volume", R_DACVOLL, R_DACVOLR,
+			FB_DACVOLL, 0xFF, 0, dacvol_scale),
+	SOC_DOUBLE_R_TLV("PCM Capture Volume", R_ADCVOLL, R_ADCVOLR,
+			FB_ADCVOLL, 0xFF, 0, adcvol_scale),
+	SOC_DOUBLE_R_TLV("Master Capture Volume", R_INVOLL, R_INVOLR,
+			FB_INVOLL, 0x3F, 0, invol_scale),
+
+	/* INSEL */
+	SOC_DOUBLE_R_TLV("Mic Boost Capture Volume", R_INSELL, R_INSELR,
+			FB_INSELL_MICBSTL, FV_INSELL_MICBSTL_30DB,
+			0, mic_boost_scale),
+
+	/* Input Channel Map */
+	SOC_ENUM("Input Channel Map", ch_map_select_enum),
+
+	/* Coefficient Ram */
+	COEFF_RAM_CTL("Cascade1L BiQuad1", BIQUAD_SIZE, 0x00),
+	COEFF_RAM_CTL("Cascade1L BiQuad2", BIQUAD_SIZE, 0x05),
+	COEFF_RAM_CTL("Cascade1L BiQuad3", BIQUAD_SIZE, 0x0a),
+	COEFF_RAM_CTL("Cascade1L BiQuad4", BIQUAD_SIZE, 0x0f),
+	COEFF_RAM_CTL("Cascade1L BiQuad5", BIQUAD_SIZE, 0x14),
+	COEFF_RAM_CTL("Cascade1L BiQuad6", BIQUAD_SIZE, 0x19),
+
+	COEFF_RAM_CTL("Cascade1R BiQuad1", BIQUAD_SIZE, 0x20),
+	COEFF_RAM_CTL("Cascade1R BiQuad2", BIQUAD_SIZE, 0x25),
+	COEFF_RAM_CTL("Cascade1R BiQuad3", BIQUAD_SIZE, 0x2a),
+	COEFF_RAM_CTL("Cascade1R BiQuad4", BIQUAD_SIZE, 0x2f),
+	COEFF_RAM_CTL("Cascade1R BiQuad5", BIQUAD_SIZE, 0x34),
+	COEFF_RAM_CTL("Cascade1R BiQuad6", BIQUAD_SIZE, 0x39),
+
+	COEFF_RAM_CTL("Cascade1L Prescale", COEFF_SIZE, 0x1f),
+	COEFF_RAM_CTL("Cascade1R Prescale", COEFF_SIZE, 0x3f),
+
+	COEFF_RAM_CTL("Cascade2L BiQuad1", BIQUAD_SIZE, 0x40),
+	COEFF_RAM_CTL("Cascade2L BiQuad2", BIQUAD_SIZE, 0x45),
+	COEFF_RAM_CTL("Cascade2L BiQuad3", BIQUAD_SIZE, 0x4a),
+	COEFF_RAM_CTL("Cascade2L BiQuad4", BIQUAD_SIZE, 0x4f),
+	COEFF_RAM_CTL("Cascade2L BiQuad5", BIQUAD_SIZE, 0x54),
+	COEFF_RAM_CTL("Cascade2L BiQuad6", BIQUAD_SIZE, 0x59),
+
+	COEFF_RAM_CTL("Cascade2R BiQuad1", BIQUAD_SIZE, 0x60),
+	COEFF_RAM_CTL("Cascade2R BiQuad2", BIQUAD_SIZE, 0x65),
+	COEFF_RAM_CTL("Cascade2R BiQuad3", BIQUAD_SIZE, 0x6a),
+	COEFF_RAM_CTL("Cascade2R BiQuad4", BIQUAD_SIZE, 0x6f),
+	COEFF_RAM_CTL("Cascade2R BiQuad5", BIQUAD_SIZE, 0x74),
+	COEFF_RAM_CTL("Cascade2R BiQuad6", BIQUAD_SIZE, 0x79),
+
+	COEFF_RAM_CTL("Cascade2L Prescale", COEFF_SIZE, 0x5f),
+	COEFF_RAM_CTL("Cascade2R Prescale", COEFF_SIZE, 0x7f),
+
+	COEFF_RAM_CTL("Bass Extraction BiQuad1", BIQUAD_SIZE, 0x80),
+	COEFF_RAM_CTL("Bass Extraction BiQuad2", BIQUAD_SIZE, 0x85),
+
+	COEFF_RAM_CTL("Bass Non Linear Function 1", COEFF_SIZE, 0x8a),
+	COEFF_RAM_CTL("Bass Non Linear Function 2", COEFF_SIZE, 0x8b),
+
+	COEFF_RAM_CTL("Bass Limiter BiQuad", BIQUAD_SIZE, 0x8c),
+
+	COEFF_RAM_CTL("Bass Cut Off BiQuad", BIQUAD_SIZE, 0x91),
+
+	COEFF_RAM_CTL("Bass Mix", COEFF_SIZE, 0x96),
+
+	COEFF_RAM_CTL("Treb Extraction BiQuad1", BIQUAD_SIZE, 0x97),
+	COEFF_RAM_CTL("Treb Extraction BiQuad2", BIQUAD_SIZE, 0x9c),
+
+	COEFF_RAM_CTL("Treb Non Linear Function 1", COEFF_SIZE, 0xa1),
+	COEFF_RAM_CTL("Treb Non Linear Function 2", COEFF_SIZE, 0xa2),
+
+	COEFF_RAM_CTL("Treb Limiter BiQuad", BIQUAD_SIZE, 0xa3),
+
+	COEFF_RAM_CTL("Treb Cut Off BiQuad", BIQUAD_SIZE, 0xa8),
+
+	COEFF_RAM_CTL("Treb Mix", COEFF_SIZE, 0xad),
+
+	COEFF_RAM_CTL("3D", COEFF_SIZE, 0xae),
+
+	COEFF_RAM_CTL("3D Mix", COEFF_SIZE, 0xaf),
+
+	COEFF_RAM_CTL("MBC1 BiQuad1", BIQUAD_SIZE, 0xb0),
+	COEFF_RAM_CTL("MBC1 BiQuad2", BIQUAD_SIZE, 0xb5),
+
+	COEFF_RAM_CTL("MBC2 BiQuad1", BIQUAD_SIZE, 0xba),
+	COEFF_RAM_CTL("MBC2 BiQuad2", BIQUAD_SIZE, 0xbf),
+
+	COEFF_RAM_CTL("MBC3 BiQuad1", BIQUAD_SIZE, 0xc4),
+	COEFF_RAM_CTL("MBC3 BiQuad2", BIQUAD_SIZE, 0xc9),
+
+	/* EQ */
+	SOC_SINGLE("EQ1 Switch", R_CONFIG1, FB_CONFIG1_EQ1_EN, 1, 0),
+	SOC_SINGLE("EQ2 Switch", R_CONFIG1, FB_CONFIG1_EQ2_EN, 1, 0),
+	SOC_ENUM("EQ1 Band Enable", eq1_band_enable_enum),
+	SOC_ENUM("EQ2 Band Enable", eq2_band_enable_enum),
+
+	/* CLE */
+	SOC_ENUM("CLE Level Detect",
+		cle_level_detection_enum),
+	SOC_ENUM("CLE Level Detect Win",
+		cle_level_detection_window_enum),
+	SOC_SINGLE("Expander Switch",
+		R_CLECTL, FB_CLECTL_EXP_EN, 1, 0),
+	SOC_SINGLE("Limiter Switch",
+		R_CLECTL, FB_CLECTL_LIMIT_EN, 1, 0),
+	SOC_SINGLE("Comp Switch",
+		R_CLECTL, FB_CLECTL_COMP_EN, 1, 0),
+	SOC_SINGLE_TLV("CLE Make-Up Gain Playback Volume",
+		R_MUGAIN, FB_MUGAIN_CLEMUG, 0x1f, 0, mugain_scale),
+	SOC_SINGLE_TLV("Comp Thresh Playback Volume",
+		R_COMPTH, FB_COMPTH, 0xff, 0, compth_scale),
+	SOC_ENUM("Comp Ratio", compressor_ratio_enum),
+	SND_SOC_BYTES("Comp Atk Time", R_CATKTCL, 2),
+
+	/* Effects */
+	SOC_SINGLE("3D Switch", R_FXCTL, FB_FXCTL_3DEN, 1, 0),
+	SOC_SINGLE("Treble Switch", R_FXCTL, FB_FXCTL_TEEN, 1, 0),
+	SOC_SINGLE("Treble Bypass Switch", R_FXCTL, FB_FXCTL_TNLFBYPASS, 1, 0),
+	SOC_SINGLE("Bass Switch", R_FXCTL, FB_FXCTL_BEEN, 1, 0),
+	SOC_SINGLE("Bass Bypass Switch", R_FXCTL, FB_FXCTL_BNLFBYPASS, 1, 0),
+
+	/* MBC */
+	SOC_SINGLE("MBC Band1 Switch", R_DACMBCEN, FB_DACMBCEN_MBCEN1, 1, 0),
+	SOC_SINGLE("MBC Band2 Switch", R_DACMBCEN, FB_DACMBCEN_MBCEN2, 1, 0),
+	SOC_SINGLE("MBC Band3 Switch", R_DACMBCEN, FB_DACMBCEN_MBCEN3, 1, 0),
+	SOC_ENUM("MBC Band1 Level Detect",
+		mbc_level_detection_enums[0]),
+	SOC_ENUM("MBC Band2 Level Detect",
+		mbc_level_detection_enums[1]),
+	SOC_ENUM("MBC Band3 Level Detect",
+		mbc_level_detection_enums[2]),
+	SOC_ENUM("MBC Band1 Level Detect Win",
+		mbc_level_detection_window_enums[0]),
+	SOC_ENUM("MBC Band2 Level Detect Win",
+		mbc_level_detection_window_enums[1]),
+	SOC_ENUM("MBC Band3 Level Detect Win",
+		mbc_level_detection_window_enums[2]),
+
+	SOC_SINGLE("MBC1 Phase Invert Switch",
+		R_DACMBCMUG1, FB_DACMBCMUG1_PHASE, 1, 0),
+	SOC_SINGLE_TLV("DAC MBC1 Make-Up Gain Playback Volume",
+		R_DACMBCMUG1, FB_DACMBCMUG1_MUGAIN, 0x1f, 0, mugain_scale),
+	SOC_SINGLE_TLV("DAC MBC1 Comp Thresh Playback Volume",
+		R_DACMBCTHR1, FB_DACMBCTHR1_THRESH, 0xff, 0, compth_scale),
+	SOC_ENUM("DAC MBC1 Comp Ratio",
+		dac_mbc1_compressor_ratio_enum),
+	SND_SOC_BYTES("DAC MBC1 Comp Atk Time", R_DACMBCATK1L, 2),
+	SND_SOC_BYTES("DAC MBC1 Comp Rel Time Const",
+		R_DACMBCREL1L, 2),
+
+	SOC_SINGLE("MBC2 Phase Invert Switch",
+		R_DACMBCMUG2, FB_DACMBCMUG2_PHASE, 1, 0),
+	SOC_SINGLE_TLV("DAC MBC2 Make-Up Gain Playback Volume",
+		R_DACMBCMUG2, FB_DACMBCMUG2_MUGAIN, 0x1f, 0, mugain_scale),
+	SOC_SINGLE_TLV("DAC MBC2 Comp Thresh Playback Volume",
+		R_DACMBCTHR2, FB_DACMBCTHR2_THRESH, 0xff, 0, compth_scale),
+	SOC_ENUM("DAC MBC2 Comp Ratio",
+		dac_mbc2_compressor_ratio_enum),
+	SND_SOC_BYTES("DAC MBC2 Comp Atk Time", R_DACMBCATK2L, 2),
+	SND_SOC_BYTES("DAC MBC2 Comp Rel Time Const",
+		R_DACMBCREL2L, 2),
+
+	SOC_SINGLE("MBC3 Phase Invert Switch",
+		R_DACMBCMUG3, FB_DACMBCMUG3_PHASE, 1, 0),
+	SOC_SINGLE_TLV("DAC MBC3 Make-Up Gain Playback Volume",
+		R_DACMBCMUG3, FB_DACMBCMUG3_MUGAIN, 0x1f, 0, mugain_scale),
+	SOC_SINGLE_TLV("DAC MBC3 Comp Thresh Playback Volume",
+		R_DACMBCTHR3, FB_DACMBCTHR3_THRESH, 0xff, 0, compth_scale),
+	SOC_ENUM("DAC MBC3 Comp Ratio",
+		dac_mbc3_compressor_ratio_enum),
+	SND_SOC_BYTES("DAC MBC3 Comp Atk Time", R_DACMBCATK3L, 2),
+	SND_SOC_BYTES("DAC MBC3 Comp Rel Time Const",
+		R_DACMBCREL3L, 2),
+};
+
+static int setup_sample_format(struct snd_soc_codec *codec,
+		snd_pcm_format_t format)
+{
+	unsigned int width;
+	int ret;
+
+	switch (format) {
+	case SNDRV_PCM_FORMAT_S16_LE:
+		width = RV_AIC1_WL_16;
+		break;
+	case SNDRV_PCM_FORMAT_S20_3LE:
+		width = RV_AIC1_WL_20;
+		break;
+	case SNDRV_PCM_FORMAT_S24_LE:
+		width = RV_AIC1_WL_24;
+		break;
+	case SNDRV_PCM_FORMAT_S32_LE:
+		width = RV_AIC1_WL_32;
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(codec->dev, "Unsupported format width (%d)\n", ret);
+		return ret;
+	}
+	ret = snd_soc_update_bits(codec, R_AIC1, RM_AIC1_WL, width);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to set sample width (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int setup_sample_rate(struct snd_soc_codec *codec, unsigned int rate)
+{
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	unsigned int br, bm;
+	int ret;
+
+	switch (rate) {
+	case 8000:
+		br = RV_DACSR_DBR_32;
+		bm = RV_DACSR_DBM_PT25;
+		break;
+	case 16000:
+		br = RV_DACSR_DBR_32;
+		bm = RV_DACSR_DBM_PT5;
+		break;
+	case 24000:
+		br = RV_DACSR_DBR_48;
+		bm = RV_DACSR_DBM_PT5;
+		break;
+	case 32000:
+		br = RV_DACSR_DBR_32;
+		bm = RV_DACSR_DBM_1;
+		break;
+	case 48000:
+		br = RV_DACSR_DBR_48;
+		bm = RV_DACSR_DBM_1;
+		break;
+	case 96000:
+		br = RV_DACSR_DBR_48;
+		bm = RV_DACSR_DBM_2;
+		break;
+	case 11025:
+		br = RV_DACSR_DBR_44_1;
+		bm = RV_DACSR_DBM_PT25;
+		break;
+	case 22050:
+		br = RV_DACSR_DBR_44_1;
+		bm = RV_DACSR_DBM_PT5;
+		break;
+	case 44100:
+		br = RV_DACSR_DBR_44_1;
+		bm = RV_DACSR_DBM_1;
+		break;
+	case 88200:
+		br = RV_DACSR_DBR_44_1;
+		bm = RV_DACSR_DBM_2;
+		break;
+	default:
+		dev_err(codec->dev, "Unsupported sample rate %d\n", rate);
+		return -EINVAL;
+	}
+
+	/* DAC and ADC share bit and frame clock */
+	ret = snd_soc_update_bits(codec, R_DACSR, RM_DACSR_DBR, br);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to update register (%d)\n", ret);
+		return ret;
+	}
+	ret = snd_soc_update_bits(codec, R_DACSR, RM_DACSR_DBM, bm);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to update register (%d)\n", ret);
+		return ret;
+	}
+	ret = snd_soc_update_bits(codec, R_ADCSR, RM_DACSR_DBR, br);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to update register (%d)\n", ret);
+		return ret;
+	}
+	ret = snd_soc_update_bits(codec, R_ADCSR, RM_DACSR_DBM, bm);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to update register (%d)\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&tscs42xx->audio_params_lock);
+
+	tscs42xx->samplerate = rate;
+
+	mutex_unlock(&tscs42xx->audio_params_lock);
+
+	return 0;
+}
+
+struct reg_setting {
+	unsigned int addr;
+	unsigned int val;
+	unsigned int mask;
+};
+
+#define PLL_REG_SETTINGS_COUNT 13
+struct pll_ctl {
+	int input_freq;
+	struct reg_setting settings[PLL_REG_SETTINGS_COUNT];
+};
+
+#define PLL_CTL(f, rt, rd, r1b_l, r9, ra, rb,		\
+		rc, r12, r1b_h, re, rf, r10, r11)	\
+	{						\
+		.input_freq = f,			\
+		.settings = {				\
+			{R_TIMEBASE,  rt,   0xFF},	\
+			{R_PLLCTLD,   rd,   0xFF},	\
+			{R_PLLCTL1B, r1b_l, 0x0F},	\
+			{R_PLLCTL9,   r9,   0xFF},	\
+			{R_PLLCTLA,   ra,   0xFF},	\
+			{R_PLLCTLB,   rb,   0xFF},	\
+			{R_PLLCTLC,   rc,   0xFF},	\
+			{R_PLLCTL12, r12,   0xFF},	\
+			{R_PLLCTL1B, r1b_h, 0xF0},	\
+			{R_PLLCTLE,   re,   0xFF},	\
+			{R_PLLCTLF,   rf,   0xFF},	\
+			{R_PLLCTL10, r10,   0xFF},	\
+			{R_PLLCTL11, r11,   0xFF},	\
+		},					\
+	}
+
+static const struct pll_ctl pll_ctls[] = {
+	PLL_CTL(1411200, 0x05,
+		0x39, 0x04, 0x07, 0x02, 0xC3, 0x04,
+		0x1B, 0x10, 0x03, 0x03, 0xD0, 0x02),
+	PLL_CTL(1536000, 0x05,
+		0x1A, 0x04, 0x02, 0x03, 0xE0, 0x01,
+		0x1A, 0x10, 0x02, 0x03, 0xB9, 0x01),
+	PLL_CTL(2822400, 0x0A,
+		0x23, 0x04, 0x07, 0x04, 0xC3, 0x04,
+		0x22, 0x10, 0x05, 0x03, 0x58, 0x02),
+	PLL_CTL(3072000, 0x0B,
+		0x22, 0x04, 0x07, 0x03, 0x48, 0x03,
+		0x1A, 0x10, 0x04, 0x03, 0xB9, 0x01),
+	PLL_CTL(5644800, 0x15,
+		0x23, 0x04, 0x0E, 0x04, 0xC3, 0x04,
+		0x1A, 0x10, 0x08, 0x03, 0xE0, 0x01),
+	PLL_CTL(6144000, 0x17,
+		0x1A, 0x04, 0x08, 0x03, 0xE0, 0x01,
+		0x1A, 0x10, 0x08, 0x03, 0xB9, 0x01),
+	PLL_CTL(12000000, 0x2E,
+		0x1B, 0x04, 0x19, 0x03, 0x00, 0x03,
+		0x2A, 0x10, 0x19, 0x05, 0x98, 0x04),
+	PLL_CTL(19200000, 0x4A,
+		0x13, 0x04, 0x14, 0x03, 0x80, 0x01,
+		0x1A, 0x10, 0x19, 0x03, 0xB9, 0x01),
+	PLL_CTL(22000000, 0x55,
+		0x2A, 0x04, 0x37, 0x05, 0x00, 0x06,
+		0x22, 0x10, 0x26, 0x03, 0x49, 0x02),
+	PLL_CTL(22579200, 0x57,
+		0x22, 0x04, 0x31, 0x03, 0x20, 0x03,
+		0x1A, 0x10, 0x1D, 0x03, 0xB3, 0x01),
+	PLL_CTL(24000000, 0x5D,
+		0x13, 0x04, 0x19, 0x03, 0x80, 0x01,
+		0x1B, 0x10, 0x19, 0x05, 0x4C, 0x02),
+	PLL_CTL(24576000, 0x5F,
+		0x13, 0x04, 0x1D, 0x03, 0xB3, 0x01,
+		0x22, 0x10, 0x40, 0x03, 0x72, 0x03),
+	PLL_CTL(27000000, 0x68,
+		0x22, 0x04, 0x4B, 0x03, 0x00, 0x04,
+		0x2A, 0x10, 0x7D, 0x03, 0x20, 0x06),
+	PLL_CTL(36000000, 0x8C,
+		0x1B, 0x04, 0x4B, 0x03, 0x00, 0x03,
+		0x2A, 0x10, 0x7D, 0x03, 0x98, 0x04),
+	PLL_CTL(25000000, 0x61,
+		0x1B, 0x04, 0x37, 0x03, 0x2B, 0x03,
+		0x1A, 0x10, 0x2A, 0x03, 0x39, 0x02),
+	PLL_CTL(26000000, 0x65,
+		0x23, 0x04, 0x41, 0x05, 0x00, 0x06,
+		0x1A, 0x10, 0x26, 0x03, 0xEF, 0x01),
+	PLL_CTL(12288000, 0x2F,
+		0x1A, 0x04, 0x12, 0x03, 0x1C, 0x02,
+		0x22, 0x10, 0x20, 0x03, 0x72, 0x03),
+	PLL_CTL(40000000, 0x9B,
+		0x22, 0x08, 0x7D, 0x03, 0x80, 0x04,
+		0x23, 0x10, 0x7D, 0x05, 0xE4, 0x06),
+	PLL_CTL(512000, 0x01,
+		0x22, 0x04, 0x01, 0x03, 0xD0, 0x02,
+		0x1B, 0x10, 0x01, 0x04, 0x72, 0x03),
+	PLL_CTL(705600, 0x02,
+		0x22, 0x04, 0x02, 0x03, 0x15, 0x04,
+		0x22, 0x10, 0x01, 0x04, 0x80, 0x02),
+	PLL_CTL(1024000, 0x03,
+		0x22, 0x04, 0x02, 0x03, 0xD0, 0x02,
+		0x1B, 0x10, 0x02, 0x04, 0x72, 0x03),
+	PLL_CTL(2048000, 0x07,
+		0x22, 0x04, 0x04, 0x03, 0xD0, 0x02,
+		0x1B, 0x10, 0x04, 0x04, 0x72, 0x03),
+	PLL_CTL(2400000, 0x08,
+		0x22, 0x04, 0x05, 0x03, 0x00, 0x03,
+		0x23, 0x10, 0x05, 0x05, 0x98, 0x04),
+};
+
+static const struct pll_ctl *get_pll_ctl(int input_freq)
+{
+	int i;
+	const struct pll_ctl *pll_ctl = NULL;
+
+	for (i = 0; i < ARRAY_SIZE(pll_ctls); ++i)
+		if (input_freq == pll_ctls[i].input_freq) {
+			pll_ctl = &pll_ctls[i];
+			break;
+		}
+
+	return pll_ctl;
+}
+
+static int set_pll_ctl_from_input_freq(struct snd_soc_codec *codec,
+		const int input_freq)
+{
+	int ret;
+	int i;
+	const struct pll_ctl *pll_ctl;
+
+	pll_ctl = get_pll_ctl(input_freq);
+	if (!pll_ctl) {
+		ret = -EINVAL;
+		dev_err(codec->dev, "No PLL input entry for %d (%d)\n",
+			input_freq, ret);
+		return ret;
+	}
+
+	for (i = 0; i < PLL_REG_SETTINGS_COUNT; ++i) {
+		ret = snd_soc_update_bits(codec,
+			pll_ctl->settings[i].addr,
+			pll_ctl->settings[i].mask,
+			pll_ctl->settings[i].val);
+		if (ret < 0) {
+			dev_err(codec->dev, "Failed to set pll ctl (%d)\n",
+				ret);
+			return ret;
+		}
+	}
+
+	return 0;
+}
+
+static int tscs42xx_hw_params(struct snd_pcm_substream *substream,
+		struct snd_pcm_hw_params *params,
+		struct snd_soc_dai *codec_dai)
+{
+	struct snd_soc_codec *codec = codec_dai->codec;
+	int ret;
+
+	ret = setup_sample_format(codec, params_format(params));
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to setup sample format (%d)\n",
+			ret);
+		return ret;
+	}
+
+	ret = setup_sample_rate(codec, params_rate(params));
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to setup sample rate (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static inline int dac_mute(struct snd_soc_codec *codec)
+{
+	int ret;
+
+	ret = snd_soc_update_bits(codec, R_CNVRTR1, RM_CNVRTR1_DACMU,
+		RV_CNVRTR1_DACMU_ENABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to mute DAC (%d)\n",
+				ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static inline int dac_unmute(struct snd_soc_codec *codec)
+{
+	int ret;
+
+	ret = snd_soc_update_bits(codec, R_CNVRTR1, RM_CNVRTR1_DACMU,
+		RV_CNVRTR1_DACMU_DISABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to unmute DAC (%d)\n",
+				ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static inline int adc_mute(struct snd_soc_codec *codec)
+{
+	int ret;
+
+	ret = snd_soc_update_bits(codec, R_CNVRTR0, RM_CNVRTR0_ADCMU,
+		RV_CNVRTR0_ADCMU_ENABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to mute ADC (%d)\n",
+				ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static inline int adc_unmute(struct snd_soc_codec *codec)
+{
+	int ret;
+
+	ret = snd_soc_update_bits(codec, R_CNVRTR0, RM_CNVRTR0_ADCMU,
+		RV_CNVRTR0_ADCMU_DISABLE);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to unmute ADC (%d)\n",
+				ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int tscs42xx_mute_stream(struct snd_soc_dai *dai, int mute, int stream)
+{
+	struct snd_soc_codec *codec = dai->codec;
+	int ret;
+
+	if (mute)
+		if (stream == SNDRV_PCM_STREAM_PLAYBACK)
+			ret = dac_mute(codec);
+		else
+			ret = adc_mute(codec);
+	else
+		if (stream == SNDRV_PCM_STREAM_PLAYBACK)
+			ret = dac_unmute(codec);
+		else
+			ret = adc_unmute(codec);
+
+	return ret;
+}
+
+static int tscs42xx_set_dai_fmt(struct snd_soc_dai *codec_dai,
+		unsigned int fmt)
+{
+	struct snd_soc_codec *codec = codec_dai->codec;
+	int ret;
+
+	/* Slave mode not supported since it needs always-on frame clock */
+	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
+	case SND_SOC_DAIFMT_CBM_CFM:
+		ret = snd_soc_update_bits(codec, R_AIC1, RM_AIC1_MS,
+				RV_AIC1_MS_MASTER);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to set codec DAI master (%d)\n", ret);
+			return ret;
+		}
+		break;
+	default:
+		ret = -EINVAL;
+		dev_err(codec->dev, "Unsupported format (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int tscs42xx_set_dai_bclk_ratio(struct snd_soc_dai *codec_dai,
+		unsigned int ratio)
+{
+	struct snd_soc_codec *codec = codec_dai->codec;
+	struct tscs42xx *tscs42xx = snd_soc_codec_get_drvdata(codec);
+	unsigned int value;
+	int ret = 0;
+
+	switch (ratio) {
+	case 32:
+		value = RV_DACSR_DBCM_32;
+		break;
+	case 40:
+		value = RV_DACSR_DBCM_40;
+		break;
+	case 64:
+		value = RV_DACSR_DBCM_64;
+		break;
+	default:
+		dev_err(codec->dev, "Unsupported bclk ratio (%d)\n", ret);
+		return -EINVAL;
+	}
+
+	ret = snd_soc_update_bits(codec, R_DACSR, RM_DACSR_DBCM, value);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to set DAC BCLK ratio (%d)\n", ret);
+		return ret;
+	}
+	ret = snd_soc_update_bits(codec, R_ADCSR, RM_ADCSR_ABCM, value);
+	if (ret < 0) {
+		dev_err(codec->dev, "Failed to set ADC BCLK ratio (%d)\n", ret);
+		return ret;
+	}
+
+	mutex_lock(&tscs42xx->audio_params_lock);
+
+	tscs42xx->bclk_ratio = ratio;
+
+	mutex_unlock(&tscs42xx->audio_params_lock);
+
+	return 0;
+}
+
+static int tscs42xx_set_dai_sysclk(struct snd_soc_dai *codec_dai,
+	int clk_id, unsigned int freq, int dir)
+{
+	struct snd_soc_codec *codec = codec_dai->codec;
+	int ret;
+
+	switch (clk_id) {
+	case TSCS42XX_PLL_SRC_XTAL:
+	case TSCS42XX_PLL_SRC_MCLK1:
+		ret = snd_soc_write(codec, R_PLLREFSEL,
+				RV_PLLREFSEL_PLL1_REF_SEL_XTAL_MCLK1 |
+				RV_PLLREFSEL_PLL2_REF_SEL_XTAL_MCLK1);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to set pll reference input (%d)\n",
+				ret);
+			return ret;
+		}
+		break;
+	case TSCS42XX_PLL_SRC_MCLK2:
+		ret = snd_soc_write(codec, R_PLLREFSEL,
+				RV_PLLREFSEL_PLL1_REF_SEL_MCLK2 |
+				RV_PLLREFSEL_PLL2_REF_SEL_MCLK2);
+		if (ret < 0) {
+			dev_err(codec->dev,
+				"Failed to set PLL reference (%d)\n", ret);
+			return ret;
+		}
+		break;
+	default:
+		dev_err(codec->dev, "pll src is unsupported\n");
+		return -EINVAL;
+	}
+
+	ret = set_pll_ctl_from_input_freq(codec, freq);
+	if (ret < 0) {
+		dev_err(codec->dev,
+			"Failed to setup PLL input freq (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static const struct snd_soc_dai_ops tscs42xx_dai_ops = {
+	.hw_params	= tscs42xx_hw_params,
+	.mute_stream	= tscs42xx_mute_stream,
+	.set_fmt	= tscs42xx_set_dai_fmt,
+	.set_bclk_ratio = tscs42xx_set_dai_bclk_ratio,
+	.set_sysclk	= tscs42xx_set_dai_sysclk,
+};
+
+static int part_is_valid(struct tscs42xx *tscs42xx)
+{
+	int val;
+	int ret;
+	unsigned int reg;
+
+	ret = regmap_read(tscs42xx->regmap, R_DEVIDH, &reg);
+	if (ret < 0)
+		return ret;
+
+	val = reg << 8;
+	ret = regmap_read(tscs42xx->regmap, R_DEVIDL, &reg);
+	if (ret < 0)
+		return ret;
+
+	val |= reg;
+
+	switch (val) {
+	case 0x4A74:
+	case 0x4A73:
+		return true;
+	default:
+		return false;
+	};
+}
+
+static struct snd_soc_codec_driver soc_codec_dev_tscs42xx = {
+	.component_driver = {
+		.dapm_widgets = tscs42xx_dapm_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(tscs42xx_dapm_widgets),
+		.dapm_routes = tscs42xx_intercon,
+		.num_dapm_routes = ARRAY_SIZE(tscs42xx_intercon),
+		.controls =	tscs42xx_snd_controls,
+		.num_controls = ARRAY_SIZE(tscs42xx_snd_controls),
+	},
+};
+
+static inline void init_coeff_ram_cache(struct tscs42xx *tscs42xx)
+{
+	const u8 norm_addrs[] = { 0x00, 0x05, 0x0a, 0x0f, 0x14, 0x19, 0x1f,
+		0x20, 0x25, 0x2a, 0x2f, 0x34, 0x39, 0x3f, 0x40, 0x45, 0x4a,
+		0x4f, 0x54, 0x59, 0x5f, 0x60, 0x65, 0x6a, 0x6f, 0x74, 0x79,
+		0x7f, 0x80, 0x85, 0x8c, 0x91, 0x96, 0x97, 0x9c, 0xa3, 0xa8,
+		0xad, 0xaf, 0xb0, 0xb5, 0xba, 0xbf, 0xc4, 0xc9, };
+	u8 *coeff_ram = tscs42xx->coeff_ram;
+	int i;
+
+	for (i = 0; i < ARRAY_SIZE(norm_addrs); i++)
+		coeff_ram[((norm_addrs[i] + 1) * COEFF_SIZE) - 1] = 0x40;
+}
+
+#define TSCS42XX_RATES SNDRV_PCM_RATE_8000_96000
+
+#define TSCS42XX_FORMATS (SNDRV_PCM_FMTBIT_S16_LE | SNDRV_PCM_FMTBIT_S20_3LE \
+	| SNDRV_PCM_FMTBIT_S24_LE | SNDRV_PCM_FMTBIT_S32_LE)
+
+static struct snd_soc_dai_driver tscs42xx_dai = {
+	.name = "tscs42xx-HiFi",
+	.playback = {
+		.stream_name = "HiFi Playback",
+		.channels_min = 2,
+		.channels_max = 2,
+		.rates = TSCS42XX_RATES,
+		.formats = TSCS42XX_FORMATS,},
+	.capture = {
+		.stream_name = "HiFi Capture",
+		.channels_min = 2,
+		.channels_max = 2,
+		.rates = TSCS42XX_RATES,
+		.formats = TSCS42XX_FORMATS,},
+	.ops = &tscs42xx_dai_ops,
+	.symmetric_rates = 1,
+	.symmetric_channels = 1,
+	.symmetric_samplebits = 1,
+};
+
+static const struct reg_sequence tscs42xx_patch[] = {
+	{ R_AIC2, RV_AIC2_BLRCM_DAC_BCLK_LRCLK_SHARED },
+};
+
+static int tscs42xx_i2c_probe(struct i2c_client *i2c,
+		const struct i2c_device_id *id)
+{
+	struct tscs42xx *tscs42xx;
+	int ret = 0;
+
+	tscs42xx = devm_kzalloc(&i2c->dev, sizeof(*tscs42xx), GFP_KERNEL);
+	if (!tscs42xx) {
+		ret = -ENOMEM;
+		dev_err(&i2c->dev,
+			"Failed to allocate memory for data (%d)\n", ret);
+		return ret;
+	}
+	i2c_set_clientdata(i2c, tscs42xx);
+	tscs42xx->dev = &i2c->dev;
+
+	tscs42xx->regmap = devm_regmap_init_i2c(i2c, &tscs42xx_regmap);
+	if (IS_ERR(tscs42xx->regmap)) {
+		ret = PTR_ERR(tscs42xx->regmap);
+		dev_err(tscs42xx->dev, "Failed to allocate regmap (%d)\n", ret);
+		return ret;
+	}
+
+	init_coeff_ram_cache(tscs42xx);
+
+	ret = part_is_valid(tscs42xx);
+	if (ret <= 0) {
+		dev_err(tscs42xx->dev, "No valid part (%d)\n", ret);
+		ret = -ENODEV;
+		return ret;
+	}
+
+	ret = regmap_write(tscs42xx->regmap, R_RESET, RV_RESET_ENABLE);
+	if (ret < 0) {
+		dev_err(tscs42xx->dev, "Failed to reset device (%d)\n", ret);
+		return ret;
+	}
+
+	ret = regmap_register_patch(tscs42xx->regmap, tscs42xx_patch,
+			ARRAY_SIZE(tscs42xx_patch));
+	if (ret < 0) {
+		dev_err(tscs42xx->dev, "Failed to apply patch (%d)\n", ret);
+		return ret;
+	}
+
+	mutex_init(&tscs42xx->audio_params_lock);
+	mutex_init(&tscs42xx->coeff_ram_lock);
+	mutex_init(&tscs42xx->pll_lock);
+
+	ret = snd_soc_register_codec(tscs42xx->dev, &soc_codec_dev_tscs42xx,
+			&tscs42xx_dai, 1);
+	if (ret) {
+		dev_err(tscs42xx->dev, "Failed to register codec (%d)\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+static int tscs42xx_i2c_remove(struct i2c_client *client)
+{
+	snd_soc_unregister_codec(&client->dev);
+
+	return 0;
+}
+
+static const struct i2c_device_id tscs42xx_i2c_id[] = {
+	{ "tscs42A1", 0 },
+	{ "tscs42A2", 0 },
+	{ }
+};
+MODULE_DEVICE_TABLE(i2c, tscs42xx_i2c_id);
+
+static const struct of_device_id tscs42xx_of_match[] = {
+	{ .compatible = "tempo,tscs42A1", },
+	{ .compatible = "tempo,tscs42A2", },
+	{ }
+};
+MODULE_DEVICE_TABLE(of, tscs42xx_of_match);
+
+static struct i2c_driver tscs42xx_i2c_driver = {
+	.driver = {
+		.name = "tscs42xx",
+		.owner = THIS_MODULE,
+		.of_match_table = tscs42xx_of_match,
+	},
+	.probe =    tscs42xx_i2c_probe,
+	.remove =   tscs42xx_i2c_remove,
+	.id_table = tscs42xx_i2c_id,
+};
+
+module_i2c_driver(tscs42xx_i2c_driver);
+
+MODULE_AUTHOR("Tempo Semiconductor <steven.eckhoff.opensource@gmail.com");
+MODULE_DESCRIPTION("ASoC TSCS42xx driver");
+MODULE_LICENSE("GPL");
diff --git a/sound/soc/codecs/tscs42xx.h b/sound/soc/codecs/tscs42xx.h
new file mode 100644
index 0000000..d4a30bc
--- /dev/null
+++ b/sound/soc/codecs/tscs42xx.h
@@ -0,0 +1,2693 @@
+// SPDX-License-Identifier: GPL-2.0
+// tscs42xx.h -- TSCS42xx ALSA SoC Audio driver
+// Copyright 2017 Tempo Semiconductor, Inc.
+// Author: Steven Eckhoff <steven.eckhoff.opensource@gmail.com>
+
+#ifndef __WOOKIE_H__
+#define __WOOKIE_H__
+
+enum {
+	TSCS42XX_PLL_SRC_NONE,
+	TSCS42XX_PLL_SRC_XTAL,
+	TSCS42XX_PLL_SRC_MCLK1,
+	TSCS42XX_PLL_SRC_MCLK2,
+};
+
+#define R_HPVOLL        0x0
+#define R_HPVOLR        0x1
+#define R_SPKVOLL       0x2
+#define R_SPKVOLR       0x3
+#define R_DACVOLL       0x4
+#define R_DACVOLR       0x5
+#define R_ADCVOLL       0x6
+#define R_ADCVOLR       0x7
+#define R_INVOLL        0x8
+#define R_INVOLR        0x9
+#define R_INMODE        0x0B
+#define R_INSELL        0x0C
+#define R_INSELR        0x0D
+#define R_AIC1          0x13
+#define R_AIC2          0x14
+#define R_CNVRTR0       0x16
+#define R_ADCSR         0x17
+#define R_CNVRTR1       0x18
+#define R_DACSR         0x19
+#define R_PWRM1         0x1A
+#define R_PWRM2         0x1B
+#define R_CONFIG0       0x1F
+#define R_CONFIG1       0x20
+#define R_DMICCTL       0x24
+#define R_CLECTL        0x25
+#define R_MUGAIN        0x26
+#define R_COMPTH        0x27
+#define R_CMPRAT        0x28
+#define R_CATKTCL       0x29
+#define R_CATKTCH       0x2A
+#define R_CRELTCL       0x2B
+#define R_CRELTCH       0x2C
+#define R_LIMTH         0x2D
+#define R_LIMTGT        0x2E
+#define R_LATKTCL       0x2F
+#define R_LATKTCH       0x30
+#define R_LRELTCL       0x31
+#define R_LRELTCH       0x32
+#define R_EXPTH         0x33
+#define R_EXPRAT        0x34
+#define R_XATKTCL       0x35
+#define R_XATKTCH       0x36
+#define R_XRELTCL       0x37
+#define R_XRELTCH       0x38
+#define R_FXCTL         0x39
+#define R_DACCRWRL      0x3A
+#define R_DACCRWRM      0x3B
+#define R_DACCRWRH      0x3C
+#define R_DACCRRDL      0x3D
+#define R_DACCRRDM      0x3E
+#define R_DACCRRDH      0x3F
+#define R_DACCRADDR     0x40
+#define R_DCOFSEL       0x41
+#define R_PLLCTL9       0x4E
+#define R_PLLCTLA       0x4F
+#define R_PLLCTLB       0x50
+#define R_PLLCTLC       0x51
+#define R_PLLCTLD       0x52
+#define R_PLLCTLE       0x53
+#define R_PLLCTLF       0x54
+#define R_PLLCTL10      0x55
+#define R_PLLCTL11      0x56
+#define R_PLLCTL12      0x57
+#define R_PLLCTL1B      0x60
+#define R_PLLCTL1C      0x61
+#define R_TIMEBASE      0x77
+#define R_DEVIDL        0x7D
+#define R_DEVIDH        0x7E
+#define R_RESET         0x80
+#define R_DACCRSTAT     0x8A
+#define R_PLLCTL0       0x8E
+#define R_PLLREFSEL     0x8F
+#define R_DACMBCEN      0xC7
+#define R_DACMBCCTL     0xC8
+#define R_DACMBCMUG1    0xC9
+#define R_DACMBCTHR1    0xCA
+#define R_DACMBCRAT1    0xCB
+#define R_DACMBCATK1L   0xCC
+#define R_DACMBCATK1H   0xCD
+#define R_DACMBCREL1L   0xCE
+#define R_DACMBCREL1H   0xCF
+#define R_DACMBCMUG2    0xD0
+#define R_DACMBCTHR2    0xD1
+#define R_DACMBCRAT2    0xD2
+#define R_DACMBCATK2L   0xD3
+#define R_DACMBCATK2H   0xD4
+#define R_DACMBCREL2L   0xD5
+#define R_DACMBCREL2H   0xD6
+#define R_DACMBCMUG3    0xD7
+#define R_DACMBCTHR3    0xD8
+#define R_DACMBCRAT3    0xD9
+#define R_DACMBCATK3L   0xDA
+#define R_DACMBCATK3H   0xDB
+#define R_DACMBCREL3L   0xDC
+#define R_DACMBCREL3H   0xDD
+
+/* Helpers */
+#define RM(m, b) ((m)<<(b))
+#define RV(v, b) ((v)<<(b))
+
+/****************************
+ *      R_HPVOLL (0x0)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_HPVOLL                            0
+
+/* Field Masks */
+#define FM_HPVOLL                            0X7F
+
+/* Field Values */
+#define FV_HPVOLL_P6DB                       0x7F
+#define FV_HPVOLL_N88PT5DB                   0x1
+#define FV_HPVOLL_MUTE                       0x0
+
+/* Register Masks */
+#define RM_HPVOLL                            RM(FM_HPVOLL, FB_HPVOLL)
+
+/* Register Values */
+#define RV_HPVOLL_P6DB                       RV(FV_HPVOLL_P6DB, FB_HPVOLL)
+#define RV_HPVOLL_N88PT5DB                   RV(FV_HPVOLL_N88PT5DB, FB_HPVOLL)
+#define RV_HPVOLL_MUTE                       RV(FV_HPVOLL_MUTE, FB_HPVOLL)
+
+/****************************
+ *      R_HPVOLR (0x1)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_HPVOLR                            0
+
+/* Field Masks */
+#define FM_HPVOLR                            0X7F
+
+/* Field Values */
+#define FV_HPVOLR_P6DB                       0x7F
+#define FV_HPVOLR_N88PT5DB                   0x1
+#define FV_HPVOLR_MUTE                       0x0
+
+/* Register Masks */
+#define RM_HPVOLR                            RM(FM_HPVOLR, FB_HPVOLR)
+
+/* Register Values */
+#define RV_HPVOLR_P6DB                       RV(FV_HPVOLR_P6DB, FB_HPVOLR)
+#define RV_HPVOLR_N88PT5DB                   RV(FV_HPVOLR_N88PT5DB, FB_HPVOLR)
+#define RV_HPVOLR_MUTE                       RV(FV_HPVOLR_MUTE, FB_HPVOLR)
+
+/*****************************
+ *      R_SPKVOLL (0x2)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_SPKVOLL                           0
+
+/* Field Masks */
+#define FM_SPKVOLL                           0X7F
+
+/* Field Values */
+#define FV_SPKVOLL_P12DB                     0x7F
+#define FV_SPKVOLL_N77PT25DB                 0x8
+#define FV_SPKVOLL_MUTE                      0x0
+
+/* Register Masks */
+#define RM_SPKVOLL                           RM(FM_SPKVOLL, FB_SPKVOLL)
+
+/* Register Values */
+#define RV_SPKVOLL_P12DB                     RV(FV_SPKVOLL_P12DB, FB_SPKVOLL)
+#define RV_SPKVOLL_N77PT25DB \
+	 RV(FV_SPKVOLL_N77PT25DB, FB_SPKVOLL)
+
+#define RV_SPKVOLL_MUTE                      RV(FV_SPKVOLL_MUTE, FB_SPKVOLL)
+
+/*****************************
+ *      R_SPKVOLR (0x3)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_SPKVOLR                           0
+
+/* Field Masks */
+#define FM_SPKVOLR                           0X7F
+
+/* Field Values */
+#define FV_SPKVOLR_P12DB                     0x7F
+#define FV_SPKVOLR_N77PT25DB                 0x8
+#define FV_SPKVOLR_MUTE                      0x0
+
+/* Register Masks */
+#define RM_SPKVOLR                           RM(FM_SPKVOLR, FB_SPKVOLR)
+
+/* Register Values */
+#define RV_SPKVOLR_P12DB                     RV(FV_SPKVOLR_P12DB, FB_SPKVOLR)
+#define RV_SPKVOLR_N77PT25DB \
+	 RV(FV_SPKVOLR_N77PT25DB, FB_SPKVOLR)
+
+#define RV_SPKVOLR_MUTE                      RV(FV_SPKVOLR_MUTE, FB_SPKVOLR)
+
+/*****************************
+ *      R_DACVOLL (0x4)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_DACVOLL                           0
+
+/* Field Masks */
+#define FM_DACVOLL                           0XFF
+
+/* Field Values */
+#define FV_DACVOLL_0DB                       0xFF
+#define FV_DACVOLL_N95PT625DB                0x1
+#define FV_DACVOLL_MUTE                      0x0
+
+/* Register Masks */
+#define RM_DACVOLL                           RM(FM_DACVOLL, FB_DACVOLL)
+
+/* Register Values */
+#define RV_DACVOLL_0DB                       RV(FV_DACVOLL_0DB, FB_DACVOLL)
+#define RV_DACVOLL_N95PT625DB \
+	 RV(FV_DACVOLL_N95PT625DB, FB_DACVOLL)
+
+#define RV_DACVOLL_MUTE                      RV(FV_DACVOLL_MUTE, FB_DACVOLL)
+
+/*****************************
+ *      R_DACVOLR (0x5)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_DACVOLR                           0
+
+/* Field Masks */
+#define FM_DACVOLR                           0XFF
+
+/* Field Values */
+#define FV_DACVOLR_0DB                       0xFF
+#define FV_DACVOLR_N95PT625DB                0x1
+#define FV_DACVOLR_MUTE                      0x0
+
+/* Register Masks */
+#define RM_DACVOLR                           RM(FM_DACVOLR, FB_DACVOLR)
+
+/* Register Values */
+#define RV_DACVOLR_0DB                       RV(FV_DACVOLR_0DB, FB_DACVOLR)
+#define RV_DACVOLR_N95PT625DB \
+	 RV(FV_DACVOLR_N95PT625DB, FB_DACVOLR)
+
+#define RV_DACVOLR_MUTE                      RV(FV_DACVOLR_MUTE, FB_DACVOLR)
+
+/*****************************
+ *      R_ADCVOLL (0x6)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_ADCVOLL                           0
+
+/* Field Masks */
+#define FM_ADCVOLL                           0XFF
+
+/* Field Values */
+#define FV_ADCVOLL_P24DB                     0xFF
+#define FV_ADCVOLL_N71PT25DB                 0x1
+#define FV_ADCVOLL_MUTE                      0x0
+
+/* Register Masks */
+#define RM_ADCVOLL                           RM(FM_ADCVOLL, FB_ADCVOLL)
+
+/* Register Values */
+#define RV_ADCVOLL_P24DB                     RV(FV_ADCVOLL_P24DB, FB_ADCVOLL)
+#define RV_ADCVOLL_N71PT25DB \
+	 RV(FV_ADCVOLL_N71PT25DB, FB_ADCVOLL)
+
+#define RV_ADCVOLL_MUTE                      RV(FV_ADCVOLL_MUTE, FB_ADCVOLL)
+
+/*****************************
+ *      R_ADCVOLR (0x7)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_ADCVOLR                           0
+
+/* Field Masks */
+#define FM_ADCVOLR                           0XFF
+
+/* Field Values */
+#define FV_ADCVOLR_P24DB                     0xFF
+#define FV_ADCVOLR_N71PT25DB                 0x1
+#define FV_ADCVOLR_MUTE                      0x0
+
+/* Register Masks */
+#define RM_ADCVOLR                           RM(FM_ADCVOLR, FB_ADCVOLR)
+
+/* Register Values */
+#define RV_ADCVOLR_P24DB                     RV(FV_ADCVOLR_P24DB, FB_ADCVOLR)
+#define RV_ADCVOLR_N71PT25DB \
+	 RV(FV_ADCVOLR_N71PT25DB, FB_ADCVOLR)
+
+#define RV_ADCVOLR_MUTE                      RV(FV_ADCVOLR_MUTE, FB_ADCVOLR)
+
+/****************************
+ *      R_INVOLL (0x8)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_INVOLL_INMUTEL                    7
+#define FB_INVOLL_IZCL                       6
+#define FB_INVOLL                            0
+
+/* Field Masks */
+#define FM_INVOLL_INMUTEL                    0X1
+#define FM_INVOLL_IZCL                       0X1
+#define FM_INVOLL                            0X3F
+
+/* Field Values */
+#define FV_INVOLL_INMUTEL_ENABLE             0x1
+#define FV_INVOLL_INMUTEL_DISABLE            0x0
+#define FV_INVOLL_IZCL_ENABLE                0x1
+#define FV_INVOLL_IZCL_DISABLE               0x0
+#define FV_INVOLL_P30DB                      0x3F
+#define FV_INVOLL_N17PT25DB                  0x0
+
+/* Register Masks */
+#define RM_INVOLL_INMUTEL \
+	 RM(FM_INVOLL_INMUTEL, FB_INVOLL_INMUTEL)
+
+#define RM_INVOLL_IZCL                       RM(FM_INVOLL_IZCL, FB_INVOLL_IZCL)
+#define RM_INVOLL                            RM(FM_INVOLL, FB_INVOLL)
+
+/* Register Values */
+#define RV_INVOLL_INMUTEL_ENABLE \
+	 RV(FV_INVOLL_INMUTEL_ENABLE, FB_INVOLL_INMUTEL)
+
+#define RV_INVOLL_INMUTEL_DISABLE \
+	 RV(FV_INVOLL_INMUTEL_DISABLE, FB_INVOLL_INMUTEL)
+
+#define RV_INVOLL_IZCL_ENABLE \
+	 RV(FV_INVOLL_IZCL_ENABLE, FB_INVOLL_IZCL)
+
+#define RV_INVOLL_IZCL_DISABLE \
+	 RV(FV_INVOLL_IZCL_DISABLE, FB_INVOLL_IZCL)
+
+#define RV_INVOLL_P30DB                      RV(FV_INVOLL_P30DB, FB_INVOLL)
+#define RV_INVOLL_N17PT25DB                  RV(FV_INVOLL_N17PT25DB, FB_INVOLL)
+
+/****************************
+ *      R_INVOLR (0x9)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_INVOLR_INMUTER                    7
+#define FB_INVOLR_IZCR                       6
+#define FB_INVOLR                            0
+
+/* Field Masks */
+#define FM_INVOLR_INMUTER                    0X1
+#define FM_INVOLR_IZCR                       0X1
+#define FM_INVOLR                            0X3F
+
+/* Field Values */
+#define FV_INVOLR_INMUTER_ENABLE             0x1
+#define FV_INVOLR_INMUTER_DISABLE            0x0
+#define FV_INVOLR_IZCR_ENABLE                0x1
+#define FV_INVOLR_IZCR_DISABLE               0x0
+#define FV_INVOLR_P30DB                      0x3F
+#define FV_INVOLR_N17PT25DB                  0x0
+
+/* Register Masks */
+#define RM_INVOLR_INMUTER \
+	 RM(FM_INVOLR_INMUTER, FB_INVOLR_INMUTER)
+
+#define RM_INVOLR_IZCR                       RM(FM_INVOLR_IZCR, FB_INVOLR_IZCR)
+#define RM_INVOLR                            RM(FM_INVOLR, FB_INVOLR)
+
+/* Register Values */
+#define RV_INVOLR_INMUTER_ENABLE \
+	 RV(FV_INVOLR_INMUTER_ENABLE, FB_INVOLR_INMUTER)
+
+#define RV_INVOLR_INMUTER_DISABLE \
+	 RV(FV_INVOLR_INMUTER_DISABLE, FB_INVOLR_INMUTER)
+
+#define RV_INVOLR_IZCR_ENABLE \
+	 RV(FV_INVOLR_IZCR_ENABLE, FB_INVOLR_IZCR)
+
+#define RV_INVOLR_IZCR_DISABLE \
+	 RV(FV_INVOLR_IZCR_DISABLE, FB_INVOLR_IZCR)
+
+#define RV_INVOLR_P30DB                      RV(FV_INVOLR_P30DB, FB_INVOLR)
+#define RV_INVOLR_N17PT25DB                  RV(FV_INVOLR_N17PT25DB, FB_INVOLR)
+
+/*****************************
+ *      R_INMODE (0x0B)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_INMODE_DS                         0
+
+/* Field Masks */
+#define FM_INMODE_DS                         0X1
+
+/* Field Values */
+#define FV_INMODE_DS_LRIN1                   0x0
+#define FV_INMODE_DS_LRIN2                   0x1
+
+/* Register Masks */
+#define RM_INMODE_DS                         RM(FM_INMODE_DS, FB_INMODE_DS)
+
+/* Register Values */
+#define RV_INMODE_DS_LRIN1 \
+	 RV(FV_INMODE_DS_LRIN1, FB_INMODE_DS)
+
+#define RV_INMODE_DS_LRIN2 \
+	 RV(FV_INMODE_DS_LRIN2, FB_INMODE_DS)
+
+
+/*****************************
+ *      R_INSELL (0x0C)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_INSELL                            6
+#define FB_INSELL_MICBSTL                    4
+
+/* Field Masks */
+#define FM_INSELL                            0X3
+#define FM_INSELL_MICBSTL                    0X3
+
+/* Field Values */
+#define FV_INSELL_IN1                        0x0
+#define FV_INSELL_IN2                        0x1
+#define FV_INSELL_IN3                        0x2
+#define FV_INSELL_D2S                        0x3
+#define FV_INSELL_MICBSTL_OFF                0x0
+#define FV_INSELL_MICBSTL_10DB               0x1
+#define FV_INSELL_MICBSTL_20DB               0x2
+#define FV_INSELL_MICBSTL_30DB               0x3
+
+/* Register Masks */
+#define RM_INSELL                            RM(FM_INSELL, FB_INSELL)
+#define RM_INSELL_MICBSTL \
+	 RM(FM_INSELL_MICBSTL, FB_INSELL_MICBSTL)
+
+
+/* Register Values */
+#define RV_INSELL_IN1                        RV(FV_INSELL_IN1, FB_INSELL)
+#define RV_INSELL_IN2                        RV(FV_INSELL_IN2, FB_INSELL)
+#define RV_INSELL_IN3                        RV(FV_INSELL_IN3, FB_INSELL)
+#define RV_INSELL_D2S                        RV(FV_INSELL_D2S, FB_INSELL)
+#define RV_INSELL_MICBSTL_OFF \
+	 RV(FV_INSELL_MICBSTL_OFF, FB_INSELL_MICBSTL)
+
+#define RV_INSELL_MICBSTL_10DB \
+	 RV(FV_INSELL_MICBSTL_10DB, FB_INSELL_MICBSTL)
+
+#define RV_INSELL_MICBSTL_20DB \
+	 RV(FV_INSELL_MICBSTL_20DB, FB_INSELL_MICBSTL)
+
+#define RV_INSELL_MICBSTL_30DB \
+	 RV(FV_INSELL_MICBSTL_30DB, FB_INSELL_MICBSTL)
+
+
+/*****************************
+ *      R_INSELR (0x0D)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_INSELR                            6
+#define FB_INSELR_MICBSTR                    4
+
+/* Field Masks */
+#define FM_INSELR                            0X3
+#define FM_INSELR_MICBSTR                    0X3
+
+/* Field Values */
+#define FV_INSELR_IN1                        0x0
+#define FV_INSELR_IN2                        0x1
+#define FV_INSELR_IN3                        0x2
+#define FV_INSELR_D2S                        0x3
+#define FV_INSELR_MICBSTR_OFF                0x0
+#define FV_INSELR_MICBSTR_10DB               0x1
+#define FV_INSELR_MICBSTR_20DB               0x2
+#define FV_INSELR_MICBSTR_30DB               0x3
+
+/* Register Masks */
+#define RM_INSELR                            RM(FM_INSELR, FB_INSELR)
+#define RM_INSELR_MICBSTR \
+	 RM(FM_INSELR_MICBSTR, FB_INSELR_MICBSTR)
+
+
+/* Register Values */
+#define RV_INSELR_IN1                        RV(FV_INSELR_IN1, FB_INSELR)
+#define RV_INSELR_IN2                        RV(FV_INSELR_IN2, FB_INSELR)
+#define RV_INSELR_IN3                        RV(FV_INSELR_IN3, FB_INSELR)
+#define RV_INSELR_D2S                        RV(FV_INSELR_D2S, FB_INSELR)
+#define RV_INSELR_MICBSTR_OFF \
+	 RV(FV_INSELR_MICBSTR_OFF, FB_INSELR_MICBSTR)
+
+#define RV_INSELR_MICBSTR_10DB \
+	 RV(FV_INSELR_MICBSTR_10DB, FB_INSELR_MICBSTR)
+
+#define RV_INSELR_MICBSTR_20DB \
+	 RV(FV_INSELR_MICBSTR_20DB, FB_INSELR_MICBSTR)
+
+#define RV_INSELR_MICBSTR_30DB \
+	 RV(FV_INSELR_MICBSTR_30DB, FB_INSELR_MICBSTR)
+
+
+/***************************
+ *      R_AIC1 (0x13)      *
+ ***************************/
+
+/* Field Offsets */
+#define FB_AIC1_BCLKINV                      6
+#define FB_AIC1_MS                           5
+#define FB_AIC1_LRP                          4
+#define FB_AIC1_WL                           2
+#define FB_AIC1_FORMAT                       0
+
+/* Field Masks */
+#define FM_AIC1_BCLKINV                      0X1
+#define FM_AIC1_MS                           0X1
+#define FM_AIC1_LRP                          0X1
+#define FM_AIC1_WL                           0X3
+#define FM_AIC1_FORMAT                       0X3
+
+/* Field Values */
+#define FV_AIC1_BCLKINV_ENABLE               0x1
+#define FV_AIC1_BCLKINV_DISABLE              0x0
+#define FV_AIC1_MS_MASTER                    0x1
+#define FV_AIC1_MS_SLAVE                     0x0
+#define FV_AIC1_LRP_INVERT                   0x1
+#define FV_AIC1_LRP_NORMAL                   0x0
+#define FV_AIC1_WL_16                        0x0
+#define FV_AIC1_WL_20                        0x1
+#define FV_AIC1_WL_24                        0x2
+#define FV_AIC1_WL_32                        0x3
+#define FV_AIC1_FORMAT_RIGHT                 0x0
+#define FV_AIC1_FORMAT_LEFT                  0x1
+#define FV_AIC1_FORMAT_I2S                   0x2
+
+/* Register Masks */
+#define RM_AIC1_BCLKINV \
+	 RM(FM_AIC1_BCLKINV, FB_AIC1_BCLKINV)
+
+#define RM_AIC1_MS                           RM(FM_AIC1_MS, FB_AIC1_MS)
+#define RM_AIC1_LRP                          RM(FM_AIC1_LRP, FB_AIC1_LRP)
+#define RM_AIC1_WL                           RM(FM_AIC1_WL, FB_AIC1_WL)
+#define RM_AIC1_FORMAT                       RM(FM_AIC1_FORMAT, FB_AIC1_FORMAT)
+
+/* Register Values */
+#define RV_AIC1_BCLKINV_ENABLE \
+	 RV(FV_AIC1_BCLKINV_ENABLE, FB_AIC1_BCLKINV)
+
+#define RV_AIC1_BCLKINV_DISABLE \
+	 RV(FV_AIC1_BCLKINV_DISABLE, FB_AIC1_BCLKINV)
+
+#define RV_AIC1_MS_MASTER                    RV(FV_AIC1_MS_MASTER, FB_AIC1_MS)
+#define RV_AIC1_MS_SLAVE                     RV(FV_AIC1_MS_SLAVE, FB_AIC1_MS)
+#define RV_AIC1_LRP_INVERT \
+	 RV(FV_AIC1_LRP_INVERT, FB_AIC1_LRP)
+
+#define RV_AIC1_LRP_NORMAL \
+	 RV(FV_AIC1_LRP_NORMAL, FB_AIC1_LRP)
+
+#define RV_AIC1_WL_16                        RV(FV_AIC1_WL_16, FB_AIC1_WL)
+#define RV_AIC1_WL_20                        RV(FV_AIC1_WL_20, FB_AIC1_WL)
+#define RV_AIC1_WL_24                        RV(FV_AIC1_WL_24, FB_AIC1_WL)
+#define RV_AIC1_WL_32                        RV(FV_AIC1_WL_32, FB_AIC1_WL)
+#define RV_AIC1_FORMAT_RIGHT \
+	 RV(FV_AIC1_FORMAT_RIGHT, FB_AIC1_FORMAT)
+
+#define RV_AIC1_FORMAT_LEFT \
+	 RV(FV_AIC1_FORMAT_LEFT, FB_AIC1_FORMAT)
+
+#define RV_AIC1_FORMAT_I2S \
+	 RV(FV_AIC1_FORMAT_I2S, FB_AIC1_FORMAT)
+
+
+/***************************
+ *      R_AIC2 (0x14)      *
+ ***************************/
+
+/* Field Offsets */
+#define FB_AIC2_DACDSEL                      6
+#define FB_AIC2_ADCDSEL                      4
+#define FB_AIC2_TRI                          3
+#define FB_AIC2_BLRCM                        0
+
+/* Field Masks */
+#define FM_AIC2_DACDSEL                      0X3
+#define FM_AIC2_ADCDSEL                      0X3
+#define FM_AIC2_TRI                          0X1
+#define FM_AIC2_BLRCM                        0X7
+
+/* Field Values */
+#define FV_AIC2_BLRCM_DAC_BCLK_LRCLK_SHARED  0x3
+
+/* Register Masks */
+#define RM_AIC2_DACDSEL \
+	 RM(FM_AIC2_DACDSEL, FB_AIC2_DACDSEL)
+
+#define RM_AIC2_ADCDSEL \
+	 RM(FM_AIC2_ADCDSEL, FB_AIC2_ADCDSEL)
+
+#define RM_AIC2_TRI                          RM(FM_AIC2_TRI, FB_AIC2_TRI)
+#define RM_AIC2_BLRCM                        RM(FM_AIC2_BLRCM, FB_AIC2_BLRCM)
+
+/* Register Values */
+#define RV_AIC2_BLRCM_DAC_BCLK_LRCLK_SHARED \
+	 RV(FV_AIC2_BLRCM_DAC_BCLK_LRCLK_SHARED, FB_AIC2_BLRCM)
+
+
+/******************************
+ *      R_CNVRTR0 (0x16)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CNVRTR0_ADCPOLR                   7
+#define FB_CNVRTR0_ADCPOLL                   6
+#define FB_CNVRTR0_AMONOMIX                  4
+#define FB_CNVRTR0_ADCMU                     3
+#define FB_CNVRTR0_HPOR                      2
+#define FB_CNVRTR0_ADCHPDR                   1
+#define FB_CNVRTR0_ADCHPDL                   0
+
+/* Field Masks */
+#define FM_CNVRTR0_ADCPOLR                   0X1
+#define FM_CNVRTR0_ADCPOLL                   0X1
+#define FM_CNVRTR0_AMONOMIX                  0X3
+#define FM_CNVRTR0_ADCMU                     0X1
+#define FM_CNVRTR0_HPOR                      0X1
+#define FM_CNVRTR0_ADCHPDR                   0X1
+#define FM_CNVRTR0_ADCHPDL                   0X1
+
+/* Field Values */
+#define FV_CNVRTR0_ADCPOLR_INVERT            0x1
+#define FV_CNVRTR0_ADCPOLR_NORMAL            0x0
+#define FV_CNVRTR0_ADCPOLL_INVERT            0x1
+#define FV_CNVRTR0_ADCPOLL_NORMAL            0x0
+#define FV_CNVRTR0_ADCMU_ENABLE              0x1
+#define FV_CNVRTR0_ADCMU_DISABLE             0x0
+#define FV_CNVRTR0_ADCHPDR_ENABLE            0x1
+#define FV_CNVRTR0_ADCHPDR_DISABLE           0x0
+#define FV_CNVRTR0_ADCHPDL_ENABLE            0x1
+#define FV_CNVRTR0_ADCHPDL_DISABLE           0x0
+
+/* Register Masks */
+#define RM_CNVRTR0_ADCPOLR \
+	 RM(FM_CNVRTR0_ADCPOLR, FB_CNVRTR0_ADCPOLR)
+
+#define RM_CNVRTR0_ADCPOLL \
+	 RM(FM_CNVRTR0_ADCPOLL, FB_CNVRTR0_ADCPOLL)
+
+#define RM_CNVRTR0_AMONOMIX \
+	 RM(FM_CNVRTR0_AMONOMIX, FB_CNVRTR0_AMONOMIX)
+
+#define RM_CNVRTR0_ADCMU \
+	 RM(FM_CNVRTR0_ADCMU, FB_CNVRTR0_ADCMU)
+
+#define RM_CNVRTR0_HPOR \
+	 RM(FM_CNVRTR0_HPOR, FB_CNVRTR0_HPOR)
+
+#define RM_CNVRTR0_ADCHPDR \
+	 RM(FM_CNVRTR0_ADCHPDR, FB_CNVRTR0_ADCHPDR)
+
+#define RM_CNVRTR0_ADCHPDL \
+	 RM(FM_CNVRTR0_ADCHPDL, FB_CNVRTR0_ADCHPDL)
+
+
+/* Register Values */
+#define RV_CNVRTR0_ADCPOLR_INVERT \
+	 RV(FV_CNVRTR0_ADCPOLR_INVERT, FB_CNVRTR0_ADCPOLR)
+
+#define RV_CNVRTR0_ADCPOLR_NORMAL \
+	 RV(FV_CNVRTR0_ADCPOLR_NORMAL, FB_CNVRTR0_ADCPOLR)
+
+#define RV_CNVRTR0_ADCPOLL_INVERT \
+	 RV(FV_CNVRTR0_ADCPOLL_INVERT, FB_CNVRTR0_ADCPOLL)
+
+#define RV_CNVRTR0_ADCPOLL_NORMAL \
+	 RV(FV_CNVRTR0_ADCPOLL_NORMAL, FB_CNVRTR0_ADCPOLL)
+
+#define RV_CNVRTR0_ADCMU_ENABLE \
+	 RV(FV_CNVRTR0_ADCMU_ENABLE, FB_CNVRTR0_ADCMU)
+
+#define RV_CNVRTR0_ADCMU_DISABLE \
+	 RV(FV_CNVRTR0_ADCMU_DISABLE, FB_CNVRTR0_ADCMU)
+
+#define RV_CNVRTR0_ADCHPDR_ENABLE \
+	 RV(FV_CNVRTR0_ADCHPDR_ENABLE, FB_CNVRTR0_ADCHPDR)
+
+#define RV_CNVRTR0_ADCHPDR_DISABLE \
+	 RV(FV_CNVRTR0_ADCHPDR_DISABLE, FB_CNVRTR0_ADCHPDR)
+
+#define RV_CNVRTR0_ADCHPDL_ENABLE \
+	 RV(FV_CNVRTR0_ADCHPDL_ENABLE, FB_CNVRTR0_ADCHPDL)
+
+#define RV_CNVRTR0_ADCHPDL_DISABLE \
+	 RV(FV_CNVRTR0_ADCHPDL_DISABLE, FB_CNVRTR0_ADCHPDL)
+
+
+/****************************
+ *      R_ADCSR (0x17)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_ADCSR_ABCM                        6
+#define FB_ADCSR_ABR                         3
+#define FB_ADCSR_ABM                         0
+
+/* Field Masks */
+#define FM_ADCSR_ABCM                        0X3
+#define FM_ADCSR_ABR                         0X3
+#define FM_ADCSR_ABM                         0X7
+
+/* Field Values */
+#define FV_ADCSR_ABCM_AUTO                   0x0
+#define FV_ADCSR_ABCM_32                     0x1
+#define FV_ADCSR_ABCM_40                     0x2
+#define FV_ADCSR_ABCM_64                     0x3
+#define FV_ADCSR_ABR_32                      0x0
+#define FV_ADCSR_ABR_44_1                    0x1
+#define FV_ADCSR_ABR_48                      0x2
+#define FV_ADCSR_ABM_PT25                    0x0
+#define FV_ADCSR_ABM_PT5                     0x1
+#define FV_ADCSR_ABM_1                       0x2
+#define FV_ADCSR_ABM_2                       0x3
+
+/* Register Masks */
+#define RM_ADCSR_ABCM                        RM(FM_ADCSR_ABCM, FB_ADCSR_ABCM)
+#define RM_ADCSR_ABR                         RM(FM_ADCSR_ABR, FB_ADCSR_ABR)
+#define RM_ADCSR_ABM                         RM(FM_ADCSR_ABM, FB_ADCSR_ABM)
+
+/* Register Values */
+#define RV_ADCSR_ABCM_AUTO \
+	 RV(FV_ADCSR_ABCM_AUTO, FB_ADCSR_ABCM)
+
+#define RV_ADCSR_ABCM_32 \
+	 RV(FV_ADCSR_ABCM_32, FB_ADCSR_ABCM)
+
+#define RV_ADCSR_ABCM_40 \
+	 RV(FV_ADCSR_ABCM_40, FB_ADCSR_ABCM)
+
+#define RV_ADCSR_ABCM_64 \
+	 RV(FV_ADCSR_ABCM_64, FB_ADCSR_ABCM)
+
+#define RV_ADCSR_ABR_32                      RV(FV_ADCSR_ABR_32, FB_ADCSR_ABR)
+#define RV_ADCSR_ABR_44_1 \
+	 RV(FV_ADCSR_ABR_44_1, FB_ADCSR_ABR)
+
+#define RV_ADCSR_ABR_48                      RV(FV_ADCSR_ABR_48, FB_ADCSR_ABR)
+#define RV_ADCSR_ABR_                        RV(FV_ADCSR_ABR_, FB_ADCSR_ABR)
+#define RV_ADCSR_ABM_PT25 \
+	 RV(FV_ADCSR_ABM_PT25, FB_ADCSR_ABM)
+
+#define RV_ADCSR_ABM_PT5                     RV(FV_ADCSR_ABM_PT5, FB_ADCSR_ABM)
+#define RV_ADCSR_ABM_1                       RV(FV_ADCSR_ABM_1, FB_ADCSR_ABM)
+#define RV_ADCSR_ABM_2                       RV(FV_ADCSR_ABM_2, FB_ADCSR_ABM)
+
+/******************************
+ *      R_CNVRTR1 (0x18)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CNVRTR1_DACPOLR                   7
+#define FB_CNVRTR1_DACPOLL                   6
+#define FB_CNVRTR1_DMONOMIX                  4
+#define FB_CNVRTR1_DACMU                     3
+#define FB_CNVRTR1_DEEMPH                    2
+#define FB_CNVRTR1_DACDITH                   0
+
+/* Field Masks */
+#define FM_CNVRTR1_DACPOLR                   0X1
+#define FM_CNVRTR1_DACPOLL                   0X1
+#define FM_CNVRTR1_DMONOMIX                  0X3
+#define FM_CNVRTR1_DACMU                     0X1
+#define FM_CNVRTR1_DEEMPH                    0X1
+#define FM_CNVRTR1_DACDITH                   0X3
+
+/* Field Values */
+#define FV_CNVRTR1_DACPOLR_INVERT            0x1
+#define FV_CNVRTR1_DACPOLR_NORMAL            0x0
+#define FV_CNVRTR1_DACPOLL_INVERT            0x1
+#define FV_CNVRTR1_DACPOLL_NORMAL            0x0
+#define FV_CNVRTR1_DMONOMIX_ENABLE           0x1
+#define FV_CNVRTR1_DMONOMIX_DISABLE          0x0
+#define FV_CNVRTR1_DACMU_ENABLE              0x1
+#define FV_CNVRTR1_DACMU_DISABLE             0x0
+
+/* Register Masks */
+#define RM_CNVRTR1_DACPOLR \
+	 RM(FM_CNVRTR1_DACPOLR, FB_CNVRTR1_DACPOLR)
+
+#define RM_CNVRTR1_DACPOLL \
+	 RM(FM_CNVRTR1_DACPOLL, FB_CNVRTR1_DACPOLL)
+
+#define RM_CNVRTR1_DMONOMIX \
+	 RM(FM_CNVRTR1_DMONOMIX, FB_CNVRTR1_DMONOMIX)
+
+#define RM_CNVRTR1_DACMU \
+	 RM(FM_CNVRTR1_DACMU, FB_CNVRTR1_DACMU)
+
+#define RM_CNVRTR1_DEEMPH \
+	 RM(FM_CNVRTR1_DEEMPH, FB_CNVRTR1_DEEMPH)
+
+#define RM_CNVRTR1_DACDITH \
+	 RM(FM_CNVRTR1_DACDITH, FB_CNVRTR1_DACDITH)
+
+
+/* Register Values */
+#define RV_CNVRTR1_DACPOLR_INVERT \
+	 RV(FV_CNVRTR1_DACPOLR_INVERT, FB_CNVRTR1_DACPOLR)
+
+#define RV_CNVRTR1_DACPOLR_NORMAL \
+	 RV(FV_CNVRTR1_DACPOLR_NORMAL, FB_CNVRTR1_DACPOLR)
+
+#define RV_CNVRTR1_DACPOLL_INVERT \
+	 RV(FV_CNVRTR1_DACPOLL_INVERT, FB_CNVRTR1_DACPOLL)
+
+#define RV_CNVRTR1_DACPOLL_NORMAL \
+	 RV(FV_CNVRTR1_DACPOLL_NORMAL, FB_CNVRTR1_DACPOLL)
+
+#define RV_CNVRTR1_DMONOMIX_ENABLE \
+	 RV(FV_CNVRTR1_DMONOMIX_ENABLE, FB_CNVRTR1_DMONOMIX)
+
+#define RV_CNVRTR1_DMONOMIX_DISABLE \
+	 RV(FV_CNVRTR1_DMONOMIX_DISABLE, FB_CNVRTR1_DMONOMIX)
+
+#define RV_CNVRTR1_DACMU_ENABLE \
+	 RV(FV_CNVRTR1_DACMU_ENABLE, FB_CNVRTR1_DACMU)
+
+#define RV_CNVRTR1_DACMU_DISABLE \
+	 RV(FV_CNVRTR1_DACMU_DISABLE, FB_CNVRTR1_DACMU)
+
+
+/****************************
+ *      R_DACSR (0x19)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_DACSR_DBCM                        6
+#define FB_DACSR_DBR                         3
+#define FB_DACSR_DBM                         0
+
+/* Field Masks */
+#define FM_DACSR_DBCM                        0X3
+#define FM_DACSR_DBR                         0X3
+#define FM_DACSR_DBM                         0X7
+
+/* Field Values */
+#define FV_DACSR_DBCM_AUTO                   0x0
+#define FV_DACSR_DBCM_32                     0x1
+#define FV_DACSR_DBCM_40                     0x2
+#define FV_DACSR_DBCM_64                     0x3
+#define FV_DACSR_DBR_32                      0x0
+#define FV_DACSR_DBR_44_1                    0x1
+#define FV_DACSR_DBR_48                      0x2
+#define FV_DACSR_DBM_PT25                    0x0
+#define FV_DACSR_DBM_PT5                     0x1
+#define FV_DACSR_DBM_1                       0x2
+#define FV_DACSR_DBM_2                       0x3
+
+/* Register Masks */
+#define RM_DACSR_DBCM                        RM(FM_DACSR_DBCM, FB_DACSR_DBCM)
+#define RM_DACSR_DBR                         RM(FM_DACSR_DBR, FB_DACSR_DBR)
+#define RM_DACSR_DBM                         RM(FM_DACSR_DBM, FB_DACSR_DBM)
+
+/* Register Values */
+#define RV_DACSR_DBCM_AUTO \
+	 RV(FV_DACSR_DBCM_AUTO, FB_DACSR_DBCM)
+
+#define RV_DACSR_DBCM_32 \
+	 RV(FV_DACSR_DBCM_32, FB_DACSR_DBCM)
+
+#define RV_DACSR_DBCM_40 \
+	 RV(FV_DACSR_DBCM_40, FB_DACSR_DBCM)
+
+#define RV_DACSR_DBCM_64 \
+	 RV(FV_DACSR_DBCM_64, FB_DACSR_DBCM)
+
+#define RV_DACSR_DBR_32                      RV(FV_DACSR_DBR_32, FB_DACSR_DBR)
+#define RV_DACSR_DBR_44_1 \
+	 RV(FV_DACSR_DBR_44_1, FB_DACSR_DBR)
+
+#define RV_DACSR_DBR_48                      RV(FV_DACSR_DBR_48, FB_DACSR_DBR)
+#define RV_DACSR_DBM_PT25 \
+	 RV(FV_DACSR_DBM_PT25, FB_DACSR_DBM)
+
+#define RV_DACSR_DBM_PT5                     RV(FV_DACSR_DBM_PT5, FB_DACSR_DBM)
+#define RV_DACSR_DBM_1                       RV(FV_DACSR_DBM_1, FB_DACSR_DBM)
+#define RV_DACSR_DBM_2                       RV(FV_DACSR_DBM_2, FB_DACSR_DBM)
+
+/****************************
+ *      R_PWRM1 (0x1A)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_PWRM1_BSTL                        7
+#define FB_PWRM1_BSTR                        6
+#define FB_PWRM1_PGAL                        5
+#define FB_PWRM1_PGAR                        4
+#define FB_PWRM1_ADCL                        3
+#define FB_PWRM1_ADCR                        2
+#define FB_PWRM1_MICB                        1
+#define FB_PWRM1_DIGENB                      0
+
+/* Field Masks */
+#define FM_PWRM1_BSTL                        0X1
+#define FM_PWRM1_BSTR                        0X1
+#define FM_PWRM1_PGAL                        0X1
+#define FM_PWRM1_PGAR                        0X1
+#define FM_PWRM1_ADCL                        0X1
+#define FM_PWRM1_ADCR                        0X1
+#define FM_PWRM1_MICB                        0X1
+#define FM_PWRM1_DIGENB                      0X1
+
+/* Field Values */
+#define FV_PWRM1_BSTL_ENABLE                 0x1
+#define FV_PWRM1_BSTL_DISABLE                0x0
+#define FV_PWRM1_BSTR_ENABLE                 0x1
+#define FV_PWRM1_BSTR_DISABLE                0x0
+#define FV_PWRM1_PGAL_ENABLE                 0x1
+#define FV_PWRM1_PGAL_DISABLE                0x0
+#define FV_PWRM1_PGAR_ENABLE                 0x1
+#define FV_PWRM1_PGAR_DISABLE                0x0
+#define FV_PWRM1_ADCL_ENABLE                 0x1
+#define FV_PWRM1_ADCL_DISABLE                0x0
+#define FV_PWRM1_ADCR_ENABLE                 0x1
+#define FV_PWRM1_ADCR_DISABLE                0x0
+#define FV_PWRM1_MICB_ENABLE                 0x1
+#define FV_PWRM1_MICB_DISABLE                0x0
+#define FV_PWRM1_DIGENB_DISABLE              0x1
+#define FV_PWRM1_DIGENB_ENABLE               0x0
+
+/* Register Masks */
+#define RM_PWRM1_BSTL                        RM(FM_PWRM1_BSTL, FB_PWRM1_BSTL)
+#define RM_PWRM1_BSTR                        RM(FM_PWRM1_BSTR, FB_PWRM1_BSTR)
+#define RM_PWRM1_PGAL                        RM(FM_PWRM1_PGAL, FB_PWRM1_PGAL)
+#define RM_PWRM1_PGAR                        RM(FM_PWRM1_PGAR, FB_PWRM1_PGAR)
+#define RM_PWRM1_ADCL                        RM(FM_PWRM1_ADCL, FB_PWRM1_ADCL)
+#define RM_PWRM1_ADCR                        RM(FM_PWRM1_ADCR, FB_PWRM1_ADCR)
+#define RM_PWRM1_MICB                        RM(FM_PWRM1_MICB, FB_PWRM1_MICB)
+#define RM_PWRM1_DIGENB \
+	 RM(FM_PWRM1_DIGENB, FB_PWRM1_DIGENB)
+
+
+/* Register Values */
+#define RV_PWRM1_BSTL_ENABLE \
+	 RV(FV_PWRM1_BSTL_ENABLE, FB_PWRM1_BSTL)
+
+#define RV_PWRM1_BSTL_DISABLE \
+	 RV(FV_PWRM1_BSTL_DISABLE, FB_PWRM1_BSTL)
+
+#define RV_PWRM1_BSTR_ENABLE \
+	 RV(FV_PWRM1_BSTR_ENABLE, FB_PWRM1_BSTR)
+
+#define RV_PWRM1_BSTR_DISABLE \
+	 RV(FV_PWRM1_BSTR_DISABLE, FB_PWRM1_BSTR)
+
+#define RV_PWRM1_PGAL_ENABLE \
+	 RV(FV_PWRM1_PGAL_ENABLE, FB_PWRM1_PGAL)
+
+#define RV_PWRM1_PGAL_DISABLE \
+	 RV(FV_PWRM1_PGAL_DISABLE, FB_PWRM1_PGAL)
+
+#define RV_PWRM1_PGAR_ENABLE \
+	 RV(FV_PWRM1_PGAR_ENABLE, FB_PWRM1_PGAR)
+
+#define RV_PWRM1_PGAR_DISABLE \
+	 RV(FV_PWRM1_PGAR_DISABLE, FB_PWRM1_PGAR)
+
+#define RV_PWRM1_ADCL_ENABLE \
+	 RV(FV_PWRM1_ADCL_ENABLE, FB_PWRM1_ADCL)
+
+#define RV_PWRM1_ADCL_DISABLE \
+	 RV(FV_PWRM1_ADCL_DISABLE, FB_PWRM1_ADCL)
+
+#define RV_PWRM1_ADCR_ENABLE \
+	 RV(FV_PWRM1_ADCR_ENABLE, FB_PWRM1_ADCR)
+
+#define RV_PWRM1_ADCR_DISABLE \
+	 RV(FV_PWRM1_ADCR_DISABLE, FB_PWRM1_ADCR)
+
+#define RV_PWRM1_MICB_ENABLE \
+	 RV(FV_PWRM1_MICB_ENABLE, FB_PWRM1_MICB)
+
+#define RV_PWRM1_MICB_DISABLE \
+	 RV(FV_PWRM1_MICB_DISABLE, FB_PWRM1_MICB)
+
+#define RV_PWRM1_DIGENB_DISABLE \
+	 RV(FV_PWRM1_DIGENB_DISABLE, FB_PWRM1_DIGENB)
+
+#define RV_PWRM1_DIGENB_ENABLE \
+	 RV(FV_PWRM1_DIGENB_ENABLE, FB_PWRM1_DIGENB)
+
+
+/****************************
+ *      R_PWRM2 (0x1B)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_PWRM2_D2S                         7
+#define FB_PWRM2_HPL                         6
+#define FB_PWRM2_HPR                         5
+#define FB_PWRM2_SPKL                        4
+#define FB_PWRM2_SPKR                        3
+#define FB_PWRM2_INSELL                      2
+#define FB_PWRM2_INSELR                      1
+#define FB_PWRM2_VREF                        0
+
+/* Field Masks */
+#define FM_PWRM2_D2S                         0X1
+#define FM_PWRM2_HPL                         0X1
+#define FM_PWRM2_HPR                         0X1
+#define FM_PWRM2_SPKL                        0X1
+#define FM_PWRM2_SPKR                        0X1
+#define FM_PWRM2_INSELL                      0X1
+#define FM_PWRM2_INSELR                      0X1
+#define FM_PWRM2_VREF                        0X1
+
+/* Field Values */
+#define FV_PWRM2_D2S_ENABLE                  0x1
+#define FV_PWRM2_D2S_DISABLE                 0x0
+#define FV_PWRM2_HPL_ENABLE                  0x1
+#define FV_PWRM2_HPL_DISABLE                 0x0
+#define FV_PWRM2_HPR_ENABLE                  0x1
+#define FV_PWRM2_HPR_DISABLE                 0x0
+#define FV_PWRM2_SPKL_ENABLE                 0x1
+#define FV_PWRM2_SPKL_DISABLE                0x0
+#define FV_PWRM2_SPKR_ENABLE                 0x1
+#define FV_PWRM2_SPKR_DISABLE                0x0
+#define FV_PWRM2_INSELL_ENABLE               0x1
+#define FV_PWRM2_INSELL_DISABLE              0x0
+#define FV_PWRM2_INSELR_ENABLE               0x1
+#define FV_PWRM2_INSELR_DISABLE              0x0
+#define FV_PWRM2_VREF_ENABLE                 0x1
+#define FV_PWRM2_VREF_DISABLE                0x0
+
+/* Register Masks */
+#define RM_PWRM2_D2S                         RM(FM_PWRM2_D2S, FB_PWRM2_D2S)
+#define RM_PWRM2_HPL                         RM(FM_PWRM2_HPL, FB_PWRM2_HPL)
+#define RM_PWRM2_HPR                         RM(FM_PWRM2_HPR, FB_PWRM2_HPR)
+#define RM_PWRM2_SPKL                        RM(FM_PWRM2_SPKL, FB_PWRM2_SPKL)
+#define RM_PWRM2_SPKR                        RM(FM_PWRM2_SPKR, FB_PWRM2_SPKR)
+#define RM_PWRM2_INSELL \
+	 RM(FM_PWRM2_INSELL, FB_PWRM2_INSELL)
+
+#define RM_PWRM2_INSELR \
+	 RM(FM_PWRM2_INSELR, FB_PWRM2_INSELR)
+
+#define RM_PWRM2_VREF                        RM(FM_PWRM2_VREF, FB_PWRM2_VREF)
+
+/* Register Values */
+#define RV_PWRM2_D2S_ENABLE \
+	 RV(FV_PWRM2_D2S_ENABLE, FB_PWRM2_D2S)
+
+#define RV_PWRM2_D2S_DISABLE \
+	 RV(FV_PWRM2_D2S_DISABLE, FB_PWRM2_D2S)
+
+#define RV_PWRM2_HPL_ENABLE \
+	 RV(FV_PWRM2_HPL_ENABLE, FB_PWRM2_HPL)
+
+#define RV_PWRM2_HPL_DISABLE \
+	 RV(FV_PWRM2_HPL_DISABLE, FB_PWRM2_HPL)
+
+#define RV_PWRM2_HPR_ENABLE \
+	 RV(FV_PWRM2_HPR_ENABLE, FB_PWRM2_HPR)
+
+#define RV_PWRM2_HPR_DISABLE \
+	 RV(FV_PWRM2_HPR_DISABLE, FB_PWRM2_HPR)
+
+#define RV_PWRM2_SPKL_ENABLE \
+	 RV(FV_PWRM2_SPKL_ENABLE, FB_PWRM2_SPKL)
+
+#define RV_PWRM2_SPKL_DISABLE \
+	 RV(FV_PWRM2_SPKL_DISABLE, FB_PWRM2_SPKL)
+
+#define RV_PWRM2_SPKR_ENABLE \
+	 RV(FV_PWRM2_SPKR_ENABLE, FB_PWRM2_SPKR)
+
+#define RV_PWRM2_SPKR_DISABLE \
+	 RV(FV_PWRM2_SPKR_DISABLE, FB_PWRM2_SPKR)
+
+#define RV_PWRM2_INSELL_ENABLE \
+	 RV(FV_PWRM2_INSELL_ENABLE, FB_PWRM2_INSELL)
+
+#define RV_PWRM2_INSELL_DISABLE \
+	 RV(FV_PWRM2_INSELL_DISABLE, FB_PWRM2_INSELL)
+
+#define RV_PWRM2_INSELR_ENABLE \
+	 RV(FV_PWRM2_INSELR_ENABLE, FB_PWRM2_INSELR)
+
+#define RV_PWRM2_INSELR_DISABLE \
+	 RV(FV_PWRM2_INSELR_DISABLE, FB_PWRM2_INSELR)
+
+#define RV_PWRM2_VREF_ENABLE \
+	 RV(FV_PWRM2_VREF_ENABLE, FB_PWRM2_VREF)
+
+#define RV_PWRM2_VREF_DISABLE \
+	 RV(FV_PWRM2_VREF_DISABLE, FB_PWRM2_VREF)
+
+
+/******************************
+ *      R_CONFIG0 (0x1F)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CONFIG0_ASDM                      6
+#define FB_CONFIG0_DSDM                      4
+#define FB_CONFIG0_DC_BYPASS                 1
+#define FB_CONFIG0_SD_FORCE_ON               0
+
+/* Field Masks */
+#define FM_CONFIG0_ASDM                      0X3
+#define FM_CONFIG0_DSDM                      0X3
+#define FM_CONFIG0_DC_BYPASS                 0X1
+#define FM_CONFIG0_SD_FORCE_ON               0X1
+
+/* Field Values */
+#define FV_CONFIG0_ASDM_HALF                 0x1
+#define FV_CONFIG0_ASDM_FULL                 0x2
+#define FV_CONFIG0_ASDM_AUTO                 0x3
+#define FV_CONFIG0_DSDM_HALF                 0x1
+#define FV_CONFIG0_DSDM_FULL                 0x2
+#define FV_CONFIG0_DSDM_AUTO                 0x3
+#define FV_CONFIG0_DC_BYPASS_ENABLE          0x1
+#define FV_CONFIG0_DC_BYPASS_DISABLE         0x0
+#define FV_CONFIG0_SD_FORCE_ON_ENABLE        0x1
+#define FV_CONFIG0_SD_FORCE_ON_DISABLE       0x0
+
+/* Register Masks */
+#define RM_CONFIG0_ASDM \
+	 RM(FM_CONFIG0_ASDM, FB_CONFIG0_ASDM)
+
+#define RM_CONFIG0_DSDM \
+	 RM(FM_CONFIG0_DSDM, FB_CONFIG0_DSDM)
+
+#define RM_CONFIG0_DC_BYPASS \
+	 RM(FM_CONFIG0_DC_BYPASS, FB_CONFIG0_DC_BYPASS)
+
+#define RM_CONFIG0_SD_FORCE_ON \
+	 RM(FM_CONFIG0_SD_FORCE_ON, FB_CONFIG0_SD_FORCE_ON)
+
+
+/* Register Values */
+#define RV_CONFIG0_ASDM_HALF \
+	 RV(FV_CONFIG0_ASDM_HALF, FB_CONFIG0_ASDM)
+
+#define RV_CONFIG0_ASDM_FULL \
+	 RV(FV_CONFIG0_ASDM_FULL, FB_CONFIG0_ASDM)
+
+#define RV_CONFIG0_ASDM_AUTO \
+	 RV(FV_CONFIG0_ASDM_AUTO, FB_CONFIG0_ASDM)
+
+#define RV_CONFIG0_DSDM_HALF \
+	 RV(FV_CONFIG0_DSDM_HALF, FB_CONFIG0_DSDM)
+
+#define RV_CONFIG0_DSDM_FULL \
+	 RV(FV_CONFIG0_DSDM_FULL, FB_CONFIG0_DSDM)
+
+#define RV_CONFIG0_DSDM_AUTO \
+	 RV(FV_CONFIG0_DSDM_AUTO, FB_CONFIG0_DSDM)
+
+#define RV_CONFIG0_DC_BYPASS_ENABLE \
+	 RV(FV_CONFIG0_DC_BYPASS_ENABLE, FB_CONFIG0_DC_BYPASS)
+
+#define RV_CONFIG0_DC_BYPASS_DISABLE \
+	 RV(FV_CONFIG0_DC_BYPASS_DISABLE, FB_CONFIG0_DC_BYPASS)
+
+#define RV_CONFIG0_SD_FORCE_ON_ENABLE \
+	 RV(FV_CONFIG0_SD_FORCE_ON_ENABLE, FB_CONFIG0_SD_FORCE_ON)
+
+#define RV_CONFIG0_SD_FORCE_ON_DISABLE \
+	 RV(FV_CONFIG0_SD_FORCE_ON_DISABLE, FB_CONFIG0_SD_FORCE_ON)
+
+
+/******************************
+ *      R_CONFIG1 (0x20)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CONFIG1_EQ2_EN                    7
+#define FB_CONFIG1_EQ2_BE                    4
+#define FB_CONFIG1_EQ1_EN                    3
+#define FB_CONFIG1_EQ1_BE                    0
+
+/* Field Masks */
+#define FM_CONFIG1_EQ2_EN                    0X1
+#define FM_CONFIG1_EQ2_BE                    0X7
+#define FM_CONFIG1_EQ1_EN                    0X1
+#define FM_CONFIG1_EQ1_BE                    0X7
+
+/* Field Values */
+#define FV_CONFIG1_EQ2_EN_ENABLE             0x1
+#define FV_CONFIG1_EQ2_EN_DISABLE            0x0
+#define FV_CONFIG1_EQ2_BE_PRE                0x0
+#define FV_CONFIG1_EQ2_BE_PRE_EQ_0           0x1
+#define FV_CONFIG1_EQ2_BE_PRE_EQ0_1          0x2
+#define FV_CONFIG1_EQ2_BE_PRE_EQ0_2          0x3
+#define FV_CONFIG1_EQ2_BE_PRE_EQ0_3          0x4
+#define FV_CONFIG1_EQ2_BE_PRE_EQ0_4          0x5
+#define FV_CONFIG1_EQ2_BE_PRE_EQ0_5          0x6
+#define FV_CONFIG1_EQ1_EN_ENABLE             0x1
+#define FV_CONFIG1_EQ1_EN_DISABLE            0x0
+#define FV_CONFIG1_EQ1_BE_PRE                0x0
+#define FV_CONFIG1_EQ1_BE_PRE_EQ_0           0x1
+#define FV_CONFIG1_EQ1_BE_PRE_EQ0_1          0x2
+#define FV_CONFIG1_EQ1_BE_PRE_EQ0_2          0x3
+#define FV_CONFIG1_EQ1_BE_PRE_EQ0_3          0x4
+#define FV_CONFIG1_EQ1_BE_PRE_EQ0_4          0x5
+#define FV_CONFIG1_EQ1_BE_PRE_EQ0_5          0x6
+
+/* Register Masks */
+#define RM_CONFIG1_EQ2_EN \
+	 RM(FM_CONFIG1_EQ2_EN, FB_CONFIG1_EQ2_EN)
+
+#define RM_CONFIG1_EQ2_BE \
+	 RM(FM_CONFIG1_EQ2_BE, FB_CONFIG1_EQ2_BE)
+
+#define RM_CONFIG1_EQ1_EN \
+	 RM(FM_CONFIG1_EQ1_EN, FB_CONFIG1_EQ1_EN)
+
+#define RM_CONFIG1_EQ1_BE \
+	 RM(FM_CONFIG1_EQ1_BE, FB_CONFIG1_EQ1_BE)
+
+
+/* Register Values */
+#define RV_CONFIG1_EQ2_EN_ENABLE \
+	 RV(FV_CONFIG1_EQ2_EN_ENABLE, FB_CONFIG1_EQ2_EN)
+
+#define RV_CONFIG1_EQ2_EN_DISABLE \
+	 RV(FV_CONFIG1_EQ2_EN_DISABLE, FB_CONFIG1_EQ2_EN)
+
+#define RV_CONFIG1_EQ2_BE_PRE \
+	 RV(FV_CONFIG1_EQ2_BE_PRE, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ_0 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ_0, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ0_1 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ0_1, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ0_2 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ0_2, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ0_3 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ0_3, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ0_4 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ0_4, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ2_BE_PRE_EQ0_5 \
+	 RV(FV_CONFIG1_EQ2_BE_PRE_EQ0_5, FB_CONFIG1_EQ2_BE)
+
+#define RV_CONFIG1_EQ1_EN_ENABLE \
+	 RV(FV_CONFIG1_EQ1_EN_ENABLE, FB_CONFIG1_EQ1_EN)
+
+#define RV_CONFIG1_EQ1_EN_DISABLE \
+	 RV(FV_CONFIG1_EQ1_EN_DISABLE, FB_CONFIG1_EQ1_EN)
+
+#define RV_CONFIG1_EQ1_BE_PRE \
+	 RV(FV_CONFIG1_EQ1_BE_PRE, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ_0 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ_0, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ0_1 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ0_1, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ0_2 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ0_2, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ0_3 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ0_3, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ0_4 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ0_4, FB_CONFIG1_EQ1_BE)
+
+#define RV_CONFIG1_EQ1_BE_PRE_EQ0_5 \
+	 RV(FV_CONFIG1_EQ1_BE_PRE_EQ0_5, FB_CONFIG1_EQ1_BE)
+
+
+/******************************
+ *      R_DMICCTL (0x24)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_DMICCTL_DMICEN                    7
+#define FB_DMICCTL_DMONO                     4
+#define FB_DMICCTL_DMPHADJ                   2
+#define FB_DMICCTL_DMRATE                    0
+
+/* Field Masks */
+#define FM_DMICCTL_DMICEN                    0X1
+#define FM_DMICCTL_DMONO                     0X1
+#define FM_DMICCTL_DMPHADJ                   0X3
+#define FM_DMICCTL_DMRATE                    0X3
+
+/* Field Values */
+#define FV_DMICCTL_DMICEN_ENABLE             0x1
+#define FV_DMICCTL_DMICEN_DISABLE            0x0
+#define FV_DMICCTL_DMONO_STEREO              0x0
+#define FV_DMICCTL_DMONO_MONO                0x1
+
+/* Register Masks */
+#define RM_DMICCTL_DMICEN \
+	 RM(FM_DMICCTL_DMICEN, FB_DMICCTL_DMICEN)
+
+#define RM_DMICCTL_DMONO \
+	 RM(FM_DMICCTL_DMONO, FB_DMICCTL_DMONO)
+
+#define RM_DMICCTL_DMPHADJ \
+	 RM(FM_DMICCTL_DMPHADJ, FB_DMICCTL_DMPHADJ)
+
+#define RM_DMICCTL_DMRATE \
+	 RM(FM_DMICCTL_DMRATE, FB_DMICCTL_DMRATE)
+
+
+/* Register Values */
+#define RV_DMICCTL_DMICEN_ENABLE \
+	 RV(FV_DMICCTL_DMICEN_ENABLE, FB_DMICCTL_DMICEN)
+
+#define RV_DMICCTL_DMICEN_DISABLE \
+	 RV(FV_DMICCTL_DMICEN_DISABLE, FB_DMICCTL_DMICEN)
+
+#define RV_DMICCTL_DMONO_STEREO \
+	 RV(FV_DMICCTL_DMONO_STEREO, FB_DMICCTL_DMONO)
+
+#define RV_DMICCTL_DMONO_MONO \
+	 RV(FV_DMICCTL_DMONO_MONO, FB_DMICCTL_DMONO)
+
+
+/*****************************
+ *      R_CLECTL (0x25)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_CLECTL_LVL_MODE                   4
+#define FB_CLECTL_WINDOWSEL                  3
+#define FB_CLECTL_EXP_EN                     2
+#define FB_CLECTL_LIMIT_EN                   1
+#define FB_CLECTL_COMP_EN                    0
+
+/* Field Masks */
+#define FM_CLECTL_LVL_MODE                   0X1
+#define FM_CLECTL_WINDOWSEL                  0X1
+#define FM_CLECTL_EXP_EN                     0X1
+#define FM_CLECTL_LIMIT_EN                   0X1
+#define FM_CLECTL_COMP_EN                    0X1
+
+/* Field Values */
+#define FV_CLECTL_LVL_MODE_AVG               0x0
+#define FV_CLECTL_LVL_MODE_PEAK              0x1
+#define FV_CLECTL_WINDOWSEL_512              0x0
+#define FV_CLECTL_WINDOWSEL_64               0x1
+#define FV_CLECTL_EXP_EN_ENABLE              0x1
+#define FV_CLECTL_EXP_EN_DISABLE             0x0
+#define FV_CLECTL_LIMIT_EN_ENABLE            0x1
+#define FV_CLECTL_LIMIT_EN_DISABLE           0x0
+#define FV_CLECTL_COMP_EN_ENABLE             0x1
+#define FV_CLECTL_COMP_EN_DISABLE            0x0
+
+/* Register Masks */
+#define RM_CLECTL_LVL_MODE \
+	 RM(FM_CLECTL_LVL_MODE, FB_CLECTL_LVL_MODE)
+
+#define RM_CLECTL_WINDOWSEL \
+	 RM(FM_CLECTL_WINDOWSEL, FB_CLECTL_WINDOWSEL)
+
+#define RM_CLECTL_EXP_EN \
+	 RM(FM_CLECTL_EXP_EN, FB_CLECTL_EXP_EN)
+
+#define RM_CLECTL_LIMIT_EN \
+	 RM(FM_CLECTL_LIMIT_EN, FB_CLECTL_LIMIT_EN)
+
+#define RM_CLECTL_COMP_EN \
+	 RM(FM_CLECTL_COMP_EN, FB_CLECTL_COMP_EN)
+
+
+/* Register Values */
+#define RV_CLECTL_LVL_MODE_AVG \
+	 RV(FV_CLECTL_LVL_MODE_AVG, FB_CLECTL_LVL_MODE)
+
+#define RV_CLECTL_LVL_MODE_PEAK \
+	 RV(FV_CLECTL_LVL_MODE_PEAK, FB_CLECTL_LVL_MODE)
+
+#define RV_CLECTL_WINDOWSEL_512 \
+	 RV(FV_CLECTL_WINDOWSEL_512, FB_CLECTL_WINDOWSEL)
+
+#define RV_CLECTL_WINDOWSEL_64 \
+	 RV(FV_CLECTL_WINDOWSEL_64, FB_CLECTL_WINDOWSEL)
+
+#define RV_CLECTL_EXP_EN_ENABLE \
+	 RV(FV_CLECTL_EXP_EN_ENABLE, FB_CLECTL_EXP_EN)
+
+#define RV_CLECTL_EXP_EN_DISABLE \
+	 RV(FV_CLECTL_EXP_EN_DISABLE, FB_CLECTL_EXP_EN)
+
+#define RV_CLECTL_LIMIT_EN_ENABLE \
+	 RV(FV_CLECTL_LIMIT_EN_ENABLE, FB_CLECTL_LIMIT_EN)
+
+#define RV_CLECTL_LIMIT_EN_DISABLE \
+	 RV(FV_CLECTL_LIMIT_EN_DISABLE, FB_CLECTL_LIMIT_EN)
+
+#define RV_CLECTL_COMP_EN_ENABLE \
+	 RV(FV_CLECTL_COMP_EN_ENABLE, FB_CLECTL_COMP_EN)
+
+#define RV_CLECTL_COMP_EN_DISABLE \
+	 RV(FV_CLECTL_COMP_EN_DISABLE, FB_CLECTL_COMP_EN)
+
+
+/*****************************
+ *      R_MUGAIN (0x26)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_MUGAIN_CLEMUG                     0
+
+/* Field Masks */
+#define FM_MUGAIN_CLEMUG                     0X1F
+
+/* Field Values */
+#define FV_MUGAIN_CLEMUG_46PT5DB             0x1F
+#define FV_MUGAIN_CLEMUG_0DB                 0x0
+
+/* Register Masks */
+#define RM_MUGAIN_CLEMUG \
+	 RM(FM_MUGAIN_CLEMUG, FB_MUGAIN_CLEMUG)
+
+
+/* Register Values */
+#define RV_MUGAIN_CLEMUG_46PT5DB \
+	 RV(FV_MUGAIN_CLEMUG_46PT5DB, FB_MUGAIN_CLEMUG)
+
+#define RV_MUGAIN_CLEMUG_0DB \
+	 RV(FV_MUGAIN_CLEMUG_0DB, FB_MUGAIN_CLEMUG)
+
+
+/*****************************
+ *      R_COMPTH (0x27)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_COMPTH                            0
+
+/* Field Masks */
+#define FM_COMPTH                            0XFF
+
+/* Field Values */
+#define FV_COMPTH_0DB                        0xFF
+#define FV_COMPTH_N95PT625DB                 0x0
+
+/* Register Masks */
+#define RM_COMPTH                            RM(FM_COMPTH, FB_COMPTH)
+
+/* Register Values */
+#define RV_COMPTH_0DB                        RV(FV_COMPTH_0DB, FB_COMPTH)
+#define RV_COMPTH_N95PT625DB \
+	 RV(FV_COMPTH_N95PT625DB, FB_COMPTH)
+
+
+/*****************************
+ *      R_CMPRAT (0x28)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_CMPRAT                            0
+
+/* Field Masks */
+#define FM_CMPRAT                            0X1F
+
+/* Register Masks */
+#define RM_CMPRAT                            RM(FM_CMPRAT, FB_CMPRAT)
+
+/******************************
+ *      R_CATKTCL (0x29)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CATKTCL                           0
+
+/* Field Masks */
+#define FM_CATKTCL                           0XFF
+
+/* Register Masks */
+#define RM_CATKTCL                           RM(FM_CATKTCL, FB_CATKTCL)
+
+/******************************
+ *      R_CATKTCH (0x2A)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CATKTCH                           0
+
+/* Field Masks */
+#define FM_CATKTCH                           0XFF
+
+/* Register Masks */
+#define RM_CATKTCH                           RM(FM_CATKTCH, FB_CATKTCH)
+
+/******************************
+ *      R_CRELTCL (0x2B)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CRELTCL                           0
+
+/* Field Masks */
+#define FM_CRELTCL                           0XFF
+
+/* Register Masks */
+#define RM_CRELTCL                           RM(FM_CRELTCL, FB_CRELTCL)
+
+/******************************
+ *      R_CRELTCH (0x2C)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_CRELTCH                           0
+
+/* Field Masks */
+#define FM_CRELTCH                           0XFF
+
+/* Register Masks */
+#define RM_CRELTCH                           RM(FM_CRELTCH, FB_CRELTCH)
+
+/****************************
+ *      R_LIMTH (0x2D)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_LIMTH                             0
+
+/* Field Masks */
+#define FM_LIMTH                             0XFF
+
+/* Field Values */
+#define FV_LIMTH_0DB                         0xFF
+#define FV_LIMTH_N95PT625DB                  0x0
+
+/* Register Masks */
+#define RM_LIMTH                             RM(FM_LIMTH, FB_LIMTH)
+
+/* Register Values */
+#define RV_LIMTH_0DB                         RV(FV_LIMTH_0DB, FB_LIMTH)
+#define RV_LIMTH_N95PT625DB                  RV(FV_LIMTH_N95PT625DB, FB_LIMTH)
+
+/*****************************
+ *      R_LIMTGT (0x2E)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_LIMTGT                            0
+
+/* Field Masks */
+#define FM_LIMTGT                            0XFF
+
+/* Field Values */
+#define FV_LIMTGT_0DB                        0xFF
+#define FV_LIMTGT_N95PT625DB                 0x0
+
+/* Register Masks */
+#define RM_LIMTGT                            RM(FM_LIMTGT, FB_LIMTGT)
+
+/* Register Values */
+#define RV_LIMTGT_0DB                        RV(FV_LIMTGT_0DB, FB_LIMTGT)
+#define RV_LIMTGT_N95PT625DB \
+	 RV(FV_LIMTGT_N95PT625DB, FB_LIMTGT)
+
+
+/******************************
+ *      R_LATKTCL (0x2F)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_LATKTCL                           0
+
+/* Field Masks */
+#define FM_LATKTCL                           0XFF
+
+/* Register Masks */
+#define RM_LATKTCL                           RM(FM_LATKTCL, FB_LATKTCL)
+
+/******************************
+ *      R_LATKTCH (0x30)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_LATKTCH                           0
+
+/* Field Masks */
+#define FM_LATKTCH                           0XFF
+
+/* Register Masks */
+#define RM_LATKTCH                           RM(FM_LATKTCH, FB_LATKTCH)
+
+/******************************
+ *      R_LRELTCL (0x31)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_LRELTCL                           0
+
+/* Field Masks */
+#define FM_LRELTCL                           0XFF
+
+/* Register Masks */
+#define RM_LRELTCL                           RM(FM_LRELTCL, FB_LRELTCL)
+
+/******************************
+ *      R_LRELTCH (0x32)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_LRELTCH                           0
+
+/* Field Masks */
+#define FM_LRELTCH                           0XFF
+
+/* Register Masks */
+#define RM_LRELTCH                           RM(FM_LRELTCH, FB_LRELTCH)
+
+/****************************
+ *      R_EXPTH (0x33)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_EXPTH                             0
+
+/* Field Masks */
+#define FM_EXPTH                             0XFF
+
+/* Field Values */
+#define FV_EXPTH_0DB                         0xFF
+#define FV_EXPTH_N95PT625DB                  0x0
+
+/* Register Masks */
+#define RM_EXPTH                             RM(FM_EXPTH, FB_EXPTH)
+
+/* Register Values */
+#define RV_EXPTH_0DB                         RV(FV_EXPTH_0DB, FB_EXPTH)
+#define RV_EXPTH_N95PT625DB                  RV(FV_EXPTH_N95PT625DB, FB_EXPTH)
+
+/*****************************
+ *      R_EXPRAT (0x34)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_EXPRAT                            0
+
+/* Field Masks */
+#define FM_EXPRAT                            0X7
+
+/* Register Masks */
+#define RM_EXPRAT                            RM(FM_EXPRAT, FB_EXPRAT)
+
+/******************************
+ *      R_XATKTCL (0x35)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_XATKTCL                           0
+
+/* Field Masks */
+#define FM_XATKTCL                           0XFF
+
+/* Register Masks */
+#define RM_XATKTCL                           RM(FM_XATKTCL, FB_XATKTCL)
+
+/******************************
+ *      R_XATKTCH (0x36)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_XATKTCH                           0
+
+/* Field Masks */
+#define FM_XATKTCH                           0XFF
+
+/* Register Masks */
+#define RM_XATKTCH                           RM(FM_XATKTCH, FB_XATKTCH)
+
+/******************************
+ *      R_XRELTCL (0x37)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_XRELTCL                           0
+
+/* Field Masks */
+#define FM_XRELTCL                           0XFF
+
+/* Register Masks */
+#define RM_XRELTCL                           RM(FM_XRELTCL, FB_XRELTCL)
+
+/******************************
+ *      R_XRELTCH (0x38)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_XRELTCH                           0
+
+/* Field Masks */
+#define FM_XRELTCH                           0XFF
+
+/* Register Masks */
+#define RM_XRELTCH                           RM(FM_XRELTCH, FB_XRELTCH)
+
+/****************************
+ *      R_FXCTL (0x39)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_FXCTL_3DEN                        4
+#define FB_FXCTL_TEEN                        3
+#define FB_FXCTL_TNLFBYPASS                  2
+#define FB_FXCTL_BEEN                        1
+#define FB_FXCTL_BNLFBYPASS                  0
+
+/* Field Masks */
+#define FM_FXCTL_3DEN                        0X1
+#define FM_FXCTL_TEEN                        0X1
+#define FM_FXCTL_TNLFBYPASS                  0X1
+#define FM_FXCTL_BEEN                        0X1
+#define FM_FXCTL_BNLFBYPASS                  0X1
+
+/* Field Values */
+#define FV_FXCTL_3DEN_ENABLE                 0x1
+#define FV_FXCTL_3DEN_DISABLE                0x0
+#define FV_FXCTL_TEEN_ENABLE                 0x1
+#define FV_FXCTL_TEEN_DISABLE                0x0
+#define FV_FXCTL_TNLFBYPASS_ENABLE           0x1
+#define FV_FXCTL_TNLFBYPASS_DISABLE          0x0
+#define FV_FXCTL_BEEN_ENABLE                 0x1
+#define FV_FXCTL_BEEN_DISABLE                0x0
+#define FV_FXCTL_BNLFBYPASS_ENABLE           0x1
+#define FV_FXCTL_BNLFBYPASS_DISABLE          0x0
+
+/* Register Masks */
+#define RM_FXCTL_3DEN                        RM(FM_FXCTL_3DEN, FB_FXCTL_3DEN)
+#define RM_FXCTL_TEEN                        RM(FM_FXCTL_TEEN, FB_FXCTL_TEEN)
+#define RM_FXCTL_TNLFBYPASS \
+	 RM(FM_FXCTL_TNLFBYPASS, FB_FXCTL_TNLFBYPASS)
+
+#define RM_FXCTL_BEEN                        RM(FM_FXCTL_BEEN, FB_FXCTL_BEEN)
+#define RM_FXCTL_BNLFBYPASS \
+	 RM(FM_FXCTL_BNLFBYPASS, FB_FXCTL_BNLFBYPASS)
+
+
+/* Register Values */
+#define RV_FXCTL_3DEN_ENABLE \
+	 RV(FV_FXCTL_3DEN_ENABLE, FB_FXCTL_3DEN)
+
+#define RV_FXCTL_3DEN_DISABLE \
+	 RV(FV_FXCTL_3DEN_DISABLE, FB_FXCTL_3DEN)
+
+#define RV_FXCTL_TEEN_ENABLE \
+	 RV(FV_FXCTL_TEEN_ENABLE, FB_FXCTL_TEEN)
+
+#define RV_FXCTL_TEEN_DISABLE \
+	 RV(FV_FXCTL_TEEN_DISABLE, FB_FXCTL_TEEN)
+
+#define RV_FXCTL_TNLFBYPASS_ENABLE \
+	 RV(FV_FXCTL_TNLFBYPASS_ENABLE, FB_FXCTL_TNLFBYPASS)
+
+#define RV_FXCTL_TNLFBYPASS_DISABLE \
+	 RV(FV_FXCTL_TNLFBYPASS_DISABLE, FB_FXCTL_TNLFBYPASS)
+
+#define RV_FXCTL_BEEN_ENABLE \
+	 RV(FV_FXCTL_BEEN_ENABLE, FB_FXCTL_BEEN)
+
+#define RV_FXCTL_BEEN_DISABLE \
+	 RV(FV_FXCTL_BEEN_DISABLE, FB_FXCTL_BEEN)
+
+#define RV_FXCTL_BNLFBYPASS_ENABLE \
+	 RV(FV_FXCTL_BNLFBYPASS_ENABLE, FB_FXCTL_BNLFBYPASS)
+
+#define RV_FXCTL_BNLFBYPASS_DISABLE \
+	 RV(FV_FXCTL_BNLFBYPASS_DISABLE, FB_FXCTL_BNLFBYPASS)
+
+
+/*******************************
+ *      R_DACCRWRL (0x3A)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRWRL_DACCRWDL                 0
+
+/* Field Masks */
+#define FM_DACCRWRL_DACCRWDL                 0XFF
+
+/* Register Masks */
+#define RM_DACCRWRL_DACCRWDL \
+	 RM(FM_DACCRWRL_DACCRWDL, FB_DACCRWRL_DACCRWDL)
+
+
+/*******************************
+ *      R_DACCRWRM (0x3B)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRWRM_DACCRWDM                 0
+
+/* Field Masks */
+#define FM_DACCRWRM_DACCRWDM                 0XFF
+
+/* Register Masks */
+#define RM_DACCRWRM_DACCRWDM \
+	 RM(FM_DACCRWRM_DACCRWDM, FB_DACCRWRM_DACCRWDM)
+
+
+/*******************************
+ *      R_DACCRWRH (0x3C)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRWRH_DACCRWDH                 0
+
+/* Field Masks */
+#define FM_DACCRWRH_DACCRWDH                 0XFF
+
+/* Register Masks */
+#define RM_DACCRWRH_DACCRWDH \
+	 RM(FM_DACCRWRH_DACCRWDH, FB_DACCRWRH_DACCRWDH)
+
+
+/*******************************
+ *      R_DACCRRDL (0x3D)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRRDL                          0
+
+/* Field Masks */
+#define FM_DACCRRDL                          0XFF
+
+/* Register Masks */
+#define RM_DACCRRDL                          RM(FM_DACCRRDL, FB_DACCRRDL)
+
+/*******************************
+ *      R_DACCRRDM (0x3E)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRRDM                          0
+
+/* Field Masks */
+#define FM_DACCRRDM                          0XFF
+
+/* Register Masks */
+#define RM_DACCRRDM                          RM(FM_DACCRRDM, FB_DACCRRDM)
+
+/*******************************
+ *      R_DACCRRDH (0x3F)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACCRRDH                          0
+
+/* Field Masks */
+#define FM_DACCRRDH                          0XFF
+
+/* Register Masks */
+#define RM_DACCRRDH                          RM(FM_DACCRRDH, FB_DACCRRDH)
+
+/********************************
+ *      R_DACCRADDR (0x40)      *
+ ********************************/
+
+/* Field Offsets */
+#define FB_DACCRADDR_DACCRADD                0
+
+/* Field Masks */
+#define FM_DACCRADDR_DACCRADD                0XFF
+
+/* Register Masks */
+#define RM_DACCRADDR_DACCRADD \
+	 RM(FM_DACCRADDR_DACCRADD, FB_DACCRADDR_DACCRADD)
+
+
+/******************************
+ *      R_DCOFSEL (0x41)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_DCOFSEL_DC_COEF_SEL               0
+
+/* Field Masks */
+#define FM_DCOFSEL_DC_COEF_SEL               0X7
+
+/* Field Values */
+#define FV_DCOFSEL_DC_COEF_SEL_2_N8          0x0
+#define FV_DCOFSEL_DC_COEF_SEL_2_N9          0x1
+#define FV_DCOFSEL_DC_COEF_SEL_2_N10         0x2
+#define FV_DCOFSEL_DC_COEF_SEL_2_N11         0x3
+#define FV_DCOFSEL_DC_COEF_SEL_2_N12         0x4
+#define FV_DCOFSEL_DC_COEF_SEL_2_N13         0x5
+#define FV_DCOFSEL_DC_COEF_SEL_2_N14         0x6
+#define FV_DCOFSEL_DC_COEF_SEL_2_N15         0x7
+
+/* Register Masks */
+#define RM_DCOFSEL_DC_COEF_SEL \
+	 RM(FM_DCOFSEL_DC_COEF_SEL, FB_DCOFSEL_DC_COEF_SEL)
+
+
+/* Register Values */
+#define RV_DCOFSEL_DC_COEF_SEL_2_N8 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N8, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N9 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N9, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N10 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N10, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N11 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N11, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N12 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N12, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N13 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N13, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N14 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N14, FB_DCOFSEL_DC_COEF_SEL)
+
+#define RV_DCOFSEL_DC_COEF_SEL_2_N15 \
+	 RV(FV_DCOFSEL_DC_COEF_SEL_2_N15, FB_DCOFSEL_DC_COEF_SEL)
+
+
+/******************************
+ *      R_PLLCTL9 (0x4E)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL9_REFDIV_PLL1               0
+
+/* Field Masks */
+#define FM_PLLCTL9_REFDIV_PLL1               0XFF
+
+/* Register Masks */
+#define RM_PLLCTL9_REFDIV_PLL1 \
+	 RM(FM_PLLCTL9_REFDIV_PLL1, FB_PLLCTL9_REFDIV_PLL1)
+
+
+/******************************
+ *      R_PLLCTLA (0x4F)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLA_OUTDIV_PLL1               0
+
+/* Field Masks */
+#define FM_PLLCTLA_OUTDIV_PLL1               0XFF
+
+/* Register Masks */
+#define RM_PLLCTLA_OUTDIV_PLL1 \
+	 RM(FM_PLLCTLA_OUTDIV_PLL1, FB_PLLCTLA_OUTDIV_PLL1)
+
+
+/******************************
+ *      R_PLLCTLB (0x50)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLB_FBDIV_PLL1L               0
+
+/* Field Masks */
+#define FM_PLLCTLB_FBDIV_PLL1L               0XFF
+
+/* Register Masks */
+#define RM_PLLCTLB_FBDIV_PLL1L \
+	 RM(FM_PLLCTLB_FBDIV_PLL1L, FB_PLLCTLB_FBDIV_PLL1L)
+
+
+/******************************
+ *      R_PLLCTLC (0x51)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLC_FBDIV_PLL1H               0
+
+/* Field Masks */
+#define FM_PLLCTLC_FBDIV_PLL1H               0X7
+
+/* Register Masks */
+#define RM_PLLCTLC_FBDIV_PLL1H \
+	 RM(FM_PLLCTLC_FBDIV_PLL1H, FB_PLLCTLC_FBDIV_PLL1H)
+
+
+/******************************
+ *      R_PLLCTLD (0x52)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLD_RZ_PLL1                   3
+#define FB_PLLCTLD_CP_PLL1                   0
+
+/* Field Masks */
+#define FM_PLLCTLD_RZ_PLL1                   0X7
+#define FM_PLLCTLD_CP_PLL1                   0X7
+
+/* Register Masks */
+#define RM_PLLCTLD_RZ_PLL1 \
+	 RM(FM_PLLCTLD_RZ_PLL1, FB_PLLCTLD_RZ_PLL1)
+
+#define RM_PLLCTLD_CP_PLL1 \
+	 RM(FM_PLLCTLD_CP_PLL1, FB_PLLCTLD_CP_PLL1)
+
+
+/******************************
+ *      R_PLLCTLE (0x53)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLE_REFDIV_PLL2               0
+
+/* Field Masks */
+#define FM_PLLCTLE_REFDIV_PLL2               0XFF
+
+/* Register Masks */
+#define RM_PLLCTLE_REFDIV_PLL2 \
+	 RM(FM_PLLCTLE_REFDIV_PLL2, FB_PLLCTLE_REFDIV_PLL2)
+
+
+/******************************
+ *      R_PLLCTLF (0x54)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTLF_OUTDIV_PLL2               0
+
+/* Field Masks */
+#define FM_PLLCTLF_OUTDIV_PLL2               0XFF
+
+/* Register Masks */
+#define RM_PLLCTLF_OUTDIV_PLL2 \
+	 RM(FM_PLLCTLF_OUTDIV_PLL2, FB_PLLCTLF_OUTDIV_PLL2)
+
+
+/*******************************
+ *      R_PLLCTL10 (0x55)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL10_FBDIV_PLL2L              0
+
+/* Field Masks */
+#define FM_PLLCTL10_FBDIV_PLL2L              0XFF
+
+/* Register Masks */
+#define RM_PLLCTL10_FBDIV_PLL2L \
+	 RM(FM_PLLCTL10_FBDIV_PLL2L, FB_PLLCTL10_FBDIV_PLL2L)
+
+
+/*******************************
+ *      R_PLLCTL11 (0x56)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL11_FBDIV_PLL2H              0
+
+/* Field Masks */
+#define FM_PLLCTL11_FBDIV_PLL2H              0X7
+
+/* Register Masks */
+#define RM_PLLCTL11_FBDIV_PLL2H \
+	 RM(FM_PLLCTL11_FBDIV_PLL2H, FB_PLLCTL11_FBDIV_PLL2H)
+
+
+/*******************************
+ *      R_PLLCTL12 (0x57)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL12_RZ_PLL2                  3
+#define FB_PLLCTL12_CP_PLL2                  0
+
+/* Field Masks */
+#define FM_PLLCTL12_RZ_PLL2                  0X7
+#define FM_PLLCTL12_CP_PLL2                  0X7
+
+/* Register Masks */
+#define RM_PLLCTL12_RZ_PLL2 \
+	 RM(FM_PLLCTL12_RZ_PLL2, FB_PLLCTL12_RZ_PLL2)
+
+#define RM_PLLCTL12_CP_PLL2 \
+	 RM(FM_PLLCTL12_CP_PLL2, FB_PLLCTL12_CP_PLL2)
+
+
+/*******************************
+ *      R_PLLCTL1B (0x60)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL1B_VCOI_PLL2                4
+#define FB_PLLCTL1B_VCOI_PLL1                2
+
+/* Field Masks */
+#define FM_PLLCTL1B_VCOI_PLL2                0X3
+#define FM_PLLCTL1B_VCOI_PLL1                0X3
+
+/* Register Masks */
+#define RM_PLLCTL1B_VCOI_PLL2 \
+	 RM(FM_PLLCTL1B_VCOI_PLL2, FB_PLLCTL1B_VCOI_PLL2)
+
+#define RM_PLLCTL1B_VCOI_PLL1 \
+	 RM(FM_PLLCTL1B_VCOI_PLL1, FB_PLLCTL1B_VCOI_PLL1)
+
+
+/*******************************
+ *      R_PLLCTL1C (0x61)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL1C_PDB_PLL2                 2
+#define FB_PLLCTL1C_PDB_PLL1                 1
+
+/* Field Masks */
+#define FM_PLLCTL1C_PDB_PLL2                 0X1
+#define FM_PLLCTL1C_PDB_PLL1                 0X1
+
+/* Field Values */
+#define FV_PLLCTL1C_PDB_PLL2_ENABLE          0x1
+#define FV_PLLCTL1C_PDB_PLL2_DISABLE         0x0
+#define FV_PLLCTL1C_PDB_PLL1_ENABLE          0x1
+#define FV_PLLCTL1C_PDB_PLL1_DISABLE         0x0
+
+/* Register Masks */
+#define RM_PLLCTL1C_PDB_PLL2 \
+	 RM(FM_PLLCTL1C_PDB_PLL2, FB_PLLCTL1C_PDB_PLL2)
+
+#define RM_PLLCTL1C_PDB_PLL1 \
+	 RM(FM_PLLCTL1C_PDB_PLL1, FB_PLLCTL1C_PDB_PLL1)
+
+
+/* Register Values */
+#define RV_PLLCTL1C_PDB_PLL2_ENABLE \
+	 RV(FV_PLLCTL1C_PDB_PLL2_ENABLE, FB_PLLCTL1C_PDB_PLL2)
+
+#define RV_PLLCTL1C_PDB_PLL2_DISABLE \
+	 RV(FV_PLLCTL1C_PDB_PLL2_DISABLE, FB_PLLCTL1C_PDB_PLL2)
+
+#define RV_PLLCTL1C_PDB_PLL1_ENABLE \
+	 RV(FV_PLLCTL1C_PDB_PLL1_ENABLE, FB_PLLCTL1C_PDB_PLL1)
+
+#define RV_PLLCTL1C_PDB_PLL1_DISABLE \
+	 RV(FV_PLLCTL1C_PDB_PLL1_DISABLE, FB_PLLCTL1C_PDB_PLL1)
+
+
+/*******************************
+ *      R_TIMEBASE (0x77)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_TIMEBASE_DIVIDER                  0
+
+/* Field Masks */
+#define FM_TIMEBASE_DIVIDER                  0XFF
+
+/* Register Masks */
+#define RM_TIMEBASE_DIVIDER \
+	 RM(FM_TIMEBASE_DIVIDER, FB_TIMEBASE_DIVIDER)
+
+
+/*****************************
+ *      R_DEVIDL (0x7D)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_DEVIDL_DIDL                       0
+
+/* Field Masks */
+#define FM_DEVIDL_DIDL                       0XFF
+
+/* Register Masks */
+#define RM_DEVIDL_DIDL                       RM(FM_DEVIDL_DIDL, FB_DEVIDL_DIDL)
+
+/*****************************
+ *      R_DEVIDH (0x7E)      *
+ *****************************/
+
+/* Field Offsets */
+#define FB_DEVIDH_DIDH                       0
+
+/* Field Masks */
+#define FM_DEVIDH_DIDH                       0XFF
+
+/* Register Masks */
+#define RM_DEVIDH_DIDH                       RM(FM_DEVIDH_DIDH, FB_DEVIDH_DIDH)
+
+/****************************
+ *      R_RESET (0x80)      *
+ ****************************/
+
+/* Field Offsets */
+#define FB_RESET                             0
+
+/* Field Masks */
+#define FM_RESET                             0XFF
+
+/* Field Values */
+#define FV_RESET_ENABLE                      0x85
+
+/* Register Masks */
+#define RM_RESET                             RM(FM_RESET, FB_RESET)
+
+/* Register Values */
+#define RV_RESET_ENABLE                      RV(FV_RESET_ENABLE, FB_RESET)
+
+/********************************
+ *      R_DACCRSTAT (0x8A)      *
+ ********************************/
+
+/* Field Offsets */
+#define FB_DACCRSTAT_DACCR_BUSY              7
+
+/* Field Masks */
+#define FM_DACCRSTAT_DACCR_BUSY              0X1
+
+/* Register Masks */
+#define RM_DACCRSTAT_DACCR_BUSY \
+	 RM(FM_DACCRSTAT_DACCR_BUSY, FB_DACCRSTAT_DACCR_BUSY)
+
+
+/******************************
+ *      R_PLLCTL0 (0x8E)      *
+ ******************************/
+
+/* Field Offsets */
+#define FB_PLLCTL0_PLL2_LOCK                 1
+#define FB_PLLCTL0_PLL1_LOCK                 0
+
+/* Field Masks */
+#define FM_PLLCTL0_PLL2_LOCK                 0X1
+#define FM_PLLCTL0_PLL1_LOCK                 0X1
+
+/* Register Masks */
+#define RM_PLLCTL0_PLL2_LOCK \
+	 RM(FM_PLLCTL0_PLL2_LOCK, FB_PLLCTL0_PLL2_LOCK)
+
+#define RM_PLLCTL0_PLL1_LOCK \
+	 RM(FM_PLLCTL0_PLL1_LOCK, FB_PLLCTL0_PLL1_LOCK)
+
+
+/********************************
+ *      R_PLLREFSEL (0x8F)      *
+ ********************************/
+
+/* Field Offsets */
+#define FB_PLLREFSEL_PLL2_REF_SEL            4
+#define FB_PLLREFSEL_PLL1_REF_SEL            0
+
+/* Field Masks */
+#define FM_PLLREFSEL_PLL2_REF_SEL            0X7
+#define FM_PLLREFSEL_PLL1_REF_SEL            0X7
+
+/* Field Values */
+#define FV_PLLREFSEL_PLL2_REF_SEL_XTAL_MCLK1 0x0
+#define FV_PLLREFSEL_PLL2_REF_SEL_MCLK2      0x1
+#define FV_PLLREFSEL_PLL1_REF_SEL_XTAL_MCLK1 0x0
+#define FV_PLLREFSEL_PLL1_REF_SEL_MCLK2      0x1
+
+/* Register Masks */
+#define RM_PLLREFSEL_PLL2_REF_SEL \
+	 RM(FM_PLLREFSEL_PLL2_REF_SEL, FB_PLLREFSEL_PLL2_REF_SEL)
+
+#define RM_PLLREFSEL_PLL1_REF_SEL \
+	 RM(FM_PLLREFSEL_PLL1_REF_SEL, FB_PLLREFSEL_PLL1_REF_SEL)
+
+
+/* Register Values */
+#define RV_PLLREFSEL_PLL2_REF_SEL_XTAL_MCLK1 \
+	 RV(FV_PLLREFSEL_PLL2_REF_SEL_XTAL_MCLK1, FB_PLLREFSEL_PLL2_REF_SEL)
+
+#define RV_PLLREFSEL_PLL2_REF_SEL_MCLK2 \
+	 RV(FV_PLLREFSEL_PLL2_REF_SEL_MCLK2, FB_PLLREFSEL_PLL2_REF_SEL)
+
+#define RV_PLLREFSEL_PLL1_REF_SEL_XTAL_MCLK1 \
+	 RV(FV_PLLREFSEL_PLL1_REF_SEL_XTAL_MCLK1, FB_PLLREFSEL_PLL1_REF_SEL)
+
+#define RV_PLLREFSEL_PLL1_REF_SEL_MCLK2 \
+	 RV(FV_PLLREFSEL_PLL1_REF_SEL_MCLK2, FB_PLLREFSEL_PLL1_REF_SEL)
+
+
+/*******************************
+ *      R_DACMBCEN (0xC7)      *
+ *******************************/
+
+/* Field Offsets */
+#define FB_DACMBCEN_MBCEN3                   2
+#define FB_DACMBCEN_MBCEN2                   1
+#define FB_DACMBCEN_MBCEN1                   0
+
+/* Field Masks */
+#define FM_DACMBCEN_MBCEN3                   0X1
+#define FM_DACMBCEN_MBCEN2                   0X1
+#define FM_DACMBCEN_MBCEN1                   0X1
+
+/* Register Masks */
+#define RM_DACMBCEN_MBCEN3 \
+	 RM(FM_DACMBCEN_MBCEN3, FB_DACMBCEN_MBCEN3)
+
+#define RM_DACMBCEN_MBCEN2 \
+	 RM(FM_DACMBCEN_MBCEN2, FB_DACMBCEN_MBCEN2)
+
+#define RM_DACMBCEN_MBCEN1 \
+	 RM(FM_DACMBCEN_MBCEN1, FB_DACMBCEN_MBCEN1)
+
+
+/********************************
+ *      R_DACMBCCTL (0xC8)      *
+ ********************************/
+
+/* Field Offsets */
+#define FB_DACMBCCTL_LVLMODE3                5
+#define FB_DACMBCCTL_WINSEL3                 4
+#define FB_DACMBCCTL_LVLMODE2                3
+#define FB_DACMBCCTL_WINSEL2                 2
+#define FB_DACMBCCTL_LVLMODE1                1
+#define FB_DACMBCCTL_WINSEL1                 0
+
+/* Field Masks */
+#define FM_DACMBCCTL_LVLMODE3                0X1
+#define FM_DACMBCCTL_WINSEL3                 0X1
+#define FM_DACMBCCTL_LVLMODE2                0X1
+#define FM_DACMBCCTL_WINSEL2                 0X1
+#define FM_DACMBCCTL_LVLMODE1                0X1
+#define FM_DACMBCCTL_WINSEL1                 0X1
+
+/* Register Masks */
+#define RM_DACMBCCTL_LVLMODE3 \
+	 RM(FM_DACMBCCTL_LVLMODE3, FB_DACMBCCTL_LVLMODE3)
+
+#define RM_DACMBCCTL_WINSEL3 \
+	 RM(FM_DACMBCCTL_WINSEL3, FB_DACMBCCTL_WINSEL3)
+
+#define RM_DACMBCCTL_LVLMODE2 \
+	 RM(FM_DACMBCCTL_LVLMODE2, FB_DACMBCCTL_LVLMODE2)
+
+#define RM_DACMBCCTL_WINSEL2 \
+	 RM(FM_DACMBCCTL_WINSEL2, FB_DACMBCCTL_WINSEL2)
+
+#define RM_DACMBCCTL_LVLMODE1 \
+	 RM(FM_DACMBCCTL_LVLMODE1, FB_DACMBCCTL_LVLMODE1)
+
+#define RM_DACMBCCTL_WINSEL1 \
+	 RM(FM_DACMBCCTL_WINSEL1, FB_DACMBCCTL_WINSEL1)
+
+
+/*********************************
+ *      R_DACMBCMUG1 (0xC9)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCMUG1_PHASE                  5
+#define FB_DACMBCMUG1_MUGAIN                 0
+
+/* Field Masks */
+#define FM_DACMBCMUG1_PHASE                  0X1
+#define FM_DACMBCMUG1_MUGAIN                 0X1F
+
+/* Register Masks */
+#define RM_DACMBCMUG1_PHASE \
+	 RM(FM_DACMBCMUG1_PHASE, FB_DACMBCMUG1_PHASE)
+
+#define RM_DACMBCMUG1_MUGAIN \
+	 RM(FM_DACMBCMUG1_MUGAIN, FB_DACMBCMUG1_MUGAIN)
+
+
+/*********************************
+ *      R_DACMBCTHR1 (0xCA)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCTHR1_THRESH                 0
+
+/* Field Masks */
+#define FM_DACMBCTHR1_THRESH                 0XFF
+
+/* Register Masks */
+#define RM_DACMBCTHR1_THRESH \
+	 RM(FM_DACMBCTHR1_THRESH, FB_DACMBCTHR1_THRESH)
+
+
+/*********************************
+ *      R_DACMBCRAT1 (0xCB)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCRAT1_RATIO                  0
+
+/* Field Masks */
+#define FM_DACMBCRAT1_RATIO                  0X1F
+
+/* Register Masks */
+#define RM_DACMBCRAT1_RATIO \
+	 RM(FM_DACMBCRAT1_RATIO, FB_DACMBCRAT1_RATIO)
+
+
+/**********************************
+ *      R_DACMBCATK1L (0xCC)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK1L_TCATKL                0
+
+/* Field Masks */
+#define FM_DACMBCATK1L_TCATKL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK1L_TCATKL \
+	 RM(FM_DACMBCATK1L_TCATKL, FB_DACMBCATK1L_TCATKL)
+
+
+/**********************************
+ *      R_DACMBCATK1H (0xCD)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK1H_TCATKH                0
+
+/* Field Masks */
+#define FM_DACMBCATK1H_TCATKH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK1H_TCATKH \
+	 RM(FM_DACMBCATK1H_TCATKH, FB_DACMBCATK1H_TCATKH)
+
+
+/**********************************
+ *      R_DACMBCREL1L (0xCE)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL1L_TCRELL                0
+
+/* Field Masks */
+#define FM_DACMBCREL1L_TCRELL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL1L_TCRELL \
+	 RM(FM_DACMBCREL1L_TCRELL, FB_DACMBCREL1L_TCRELL)
+
+
+/**********************************
+ *      R_DACMBCREL1H (0xCF)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL1H_TCRELH                0
+
+/* Field Masks */
+#define FM_DACMBCREL1H_TCRELH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL1H_TCRELH \
+	 RM(FM_DACMBCREL1H_TCRELH, FB_DACMBCREL1H_TCRELH)
+
+
+/*********************************
+ *      R_DACMBCMUG2 (0xD0)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCMUG2_PHASE                  5
+#define FB_DACMBCMUG2_MUGAIN                 0
+
+/* Field Masks */
+#define FM_DACMBCMUG2_PHASE                  0X1
+#define FM_DACMBCMUG2_MUGAIN                 0X1F
+
+/* Register Masks */
+#define RM_DACMBCMUG2_PHASE \
+	 RM(FM_DACMBCMUG2_PHASE, FB_DACMBCMUG2_PHASE)
+
+#define RM_DACMBCMUG2_MUGAIN \
+	 RM(FM_DACMBCMUG2_MUGAIN, FB_DACMBCMUG2_MUGAIN)
+
+
+/*********************************
+ *      R_DACMBCTHR2 (0xD1)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCTHR2_THRESH                 0
+
+/* Field Masks */
+#define FM_DACMBCTHR2_THRESH                 0XFF
+
+/* Register Masks */
+#define RM_DACMBCTHR2_THRESH \
+	 RM(FM_DACMBCTHR2_THRESH, FB_DACMBCTHR2_THRESH)
+
+
+/*********************************
+ *      R_DACMBCRAT2 (0xD2)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCRAT2_RATIO                  0
+
+/* Field Masks */
+#define FM_DACMBCRAT2_RATIO                  0X1F
+
+/* Register Masks */
+#define RM_DACMBCRAT2_RATIO \
+	 RM(FM_DACMBCRAT2_RATIO, FB_DACMBCRAT2_RATIO)
+
+
+/**********************************
+ *      R_DACMBCATK2L (0xD3)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK2L_TCATKL                0
+
+/* Field Masks */
+#define FM_DACMBCATK2L_TCATKL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK2L_TCATKL \
+	 RM(FM_DACMBCATK2L_TCATKL, FB_DACMBCATK2L_TCATKL)
+
+
+/**********************************
+ *      R_DACMBCATK2H (0xD4)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK2H_TCATKH                0
+
+/* Field Masks */
+#define FM_DACMBCATK2H_TCATKH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK2H_TCATKH \
+	 RM(FM_DACMBCATK2H_TCATKH, FB_DACMBCATK2H_TCATKH)
+
+
+/**********************************
+ *      R_DACMBCREL2L (0xD5)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL2L_TCRELL                0
+
+/* Field Masks */
+#define FM_DACMBCREL2L_TCRELL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL2L_TCRELL \
+	 RM(FM_DACMBCREL2L_TCRELL, FB_DACMBCREL2L_TCRELL)
+
+
+/**********************************
+ *      R_DACMBCREL2H (0xD6)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL2H_TCRELH                0
+
+/* Field Masks */
+#define FM_DACMBCREL2H_TCRELH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL2H_TCRELH \
+	 RM(FM_DACMBCREL2H_TCRELH, FB_DACMBCREL2H_TCRELH)
+
+
+/*********************************
+ *      R_DACMBCMUG3 (0xD7)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCMUG3_PHASE                  5
+#define FB_DACMBCMUG3_MUGAIN                 0
+
+/* Field Masks */
+#define FM_DACMBCMUG3_PHASE                  0X1
+#define FM_DACMBCMUG3_MUGAIN                 0X1F
+
+/* Register Masks */
+#define RM_DACMBCMUG3_PHASE \
+	 RM(FM_DACMBCMUG3_PHASE, FB_DACMBCMUG3_PHASE)
+
+#define RM_DACMBCMUG3_MUGAIN \
+	 RM(FM_DACMBCMUG3_MUGAIN, FB_DACMBCMUG3_MUGAIN)
+
+
+/*********************************
+ *      R_DACMBCTHR3 (0xD8)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCTHR3_THRESH                 0
+
+/* Field Masks */
+#define FM_DACMBCTHR3_THRESH                 0XFF
+
+/* Register Masks */
+#define RM_DACMBCTHR3_THRESH \
+	 RM(FM_DACMBCTHR3_THRESH, FB_DACMBCTHR3_THRESH)
+
+
+/*********************************
+ *      R_DACMBCRAT3 (0xD9)      *
+ *********************************/
+
+/* Field Offsets */
+#define FB_DACMBCRAT3_RATIO                  0
+
+/* Field Masks */
+#define FM_DACMBCRAT3_RATIO                  0X1F
+
+/* Register Masks */
+#define RM_DACMBCRAT3_RATIO \
+	 RM(FM_DACMBCRAT3_RATIO, FB_DACMBCRAT3_RATIO)
+
+
+/**********************************
+ *      R_DACMBCATK3L (0xDA)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK3L_TCATKL                0
+
+/* Field Masks */
+#define FM_DACMBCATK3L_TCATKL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK3L_TCATKL \
+	 RM(FM_DACMBCATK3L_TCATKL, FB_DACMBCATK3L_TCATKL)
+
+
+/**********************************
+ *      R_DACMBCATK3H (0xDB)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCATK3H_TCATKH                0
+
+/* Field Masks */
+#define FM_DACMBCATK3H_TCATKH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCATK3H_TCATKH \
+	 RM(FM_DACMBCATK3H_TCATKH, FB_DACMBCATK3H_TCATKH)
+
+
+/**********************************
+ *      R_DACMBCREL3L (0xDC)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL3L_TCRELL                0
+
+/* Field Masks */
+#define FM_DACMBCREL3L_TCRELL                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL3L_TCRELL \
+	 RM(FM_DACMBCREL3L_TCRELL, FB_DACMBCREL3L_TCRELL)
+
+
+/**********************************
+ *      R_DACMBCREL3H (0xDD)      *
+ **********************************/
+
+/* Field Offsets */
+#define FB_DACMBCREL3H_TCRELH                0
+
+/* Field Masks */
+#define FM_DACMBCREL3H_TCRELH                0XFF
+
+/* Register Masks */
+#define RM_DACMBCREL3H_TCRELH \
+	 RM(FM_DACMBCREL3H_TCRELH, FB_DACMBCREL3H_TCRELH)
+
+
+#endif /* __WOOKIE_H__ */
diff --git a/sound/soc/codecs/twl4030.c b/sound/soc/codecs/twl4030.c
index cfe72b9..8798182 100644
--- a/sound/soc/codecs/twl4030.c
+++ b/sound/soc/codecs/twl4030.c
@@ -240,7 +240,6 @@ static struct twl4030_codec_data *twl4030_get_pdata(struct snd_soc_codec *codec)
 				     sizeof(struct twl4030_codec_data),
 				     GFP_KERNEL);
 		if (!pdata) {
-			dev_err(codec->dev, "Can not allocate memory\n");
 			of_node_put(twl4030_codec_node);
 			return NULL;
 		}
@@ -851,14 +850,14 @@ static int snd_soc_get_volsw_twl4030(struct snd_kcontrol *kcontrol,
 	int mask = (1 << fls(max)) - 1;
 
 	ucontrol->value.integer.value[0] =
-		(snd_soc_read(codec, reg) >> shift) & mask;
+		(twl4030_read(codec, reg) >> shift) & mask;
 	if (ucontrol->value.integer.value[0])
 		ucontrol->value.integer.value[0] =
 			max + 1 - ucontrol->value.integer.value[0];
 
 	if (shift != rshift) {
 		ucontrol->value.integer.value[1] =
-			(snd_soc_read(codec, reg) >> rshift) & mask;
+			(twl4030_read(codec, reg) >> rshift) & mask;
 		if (ucontrol->value.integer.value[1])
 			ucontrol->value.integer.value[1] =
 				max + 1 - ucontrol->value.integer.value[1];
@@ -909,9 +908,9 @@ static int snd_soc_get_volsw_r2_twl4030(struct snd_kcontrol *kcontrol,
 	int mask = (1<<fls(max))-1;
 
 	ucontrol->value.integer.value[0] =
-		(snd_soc_read(codec, reg) >> shift) & mask;
+		(twl4030_read(codec, reg) >> shift) & mask;
 	ucontrol->value.integer.value[1] =
-		(snd_soc_read(codec, reg2) >> shift) & mask;
+		(twl4030_read(codec, reg2) >> shift) & mask;
 
 	if (ucontrol->value.integer.value[0])
 		ucontrol->value.integer.value[0] =
@@ -2196,8 +2195,6 @@ static int twl4030_soc_remove(struct snd_soc_codec *codec)
 static const struct snd_soc_codec_driver soc_codec_dev_twl4030 = {
 	.probe = twl4030_soc_probe,
 	.remove = twl4030_soc_remove,
-	.read = twl4030_read,
-	.write = twl4030_write,
 	.set_bias_level = twl4030_set_bias_level,
 	.idle_bias_off = true,
 
diff --git a/sound/soc/codecs/twl6040.c b/sound/soc/codecs/twl6040.c
index 1773ff8..3b895b4 100644
--- a/sound/soc/codecs/twl6040.c
+++ b/sound/soc/codecs/twl6040.c
@@ -106,10 +106,12 @@ static const struct snd_pcm_hw_constraint_list sysclk_constraints[] = {
 	{ .count = ARRAY_SIZE(hp_rates), .list = hp_rates, },
 };
 
+#define to_twl6040(codec)	dev_get_drvdata((codec)->dev->parent)
+
 static unsigned int twl6040_read(struct snd_soc_codec *codec, unsigned int reg)
 {
 	struct twl6040_data *priv = snd_soc_codec_get_drvdata(codec);
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 	u8 value;
 
 	if (reg >= TWL6040_CACHEREGNUM)
@@ -171,7 +173,7 @@ static inline void twl6040_update_dl12_cache(struct snd_soc_codec *codec,
 static int twl6040_write(struct snd_soc_codec *codec,
 			unsigned int reg, unsigned int value)
 {
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 
 	if (reg >= TWL6040_CACHEREGNUM)
 		return -EIO;
@@ -541,7 +543,7 @@ int twl6040_get_dl1_gain(struct snd_soc_codec *codec)
 	if (snd_soc_dapm_get_pin_status(dapm, "HSOR") ||
 		snd_soc_dapm_get_pin_status(dapm, "HSOL")) {
 
-		u8 val = snd_soc_read(codec, TWL6040_REG_HSLCTL);
+		u8 val = twl6040_read(codec, TWL6040_REG_HSLCTL);
 		if (val & TWL6040_HSDACMODE)
 			/* HSDACL in LP mode */
 			return -8; /* -8dB */
@@ -572,7 +574,7 @@ EXPORT_SYMBOL_GPL(twl6040_get_trim_value);
 
 int twl6040_get_hs_step_size(struct snd_soc_codec *codec)
 {
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 
 	if (twl6040_get_revid(twl6040) < TWL6040_REV_ES1_3)
 		/* For ES under ES_1.3 HS step is 2 mV */
@@ -830,7 +832,7 @@ static const struct snd_soc_dapm_route intercon[] = {
 static int twl6040_set_bias_level(struct snd_soc_codec *codec,
 				enum snd_soc_bias_level level)
 {
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 	struct twl6040_data *priv = snd_soc_codec_get_drvdata(codec);
 	int ret = 0;
 
@@ -922,7 +924,7 @@ static int twl6040_prepare(struct snd_pcm_substream *substream,
 			struct snd_soc_dai *dai)
 {
 	struct snd_soc_codec *codec = dai->codec;
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 	struct twl6040_data *priv = snd_soc_codec_get_drvdata(codec);
 	int ret;
 
@@ -964,7 +966,7 @@ static int twl6040_set_dai_sysclk(struct snd_soc_dai *codec_dai,
 static void twl6040_mute_path(struct snd_soc_codec *codec, enum twl6040_dai_id id,
 			     int mute)
 {
-	struct twl6040 *twl6040 = codec->control_data;
+	struct twl6040 *twl6040 = to_twl6040(codec);
 	struct twl6040_data *priv = snd_soc_codec_get_drvdata(codec);
 	int hslctl, hsrctl, earctl;
 	int hflctl, hfrctl;
@@ -1108,7 +1110,6 @@ static struct snd_soc_dai_driver twl6040_dai[] = {
 static int twl6040_probe(struct snd_soc_codec *codec)
 {
 	struct twl6040_data *priv;
-	struct twl6040 *twl6040 = dev_get_drvdata(codec->dev->parent);
 	struct platform_device *pdev = to_platform_device(codec->dev);
 	int ret = 0;
 
@@ -1119,7 +1120,6 @@ static int twl6040_probe(struct snd_soc_codec *codec)
 	snd_soc_codec_set_drvdata(codec, priv);
 
 	priv->codec = codec;
-	codec->control_data = twl6040;
 
 	priv->plug_irq = platform_get_irq(pdev, 0);
 	if (priv->plug_irq < 0) {
@@ -1158,8 +1158,6 @@ static int twl6040_remove(struct snd_soc_codec *codec)
 static const struct snd_soc_codec_driver soc_codec_dev_twl6040 = {
 	.probe = twl6040_probe,
 	.remove = twl6040_remove,
-	.read = twl6040_read,
-	.write = twl6040_write,
 	.set_bias_level = twl6040_set_bias_level,
 	.suspend_bias_off = true,
 	.ignore_pmdown_time = true,
diff --git a/sound/soc/codecs/uda1380.c b/sound/soc/codecs/uda1380.c
index 926c81a..c73e6a1 100644
--- a/sound/soc/codecs/uda1380.c
+++ b/sound/soc/codecs/uda1380.c
@@ -37,7 +37,8 @@ struct uda1380_priv {
 	struct snd_soc_codec *codec;
 	unsigned int dac_clk;
 	struct work_struct work;
-	void *control_data;
+	struct i2c_client *i2c;
+	u16 *reg_cache;
 };
 
 /*
@@ -63,7 +64,9 @@ static unsigned long uda1380_cache_dirty;
 static inline unsigned int uda1380_read_reg_cache(struct snd_soc_codec *codec,
 	unsigned int reg)
 {
-	u16 *cache = codec->reg_cache;
+	struct uda1380_priv *uda1380 = snd_soc_codec_get_drvdata(codec);
+	u16 *cache = uda1380->reg_cache;
+
 	if (reg == UDA1380_RESET)
 		return 0;
 	if (reg >= UDA1380_CACHEREGNUM)
@@ -77,7 +80,8 @@ static inline unsigned int uda1380_read_reg_cache(struct snd_soc_codec *codec,
 static inline void uda1380_write_reg_cache(struct snd_soc_codec *codec,
 	u16 reg, unsigned int value)
 {
-	u16 *cache = codec->reg_cache;
+	struct uda1380_priv *uda1380 = snd_soc_codec_get_drvdata(codec);
+	u16 *cache = uda1380->reg_cache;
 
 	if (reg >= UDA1380_CACHEREGNUM)
 		return;
@@ -92,6 +96,7 @@ static inline void uda1380_write_reg_cache(struct snd_soc_codec *codec,
 static int uda1380_write(struct snd_soc_codec *codec, unsigned int reg,
 	unsigned int value)
 {
+	struct uda1380_priv *uda1380 = snd_soc_codec_get_drvdata(codec);
 	u8 data[3];
 
 	/* data is
@@ -111,10 +116,10 @@ static int uda1380_write(struct snd_soc_codec *codec, unsigned int reg,
 	if (!snd_soc_codec_is_active(codec) && (reg >= UDA1380_MVOL))
 		return 0;
 	pr_debug("uda1380: hw write %x val %x\n", reg, value);
-	if (codec->hw_write(codec->control_data, data, 3) == 3) {
+	if (i2c_master_send(uda1380->i2c, data, 3) == 3) {
 		unsigned int val;
-		i2c_master_send(codec->control_data, data, 1);
-		i2c_master_recv(codec->control_data, data, 2);
+		i2c_master_send(uda1380->i2c, data, 1);
+		i2c_master_recv(uda1380->i2c, data, 2);
 		val = (data[0]<<8) | data[1];
 		if (val != value) {
 			pr_debug("uda1380: READ BACK VAL %x\n",
@@ -130,16 +135,17 @@ static int uda1380_write(struct snd_soc_codec *codec, unsigned int reg,
 
 static void uda1380_sync_cache(struct snd_soc_codec *codec)
 {
+	struct uda1380_priv *uda1380 = snd_soc_codec_get_drvdata(codec);
 	int reg;
 	u8 data[3];
-	u16 *cache = codec->reg_cache;
+	u16 *cache = uda1380->reg_cache;
 
 	/* Sync reg_cache with the hardware */
 	for (reg = 0; reg < UDA1380_MVOL; reg++) {
 		data[0] = reg;
 		data[1] = (cache[reg] & 0xff00) >> 8;
 		data[2] = cache[reg] & 0x00ff;
-		if (codec->hw_write(codec->control_data, data, 3) != 3)
+		if (i2c_master_send(uda1380->i2c, data, 3) != 3)
 			dev_err(codec->dev, "%s: write to reg 0x%x failed\n",
 				__func__, reg);
 	}
@@ -148,6 +154,7 @@ static void uda1380_sync_cache(struct snd_soc_codec *codec)
 static int uda1380_reset(struct snd_soc_codec *codec)
 {
 	struct uda1380_platform_data *pdata = codec->dev->platform_data;
+	struct uda1380_priv *uda1380 = snd_soc_codec_get_drvdata(codec);
 
 	if (gpio_is_valid(pdata->gpio_reset)) {
 		gpio_set_value(pdata->gpio_reset, 1);
@@ -160,7 +167,7 @@ static int uda1380_reset(struct snd_soc_codec *codec)
 		data[1] = 0;
 		data[2] = 0;
 
-		if (codec->hw_write(codec->control_data, data, 3) != 3) {
+		if (i2c_master_send(uda1380->i2c, data, 3) != 3) {
 			dev_err(codec->dev, "%s: failed\n", __func__);
 			return -EIO;
 		}
@@ -695,9 +702,6 @@ static int uda1380_probe(struct snd_soc_codec *codec)
 
 	uda1380->codec = codec;
 
-	codec->hw_write = (hw_write_t)i2c_master_send;
-	codec->control_data = uda1380->control_data;
-
 	if (!gpio_is_valid(pdata->gpio_power)) {
 		ret = uda1380_reset(codec);
 		if (ret)
@@ -727,11 +731,6 @@ static const struct snd_soc_codec_driver soc_codec_dev_uda1380 = {
 	.set_bias_level = uda1380_set_bias_level,
 	.suspend_bias_off = true,
 
-	.reg_cache_size = ARRAY_SIZE(uda1380_reg),
-	.reg_word_size = sizeof(u16),
-	.reg_cache_default = uda1380_reg,
-	.reg_cache_step = 1,
-
 	.component_driver = {
 		.controls		= uda1380_snd_controls,
 		.num_controls		= ARRAY_SIZE(uda1380_snd_controls),
@@ -771,8 +770,15 @@ static int uda1380_i2c_probe(struct i2c_client *i2c,
 			return ret;
 	}
 
+	uda1380->reg_cache = devm_kmemdup(&i2c->dev,
+					uda1380_reg,
+					ARRAY_SIZE(uda1380_reg) * sizeof(u16),
+					GFP_KERNEL);
+	if (!uda1380->reg_cache)
+		return -ENOMEM;
+
 	i2c_set_clientdata(i2c, uda1380);
-	uda1380->control_data = i2c;
+	uda1380->i2c = i2c;
 
 	ret =  snd_soc_register_codec(&i2c->dev,
 			&soc_codec_dev_uda1380, uda1380_dai, ARRAY_SIZE(uda1380_dai));
diff --git a/sound/soc/codecs/wm0010.c b/sound/soc/codecs/wm0010.c
index 4f5f571..0147d2f 100644
--- a/sound/soc/codecs/wm0010.c
+++ b/sound/soc/codecs/wm0010.c
@@ -655,11 +655,8 @@ static int wm0010_boot(struct snd_soc_codec *codec)
 		ret = -ENOMEM;
 		len = pll_rec.length + 8;
 		out = kzalloc(len, GFP_KERNEL | GFP_DMA);
-		if (!out) {
-			dev_err(codec->dev,
-				"Failed to allocate RX buffer\n");
+		if (!out)
 			goto abort;
-		}
 
 		img_swap = kzalloc(len, GFP_KERNEL | GFP_DMA);
 		if (!img_swap)
diff --git a/sound/soc/codecs/wm2000.c b/sound/soc/codecs/wm2000.c
index 23cde3a..abfa052 100644
--- a/sound/soc/codecs/wm2000.c
+++ b/sound/soc/codecs/wm2000.c
@@ -13,7 +13,7 @@
  * 'wm2000_anc.bin' by default (overridable via platform data) at
  * runtime and is expected to be in flat binary format.  This is
  * generated by Wolfson configuration tools and includes
- * system-specific callibration information.  If supplied as a
+ * system-specific calibration information.  If supplied as a
  * sequence of ASCII-encoded hexidecimal bytes this can be converted
  * into a flat binary with a command such as this on the command line:
  *
@@ -826,8 +826,7 @@ static int wm2000_i2c_probe(struct i2c_client *i2c,
 	int reg;
 	u16 id;
 
-	wm2000 = devm_kzalloc(&i2c->dev, sizeof(struct wm2000_priv),
-			      GFP_KERNEL);
+	wm2000 = devm_kzalloc(&i2c->dev, sizeof(*wm2000), GFP_KERNEL);
 	if (!wm2000)
 		return -ENOMEM;
 
@@ -902,7 +901,6 @@ static int wm2000_i2c_probe(struct i2c_client *i2c,
 					    wm2000->anc_download_size,
 					    GFP_KERNEL);
 	if (wm2000->anc_download == NULL) {
-		dev_err(&i2c->dev, "Out of memory\n");
 		ret = -ENOMEM;
 		goto err_supplies;
 	}
diff --git a/sound/soc/codecs/wm2200.c b/sound/soc/codecs/wm2200.c
index d83dab5..5c2f572 100644
--- a/sound/soc/codecs/wm2200.c
+++ b/sound/soc/codecs/wm2200.c
@@ -98,6 +98,8 @@ struct wm2200_priv {
 
 	int rev;
 	int sysclk;
+
+	unsigned int symmetric_rates:1;
 };
 
 #define WM2200_DSP_RANGE_BASE (WM2200_MAX_REGISTER + 1)
@@ -1550,7 +1552,7 @@ static const struct snd_soc_dapm_route wm2200_dapm_routes[] = {
 
 static int wm2200_probe(struct snd_soc_codec *codec)
 {
-	struct wm2200_priv *wm2200 = dev_get_drvdata(codec->dev);
+	struct wm2200_priv *wm2200 = snd_soc_codec_get_drvdata(codec);
 	int ret;
 
 	wm2200->codec = codec;
@@ -1758,7 +1760,7 @@ static int wm2200_hw_params(struct snd_pcm_substream *substream,
 	lrclk = bclk_rates[bclk] / params_rate(params);
 	dev_dbg(codec->dev, "Setting %dHz LRCLK\n", bclk_rates[bclk] / lrclk);
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK ||
-	    dai->symmetric_rates)
+	    wm2200->symmetric_rates)
 		snd_soc_update_bits(codec, WM2200_AUDIO_IF_1_7,
 				    WM2200_AIF1RX_BCPF_MASK, lrclk);
 	else
@@ -2059,13 +2061,14 @@ static int wm2200_set_fll(struct snd_soc_codec *codec, int fll_id, int source,
 static int wm2200_dai_probe(struct snd_soc_dai *dai)
 {
 	struct snd_soc_codec *codec = dai->codec;
+	struct wm2200_priv *wm2200 = snd_soc_codec_get_drvdata(codec);
 	unsigned int val = 0;
 	int ret;
 
 	ret = snd_soc_read(codec, WM2200_GPIO_CTRL_1);
 	if (ret >= 0) {
 		if ((ret & WM2200_GP1_FN_MASK) != 0) {
-			dai->symmetric_rates = true;
+			wm2200->symmetric_rates = true;
 			val = WM2200_AIF1TX_LRCLK_SRC;
 		}
 	} else {
diff --git a/sound/soc/codecs/wm5102.c b/sound/soc/codecs/wm5102.c
index 4f0481d3..fc066ca 100644
--- a/sound/soc/codecs/wm5102.c
+++ b/sound/soc/codecs/wm5102.c
@@ -1935,8 +1935,11 @@ static int wm5102_codec_probe(struct snd_soc_codec *codec)
 	struct snd_soc_dapm_context *dapm = snd_soc_codec_get_dapm(codec);
 	struct snd_soc_component *component = snd_soc_dapm_to_component(dapm);
 	struct wm5102_priv *priv = snd_soc_codec_get_drvdata(codec);
+	struct arizona *arizona = priv->core.arizona;
 	int ret;
 
+	snd_soc_codec_init_regmap(codec, arizona->regmap);
+
 	ret = wm_adsp2_codec_probe(&priv->core.adsp[0], codec);
 	if (ret)
 		return ret;
@@ -1989,17 +1992,9 @@ static unsigned int wm5102_digital_vu[] = {
 	ARIZONA_DAC_DIGITAL_VOLUME_5R,
 };
 
-static struct regmap *wm5102_get_regmap(struct device *dev)
-{
-	struct wm5102_priv *priv = dev_get_drvdata(dev);
-
-	return priv->core.arizona->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm5102 = {
 	.probe = wm5102_codec_probe,
 	.remove = wm5102_codec_remove,
-	.get_regmap = wm5102_get_regmap,
 
 	.idle_bias_off = true,
 
diff --git a/sound/soc/codecs/wm5110.c b/sound/soc/codecs/wm5110.c
index 6ed1e1f..fb0cf9c 100644
--- a/sound/soc/codecs/wm5110.c
+++ b/sound/soc/codecs/wm5110.c
@@ -2280,9 +2280,11 @@ static int wm5110_codec_probe(struct snd_soc_codec *codec)
 	struct snd_soc_dapm_context *dapm = snd_soc_codec_get_dapm(codec);
 	struct snd_soc_component *component = snd_soc_dapm_to_component(dapm);
 	struct wm5110_priv *priv = snd_soc_codec_get_drvdata(codec);
+	struct arizona *arizona = priv->core.arizona;
 	int i, ret;
 
-	priv->core.arizona->dapm = dapm;
+	arizona->dapm = dapm;
+	snd_soc_codec_init_regmap(codec, arizona->regmap);
 
 	ret = arizona_init_spk(codec);
 	if (ret < 0)
@@ -2344,17 +2346,9 @@ static unsigned int wm5110_digital_vu[] = {
 	ARIZONA_DAC_DIGITAL_VOLUME_6R,
 };
 
-static struct regmap *wm5110_get_regmap(struct device *dev)
-{
-	struct wm5110_priv *priv = dev_get_drvdata(dev);
-
-	return priv->core.arizona->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm5110 = {
 	.probe = wm5110_codec_probe,
 	.remove = wm5110_codec_remove,
-	.get_regmap = wm5110_get_regmap,
 
 	.idle_bias_off = true,
 
diff --git a/sound/soc/codecs/wm8350.c b/sound/soc/codecs/wm8350.c
index 2efc5b4..fc79c67 100644
--- a/sound/soc/codecs/wm8350.c
+++ b/sound/soc/codecs/wm8350.c
@@ -1472,6 +1472,8 @@ static  int wm8350_codec_probe(struct snd_soc_codec *codec)
 			    GFP_KERNEL);
 	if (priv == NULL)
 		return -ENOMEM;
+
+	snd_soc_codec_init_regmap(codec, wm8350->regmap);
 	snd_soc_codec_set_drvdata(codec, priv);
 
 	priv->wm8350 = wm8350;
@@ -1580,17 +1582,9 @@ static int  wm8350_codec_remove(struct snd_soc_codec *codec)
 	return 0;
 }
 
-static struct regmap *wm8350_get_regmap(struct device *dev)
-{
-	struct wm8350 *wm8350 = dev_get_platdata(dev);
-
-	return wm8350->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm8350 = {
 	.probe =	wm8350_codec_probe,
 	.remove =	wm8350_codec_remove,
-	.get_regmap =	wm8350_get_regmap,
 	.set_bias_level = wm8350_set_bias_level,
 	.suspend_bias_off = true,
 
diff --git a/sound/soc/codecs/wm8400.c b/sound/soc/codecs/wm8400.c
index 6c59fb9..a36adf8 100644
--- a/sound/soc/codecs/wm8400.c
+++ b/sound/soc/codecs/wm8400.c
@@ -1285,6 +1285,7 @@ static int wm8400_codec_probe(struct snd_soc_codec *codec)
 	if (priv == NULL)
 		return -ENOMEM;
 
+	snd_soc_codec_init_regmap(codec, wm8400->regmap);
 	snd_soc_codec_set_drvdata(codec, priv);
 	priv->wm8400 = wm8400;
 
@@ -1325,17 +1326,9 @@ static int  wm8400_codec_remove(struct snd_soc_codec *codec)
 	return 0;
 }
 
-static struct regmap *wm8400_get_regmap(struct device *dev)
-{
-	struct wm8400 *wm8400 = dev_get_platdata(dev);
-
-	return wm8400->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm8400 = {
 	.probe =	wm8400_codec_probe,
 	.remove =	wm8400_codec_remove,
-	.get_regmap =	wm8400_get_regmap,
 	.set_bias_level = wm8400_set_bias_level,
 	.suspend_bias_off = true,
 
diff --git a/sound/soc/codecs/wm8903.c b/sound/soc/codecs/wm8903.c
index 237eeb9..cba90f2 100644
--- a/sound/soc/codecs/wm8903.c
+++ b/sound/soc/codecs/wm8903.c
@@ -1995,8 +1995,7 @@ static int wm8903_i2c_probe(struct i2c_client *i2c,
 	unsigned int val, irq_pol;
 	int ret, i;
 
-	wm8903 = devm_kzalloc(&i2c->dev,  sizeof(struct wm8903_priv),
-			      GFP_KERNEL);
+	wm8903 = devm_kzalloc(&i2c->dev, sizeof(*wm8903), GFP_KERNEL);
 	if (wm8903 == NULL)
 		return -ENOMEM;
 
@@ -2017,13 +2016,10 @@ static int wm8903_i2c_probe(struct i2c_client *i2c,
 	if (pdata) {
 		wm8903->pdata = pdata;
 	} else {
-		wm8903->pdata = devm_kzalloc(&i2c->dev,
-					sizeof(struct wm8903_platform_data),
-					GFP_KERNEL);
-		if (wm8903->pdata == NULL) {
-			dev_err(&i2c->dev, "Failed to allocate pdata\n");
+		wm8903->pdata = devm_kzalloc(&i2c->dev, sizeof(*wm8903->pdata),
+					     GFP_KERNEL);
+		if (!wm8903->pdata)
 			return -ENOMEM;
-		}
 
 		if (i2c->irq) {
 			ret = wm8903_set_pdata_irq_trigger(i2c, wm8903->pdata);
diff --git a/sound/soc/codecs/wm8994.c b/sound/soc/codecs/wm8994.c
index f91b49e..21ffd64 100644
--- a/sound/soc/codecs/wm8994.c
+++ b/sound/soc/codecs/wm8994.c
@@ -3993,6 +3993,8 @@ static int wm8994_codec_probe(struct snd_soc_codec *codec)
 	unsigned int reg;
 	int ret, i;
 
+	snd_soc_codec_init_regmap(codec, control->regmap);
+
 	wm8994->hubs.codec = codec;
 
 	mutex_init(&wm8994->accdet_lock);
@@ -4434,19 +4436,11 @@ static int wm8994_codec_remove(struct snd_soc_codec *codec)
 	return 0;
 }
 
-static struct regmap *wm8994_get_regmap(struct device *dev)
-{
-	struct wm8994 *control = dev_get_drvdata(dev->parent);
-
-	return control->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm8994 = {
 	.probe =	wm8994_codec_probe,
 	.remove =	wm8994_codec_remove,
 	.suspend =	wm8994_codec_suspend,
 	.resume =	wm8994_codec_resume,
-	.get_regmap =   wm8994_get_regmap,
 	.set_bias_level = wm8994_set_bias_level,
 };
 
diff --git a/sound/soc/codecs/wm8997.c b/sound/soc/codecs/wm8997.c
index 77f5127..cac9b3e 100644
--- a/sound/soc/codecs/wm8997.c
+++ b/sound/soc/codecs/wm8997.c
@@ -1062,8 +1062,11 @@ static int wm8997_codec_probe(struct snd_soc_codec *codec)
 	struct snd_soc_dapm_context *dapm = snd_soc_codec_get_dapm(codec);
 	struct snd_soc_component *component = snd_soc_dapm_to_component(dapm);
 	struct wm8997_priv *priv = snd_soc_codec_get_drvdata(codec);
+	struct arizona *arizona = priv->core.arizona;
 	int ret;
 
+	snd_soc_codec_init_regmap(codec, arizona->regmap);
+
 	ret = arizona_init_spk(codec);
 	if (ret < 0)
 		return ret;
@@ -1095,17 +1098,9 @@ static unsigned int wm8997_digital_vu[] = {
 	ARIZONA_DAC_DIGITAL_VOLUME_5R,
 };
 
-static struct regmap *wm8997_get_regmap(struct device *dev)
-{
-	struct wm8997_priv *priv = dev_get_drvdata(dev);
-
-	return priv->core.arizona->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm8997 = {
 	.probe = wm8997_codec_probe,
 	.remove = wm8997_codec_remove,
-	.get_regmap =   wm8997_get_regmap,
 
 	.idle_bias_off = true,
 
diff --git a/sound/soc/codecs/wm8998.c b/sound/soc/codecs/wm8998.c
index 2d211db..1288e1f 100644
--- a/sound/soc/codecs/wm8998.c
+++ b/sound/soc/codecs/wm8998.c
@@ -1275,9 +1275,11 @@ static int wm8998_codec_probe(struct snd_soc_codec *codec)
 	struct wm8998_priv *priv = snd_soc_codec_get_drvdata(codec);
 	struct snd_soc_dapm_context *dapm = snd_soc_codec_get_dapm(codec);
 	struct snd_soc_component *component = snd_soc_dapm_to_component(dapm);
+	struct arizona *arizona = priv->core.arizona;
 	int ret;
 
-	priv->core.arizona->dapm = dapm;
+	arizona->dapm = dapm;
+	snd_soc_codec_init_regmap(codec, arizona->regmap);
 
 	ret = arizona_init_spk(codec);
 	if (ret < 0)
@@ -1313,17 +1315,9 @@ static unsigned int wm8998_digital_vu[] = {
 	ARIZONA_DAC_DIGITAL_VOLUME_5R,
 };
 
-static struct regmap *wm8998_get_regmap(struct device *dev)
-{
-	struct wm8998_priv *priv = dev_get_drvdata(dev);
-
-	return priv->core.arizona->regmap;
-}
-
 static const struct snd_soc_codec_driver soc_codec_dev_wm8998 = {
 	.probe = wm8998_codec_probe,
 	.remove = wm8998_codec_remove,
-	.get_regmap = wm8998_get_regmap,
 
 	.idle_bias_off = true,
 
diff --git a/sound/soc/davinci/davinci-mcasp.c b/sound/soc/davinci/davinci-mcasp.c
index 804c6f2..03ba218 100644
--- a/sound/soc/davinci/davinci-mcasp.c
+++ b/sound/soc/davinci/davinci-mcasp.c
@@ -1242,6 +1242,20 @@ static int davinci_mcasp_hw_rule_format(struct snd_pcm_hw_params *params,
 	return snd_mask_refine(fmt, &nfmt);
 }
 
+static int davinci_mcasp_hw_rule_min_periodsize(
+		struct snd_pcm_hw_params *params, struct snd_pcm_hw_rule *rule)
+{
+	struct snd_interval *period_size = hw_param_interval(params,
+						SNDRV_PCM_HW_PARAM_PERIOD_SIZE);
+	struct snd_interval frames;
+
+	snd_interval_any(&frames);
+	frames.min = 64;
+	frames.integer = 1;
+
+	return snd_interval_refine(period_size, &frames);
+}
+
 static int davinci_mcasp_startup(struct snd_pcm_substream *substream,
 				 struct snd_soc_dai *cpu_dai)
 {
@@ -1333,6 +1347,11 @@ static int davinci_mcasp_startup(struct snd_pcm_substream *substream,
 			return ret;
 	}
 
+	snd_pcm_hw_rule_add(substream->runtime, 0,
+			    SNDRV_PCM_HW_PARAM_PERIOD_SIZE,
+			    davinci_mcasp_hw_rule_min_periodsize, NULL,
+			    SNDRV_PCM_HW_PARAM_PERIOD_SIZE, -1);
+
 	return 0;
 }
 
diff --git a/sound/soc/fsl/eukrea-tlv320.c b/sound/soc/fsl/eukrea-tlv320.c
index 84ef638..191426a 100644
--- a/sound/soc/fsl/eukrea-tlv320.c
+++ b/sound/soc/fsl/eukrea-tlv320.c
@@ -29,7 +29,6 @@
 
 #include "../codecs/tlv320aic23.h"
 #include "imx-ssi.h"
-#include "fsl_ssi.h"
 #include "imx-audmux.h"
 
 #define CODEC_CLOCK 12000000
diff --git a/sound/soc/fsl/fsl-asoc-card.c b/sound/soc/fsl/fsl-asoc-card.c
index 1225e03..989be51 100644
--- a/sound/soc/fsl/fsl-asoc-card.c
+++ b/sound/soc/fsl/fsl-asoc-card.c
@@ -442,8 +442,8 @@ static int fsl_asoc_card_late_probe(struct snd_soc_card *card)
 
 	if (fsl_asoc_card_is_ac97(priv)) {
 #if IS_ENABLED(CONFIG_SND_AC97_CODEC)
-		struct snd_soc_codec *codec = rtd->codec;
-		struct snd_ac97 *ac97 = snd_soc_codec_get_drvdata(codec);
+		struct snd_soc_component *component = rtd->codec_dai->component;
+		struct snd_ac97 *ac97 = snd_soc_component_get_drvdata(component);
 
 		/*
 		 * Use slots 3/4 for S/PDIF so SSI won't try to enable
diff --git a/sound/soc/fsl/fsl_asrc.h b/sound/soc/fsl/fsl_asrc.h
index 52c27a35..2c5856a 100644
--- a/sound/soc/fsl/fsl_asrc.h
+++ b/sound/soc/fsl/fsl_asrc.h
@@ -57,7 +57,7 @@
 #define REG_ASRDOC			0x74
 #define REG_ASRDI(i)			(REG_ASRDIA + (i << 3))
 #define REG_ASRDO(i)			(REG_ASRDOA + (i << 3))
-#define REG_ASRDx(x, i)			(x == IN ? REG_ASRDI(i) : REG_ASRDO(i))
+#define REG_ASRDx(x, i)			((x) == IN ? REG_ASRDI(i) : REG_ASRDO(i))
 
 #define REG_ASRIDRHA			0x80
 #define REG_ASRIDRLA			0x84
diff --git a/sound/soc/fsl/fsl_dma.c b/sound/soc/fsl/fsl_dma.c
index 0c11f43..8c2981b 100644
--- a/sound/soc/fsl/fsl_dma.c
+++ b/sound/soc/fsl/fsl_dma.c
@@ -913,8 +913,8 @@ static int fsl_soc_dma_probe(struct platform_device *pdev)
 	dma->dai.pcm_free = fsl_dma_free_dma_buffers;
 
 	/* Store the SSI-specific information that we need */
-	dma->ssi_stx_phys = res.start + CCSR_SSI_STX0;
-	dma->ssi_srx_phys = res.start + CCSR_SSI_SRX0;
+	dma->ssi_stx_phys = res.start + REG_SSI_STX0;
+	dma->ssi_srx_phys = res.start + REG_SSI_SRX0;
 
 	iprop = of_get_property(ssi_np, "fsl,fifo-depth", NULL);
 	if (iprop)
diff --git a/sound/soc/fsl/fsl_ssi.c b/sound/soc/fsl/fsl_ssi.c
index 424bafa..aecd00f 100644
--- a/sound/soc/fsl/fsl_ssi.c
+++ b/sound/soc/fsl/fsl_ssi.c
@@ -69,21 +69,35 @@
  * samples will be written to STX properly.
  */
 #ifdef __BIG_ENDIAN
-#define FSLSSI_I2S_FORMATS (SNDRV_PCM_FMTBIT_S8 | SNDRV_PCM_FMTBIT_S16_BE | \
-	 SNDRV_PCM_FMTBIT_S18_3BE | SNDRV_PCM_FMTBIT_S20_3BE | \
-	 SNDRV_PCM_FMTBIT_S24_3BE | SNDRV_PCM_FMTBIT_S24_BE)
+#define FSLSSI_I2S_FORMATS \
+	(SNDRV_PCM_FMTBIT_S8 | \
+	 SNDRV_PCM_FMTBIT_S16_BE | \
+	 SNDRV_PCM_FMTBIT_S18_3BE | \
+	 SNDRV_PCM_FMTBIT_S20_3BE | \
+	 SNDRV_PCM_FMTBIT_S24_3BE | \
+	 SNDRV_PCM_FMTBIT_S24_BE)
 #else
-#define FSLSSI_I2S_FORMATS (SNDRV_PCM_FMTBIT_S8 | SNDRV_PCM_FMTBIT_S16_LE | \
-	 SNDRV_PCM_FMTBIT_S18_3LE | SNDRV_PCM_FMTBIT_S20_3LE | \
-	 SNDRV_PCM_FMTBIT_S24_3LE | SNDRV_PCM_FMTBIT_S24_LE)
+#define FSLSSI_I2S_FORMATS \
+	(SNDRV_PCM_FMTBIT_S8 | \
+	 SNDRV_PCM_FMTBIT_S16_LE | \
+	 SNDRV_PCM_FMTBIT_S18_3LE | \
+	 SNDRV_PCM_FMTBIT_S20_3LE | \
+	 SNDRV_PCM_FMTBIT_S24_3LE | \
+	 SNDRV_PCM_FMTBIT_S24_LE)
 #endif
 
-#define FSLSSI_SIER_DBG_RX_FLAGS (CCSR_SSI_SIER_RFF0_EN | \
-		CCSR_SSI_SIER_RLS_EN | CCSR_SSI_SIER_RFS_EN | \
-		CCSR_SSI_SIER_ROE0_EN | CCSR_SSI_SIER_RFRC_EN)
-#define FSLSSI_SIER_DBG_TX_FLAGS (CCSR_SSI_SIER_TFE0_EN | \
-		CCSR_SSI_SIER_TLS_EN | CCSR_SSI_SIER_TFS_EN | \
-		CCSR_SSI_SIER_TUE0_EN | CCSR_SSI_SIER_TFRC_EN)
+#define FSLSSI_SIER_DBG_RX_FLAGS \
+	(SSI_SIER_RFF0_EN | \
+	 SSI_SIER_RLS_EN | \
+	 SSI_SIER_RFS_EN | \
+	 SSI_SIER_ROE0_EN | \
+	 SSI_SIER_RFRC_EN)
+#define FSLSSI_SIER_DBG_TX_FLAGS \
+	(SSI_SIER_TFE0_EN | \
+	 SSI_SIER_TLS_EN | \
+	 SSI_SIER_TFS_EN | \
+	 SSI_SIER_TUE0_EN | \
+	 SSI_SIER_TFRC_EN)
 
 enum fsl_ssi_type {
 	FSL_SSI_MCP8610,
@@ -92,23 +106,18 @@ enum fsl_ssi_type {
 	FSL_SSI_MX51,
 };
 
-struct fsl_ssi_reg_val {
+struct fsl_ssi_regvals {
 	u32 sier;
 	u32 srcr;
 	u32 stcr;
 	u32 scr;
 };
 
-struct fsl_ssi_rxtx_reg_val {
-	struct fsl_ssi_reg_val rx;
-	struct fsl_ssi_reg_val tx;
-};
-
 static bool fsl_ssi_readable_reg(struct device *dev, unsigned int reg)
 {
 	switch (reg) {
-	case CCSR_SSI_SACCEN:
-	case CCSR_SSI_SACCDIS:
+	case REG_SSI_SACCEN:
+	case REG_SSI_SACCDIS:
 		return false;
 	default:
 		return true;
@@ -118,18 +127,18 @@ static bool fsl_ssi_readable_reg(struct device *dev, unsigned int reg)
 static bool fsl_ssi_volatile_reg(struct device *dev, unsigned int reg)
 {
 	switch (reg) {
-	case CCSR_SSI_STX0:
-	case CCSR_SSI_STX1:
-	case CCSR_SSI_SRX0:
-	case CCSR_SSI_SRX1:
-	case CCSR_SSI_SISR:
-	case CCSR_SSI_SFCSR:
-	case CCSR_SSI_SACNT:
-	case CCSR_SSI_SACADD:
-	case CCSR_SSI_SACDAT:
-	case CCSR_SSI_SATAG:
-	case CCSR_SSI_SACCST:
-	case CCSR_SSI_SOR:
+	case REG_SSI_STX0:
+	case REG_SSI_STX1:
+	case REG_SSI_SRX0:
+	case REG_SSI_SRX1:
+	case REG_SSI_SISR:
+	case REG_SSI_SFCSR:
+	case REG_SSI_SACNT:
+	case REG_SSI_SACADD:
+	case REG_SSI_SACDAT:
+	case REG_SSI_SATAG:
+	case REG_SSI_SACCST:
+	case REG_SSI_SOR:
 		return true;
 	default:
 		return false;
@@ -139,12 +148,12 @@ static bool fsl_ssi_volatile_reg(struct device *dev, unsigned int reg)
 static bool fsl_ssi_precious_reg(struct device *dev, unsigned int reg)
 {
 	switch (reg) {
-	case CCSR_SSI_SRX0:
-	case CCSR_SSI_SRX1:
-	case CCSR_SSI_SISR:
-	case CCSR_SSI_SACADD:
-	case CCSR_SSI_SACDAT:
-	case CCSR_SSI_SATAG:
+	case REG_SSI_SRX0:
+	case REG_SSI_SRX1:
+	case REG_SSI_SISR:
+	case REG_SSI_SACADD:
+	case REG_SSI_SACDAT:
+	case REG_SSI_SATAG:
 		return true;
 	default:
 		return false;
@@ -154,9 +163,9 @@ static bool fsl_ssi_precious_reg(struct device *dev, unsigned int reg)
 static bool fsl_ssi_writeable_reg(struct device *dev, unsigned int reg)
 {
 	switch (reg) {
-	case CCSR_SSI_SRX0:
-	case CCSR_SSI_SRX1:
-	case CCSR_SSI_SACCST:
+	case REG_SSI_SRX0:
+	case REG_SSI_SRX1:
+	case REG_SSI_SACCST:
 		return false;
 	default:
 		return true;
@@ -164,12 +173,12 @@ static bool fsl_ssi_writeable_reg(struct device *dev, unsigned int reg)
 }
 
 static const struct regmap_config fsl_ssi_regconfig = {
-	.max_register = CCSR_SSI_SACCDIS,
+	.max_register = REG_SSI_SACCDIS,
 	.reg_bits = 32,
 	.val_bits = 32,
 	.reg_stride = 4,
 	.val_format_endian = REGMAP_ENDIAN_NATIVE,
-	.num_reg_defaults_raw = CCSR_SSI_SACCDIS / sizeof(uint32_t) + 1,
+	.num_reg_defaults_raw = REG_SSI_SACCDIS / sizeof(uint32_t) + 1,
 	.readable_reg = fsl_ssi_readable_reg,
 	.volatile_reg = fsl_ssi_volatile_reg,
 	.precious_reg = fsl_ssi_precious_reg,
@@ -185,78 +194,79 @@ struct fsl_ssi_soc_data {
 };
 
 /**
- * fsl_ssi_private: per-SSI private data
+ * fsl_ssi: per-SSI private data
  *
- * @reg: Pointer to the regmap registers
+ * @regs: Pointer to the regmap registers
  * @irq: IRQ of this SSI
  * @cpu_dai_drv: CPU DAI driver for this device
  *
  * @dai_fmt: DAI configuration this device is currently used with
- * @i2s_mode: i2s and network mode configuration of the device. Is used to
- * switch between normal and i2s/network mode
- * mode depending on the number of channels
+ * @i2s_net: I2S and Network mode configurations of SCR register
  * @use_dma: DMA is used or FIQ with stream filter
- * @use_dual_fifo: DMA with support for both FIFOs used
- * @fifo_deph: Depth of the SSI FIFOs
- * @slot_width: width of each DAI slot
- * @slots: number of slots
- * @rxtx_reg_val: Specific register settings for receive/transmit configuration
+ * @use_dual_fifo: DMA with support for dual FIFO mode
+ * @has_ipg_clk_name: If "ipg" is in the clock name list of device tree
+ * @fifo_depth: Depth of the SSI FIFOs
+ * @slot_width: Width of each DAI slot
+ * @slots: Number of slots
+ * @regvals: Specific RX/TX register settings
  *
- * @clk: SSI clock
- * @baudclk: SSI baud clock for master mode
+ * @clk: Clock source to access register
+ * @baudclk: Clock source to generate bit and frame-sync clocks
  * @baudclk_streams: Active streams that are using baudclk
  *
+ * @regcache_sfcsr: Cache sfcsr register value during suspend and resume
+ * @regcache_sacnt: Cache sacnt register value during suspend and resume
+ *
  * @dma_params_tx: DMA transmit parameters
  * @dma_params_rx: DMA receive parameters
  * @ssi_phys: physical address of the SSI registers
  *
  * @fiq_params: FIQ stream filtering parameters
  *
- * @pdev: Pointer to pdev used for deprecated fsl-ssi sound card
+ * @pdev: Pointer to pdev when using fsl-ssi as sound card (ppc only)
+ *        TODO: Should be replaced with simple-sound-card
  *
  * @dbg_stats: Debugging statistics
  *
  * @soc: SoC specific data
+ * @dev: Pointer to &pdev->dev
  *
- * @fifo_watermark: the FIFO watermark setting.  Notifies DMA when
- *             there are @fifo_watermark or fewer words in TX fifo or
- *             @fifo_watermark or more empty words in RX fifo.
- * @dma_maxburst: max number of words to transfer in one go.  So far,
- *             this is always the same as fifo_watermark.
+ * @fifo_watermark: The FIFO watermark setting. Notifies DMA when there are
+ *                  @fifo_watermark or fewer words in TX fifo or
+ *                  @fifo_watermark or more empty words in RX fifo.
+ * @dma_maxburst: Max number of words to transfer in one go. So far,
+ *                this is always the same as fifo_watermark.
+ *
+ * @ac97_reg_lock: Mutex lock to serialize AC97 register access operations
  */
-struct fsl_ssi_private {
+struct fsl_ssi {
 	struct regmap *regs;
 	int irq;
 	struct snd_soc_dai_driver cpu_dai_drv;
 
 	unsigned int dai_fmt;
-	u8 i2s_mode;
+	u8 i2s_net;
 	bool use_dma;
 	bool use_dual_fifo;
 	bool has_ipg_clk_name;
 	unsigned int fifo_depth;
 	unsigned int slot_width;
 	unsigned int slots;
-	struct fsl_ssi_rxtx_reg_val rxtx_reg_val;
+	struct fsl_ssi_regvals regvals[2];
 
 	struct clk *clk;
 	struct clk *baudclk;
 	unsigned int baudclk_streams;
 
-	/* regcache for volatile regs */
 	u32 regcache_sfcsr;
 	u32 regcache_sacnt;
 
-	/* DMA params */
 	struct snd_dmaengine_dai_dma_data dma_params_tx;
 	struct snd_dmaengine_dai_dma_data dma_params_rx;
 	dma_addr_t ssi_phys;
 
-	/* params for non-dma FIQ stream filtered mode */
 	struct imx_pcm_fiq_params fiq_params;
 
-	/* Used when using fsl-ssi as sound-card. This is only used by ppc and
-	 * should be replaced with simple-sound-card. */
 	struct platform_device *pdev;
 
 	struct fsl_ssi_dbg dbg_stats;
@@ -271,27 +281,27 @@ struct fsl_ssi_private {
 };
 
 /*
- * imx51 and later SoCs have a slightly different IP that allows the
- * SSI configuration while the SSI unit is running.
+ * SoC specific data
  *
- * More important, it is necessary on those SoCs to configure the
- * sperate TX/RX DMA bits just before starting the stream
- * (fsl_ssi_trigger). The SDMA unit has to be configured before fsl_ssi
- * sends any DMA requests to the SDMA unit, otherwise it is not defined
- * how the SDMA unit handles the DMA request.
- *
- * SDMA units are present on devices starting at imx35 but the imx35
- * reference manual states that the DMA bits should not be changed
- * while the SSI unit is running (SSIEN). So we support the necessary
- * online configuration of fsl-ssi starting at imx51.
+ * Notes:
+ * 1) SSI in earlier SoCS has critical bits in control registers that
+ *    cannot be changed after SSI starts running -- a software reset
+ *    (set SSIEN to 0) is required to change their values. So adding
+ *    an offline_config flag for these SoCs.
+ * 2) SDMA is available since imx35. However, imx35 does not support
+ *    DMA bits changing when SSI is running, so set offline_config.
+ * 3) imx51 and later versions support register configurations when
+ *    SSI is running (SSIEN); For these versions, DMA needs to be
+ *    configured before SSI sends DMA request to avoid an undefined
+ *    DMA request on the SDMA side.
  */
 
 static struct fsl_ssi_soc_data fsl_ssi_mpc8610 = {
 	.imx = false,
 	.offline_config = true,
-	.sisr_write_mask = CCSR_SSI_SISR_RFRC | CCSR_SSI_SISR_TFRC |
-			CCSR_SSI_SISR_ROE0 | CCSR_SSI_SISR_ROE1 |
-			CCSR_SSI_SISR_TUE0 | CCSR_SSI_SISR_TUE1,
+	.sisr_write_mask = SSI_SISR_RFRC | SSI_SISR_TFRC |
+			   SSI_SISR_ROE0 | SSI_SISR_ROE1 |
+			   SSI_SISR_TUE0 | SSI_SISR_TUE1,
 };
 
 static struct fsl_ssi_soc_data fsl_ssi_imx21 = {
@@ -304,16 +314,16 @@ static struct fsl_ssi_soc_data fsl_ssi_imx21 = {
 static struct fsl_ssi_soc_data fsl_ssi_imx35 = {
 	.imx = true,
 	.offline_config = true,
-	.sisr_write_mask = CCSR_SSI_SISR_RFRC | CCSR_SSI_SISR_TFRC |
-			CCSR_SSI_SISR_ROE0 | CCSR_SSI_SISR_ROE1 |
-			CCSR_SSI_SISR_TUE0 | CCSR_SSI_SISR_TUE1,
+	.sisr_write_mask = SSI_SISR_RFRC | SSI_SISR_TFRC |
+			   SSI_SISR_ROE0 | SSI_SISR_ROE1 |
+			   SSI_SISR_TUE0 | SSI_SISR_TUE1,
 };
 
 static struct fsl_ssi_soc_data fsl_ssi_imx51 = {
 	.imx = true,
 	.offline_config = false,
-	.sisr_write_mask = CCSR_SSI_SISR_ROE0 | CCSR_SSI_SISR_ROE1 |
-		CCSR_SSI_SISR_TUE0 | CCSR_SSI_SISR_TUE1,
+	.sisr_write_mask = SSI_SISR_ROE0 | SSI_SISR_ROE1 |
+			   SSI_SISR_TUE0 | SSI_SISR_TUE1,
 };
 
 static const struct of_device_id fsl_ssi_ids[] = {
@@ -325,108 +335,86 @@ static const struct of_device_id fsl_ssi_ids[] = {
 };
 MODULE_DEVICE_TABLE(of, fsl_ssi_ids);
 
-static bool fsl_ssi_is_ac97(struct fsl_ssi_private *ssi_private)
+static bool fsl_ssi_is_ac97(struct fsl_ssi *ssi)
 {
-	return (ssi_private->dai_fmt & SND_SOC_DAIFMT_FORMAT_MASK) ==
+	return (ssi->dai_fmt & SND_SOC_DAIFMT_FORMAT_MASK) ==
 		SND_SOC_DAIFMT_AC97;
 }
 
-static bool fsl_ssi_is_i2s_master(struct fsl_ssi_private *ssi_private)
+static bool fsl_ssi_is_i2s_master(struct fsl_ssi *ssi)
 {
-	return (ssi_private->dai_fmt & SND_SOC_DAIFMT_MASTER_MASK) ==
+	return (ssi->dai_fmt & SND_SOC_DAIFMT_MASTER_MASK) ==
 		SND_SOC_DAIFMT_CBS_CFS;
 }
 
-static bool fsl_ssi_is_i2s_cbm_cfs(struct fsl_ssi_private *ssi_private)
+static bool fsl_ssi_is_i2s_cbm_cfs(struct fsl_ssi *ssi)
 {
-	return (ssi_private->dai_fmt & SND_SOC_DAIFMT_MASTER_MASK) ==
+	return (ssi->dai_fmt & SND_SOC_DAIFMT_MASTER_MASK) ==
 		SND_SOC_DAIFMT_CBM_CFS;
 }
+
 /**
- * fsl_ssi_isr: SSI interrupt handler
- *
- * Although it's possible to use the interrupt handler to send and receive
- * data to/from the SSI, we use the DMA instead.  Programming is more
- * complicated, but the performance is much better.
- *
- * This interrupt handler is used only to gather statistics.
- *
- * @irq: IRQ of the SSI device
- * @dev_id: pointer to the ssi_private structure for this SSI device
+ * Interrupt handler to gather states
  */
 static irqreturn_t fsl_ssi_isr(int irq, void *dev_id)
 {
-	struct fsl_ssi_private *ssi_private = dev_id;
-	struct regmap *regs = ssi_private->regs;
+	struct fsl_ssi *ssi = dev_id;
+	struct regmap *regs = ssi->regs;
 	__be32 sisr;
 	__be32 sisr2;
 
-	/* We got an interrupt, so read the status register to see what we
-	   were interrupted for.  We mask it with the Interrupt Enable register
-	   so that we only check for events that we're interested in.
-	 */
-	regmap_read(regs, CCSR_SSI_SISR, &sisr);
+	regmap_read(regs, REG_SSI_SISR, &sisr);
 
-	sisr2 = sisr & ssi_private->soc->sisr_write_mask;
+	sisr2 = sisr & ssi->soc->sisr_write_mask;
 	/* Clear the bits that we set */
 	if (sisr2)
-		regmap_write(regs, CCSR_SSI_SISR, sisr2);
+		regmap_write(regs, REG_SSI_SISR, sisr2);
 
-	fsl_ssi_dbg_isr(&ssi_private->dbg_stats, sisr);
+	fsl_ssi_dbg_isr(&ssi->dbg_stats, sisr);
 
 	return IRQ_HANDLED;
 }
 
-/*
- * Enable/Disable all rx/tx config flags at once.
+/**
+ * Enable or disable all rx/tx config flags at once
  */
-static void fsl_ssi_rxtx_config(struct fsl_ssi_private *ssi_private,
-		bool enable)
+static void fsl_ssi_rxtx_config(struct fsl_ssi *ssi, bool enable)
 {
-	struct regmap *regs = ssi_private->regs;
-	struct fsl_ssi_rxtx_reg_val *vals = &ssi_private->rxtx_reg_val;
+	struct regmap *regs = ssi->regs;
+	struct fsl_ssi_regvals *vals = ssi->regvals;
 
 	if (enable) {
-		regmap_update_bits(regs, CCSR_SSI_SIER,
-				vals->rx.sier | vals->tx.sier,
-				vals->rx.sier | vals->tx.sier);
-		regmap_update_bits(regs, CCSR_SSI_SRCR,
-				vals->rx.srcr | vals->tx.srcr,
-				vals->rx.srcr | vals->tx.srcr);
-		regmap_update_bits(regs, CCSR_SSI_STCR,
-				vals->rx.stcr | vals->tx.stcr,
-				vals->rx.stcr | vals->tx.stcr);
+		regmap_update_bits(regs, REG_SSI_SIER,
+				   vals[RX].sier | vals[TX].sier,
+				   vals[RX].sier | vals[TX].sier);
+		regmap_update_bits(regs, REG_SSI_SRCR,
+				   vals[RX].srcr | vals[TX].srcr,
+				   vals[RX].srcr | vals[TX].srcr);
+		regmap_update_bits(regs, REG_SSI_STCR,
+				   vals[RX].stcr | vals[TX].stcr,
+				   vals[RX].stcr | vals[TX].stcr);
 	} else {
-		regmap_update_bits(regs, CCSR_SSI_SRCR,
-				vals->rx.srcr | vals->tx.srcr, 0);
-		regmap_update_bits(regs, CCSR_SSI_STCR,
-				vals->rx.stcr | vals->tx.stcr, 0);
-		regmap_update_bits(regs, CCSR_SSI_SIER,
-				vals->rx.sier | vals->tx.sier, 0);
+		regmap_update_bits(regs, REG_SSI_SRCR,
+				   vals[RX].srcr | vals[TX].srcr, 0);
+		regmap_update_bits(regs, REG_SSI_STCR,
+				   vals[RX].stcr | vals[TX].stcr, 0);
+		regmap_update_bits(regs, REG_SSI_SIER,
+				   vals[RX].sier | vals[TX].sier, 0);
 	}
 }
 
-/*
- * Clear RX or TX FIFO to remove samples from the previous
- * stream session which may be still present in the FIFO and
- * may introduce bad samples and/or channel slipping.
- *
- * Note: The SOR is not documented in recent IMX datasheet, but
- * is described in IMX51 reference manual at section 56.3.3.15.
+/**
+ * Clear remaining data in the FIFO to avoid dirty data or channel slipping
  */
-static void fsl_ssi_fifo_clear(struct fsl_ssi_private *ssi_private,
-		bool is_rx)
+static void fsl_ssi_fifo_clear(struct fsl_ssi *ssi, bool is_rx)
 {
-	if (is_rx) {
-		regmap_update_bits(ssi_private->regs, CCSR_SSI_SOR,
-			CCSR_SSI_SOR_RX_CLR, CCSR_SSI_SOR_RX_CLR);
-	} else {
-		regmap_update_bits(ssi_private->regs, CCSR_SSI_SOR,
-			CCSR_SSI_SOR_TX_CLR, CCSR_SSI_SOR_TX_CLR);
-	}
+	bool tx = !is_rx;
+
+	regmap_update_bits(ssi->regs, REG_SSI_SOR,
+			   SSI_SOR_xX_CLR(tx), SSI_SOR_xX_CLR(tx));
 }
 
-/*
+/**
  * Calculate the bits that have to be disabled for the current stream that is
  * getting disabled. This keeps the bits enabled that are necessary for the
  * second stream to work if 'stream_active' is true.
@@ -446,261 +434,239 @@ static void fsl_ssi_fifo_clear(struct fsl_ssi_private *ssi_private,
 	((vals_disable) & \
 	 ((vals_disable) ^ ((vals_stream) * (u32)!!(stream_active))))
 
-/*
- * Enable/Disable a ssi configuration. You have to pass either
- * ssi_private->rxtx_reg_val.rx or tx as vals parameter.
+/**
+ * Enable or disable SSI configuration.
  */
-static void fsl_ssi_config(struct fsl_ssi_private *ssi_private, bool enable,
-		struct fsl_ssi_reg_val *vals)
+static void fsl_ssi_config(struct fsl_ssi *ssi, bool enable,
+			   struct fsl_ssi_regvals *vals)
 {
-	struct regmap *regs = ssi_private->regs;
-	struct fsl_ssi_reg_val *avals;
+	struct regmap *regs = ssi->regs;
+	struct fsl_ssi_regvals *avals;
 	int nr_active_streams;
-	u32 scr_val;
+	u32 scr;
 	int keep_active;
 
-	regmap_read(regs, CCSR_SSI_SCR, &scr_val);
+	regmap_read(regs, REG_SSI_SCR, &scr);
 
-	nr_active_streams = !!(scr_val & CCSR_SSI_SCR_TE) +
-				!!(scr_val & CCSR_SSI_SCR_RE);
+	nr_active_streams = !!(scr & SSI_SCR_TE) + !!(scr & SSI_SCR_RE);
 
 	if (nr_active_streams - 1 > 0)
 		keep_active = 1;
 	else
 		keep_active = 0;
 
-	/* Find the other direction values rx or tx which we do not want to
-	 * modify */
-	if (&ssi_private->rxtx_reg_val.rx == vals)
-		avals = &ssi_private->rxtx_reg_val.tx;
+	/* Get the opposite direction to keep its values untouched */
+	if (&ssi->regvals[RX] == vals)
+		avals = &ssi->regvals[TX];
 	else
-		avals = &ssi_private->rxtx_reg_val.rx;
+		avals = &ssi->regvals[RX];
 
-	/* If vals should be disabled, start with disabling the unit */
 	if (!enable) {
+		/*
+		 * To keep the other stream safe, exclude shared bits between
+		 * both streams, and get safe bits to disable current stream
+		 */
 		u32 scr = fsl_ssi_disable_val(vals->scr, avals->scr,
-				keep_active);
-		regmap_update_bits(regs, CCSR_SSI_SCR, scr, 0);
+					      keep_active);
+		/* Safely disable SCR register for the stream */
+		regmap_update_bits(regs, REG_SSI_SCR, scr, 0);
 	}
 
 	/*
-	 * We are running on a SoC which does not support online SSI
-	 * reconfiguration, so we have to enable all necessary flags at once
-	 * even if we do not use them later (capture and playback configuration)
+	 * For cases where online configuration is not supported,
+	 * 1) Enable all necessary bits of both streams when 1st stream starts
+	 *    even if the opposite stream will not start
+	 * 2) Disable all remaining bits of both streams when last stream ends
 	 */
-	if (ssi_private->soc->offline_config) {
-		if ((enable && !nr_active_streams) ||
-				(!enable && !keep_active))
-			fsl_ssi_rxtx_config(ssi_private, enable);
+	if (ssi->soc->offline_config) {
+		if ((enable && !nr_active_streams) || (!enable && !keep_active))
+			fsl_ssi_rxtx_config(ssi, enable);
 
 		goto config_done;
 	}
 
-	/*
-	 * Configure single direction units while the SSI unit is running
-	 * (online configuration)
-	 */
+	/* Online configure single direction while SSI is running */
 	if (enable) {
-		fsl_ssi_fifo_clear(ssi_private, vals->scr & CCSR_SSI_SCR_RE);
+		fsl_ssi_fifo_clear(ssi, vals->scr & SSI_SCR_RE);
 
-		regmap_update_bits(regs, CCSR_SSI_SRCR, vals->srcr, vals->srcr);
-		regmap_update_bits(regs, CCSR_SSI_STCR, vals->stcr, vals->stcr);
-		regmap_update_bits(regs, CCSR_SSI_SIER, vals->sier, vals->sier);
+		regmap_update_bits(regs, REG_SSI_SRCR, vals->srcr, vals->srcr);
+		regmap_update_bits(regs, REG_SSI_STCR, vals->stcr, vals->stcr);
+		regmap_update_bits(regs, REG_SSI_SIER, vals->sier, vals->sier);
 	} else {
 		u32 sier;
 		u32 srcr;
 		u32 stcr;
 
 		/*
-		 * Disabling the necessary flags for one of rx/tx while the
-		 * other stream is active is a little bit more difficult. We
-		 * have to disable only those flags that differ between both
-		 * streams (rx XOR tx) and that are set in the stream that is
-		 * disabled now. Otherwise we could alter flags of the other
-		 * stream
+		 * To keep the other stream safe, exclude shared bits between
+		 * both streams, and get safe bits to disable current stream
 		 */
-
-		/* These assignments are simply vals without bits set in avals*/
 		sier = fsl_ssi_disable_val(vals->sier, avals->sier,
-				keep_active);
+					   keep_active);
 		srcr = fsl_ssi_disable_val(vals->srcr, avals->srcr,
-				keep_active);
+					   keep_active);
 		stcr = fsl_ssi_disable_val(vals->stcr, avals->stcr,
-				keep_active);
+					   keep_active);
 
-		regmap_update_bits(regs, CCSR_SSI_SRCR, srcr, 0);
-		regmap_update_bits(regs, CCSR_SSI_STCR, stcr, 0);
-		regmap_update_bits(regs, CCSR_SSI_SIER, sier, 0);
+		/* Safely disable other control registers for the stream */
+		regmap_update_bits(regs, REG_SSI_SRCR, srcr, 0);
+		regmap_update_bits(regs, REG_SSI_STCR, stcr, 0);
+		regmap_update_bits(regs, REG_SSI_SIER, sier, 0);
 	}
 
 config_done:
 	/* Enabling of subunits is done after configuration */
 	if (enable) {
-		if (ssi_private->use_dma && (vals->scr & CCSR_SSI_SCR_TE)) {
-			/*
-			 * Be sure the Tx FIFO is filled when TE is set.
-			 * Otherwise, there are some chances to start the
-			 * playback with some void samples inserted first,
-			 * generating a channel slip.
-			 *
-			 * First, SSIEN must be set, to let the FIFO be filled.
-			 *
-			 * Notes:
-			 * - Limit this fix to the DMA case until FIQ cases can
-			 *   be tested.
-			 * - Limit the length of the busy loop to not lock the
-			 *   system too long, even if 1-2 loops are sufficient
-			 *   in general.
-			 */
+		/*
+		 * Start DMA before setting TE to avoid FIFO underrun
+		 * which may cause a channel slip or a channel swap
+		 *
+		 * TODO: FIQ cases might also need this upon testing
+		 */
+		if (ssi->use_dma && (vals->scr & SSI_SCR_TE)) {
 			int i;
 			int max_loop = 100;
-			regmap_update_bits(regs, CCSR_SSI_SCR,
-					CCSR_SSI_SCR_SSIEN, CCSR_SSI_SCR_SSIEN);
+
+			/* Enable SSI first to send TX DMA request */
+			regmap_update_bits(regs, REG_SSI_SCR,
+					   SSI_SCR_SSIEN, SSI_SCR_SSIEN);
+
+			/* Busy wait until TX FIFO not empty -- DMA working */
 			for (i = 0; i < max_loop; i++) {
 				u32 sfcsr;
-				regmap_read(regs, CCSR_SSI_SFCSR, &sfcsr);
-				if (CCSR_SSI_SFCSR_TFCNT0(sfcsr))
+				regmap_read(regs, REG_SSI_SFCSR, &sfcsr);
+				if (SSI_SFCSR_TFCNT0(sfcsr))
 					break;
 			}
 			if (i == max_loop) {
-				dev_err(ssi_private->dev,
+				dev_err(ssi->dev,
 					"Timeout waiting TX FIFO filling\n");
 			}
 		}
-		regmap_update_bits(regs, CCSR_SSI_SCR, vals->scr, vals->scr);
+		/* Enable all remaining bits */
+		regmap_update_bits(regs, REG_SSI_SCR, vals->scr, vals->scr);
 	}
 }
 
-
-static void fsl_ssi_rx_config(struct fsl_ssi_private *ssi_private, bool enable)
+static void fsl_ssi_rx_config(struct fsl_ssi *ssi, bool enable)
 {
-	fsl_ssi_config(ssi_private, enable, &ssi_private->rxtx_reg_val.rx);
+	fsl_ssi_config(ssi, enable, &ssi->regvals[RX]);
 }
 
-static void fsl_ssi_tx_config(struct fsl_ssi_private *ssi_private, bool enable)
+static void fsl_ssi_tx_ac97_saccst_setup(struct fsl_ssi *ssi)
 {
-	fsl_ssi_config(ssi_private, enable, &ssi_private->rxtx_reg_val.tx);
-}
-
-/*
- * Setup rx/tx register values used to enable/disable the streams. These will
- * be used later in fsl_ssi_config to setup the streams without the need to
- * check for all different SSI modes.
- */
-static void fsl_ssi_setup_reg_vals(struct fsl_ssi_private *ssi_private)
-{
-	struct fsl_ssi_rxtx_reg_val *reg = &ssi_private->rxtx_reg_val;
-
-	reg->rx.sier = CCSR_SSI_SIER_RFF0_EN;
-	reg->rx.srcr = CCSR_SSI_SRCR_RFEN0;
-	reg->rx.scr = 0;
-	reg->tx.sier = CCSR_SSI_SIER_TFE0_EN;
-	reg->tx.stcr = CCSR_SSI_STCR_TFEN0;
-	reg->tx.scr = 0;
-
-	if (!fsl_ssi_is_ac97(ssi_private)) {
-		reg->rx.scr = CCSR_SSI_SCR_SSIEN | CCSR_SSI_SCR_RE;
-		reg->rx.sier |= CCSR_SSI_SIER_RFF0_EN;
-		reg->tx.scr = CCSR_SSI_SCR_SSIEN | CCSR_SSI_SCR_TE;
-		reg->tx.sier |= CCSR_SSI_SIER_TFE0_EN;
-	}
-
-	if (ssi_private->use_dma) {
-		reg->rx.sier |= CCSR_SSI_SIER_RDMAE;
-		reg->tx.sier |= CCSR_SSI_SIER_TDMAE;
-	} else {
-		reg->rx.sier |= CCSR_SSI_SIER_RIE;
-		reg->tx.sier |= CCSR_SSI_SIER_TIE;
-	}
-
-	reg->rx.sier |= FSLSSI_SIER_DBG_RX_FLAGS;
-	reg->tx.sier |= FSLSSI_SIER_DBG_TX_FLAGS;
-}
-
-static void fsl_ssi_setup_ac97(struct fsl_ssi_private *ssi_private)
-{
-	struct regmap *regs = ssi_private->regs;
-
-	/*
-	 * Setup the clock control register
-	 */
-	regmap_write(regs, CCSR_SSI_STCCR,
-			CCSR_SSI_SxCCR_WL(17) | CCSR_SSI_SxCCR_DC(13));
-	regmap_write(regs, CCSR_SSI_SRCCR,
-			CCSR_SSI_SxCCR_WL(17) | CCSR_SSI_SxCCR_DC(13));
-
-	/*
-	 * Enable AC97 mode and startup the SSI
-	 */
-	regmap_write(regs, CCSR_SSI_SACNT,
-			CCSR_SSI_SACNT_AC97EN | CCSR_SSI_SACNT_FV);
+	struct regmap *regs = ssi->regs;
 
 	/* no SACC{ST,EN,DIS} regs on imx21-class SSI */
-	if (!ssi_private->soc->imx21regs) {
-		regmap_write(regs, CCSR_SSI_SACCDIS, 0xff);
-		regmap_write(regs, CCSR_SSI_SACCEN, 0x300);
+	if (!ssi->soc->imx21regs) {
+		/* Disable all channel slots */
+		regmap_write(regs, REG_SSI_SACCDIS, 0xff);
+		/* Enable slots 3 & 4 -- PCM Playback Left & Right channels */
+		regmap_write(regs, REG_SSI_SACCEN, 0x300);
 	}
+}
 
+static void fsl_ssi_tx_config(struct fsl_ssi *ssi, bool enable)
+{
 	/*
-	 * Enable SSI, Transmit and Receive. AC97 has to communicate with the
-	 * codec before a stream is started.
+	 * SACCST might be modified via AC Link by a CODEC if it sends
+	 * extra bits in their SLOTREQ requests, which'll accidentally
+	 * send valid data to slots other than normal playback slots.
+	 *
+	 * To be safe, configure SACCST right before TX starts.
 	 */
-	regmap_update_bits(regs, CCSR_SSI_SCR,
-			CCSR_SSI_SCR_SSIEN | CCSR_SSI_SCR_TE | CCSR_SSI_SCR_RE,
-			CCSR_SSI_SCR_SSIEN | CCSR_SSI_SCR_TE | CCSR_SSI_SCR_RE);
+	if (enable && fsl_ssi_is_ac97(ssi))
+		fsl_ssi_tx_ac97_saccst_setup(ssi);
 
-	regmap_write(regs, CCSR_SSI_SOR, CCSR_SSI_SOR_WAIT(3));
+	fsl_ssi_config(ssi, enable, &ssi->regvals[TX]);
 }
 
 /**
- * fsl_ssi_startup: create a new substream
- *
- * This is the first function called when a stream is opened.
- *
- * If this is the first stream open, then grab the IRQ and program most of
- * the SSI registers.
+ * Cache critical bits of SIER, SRCR, STCR and SCR to later set them safely
  */
+static void fsl_ssi_setup_regvals(struct fsl_ssi *ssi)
+{
+	struct fsl_ssi_regvals *vals = ssi->regvals;
+
+	vals[RX].sier = SSI_SIER_RFF0_EN;
+	vals[RX].srcr = SSI_SRCR_RFEN0;
+	vals[RX].scr = 0;
+	vals[TX].sier = SSI_SIER_TFE0_EN;
+	vals[TX].stcr = SSI_STCR_TFEN0;
+	vals[TX].scr = 0;
+
+	/* AC97 has already enabled SSIEN, RE and TE, so ignore them */
+	if (!fsl_ssi_is_ac97(ssi)) {
+		vals[RX].scr = SSI_SCR_SSIEN | SSI_SCR_RE;
+		vals[TX].scr = SSI_SCR_SSIEN | SSI_SCR_TE;
+	}
+
+	if (ssi->use_dma) {
+		vals[RX].sier |= SSI_SIER_RDMAE;
+		vals[TX].sier |= SSI_SIER_TDMAE;
+	} else {
+		vals[RX].sier |= SSI_SIER_RIE;
+		vals[TX].sier |= SSI_SIER_TIE;
+	}
+
+	vals[RX].sier |= FSLSSI_SIER_DBG_RX_FLAGS;
+	vals[TX].sier |= FSLSSI_SIER_DBG_TX_FLAGS;
+}
+
+static void fsl_ssi_setup_ac97(struct fsl_ssi *ssi)
+{
+	struct regmap *regs = ssi->regs;
+
+	/* Setup the clock control register */
+	regmap_write(regs, REG_SSI_STCCR, SSI_SxCCR_WL(17) | SSI_SxCCR_DC(13));
+	regmap_write(regs, REG_SSI_SRCCR, SSI_SxCCR_WL(17) | SSI_SxCCR_DC(13));
+
+	/* Enable AC97 mode and startup the SSI */
+	regmap_write(regs, REG_SSI_SACNT, SSI_SACNT_AC97EN | SSI_SACNT_FV);
+
+	/* AC97 has to communicate with codec before starting a stream */
+	regmap_update_bits(regs, REG_SSI_SCR,
+			   SSI_SCR_SSIEN | SSI_SCR_TE | SSI_SCR_RE,
+			   SSI_SCR_SSIEN | SSI_SCR_TE | SSI_SCR_RE);
+
+	regmap_write(regs, REG_SSI_SOR, SSI_SOR_WAIT(3));
+}
+
 static int fsl_ssi_startup(struct snd_pcm_substream *substream,
 			   struct snd_soc_dai *dai)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
-	struct fsl_ssi_private *ssi_private =
-		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(rtd->cpu_dai);
 	int ret;
 
-	ret = clk_prepare_enable(ssi_private->clk);
+	ret = clk_prepare_enable(ssi->clk);
 	if (ret)
 		return ret;
 
-	/* When using dual fifo mode, it is safer to ensure an even period
+	/*
+	 * When using dual fifo mode, it is safer to ensure an even period
 	 * size. If appearing to an odd number while DMA always starts its
 	 * task from fifo0, fifo1 would be neglected at the end of each
 	 * period. But SSI would still access fifo1 with an invalid data.
 	 */
-	if (ssi_private->use_dual_fifo)
+	if (ssi->use_dual_fifo)
 		snd_pcm_hw_constraint_step(substream->runtime, 0,
-				SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 2);
+					   SNDRV_PCM_HW_PARAM_PERIOD_SIZE, 2);
 
 	return 0;
 }
 
-/**
- * fsl_ssi_shutdown: shutdown the SSI
- *
- */
 static void fsl_ssi_shutdown(struct snd_pcm_substream *substream,
-				struct snd_soc_dai *dai)
+			     struct snd_soc_dai *dai)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
-	struct fsl_ssi_private *ssi_private =
-		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(rtd->cpu_dai);
 
-	clk_disable_unprepare(ssi_private->clk);
-
+	clk_disable_unprepare(ssi->clk);
 }
 
 /**
- * fsl_ssi_set_bclk - configure Digital Audio Interface bit clock
+ * Configure Digital Audio Interface bit clock
  *
  * Note: This function can be only called when using SSI as DAI master
  *
@@ -709,12 +675,13 @@ static void fsl_ssi_shutdown(struct snd_pcm_substream *substream,
  *       (In 2-channel I2S Master mode, slot_width is fixed 32)
  */
 static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
-		struct snd_soc_dai *cpu_dai,
-		struct snd_pcm_hw_params *hw_params)
+			    struct snd_soc_dai *dai,
+			    struct snd_pcm_hw_params *hw_params)
 {
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
-	struct regmap *regs = ssi_private->regs;
-	int synchronous = ssi_private->cpu_dai_drv.symmetric_rates, ret;
+	bool tx2, tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
+	struct regmap *regs = ssi->regs;
+	int synchronous = ssi->cpu_dai_drv.symmetric_rates, ret;
 	u32 pm = 999, div2, psr, stccr, mask, afreq, factor, i;
 	unsigned long clkrate, baudrate, tmprate;
 	unsigned int slots = params_channels(hw_params);
@@ -724,29 +691,29 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
 	bool baudclk_is_used;
 
 	/* Override slots and slot_width if being specifically set... */
-	if (ssi_private->slots)
-		slots = ssi_private->slots;
+	if (ssi->slots)
+		slots = ssi->slots;
 	/* ...but keep 32 bits if slots is 2 -- I2S Master mode */
-	if (ssi_private->slot_width && slots != 2)
-		slot_width = ssi_private->slot_width;
+	if (ssi->slot_width && slots != 2)
+		slot_width = ssi->slot_width;
 
 	/* Generate bit clock based on the slot number and slot width */
 	freq = slots * slot_width * params_rate(hw_params);
 
 	/* Don't apply it to any non-baudclk circumstance */
-	if (IS_ERR(ssi_private->baudclk))
+	if (IS_ERR(ssi->baudclk))
 		return -EINVAL;
 
 	/*
 	 * Hardware limitation: The bclk rate must be
 	 * never greater than 1/5 IPG clock rate
 	 */
-	if (freq * 5 > clk_get_rate(ssi_private->clk)) {
-		dev_err(cpu_dai->dev, "bitclk > ipgclk/5\n");
+	if (freq * 5 > clk_get_rate(ssi->clk)) {
+		dev_err(dai->dev, "bitclk > ipgclk / 5\n");
 		return -EINVAL;
 	}
 
-	baudclk_is_used = ssi_private->baudclk_streams & ~(BIT(substream->stream));
+	baudclk_is_used = ssi->baudclk_streams & ~(BIT(substream->stream));
 
 	/* It should be already enough to divide clock by setting pm alone */
 	psr = 0;
@@ -758,9 +725,9 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
 		tmprate = freq * factor * (i + 1);
 
 		if (baudclk_is_used)
-			clkrate = clk_get_rate(ssi_private->baudclk);
+			clkrate = clk_get_rate(ssi->baudclk);
 		else
-			clkrate = clk_round_rate(ssi_private->baudclk, tmprate);
+			clkrate = clk_round_rate(ssi->baudclk, tmprate);
 
 		clkrate /= factor;
 		afreq = clkrate / (i + 1);
@@ -791,24 +758,22 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
 
 	/* No proper pm found if it is still remaining the initial value */
 	if (pm == 999) {
-		dev_err(cpu_dai->dev, "failed to handle the required sysclk\n");
+		dev_err(dai->dev, "failed to handle the required sysclk\n");
 		return -EINVAL;
 	}
 
-	stccr = CCSR_SSI_SxCCR_PM(pm + 1) | (div2 ? CCSR_SSI_SxCCR_DIV2 : 0) |
-		(psr ? CCSR_SSI_SxCCR_PSR : 0);
-	mask = CCSR_SSI_SxCCR_PM_MASK | CCSR_SSI_SxCCR_DIV2 |
-		CCSR_SSI_SxCCR_PSR;
+	stccr = SSI_SxCCR_PM(pm + 1) | (div2 ? SSI_SxCCR_DIV2 : 0) |
+		(psr ? SSI_SxCCR_PSR : 0);
+	mask = SSI_SxCCR_PM_MASK | SSI_SxCCR_DIV2 | SSI_SxCCR_PSR;
 
-	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK || synchronous)
-		regmap_update_bits(regs, CCSR_SSI_STCCR, mask, stccr);
-	else
-		regmap_update_bits(regs, CCSR_SSI_SRCCR, mask, stccr);
+	/* STCCR is used for RX in synchronous mode */
+	tx2 = tx || synchronous;
+	regmap_update_bits(regs, REG_SSI_SxCCR(tx2), mask, stccr);
 
 	if (!baudclk_is_used) {
-		ret = clk_set_rate(ssi_private->baudclk, baudrate);
+		ret = clk_set_rate(ssi->baudclk, baudrate);
 		if (ret) {
-			dev_err(cpu_dai->dev, "failed to set baudclk rate\n");
+			dev_err(dai->dev, "failed to set baudclk rate\n");
 			return -EINVAL;
 		}
 	}
@@ -817,185 +782,165 @@ static int fsl_ssi_set_bclk(struct snd_pcm_substream *substream,
 }
 
 /**
- * fsl_ssi_hw_params - program the sample size
+ * Configure SSI based on PCM hardware parameters
  *
- * Most of the SSI registers have been programmed in the startup function,
- * but the word length must be programmed here.  Unfortunately, programming
- * the SxCCR.WL bits requires the SSI to be temporarily disabled.  This can
- * cause a problem with supporting simultaneous playback and capture.  If
- * the SSI is already playing a stream, then that stream may be temporarily
- * stopped when you start capture.
- *
- * Note: The SxCCR.DC and SxCCR.PM bits are only used if the SSI is the
- * clock master.
+ * Notes:
+ * 1) SxCCR.WL bits are critical bits that require SSI to be temporarily
+ *    disabled on offline_config SoCs. Even for online configurable SoCs
+ *    running in synchronous mode (both TX and RX use STCCR), it is not
+ *    safe to re-configure them when both two streams start running.
+ * 2) SxCCR.PM, SxCCR.DIV2 and SxCCR.PSR bits will be configured in the
+ *    fsl_ssi_set_bclk() if SSI is the DAI clock master.
  */
 static int fsl_ssi_hw_params(struct snd_pcm_substream *substream,
-	struct snd_pcm_hw_params *hw_params, struct snd_soc_dai *cpu_dai)
+			     struct snd_pcm_hw_params *hw_params,
+			     struct snd_soc_dai *dai)
 {
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
-	struct regmap *regs = ssi_private->regs;
+	bool tx2, tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
+	struct regmap *regs = ssi->regs;
 	unsigned int channels = params_channels(hw_params);
 	unsigned int sample_size = params_width(hw_params);
-	u32 wl = CCSR_SSI_SxCCR_WL(sample_size);
+	u32 wl = SSI_SxCCR_WL(sample_size);
 	int ret;
-	u32 scr_val;
+	u32 scr;
 	int enabled;
 
-	regmap_read(regs, CCSR_SSI_SCR, &scr_val);
-	enabled = scr_val & CCSR_SSI_SCR_SSIEN;
+	regmap_read(regs, REG_SSI_SCR, &scr);
+	enabled = scr & SSI_SCR_SSIEN;
 
 	/*
-	 * If we're in synchronous mode, and the SSI is already enabled,
-	 * then STCCR is already set properly.
+	 * SSI is properly configured if it is enabled and running in
+	 * the synchronous mode; Note that AC97 mode is an exception
+	 * that should set separate configurations for STCCR and SRCCR
+	 * despite running in the synchronous mode.
 	 */
-	if (enabled && ssi_private->cpu_dai_drv.symmetric_rates)
+	if (enabled && ssi->cpu_dai_drv.symmetric_rates)
 		return 0;
 
-	if (fsl_ssi_is_i2s_master(ssi_private)) {
-		ret = fsl_ssi_set_bclk(substream, cpu_dai, hw_params);
+	if (fsl_ssi_is_i2s_master(ssi)) {
+		ret = fsl_ssi_set_bclk(substream, dai, hw_params);
 		if (ret)
 			return ret;
 
 		/* Do not enable the clock if it is already enabled */
-		if (!(ssi_private->baudclk_streams & BIT(substream->stream))) {
-			ret = clk_prepare_enable(ssi_private->baudclk);
+		if (!(ssi->baudclk_streams & BIT(substream->stream))) {
+			ret = clk_prepare_enable(ssi->baudclk);
 			if (ret)
 				return ret;
 
-			ssi_private->baudclk_streams |= BIT(substream->stream);
+			ssi->baudclk_streams |= BIT(substream->stream);
 		}
 	}
 
-	if (!fsl_ssi_is_ac97(ssi_private)) {
-		u8 i2smode;
-		/*
-		 * Switch to normal net mode in order to have a frame sync
-		 * signal every 32 bits instead of 16 bits
-		 */
-		if (fsl_ssi_is_i2s_cbm_cfs(ssi_private) && sample_size == 16)
-			i2smode = CCSR_SSI_SCR_I2S_MODE_NORMAL |
-				CCSR_SSI_SCR_NET;
+	if (!fsl_ssi_is_ac97(ssi)) {
+		u8 i2s_net;
+		/* Normal + Network mode to send 16-bit data in 32-bit frames */
+		if (fsl_ssi_is_i2s_cbm_cfs(ssi) && sample_size == 16)
+			i2s_net = SSI_SCR_I2S_MODE_NORMAL | SSI_SCR_NET;
 		else
-			i2smode = ssi_private->i2s_mode;
+			i2s_net = ssi->i2s_net;
 
-		regmap_update_bits(regs, CCSR_SSI_SCR,
-				CCSR_SSI_SCR_NET | CCSR_SSI_SCR_I2S_MODE_MASK,
-				channels == 1 ? 0 : i2smode);
+		regmap_update_bits(regs, REG_SSI_SCR,
+				   SSI_SCR_I2S_NET_MASK,
+				   channels == 1 ? 0 : i2s_net);
 	}
 
-	/*
-	 * FIXME: The documentation says that SxCCR[WL] should not be
-	 * modified while the SSI is enabled.  The only time this can
-	 * happen is if we're trying to do simultaneous playback and
-	 * capture in asynchronous mode.  Unfortunately, I have been enable
-	 * to get that to work at all on the P1022DS.  Therefore, we don't
-	 * bother to disable/enable the SSI when setting SxCCR[WL], because
-	 * the SSI will stop anyway.  Maybe one day, this will get fixed.
-	 */
-
 	/* In synchronous mode, the SSI uses STCCR for capture */
-	if ((substream->stream == SNDRV_PCM_STREAM_PLAYBACK) ||
-	    ssi_private->cpu_dai_drv.symmetric_rates)
-		regmap_update_bits(regs, CCSR_SSI_STCCR, CCSR_SSI_SxCCR_WL_MASK,
-				wl);
-	else
-		regmap_update_bits(regs, CCSR_SSI_SRCCR, CCSR_SSI_SxCCR_WL_MASK,
-				wl);
+	tx2 = tx || ssi->cpu_dai_drv.symmetric_rates;
+	regmap_update_bits(regs, REG_SSI_SxCCR(tx2), SSI_SxCCR_WL_MASK, wl);
 
 	return 0;
 }
 
 static int fsl_ssi_hw_free(struct snd_pcm_substream *substream,
-		struct snd_soc_dai *cpu_dai)
+			   struct snd_soc_dai *dai)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
-	struct fsl_ssi_private *ssi_private =
-		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(rtd->cpu_dai);
 
-	if (fsl_ssi_is_i2s_master(ssi_private) &&
-			ssi_private->baudclk_streams & BIT(substream->stream)) {
-		clk_disable_unprepare(ssi_private->baudclk);
-		ssi_private->baudclk_streams &= ~BIT(substream->stream);
+	if (fsl_ssi_is_i2s_master(ssi) &&
+	    ssi->baudclk_streams & BIT(substream->stream)) {
+		clk_disable_unprepare(ssi->baudclk);
+		ssi->baudclk_streams &= ~BIT(substream->stream);
 	}
 
 	return 0;
 }
 
 static int _fsl_ssi_set_dai_fmt(struct device *dev,
-				struct fsl_ssi_private *ssi_private,
-				unsigned int fmt)
+				struct fsl_ssi *ssi, unsigned int fmt)
 {
-	struct regmap *regs = ssi_private->regs;
+	struct regmap *regs = ssi->regs;
 	u32 strcr = 0, stcr, srcr, scr, mask;
 	u8 wm;
 
-	ssi_private->dai_fmt = fmt;
+	ssi->dai_fmt = fmt;
 
-	if (fsl_ssi_is_i2s_master(ssi_private) && IS_ERR(ssi_private->baudclk)) {
-		dev_err(dev, "baudclk is missing which is necessary for master mode\n");
+	if (fsl_ssi_is_i2s_master(ssi) && IS_ERR(ssi->baudclk)) {
+		dev_err(dev, "missing baudclk for master mode\n");
 		return -EINVAL;
 	}
 
-	fsl_ssi_setup_reg_vals(ssi_private);
+	fsl_ssi_setup_regvals(ssi);
 
-	regmap_read(regs, CCSR_SSI_SCR, &scr);
-	scr &= ~(CCSR_SSI_SCR_SYN | CCSR_SSI_SCR_I2S_MODE_MASK);
-	scr |= CCSR_SSI_SCR_SYNC_TX_FS;
+	regmap_read(regs, REG_SSI_SCR, &scr);
+	scr &= ~(SSI_SCR_SYN | SSI_SCR_I2S_MODE_MASK);
+	/* Synchronize frame sync clock for TE to avoid data slipping */
+	scr |= SSI_SCR_SYNC_TX_FS;
 
-	mask = CCSR_SSI_STCR_TXBIT0 | CCSR_SSI_STCR_TFDIR | CCSR_SSI_STCR_TXDIR |
-		CCSR_SSI_STCR_TSCKP | CCSR_SSI_STCR_TFSI | CCSR_SSI_STCR_TFSL |
-		CCSR_SSI_STCR_TEFS;
-	regmap_read(regs, CCSR_SSI_STCR, &stcr);
-	regmap_read(regs, CCSR_SSI_SRCR, &srcr);
+	mask = SSI_STCR_TXBIT0 | SSI_STCR_TFDIR | SSI_STCR_TXDIR |
+	       SSI_STCR_TSCKP | SSI_STCR_TFSI | SSI_STCR_TFSL | SSI_STCR_TEFS;
+	regmap_read(regs, REG_SSI_STCR, &stcr);
+	regmap_read(regs, REG_SSI_SRCR, &srcr);
 	stcr &= ~mask;
 	srcr &= ~mask;
 
-	ssi_private->i2s_mode = CCSR_SSI_SCR_NET;
+	/* Use Network mode as default */
+	ssi->i2s_net = SSI_SCR_NET;
 	switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) {
 	case SND_SOC_DAIFMT_I2S:
-		regmap_update_bits(regs, CCSR_SSI_STCCR,
-				   CCSR_SSI_SxCCR_DC_MASK,
-				   CCSR_SSI_SxCCR_DC(2));
-		regmap_update_bits(regs, CCSR_SSI_SRCCR,
-				   CCSR_SSI_SxCCR_DC_MASK,
-				   CCSR_SSI_SxCCR_DC(2));
+		regmap_update_bits(regs, REG_SSI_STCCR,
+				   SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(2));
+		regmap_update_bits(regs, REG_SSI_SRCCR,
+				   SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(2));
 		switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
 		case SND_SOC_DAIFMT_CBM_CFS:
 		case SND_SOC_DAIFMT_CBS_CFS:
-			ssi_private->i2s_mode |= CCSR_SSI_SCR_I2S_MODE_MASTER;
+			ssi->i2s_net |= SSI_SCR_I2S_MODE_MASTER;
 			break;
 		case SND_SOC_DAIFMT_CBM_CFM:
-			ssi_private->i2s_mode |= CCSR_SSI_SCR_I2S_MODE_SLAVE;
+			ssi->i2s_net |= SSI_SCR_I2S_MODE_SLAVE;
 			break;
 		default:
 			return -EINVAL;
 		}
 
 		/* Data on rising edge of bclk, frame low, 1clk before data */
-		strcr |= CCSR_SSI_STCR_TFSI | CCSR_SSI_STCR_TSCKP |
-			CCSR_SSI_STCR_TXBIT0 | CCSR_SSI_STCR_TEFS;
+		strcr |= SSI_STCR_TFSI | SSI_STCR_TSCKP |
+			 SSI_STCR_TXBIT0 | SSI_STCR_TEFS;
 		break;
 	case SND_SOC_DAIFMT_LEFT_J:
 		/* Data on rising edge of bclk, frame high */
-		strcr |= CCSR_SSI_STCR_TXBIT0 | CCSR_SSI_STCR_TSCKP;
+		strcr |= SSI_STCR_TXBIT0 | SSI_STCR_TSCKP;
 		break;
 	case SND_SOC_DAIFMT_DSP_A:
 		/* Data on rising edge of bclk, frame high, 1clk before data */
-		strcr |= CCSR_SSI_STCR_TFSL | CCSR_SSI_STCR_TSCKP |
-			CCSR_SSI_STCR_TXBIT0 | CCSR_SSI_STCR_TEFS;
+		strcr |= SSI_STCR_TFSL | SSI_STCR_TSCKP |
+			 SSI_STCR_TXBIT0 | SSI_STCR_TEFS;
 		break;
 	case SND_SOC_DAIFMT_DSP_B:
 		/* Data on rising edge of bclk, frame high */
-		strcr |= CCSR_SSI_STCR_TFSL | CCSR_SSI_STCR_TSCKP |
-			CCSR_SSI_STCR_TXBIT0;
+		strcr |= SSI_STCR_TFSL | SSI_STCR_TSCKP | SSI_STCR_TXBIT0;
 		break;
 	case SND_SOC_DAIFMT_AC97:
-		ssi_private->i2s_mode |= CCSR_SSI_SCR_I2S_MODE_NORMAL;
+		/* Data on falling edge of bclk, frame high, 1clk before data */
+		ssi->i2s_net |= SSI_SCR_I2S_MODE_NORMAL;
 		break;
 	default:
 		return -EINVAL;
 	}
-	scr |= ssi_private->i2s_mode;
+	scr |= ssi->i2s_net;
 
 	/* DAI clock inversion */
 	switch (fmt & SND_SOC_DAIFMT_INV_MASK) {
@@ -1004,16 +949,16 @@ static int _fsl_ssi_set_dai_fmt(struct device *dev,
 		break;
 	case SND_SOC_DAIFMT_IB_NF:
 		/* Invert bit clock */
-		strcr ^= CCSR_SSI_STCR_TSCKP;
+		strcr ^= SSI_STCR_TSCKP;
 		break;
 	case SND_SOC_DAIFMT_NB_IF:
 		/* Invert frame clock */
-		strcr ^= CCSR_SSI_STCR_TFSI;
+		strcr ^= SSI_STCR_TFSI;
 		break;
 	case SND_SOC_DAIFMT_IB_IF:
 		/* Invert both clocks */
-		strcr ^= CCSR_SSI_STCR_TSCKP;
-		strcr ^= CCSR_SSI_STCR_TFSI;
+		strcr ^= SSI_STCR_TSCKP;
+		strcr ^= SSI_STCR_TFSI;
 		break;
 	default:
 		return -EINVAL;
@@ -1022,123 +967,122 @@ static int _fsl_ssi_set_dai_fmt(struct device *dev,
 	/* DAI clock master masks */
 	switch (fmt & SND_SOC_DAIFMT_MASTER_MASK) {
 	case SND_SOC_DAIFMT_CBS_CFS:
-		strcr |= CCSR_SSI_STCR_TFDIR | CCSR_SSI_STCR_TXDIR;
-		scr |= CCSR_SSI_SCR_SYS_CLK_EN;
+		/* Output bit and frame sync clocks */
+		strcr |= SSI_STCR_TFDIR | SSI_STCR_TXDIR;
+		scr |= SSI_SCR_SYS_CLK_EN;
 		break;
 	case SND_SOC_DAIFMT_CBM_CFM:
-		scr &= ~CCSR_SSI_SCR_SYS_CLK_EN;
+		/* Input bit or frame sync clocks */
+		scr &= ~SSI_SCR_SYS_CLK_EN;
 		break;
 	case SND_SOC_DAIFMT_CBM_CFS:
-		strcr &= ~CCSR_SSI_STCR_TXDIR;
-		strcr |= CCSR_SSI_STCR_TFDIR;
-		scr &= ~CCSR_SSI_SCR_SYS_CLK_EN;
+		/* Input bit clock but output frame sync clock */
+		strcr &= ~SSI_STCR_TXDIR;
+		strcr |= SSI_STCR_TFDIR;
+		scr &= ~SSI_SCR_SYS_CLK_EN;
 		break;
 	default:
-		if (!fsl_ssi_is_ac97(ssi_private))
+		if (!fsl_ssi_is_ac97(ssi))
 			return -EINVAL;
 	}
 
 	stcr |= strcr;
 	srcr |= strcr;
 
-	if (ssi_private->cpu_dai_drv.symmetric_rates
-			|| fsl_ssi_is_ac97(ssi_private)) {
-		/* Need to clear RXDIR when using SYNC or AC97 mode */
-		srcr &= ~CCSR_SSI_SRCR_RXDIR;
-		scr |= CCSR_SSI_SCR_SYN;
+	/* Set SYN mode and clear RXDIR bit when using SYN or AC97 mode */
+	if (ssi->cpu_dai_drv.symmetric_rates || fsl_ssi_is_ac97(ssi)) {
+		srcr &= ~SSI_SRCR_RXDIR;
+		scr |= SSI_SCR_SYN;
 	}
 
-	regmap_write(regs, CCSR_SSI_STCR, stcr);
-	regmap_write(regs, CCSR_SSI_SRCR, srcr);
-	regmap_write(regs, CCSR_SSI_SCR, scr);
+	regmap_write(regs, REG_SSI_STCR, stcr);
+	regmap_write(regs, REG_SSI_SRCR, srcr);
+	regmap_write(regs, REG_SSI_SCR, scr);
 
-	wm = ssi_private->fifo_watermark;
+	wm = ssi->fifo_watermark;
 
-	regmap_write(regs, CCSR_SSI_SFCSR,
-			CCSR_SSI_SFCSR_TFWM0(wm) | CCSR_SSI_SFCSR_RFWM0(wm) |
-			CCSR_SSI_SFCSR_TFWM1(wm) | CCSR_SSI_SFCSR_RFWM1(wm));
+	regmap_write(regs, REG_SSI_SFCSR,
+		     SSI_SFCSR_TFWM0(wm) | SSI_SFCSR_RFWM0(wm) |
+		     SSI_SFCSR_TFWM1(wm) | SSI_SFCSR_RFWM1(wm));
 
-	if (ssi_private->use_dual_fifo) {
-		regmap_update_bits(regs, CCSR_SSI_SRCR, CCSR_SSI_SRCR_RFEN1,
-				CCSR_SSI_SRCR_RFEN1);
-		regmap_update_bits(regs, CCSR_SSI_STCR, CCSR_SSI_STCR_TFEN1,
-				CCSR_SSI_STCR_TFEN1);
-		regmap_update_bits(regs, CCSR_SSI_SCR, CCSR_SSI_SCR_TCH_EN,
-				CCSR_SSI_SCR_TCH_EN);
+	if (ssi->use_dual_fifo) {
+		regmap_update_bits(regs, REG_SSI_SRCR,
+				   SSI_SRCR_RFEN1, SSI_SRCR_RFEN1);
+		regmap_update_bits(regs, REG_SSI_STCR,
+				   SSI_STCR_TFEN1, SSI_STCR_TFEN1);
+		regmap_update_bits(regs, REG_SSI_SCR,
+				   SSI_SCR_TCH_EN, SSI_SCR_TCH_EN);
 	}
 
 	if ((fmt & SND_SOC_DAIFMT_FORMAT_MASK) == SND_SOC_DAIFMT_AC97)
-		fsl_ssi_setup_ac97(ssi_private);
+		fsl_ssi_setup_ac97(ssi);
 
 	return 0;
-
 }
 
 /**
- * fsl_ssi_set_dai_fmt - configure Digital Audio Interface Format.
+ * Configure Digital Audio Interface (DAI) Format
  */
-static int fsl_ssi_set_dai_fmt(struct snd_soc_dai *cpu_dai, unsigned int fmt)
+static int fsl_ssi_set_dai_fmt(struct snd_soc_dai *dai, unsigned int fmt)
 {
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
 
-	return _fsl_ssi_set_dai_fmt(cpu_dai->dev, ssi_private, fmt);
+	/* AC97 configured DAIFMT earlier in the probe() */
+	if (fsl_ssi_is_ac97(ssi))
+		return 0;
+
+	return _fsl_ssi_set_dai_fmt(dai->dev, ssi, fmt);
 }
 
 /**
- * fsl_ssi_set_dai_tdm_slot - set TDM slot number
- *
- * Note: This function can be only called when using SSI as DAI master
+ * Set TDM slot number and slot width
  */
-static int fsl_ssi_set_dai_tdm_slot(struct snd_soc_dai *cpu_dai, u32 tx_mask,
-				u32 rx_mask, int slots, int slot_width)
+static int fsl_ssi_set_dai_tdm_slot(struct snd_soc_dai *dai, u32 tx_mask,
+				    u32 rx_mask, int slots, int slot_width)
 {
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(cpu_dai);
-	struct regmap *regs = ssi_private->regs;
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
+	struct regmap *regs = ssi->regs;
 	u32 val;
 
 	/* The word length should be 8, 10, 12, 16, 18, 20, 22 or 24 */
 	if (slot_width & 1 || slot_width < 8 || slot_width > 24) {
-		dev_err(cpu_dai->dev, "invalid slot width: %d\n", slot_width);
+		dev_err(dai->dev, "invalid slot width: %d\n", slot_width);
 		return -EINVAL;
 	}
 
 	/* The slot number should be >= 2 if using Network mode or I2S mode */
-	regmap_read(regs, CCSR_SSI_SCR, &val);
-	val &= CCSR_SSI_SCR_I2S_MODE_MASK | CCSR_SSI_SCR_NET;
+	regmap_read(regs, REG_SSI_SCR, &val);
+	val &= SSI_SCR_I2S_MODE_MASK | SSI_SCR_NET;
 	if (val && slots < 2) {
-		dev_err(cpu_dai->dev, "slot number should be >= 2 in I2S or NET\n");
+		dev_err(dai->dev, "slot number should be >= 2 in I2S or NET\n");
 		return -EINVAL;
 	}
 
-	regmap_update_bits(regs, CCSR_SSI_STCCR, CCSR_SSI_SxCCR_DC_MASK,
-			CCSR_SSI_SxCCR_DC(slots));
-	regmap_update_bits(regs, CCSR_SSI_SRCCR, CCSR_SSI_SxCCR_DC_MASK,
-			CCSR_SSI_SxCCR_DC(slots));
+	regmap_update_bits(regs, REG_SSI_STCCR,
+			   SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(slots));
+	regmap_update_bits(regs, REG_SSI_SRCCR,
+			   SSI_SxCCR_DC_MASK, SSI_SxCCR_DC(slots));
 
-	/* The register SxMSKs needs SSI to provide essential clock due to
-	 * hardware design. So we here temporarily enable SSI to set them.
-	 */
-	regmap_read(regs, CCSR_SSI_SCR, &val);
-	val &= CCSR_SSI_SCR_SSIEN;
-	regmap_update_bits(regs, CCSR_SSI_SCR, CCSR_SSI_SCR_SSIEN,
-			CCSR_SSI_SCR_SSIEN);
+	/* Save SSIEN bit of the SCR register */
+	regmap_read(regs, REG_SSI_SCR, &val);
+	val &= SSI_SCR_SSIEN;
+	/* Temporarily enable SSI to allow SxMSKs to be configurable */
+	regmap_update_bits(regs, REG_SSI_SCR, SSI_SCR_SSIEN, SSI_SCR_SSIEN);
 
-	regmap_write(regs, CCSR_SSI_STMSK, ~tx_mask);
-	regmap_write(regs, CCSR_SSI_SRMSK, ~rx_mask);
+	regmap_write(regs, REG_SSI_STMSK, ~tx_mask);
+	regmap_write(regs, REG_SSI_SRMSK, ~rx_mask);
 
-	regmap_update_bits(regs, CCSR_SSI_SCR, CCSR_SSI_SCR_SSIEN, val);
+	/* Restore the value of SSIEN bit */
+	regmap_update_bits(regs, REG_SSI_SCR, SSI_SCR_SSIEN, val);
 
-	ssi_private->slot_width = slot_width;
-	ssi_private->slots = slots;
+	ssi->slot_width = slot_width;
+	ssi->slots = slots;
 
 	return 0;
 }
 
 /**
- * fsl_ssi_trigger: start and stop the DMA transfer.
- *
- * This function is called by ALSA to start, stop, pause, and resume the DMA
- * transfer of data.
+ * Start or stop SSI and corresponding DMA transaction.
  *
  * The DMA channel is in external master start and pause mode, which
  * means the SSI completely controls the flow of data.
@@ -1147,37 +1091,38 @@ static int fsl_ssi_trigger(struct snd_pcm_substream *substream, int cmd,
 			   struct snd_soc_dai *dai)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(rtd->cpu_dai);
-	struct regmap *regs = ssi_private->regs;
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	struct regmap *regs = ssi->regs;
 
 	switch (cmd) {
 	case SNDRV_PCM_TRIGGER_START:
 	case SNDRV_PCM_TRIGGER_RESUME:
 	case SNDRV_PCM_TRIGGER_PAUSE_RELEASE:
 		if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
-			fsl_ssi_tx_config(ssi_private, true);
+			fsl_ssi_tx_config(ssi, true);
 		else
-			fsl_ssi_rx_config(ssi_private, true);
+			fsl_ssi_rx_config(ssi, true);
 		break;
 
 	case SNDRV_PCM_TRIGGER_STOP:
 	case SNDRV_PCM_TRIGGER_SUSPEND:
 	case SNDRV_PCM_TRIGGER_PAUSE_PUSH:
 		if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
-			fsl_ssi_tx_config(ssi_private, false);
+			fsl_ssi_tx_config(ssi, false);
 		else
-			fsl_ssi_rx_config(ssi_private, false);
+			fsl_ssi_rx_config(ssi, false);
 		break;
 
 	default:
 		return -EINVAL;
 	}
 
-	if (fsl_ssi_is_ac97(ssi_private)) {
+	/* Clear corresponding FIFO */
+	if (fsl_ssi_is_ac97(ssi)) {
 		if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK)
-			regmap_write(regs, CCSR_SSI_SOR, CCSR_SSI_SOR_TX_CLR);
+			regmap_write(regs, REG_SSI_SOR, SSI_SOR_TX_CLR);
 		else
-			regmap_write(regs, CCSR_SSI_SOR, CCSR_SSI_SOR_RX_CLR);
+			regmap_write(regs, REG_SSI_SOR, SSI_SOR_RX_CLR);
 	}
 
 	return 0;
@@ -1185,27 +1130,26 @@ static int fsl_ssi_trigger(struct snd_pcm_substream *substream, int cmd,
 
 static int fsl_ssi_dai_probe(struct snd_soc_dai *dai)
 {
-	struct fsl_ssi_private *ssi_private = snd_soc_dai_get_drvdata(dai);
+	struct fsl_ssi *ssi = snd_soc_dai_get_drvdata(dai);
 
-	if (ssi_private->soc->imx && ssi_private->use_dma) {
-		dai->playback_dma_data = &ssi_private->dma_params_tx;
-		dai->capture_dma_data = &ssi_private->dma_params_rx;
+	if (ssi->soc->imx && ssi->use_dma) {
+		dai->playback_dma_data = &ssi->dma_params_tx;
+		dai->capture_dma_data = &ssi->dma_params_rx;
 	}
 
 	return 0;
 }
 
 static const struct snd_soc_dai_ops fsl_ssi_dai_ops = {
-	.startup	= fsl_ssi_startup,
-	.shutdown       = fsl_ssi_shutdown,
-	.hw_params	= fsl_ssi_hw_params,
-	.hw_free	= fsl_ssi_hw_free,
-	.set_fmt	= fsl_ssi_set_dai_fmt,
-	.set_tdm_slot	= fsl_ssi_set_dai_tdm_slot,
-	.trigger	= fsl_ssi_trigger,
+	.startup = fsl_ssi_startup,
+	.shutdown = fsl_ssi_shutdown,
+	.hw_params = fsl_ssi_hw_params,
+	.hw_free = fsl_ssi_hw_free,
+	.set_fmt = fsl_ssi_set_dai_fmt,
+	.set_tdm_slot = fsl_ssi_set_dai_tdm_slot,
+	.trigger = fsl_ssi_trigger,
 };
 
-/* Template for the CPU dai driver structure */
 static struct snd_soc_dai_driver fsl_ssi_dai_template = {
 	.probe = fsl_ssi_dai_probe,
 	.playback = {
@@ -1226,7 +1170,7 @@ static struct snd_soc_dai_driver fsl_ssi_dai_template = {
 };
 
 static const struct snd_soc_component_driver fsl_ssi_component = {
-	.name		= "fsl-ssi",
+	.name = "fsl-ssi",
 };
 
 static struct snd_soc_dai_driver fsl_ssi_ac97_dai = {
@@ -1237,23 +1181,23 @@ static struct snd_soc_dai_driver fsl_ssi_ac97_dai = {
 		.channels_min = 2,
 		.channels_max = 2,
 		.rates = SNDRV_PCM_RATE_8000_48000,
-		.formats = SNDRV_PCM_FMTBIT_S16_LE,
+		.formats = SNDRV_PCM_FMTBIT_S16 | SNDRV_PCM_FMTBIT_S20,
 	},
 	.capture = {
 		.stream_name = "AC97 Capture",
 		.channels_min = 2,
 		.channels_max = 2,
 		.rates = SNDRV_PCM_RATE_48000,
-		.formats = SNDRV_PCM_FMTBIT_S16_LE,
+		/* 16-bit capture is broken (errata ERR003778) */
+		.formats = SNDRV_PCM_FMTBIT_S20,
 	},
 	.ops = &fsl_ssi_dai_ops,
 };
 
-
-static struct fsl_ssi_private *fsl_ac97_data;
+static struct fsl_ssi *fsl_ac97_data;
 
 static void fsl_ssi_ac97_write(struct snd_ac97 *ac97, unsigned short reg,
-		unsigned short val)
+			       unsigned short val)
 {
 	struct regmap *regs = fsl_ac97_data->regs;
 	unsigned int lreg;
@@ -1273,13 +1217,13 @@ static void fsl_ssi_ac97_write(struct snd_ac97 *ac97, unsigned short reg,
 	}
 
 	lreg = reg <<  12;
-	regmap_write(regs, CCSR_SSI_SACADD, lreg);
+	regmap_write(regs, REG_SSI_SACADD, lreg);
 
 	lval = val << 4;
-	regmap_write(regs, CCSR_SSI_SACDAT, lval);
+	regmap_write(regs, REG_SSI_SACDAT, lval);
 
-	regmap_update_bits(regs, CCSR_SSI_SACNT, CCSR_SSI_SACNT_RDWR_MASK,
-			CCSR_SSI_SACNT_WR);
+	regmap_update_bits(regs, REG_SSI_SACNT,
+			   SSI_SACNT_RDWR_MASK, SSI_SACNT_WR);
 	udelay(100);
 
 	clk_disable_unprepare(fsl_ac97_data->clk);
@@ -1289,10 +1233,9 @@ static void fsl_ssi_ac97_write(struct snd_ac97 *ac97, unsigned short reg,
 }
 
 static unsigned short fsl_ssi_ac97_read(struct snd_ac97 *ac97,
-		unsigned short reg)
+					unsigned short reg)
 {
 	struct regmap *regs = fsl_ac97_data->regs;
-
 	unsigned short val = 0;
 	u32 reg_val;
 	unsigned int lreg;
@@ -1302,19 +1245,18 @@ static unsigned short fsl_ssi_ac97_read(struct snd_ac97 *ac97,
 
 	ret = clk_prepare_enable(fsl_ac97_data->clk);
 	if (ret) {
-		pr_err("ac97 read clk_prepare_enable failed: %d\n",
-			ret);
+		pr_err("ac97 read clk_prepare_enable failed: %d\n", ret);
 		goto ret_unlock;
 	}
 
 	lreg = (reg & 0x7f) <<  12;
-	regmap_write(regs, CCSR_SSI_SACADD, lreg);
-	regmap_update_bits(regs, CCSR_SSI_SACNT, CCSR_SSI_SACNT_RDWR_MASK,
-			CCSR_SSI_SACNT_RD);
+	regmap_write(regs, REG_SSI_SACADD, lreg);
+	regmap_update_bits(regs, REG_SSI_SACNT,
+			   SSI_SACNT_RDWR_MASK, SSI_SACNT_RD);
 
 	udelay(100);
 
-	regmap_read(regs, CCSR_SSI_SACDAT, &reg_val);
+	regmap_read(regs, REG_SSI_SACDAT, &reg_val);
 	val = (reg_val >> 4) & 0xffff;
 
 	clk_disable_unprepare(fsl_ac97_data->clk);
@@ -1325,8 +1267,8 @@ static unsigned short fsl_ssi_ac97_read(struct snd_ac97 *ac97,
 }
 
 static struct snd_ac97_bus_ops fsl_ssi_ac97_ops = {
-	.read		= fsl_ssi_ac97_read,
-	.write		= fsl_ssi_ac97_write,
+	.read = fsl_ssi_ac97_read,
+	.write = fsl_ssi_ac97_write,
 };
 
 /**
@@ -1341,70 +1283,67 @@ static void make_lowercase(char *s)
 }
 
 static int fsl_ssi_imx_probe(struct platform_device *pdev,
-		struct fsl_ssi_private *ssi_private, void __iomem *iomem)
+			     struct fsl_ssi *ssi, void __iomem *iomem)
 {
 	struct device_node *np = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
 	u32 dmas[4];
 	int ret;
 
-	if (ssi_private->has_ipg_clk_name)
-		ssi_private->clk = devm_clk_get(&pdev->dev, "ipg");
+	/* Backward compatible for a DT without ipg clock name assigned */
+	if (ssi->has_ipg_clk_name)
+		ssi->clk = devm_clk_get(dev, "ipg");
 	else
-		ssi_private->clk = devm_clk_get(&pdev->dev, NULL);
-	if (IS_ERR(ssi_private->clk)) {
-		ret = PTR_ERR(ssi_private->clk);
-		dev_err(&pdev->dev, "could not get clock: %d\n", ret);
+		ssi->clk = devm_clk_get(dev, NULL);
+	if (IS_ERR(ssi->clk)) {
+		ret = PTR_ERR(ssi->clk);
+		dev_err(dev, "failed to get clock: %d\n", ret);
 		return ret;
 	}
 
-	if (!ssi_private->has_ipg_clk_name) {
-		ret = clk_prepare_enable(ssi_private->clk);
+	/* Enable the clock since regmap will not handle it in this case */
+	if (!ssi->has_ipg_clk_name) {
+		ret = clk_prepare_enable(ssi->clk);
 		if (ret) {
-			dev_err(&pdev->dev, "clk_prepare_enable failed: %d\n", ret);
+			dev_err(dev, "clk_prepare_enable failed: %d\n", ret);
 			return ret;
 		}
 	}
 
-	/* For those SLAVE implementations, we ignore non-baudclk cases
-	 * and, instead, abandon MASTER mode that needs baud clock.
-	 */
-	ssi_private->baudclk = devm_clk_get(&pdev->dev, "baud");
-	if (IS_ERR(ssi_private->baudclk))
-		dev_dbg(&pdev->dev, "could not get baud clock: %ld\n",
-			 PTR_ERR(ssi_private->baudclk));
+	/* Do not error out for slave cases that live without a baud clock */
+	ssi->baudclk = devm_clk_get(dev, "baud");
+	if (IS_ERR(ssi->baudclk))
+		dev_dbg(dev, "failed to get baud clock: %ld\n",
+			 PTR_ERR(ssi->baudclk));
 
-	ssi_private->dma_params_tx.maxburst = ssi_private->dma_maxburst;
-	ssi_private->dma_params_rx.maxburst = ssi_private->dma_maxburst;
-	ssi_private->dma_params_tx.addr = ssi_private->ssi_phys + CCSR_SSI_STX0;
-	ssi_private->dma_params_rx.addr = ssi_private->ssi_phys + CCSR_SSI_SRX0;
+	ssi->dma_params_tx.maxburst = ssi->dma_maxburst;
+	ssi->dma_params_rx.maxburst = ssi->dma_maxburst;
+	ssi->dma_params_tx.addr = ssi->ssi_phys + REG_SSI_STX0;
+	ssi->dma_params_rx.addr = ssi->ssi_phys + REG_SSI_SRX0;
 
+	/* Set to dual FIFO mode according to the SDMA sciprt */
 	ret = of_property_read_u32_array(np, "dmas", dmas, 4);
-	if (ssi_private->use_dma && !ret && dmas[2] == IMX_DMATYPE_SSI_DUAL) {
-		ssi_private->use_dual_fifo = true;
-		/* When using dual fifo mode, we need to keep watermark
-		 * as even numbers due to dma script limitation.
+	if (ssi->use_dma && !ret && dmas[2] == IMX_DMATYPE_SSI_DUAL) {
+		ssi->use_dual_fifo = true;
+		/*
+		 * Use even numbers to avoid channel swap due to SDMA
+		 * script design
 		 */
-		ssi_private->dma_params_tx.maxburst &= ~0x1;
-		ssi_private->dma_params_rx.maxburst &= ~0x1;
+		ssi->dma_params_tx.maxburst &= ~0x1;
+		ssi->dma_params_rx.maxburst &= ~0x1;
 	}
 
-	if (!ssi_private->use_dma) {
-
+	if (!ssi->use_dma) {
 		/*
-		 * Some boards use an incompatible codec. To get it
-		 * working, we are using imx-fiq-pcm-audio, that
-		 * can handle those codecs. DMA is not possible in this
-		 * situation.
+		 * Some boards use an incompatible codec. Use imx-fiq-pcm-audio
+		 * to get it working, as DMA is not possible in this situation.
 		 */
+		ssi->fiq_params.irq = ssi->irq;
+		ssi->fiq_params.base = iomem;
+		ssi->fiq_params.dma_params_rx = &ssi->dma_params_rx;
+		ssi->fiq_params.dma_params_tx = &ssi->dma_params_tx;
 
-		ssi_private->fiq_params.irq = ssi_private->irq;
-		ssi_private->fiq_params.base = iomem;
-		ssi_private->fiq_params.dma_params_rx =
-			&ssi_private->dma_params_rx;
-		ssi_private->fiq_params.dma_params_tx =
-			&ssi_private->dma_params_tx;
-
-		ret = imx_pcm_fiq_init(pdev, &ssi_private->fiq_params);
+		ret = imx_pcm_fiq_init(pdev, &ssi->fiq_params);
 		if (ret)
 			goto error_pcm;
 	} else {
@@ -1416,26 +1355,26 @@ static int fsl_ssi_imx_probe(struct platform_device *pdev,
 	return 0;
 
 error_pcm:
+	if (!ssi->has_ipg_clk_name)
+		clk_disable_unprepare(ssi->clk);
 
-	if (!ssi_private->has_ipg_clk_name)
-		clk_disable_unprepare(ssi_private->clk);
 	return ret;
 }
 
-static void fsl_ssi_imx_clean(struct platform_device *pdev,
-		struct fsl_ssi_private *ssi_private)
+static void fsl_ssi_imx_clean(struct platform_device *pdev, struct fsl_ssi *ssi)
 {
-	if (!ssi_private->use_dma)
+	if (!ssi->use_dma)
 		imx_pcm_fiq_exit(pdev);
-	if (!ssi_private->has_ipg_clk_name)
-		clk_disable_unprepare(ssi_private->clk);
+	if (!ssi->has_ipg_clk_name)
+		clk_disable_unprepare(ssi->clk);
 }
 
 static int fsl_ssi_probe(struct platform_device *pdev)
 {
-	struct fsl_ssi_private *ssi_private;
+	struct fsl_ssi *ssi;
 	int ret = 0;
 	struct device_node *np = pdev->dev.of_node;
+	struct device *dev = &pdev->dev;
 	const struct of_device_id *of_id;
 	const char *p, *sprop;
 	const uint32_t *iprop;
@@ -1444,185 +1383,159 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 	char name[64];
 	struct regmap_config regconfig = fsl_ssi_regconfig;
 
-	of_id = of_match_device(fsl_ssi_ids, &pdev->dev);
+	of_id = of_match_device(fsl_ssi_ids, dev);
 	if (!of_id || !of_id->data)
 		return -EINVAL;
 
-	ssi_private = devm_kzalloc(&pdev->dev, sizeof(*ssi_private),
-			GFP_KERNEL);
-	if (!ssi_private)
+	ssi = devm_kzalloc(dev, sizeof(*ssi), GFP_KERNEL);
+	if (!ssi)
 		return -ENOMEM;
 
-	ssi_private->soc = of_id->data;
-	ssi_private->dev = &pdev->dev;
+	ssi->soc = of_id->data;
+	ssi->dev = dev;
 
+	/* Check if being used in AC97 mode */
 	sprop = of_get_property(np, "fsl,mode", NULL);
 	if (sprop) {
 		if (!strcmp(sprop, "ac97-slave"))
-			ssi_private->dai_fmt = SND_SOC_DAIFMT_AC97;
+			ssi->dai_fmt = SND_SOC_DAIFMT_AC97;
 	}
 
-	ssi_private->use_dma = !of_property_read_bool(np,
-			"fsl,fiq-stream-filter");
+	/* Select DMA or FIQ */
+	ssi->use_dma = !of_property_read_bool(np, "fsl,fiq-stream-filter");
 
-	if (fsl_ssi_is_ac97(ssi_private)) {
-		memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_ac97_dai,
-				sizeof(fsl_ssi_ac97_dai));
-
-		fsl_ac97_data = ssi_private;
+	if (fsl_ssi_is_ac97(ssi)) {
+		memcpy(&ssi->cpu_dai_drv, &fsl_ssi_ac97_dai,
+		       sizeof(fsl_ssi_ac97_dai));
+		fsl_ac97_data = ssi;
 	} else {
-		/* Initialize this copy of the CPU DAI driver structure */
-		memcpy(&ssi_private->cpu_dai_drv, &fsl_ssi_dai_template,
+		memcpy(&ssi->cpu_dai_drv, &fsl_ssi_dai_template,
 		       sizeof(fsl_ssi_dai_template));
 	}
-	ssi_private->cpu_dai_drv.name = dev_name(&pdev->dev);
+	ssi->cpu_dai_drv.name = dev_name(dev);
 
 	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-	iomem = devm_ioremap_resource(&pdev->dev, res);
+	iomem = devm_ioremap_resource(dev, res);
 	if (IS_ERR(iomem))
 		return PTR_ERR(iomem);
-	ssi_private->ssi_phys = res->start;
+	ssi->ssi_phys = res->start;
 
-	if (ssi_private->soc->imx21regs) {
-		/*
-		 * According to datasheet imx21-class SSI
-		 * don't have SACC{ST,EN,DIS} regs.
-		 */
-		regconfig.max_register = CCSR_SSI_SRMSK;
+	if (ssi->soc->imx21regs) {
+		/* No SACC{ST,EN,DIS} regs in imx21-class SSI */
+		regconfig.max_register = REG_SSI_SRMSK;
 		regconfig.num_reg_defaults_raw =
-			CCSR_SSI_SRMSK / sizeof(uint32_t) + 1;
+			REG_SSI_SRMSK / sizeof(uint32_t) + 1;
 	}
 
 	ret = of_property_match_string(np, "clock-names", "ipg");
 	if (ret < 0) {
-		ssi_private->has_ipg_clk_name = false;
-		ssi_private->regs = devm_regmap_init_mmio(&pdev->dev, iomem,
-			&regconfig);
+		ssi->has_ipg_clk_name = false;
+		ssi->regs = devm_regmap_init_mmio(dev, iomem, &regconfig);
 	} else {
-		ssi_private->has_ipg_clk_name = true;
-		ssi_private->regs = devm_regmap_init_mmio_clk(&pdev->dev,
-			"ipg", iomem, &regconfig);
+		ssi->has_ipg_clk_name = true;
+		ssi->regs = devm_regmap_init_mmio_clk(dev, "ipg", iomem,
+						      &regconfig);
 	}
-	if (IS_ERR(ssi_private->regs)) {
-		dev_err(&pdev->dev, "Failed to init register map\n");
-		return PTR_ERR(ssi_private->regs);
+	if (IS_ERR(ssi->regs)) {
+		dev_err(dev, "failed to init register map\n");
+		return PTR_ERR(ssi->regs);
 	}
 
-	ssi_private->irq = platform_get_irq(pdev, 0);
-	if (ssi_private->irq < 0) {
-		dev_err(&pdev->dev, "no irq for node %s\n", pdev->name);
-		return ssi_private->irq;
+	ssi->irq = platform_get_irq(pdev, 0);
+	if (ssi->irq < 0) {
+		dev_err(dev, "no irq for node %s\n", pdev->name);
+		return ssi->irq;
 	}
 
-	/* Are the RX and the TX clocks locked? */
+	/* Set software limitations for synchronous mode */
 	if (!of_find_property(np, "fsl,ssi-asynchronous", NULL)) {
-		if (!fsl_ssi_is_ac97(ssi_private))
-			ssi_private->cpu_dai_drv.symmetric_rates = 1;
+		if (!fsl_ssi_is_ac97(ssi)) {
+			ssi->cpu_dai_drv.symmetric_rates = 1;
+			ssi->cpu_dai_drv.symmetric_samplebits = 1;
+		}
 
-		ssi_private->cpu_dai_drv.symmetric_channels = 1;
-		ssi_private->cpu_dai_drv.symmetric_samplebits = 1;
+		ssi->cpu_dai_drv.symmetric_channels = 1;
 	}
 
-	/* Determine the FIFO depth. */
+	/* Fetch FIFO depth; Set to 8 for older DT without this property */
 	iprop = of_get_property(np, "fsl,fifo-depth", NULL);
 	if (iprop)
-		ssi_private->fifo_depth = be32_to_cpup(iprop);
+		ssi->fifo_depth = be32_to_cpup(iprop);
 	else
-                /* Older 8610 DTs didn't have the fifo-depth property */
-		ssi_private->fifo_depth = 8;
+		ssi->fifo_depth = 8;
 
 	/*
-	 * Set the watermark for transmit FIFO 0 and receive FIFO 0. We don't
-	 * use FIFO 1 but set the watermark appropriately nontheless.
-	 * We program the transmit water to signal a DMA transfer
-	 * if there are N elements left in the FIFO. For chips with 15-deep
-	 * FIFOs, set watermark to 8.  This allows the SSI to operate at a
-	 * high data rate without channel slipping. Behavior is unchanged
-	 * for the older chips with a fifo depth of only 8.  A value of 4
-	 * might be appropriate for the older chips, but is left at
-	 * fifo_depth-2 until sombody has a chance to test.
+	 * Configure TX and RX DMA watermarks -- when to send a DMA request
 	 *
-	 * We set the watermark on the same level as the DMA burstsize.  For
-	 * fiq it is probably better to use the biggest possible watermark
-	 * size.
+	 * Values should be tested to avoid FIFO under/over run. Set maxburst
+	 * to fifo_watermark to maxiumize DMA transaction to reduce overhead.
 	 */
-	switch (ssi_private->fifo_depth) {
+	switch (ssi->fifo_depth) {
 	case 15:
 		/*
-		 * 2 samples is not enough when running at high data
-		 * rates (like 48kHz @ 16 bits/channel, 16 channels)
-		 * 8 seems to split things evenly and leave enough time
-		 * for the DMA to fill the FIFO before it's over/under
-		 * run.
+		 * Set to 8 as a balanced configuration -- When TX FIFO has 8
+		 * empty slots, send a DMA request to fill these 8 slots. The
+		 * remaining 7 slots should be able to allow DMA to finish the
+		 * transaction before TX FIFO underruns; Same applies to RX.
+		 *
+		 * Tested with cases running at 48kHz @ 16 bits x 16 channels
 		 */
-		ssi_private->fifo_watermark = 8;
-		ssi_private->dma_maxburst = 8;
+		ssi->fifo_watermark = 8;
+		ssi->dma_maxburst = 8;
 		break;
 	case 8:
 	default:
-		/*
-		 * maintain old behavior for older chips.
-		 * Keeping it the same because I don't have an older
-		 * board to test with.
-		 * I suspect this could be changed to be something to
-		 * leave some more space in the fifo.
-		 */
-		ssi_private->fifo_watermark = ssi_private->fifo_depth - 2;
-		ssi_private->dma_maxburst = ssi_private->fifo_depth - 2;
+		/* Safely use old watermark configurations for older chips */
+		ssi->fifo_watermark = ssi->fifo_depth - 2;
+		ssi->dma_maxburst = ssi->fifo_depth - 2;
 		break;
 	}
 
-	dev_set_drvdata(&pdev->dev, ssi_private);
+	dev_set_drvdata(dev, ssi);
 
-	if (ssi_private->soc->imx) {
-		ret = fsl_ssi_imx_probe(pdev, ssi_private, iomem);
+	if (ssi->soc->imx) {
+		ret = fsl_ssi_imx_probe(pdev, ssi, iomem);
 		if (ret)
 			return ret;
 	}
 
-	if (fsl_ssi_is_ac97(ssi_private)) {
-		mutex_init(&ssi_private->ac97_reg_lock);
+	if (fsl_ssi_is_ac97(ssi)) {
+		mutex_init(&ssi->ac97_reg_lock);
 		ret = snd_soc_set_ac97_ops_of_reset(&fsl_ssi_ac97_ops, pdev);
 		if (ret) {
-			dev_err(&pdev->dev, "could not set AC'97 ops\n");
+			dev_err(dev, "failed to set AC'97 ops\n");
 			goto error_ac97_ops;
 		}
 	}
 
-	ret = devm_snd_soc_register_component(&pdev->dev, &fsl_ssi_component,
-					      &ssi_private->cpu_dai_drv, 1);
+	ret = devm_snd_soc_register_component(dev, &fsl_ssi_component,
+					      &ssi->cpu_dai_drv, 1);
 	if (ret) {
-		dev_err(&pdev->dev, "failed to register DAI: %d\n", ret);
+		dev_err(dev, "failed to register DAI: %d\n", ret);
 		goto error_asoc_register;
 	}
 
-	if (ssi_private->use_dma) {
-		ret = devm_request_irq(&pdev->dev, ssi_private->irq,
-					fsl_ssi_isr, 0, dev_name(&pdev->dev),
-					ssi_private);
+	if (ssi->use_dma) {
+		ret = devm_request_irq(dev, ssi->irq, fsl_ssi_isr, 0,
+				       dev_name(dev), ssi);
 		if (ret < 0) {
-			dev_err(&pdev->dev, "could not claim irq %u\n",
-					ssi_private->irq);
+			dev_err(dev, "failed to claim irq %u\n", ssi->irq);
 			goto error_asoc_register;
 		}
 	}
 
-	ret = fsl_ssi_debugfs_create(&ssi_private->dbg_stats, &pdev->dev);
+	ret = fsl_ssi_debugfs_create(&ssi->dbg_stats, dev);
 	if (ret)
 		goto error_asoc_register;
 
-	/*
-	 * If codec-handle property is missing from SSI node, we assume
-	 * that the machine driver uses new binding which does not require
-	 * SSI driver to trigger machine driver's probe.
-	 */
+	/* Bypass it if using newer DT bindings of ASoC machine drivers */
 	if (!of_get_property(np, "codec-handle", NULL))
 		goto done;
 
-	/* Trigger the machine driver's probe function.  The platform driver
-	 * name of the machine driver is taken from /compatible property of the
-	 * device tree.  We also pass the address of the CPU DAI driver
-	 * structure.
+	/*
+	 * Backward compatible for older bindings by manually triggering the
+	 * machine driver's probe(). Use /compatible property, including the
+	 * address of CPU DAI driver structure, as the name of machine driver.
 	 */
 	sprop = of_get_property(of_find_node_by_path("/"), "compatible", NULL);
 	/* Sometimes the compatible name has a "fsl," prefix, so we strip it. */
@@ -1632,34 +1545,31 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 	snprintf(name, sizeof(name), "snd-soc-%s", sprop);
 	make_lowercase(name);
 
-	ssi_private->pdev =
-		platform_device_register_data(&pdev->dev, name, 0, NULL, 0);
-	if (IS_ERR(ssi_private->pdev)) {
-		ret = PTR_ERR(ssi_private->pdev);
-		dev_err(&pdev->dev, "failed to register platform: %d\n", ret);
+	ssi->pdev = platform_device_register_data(dev, name, 0, NULL, 0);
+	if (IS_ERR(ssi->pdev)) {
+		ret = PTR_ERR(ssi->pdev);
+		dev_err(dev, "failed to register platform: %d\n", ret);
 		goto error_sound_card;
 	}
 
 done:
-	if (ssi_private->dai_fmt)
-		_fsl_ssi_set_dai_fmt(&pdev->dev, ssi_private,
-				     ssi_private->dai_fmt);
+	if (ssi->dai_fmt)
+		_fsl_ssi_set_dai_fmt(dev, ssi, ssi->dai_fmt);
 
-	if (fsl_ssi_is_ac97(ssi_private)) {
+	if (fsl_ssi_is_ac97(ssi)) {
 		u32 ssi_idx;
 
 		ret = of_property_read_u32(np, "cell-index", &ssi_idx);
 		if (ret) {
-			dev_err(&pdev->dev, "cannot get SSI index property\n");
+			dev_err(dev, "failed to get SSI index property\n");
 			goto error_sound_card;
 		}
 
-		ssi_private->pdev =
-			platform_device_register_data(NULL,
-					"ac97-codec", ssi_idx, NULL, 0);
-		if (IS_ERR(ssi_private->pdev)) {
-			ret = PTR_ERR(ssi_private->pdev);
-			dev_err(&pdev->dev,
+		ssi->pdev = platform_device_register_data(NULL, "ac97-codec",
+							  ssi_idx, NULL, 0);
+		if (IS_ERR(ssi->pdev)) {
+			ret = PTR_ERR(ssi->pdev);
+			dev_err(dev,
 				"failed to register AC97 codec platform: %d\n",
 				ret);
 			goto error_sound_card;
@@ -1669,37 +1579,35 @@ static int fsl_ssi_probe(struct platform_device *pdev)
 	return 0;
 
 error_sound_card:
-	fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);
-
+	fsl_ssi_debugfs_remove(&ssi->dbg_stats);
 error_asoc_register:
-	if (fsl_ssi_is_ac97(ssi_private))
+	if (fsl_ssi_is_ac97(ssi))
 		snd_soc_set_ac97_ops(NULL);
-
 error_ac97_ops:
-	if (fsl_ssi_is_ac97(ssi_private))
-		mutex_destroy(&ssi_private->ac97_reg_lock);
+	if (fsl_ssi_is_ac97(ssi))
+		mutex_destroy(&ssi->ac97_reg_lock);
 
-	if (ssi_private->soc->imx)
-		fsl_ssi_imx_clean(pdev, ssi_private);
+	if (ssi->soc->imx)
+		fsl_ssi_imx_clean(pdev, ssi);
 
 	return ret;
 }
 
 static int fsl_ssi_remove(struct platform_device *pdev)
 {
-	struct fsl_ssi_private *ssi_private = dev_get_drvdata(&pdev->dev);
+	struct fsl_ssi *ssi = dev_get_drvdata(&pdev->dev);
 
-	fsl_ssi_debugfs_remove(&ssi_private->dbg_stats);
+	fsl_ssi_debugfs_remove(&ssi->dbg_stats);
 
-	if (ssi_private->pdev)
-		platform_device_unregister(ssi_private->pdev);
+	if (ssi->pdev)
+		platform_device_unregister(ssi->pdev);
 
-	if (ssi_private->soc->imx)
-		fsl_ssi_imx_clean(pdev, ssi_private);
+	if (ssi->soc->imx)
+		fsl_ssi_imx_clean(pdev, ssi);
 
-	if (fsl_ssi_is_ac97(ssi_private)) {
+	if (fsl_ssi_is_ac97(ssi)) {
 		snd_soc_set_ac97_ops(NULL);
-		mutex_destroy(&ssi_private->ac97_reg_lock);
+		mutex_destroy(&ssi->ac97_reg_lock);
 	}
 
 	return 0;
@@ -1708,13 +1616,11 @@ static int fsl_ssi_remove(struct platform_device *pdev)
 #ifdef CONFIG_PM_SLEEP
 static int fsl_ssi_suspend(struct device *dev)
 {
-	struct fsl_ssi_private *ssi_private = dev_get_drvdata(dev);
-	struct regmap *regs = ssi_private->regs;
+	struct fsl_ssi *ssi = dev_get_drvdata(dev);
+	struct regmap *regs = ssi->regs;
 
-	regmap_read(regs, CCSR_SSI_SFCSR,
-			&ssi_private->regcache_sfcsr);
-	regmap_read(regs, CCSR_SSI_SACNT,
-			&ssi_private->regcache_sacnt);
+	regmap_read(regs, REG_SSI_SFCSR, &ssi->regcache_sfcsr);
+	regmap_read(regs, REG_SSI_SACNT, &ssi->regcache_sacnt);
 
 	regcache_cache_only(regs, true);
 	regcache_mark_dirty(regs);
@@ -1724,17 +1630,16 @@ static int fsl_ssi_suspend(struct device *dev)
 
 static int fsl_ssi_resume(struct device *dev)
 {
-	struct fsl_ssi_private *ssi_private = dev_get_drvdata(dev);
-	struct regmap *regs = ssi_private->regs;
+	struct fsl_ssi *ssi = dev_get_drvdata(dev);
+	struct regmap *regs = ssi->regs;
 
 	regcache_cache_only(regs, false);
 
-	regmap_update_bits(regs, CCSR_SSI_SFCSR,
-			CCSR_SSI_SFCSR_RFWM1_MASK | CCSR_SSI_SFCSR_TFWM1_MASK |
-			CCSR_SSI_SFCSR_RFWM0_MASK | CCSR_SSI_SFCSR_TFWM0_MASK,
-			ssi_private->regcache_sfcsr);
-	regmap_write(regs, CCSR_SSI_SACNT,
-			ssi_private->regcache_sacnt);
+	regmap_update_bits(regs, REG_SSI_SFCSR,
+			   SSI_SFCSR_RFWM1_MASK | SSI_SFCSR_TFWM1_MASK |
+			   SSI_SFCSR_RFWM0_MASK | SSI_SFCSR_TFWM0_MASK,
+			   ssi->regcache_sfcsr);
+	regmap_write(regs, REG_SSI_SACNT, ssi->regcache_sacnt);
 
 	return regcache_sync(regs);
 }
diff --git a/sound/soc/fsl/fsl_ssi.h b/sound/soc/fsl/fsl_ssi.h
index 5065105..de2fdc5 100644
--- a/sound/soc/fsl/fsl_ssi.h
+++ b/sound/soc/fsl/fsl_ssi.h
@@ -1,5 +1,5 @@
 /*
- * fsl_ssi.h - ALSA SSI interface for the Freescale MPC8610 SoC
+ * fsl_ssi.h - ALSA SSI interface for the Freescale MPC8610 and i.MX SoC
  *
  * Author: Timur Tabi <timur@freescale.com>
  *
@@ -12,198 +12,261 @@
 #ifndef _MPC8610_I2S_H
 #define _MPC8610_I2S_H
 
-/* SSI registers */
-#define CCSR_SSI_STX0			0x00
-#define CCSR_SSI_STX1			0x04
-#define CCSR_SSI_SRX0			0x08
-#define CCSR_SSI_SRX1			0x0c
-#define CCSR_SSI_SCR			0x10
-#define CCSR_SSI_SISR			0x14
-#define CCSR_SSI_SIER			0x18
-#define CCSR_SSI_STCR			0x1c
-#define CCSR_SSI_SRCR			0x20
-#define CCSR_SSI_STCCR			0x24
-#define CCSR_SSI_SRCCR			0x28
-#define CCSR_SSI_SFCSR			0x2c
-#define CCSR_SSI_STR			0x30
-#define CCSR_SSI_SOR			0x34
-#define CCSR_SSI_SACNT			0x38
-#define CCSR_SSI_SACADD			0x3c
-#define CCSR_SSI_SACDAT			0x40
-#define CCSR_SSI_SATAG			0x44
-#define CCSR_SSI_STMSK			0x48
-#define CCSR_SSI_SRMSK			0x4c
-#define CCSR_SSI_SACCST			0x50
-#define CCSR_SSI_SACCEN			0x54
-#define CCSR_SSI_SACCDIS		0x58
+#define RX 0
+#define TX 1
 
-#define CCSR_SSI_SCR_SYNC_TX_FS		0x00001000
-#define CCSR_SSI_SCR_RFR_CLK_DIS	0x00000800
-#define CCSR_SSI_SCR_TFR_CLK_DIS	0x00000400
-#define CCSR_SSI_SCR_TCH_EN		0x00000100
-#define CCSR_SSI_SCR_SYS_CLK_EN		0x00000080
-#define CCSR_SSI_SCR_I2S_MODE_MASK	0x00000060
-#define CCSR_SSI_SCR_I2S_MODE_NORMAL	0x00000000
-#define CCSR_SSI_SCR_I2S_MODE_MASTER	0x00000020
-#define CCSR_SSI_SCR_I2S_MODE_SLAVE	0x00000040
-#define CCSR_SSI_SCR_SYN		0x00000010
-#define CCSR_SSI_SCR_NET		0x00000008
-#define CCSR_SSI_SCR_RE			0x00000004
-#define CCSR_SSI_SCR_TE			0x00000002
-#define CCSR_SSI_SCR_SSIEN		0x00000001
+/* -- SSI Register Map -- */
 
-#define CCSR_SSI_SISR_RFRC		0x01000000
-#define CCSR_SSI_SISR_TFRC		0x00800000
-#define CCSR_SSI_SISR_CMDAU		0x00040000
-#define CCSR_SSI_SISR_CMDDU		0x00020000
-#define CCSR_SSI_SISR_RXT		0x00010000
-#define CCSR_SSI_SISR_RDR1		0x00008000
-#define CCSR_SSI_SISR_RDR0		0x00004000
-#define CCSR_SSI_SISR_TDE1		0x00002000
-#define CCSR_SSI_SISR_TDE0		0x00001000
-#define CCSR_SSI_SISR_ROE1		0x00000800
-#define CCSR_SSI_SISR_ROE0		0x00000400
-#define CCSR_SSI_SISR_TUE1		0x00000200
-#define CCSR_SSI_SISR_TUE0		0x00000100
-#define CCSR_SSI_SISR_TFS		0x00000080
-#define CCSR_SSI_SISR_RFS		0x00000040
-#define CCSR_SSI_SISR_TLS		0x00000020
-#define CCSR_SSI_SISR_RLS		0x00000010
-#define CCSR_SSI_SISR_RFF1		0x00000008
-#define CCSR_SSI_SISR_RFF0		0x00000004
-#define CCSR_SSI_SISR_TFE1		0x00000002
-#define CCSR_SSI_SISR_TFE0		0x00000001
+/* SSI Transmit Data Register 0 */
+#define REG_SSI_STX0			0x00
+/* SSI Transmit Data Register 1 */
+#define REG_SSI_STX1			0x04
+/* SSI Receive Data Register 0 */
+#define REG_SSI_SRX0			0x08
+/* SSI Receive Data Register 1 */
+#define REG_SSI_SRX1			0x0c
+/* SSI Control Register */
+#define REG_SSI_SCR			0x10
+/* SSI Interrupt Status Register */
+#define REG_SSI_SISR			0x14
+/* SSI Interrupt Enable Register */
+#define REG_SSI_SIER			0x18
+/* SSI Transmit Configuration Register */
+#define REG_SSI_STCR			0x1c
+/* SSI Receive Configuration Register */
+#define REG_SSI_SRCR			0x20
+#define REG_SSI_SxCR(tx)		((tx) ? REG_SSI_STCR : REG_SSI_SRCR)
+/* SSI Transmit Clock Control Register */
+#define REG_SSI_STCCR			0x24
+/* SSI Receive Clock Control Register */
+#define REG_SSI_SRCCR			0x28
+#define REG_SSI_SxCCR(tx)		((tx) ? REG_SSI_STCCR : REG_SSI_SRCCR)
+/* SSI FIFO Control/Status Register */
+#define REG_SSI_SFCSR			0x2c
+/*
+ * SSI Test Register (Intended for debugging purposes only)
+ *
+ * Note: STR is not documented in recent IMX datasheet, but
+ * is described in IMX51 reference manual at section 56.3.3.14
+ */
+#define REG_SSI_STR			0x30
+/*
+ * SSI Option Register (Intended for internal use only)
+ *
+ * Note: SOR is not documented in recent IMX datasheet, but
+ * is described in IMX51 reference manual at section 56.3.3.15
+ */
+#define REG_SSI_SOR			0x34
+/* SSI AC97 Control Register */
+#define REG_SSI_SACNT			0x38
+/* SSI AC97 Command Address Register */
+#define REG_SSI_SACADD			0x3c
+/* SSI AC97 Command Data Register */
+#define REG_SSI_SACDAT			0x40
+/* SSI AC97 Tag Register */
+#define REG_SSI_SATAG			0x44
+/* SSI Transmit Time Slot Mask Register */
+#define REG_SSI_STMSK			0x48
+/* SSI  Receive Time Slot Mask Register */
+#define REG_SSI_SRMSK			0x4c
+#define REG_SSI_SxMSK(tx)		((tx) ? REG_SSI_STMSK : REG_SSI_SRMSK)
+/*
+ * SSI AC97 Channel Status Register
+ *
+ * The status could be changed by:
+ * 1) Writing a '1' bit at some position in SACCEN sets relevant bit in SACCST
+ * 2) Writing a '1' bit at some position in SACCDIS unsets the relevant bit
+ * 3) Receivng a '1' in SLOTREQ bit from external CODEC via AC Link
+ */
+#define REG_SSI_SACCST			0x50
+/* SSI AC97 Channel Enable Register -- Set bits in SACCST */
+#define REG_SSI_SACCEN			0x54
+/* SSI AC97 Channel Disable Register -- Clear bits in SACCST */
+#define REG_SSI_SACCDIS			0x58
 
-#define CCSR_SSI_SIER_RFRC_EN		0x01000000
-#define CCSR_SSI_SIER_TFRC_EN		0x00800000
-#define CCSR_SSI_SIER_RDMAE		0x00400000
-#define CCSR_SSI_SIER_RIE		0x00200000
-#define CCSR_SSI_SIER_TDMAE		0x00100000
-#define CCSR_SSI_SIER_TIE		0x00080000
-#define CCSR_SSI_SIER_CMDAU_EN		0x00040000
-#define CCSR_SSI_SIER_CMDDU_EN		0x00020000
-#define CCSR_SSI_SIER_RXT_EN		0x00010000
-#define CCSR_SSI_SIER_RDR1_EN		0x00008000
-#define CCSR_SSI_SIER_RDR0_EN		0x00004000
-#define CCSR_SSI_SIER_TDE1_EN		0x00002000
-#define CCSR_SSI_SIER_TDE0_EN		0x00001000
-#define CCSR_SSI_SIER_ROE1_EN		0x00000800
-#define CCSR_SSI_SIER_ROE0_EN		0x00000400
-#define CCSR_SSI_SIER_TUE1_EN		0x00000200
-#define CCSR_SSI_SIER_TUE0_EN		0x00000100
-#define CCSR_SSI_SIER_TFS_EN		0x00000080
-#define CCSR_SSI_SIER_RFS_EN		0x00000040
-#define CCSR_SSI_SIER_TLS_EN		0x00000020
-#define CCSR_SSI_SIER_RLS_EN		0x00000010
-#define CCSR_SSI_SIER_RFF1_EN		0x00000008
-#define CCSR_SSI_SIER_RFF0_EN		0x00000004
-#define CCSR_SSI_SIER_TFE1_EN		0x00000002
-#define CCSR_SSI_SIER_TFE0_EN		0x00000001
+/* -- SSI Register Field Maps -- */
 
-#define CCSR_SSI_STCR_TXBIT0		0x00000200
-#define CCSR_SSI_STCR_TFEN1		0x00000100
-#define CCSR_SSI_STCR_TFEN0		0x00000080
-#define CCSR_SSI_STCR_TFDIR		0x00000040
-#define CCSR_SSI_STCR_TXDIR		0x00000020
-#define CCSR_SSI_STCR_TSHFD		0x00000010
-#define CCSR_SSI_STCR_TSCKP		0x00000008
-#define CCSR_SSI_STCR_TFSI		0x00000004
-#define CCSR_SSI_STCR_TFSL		0x00000002
-#define CCSR_SSI_STCR_TEFS		0x00000001
+/* SSI Control Register -- REG_SSI_SCR 0x10 */
+#define SSI_SCR_SYNC_TX_FS		0x00001000
+#define SSI_SCR_RFR_CLK_DIS		0x00000800
+#define SSI_SCR_TFR_CLK_DIS		0x00000400
+#define SSI_SCR_TCH_EN			0x00000100
+#define SSI_SCR_SYS_CLK_EN		0x00000080
+#define SSI_SCR_I2S_MODE_MASK		0x00000060
+#define SSI_SCR_I2S_MODE_NORMAL		0x00000000
+#define SSI_SCR_I2S_MODE_MASTER		0x00000020
+#define SSI_SCR_I2S_MODE_SLAVE		0x00000040
+#define SSI_SCR_SYN			0x00000010
+#define SSI_SCR_NET			0x00000008
+#define SSI_SCR_I2S_NET_MASK		(SSI_SCR_NET | SSI_SCR_I2S_MODE_MASK)
+#define SSI_SCR_RE			0x00000004
+#define SSI_SCR_TE			0x00000002
+#define SSI_SCR_SSIEN			0x00000001
 
-#define CCSR_SSI_SRCR_RXEXT		0x00000400
-#define CCSR_SSI_SRCR_RXBIT0		0x00000200
-#define CCSR_SSI_SRCR_RFEN1		0x00000100
-#define CCSR_SSI_SRCR_RFEN0		0x00000080
-#define CCSR_SSI_SRCR_RFDIR		0x00000040
-#define CCSR_SSI_SRCR_RXDIR		0x00000020
-#define CCSR_SSI_SRCR_RSHFD		0x00000010
-#define CCSR_SSI_SRCR_RSCKP		0x00000008
-#define CCSR_SSI_SRCR_RFSI		0x00000004
-#define CCSR_SSI_SRCR_RFSL		0x00000002
-#define CCSR_SSI_SRCR_REFS		0x00000001
+/* SSI Interrupt Status Register -- REG_SSI_SISR 0x14 */
+#define SSI_SISR_RFRC			0x01000000
+#define SSI_SISR_TFRC			0x00800000
+#define SSI_SISR_CMDAU			0x00040000
+#define SSI_SISR_CMDDU			0x00020000
+#define SSI_SISR_RXT			0x00010000
+#define SSI_SISR_RDR1			0x00008000
+#define SSI_SISR_RDR0			0x00004000
+#define SSI_SISR_TDE1			0x00002000
+#define SSI_SISR_TDE0			0x00001000
+#define SSI_SISR_ROE1			0x00000800
+#define SSI_SISR_ROE0			0x00000400
+#define SSI_SISR_TUE1			0x00000200
+#define SSI_SISR_TUE0			0x00000100
+#define SSI_SISR_TFS			0x00000080
+#define SSI_SISR_RFS			0x00000040
+#define SSI_SISR_TLS			0x00000020
+#define SSI_SISR_RLS			0x00000010
+#define SSI_SISR_RFF1			0x00000008
+#define SSI_SISR_RFF0			0x00000004
+#define SSI_SISR_TFE1			0x00000002
+#define SSI_SISR_TFE0			0x00000001
 
-/* STCCR and SRCCR */
-#define CCSR_SSI_SxCCR_DIV2_SHIFT	18
-#define CCSR_SSI_SxCCR_DIV2		0x00040000
-#define CCSR_SSI_SxCCR_PSR_SHIFT	17
-#define CCSR_SSI_SxCCR_PSR		0x00020000
-#define CCSR_SSI_SxCCR_WL_SHIFT		13
-#define CCSR_SSI_SxCCR_WL_MASK		0x0001E000
-#define CCSR_SSI_SxCCR_WL(x) \
-	(((((x) / 2) - 1) << CCSR_SSI_SxCCR_WL_SHIFT) & CCSR_SSI_SxCCR_WL_MASK)
-#define CCSR_SSI_SxCCR_DC_SHIFT		8
-#define CCSR_SSI_SxCCR_DC_MASK		0x00001F00
-#define CCSR_SSI_SxCCR_DC(x) \
-	((((x) - 1) << CCSR_SSI_SxCCR_DC_SHIFT) & CCSR_SSI_SxCCR_DC_MASK)
-#define CCSR_SSI_SxCCR_PM_SHIFT		0
-#define CCSR_SSI_SxCCR_PM_MASK		0x000000FF
-#define CCSR_SSI_SxCCR_PM(x) \
-	((((x) - 1) << CCSR_SSI_SxCCR_PM_SHIFT) & CCSR_SSI_SxCCR_PM_MASK)
+/* SSI Interrupt Enable Register -- REG_SSI_SIER 0x18 */
+#define SSI_SIER_RFRC_EN		0x01000000
+#define SSI_SIER_TFRC_EN		0x00800000
+#define SSI_SIER_RDMAE			0x00400000
+#define SSI_SIER_RIE			0x00200000
+#define SSI_SIER_TDMAE			0x00100000
+#define SSI_SIER_TIE			0x00080000
+#define SSI_SIER_CMDAU_EN		0x00040000
+#define SSI_SIER_CMDDU_EN		0x00020000
+#define SSI_SIER_RXT_EN			0x00010000
+#define SSI_SIER_RDR1_EN		0x00008000
+#define SSI_SIER_RDR0_EN		0x00004000
+#define SSI_SIER_TDE1_EN		0x00002000
+#define SSI_SIER_TDE0_EN		0x00001000
+#define SSI_SIER_ROE1_EN		0x00000800
+#define SSI_SIER_ROE0_EN		0x00000400
+#define SSI_SIER_TUE1_EN		0x00000200
+#define SSI_SIER_TUE0_EN		0x00000100
+#define SSI_SIER_TFS_EN			0x00000080
+#define SSI_SIER_RFS_EN			0x00000040
+#define SSI_SIER_TLS_EN			0x00000020
+#define SSI_SIER_RLS_EN			0x00000010
+#define SSI_SIER_RFF1_EN		0x00000008
+#define SSI_SIER_RFF0_EN		0x00000004
+#define SSI_SIER_TFE1_EN		0x00000002
+#define SSI_SIER_TFE0_EN		0x00000001
+
+/* SSI Transmit Configuration Register -- REG_SSI_STCR 0x1C */
+#define SSI_STCR_TXBIT0			0x00000200
+#define SSI_STCR_TFEN1			0x00000100
+#define SSI_STCR_TFEN0			0x00000080
+#define SSI_STCR_TFDIR			0x00000040
+#define SSI_STCR_TXDIR			0x00000020
+#define SSI_STCR_TSHFD			0x00000010
+#define SSI_STCR_TSCKP			0x00000008
+#define SSI_STCR_TFSI			0x00000004
+#define SSI_STCR_TFSL			0x00000002
+#define SSI_STCR_TEFS			0x00000001
+
+/* SSI Receive Configuration Register -- REG_SSI_SRCR 0x20 */
+#define SSI_SRCR_RXEXT			0x00000400
+#define SSI_SRCR_RXBIT0			0x00000200
+#define SSI_SRCR_RFEN1			0x00000100
+#define SSI_SRCR_RFEN0			0x00000080
+#define SSI_SRCR_RFDIR			0x00000040
+#define SSI_SRCR_RXDIR			0x00000020
+#define SSI_SRCR_RSHFD			0x00000010
+#define SSI_SRCR_RSCKP			0x00000008
+#define SSI_SRCR_RFSI			0x00000004
+#define SSI_SRCR_RFSL			0x00000002
+#define SSI_SRCR_REFS			0x00000001
 
 /*
- * The xFCNT bits are read-only, and the xFWM bits are read/write.  Use the
- * CCSR_SSI_SFCSR_xFCNTy() macros to read the FIFO counters, and use the
- * CCSR_SSI_SFCSR_xFWMy() macros to set the watermarks.
+ * SSI Transmit Clock Control Register -- REG_SSI_STCCR 0x24
+ * SSI Receive Clock Control Register -- REG_SSI_SRCCR 0x28
  */
-#define CCSR_SSI_SFCSR_RFCNT1_SHIFT	28
-#define CCSR_SSI_SFCSR_RFCNT1_MASK	0xF0000000
-#define CCSR_SSI_SFCSR_RFCNT1(x) \
-	(((x) & CCSR_SSI_SFCSR_RFCNT1_MASK) >> CCSR_SSI_SFCSR_RFCNT1_SHIFT)
-#define CCSR_SSI_SFCSR_TFCNT1_SHIFT	24
-#define CCSR_SSI_SFCSR_TFCNT1_MASK	0x0F000000
-#define CCSR_SSI_SFCSR_TFCNT1(x) \
-	(((x) & CCSR_SSI_SFCSR_TFCNT1_MASK) >> CCSR_SSI_SFCSR_TFCNT1_SHIFT)
-#define CCSR_SSI_SFCSR_RFWM1_SHIFT	20
-#define CCSR_SSI_SFCSR_RFWM1_MASK	0x00F00000
-#define CCSR_SSI_SFCSR_RFWM1(x)	\
-	(((x) << CCSR_SSI_SFCSR_RFWM1_SHIFT) & CCSR_SSI_SFCSR_RFWM1_MASK)
-#define CCSR_SSI_SFCSR_TFWM1_SHIFT	16
-#define CCSR_SSI_SFCSR_TFWM1_MASK	0x000F0000
-#define CCSR_SSI_SFCSR_TFWM1(x)	\
-	(((x) << CCSR_SSI_SFCSR_TFWM1_SHIFT) & CCSR_SSI_SFCSR_TFWM1_MASK)
-#define CCSR_SSI_SFCSR_RFCNT0_SHIFT	12
-#define CCSR_SSI_SFCSR_RFCNT0_MASK	0x0000F000
-#define CCSR_SSI_SFCSR_RFCNT0(x) \
-	(((x) & CCSR_SSI_SFCSR_RFCNT0_MASK) >> CCSR_SSI_SFCSR_RFCNT0_SHIFT)
-#define CCSR_SSI_SFCSR_TFCNT0_SHIFT	8
-#define CCSR_SSI_SFCSR_TFCNT0_MASK	0x00000F00
-#define CCSR_SSI_SFCSR_TFCNT0(x) \
-	(((x) & CCSR_SSI_SFCSR_TFCNT0_MASK) >> CCSR_SSI_SFCSR_TFCNT0_SHIFT)
-#define CCSR_SSI_SFCSR_RFWM0_SHIFT	4
-#define CCSR_SSI_SFCSR_RFWM0_MASK	0x000000F0
-#define CCSR_SSI_SFCSR_RFWM0(x)	\
-	(((x) << CCSR_SSI_SFCSR_RFWM0_SHIFT) & CCSR_SSI_SFCSR_RFWM0_MASK)
-#define CCSR_SSI_SFCSR_TFWM0_SHIFT	0
-#define CCSR_SSI_SFCSR_TFWM0_MASK	0x0000000F
-#define CCSR_SSI_SFCSR_TFWM0(x)	\
-	(((x) << CCSR_SSI_SFCSR_TFWM0_SHIFT) & CCSR_SSI_SFCSR_TFWM0_MASK)
+#define SSI_SxCCR_DIV2_SHIFT		18
+#define SSI_SxCCR_DIV2			0x00040000
+#define SSI_SxCCR_PSR_SHIFT		17
+#define SSI_SxCCR_PSR			0x00020000
+#define SSI_SxCCR_WL_SHIFT		13
+#define SSI_SxCCR_WL_MASK		0x0001E000
+#define SSI_SxCCR_WL(x) \
+	(((((x) / 2) - 1) << SSI_SxCCR_WL_SHIFT) & SSI_SxCCR_WL_MASK)
+#define SSI_SxCCR_DC_SHIFT		8
+#define SSI_SxCCR_DC_MASK		0x00001F00
+#define SSI_SxCCR_DC(x) \
+	((((x) - 1) << SSI_SxCCR_DC_SHIFT) & SSI_SxCCR_DC_MASK)
+#define SSI_SxCCR_PM_SHIFT		0
+#define SSI_SxCCR_PM_MASK		0x000000FF
+#define SSI_SxCCR_PM(x) \
+	((((x) - 1) << SSI_SxCCR_PM_SHIFT) & SSI_SxCCR_PM_MASK)
 
-#define CCSR_SSI_STR_TEST		0x00008000
-#define CCSR_SSI_STR_RCK2TCK		0x00004000
-#define CCSR_SSI_STR_RFS2TFS		0x00002000
-#define CCSR_SSI_STR_RXSTATE(x) (((x) >> 8) & 0x1F)
-#define CCSR_SSI_STR_TXD2RXD		0x00000080
-#define CCSR_SSI_STR_TCK2RCK		0x00000040
-#define CCSR_SSI_STR_TFS2RFS		0x00000020
-#define CCSR_SSI_STR_TXSTATE(x) ((x) & 0x1F)
+/*
+ * SSI FIFO Control/Status Register -- REG_SSI_SFCSR 0x2c
+ *
+ * Tx or Rx FIFO Counter -- SSI_SFCSR_xFCNTy Read-Only
+ * Tx or Rx FIFO Watermarks -- SSI_SFCSR_xFWMy Read/Write
+ */
+#define SSI_SFCSR_RFCNT1_SHIFT		28
+#define SSI_SFCSR_RFCNT1_MASK		0xF0000000
+#define SSI_SFCSR_RFCNT1(x) \
+	(((x) & SSI_SFCSR_RFCNT1_MASK) >> SSI_SFCSR_RFCNT1_SHIFT)
+#define SSI_SFCSR_TFCNT1_SHIFT		24
+#define SSI_SFCSR_TFCNT1_MASK		0x0F000000
+#define SSI_SFCSR_TFCNT1(x) \
+	(((x) & SSI_SFCSR_TFCNT1_MASK) >> SSI_SFCSR_TFCNT1_SHIFT)
+#define SSI_SFCSR_RFWM1_SHIFT		20
+#define SSI_SFCSR_RFWM1_MASK		0x00F00000
+#define SSI_SFCSR_RFWM1(x)	\
+	(((x) << SSI_SFCSR_RFWM1_SHIFT) & SSI_SFCSR_RFWM1_MASK)
+#define SSI_SFCSR_TFWM1_SHIFT		16
+#define SSI_SFCSR_TFWM1_MASK		0x000F0000
+#define SSI_SFCSR_TFWM1(x)	\
+	(((x) << SSI_SFCSR_TFWM1_SHIFT) & SSI_SFCSR_TFWM1_MASK)
+#define SSI_SFCSR_RFCNT0_SHIFT		12
+#define SSI_SFCSR_RFCNT0_MASK		0x0000F000
+#define SSI_SFCSR_RFCNT0(x) \
+	(((x) & SSI_SFCSR_RFCNT0_MASK) >> SSI_SFCSR_RFCNT0_SHIFT)
+#define SSI_SFCSR_TFCNT0_SHIFT		8
+#define SSI_SFCSR_TFCNT0_MASK		0x00000F00
+#define SSI_SFCSR_TFCNT0(x) \
+	(((x) & SSI_SFCSR_TFCNT0_MASK) >> SSI_SFCSR_TFCNT0_SHIFT)
+#define SSI_SFCSR_RFWM0_SHIFT		4
+#define SSI_SFCSR_RFWM0_MASK		0x000000F0
+#define SSI_SFCSR_RFWM0(x)	\
+	(((x) << SSI_SFCSR_RFWM0_SHIFT) & SSI_SFCSR_RFWM0_MASK)
+#define SSI_SFCSR_TFWM0_SHIFT		0
+#define SSI_SFCSR_TFWM0_MASK		0x0000000F
+#define SSI_SFCSR_TFWM0(x)	\
+	(((x) << SSI_SFCSR_TFWM0_SHIFT) & SSI_SFCSR_TFWM0_MASK)
 
-#define CCSR_SSI_SOR_CLKOFF		0x00000040
-#define CCSR_SSI_SOR_RX_CLR		0x00000020
-#define CCSR_SSI_SOR_TX_CLR		0x00000010
-#define CCSR_SSI_SOR_INIT		0x00000008
-#define CCSR_SSI_SOR_WAIT_SHIFT		1
-#define CCSR_SSI_SOR_WAIT_MASK		0x00000006
-#define CCSR_SSI_SOR_WAIT(x) (((x) & 3) << CCSR_SSI_SOR_WAIT_SHIFT)
-#define CCSR_SSI_SOR_SYNRST 		0x00000001
+/* SSI Test Register -- REG_SSI_STR 0x30 */
+#define SSI_STR_TEST			0x00008000
+#define SSI_STR_RCK2TCK			0x00004000
+#define SSI_STR_RFS2TFS			0x00002000
+#define SSI_STR_RXSTATE(x)		(((x) >> 8) & 0x1F)
+#define SSI_STR_TXD2RXD			0x00000080
+#define SSI_STR_TCK2RCK			0x00000040
+#define SSI_STR_TFS2RFS			0x00000020
+#define SSI_STR_TXSTATE(x)		((x) & 0x1F)
 
-#define CCSR_SSI_SACNT_FRDIV(x)		(((x) & 0x3f) << 5)
-#define CCSR_SSI_SACNT_WR		0x00000010
-#define CCSR_SSI_SACNT_RD		0x00000008
-#define CCSR_SSI_SACNT_RDWR_MASK	0x00000018
-#define CCSR_SSI_SACNT_TIF		0x00000004
-#define CCSR_SSI_SACNT_FV		0x00000002
-#define CCSR_SSI_SACNT_AC97EN		0x00000001
+/* SSI Option Register -- REG_SSI_SOR 0x34 */
+#define SSI_SOR_CLKOFF			0x00000040
+#define SSI_SOR_RX_CLR			0x00000020
+#define SSI_SOR_TX_CLR			0x00000010
+#define SSI_SOR_xX_CLR(tx)		((tx) ? SSI_SOR_TX_CLR : SSI_SOR_RX_CLR)
+#define SSI_SOR_INIT			0x00000008
+#define SSI_SOR_WAIT_SHIFT		1
+#define SSI_SOR_WAIT_MASK		0x00000006
+#define SSI_SOR_WAIT(x)			(((x) & 3) << SSI_SOR_WAIT_SHIFT)
+#define SSI_SOR_SYNRST			0x00000001
+
+/* SSI AC97 Control Register -- REG_SSI_SACNT 0x38 */
+#define SSI_SACNT_FRDIV(x)		(((x) & 0x3f) << 5)
+#define SSI_SACNT_WR			0x00000010
+#define SSI_SACNT_RD			0x00000008
+#define SSI_SACNT_RDWR_MASK		0x00000018
+#define SSI_SACNT_TIF			0x00000004
+#define SSI_SACNT_FV			0x00000002
+#define SSI_SACNT_AC97EN		0x00000001
 
 
 struct device;
@@ -255,7 +318,7 @@ static inline void fsl_ssi_dbg_isr(struct fsl_ssi_dbg *stats, u32 sisr)
 }
 
 static inline int fsl_ssi_debugfs_create(struct fsl_ssi_dbg *ssi_dbg,
-		struct device *dev)
+					 struct device *dev)
 {
 	return 0;
 }
diff --git a/sound/soc/fsl/fsl_ssi_dbg.c b/sound/soc/fsl/fsl_ssi_dbg.c
index 5469ffb..7aac63e 100644
--- a/sound/soc/fsl/fsl_ssi_dbg.c
+++ b/sound/soc/fsl/fsl_ssi_dbg.c
@@ -18,86 +18,86 @@
 
 void fsl_ssi_dbg_isr(struct fsl_ssi_dbg *dbg, u32 sisr)
 {
-	if (sisr & CCSR_SSI_SISR_RFRC)
+	if (sisr & SSI_SISR_RFRC)
 		dbg->stats.rfrc++;
 
-	if (sisr & CCSR_SSI_SISR_TFRC)
+	if (sisr & SSI_SISR_TFRC)
 		dbg->stats.tfrc++;
 
-	if (sisr & CCSR_SSI_SISR_CMDAU)
+	if (sisr & SSI_SISR_CMDAU)
 		dbg->stats.cmdau++;
 
-	if (sisr & CCSR_SSI_SISR_CMDDU)
+	if (sisr & SSI_SISR_CMDDU)
 		dbg->stats.cmddu++;
 
-	if (sisr & CCSR_SSI_SISR_RXT)
+	if (sisr & SSI_SISR_RXT)
 		dbg->stats.rxt++;
 
-	if (sisr & CCSR_SSI_SISR_RDR1)
+	if (sisr & SSI_SISR_RDR1)
 		dbg->stats.rdr1++;
 
-	if (sisr & CCSR_SSI_SISR_RDR0)
+	if (sisr & SSI_SISR_RDR0)
 		dbg->stats.rdr0++;
 
-	if (sisr & CCSR_SSI_SISR_TDE1)
+	if (sisr & SSI_SISR_TDE1)
 		dbg->stats.tde1++;
 
-	if (sisr & CCSR_SSI_SISR_TDE0)
+	if (sisr & SSI_SISR_TDE0)
 		dbg->stats.tde0++;
 
-	if (sisr & CCSR_SSI_SISR_ROE1)
+	if (sisr & SSI_SISR_ROE1)
 		dbg->stats.roe1++;
 
-	if (sisr & CCSR_SSI_SISR_ROE0)
+	if (sisr & SSI_SISR_ROE0)
 		dbg->stats.roe0++;
 
-	if (sisr & CCSR_SSI_SISR_TUE1)
+	if (sisr & SSI_SISR_TUE1)
 		dbg->stats.tue1++;
 
-	if (sisr & CCSR_SSI_SISR_TUE0)
+	if (sisr & SSI_SISR_TUE0)
 		dbg->stats.tue0++;
 
-	if (sisr & CCSR_SSI_SISR_TFS)
+	if (sisr & SSI_SISR_TFS)
 		dbg->stats.tfs++;
 
-	if (sisr & CCSR_SSI_SISR_RFS)
+	if (sisr & SSI_SISR_RFS)
 		dbg->stats.rfs++;
 
-	if (sisr & CCSR_SSI_SISR_TLS)
+	if (sisr & SSI_SISR_TLS)
 		dbg->stats.tls++;
 
-	if (sisr & CCSR_SSI_SISR_RLS)
+	if (sisr & SSI_SISR_RLS)
 		dbg->stats.rls++;
 
-	if (sisr & CCSR_SSI_SISR_RFF1)
+	if (sisr & SSI_SISR_RFF1)
 		dbg->stats.rff1++;
 
-	if (sisr & CCSR_SSI_SISR_RFF0)
+	if (sisr & SSI_SISR_RFF0)
 		dbg->stats.rff0++;
 
-	if (sisr & CCSR_SSI_SISR_TFE1)
+	if (sisr & SSI_SISR_TFE1)
 		dbg->stats.tfe1++;
 
-	if (sisr & CCSR_SSI_SISR_TFE0)
+	if (sisr & SSI_SISR_TFE0)
 		dbg->stats.tfe0++;
 }
 
-/* Show the statistics of a flag only if its interrupt is enabled.  The
- * compiler will optimze this code to a no-op if the interrupt is not
- * enabled.
+/**
+ * Show the statistics of a flag only if its interrupt is enabled
+ *
+ * Compilers will optimize it to a no-op if the interrupt is disabled
  */
 #define SIER_SHOW(flag, name) \
 	do { \
-		if (CCSR_SSI_SIER_##flag) \
+		if (SSI_SIER_##flag) \
 			seq_printf(s, #name "=%u\n", ssi_dbg->stats.name); \
 	} while (0)
 
 
 /**
- * fsl_sysfs_ssi_show: display SSI statistics
+ * Display the statistics for the current SSI device
  *
- * Display the statistics for the current SSI device.  To avoid confusion,
- * we only show those counts that are enabled.
+ * To avoid confusion, only show those counts that are enabled
  */
 static int fsl_ssi_stats_show(struct seq_file *s, void *unused)
 {
@@ -147,7 +147,8 @@ int fsl_ssi_debugfs_create(struct fsl_ssi_dbg *ssi_dbg, struct device *dev)
 		return -ENOMEM;
 
 	ssi_dbg->dbg_stats = debugfs_create_file("stats", S_IRUGO,
-			ssi_dbg->dbg_dir, ssi_dbg, &fsl_ssi_stats_ops);
+						 ssi_dbg->dbg_dir, ssi_dbg,
+						 &fsl_ssi_stats_ops);
 	if (!ssi_dbg->dbg_stats) {
 		debugfs_remove(ssi_dbg->dbg_dir);
 		return -ENOMEM;
diff --git a/sound/soc/hisilicon/hi6210-i2s.c b/sound/soc/hisilicon/hi6210-i2s.c
index 0c8f86d..07a5720 100644
--- a/sound/soc/hisilicon/hi6210-i2s.c
+++ b/sound/soc/hisilicon/hi6210-i2s.c
@@ -36,7 +36,6 @@
 #include <linux/of_irq.h>
 #include <linux/mfd/syscon.h>
 #include <linux/reset-controller.h>
-#include <linux/clk.h>
 
 #include "hi6210-i2s.h"
 
diff --git a/sound/soc/intel/Kconfig b/sound/soc/intel/Kconfig
index 7b49d04..f2c9e8c 100644
--- a/sound/soc/intel/Kconfig
+++ b/sound/soc/intel/Kconfig
@@ -1,71 +1,122 @@
+config SND_SOC_INTEL_SST_TOPLEVEL
+	bool "Intel ASoC SST drivers"
+	default y
+	depends on X86 || COMPILE_TEST
+	select SND_SOC_INTEL_MACH
+	help
+	  Intel ASoC SST Platform Drivers. If you have a Intel machine that
+	  has an audio controller with a DSP and I2S or DMIC port, then
+	  enable this option by saying Y
+
+	  Note that the answer to this question doesn't directly affect the
+	  kernel: saying N will just cause the configurator to skip all
+	  the questions about Intel SST drivers.
+
+if SND_SOC_INTEL_SST_TOPLEVEL
+
 config SND_SST_IPC
 	tristate
+	# This option controls the IPC core for HiFi2 platforms
 
 config SND_SST_IPC_PCI
 	tristate
 	select SND_SST_IPC
+	# This option controls the PCI-based IPC for HiFi2 platforms
+	#  (Medfield, Merrifield).
 
 config SND_SST_IPC_ACPI
 	tristate
 	select SND_SST_IPC
-	select SND_SOC_INTEL_SST
-	select IOSF_MBI
+	# This option controls the ACPI-based IPC for HiFi2 platforms
+	# (Baytrail, Cherrytrail)
 
-config SND_SOC_INTEL_COMMON
+config SND_SOC_INTEL_SST_ACPI
 	tristate
+	# This option controls ACPI-based probing on
+	# Haswell/Broadwell/Baytrail legacy and will be set
+	# when these platforms are enabled
 
 config SND_SOC_INTEL_SST
 	tristate
-	select SND_SOC_INTEL_SST_ACPI if ACPI
 
 config SND_SOC_INTEL_SST_FIRMWARE
 	tristate
 	select DW_DMAC_CORE
-
-config SND_SOC_INTEL_SST_ACPI
-	tristate
-
-config SND_SOC_ACPI_INTEL_MATCH
-	tristate
-	select SND_SOC_ACPI if ACPI
-
-config SND_SOC_INTEL_SST_TOPLEVEL
-	tristate "Intel ASoC SST drivers"
-	depends on X86 || COMPILE_TEST
-	select SND_SOC_INTEL_MACH
-	select SND_SOC_INTEL_COMMON
-	help
-          Intel ASoC Audio Drivers. If you have a Intel machine that
-          has audio controller with a DSP and I2S or DMIC port, then
-          enable this option by saying Y or M
-          If unsure select "N".
+	# This option controls firmware download on
+	# Haswell/Broadwell/Baytrail legacy and will be set
+	# when these platforms are enabled
 
 config SND_SOC_INTEL_HASWELL
-	tristate "Intel ASoC SST driver for Haswell/Broadwell"
-	depends on SND_SOC_INTEL_SST_TOPLEVEL && SND_DMA_SGBUF
-	depends on DMADEVICES
+	tristate "Haswell/Broadwell Platforms"
+	depends on SND_DMA_SGBUF
+	depends on DMADEVICES && ACPI
 	select SND_SOC_INTEL_SST
+	select SND_SOC_INTEL_SST_ACPI
 	select SND_SOC_INTEL_SST_FIRMWARE
+	select SND_SOC_ACPI_INTEL_MATCH
+	help
+	  If you have a Intel Haswell or Broadwell platform connected to
+	  an I2S codec, then enable this option by saying Y or m. This is
+	  typically used for Chromebooks. This is a recommended option.
 
 config SND_SOC_INTEL_BAYTRAIL
-	tristate "Intel ASoC SST driver for Baytrail (legacy)"
-	depends on SND_SOC_INTEL_SST_TOPLEVEL
-	depends on DMADEVICES
+	tristate "Baytrail (legacy) Platforms"
+	depends on DMADEVICES && ACPI
 	select SND_SOC_INTEL_SST
+	select SND_SOC_INTEL_SST_ACPI
 	select SND_SOC_INTEL_SST_FIRMWARE
+	select SND_SOC_ACPI_INTEL_MATCH
+	help
+	  If you have a Intel Baytrail platform connected to an I2S codec,
+	  then enable this option by saying Y or m. This was typically used
+	  for Baytrail Chromebooks but this option is now deprecated and is
+	  not recommended, use SND_SST_ATOM_HIFI2_PLATFORM instead.
+
+config SND_SST_ATOM_HIFI2_PLATFORM_PCI
+	tristate "PCI HiFi2 (Medfield, Merrifield) Platforms"
+	depends on X86 && PCI
+	select SND_SST_IPC_PCI
+	select SND_SOC_COMPRESS
+	help
+	  If you have a Intel Medfield or Merrifield/Edison platform, then
+	  enable this option by saying Y or m. Distros will typically not
+	  enable this option: Medfield devices are not available to
+	  developers and while Merrifield/Edison can run a mainline kernel with
+	  limited functionality it will require a firmware file which
+	  is not in the standard firmware tree
 
 config SND_SST_ATOM_HIFI2_PLATFORM
-	tristate "Intel ASoC SST driver for HiFi2 platforms (*field, *trail)"
-	depends on SND_SOC_INTEL_SST_TOPLEVEL && X86
+	tristate "ACPI HiFi2 (Baytrail, Cherrytrail) Platforms"
+	depends on X86 && ACPI
+	select SND_SST_IPC_ACPI
 	select SND_SOC_COMPRESS
+	select SND_SOC_ACPI_INTEL_MATCH
+	select IOSF_MBI
+	help
+	  If you have a Intel Baytrail or Cherrytrail platform with an I2S
+	  codec, then enable this option by saying Y or m. This is a
+	  recommended option
 
 config SND_SOC_INTEL_SKYLAKE
-	tristate "Intel ASoC SST driver for SKL/BXT/KBL/GLK/CNL"
-	depends on SND_SOC_INTEL_SST_TOPLEVEL && PCI && ACPI
+	tristate "SKL/BXT/KBL/GLK/CNL... Platforms"
+	depends on PCI && ACPI
 	select SND_HDA_EXT_CORE
 	select SND_HDA_DSP_LOADER
 	select SND_SOC_TOPOLOGY
 	select SND_SOC_INTEL_SST
+	select SND_SOC_ACPI_INTEL_MATCH
+	help
+	  If you have a Intel Skylake/Broxton/ApolloLake/KabyLake/
+	  GeminiLake or CannonLake platform with the DSP enabled in the BIOS
+	  then enable this option by saying Y or m.
+
+config SND_SOC_ACPI_INTEL_MATCH
+	tristate
+	select SND_SOC_ACPI if ACPI
+	# this option controls the compilation of ACPI matching tables and
+	# helpers and is not meant to be selected by the user.
+
+endif ## SND_SOC_INTEL_SST_TOPLEVEL
 
 # ASoC codec drivers
 source "sound/soc/intel/boards/Kconfig"
diff --git a/sound/soc/intel/Makefile b/sound/soc/intel/Makefile
index b973d45..8160520f 100644
--- a/sound/soc/intel/Makefile
+++ b/sound/soc/intel/Makefile
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0
 # Core support
-obj-$(CONFIG_SND_SOC_INTEL_COMMON) += common/
+obj-$(CONFIG_SND_SOC) += common/
 
 # Platform Support
 obj-$(CONFIG_SND_SOC_INTEL_HASWELL) += haswell/
diff --git a/sound/soc/intel/atom/sst/sst_acpi.c b/sound/soc/intel/atom/sst/sst_acpi.c
index 32d6e02..6cd481b 100644
--- a/sound/soc/intel/atom/sst/sst_acpi.c
+++ b/sound/soc/intel/atom/sst/sst_acpi.c
@@ -236,6 +236,9 @@ static int sst_platform_get_resources(struct intel_sst_drv *ctx)
 	/* Find the IRQ */
 	ctx->irq_num = platform_get_irq(pdev,
 				ctx->pdata->res_info->acpi_ipc_irq_index);
+	if (ctx->irq_num <= 0)
+		return ctx->irq_num < 0 ? ctx->irq_num : -EIO;
+
 	return 0;
 }
 
diff --git a/sound/soc/intel/atom/sst/sst_stream.c b/sound/soc/intel/atom/sst/sst_stream.c
index 65e257b..7ee6aeb 100644
--- a/sound/soc/intel/atom/sst/sst_stream.c
+++ b/sound/soc/intel/atom/sst/sst_stream.c
@@ -220,10 +220,10 @@ int sst_send_byte_stream_mrfld(struct intel_sst_drv *sst_drv_ctx,
 		sst_free_block(sst_drv_ctx, block);
 out:
 	test_and_clear_bit(pvt_id, &sst_drv_ctx->pvt_id);
-	return 0;
+	return ret;
 }
 
-/*
+/**
  * sst_pause_stream - Send msg for a pausing stream
  * @str_id:	 stream ID
  *
@@ -261,7 +261,7 @@ int sst_pause_stream(struct intel_sst_drv *sst_drv_ctx, int str_id)
 		}
 	} else {
 		retval = -EBADRQC;
-		dev_dbg(sst_drv_ctx->dev, "SST DBG:BADRQC for stream\n ");
+		dev_dbg(sst_drv_ctx->dev, "SST DBG:BADRQC for stream\n");
 	}
 
 	return retval;
@@ -284,7 +284,7 @@ int sst_resume_stream(struct intel_sst_drv *sst_drv_ctx, int str_id)
 	if (!str_info)
 		return -EINVAL;
 	if (str_info->status == STREAM_RUNNING)
-			return 0;
+		return 0;
 	if (str_info->status == STREAM_PAUSED) {
 		retval = sst_prepare_and_post_msg(sst_drv_ctx, str_info->task_id,
 				IPC_CMD, IPC_IA_RESUME_STREAM_MRFLD,
diff --git a/sound/soc/intel/boards/Kconfig b/sound/soc/intel/boards/Kconfig
index 6f75470..d4e1036 100644
--- a/sound/soc/intel/boards/Kconfig
+++ b/sound/soc/intel/boards/Kconfig
@@ -1,183 +1,183 @@
-config SND_SOC_INTEL_MACH
-	tristate "Intel Audio machine drivers"
+menuconfig SND_SOC_INTEL_MACH
+	bool "Intel Machine drivers"
 	depends on SND_SOC_INTEL_SST_TOPLEVEL
-	select SND_SOC_ACPI_INTEL_MATCH if ACPI
+	help
+         Intel ASoC Machine Drivers. If you have a Intel machine that
+         has an audio controller with a DSP and I2S or DMIC port, then
+         enable this option by saying Y
+
+         Note that the answer to this question doesn't directly affect the
+         kernel: saying N will just cause the configurator to skip all
+         the questions about Intel ASoC machine drivers.
 
 if SND_SOC_INTEL_MACH
 
-config SND_MFLD_MACHINE
-	tristate "SOC Machine Audio driver for Intel Medfield MID platform"
-	depends on INTEL_SCU_IPC
-	select SND_SOC_SN95031
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_PCI
-	help
-          This adds support for ASoC machine driver for Intel(R) MID Medfield platform
-          used as alsa device in audio substem in Intel(R) MID devices
-          Say Y if you have such a device.
-          If unsure select "N".
+if SND_SOC_INTEL_HASWELL
 
 config SND_SOC_INTEL_HASWELL_MACH
-	tristate "ASoC Audio DSP support for Intel Haswell Lynxpoint"
+	tristate "Haswell Lynxpoint"
 	depends on X86_INTEL_LPSS && I2C && I2C_DESIGNWARE_PLATFORM
-	depends on SND_SOC_INTEL_HASWELL
 	select SND_SOC_RT5640
 	help
 	  This adds support for the Lynxpoint Audio DSP on Intel(R) Haswell
-	  Ultrabook platforms.
-	  Say Y if you have such a device.
+	  Ultrabook platforms. This is a recommended option.
+	  Say Y or m if you have such a device.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_BDW_RT5677_MACH
-	tristate "ASoC Audio driver for Intel Broadwell with RT5677 codec"
-	depends on X86_INTEL_LPSS && GPIOLIB && I2C
-	depends on SND_SOC_INTEL_HASWELL
+	tristate "Broadwell with RT5677 codec"
+	depends on X86_INTEL_LPSS && I2C && I2C_DESIGNWARE_PLATFORM && GPIOLIB
 	select SND_SOC_RT5677
 	help
 	  This adds support for Intel Broadwell platform based boards with
-	  the RT5677 audio codec.
+	  the RT5677 audio codec. This is a recommended option.
+	  Say Y or m if you have such a device.
+	  If unsure select "N".
 
 config SND_SOC_INTEL_BROADWELL_MACH
-	tristate "ASoC Audio DSP support for Intel Broadwell Wildcatpoint"
+	tristate "Broadwell Wildcatpoint"
 	depends on X86_INTEL_LPSS && I2C && I2C_DESIGNWARE_PLATFORM
-	depends on SND_SOC_INTEL_HASWELL
 	select SND_SOC_RT286
 	help
 	  This adds support for the Wilcatpoint Audio DSP on Intel(R) Broadwell
 	  Ultrabook platforms.
-	  Say Y if you have such a device.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
+endif ## SND_SOC_INTEL_HASWELL
+
+if SND_SOC_INTEL_BAYTRAIL
 
 config SND_SOC_INTEL_BYT_MAX98090_MACH
-	tristate "ASoC Audio driver for Intel Baytrail with MAX98090 codec"
+	tristate "Baytrail with MAX98090 codec"
 	depends on X86_INTEL_LPSS && I2C
-	depends on SND_SST_IPC_ACPI = n
-	depends on SND_SOC_INTEL_BAYTRAIL
 	select SND_SOC_MAX98090
 	help
 	  This adds audio driver for Intel Baytrail platform based boards
-	  with the MAX98090 audio codec.
+	  with the MAX98090 audio codec. This driver is deprecated, use
+	  SND_SOC_INTEL_CHT_BSW_MAX98090_TI_MACH instead for better
+	  functionality.
 
 config SND_SOC_INTEL_BYT_RT5640_MACH
-	tristate "ASoC Audio driver for Intel Baytrail with RT5640 codec"
+	tristate "Baytrail with RT5640 codec"
 	depends on X86_INTEL_LPSS && I2C
-	depends on SND_SST_IPC_ACPI = n
-	depends on SND_SOC_INTEL_BAYTRAIL
 	select SND_SOC_RT5640
 	help
 	  This adds audio driver for Intel Baytrail platform based boards
 	  with the RT5640 audio codec. This driver is deprecated, use
 	  SND_SOC_INTEL_BYTCR_RT5640_MACH instead for better functionality.
 
+endif ## SND_SOC_INTEL_BAYTRAIL
+
+if SND_SST_ATOM_HIFI2_PLATFORM
+
 config SND_SOC_INTEL_BYTCR_RT5640_MACH
-        tristate "ASoC Audio driver for Intel Baytrail and Baytrail-CR with RT5640 codec"
-	depends on X86 && I2C && ACPI
+	tristate "Baytrail and Baytrail-CR with RT5640 codec"
+	depends on X86_INTEL_LPSS && I2C && ACPI
+	select SND_SOC_ACPI
 	select SND_SOC_RT5640
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
-          This adds support for ASoC machine driver for Intel(R) Baytrail and Baytrail-CR
-          platforms with RT5640 audio codec.
-          Say Y if you have such a device.
-          If unsure select "N".
+	  This adds support for ASoC machine driver for Intel(R) Baytrail and Baytrail-CR
+	  platforms with RT5640 audio codec.
+	  Say Y or m if you have such a device. This is a recommended option.
+	  If unsure select "N".
 
 config SND_SOC_INTEL_BYTCR_RT5651_MACH
-        tristate "ASoC Audio driver for Intel Baytrail and Baytrail-CR with RT5651 codec"
-	depends on X86 && I2C && ACPI
+	tristate "Baytrail and Baytrail-CR with RT5651 codec"
+	depends on X86_INTEL_LPSS && I2C && ACPI
+	select SND_SOC_ACPI
 	select SND_SOC_RT5651
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
-          This adds support for ASoC machine driver for Intel(R) Baytrail and Baytrail-CR
-          platforms with RT5651 audio codec.
-          Say Y if you have such a device.
-          If unsure select "N".
+	  This adds support for ASoC machine driver for Intel(R) Baytrail and Baytrail-CR
+	  platforms with RT5651 audio codec.
+	  Say Y or m if you have such a device. This is a recommended option.
+	  If unsure select "N".
 
 config SND_SOC_INTEL_CHT_BSW_RT5672_MACH
-        tristate "ASoC Audio driver for Intel Cherrytrail & Braswell with RT5672 codec"
+	tristate "Cherrytrail & Braswell with RT5672 codec"
 	depends on X86_INTEL_LPSS && I2C && ACPI
-        select SND_SOC_RT5670
-        depends on SND_SST_ATOM_HIFI2_PLATFORM
-        select SND_SST_IPC_ACPI
+	select SND_SOC_ACPI
+	select SND_SOC_RT5670
         help
           This adds support for ASoC machine driver for Intel(R) Cherrytrail & Braswell
           platforms with RT5672 audio codec.
-          Say Y if you have such a device.
+          Say Y or m if you have such a device. This is a recommended option.
           If unsure select "N".
 
 config SND_SOC_INTEL_CHT_BSW_RT5645_MACH
-	tristate "ASoC Audio driver for Intel Cherrytrail & Braswell with RT5645/5650 codec"
+	tristate "Cherrytrail & Braswell with RT5645/5650 codec"
 	depends on X86_INTEL_LPSS && I2C && ACPI
+	select SND_SOC_ACPI
 	select SND_SOC_RT5645
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
 	  This adds support for ASoC machine driver for Intel(R) Cherrytrail & Braswell
 	  platforms with RT5645/5650 audio codec.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_CHT_BSW_MAX98090_TI_MACH
-	tristate "ASoC Audio driver for Intel Cherrytrail & Braswell with MAX98090 & TI codec"
+	tristate "Cherrytrail & Braswell with MAX98090 & TI codec"
 	depends on X86_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_MAX98090
 	select SND_SOC_TS3A227E
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
 	  This adds support for ASoC machine driver for Intel(R) Cherrytrail & Braswell
 	  platforms with MAX98090 audio codec it also can support TI jack chip as aux device.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_BYT_CHT_DA7213_MACH
-	tristate "ASoC Audio driver for Intel Baytrail & Cherrytrail with DA7212/7213 codec"
+	tristate "Baytrail & Cherrytrail with DA7212/7213 codec"
 	depends on X86_INTEL_LPSS && I2C && ACPI
+	select SND_SOC_ACPI
 	select SND_SOC_DA7213
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
 	  This adds support for ASoC machine driver for Intel(R) Baytrail & CherryTrail
 	  platforms with DA7212/7213 audio codec.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_BYT_CHT_ES8316_MACH
-	tristate "ASoC Audio driver for Intel Baytrail & Cherrytrail with ES8316 codec"
+	tristate "Baytrail & Cherrytrail with ES8316 codec"
 	depends on X86_INTEL_LPSS && I2C && ACPI
+	select SND_SOC_ACPI
 	select SND_SOC_ES8316
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
 	  This adds support for ASoC machine driver for Intel(R) Baytrail &
 	  Cherrytrail platforms with ES8316 audio codec.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_BYT_CHT_NOCODEC_MACH
-	tristate "ASoC Audio driver for Intel Baytrail & Cherrytrail platform with no codec (MinnowBoard MAX, Up)"
+	tristate "Baytrail & Cherrytrail platform with no codec (MinnowBoard MAX, Up)"
 	depends on X86_INTEL_LPSS && I2C && ACPI
-	depends on SND_SST_ATOM_HIFI2_PLATFORM
-	select SND_SST_IPC_ACPI
 	help
 	  This adds support for ASoC machine driver for the MinnowBoard Max or
 	  Up boards and provides access to I2S signals on the Low-Speed
-	  connector
+	  connector. This is not a recommended option outside of these cases.
+	  It is not intended to be enabled by distros by default.
+	  Say Y or m if you have such a device.
+
 	  If unsure select "N".
 
+endif ## SND_SST_ATOM_HIFI2_PLATFORM
+
+if SND_SOC_INTEL_SKYLAKE
+
 config SND_SOC_INTEL_SKL_RT286_MACH
-	tristate "ASoC Audio driver for SKL with RT286 I2S mode"
-	depends on X86 && ACPI && I2C
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "SKL with RT286 I2S mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_RT286
 	select SND_SOC_DMIC
 	select SND_SOC_HDAC_HDMI
 	help
 	   This adds support for ASoC machine driver for Skylake platforms
 	   with RT286 I2S audio codec.
-	   Say Y if you have such a device.
+	   Say Y or m if you have such a device.
 	   If unsure select "N".
 
 config SND_SOC_INTEL_SKL_NAU88L25_SSM4567_MACH
-	tristate "ASoC Audio driver for SKL with NAU88L25 and SSM4567 in I2S Mode"
-	depends on X86_INTEL_LPSS && I2C
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "SKL with NAU88L25 and SSM4567 in I2S Mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_NAU8825
 	select SND_SOC_SSM4567
 	select SND_SOC_DMIC
@@ -185,13 +185,12 @@
 	help
 	  This adds support for ASoC Onboard Codec I2S machine driver. This will
 	  create an alsa sound card for NAU88L25 + SSM4567.
-	  Say Y if you have such a device.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_SKL_NAU88L25_MAX98357A_MACH
-	tristate "ASoC Audio driver for SKL with NAU88L25 and MAX98357A in I2S Mode"
-	depends on X86_INTEL_LPSS && I2C
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "SKL with NAU88L25 and MAX98357A in I2S Mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_NAU8825
 	select SND_SOC_MAX98357A
 	select SND_SOC_DMIC
@@ -199,13 +198,12 @@
 	help
 	  This adds support for ASoC Onboard Codec I2S machine driver. This will
 	  create an alsa sound card for NAU88L25 + MAX98357A.
-	  Say Y if you have such a device.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_BXT_DA7219_MAX98357A_MACH
-	tristate "ASoC Audio driver for Broxton with DA7219 and MAX98357A in I2S Mode"
-	depends on X86 && ACPI && I2C
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "Broxton with DA7219 and MAX98357A in I2S Mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_DA7219
 	select SND_SOC_MAX98357A
 	select SND_SOC_DMIC
@@ -214,13 +212,12 @@
 	help
 	   This adds support for ASoC machine driver for Broxton-P platforms
 	   with DA7219 + MAX98357A I2S audio codec.
-	   Say Y if you have such a device.
+	   Say Y or m if you have such a device. This is a recommended option.
 	   If unsure select "N".
 
 config SND_SOC_INTEL_BXT_RT298_MACH
-	tristate "ASoC Audio driver for Broxton with RT298 I2S mode"
-	depends on X86 && ACPI && I2C
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "Broxton with RT298 I2S mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_RT298
 	select SND_SOC_DMIC
 	select SND_SOC_HDAC_HDMI
@@ -228,14 +225,12 @@
 	help
 	   This adds support for ASoC machine driver for Broxton platforms
 	   with RT286 I2S audio codec.
-	   Say Y if you have such a device.
+	   Say Y or m if you have such a device. This is a recommended option.
 	   If unsure select "N".
 
 config SND_SOC_INTEL_KBL_RT5663_MAX98927_MACH
-	tristate "ASoC Audio driver for KBL with RT5663 and MAX98927 in I2S Mode"
-	depends on X86_INTEL_LPSS && I2C
-	select SND_SOC_INTEL_SST
-	depends on SND_SOC_INTEL_SKYLAKE
+	tristate "KBL with RT5663 and MAX98927 in I2S Mode"
+	depends on MFD_INTEL_LPSS && I2C && ACPI
 	select SND_SOC_RT5663
 	select SND_SOC_MAX98927
 	select SND_SOC_DMIC
@@ -243,14 +238,13 @@
 	help
 	  This adds support for ASoC Onboard Codec I2S machine driver. This will
 	  create an alsa sound card for RT5663 + MAX98927.
-	  Say Y if you have such a device.
+	  Say Y or m if you have such a device. This is a recommended option.
 	  If unsure select "N".
 
 config SND_SOC_INTEL_KBL_RT5663_RT5514_MAX98927_MACH
-        tristate "ASoC Audio driver for KBL with RT5663, RT5514 and MAX98927 in I2S Mode"
-        depends on X86_INTEL_LPSS && I2C && SPI
-        select SND_SOC_INTEL_SST
-        depends on SND_SOC_INTEL_SKYLAKE
+        tristate "KBL with RT5663, RT5514 and MAX98927 in I2S Mode"
+        depends on MFD_INTEL_LPSS && I2C && ACPI
+        depends on SPI
         select SND_SOC_RT5663
         select SND_SOC_RT5514
         select SND_SOC_RT5514_SPI
@@ -259,7 +253,8 @@
         help
           This adds support for ASoC Onboard Codec I2S machine driver. This will
           create an alsa sound card for RT5663 + RT5514 + MAX98927.
-          Say Y if you have such a device.
+          Say Y or m if you have such a device. This is a recommended option.
           If unsure select "N".
+endif ## SND_SOC_INTEL_SKYLAKE
 
-endif
+endif ## SND_SOC_INTEL_MACH
diff --git a/sound/soc/intel/boards/bytcht_da7213.c b/sound/soc/intel/boards/bytcht_da7213.c
index c4d82ad..2179ded 100644
--- a/sound/soc/intel/boards/bytcht_da7213.c
+++ b/sound/soc/intel/boards/bytcht_da7213.c
@@ -219,7 +219,7 @@ static struct snd_soc_card bytcht_da7213_card = {
 	.num_dapm_routes = ARRAY_SIZE(audio_map),
 };
 
-static char codec_name[16]; /* i2c-<HID>:00 with HID being 8 chars */
+static char codec_name[SND_ACPI_I2C_ID_LEN];
 
 static int bytcht_da7213_probe(struct platform_device *pdev)
 {
@@ -243,7 +243,7 @@ static int bytcht_da7213_probe(struct platform_device *pdev)
 	}
 
 	/* fixup codec name based on HID */
-	i2c_name = snd_soc_acpi_find_name_from_hid(mach->id);
+	i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
 	if (i2c_name) {
 		snprintf(codec_name, sizeof(codec_name),
 			"%s%s", "i2c-", i2c_name);
diff --git a/sound/soc/intel/boards/bytcht_es8316.c b/sound/soc/intel/boards/bytcht_es8316.c
index 8088396..305e7f4 100644
--- a/sound/soc/intel/boards/bytcht_es8316.c
+++ b/sound/soc/intel/boards/bytcht_es8316.c
@@ -232,15 +232,39 @@ static struct snd_soc_card byt_cht_es8316_card = {
 	.fully_routed = true,
 };
 
+static char codec_name[SND_ACPI_I2C_ID_LEN];
+
 static int snd_byt_cht_es8316_mc_probe(struct platform_device *pdev)
 {
-	int ret = 0;
 	struct byt_cht_es8316_private *priv;
+	struct snd_soc_acpi_mach *mach;
+	const char *i2c_name = NULL;
+	int dai_index = 0;
+	int i;
+	int ret = 0;
 
 	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_ATOMIC);
 	if (!priv)
 		return -ENOMEM;
 
+	mach = (&pdev->dev)->platform_data;
+	/* fix index of codec dai */
+	for (i = 0; i < ARRAY_SIZE(byt_cht_es8316_dais); i++) {
+		if (!strcmp(byt_cht_es8316_dais[i].codec_name,
+			    "i2c-ESSX8316:00")) {
+			dai_index = i;
+			break;
+		}
+	}
+
+	/* fixup codec name based on HID */
+	i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
+	if (i2c_name) {
+		snprintf(codec_name, sizeof(codec_name),
+			"%s%s", "i2c-", i2c_name);
+		byt_cht_es8316_dais[dai_index].codec_name = codec_name;
+	}
+
 	/* register the soc card */
 	byt_cht_es8316_card.dev = &pdev->dev;
 	snd_soc_card_set_drvdata(&byt_cht_es8316_card, priv);
diff --git a/sound/soc/intel/boards/bytcr_rt5640.c b/sound/soc/intel/boards/bytcr_rt5640.c
index f2c0fc4..b6a1cfe 100644
--- a/sound/soc/intel/boards/bytcr_rt5640.c
+++ b/sound/soc/intel/boards/bytcr_rt5640.c
@@ -713,7 +713,7 @@ static struct snd_soc_card byt_rt5640_card = {
 	.fully_routed = true,
 };
 
-static char byt_rt5640_codec_name[16]; /* i2c-<HID>:00 with HID being 8 chars */
+static char byt_rt5640_codec_name[SND_ACPI_I2C_ID_LEN];
 static char byt_rt5640_codec_aif_name[12]; /*  = "rt5640-aif[1|2]" */
 static char byt_rt5640_cpu_dai_name[10]; /*  = "ssp[0|2]-port" */
 
@@ -762,7 +762,7 @@ static int snd_byt_rt5640_mc_probe(struct platform_device *pdev)
 	}
 
 	/* fixup codec name based on HID */
-	i2c_name = snd_soc_acpi_find_name_from_hid(mach->id);
+	i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
 	if (i2c_name) {
 		snprintf(byt_rt5640_codec_name, sizeof(byt_rt5640_codec_name),
 			"%s%s", "i2c-", i2c_name);
diff --git a/sound/soc/intel/boards/bytcr_rt5651.c b/sound/soc/intel/boards/bytcr_rt5651.c
index d955836..456526a 100644
--- a/sound/soc/intel/boards/bytcr_rt5651.c
+++ b/sound/soc/intel/boards/bytcr_rt5651.c
@@ -38,6 +38,8 @@ enum {
 	BYT_RT5651_DMIC_MAP,
 	BYT_RT5651_IN1_MAP,
 	BYT_RT5651_IN2_MAP,
+	BYT_RT5651_IN1_IN2_MAP,
+	BYT_RT5651_IN3_MAP,
 };
 
 #define BYT_RT5651_MAP(quirk)	((quirk) & GENMASK(7, 0))
@@ -62,6 +64,8 @@ static void log_quirks(struct device *dev)
 		dev_info(dev, "quirk IN1_MAP enabled");
 	if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_IN2_MAP)
 		dev_info(dev, "quirk IN2_MAP enabled");
+	if (BYT_RT5651_MAP(byt_rt5651_quirk) == BYT_RT5651_IN3_MAP)
+		dev_info(dev, "quirk IN3_MAP enabled");
 	if (byt_rt5651_quirk & BYT_RT5651_DMIC_EN)
 		dev_info(dev, "quirk DMIC enabled");
 	if (byt_rt5651_quirk & BYT_RT5651_MCLK_EN)
@@ -127,6 +131,7 @@ static const struct snd_soc_dapm_widget byt_rt5651_widgets[] = {
 	SND_SOC_DAPM_MIC("Headset Mic", NULL),
 	SND_SOC_DAPM_MIC("Internal Mic", NULL),
 	SND_SOC_DAPM_SPK("Speaker", NULL),
+	SND_SOC_DAPM_LINE("Line In", NULL),
 	SND_SOC_DAPM_SUPPLY("Platform Clock", SND_SOC_NOPM, 0, 0,
 			    platform_clock_control, SND_SOC_DAPM_PRE_PMU |
 			    SND_SOC_DAPM_POST_PMD),
@@ -138,6 +143,7 @@ static const struct snd_soc_dapm_route byt_rt5651_audio_map[] = {
 	{"Headset Mic", NULL, "Platform Clock"},
 	{"Internal Mic", NULL, "Platform Clock"},
 	{"Speaker", NULL, "Platform Clock"},
+	{"Line In", NULL, "Platform Clock"},
 
 	{"AIF1 Playback", NULL, "ssp2 Tx"},
 	{"ssp2 Tx", NULL, "codec_out0"},
@@ -151,6 +157,9 @@ static const struct snd_soc_dapm_route byt_rt5651_audio_map[] = {
 	{"Headphone", NULL, "HPOR"},
 	{"Speaker", NULL, "LOUTL"},
 	{"Speaker", NULL, "LOUTR"},
+	{"IN2P", NULL, "Line In"},
+	{"IN2N", NULL, "Line In"},
+
 };
 
 static const struct snd_soc_dapm_route byt_rt5651_intmic_dmic_map[] = {
@@ -171,11 +180,25 @@ static const struct snd_soc_dapm_route byt_rt5651_intmic_in2_map[] = {
 	{"IN2P", NULL, "Internal Mic"},
 };
 
+static const struct snd_soc_dapm_route byt_rt5651_intmic_in1_in2_map[] = {
+	{"Internal Mic", NULL, "micbias1"},
+	{"IN1P", NULL, "Internal Mic"},
+	{"IN2P", NULL, "Internal Mic"},
+	{"IN3P", NULL, "Headset Mic"},
+};
+
+static const struct snd_soc_dapm_route byt_rt5651_intmic_in3_map[] = {
+	{"Internal Mic", NULL, "micbias1"},
+	{"IN3P", NULL, "Headset Mic"},
+	{"IN1P", NULL, "Internal Mic"},
+};
+
 static const struct snd_kcontrol_new byt_rt5651_controls[] = {
 	SOC_DAPM_PIN_SWITCH("Headphone"),
 	SOC_DAPM_PIN_SWITCH("Headset Mic"),
 	SOC_DAPM_PIN_SWITCH("Internal Mic"),
 	SOC_DAPM_PIN_SWITCH("Speaker"),
+	SOC_DAPM_PIN_SWITCH("Line In"),
 };
 
 static struct snd_soc_jack_pin bytcr_jack_pins[] = {
@@ -247,8 +270,16 @@ static const struct dmi_system_id byt_rt5651_quirk_table[] = {
 			DMI_MATCH(DMI_SYS_VENDOR, "Circuitco"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "Minnowboard Max B3 PLATFORM"),
 		},
-		.driver_data = (void *)(BYT_RT5651_DMIC_MAP |
-					BYT_RT5651_DMIC_EN),
+		.driver_data = (void *)(BYT_RT5651_IN3_MAP),
+	},
+	{
+		.callback = byt_rt5651_quirk_cb,
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "ADI"),
+			DMI_MATCH(DMI_PRODUCT_NAME, "Minnowboard Turbot"),
+		},
+		.driver_data = (void *)(BYT_RT5651_MCLK_EN |
+					BYT_RT5651_IN3_MAP),
 	},
 	{
 		.callback = byt_rt5651_quirk_cb,
@@ -256,7 +287,8 @@ static const struct dmi_system_id byt_rt5651_quirk_table[] = {
 			DMI_MATCH(DMI_SYS_VENDOR, "KIANO"),
 			DMI_MATCH(DMI_PRODUCT_NAME, "KIANO SlimNote 14.2"),
 		},
-		.driver_data = (void *)(BYT_RT5651_IN2_MAP),
+		.driver_data = (void *)(BYT_RT5651_MCLK_EN |
+					BYT_RT5651_IN1_IN2_MAP),
 	},
 	{}
 };
@@ -281,6 +313,14 @@ static int byt_rt5651_init(struct snd_soc_pcm_runtime *runtime)
 		custom_map = byt_rt5651_intmic_in2_map;
 		num_routes = ARRAY_SIZE(byt_rt5651_intmic_in2_map);
 		break;
+	case BYT_RT5651_IN1_IN2_MAP:
+		custom_map = byt_rt5651_intmic_in1_in2_map;
+		num_routes = ARRAY_SIZE(byt_rt5651_intmic_in1_in2_map);
+		break;
+	case BYT_RT5651_IN3_MAP:
+		custom_map = byt_rt5651_intmic_in3_map;
+		num_routes = ARRAY_SIZE(byt_rt5651_intmic_in3_map);
+		break;
 	default:
 		custom_map = byt_rt5651_intmic_dmic_map;
 		num_routes = ARRAY_SIZE(byt_rt5651_intmic_dmic_map);
@@ -469,7 +509,7 @@ static struct snd_soc_card byt_rt5651_card = {
 	.fully_routed = true,
 };
 
-static char byt_rt5651_codec_name[16]; /* i2c-<HID>:00 with HID being 8 chars */
+static char byt_rt5651_codec_name[SND_ACPI_I2C_ID_LEN];
 
 static int snd_byt_rt5651_mc_probe(struct platform_device *pdev)
 {
@@ -499,7 +539,7 @@ static int snd_byt_rt5651_mc_probe(struct platform_device *pdev)
 	}
 
 	/* fixup codec name based on HID */
-	i2c_name = snd_soc_acpi_find_name_from_hid(mach->id);
+	i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
 	if (i2c_name) {
 		snprintf(byt_rt5651_codec_name, sizeof(byt_rt5651_codec_name),
 			"%s%s", "i2c-", i2c_name);
diff --git a/sound/soc/intel/boards/cht_bsw_rt5645.c b/sound/soc/intel/boards/cht_bsw_rt5645.c
index 18d129ca..31641aa 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5645.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5645.c
@@ -49,7 +49,7 @@ struct cht_acpi_card {
 struct cht_mc_private {
 	struct snd_soc_jack jack;
 	struct cht_acpi_card *acpi_card;
-	char codec_name[16];
+	char codec_name[SND_ACPI_I2C_ID_LEN];
 	struct clk *mclk;
 };
 
@@ -118,6 +118,7 @@ static const struct snd_soc_dapm_widget cht_dapm_widgets[] = {
 	SND_SOC_DAPM_HP("Headphone", NULL),
 	SND_SOC_DAPM_MIC("Headset Mic", NULL),
 	SND_SOC_DAPM_MIC("Int Mic", NULL),
+	SND_SOC_DAPM_MIC("Int Analog Mic", NULL),
 	SND_SOC_DAPM_SPK("Ext Spk", NULL),
 	SND_SOC_DAPM_SUPPLY("Platform Clock", SND_SOC_NOPM, 0, 0,
 			platform_clock_control, SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD),
@@ -128,6 +129,8 @@ static const struct snd_soc_dapm_route cht_rt5645_audio_map[] = {
 	{"IN1N", NULL, "Headset Mic"},
 	{"DMIC L1", NULL, "Int Mic"},
 	{"DMIC R1", NULL, "Int Mic"},
+	{"IN2P", NULL, "Int Analog Mic"},
+	{"IN2N", NULL, "Int Analog Mic"},
 	{"Headphone", NULL, "HPOL"},
 	{"Headphone", NULL, "HPOR"},
 	{"Ext Spk", NULL, "SPOL"},
@@ -135,6 +138,9 @@ static const struct snd_soc_dapm_route cht_rt5645_audio_map[] = {
 	{"Headphone", NULL, "Platform Clock"},
 	{"Headset Mic", NULL, "Platform Clock"},
 	{"Int Mic", NULL, "Platform Clock"},
+	{"Int Analog Mic", NULL, "Platform Clock"},
+	{"Int Analog Mic", NULL, "micbias1"},
+	{"Int Analog Mic", NULL, "micbias2"},
 	{"Ext Spk", NULL, "Platform Clock"},
 };
 
@@ -189,6 +195,7 @@ static const struct snd_kcontrol_new cht_mc_controls[] = {
 	SOC_DAPM_PIN_SWITCH("Headphone"),
 	SOC_DAPM_PIN_SWITCH("Headset Mic"),
 	SOC_DAPM_PIN_SWITCH("Int Mic"),
+	SOC_DAPM_PIN_SWITCH("Int Analog Mic"),
 	SOC_DAPM_PIN_SWITCH("Ext Spk"),
 };
 
@@ -499,7 +506,7 @@ static struct cht_acpi_card snd_soc_cards[] = {
 	{"10EC5650", CODEC_TYPE_RT5650, &snd_soc_card_chtrt5650},
 };
 
-static char cht_rt5645_codec_name[16]; /* i2c-<HID>:00 with HID being 8 chars */
+static char cht_rt5645_codec_name[SND_ACPI_I2C_ID_LEN];
 static char cht_rt5645_codec_aif_name[12]; /*  = "rt5645-aif[1|2]" */
 static char cht_rt5645_cpu_dai_name[10]; /*  = "ssp[0|2]-port" */
 
@@ -566,7 +573,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
 		}
 
 	/* fixup codec name based on HID */
-	i2c_name = snd_soc_acpi_find_name_from_hid(mach->id);
+	i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
 	if (i2c_name) {
 		snprintf(cht_rt5645_codec_name, sizeof(cht_rt5645_codec_name),
 			"%s%s", "i2c-", i2c_name);
diff --git a/sound/soc/intel/boards/cht_bsw_rt5672.c b/sound/soc/intel/boards/cht_bsw_rt5672.c
index f8f21ee..c14a52d 100644
--- a/sound/soc/intel/boards/cht_bsw_rt5672.c
+++ b/sound/soc/intel/boards/cht_bsw_rt5672.c
@@ -35,7 +35,7 @@
 
 struct cht_mc_private {
 	struct snd_soc_jack headset;
-	char codec_name[16];
+	char codec_name[SND_ACPI_I2C_ID_LEN];
 	struct clk *mclk;
 };
 
@@ -396,7 +396,7 @@ static int snd_cht_mc_probe(struct platform_device *pdev)
 
 	/* fixup codec name based on HID */
 	if (mach) {
-		i2c_name = snd_soc_acpi_find_name_from_hid(mach->id);
+		i2c_name = acpi_dev_get_first_match_name(mach->id, NULL, -1);
 		if (i2c_name) {
 			snprintf(drv->codec_name, sizeof(drv->codec_name),
 				 "i2c-%s", i2c_name);
diff --git a/sound/soc/intel/boards/haswell.c b/sound/soc/intel/boards/haswell.c
index 5e1ea03..3c516077 100644
--- a/sound/soc/intel/boards/haswell.c
+++ b/sound/soc/intel/boards/haswell.c
@@ -76,7 +76,7 @@ static int haswell_rt5640_hw_params(struct snd_pcm_substream *substream,
 	}
 
 	/* set correct codec filter for DAI format and clock config */
-	snd_soc_update_bits(rtd->codec, 0x83, 0xffff, 0x8000);
+	snd_soc_component_update_bits(codec_dai->component, 0x83, 0xffff, 0x8000);
 
 	return ret;
 }
diff --git a/sound/soc/intel/boards/kbl_rt5663_max98927.c b/sound/soc/intel/boards/kbl_rt5663_max98927.c
index 6dcad0a8a..bf7014c 100644
--- a/sound/soc/intel/boards/kbl_rt5663_max98927.c
+++ b/sound/soc/intel/boards/kbl_rt5663_max98927.c
@@ -225,7 +225,7 @@ static int kabylake_rt5663_codec_init(struct snd_soc_pcm_runtime *rtd)
 	}
 
 	jack = &ctx->kabylake_headset;
-	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_MEDIA);
+	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_1, KEY_VOICECOMMAND);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_2, KEY_VOLUMEUP);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_3, KEY_VOLUMEDOWN);
diff --git a/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c b/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
index 271ae3c..90ea98f 100644
--- a/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
+++ b/sound/soc/intel/boards/kbl_rt5663_rt5514_max98927.c
@@ -195,7 +195,7 @@ static int kabylake_rt5663_codec_init(struct snd_soc_pcm_runtime *rtd)
 	}
 
 	jack = &ctx->kabylake_headset;
-	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_MEDIA);
+	snd_jack_set_key(jack->jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_1, KEY_VOICECOMMAND);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_2, KEY_VOLUMEUP);
 	snd_jack_set_key(jack->jack, SND_JACK_BTN_3, KEY_VOLUMEDOWN);
diff --git a/sound/soc/intel/boards/mfld_machine.c b/sound/soc/intel/boards/mfld_machine.c
deleted file mode 100644
index 6f44acf..0000000
--- a/sound/soc/intel/boards/mfld_machine.c
+++ /dev/null
@@ -1,428 +0,0 @@
-/*
- *  mfld_machine.c - ASoc Machine driver for Intel Medfield MID platform
- *
- *  Copyright (C) 2010 Intel Corp
- *  Author: Vinod Koul <vinod.koul@intel.com>
- *  Author: Harsha Priya <priya.harsha@intel.com>
- *  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- *
- *  This program is free software; you can redistribute it and/or modify
- *  it under the terms of the GNU General Public License as published by
- *  the Free Software Foundation; version 2 of the License.
- *
- *  This program is distributed in the hope that it will be useful, but
- *  WITHOUT ANY WARRANTY; without even the implied warranty of
- *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
- *  General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License along
- *  with this program; if not, write to the Free Software Foundation, Inc.,
- *  59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.
- *
- * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- */
-
-#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-
-#include <linux/init.h>
-#include <linux/device.h>
-#include <linux/slab.h>
-#include <linux/io.h>
-#include <linux/module.h>
-#include <sound/pcm.h>
-#include <sound/pcm_params.h>
-#include <sound/soc.h>
-#include <sound/jack.h>
-#include "../codecs/sn95031.h"
-
-#define MID_MONO 1
-#define MID_STEREO 2
-#define MID_MAX_CAP 5
-#define MFLD_JACK_INSERT 0x04
-
-enum soc_mic_bias_zones {
-	MFLD_MV_START = 0,
-	/* mic bias volutage range for Headphones*/
-	MFLD_MV_HP = 400,
-	/* mic bias volutage range for American Headset*/
-	MFLD_MV_AM_HS = 650,
-	/* mic bias volutage range for Headset*/
-	MFLD_MV_HS = 2000,
-	MFLD_MV_UNDEFINED,
-};
-
-static unsigned int	hs_switch;
-static unsigned int	lo_dac;
-static struct snd_soc_codec *mfld_codec;
-
-struct mfld_mc_private {
-	void __iomem *int_base;
-	u8 interrupt_status;
-};
-
-struct snd_soc_jack mfld_jack;
-
-/*Headset jack detection DAPM pins */
-static struct snd_soc_jack_pin mfld_jack_pins[] = {
-	{
-		.pin = "Headphones",
-		.mask = SND_JACK_HEADPHONE,
-	},
-	{
-		.pin = "AMIC1",
-		.mask = SND_JACK_MICROPHONE,
-	},
-};
-
-/* jack detection voltage zones */
-static struct snd_soc_jack_zone mfld_zones[] = {
-	{MFLD_MV_START, MFLD_MV_AM_HS, SND_JACK_HEADPHONE},
-	{MFLD_MV_AM_HS, MFLD_MV_HS, SND_JACK_HEADSET},
-};
-
-/* sound card controls */
-static const char * const headset_switch_text[] = {"Earpiece", "Headset"};
-
-static const char * const lo_text[] = {"Vibra", "Headset", "IHF", "None"};
-
-static const struct soc_enum headset_enum =
-	SOC_ENUM_SINGLE_EXT(2, headset_switch_text);
-
-static const struct soc_enum lo_enum =
-	SOC_ENUM_SINGLE_EXT(4, lo_text);
-
-static int headset_get_switch(struct snd_kcontrol *kcontrol,
-	struct snd_ctl_elem_value *ucontrol)
-{
-	ucontrol->value.enumerated.item[0] = hs_switch;
-	return 0;
-}
-
-static int headset_set_switch(struct snd_kcontrol *kcontrol,
-	struct snd_ctl_elem_value *ucontrol)
-{
-	struct snd_soc_card *card = snd_kcontrol_chip(kcontrol);
-	struct snd_soc_dapm_context *dapm = &card->dapm;
-
-	if (ucontrol->value.enumerated.item[0] == hs_switch)
-		return 0;
-
-	snd_soc_dapm_mutex_lock(dapm);
-
-	if (ucontrol->value.enumerated.item[0]) {
-		pr_debug("hs_set HS path\n");
-		snd_soc_dapm_enable_pin_unlocked(dapm, "Headphones");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "EPOUT");
-	} else {
-		pr_debug("hs_set EP path\n");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "Headphones");
-		snd_soc_dapm_enable_pin_unlocked(dapm, "EPOUT");
-	}
-
-	snd_soc_dapm_sync_unlocked(dapm);
-
-	snd_soc_dapm_mutex_unlock(dapm);
-
-	hs_switch = ucontrol->value.enumerated.item[0];
-
-	return 0;
-}
-
-static void lo_enable_out_pins(struct snd_soc_dapm_context *dapm)
-{
-	snd_soc_dapm_enable_pin_unlocked(dapm, "IHFOUTL");
-	snd_soc_dapm_enable_pin_unlocked(dapm, "IHFOUTR");
-	snd_soc_dapm_enable_pin_unlocked(dapm, "LINEOUTL");
-	snd_soc_dapm_enable_pin_unlocked(dapm, "LINEOUTR");
-	snd_soc_dapm_enable_pin_unlocked(dapm, "VIB1OUT");
-	snd_soc_dapm_enable_pin_unlocked(dapm, "VIB2OUT");
-	if (hs_switch) {
-		snd_soc_dapm_enable_pin_unlocked(dapm, "Headphones");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "EPOUT");
-	} else {
-		snd_soc_dapm_disable_pin_unlocked(dapm, "Headphones");
-		snd_soc_dapm_enable_pin_unlocked(dapm, "EPOUT");
-	}
-}
-
-static int lo_get_switch(struct snd_kcontrol *kcontrol,
-	struct snd_ctl_elem_value *ucontrol)
-{
-	ucontrol->value.enumerated.item[0] = lo_dac;
-	return 0;
-}
-
-static int lo_set_switch(struct snd_kcontrol *kcontrol,
-	struct snd_ctl_elem_value *ucontrol)
-{
-	struct snd_soc_card *card = snd_kcontrol_chip(kcontrol);
-	struct snd_soc_dapm_context *dapm = &card->dapm;
-
-	if (ucontrol->value.enumerated.item[0] == lo_dac)
-		return 0;
-
-	snd_soc_dapm_mutex_lock(dapm);
-
-	/* we dont want to work with last state of lineout so just enable all
-	 * pins and then disable pins not required
-	 */
-	lo_enable_out_pins(dapm);
-
-	switch (ucontrol->value.enumerated.item[0]) {
-	case 0:
-		pr_debug("set vibra path\n");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "VIB1OUT");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "VIB2OUT");
-		snd_soc_update_bits(mfld_codec, SN95031_LOCTL, 0x66, 0);
-		break;
-
-	case 1:
-		pr_debug("set hs  path\n");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "Headphones");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "EPOUT");
-		snd_soc_update_bits(mfld_codec, SN95031_LOCTL, 0x66, 0x22);
-		break;
-
-	case 2:
-		pr_debug("set spkr path\n");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "IHFOUTL");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "IHFOUTR");
-		snd_soc_update_bits(mfld_codec, SN95031_LOCTL, 0x66, 0x44);
-		break;
-
-	case 3:
-		pr_debug("set null path\n");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "LINEOUTL");
-		snd_soc_dapm_disable_pin_unlocked(dapm, "LINEOUTR");
-		snd_soc_update_bits(mfld_codec, SN95031_LOCTL, 0x66, 0x66);
-		break;
-	}
-
-	snd_soc_dapm_sync_unlocked(dapm);
-
-	snd_soc_dapm_mutex_unlock(dapm);
-
-	lo_dac = ucontrol->value.enumerated.item[0];
-	return 0;
-}
-
-static const struct snd_kcontrol_new mfld_snd_controls[] = {
-	SOC_ENUM_EXT("Playback Switch", headset_enum,
-			headset_get_switch, headset_set_switch),
-	SOC_ENUM_EXT("Lineout Mux", lo_enum,
-			lo_get_switch, lo_set_switch),
-};
-
-static const struct snd_soc_dapm_widget mfld_widgets[] = {
-	SND_SOC_DAPM_HP("Headphones", NULL),
-	SND_SOC_DAPM_MIC("Mic", NULL),
-};
-
-static const struct snd_soc_dapm_route mfld_map[] = {
-	{"Headphones", NULL, "HPOUTR"},
-	{"Headphones", NULL, "HPOUTL"},
-	{"Mic", NULL, "AMIC1"},
-};
-
-static void mfld_jack_check(unsigned int intr_status)
-{
-	struct mfld_jack_data jack_data;
-
-	if (!mfld_codec)
-		return;
-
-	jack_data.mfld_jack = &mfld_jack;
-	jack_data.intr_id = intr_status;
-
-	sn95031_jack_detection(mfld_codec, &jack_data);
-	/* TODO: add american headset detection post gpiolib support */
-}
-
-static int mfld_init(struct snd_soc_pcm_runtime *runtime)
-{
-	struct snd_soc_dapm_context *dapm = &runtime->card->dapm;
-	int ret_val;
-
-	/* default is earpiece pin, userspace sets it explcitly */
-	snd_soc_dapm_disable_pin(dapm, "Headphones");
-	/* default is lineout NC, userspace sets it explcitly */
-	snd_soc_dapm_disable_pin(dapm, "LINEOUTL");
-	snd_soc_dapm_disable_pin(dapm, "LINEOUTR");
-	lo_dac = 3;
-	hs_switch = 0;
-	/* we dont use linein in this so set to NC */
-	snd_soc_dapm_disable_pin(dapm, "LINEINL");
-	snd_soc_dapm_disable_pin(dapm, "LINEINR");
-
-	/* Headset and button jack detection */
-	ret_val = snd_soc_card_jack_new(runtime->card,
-			"Intel(R) MID Audio Jack", SND_JACK_HEADSET |
-			SND_JACK_BTN_0 | SND_JACK_BTN_1, &mfld_jack,
-			mfld_jack_pins, ARRAY_SIZE(mfld_jack_pins));
-	if (ret_val) {
-		pr_err("jack creation failed\n");
-		return ret_val;
-	}
-
-	ret_val = snd_soc_jack_add_zones(&mfld_jack,
-			ARRAY_SIZE(mfld_zones), mfld_zones);
-	if (ret_val) {
-		pr_err("adding jack zones failed\n");
-		return ret_val;
-	}
-
-	mfld_codec = runtime->codec;
-
-	/* we want to check if anything is inserted at boot,
-	 * so send a fake event to codec and it will read adc
-	 * to find if anything is there or not */
-	mfld_jack_check(MFLD_JACK_INSERT);
-	return ret_val;
-}
-
-static struct snd_soc_dai_link mfld_msic_dailink[] = {
-	{
-		.name = "Medfield Headset",
-		.stream_name = "Headset",
-		.cpu_dai_name = "Headset-cpu-dai",
-		.codec_dai_name = "SN95031 Headset",
-		.codec_name = "sn95031",
-		.platform_name = "sst-platform",
-		.init = mfld_init,
-	},
-	{
-		.name = "Medfield Speaker",
-		.stream_name = "Speaker",
-		.cpu_dai_name = "Speaker-cpu-dai",
-		.codec_dai_name = "SN95031 Speaker",
-		.codec_name = "sn95031",
-		.platform_name = "sst-platform",
-		.init = NULL,
-	},
-	{
-		.name = "Medfield Vibra",
-		.stream_name = "Vibra1",
-		.cpu_dai_name = "Vibra1-cpu-dai",
-		.codec_dai_name = "SN95031 Vibra1",
-		.codec_name = "sn95031",
-		.platform_name = "sst-platform",
-		.init = NULL,
-	},
-	{
-		.name = "Medfield Haptics",
-		.stream_name = "Vibra2",
-		.cpu_dai_name = "Vibra2-cpu-dai",
-		.codec_dai_name = "SN95031 Vibra2",
-		.codec_name = "sn95031",
-		.platform_name = "sst-platform",
-		.init = NULL,
-	},
-	{
-		.name = "Medfield Compress",
-		.stream_name = "Speaker",
-		.cpu_dai_name = "Compress-cpu-dai",
-		.codec_dai_name = "SN95031 Speaker",
-		.codec_name = "sn95031",
-		.platform_name = "sst-platform",
-		.init = NULL,
-	},
-};
-
-/* SoC card */
-static struct snd_soc_card snd_soc_card_mfld = {
-	.name = "medfield_audio",
-	.owner = THIS_MODULE,
-	.dai_link = mfld_msic_dailink,
-	.num_links = ARRAY_SIZE(mfld_msic_dailink),
-
-	.controls = mfld_snd_controls,
-	.num_controls = ARRAY_SIZE(mfld_snd_controls),
-	.dapm_widgets = mfld_widgets,
-	.num_dapm_widgets = ARRAY_SIZE(mfld_widgets),
-	.dapm_routes = mfld_map,
-	.num_dapm_routes = ARRAY_SIZE(mfld_map),
-};
-
-static irqreturn_t snd_mfld_jack_intr_handler(int irq, void *dev)
-{
-	struct mfld_mc_private *mc_private = (struct mfld_mc_private *) dev;
-
-	memcpy_fromio(&mc_private->interrupt_status,
-			((void *)(mc_private->int_base)),
-			sizeof(u8));
-	return IRQ_WAKE_THREAD;
-}
-
-static irqreturn_t snd_mfld_jack_detection(int irq, void *data)
-{
-	struct mfld_mc_private *mc_drv_ctx = (struct mfld_mc_private *) data;
-
-	mfld_jack_check(mc_drv_ctx->interrupt_status);
-
-	return IRQ_HANDLED;
-}
-
-static int snd_mfld_mc_probe(struct platform_device *pdev)
-{
-	int ret_val = 0, irq;
-	struct mfld_mc_private *mc_drv_ctx;
-	struct resource *irq_mem;
-
-	pr_debug("snd_mfld_mc_probe called\n");
-
-	/* retrive the irq number */
-	irq = platform_get_irq(pdev, 0);
-
-	/* audio interrupt base of SRAM location where
-	 * interrupts are stored by System FW */
-	mc_drv_ctx = devm_kzalloc(&pdev->dev, sizeof(*mc_drv_ctx), GFP_ATOMIC);
-	if (!mc_drv_ctx)
-		return -ENOMEM;
-
-	irq_mem = platform_get_resource_byname(
-				pdev, IORESOURCE_MEM, "IRQ_BASE");
-	if (!irq_mem) {
-		pr_err("no mem resource given\n");
-		return -ENODEV;
-	}
-	mc_drv_ctx->int_base = devm_ioremap_nocache(&pdev->dev, irq_mem->start,
-						    resource_size(irq_mem));
-	if (!mc_drv_ctx->int_base) {
-		pr_err("Mapping of cache failed\n");
-		return -ENOMEM;
-	}
-	/* register for interrupt */
-	ret_val = devm_request_threaded_irq(&pdev->dev, irq,
-			snd_mfld_jack_intr_handler,
-			snd_mfld_jack_detection,
-			IRQF_SHARED, pdev->dev.driver->name, mc_drv_ctx);
-	if (ret_val) {
-		pr_err("cannot register IRQ\n");
-		return ret_val;
-	}
-	/* register the soc card */
-	snd_soc_card_mfld.dev = &pdev->dev;
-	ret_val = devm_snd_soc_register_card(&pdev->dev, &snd_soc_card_mfld);
-	if (ret_val) {
-		pr_debug("snd_soc_register_card failed %d\n", ret_val);
-		return ret_val;
-	}
-	platform_set_drvdata(pdev, mc_drv_ctx);
-	pr_debug("successfully exited probe\n");
-	return 0;
-}
-
-static struct platform_driver snd_mfld_mc_driver = {
-	.driver = {
-		.name = "msic_audio",
-	},
-	.probe = snd_mfld_mc_probe,
-};
-
-module_platform_driver(snd_mfld_mc_driver);
-
-MODULE_DESCRIPTION("ASoC Intel(R) MID Machine driver");
-MODULE_AUTHOR("Vinod Koul <vinod.koul@intel.com>");
-MODULE_AUTHOR("Harsha Priya <priya.harsha@intel.com>");
-MODULE_LICENSE("GPL v2");
-MODULE_ALIAS("platform:msic-audio");
diff --git a/sound/soc/intel/common/sst-dsp.c b/sound/soc/intel/common/sst-dsp.c
index 11c0805..fd82f4b 100644
--- a/sound/soc/intel/common/sst-dsp.c
+++ b/sound/soc/intel/common/sst-dsp.c
@@ -269,7 +269,7 @@ int sst_dsp_register_poll(struct sst_dsp *ctx, u32 offset, u32 mask,
 	 */
 
 	timeout = jiffies + msecs_to_jiffies(time);
-	while (((sst_dsp_shim_read_unlocked(ctx, offset) & mask) != target)
+	while ((((reg = sst_dsp_shim_read_unlocked(ctx, offset)) & mask) != target)
 		&& time_before(jiffies, timeout)) {
 		k++;
 		if (k > 10)
@@ -278,8 +278,6 @@ int sst_dsp_register_poll(struct sst_dsp *ctx, u32 offset, u32 mask,
 		usleep_range(s, 2*s);
 	}
 
-	reg = sst_dsp_shim_read_unlocked(ctx, offset);
-
 	if ((reg & mask) == target) {
 		dev_dbg(ctx->dev, "FW Poll Status: reg=%#x %s successful\n",
 					reg, operation);
diff --git a/sound/soc/intel/skylake/bxt-sst.c b/sound/soc/intel/skylake/bxt-sst.c
index 4524211..440bca7 100644
--- a/sound/soc/intel/skylake/bxt-sst.c
+++ b/sound/soc/intel/skylake/bxt-sst.c
@@ -595,7 +595,7 @@ int bxt_sst_dsp_init(struct device *dev, void __iomem *mmio_base, int irq,
 	INIT_DELAYED_WORK(&skl->d0i3.work, bxt_set_dsp_D0i3);
 	skl->d0i3.state = SKL_DSP_D0I3_NONE;
 
-	return 0;
+	return skl_dsp_acquire_irq(sst);
 }
 EXPORT_SYMBOL_GPL(bxt_sst_dsp_init);
 
diff --git a/sound/soc/intel/skylake/cnl-sst.c b/sound/soc/intel/skylake/cnl-sst.c
index 387de38..245df10 100644
--- a/sound/soc/intel/skylake/cnl-sst.c
+++ b/sound/soc/intel/skylake/cnl-sst.c
@@ -458,7 +458,7 @@ int cnl_sst_dsp_init(struct device *dev, void __iomem *mmio_base, int irq,
 	cnl->boot_complete = false;
 	init_waitqueue_head(&cnl->boot_wait);
 
-	return 0;
+	return skl_dsp_acquire_irq(sst);
 }
 EXPORT_SYMBOL_GPL(cnl_sst_dsp_init);
 
diff --git a/sound/soc/intel/skylake/skl-i2s.h b/sound/soc/intel/skylake/skl-i2s.h
new file mode 100644
index 0000000..dcf819b
--- /dev/null
+++ b/sound/soc/intel/skylake/skl-i2s.h
@@ -0,0 +1,64 @@
+/*
+ *  skl-i2s.h - i2s blob mapping
+ *
+ *  Copyright (C) 2017 Intel Corp
+ *  Author: Subhransu S. Prusty < subhransu.s.prusty@intel.com>
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful, but
+ *  WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  General Public License for more details.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ */
+
+#ifndef __SOUND_SOC_SKL_I2S_H
+#define __SOUND_SOC_SKL_I2S_H
+
+#define SKL_I2S_MAX_TIME_SLOTS		8
+#define SKL_MCLK_DIV_CLK_SRC_MASK	GENMASK(17, 16)
+
+#define SKL_MNDSS_DIV_CLK_SRC_MASK	GENMASK(21, 20)
+#define SKL_SHIFT(x)			(ffs(x) - 1)
+#define SKL_MCLK_DIV_RATIO_MASK		GENMASK(11, 0)
+
+struct skl_i2s_config {
+	u32 ssc0;
+	u32 ssc1;
+	u32 sscto;
+	u32 sspsp;
+	u32 sstsa;
+	u32 ssrsa;
+	u32 ssc2;
+	u32 sspsp2;
+	u32 ssc3;
+	u32 ssioc;
+} __packed;
+
+struct skl_i2s_config_mclk {
+	u32 mdivctrl;
+	u32 mdivr;
+};
+
+/**
+ * struct skl_i2s_config_blob_legacy - Structure defines I2S Gateway
+ * configuration legacy blob
+ *
+ * @gtw_attr:		Gateway attribute for the I2S Gateway
+ * @tdm_ts_group:	TDM slot mapping against channels in the Gateway.
+ * @i2s_cfg:		I2S HW registers
+ * @mclk:		MCLK clock source and divider values
+ */
+struct skl_i2s_config_blob_legacy {
+	u32 gtw_attr;
+	u32 tdm_ts_group[SKL_I2S_MAX_TIME_SLOTS];
+	struct skl_i2s_config i2s_cfg;
+	struct skl_i2s_config_mclk mclk;
+};
+
+#endif /* __SOUND_SOC_SKL_I2S_H */
diff --git a/sound/soc/intel/skylake/skl-messages.c b/sound/soc/intel/skylake/skl-messages.c
index 61b5bfa..8cbf080 100644
--- a/sound/soc/intel/skylake/skl-messages.c
+++ b/sound/soc/intel/skylake/skl-messages.c
@@ -55,6 +55,19 @@ static int skl_free_dma_buf(struct device *dev, struct snd_dma_buffer *dmab)
 	return 0;
 }
 
+#define SKL_ASTATE_PARAM_ID	4
+
+void skl_dsp_set_astate_cfg(struct skl_sst *ctx, u32 cnt, void *data)
+{
+	struct skl_ipc_large_config_msg	msg = {0};
+
+	msg.large_param_id = SKL_ASTATE_PARAM_ID;
+	msg.param_data_size = (cnt * sizeof(struct skl_astate_param) +
+				sizeof(cnt));
+
+	skl_ipc_set_large_config(&ctx->ipc, &msg, data);
+}
+
 #define NOTIFICATION_PARAM_ID 3
 #define NOTIFICATION_MASK 0xf
 
@@ -404,11 +417,20 @@ int skl_resume_dsp(struct skl *skl)
 	if (skl->skl_sst->is_first_boot == true)
 		return 0;
 
+	/* disable dynamic clock gating during fw and lib download */
+	ctx->enable_miscbdcge(ctx->dev, false);
+
 	ret = skl_dsp_wake(ctx->dsp);
+	ctx->enable_miscbdcge(ctx->dev, true);
 	if (ret < 0)
 		return ret;
 
 	skl_dsp_enable_notification(skl->skl_sst, false);
+
+	if (skl->cfg.astate_cfg != NULL) {
+		skl_dsp_set_astate_cfg(skl->skl_sst, skl->cfg.astate_cfg->count,
+					skl->cfg.astate_cfg);
+	}
 	return ret;
 }
 
diff --git a/sound/soc/intel/skylake/skl-nhlt.c b/sound/soc/intel/skylake/skl-nhlt.c
index 3eaac41..3b1d2b8 100644
--- a/sound/soc/intel/skylake/skl-nhlt.c
+++ b/sound/soc/intel/skylake/skl-nhlt.c
@@ -19,6 +19,7 @@
  */
 #include <linux/pci.h>
 #include "skl.h"
+#include "skl-i2s.h"
 
 #define NHLT_ACPI_HEADER_SIG	"NHLT"
 
@@ -43,7 +44,8 @@ struct nhlt_acpi_table *skl_nhlt_init(struct device *dev)
 	obj = acpi_evaluate_dsm(handle, &osc_guid, 1, 1, NULL);
 	if (obj && obj->type == ACPI_TYPE_BUFFER) {
 		nhlt_ptr = (struct nhlt_resource_desc  *)obj->buffer.pointer;
-		nhlt_table = (struct nhlt_acpi_table *)
+		if (nhlt_ptr->length)
+			nhlt_table = (struct nhlt_acpi_table *)
 				memremap(nhlt_ptr->min_addr, nhlt_ptr->length,
 				MEMREMAP_WB);
 		ACPI_FREE(obj);
@@ -276,3 +278,157 @@ void skl_nhlt_remove_sysfs(struct skl *skl)
 
 	sysfs_remove_file(&dev->kobj, &dev_attr_platform_id.attr);
 }
+
+/*
+ * Queries NHLT for all the fmt configuration for a particular endpoint and
+ * stores all possible rates supported in a rate table for the corresponding
+ * sclk/sclkfs.
+ */
+static void skl_get_ssp_clks(struct skl *skl, struct skl_ssp_clk *ssp_clks,
+				struct nhlt_fmt *fmt, u8 id)
+{
+	struct skl_i2s_config_blob_legacy *i2s_config;
+	struct skl_clk_parent_src *parent;
+	struct skl_ssp_clk *sclk, *sclkfs;
+	struct nhlt_fmt_cfg *fmt_cfg;
+	struct wav_fmt_ext *wav_fmt;
+	unsigned long rate = 0;
+	bool present = false;
+	int rate_index = 0;
+	u16 channels, bps;
+	u8 clk_src;
+	int i, j;
+	u32 fs;
+
+	sclk = &ssp_clks[SKL_SCLK_OFS];
+	sclkfs = &ssp_clks[SKL_SCLKFS_OFS];
+
+	if (fmt->fmt_count == 0)
+		return;
+
+	for (i = 0; i < fmt->fmt_count; i++) {
+		fmt_cfg = &fmt->fmt_config[i];
+		wav_fmt = &fmt_cfg->fmt_ext;
+
+		channels = wav_fmt->fmt.channels;
+		bps = wav_fmt->fmt.bits_per_sample;
+		fs = wav_fmt->fmt.samples_per_sec;
+
+		/*
+		 * In case of TDM configuration on a ssp, there can
+		 * be more than one blob in which channel masks are
+		 * different for each usecase for a specific rate and bps.
+		 * But the sclk rate will be generated for the total
+		 * number of channels used for that endpoint.
+		 *
+		 * So for the given fs and bps, choose blob which has
+		 * the superset of all channels for that endpoint and
+		 * derive the rate.
+		 */
+		for (j = i; j < fmt->fmt_count; j++) {
+			fmt_cfg = &fmt->fmt_config[j];
+			wav_fmt = &fmt_cfg->fmt_ext;
+			if ((fs == wav_fmt->fmt.samples_per_sec) &&
+			   (bps == wav_fmt->fmt.bits_per_sample))
+				channels = max_t(u16, channels,
+						wav_fmt->fmt.channels);
+		}
+
+		rate = channels * bps * fs;
+
+		/* check if the rate is added already to the given SSP's sclk */
+		for (j = 0; (j < SKL_MAX_CLK_RATES) &&
+			    (sclk[id].rate_cfg[j].rate != 0); j++) {
+			if (sclk[id].rate_cfg[j].rate == rate) {
+				present = true;
+				break;
+			}
+		}
+
+		/* Fill rate and parent for sclk/sclkfs */
+		if (!present) {
+			/* MCLK Divider Source Select */
+			i2s_config = (struct skl_i2s_config_blob_legacy *)
+						fmt->fmt_config[0].config.caps;
+			clk_src = ((i2s_config->mclk.mdivctrl)
+					& SKL_MNDSS_DIV_CLK_SRC_MASK) >>
+					SKL_SHIFT(SKL_MNDSS_DIV_CLK_SRC_MASK);
+
+			parent = skl_get_parent_clk(clk_src);
+
+			/*
+			 * Do not copy the config data if there is no parent
+			 * clock available for this clock source select
+			 */
+			if (!parent)
+				continue;
+
+			sclk[id].rate_cfg[rate_index].rate = rate;
+			sclk[id].rate_cfg[rate_index].config = fmt_cfg;
+			sclkfs[id].rate_cfg[rate_index].rate = rate;
+			sclkfs[id].rate_cfg[rate_index].config = fmt_cfg;
+			sclk[id].parent_name = parent->name;
+			sclkfs[id].parent_name = parent->name;
+
+			rate_index++;
+		}
+	}
+}
+
+static void skl_get_mclk(struct skl *skl, struct skl_ssp_clk *mclk,
+				struct nhlt_fmt *fmt, u8 id)
+{
+	struct skl_i2s_config_blob_legacy *i2s_config;
+	struct nhlt_specific_cfg *fmt_cfg;
+	struct skl_clk_parent_src *parent;
+	u32 clkdiv, div_ratio;
+	u8 clk_src;
+
+	fmt_cfg = &fmt->fmt_config[0].config;
+	i2s_config = (struct skl_i2s_config_blob_legacy *)fmt_cfg->caps;
+
+	/* MCLK Divider Source Select */
+	clk_src = ((i2s_config->mclk.mdivctrl) & SKL_MCLK_DIV_CLK_SRC_MASK) >>
+					SKL_SHIFT(SKL_MCLK_DIV_CLK_SRC_MASK);
+
+	clkdiv = i2s_config->mclk.mdivr & SKL_MCLK_DIV_RATIO_MASK;
+
+	/* bypass divider */
+	div_ratio = 1;
+
+	if (clkdiv != SKL_MCLK_DIV_RATIO_MASK)
+		/* Divider is 2 + clkdiv */
+		div_ratio = clkdiv + 2;
+
+	/* Calculate MCLK rate from source using div value */
+	parent = skl_get_parent_clk(clk_src);
+	if (!parent)
+		return;
+
+	mclk[id].rate_cfg[0].rate = parent->rate/div_ratio;
+	mclk[id].rate_cfg[0].config = &fmt->fmt_config[0];
+	mclk[id].parent_name = parent->name;
+}
+
+void skl_get_clks(struct skl *skl, struct skl_ssp_clk *ssp_clks)
+{
+	struct nhlt_acpi_table *nhlt = (struct nhlt_acpi_table *)skl->nhlt;
+	struct nhlt_endpoint *epnt;
+	struct nhlt_fmt *fmt;
+	int i;
+	u8 id;
+
+	epnt = (struct nhlt_endpoint *)nhlt->desc;
+	for (i = 0; i < nhlt->endpoint_count; i++) {
+		if (epnt->linktype == NHLT_LINK_SSP) {
+			id = epnt->virtual_bus_id;
+
+			fmt = (struct nhlt_fmt *)(epnt->config.caps
+					+ epnt->config.size);
+
+			skl_get_ssp_clks(skl, ssp_clks, fmt, id);
+			skl_get_mclk(skl, ssp_clks, fmt, id);
+		}
+		epnt = (struct nhlt_endpoint *)((u8 *)epnt + epnt->length);
+	}
+}
diff --git a/sound/soc/intel/skylake/skl-pcm.c b/sound/soc/intel/skylake/skl-pcm.c
index 1dd9747..e468285 100644
--- a/sound/soc/intel/skylake/skl-pcm.c
+++ b/sound/soc/intel/skylake/skl-pcm.c
@@ -537,7 +537,7 @@ static int skl_link_hw_params(struct snd_pcm_substream *substream,
 
 	snd_soc_dai_set_dma_data(dai, substream, (void *)link_dev);
 
-	link = snd_hdac_ext_bus_get_link(ebus, rtd->codec->component.name);
+	link = snd_hdac_ext_bus_get_link(ebus, codec_dai->component->name);
 	if (!link)
 		return -EINVAL;
 
@@ -620,7 +620,7 @@ static int skl_link_hw_free(struct snd_pcm_substream *substream,
 
 	link_dev->link_prepared = 0;
 
-	link = snd_hdac_ext_bus_get_link(ebus, rtd->codec->component.name);
+	link = snd_hdac_ext_bus_get_link(ebus, rtd->codec_dai->component->name);
 	if (!link)
 		return -EINVAL;
 
@@ -1343,7 +1343,11 @@ static int skl_platform_soc_probe(struct snd_soc_platform *platform)
 			return -EIO;
 		}
 
+		/* disable dynamic clock gating during fw and lib download */
+		skl->skl_sst->enable_miscbdcge(platform->dev, false);
+
 		ret = ops->init_fw(platform->dev, skl->skl_sst);
+		skl->skl_sst->enable_miscbdcge(platform->dev, true);
 		if (ret < 0) {
 			dev_err(platform->dev, "Failed to boot first fw: %d\n", ret);
 			return ret;
@@ -1351,6 +1355,12 @@ static int skl_platform_soc_probe(struct snd_soc_platform *platform)
 		skl_populate_modules(skl);
 		skl->skl_sst->update_d0i3c = skl_update_d0i3c;
 		skl_dsp_enable_notification(skl->skl_sst, false);
+
+		if (skl->cfg.astate_cfg != NULL) {
+			skl_dsp_set_astate_cfg(skl->skl_sst,
+					skl->cfg.astate_cfg->count,
+					skl->cfg.astate_cfg);
+		}
 	}
 	pm_runtime_mark_last_busy(platform->dev);
 	pm_runtime_put_autosuspend(platform->dev);
diff --git a/sound/soc/intel/skylake/skl-ssp-clk.h b/sound/soc/intel/skylake/skl-ssp-clk.h
new file mode 100644
index 0000000..c9ea840
--- /dev/null
+++ b/sound/soc/intel/skylake/skl-ssp-clk.h
@@ -0,0 +1,79 @@
+/*
+ *  skl-ssp-clk.h - Skylake ssp clock information and ipc structure
+ *
+ *  Copyright (C) 2017 Intel Corp
+ *  Author: Jaikrishna Nemallapudi <jaikrishnax.nemallapudi@intel.com>
+ *  Author: Subhransu S. Prusty <subhransu.s.prusty@intel.com>
+ *  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ *  This program is free software; you can redistribute it and/or modify
+ *  it under the terms of the GNU General Public License as published by
+ *  the Free Software Foundation; version 2 of the License.
+ *
+ *  This program is distributed in the hope that it will be useful, but
+ *  WITHOUT ANY WARRANTY; without even the implied warranty of
+ *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ *  General Public License for more details.
+ *
+ * ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ *
+ */
+
+#ifndef SOUND_SOC_SKL_SSP_CLK_H
+#define SOUND_SOC_SKL_SSP_CLK_H
+
+#define SKL_MAX_SSP		6
+/* xtal/cardinal/pll, parent of ssp clocks and mclk */
+#define SKL_MAX_CLK_SRC		3
+#define SKL_MAX_SSP_CLK_TYPES	3 /* mclk, sclk, sclkfs */
+
+#define SKL_MAX_CLK_CNT		(SKL_MAX_SSP * SKL_MAX_SSP_CLK_TYPES)
+
+/* Max number of configurations supported for each clock */
+#define SKL_MAX_CLK_RATES	10
+
+#define SKL_SCLK_OFS		SKL_MAX_SSP
+#define SKL_SCLKFS_OFS		(SKL_SCLK_OFS + SKL_MAX_SSP)
+
+enum skl_clk_type {
+	SKL_MCLK,
+	SKL_SCLK,
+	SKL_SCLK_FS,
+};
+
+enum skl_clk_src_type {
+	SKL_XTAL,
+	SKL_CARDINAL,
+	SKL_PLL,
+};
+
+struct skl_clk_parent_src {
+	u8 clk_id;
+	const char *name;
+	unsigned long rate;
+	const char *parent_name;
+};
+
+struct skl_clk_rate_cfg_table {
+	unsigned long rate;
+	void *config;
+};
+
+/*
+ * rate for mclk will be in rates[0]. For sclk and sclkfs, rates[] store
+ * all possible clocks ssp can generate for that platform.
+ */
+struct skl_ssp_clk {
+	const char *name;
+	const char *parent_name;
+	struct skl_clk_rate_cfg_table rate_cfg[SKL_MAX_CLK_RATES];
+};
+
+struct skl_clk_pdata {
+	struct skl_clk_parent_src *parent_clks;
+	int num_clks;
+	struct skl_ssp_clk *ssp_clks;
+	void *pvt_data;
+};
+
+#endif /* SOUND_SOC_SKL_SSP_CLK_H */
diff --git a/sound/soc/intel/skylake/skl-sst-dsp.c b/sound/soc/intel/skylake/skl-sst-dsp.c
index 19ee1d4..71e31ad 100644
--- a/sound/soc/intel/skylake/skl-sst-dsp.c
+++ b/sound/soc/intel/skylake/skl-sst-dsp.c
@@ -435,16 +435,22 @@ struct sst_dsp *skl_dsp_ctx_init(struct device *dev,
 			return NULL;
 	}
 
+	return sst;
+}
+
+int skl_dsp_acquire_irq(struct sst_dsp *sst)
+{
+	struct sst_dsp_device *sst_dev = sst->sst_dev;
+	int ret;
+
 	/* Register the ISR */
 	ret = request_threaded_irq(sst->irq, sst->ops->irq_handler,
 		sst_dev->thread, IRQF_SHARED, "AudioDSP", sst);
-	if (ret) {
+	if (ret)
 		dev_err(sst->dev, "unable to grab threaded IRQ %d, disabling device\n",
 			       sst->irq);
-		return NULL;
-	}
 
-	return sst;
+	return ret;
 }
 
 void skl_dsp_free(struct sst_dsp *dsp)
diff --git a/sound/soc/intel/skylake/skl-sst-dsp.h b/sound/soc/intel/skylake/skl-sst-dsp.h
index eba20d3..12fc9a7 100644
--- a/sound/soc/intel/skylake/skl-sst-dsp.h
+++ b/sound/soc/intel/skylake/skl-sst-dsp.h
@@ -206,6 +206,7 @@ int skl_cldma_wait_interruptible(struct sst_dsp *ctx);
 void skl_dsp_set_state_locked(struct sst_dsp *ctx, int state);
 struct sst_dsp *skl_dsp_ctx_init(struct device *dev,
 		struct sst_dsp_device *sst_dev, int irq);
+int skl_dsp_acquire_irq(struct sst_dsp *sst);
 bool is_skl_dsp_running(struct sst_dsp *ctx);
 
 unsigned int skl_dsp_get_enabled_cores(struct sst_dsp *ctx);
@@ -251,6 +252,9 @@ void skl_freeup_uuid_list(struct skl_sst *ctx);
 
 int skl_dsp_strip_extended_manifest(struct firmware *fw);
 void skl_dsp_enable_notification(struct skl_sst *ctx, bool enable);
+
+void skl_dsp_set_astate_cfg(struct skl_sst *ctx, u32 cnt, void *data);
+
 int skl_sst_ctx_init(struct device *dev, int irq, const char *fw_name,
 		struct skl_dsp_loader_ops dsp_ops, struct skl_sst **dsp,
 		struct sst_dsp_device *skl_dev);
diff --git a/sound/soc/intel/skylake/skl-sst-utils.c b/sound/soc/intel/skylake/skl-sst-utils.c
index 8ff8928..2ae4056 100644
--- a/sound/soc/intel/skylake/skl-sst-utils.c
+++ b/sound/soc/intel/skylake/skl-sst-utils.c
@@ -178,7 +178,8 @@ static inline int skl_pvtid_128(struct uuid_module *module)
  * skl_get_pvt_id: generate a private id for use as module id
  *
  * @ctx: driver context
- * @mconfig: module configuration data
+ * @uuid_mod: module's uuid
+ * @instance_id: module's instance id
  *
  * This generates a 128 bit private unique id for a module TYPE so that
  * module instance is unique
@@ -208,7 +209,8 @@ EXPORT_SYMBOL_GPL(skl_get_pvt_id);
  * skl_put_pvt_id: free up the private id allocated
  *
  * @ctx: driver context
- * @mconfig: module configuration data
+ * @uuid_mod: module's uuid
+ * @pvt_id: module pvt id
  *
  * This frees a 128 bit private unique id previously generated
  */
diff --git a/sound/soc/intel/skylake/skl-sst.c b/sound/soc/intel/skylake/skl-sst.c
index a436abf..5a7e41b6 100644
--- a/sound/soc/intel/skylake/skl-sst.c
+++ b/sound/soc/intel/skylake/skl-sst.c
@@ -569,7 +569,7 @@ int skl_sst_dsp_init(struct device *dev, void __iomem *mmio_base, int irq,
 
 	sst->fw_ops = skl_fw_ops;
 
-	return 0;
+	return skl_dsp_acquire_irq(sst);
 }
 EXPORT_SYMBOL_GPL(skl_sst_dsp_init);
 
diff --git a/sound/soc/intel/skylake/skl-topology.c b/sound/soc/intel/skylake/skl-topology.c
index 81923da..73af6e1 100644
--- a/sound/soc/intel/skylake/skl-topology.c
+++ b/sound/soc/intel/skylake/skl-topology.c
@@ -190,7 +190,6 @@ skl_tplg_free_pipe_mcps(struct skl *skl, struct skl_module_cfg *mconfig)
 	u8 res_idx = mconfig->res_idx;
 	struct skl_module_res *res = &mconfig->module->resources[res_idx];
 
-	res = &mconfig->module->resources[res_idx];
 	skl->resource.mcps -= res->cps;
 }
 
@@ -3056,11 +3055,13 @@ static int skl_tplg_get_int_tkn(struct device *dev,
 		struct snd_soc_tplg_vendor_value_elem *tkn_elem,
 		struct skl *skl)
 {
-	int tkn_count = 0, ret;
+	int tkn_count = 0, ret, size;
 	static int mod_idx, res_val_idx, intf_val_idx, dir, pin_idx;
 	struct skl_module_res *res = NULL;
 	struct skl_module_iface *fmt = NULL;
 	struct skl_module *mod = NULL;
+	static struct skl_astate_param *astate_table;
+	static int astate_cfg_idx, count;
 	int i;
 
 	if (skl->modules) {
@@ -3093,6 +3094,46 @@ static int skl_tplg_get_int_tkn(struct device *dev,
 		mod_idx = tkn_elem->value;
 		break;
 
+	case SKL_TKN_U32_ASTATE_COUNT:
+		if (astate_table != NULL) {
+			dev_err(dev, "More than one entry for A-State count");
+			return -EINVAL;
+		}
+
+		if (tkn_elem->value > SKL_MAX_ASTATE_CFG) {
+			dev_err(dev, "Invalid A-State count %d\n",
+				tkn_elem->value);
+			return -EINVAL;
+		}
+
+		size = tkn_elem->value * sizeof(struct skl_astate_param) +
+				sizeof(count);
+		skl->cfg.astate_cfg = devm_kzalloc(dev, size, GFP_KERNEL);
+		if (!skl->cfg.astate_cfg)
+			return -ENOMEM;
+
+		astate_table = skl->cfg.astate_cfg->astate_table;
+		count = skl->cfg.astate_cfg->count = tkn_elem->value;
+		break;
+
+	case SKL_TKN_U32_ASTATE_IDX:
+		if (tkn_elem->value >= count) {
+			dev_err(dev, "Invalid A-State index %d\n",
+				tkn_elem->value);
+			return -EINVAL;
+		}
+
+		astate_cfg_idx = tkn_elem->value;
+		break;
+
+	case SKL_TKN_U32_ASTATE_KCPS:
+		astate_table[astate_cfg_idx].kcps = tkn_elem->value;
+		break;
+
+	case SKL_TKN_U32_ASTATE_CLK_SRC:
+		astate_table[astate_cfg_idx].clk_src = tkn_elem->value;
+		break;
+
 	case SKL_TKN_U8_IN_PIN_TYPE:
 	case SKL_TKN_U8_OUT_PIN_TYPE:
 	case SKL_TKN_U8_IN_QUEUE_COUNT:
diff --git a/sound/soc/intel/skylake/skl.c b/sound/soc/intel/skylake/skl.c
index 31d8634..32ce64c 100644
--- a/sound/soc/intel/skylake/skl.c
+++ b/sound/soc/intel/skylake/skl.c
@@ -355,6 +355,7 @@ static int skl_resume(struct device *dev)
 
 		if (ebus->cmd_dma_state)
 			snd_hdac_bus_init_cmd_io(&ebus->bus);
+		ret = 0;
 	} else {
 		ret = _skl_resume(ebus);
 
@@ -435,19 +436,51 @@ static int skl_free(struct hdac_ext_bus *ebus)
 	return 0;
 }
 
-static int skl_machine_device_register(struct skl *skl, void *driver_data)
+/*
+ * For each ssp there are 3 clocks (mclk/sclk/sclkfs).
+ * e.g. for ssp0, clocks will be named as
+ *      "ssp0_mclk", "ssp0_sclk", "ssp0_sclkfs"
+ * So for skl+, there are 6 ssps, so 18 clocks will be created.
+ */
+static struct skl_ssp_clk skl_ssp_clks[] = {
+	{.name = "ssp0_mclk"}, {.name = "ssp1_mclk"}, {.name = "ssp2_mclk"},
+	{.name = "ssp3_mclk"}, {.name = "ssp4_mclk"}, {.name = "ssp5_mclk"},
+	{.name = "ssp0_sclk"}, {.name = "ssp1_sclk"}, {.name = "ssp2_sclk"},
+	{.name = "ssp3_sclk"}, {.name = "ssp4_sclk"}, {.name = "ssp5_sclk"},
+	{.name = "ssp0_sclkfs"}, {.name = "ssp1_sclkfs"},
+						{.name = "ssp2_sclkfs"},
+	{.name = "ssp3_sclkfs"}, {.name = "ssp4_sclkfs"},
+						{.name = "ssp5_sclkfs"},
+};
+
+static int skl_find_machine(struct skl *skl, void *driver_data)
 {
-	struct hdac_bus *bus = ebus_to_hbus(&skl->ebus);
-	struct platform_device *pdev;
 	struct snd_soc_acpi_mach *mach = driver_data;
-	int ret;
+	struct hdac_bus *bus = ebus_to_hbus(&skl->ebus);
+	struct skl_machine_pdata *pdata;
 
 	mach = snd_soc_acpi_find_machine(mach);
 	if (mach == NULL) {
 		dev_err(bus->dev, "No matching machine driver found\n");
 		return -ENODEV;
 	}
+
+	skl->mach = mach;
 	skl->fw_name = mach->fw_filename;
+	pdata = skl->mach->pdata;
+
+	if (mach->pdata)
+		skl->use_tplg_pcm = pdata->use_tplg_pcm;
+
+	return 0;
+}
+
+static int skl_machine_device_register(struct skl *skl)
+{
+	struct hdac_bus *bus = ebus_to_hbus(&skl->ebus);
+	struct snd_soc_acpi_mach *mach = skl->mach;
+	struct platform_device *pdev;
+	int ret;
 
 	pdev = platform_device_alloc(mach->drv_name, -1);
 	if (pdev == NULL) {
@@ -462,11 +495,8 @@ static int skl_machine_device_register(struct skl *skl, void *driver_data)
 		return -EIO;
 	}
 
-	if (mach->pdata) {
-		skl->use_tplg_pcm =
-			((struct skl_machine_pdata *)mach->pdata)->use_tplg_pcm;
+	if (mach->pdata)
 		dev_set_drvdata(&pdev->dev, mach->pdata);
-	}
 
 	skl->i2s_dev = pdev;
 
@@ -509,6 +539,74 @@ static void skl_dmic_device_unregister(struct skl *skl)
 		platform_device_unregister(skl->dmic_dev);
 }
 
+static struct skl_clk_parent_src skl_clk_src[] = {
+	{ .clk_id = SKL_XTAL, .name = "xtal" },
+	{ .clk_id = SKL_CARDINAL, .name = "cardinal", .rate = 24576000 },
+	{ .clk_id = SKL_PLL, .name = "pll", .rate = 96000000 },
+};
+
+struct skl_clk_parent_src *skl_get_parent_clk(u8 clk_id)
+{
+	unsigned int i;
+
+	for (i = 0; i < ARRAY_SIZE(skl_clk_src); i++) {
+		if (skl_clk_src[i].clk_id == clk_id)
+			return &skl_clk_src[i];
+	}
+
+	return NULL;
+}
+
+static void init_skl_xtal_rate(int pci_id)
+{
+	switch (pci_id) {
+	case 0x9d70:
+	case 0x9d71:
+		skl_clk_src[0].rate = 24000000;
+		return;
+
+	default:
+		skl_clk_src[0].rate = 19200000;
+		return;
+	}
+}
+
+static int skl_clock_device_register(struct skl *skl)
+{
+	struct platform_device_info pdevinfo = {NULL};
+	struct skl_clk_pdata *clk_pdata;
+
+	clk_pdata = devm_kzalloc(&skl->pci->dev, sizeof(*clk_pdata),
+							GFP_KERNEL);
+	if (!clk_pdata)
+		return -ENOMEM;
+
+	init_skl_xtal_rate(skl->pci->device);
+
+	clk_pdata->parent_clks = skl_clk_src;
+	clk_pdata->ssp_clks = skl_ssp_clks;
+	clk_pdata->num_clks = ARRAY_SIZE(skl_ssp_clks);
+
+	/* Query NHLT to fill the rates and parent */
+	skl_get_clks(skl, clk_pdata->ssp_clks);
+	clk_pdata->pvt_data = skl;
+
+	/* Register Platform device */
+	pdevinfo.parent = &skl->pci->dev;
+	pdevinfo.id = -1;
+	pdevinfo.name = "skl-ssp-clk";
+	pdevinfo.data = clk_pdata;
+	pdevinfo.size_data = sizeof(*clk_pdata);
+	skl->clk_dev = platform_device_register_full(&pdevinfo);
+	return PTR_ERR_OR_ZERO(skl->clk_dev);
+}
+
+static void skl_clock_device_unregister(struct skl *skl)
+{
+	if (skl->clk_dev)
+		platform_device_unregister(skl->clk_dev);
+}
+
 /*
  * Probe the given codec address
  */
@@ -615,18 +713,30 @@ static void skl_probe_work(struct work_struct *work)
 	/* create codec instances */
 	skl_codec_create(ebus);
 
+	/* register platform dai and controls */
+	err = skl_platform_register(bus->dev);
+	if (err < 0) {
+		dev_err(bus->dev, "platform register failed: %d\n", err);
+		return;
+	}
+
+	if (bus->ppcap) {
+		err = skl_machine_device_register(skl);
+		if (err < 0) {
+			dev_err(bus->dev, "machine register failed: %d\n", err);
+			goto out_err;
+		}
+	}
+
 	if (IS_ENABLED(CONFIG_SND_SOC_HDAC_HDMI)) {
 		err = snd_hdac_display_power(bus, false);
 		if (err < 0) {
 			dev_err(bus->dev, "Cannot turn off display power on i915\n");
+			skl_machine_device_unregister(skl);
 			return;
 		}
 	}
 
-	/* register platform dai and controls */
-	err = skl_platform_register(bus->dev);
-	if (err < 0)
-		return;
 	/*
 	 * we are done probing so decrement link counts
 	 */
@@ -791,18 +901,21 @@ static int skl_probe(struct pci_dev *pci,
 
 	/* check if dsp is there */
 	if (bus->ppcap) {
-		err = skl_machine_device_register(skl,
-				  (void *)pci_id->driver_data);
+		/* create device for dsp clk */
+		err = skl_clock_device_register(skl);
+		if (err < 0)
+			goto out_clk_free;
+
+		err = skl_find_machine(skl, (void *)pci_id->driver_data);
 		if (err < 0)
 			goto out_nhlt_free;
 
 		err = skl_init_dsp(skl);
 		if (err < 0) {
 			dev_dbg(bus->dev, "error failed to register dsp\n");
-			goto out_mach_free;
+			goto out_nhlt_free;
 		}
 		skl->skl_sst->enable_miscbdcge = skl_enable_miscbdcge;
-
 	}
 	if (bus->mlcap)
 		snd_hdac_ext_bus_get_ml_capabilities(ebus);
@@ -820,8 +933,8 @@ static int skl_probe(struct pci_dev *pci,
 
 out_dsp_free:
 	skl_free_dsp(skl);
-out_mach_free:
-	skl_machine_device_unregister(skl);
+out_clk_free:
+	skl_clock_device_unregister(skl);
 out_nhlt_free:
 	skl_nhlt_free(skl->nhlt);
 out_free:
@@ -872,6 +985,7 @@ static void skl_remove(struct pci_dev *pci)
 	skl_free_dsp(skl);
 	skl_machine_device_unregister(skl);
 	skl_dmic_device_unregister(skl);
+	skl_clock_device_unregister(skl);
 	skl_nhlt_remove_sysfs(skl);
 	skl_nhlt_free(skl->nhlt);
 	skl_free(ebus);
diff --git a/sound/soc/intel/skylake/skl.h b/sound/soc/intel/skylake/skl.h
index e00cde8..f411579 100644
--- a/sound/soc/intel/skylake/skl.h
+++ b/sound/soc/intel/skylake/skl.h
@@ -25,9 +25,12 @@
 #include <sound/hdaudio_ext.h>
 #include <sound/soc.h>
 #include "skl-nhlt.h"
+#include "skl-ssp-clk.h"
 
 #define SKL_SUSPEND_DELAY 2000
 
+#define SKL_MAX_ASTATE_CFG		3
+
 #define AZX_PCIREG_PGCTL		0x44
 #define AZX_PGCTL_LSRMD_MASK		(1 << 4)
 #define AZX_PCIREG_CGCTL		0x48
@@ -45,6 +48,20 @@ struct skl_dsp_resource {
 
 struct skl_debug;
 
+struct skl_astate_param {
+	u32 kcps;
+	u32 clk_src;
+};
+
+struct skl_astate_config {
+	u32 count;
+	struct skl_astate_param astate_table[0];
+};
+
+struct skl_fw_config {
+	struct skl_astate_config *astate_cfg;
+};
+
 struct skl {
 	struct hdac_ext_bus ebus;
 	struct pci_dev *pci;
@@ -52,6 +69,7 @@ struct skl {
 	unsigned int init_done:1; /* delayed init status */
 	struct platform_device *dmic_dev;
 	struct platform_device *i2s_dev;
+	struct platform_device *clk_dev;
 	struct snd_soc_platform *platform;
 	struct snd_soc_dai_driver *dais;
 
@@ -75,6 +93,8 @@ struct skl {
 	u8 nr_modules;
 	struct skl_module **modules;
 	bool use_tplg_pcm;
+	struct skl_fw_config cfg;
+	struct snd_soc_acpi_mach *mach;
 };
 
 #define skl_to_ebus(s)	(&(s)->ebus)
@@ -125,6 +145,8 @@ const struct skl_dsp_ops *skl_get_dsp_ops(int pci_id);
 void skl_update_d0i3c(struct device *dev, bool enable);
 int skl_nhlt_create_sysfs(struct skl *skl);
 void skl_nhlt_remove_sysfs(struct skl *skl);
+void skl_get_clks(struct skl *skl, struct skl_ssp_clk *ssp_clks);
+struct skl_clk_parent_src *skl_get_parent_clk(u8 clk_id);
 
 struct skl_module_cfg;
 
diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
index affa7fb..949fc3a 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.c
@@ -14,451 +14,285 @@
  * GNU General Public License for more details.
  */
 
-#include <sound/soc.h>
-#include <linux/regmap.h>
-#include <linux/pm_runtime.h>
-
 #include "mt2701-afe-common.h"
 #include "mt2701-afe-clock-ctrl.h"
 
-static const char *aud_clks[MT2701_CLOCK_NUM] = {
-	[MT2701_AUD_INFRA_SYS_AUDIO] = "infra_sys_audio_clk",
-	[MT2701_AUD_AUD_MUX1_SEL] = "top_audio_mux1_sel",
-	[MT2701_AUD_AUD_MUX2_SEL] = "top_audio_mux2_sel",
-	[MT2701_AUD_AUD_MUX1_DIV] = "top_audio_mux1_div",
-	[MT2701_AUD_AUD_MUX2_DIV] = "top_audio_mux2_div",
-	[MT2701_AUD_AUD_48K_TIMING] = "top_audio_48k_timing",
-	[MT2701_AUD_AUD_44K_TIMING] = "top_audio_44k_timing",
-	[MT2701_AUD_AUDPLL_MUX_SEL] = "top_audpll_mux_sel",
-	[MT2701_AUD_APLL_SEL] = "top_apll_sel",
-	[MT2701_AUD_AUD1PLL_98M] = "top_aud1_pll_98M",
-	[MT2701_AUD_AUD2PLL_90M] = "top_aud2_pll_90M",
-	[MT2701_AUD_HADDS2PLL_98M] = "top_hadds2_pll_98M",
-	[MT2701_AUD_HADDS2PLL_294M] = "top_hadds2_pll_294M",
-	[MT2701_AUD_AUDPLL] = "top_audpll",
-	[MT2701_AUD_AUDPLL_D4] = "top_audpll_d4",
-	[MT2701_AUD_AUDPLL_D8] = "top_audpll_d8",
-	[MT2701_AUD_AUDPLL_D16] = "top_audpll_d16",
-	[MT2701_AUD_AUDPLL_D24] = "top_audpll_d24",
-	[MT2701_AUD_AUDINTBUS] = "top_audintbus_sel",
-	[MT2701_AUD_CLK_26M] = "clk_26m",
-	[MT2701_AUD_SYSPLL1_D4] = "top_syspll1_d4",
-	[MT2701_AUD_AUD_K1_SRC_SEL] = "top_aud_k1_src_sel",
-	[MT2701_AUD_AUD_K2_SRC_SEL] = "top_aud_k2_src_sel",
-	[MT2701_AUD_AUD_K3_SRC_SEL] = "top_aud_k3_src_sel",
-	[MT2701_AUD_AUD_K4_SRC_SEL] = "top_aud_k4_src_sel",
-	[MT2701_AUD_AUD_K5_SRC_SEL] = "top_aud_k5_src_sel",
-	[MT2701_AUD_AUD_K6_SRC_SEL] = "top_aud_k6_src_sel",
-	[MT2701_AUD_AUD_K1_SRC_DIV] = "top_aud_k1_src_div",
-	[MT2701_AUD_AUD_K2_SRC_DIV] = "top_aud_k2_src_div",
-	[MT2701_AUD_AUD_K3_SRC_DIV] = "top_aud_k3_src_div",
-	[MT2701_AUD_AUD_K4_SRC_DIV] = "top_aud_k4_src_div",
-	[MT2701_AUD_AUD_K5_SRC_DIV] = "top_aud_k5_src_div",
-	[MT2701_AUD_AUD_K6_SRC_DIV] = "top_aud_k6_src_div",
-	[MT2701_AUD_AUD_I2S1_MCLK] = "top_aud_i2s1_mclk",
-	[MT2701_AUD_AUD_I2S2_MCLK] = "top_aud_i2s2_mclk",
-	[MT2701_AUD_AUD_I2S3_MCLK] = "top_aud_i2s3_mclk",
-	[MT2701_AUD_AUD_I2S4_MCLK] = "top_aud_i2s4_mclk",
-	[MT2701_AUD_AUD_I2S5_MCLK] = "top_aud_i2s5_mclk",
-	[MT2701_AUD_AUD_I2S6_MCLK] = "top_aud_i2s6_mclk",
-	[MT2701_AUD_ASM_M_SEL] = "top_asm_m_sel",
-	[MT2701_AUD_ASM_H_SEL] = "top_asm_h_sel",
-	[MT2701_AUD_UNIVPLL2_D4] = "top_univpll2_d4",
-	[MT2701_AUD_UNIVPLL2_D2] = "top_univpll2_d2",
-	[MT2701_AUD_SYSPLL_D5] = "top_syspll_d5",
+static const char *const base_clks[] = {
+	[MT2701_INFRA_SYS_AUDIO] = "infra_sys_audio_clk",
+	[MT2701_TOP_AUD_MCLK_SRC0] = "top_audio_mux1_sel",
+	[MT2701_TOP_AUD_MCLK_SRC1] = "top_audio_mux2_sel",
+	[MT2701_TOP_AUD_A1SYS] = "top_audio_a1sys_hp",
+	[MT2701_TOP_AUD_A2SYS] = "top_audio_a2sys_hp",
+	[MT2701_AUDSYS_AFE] = "audio_afe_pd",
+	[MT2701_AUDSYS_AFE_CONN] = "audio_afe_conn_pd",
+	[MT2701_AUDSYS_A1SYS] = "audio_a1sys_pd",
+	[MT2701_AUDSYS_A2SYS] = "audio_a2sys_pd",
 };
 
 int mt2701_init_clock(struct mtk_base_afe *afe)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
-	int i = 0;
+	int i;
 
-	for (i = 0; i < MT2701_CLOCK_NUM; i++) {
-		afe_priv->clocks[i] = devm_clk_get(afe->dev, aud_clks[i]);
-		if (IS_ERR(afe_priv->clocks[i])) {
-			dev_warn(afe->dev, "%s devm_clk_get %s fail\n",
-				 __func__, aud_clks[i]);
-			return PTR_ERR(aud_clks[i]);
+	for (i = 0; i < MT2701_BASE_CLK_NUM; i++) {
+		afe_priv->base_ck[i] = devm_clk_get(afe->dev, base_clks[i]);
+		if (IS_ERR(afe_priv->base_ck[i])) {
+			dev_err(afe->dev, "failed to get %s\n", base_clks[i]);
+			return PTR_ERR(afe_priv->base_ck[i]);
 		}
 	}
 
-	return 0;
-}
+	/* Get I2S related clocks */
+	for (i = 0; i < MT2701_I2S_NUM; i++) {
+		struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[i];
+		char name[13];
 
-int mt2701_afe_enable_clock(struct mtk_base_afe *afe)
-{
-	int ret = 0;
+		snprintf(name, sizeof(name), "i2s%d_src_sel", i);
+		i2s_path->sel_ck = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->sel_ck)) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->sel_ck);
+		}
 
-	ret = mt2701_turn_on_a1sys_clock(afe);
-	if (ret) {
-		dev_err(afe->dev, "%s turn_on_a1sys_clock fail %d\n",
-			__func__, ret);
-		return ret;
+		snprintf(name, sizeof(name), "i2s%d_src_div", i);
+		i2s_path->div_ck = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->div_ck)) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->div_ck);
+		}
+
+		snprintf(name, sizeof(name), "i2s%d_mclk_en", i);
+		i2s_path->mclk_ck = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->mclk_ck)) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->mclk_ck);
+		}
+
+		snprintf(name, sizeof(name), "i2so%d_hop_ck", i);
+		i2s_path->hop_ck[I2S_OUT] = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->hop_ck[I2S_OUT])) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->hop_ck[I2S_OUT]);
+		}
+
+		snprintf(name, sizeof(name), "i2si%d_hop_ck", i);
+		i2s_path->hop_ck[I2S_IN] = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->hop_ck[I2S_IN])) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->hop_ck[I2S_IN]);
+		}
+
+		snprintf(name, sizeof(name), "asrc%d_out_ck", i);
+		i2s_path->asrco_ck = devm_clk_get(afe->dev, name);
+		if (IS_ERR(i2s_path->asrco_ck)) {
+			dev_err(afe->dev, "failed to get %s\n", name);
+			return PTR_ERR(i2s_path->asrco_ck);
+		}
 	}
 
-	ret = mt2701_turn_on_a2sys_clock(afe);
-	if (ret) {
-		dev_err(afe->dev, "%s turn_on_a2sys_clock fail %d\n",
-			__func__, ret);
-		mt2701_turn_off_a1sys_clock(afe);
-		return ret;
-	}
+	/* Some platforms may support BT path */
+	afe_priv->mrgif_ck = devm_clk_get(afe->dev, "audio_mrgif_pd");
+	if (IS_ERR(afe_priv->mrgif_ck)) {
+		if (PTR_ERR(afe_priv->mrgif_ck) == -EPROBE_DEFER)
+			return -EPROBE_DEFER;
 
-	ret = mt2701_turn_on_afe_clock(afe);
-	if (ret) {
-		dev_err(afe->dev, "%s turn_on_afe_clock fail %d\n",
-			__func__, ret);
-		mt2701_turn_off_a1sys_clock(afe);
-		mt2701_turn_off_a2sys_clock(afe);
-		return ret;
+		afe_priv->mrgif_ck = NULL;
 	}
 
-	regmap_update_bits(afe->regmap, ASYS_TOP_CON,
-			   AUDIO_TOP_CON0_A1SYS_A2SYS_ON,
-			   AUDIO_TOP_CON0_A1SYS_A2SYS_ON);
-	regmap_update_bits(afe->regmap, AFE_DAC_CON0,
-			   AFE_DAC_CON0_AFE_ON,
-			   AFE_DAC_CON0_AFE_ON);
-	regmap_write(afe->regmap, PWR2_TOP_CON,
-		     PWR2_TOP_CON_INIT_VAL);
-	regmap_write(afe->regmap, PWR1_ASM_CON1,
-		     PWR1_ASM_CON1_INIT_VAL);
-	regmap_write(afe->regmap, PWR2_ASM_CON1,
-		     PWR2_ASM_CON1_INIT_VAL);
-
 	return 0;
 }
 
-void mt2701_afe_disable_clock(struct mtk_base_afe *afe)
-{
-	mt2701_turn_off_afe_clock(afe);
-	mt2701_turn_off_a1sys_clock(afe);
-	mt2701_turn_off_a2sys_clock(afe);
-	regmap_update_bits(afe->regmap, ASYS_TOP_CON,
-			   AUDIO_TOP_CON0_A1SYS_A2SYS_ON, 0);
-	regmap_update_bits(afe->regmap, AFE_DAC_CON0,
-			   AFE_DAC_CON0_AFE_ON, 0);
-}
-
-int mt2701_turn_on_a1sys_clock(struct mtk_base_afe *afe)
+int mt2701_afe_enable_i2s(struct mtk_base_afe *afe, int id, int dir)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
-	int ret = 0;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[id];
+	int ret;
 
-	/* Set Mux */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_MUX1_SEL]);
+	ret = clk_prepare_enable(i2s_path->asrco_ck);
 	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUD_MUX1_SEL], ret);
-		goto A1SYS_CLK_AUD_MUX1_SEL_ERR;
+		dev_err(afe->dev, "failed to enable ASRC clock %d\n", ret);
+		return ret;
 	}
 
-	ret = clk_set_parent(afe_priv->clocks[MT2701_AUD_AUD_MUX1_SEL],
-			     afe_priv->clocks[MT2701_AUD_AUD1PLL_98M]);
+	ret = clk_prepare_enable(i2s_path->hop_ck[dir]);
 	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n", __func__,
-			aud_clks[MT2701_AUD_AUD_MUX1_SEL],
-			aud_clks[MT2701_AUD_AUD1PLL_98M], ret);
-		goto A1SYS_CLK_AUD_MUX1_SEL_ERR;
-	}
-
-	/* Set Divider */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_MUX1_DIV]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__,
-			aud_clks[MT2701_AUD_AUD_MUX1_DIV],
-			ret);
-		goto A1SYS_CLK_AUD_MUX1_DIV_ERR;
-	}
-
-	ret = clk_set_rate(afe_priv->clocks[MT2701_AUD_AUD_MUX1_DIV],
-			   MT2701_AUD_AUD_MUX1_DIV_RATE);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%d fail %d\n", __func__,
-			aud_clks[MT2701_AUD_AUD_MUX1_DIV],
-			MT2701_AUD_AUD_MUX1_DIV_RATE, ret);
-		goto A1SYS_CLK_AUD_MUX1_DIV_ERR;
-	}
-
-	/* Enable clock gate */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_48K_TIMING]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUD_48K_TIMING], ret);
-		goto A1SYS_CLK_AUD_48K_ERR;
-	}
-
-	/* Enable infra audio */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_INFRA_SYS_AUDIO], ret);
-		goto A1SYS_CLK_INFRA_ERR;
+		dev_err(afe->dev, "failed to enable I2S clock %d\n", ret);
+		goto err_hop_ck;
 	}
 
 	return 0;
 
-A1SYS_CLK_INFRA_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-A1SYS_CLK_AUD_48K_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_48K_TIMING]);
-A1SYS_CLK_AUD_MUX1_DIV_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX1_DIV]);
-A1SYS_CLK_AUD_MUX1_SEL_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX1_SEL]);
+err_hop_ck:
+	clk_disable_unprepare(i2s_path->asrco_ck);
 
 	return ret;
 }
 
-void mt2701_turn_off_a1sys_clock(struct mtk_base_afe *afe)
+void mt2701_afe_disable_i2s(struct mtk_base_afe *afe, int id, int dir)
+{
+	struct mt2701_afe_private *afe_priv = afe->platform_priv;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[id];
+
+	clk_disable_unprepare(i2s_path->hop_ck[dir]);
+	clk_disable_unprepare(i2s_path->asrco_ck);
+}
+
+int mt2701_afe_enable_mclk(struct mtk_base_afe *afe, int id)
+{
+	struct mt2701_afe_private *afe_priv = afe->platform_priv;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[id];
+
+	return clk_prepare_enable(i2s_path->mclk_ck);
+}
+
+void mt2701_afe_disable_mclk(struct mtk_base_afe *afe, int id)
+{
+	struct mt2701_afe_private *afe_priv = afe->platform_priv;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[id];
+
+	clk_disable_unprepare(i2s_path->mclk_ck);
+}
+
+int mt2701_enable_btmrg_clk(struct mtk_base_afe *afe)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_48K_TIMING]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX1_DIV]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX1_SEL]);
+	return clk_prepare_enable(afe_priv->mrgif_ck);
 }
 
-int mt2701_turn_on_a2sys_clock(struct mtk_base_afe *afe)
-{
-	struct mt2701_afe_private *afe_priv = afe->platform_priv;
-	int ret = 0;
-
-	/* Set Mux */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_MUX2_SEL]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUD_MUX2_SEL], ret);
-		goto A2SYS_CLK_AUD_MUX2_SEL_ERR;
-	}
-
-	ret = clk_set_parent(afe_priv->clocks[MT2701_AUD_AUD_MUX2_SEL],
-			     afe_priv->clocks[MT2701_AUD_AUD2PLL_90M]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n", __func__,
-			aud_clks[MT2701_AUD_AUD_MUX2_SEL],
-			aud_clks[MT2701_AUD_AUD2PLL_90M], ret);
-		goto A2SYS_CLK_AUD_MUX2_SEL_ERR;
-	}
-
-	/* Set Divider */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_MUX2_DIV]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUD_MUX2_DIV], ret);
-		goto A2SYS_CLK_AUD_MUX2_DIV_ERR;
-	}
-
-	ret = clk_set_rate(afe_priv->clocks[MT2701_AUD_AUD_MUX2_DIV],
-			   MT2701_AUD_AUD_MUX2_DIV_RATE);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%d fail %d\n", __func__,
-			aud_clks[MT2701_AUD_AUD_MUX2_DIV],
-			MT2701_AUD_AUD_MUX2_DIV_RATE, ret);
-		goto A2SYS_CLK_AUD_MUX2_DIV_ERR;
-	}
-
-	/* Enable clock gate */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUD_44K_TIMING]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUD_44K_TIMING], ret);
-		goto A2SYS_CLK_AUD_44K_ERR;
-	}
-
-	/* Enable infra audio */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_INFRA_SYS_AUDIO], ret);
-		goto A2SYS_CLK_INFRA_ERR;
-	}
-
-	return 0;
-
-A2SYS_CLK_INFRA_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-A2SYS_CLK_AUD_44K_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_44K_TIMING]);
-A2SYS_CLK_AUD_MUX2_DIV_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX2_DIV]);
-A2SYS_CLK_AUD_MUX2_SEL_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX2_SEL]);
-
-	return ret;
-}
-
-void mt2701_turn_off_a2sys_clock(struct mtk_base_afe *afe)
+void mt2701_disable_btmrg_clk(struct mtk_base_afe *afe)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_44K_TIMING]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX2_DIV]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUD_MUX2_SEL]);
+	clk_disable_unprepare(afe_priv->mrgif_ck);
 }
 
-int mt2701_turn_on_afe_clock(struct mtk_base_afe *afe)
+static int mt2701_afe_enable_audsys(struct mtk_base_afe *afe)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 	int ret;
 
-	/* enable INFRA_SYS */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_INFRA_SYS_AUDIO], ret);
-		goto AFE_AUD_INFRA_ERR;
-	}
+	/* Enable infra clock gate */
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
+	if (ret)
+		return ret;
 
-	/* Set MT2701_AUD_AUDINTBUS to MT2701_AUD_SYSPLL1_D4 */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_AUDINTBUS]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_AUDINTBUS], ret);
-		goto AFE_AUD_AUDINTBUS_ERR;
-	}
+	/* Enable top a1sys clock gate */
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+	if (ret)
+		goto err_a1sys;
 
-	ret = clk_set_parent(afe_priv->clocks[MT2701_AUD_AUDINTBUS],
-			     afe_priv->clocks[MT2701_AUD_SYSPLL1_D4]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n", __func__,
-			aud_clks[MT2701_AUD_AUDINTBUS],
-			aud_clks[MT2701_AUD_SYSPLL1_D4], ret);
-		goto AFE_AUD_AUDINTBUS_ERR;
-	}
+	/* Enable top a2sys clock gate */
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+	if (ret)
+		goto err_a2sys;
 
-	/* Set MT2701_AUD_ASM_H_SEL to MT2701_AUD_UNIVPLL2_D2 */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_ASM_H_SEL]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_ASM_H_SEL], ret);
-		goto AFE_AUD_ASM_H_ERR;
-	}
+	/* Internal clock gates */
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+	if (ret)
+		goto err_afe;
 
-	ret = clk_set_parent(afe_priv->clocks[MT2701_AUD_ASM_H_SEL],
-			     afe_priv->clocks[MT2701_AUD_UNIVPLL2_D2]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n", __func__,
-			aud_clks[MT2701_AUD_ASM_H_SEL],
-			aud_clks[MT2701_AUD_UNIVPLL2_D2], ret);
-		goto AFE_AUD_ASM_H_ERR;
-	}
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
+	if (ret)
+		goto err_audio_a1sys;
 
-	/* Set MT2701_AUD_ASM_M_SEL to MT2701_AUD_UNIVPLL2_D4 */
-	ret = clk_prepare_enable(afe_priv->clocks[MT2701_AUD_ASM_M_SEL]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[MT2701_AUD_ASM_M_SEL], ret);
-		goto AFE_AUD_ASM_M_ERR;
-	}
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_A2SYS]);
+	if (ret)
+		goto err_audio_a2sys;
 
-	ret = clk_set_parent(afe_priv->clocks[MT2701_AUD_ASM_M_SEL],
-			     afe_priv->clocks[MT2701_AUD_UNIVPLL2_D4]);
-	if (ret) {
-		dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n", __func__,
-			aud_clks[MT2701_AUD_ASM_M_SEL],
-			aud_clks[MT2701_AUD_UNIVPLL2_D4], ret);
-		goto AFE_AUD_ASM_M_ERR;
-	}
-
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
-			   AUDIO_TOP_CON0_PDN_AFE, 0);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
-			   AUDIO_TOP_CON0_PDN_APLL_CK, 0);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_A1SYS, 0);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_A2SYS, 0);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_AFE_CONN, 0);
+	ret = clk_prepare_enable(afe_priv->base_ck[MT2701_AUDSYS_AFE_CONN]);
+	if (ret)
+		goto err_afe_conn;
 
 	return 0;
 
-AFE_AUD_ASM_M_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_ASM_M_SEL]);
-AFE_AUD_ASM_H_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_ASM_H_SEL]);
-AFE_AUD_AUDINTBUS_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUDINTBUS]);
-AFE_AUD_INFRA_ERR:
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
+err_afe_conn:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A2SYS]);
+err_audio_a2sys:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
+err_audio_a1sys:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+err_afe:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+err_a2sys:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+err_a1sys:
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
 
 	return ret;
 }
 
-void mt2701_turn_off_afe_clock(struct mtk_base_afe *afe)
+static void mt2701_afe_disable_audsys(struct mtk_base_afe *afe)
 {
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_INFRA_SYS_AUDIO]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_AFE_CONN]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A2SYS]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_A1SYS]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_AUDSYS_AFE]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A1SYS]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_TOP_AUD_A2SYS]);
+	clk_disable_unprepare(afe_priv->base_ck[MT2701_INFRA_SYS_AUDIO]);
+}
 
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_AUDINTBUS]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_ASM_H_SEL]);
-	clk_disable_unprepare(afe_priv->clocks[MT2701_AUD_ASM_M_SEL]);
+int mt2701_afe_enable_clock(struct mtk_base_afe *afe)
+{
+	int ret;
 
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
-			   AUDIO_TOP_CON0_PDN_AFE, AUDIO_TOP_CON0_PDN_AFE);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON0,
-			   AUDIO_TOP_CON0_PDN_APLL_CK,
-			   AUDIO_TOP_CON0_PDN_APLL_CK);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_A1SYS,
-			   AUDIO_TOP_CON4_PDN_A1SYS);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_A2SYS,
-			   AUDIO_TOP_CON4_PDN_A2SYS);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_AFE_CONN,
-			   AUDIO_TOP_CON4_PDN_AFE_CONN);
+	/* Enable audio system */
+	ret = mt2701_afe_enable_audsys(afe);
+	if (ret) {
+		dev_err(afe->dev, "failed to enable audio system %d\n", ret);
+		return ret;
+	}
+
+	regmap_update_bits(afe->regmap, ASYS_TOP_CON,
+			   ASYS_TOP_CON_ASYS_TIMING_ON,
+			   ASYS_TOP_CON_ASYS_TIMING_ON);
+	regmap_update_bits(afe->regmap, AFE_DAC_CON0,
+			   AFE_DAC_CON0_AFE_ON,
+			   AFE_DAC_CON0_AFE_ON);
+
+	/* Configure ASRC */
+	regmap_write(afe->regmap, PWR1_ASM_CON1, PWR1_ASM_CON1_INIT_VAL);
+	regmap_write(afe->regmap, PWR2_ASM_CON1, PWR2_ASM_CON1_INIT_VAL);
+
+	return 0;
+}
+
+int mt2701_afe_disable_clock(struct mtk_base_afe *afe)
+{
+	regmap_update_bits(afe->regmap, ASYS_TOP_CON,
+			   ASYS_TOP_CON_ASYS_TIMING_ON, 0);
+	regmap_update_bits(afe->regmap, AFE_DAC_CON0,
+			   AFE_DAC_CON0_AFE_ON, 0);
+
+	mt2701_afe_disable_audsys(afe);
+
+	return 0;
 }
 
 void mt2701_mclk_configuration(struct mtk_base_afe *afe, int id, int domain,
 			       int mclk)
 {
-	struct mt2701_afe_private *afe_priv = afe->platform_priv;
+	struct mt2701_afe_private *priv = afe->platform_priv;
+	struct mt2701_i2s_path *i2s_path = &priv->i2s_path[id];
 	int ret;
-	int aud_src_div_id = MT2701_AUD_AUD_K1_SRC_DIV + id;
-	int aud_src_clk_id = MT2701_AUD_AUD_K1_SRC_SEL + id;
 
-	/* Set MCLK Kx_SRC_SEL(domain) */
-	ret = clk_prepare_enable(afe_priv->clocks[aud_src_clk_id]);
+	/* Set mclk source */
+	if (domain == 0)
+		ret = clk_set_parent(i2s_path->sel_ck,
+				     priv->base_ck[MT2701_TOP_AUD_MCLK_SRC0]);
+	else
+		ret = clk_set_parent(i2s_path->sel_ck,
+				     priv->base_ck[MT2701_TOP_AUD_MCLK_SRC1]);
+
 	if (ret)
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[aud_src_clk_id], ret);
+		dev_err(afe->dev, "failed to set domain%d mclk source %d\n",
+			domain, ret);
 
-	if (domain == 0) {
-		ret = clk_set_parent(afe_priv->clocks[aud_src_clk_id],
-				     afe_priv->clocks[MT2701_AUD_AUD_MUX1_SEL]);
-		if (ret)
-			dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n",
-				__func__, aud_clks[aud_src_clk_id],
-				aud_clks[MT2701_AUD_AUD_MUX1_SEL], ret);
-	} else {
-		ret = clk_set_parent(afe_priv->clocks[aud_src_clk_id],
-				     afe_priv->clocks[MT2701_AUD_AUD_MUX2_SEL]);
-		if (ret)
-			dev_err(afe->dev, "%s clk_set_parent %s-%s fail %d\n",
-				__func__, aud_clks[aud_src_clk_id],
-				aud_clks[MT2701_AUD_AUD_MUX2_SEL], ret);
-	}
-	clk_disable_unprepare(afe_priv->clocks[aud_src_clk_id]);
-
-	/* Set MCLK Kx_SRC_DIV(divider) */
-	ret = clk_prepare_enable(afe_priv->clocks[aud_src_div_id]);
+	/* Set mclk divider */
+	ret = clk_set_rate(i2s_path->div_ck, mclk);
 	if (ret)
-		dev_err(afe->dev, "%s clk_prepare_enable %s fail %d\n",
-			__func__, aud_clks[aud_src_div_id], ret);
-
-	ret = clk_set_rate(afe_priv->clocks[aud_src_div_id], mclk);
-	if (ret)
-		dev_err(afe->dev, "%s clk_set_rate %s-%d fail %d\n", __func__,
-			aud_clks[aud_src_div_id], mclk, ret);
-	clk_disable_unprepare(afe_priv->clocks[aud_src_div_id]);
+		dev_err(afe->dev, "failed to set mclk divider %d\n", ret);
 }
-
-MODULE_DESCRIPTION("MT2701 afe clock control");
-MODULE_AUTHOR("Garlic Tseng <garlic.tseng@mediatek.com>");
-MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.h b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.h
index 6497d57..15417d9 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.h
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-clock-ctrl.h
@@ -21,16 +21,15 @@ struct mtk_base_afe;
 
 int mt2701_init_clock(struct mtk_base_afe *afe);
 int mt2701_afe_enable_clock(struct mtk_base_afe *afe);
-void mt2701_afe_disable_clock(struct mtk_base_afe *afe);
+int mt2701_afe_disable_clock(struct mtk_base_afe *afe);
 
-int mt2701_turn_on_a1sys_clock(struct mtk_base_afe *afe);
-void mt2701_turn_off_a1sys_clock(struct mtk_base_afe *afe);
+int mt2701_afe_enable_i2s(struct mtk_base_afe *afe, int id, int dir);
+void mt2701_afe_disable_i2s(struct mtk_base_afe *afe, int id, int dir);
+int mt2701_afe_enable_mclk(struct mtk_base_afe *afe, int id);
+void mt2701_afe_disable_mclk(struct mtk_base_afe *afe, int id);
 
-int mt2701_turn_on_a2sys_clock(struct mtk_base_afe *afe);
-void mt2701_turn_off_a2sys_clock(struct mtk_base_afe *afe);
-
-int mt2701_turn_on_afe_clock(struct mtk_base_afe *afe);
-void mt2701_turn_off_afe_clock(struct mtk_base_afe *afe);
+int mt2701_enable_btmrg_clk(struct mtk_base_afe *afe);
+void mt2701_disable_btmrg_clk(struct mtk_base_afe *afe);
 
 void mt2701_mclk_configuration(struct mtk_base_afe *afe, int id, int domain,
 			       int mclk);
diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-common.h b/sound/soc/mediatek/mt2701/mt2701-afe-common.h
index c19430e9..ae8ddea 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-common.h
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-common.h
@@ -16,6 +16,7 @@
 
 #ifndef _MT_2701_AFE_COMMON_H_
 #define _MT_2701_AFE_COMMON_H_
+
 #include <sound/soc.h>
 #include <linux/clk.h>
 #include <linux/regmap.h>
@@ -25,16 +26,7 @@
 #define MT2701_STREAM_DIR_NUM (SNDRV_PCM_STREAM_LAST + 1)
 #define MT2701_PLL_DOMAIN_0_RATE	98304000
 #define MT2701_PLL_DOMAIN_1_RATE	90316800
-#define MT2701_AUD_AUD_MUX1_DIV_RATE (MT2701_PLL_DOMAIN_0_RATE / 2)
-#define MT2701_AUD_AUD_MUX2_DIV_RATE (MT2701_PLL_DOMAIN_1_RATE / 2)
-
-enum {
-	MT2701_I2S_1,
-	MT2701_I2S_2,
-	MT2701_I2S_3,
-	MT2701_I2S_4,
-	MT2701_I2S_NUM,
-};
+#define MT2701_I2S_NUM	4
 
 enum {
 	MT2701_MEMIF_DL1,
@@ -62,60 +54,23 @@ enum {
 };
 
 enum {
-	MT2701_IRQ_ASYS_START,
-	MT2701_IRQ_ASYS_IRQ1 = MT2701_IRQ_ASYS_START,
+	MT2701_IRQ_ASYS_IRQ1,
 	MT2701_IRQ_ASYS_IRQ2,
 	MT2701_IRQ_ASYS_IRQ3,
 	MT2701_IRQ_ASYS_END,
 };
 
-/* 2701 clock def */
-enum audio_system_clock_type {
-	MT2701_AUD_INFRA_SYS_AUDIO,
-	MT2701_AUD_AUD_MUX1_SEL,
-	MT2701_AUD_AUD_MUX2_SEL,
-	MT2701_AUD_AUD_MUX1_DIV,
-	MT2701_AUD_AUD_MUX2_DIV,
-	MT2701_AUD_AUD_48K_TIMING,
-	MT2701_AUD_AUD_44K_TIMING,
-	MT2701_AUD_AUDPLL_MUX_SEL,
-	MT2701_AUD_APLL_SEL,
-	MT2701_AUD_AUD1PLL_98M,
-	MT2701_AUD_AUD2PLL_90M,
-	MT2701_AUD_HADDS2PLL_98M,
-	MT2701_AUD_HADDS2PLL_294M,
-	MT2701_AUD_AUDPLL,
-	MT2701_AUD_AUDPLL_D4,
-	MT2701_AUD_AUDPLL_D8,
-	MT2701_AUD_AUDPLL_D16,
-	MT2701_AUD_AUDPLL_D24,
-	MT2701_AUD_AUDINTBUS,
-	MT2701_AUD_CLK_26M,
-	MT2701_AUD_SYSPLL1_D4,
-	MT2701_AUD_AUD_K1_SRC_SEL,
-	MT2701_AUD_AUD_K2_SRC_SEL,
-	MT2701_AUD_AUD_K3_SRC_SEL,
-	MT2701_AUD_AUD_K4_SRC_SEL,
-	MT2701_AUD_AUD_K5_SRC_SEL,
-	MT2701_AUD_AUD_K6_SRC_SEL,
-	MT2701_AUD_AUD_K1_SRC_DIV,
-	MT2701_AUD_AUD_K2_SRC_DIV,
-	MT2701_AUD_AUD_K3_SRC_DIV,
-	MT2701_AUD_AUD_K4_SRC_DIV,
-	MT2701_AUD_AUD_K5_SRC_DIV,
-	MT2701_AUD_AUD_K6_SRC_DIV,
-	MT2701_AUD_AUD_I2S1_MCLK,
-	MT2701_AUD_AUD_I2S2_MCLK,
-	MT2701_AUD_AUD_I2S3_MCLK,
-	MT2701_AUD_AUD_I2S4_MCLK,
-	MT2701_AUD_AUD_I2S5_MCLK,
-	MT2701_AUD_AUD_I2S6_MCLK,
-	MT2701_AUD_ASM_M_SEL,
-	MT2701_AUD_ASM_H_SEL,
-	MT2701_AUD_UNIVPLL2_D4,
-	MT2701_AUD_UNIVPLL2_D2,
-	MT2701_AUD_SYSPLL_D5,
-	MT2701_CLOCK_NUM
+enum audio_base_clock {
+	MT2701_INFRA_SYS_AUDIO,
+	MT2701_TOP_AUD_MCLK_SRC0,
+	MT2701_TOP_AUD_MCLK_SRC1,
+	MT2701_TOP_AUD_A1SYS,
+	MT2701_TOP_AUD_A2SYS,
+	MT2701_AUDSYS_AFE,
+	MT2701_AUDSYS_AFE_CONN,
+	MT2701_AUDSYS_A1SYS,
+	MT2701_AUDSYS_A2SYS,
+	MT2701_BASE_CLK_NUM,
 };
 
 static const unsigned int mt2701_afe_backup_list[] = {
@@ -139,12 +94,8 @@ static const unsigned int mt2701_afe_backup_list[] = {
 	AFE_MEMIF_PBUF_SIZE,
 };
 
-struct snd_pcm_substream;
-struct mtk_base_irq_data;
-
 struct mt2701_i2s_data {
 	int i2s_ctrl_reg;
-	int i2s_pwn_shift;
 	int i2s_asrc_fs_shift;
 	int i2s_asrc_fs_mask;
 };
@@ -160,12 +111,18 @@ struct mt2701_i2s_path {
 	int mclk_rate;
 	int on[I2S_DIR_NUM];
 	int occupied[I2S_DIR_NUM];
-	const struct mt2701_i2s_data *i2s_data[2];
+	const struct mt2701_i2s_data *i2s_data[I2S_DIR_NUM];
+	struct clk *hop_ck[I2S_DIR_NUM];
+	struct clk *sel_ck;
+	struct clk *div_ck;
+	struct clk *mclk_ck;
+	struct clk *asrco_ck;
 };
 
 struct mt2701_afe_private {
-	struct clk *clocks[MT2701_CLOCK_NUM];
 	struct mt2701_i2s_path i2s_path[MT2701_I2S_NUM];
+	struct clk *base_ck[MT2701_BASE_CLK_NUM];
+	struct clk *mrgif_ck;
 	bool mrg_enable[MT2701_STREAM_DIR_NUM];
 };
 
diff --git a/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c b/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
index 8fda182..5bc4e00 100644
--- a/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
+++ b/sound/soc/mediatek/mt2701/mt2701-afe-pcm.c
@@ -17,19 +17,16 @@
 
 #include <linux/delay.h>
 #include <linux/module.h>
+#include <linux/mfd/syscon.h>
 #include <linux/of.h>
 #include <linux/of_address.h>
 #include <linux/pm_runtime.h>
-#include <sound/soc.h>
 
 #include "mt2701-afe-common.h"
-
 #include "mt2701-afe-clock-ctrl.h"
 #include "../common/mtk-afe-platform-driver.h"
 #include "../common/mtk-afe-fe-dai.h"
 
-#define AFE_IRQ_STATUS_BITS	0xff
-
 static const struct snd_pcm_hardware mt2701_afe_hardware = {
 	.info = SNDRV_PCM_INFO_MMAP | SNDRV_PCM_INFO_INTERLEAVED
 		| SNDRV_PCM_INFO_RESUME | SNDRV_PCM_INFO_MMAP_VALID,
@@ -97,40 +94,26 @@ static int mt2701_afe_i2s_startup(struct snd_pcm_substream *substream,
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
 	struct mtk_base_afe *afe = snd_soc_platform_get_drvdata(rtd->platform);
-	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 	int i2s_num = mt2701_dai_num_to_i2s(afe, dai->id);
-	int clk_num = MT2701_AUD_AUD_I2S1_MCLK + i2s_num;
-	int ret = 0;
 
 	if (i2s_num < 0)
 		return i2s_num;
 
-	/* enable mclk */
-	ret = clk_prepare_enable(afe_priv->clocks[clk_num]);
-	if (ret)
-		dev_err(afe->dev, "Failed to enable mclk for I2S: %d\n",
-			i2s_num);
-
-	return ret;
+	return mt2701_afe_enable_mclk(afe, i2s_num);
 }
 
 static int mt2701_afe_i2s_path_shutdown(struct snd_pcm_substream *substream,
 					struct snd_soc_dai *dai,
+					int i2s_num,
 					int dir_invert)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
 	struct mtk_base_afe *afe = snd_soc_platform_get_drvdata(rtd->platform);
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
-	int i2s_num = mt2701_dai_num_to_i2s(afe, dai->id);
-	struct mt2701_i2s_path *i2s_path;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[i2s_num];
 	const struct mt2701_i2s_data *i2s_data;
 	int stream_dir = substream->stream;
 
-	if (i2s_num < 0)
-		return i2s_num;
-
-	i2s_path = &afe_priv->i2s_path[i2s_num];
-
 	if (dir_invert)	{
 		if (stream_dir == SNDRV_PCM_STREAM_PLAYBACK)
 			stream_dir = SNDRV_PCM_STREAM_CAPTURE;
@@ -151,9 +134,9 @@ static int mt2701_afe_i2s_path_shutdown(struct snd_pcm_substream *substream,
 	/* disable i2s */
 	regmap_update_bits(afe->regmap, i2s_data->i2s_ctrl_reg,
 			   ASYS_I2S_CON_I2S_EN, 0);
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   1 << i2s_data->i2s_pwn_shift,
-			   1 << i2s_data->i2s_pwn_shift);
+
+	mt2701_afe_disable_i2s(afe, i2s_num, stream_dir);
+
 	return 0;
 }
 
@@ -165,7 +148,6 @@ static void mt2701_afe_i2s_shutdown(struct snd_pcm_substream *substream,
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
 	int i2s_num = mt2701_dai_num_to_i2s(afe, dai->id);
 	struct mt2701_i2s_path *i2s_path;
-	int clk_num = MT2701_AUD_AUD_I2S1_MCLK + i2s_num;
 
 	if (i2s_num < 0)
 		return;
@@ -177,37 +159,32 @@ static void mt2701_afe_i2s_shutdown(struct snd_pcm_substream *substream,
 	else
 		goto I2S_UNSTART;
 
-	mt2701_afe_i2s_path_shutdown(substream, dai, 0);
+	mt2701_afe_i2s_path_shutdown(substream, dai, i2s_num, 0);
 
 	/* need to disable i2s-out path when disable i2s-in */
 	if (substream->stream == SNDRV_PCM_STREAM_CAPTURE)
-		mt2701_afe_i2s_path_shutdown(substream, dai, 1);
+		mt2701_afe_i2s_path_shutdown(substream, dai, i2s_num, 1);
 
 I2S_UNSTART:
 	/* disable mclk */
-	clk_disable_unprepare(afe_priv->clocks[clk_num]);
+	mt2701_afe_disable_mclk(afe, i2s_num);
 }
 
 static int mt2701_i2s_path_prepare_enable(struct snd_pcm_substream *substream,
 					  struct snd_soc_dai *dai,
+					  int i2s_num,
 					  int dir_invert)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
 	struct mtk_base_afe *afe = snd_soc_platform_get_drvdata(rtd->platform);
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
-	int i2s_num = mt2701_dai_num_to_i2s(afe, dai->id);
-	struct mt2701_i2s_path *i2s_path;
+	struct mt2701_i2s_path *i2s_path = &afe_priv->i2s_path[i2s_num];
 	const struct mt2701_i2s_data *i2s_data;
 	struct snd_pcm_runtime * const runtime = substream->runtime;
 	int reg, fs, w_len = 1; /* now we support bck 64bits only */
 	int stream_dir = substream->stream;
 	unsigned int mask = 0, val = 0;
 
-	if (i2s_num < 0)
-		return i2s_num;
-
-	i2s_path = &afe_priv->i2s_path[i2s_num];
-
 	if (dir_invert) {
 		if (stream_dir == SNDRV_PCM_STREAM_PLAYBACK)
 			stream_dir = SNDRV_PCM_STREAM_CAPTURE;
@@ -251,9 +228,7 @@ static int mt2701_i2s_path_prepare_enable(struct snd_pcm_substream *substream,
 			   fs << i2s_data->i2s_asrc_fs_shift);
 
 	/* enable i2s */
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   1 << i2s_data->i2s_pwn_shift,
-			   0 << i2s_data->i2s_pwn_shift);
+	mt2701_afe_enable_i2s(afe, i2s_num, stream_dir);
 
 	/* reset i2s hw status before enable */
 	regmap_update_bits(afe->regmap, i2s_data->i2s_ctrl_reg,
@@ -300,13 +275,13 @@ static int mt2701_afe_i2s_prepare(struct snd_pcm_substream *substream,
 	mt2701_mclk_configuration(afe, i2s_num, clk_domain, mclk_rate);
 
 	if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) {
-		mt2701_i2s_path_prepare_enable(substream, dai, 0);
+		mt2701_i2s_path_prepare_enable(substream, dai, i2s_num, 0);
 	} else {
 		/* need to enable i2s-out path when enable i2s-in */
 		/* prepare for another direction "out" */
-		mt2701_i2s_path_prepare_enable(substream, dai, 1);
+		mt2701_i2s_path_prepare_enable(substream, dai, i2s_num, 1);
 		/* prepare for "in" */
-		mt2701_i2s_path_prepare_enable(substream, dai, 0);
+		mt2701_i2s_path_prepare_enable(substream, dai, i2s_num, 0);
 	}
 
 	return 0;
@@ -339,9 +314,11 @@ static int mt2701_btmrg_startup(struct snd_pcm_substream *substream,
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
 	struct mtk_base_afe *afe = snd_soc_platform_get_drvdata(rtd->platform);
 	struct mt2701_afe_private *afe_priv = afe->platform_priv;
+	int ret;
 
-	regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-			   AUDIO_TOP_CON4_PDN_MRGIF, 0);
+	ret = mt2701_enable_btmrg_clk(afe);
+	if (ret)
+		return ret;
 
 	afe_priv->mrg_enable[substream->stream] = 1;
 	return 0;
@@ -406,9 +383,7 @@ static void mt2701_btmrg_shutdown(struct snd_pcm_substream *substream,
 				   AFE_MRGIF_CON_MRG_EN, 0);
 		regmap_update_bits(afe->regmap, AFE_MRGIF_CON,
 				   AFE_MRGIF_CON_MRG_I2S_EN, 0);
-		regmap_update_bits(afe->regmap, AUDIO_TOP_CON4,
-				   AUDIO_TOP_CON4_PDN_MRGIF,
-				   AUDIO_TOP_CON4_PDN_MRGIF);
+		mt2701_disable_btmrg_clk(afe);
 	}
 	afe_priv->mrg_enable[substream->stream] = 0;
 }
@@ -574,7 +549,6 @@ static const struct snd_soc_dai_ops mt2701_single_memif_dai_ops = {
 	.hw_free	= mtk_afe_fe_hw_free,
 	.prepare	= mtk_afe_fe_prepare,
 	.trigger	= mtk_afe_fe_trigger,
-
 };
 
 static const struct snd_soc_dai_ops mt2701_dlm_memif_dai_ops = {
@@ -915,31 +889,6 @@ static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_i2s4[] = {
 				    PWR2_TOP_CON, 19, 1, 0),
 };
 
-static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_asrc0[] = {
-	SOC_DAPM_SINGLE_AUTODISABLE("Asrc0 out Switch", AUDIO_TOP_CON4, 14, 1,
-				    1),
-};
-
-static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_asrc1[] = {
-	SOC_DAPM_SINGLE_AUTODISABLE("Asrc1 out Switch", AUDIO_TOP_CON4, 15, 1,
-				    1),
-};
-
-static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_asrc2[] = {
-	SOC_DAPM_SINGLE_AUTODISABLE("Asrc2 out Switch", PWR2_TOP_CON, 6, 1,
-				    1),
-};
-
-static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_asrc3[] = {
-	SOC_DAPM_SINGLE_AUTODISABLE("Asrc3 out Switch", PWR2_TOP_CON, 7, 1,
-				    1),
-};
-
-static const struct snd_kcontrol_new mt2701_afe_multi_ch_out_asrc4[] = {
-	SOC_DAPM_SINGLE_AUTODISABLE("Asrc4 out Switch", PWR2_TOP_CON, 8, 1,
-				    1),
-};
-
 static const struct snd_soc_dapm_widget mt2701_afe_pcm_widgets[] = {
 	/* inter-connections */
 	SND_SOC_DAPM_MIXER("I00", SND_SOC_NOPM, 0, 0, NULL, 0),
@@ -999,19 +948,6 @@ static const struct snd_soc_dapm_widget mt2701_afe_pcm_widgets[] = {
 	SND_SOC_DAPM_MIXER("I18I19", SND_SOC_NOPM, 0, 0,
 			   mt2701_afe_multi_ch_out_i2s3,
 			   ARRAY_SIZE(mt2701_afe_multi_ch_out_i2s3)),
-
-	SND_SOC_DAPM_MIXER("ASRC_O0", SND_SOC_NOPM, 0, 0,
-			   mt2701_afe_multi_ch_out_asrc0,
-			   ARRAY_SIZE(mt2701_afe_multi_ch_out_asrc0)),
-	SND_SOC_DAPM_MIXER("ASRC_O1", SND_SOC_NOPM, 0, 0,
-			   mt2701_afe_multi_ch_out_asrc1,
-			   ARRAY_SIZE(mt2701_afe_multi_ch_out_asrc1)),
-	SND_SOC_DAPM_MIXER("ASRC_O2", SND_SOC_NOPM, 0, 0,
-			   mt2701_afe_multi_ch_out_asrc2,
-			   ARRAY_SIZE(mt2701_afe_multi_ch_out_asrc2)),
-	SND_SOC_DAPM_MIXER("ASRC_O3", SND_SOC_NOPM, 0, 0,
-			   mt2701_afe_multi_ch_out_asrc3,
-			   ARRAY_SIZE(mt2701_afe_multi_ch_out_asrc3)),
 };
 
 static const struct snd_soc_dapm_route mt2701_afe_pcm_routes[] = {
@@ -1021,7 +957,6 @@ static const struct snd_soc_dapm_route mt2701_afe_pcm_routes[] = {
 
 	{"I2S0 Playback", NULL, "O15"},
 	{"I2S0 Playback", NULL, "O16"},
-
 	{"I2S1 Playback", NULL, "O17"},
 	{"I2S1 Playback", NULL, "O18"},
 	{"I2S2 Playback", NULL, "O19"},
@@ -1038,7 +973,6 @@ static const struct snd_soc_dapm_route mt2701_afe_pcm_routes[] = {
 
 	{"I00", NULL, "I2S0 Capture"},
 	{"I01", NULL, "I2S0 Capture"},
-
 	{"I02", NULL, "I2S1 Capture"},
 	{"I03", NULL, "I2S1 Capture"},
 	/* I02,03 link to UL2, also need to open I2S0 */
@@ -1046,15 +980,10 @@ static const struct snd_soc_dapm_route mt2701_afe_pcm_routes[] = {
 
 	{"I26", NULL, "BT Capture"},
 
-	{"ASRC_O0", "Asrc0 out Switch", "DLM"},
-	{"ASRC_O1", "Asrc1 out Switch", "DLM"},
-	{"ASRC_O2", "Asrc2 out Switch", "DLM"},
-	{"ASRC_O3", "Asrc3 out Switch", "DLM"},
-
-	{"I12I13", "Multich I2S0 Out Switch", "ASRC_O0"},
-	{"I14I15", "Multich I2S1 Out Switch", "ASRC_O1"},
-	{"I16I17", "Multich I2S2 Out Switch", "ASRC_O2"},
-	{"I18I19", "Multich I2S3 Out Switch", "ASRC_O3"},
+	{"I12I13", "Multich I2S0 Out Switch", "DLM"},
+	{"I14I15", "Multich I2S1 Out Switch", "DLM"},
+	{"I16I17", "Multich I2S2 Out Switch", "DLM"},
+	{"I18I19", "Multich I2S3 Out Switch", "DLM"},
 
 	{ "I12", NULL, "I12I13" },
 	{ "I13", NULL, "I12I13" },
@@ -1079,7 +1008,6 @@ static const struct snd_soc_dapm_route mt2701_afe_pcm_routes[] = {
 	{ "O21", "I18 Switch", "I18" },
 	{ "O22", "I19 Switch", "I19" },
 	{ "O31", "I35 Switch", "I35" },
-
 };
 
 static const struct snd_soc_component_driver mt2701_afe_pcm_dai_component = {
@@ -1386,14 +1314,12 @@ static const struct mt2701_i2s_data mt2701_i2s_data[MT2701_I2S_NUM][2] = {
 	{
 		{
 			.i2s_ctrl_reg = ASYS_I2SO1_CON,
-			.i2s_pwn_shift = 6,
 			.i2s_asrc_fs_shift = 0,
 			.i2s_asrc_fs_mask = 0x1f,
 
 		},
 		{
 			.i2s_ctrl_reg = ASYS_I2SIN1_CON,
-			.i2s_pwn_shift = 0,
 			.i2s_asrc_fs_shift = 0,
 			.i2s_asrc_fs_mask = 0x1f,
 
@@ -1402,14 +1328,12 @@ static const struct mt2701_i2s_data mt2701_i2s_data[MT2701_I2S_NUM][2] = {
 	{
 		{
 			.i2s_ctrl_reg = ASYS_I2SO2_CON,
-			.i2s_pwn_shift = 7,
 			.i2s_asrc_fs_shift = 5,
 			.i2s_asrc_fs_mask = 0x1f,
 
 		},
 		{
 			.i2s_ctrl_reg = ASYS_I2SIN2_CON,
-			.i2s_pwn_shift = 1,
 			.i2s_asrc_fs_shift = 5,
 			.i2s_asrc_fs_mask = 0x1f,
 
@@ -1418,14 +1342,12 @@ static const struct mt2701_i2s_data mt2701_i2s_data[MT2701_I2S_NUM][2] = {
 	{
 		{
 			.i2s_ctrl_reg = ASYS_I2SO3_CON,
-			.i2s_pwn_shift = 8,
 			.i2s_asrc_fs_shift = 10,
 			.i2s_asrc_fs_mask = 0x1f,
 
 		},
 		{
 			.i2s_ctrl_reg = ASYS_I2SIN3_CON,
-			.i2s_pwn_shift = 2,
 			.i2s_asrc_fs_shift = 10,
 			.i2s_asrc_fs_mask = 0x1f,
 
@@ -1434,14 +1356,12 @@ static const struct mt2701_i2s_data mt2701_i2s_data[MT2701_I2S_NUM][2] = {
 	{
 		{
 			.i2s_ctrl_reg = ASYS_I2SO4_CON,
-			.i2s_pwn_shift = 9,
 			.i2s_asrc_fs_shift = 15,
 			.i2s_asrc_fs_mask = 0x1f,
 
 		},
 		{
 			.i2s_ctrl_reg = ASYS_I2SIN4_CON,
-			.i2s_pwn_shift = 3,
 			.i2s_asrc_fs_shift = 15,
 			.i2s_asrc_fs_mask = 0x1f,
 
@@ -1449,14 +1369,6 @@ static const struct mt2701_i2s_data mt2701_i2s_data[MT2701_I2S_NUM][2] = {
 	},
 };
 
-static const struct regmap_config mt2701_afe_regmap_config = {
-	.reg_bits = 32,
-	.reg_stride = 4,
-	.val_bits = 32,
-	.max_register = AFE_END_ADDR,
-	.cache_type = REGCACHE_NONE,
-};
-
 static irqreturn_t mt2701_asys_isr(int irq_id, void *dev)
 {
 	int id;
@@ -1483,8 +1395,7 @@ static int mt2701_afe_runtime_suspend(struct device *dev)
 {
 	struct mtk_base_afe *afe = dev_get_drvdata(dev);
 
-	mt2701_afe_disable_clock(afe);
-	return 0;
+	return mt2701_afe_disable_clock(afe);
 }
 
 static int mt2701_afe_runtime_resume(struct device *dev)
@@ -1496,21 +1407,22 @@ static int mt2701_afe_runtime_resume(struct device *dev)
 
 static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 {
+	struct snd_soc_component *component;
 	struct mtk_base_afe *afe;
 	struct mt2701_afe_private *afe_priv;
-	struct resource *res;
 	struct device *dev;
 	int i, irq_id, ret;
 
 	afe = devm_kzalloc(&pdev->dev, sizeof(*afe), GFP_KERNEL);
 	if (!afe)
 		return -ENOMEM;
+
 	afe->platform_priv = devm_kzalloc(&pdev->dev, sizeof(*afe_priv),
 					  GFP_KERNEL);
 	if (!afe->platform_priv)
 		return -ENOMEM;
-	afe_priv = afe->platform_priv;
 
+	afe_priv = afe->platform_priv;
 	afe->dev = &pdev->dev;
 	dev = afe->dev;
 
@@ -1527,17 +1439,11 @@ static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 		return ret;
 	}
 
-	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
-
-	afe->base_addr = devm_ioremap_resource(&pdev->dev, res);
-
-	if (IS_ERR(afe->base_addr))
-		return PTR_ERR(afe->base_addr);
-
-	afe->regmap = devm_regmap_init_mmio(&pdev->dev, afe->base_addr,
-		&mt2701_afe_regmap_config);
-	if (IS_ERR(afe->regmap))
+	afe->regmap = syscon_node_to_regmap(dev->parent->of_node);
+	if (IS_ERR(afe->regmap)) {
+		dev_err(dev, "could not get regmap from parent\n");
 		return PTR_ERR(afe->regmap);
+	}
 
 	mutex_init(&afe->irq_alloc_lock);
 
@@ -1545,7 +1451,6 @@ static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 	afe->memif_size = MT2701_MEMIF_NUM;
 	afe->memif = devm_kcalloc(dev, afe->memif_size, sizeof(*afe->memif),
 				  GFP_KERNEL);
-
 	if (!afe->memif)
 		return -ENOMEM;
 
@@ -1558,7 +1463,6 @@ static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 	afe->irqs_size = MT2701_IRQ_ASYS_END;
 	afe->irqs = devm_kcalloc(dev, afe->irqs_size, sizeof(*afe->irqs),
 				 GFP_KERNEL);
-
 	if (!afe->irqs)
 		return -ENOMEM;
 
@@ -1573,10 +1477,15 @@ static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 			= &mt2701_i2s_data[i][I2S_IN];
 	}
 
+	component = kzalloc(sizeof(*component), GFP_KERNEL);
+	if (!component)
+		return -ENOMEM;
+
+	component->regmap = afe->regmap;
+
 	afe->mtk_afe_hardware = &mt2701_afe_hardware;
 	afe->memif_fs = mt2701_memif_fs;
 	afe->irq_fs = mt2701_irq_fs;
-
 	afe->reg_back_up_list = mt2701_afe_backup_list;
 	afe->reg_back_up_list_num = ARRAY_SIZE(mt2701_afe_backup_list);
 	afe->runtime_resume = mt2701_afe_runtime_resume;
@@ -1586,59 +1495,58 @@ static int mt2701_afe_pcm_dev_probe(struct platform_device *pdev)
 	ret = mt2701_init_clock(afe);
 	if (ret) {
 		dev_err(dev, "init clock error\n");
-		return ret;
+		goto err_init_clock;
 	}
 
 	platform_set_drvdata(pdev, afe);
-	pm_runtime_enable(&pdev->dev);
-	if (!pm_runtime_enabled(&pdev->dev))
-		goto err_pm_disable;
-	pm_runtime_get_sync(&pdev->dev);
 
-	ret = snd_soc_register_platform(&pdev->dev, &mtk_afe_pcm_platform);
+	pm_runtime_enable(dev);
+	if (!pm_runtime_enabled(dev)) {
+		ret = mt2701_afe_runtime_resume(dev);
+		if (ret)
+			goto err_pm_disable;
+	}
+	pm_runtime_get_sync(dev);
+
+	ret = snd_soc_register_platform(dev, &mtk_afe_pcm_platform);
 	if (ret) {
 		dev_warn(dev, "err_platform\n");
 		goto err_platform;
 	}
 
-	ret = snd_soc_register_component(&pdev->dev,
-					 &mt2701_afe_pcm_dai_component,
-					 mt2701_afe_pcm_dais,
-					 ARRAY_SIZE(mt2701_afe_pcm_dais));
+	ret = snd_soc_add_component(dev, component,
+				    &mt2701_afe_pcm_dai_component,
+				    mt2701_afe_pcm_dais,
+				    ARRAY_SIZE(mt2701_afe_pcm_dais));
 	if (ret) {
 		dev_warn(dev, "err_dai_component\n");
 		goto err_dai_component;
 	}
 
-	mt2701_afe_runtime_resume(&pdev->dev);
-
 	return 0;
 
 err_dai_component:
-	snd_soc_unregister_component(&pdev->dev);
-
+	snd_soc_unregister_platform(dev);
 err_platform:
-	snd_soc_unregister_platform(&pdev->dev);
-
+	pm_runtime_put_sync(dev);
 err_pm_disable:
-	pm_runtime_disable(&pdev->dev);
+	pm_runtime_disable(dev);
+err_init_clock:
+	kfree(component);
 
 	return ret;
 }
 
 static int mt2701_afe_pcm_dev_remove(struct platform_device *pdev)
 {
-	struct mtk_base_afe *afe = platform_get_drvdata(pdev);
-
+	pm_runtime_put_sync(&pdev->dev);
 	pm_runtime_disable(&pdev->dev);
 	if (!pm_runtime_status_suspended(&pdev->dev))
 		mt2701_afe_runtime_suspend(&pdev->dev);
-	pm_runtime_put_sync(&pdev->dev);
 
 	snd_soc_unregister_component(&pdev->dev);
 	snd_soc_unregister_platform(&pdev->dev);
-	/* disable afe clock */
-	mt2701_afe_disable_clock(afe);
+
 	return 0;
 }
 
@@ -1670,4 +1578,3 @@ module_platform_driver(mt2701_afe_pcm_driver);
 MODULE_DESCRIPTION("Mediatek ALSA SoC AFE platform driver for 2701");
 MODULE_AUTHOR("Garlic Tseng <garlic.tseng@mediatek.com>");
 MODULE_LICENSE("GPL v2");
-
diff --git a/sound/soc/mediatek/mt2701/mt2701-reg.h b/sound/soc/mediatek/mt2701/mt2701-reg.h
index bb62b1c..18e6769 100644
--- a/sound/soc/mediatek/mt2701/mt2701-reg.h
+++ b/sound/soc/mediatek/mt2701/mt2701-reg.h
@@ -17,17 +17,6 @@
 #ifndef _MT2701_REG_H_
 #define _MT2701_REG_H_
 
-#include <linux/delay.h>
-#include <linux/module.h>
-#include <linux/of.h>
-#include <linux/of_address.h>
-#include <linux/pm_runtime.h>
-#include <sound/soc.h>
-#include "mt2701-afe-common.h"
-
-/*****************************************************************************
- *                  R E G I S T E R       D E F I N I T I O N
- *****************************************************************************/
 #define AUDIO_TOP_CON0 0x0000
 #define AUDIO_TOP_CON4 0x0010
 #define AUDIO_TOP_CON5 0x0014
@@ -109,18 +98,6 @@
 #define AFE_DAI_BASE 0x1370
 #define AFE_DAI_CUR 0x137c
 
-/* AUDIO_TOP_CON0 (0x0000) */
-#define AUDIO_TOP_CON0_A1SYS_A2SYS_ON	(0x3 << 0)
-#define AUDIO_TOP_CON0_PDN_AFE		(0x1 << 2)
-#define AUDIO_TOP_CON0_PDN_APLL_CK	(0x1 << 23)
-
-/* AUDIO_TOP_CON4 (0x0010) */
-#define AUDIO_TOP_CON4_I2SO1_PWN	(0x1 << 6)
-#define AUDIO_TOP_CON4_PDN_A1SYS	(0x1 << 21)
-#define AUDIO_TOP_CON4_PDN_A2SYS	(0x1 << 22)
-#define AUDIO_TOP_CON4_PDN_AFE_CONN	(0x1 << 23)
-#define AUDIO_TOP_CON4_PDN_MRGIF	(0x1 << 25)
-
 /* AFE_DAIBT_CON0 (0x001c) */
 #define AFE_DAIBT_CON0_DAIBT_EN		(0x1 << 0)
 #define AFE_DAIBT_CON0_BT_FUNC_EN	(0x1 << 1)
@@ -137,22 +114,8 @@
 #define AFE_MRGIF_CON_I2S_MODE_MASK	(0xf << 20)
 #define AFE_MRGIF_CON_I2S_MODE_32K	(0x4 << 20)
 
-/* ASYS_I2SO1_CON (0x061c) */
-#define ASYS_I2SO1_CON_FS		(0x1f << 8)
-#define ASYS_I2SO1_CON_FS_SET(x)	((x) << 8)
-#define ASYS_I2SO1_CON_MULTI_CH		(0x1 << 16)
-#define ASYS_I2SO1_CON_SIDEGEN		(0x1 << 30)
-#define ASYS_I2SO1_CON_I2S_EN		(0x1 << 0)
-/* 0:EIAJ 1:I2S */
-#define ASYS_I2SO1_CON_I2S_MODE		(0x1 << 3)
-#define ASYS_I2SO1_CON_WIDE_MODE	(0x1 << 1)
-#define ASYS_I2SO1_CON_WIDE_MODE_SET(x)	((x) << 1)
-
-/* PWR2_TOP_CON (0x0634) */
-#define PWR2_TOP_CON_INIT_VAL		(0xffe1ffff)
-
-/* ASYS_IRQ_CLR (0x07c0) */
-#define ASYS_IRQ_CLR_ALL		(0xffffffff)
+/* ASYS_TOP_CON (0x0600) */
+#define ASYS_TOP_CON_ASYS_TIMING_ON		(0x3 << 0)
 
 /* PWR2_ASM_CON1 (0x1070) */
 #define PWR2_ASM_CON1_INIT_VAL		(0x492492)
@@ -182,5 +145,4 @@
 #define ASYS_I2S_CON_WIDE_MODE_SET(x)	((x) << 1)
 #define ASYS_I2S_IN_PHASE_FIX		(0x1 << 31)
 
-#define AFE_END_ADDR 0x15e0
 #endif
diff --git a/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c b/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
index 8a643a3..c7f7f8a 100644
--- a/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
+++ b/sound/soc/mediatek/mt8173/mt8173-afe-pcm.c
@@ -1083,7 +1083,7 @@ static int mt8173_afe_init_audio_clk(struct mtk_base_afe *afe)
 static int mt8173_afe_pcm_dev_probe(struct platform_device *pdev)
 {
 	int ret, i;
-	unsigned int irq_id;
+	int irq_id;
 	struct mtk_base_afe *afe;
 	struct mt8173_afe_private *afe_priv;
 	struct resource *res;
@@ -1105,9 +1105,9 @@ static int mt8173_afe_pcm_dev_probe(struct platform_device *pdev)
 	afe->dev = &pdev->dev;
 
 	irq_id = platform_get_irq(pdev, 0);
-	if (!irq_id) {
+	if (irq_id <= 0) {
 		dev_err(afe->dev, "np %s no irq\n", afe->dev->of_node->name);
-		return -ENXIO;
+		return irq_id < 0 ? irq_id : -ENXIO;
 	}
 	ret = devm_request_irq(afe->dev, irq_id, mt8173_afe_irq_handler,
 			       0, "Afe_ISR_Handle", (void *)afe);
diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5514.c b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5514.c
index 99c1521..5a9a548 100644
--- a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5514.c
+++ b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5514.c
@@ -37,8 +37,6 @@ static const struct snd_soc_dapm_route mt8173_rt5650_rt5514_routes[] = {
 	{"Sub DMIC1R", NULL, "Int Mic"},
 	{"Headphone", NULL, "HPOL"},
 	{"Headphone", NULL, "HPOR"},
-	{"Headset Mic", NULL, "micbias1"},
-	{"Headset Mic", NULL, "micbias2"},
 	{"IN1P", NULL, "Headset Mic"},
 	{"IN1N", NULL, "Headset Mic"},
 };
diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
index 42de84ca8..b724808 100644
--- a/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
+++ b/sound/soc/mediatek/mt8173/mt8173-rt5650-rt5676.c
@@ -40,8 +40,6 @@ static const struct snd_soc_dapm_route mt8173_rt5650_rt5676_routes[] = {
 	{"Headphone", NULL, "HPOL"},
 	{"Headphone", NULL, "HPOR"},
 	{"Headphone", NULL, "Sub AIF2TX"}, /* IF2 ADC to 5650  */
-	{"Headset Mic", NULL, "micbias1"},
-	{"Headset Mic", NULL, "micbias2"},
 	{"IN1P", NULL, "Headset Mic"},
 	{"IN1N", NULL, "Headset Mic"},
 	{"Sub AIF2RX", NULL, "Headset Mic"}, /* IF2 DAC from 5650  */
diff --git a/sound/soc/mediatek/mt8173/mt8173-rt5650.c b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
index e69c141..40ebefd 100644
--- a/sound/soc/mediatek/mt8173/mt8173-rt5650.c
+++ b/sound/soc/mediatek/mt8173/mt8173-rt5650.c
@@ -51,8 +51,6 @@ static const struct snd_soc_dapm_route mt8173_rt5650_routes[] = {
 	{"DMIC R1", NULL, "Int Mic"},
 	{"Headphone", NULL, "HPOL"},
 	{"Headphone", NULL, "HPOR"},
-	{"Headset Mic", NULL, "micbias1"},
-	{"Headset Mic", NULL, "micbias2"},
 	{"IN1P", NULL, "Headset Mic"},
 	{"IN1N", NULL, "Headset Mic"},
 };
diff --git a/sound/soc/mxs/mxs-sgtl5000.c b/sound/soc/mxs/mxs-sgtl5000.c
index 2ed3240..2b3f240 100644
--- a/sound/soc/mxs/mxs-sgtl5000.c
+++ b/sound/soc/mxs/mxs-sgtl5000.c
@@ -93,6 +93,14 @@ static struct snd_soc_dai_link mxs_sgtl5000_dai[] = {
 	},
 };
 
+static const struct snd_soc_dapm_widget mxs_sgtl5000_dapm_widgets[] = {
+	SND_SOC_DAPM_MIC("Mic Jack", NULL),
+	SND_SOC_DAPM_LINE("Line In Jack", NULL),
+	SND_SOC_DAPM_HP("Headphone Jack", NULL),
+	SND_SOC_DAPM_SPK("Line Out Jack", NULL),
+	SND_SOC_DAPM_SPK("Ext Spk", NULL),
+};
+
 static struct snd_soc_card mxs_sgtl5000 = {
 	.name		= "mxs_sgtl5000",
 	.owner		= THIS_MODULE,
@@ -141,10 +149,23 @@ static int mxs_sgtl5000_probe(struct platform_device *pdev)
 
 	card->dev = &pdev->dev;
 
+	if (of_find_property(np, "audio-routing", NULL)) {
+		card->dapm_widgets = mxs_sgtl5000_dapm_widgets;
+		card->num_dapm_widgets = ARRAY_SIZE(mxs_sgtl5000_dapm_widgets);
+
+		ret = snd_soc_of_parse_audio_routing(card, "audio-routing");
+		if (ret) {
+			dev_err(&pdev->dev, "failed to parse audio-routing (%d)\n",
+				ret);
+			return ret;
+		}
+	}
+
 	ret = devm_snd_soc_register_card(&pdev->dev, card);
 	if (ret) {
-		dev_err(&pdev->dev, "snd_soc_register_card failed (%d)\n",
-			ret);
+		if (ret != -EPROBE_DEFER)
+			dev_err(&pdev->dev, "snd_soc_register_card failed (%d)\n",
+				ret);
 		return ret;
 	}
 
diff --git a/sound/soc/nuc900/nuc900-ac97.c b/sound/soc/nuc900/nuc900-ac97.c
index b6615af..81b09d7 100644
--- a/sound/soc/nuc900/nuc900-ac97.c
+++ b/sound/soc/nuc900/nuc900-ac97.c
@@ -67,7 +67,7 @@ static unsigned short nuc900_ac97_read(struct snd_ac97 *ac97,
 
 	/* polling the AC_R_FINISH */
 	while (!(AUDIO_READ(nuc900_audio->mmio + ACTL_ACCON) & AC_R_FINISH)
-								&& timeout--)
+								&& --timeout)
 		mdelay(1);
 
 	if (!timeout) {
@@ -121,7 +121,7 @@ static void nuc900_ac97_write(struct snd_ac97 *ac97, unsigned short reg,
 
 	/* polling the AC_W_FINISH */
 	while ((AUDIO_READ(nuc900_audio->mmio + ACTL_ACCON) & AC_W_FINISH)
-								&& timeout--)
+								&& --timeout)
 		mdelay(1);
 
 	if (!timeout)
@@ -345,11 +345,10 @@ static int nuc900_ac97_drvprobe(struct platform_device *pdev)
 		goto out;
 	}
 
-	nuc900_audio->irq_num = platform_get_irq(pdev, 0);
-	if (!nuc900_audio->irq_num) {
-		ret = -EBUSY;
+	ret = platform_get_irq(pdev, 0);
+	if (ret < 0)
 		goto out;
-	}
+	nuc900_audio->irq_num = ret;
 
 	nuc900_ac97_data = nuc900_audio;
 
diff --git a/sound/soc/omap/ams-delta.c b/sound/soc/omap/ams-delta.c
index d402196..cb72c1e 100644
--- a/sound/soc/omap/ams-delta.c
+++ b/sound/soc/omap/ams-delta.c
@@ -105,7 +105,7 @@ static int ams_delta_set_audio_mode(struct snd_kcontrol *kcontrol,
 	int pin, changed = 0;
 
 	/* Refuse any mode changes if we are not able to control the codec. */
-	if (!cx20442_codec->hw_write)
+	if (!cx20442_codec->component.card->pop_time)
 		return -EUNATCH;
 
 	if (ucontrol->value.enumerated.item[0] >= control->items)
@@ -345,7 +345,7 @@ static void cx81801_receive(struct tty_struct *tty,
 	if (!codec)
 		return;
 
-	if (!codec->hw_write) {
+	if (!codec->component.card->pop_time) {
 		/* First modem response, complete setup procedure */
 
 		/* Initialize timer used for config pulse generation */
diff --git a/sound/soc/qcom/apq8016_sbc.c b/sound/soc/qcom/apq8016_sbc.c
index d49adc8..7044287 100644
--- a/sound/soc/qcom/apq8016_sbc.c
+++ b/sound/soc/qcom/apq8016_sbc.c
@@ -43,7 +43,7 @@ struct apq8016_sbc_data {
 static int apq8016_sbc_dai_init(struct snd_soc_pcm_runtime *rtd)
 {
 	struct snd_soc_dai *cpu_dai = rtd->cpu_dai;
-	struct snd_soc_codec *codec;
+	struct snd_soc_component *component;
 	struct snd_soc_dai_link *dai_link = rtd->dai_link;
 	struct snd_soc_card *card = rtd->card;
 	struct apq8016_sbc_data *pdata = snd_soc_card_get_drvdata(card);
@@ -92,7 +92,7 @@ static int apq8016_sbc_dai_init(struct snd_soc_pcm_runtime *rtd)
 
 		jack = pdata->jack.jack;
 
-		snd_jack_set_key(jack, SND_JACK_BTN_0, KEY_MEDIA);
+		snd_jack_set_key(jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
 		snd_jack_set_key(jack, SND_JACK_BTN_1, KEY_VOICECOMMAND);
 		snd_jack_set_key(jack, SND_JACK_BTN_2, KEY_VOLUMEUP);
 		snd_jack_set_key(jack, SND_JACK_BTN_3, KEY_VOLUMEDOWN);
@@ -102,15 +102,15 @@ static int apq8016_sbc_dai_init(struct snd_soc_pcm_runtime *rtd)
 	for (i = 0 ; i < dai_link->num_codecs; i++) {
 		struct snd_soc_dai *dai = rtd->codec_dais[i];
 
-		codec = dai->codec;
+		component = dai->component;
 		/* Set default mclk for internal codec */
-		rval = snd_soc_codec_set_sysclk(codec, 0, 0, DEFAULT_MCLK_RATE,
+		rval = snd_soc_component_set_sysclk(component, 0, 0, DEFAULT_MCLK_RATE,
 				       SND_SOC_CLOCK_IN);
 		if (rval != 0 && rval != -ENOTSUPP) {
 			dev_warn(card->dev, "Failed to set mclk: %d\n", rval);
 			return rval;
 		}
-		rval = snd_soc_codec_set_jack(codec, &pdata->jack, NULL);
+		rval = snd_soc_component_set_jack(component, &pdata->jack, NULL);
 		if (rval != 0 && rval != -ENOTSUPP) {
 			dev_warn(card->dev, "Failed to set jack: %d\n", rval);
 			return rval;
diff --git a/sound/soc/rockchip/rk3399_gru_sound.c b/sound/soc/rockchip/rk3399_gru_sound.c
index d64fbbd..fa6cd1d 100644
--- a/sound/soc/rockchip/rk3399_gru_sound.c
+++ b/sound/soc/rockchip/rk3399_gru_sound.c
@@ -206,7 +206,8 @@ static int rockchip_sound_da7219_init(struct snd_soc_pcm_runtime *rtd)
 		return ret;
 	}
 
-	snd_jack_set_key(rockchip_sound_jack.jack, SND_JACK_BTN_0, KEY_MEDIA);
+	snd_jack_set_key(
+		rockchip_sound_jack.jack, SND_JACK_BTN_0, KEY_PLAYPAUSE);
 	snd_jack_set_key(
 		rockchip_sound_jack.jack, SND_JACK_BTN_1, KEY_VOLUMEUP);
 	snd_jack_set_key(
diff --git a/sound/soc/rockchip/rockchip_i2s.c b/sound/soc/rockchip/rockchip_i2s.c
index 908211e..950823d6 100644
--- a/sound/soc/rockchip/rockchip_i2s.c
+++ b/sound/soc/rockchip/rockchip_i2s.c
@@ -328,6 +328,7 @@ static int rockchip_i2s_hw_params(struct snd_pcm_substream *substream,
 		val |= I2S_CHN_4;
 		break;
 	case 2:
+	case 1:
 		val |= I2S_CHN_2;
 		break;
 	default:
@@ -460,7 +461,7 @@ static struct snd_soc_dai_driver rockchip_i2s_dai = {
 	},
 	.capture = {
 		.stream_name = "Capture",
-		.channels_min = 2,
+		.channels_min = 1,
 		.channels_max = 2,
 		.rates = SNDRV_PCM_RATE_8000_192000,
 		.formats = (SNDRV_PCM_FMTBIT_S8 |
@@ -504,6 +505,7 @@ static bool rockchip_i2s_rd_reg(struct device *dev, unsigned int reg)
 	case I2S_INTCR:
 	case I2S_XFER:
 	case I2S_CLR:
+	case I2S_TXDR:
 	case I2S_RXDR:
 	case I2S_FIFOLR:
 	case I2S_INTSR:
@@ -518,6 +520,9 @@ static bool rockchip_i2s_volatile_reg(struct device *dev, unsigned int reg)
 	switch (reg) {
 	case I2S_INTSR:
 	case I2S_CLR:
+	case I2S_FIFOLR:
+	case I2S_TXDR:
+	case I2S_RXDR:
 		return true;
 	default:
 		return false;
@@ -527,6 +532,8 @@ static bool rockchip_i2s_volatile_reg(struct device *dev, unsigned int reg)
 static bool rockchip_i2s_precious_reg(struct device *dev, unsigned int reg)
 {
 	switch (reg) {
+	case I2S_RXDR:
+		return true;
 	default:
 		return false;
 	}
@@ -654,7 +661,7 @@ static int rockchip_i2s_probe(struct platform_device *pdev)
 	}
 
 	if (!of_property_read_u32(node, "rockchip,capture-channels", &val)) {
-		if (val >= 2 && val <= 8)
+		if (val >= 1 && val <= 8)
 			soc_dai->capture.channels_max = val;
 	}
 
diff --git a/sound/soc/samsung/bells.c b/sound/soc/samsung/bells.c
index 34deba4..0e66cd8 100644
--- a/sound/soc/samsung/bells.c
+++ b/sound/soc/samsung/bells.c
@@ -60,13 +60,13 @@ static int bells_set_bias_level(struct snd_soc_card *card,
 {
 	struct snd_soc_pcm_runtime *rtd;
 	struct snd_soc_dai *codec_dai;
-	struct snd_soc_codec *codec;
+	struct snd_soc_component *component;
 	struct bells_drvdata *bells = card->drvdata;
 	int ret;
 
 	rtd = snd_soc_get_pcm_runtime(card, card->dai_link[DAI_DSP_CODEC].name);
 	codec_dai = rtd->codec_dai;
-	codec = codec_dai->codec;
+	component = codec_dai->component;
 
 	if (dapm->dev != codec_dai->dev)
 		return 0;
@@ -76,7 +76,7 @@ static int bells_set_bias_level(struct snd_soc_card *card,
 		if (dapm->bias_level != SND_SOC_BIAS_STANDBY)
 			break;
 
-		ret = snd_soc_codec_set_pll(codec, WM5102_FLL1,
+		ret = snd_soc_component_set_pll(component, WM5102_FLL1,
 					    ARIZONA_FLL_SRC_MCLK1,
 					    MCLK_RATE,
 					    bells->sysclk_rate);
@@ -84,7 +84,7 @@ static int bells_set_bias_level(struct snd_soc_card *card,
 			pr_err("Failed to start FLL: %d\n", ret);
 
 		if (bells->asyncclk_rate) {
-			ret = snd_soc_codec_set_pll(codec, WM5102_FLL2,
+			ret = snd_soc_component_set_pll(component, WM5102_FLL2,
 						    ARIZONA_FLL_SRC_AIF2BCLK,
 						    BCLK2_RATE,
 						    bells->asyncclk_rate);
@@ -106,27 +106,27 @@ static int bells_set_bias_level_post(struct snd_soc_card *card,
 {
 	struct snd_soc_pcm_runtime *rtd;
 	struct snd_soc_dai *codec_dai;
-	struct snd_soc_codec *codec;
+	struct snd_soc_component *component;
 	struct bells_drvdata *bells = card->drvdata;
 	int ret;
 
 	rtd = snd_soc_get_pcm_runtime(card, card->dai_link[DAI_DSP_CODEC].name);
 	codec_dai = rtd->codec_dai;
-	codec = codec_dai->codec;
+	component = codec_dai->component;
 
 	if (dapm->dev != codec_dai->dev)
 		return 0;
 
 	switch (level) {
 	case SND_SOC_BIAS_STANDBY:
-		ret = snd_soc_codec_set_pll(codec, WM5102_FLL1, 0, 0, 0);
+		ret = snd_soc_component_set_pll(component, WM5102_FLL1, 0, 0, 0);
 		if (ret < 0) {
 			pr_err("Failed to stop FLL: %d\n", ret);
 			return ret;
 		}
 
 		if (bells->asyncclk_rate) {
-			ret = snd_soc_codec_set_pll(codec, WM5102_FLL2,
+			ret = snd_soc_component_set_pll(component, WM5102_FLL2,
 						    0, 0, 0);
 			if (ret < 0) {
 				pr_err("Failed to stop FLL: %d\n", ret);
@@ -148,8 +148,8 @@ static int bells_late_probe(struct snd_soc_card *card)
 {
 	struct bells_drvdata *bells = card->drvdata;
 	struct snd_soc_pcm_runtime *rtd;
-	struct snd_soc_codec *wm0010;
-	struct snd_soc_codec *codec;
+	struct snd_soc_component *wm0010;
+	struct snd_soc_component *component;
 	struct snd_soc_dai *aif1_dai;
 	struct snd_soc_dai *aif2_dai;
 	struct snd_soc_dai *aif3_dai;
@@ -157,22 +157,22 @@ static int bells_late_probe(struct snd_soc_card *card)
 	int ret;
 
 	rtd = snd_soc_get_pcm_runtime(card, card->dai_link[DAI_AP_DSP].name);
-	wm0010 = rtd->codec;
+	wm0010 = rtd->codec_dai->component;
 
 	rtd = snd_soc_get_pcm_runtime(card, card->dai_link[DAI_DSP_CODEC].name);
-	codec = rtd->codec;
+	component = rtd->codec_dai->component;
 	aif1_dai = rtd->codec_dai;
 
-	ret = snd_soc_codec_set_sysclk(codec, ARIZONA_CLK_SYSCLK,
+	ret = snd_soc_component_set_sysclk(component, ARIZONA_CLK_SYSCLK,
 				       ARIZONA_CLK_SRC_FLL1,
 				       bells->sysclk_rate,
 				       SND_SOC_CLOCK_IN);
 	if (ret != 0) {
-		dev_err(codec->dev, "Failed to set SYSCLK: %d\n", ret);
+		dev_err(component->dev, "Failed to set SYSCLK: %d\n", ret);
 		return ret;
 	}
 
-	ret = snd_soc_codec_set_sysclk(wm0010, 0, 0, SYS_MCLK_RATE, 0);
+	ret = snd_soc_component_set_sysclk(wm0010, 0, 0, SYS_MCLK_RATE, 0);
 	if (ret != 0) {
 		dev_err(wm0010->dev, "Failed to set WM0010 clock: %d\n", ret);
 		return ret;
@@ -182,20 +182,20 @@ static int bells_late_probe(struct snd_soc_card *card)
 	if (ret != 0)
 		dev_err(aif1_dai->dev, "Failed to set AIF1 clock: %d\n", ret);
 
-	ret = snd_soc_codec_set_sysclk(codec, ARIZONA_CLK_OPCLK, 0,
+	ret = snd_soc_component_set_sysclk(component, ARIZONA_CLK_OPCLK, 0,
 				       SYS_MCLK_RATE, SND_SOC_CLOCK_OUT);
 	if (ret != 0)
-		dev_err(codec->dev, "Failed to set OPCLK: %d\n", ret);
+		dev_err(component->dev, "Failed to set OPCLK: %d\n", ret);
 
 	if (card->num_rtd == DAI_CODEC_CP)
 		return 0;
 
-	ret = snd_soc_codec_set_sysclk(codec, ARIZONA_CLK_ASYNCCLK,
+	ret = snd_soc_component_set_sysclk(component, ARIZONA_CLK_ASYNCCLK,
 				       ARIZONA_CLK_SRC_FLL2,
 				       bells->asyncclk_rate,
 				       SND_SOC_CLOCK_IN);
 	if (ret != 0) {
-		dev_err(codec->dev, "Failed to set ASYNCCLK: %d\n", ret);
+		dev_err(component->dev, "Failed to set ASYNCCLK: %d\n", ret);
 		return ret;
 	}
 
@@ -221,7 +221,7 @@ static int bells_late_probe(struct snd_soc_card *card)
 		return ret;
 	}
 
-	ret = snd_soc_codec_set_sysclk(wm9081_dai->codec, WM9081_SYSCLK_MCLK,
+	ret = snd_soc_component_set_sysclk(wm9081_dai->component, WM9081_SYSCLK_MCLK,
 				       0, SYS_MCLK_RATE, 0);
 	if (ret != 0) {
 		dev_err(wm9081_dai->dev, "Failed to set MCLK: %d\n", ret);
diff --git a/sound/soc/sh/rcar/core.c b/sound/soc/sh/rcar/core.c
index f12a88a..64d5ecb 100644
--- a/sound/soc/sh/rcar/core.c
+++ b/sound/soc/sh/rcar/core.c
@@ -197,16 +197,27 @@ int rsnd_io_is_working(struct rsnd_dai_stream *io)
 	return 0;
 }
 
-int rsnd_runtime_channel_original(struct rsnd_dai_stream *io)
+int rsnd_runtime_channel_original_with_params(struct rsnd_dai_stream *io,
+					      struct snd_pcm_hw_params *params)
 {
 	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
 
-	return runtime->channels;
+	/*
+	 * params will be added when refine
+	 * see
+	 *	__rsnd_soc_hw_rule_rate()
+	 *	__rsnd_soc_hw_rule_channels()
+	 */
+	if (params)
+		return params_channels(params);
+	else
+		return runtime->channels;
 }
 
-int rsnd_runtime_channel_after_ctu(struct rsnd_dai_stream *io)
+int rsnd_runtime_channel_after_ctu_with_params(struct rsnd_dai_stream *io,
+					       struct snd_pcm_hw_params *params)
 {
-	int chan = rsnd_runtime_channel_original(io);
+	int chan = rsnd_runtime_channel_original_with_params(io, params);
 	struct rsnd_mod *ctu_mod = rsnd_io_to_mod_ctu(io);
 
 	if (ctu_mod) {
@@ -219,12 +230,13 @@ int rsnd_runtime_channel_after_ctu(struct rsnd_dai_stream *io)
 	return chan;
 }
 
-int rsnd_runtime_channel_for_ssi(struct rsnd_dai_stream *io)
+int rsnd_runtime_channel_for_ssi_with_params(struct rsnd_dai_stream *io,
+					     struct snd_pcm_hw_params *params)
 {
 	struct rsnd_dai *rdai = rsnd_io_to_rdai(io);
 	int chan = rsnd_io_is_play(io) ?
-		rsnd_runtime_channel_after_ctu(io) :
-		rsnd_runtime_channel_original(io);
+		rsnd_runtime_channel_after_ctu_with_params(io, params) :
+		rsnd_runtime_channel_original_with_params(io, params);
 
 	/* Use Multi SSI */
 	if (rsnd_runtime_is_ssi_multi(io))
@@ -262,10 +274,10 @@ u32 rsnd_get_adinr_bit(struct rsnd_mod *mod, struct rsnd_dai_stream *io)
 	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
 	struct device *dev = rsnd_priv_to_dev(priv);
 
-	switch (runtime->sample_bits) {
+	switch (snd_pcm_format_width(runtime->format)) {
 	case 16:
 		return 8 << 16;
-	case 32:
+	case 24:
 		return 0 << 16;
 	}
 
@@ -282,11 +294,12 @@ u32 rsnd_get_dalign(struct rsnd_mod *mod, struct rsnd_dai_stream *io)
 	struct rsnd_mod *ssiu = rsnd_io_to_mod_ssiu(io);
 	struct rsnd_mod *target;
 	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
-	u32 val = 0x76543210;
-	u32 mask = ~0;
 
 	/*
-	 * *Hardware* L/R and *Software* L/R are inverted.
+	 * *Hardware* L/R and *Software* L/R are inverted for 16bit data.
+	 *	    31..16 15...0
+	 *	HW: [L ch] [R ch]
+	 *	SW: [R ch] [L ch]
 	 * We need to care about inversion timing to control
 	 * Playback/Capture correctly.
 	 * The point is [DVC] needs *Hardware* L/R, [MEM] needs *Software* L/R
@@ -313,27 +326,13 @@ u32 rsnd_get_dalign(struct rsnd_mod *mod, struct rsnd_dai_stream *io)
 		target = cmd ? cmd : ssiu;
 	}
 
-	mask <<= runtime->channels * 4;
-	val = val & mask;
-
-	switch (runtime->sample_bits) {
-	case 16:
-		val |= 0x67452301 & ~mask;
-		break;
-	case 32:
-		val |= 0x76543210 & ~mask;
-		break;
-	}
-
-	/*
-	 * exchange channeles on SRC if possible,
-	 * otherwise, R/L volume settings on DVC
-	 * changes inverted channels
-	 */
-	if (mod == target)
-		return val;
-	else
+	/* Non target mod or 24bit data needs normal DALIGN */
+	if ((snd_pcm_format_width(runtime->format) != 16) ||
+	    (mod != target))
 		return 0x76543210;
+	/* Target mod needs inverted DALIGN when 16bit */
+	else
+		return 0x67452301;
 }
 
 u32 rsnd_get_busif_shift(struct rsnd_dai_stream *io, struct rsnd_mod *mod)
@@ -363,12 +362,8 @@ u32 rsnd_get_busif_shift(struct rsnd_dai_stream *io, struct rsnd_mod *mod)
 	 * HW    24bit data is located as 0x******00
 	 *
 	 */
-	switch (runtime->sample_bits) {
-	case 16:
+	if (snd_pcm_format_width(runtime->format) == 16)
 		return 0;
-	case 32:
-		break;
-	}
 
 	for (i = 0; i < ARRAY_SIZE(playback_mods); i++) {
 		tmod = rsnd_io_to_mod(io, mods[i]);
@@ -616,8 +611,6 @@ static int rsnd_soc_dai_trigger(struct snd_pcm_substream *substream, int cmd,
 	switch (cmd) {
 	case SNDRV_PCM_TRIGGER_START:
 	case SNDRV_PCM_TRIGGER_RESUME:
-		rsnd_dai_stream_init(io, substream);
-
 		ret = rsnd_dai_call(init, io, priv);
 		if (ret < 0)
 			goto dai_trigger_end;
@@ -639,7 +632,6 @@ static int rsnd_soc_dai_trigger(struct snd_pcm_substream *substream, int cmd,
 
 		ret |= rsnd_dai_call(quit, io, priv);
 
-		rsnd_dai_stream_quit(io);
 		break;
 	default:
 		ret = -EINVAL;
@@ -784,8 +776,9 @@ static int rsnd_soc_hw_rule(struct rsnd_priv *priv,
 	return snd_interval_refine(iv, &p);
 }
 
-static int rsnd_soc_hw_rule_rate(struct snd_pcm_hw_params *params,
-				 struct snd_pcm_hw_rule *rule)
+static int __rsnd_soc_hw_rule_rate(struct snd_pcm_hw_params *params,
+				   struct snd_pcm_hw_rule *rule,
+				   int is_play)
 {
 	struct snd_interval *ic_ = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS);
 	struct snd_interval *ir = hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE);
@@ -793,25 +786,37 @@ static int rsnd_soc_hw_rule_rate(struct snd_pcm_hw_params *params,
 	struct snd_soc_dai *dai = rule->private;
 	struct rsnd_dai *rdai = rsnd_dai_to_rdai(dai);
 	struct rsnd_priv *priv = rsnd_rdai_to_priv(rdai);
+	struct rsnd_dai_stream *io = is_play ? &rdai->playback : &rdai->capture;
 
 	/*
 	 * possible sampling rate limitation is same as
 	 * 2ch if it supports multi ssi
+	 * and same as 8ch if TDM 6ch (see rsnd_ssi_config_init())
 	 */
 	ic = *ic_;
-	if (1 < rsnd_rdai_ssi_lane_get(rdai)) {
-		ic.min = 2;
-		ic.max = 2;
-	}
+	ic.min =
+	ic.max = rsnd_runtime_channel_for_ssi_with_params(io, params);
 
 	return rsnd_soc_hw_rule(priv, rsnd_soc_hw_rate_list,
 				ARRAY_SIZE(rsnd_soc_hw_rate_list),
 				&ic, ir);
 }
 
+static int rsnd_soc_hw_rule_rate_playback(struct snd_pcm_hw_params *params,
+				 struct snd_pcm_hw_rule *rule)
+{
+	return __rsnd_soc_hw_rule_rate(params, rule, 1);
+}
 
-static int rsnd_soc_hw_rule_channels(struct snd_pcm_hw_params *params,
-				     struct snd_pcm_hw_rule *rule)
+static int rsnd_soc_hw_rule_rate_capture(struct snd_pcm_hw_params *params,
+					  struct snd_pcm_hw_rule *rule)
+{
+	return __rsnd_soc_hw_rule_rate(params, rule, 0);
+}
+
+static int __rsnd_soc_hw_rule_channels(struct snd_pcm_hw_params *params,
+				       struct snd_pcm_hw_rule *rule,
+				       int is_play)
 {
 	struct snd_interval *ic_ = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS);
 	struct snd_interval *ir = hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE);
@@ -819,22 +824,34 @@ static int rsnd_soc_hw_rule_channels(struct snd_pcm_hw_params *params,
 	struct snd_soc_dai *dai = rule->private;
 	struct rsnd_dai *rdai = rsnd_dai_to_rdai(dai);
 	struct rsnd_priv *priv = rsnd_rdai_to_priv(rdai);
+	struct rsnd_dai_stream *io = is_play ? &rdai->playback : &rdai->capture;
 
 	/*
 	 * possible sampling rate limitation is same as
 	 * 2ch if it supports multi ssi
+	 * and same as 8ch if TDM 6ch (see rsnd_ssi_config_init())
 	 */
 	ic = *ic_;
-	if (1 < rsnd_rdai_ssi_lane_get(rdai)) {
-		ic.min = 2;
-		ic.max = 2;
-	}
+	ic.min =
+	ic.max = rsnd_runtime_channel_for_ssi_with_params(io, params);
 
 	return rsnd_soc_hw_rule(priv, rsnd_soc_hw_channels_list,
 				ARRAY_SIZE(rsnd_soc_hw_channels_list),
 				ir, &ic);
 }
 
+static int rsnd_soc_hw_rule_channels_playback(struct snd_pcm_hw_params *params,
+					      struct snd_pcm_hw_rule *rule)
+{
+	return __rsnd_soc_hw_rule_channels(params, rule, 1);
+}
+
+static int rsnd_soc_hw_rule_channels_capture(struct snd_pcm_hw_params *params,
+					     struct snd_pcm_hw_rule *rule)
+{
+	return __rsnd_soc_hw_rule_channels(params, rule, 0);
+}
+
 static const struct snd_pcm_hardware rsnd_pcm_hardware = {
 	.info =		SNDRV_PCM_INFO_INTERLEAVED	|
 			SNDRV_PCM_INFO_MMAP		|
@@ -859,6 +876,8 @@ static int rsnd_soc_dai_startup(struct snd_pcm_substream *substream,
 	int ret;
 	int i;
 
+	rsnd_dai_stream_init(io, substream);
+
 	/*
 	 * Channel Limitation
 	 * It depends on Platform design
@@ -886,11 +905,17 @@ static int rsnd_soc_dai_startup(struct snd_pcm_substream *substream,
 	 * It depends on Clock Master Mode
 	 */
 	if (rsnd_rdai_is_clk_master(rdai)) {
+		int is_play = substream->stream == SNDRV_PCM_STREAM_PLAYBACK;
+
 		snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_RATE,
-				    rsnd_soc_hw_rule_rate, dai,
+				    is_play ? rsnd_soc_hw_rule_rate_playback :
+					      rsnd_soc_hw_rule_rate_capture,
+				    dai,
 				    SNDRV_PCM_HW_PARAM_CHANNELS, -1);
 		snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_CHANNELS,
-				    rsnd_soc_hw_rule_channels, dai,
+				    is_play ? rsnd_soc_hw_rule_channels_playback :
+					      rsnd_soc_hw_rule_channels_capture,
+				    dai,
 				    SNDRV_PCM_HW_PARAM_RATE, -1);
 	}
 
@@ -915,6 +940,8 @@ static void rsnd_soc_dai_shutdown(struct snd_pcm_substream *substream,
 	 * call rsnd_dai_call without spinlock
 	 */
 	rsnd_dai_call(nolock_stop, io, priv);
+
+	rsnd_dai_stream_quit(io);
 }
 
 static const struct snd_soc_dai_ops rsnd_soc_dai_ops = {
@@ -990,7 +1017,7 @@ static struct device_node *rsnd_dai_of_node(struct rsnd_priv *priv,
 
 static void __rsnd_dai_probe(struct rsnd_priv *priv,
 			     struct device_node *dai_np,
-			     int dai_i, int is_graph)
+			     int dai_i)
 {
 	struct device_node *playback, *capture;
 	struct rsnd_dai_stream *io_playback;
@@ -1089,13 +1116,13 @@ static int rsnd_dai_probe(struct rsnd_priv *priv)
 	dai_i = 0;
 	if (is_graph) {
 		for_each_endpoint_of_node(dai_node, dai_np) {
-			__rsnd_dai_probe(priv, dai_np, dai_i, is_graph);
+			__rsnd_dai_probe(priv, dai_np, dai_i);
 			rsnd_ssi_parse_hdmi_connection(priv, dai_np, dai_i);
 			dai_i++;
 		}
 	} else {
 		for_each_child_of_node(dai_node, dai_np)
-			__rsnd_dai_probe(priv, dai_np, dai_i++, is_graph);
+			__rsnd_dai_probe(priv, dai_np, dai_i++);
 	}
 
 	return 0;
@@ -1496,6 +1523,8 @@ static int rsnd_remove(struct platform_device *pdev)
 	};
 	int ret = 0, i;
 
+	snd_soc_disconnect_sync(&pdev->dev);
+
 	pm_runtime_disable(&pdev->dev);
 
 	for_each_rsnd_dai(rdai, priv, i) {
diff --git a/sound/soc/sh/rcar/dma.c b/sound/soc/sh/rcar/dma.c
index 4d750bdf..41de234 100644
--- a/sound/soc/sh/rcar/dma.c
+++ b/sound/soc/sh/rcar/dma.c
@@ -71,25 +71,7 @@ static struct rsnd_mod mem = {
 static void __rsnd_dmaen_complete(struct rsnd_mod *mod,
 				  struct rsnd_dai_stream *io)
 {
-	struct rsnd_priv *priv = rsnd_mod_to_priv(mod);
-	bool elapsed = false;
-	unsigned long flags;
-
-	/*
-	 * Renesas sound Gen1 needs 1 DMAC,
-	 * Gen2 needs 2 DMAC.
-	 * In Gen2 case, it are Audio-DMAC, and Audio-DMAC-peri-peri.
-	 * But, Audio-DMAC-peri-peri doesn't have interrupt,
-	 * and this driver is assuming that here.
-	 */
-	spin_lock_irqsave(&priv->lock, flags);
-
 	if (rsnd_io_is_working(io))
-		elapsed = true;
-
-	spin_unlock_irqrestore(&priv->lock, flags);
-
-	if (elapsed)
 		rsnd_dai_period_elapsed(io);
 }
 
diff --git a/sound/soc/sh/rcar/rsnd.h b/sound/soc/sh/rcar/rsnd.h
index 57cd2bc..ad65235 100644
--- a/sound/soc/sh/rcar/rsnd.h
+++ b/sound/soc/sh/rcar/rsnd.h
@@ -399,9 +399,18 @@ void rsnd_parse_connect_common(struct rsnd_dai *rdai,
 		struct device_node *playback,
 		struct device_node *capture);
 
-int rsnd_runtime_channel_original(struct rsnd_dai_stream *io);
-int rsnd_runtime_channel_after_ctu(struct rsnd_dai_stream *io);
-int rsnd_runtime_channel_for_ssi(struct rsnd_dai_stream *io);
+#define rsnd_runtime_channel_original(io) \
+	rsnd_runtime_channel_original_with_params(io, NULL)
+int rsnd_runtime_channel_original_with_params(struct rsnd_dai_stream *io,
+				struct snd_pcm_hw_params *params);
+#define rsnd_runtime_channel_after_ctu(io)			\
+	rsnd_runtime_channel_after_ctu_with_params(io, NULL)
+int rsnd_runtime_channel_after_ctu_with_params(struct rsnd_dai_stream *io,
+				struct snd_pcm_hw_params *params);
+#define rsnd_runtime_channel_for_ssi(io) \
+	rsnd_runtime_channel_for_ssi_with_params(io, NULL)
+int rsnd_runtime_channel_for_ssi_with_params(struct rsnd_dai_stream *io,
+				 struct snd_pcm_hw_params *params);
 int rsnd_runtime_is_ssi_multi(struct rsnd_dai_stream *io);
 int rsnd_runtime_is_ssi_tdm(struct rsnd_dai_stream *io);
 
diff --git a/sound/soc/sh/rcar/ssi.c b/sound/soc/sh/rcar/ssi.c
index cbf3bf3..97a9db8 100644
--- a/sound/soc/sh/rcar/ssi.c
+++ b/sound/soc/sh/rcar/ssi.c
@@ -79,8 +79,8 @@ struct rsnd_ssi {
 	int irq;
 	unsigned int usrcnt;
 
+	/* for PIO */
 	int byte_pos;
-	int period_pos;
 	int byte_per_period;
 	int next_period_byte;
 };
@@ -371,11 +371,11 @@ static void rsnd_ssi_config_init(struct rsnd_mod *mod,
 	if (rsnd_io_is_play(io))
 		cr_own |= TRMD;
 
-	switch (runtime->sample_bits) {
+	switch (snd_pcm_format_width(runtime->format)) {
 	case 16:
 		cr_own |= DWL_16;
 		break;
-	case 32:
+	case 24:
 		cr_own |= DWL_24;
 		break;
 	}
@@ -414,63 +414,6 @@ static void rsnd_ssi_register_setup(struct rsnd_mod *mod)
 					ssi->cr_en);
 }
 
-static void rsnd_ssi_pointer_init(struct rsnd_mod *mod,
-				  struct rsnd_dai_stream *io)
-{
-	struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
-	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
-
-	ssi->byte_pos		= 0;
-	ssi->period_pos		= 0;
-	ssi->byte_per_period	= runtime->period_size *
-				  runtime->channels *
-				  samples_to_bytes(runtime, 1);
-	ssi->next_period_byte	= ssi->byte_per_period;
-}
-
-static int rsnd_ssi_pointer_offset(struct rsnd_mod *mod,
-				   struct rsnd_dai_stream *io,
-				   int additional)
-{
-	struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
-	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
-	int pos = ssi->byte_pos + additional;
-
-	pos %= (runtime->periods * ssi->byte_per_period);
-
-	return pos;
-}
-
-static bool rsnd_ssi_pointer_update(struct rsnd_mod *mod,
-				    struct rsnd_dai_stream *io,
-				    int byte)
-{
-	struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
-	bool ret = false;
-	int byte_pos;
-
-	byte_pos = ssi->byte_pos + byte;
-
-	if (byte_pos >= ssi->next_period_byte) {
-		struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
-
-		ssi->period_pos++;
-		ssi->next_period_byte += ssi->byte_per_period;
-
-		if (ssi->period_pos >= runtime->periods) {
-			byte_pos = 0;
-			ssi->period_pos = 0;
-			ssi->next_period_byte = ssi->byte_per_period;
-		}
-
-		ret = true;
-	}
-
-	WRITE_ONCE(ssi->byte_pos, byte_pos);
-
-	return ret;
-}
-
 /*
  *	SSI mod common functions
  */
@@ -484,8 +427,6 @@ static int rsnd_ssi_init(struct rsnd_mod *mod,
 	if (!rsnd_ssi_is_run_mods(mod, io))
 		return 0;
 
-	rsnd_ssi_pointer_init(mod, io);
-
 	ssi->usrcnt++;
 
 	rsnd_mod_power_on(mod);
@@ -656,6 +597,8 @@ static int rsnd_ssi_irq(struct rsnd_mod *mod,
 	return 0;
 }
 
+static bool rsnd_ssi_pio_interrupt(struct rsnd_mod *mod,
+				   struct rsnd_dai_stream *io);
 static void __rsnd_ssi_interrupt(struct rsnd_mod *mod,
 				 struct rsnd_dai_stream *io)
 {
@@ -674,30 +617,8 @@ static void __rsnd_ssi_interrupt(struct rsnd_mod *mod,
 	status = rsnd_ssi_status_get(mod);
 
 	/* PIO only */
-	if (!is_dma && (status & DIRQ)) {
-		struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
-		u32 *buf = (u32 *)(runtime->dma_area +
-				   rsnd_ssi_pointer_offset(mod, io, 0));
-		int shift = 0;
-
-		switch (runtime->sample_bits) {
-		case 32:
-			shift = 8;
-			break;
-		}
-
-		/*
-		 * 8/16/32 data can be assesse to TDR/RDR register
-		 * directly as 32bit data
-		 * see rsnd_ssi_init()
-		 */
-		if (rsnd_io_is_play(io))
-			rsnd_mod_write(mod, SSITDR, (*buf) << shift);
-		else
-			*buf = (rsnd_mod_read(mod, SSIRDR) >> shift);
-
-		elapsed = rsnd_ssi_pointer_update(mod, io, sizeof(*buf));
-	}
+	if (!is_dma && (status & DIRQ))
+		elapsed = rsnd_ssi_pio_interrupt(mod, io);
 
 	/* DMA only */
 	if (is_dma && (status & (UIRQ | OIRQ)))
@@ -835,7 +756,71 @@ static int rsnd_ssi_common_remove(struct rsnd_mod *mod,
 	return 0;
 }
 
-static int rsnd_ssi_pointer(struct rsnd_mod *mod,
+/*
+ *	SSI PIO functions
+ */
+static bool rsnd_ssi_pio_interrupt(struct rsnd_mod *mod,
+				   struct rsnd_dai_stream *io)
+{
+	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
+	struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
+	u32 *buf = (u32 *)(runtime->dma_area + ssi->byte_pos);
+	int shift = 0;
+	int byte_pos;
+	bool elapsed = false;
+
+	if (snd_pcm_format_width(runtime->format) == 24)
+		shift = 8;
+
+	/*
+	 * 8/16/32 data can be assesse to TDR/RDR register
+	 * directly as 32bit data
+	 * see rsnd_ssi_init()
+	 */
+	if (rsnd_io_is_play(io))
+		rsnd_mod_write(mod, SSITDR, (*buf) << shift);
+	else
+		*buf = (rsnd_mod_read(mod, SSIRDR) >> shift);
+
+	byte_pos = ssi->byte_pos + sizeof(*buf);
+
+	if (byte_pos >= ssi->next_period_byte) {
+		int period_pos = byte_pos / ssi->byte_per_period;
+
+		if (period_pos >= runtime->periods) {
+			byte_pos = 0;
+			period_pos = 0;
+		}
+
+		ssi->next_period_byte = (period_pos + 1) * ssi->byte_per_period;
+
+		elapsed = true;
+	}
+
+	WRITE_ONCE(ssi->byte_pos, byte_pos);
+
+	return elapsed;
+}
+
+static int rsnd_ssi_pio_init(struct rsnd_mod *mod,
+			     struct rsnd_dai_stream *io,
+			     struct rsnd_priv *priv)
+{
+	struct snd_pcm_runtime *runtime = rsnd_io_to_runtime(io);
+	struct rsnd_ssi *ssi = rsnd_mod_to_ssi(mod);
+
+	if (!rsnd_ssi_is_parent(mod, io)) {
+		ssi->byte_pos		= 0;
+		ssi->byte_per_period	= runtime->period_size *
+					  runtime->channels *
+					  samples_to_bytes(runtime, 1);
+		ssi->next_period_byte	= ssi->byte_per_period;
+	}
+
+	return rsnd_ssi_init(mod, io, priv);
+}
+
+static int rsnd_ssi_pio_pointer(struct rsnd_mod *mod,
 			    struct rsnd_dai_stream *io,
 			    snd_pcm_uframes_t *pointer)
 {
@@ -851,12 +836,12 @@ static struct rsnd_mod_ops rsnd_ssi_pio_ops = {
 	.name	= SSI_NAME,
 	.probe	= rsnd_ssi_common_probe,
 	.remove	= rsnd_ssi_common_remove,
-	.init	= rsnd_ssi_init,
+	.init	= rsnd_ssi_pio_init,
 	.quit	= rsnd_ssi_quit,
 	.start	= rsnd_ssi_start,
 	.stop	= rsnd_ssi_stop,
 	.irq	= rsnd_ssi_irq,
-	.pointer= rsnd_ssi_pointer,
+	.pointer = rsnd_ssi_pio_pointer,
 	.pcm_new = rsnd_ssi_pcm_new,
 	.hw_params = rsnd_ssi_hw_params,
 };
diff --git a/sound/soc/soc-acpi.c b/sound/soc/soc-acpi.c
index f21df28..3d7e1ff 100644
--- a/sound/soc/soc-acpi.c
+++ b/sound/soc/soc-acpi.c
@@ -16,79 +16,16 @@
 
 #include <sound/soc-acpi.h>
 
-static acpi_status snd_soc_acpi_find_name(acpi_handle handle, u32 level,
-				      void *context, void **ret)
-{
-	struct acpi_device *adev;
-	const char *name = NULL;
-
-	if (acpi_bus_get_device(handle, &adev))
-		return AE_OK;
-
-	if (adev->status.present && adev->status.functional) {
-		name = acpi_dev_name(adev);
-		*(const char **)ret = name;
-		return AE_CTRL_TERMINATE;
-	}
-
-	return AE_OK;
-}
-
-const char *snd_soc_acpi_find_name_from_hid(const u8 hid[ACPI_ID_LEN])
-{
-	const char *name = NULL;
-	acpi_status status;
-
-	status = acpi_get_devices(hid, snd_soc_acpi_find_name, NULL,
-				  (void **)&name);
-
-	if (ACPI_FAILURE(status) || name[0] == '\0')
-		return NULL;
-
-	return name;
-}
-EXPORT_SYMBOL_GPL(snd_soc_acpi_find_name_from_hid);
-
-static acpi_status snd_soc_acpi_mach_match(acpi_handle handle, u32 level,
-				       void *context, void **ret)
-{
-	unsigned long long sta;
-	acpi_status status;
-
-	*(bool *)context = true;
-	status = acpi_evaluate_integer(handle, "_STA", NULL, &sta);
-	if (ACPI_FAILURE(status) || !(sta & ACPI_STA_DEVICE_PRESENT))
-		*(bool *)context = false;
-
-	return AE_OK;
-}
-
-bool snd_soc_acpi_check_hid(const u8 hid[ACPI_ID_LEN])
-{
-	acpi_status status;
-	bool found = false;
-
-	status = acpi_get_devices(hid, snd_soc_acpi_mach_match, &found, NULL);
-
-	if (ACPI_FAILURE(status))
-		return false;
-
-	return found;
-}
-EXPORT_SYMBOL_GPL(snd_soc_acpi_check_hid);
-
 struct snd_soc_acpi_mach *
 snd_soc_acpi_find_machine(struct snd_soc_acpi_mach *machines)
 {
 	struct snd_soc_acpi_mach *mach;
 
 	for (mach = machines; mach->id[0]; mach++) {
-		if (snd_soc_acpi_check_hid(mach->id) == true) {
-			if (mach->machine_quirk == NULL)
-				return mach;
-
-			if (mach->machine_quirk(mach) != NULL)
-				return mach;
+		if (acpi_dev_present(mach->id, NULL, -1)) {
+			if (mach->machine_quirk)
+				mach = mach->machine_quirk(mach);
+			return mach;
 		}
 	}
 	return NULL;
@@ -163,7 +100,7 @@ struct snd_soc_acpi_mach *snd_soc_acpi_codec_list(void *arg)
 		return mach;
 
 	for (i = 0; i < codec_list->num_codecs; i++) {
-		if (snd_soc_acpi_check_hid(codec_list->codecs[i]) != true)
+		if (!acpi_dev_present(codec_list->codecs[i], NULL, -1))
 			return NULL;
 	}
 
diff --git a/sound/soc/soc-compress.c b/sound/soc/soc-compress.c
index d9b1e64..81232f4 100644
--- a/sound/soc/soc-compress.c
+++ b/sound/soc/soc-compress.c
@@ -1096,7 +1096,6 @@ static struct snd_compr_ops soc_compr_dyn_ops = {
  */
 int snd_soc_new_compress(struct snd_soc_pcm_runtime *rtd, int num)
 {
-	struct snd_soc_codec *codec = rtd->codec;
 	struct snd_soc_platform *platform = rtd->platform;
 	struct snd_soc_component *component;
 	struct snd_soc_rtdcom_list *rtdcom;
@@ -1199,8 +1198,9 @@ int snd_soc_new_compress(struct snd_soc_pcm_runtime *rtd, int num)
 	ret = snd_compress_new(rtd->card->snd_card, num, direction,
 				new_name, compr);
 	if (ret < 0) {
+		component = rtd->codec_dai->component;
 		pr_err("compress asoc: can't create compress for codec %s\n",
-			codec->component.name);
+			component->name);
 		goto compr_err;
 	}
 
diff --git a/sound/soc/soc-core.c b/sound/soc/soc-core.c
index c0edac8..e918795 100644
--- a/sound/soc/soc-core.c
+++ b/sound/soc/soc-core.c
@@ -213,7 +213,7 @@ static umode_t soc_dev_attr_is_visible(struct kobject *kobj,
 
 	if (attr == &dev_attr_pmdown_time.attr)
 		return attr->mode; /* always visible */
-	return rtd->codec ? attr->mode : 0; /* enabled only with codec */
+	return rtd->num_codecs ? attr->mode : 0; /* enabled only with codec */
 }
 
 static const struct attribute_group soc_dapm_dev_group = {
@@ -349,120 +349,84 @@ static void soc_init_codec_debugfs(struct snd_soc_component *component)
 			"ASoC: Failed to create codec register debugfs file\n");
 }
 
-static ssize_t codec_list_read_file(struct file *file, char __user *user_buf,
-				    size_t count, loff_t *ppos)
+static int codec_list_seq_show(struct seq_file *m, void *v)
 {
-	char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	ssize_t len, ret = 0;
 	struct snd_soc_codec *codec;
 
-	if (!buf)
-		return -ENOMEM;
-
 	mutex_lock(&client_mutex);
 
-	list_for_each_entry(codec, &codec_list, list) {
-		len = snprintf(buf + ret, PAGE_SIZE - ret, "%s\n",
-			       codec->component.name);
-		if (len >= 0)
-			ret += len;
-		if (ret > PAGE_SIZE) {
-			ret = PAGE_SIZE;
-			break;
-		}
-	}
+	list_for_each_entry(codec, &codec_list, list)
+		seq_printf(m, "%s\n", codec->component.name);
 
 	mutex_unlock(&client_mutex);
 
-	if (ret >= 0)
-		ret = simple_read_from_buffer(user_buf, count, ppos, buf, ret);
+	return 0;
+}
 
-	kfree(buf);
-
-	return ret;
+static int codec_list_seq_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, codec_list_seq_show, NULL);
 }
 
 static const struct file_operations codec_list_fops = {
-	.read = codec_list_read_file,
-	.llseek = default_llseek,/* read accesses f_pos */
+	.open = codec_list_seq_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
 };
 
-static ssize_t dai_list_read_file(struct file *file, char __user *user_buf,
-				  size_t count, loff_t *ppos)
+static int dai_list_seq_show(struct seq_file *m, void *v)
 {
-	char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	ssize_t len, ret = 0;
 	struct snd_soc_component *component;
 	struct snd_soc_dai *dai;
 
-	if (!buf)
-		return -ENOMEM;
-
 	mutex_lock(&client_mutex);
 
-	list_for_each_entry(component, &component_list, list) {
-		list_for_each_entry(dai, &component->dai_list, list) {
-			len = snprintf(buf + ret, PAGE_SIZE - ret, "%s\n",
-				dai->name);
-			if (len >= 0)
-				ret += len;
-			if (ret > PAGE_SIZE) {
-				ret = PAGE_SIZE;
-				break;
-			}
-		}
-	}
+	list_for_each_entry(component, &component_list, list)
+		list_for_each_entry(dai, &component->dai_list, list)
+			seq_printf(m, "%s\n", dai->name);
 
 	mutex_unlock(&client_mutex);
 
-	ret = simple_read_from_buffer(user_buf, count, ppos, buf, ret);
+	return 0;
+}
 
-	kfree(buf);
-
-	return ret;
+static int dai_list_seq_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, dai_list_seq_show, NULL);
 }
 
 static const struct file_operations dai_list_fops = {
-	.read = dai_list_read_file,
-	.llseek = default_llseek,/* read accesses f_pos */
+	.open = dai_list_seq_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
 };
 
-static ssize_t platform_list_read_file(struct file *file,
-				       char __user *user_buf,
-				       size_t count, loff_t *ppos)
+static int platform_list_seq_show(struct seq_file *m, void *v)
 {
-	char *buf = kmalloc(PAGE_SIZE, GFP_KERNEL);
-	ssize_t len, ret = 0;
 	struct snd_soc_platform *platform;
 
-	if (!buf)
-		return -ENOMEM;
-
 	mutex_lock(&client_mutex);
 
-	list_for_each_entry(platform, &platform_list, list) {
-		len = snprintf(buf + ret, PAGE_SIZE - ret, "%s\n",
-			       platform->component.name);
-		if (len >= 0)
-			ret += len;
-		if (ret > PAGE_SIZE) {
-			ret = PAGE_SIZE;
-			break;
-		}
-	}
+	list_for_each_entry(platform, &platform_list, list)
+		seq_printf(m, "%s\n", platform->component.name);
 
 	mutex_unlock(&client_mutex);
 
-	ret = simple_read_from_buffer(user_buf, count, ppos, buf, ret);
+	return 0;
+}
 
-	kfree(buf);
-
-	return ret;
+static int platform_list_seq_open(struct inode *inode, struct file *file)
+{
+	return single_open(file, platform_list_seq_show, NULL);
 }
 
 static const struct file_operations platform_list_fops = {
-	.read = platform_list_read_file,
-	.llseek = default_llseek,/* read accesses f_pos */
+	.open = platform_list_seq_open,
+	.read = seq_read,
+	.llseek = seq_lseek,
+	.release = single_release,
 };
 
 static void soc_init_card_debugfs(struct snd_soc_card *card)
@@ -491,7 +455,6 @@ static void soc_cleanup_card_debugfs(struct snd_soc_card *card)
 	debugfs_remove_recursive(card->debugfs_card_root);
 }
 
-
 static void snd_soc_debugfs_init(void)
 {
 	snd_soc_debugfs_root = debugfs_create_dir("asoc", NULL);
@@ -598,6 +561,7 @@ struct snd_soc_component *snd_soc_rtdcom_lookup(struct snd_soc_pcm_runtime *rtd,
 
 	return NULL;
 }
+EXPORT_SYMBOL_GPL(snd_soc_rtdcom_lookup);
 
 struct snd_pcm_substream *snd_soc_get_dai_substream(struct snd_soc_card *card,
 		const char *dai_link, int stream)
@@ -1392,6 +1356,17 @@ static int soc_init_dai_link(struct snd_soc_card *card,
 	return 0;
 }
 
+void snd_soc_disconnect_sync(struct device *dev)
+{
+	struct snd_soc_component *component = snd_soc_lookup_component(dev, NULL);
+
+	if (!component || !component->card)
+		return;
+
+	snd_card_disconnect_sync(component->card->snd_card);
+}
+EXPORT_SYMBOL_GPL(snd_soc_disconnect_sync);
+
 /**
  * snd_soc_add_dai_link - Add a DAI link dynamically
  * @card: The ASoC card to which the DAI link is added
@@ -1945,7 +1920,9 @@ int snd_soc_runtime_set_dai_fmt(struct snd_soc_pcm_runtime *rtd,
 	}
 
 	/* Flip the polarity for the "CPU" end of a CODEC<->CODEC link */
-	if (cpu_dai->codec) {
+	/* the component which has non_legacy_dai_naming is Codec */
+	if (cpu_dai->codec ||
+	    cpu_dai->component->driver->non_legacy_dai_naming) {
 		unsigned int inv_dai_fmt;
 
 		inv_dai_fmt = dai_fmt & ~SND_SOC_DAIFMT_MASTER_MASK;
@@ -3149,7 +3126,7 @@ static struct snd_soc_dai *soc_add_dai(struct snd_soc_component *component,
 	if (!dai->driver->ops)
 		dai->driver->ops = &null_dai_ops;
 
-	list_add(&dai->list, &component->dai_list);
+	list_add_tail(&dai->list, &component->dai_list);
 	component->num_dai++;
 
 	dev_dbg(dev, "ASoC: Registered DAI '%s'\n", dai->name);
@@ -3176,8 +3153,6 @@ static int snd_soc_register_dais(struct snd_soc_component *component,
 
 	dev_dbg(dev, "ASoC: dai register %s #%zu\n", dev_name(dev), count);
 
-	component->dai_drv = dai_drv;
-
 	for (i = 0; i < count; i++) {
 
 		dai = soc_add_dai(component, dai_drv + i,
@@ -4354,6 +4329,7 @@ int snd_soc_get_dai_name(struct of_phandle_args *args,
 							     args,
 							     dai_name);
 		} else {
+			struct snd_soc_dai *dai;
 			int id = -1;
 
 			switch (args->args_count) {
@@ -4375,7 +4351,14 @@ int snd_soc_get_dai_name(struct of_phandle_args *args,
 
 			ret = 0;
 
-			*dai_name = pos->dai_drv[id].name;
+			/* find target DAI */
+			list_for_each_entry(dai, &pos->dai_list, list) {
+				if (id == 0)
+					break;
+				id--;
+			}
+
+			*dai_name = dai->driver->name;
 			if (!*dai_name)
 				*dai_name = pos->name;
 		}
diff --git a/sound/soc/soc-io.c b/sound/soc/soc-io.c
index 20340ad..2bc1c4c 100644
--- a/sound/soc/soc-io.c
+++ b/sound/soc/soc-io.c
@@ -34,6 +34,10 @@ int snd_soc_component_read(struct snd_soc_component *component,
 		ret = regmap_read(component->regmap, reg, val);
 	else if (component->read)
 		ret = component->read(component, reg, val);
+	else if (component->driver->read) {
+		*val = component->driver->read(component, reg);
+		ret = 0;
+	}
 	else
 		ret = -EIO;
 
@@ -70,6 +74,8 @@ int snd_soc_component_write(struct snd_soc_component *component,
 		return regmap_write(component->regmap, reg, val);
 	else if (component->write)
 		return component->write(component, reg, val);
+	else if (component->driver->write)
+		return component->driver->write(component, reg, val);
 	else
 		return -EIO;
 }
diff --git a/sound/soc/soc-ops.c b/sound/soc/soc-ops.c
index 500f98c..7144a51 100644
--- a/sound/soc/soc-ops.c
+++ b/sound/soc/soc-ops.c
@@ -378,7 +378,7 @@ int snd_soc_get_volsw_sx(struct snd_kcontrol *kcontrol,
 	unsigned int rshift = mc->rshift;
 	int max = mc->max;
 	int min = mc->min;
-	int mask = (1 << (fls(min + max) - 1)) - 1;
+	unsigned int mask = (1 << (fls(min + max) - 1)) - 1;
 	unsigned int val;
 	int ret;
 
@@ -423,7 +423,7 @@ int snd_soc_put_volsw_sx(struct snd_kcontrol *kcontrol,
 	unsigned int rshift = mc->rshift;
 	int max = mc->max;
 	int min = mc->min;
-	int mask = (1 << (fls(min + max) - 1)) - 1;
+	unsigned int mask = (1 << (fls(min + max) - 1)) - 1;
 	int err = 0;
 	unsigned int val, val_mask, val2 = 0;
 
diff --git a/sound/soc/soc-utils.c b/sound/soc/soc-utils.c
index e30aacb..bcd3da2 100644
--- a/sound/soc/soc-utils.c
+++ b/sound/soc/soc-utils.c
@@ -288,7 +288,7 @@ static const struct snd_soc_platform_driver dummy_platform = {
 	.ops = &dummy_dma_ops,
 };
 
-static struct snd_soc_codec_driver dummy_codec;
+static const struct snd_soc_codec_driver dummy_codec;
 
 #define STUB_RATES	SNDRV_PCM_RATE_8000_192000
 #define STUB_FORMATS	(SNDRV_PCM_FMTBIT_S8 | \
diff --git a/sound/soc/stm/Kconfig b/sound/soc/stm/Kconfig
index 3398e6c..3ad881f 100644
--- a/sound/soc/stm/Kconfig
+++ b/sound/soc/stm/Kconfig
@@ -28,4 +28,16 @@
 	help
 	  Say Y if you want to enable S/PDIF capture for STM32
 
+config SND_SOC_STM32_DFSDM
+	tristate "SoC Audio support for STM32 DFSDM"
+	depends on ARCH_STM32 || COMPILE_TEST
+	depends on SND_SOC
+	depends on STM32_DFSDM_ADC
+	select SND_SOC_GENERIC_DMAENGINE_PCM
+	select SND_SOC_DMIC
+	select IIO_BUFFER_CB
+	help
+	  Select this option to enable the STM32 Digital Filter
+	  for Sigma Delta Modulators (DFSDM) driver used
+	  in various STM32 series for digital microphone capture.
 endmenu
diff --git a/sound/soc/stm/Makefile b/sound/soc/stm/Makefile
index 5b7f0fa..3143c0b 100644
--- a/sound/soc/stm/Makefile
+++ b/sound/soc/stm/Makefile
@@ -13,3 +13,6 @@
 # SPDIFRX
 snd-soc-stm32-spdifrx-objs := stm32_spdifrx.o
 obj-$(CONFIG_SND_SOC_STM32_SPDIFRX) += snd-soc-stm32-spdifrx.o
+
+#DFSDM
+obj-$(CONFIG_SND_SOC_STM32_DFSDM) += stm32_adfsdm.o
diff --git a/sound/soc/stm/stm32_adfsdm.c b/sound/soc/stm/stm32_adfsdm.c
new file mode 100644
index 0000000..7306e3e
--- /dev/null
+++ b/sound/soc/stm/stm32_adfsdm.c
@@ -0,0 +1,347 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * This file is part of STM32 DFSDM ASoC DAI driver
+ *
+ * Copyright (C) 2017, STMicroelectronics - All Rights Reserved
+ * Authors: Arnaud Pouliquen <arnaud.pouliquen@st.com>
+ *          Olivier Moysan <olivier.moysan@st.com>
+ */
+
+#include <linux/clk.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#include <linux/iio/iio.h>
+#include <linux/iio/consumer.h>
+#include <linux/iio/adc/stm32-dfsdm-adc.h>
+
+#include <sound/pcm.h>
+#include <sound/soc.h>
+
+#define STM32_ADFSDM_DRV_NAME "stm32-adfsdm"
+
+#define DFSDM_MAX_PERIOD_SIZE	(PAGE_SIZE / 2)
+#define DFSDM_MAX_PERIODS	6
+
+struct stm32_adfsdm_priv {
+	struct snd_soc_dai_driver dai_drv;
+	struct snd_pcm_substream *substream;
+	struct device *dev;
+
+	/* IIO */
+	struct iio_channel *iio_ch;
+	struct iio_cb_buffer *iio_cb;
+	bool iio_active;
+
+	/* PCM buffer */
+	unsigned char *pcm_buff;
+	unsigned int pos;
+};
+
+static const struct snd_pcm_hardware stm32_adfsdm_pcm_hw = {
+	.info = SNDRV_PCM_INFO_INTERLEAVED | SNDRV_PCM_INFO_BLOCK_TRANSFER |
+	    SNDRV_PCM_INFO_PAUSE,
+	.formats = SNDRV_PCM_FMTBIT_S32_LE,
+
+	.rate_min = 8000,
+	.rate_max = 32000,
+
+	.channels_min = 1,
+	.channels_max = 1,
+
+	.periods_min = 2,
+	.periods_max = DFSDM_MAX_PERIODS,
+
+	.period_bytes_max = DFSDM_MAX_PERIOD_SIZE,
+	.buffer_bytes_max = DFSDM_MAX_PERIODS * DFSDM_MAX_PERIOD_SIZE
+};
+
+static void stm32_adfsdm_shutdown(struct snd_pcm_substream *substream,
+				  struct snd_soc_dai *dai)
+{
+	struct stm32_adfsdm_priv *priv = snd_soc_dai_get_drvdata(dai);
+
+	if (priv->iio_active) {
+		iio_channel_stop_all_cb(priv->iio_cb);
+		priv->iio_active = false;
+	}
+}
+
+static int stm32_adfsdm_dai_prepare(struct snd_pcm_substream *substream,
+				    struct snd_soc_dai *dai)
+{
+	struct stm32_adfsdm_priv *priv = snd_soc_dai_get_drvdata(dai);
+	int ret;
+
+	ret = iio_write_channel_attribute(priv->iio_ch,
+					  substream->runtime->rate, 0,
+					  IIO_CHAN_INFO_SAMP_FREQ);
+	if (ret < 0) {
+		dev_err(dai->dev, "%s: Failed to set %d sampling rate\n",
+			__func__, substream->runtime->rate);
+		return ret;
+	}
+
+	if (!priv->iio_active) {
+		ret = iio_channel_start_all_cb(priv->iio_cb);
+		if (!ret)
+			priv->iio_active = true;
+		else
+			dev_err(dai->dev, "%s: IIO channel start failed (%d)\n",
+				__func__, ret);
+	}
+
+	return ret;
+}
+
+static int stm32_adfsdm_set_sysclk(struct snd_soc_dai *dai, int clk_id,
+				   unsigned int freq, int dir)
+{
+	struct stm32_adfsdm_priv *priv = snd_soc_dai_get_drvdata(dai);
+	ssize_t size;
+	char str_freq[10];
+
+	dev_dbg(dai->dev, "%s: Enter for freq %d\n", __func__, freq);
+
+	/* Set IIO frequency if CODEC is master as clock comes from SPI_IN */
+
+	snprintf(str_freq, sizeof(str_freq), "%d\n", freq);
+	size = iio_write_channel_ext_info(priv->iio_ch, "spi_clk_freq",
+					  str_freq, sizeof(str_freq));
+	if (size != sizeof(str_freq)) {
+		dev_err(dai->dev, "%s: Failed to set SPI clock\n",
+			__func__);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static const struct snd_soc_dai_ops stm32_adfsdm_dai_ops = {
+	.shutdown = stm32_adfsdm_shutdown,
+	.prepare = stm32_adfsdm_dai_prepare,
+	.set_sysclk = stm32_adfsdm_set_sysclk,
+};
+
+static const struct snd_soc_dai_driver stm32_adfsdm_dai = {
+	.capture = {
+		    .channels_min = 1,
+		    .channels_max = 1,
+		    .formats = SNDRV_PCM_FMTBIT_S32_LE,
+		    .rates = (SNDRV_PCM_RATE_8000 | SNDRV_PCM_RATE_16000 |
+			      SNDRV_PCM_RATE_32000),
+		    },
+	.ops = &stm32_adfsdm_dai_ops,
+};
+
+static const struct snd_soc_component_driver stm32_adfsdm_dai_component = {
+	.name = "stm32_dfsdm_audio",
+};
+
+static int stm32_afsdm_pcm_cb(const void *data, size_t size, void *private)
+{
+	struct stm32_adfsdm_priv *priv = private;
+	struct snd_soc_pcm_runtime *rtd = priv->substream->private_data;
+	u8 *pcm_buff = priv->pcm_buff;
+	u8 *src_buff = (u8 *)data;
+	unsigned int buff_size = snd_pcm_lib_buffer_bytes(priv->substream);
+	unsigned int period_size = snd_pcm_lib_period_bytes(priv->substream);
+	unsigned int old_pos = priv->pos;
+	unsigned int cur_size = size;
+
+	dev_dbg(rtd->dev, "%s: buff_add :%p, pos = %d, size = %zu\n",
+		__func__, &pcm_buff[priv->pos], priv->pos, size);
+
+	if ((priv->pos + size) > buff_size) {
+		memcpy(&pcm_buff[priv->pos], src_buff, buff_size - priv->pos);
+		cur_size -= buff_size - priv->pos;
+		priv->pos = 0;
+	}
+
+	memcpy(&pcm_buff[priv->pos], &src_buff[size - cur_size], cur_size);
+	priv->pos = (priv->pos + cur_size) % buff_size;
+
+	if (cur_size != size || (old_pos && (old_pos % period_size < size)))
+		snd_pcm_period_elapsed(priv->substream);
+
+	return 0;
+}
+
+static int stm32_adfsdm_trigger(struct snd_pcm_substream *substream, int cmd)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct stm32_adfsdm_priv *priv =
+		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+
+	switch (cmd) {
+	case SNDRV_PCM_TRIGGER_START:
+	case SNDRV_PCM_TRIGGER_RESUME:
+		priv->pos = 0;
+		return stm32_dfsdm_get_buff_cb(priv->iio_ch->indio_dev,
+					       stm32_afsdm_pcm_cb, priv);
+	case SNDRV_PCM_TRIGGER_SUSPEND:
+	case SNDRV_PCM_TRIGGER_STOP:
+		return stm32_dfsdm_release_buff_cb(priv->iio_ch->indio_dev);
+	}
+
+	return -EINVAL;
+}
+
+static int stm32_adfsdm_pcm_open(struct snd_pcm_substream *substream)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct stm32_adfsdm_priv *priv = snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	int ret;
+
+	ret =  snd_soc_set_runtime_hwparams(substream, &stm32_adfsdm_pcm_hw);
+	if (!ret)
+		priv->substream = substream;
+
+	return ret;
+}
+
+static int stm32_adfsdm_pcm_close(struct snd_pcm_substream *substream)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct stm32_adfsdm_priv *priv =
+		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+
+	snd_pcm_lib_free_pages(substream);
+	priv->substream = NULL;
+
+	return 0;
+}
+
+static snd_pcm_uframes_t stm32_adfsdm_pcm_pointer(
+					    struct snd_pcm_substream *substream)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct stm32_adfsdm_priv *priv =
+		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+
+	return bytes_to_frames(substream->runtime, priv->pos);
+}
+
+static int stm32_adfsdm_pcm_hw_params(struct snd_pcm_substream *substream,
+				      struct snd_pcm_hw_params *params)
+{
+	struct snd_soc_pcm_runtime *rtd = substream->private_data;
+	struct stm32_adfsdm_priv *priv =
+		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	int ret;
+
+	ret =  snd_pcm_lib_malloc_pages(substream, params_buffer_bytes(params));
+	if (ret < 0)
+		return ret;
+	priv->pcm_buff = substream->runtime->dma_area;
+
+	return iio_channel_cb_set_buffer_watermark(priv->iio_cb,
+						   params_period_size(params));
+}
+
+static int stm32_adfsdm_pcm_hw_free(struct snd_pcm_substream *substream)
+{
+	snd_pcm_lib_free_pages(substream);
+
+	return 0;
+}
+
+static struct snd_pcm_ops stm32_adfsdm_pcm_ops = {
+	.open		= stm32_adfsdm_pcm_open,
+	.close		= stm32_adfsdm_pcm_close,
+	.hw_params	= stm32_adfsdm_pcm_hw_params,
+	.hw_free	= stm32_adfsdm_pcm_hw_free,
+	.trigger	= stm32_adfsdm_trigger,
+	.pointer	= stm32_adfsdm_pcm_pointer,
+};
+
+static int stm32_adfsdm_pcm_new(struct snd_soc_pcm_runtime *rtd)
+{
+	struct snd_pcm *pcm = rtd->pcm;
+	struct stm32_adfsdm_priv *priv =
+		snd_soc_dai_get_drvdata(rtd->cpu_dai);
+	unsigned int size = DFSDM_MAX_PERIODS * DFSDM_MAX_PERIOD_SIZE;
+
+	return snd_pcm_lib_preallocate_pages_for_all(pcm, SNDRV_DMA_TYPE_DEV,
+						     priv->dev, size, size);
+}
+
+static void stm32_adfsdm_pcm_free(struct snd_pcm *pcm)
+{
+	struct snd_pcm_substream *substream;
+	struct snd_soc_pcm_runtime *rtd;
+	struct stm32_adfsdm_priv *priv;
+
+	substream = pcm->streams[SNDRV_PCM_STREAM_CAPTURE].substream;
+	if (substream) {
+		rtd = substream->private_data;
+		priv = snd_soc_dai_get_drvdata(rtd->cpu_dai);
+
+		snd_pcm_lib_preallocate_free_for_all(pcm);
+	}
+}
+
+static struct snd_soc_platform_driver stm32_adfsdm_soc_platform = {
+	.ops		= &stm32_adfsdm_pcm_ops,
+	.pcm_new	= stm32_adfsdm_pcm_new,
+	.pcm_free	= stm32_adfsdm_pcm_free,
+};
+
+static const struct of_device_id stm32_adfsdm_of_match[] = {
+	{.compatible = "st,stm32h7-dfsdm-dai"},
+	{}
+};
+MODULE_DEVICE_TABLE(of, stm32_adfsdm_of_match);
+
+static int stm32_adfsdm_probe(struct platform_device *pdev)
+{
+	struct stm32_adfsdm_priv *priv;
+	int ret;
+
+	priv = devm_kzalloc(&pdev->dev, sizeof(*priv), GFP_KERNEL);
+	if (!priv)
+		return -ENOMEM;
+
+	priv->dev = &pdev->dev;
+	priv->dai_drv = stm32_adfsdm_dai;
+
+	dev_set_drvdata(&pdev->dev, priv);
+
+	ret = devm_snd_soc_register_component(&pdev->dev,
+					      &stm32_adfsdm_dai_component,
+					      &priv->dai_drv, 1);
+	if (ret < 0)
+		return ret;
+
+	/* Associate iio channel */
+	priv->iio_ch  = devm_iio_channel_get_all(&pdev->dev);
+	if (IS_ERR(priv->iio_ch))
+		return PTR_ERR(priv->iio_ch);
+
+	priv->iio_cb = iio_channel_get_all_cb(&pdev->dev, NULL, NULL);
+	if (IS_ERR(priv->iio_cb))
+		return PTR_ERR(priv->iio_cb);
+
+	ret = devm_snd_soc_register_platform(&pdev->dev,
+					     &stm32_adfsdm_soc_platform);
+	if (ret < 0)
+		dev_err(&pdev->dev, "%s: Failed to register PCM platform\n",
+			__func__);
+
+	return ret;
+}
+
+static struct platform_driver stm32_adfsdm_driver = {
+	.driver = {
+		   .name = STM32_ADFSDM_DRV_NAME,
+		   .of_match_table = stm32_adfsdm_of_match,
+		   },
+	.probe = stm32_adfsdm_probe,
+};
+
+module_platform_driver(stm32_adfsdm_driver);
+
+MODULE_DESCRIPTION("stm32 DFSDM DAI driver");
+MODULE_AUTHOR("Arnaud Pouliquen <arnaud.pouliquen@st.com>");
+MODULE_LICENSE("GPL v2");
+MODULE_ALIAS("platform:" STM32_ADFSDM_DRV_NAME);
diff --git a/sound/soc/stm/stm32_sai.c b/sound/soc/stm/stm32_sai.c
index d6f71a3..d743b7d 100644
--- a/sound/soc/stm/stm32_sai.c
+++ b/sound/soc/stm/stm32_sai.c
@@ -28,16 +28,6 @@
 
 #include "stm32_sai.h"
 
-static LIST_HEAD(sync_providers);
-static DEFINE_MUTEX(sync_mutex);
-
-struct sync_provider {
-	struct list_head link;
-	struct device_node *node;
-	int  (*sync_conf)(void *data, int synco);
-	void *data;
-};
-
 static const struct stm32_sai_conf stm32_sai_conf_f4 = {
 	.version = SAI_STM32F4,
 };
@@ -70,9 +60,8 @@ static int stm32_sai_sync_conf_client(struct stm32_sai_data *sai, int synci)
 	return 0;
 }
 
-static int stm32_sai_sync_conf_provider(void *data, int synco)
+static int stm32_sai_sync_conf_provider(struct stm32_sai_data *sai, int synco)
 {
-	struct stm32_sai_data *sai = (struct stm32_sai_data *)data;
 	u32 prev_synco;
 	int ret;
 
@@ -103,83 +92,42 @@ static int stm32_sai_sync_conf_provider(void *data, int synco)
 	return 0;
 }
 
-static int stm32_sai_set_sync_provider(struct device_node *np, int synco)
-{
-	struct sync_provider *provider;
-	int ret;
-
-	mutex_lock(&sync_mutex);
-	list_for_each_entry(provider, &sync_providers, link) {
-		if (provider->node == np) {
-			ret = provider->sync_conf(provider->data, synco);
-			mutex_unlock(&sync_mutex);
-			return ret;
-		}
-	}
-	mutex_unlock(&sync_mutex);
-
-	/* SAI sync provider not found */
-	return -ENODEV;
-}
-
-static int stm32_sai_set_sync(struct stm32_sai_data *sai,
+static int stm32_sai_set_sync(struct stm32_sai_data *sai_client,
 			      struct device_node *np_provider,
 			      int synco, int synci)
 {
+	struct platform_device *pdev = of_find_device_by_node(np_provider);
+	struct stm32_sai_data *sai_provider;
 	int ret;
 
+	if (!pdev) {
+		dev_err(&sai_client->pdev->dev,
+			"Device not found for node %s\n", np_provider->name);
+		return -ENODEV;
+	}
+
+	sai_provider = platform_get_drvdata(pdev);
+	if (!sai_provider) {
+		dev_err(&sai_client->pdev->dev,
+			"SAI sync provider data not found\n");
+		return -EINVAL;
+	}
+
 	/* Configure sync client */
-	stm32_sai_sync_conf_client(sai, synci);
+	ret = stm32_sai_sync_conf_client(sai_client, synci);
+	if (ret < 0)
+		return ret;
 
 	/* Configure sync provider */
-	ret = stm32_sai_set_sync_provider(np_provider, synco);
-
-	return ret;
-}
-
-static int stm32_sai_sync_add_provider(struct platform_device *pdev,
-				       void *data)
-{
-	struct sync_provider *sp;
-
-	sp = devm_kzalloc(&pdev->dev, sizeof(*sp), GFP_KERNEL);
-	if (!sp)
-		return -ENOMEM;
-
-	sp->node = of_node_get(pdev->dev.of_node);
-	sp->data = data;
-	sp->sync_conf = &stm32_sai_sync_conf_provider;
-
-	mutex_lock(&sync_mutex);
-	list_add(&sp->link, &sync_providers);
-	mutex_unlock(&sync_mutex);
-
-	return 0;
-}
-
-static void stm32_sai_sync_del_provider(struct device_node *np)
-{
-	struct sync_provider *sp;
-
-	mutex_lock(&sync_mutex);
-	list_for_each_entry(sp, &sync_providers, link) {
-		if (sp->node == np) {
-			list_del(&sp->link);
-			of_node_put(sp->node);
-			break;
-		}
-	}
-	mutex_unlock(&sync_mutex);
+	return stm32_sai_sync_conf_provider(sai_provider, synco);
 }
 
 static int stm32_sai_probe(struct platform_device *pdev)
 {
-	struct device_node *np = pdev->dev.of_node;
 	struct stm32_sai_data *sai;
 	struct reset_control *rst;
 	struct resource *res;
 	const struct of_device_id *of_id;
-	int ret;
 
 	sai = devm_kzalloc(&pdev->dev, sizeof(*sai), GFP_KERNEL);
 	if (!sai)
@@ -231,28 +179,11 @@ static int stm32_sai_probe(struct platform_device *pdev)
 		reset_control_deassert(rst);
 	}
 
-	ret = stm32_sai_sync_add_provider(pdev, sai);
-	if (ret < 0)
-		return ret;
-	sai->set_sync = &stm32_sai_set_sync;
-
 	sai->pdev = pdev;
+	sai->set_sync = &stm32_sai_set_sync;
 	platform_set_drvdata(pdev, sai);
 
-	ret = of_platform_populate(np, NULL, NULL, &pdev->dev);
-	if (ret < 0)
-		stm32_sai_sync_del_provider(np);
-
-	return ret;
-}
-
-static int stm32_sai_remove(struct platform_device *pdev)
-{
-	of_platform_depopulate(&pdev->dev);
-
-	stm32_sai_sync_del_provider(pdev->dev.of_node);
-
-	return 0;
+	return devm_of_platform_populate(&pdev->dev);
 }
 
 MODULE_DEVICE_TABLE(of, stm32_sai_ids);
@@ -263,7 +194,6 @@ static struct platform_driver stm32_sai_driver = {
 		.of_match_table = stm32_sai_ids,
 	},
 	.probe = stm32_sai_probe,
-	.remove = stm32_sai_remove,
 };
 
 module_platform_driver(stm32_sai_driver);
diff --git a/sound/soc/sunxi/sun4i-codec.c b/sound/soc/sunxi/sun4i-codec.c
index 5da4efe..8862816 100644
--- a/sound/soc/sunxi/sun4i-codec.c
+++ b/sound/soc/sunxi/sun4i-codec.c
@@ -590,12 +590,28 @@ static int sun4i_codec_hw_params(struct snd_pcm_substream *substream,
 					     hwrate);
 }
 
+
+static unsigned int sun4i_codec_src_rates[] = {
+	8000, 11025, 12000, 16000, 22050, 24000, 32000,
+	44100, 48000, 96000, 192000
+};
+
+
+static struct snd_pcm_hw_constraint_list sun4i_codec_constraints = {
+	.count  = ARRAY_SIZE(sun4i_codec_src_rates),
+	.list   = sun4i_codec_src_rates,
+};
+
+
 static int sun4i_codec_startup(struct snd_pcm_substream *substream,
 			       struct snd_soc_dai *dai)
 {
 	struct snd_soc_pcm_runtime *rtd = substream->private_data;
 	struct sun4i_codec *scodec = snd_soc_card_get_drvdata(rtd->card);
 
+	snd_pcm_hw_constraint_list(substream->runtime, 0,
+				SNDRV_PCM_HW_PARAM_RATE, &sun4i_codec_constraints);
+
 	/*
 	 * Stop issuing DRQ when we have room for less than 16 samples
 	 * in our TX FIFO
@@ -633,9 +649,7 @@ static struct snd_soc_dai_driver sun4i_codec_dai = {
 		.channels_max	= 2,
 		.rate_min	= 8000,
 		.rate_max	= 192000,
-		.rates		= SNDRV_PCM_RATE_8000_48000 |
-				  SNDRV_PCM_RATE_96000 |
-				  SNDRV_PCM_RATE_192000,
+		.rates		= SNDRV_PCM_RATE_CONTINUOUS,
 		.formats	= SNDRV_PCM_FMTBIT_S16_LE |
 				  SNDRV_PCM_FMTBIT_S32_LE,
 		.sig_bits	= 24,
@@ -645,11 +659,8 @@ static struct snd_soc_dai_driver sun4i_codec_dai = {
 		.channels_min	= 1,
 		.channels_max	= 2,
 		.rate_min	= 8000,
-		.rate_max	= 192000,
-		.rates		= SNDRV_PCM_RATE_8000_48000 |
-				  SNDRV_PCM_RATE_96000 |
-				  SNDRV_PCM_RATE_192000 |
-				  SNDRV_PCM_RATE_KNOT,
+		.rate_max	= 48000,
+		.rates		= SNDRV_PCM_RATE_CONTINUOUS,
 		.formats	= SNDRV_PCM_FMTBIT_S16_LE |
 				  SNDRV_PCM_FMTBIT_S32_LE,
 		.sig_bits	= 24,
@@ -1128,7 +1139,7 @@ static const struct snd_soc_component_driver sun4i_codec_component = {
 	.name = "sun4i-codec",
 };
 
-#define SUN4I_CODEC_RATES	SNDRV_PCM_RATE_8000_192000
+#define SUN4I_CODEC_RATES	SNDRV_PCM_RATE_CONTINUOUS
 #define SUN4I_CODEC_FORMATS	(SNDRV_PCM_FMTBIT_S16_LE | \
 				 SNDRV_PCM_FMTBIT_S32_LE)
 
diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c
index 04f9258..dca1143 100644
--- a/sound/soc/sunxi/sun4i-i2s.c
+++ b/sound/soc/sunxi/sun4i-i2s.c
@@ -269,10 +269,11 @@ static bool sun4i_i2s_oversample_is_valid(unsigned int oversample)
 	return false;
 }
 
-static int sun4i_i2s_set_clk_rate(struct sun4i_i2s *i2s,
+static int sun4i_i2s_set_clk_rate(struct snd_soc_dai *dai,
 				  unsigned int rate,
 				  unsigned int word_size)
 {
+	struct sun4i_i2s *i2s = snd_soc_dai_get_drvdata(dai);
 	unsigned int oversample_rate, clk_rate;
 	int bclk_div, mclk_div;
 	int ret;
@@ -300,6 +301,7 @@ static int sun4i_i2s_set_clk_rate(struct sun4i_i2s *i2s,
 		break;
 
 	default:
+		dev_err(dai->dev, "Unsupported sample rate: %u\n", rate);
 		return -EINVAL;
 	}
 
@@ -308,18 +310,25 @@ static int sun4i_i2s_set_clk_rate(struct sun4i_i2s *i2s,
 		return ret;
 
 	oversample_rate = i2s->mclk_freq / rate;
-	if (!sun4i_i2s_oversample_is_valid(oversample_rate))
+	if (!sun4i_i2s_oversample_is_valid(oversample_rate)) {
+		dev_err(dai->dev, "Unsupported oversample rate: %d\n",
+			oversample_rate);
 		return -EINVAL;
+	}
 
 	bclk_div = sun4i_i2s_get_bclk_div(i2s, oversample_rate,
 					  word_size);
-	if (bclk_div < 0)
+	if (bclk_div < 0) {
+		dev_err(dai->dev, "Unsupported BCLK divider: %d\n", bclk_div);
 		return -EINVAL;
+	}
 
 	mclk_div = sun4i_i2s_get_mclk_div(i2s, oversample_rate,
 					  clk_rate, rate);
-	if (mclk_div < 0)
+	if (mclk_div < 0) {
+		dev_err(dai->dev, "Unsupported MCLK divider: %d\n", mclk_div);
 		return -EINVAL;
+	}
 
 	/* Adjust the clock division values if needed */
 	bclk_div += i2s->variant->bclk_offset;
@@ -349,8 +358,11 @@ static int sun4i_i2s_hw_params(struct snd_pcm_substream *substream,
 	u32 width;
 
 	channels = params_channels(params);
-	if (channels != 2)
+	if (channels != 2) {
+		dev_err(dai->dev, "Unsupported number of channels: %d\n",
+			channels);
 		return -EINVAL;
+	}
 
 	if (i2s->variant->has_chcfg) {
 		regmap_update_bits(i2s->regmap, SUN8I_I2S_CHAN_CFG_REG,
@@ -382,6 +394,8 @@ static int sun4i_i2s_hw_params(struct snd_pcm_substream *substream,
 		width = DMA_SLAVE_BUSWIDTH_2_BYTES;
 		break;
 	default:
+		dev_err(dai->dev, "Unsupported physical sample width: %d\n",
+			params_physical_width(params));
 		return -EINVAL;
 	}
 	i2s->playback_dma_data.addr_width = width;
@@ -393,6 +407,8 @@ static int sun4i_i2s_hw_params(struct snd_pcm_substream *substream,
 		break;
 
 	default:
+		dev_err(dai->dev, "Unsupported sample width: %d\n",
+			params_width(params));
 		return -EINVAL;
 	}
 
@@ -401,7 +417,7 @@ static int sun4i_i2s_hw_params(struct snd_pcm_substream *substream,
 	regmap_field_write(i2s->field_fmt_sr,
 			   sr + i2s->variant->fmt_offset);
 
-	return sun4i_i2s_set_clk_rate(i2s, params_rate(params),
+	return sun4i_i2s_set_clk_rate(dai, params_rate(params),
 				      params_width(params));
 }
 
@@ -426,6 +442,8 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
 		val = SUN4I_I2S_FMT0_FMT_RIGHT_J;
 		break;
 	default:
+		dev_err(dai->dev, "Unsupported format: %d\n",
+			fmt & SND_SOC_DAIFMT_FORMAT_MASK);
 		return -EINVAL;
 	}
 
@@ -464,6 +482,8 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
 	case SND_SOC_DAIFMT_NB_NF:
 		break;
 	default:
+		dev_err(dai->dev, "Unsupported clock polarity: %d\n",
+			fmt & SND_SOC_DAIFMT_INV_MASK);
 		return -EINVAL;
 	}
 
@@ -482,6 +502,8 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
 			val = SUN4I_I2S_CTRL_MODE_SLAVE;
 			break;
 		default:
+			dev_err(dai->dev, "Unsupported slave setting: %d\n",
+				fmt & SND_SOC_DAIFMT_MASTER_MASK);
 			return -EINVAL;
 		}
 		regmap_update_bits(i2s->regmap, SUN4I_I2S_CTRL_REG,
@@ -504,6 +526,8 @@ static int sun4i_i2s_set_fmt(struct snd_soc_dai *dai, unsigned int fmt)
 			val = 0;
 			break;
 		default:
+			dev_err(dai->dev, "Unsupported slave setting: %d\n",
+				fmt & SND_SOC_DAIFMT_MASTER_MASK);
 			return -EINVAL;
 		}
 		regmap_update_bits(i2s->regmap, SUN4I_I2S_CTRL_REG,
@@ -897,6 +921,23 @@ static const struct sun4i_i2s_quirks sun6i_a31_i2s_quirks = {
 	.field_rxchansel	= REG_FIELD(SUN4I_I2S_RX_CHAN_SEL_REG, 0, 2),
 };
 
+static const struct sun4i_i2s_quirks sun8i_a83t_i2s_quirks = {
+	.has_reset		= true,
+	.reg_offset_txdata	= SUN8I_I2S_FIFO_TX_REG,
+	.sun4i_i2s_regmap	= &sun4i_i2s_regmap_config,
+	.field_clkdiv_mclk_en	= REG_FIELD(SUN4I_I2S_CLK_DIV_REG, 7, 7),
+	.field_fmt_wss		= REG_FIELD(SUN4I_I2S_FMT0_REG, 2, 3),
+	.field_fmt_sr		= REG_FIELD(SUN4I_I2S_FMT0_REG, 4, 5),
+	.field_fmt_bclk		= REG_FIELD(SUN4I_I2S_FMT0_REG, 6, 6),
+	.field_fmt_lrclk	= REG_FIELD(SUN4I_I2S_FMT0_REG, 7, 7),
+	.has_slave_select_bit	= true,
+	.field_fmt_mode		= REG_FIELD(SUN4I_I2S_FMT0_REG, 0, 1),
+	.field_txchanmap	= REG_FIELD(SUN4I_I2S_TX_CHAN_MAP_REG, 0, 31),
+	.field_rxchanmap	= REG_FIELD(SUN4I_I2S_RX_CHAN_MAP_REG, 0, 31),
+	.field_txchansel	= REG_FIELD(SUN4I_I2S_TX_CHAN_SEL_REG, 0, 2),
+	.field_rxchansel	= REG_FIELD(SUN4I_I2S_RX_CHAN_SEL_REG, 0, 2),
+};
+
 static const struct sun4i_i2s_quirks sun8i_h3_i2s_quirks = {
 	.has_reset		= true,
 	.reg_offset_txdata	= SUN8I_I2S_FIFO_TX_REG,
@@ -1121,6 +1162,10 @@ static const struct of_device_id sun4i_i2s_match[] = {
 		.data = &sun6i_a31_i2s_quirks,
 	},
 	{
+		.compatible = "allwinner,sun8i-a83t-i2s",
+		.data = &sun8i_a83t_i2s_quirks,
+	},
+	{
 		.compatible = "allwinner,sun8i-h3-i2s",
 		.data = &sun8i_h3_i2s_quirks,
 	},
diff --git a/sound/soc/uniphier/Kconfig b/sound/soc/uniphier/Kconfig
new file mode 100644
index 0000000..02886a4
--- /dev/null
+++ b/sound/soc/uniphier/Kconfig
@@ -0,0 +1,19 @@
+# SPDX-License-Identifier: GPL-2.0
+config SND_SOC_UNIPHIER
+	tristate "ASoC support for UniPhier"
+	depends on (ARCH_UNIPHIER || COMPILE_TEST)
+	help
+	  Say Y or M if you want to add support for the Socionext
+	  UniPhier SoC audio interfaces. You will also need to select the
+	  audio interfaces to support below.
+	  If unsure select "N".
+
+config SND_SOC_UNIPHIER_EVEA_CODEC
+	tristate "UniPhier SoC internal audio codec"
+	depends on SND_SOC_UNIPHIER
+	select REGMAP_MMIO
+	help
+	  This adds Codec driver for Socionext UniPhier LD11/20 SoC
+	  internal DAC. This driver supports Line In / Out and HeadPhone.
+	  Select Y if you use such device.
+	  If unsure select "N".
diff --git a/sound/soc/uniphier/Makefile b/sound/soc/uniphier/Makefile
new file mode 100644
index 0000000..3be00d7
--- /dev/null
+++ b/sound/soc/uniphier/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+snd-soc-uniphier-evea-objs := evea.o
+obj-$(CONFIG_SND_SOC_UNIPHIER_EVEA_CODEC) += snd-soc-uniphier-evea.o
diff --git a/sound/soc/uniphier/evea.c b/sound/soc/uniphier/evea.c
new file mode 100644
index 0000000..0cc9eff
--- /dev/null
+++ b/sound/soc/uniphier/evea.c
@@ -0,0 +1,567 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Socionext UniPhier EVEA ADC/DAC codec driver.
+ *
+ * Copyright (c) 2016-2017 Socionext Inc.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2
+ * of the License.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/clk.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/regmap.h>
+#include <linux/reset.h>
+#include <sound/pcm.h>
+#include <sound/soc.h>
+
+#define DRV_NAME        "evea"
+#define EVEA_RATES      SNDRV_PCM_RATE_48000
+#define EVEA_FORMATS    SNDRV_PCM_FMTBIT_S32_LE
+
+#define AADCPOW(n)                           (0x0078 + 0x04 * (n))
+#define   AADCPOW_AADC_POWD                   BIT(0)
+#define AHPOUTPOW                            0x0098
+#define   AHPOUTPOW_HP_ON                     BIT(4)
+#define ALINEPOW                             0x009c
+#define   ALINEPOW_LIN2_POWD                  BIT(3)
+#define   ALINEPOW_LIN1_POWD                  BIT(4)
+#define ALO1OUTPOW                           0x00a8
+#define   ALO1OUTPOW_LO1_ON                   BIT(4)
+#define ALO2OUTPOW                           0x00ac
+#define   ALO2OUTPOW_ADAC2_MUTE               BIT(0)
+#define   ALO2OUTPOW_LO2_ON                   BIT(4)
+#define AANAPOW                              0x00b8
+#define   AANAPOW_A_POWD                      BIT(4)
+#define ADACSEQ1(n)                          (0x0144 + 0x40 * (n))
+#define   ADACSEQ1_MMUTE                      BIT(1)
+#define ADACSEQ2(n)                          (0x0160 + 0x40 * (n))
+#define   ADACSEQ2_ADACIN_FIX                 BIT(0)
+#define ADAC1ODC                             0x0200
+#define   ADAC1ODC_HP_DIS_RES_MASK            GENMASK(2, 1)
+#define   ADAC1ODC_HP_DIS_RES_OFF             (0x0 << 1)
+#define   ADAC1ODC_HP_DIS_RES_ON              (0x3 << 1)
+#define   ADAC1ODC_ADAC_RAMPCLT_MASK          GENMASK(8, 7)
+#define   ADAC1ODC_ADAC_RAMPCLT_NORMAL        (0x0 << 7)
+#define   ADAC1ODC_ADAC_RAMPCLT_REDUCE        (0x1 << 7)
+
+struct evea_priv {
+	struct clk *clk, *clk_exiv;
+	struct reset_control *rst, *rst_exiv, *rst_adamv;
+	struct regmap *regmap;
+
+	int switch_lin;
+	int switch_lo;
+	int switch_hp;
+};
+
+static const struct snd_soc_dapm_widget evea_widgets[] = {
+	SND_SOC_DAPM_ADC("ADC", "Capture", SND_SOC_NOPM, 0, 0),
+	SND_SOC_DAPM_INPUT("LIN1_LP"),
+	SND_SOC_DAPM_INPUT("LIN1_RP"),
+	SND_SOC_DAPM_INPUT("LIN2_LP"),
+	SND_SOC_DAPM_INPUT("LIN2_RP"),
+	SND_SOC_DAPM_INPUT("LIN3_LP"),
+	SND_SOC_DAPM_INPUT("LIN3_RP"),
+
+	SND_SOC_DAPM_DAC("DAC", "Playback", SND_SOC_NOPM, 0, 0),
+	SND_SOC_DAPM_OUTPUT("HP1_L"),
+	SND_SOC_DAPM_OUTPUT("HP1_R"),
+	SND_SOC_DAPM_OUTPUT("LO2_L"),
+	SND_SOC_DAPM_OUTPUT("LO2_R"),
+};
+
+static const struct snd_soc_dapm_route evea_routes[] = {
+	{ "ADC", NULL, "LIN1_LP" },
+	{ "ADC", NULL, "LIN1_RP" },
+	{ "ADC", NULL, "LIN2_LP" },
+	{ "ADC", NULL, "LIN2_RP" },
+	{ "ADC", NULL, "LIN3_LP" },
+	{ "ADC", NULL, "LIN3_RP" },
+
+	{ "HP1_L", NULL, "DAC" },
+	{ "HP1_R", NULL, "DAC" },
+	{ "LO2_L", NULL, "DAC" },
+	{ "LO2_R", NULL, "DAC" },
+};
+
+static void evea_set_power_state_on(struct evea_priv *evea)
+{
+	struct regmap *map = evea->regmap;
+
+	regmap_update_bits(map, AANAPOW, AANAPOW_A_POWD,
+			   AANAPOW_A_POWD);
+
+	regmap_update_bits(map, ADAC1ODC, ADAC1ODC_HP_DIS_RES_MASK,
+			   ADAC1ODC_HP_DIS_RES_ON);
+
+	regmap_update_bits(map, ADAC1ODC, ADAC1ODC_ADAC_RAMPCLT_MASK,
+			   ADAC1ODC_ADAC_RAMPCLT_REDUCE);
+
+	regmap_update_bits(map, ADACSEQ2(0), ADACSEQ2_ADACIN_FIX, 0);
+	regmap_update_bits(map, ADACSEQ2(1), ADACSEQ2_ADACIN_FIX, 0);
+	regmap_update_bits(map, ADACSEQ2(2), ADACSEQ2_ADACIN_FIX, 0);
+}
+
+static void evea_set_power_state_off(struct evea_priv *evea)
+{
+	struct regmap *map = evea->regmap;
+
+	regmap_update_bits(map, ADAC1ODC, ADAC1ODC_HP_DIS_RES_MASK,
+			   ADAC1ODC_HP_DIS_RES_ON);
+
+	regmap_update_bits(map, ADACSEQ1(0), ADACSEQ1_MMUTE,
+			   ADACSEQ1_MMUTE);
+	regmap_update_bits(map, ADACSEQ1(1), ADACSEQ1_MMUTE,
+			   ADACSEQ1_MMUTE);
+	regmap_update_bits(map, ADACSEQ1(2), ADACSEQ1_MMUTE,
+			   ADACSEQ1_MMUTE);
+
+	regmap_update_bits(map, ALO1OUTPOW, ALO1OUTPOW_LO1_ON, 0);
+	regmap_update_bits(map, ALO2OUTPOW, ALO2OUTPOW_LO2_ON, 0);
+	regmap_update_bits(map, AHPOUTPOW, AHPOUTPOW_HP_ON, 0);
+}
+
+static int evea_update_switch_lin(struct evea_priv *evea)
+{
+	struct regmap *map = evea->regmap;
+
+	if (evea->switch_lin) {
+		regmap_update_bits(map, ALINEPOW,
+				   ALINEPOW_LIN2_POWD | ALINEPOW_LIN1_POWD,
+				   ALINEPOW_LIN2_POWD | ALINEPOW_LIN1_POWD);
+
+		regmap_update_bits(map, AADCPOW(0), AADCPOW_AADC_POWD,
+				   AADCPOW_AADC_POWD);
+		regmap_update_bits(map, AADCPOW(1), AADCPOW_AADC_POWD,
+				   AADCPOW_AADC_POWD);
+	} else {
+		regmap_update_bits(map, AADCPOW(0), AADCPOW_AADC_POWD, 0);
+		regmap_update_bits(map, AADCPOW(1), AADCPOW_AADC_POWD, 0);
+
+		regmap_update_bits(map, ALINEPOW,
+				   ALINEPOW_LIN2_POWD | ALINEPOW_LIN1_POWD, 0);
+	}
+
+	return 0;
+}
+
+static int evea_update_switch_lo(struct evea_priv *evea)
+{
+	struct regmap *map = evea->regmap;
+
+	if (evea->switch_lo) {
+		regmap_update_bits(map, ADACSEQ1(0), ADACSEQ1_MMUTE, 0);
+		regmap_update_bits(map, ADACSEQ1(2), ADACSEQ1_MMUTE, 0);
+
+		regmap_update_bits(map, ALO1OUTPOW, ALO1OUTPOW_LO1_ON,
+				   ALO1OUTPOW_LO1_ON);
+		regmap_update_bits(map, ALO2OUTPOW,
+				   ALO2OUTPOW_ADAC2_MUTE | ALO2OUTPOW_LO2_ON,
+				   ALO2OUTPOW_ADAC2_MUTE | ALO2OUTPOW_LO2_ON);
+	} else {
+		regmap_update_bits(map, ADACSEQ1(0), ADACSEQ1_MMUTE,
+				   ADACSEQ1_MMUTE);
+		regmap_update_bits(map, ADACSEQ1(2), ADACSEQ1_MMUTE,
+				   ADACSEQ1_MMUTE);
+
+		regmap_update_bits(map, ALO1OUTPOW, ALO1OUTPOW_LO1_ON, 0);
+		regmap_update_bits(map, ALO2OUTPOW,
+				   ALO2OUTPOW_ADAC2_MUTE | ALO2OUTPOW_LO2_ON,
+				   0);
+	}
+
+	return 0;
+}
+
+static int evea_update_switch_hp(struct evea_priv *evea)
+{
+	struct regmap *map = evea->regmap;
+
+	if (evea->switch_hp) {
+		regmap_update_bits(map, ADACSEQ1(1), ADACSEQ1_MMUTE, 0);
+
+		regmap_update_bits(map, AHPOUTPOW, AHPOUTPOW_HP_ON,
+				   AHPOUTPOW_HP_ON);
+
+		regmap_update_bits(map, ADAC1ODC, ADAC1ODC_HP_DIS_RES_MASK,
+				   ADAC1ODC_HP_DIS_RES_OFF);
+	} else {
+		regmap_update_bits(map, ADAC1ODC, ADAC1ODC_HP_DIS_RES_MASK,
+				   ADAC1ODC_HP_DIS_RES_ON);
+
+		regmap_update_bits(map, ADACSEQ1(1), ADACSEQ1_MMUTE,
+				   ADACSEQ1_MMUTE);
+
+		regmap_update_bits(map, AHPOUTPOW, AHPOUTPOW_HP_ON, 0);
+	}
+
+	return 0;
+}
+
+static void evea_update_switch_all(struct evea_priv *evea)
+{
+	evea_update_switch_lin(evea);
+	evea_update_switch_lo(evea);
+	evea_update_switch_hp(evea);
+}
+
+static int evea_get_switch_lin(struct snd_kcontrol *kcontrol,
+			       struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	ucontrol->value.integer.value[0] = evea->switch_lin;
+
+	return 0;
+}
+
+static int evea_set_switch_lin(struct snd_kcontrol *kcontrol,
+			       struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	if (evea->switch_lin == ucontrol->value.integer.value[0])
+		return 0;
+
+	evea->switch_lin = ucontrol->value.integer.value[0];
+
+	return evea_update_switch_lin(evea);
+}
+
+static int evea_get_switch_lo(struct snd_kcontrol *kcontrol,
+			      struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	ucontrol->value.integer.value[0] = evea->switch_lo;
+
+	return 0;
+}
+
+static int evea_set_switch_lo(struct snd_kcontrol *kcontrol,
+			      struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	if (evea->switch_lo == ucontrol->value.integer.value[0])
+		return 0;
+
+	evea->switch_lo = ucontrol->value.integer.value[0];
+
+	return evea_update_switch_lo(evea);
+}
+
+static int evea_get_switch_hp(struct snd_kcontrol *kcontrol,
+			      struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	ucontrol->value.integer.value[0] = evea->switch_hp;
+
+	return 0;
+}
+
+static int evea_set_switch_hp(struct snd_kcontrol *kcontrol,
+			      struct snd_ctl_elem_value *ucontrol)
+{
+	struct snd_soc_codec *codec = snd_soc_kcontrol_codec(kcontrol);
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	if (evea->switch_hp == ucontrol->value.integer.value[0])
+		return 0;
+
+	evea->switch_hp = ucontrol->value.integer.value[0];
+
+	return evea_update_switch_hp(evea);
+}
+
+static const struct snd_kcontrol_new eva_controls[] = {
+	SOC_SINGLE_BOOL_EXT("Line Capture Switch", 0,
+			    evea_get_switch_lin, evea_set_switch_lin),
+	SOC_SINGLE_BOOL_EXT("Line Playback Switch", 0,
+			    evea_get_switch_lo, evea_set_switch_lo),
+	SOC_SINGLE_BOOL_EXT("Headphone Playback Switch", 0,
+			    evea_get_switch_hp, evea_set_switch_hp),
+};
+
+static int evea_codec_probe(struct snd_soc_codec *codec)
+{
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	evea->switch_lin = 1;
+	evea->switch_lo = 1;
+	evea->switch_hp = 1;
+
+	evea_set_power_state_on(evea);
+	evea_update_switch_all(evea);
+
+	return 0;
+}
+
+static int evea_codec_suspend(struct snd_soc_codec *codec)
+{
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+
+	evea_set_power_state_off(evea);
+
+	reset_control_assert(evea->rst_adamv);
+	reset_control_assert(evea->rst_exiv);
+	reset_control_assert(evea->rst);
+
+	clk_disable_unprepare(evea->clk_exiv);
+	clk_disable_unprepare(evea->clk);
+
+	return 0;
+}
+
+static int evea_codec_resume(struct snd_soc_codec *codec)
+{
+	struct evea_priv *evea = snd_soc_codec_get_drvdata(codec);
+	int ret;
+
+	ret = clk_prepare_enable(evea->clk);
+	if (ret)
+		return ret;
+
+	ret = clk_prepare_enable(evea->clk_exiv);
+	if (ret)
+		goto err_out_clock;
+
+	ret = reset_control_deassert(evea->rst);
+	if (ret)
+		goto err_out_clock_exiv;
+
+	ret = reset_control_deassert(evea->rst_exiv);
+	if (ret)
+		goto err_out_reset;
+
+	ret = reset_control_deassert(evea->rst_adamv);
+	if (ret)
+		goto err_out_reset_exiv;
+
+	evea_set_power_state_on(evea);
+	evea_update_switch_all(evea);
+
+	return 0;
+
+err_out_reset_exiv:
+	reset_control_assert(evea->rst_exiv);
+
+err_out_reset:
+	reset_control_assert(evea->rst);
+
+err_out_clock_exiv:
+	clk_disable_unprepare(evea->clk_exiv);
+
+err_out_clock:
+	clk_disable_unprepare(evea->clk);
+
+	return ret;
+}
+
+static struct snd_soc_codec_driver soc_codec_evea = {
+	.probe   = evea_codec_probe,
+	.suspend = evea_codec_suspend,
+	.resume  = evea_codec_resume,
+
+	.component_driver = {
+		.dapm_widgets = evea_widgets,
+		.num_dapm_widgets = ARRAY_SIZE(evea_widgets),
+		.dapm_routes = evea_routes,
+		.num_dapm_routes = ARRAY_SIZE(evea_routes),
+		.controls = eva_controls,
+		.num_controls = ARRAY_SIZE(eva_controls),
+	},
+};
+
+static struct snd_soc_dai_driver soc_dai_evea[] = {
+	{
+		.name     = DRV_NAME "-line1",
+		.playback = {
+			.stream_name  = "Line Out 1",
+			.formats      = EVEA_FORMATS,
+			.rates        = EVEA_RATES,
+			.channels_min = 2,
+			.channels_max = 2,
+		},
+		.capture = {
+			.stream_name  = "Line In 1",
+			.formats      = EVEA_FORMATS,
+			.rates        = EVEA_RATES,
+			.channels_min = 2,
+			.channels_max = 2,
+		},
+	},
+	{
+		.name     = DRV_NAME "-hp1",
+		.playback = {
+			.stream_name  = "Headphone 1",
+			.formats      = EVEA_FORMATS,
+			.rates        = EVEA_RATES,
+			.channels_min = 2,
+			.channels_max = 2,
+		},
+	},
+	{
+		.name     = DRV_NAME "-lo2",
+		.playback = {
+			.stream_name  = "Line Out 2",
+			.formats      = EVEA_FORMATS,
+			.rates        = EVEA_RATES,
+			.channels_min = 2,
+			.channels_max = 2,
+		},
+	},
+};
+
+static const struct regmap_config evea_regmap_config = {
+	.reg_bits      = 32,
+	.reg_stride    = 4,
+	.val_bits      = 32,
+	.max_register  = 0xffc,
+	.cache_type    = REGCACHE_NONE,
+};
+
+static int evea_probe(struct platform_device *pdev)
+{
+	struct evea_priv *evea;
+	struct resource *res;
+	void __iomem *preg;
+	int ret;
+
+	evea = devm_kzalloc(&pdev->dev, sizeof(struct evea_priv), GFP_KERNEL);
+	if (!evea)
+		return -ENOMEM;
+
+	evea->clk = devm_clk_get(&pdev->dev, "evea");
+	if (IS_ERR(evea->clk))
+		return PTR_ERR(evea->clk);
+
+	evea->clk_exiv = devm_clk_get(&pdev->dev, "exiv");
+	if (IS_ERR(evea->clk_exiv))
+		return PTR_ERR(evea->clk_exiv);
+
+	evea->rst = devm_reset_control_get_shared(&pdev->dev, "evea");
+	if (IS_ERR(evea->rst))
+		return PTR_ERR(evea->rst);
+
+	evea->rst_exiv = devm_reset_control_get_shared(&pdev->dev, "exiv");
+	if (IS_ERR(evea->rst_exiv))
+		return PTR_ERR(evea->rst_exiv);
+
+	res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+	preg = devm_ioremap_resource(&pdev->dev, res);
+	if (IS_ERR(preg))
+		return PTR_ERR(preg);
+
+	evea->regmap = devm_regmap_init_mmio(&pdev->dev, preg,
+					     &evea_regmap_config);
+	if (IS_ERR(evea->regmap))
+		return PTR_ERR(evea->regmap);
+
+	ret = clk_prepare_enable(evea->clk);
+	if (ret)
+		return ret;
+
+	ret = clk_prepare_enable(evea->clk_exiv);
+	if (ret)
+		goto err_out_clock;
+
+	ret = reset_control_deassert(evea->rst);
+	if (ret)
+		goto err_out_clock_exiv;
+
+	ret = reset_control_deassert(evea->rst_exiv);
+	if (ret)
+		goto err_out_reset;
+
+	/* ADAMV will hangup if EXIV reset is asserted */
+	evea->rst_adamv = devm_reset_control_get_shared(&pdev->dev, "adamv");
+	if (IS_ERR(evea->rst_adamv)) {
+		ret = PTR_ERR(evea->rst_adamv);
+		goto err_out_reset_exiv;
+	}
+
+	ret = reset_control_deassert(evea->rst_adamv);
+	if (ret)
+		goto err_out_reset_exiv;
+
+	platform_set_drvdata(pdev, evea);
+
+	ret = snd_soc_register_codec(&pdev->dev, &soc_codec_evea,
+				     soc_dai_evea, ARRAY_SIZE(soc_dai_evea));
+	if (ret)
+		goto err_out_reset_adamv;
+
+	return 0;
+
+err_out_reset_adamv:
+	reset_control_assert(evea->rst_adamv);
+
+err_out_reset_exiv:
+	reset_control_assert(evea->rst_exiv);
+
+err_out_reset:
+	reset_control_assert(evea->rst);
+
+err_out_clock_exiv:
+	clk_disable_unprepare(evea->clk_exiv);
+
+err_out_clock:
+	clk_disable_unprepare(evea->clk);
+
+	return ret;
+}
+
+static int evea_remove(struct platform_device *pdev)
+{
+	struct evea_priv *evea = platform_get_drvdata(pdev);
+
+	snd_soc_unregister_codec(&pdev->dev);
+
+	reset_control_assert(evea->rst_adamv);
+	reset_control_assert(evea->rst_exiv);
+	reset_control_assert(evea->rst);
+
+	clk_disable_unprepare(evea->clk_exiv);
+	clk_disable_unprepare(evea->clk);
+
+	return 0;
+}
+
+static const struct of_device_id evea_of_match[] = {
+	{ .compatible = "socionext,uniphier-evea", },
+	{}
+};
+MODULE_DEVICE_TABLE(of, evea_of_match);
+
+static struct platform_driver evea_codec_driver = {
+	.driver = {
+		.name = DRV_NAME,
+		.of_match_table = of_match_ptr(evea_of_match),
+	},
+	.probe  = evea_probe,
+	.remove = evea_remove,
+};
+module_platform_driver(evea_codec_driver);
+
+MODULE_AUTHOR("Katsuhiro Suzuki <suzuki.katsuhiro@socionext.com>");
+MODULE_DESCRIPTION("UniPhier EVEA codec driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/soc/ux500/mop500.c b/sound/soc/ux500/mop500.c
index 070a688..c60a577 100644
--- a/sound/soc/ux500/mop500.c
+++ b/sound/soc/ux500/mop500.c
@@ -163,3 +163,7 @@ static struct platform_driver snd_soc_mop500_driver = {
 };
 
 module_platform_driver(snd_soc_mop500_driver);
+
+MODULE_LICENSE("GPL v2");
+MODULE_DESCRIPTION("ASoC MOP500 board driver");
+MODULE_AUTHOR("Ola Lilja");
diff --git a/sound/soc/ux500/ux500_pcm.c b/sound/soc/ux500/ux500_pcm.c
index f12c01d..d35ba77 100644
--- a/sound/soc/ux500/ux500_pcm.c
+++ b/sound/soc/ux500/ux500_pcm.c
@@ -165,3 +165,8 @@ int ux500_pcm_unregister_platform(struct platform_device *pdev)
 	return 0;
 }
 EXPORT_SYMBOL_GPL(ux500_pcm_unregister_platform);
+
+MODULE_AUTHOR("Ola Lilja");
+MODULE_AUTHOR("Roger Nilsson");
+MODULE_DESCRIPTION("ASoC UX500 driver");
+MODULE_LICENSE("GPL v2");
diff --git a/sound/sound_core.c b/sound/sound_core.c
index 99b73c6..b4efb22 100644
--- a/sound/sound_core.c
+++ b/sound/sound_core.c
@@ -119,13 +119,6 @@ struct sound_unit
 	char name[32];
 };
 
-#ifdef CONFIG_SOUND_MSNDCLAS
-extern int msnd_classic_init(void);
-#endif
-#ifdef CONFIG_SOUND_MSNDPIN
-extern int msnd_pinnacle_init(void);
-#endif
-
 /*
  * By default, OSS sound_core claims full legacy minor range (0-255)
  * of SOUND_MAJOR to trap open attempts to any sound minor and
@@ -452,26 +445,6 @@ int register_sound_mixer(const struct file_operations *fops, int dev)
 
 EXPORT_SYMBOL(register_sound_mixer);
 
-/**
- *	register_sound_midi - register a midi device
- *	@fops: File operations for the driver
- *	@dev: Unit number to allocate
- *
- *	Allocate a midi device. Unit is the number of the midi device requested.
- *	Pass -1 to request the next free midi unit.
- *
- *	Return: On success, the allocated number is returned. On failure,
- *	a negative error code is returned.
- */
-
-int register_sound_midi(const struct file_operations *fops, int dev)
-{
-	return sound_insert_unit(&chains[2], fops, dev, 2, 130,
-				 "midi", S_IRUSR | S_IWUSR, NULL);
-}
-
-EXPORT_SYMBOL(register_sound_midi);
-
 /*
  *	DSP's are registered as a triple. Register only one and cheat
  *	in open - see below.
@@ -533,21 +506,6 @@ void unregister_sound_mixer(int unit)
 EXPORT_SYMBOL(unregister_sound_mixer);
 
 /**
- *	unregister_sound_midi - unregister a midi device
- *	@unit: unit number to allocate
- *
- *	Release a sound device that was allocated with register_sound_midi().
- *	The unit passed is the return value from the register function.
- */
-
-void unregister_sound_midi(int unit)
-{
-	sound_remove_unit(&chains[2], unit);
-}
-
-EXPORT_SYMBOL(unregister_sound_midi);
-
-/**
  *	unregister_sound_dsp - unregister a DSP device
  *	@unit: unit number to allocate
  *
diff --git a/sound/usb/card.c b/sound/usb/card.c
index 23d1d23..8018d56 100644
--- a/sound/usb/card.c
+++ b/sound/usb/card.c
@@ -585,15 +585,24 @@ static int usb_audio_probe(struct usb_interface *intf,
 		 * now look for an empty slot and create a new card instance
 		 */
 		for (i = 0; i < SNDRV_CARDS; i++)
-			if (enable[i] && ! usb_chip[i] &&
+			if (!usb_chip[i] &&
 			    (vid[i] == -1 || vid[i] == USB_ID_VENDOR(id)) &&
 			    (pid[i] == -1 || pid[i] == USB_ID_PRODUCT(id))) {
-				err = snd_usb_audio_create(intf, dev, i, quirk,
-							   id, &chip);
-				if (err < 0)
+				if (enable[i]) {
+					err = snd_usb_audio_create(intf, dev, i, quirk,
+								   id, &chip);
+					if (err < 0)
+						goto __error;
+					chip->pm_intf = intf;
+					break;
+				} else if (vid[i] != -1 || pid[i] != -1) {
+					dev_info(&dev->dev,
+						 "device (%04x:%04x) is disabled\n",
+						 USB_ID_VENDOR(id),
+						 USB_ID_PRODUCT(id));
+					err = -ENOENT;
 					goto __error;
-				chip->pm_intf = intf;
-				break;
+				}
 			}
 		if (!chip) {
 			dev_err(&dev->dev, "no available usb audio device\n");
diff --git a/sound/usb/mixer.c b/sound/usb/mixer.c
index 2b4ceda..9afb8ab 100644
--- a/sound/usb/mixer.c
+++ b/sound/usb/mixer.c
@@ -656,10 +656,14 @@ static int get_term_name(struct mixer_build *state, struct usb_audio_term *iterm
 			 unsigned char *name, int maxlen, int term_only)
 {
 	struct iterm_name_combo *names;
+	int len;
 
-	if (iterm->name)
-		return snd_usb_copy_string_desc(state, iterm->name,
+	if (iterm->name) {
+		len = snd_usb_copy_string_desc(state, iterm->name,
 						name, maxlen);
+		if (len)
+			return len;
+	}
 
 	/* virtual type - not a real terminal */
 	if (iterm->type >> 16) {
diff --git a/sound/usb/mixer_quirks.c b/sound/usb/mixer_quirks.c
index e1e7ce9..e6359d3 100644
--- a/sound/usb/mixer_quirks.c
+++ b/sound/usb/mixer_quirks.c
@@ -27,6 +27,7 @@
  *   Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307 USA
  */
 
+#include <linux/hid.h>
 #include <linux/init.h>
 #include <linux/slab.h>
 #include <linux/usb.h>
@@ -1721,6 +1722,83 @@ static int snd_microii_controls_create(struct usb_mixer_interface *mixer)
 	return 0;
 }
 
+/* Creative Sound Blaster E1 */
+
+static int snd_soundblaster_e1_switch_get(struct snd_kcontrol *kcontrol,
+					  struct snd_ctl_elem_value *ucontrol)
+{
+	ucontrol->value.integer.value[0] = kcontrol->private_value;
+	return 0;
+}
+
+static int snd_soundblaster_e1_switch_update(struct usb_mixer_interface *mixer,
+					     unsigned char state)
+{
+	struct snd_usb_audio *chip = mixer->chip;
+	int err;
+	unsigned char buff[2];
+
+	buff[0] = 0x02;
+	buff[1] = state ? 0x02 : 0x00;
+
+	err = snd_usb_lock_shutdown(chip);
+	if (err < 0)
+		return err;
+	err = snd_usb_ctl_msg(chip->dev,
+			usb_sndctrlpipe(chip->dev, 0), HID_REQ_SET_REPORT,
+			USB_TYPE_CLASS | USB_RECIP_INTERFACE | USB_DIR_OUT,
+			0x0202, 3, buff, 2);
+	snd_usb_unlock_shutdown(chip);
+	return err;
+}
+
+static int snd_soundblaster_e1_switch_put(struct snd_kcontrol *kcontrol,
+					  struct snd_ctl_elem_value *ucontrol)
+{
+	struct usb_mixer_elem_list *list = snd_kcontrol_chip(kcontrol);
+	unsigned char value = !!ucontrol->value.integer.value[0];
+	int err;
+
+	if (kcontrol->private_value == value)
+		return 0;
+	kcontrol->private_value = value;
+	err = snd_soundblaster_e1_switch_update(list->mixer, value);
+	return err < 0 ? err : 1;
+}
+
+static int snd_soundblaster_e1_switch_resume(struct usb_mixer_elem_list *list)
+{
+	return snd_soundblaster_e1_switch_update(list->mixer,
+						 list->kctl->private_value);
+}
+
+static int snd_soundblaster_e1_switch_info(struct snd_kcontrol *kcontrol,
+					   struct snd_ctl_elem_info *uinfo)
+{
+	static const char *const texts[2] = {
+		"Mic", "Aux"
+	};
+
+	return snd_ctl_enum_info(uinfo, 1, ARRAY_SIZE(texts), texts);
+}
+
+static struct snd_kcontrol_new snd_soundblaster_e1_input_switch = {
+	.iface = SNDRV_CTL_ELEM_IFACE_MIXER,
+	.name = "Input Source",
+	.info = snd_soundblaster_e1_switch_info,
+	.get = snd_soundblaster_e1_switch_get,
+	.put = snd_soundblaster_e1_switch_put,
+	.private_value = 0,
+};
+
+static int snd_soundblaster_e1_switch_create(struct usb_mixer_interface *mixer)
+{
+	return add_single_ctl_with_resume(mixer, 0,
+					  snd_soundblaster_e1_switch_resume,
+					  &snd_soundblaster_e1_input_switch,
+					  NULL);
+}
+
 int snd_usb_mixer_apply_create_quirk(struct usb_mixer_interface *mixer)
 {
 	int err = 0;
@@ -1802,6 +1880,10 @@ int snd_usb_mixer_apply_create_quirk(struct usb_mixer_interface *mixer)
 	case USB_ID(0x1235, 0x800c): /* Focusrite Scarlett 18i20 */
 		err = snd_scarlett_controls_create(mixer);
 		break;
+
+	case USB_ID(0x041e, 0x323b): /* Creative Sound Blaster E1 */
+		err = snd_soundblaster_e1_switch_create(mixer);
+		break;
 	}
 
 	return err;
diff --git a/sound/usb/quirks-table.h b/sound/usb/quirks-table.h
index 8a59d47..5025204 100644
--- a/sound/usb/quirks-table.h
+++ b/sound/usb/quirks-table.h
@@ -3277,4 +3277,52 @@ AU0828_DEVICE(0x2040, 0x7270, "Hauppauge", "HVR-950Q"),
 	}
 },
 
+{
+	/*
+	 * Nura's first gen headphones use Cambridge Silicon Radio's vendor
+	 * ID, but it looks like the product ID actually is only for Nura.
+	 * The capture interface does not work at all (even on Windows),
+	 * and only the 48 kHz sample rate works for the playback interface.
+	 */
+	USB_DEVICE(0x0a12, 0x1243),
+	.driver_info = (unsigned long) &(const struct snd_usb_audio_quirk) {
+		.ifnum = QUIRK_ANY_INTERFACE,
+		.type = QUIRK_COMPOSITE,
+		.data = (const struct snd_usb_audio_quirk[]) {
+			{
+				.ifnum = 0,
+				.type = QUIRK_AUDIO_STANDARD_MIXER,
+			},
+			/* Capture */
+			{
+				.ifnum = 1,
+				.type = QUIRK_IGNORE_INTERFACE,
+			},
+			/* Playback */
+			{
+				.ifnum = 2,
+				.type = QUIRK_AUDIO_FIXED_ENDPOINT,
+				.data = &(const struct audioformat) {
+					.formats = SNDRV_PCM_FMTBIT_S16_LE,
+					.channels = 2,
+					.iface = 2,
+					.altsetting = 1,
+					.altset_idx = 1,
+					.attributes = UAC_EP_CS_ATTR_FILL_MAX |
+						UAC_EP_CS_ATTR_SAMPLE_RATE,
+					.endpoint = 0x03,
+					.ep_attr = USB_ENDPOINT_XFER_ISOC,
+					.rates = SNDRV_PCM_RATE_48000,
+					.rate_min = 48000,
+					.rate_max = 48000,
+					.nr_rates = 1,
+					.rate_table = (unsigned int[]) {
+						48000
+					}
+				}
+			},
+		}
+	}
+},
+
 #undef USB_DEVICE_VENDOR_SPEC
diff --git a/tools/arch/alpha/include/uapi/asm/errno.h b/tools/arch/alpha/include/uapi/asm/errno.h
new file mode 100644
index 0000000..3d265f6
--- /dev/null
+++ b/tools/arch/alpha/include/uapi/asm/errno.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ALPHA_ERRNO_H
+#define _ALPHA_ERRNO_H
+
+#include <asm-generic/errno-base.h>
+
+#undef	EAGAIN			/* 11 in errno-base.h */
+
+#define	EDEADLK		11	/* Resource deadlock would occur */
+
+#define	EAGAIN		35	/* Try again */
+#define	EWOULDBLOCK	EAGAIN	/* Operation would block */
+#define	EINPROGRESS	36	/* Operation now in progress */
+#define	EALREADY	37	/* Operation already in progress */
+#define	ENOTSOCK	38	/* Socket operation on non-socket */
+#define	EDESTADDRREQ	39	/* Destination address required */
+#define	EMSGSIZE	40	/* Message too long */
+#define	EPROTOTYPE	41	/* Protocol wrong type for socket */
+#define	ENOPROTOOPT	42	/* Protocol not available */
+#define	EPROTONOSUPPORT	43	/* Protocol not supported */
+#define	ESOCKTNOSUPPORT	44	/* Socket type not supported */
+#define	EOPNOTSUPP	45	/* Operation not supported on transport endpoint */
+#define	EPFNOSUPPORT	46	/* Protocol family not supported */
+#define	EAFNOSUPPORT	47	/* Address family not supported by protocol */
+#define	EADDRINUSE	48	/* Address already in use */
+#define	EADDRNOTAVAIL	49	/* Cannot assign requested address */
+#define	ENETDOWN	50	/* Network is down */
+#define	ENETUNREACH	51	/* Network is unreachable */
+#define	ENETRESET	52	/* Network dropped connection because of reset */
+#define	ECONNABORTED	53	/* Software caused connection abort */
+#define	ECONNRESET	54	/* Connection reset by peer */
+#define	ENOBUFS		55	/* No buffer space available */
+#define	EISCONN		56	/* Transport endpoint is already connected */
+#define	ENOTCONN	57	/* Transport endpoint is not connected */
+#define	ESHUTDOWN	58	/* Cannot send after transport endpoint shutdown */
+#define	ETOOMANYREFS	59	/* Too many references: cannot splice */
+#define	ETIMEDOUT	60	/* Connection timed out */
+#define	ECONNREFUSED	61	/* Connection refused */
+#define	ELOOP		62	/* Too many symbolic links encountered */
+#define	ENAMETOOLONG	63	/* File name too long */
+#define	EHOSTDOWN	64	/* Host is down */
+#define	EHOSTUNREACH	65	/* No route to host */
+#define	ENOTEMPTY	66	/* Directory not empty */
+
+#define	EUSERS		68	/* Too many users */
+#define	EDQUOT		69	/* Quota exceeded */
+#define	ESTALE		70	/* Stale file handle */
+#define	EREMOTE		71	/* Object is remote */
+
+#define	ENOLCK		77	/* No record locks available */
+#define	ENOSYS		78	/* Function not implemented */
+
+#define	ENOMSG		80	/* No message of desired type */
+#define	EIDRM		81	/* Identifier removed */
+#define	ENOSR		82	/* Out of streams resources */
+#define	ETIME		83	/* Timer expired */
+#define	EBADMSG		84	/* Not a data message */
+#define	EPROTO		85	/* Protocol error */
+#define	ENODATA		86	/* No data available */
+#define	ENOSTR		87	/* Device not a stream */
+
+#define	ENOPKG		92	/* Package not installed */
+
+#define	EILSEQ		116	/* Illegal byte sequence */
+
+/* The following are just random noise.. */
+#define	ECHRNG		88	/* Channel number out of range */
+#define	EL2NSYNC	89	/* Level 2 not synchronized */
+#define	EL3HLT		90	/* Level 3 halted */
+#define	EL3RST		91	/* Level 3 reset */
+
+#define	ELNRNG		93	/* Link number out of range */
+#define	EUNATCH		94	/* Protocol driver not attached */
+#define	ENOCSI		95	/* No CSI structure available */
+#define	EL2HLT		96	/* Level 2 halted */
+#define	EBADE		97	/* Invalid exchange */
+#define	EBADR		98	/* Invalid request descriptor */
+#define	EXFULL		99	/* Exchange full */
+#define	ENOANO		100	/* No anode */
+#define	EBADRQC		101	/* Invalid request code */
+#define	EBADSLT		102	/* Invalid slot */
+
+#define	EDEADLOCK	EDEADLK
+
+#define	EBFONT		104	/* Bad font file format */
+#define	ENONET		105	/* Machine is not on the network */
+#define	ENOLINK		106	/* Link has been severed */
+#define	EADV		107	/* Advertise error */
+#define	ESRMNT		108	/* Srmount error */
+#define	ECOMM		109	/* Communication error on send */
+#define	EMULTIHOP	110	/* Multihop attempted */
+#define	EDOTDOT		111	/* RFS specific error */
+#define	EOVERFLOW	112	/* Value too large for defined data type */
+#define	ENOTUNIQ	113	/* Name not unique on network */
+#define	EBADFD		114	/* File descriptor in bad state */
+#define	EREMCHG		115	/* Remote address changed */
+
+#define	EUCLEAN		117	/* Structure needs cleaning */
+#define	ENOTNAM		118	/* Not a XENIX named type file */
+#define	ENAVAIL		119	/* No XENIX semaphores available */
+#define	EISNAM		120	/* Is a named type file */
+#define	EREMOTEIO	121	/* Remote I/O error */
+
+#define	ELIBACC		122	/* Can not access a needed shared library */
+#define	ELIBBAD		123	/* Accessing a corrupted shared library */
+#define	ELIBSCN		124	/* .lib section in a.out corrupted */
+#define	ELIBMAX		125	/* Attempting to link in too many shared libraries */
+#define	ELIBEXEC	126	/* Cannot exec a shared library directly */
+#define	ERESTART	127	/* Interrupted system call should be restarted */
+#define	ESTRPIPE	128	/* Streams pipe error */
+
+#define ENOMEDIUM	129	/* No medium found */
+#define EMEDIUMTYPE	130	/* Wrong medium type */
+#define	ECANCELED	131	/* Operation Cancelled */
+#define	ENOKEY		132	/* Required key not available */
+#define	EKEYEXPIRED	133	/* Key has expired */
+#define	EKEYREVOKED	134	/* Key has been revoked */
+#define	EKEYREJECTED	135	/* Key was rejected by service */
+
+/* for robust mutexes */
+#define	EOWNERDEAD	136	/* Owner died */
+#define	ENOTRECOVERABLE	137	/* State not recoverable */
+
+#define	ERFKILL		138	/* Operation not possible due to RF-kill */
+
+#define EHWPOISON	139	/* Memory page has hardware error */
+
+#endif
diff --git a/tools/arch/mips/include/asm/errno.h b/tools/arch/mips/include/asm/errno.h
new file mode 100644
index 0000000..21d91cdf
--- /dev/null
+++ b/tools/arch/mips/include/asm/errno.h
@@ -0,0 +1,17 @@
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1995, 1999, 2001, 2002 by Ralf Baechle
+ */
+#ifndef _ASM_ERRNO_H
+#define _ASM_ERRNO_H
+
+#include <uapi/asm/errno.h>
+
+
+/* The biggest error number defined here or in <linux/errno.h>. */
+#define EMAXERRNO	1133
+
+#endif /* _ASM_ERRNO_H */
diff --git a/tools/arch/mips/include/uapi/asm/errno.h b/tools/arch/mips/include/uapi/asm/errno.h
new file mode 100644
index 0000000..2fb714e
--- /dev/null
+++ b/tools/arch/mips/include/uapi/asm/errno.h
@@ -0,0 +1,130 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * This file is subject to the terms and conditions of the GNU General Public
+ * License.  See the file "COPYING" in the main directory of this archive
+ * for more details.
+ *
+ * Copyright (C) 1995, 1999, 2001, 2002 by Ralf Baechle
+ */
+#ifndef _UAPI_ASM_ERRNO_H
+#define _UAPI_ASM_ERRNO_H
+
+/*
+ * These error numbers are intended to be MIPS ABI compatible
+ */
+
+#include <asm-generic/errno-base.h>
+
+#define ENOMSG		35	/* No message of desired type */
+#define EIDRM		36	/* Identifier removed */
+#define ECHRNG		37	/* Channel number out of range */
+#define EL2NSYNC	38	/* Level 2 not synchronized */
+#define EL3HLT		39	/* Level 3 halted */
+#define EL3RST		40	/* Level 3 reset */
+#define ELNRNG		41	/* Link number out of range */
+#define EUNATCH		42	/* Protocol driver not attached */
+#define ENOCSI		43	/* No CSI structure available */
+#define EL2HLT		44	/* Level 2 halted */
+#define EDEADLK		45	/* Resource deadlock would occur */
+#define ENOLCK		46	/* No record locks available */
+#define EBADE		50	/* Invalid exchange */
+#define EBADR		51	/* Invalid request descriptor */
+#define EXFULL		52	/* Exchange full */
+#define ENOANO		53	/* No anode */
+#define EBADRQC		54	/* Invalid request code */
+#define EBADSLT		55	/* Invalid slot */
+#define EDEADLOCK	56	/* File locking deadlock error */
+#define EBFONT		59	/* Bad font file format */
+#define ENOSTR		60	/* Device not a stream */
+#define ENODATA		61	/* No data available */
+#define ETIME		62	/* Timer expired */
+#define ENOSR		63	/* Out of streams resources */
+#define ENONET		64	/* Machine is not on the network */
+#define ENOPKG		65	/* Package not installed */
+#define EREMOTE		66	/* Object is remote */
+#define ENOLINK		67	/* Link has been severed */
+#define EADV		68	/* Advertise error */
+#define ESRMNT		69	/* Srmount error */
+#define ECOMM		70	/* Communication error on send */
+#define EPROTO		71	/* Protocol error */
+#define EDOTDOT		73	/* RFS specific error */
+#define EMULTIHOP	74	/* Multihop attempted */
+#define EBADMSG		77	/* Not a data message */
+#define ENAMETOOLONG	78	/* File name too long */
+#define EOVERFLOW	79	/* Value too large for defined data type */
+#define ENOTUNIQ	80	/* Name not unique on network */
+#define EBADFD		81	/* File descriptor in bad state */
+#define EREMCHG		82	/* Remote address changed */
+#define ELIBACC		83	/* Can not access a needed shared library */
+#define ELIBBAD		84	/* Accessing a corrupted shared library */
+#define ELIBSCN		85	/* .lib section in a.out corrupted */
+#define ELIBMAX		86	/* Attempting to link in too many shared libraries */
+#define ELIBEXEC	87	/* Cannot exec a shared library directly */
+#define EILSEQ		88	/* Illegal byte sequence */
+#define ENOSYS		89	/* Function not implemented */
+#define ELOOP		90	/* Too many symbolic links encountered */
+#define ERESTART	91	/* Interrupted system call should be restarted */
+#define ESTRPIPE	92	/* Streams pipe error */
+#define ENOTEMPTY	93	/* Directory not empty */
+#define EUSERS		94	/* Too many users */
+#define ENOTSOCK	95	/* Socket operation on non-socket */
+#define EDESTADDRREQ	96	/* Destination address required */
+#define EMSGSIZE	97	/* Message too long */
+#define EPROTOTYPE	98	/* Protocol wrong type for socket */
+#define ENOPROTOOPT	99	/* Protocol not available */
+#define EPROTONOSUPPORT 120	/* Protocol not supported */
+#define ESOCKTNOSUPPORT 121	/* Socket type not supported */
+#define EOPNOTSUPP	122	/* Operation not supported on transport endpoint */
+#define EPFNOSUPPORT	123	/* Protocol family not supported */
+#define EAFNOSUPPORT	124	/* Address family not supported by protocol */
+#define EADDRINUSE	125	/* Address already in use */
+#define EADDRNOTAVAIL	126	/* Cannot assign requested address */
+#define ENETDOWN	127	/* Network is down */
+#define ENETUNREACH	128	/* Network is unreachable */
+#define ENETRESET	129	/* Network dropped connection because of reset */
+#define ECONNABORTED	130	/* Software caused connection abort */
+#define ECONNRESET	131	/* Connection reset by peer */
+#define ENOBUFS		132	/* No buffer space available */
+#define EISCONN		133	/* Transport endpoint is already connected */
+#define ENOTCONN	134	/* Transport endpoint is not connected */
+#define EUCLEAN		135	/* Structure needs cleaning */
+#define ENOTNAM		137	/* Not a XENIX named type file */
+#define ENAVAIL		138	/* No XENIX semaphores available */
+#define EISNAM		139	/* Is a named type file */
+#define EREMOTEIO	140	/* Remote I/O error */
+#define EINIT		141	/* Reserved */
+#define EREMDEV		142	/* Error 142 */
+#define ESHUTDOWN	143	/* Cannot send after transport endpoint shutdown */
+#define ETOOMANYREFS	144	/* Too many references: cannot splice */
+#define ETIMEDOUT	145	/* Connection timed out */
+#define ECONNREFUSED	146	/* Connection refused */
+#define EHOSTDOWN	147	/* Host is down */
+#define EHOSTUNREACH	148	/* No route to host */
+#define EWOULDBLOCK	EAGAIN	/* Operation would block */
+#define EALREADY	149	/* Operation already in progress */
+#define EINPROGRESS	150	/* Operation now in progress */
+#define ESTALE		151	/* Stale file handle */
+#define ECANCELED	158	/* AIO operation canceled */
+
+/*
+ * These error are Linux extensions.
+ */
+#define ENOMEDIUM	159	/* No medium found */
+#define EMEDIUMTYPE	160	/* Wrong medium type */
+#define ENOKEY		161	/* Required key not available */
+#define EKEYEXPIRED	162	/* Key has expired */
+#define EKEYREVOKED	163	/* Key has been revoked */
+#define EKEYREJECTED	164	/* Key was rejected by service */
+
+/* for robust mutexes */
+#define EOWNERDEAD	165	/* Owner died */
+#define ENOTRECOVERABLE 166	/* State not recoverable */
+
+#define ERFKILL		167	/* Operation not possible due to RF-kill */
+
+#define EHWPOISON	168	/* Memory page has hardware error */
+
+#define EDQUOT		1133	/* Quota exceeded */
+
+
+#endif /* _UAPI_ASM_ERRNO_H */
diff --git a/tools/arch/parisc/include/uapi/asm/errno.h b/tools/arch/parisc/include/uapi/asm/errno.h
new file mode 100644
index 0000000..fc0df35
--- /dev/null
+++ b/tools/arch/parisc/include/uapi/asm/errno.h
@@ -0,0 +1,128 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _PARISC_ERRNO_H
+#define _PARISC_ERRNO_H
+
+#include <asm-generic/errno-base.h>
+
+#define	ENOMSG		35	/* No message of desired type */
+#define	EIDRM		36	/* Identifier removed */
+#define	ECHRNG		37	/* Channel number out of range */
+#define	EL2NSYNC	38	/* Level 2 not synchronized */
+#define	EL3HLT		39	/* Level 3 halted */
+#define	EL3RST		40	/* Level 3 reset */
+#define	ELNRNG		41	/* Link number out of range */
+#define	EUNATCH		42	/* Protocol driver not attached */
+#define	ENOCSI		43	/* No CSI structure available */
+#define	EL2HLT		44	/* Level 2 halted */
+#define	EDEADLK		45	/* Resource deadlock would occur */
+#define	EDEADLOCK	EDEADLK
+#define	ENOLCK		46	/* No record locks available */
+#define	EILSEQ		47	/* Illegal byte sequence */
+
+#define	ENONET		50	/* Machine is not on the network */
+#define	ENODATA		51	/* No data available */
+#define	ETIME		52	/* Timer expired */
+#define	ENOSR		53	/* Out of streams resources */
+#define	ENOSTR		54	/* Device not a stream */
+#define	ENOPKG		55	/* Package not installed */
+
+#define	ENOLINK		57	/* Link has been severed */
+#define	EADV		58	/* Advertise error */
+#define	ESRMNT		59	/* Srmount error */
+#define	ECOMM		60	/* Communication error on send */
+#define	EPROTO		61	/* Protocol error */
+
+#define	EMULTIHOP	64	/* Multihop attempted */
+
+#define	EDOTDOT		66	/* RFS specific error */
+#define	EBADMSG		67	/* Not a data message */
+#define	EUSERS		68	/* Too many users */
+#define	EDQUOT		69	/* Quota exceeded */
+#define	ESTALE		70	/* Stale file handle */
+#define	EREMOTE		71	/* Object is remote */
+#define	EOVERFLOW	72	/* Value too large for defined data type */
+
+/* these errnos are defined by Linux but not HPUX. */
+
+#define	EBADE		160	/* Invalid exchange */
+#define	EBADR		161	/* Invalid request descriptor */
+#define	EXFULL		162	/* Exchange full */
+#define	ENOANO		163	/* No anode */
+#define	EBADRQC		164	/* Invalid request code */
+#define	EBADSLT		165	/* Invalid slot */
+#define	EBFONT		166	/* Bad font file format */
+#define	ENOTUNIQ	167	/* Name not unique on network */
+#define	EBADFD		168	/* File descriptor in bad state */
+#define	EREMCHG		169	/* Remote address changed */
+#define	ELIBACC		170	/* Can not access a needed shared library */
+#define	ELIBBAD		171	/* Accessing a corrupted shared library */
+#define	ELIBSCN		172	/* .lib section in a.out corrupted */
+#define	ELIBMAX		173	/* Attempting to link in too many shared libraries */
+#define	ELIBEXEC	174	/* Cannot exec a shared library directly */
+#define	ERESTART	175	/* Interrupted system call should be restarted */
+#define	ESTRPIPE	176	/* Streams pipe error */
+#define	EUCLEAN		177	/* Structure needs cleaning */
+#define	ENOTNAM		178	/* Not a XENIX named type file */
+#define	ENAVAIL		179	/* No XENIX semaphores available */
+#define	EISNAM		180	/* Is a named type file */
+#define	EREMOTEIO	181	/* Remote I/O error */
+#define	ENOMEDIUM	182	/* No medium found */
+#define	EMEDIUMTYPE	183	/* Wrong medium type */
+#define	ENOKEY		184	/* Required key not available */
+#define	EKEYEXPIRED	185	/* Key has expired */
+#define	EKEYREVOKED	186	/* Key has been revoked */
+#define	EKEYREJECTED	187	/* Key was rejected by service */
+
+/* We now return you to your regularly scheduled HPUX. */
+
+#define ENOSYM		215	/* symbol does not exist in executable */
+#define	ENOTSOCK	216	/* Socket operation on non-socket */
+#define	EDESTADDRREQ	217	/* Destination address required */
+#define	EMSGSIZE	218	/* Message too long */
+#define	EPROTOTYPE	219	/* Protocol wrong type for socket */
+#define	ENOPROTOOPT	220	/* Protocol not available */
+#define	EPROTONOSUPPORT	221	/* Protocol not supported */
+#define	ESOCKTNOSUPPORT	222	/* Socket type not supported */
+#define	EOPNOTSUPP	223	/* Operation not supported on transport endpoint */
+#define	EPFNOSUPPORT	224	/* Protocol family not supported */
+#define	EAFNOSUPPORT	225	/* Address family not supported by protocol */
+#define	EADDRINUSE	226	/* Address already in use */
+#define	EADDRNOTAVAIL	227	/* Cannot assign requested address */
+#define	ENETDOWN	228	/* Network is down */
+#define	ENETUNREACH	229	/* Network is unreachable */
+#define	ENETRESET	230	/* Network dropped connection because of reset */
+#define	ECONNABORTED	231	/* Software caused connection abort */
+#define	ECONNRESET	232	/* Connection reset by peer */
+#define	ENOBUFS		233	/* No buffer space available */
+#define	EISCONN		234	/* Transport endpoint is already connected */
+#define	ENOTCONN	235	/* Transport endpoint is not connected */
+#define	ESHUTDOWN	236	/* Cannot send after transport endpoint shutdown */
+#define	ETOOMANYREFS	237	/* Too many references: cannot splice */
+#define	ETIMEDOUT	238	/* Connection timed out */
+#define	ECONNREFUSED	239	/* Connection refused */
+#define	EREFUSED	ECONNREFUSED	/* for HP's NFS apparently */
+#define	EREMOTERELEASE	240	/* Remote peer released connection */
+#define	EHOSTDOWN	241	/* Host is down */
+#define	EHOSTUNREACH	242	/* No route to host */
+
+#define	EALREADY	244	/* Operation already in progress */
+#define	EINPROGRESS	245	/* Operation now in progress */
+#define	EWOULDBLOCK	EAGAIN	/* Operation would block (Not HPUX compliant) */
+#define	ENOTEMPTY	247	/* Directory not empty */
+#define	ENAMETOOLONG	248	/* File name too long */
+#define	ELOOP		249	/* Too many symbolic links encountered */
+#define	ENOSYS		251	/* Function not implemented */
+
+#define ENOTSUP		252	/* Function not implemented (POSIX.4 / HPUX) */
+#define ECANCELLED	253	/* aio request was canceled before complete (POSIX.4 / HPUX) */
+#define ECANCELED	ECANCELLED	/* SuSv3 and Solaris wants one 'L' */
+
+/* for robust mutexes */
+#define EOWNERDEAD	254	/* Owner died */
+#define ENOTRECOVERABLE	255	/* State not recoverable */
+
+#define	ERFKILL		256	/* Operation not possible due to RF-kill */
+
+#define EHWPOISON	257	/* Memory page has hardware error */
+
+#endif
diff --git a/tools/arch/powerpc/include/uapi/asm/errno.h b/tools/arch/powerpc/include/uapi/asm/errno.h
new file mode 100644
index 0000000..cc79856
--- /dev/null
+++ b/tools/arch/powerpc/include/uapi/asm/errno.h
@@ -0,0 +1,10 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_POWERPC_ERRNO_H
+#define _ASM_POWERPC_ERRNO_H
+
+#include <asm-generic/errno.h>
+
+#undef	EDEADLOCK
+#define	EDEADLOCK	58	/* File locking deadlock error */
+
+#endif	/* _ASM_POWERPC_ERRNO_H */
diff --git a/tools/arch/s390/include/uapi/asm/unistd.h b/tools/arch/s390/include/uapi/asm/unistd.h
new file mode 100644
index 0000000..7251209
--- /dev/null
+++ b/tools/arch/s390/include/uapi/asm/unistd.h
@@ -0,0 +1,412 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ *  S390 version
+ *
+ *  Derived from "include/asm-i386/unistd.h"
+ */
+
+#ifndef _UAPI_ASM_S390_UNISTD_H_
+#define _UAPI_ASM_S390_UNISTD_H_
+
+/*
+ * This file contains the system call numbers.
+ */
+
+#define __NR_exit                 1
+#define __NR_fork                 2
+#define __NR_read                 3
+#define __NR_write                4
+#define __NR_open                 5
+#define __NR_close                6
+#define __NR_restart_syscall	  7
+#define __NR_creat                8
+#define __NR_link                 9
+#define __NR_unlink              10
+#define __NR_execve              11
+#define __NR_chdir               12
+#define __NR_mknod               14
+#define __NR_chmod               15
+#define __NR_lseek               19
+#define __NR_getpid              20
+#define __NR_mount               21
+#define __NR_umount              22
+#define __NR_ptrace              26
+#define __NR_alarm               27
+#define __NR_pause               29
+#define __NR_utime               30
+#define __NR_access              33
+#define __NR_nice                34
+#define __NR_sync                36
+#define __NR_kill                37
+#define __NR_rename              38
+#define __NR_mkdir               39
+#define __NR_rmdir               40
+#define __NR_dup                 41
+#define __NR_pipe                42
+#define __NR_times               43
+#define __NR_brk                 45
+#define __NR_signal              48
+#define __NR_acct                51
+#define __NR_umount2             52
+#define __NR_ioctl               54
+#define __NR_fcntl               55
+#define __NR_setpgid             57
+#define __NR_umask               60
+#define __NR_chroot              61
+#define __NR_ustat               62
+#define __NR_dup2                63
+#define __NR_getppid             64
+#define __NR_getpgrp             65
+#define __NR_setsid              66
+#define __NR_sigaction           67
+#define __NR_sigsuspend          72
+#define __NR_sigpending          73
+#define __NR_sethostname         74
+#define __NR_setrlimit           75
+#define __NR_getrusage           77
+#define __NR_gettimeofday        78
+#define __NR_settimeofday        79
+#define __NR_symlink             83
+#define __NR_readlink            85
+#define __NR_uselib              86
+#define __NR_swapon              87
+#define __NR_reboot              88
+#define __NR_readdir             89
+#define __NR_mmap                90
+#define __NR_munmap              91
+#define __NR_truncate            92
+#define __NR_ftruncate           93
+#define __NR_fchmod              94
+#define __NR_getpriority         96
+#define __NR_setpriority         97
+#define __NR_statfs              99
+#define __NR_fstatfs            100
+#define __NR_socketcall         102
+#define __NR_syslog             103
+#define __NR_setitimer          104
+#define __NR_getitimer          105
+#define __NR_stat               106
+#define __NR_lstat              107
+#define __NR_fstat              108
+#define __NR_lookup_dcookie     110
+#define __NR_vhangup            111
+#define __NR_idle               112
+#define __NR_wait4              114
+#define __NR_swapoff            115
+#define __NR_sysinfo            116
+#define __NR_ipc                117
+#define __NR_fsync              118
+#define __NR_sigreturn          119
+#define __NR_clone              120
+#define __NR_setdomainname      121
+#define __NR_uname              122
+#define __NR_adjtimex           124
+#define __NR_mprotect           125
+#define __NR_sigprocmask        126
+#define __NR_create_module      127
+#define __NR_init_module        128
+#define __NR_delete_module      129
+#define __NR_get_kernel_syms    130
+#define __NR_quotactl           131
+#define __NR_getpgid            132
+#define __NR_fchdir             133
+#define __NR_bdflush            134
+#define __NR_sysfs              135
+#define __NR_personality        136
+#define __NR_afs_syscall        137 /* Syscall for Andrew File System */
+#define __NR_getdents           141
+#define __NR_flock              143
+#define __NR_msync              144
+#define __NR_readv              145
+#define __NR_writev             146
+#define __NR_getsid             147
+#define __NR_fdatasync          148
+#define __NR__sysctl            149
+#define __NR_mlock              150
+#define __NR_munlock            151
+#define __NR_mlockall           152
+#define __NR_munlockall         153
+#define __NR_sched_setparam             154
+#define __NR_sched_getparam             155
+#define __NR_sched_setscheduler         156
+#define __NR_sched_getscheduler         157
+#define __NR_sched_yield                158
+#define __NR_sched_get_priority_max     159
+#define __NR_sched_get_priority_min     160
+#define __NR_sched_rr_get_interval      161
+#define __NR_nanosleep          162
+#define __NR_mremap             163
+#define __NR_query_module       167
+#define __NR_poll               168
+#define __NR_nfsservctl         169
+#define __NR_prctl              172
+#define __NR_rt_sigreturn       173
+#define __NR_rt_sigaction       174
+#define __NR_rt_sigprocmask     175
+#define __NR_rt_sigpending      176
+#define __NR_rt_sigtimedwait    177
+#define __NR_rt_sigqueueinfo    178
+#define __NR_rt_sigsuspend      179
+#define __NR_pread64            180
+#define __NR_pwrite64           181
+#define __NR_getcwd             183
+#define __NR_capget             184
+#define __NR_capset             185
+#define __NR_sigaltstack        186
+#define __NR_sendfile           187
+#define __NR_getpmsg		188
+#define __NR_putpmsg		189
+#define __NR_vfork		190
+#define __NR_pivot_root         217
+#define __NR_mincore            218
+#define __NR_madvise            219
+#define __NR_getdents64		220
+#define __NR_readahead		222
+#define __NR_setxattr		224
+#define __NR_lsetxattr		225
+#define __NR_fsetxattr		226
+#define __NR_getxattr		227
+#define __NR_lgetxattr		228
+#define __NR_fgetxattr		229
+#define __NR_listxattr		230
+#define __NR_llistxattr		231
+#define __NR_flistxattr		232
+#define __NR_removexattr	233
+#define __NR_lremovexattr	234
+#define __NR_fremovexattr	235
+#define __NR_gettid		236
+#define __NR_tkill		237
+#define __NR_futex		238
+#define __NR_sched_setaffinity	239
+#define __NR_sched_getaffinity	240
+#define __NR_tgkill		241
+/* Number 242 is reserved for tux */
+#define __NR_io_setup		243
+#define __NR_io_destroy		244
+#define __NR_io_getevents	245
+#define __NR_io_submit		246
+#define __NR_io_cancel		247
+#define __NR_exit_group		248
+#define __NR_epoll_create	249
+#define __NR_epoll_ctl		250
+#define __NR_epoll_wait		251
+#define __NR_set_tid_address	252
+#define __NR_fadvise64		253
+#define __NR_timer_create	254
+#define __NR_timer_settime	255
+#define __NR_timer_gettime	256
+#define __NR_timer_getoverrun	257
+#define __NR_timer_delete	258
+#define __NR_clock_settime	259
+#define __NR_clock_gettime	260
+#define __NR_clock_getres	261
+#define __NR_clock_nanosleep	262
+/* Number 263 is reserved for vserver */
+#define __NR_statfs64		265
+#define __NR_fstatfs64		266
+#define __NR_remap_file_pages	267
+#define __NR_mbind		268
+#define __NR_get_mempolicy	269
+#define __NR_set_mempolicy	270
+#define __NR_mq_open		271
+#define __NR_mq_unlink		272
+#define __NR_mq_timedsend	273
+#define __NR_mq_timedreceive	274
+#define __NR_mq_notify		275
+#define __NR_mq_getsetattr	276
+#define __NR_kexec_load		277
+#define __NR_add_key		278
+#define __NR_request_key	279
+#define __NR_keyctl		280
+#define __NR_waitid		281
+#define __NR_ioprio_set		282
+#define __NR_ioprio_get		283
+#define __NR_inotify_init	284
+#define __NR_inotify_add_watch	285
+#define __NR_inotify_rm_watch	286
+#define __NR_migrate_pages	287
+#define __NR_openat		288
+#define __NR_mkdirat		289
+#define __NR_mknodat		290
+#define __NR_fchownat		291
+#define __NR_futimesat		292
+#define __NR_unlinkat		294
+#define __NR_renameat		295
+#define __NR_linkat		296
+#define __NR_symlinkat		297
+#define __NR_readlinkat		298
+#define __NR_fchmodat		299
+#define __NR_faccessat		300
+#define __NR_pselect6		301
+#define __NR_ppoll		302
+#define __NR_unshare		303
+#define __NR_set_robust_list	304
+#define __NR_get_robust_list	305
+#define __NR_splice		306
+#define __NR_sync_file_range	307
+#define __NR_tee		308
+#define __NR_vmsplice		309
+#define __NR_move_pages		310
+#define __NR_getcpu		311
+#define __NR_epoll_pwait	312
+#define __NR_utimes		313
+#define __NR_fallocate		314
+#define __NR_utimensat		315
+#define __NR_signalfd		316
+#define __NR_timerfd		317
+#define __NR_eventfd		318
+#define __NR_timerfd_create	319
+#define __NR_timerfd_settime	320
+#define __NR_timerfd_gettime	321
+#define __NR_signalfd4		322
+#define __NR_eventfd2		323
+#define __NR_inotify_init1	324
+#define __NR_pipe2		325
+#define __NR_dup3		326
+#define __NR_epoll_create1	327
+#define	__NR_preadv		328
+#define	__NR_pwritev		329
+#define __NR_rt_tgsigqueueinfo	330
+#define __NR_perf_event_open	331
+#define __NR_fanotify_init	332
+#define __NR_fanotify_mark	333
+#define __NR_prlimit64		334
+#define __NR_name_to_handle_at	335
+#define __NR_open_by_handle_at	336
+#define __NR_clock_adjtime	337
+#define __NR_syncfs		338
+#define __NR_setns		339
+#define __NR_process_vm_readv	340
+#define __NR_process_vm_writev	341
+#define __NR_s390_runtime_instr 342
+#define __NR_kcmp		343
+#define __NR_finit_module	344
+#define __NR_sched_setattr	345
+#define __NR_sched_getattr	346
+#define __NR_renameat2		347
+#define __NR_seccomp		348
+#define __NR_getrandom		349
+#define __NR_memfd_create	350
+#define __NR_bpf		351
+#define __NR_s390_pci_mmio_write	352
+#define __NR_s390_pci_mmio_read		353
+#define __NR_execveat		354
+#define __NR_userfaultfd	355
+#define __NR_membarrier		356
+#define __NR_recvmmsg		357
+#define __NR_sendmmsg		358
+#define __NR_socket		359
+#define __NR_socketpair		360
+#define __NR_bind		361
+#define __NR_connect		362
+#define __NR_listen		363
+#define __NR_accept4		364
+#define __NR_getsockopt		365
+#define __NR_setsockopt		366
+#define __NR_getsockname	367
+#define __NR_getpeername	368
+#define __NR_sendto		369
+#define __NR_sendmsg		370
+#define __NR_recvfrom		371
+#define __NR_recvmsg		372
+#define __NR_shutdown		373
+#define __NR_mlock2		374
+#define __NR_copy_file_range	375
+#define __NR_preadv2		376
+#define __NR_pwritev2		377
+#define __NR_s390_guarded_storage	378
+#define __NR_statx		379
+#define __NR_s390_sthyi		380
+#define NR_syscalls 381
+
+/* 
+ * There are some system calls that are not present on 64 bit, some
+ * have a different name although they do the same (e.g. __NR_chown32
+ * is __NR_chown on 64 bit).
+ */
+#ifndef __s390x__
+
+#define __NR_time		 13
+#define __NR_lchown		 16
+#define __NR_setuid		 23
+#define __NR_getuid		 24
+#define __NR_stime		 25
+#define __NR_setgid		 46
+#define __NR_getgid		 47
+#define __NR_geteuid		 49
+#define __NR_getegid		 50
+#define __NR_setreuid		 70
+#define __NR_setregid		 71
+#define __NR_getrlimit		 76
+#define __NR_getgroups		 80
+#define __NR_setgroups		 81
+#define __NR_fchown		 95
+#define __NR_ioperm		101
+#define __NR_setfsuid		138
+#define __NR_setfsgid		139
+#define __NR__llseek		140
+#define __NR__newselect 	142
+#define __NR_setresuid		164
+#define __NR_getresuid		165
+#define __NR_setresgid		170
+#define __NR_getresgid		171
+#define __NR_chown		182
+#define __NR_ugetrlimit		191	/* SuS compliant getrlimit */
+#define __NR_mmap2		192
+#define __NR_truncate64		193
+#define __NR_ftruncate64	194
+#define __NR_stat64		195
+#define __NR_lstat64		196
+#define __NR_fstat64		197
+#define __NR_lchown32		198
+#define __NR_getuid32		199
+#define __NR_getgid32		200
+#define __NR_geteuid32		201
+#define __NR_getegid32		202
+#define __NR_setreuid32		203
+#define __NR_setregid32		204
+#define __NR_getgroups32	205
+#define __NR_setgroups32	206
+#define __NR_fchown32		207
+#define __NR_setresuid32	208
+#define __NR_getresuid32	209
+#define __NR_setresgid32	210
+#define __NR_getresgid32	211
+#define __NR_chown32		212
+#define __NR_setuid32		213
+#define __NR_setgid32		214
+#define __NR_setfsuid32		215
+#define __NR_setfsgid32		216
+#define __NR_fcntl64		221
+#define __NR_sendfile64		223
+#define __NR_fadvise64_64	264
+#define __NR_fstatat64		293
+
+#else
+
+#define __NR_select		142
+#define __NR_getrlimit		191	/* SuS compliant getrlimit */
+#define __NR_lchown  		198
+#define __NR_getuid  		199
+#define __NR_getgid  		200
+#define __NR_geteuid  		201
+#define __NR_getegid  		202
+#define __NR_setreuid  		203
+#define __NR_setregid  		204
+#define __NR_getgroups  	205
+#define __NR_setgroups  	206
+#define __NR_fchown  		207
+#define __NR_setresuid  	208
+#define __NR_getresuid  	209
+#define __NR_setresgid  	210
+#define __NR_getresgid  	211
+#define __NR_chown  		212
+#define __NR_setuid  		213
+#define __NR_setgid  		214
+#define __NR_setfsuid  		215
+#define __NR_setfsgid  		216
+#define __NR_newfstatat		293
+
+#endif
+
+#endif /* _UAPI_ASM_S390_UNISTD_H_ */
diff --git a/tools/arch/sparc/include/uapi/asm/errno.h b/tools/arch/sparc/include/uapi/asm/errno.h
new file mode 100644
index 0000000..81a732b
--- /dev/null
+++ b/tools/arch/sparc/include/uapi/asm/errno.h
@@ -0,0 +1,118 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _SPARC_ERRNO_H
+#define _SPARC_ERRNO_H
+
+/* These match the SunOS error numbering scheme. */
+
+#include <asm-generic/errno-base.h>
+
+#define	EWOULDBLOCK	EAGAIN	/* Operation would block */
+#define	EINPROGRESS	36	/* Operation now in progress */
+#define	EALREADY	37	/* Operation already in progress */
+#define	ENOTSOCK	38	/* Socket operation on non-socket */
+#define	EDESTADDRREQ	39	/* Destination address required */
+#define	EMSGSIZE	40	/* Message too long */
+#define	EPROTOTYPE	41	/* Protocol wrong type for socket */
+#define	ENOPROTOOPT	42	/* Protocol not available */
+#define	EPROTONOSUPPORT	43	/* Protocol not supported */
+#define	ESOCKTNOSUPPORT	44	/* Socket type not supported */
+#define	EOPNOTSUPP	45	/* Op not supported on transport endpoint */
+#define	EPFNOSUPPORT	46	/* Protocol family not supported */
+#define	EAFNOSUPPORT	47	/* Address family not supported by protocol */
+#define	EADDRINUSE	48	/* Address already in use */
+#define	EADDRNOTAVAIL	49	/* Cannot assign requested address */
+#define	ENETDOWN	50	/* Network is down */
+#define	ENETUNREACH	51	/* Network is unreachable */
+#define	ENETRESET	52	/* Net dropped connection because of reset */
+#define	ECONNABORTED	53	/* Software caused connection abort */
+#define	ECONNRESET	54	/* Connection reset by peer */
+#define	ENOBUFS		55	/* No buffer space available */
+#define	EISCONN		56	/* Transport endpoint is already connected */
+#define	ENOTCONN	57	/* Transport endpoint is not connected */
+#define	ESHUTDOWN	58	/* No send after transport endpoint shutdown */
+#define	ETOOMANYREFS	59	/* Too many references: cannot splice */
+#define	ETIMEDOUT	60	/* Connection timed out */
+#define	ECONNREFUSED	61	/* Connection refused */
+#define	ELOOP		62	/* Too many symbolic links encountered */
+#define	ENAMETOOLONG	63	/* File name too long */
+#define	EHOSTDOWN	64	/* Host is down */
+#define	EHOSTUNREACH	65	/* No route to host */
+#define	ENOTEMPTY	66	/* Directory not empty */
+#define EPROCLIM        67      /* SUNOS: Too many processes */
+#define	EUSERS		68	/* Too many users */
+#define	EDQUOT		69	/* Quota exceeded */
+#define	ESTALE		70	/* Stale file handle */
+#define	EREMOTE		71	/* Object is remote */
+#define	ENOSTR		72	/* Device not a stream */
+#define	ETIME		73	/* Timer expired */
+#define	ENOSR		74	/* Out of streams resources */
+#define	ENOMSG		75	/* No message of desired type */
+#define	EBADMSG		76	/* Not a data message */
+#define	EIDRM		77	/* Identifier removed */
+#define	EDEADLK		78	/* Resource deadlock would occur */
+#define	ENOLCK		79	/* No record locks available */
+#define	ENONET		80	/* Machine is not on the network */
+#define ERREMOTE        81      /* SunOS: Too many lvls of remote in path */
+#define	ENOLINK		82	/* Link has been severed */
+#define	EADV		83	/* Advertise error */
+#define	ESRMNT		84	/* Srmount error */
+#define	ECOMM		85      /* Communication error on send */
+#define	EPROTO		86	/* Protocol error */
+#define	EMULTIHOP	87	/* Multihop attempted */
+#define	EDOTDOT		88	/* RFS specific error */
+#define	EREMCHG		89	/* Remote address changed */
+#define	ENOSYS		90	/* Function not implemented */
+
+/* The rest have no SunOS equivalent. */
+#define	ESTRPIPE	91	/* Streams pipe error */
+#define	EOVERFLOW	92	/* Value too large for defined data type */
+#define	EBADFD		93	/* File descriptor in bad state */
+#define	ECHRNG		94	/* Channel number out of range */
+#define	EL2NSYNC	95	/* Level 2 not synchronized */
+#define	EL3HLT		96	/* Level 3 halted */
+#define	EL3RST		97	/* Level 3 reset */
+#define	ELNRNG		98	/* Link number out of range */
+#define	EUNATCH		99	/* Protocol driver not attached */
+#define	ENOCSI		100	/* No CSI structure available */
+#define	EL2HLT		101	/* Level 2 halted */
+#define	EBADE		102	/* Invalid exchange */
+#define	EBADR		103	/* Invalid request descriptor */
+#define	EXFULL		104	/* Exchange full */
+#define	ENOANO		105	/* No anode */
+#define	EBADRQC		106	/* Invalid request code */
+#define	EBADSLT		107	/* Invalid slot */
+#define	EDEADLOCK	108	/* File locking deadlock error */
+#define	EBFONT		109	/* Bad font file format */
+#define	ELIBEXEC	110	/* Cannot exec a shared library directly */
+#define	ENODATA		111	/* No data available */
+#define	ELIBBAD		112	/* Accessing a corrupted shared library */
+#define	ENOPKG		113	/* Package not installed */
+#define	ELIBACC		114	/* Can not access a needed shared library */
+#define	ENOTUNIQ	115	/* Name not unique on network */
+#define	ERESTART	116	/* Interrupted syscall should be restarted */
+#define	EUCLEAN		117	/* Structure needs cleaning */
+#define	ENOTNAM		118	/* Not a XENIX named type file */
+#define	ENAVAIL		119	/* No XENIX semaphores available */
+#define	EISNAM		120	/* Is a named type file */
+#define	EREMOTEIO	121	/* Remote I/O error */
+#define	EILSEQ		122	/* Illegal byte sequence */
+#define	ELIBMAX		123	/* Atmpt to link in too many shared libs */
+#define	ELIBSCN		124	/* .lib section in a.out corrupted */
+
+#define	ENOMEDIUM	125	/* No medium found */
+#define	EMEDIUMTYPE	126	/* Wrong medium type */
+#define	ECANCELED	127	/* Operation Cancelled */
+#define	ENOKEY		128	/* Required key not available */
+#define	EKEYEXPIRED	129	/* Key has expired */
+#define	EKEYREVOKED	130	/* Key has been revoked */
+#define	EKEYREJECTED	131	/* Key was rejected by service */
+
+/* for robust mutexes */
+#define	EOWNERDEAD	132	/* Owner died */
+#define	ENOTRECOVERABLE	133	/* State not recoverable */
+
+#define	ERFKILL		134	/* Operation not possible due to RF-kill */
+
+#define EHWPOISON	135	/* Memory page has hardware error */
+
+#endif
diff --git a/tools/arch/x86/include/asm/cpufeatures.h b/tools/arch/x86/include/asm/cpufeatures.h
index 800104c..21ac898 100644
--- a/tools/arch/x86/include/asm/cpufeatures.h
+++ b/tools/arch/x86/include/asm/cpufeatures.h
@@ -197,11 +197,12 @@
 #define X86_FEATURE_CAT_L3		( 7*32+ 4) /* Cache Allocation Technology L3 */
 #define X86_FEATURE_CAT_L2		( 7*32+ 5) /* Cache Allocation Technology L2 */
 #define X86_FEATURE_CDP_L3		( 7*32+ 6) /* Code and Data Prioritization L3 */
+#define X86_FEATURE_INVPCID_SINGLE	( 7*32+ 7) /* Effectively INVPCID && CR4.PCIDE=1 */
 
 #define X86_FEATURE_HW_PSTATE		( 7*32+ 8) /* AMD HW-PState */
 #define X86_FEATURE_PROC_FEEDBACK	( 7*32+ 9) /* AMD ProcFeedbackInterface */
 #define X86_FEATURE_SME			( 7*32+10) /* AMD Secure Memory Encryption */
-
+#define X86_FEATURE_PTI			( 7*32+11) /* Kernel Page Table Isolation enabled */
 #define X86_FEATURE_INTEL_PPIN		( 7*32+14) /* Intel Processor Inventory Number */
 #define X86_FEATURE_INTEL_PT		( 7*32+15) /* Intel Processor Trace */
 #define X86_FEATURE_AVX512_4VNNIW	( 7*32+16) /* AVX-512 Neural Network Instructions */
@@ -340,5 +341,6 @@
 #define X86_BUG_SWAPGS_FENCE		X86_BUG(11) /* SWAPGS without input dep on GS */
 #define X86_BUG_MONITOR			X86_BUG(12) /* IPI required to wake up remote CPU */
 #define X86_BUG_AMD_E400		X86_BUG(13) /* CPU is among the affected by Erratum 400 */
+#define X86_BUG_CPU_MELTDOWN		X86_BUG(14) /* CPU is affected by meltdown attack and needs kernel page table isolation */
 
 #endif /* _ASM_X86_CPUFEATURES_H */
diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x86/include/asm/disabled-features.h
index 14d6d50..b027633 100644
--- a/tools/arch/x86/include/asm/disabled-features.h
+++ b/tools/arch/x86/include/asm/disabled-features.h
@@ -50,6 +50,12 @@
 # define DISABLE_LA57	(1<<(X86_FEATURE_LA57 & 31))
 #endif
 
+#ifdef CONFIG_PAGE_TABLE_ISOLATION
+# define DISABLE_PTI		0
+#else
+# define DISABLE_PTI		(1 << (X86_FEATURE_PTI & 31))
+#endif
+
 /*
  * Make sure to add features to the correct mask
  */
@@ -60,7 +66,7 @@
 #define DISABLED_MASK4	(DISABLE_PCID)
 #define DISABLED_MASK5	0
 #define DISABLED_MASK6	0
-#define DISABLED_MASK7	0
+#define DISABLED_MASK7	(DISABLE_PTI)
 #define DISABLED_MASK8	0
 #define DISABLED_MASK9	(DISABLE_MPX)
 #define DISABLED_MASK10	0
diff --git a/tools/arch/x86/include/uapi/asm/errno.h b/tools/arch/x86/include/uapi/asm/errno.h
new file mode 100644
index 0000000..4c82b50
--- /dev/null
+++ b/tools/arch/x86/include/uapi/asm/errno.h
@@ -0,0 +1 @@
+#include <asm-generic/errno.h>
diff --git a/tools/build/Makefile.feature b/tools/build/Makefile.feature
index c71a05b..c378f00 100644
--- a/tools/build/Makefile.feature
+++ b/tools/build/Makefile.feature
@@ -56,6 +56,7 @@
         libunwind-arm                   \
         libunwind-aarch64               \
         pthread-attr-setaffinity-np     \
+        pthread-barrier     		\
         stackprotector-all              \
         timerfd                         \
         libdw-dwarf-unwind              \
@@ -65,7 +66,8 @@
         bpf                             \
         sched_getcpu			\
         sdt				\
-        setns
+        setns				\
+        libopencsd
 
 # FEATURE_TESTS_BASIC + FEATURE_TESTS_EXTRA is the complete list
 # of all feature tests
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 9698264..59585fe 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -37,6 +37,7 @@
          test-libunwind-debug-frame-arm.bin     \
          test-libunwind-debug-frame-aarch64.bin \
          test-pthread-attr-setaffinity-np.bin   \
+         test-pthread-barrier.bin		\
          test-stackprotector-all.bin            \
          test-timerfd.bin                       \
          test-libdw-dwarf-unwind.bin            \
@@ -51,7 +52,8 @@
          test-cxx.bin                           \
          test-jvmti.bin				\
          test-sched_getcpu.bin			\
-         test-setns.bin
+         test-setns.bin				\
+         test-libopencsd.bin
 
 FILES := $(addprefix $(OUTPUT),$(FILES))
 
@@ -79,6 +81,9 @@
 $(OUTPUT)test-pthread-attr-setaffinity-np.bin:
 	$(BUILD) -D_GNU_SOURCE -lpthread
 
+$(OUTPUT)test-pthread-barrier.bin:
+	$(BUILD) -lpthread
+
 $(OUTPUT)test-stackprotector-all.bin:
 	$(BUILD) -fstack-protector-all
 
@@ -100,6 +105,10 @@
 $(OUTPUT)test-setns.bin:
 	$(BUILD)
 
+$(OUTPUT)test-libopencsd.bin:
+	$(BUILD) # -lopencsd_c_api -lopencsd provided by
+		 # $(FEATURE_CHECK_LDFLAGS-libopencsd)
+
 DWARFLIBS := -ldw
 ifeq ($(findstring -static,${LDFLAGS}),-static)
 DWARFLIBS += -lelf -lebl -lz -llzma -lbz2
diff --git a/tools/build/feature/test-all.c b/tools/build/feature/test-all.c
index 4112702..8dc20a6 100644
--- a/tools/build/feature/test-all.c
+++ b/tools/build/feature/test-all.c
@@ -118,6 +118,10 @@
 # include "test-pthread-attr-setaffinity-np.c"
 #undef main
 
+#define main main_test_pthread_barrier
+# include "test-pthread-barrier.c"
+#undef main
+
 #define main main_test_sched_getcpu
 # include "test-sched_getcpu.c"
 #undef main
@@ -158,6 +162,10 @@
 # include "test-setns.c"
 #undef main
 
+#define main main_test_libopencsd
+# include "test-libopencsd.c"
+#undef main
+
 int main(int argc, char *argv[])
 {
 	main_test_libpython();
@@ -187,6 +195,7 @@ int main(int argc, char *argv[])
 	main_test_sync_compare_and_swap(argc, argv);
 	main_test_zlib();
 	main_test_pthread_attr_setaffinity_np();
+	main_test_pthread_barrier();
 	main_test_lzma();
 	main_test_get_cpuid();
 	main_test_bpf();
@@ -194,6 +203,7 @@ int main(int argc, char *argv[])
 	main_test_sched_getcpu();
 	main_test_sdt();
 	main_test_setns();
+	main_test_libopencsd();
 
 	return 0;
 }
diff --git a/tools/build/feature/test-libopencsd.c b/tools/build/feature/test-libopencsd.c
new file mode 100644
index 0000000..5ff1246
--- /dev/null
+++ b/tools/build/feature/test-libopencsd.c
@@ -0,0 +1,8 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <opencsd/c_api/opencsd_c_api.h>
+
+int main(void)
+{
+	(void)ocsd_get_version();
+	return 0;
+}
diff --git a/tools/build/feature/test-pthread-barrier.c b/tools/build/feature/test-pthread-barrier.c
new file mode 100644
index 0000000..0558d93
--- /dev/null
+++ b/tools/build/feature/test-pthread-barrier.c
@@ -0,0 +1,12 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <stdint.h>
+#include <pthread.h>
+
+int main(void)
+{
+	pthread_barrier_t barrier;
+
+	pthread_barrier_init(&barrier, NULL, 1);
+	pthread_barrier_wait(&barrier);
+	return pthread_barrier_destroy(&barrier);
+}
diff --git a/tools/include/uapi/asm-generic/errno-base.h b/tools/include/uapi/asm-generic/errno-base.h
new file mode 100644
index 0000000..9653140
--- /dev/null
+++ b/tools/include/uapi/asm-generic/errno-base.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_GENERIC_ERRNO_BASE_H
+#define _ASM_GENERIC_ERRNO_BASE_H
+
+#define	EPERM		 1	/* Operation not permitted */
+#define	ENOENT		 2	/* No such file or directory */
+#define	ESRCH		 3	/* No such process */
+#define	EINTR		 4	/* Interrupted system call */
+#define	EIO		 5	/* I/O error */
+#define	ENXIO		 6	/* No such device or address */
+#define	E2BIG		 7	/* Argument list too long */
+#define	ENOEXEC		 8	/* Exec format error */
+#define	EBADF		 9	/* Bad file number */
+#define	ECHILD		10	/* No child processes */
+#define	EAGAIN		11	/* Try again */
+#define	ENOMEM		12	/* Out of memory */
+#define	EACCES		13	/* Permission denied */
+#define	EFAULT		14	/* Bad address */
+#define	ENOTBLK		15	/* Block device required */
+#define	EBUSY		16	/* Device or resource busy */
+#define	EEXIST		17	/* File exists */
+#define	EXDEV		18	/* Cross-device link */
+#define	ENODEV		19	/* No such device */
+#define	ENOTDIR		20	/* Not a directory */
+#define	EISDIR		21	/* Is a directory */
+#define	EINVAL		22	/* Invalid argument */
+#define	ENFILE		23	/* File table overflow */
+#define	EMFILE		24	/* Too many open files */
+#define	ENOTTY		25	/* Not a typewriter */
+#define	ETXTBSY		26	/* Text file busy */
+#define	EFBIG		27	/* File too large */
+#define	ENOSPC		28	/* No space left on device */
+#define	ESPIPE		29	/* Illegal seek */
+#define	EROFS		30	/* Read-only file system */
+#define	EMLINK		31	/* Too many links */
+#define	EPIPE		32	/* Broken pipe */
+#define	EDOM		33	/* Math argument out of domain of func */
+#define	ERANGE		34	/* Math result not representable */
+
+#endif
diff --git a/tools/include/uapi/asm-generic/errno.h b/tools/include/uapi/asm-generic/errno.h
new file mode 100644
index 0000000..cf9c51a
--- /dev/null
+++ b/tools/include/uapi/asm-generic/errno.h
@@ -0,0 +1,123 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+#ifndef _ASM_GENERIC_ERRNO_H
+#define _ASM_GENERIC_ERRNO_H
+
+#include <asm-generic/errno-base.h>
+
+#define	EDEADLK		35	/* Resource deadlock would occur */
+#define	ENAMETOOLONG	36	/* File name too long */
+#define	ENOLCK		37	/* No record locks available */
+
+/*
+ * This error code is special: arch syscall entry code will return
+ * -ENOSYS if users try to call a syscall that doesn't exist.  To keep
+ * failures of syscalls that really do exist distinguishable from
+ * failures due to attempts to use a nonexistent syscall, syscall
+ * implementations should refrain from returning -ENOSYS.
+ */
+#define	ENOSYS		38	/* Invalid system call number */
+
+#define	ENOTEMPTY	39	/* Directory not empty */
+#define	ELOOP		40	/* Too many symbolic links encountered */
+#define	EWOULDBLOCK	EAGAIN	/* Operation would block */
+#define	ENOMSG		42	/* No message of desired type */
+#define	EIDRM		43	/* Identifier removed */
+#define	ECHRNG		44	/* Channel number out of range */
+#define	EL2NSYNC	45	/* Level 2 not synchronized */
+#define	EL3HLT		46	/* Level 3 halted */
+#define	EL3RST		47	/* Level 3 reset */
+#define	ELNRNG		48	/* Link number out of range */
+#define	EUNATCH		49	/* Protocol driver not attached */
+#define	ENOCSI		50	/* No CSI structure available */
+#define	EL2HLT		51	/* Level 2 halted */
+#define	EBADE		52	/* Invalid exchange */
+#define	EBADR		53	/* Invalid request descriptor */
+#define	EXFULL		54	/* Exchange full */
+#define	ENOANO		55	/* No anode */
+#define	EBADRQC		56	/* Invalid request code */
+#define	EBADSLT		57	/* Invalid slot */
+
+#define	EDEADLOCK	EDEADLK
+
+#define	EBFONT		59	/* Bad font file format */
+#define	ENOSTR		60	/* Device not a stream */
+#define	ENODATA		61	/* No data available */
+#define	ETIME		62	/* Timer expired */
+#define	ENOSR		63	/* Out of streams resources */
+#define	ENONET		64	/* Machine is not on the network */
+#define	ENOPKG		65	/* Package not installed */
+#define	EREMOTE		66	/* Object is remote */
+#define	ENOLINK		67	/* Link has been severed */
+#define	EADV		68	/* Advertise error */
+#define	ESRMNT		69	/* Srmount error */
+#define	ECOMM		70	/* Communication error on send */
+#define	EPROTO		71	/* Protocol error */
+#define	EMULTIHOP	72	/* Multihop attempted */
+#define	EDOTDOT		73	/* RFS specific error */
+#define	EBADMSG		74	/* Not a data message */
+#define	EOVERFLOW	75	/* Value too large for defined data type */
+#define	ENOTUNIQ	76	/* Name not unique on network */
+#define	EBADFD		77	/* File descriptor in bad state */
+#define	EREMCHG		78	/* Remote address changed */
+#define	ELIBACC		79	/* Can not access a needed shared library */
+#define	ELIBBAD		80	/* Accessing a corrupted shared library */
+#define	ELIBSCN		81	/* .lib section in a.out corrupted */
+#define	ELIBMAX		82	/* Attempting to link in too many shared libraries */
+#define	ELIBEXEC	83	/* Cannot exec a shared library directly */
+#define	EILSEQ		84	/* Illegal byte sequence */
+#define	ERESTART	85	/* Interrupted system call should be restarted */
+#define	ESTRPIPE	86	/* Streams pipe error */
+#define	EUSERS		87	/* Too many users */
+#define	ENOTSOCK	88	/* Socket operation on non-socket */
+#define	EDESTADDRREQ	89	/* Destination address required */
+#define	EMSGSIZE	90	/* Message too long */
+#define	EPROTOTYPE	91	/* Protocol wrong type for socket */
+#define	ENOPROTOOPT	92	/* Protocol not available */
+#define	EPROTONOSUPPORT	93	/* Protocol not supported */
+#define	ESOCKTNOSUPPORT	94	/* Socket type not supported */
+#define	EOPNOTSUPP	95	/* Operation not supported on transport endpoint */
+#define	EPFNOSUPPORT	96	/* Protocol family not supported */
+#define	EAFNOSUPPORT	97	/* Address family not supported by protocol */
+#define	EADDRINUSE	98	/* Address already in use */
+#define	EADDRNOTAVAIL	99	/* Cannot assign requested address */
+#define	ENETDOWN	100	/* Network is down */
+#define	ENETUNREACH	101	/* Network is unreachable */
+#define	ENETRESET	102	/* Network dropped connection because of reset */
+#define	ECONNABORTED	103	/* Software caused connection abort */
+#define	ECONNRESET	104	/* Connection reset by peer */
+#define	ENOBUFS		105	/* No buffer space available */
+#define	EISCONN		106	/* Transport endpoint is already connected */
+#define	ENOTCONN	107	/* Transport endpoint is not connected */
+#define	ESHUTDOWN	108	/* Cannot send after transport endpoint shutdown */
+#define	ETOOMANYREFS	109	/* Too many references: cannot splice */
+#define	ETIMEDOUT	110	/* Connection timed out */
+#define	ECONNREFUSED	111	/* Connection refused */
+#define	EHOSTDOWN	112	/* Host is down */
+#define	EHOSTUNREACH	113	/* No route to host */
+#define	EALREADY	114	/* Operation already in progress */
+#define	EINPROGRESS	115	/* Operation now in progress */
+#define	ESTALE		116	/* Stale file handle */
+#define	EUCLEAN		117	/* Structure needs cleaning */
+#define	ENOTNAM		118	/* Not a XENIX named type file */
+#define	ENAVAIL		119	/* No XENIX semaphores available */
+#define	EISNAM		120	/* Is a named type file */
+#define	EREMOTEIO	121	/* Remote I/O error */
+#define	EDQUOT		122	/* Quota exceeded */
+
+#define	ENOMEDIUM	123	/* No medium found */
+#define	EMEDIUMTYPE	124	/* Wrong medium type */
+#define	ECANCELED	125	/* Operation Canceled */
+#define	ENOKEY		126	/* Required key not available */
+#define	EKEYEXPIRED	127	/* Key has expired */
+#define	EKEYREVOKED	128	/* Key has been revoked */
+#define	EKEYREJECTED	129	/* Key was rejected by service */
+
+/* for robust mutexes */
+#define	EOWNERDEAD	130	/* Owner died */
+#define	ENOTRECOVERABLE	131	/* State not recoverable */
+
+#define ERFKILL		132	/* Operation not possible due to RF-kill */
+
+#define EHWPOISON	133	/* Memory page has hardware error */
+
+#endif
diff --git a/tools/include/uapi/linux/perf_event.h b/tools/include/uapi/linux/perf_event.h
index b9a4953..c77c9a2 100644
--- a/tools/include/uapi/linux/perf_event.h
+++ b/tools/include/uapi/linux/perf_event.h
@@ -612,9 +612,12 @@ struct perf_event_mmap_page {
  */
 #define PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT	(1 << 12)
 /*
- * PERF_RECORD_MISC_MMAP_DATA and PERF_RECORD_MISC_COMM_EXEC are used on
- * different events so can reuse the same bit position.
- * Ditto PERF_RECORD_MISC_SWITCH_OUT.
+ * Following PERF_RECORD_MISC_* are used on different
+ * events, so can reuse the same bit position:
+ *
+ *   PERF_RECORD_MISC_MMAP_DATA  - PERF_RECORD_MMAP* events
+ *   PERF_RECORD_MISC_COMM_EXEC  - PERF_RECORD_COMM event
+ *   PERF_RECORD_MISC_SWITCH_OUT - PERF_RECORD_SWITCH* events
  */
 #define PERF_RECORD_MISC_MMAP_DATA		(1 << 13)
 #define PERF_RECORD_MISC_COMM_EXEC		(1 << 13)
@@ -864,6 +867,7 @@ enum perf_event_type {
 	 *	struct perf_event_header	header;
 	 *	u32				pid;
 	 *	u32				tid;
+	 *	struct sample_id		sample_id;
 	 * };
 	 */
 	PERF_RECORD_ITRACE_START		= 12,
diff --git a/tools/lib/traceevent/event-parse.c b/tools/lib/traceevent/event-parse.c
index 7ce724f..e5f2acb 100644
--- a/tools/lib/traceevent/event-parse.c
+++ b/tools/lib/traceevent/event-parse.c
@@ -1094,7 +1094,7 @@ static enum event_type __read_token(char **tok)
 		if (strcmp(*tok, "LOCAL_PR_FMT") == 0) {
 			free(*tok);
 			*tok = NULL;
-			return force_token("\"\%s\" ", tok);
+			return force_token("\"%s\" ", tok);
 		} else if (strcmp(*tok, "STA_PR_FMT") == 0) {
 			free(*tok);
 			*tok = NULL;
@@ -3970,6 +3970,11 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
 				val &= ~fval;
 			}
 		}
+		if (val) {
+			if (print && arg->flags.delim)
+				trace_seq_puts(s, arg->flags.delim);
+			trace_seq_printf(s, "0x%llx", val);
+		}
 		break;
 	case PRINT_SYMBOL:
 		val = eval_num_arg(data, size, event, arg->symbol.field);
@@ -3980,6 +3985,8 @@ static void print_str_arg(struct trace_seq *s, void *data, int size,
 				break;
 			}
 		}
+		if (!flag)
+			trace_seq_printf(s, "0x%llx", val);
 		break;
 	case PRINT_HEX:
 	case PRINT_HEX_STR:
@@ -4293,6 +4300,26 @@ static struct print_arg *make_bprint_args(char *fmt, void *data, int size, struc
 				goto process_again;
 			case 'p':
 				ls = 1;
+				if (isalnum(ptr[1])) {
+					ptr++;
+					/* Check for special pointers */
+					switch (*ptr) {
+					case 's':
+					case 'S':
+					case 'f':
+					case 'F':
+						break;
+					default:
+						/*
+						 * Older kernels do not process
+						 * dereferenced pointers.
+						 * Only process if the pointer
+						 * value is a printable.
+						 */
+						if (isprint(*(char *)bptr))
+							goto process_string;
+					}
+				}
 				/* fall through */
 			case 'd':
 			case 'u':
@@ -4345,6 +4372,7 @@ static struct print_arg *make_bprint_args(char *fmt, void *data, int size, struc
 
 				break;
 			case 's':
+ process_string:
 				arg = alloc_arg();
 				if (!arg) {
 					do_warning_event(event, "%s(%d): not enough memory!",
@@ -4949,21 +4977,27 @@ static void pretty_print(struct trace_seq *s, void *data, int size, struct event
 				else
 					ls = 2;
 
-				if (*(ptr+1) == 'F' || *(ptr+1) == 'f' ||
-				    *(ptr+1) == 'S' || *(ptr+1) == 's') {
+				if (isalnum(ptr[1]))
 					ptr++;
+
+				if (arg->type == PRINT_BSTRING) {
+					trace_seq_puts(s, arg->string.string);
+					break;
+				}
+
+				if (*ptr == 'F' || *ptr == 'f' ||
+				    *ptr == 'S' || *ptr == 's') {
 					show_func = *ptr;
-				} else if (*(ptr+1) == 'M' || *(ptr+1) == 'm') {
-					print_mac_arg(s, *(ptr+1), data, size, event, arg);
-					ptr++;
+				} else if (*ptr == 'M' || *ptr == 'm') {
+					print_mac_arg(s, *ptr, data, size, event, arg);
 					arg = arg->next;
 					break;
-				} else if (*(ptr+1) == 'I' || *(ptr+1) == 'i') {
+				} else if (*ptr == 'I' || *ptr == 'i') {
 					int n;
 
-					n = print_ip_arg(s, ptr+1, data, size, event, arg);
+					n = print_ip_arg(s, ptr, data, size, event, arg);
 					if (n > 0) {
-						ptr += n;
+						ptr += n - 1;
 						arg = arg->next;
 						break;
 					}
@@ -5532,8 +5566,14 @@ void pevent_print_event(struct pevent *pevent, struct trace_seq *s,
 
 	event = pevent_find_event_by_record(pevent, record);
 	if (!event) {
-		do_warning("ug! no event found for type %d",
-			   trace_parse_common_type(pevent, record->data));
+		int i;
+		int type = trace_parse_common_type(pevent, record->data);
+
+		do_warning("ug! no event found for type %d", type);
+		trace_seq_printf(s, "[UNKNOWN TYPE %d]", type);
+		for (i = 0; i < record->size; i++)
+			trace_seq_printf(s, " %02x",
+					 ((unsigned char *)record->data)[i]);
 		return;
 	}
 
diff --git a/tools/lib/traceevent/event-plugin.c b/tools/lib/traceevent/event-plugin.c
index a16756a..d542cb6 100644
--- a/tools/lib/traceevent/event-plugin.c
+++ b/tools/lib/traceevent/event-plugin.c
@@ -120,12 +120,12 @@ char **traceevent_plugin_list_options(void)
 		for (op = reg->options; op->name; op++) {
 			char *alias = op->plugin_alias ? op->plugin_alias : op->file;
 			char **temp = list;
+			int ret;
 
-			name = malloc(strlen(op->name) + strlen(alias) + 2);
-			if (!name)
+			ret = asprintf(&name, "%s:%s", alias, op->name);
+			if (ret < 0)
 				goto err;
 
-			sprintf(name, "%s:%s", alias, op->name);
 			list = realloc(list, count + 2);
 			if (!list) {
 				list = temp;
@@ -290,17 +290,14 @@ load_plugin(struct pevent *pevent, const char *path,
 	const char *alias;
 	char *plugin;
 	void *handle;
+	int ret;
 
-	plugin = malloc(strlen(path) + strlen(file) + 2);
-	if (!plugin) {
+	ret = asprintf(&plugin, "%s/%s", path, file);
+	if (ret < 0) {
 		warning("could not allocate plugin memory\n");
 		return;
 	}
 
-	strcpy(plugin, path);
-	strcat(plugin, "/");
-	strcat(plugin, file);
-
 	handle = dlopen(plugin, RTLD_NOW | RTLD_GLOBAL);
 	if (!handle) {
 		warning("could not load plugin '%s'\n%s\n",
@@ -391,6 +388,7 @@ load_plugins(struct pevent *pevent, const char *suffix,
 	char *home;
 	char *path;
 	char *envdir;
+	int ret;
 
 	if (pevent->flags & PEVENT_DISABLE_PLUGINS)
 		return;
@@ -421,16 +419,12 @@ load_plugins(struct pevent *pevent, const char *suffix,
 	if (!home)
 		return;
 
-	path = malloc(strlen(home) + strlen(LOCAL_PLUGIN_DIR) + 2);
-	if (!path) {
+	ret = asprintf(&path, "%s/%s", home, LOCAL_PLUGIN_DIR);
+	if (ret < 0) {
 		warning("could not allocate plugin memory\n");
 		return;
 	}
 
-	strcpy(path, home);
-	strcat(path, "/");
-	strcat(path, LOCAL_PLUGIN_DIR);
-
 	load_plugins_dir(pevent, suffix, path, load_plugin, data);
 
 	free(path);
diff --git a/tools/lib/traceevent/kbuffer-parse.c b/tools/lib/traceevent/kbuffer-parse.c
index c94e364..ca424b1 100644
--- a/tools/lib/traceevent/kbuffer-parse.c
+++ b/tools/lib/traceevent/kbuffer-parse.c
@@ -24,8 +24,8 @@
 
 #include "kbuffer.h"
 
-#define MISSING_EVENTS (1 << 31)
-#define MISSING_STORED (1 << 30)
+#define MISSING_EVENTS (1UL << 31)
+#define MISSING_STORED (1UL << 30)
 
 #define COMMIT_MASK ((1 << 27) - 1)
 
diff --git a/tools/lib/traceevent/parse-filter.c b/tools/lib/traceevent/parse-filter.c
index 315df0a..431e8b3 100644
--- a/tools/lib/traceevent/parse-filter.c
+++ b/tools/lib/traceevent/parse-filter.c
@@ -287,12 +287,10 @@ find_event(struct pevent *pevent, struct event_list **events,
 		sys_name = NULL;
 	}
 
-	reg = malloc(strlen(event_name) + 3);
-	if (reg == NULL)
+	ret = asprintf(&reg, "^%s$", event_name);
+	if (ret < 0)
 		return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 
-	sprintf(reg, "^%s$", event_name);
-
 	ret = regcomp(&ereg, reg, REG_ICASE|REG_NOSUB);
 	free(reg);
 
@@ -300,13 +298,12 @@ find_event(struct pevent *pevent, struct event_list **events,
 		return PEVENT_ERRNO__INVALID_EVENT_NAME;
 
 	if (sys_name) {
-		reg = malloc(strlen(sys_name) + 3);
-		if (reg == NULL) {
+		ret = asprintf(&reg, "^%s$", sys_name);
+		if (ret < 0) {
 			regfree(&ereg);
 			return PEVENT_ERRNO__MEM_ALLOC_FAILED;
 		}
 
-		sprintf(reg, "^%s$", sys_name);
 		ret = regcomp(&sreg, reg, REG_ICASE|REG_NOSUB);
 		free(reg);
 		if (ret) {
@@ -1634,6 +1631,7 @@ int pevent_filter_clear_trivial(struct event_filter *filter,
 		case FILTER_TRIVIAL_FALSE:
 			if (filter_type->filter->boolean.value)
 				continue;
+			break;
 		case FILTER_TRIVIAL_TRUE:
 			if (!filter_type->filter->boolean.value)
 				continue;
@@ -1879,17 +1877,25 @@ static const char *get_field_str(struct filter_arg *arg, struct pevent_record *r
 	struct pevent *pevent;
 	unsigned long long addr;
 	const char *val = NULL;
+	unsigned int size;
 	char hex[64];
 
 	/* If the field is not a string convert it */
 	if (arg->str.field->flags & FIELD_IS_STRING) {
 		val = record->data + arg->str.field->offset;
+		size = arg->str.field->size;
+
+		if (arg->str.field->flags & FIELD_IS_DYNAMIC) {
+			addr = *(unsigned int *)val;
+			val = record->data + (addr & 0xffff);
+			size = addr >> 16;
+		}
 
 		/*
 		 * We need to copy the data since we can't be sure the field
 		 * is null terminated.
 		 */
-		if (*(val + arg->str.field->size - 1)) {
+		if (*(val + size - 1)) {
 			/* copy it */
 			memcpy(arg->str.buffer, val, arg->str.field->size);
 			/* the buffer is already NULL terminated */
diff --git a/tools/objtool/Makefile b/tools/objtool/Makefile
index ae0272f..e6acc28 100644
--- a/tools/objtool/Makefile
+++ b/tools/objtool/Makefile
@@ -46,7 +46,7 @@
 	@$(MAKE) $(build)=objtool
 
 $(OBJTOOL): $(LIBSUBCMD) $(OBJTOOL_IN)
-	@./sync-check.sh
+	@$(CONFIG_SHELL) ./sync-check.sh
 	$(QUIET_LINK)$(CC) $(OBJTOOL_IN) $(LDFLAGS) -o $@
 
 
diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index 9b341584..f40d46e 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -428,6 +428,40 @@ static void add_ignores(struct objtool_file *file)
 }
 
 /*
+ * FIXME: For now, just ignore any alternatives which add retpolines.  This is
+ * a temporary hack, as it doesn't allow ORC to unwind from inside a retpoline.
+ * But it at least allows objtool to understand the control flow *around* the
+ * retpoline.
+ */
+static int add_nospec_ignores(struct objtool_file *file)
+{
+	struct section *sec;
+	struct rela *rela;
+	struct instruction *insn;
+
+	sec = find_section_by_name(file->elf, ".rela.discard.nospec");
+	if (!sec)
+		return 0;
+
+	list_for_each_entry(rela, &sec->rela_list, list) {
+		if (rela->sym->type != STT_SECTION) {
+			WARN("unexpected relocation symbol type in %s", sec->name);
+			return -1;
+		}
+
+		insn = find_insn(file, rela->sym->sec, rela->addend);
+		if (!insn) {
+			WARN("bad .discard.nospec entry");
+			return -1;
+		}
+
+		insn->ignore_alts = true;
+	}
+
+	return 0;
+}
+
+/*
  * Find the destination instructions for all jumps.
  */
 static int add_jump_destinations(struct objtool_file *file)
@@ -456,6 +490,13 @@ static int add_jump_destinations(struct objtool_file *file)
 		} else if (rela->sym->sec->idx) {
 			dest_sec = rela->sym->sec;
 			dest_off = rela->sym->sym.st_value + rela->addend + 4;
+		} else if (strstr(rela->sym->name, "_indirect_thunk_")) {
+			/*
+			 * Retpoline jumps are really dynamic jumps in
+			 * disguise, so convert them accordingly.
+			 */
+			insn->type = INSN_JUMP_DYNAMIC;
+			continue;
 		} else {
 			/* sibling call */
 			insn->jump_dest = 0;
@@ -502,11 +543,18 @@ static int add_call_destinations(struct objtool_file *file)
 			dest_off = insn->offset + insn->len + insn->immediate;
 			insn->call_dest = find_symbol_by_offset(insn->sec,
 								dest_off);
+			/*
+			 * FIXME: Thanks to retpolines, it's now considered
+			 * normal for a function to call within itself.  So
+			 * disable this warning for now.
+			 */
+#if 0
 			if (!insn->call_dest) {
 				WARN_FUNC("can't find call dest symbol at offset 0x%lx",
 					  insn->sec, insn->offset, dest_off);
 				return -1;
 			}
+#endif
 		} else if (rela->sym->type == STT_SECTION) {
 			insn->call_dest = find_symbol_by_offset(rela->sym->sec,
 								rela->addend+4);
@@ -671,12 +719,6 @@ static int add_special_section_alts(struct objtool_file *file)
 		return ret;
 
 	list_for_each_entry_safe(special_alt, tmp, &special_alts, list) {
-		alt = malloc(sizeof(*alt));
-		if (!alt) {
-			WARN("malloc failed");
-			ret = -1;
-			goto out;
-		}
 
 		orig_insn = find_insn(file, special_alt->orig_sec,
 				      special_alt->orig_off);
@@ -687,6 +729,10 @@ static int add_special_section_alts(struct objtool_file *file)
 			goto out;
 		}
 
+		/* Ignore retpoline alternatives. */
+		if (orig_insn->ignore_alts)
+			continue;
+
 		new_insn = NULL;
 		if (!special_alt->group || special_alt->new_len) {
 			new_insn = find_insn(file, special_alt->new_sec,
@@ -712,6 +758,13 @@ static int add_special_section_alts(struct objtool_file *file)
 				goto out;
 		}
 
+		alt = malloc(sizeof(*alt));
+		if (!alt) {
+			WARN("malloc failed");
+			ret = -1;
+			goto out;
+		}
+
 		alt->insn = new_insn;
 		list_add_tail(&alt->list, &orig_insn->alts);
 
@@ -1028,6 +1081,10 @@ static int decode_sections(struct objtool_file *file)
 
 	add_ignores(file);
 
+	ret = add_nospec_ignores(file);
+	if (ret)
+		return ret;
+
 	ret = add_jump_destinations(file);
 	if (ret)
 		return ret;
diff --git a/tools/objtool/check.h b/tools/objtool/check.h
index 47d9ea7..dbadb30 100644
--- a/tools/objtool/check.h
+++ b/tools/objtool/check.h
@@ -44,7 +44,7 @@ struct instruction {
 	unsigned int len;
 	unsigned char type;
 	unsigned long immediate;
-	bool alt_group, visited, dead_end, ignore, hint, save, restore;
+	bool alt_group, visited, dead_end, ignore, hint, save, restore, ignore_alts;
 	struct symbol *call_dest;
 	struct instruction *jump_dest;
 	struct list_head alts;
diff --git a/tools/objtool/elf.c b/tools/objtool/elf.c
index 2446015..c1c3386 100644
--- a/tools/objtool/elf.c
+++ b/tools/objtool/elf.c
@@ -26,6 +26,7 @@
 #include <stdlib.h>
 #include <string.h>
 #include <unistd.h>
+#include <errno.h>
 
 #include "elf.h"
 #include "warn.h"
@@ -358,7 +359,8 @@ struct elf *elf_open(const char *name, int flags)
 
 	elf->fd = open(name, flags);
 	if (elf->fd == -1) {
-		perror("open");
+		fprintf(stderr, "objtool: Can't open '%s': %s\n",
+			name, strerror(errno));
 		goto err;
 	}
 
diff --git a/tools/perf/Build b/tools/perf/Build
index b48ca40..e5232d5 100644
--- a/tools/perf/Build
+++ b/tools/perf/Build
@@ -25,7 +25,7 @@
 perf-y += builtin-version.o
 perf-y += builtin-c2c.o
 
-perf-$(CONFIG_AUDIT) += builtin-trace.o
+perf-$(CONFIG_TRACE) += builtin-trace.o
 perf-$(CONFIG_LIBELF) += builtin-probe.o
 
 perf-y += bench/
@@ -50,6 +50,6 @@
 libperf-y += arch/
 libperf-y += ui/
 libperf-y += scripts/
-libperf-$(CONFIG_AUDIT) += trace/beauty/
+libperf-$(CONFIG_TRACE) += trace/beauty/
 
 gtk-y += ui/gtk/
diff --git a/tools/perf/Documentation/perf-buildid-cache.txt b/tools/perf/Documentation/perf-buildid-cache.txt
index 8468100..73c2650 100644
--- a/tools/perf/Documentation/perf-buildid-cache.txt
+++ b/tools/perf/Documentation/perf-buildid-cache.txt
@@ -24,6 +24,9 @@
 -a::
 --add=::
         Add specified file to the cache.
+-f::
+--force::
+	Don't complain, do it.
 -k::
 --kcore::
         Add specified kcore file to the cache. For the current host that is
diff --git a/tools/perf/Documentation/perf-evlist.txt b/tools/perf/Documentation/perf-evlist.txt
index 6f7200f..c0a6640 100644
--- a/tools/perf/Documentation/perf-evlist.txt
+++ b/tools/perf/Documentation/perf-evlist.txt
@@ -20,6 +20,10 @@
 --input=::
         Input file name. (default: perf.data unless stdin is a fifo)
 
+-f::
+--force::
+	Don't complain, do it.
+
 -F::
 --freq=::
 	Show just the sample frequency used for each event.
diff --git a/tools/perf/Documentation/perf-inject.txt b/tools/perf/Documentation/perf-inject.txt
index 87b2588..a64d658 100644
--- a/tools/perf/Documentation/perf-inject.txt
+++ b/tools/perf/Documentation/perf-inject.txt
@@ -60,6 +60,10 @@
 	found in the jitdumps files captured in the input perf.data file. Use this option
 	if you are monitoring environment using JIT runtimes, such as Java, DART or V8.
 
+-f::
+--force::
+	Don't complain, do it.
+
 SEE ALSO
 --------
 linkperf:perf-record[1], linkperf:perf-report[1], linkperf:perf-archive[1]
diff --git a/tools/perf/Documentation/perf-lock.txt b/tools/perf/Documentation/perf-lock.txt
index ab25be2..74d7745 100644
--- a/tools/perf/Documentation/perf-lock.txt
+++ b/tools/perf/Documentation/perf-lock.txt
@@ -42,6 +42,10 @@
 --dump-raw-trace::
         Dump raw trace in ASCII.
 
+-f::
+--force::
+	Don't complan, do it.
+
 REPORT OPTIONS
 --------------
 
diff --git a/tools/perf/Documentation/perf-probe.txt b/tools/perf/Documentation/perf-probe.txt
index d7e4869..b6866a0 100644
--- a/tools/perf/Documentation/perf-probe.txt
+++ b/tools/perf/Documentation/perf-probe.txt
@@ -170,7 +170,7 @@
      or,
      sdt_PROVIDER:SDTEVENT
 
-'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function. You can also specify a group name by 'GROUP', if omitted, set 'probe' is used for kprobe and 'probe_<bin>' is used for uprobe.
+'EVENT' specifies the name of new event, if omitted, it will be set the name of the probed function, and for return probes, a "\_\_return" suffix is automatically added to the function name. You can also specify a group name by 'GROUP', if omitted, set 'probe' is used for kprobe and 'probe_<bin>' is used for uprobe.
 Note that using existing group name can conflict with other events. Especially, using the group name reserved for kernel modules can hide embedded events in the
 modules.
 'FUNC' specifies a probed function name, and it may have one of the following options; '+OFFS' is the offset from function entry address in bytes, ':RLN' is the relative-line number from function entry line, and '%return' means that it probes function return. And ';PTN' means lazy matching pattern (see LAZY MATCHING). Note that ';PTN' must be the end of the probe point definition.  In addition, '@SRC' specifies a source file which has that function.
@@ -182,6 +182,14 @@
 For details of the SDT, see below.
 https://sourceware.org/gdb/onlinedocs/gdb/Static-Probe-Points.html
 
+ESCAPED CHARACTER
+-----------------
+
+In the probe syntax, '=', '@', '+', ':' and ';' are treated as a special character. You can use a backslash ('\') to escape the special characters.
+This is useful if you need to probe on a specific versioned symbols, like @GLIBC_... suffixes, or also you need to specify a source file which includes the special characters.
+Note that usually single backslash is consumed by shell, so you might need to pass double backslash (\\) or wrapping with single quotes (\'AAA\@BBB').
+See EXAMPLES how it is used.
+
 PROBE ARGUMENT
 --------------
 Each probe argument follows below syntax.
@@ -277,6 +285,14 @@
 
  ./perf probe --target-ns <target pid> -x /usr/lib/jvm/java-1.8.0-openjdk-1.8.0.121-0.b13.el7_3.x86_64/jre/lib/amd64/server/libjvm.so %sdt_hotspot:thread__sleep__end
 
+Add a probe on specific versioned symbol by backslash escape
+
+ ./perf probe -x /lib64/libc-2.25.so 'malloc_get_state\@GLIBC_2.2.5'
+
+Add a probe in a source file using special characters by backslash escape
+
+ ./perf probe -x /opt/test/a.out 'foo\+bar.c:4'
+
 
 SEE ALSO
 --------
diff --git a/tools/perf/Documentation/perf-record.txt b/tools/perf/Documentation/perf-record.txt
index 5a626ef..3eea6de 100644
--- a/tools/perf/Documentation/perf-record.txt
+++ b/tools/perf/Documentation/perf-record.txt
@@ -430,6 +430,9 @@
 --timestamp-filename
 Append timestamp to output file name.
 
+--timestamp-boundary::
+Record timestamp boundary (time of first/last samples).
+
 --switch-output[=mode]::
 Generate multiple perf.data files, timestamp prefixed, switching to a new one
 based on 'mode' value:
diff --git a/tools/perf/Documentation/perf-report.txt b/tools/perf/Documentation/perf-report.txt
index ddde2b5..907e505 100644
--- a/tools/perf/Documentation/perf-report.txt
+++ b/tools/perf/Documentation/perf-report.txt
@@ -402,6 +402,26 @@
 	stop time is not given (i.e, time string is 'x.y,') then analysis goes
 	to end of file.
 
+	Also support time percent with multiple time range. Time string is
+	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
+
+	For example:
+	Select the second 10% time slice:
+
+	  perf report --time 10%/2
+
+	Select from 0% to 10% time slice:
+
+	  perf report --time 0%-10%
+
+	Select the first and second 10% time slices:
+
+	  perf report --time 10%/1,10%/2
+
+	Select from 0% to 10% and 30% to 40% slices:
+
+	  perf report --time 0%-10%,30%-40%
+
 --itrace::
 	Options for decoding instruction tracing data. The options are:
 
@@ -437,8 +457,23 @@
 	will be printed. Each entry is function name or file/line. Enabled by
 	default, disable with --no-inline.
 
+--mmaps::
+	Show --tasks output plus mmap information in a format similar to
+	/proc/<PID>/maps.
+
+	Please note that not all mmaps are stored, options affecting which ones
+	are include 'perf record --data', for instance.
+
+--stats::
+	Display overall events statistics without any further processing.
+	(like the one at the end of the perf report -D command)
+
+--tasks::
+	Display monitored tasks stored in perf data. Displaying pid/tid/ppid
+	plus the command string aligned to distinguish parent and child tasks.
+
 include::callchain-overhead-calculation.txt[]
 
 SEE ALSO
 --------
-linkperf:perf-stat[1], linkperf:perf-annotate[1]
+linkperf:perf-stat[1], linkperf:perf-annotate[1], linkperf:perf-record[1]
diff --git a/tools/perf/Documentation/perf-sched.txt b/tools/perf/Documentation/perf-sched.txt
index 55b6733..c7e50f2 100644
--- a/tools/perf/Documentation/perf-sched.txt
+++ b/tools/perf/Documentation/perf-sched.txt
@@ -74,6 +74,10 @@
 --dump-raw-trace=::
         Display verbose dump of the sched data.
 
+-f::
+--force::
+	Don't complain, do it.
+
 OPTIONS for 'perf sched map'
 ----------------------------
 
diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2811fcf..7730c1d 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -117,7 +117,7 @@
         Comma separated list of fields to print. Options are:
         comm, tid, pid, time, cpu, event, trace, ip, sym, dso, addr, symoff,
         srcline, period, iregs, uregs, brstack, brstacksym, flags, bpf-output, brstackinsn,
-        brstackoff, callindent, insn, insnlen, synth, phys_addr.
+        brstackoff, callindent, insn, insnlen, synth, phys_addr, metric, misc.
         Field list can be prepended with the type, trace, sw or hw,
         to indicate to which event type the field list applies.
         e.g., -F sw:comm,tid,time,ip,sym  and -F trace:time,cpu,trace
@@ -217,6 +217,32 @@
 
 	The brstackoff field will print an offset into a specific dso/binary.
 
+	With the metric option perf script can compute metrics for
+	sampling periods, similar to perf stat. This requires
+	specifying a group with multiple metrics with the :S option
+	for perf record. perf will sample on the first event, and
+	compute metrics for all the events in the group. Please note
+	that the metric computed is averaged over the whole sampling
+	period, not just for the sample point.
+
+	For sample events it's possible to display misc field with -F +misc option,
+	following letters are displayed for each bit:
+
+	  PERF_RECORD_MISC_KERNEL        K
+	  PERF_RECORD_MISC_USER          U
+	  PERF_RECORD_MISC_HYPERVISOR    H
+	  PERF_RECORD_MISC_GUEST_KERNEL  G
+	  PERF_RECORD_MISC_GUEST_USER    g
+	  PERF_RECORD_MISC_MMAP_DATA*    M
+	  PERF_RECORD_MISC_COMM_EXEC     E
+	  PERF_RECORD_MISC_SWITCH_OUT    S
+
+	  $ perf script -F +misc ...
+	   sched-messaging  1414 K     28690.636582:       4590 cycles ...
+	   sched-messaging  1407 U     28690.636600:     325620 cycles ...
+	   sched-messaging  1414 K     28690.636608:      19473 cycles ...
+	  misc field ___________/
+
 -k::
 --vmlinux=<file>::
         vmlinux pathname
@@ -274,6 +300,9 @@
 	Display context switch events i.e. events of type PERF_RECORD_SWITCH or
 	PERF_RECORD_SWITCH_CPU_WIDE.
 
+--show-lost-events
+	Display lost events i.e. events of type PERF_RECORD_LOST.
+
 --demangle::
 	Demangle symbol names to human readable form. It's enabled by default,
 	disable with --no-demangle.
@@ -321,6 +350,22 @@
 	stop time is not given (i.e, time string is 'x.y,') then analysis goes
 	to end of file.
 
+	Also support time percent with multipe time range. Time string is
+	'a%/n,b%/m,...' or 'a%-b%,c%-%d,...'.
+
+	For example:
+	Select the second 10% time slice:
+	perf script --time 10%/2
+
+	Select from 0% to 10% time slice:
+	perf script --time 0%-10%
+
+	Select the first and second 10% time slices:
+	perf script --time 10%/1,10%/2
+
+	Select from 0% to 10% and 30% to 40% slices:
+	perf script --time 0%-10%,30%-40%
+
 --max-blocks::
 	Set the maximum number of program blocks to print with brstackasm for
 	each sample.
diff --git a/tools/perf/Documentation/perf-timechart.txt b/tools/perf/Documentation/perf-timechart.txt
index df98d1c..ef0c756 100644
--- a/tools/perf/Documentation/perf-timechart.txt
+++ b/tools/perf/Documentation/perf-timechart.txt
@@ -50,7 +50,9 @@
 -p::
 --process::
         Select the processes to display, by name or PID
-
+-f::
+--force::
+	Don't complain, do it.
 --symfs=<directory>::
         Look for files with symbols relative to this directory.
 -n::
diff --git a/tools/perf/Documentation/perf-top.txt b/tools/perf/Documentation/perf-top.txt
index 4353262..8a32cc7 100644
--- a/tools/perf/Documentation/perf-top.txt
+++ b/tools/perf/Documentation/perf-top.txt
@@ -268,6 +268,12 @@
 [S]::
 	Stop annotation, return to full profile display.
 
+[K]::
+	Hide kernel symbols.
+
+[U]::
+	Hide user symbols.
+
 [z]::
 	Toggle event count zeroing across display updates.
 
diff --git a/tools/perf/Documentation/perf-trace.txt b/tools/perf/Documentation/perf-trace.txt
index d53bea6..33a88e9 100644
--- a/tools/perf/Documentation/perf-trace.txt
+++ b/tools/perf/Documentation/perf-trace.txt
@@ -86,18 +86,18 @@
 In per-thread mode with inheritance mode on (default), Events are captured only when
 the thread executes on the designated CPUs. Default is to monitor all CPUs.
 
---duration:
+--duration::
 	Show only events that had a duration greater than N.M ms.
 
---sched:
+--sched::
 	Accrue thread runtime and provide a summary at the end of the session.
 
--i
---input
+-i::
+--input::
 	Process events from a given perf data file.
 
--T
---time
+-T::
+--time::
 	Print full timestamp rather time relative to first sample.
 
 --comm::
@@ -117,6 +117,10 @@
 	Show tool stats such as number of times fd->pathname was discovered thru
 	hooking the open syscall return + vfs_getname or via reading /proc/pid/fd, etc.
 
+-f::
+--force::
+	Don't complain, do it.
+
 -F=[all|min|maj]::
 --pf=[all|min|maj]::
 	Trace pagefaults. Optionally, you can specify whether you want minor,
@@ -159,6 +163,10 @@
         Implies '--call-graph dwarf' when --call-graph not present on the
         command line, on systems where DWARF unwinding was built in.
 
+--print-sample::
+	Print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info for the
+	raw_syscalls:sys_{enter,exit} tracepoints, for debugging.
+
 --proc-map-timeout::
 	When processing pre-existing threads /proc/XXX/mmap, it may take a long time,
 	because the file may be huge. A time out is needed in such cases.
diff --git a/tools/perf/Documentation/perf.data-file-format.txt b/tools/perf/Documentation/perf.data-file-format.txt
index e90c59c..f7d85e8 100644
--- a/tools/perf/Documentation/perf.data-file-format.txt
+++ b/tools/perf/Documentation/perf.data-file-format.txt
@@ -238,6 +238,33 @@
 	struct auxtrace_index_entry entries[PERF_AUXTRACE_INDEX_ENTRY_COUNT];
 };
 
+	HEADER_STAT = 19,
+
+This is merely a flag signifying that the data section contains data
+recorded from perf stat record.
+
+	HEADER_CACHE = 20,
+
+Description of the cache hierarchy. Based on the Linux sysfs format
+in /sys/devices/system/cpu/cpu*/cache/
+
+	u32 version	Currently always 1
+	u32 number_of_cache_levels
+
+struct {
+	u32	level;
+	u32	line_size;
+	u32	sets;
+	u32	ways;
+	struct perf_header_string type;
+	struct perf_header_string size;
+	struct perf_header_string map;
+}[number_of_cache_levels];
+
+	HEADER_SAMPLE_TIME = 21,
+
+Two uint64_t for the time of first sample and the time of last sample.
+
 	other bits are reserved and should ignored for now
 	HEADER_FEAT_BITS	= 256,
 
diff --git a/tools/perf/Documentation/tips.txt b/tools/perf/Documentation/tips.txt
index db0ca30..849599f 100644
--- a/tools/perf/Documentation/tips.txt
+++ b/tools/perf/Documentation/tips.txt
@@ -32,3 +32,5 @@
 System-wide collection from all CPUs: perf record -a
 Show current config key-value pairs: perf config --list
 Show user configuration overrides: perf config --user --list
+To add Node.js USDT(User-Level Statically Defined Tracing): perf buildid-cache --add `which node`
+To report cacheline events from previous recording: perf c2c report
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config
index 0294bfb..0dfdaa9 100644
--- a/tools/perf/Makefile.config
+++ b/tools/perf/Makefile.config
@@ -22,6 +22,7 @@
 $(call detected_var,SRCARCH)
 
 NO_PERF_REGS := 1
+NO_SYSCALL_TABLE := 1
 
 # Additional ARCH settings for ppc
 ifeq ($(SRCARCH),powerpc)
@@ -33,7 +34,8 @@
 ifeq ($(SRCARCH),x86)
   $(call detected,CONFIG_X86)
   ifeq (${IS_64_BIT}, 1)
-    CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -DHAVE_SYSCALL_TABLE -I$(OUTPUT)arch/x86/include/generated
+    NO_SYSCALL_TABLE := 0
+    CFLAGS += -DHAVE_ARCH_X86_64_SUPPORT -I$(OUTPUT)arch/x86/include/generated
     ARCH_INCLUDE = ../../arch/x86/lib/memcpy_64.S ../../arch/x86/lib/memset_64.S
     LIBUNWIND_LIBS = -lunwind-x86_64 -lunwind -llzma
     $(call detected,CONFIG_X86_64)
@@ -55,12 +57,18 @@
 
 ifeq ($(ARCH),s390)
   NO_PERF_REGS := 0
+  NO_SYSCALL_TABLE := 0
+  CFLAGS += -fPIC -I$(OUTPUT)arch/s390/include/generated
 endif
 
 ifeq ($(NO_PERF_REGS),0)
   $(call detected,CONFIG_PERF_REGS)
 endif
 
+ifneq ($(NO_SYSCALL_TABLE),1)
+  CFLAGS += -DHAVE_SYSCALL_TABLE
+endif
+
 # So far there's only x86 and arm libdw unwind support merged in perf.
 # Disable it on all other architectures in case libdw unwind
 # support is detected in system. Add supported architectures
@@ -97,6 +105,16 @@
 FEATURE_CHECK_CFLAGS-libunwind-debug-frame = $(LIBUNWIND_CFLAGS)
 FEATURE_CHECK_LDFLAGS-libunwind-debug-frame = $(LIBUNWIND_LDFLAGS) $(LIBUNWIND_LIBS)
 
+ifdef CSINCLUDES
+  LIBOPENCSD_CFLAGS := -I$(CSINCLUDES)
+endif
+OPENCSDLIBS := -lopencsd_c_api -lopencsd
+ifdef CSLIBS
+  LIBOPENCSD_LDFLAGS := -L$(CSLIBS)
+endif
+FEATURE_CHECK_CFLAGS-libopencsd := $(LIBOPENCSD_CFLAGS)
+FEATURE_CHECK_LDFLAGS-libopencsd := $(LIBOPENCSD_LDFLAGS) $(OPENCSDLIBS)
+
 ifeq ($(NO_PERF_REGS),0)
   CFLAGS += -DHAVE_PERF_REGS_SUPPORT
 endif
@@ -265,6 +283,10 @@
   CFLAGS += -DHAVE_PTHREAD_ATTR_SETAFFINITY_NP
 endif
 
+ifeq ($(feature-pthread-barrier), 1)
+  CFLAGS += -DHAVE_PTHREAD_BARRIER
+endif
+
 ifndef NO_BIONIC
   $(call feature_check,bionic)
   ifeq ($(feature-bionic), 1)
@@ -341,6 +363,21 @@
   $(call detected,CONFIG_SETNS)
 endif
 
+ifndef NO_CORESIGHT
+  ifeq ($(feature-libopencsd), 1)
+    CFLAGS += -DHAVE_CSTRACE_SUPPORT $(LIBOPENCSD_CFLAGS)
+    LDFLAGS += $(LIBOPENCSD_LDFLAGS)
+    EXTLIBS += $(OPENCSDLIBS)
+    $(call detected,CONFIG_LIBOPENCSD)
+    ifdef CSTRACE_RAW
+      CFLAGS += -DCS_DEBUG_RAW
+      ifeq (${CSTRACE_RAW}, packed)
+        CFLAGS += -DCS_RAW_PACKED
+      endif
+    endif
+  endif
+endif
+
 ifndef NO_LIBELF
   CFLAGS += -DHAVE_LIBELF_SUPPORT
   EXTLIBS += -lelf
@@ -519,14 +556,18 @@
   EXTLIBS += $(EXTLIBS_LIBUNWIND)
 endif
 
-ifndef NO_LIBAUDIT
-  ifneq ($(feature-libaudit), 1)
-    msg := $(warning No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev);
-    NO_LIBAUDIT := 1
-  else
-    CFLAGS += -DHAVE_LIBAUDIT_SUPPORT
-    EXTLIBS += -laudit
-    $(call detected,CONFIG_AUDIT)
+ifeq ($(NO_SYSCALL_TABLE),0)
+  $(call detected,CONFIG_TRACE)
+else
+  ifndef NO_LIBAUDIT
+    ifneq ($(feature-libaudit), 1)
+      msg := $(warning No libaudit.h found, disables 'trace' tool, please install audit-libs-devel or libaudit-dev);
+      NO_LIBAUDIT := 1
+    else
+      CFLAGS += -DHAVE_LIBAUDIT_SUPPORT
+      EXTLIBS += -laudit
+      $(call detected,CONFIG_TRACE)
+    endif
   endif
 endif
 
@@ -768,7 +809,7 @@
   NO_PERF_READ_VDSOX32 := 1
 endif
 
-ifdef LIBBABELTRACE
+ifndef NO_LIBBABELTRACE
   $(call feature_check,libbabeltrace)
   ifeq ($(feature-libbabeltrace), 1)
     CFLAGS += -DHAVE_LIBBABELTRACE_SUPPORT $(LIBBABELTRACE_CFLAGS)
@@ -935,6 +976,10 @@
 endef
 
 ifeq ($(VF),1)
+  # Display EXTRA features which are detected manualy
+  # from here with feature_check call and thus cannot
+  # be partof global state output.
+  $(foreach feat,$(FEATURE_TESTS_EXTRA),$(call feature_print_status,$(feat),))
   $(call print_var,prefix)
   $(call print_var,bindir)
   $(call print_var,libdir)
diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf
index 68cf136..9b0351d 100644
--- a/tools/perf/Makefile.perf
+++ b/tools/perf/Makefile.perf
@@ -77,7 +77,7 @@
 #
 # Define NO_ZLIB if you do not want to support compressed kernel modules
 #
-# Define LIBBABELTRACE if you DO want libbabeltrace support
+# Define NO_LIBBABELTRACE if you do not want libbabeltrace support
 # for CTF data format.
 #
 # Define NO_LZMA if you do not want to support compressed (xz) kernel modules
@@ -98,6 +98,8 @@
 # When selected, pass LLVM_CONFIG=/path/to/llvm-config to `make' if
 # llvm-config is not in $PATH.
 
+# Define NO_CORESIGHT if you do not want support for CoreSight trace decoding.
+
 # As per kernel Makefile, avoid funny character set dependencies
 unexport LC_ALL
 LC_COLLATE=C
@@ -462,6 +464,13 @@
 $(prctl_option_array): $(prctl_hdr_dir)/prctl.h $(prctl_option_tbl)
 	$(Q)$(SHELL) '$(prctl_option_tbl)' $(prctl_hdr_dir) > $@
 
+arch_errno_name_array := $(beauty_outdir)/arch_errno_name_array.c
+arch_errno_hdr_dir := $(srctree)/tools
+arch_errno_tbl := $(srctree)/tools/perf/trace/beauty/arch_errno_names.sh
+
+$(arch_errno_name_array): $(arch_errno_tbl)
+	$(Q)$(SHELL) '$(arch_errno_tbl)' $(CC) $(arch_errno_hdr_dir) > $@
+
 all: shell_compatibility_test $(ALL_PROGRAMS) $(LANG_BINDINGS) $(OTHER_PROGRAMS)
 
 $(OUTPUT)python/perf.so: $(PYTHON_EXT_SRCS) $(PYTHON_EXT_DEPS) $(LIBTRACEEVENT_DYNAMIC_LIST)
@@ -565,7 +574,8 @@
 	$(vhost_virtio_ioctl_array) \
 	$(madvise_behavior_array) \
 	$(perf_ioctl_array) \
-	$(prctl_option_array)
+	$(prctl_option_array) \
+	$(arch_errno_name_array)
 
 $(OUTPUT)%.o: %.c prepare FORCE
 	$(Q)$(MAKE) -f $(srctree)/tools/build/Makefile.build dir=$(build-dir) $@
@@ -847,7 +857,8 @@
 		$(OUTPUT)$(kcmp_type_array) \
 		$(OUTPUT)$(vhost_virtio_ioctl_array) \
 		$(OUTPUT)$(perf_ioctl_array) \
-		$(OUTPUT)$(prctl_option_array)
+		$(OUTPUT)$(prctl_option_array) \
+		$(OUTPUT)$(arch_errno_name_array)
 	$(QUIET_SUBDIR0)Documentation $(QUIET_SUBDIR1) clean
 
 #
diff --git a/tools/perf/arch/arm/util/auxtrace.c b/tools/perf/arch/arm/util/auxtrace.c
index 8edf2cb..2323581 100644
--- a/tools/perf/arch/arm/util/auxtrace.c
+++ b/tools/perf/arch/arm/util/auxtrace.c
@@ -22,6 +22,42 @@
 #include "../../util/evlist.h"
 #include "../../util/pmu.h"
 #include "cs-etm.h"
+#include "arm-spe.h"
+
+static struct perf_pmu **find_all_arm_spe_pmus(int *nr_spes, int *err)
+{
+	struct perf_pmu **arm_spe_pmus = NULL;
+	int ret, i, nr_cpus = sysconf(_SC_NPROCESSORS_CONF);
+	/* arm_spe_xxxxxxxxx\0 */
+	char arm_spe_pmu_name[sizeof(ARM_SPE_PMU_NAME) + 10];
+
+	arm_spe_pmus = zalloc(sizeof(struct perf_pmu *) * nr_cpus);
+	if (!arm_spe_pmus) {
+		pr_err("spes alloc failed\n");
+		*err = -ENOMEM;
+		return NULL;
+	}
+
+	for (i = 0; i < nr_cpus; i++) {
+		ret = sprintf(arm_spe_pmu_name, "%s%d", ARM_SPE_PMU_NAME, i);
+		if (ret < 0) {
+			pr_err("sprintf failed\n");
+			*err = -ENOMEM;
+			return NULL;
+		}
+
+		arm_spe_pmus[*nr_spes] = perf_pmu__find(arm_spe_pmu_name);
+		if (arm_spe_pmus[*nr_spes]) {
+			pr_debug2("%s %d: arm_spe_pmu %d type %d name %s\n",
+				 __func__, __LINE__, *nr_spes,
+				 arm_spe_pmus[*nr_spes]->type,
+				 arm_spe_pmus[*nr_spes]->name);
+			(*nr_spes)++;
+		}
+	}
+
+	return arm_spe_pmus;
+}
 
 struct auxtrace_record
 *auxtrace_record__init(struct perf_evlist *evlist, int *err)
@@ -29,22 +65,51 @@ struct auxtrace_record
 	struct perf_pmu	*cs_etm_pmu;
 	struct perf_evsel *evsel;
 	bool found_etm = false;
+	bool found_spe = false;
+	static struct perf_pmu **arm_spe_pmus = NULL;
+	static int nr_spes = 0;
+	int i;
+
+	if (!evlist)
+		return NULL;
 
 	cs_etm_pmu = perf_pmu__find(CORESIGHT_ETM_PMU_NAME);
 
-	if (evlist) {
-		evlist__for_each_entry(evlist, evsel) {
-			if (cs_etm_pmu &&
-			    evsel->attr.type == cs_etm_pmu->type)
-				found_etm = true;
+	if (!arm_spe_pmus)
+		arm_spe_pmus = find_all_arm_spe_pmus(&nr_spes, err);
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (cs_etm_pmu &&
+		    evsel->attr.type == cs_etm_pmu->type)
+			found_etm = true;
+
+		if (!nr_spes)
+			continue;
+
+		for (i = 0; i < nr_spes; i++) {
+			if (evsel->attr.type == arm_spe_pmus[i]->type) {
+				found_spe = true;
+				break;
+			}
 		}
 	}
 
+	if (found_etm && found_spe) {
+		pr_err("Concurrent ARM Coresight ETM and SPE operation not currently supported\n");
+		*err = -EOPNOTSUPP;
+		return NULL;
+	}
+
 	if (found_etm)
 		return cs_etm_record_init(err);
 
+#if defined(__aarch64__)
+	if (found_spe)
+		return arm_spe_recording_init(err, arm_spe_pmus[i]);
+#endif
+
 	/*
-	 * Clear 'err' even if we haven't found a cs_etm event - that way perf
+	 * Clear 'err' even if we haven't found an event - that way perf
 	 * record can still be used even if tracers aren't present.  The NULL
 	 * return value will take care of telling the infrastructure HW tracing
 	 * isn't available.
diff --git a/tools/perf/arch/arm/util/pmu.c b/tools/perf/arch/arm/util/pmu.c
index 98d6739..ac4dffc 100644
--- a/tools/perf/arch/arm/util/pmu.c
+++ b/tools/perf/arch/arm/util/pmu.c
@@ -20,6 +20,7 @@
 #include <linux/perf_event.h>
 
 #include "cs-etm.h"
+#include "arm-spe.h"
 #include "../../util/pmu.h"
 
 struct perf_event_attr
@@ -30,7 +31,12 @@ struct perf_event_attr
 		/* add ETM default config here */
 		pmu->selectable = true;
 		pmu->set_drv_config = cs_etm_set_drv_config;
+#if defined(__aarch64__)
+	} else if (strstarts(pmu->name, ARM_SPE_PMU_NAME)) {
+		return arm_spe_pmu_default_config(pmu);
+#endif
 	}
+
 #endif
 	return NULL;
 }
diff --git a/tools/perf/arch/arm64/util/Build b/tools/perf/arch/arm64/util/Build
index cef6fb3..c0b8dfe 100644
--- a/tools/perf/arch/arm64/util/Build
+++ b/tools/perf/arch/arm64/util/Build
@@ -1,6 +1,9 @@
+libperf-y += header.o
+libperf-y += sym-handling.o
 libperf-$(CONFIG_DWARF)     += dwarf-regs.o
 libperf-$(CONFIG_LOCAL_LIBUNWIND) += unwind-libunwind.o
 
 libperf-$(CONFIG_AUXTRACE) += ../../arm/util/pmu.o \
 			      ../../arm/util/auxtrace.o \
-			      ../../arm/util/cs-etm.o
+			      ../../arm/util/cs-etm.o \
+			      arm-spe.o
diff --git a/tools/perf/arch/arm64/util/arm-spe.c b/tools/perf/arch/arm64/util/arm-spe.c
new file mode 100644
index 0000000..1120e39
--- /dev/null
+++ b/tools/perf/arch/arm64/util/arm-spe.c
@@ -0,0 +1,225 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+#include <time.h>
+
+#include "../../util/cpumap.h"
+#include "../../util/evsel.h"
+#include "../../util/evlist.h"
+#include "../../util/session.h"
+#include "../../util/util.h"
+#include "../../util/pmu.h"
+#include "../../util/debug.h"
+#include "../../util/auxtrace.h"
+#include "../../util/arm-spe.h"
+
+#define KiB(x) ((x) * 1024)
+#define MiB(x) ((x) * 1024 * 1024)
+
+struct arm_spe_recording {
+	struct auxtrace_record		itr;
+	struct perf_pmu			*arm_spe_pmu;
+	struct perf_evlist		*evlist;
+};
+
+static size_t
+arm_spe_info_priv_size(struct auxtrace_record *itr __maybe_unused,
+		       struct perf_evlist *evlist __maybe_unused)
+{
+	return ARM_SPE_AUXTRACE_PRIV_SIZE;
+}
+
+static int arm_spe_info_fill(struct auxtrace_record *itr,
+			     struct perf_session *session,
+			     struct auxtrace_info_event *auxtrace_info,
+			     size_t priv_size)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+
+	if (priv_size != ARM_SPE_AUXTRACE_PRIV_SIZE)
+		return -EINVAL;
+
+	if (!session->evlist->nr_mmaps)
+		return -EINVAL;
+
+	auxtrace_info->type = PERF_AUXTRACE_ARM_SPE;
+	auxtrace_info->priv[ARM_SPE_PMU_TYPE] = arm_spe_pmu->type;
+
+	return 0;
+}
+
+static int arm_spe_recording_options(struct auxtrace_record *itr,
+				     struct perf_evlist *evlist,
+				     struct record_opts *opts)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_pmu *arm_spe_pmu = sper->arm_spe_pmu;
+	struct perf_evsel *evsel, *arm_spe_evsel = NULL;
+	bool privileged = geteuid() == 0 || perf_event_paranoid() < 0;
+	struct perf_evsel *tracking_evsel;
+	int err;
+
+	sper->evlist = evlist;
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->attr.type == arm_spe_pmu->type) {
+			if (arm_spe_evsel) {
+				pr_err("There may be only one " ARM_SPE_PMU_NAME "x event\n");
+				return -EINVAL;
+			}
+			evsel->attr.freq = 0;
+			evsel->attr.sample_period = 1;
+			arm_spe_evsel = evsel;
+			opts->full_auxtrace = true;
+		}
+	}
+
+	if (!opts->full_auxtrace)
+		return 0;
+
+	/* We are in full trace mode but '-m,xyz' wasn't specified */
+	if (opts->full_auxtrace && !opts->auxtrace_mmap_pages) {
+		if (privileged) {
+			opts->auxtrace_mmap_pages = MiB(4) / page_size;
+		} else {
+			opts->auxtrace_mmap_pages = KiB(128) / page_size;
+			if (opts->mmap_pages == UINT_MAX)
+				opts->mmap_pages = KiB(256) / page_size;
+		}
+	}
+
+	/* Validate auxtrace_mmap_pages */
+	if (opts->auxtrace_mmap_pages) {
+		size_t sz = opts->auxtrace_mmap_pages * (size_t)page_size;
+		size_t min_sz = KiB(8);
+
+		if (sz < min_sz || !is_power_of_2(sz)) {
+			pr_err("Invalid mmap size for ARM SPE: must be at least %zuKiB and a power of 2\n",
+			       min_sz / 1024);
+			return -EINVAL;
+		}
+	}
+
+
+	/*
+	 * To obtain the auxtrace buffer file descriptor, the auxtrace event
+	 * must come first.
+	 */
+	perf_evlist__to_front(evlist, arm_spe_evsel);
+
+	perf_evsel__set_sample_bit(arm_spe_evsel, CPU);
+	perf_evsel__set_sample_bit(arm_spe_evsel, TIME);
+	perf_evsel__set_sample_bit(arm_spe_evsel, TID);
+
+	/* Add dummy event to keep tracking */
+	err = parse_events(evlist, "dummy:u", NULL);
+	if (err)
+		return err;
+
+	tracking_evsel = perf_evlist__last(evlist);
+	perf_evlist__set_tracking_event(evlist, tracking_evsel);
+
+	tracking_evsel->attr.freq = 0;
+	tracking_evsel->attr.sample_period = 1;
+	perf_evsel__set_sample_bit(tracking_evsel, TIME);
+	perf_evsel__set_sample_bit(tracking_evsel, CPU);
+	perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
+
+	return 0;
+}
+
+static u64 arm_spe_reference(struct auxtrace_record *itr __maybe_unused)
+{
+	struct timespec ts;
+
+	clock_gettime(CLOCK_MONOTONIC_RAW, &ts);
+
+	return ts.tv_sec ^ ts.tv_nsec;
+}
+
+static void arm_spe_recording_free(struct auxtrace_record *itr)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+
+	free(sper);
+}
+
+static int arm_spe_read_finish(struct auxtrace_record *itr, int idx)
+{
+	struct arm_spe_recording *sper =
+			container_of(itr, struct arm_spe_recording, itr);
+	struct perf_evsel *evsel;
+
+	evlist__for_each_entry(sper->evlist, evsel) {
+		if (evsel->attr.type == sper->arm_spe_pmu->type)
+			return perf_evlist__enable_event_idx(sper->evlist,
+							     evsel, idx);
+	}
+	return -EINVAL;
+}
+
+struct auxtrace_record *arm_spe_recording_init(int *err,
+					       struct perf_pmu *arm_spe_pmu)
+{
+	struct arm_spe_recording *sper;
+
+	if (!arm_spe_pmu) {
+		*err = -ENODEV;
+		return NULL;
+	}
+
+	sper = zalloc(sizeof(struct arm_spe_recording));
+	if (!sper) {
+		*err = -ENOMEM;
+		return NULL;
+	}
+
+	sper->arm_spe_pmu = arm_spe_pmu;
+	sper->itr.recording_options = arm_spe_recording_options;
+	sper->itr.info_priv_size = arm_spe_info_priv_size;
+	sper->itr.info_fill = arm_spe_info_fill;
+	sper->itr.free = arm_spe_recording_free;
+	sper->itr.reference = arm_spe_reference;
+	sper->itr.read_finish = arm_spe_read_finish;
+	sper->itr.alignment = 0;
+
+	return &sper->itr;
+}
+
+struct perf_event_attr
+*arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu)
+{
+	struct perf_event_attr *attr;
+
+	attr = zalloc(sizeof(struct perf_event_attr));
+	if (!attr) {
+		pr_err("arm_spe default config cannot allocate a perf_event_attr\n");
+		return NULL;
+	}
+
+	/*
+	 * If kernel driver doesn't advertise a minimum,
+	 * use max allowable by PMSIDR_EL1.INTERVAL
+	 */
+	if (perf_pmu__scan_file(arm_spe_pmu, "caps/min_interval", "%llu",
+				  &attr->sample_period) != 1) {
+		pr_debug("arm_spe driver doesn't advertise a min. interval. Using 4096\n");
+		attr->sample_period = 4096;
+	}
+
+	arm_spe_pmu->selectable = true;
+	arm_spe_pmu->is_uncore = false;
+
+	return attr;
+}
diff --git a/tools/perf/arch/arm64/util/header.c b/tools/perf/arch/arm64/util/header.c
new file mode 100644
index 0000000..534cd25
--- /dev/null
+++ b/tools/perf/arch/arm64/util/header.c
@@ -0,0 +1,65 @@
+#include <stdio.h>
+#include <stdlib.h>
+#include <api/fs/fs.h>
+#include "header.h"
+
+#define MIDR "/regs/identification/midr_el1"
+#define MIDR_SIZE 19
+#define MIDR_REVISION_MASK      0xf
+#define MIDR_VARIANT_SHIFT      20
+#define MIDR_VARIANT_MASK       (0xf << MIDR_VARIANT_SHIFT)
+
+char *get_cpuid_str(struct perf_pmu *pmu)
+{
+	char *buf = NULL;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+	int cpu;
+	u64 midr = 0;
+	struct cpu_map *cpus;
+	FILE *file;
+
+	if (!sysfs || !pmu || !pmu->cpus)
+		return NULL;
+
+	buf = malloc(MIDR_SIZE);
+	if (!buf)
+		return NULL;
+
+	/* read midr from list of cpus mapped to this pmu */
+	cpus = cpu_map__get(pmu->cpus);
+	for (cpu = 0; cpu < cpus->nr; cpu++) {
+		scnprintf(path, PATH_MAX, "%s/devices/system/cpu/cpu%d"MIDR,
+				sysfs, cpus->map[cpu]);
+
+		file = fopen(path, "r");
+		if (!file) {
+			pr_debug("fopen failed for file %s\n", path);
+			continue;
+		}
+
+		if (!fgets(buf, MIDR_SIZE, file)) {
+			fclose(file);
+			continue;
+		}
+		fclose(file);
+
+		/* Ignore/clear Variant[23:20] and
+		 * Revision[3:0] of MIDR
+		 */
+		midr = strtoul(buf, NULL, 16);
+		midr &= (~(MIDR_VARIANT_MASK | MIDR_REVISION_MASK));
+		scnprintf(buf, MIDR_SIZE, "0x%016lx", midr);
+		/* got midr break loop */
+		break;
+	}
+
+	if (!midr) {
+		pr_err("failed to get cpuid string for PMU %s\n", pmu->name);
+		free(buf);
+		buf = NULL;
+	}
+
+	cpu_map__put(cpus);
+	return buf;
+}
diff --git a/tools/perf/arch/arm64/util/sym-handling.c b/tools/perf/arch/arm64/util/sym-handling.c
new file mode 100644
index 0000000..0051b1e
--- /dev/null
+++ b/tools/perf/arch/arm64/util/sym-handling.c
@@ -0,0 +1,22 @@
+/*
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License, version 2, as
+ * published by the Free Software Foundation.
+ *
+ * Copyright (C) 2015 Naveen N. Rao, IBM Corporation
+ */
+
+#include "debug.h"
+#include "symbol.h"
+#include "map.h"
+#include "probe-event.h"
+#include "probe-file.h"
+
+#ifdef HAVE_LIBELF_SUPPORT
+bool elf__needs_adjust_symbols(GElf_Ehdr ehdr)
+{
+	return ehdr.e_type == ET_EXEC ||
+	       ehdr.e_type == ET_REL ||
+	       ehdr.e_type == ET_DYN;
+}
+#endif
diff --git a/tools/perf/arch/common.c b/tools/perf/arch/common.c
index 8c0cfeb..c6f3735 100644
--- a/tools/perf/arch/common.c
+++ b/tools/perf/arch/common.c
@@ -1,12 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <stdio.h>
-#include <sys/utsname.h>
 #include "common.h"
+#include "../util/env.h"
 #include "../util/util.h"
 #include "../util/debug.h"
 
-#include "sane_ctype.h"
-
 const char *const arm_triplets[] = {
 	"arm-eabi-",
 	"arm-linux-androideabi-",
@@ -120,55 +118,19 @@ static int lookup_triplets(const char *const *triplets, const char *name)
 	return -1;
 }
 
-/*
- * Return architecture name in a normalized form.
- * The conversion logic comes from the Makefile.
- */
-const char *normalize_arch(char *arch)
-{
-	if (!strcmp(arch, "x86_64"))
-		return "x86";
-	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
-		return "x86";
-	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
-		return "sparc";
-	if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
-		return "arm64";
-	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
-		return "arm";
-	if (!strncmp(arch, "s390", 4))
-		return "s390";
-	if (!strncmp(arch, "parisc", 6))
-		return "parisc";
-	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
-		return "powerpc";
-	if (!strncmp(arch, "mips", 4))
-		return "mips";
-	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
-		return "sh";
-
-	return arch;
-}
-
 static int perf_env__lookup_binutils_path(struct perf_env *env,
 					  const char *name, const char **path)
 {
 	int idx;
-	const char *arch, *cross_env;
-	struct utsname uts;
+	const char *arch = perf_env__arch(env), *cross_env;
 	const char *const *path_list;
 	char *buf = NULL;
 
-	arch = normalize_arch(env->arch);
-
-	if (uname(&uts) < 0)
-		goto out;
-
 	/*
 	 * We don't need to try to find objdump path for native system.
 	 * Just use default binutils path (e.g.: "objdump").
 	 */
-	if (!strcmp(normalize_arch(uts.machine), arch))
+	if (!strcmp(perf_env__arch(NULL), arch))
 		goto out;
 
 	cross_env = getenv("CROSS_COMPILE");
diff --git a/tools/perf/arch/common.h b/tools/perf/arch/common.h
index a154650..2d875baa 100644
--- a/tools/perf/arch/common.h
+++ b/tools/perf/arch/common.h
@@ -7,6 +7,5 @@
 extern const char *objdump_path;
 
 int perf_env__lookup_objdump(struct perf_env *env);
-const char *normalize_arch(char *arch);
 
 #endif /* ARCH_PERF_COMMON_H */
diff --git a/tools/perf/arch/powerpc/util/header.c b/tools/perf/arch/powerpc/util/header.c
index 7a4cf80..0b24266 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -35,7 +35,7 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *bufp;
 
diff --git a/tools/perf/arch/powerpc/util/sym-handling.c b/tools/perf/arch/powerpc/util/sym-handling.c
index 9c4e23d..53d83d7 100644
--- a/tools/perf/arch/powerpc/util/sym-handling.c
+++ b/tools/perf/arch/powerpc/util/sym-handling.c
@@ -64,6 +64,14 @@ int arch__compare_symbol_names_n(const char *namea, const char *nameb,
 
 	return strncmp(namea, nameb, n);
 }
+
+const char *arch__normalize_symbol_name(const char *name)
+{
+	/* Skip over initial dot */
+	if (name && *name == '.')
+		name++;
+	return name;
+}
 #endif
 
 #if defined(_CALL_ELF) && _CALL_ELF == 2
diff --git a/tools/perf/arch/s390/Makefile b/tools/perf/arch/s390/Makefile
index 09ba923..48228de 100644
--- a/tools/perf/arch/s390/Makefile
+++ b/tools/perf/arch/s390/Makefile
@@ -3,3 +3,24 @@
 endif
 HAVE_KVM_STAT_SUPPORT := 1
 PERF_HAVE_ARCH_REGS_QUERY_REGISTER_OFFSET := 1
+
+#
+# Syscall table generation for perf
+#
+
+out    := $(OUTPUT)arch/s390/include/generated/asm
+header := $(out)/syscalls_64.c
+sysdef := $(srctree)/tools/arch/s390/include/uapi/asm/unistd.h
+sysprf := $(srctree)/tools/perf/arch/s390/entry/syscalls/
+systbl := $(sysprf)/mksyscalltbl
+
+# Create output directory if not already present
+_dummy := $(shell [ -d '$(out)' ] || mkdir -p '$(out)')
+
+$(header): $(sysdef) $(systbl)
+	$(Q)$(SHELL) '$(systbl)' '$(CC)' $(sysdef) > $@
+
+clean::
+	$(call QUIET_CLEAN, s390) $(RM) $(header)
+
+archheaders: $(header)
diff --git a/tools/perf/arch/s390/annotate/instructions.c b/tools/perf/arch/s390/annotate/instructions.c
index e0e466c..8c72b44 100644
--- a/tools/perf/arch/s390/annotate/instructions.c
+++ b/tools/perf/arch/s390/annotate/instructions.c
@@ -18,7 +18,8 @@ static struct ins_ops *s390__associate_ins_ops(struct arch *arch, const char *na
 	if (!strcmp(name, "br"))
 		ops = &ret_ops;
 
-	arch__associate_ins_ops(arch, name, ops);
+	if (ops)
+		arch__associate_ins_ops(arch, name, ops);
 	return ops;
 }
 
diff --git a/tools/perf/arch/s390/entry/syscalls/mksyscalltbl b/tools/perf/arch/s390/entry/syscalls/mksyscalltbl
new file mode 100755
index 0000000..7fa0d0a
--- /dev/null
+++ b/tools/perf/arch/s390/entry/syscalls/mksyscalltbl
@@ -0,0 +1,36 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generate system call table for perf
+#
+#
+# Copyright IBM Corp. 2017
+# Author(s):  Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
+#
+
+gcc=$1
+input=$2
+
+if ! test -r $input; then
+	echo "Could not read input file" >&2
+	exit 1
+fi
+
+create_table()
+{
+	local max_nr
+
+	echo 'static const char *syscalltbl_s390_64[] = {'
+	while read sc nr; do
+		printf '\t[%d] = "%s",\n' $nr $sc
+		max_nr=$nr
+	done
+	echo '};'
+	echo "#define SYSCALLTBL_S390_64_MAX_ID $max_nr"
+}
+
+
+$gcc -m64 -E -dM -x c  $input	       \
+	|sed -ne 's/^#define __NR_//p' \
+	|sort -t' ' -k2 -nu	       \
+	|create_table
diff --git a/tools/perf/arch/x86/tests/perf-time-to-tsc.c b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
index b59678e..06abe81 100644
--- a/tools/perf/arch/x86/tests/perf-time-to-tsc.c
+++ b/tools/perf/arch/x86/tests/perf-time-to-tsc.c
@@ -84,7 +84,7 @@ int test__perf_time_to_tsc(struct test *test __maybe_unused, int subtest __maybe
 
 	CHECK__(perf_evlist__open(evlist));
 
-	CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
+	CHECK__(perf_evlist__mmap(evlist, UINT_MAX));
 
 	pc = evlist->mmap[0].base;
 	ret = perf_read_tsc_conversion(pc, &tc);
diff --git a/tools/perf/arch/x86/util/header.c b/tools/perf/arch/x86/util/header.c
index 33027c5..fb0d71a 100644
--- a/tools/perf/arch/x86/util/header.c
+++ b/tools/perf/arch/x86/util/header.c
@@ -66,11 +66,11 @@ get_cpuid(char *buffer, size_t sz)
 }
 
 char *
-get_cpuid_str(void)
+get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	char *buf = malloc(128);
 
-	if (__get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
+	if (buf && __get_cpuid(buf, 128, "%s-%u-%X$") < 0) {
 		free(buf);
 		return NULL;
 	}
diff --git a/tools/perf/arch/x86/util/unwind-libunwind.c b/tools/perf/arch/x86/util/unwind-libunwind.c
index 9c917f8..05920e3 100644
--- a/tools/perf/arch/x86/util/unwind-libunwind.c
+++ b/tools/perf/arch/x86/util/unwind-libunwind.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 
-#ifndef REMOTE_UNWIND_LIBUNWIND
 #include <errno.h>
+#ifndef REMOTE_UNWIND_LIBUNWIND
 #include <libunwind.h>
 #include "perf_regs.h"
 #include "../../util/unwind.h"
diff --git a/tools/perf/bench/futex-hash.c b/tools/perf/bench/futex-hash.c
index 58ae6ed..9aa3a67 100644
--- a/tools/perf/bench/futex-hash.c
+++ b/tools/perf/bench/futex-hash.c
@@ -24,9 +24,9 @@
 #include <subcmd/parse-options.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
-#include <sys/time.h>
 
 static unsigned int nthreads = 0;
 static unsigned int nsecs    = 10;
@@ -118,11 +118,12 @@ static void print_summary(void)
 int bench_futex_hash(int argc, const char **argv)
 {
 	int ret = 0;
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	struct sigaction act;
-	unsigned int i, ncpus;
+	unsigned int i;
 	pthread_attr_t thread_attr;
 	struct worker *worker = NULL;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_hash_usage, 0);
 	if (argc) {
@@ -130,14 +131,16 @@ int bench_futex_hash(int argc, const char **argv)
 		exit(EXIT_FAILURE);
 	}
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		goto errmem;
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads) /* default to the number of CPUs */
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -163,10 +166,10 @@ int bench_futex_hash(int argc, const char **argv)
 		if (!worker[i].futex)
 			goto errmem;
 
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu);
+		ret = pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset);
 		if (ret)
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
@@ -217,6 +220,7 @@ int bench_futex_hash(int argc, const char **argv)
 	print_summary();
 
 	free(worker);
+	free(cpu);
 	return ret;
 errmem:
 	err(EXIT_FAILURE, "calloc");
diff --git a/tools/perf/bench/futex-lock-pi.c b/tools/perf/bench/futex-lock-pi.c
index 08653ae..8e9c475 100644
--- a/tools/perf/bench/futex-lock-pi.c
+++ b/tools/perf/bench/futex-lock-pi.c
@@ -15,6 +15,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -32,7 +33,7 @@ static struct worker *worker;
 static unsigned int nsecs = 10;
 static bool silent = false, multi = false;
 static bool done = false, fshared = false;
-static unsigned int ncpus, nthreads = 0;
+static unsigned int nthreads = 0;
 static int futex_flag = 0;
 struct timeval start, end, runtime;
 static pthread_mutex_t thread_lock;
@@ -113,9 +114,10 @@ static void *workerfn(void *arg)
 	return NULL;
 }
 
-static void create_threads(struct worker *w, pthread_attr_t thread_attr)
+static void create_threads(struct worker *w, pthread_attr_t thread_attr,
+			   struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
@@ -130,10 +132,10 @@ static void create_threads(struct worker *w, pthread_attr_t thread_attr)
 		} else
 			worker[i].futex = &global_futex;
 
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i].thread, &thread_attr, workerfn, &worker[i]))
@@ -147,19 +149,22 @@ int bench_futex_lock_pi(int argc, const char **argv)
 	unsigned int i;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_lock_pi_usage, 0);
 	if (argc)
 		goto err;
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads)
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -180,7 +185,7 @@ int bench_futex_lock_pi(int argc, const char **argv)
 	pthread_attr_init(&thread_attr);
 	gettimeofday(&start, NULL);
 
-	create_threads(worker, thread_attr);
+	create_threads(worker, thread_attr, cpu);
 	pthread_attr_destroy(&thread_attr);
 
 	pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/bench/futex-requeue.c b/tools/perf/bench/futex-requeue.c
index 1058c19..fc692ef 100644
--- a/tools/perf/bench/futex-requeue.c
+++ b/tools/perf/bench/futex-requeue.c
@@ -22,6 +22,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -40,7 +41,7 @@ static bool done = false, silent = false, fshared = false;
 static pthread_mutex_t thread_lock;
 static pthread_cond_t thread_parent, thread_worker;
 static struct stats requeuetime_stats, requeued_stats;
-static unsigned int ncpus, threads_starting, nthreads = 0;
+static unsigned int threads_starting, nthreads = 0;
 static int futex_flag = 0;
 
 static const struct option options[] = {
@@ -83,19 +84,19 @@ static void *workerfn(void *arg __maybe_unused)
 }
 
 static void block_threads(pthread_t *w,
-			  pthread_attr_t thread_attr)
+			  pthread_attr_t thread_attr, struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
 
 	/* create and block all threads */
 	for (i = 0; i < nthreads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, workerfn, NULL))
@@ -116,19 +117,22 @@ int bench_futex_requeue(int argc, const char **argv)
 	unsigned int i, j;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_requeue_usage, 0);
 	if (argc)
 		goto err;
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "cpu_map__new");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
 	if (!nthreads)
-		nthreads = ncpus;
+		nthreads = cpu->nr;
 
 	worker = calloc(nthreads, sizeof(*worker));
 	if (!worker)
@@ -156,7 +160,7 @@ int bench_futex_requeue(int argc, const char **argv)
 		struct timeval start, end, runtime;
 
 		/* create, launch & block all threads */
-		block_threads(worker, thread_attr);
+		block_threads(worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/bench/futex-wake-parallel.c b/tools/perf/bench/futex-wake-parallel.c
index b4732da..69d8fdc 100644
--- a/tools/perf/bench/futex-wake-parallel.c
+++ b/tools/perf/bench/futex-wake-parallel.c
@@ -7,7 +7,17 @@
  * for each individual thread to service its share of work. Ultimately
  * it can be used to measure futex_wake() changes.
  */
+#include "bench.h"
+#include <linux/compiler.h>
+#include "../util/debug.h"
 
+#ifndef HAVE_PTHREAD_BARRIER
+int bench_futex_wake_parallel(int argc __maybe_unused, const char **argv __maybe_unused)
+{
+	pr_err("%s: pthread_barrier_t unavailable, disabling this test...\n", __func__);
+	return 0;
+}
+#else /* HAVE_PTHREAD_BARRIER */
 /* For the CLR_() macros */
 #include <string.h>
 #include <pthread.h>
@@ -15,12 +25,11 @@
 #include <signal.h>
 #include "../util/stat.h"
 #include <subcmd/parse-options.h>
-#include <linux/compiler.h>
 #include <linux/kernel.h>
 #include <linux/time64.h>
 #include <errno.h>
-#include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -42,8 +51,9 @@ static bool done = false, silent = false, fshared = false;
 static unsigned int nblocked_threads = 0, nwaking_threads = 0;
 static pthread_mutex_t thread_lock;
 static pthread_cond_t thread_parent, thread_worker;
+static pthread_barrier_t barrier;
 static struct stats waketime_stats, wakeup_stats;
-static unsigned int ncpus, threads_starting;
+static unsigned int threads_starting;
 static int futex_flag = 0;
 
 static const struct option options[] = {
@@ -64,6 +74,8 @@ static void *waking_workerfn(void *arg)
 	struct thread_data *waker = (struct thread_data *) arg;
 	struct timeval start, end;
 
+	pthread_barrier_wait(&barrier);
+
 	gettimeofday(&start, NULL);
 
 	waker->nwoken = futex_wake(&futex, nwakes, futex_flag);
@@ -84,6 +96,8 @@ static void wakeup_threads(struct thread_data *td, pthread_attr_t thread_attr)
 
 	pthread_attr_setdetachstate(&thread_attr, PTHREAD_CREATE_JOINABLE);
 
+	pthread_barrier_init(&barrier, NULL, nwaking_threads + 1);
+
 	/* create and block all threads */
 	for (i = 0; i < nwaking_threads; i++) {
 		/*
@@ -96,9 +110,13 @@ static void wakeup_threads(struct thread_data *td, pthread_attr_t thread_attr)
 			err(EXIT_FAILURE, "pthread_create");
 	}
 
+	pthread_barrier_wait(&barrier);
+
 	for (i = 0; i < nwaking_threads; i++)
 		if (pthread_join(td[i].worker, NULL))
 			err(EXIT_FAILURE, "pthread_join");
+
+	pthread_barrier_destroy(&barrier);
 }
 
 static void *blocked_workerfn(void *arg __maybe_unused)
@@ -119,19 +137,20 @@ static void *blocked_workerfn(void *arg __maybe_unused)
 	return NULL;
 }
 
-static void block_threads(pthread_t *w, pthread_attr_t thread_attr)
+static void block_threads(pthread_t *w, pthread_attr_t thread_attr,
+			  struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nblocked_threads;
 
 	/* create and block all threads */
 	for (i = 0; i < nblocked_threads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, blocked_workerfn, NULL))
@@ -205,6 +224,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	struct sigaction act;
 	pthread_attr_t thread_attr;
 	struct thread_data *waking_worker;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options,
 			     bench_futex_wake_parallel_usage, 0);
@@ -217,9 +237,12 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	act.sa_sigaction = toggle_done;
 	sigaction(SIGINT, &act, NULL);
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
+
 	if (!nblocked_threads)
-		nblocked_threads = ncpus;
+		nblocked_threads = cpu->nr;
 
 	/* some sanity checks */
 	if (nwaking_threads > nblocked_threads || !nwaking_threads)
@@ -259,7 +282,7 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 			err(EXIT_FAILURE, "calloc");
 
 		/* create, launch & block all threads */
-		block_threads(blocked_worker, thread_attr);
+		block_threads(blocked_worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
@@ -297,3 +320,4 @@ int bench_futex_wake_parallel(int argc, const char **argv)
 	free(blocked_worker);
 	return ret;
 }
+#endif /* HAVE_PTHREAD_BARRIER */
diff --git a/tools/perf/bench/futex-wake.c b/tools/perf/bench/futex-wake.c
index 8c5c0b6..e8181ad 100644
--- a/tools/perf/bench/futex-wake.c
+++ b/tools/perf/bench/futex-wake.c
@@ -22,6 +22,7 @@
 #include <errno.h>
 #include "bench.h"
 #include "futex.h"
+#include "cpumap.h"
 
 #include <err.h>
 #include <stdlib.h>
@@ -89,19 +90,19 @@ static void print_summary(void)
 }
 
 static void block_threads(pthread_t *w,
-			  pthread_attr_t thread_attr)
+			  pthread_attr_t thread_attr, struct cpu_map *cpu)
 {
-	cpu_set_t cpu;
+	cpu_set_t cpuset;
 	unsigned int i;
 
 	threads_starting = nthreads;
 
 	/* create and block all threads */
 	for (i = 0; i < nthreads; i++) {
-		CPU_ZERO(&cpu);
-		CPU_SET(i % ncpus, &cpu);
+		CPU_ZERO(&cpuset);
+		CPU_SET(cpu->map[i % cpu->nr], &cpuset);
 
-		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpu))
+		if (pthread_attr_setaffinity_np(&thread_attr, sizeof(cpu_set_t), &cpuset))
 			err(EXIT_FAILURE, "pthread_attr_setaffinity_np");
 
 		if (pthread_create(&w[i], &thread_attr, workerfn, NULL))
@@ -122,6 +123,7 @@ int bench_futex_wake(int argc, const char **argv)
 	unsigned int i, j;
 	struct sigaction act;
 	pthread_attr_t thread_attr;
+	struct cpu_map *cpu;
 
 	argc = parse_options(argc, argv, options, bench_futex_wake_usage, 0);
 	if (argc) {
@@ -129,7 +131,9 @@ int bench_futex_wake(int argc, const char **argv)
 		exit(EXIT_FAILURE);
 	}
 
-	ncpus = sysconf(_SC_NPROCESSORS_ONLN);
+	cpu = cpu_map__new(NULL);
+	if (!cpu)
+		err(EXIT_FAILURE, "calloc");
 
 	sigfillset(&act.sa_mask);
 	act.sa_sigaction = toggle_done;
@@ -161,7 +165,7 @@ int bench_futex_wake(int argc, const char **argv)
 		struct timeval start, end, runtime;
 
 		/* create, launch & block all threads */
-		block_threads(worker, thread_attr);
+		block_threads(worker, thread_attr, cpu);
 
 		/* make sure all threads are already blocked */
 		pthread_mutex_lock(&thread_lock);
diff --git a/tools/perf/builtin-buildid-cache.c b/tools/perf/builtin-buildid-cache.c
index 3d354ba..41db2cb 100644
--- a/tools/perf/builtin-buildid-cache.c
+++ b/tools/perf/builtin-buildid-cache.c
@@ -325,8 +325,8 @@ int cmd_buildid_cache(int argc, const char **argv)
 		   "file", "kcore file to add"),
 	OPT_STRING('r', "remove", &remove_name_list_str, "file list",
 		    "file(s) to remove"),
-	OPT_STRING('p', "purge", &purge_name_list_str, "path list",
-		    "path(s) to remove (remove old caches too)"),
+	OPT_STRING('p', "purge", &purge_name_list_str, "file list",
+		    "file(s) to remove (remove old caches too)"),
 	OPT_STRING('M', "missing", &missing_filename, "file",
 		   "to find missing build ids in the cache"),
 	OPT_BOOLEAN('f', "force", &force, "don't complain, do it"),
diff --git a/tools/perf/builtin-c2c.c b/tools/perf/builtin-c2c.c
index 17855c4..c0815a3 100644
--- a/tools/perf/builtin-c2c.c
+++ b/tools/perf/builtin-c2c.c
@@ -27,13 +27,10 @@
 #include "sort.h"
 #include "tool.h"
 #include "data.h"
-#include "sort.h"
 #include "event.h"
 #include "evlist.h"
 #include "evsel.h"
-#include <asm/bug.h>
 #include "ui/browsers/hists.h"
-#include "evlist.h"
 #include "thread.h"
 
 struct c2c_hists {
@@ -2224,9 +2221,9 @@ static int perf_c2c__browse_cacheline(struct hist_entry *he)
 	struct hist_browser *browser;
 	int key = -1;
 	const char help[] =
-	" ENTER         Togle callchains (if present) \n"
-	" n             Togle Node details info \n"
-	" s             Togle full lenght of symbol and source line columns \n"
+	" ENTER         Toggle callchains (if present) \n"
+	" n             Toggle Node details info \n"
+	" s             Toggle full length of symbol and source line columns \n"
 	" q             Return back to cacheline list \n";
 
 	/* Display compact version first. */
@@ -2303,7 +2300,7 @@ static int perf_c2c__hists_browse(struct hists *hists)
 	int key = -1;
 	const char help[] =
 	" d             Display cacheline details \n"
-	" ENTER         Togle callchains (if present) \n"
+	" ENTER         Toggle callchains (if present) \n"
 	" q             Quit \n";
 
 	browser = perf_c2c_browser__new(hists);
@@ -2393,9 +2390,10 @@ static int setup_callchain(struct perf_evlist *evlist)
 	enum perf_call_graph_mode mode = CALLCHAIN_NONE;
 
 	if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-	    (sample_type & PERF_SAMPLE_STACK_USER))
+	    (sample_type & PERF_SAMPLE_STACK_USER)) {
 		mode = CALLCHAIN_DWARF;
-	else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+		dwarf_callchain_users = true;
+	} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 		mode = CALLCHAIN_LBR;
 	else if (sample_type & PERF_SAMPLE_CALLCHAIN)
 		mode = CALLCHAIN_FP;
diff --git a/tools/perf/builtin-help.c b/tools/perf/builtin-help.c
index a0f7ed2..4aca13f 100644
--- a/tools/perf/builtin-help.c
+++ b/tools/perf/builtin-help.c
@@ -439,7 +439,7 @@ int cmd_help(int argc, const char **argv)
 #ifdef HAVE_LIBELF_SUPPORT
 		"probe",
 #endif
-#ifdef HAVE_LIBAUDIT_SUPPORT
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
 		"trace",
 #endif
 	NULL };
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c
index 16a2854..40fe919 100644
--- a/tools/perf/builtin-inject.c
+++ b/tools/perf/builtin-inject.c
@@ -536,8 +536,7 @@ static int perf_inject__sched_stat(struct perf_tool *tool,
 	sample_sw.period = sample->period;
 	sample_sw.time	 = sample->time;
 	perf_event__synthesize_sample(event_sw, evsel->attr.sample_type,
-				      evsel->attr.read_format, &sample_sw,
-				      false);
+				      evsel->attr.read_format, &sample_sw);
 	build_id__mark_dso_hit(tool, event_sw, &sample_sw, evsel, machine);
 	return perf_event__repipe(tool, event_sw, &sample_sw, machine);
 }
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 0c36f2a..55d919d 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -26,6 +26,9 @@
 #include <sys/timerfd.h>
 #endif
 #include <sys/time.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
 
 #include <linux/kernel.h>
 #include <linux/time64.h>
@@ -741,20 +744,20 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
 				   u64 *mmap_time)
 {
 	union perf_event *event;
-	struct perf_sample sample;
+	u64 timestamp;
 	s64 n = 0;
 	int err;
 
 	*mmap_time = ULLONG_MAX;
 	while ((event = perf_evlist__mmap_read(kvm->evlist, idx)) != NULL) {
-		err = perf_evlist__parse_sample(kvm->evlist, event, &sample);
+		err = perf_evlist__parse_sample_timestamp(kvm->evlist, event, &timestamp);
 		if (err) {
 			perf_evlist__mmap_consume(kvm->evlist, idx);
 			pr_err("Failed to parse sample\n");
 			return -1;
 		}
 
-		err = perf_session__queue_event(kvm->session, event, &sample, 0);
+		err = perf_session__queue_event(kvm->session, event, timestamp, 0);
 		/*
 		 * FIXME: Here we can't consume the event, as perf_session__queue_event will
 		 *        point to it, and it'll get possibly overwritten by the kernel.
@@ -768,7 +771,7 @@ static s64 perf_kvm__mmap_read_idx(struct perf_kvm_stat *kvm, int idx,
 
 		/* save time stamp of our first sample for this mmap */
 		if (n == 0)
-			*mmap_time = sample.time;
+			*mmap_time = timestamp;
 
 		/* limit events per mmap handled all at once */
 		n++;
@@ -1044,7 +1047,7 @@ static int kvm_live_open_events(struct perf_kvm_stat *kvm)
 		goto out;
 	}
 
-	if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages, false) < 0) {
+	if (perf_evlist__mmap(evlist, kvm->opts.mmap_pages) < 0) {
 		ui__error("Failed to mmap the events: %s\n",
 			  str_error_r(errno, sbuf, sizeof(sbuf)));
 		perf_evlist__close(evlist);
diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index 0032559..65681a1 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -51,7 +51,6 @@
 #include <signal.h>
 #include <sys/mman.h>
 #include <sys/wait.h>
-#include <asm/bug.h>
 #include <linux/time64.h>
 
 struct switch_output {
@@ -79,6 +78,7 @@ struct record {
 	bool			no_buildid_cache_set;
 	bool			buildid_all;
 	bool			timestamp_filename;
+	bool			timestamp_boundary;
 	struct switch_output	switch_output;
 	unsigned long long	samples;
 };
@@ -301,7 +301,7 @@ static int record__mmap_evlist(struct record *rec,
 	struct record_opts *opts = &rec->opts;
 	char msg[512];
 
-	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages, false,
+	if (perf_evlist__mmap_ex(evlist, opts->mmap_pages,
 				 opts->auxtrace_mmap_pages,
 				 opts->auxtrace_snapshot_mode) < 0) {
 		if (errno == EPERM) {
@@ -372,6 +372,8 @@ static int record__open(struct record *rec)
 			ui__error("%s\n", msg);
 			goto out;
 		}
+
+		pos->supported = true;
 	}
 
 	if (perf_evlist__apply_filters(evlist, &pos)) {
@@ -408,8 +410,15 @@ static int process_sample_event(struct perf_tool *tool,
 {
 	struct record *rec = container_of(tool, struct record, tool);
 
-	rec->samples++;
+	if (rec->evlist->first_sample_time == 0)
+		rec->evlist->first_sample_time = sample->time;
 
+	rec->evlist->last_sample_time = sample->time;
+
+	if (rec->buildid_all)
+		return 0;
+
+	rec->samples++;
 	return build_id__mark_dso_hit(tool, event, sample, evsel, machine);
 }
 
@@ -434,9 +443,11 @@ static int process_buildids(struct record *rec)
 
 	/*
 	 * If --buildid-all is given, it marks all DSO regardless of hits,
-	 * so no need to process samples.
+	 * so no need to process samples. But if timestamp_boundary is enabled,
+	 * it still needs to walk on all samples to get the timestamps of
+	 * first/last samples.
 	 */
-	if (rec->buildid_all)
+	if (rec->buildid_all && !rec->timestamp_boundary)
 		rec->tool.sample = NULL;
 
 	return perf_session__process_events(session);
@@ -477,7 +488,7 @@ static struct perf_event_header finished_round_event = {
 };
 
 static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evlist,
-				    bool backward)
+				    bool overwrite)
 {
 	u64 bytes_written = rec->bytes_written;
 	int i;
@@ -487,18 +498,18 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (!evlist)
 		return 0;
 
-	maps = backward ? evlist->backward_mmap : evlist->mmap;
+	maps = overwrite ? evlist->overwrite_mmap : evlist->mmap;
 	if (!maps)
 		return 0;
 
-	if (backward && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
+	if (overwrite && evlist->bkw_mmap_state != BKW_MMAP_DATA_PENDING)
 		return 0;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		struct auxtrace_mmap *mm = &maps[i].auxtrace_mmap;
 
 		if (maps[i].base) {
-			if (perf_mmap__push(&maps[i], evlist->overwrite, backward, rec, record__pushfn) != 0) {
+			if (perf_mmap__push(&maps[i], overwrite, rec, record__pushfn) != 0) {
 				rc = -1;
 				goto out;
 			}
@@ -518,7 +529,7 @@ static int record__mmap_read_evlist(struct record *rec, struct perf_evlist *evli
 	if (bytes_written != rec->bytes_written)
 		rc = record__write(rec, &finished_round_event, sizeof(finished_round_event));
 
-	if (backward)
+	if (overwrite)
 		perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_EMPTY);
 out:
 	return rc;
@@ -690,8 +701,8 @@ perf_evlist__pick_pc(struct perf_evlist *evlist)
 	if (evlist) {
 		if (evlist->mmap && evlist->mmap[0].base)
 			return evlist->mmap[0].base;
-		if (evlist->backward_mmap && evlist->backward_mmap[0].base)
-			return evlist->backward_mmap[0].base;
+		if (evlist->overwrite_mmap && evlist->overwrite_mmap[0].base)
+			return evlist->overwrite_mmap[0].base;
 	}
 	return NULL;
 }
@@ -784,6 +795,28 @@ static int record__synthesize(struct record *rec, bool tail)
 					 perf_event__synthesize_guest_os, tool);
 	}
 
+	err = perf_event__synthesize_extra_attr(&rec->tool,
+						rec->evlist,
+						process_synthesized_event,
+						data->is_pipe);
+	if (err)
+		goto out;
+
+	err = perf_event__synthesize_thread_map2(&rec->tool, rec->evlist->threads,
+						 process_synthesized_event,
+						NULL);
+	if (err < 0) {
+		pr_err("Couldn't synthesize thread map.\n");
+		return err;
+	}
+
+	err = perf_event__synthesize_cpu_map(&rec->tool, rec->evlist->cpus,
+					     process_synthesized_event, NULL);
+	if (err < 0) {
+		pr_err("Couldn't synthesize cpu map.\n");
+		return err;
+	}
+
 	err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->threads,
 					    process_synthesized_event, opts->sample_address,
 					    opts->proc_map_timeout, 1);
@@ -1598,6 +1631,8 @@ static struct option __record_options[] = {
 		    "Record build-id of all DSOs regardless of hits"),
 	OPT_BOOLEAN(0, "timestamp-filename", &record.timestamp_filename,
 		    "append timestamp to output filename"),
+	OPT_BOOLEAN(0, "timestamp-boundary", &record.timestamp_boundary,
+		    "Record timestamp boundary (time of first/last samples)"),
 	OPT_STRING_OPTARG_SET(0, "switch-output", &record.switch_output.str,
 			  &record.switch_output.set, "signal,size,time",
 			  "Switch output when receive SIGUSR2 or cross size,time threshold",
@@ -1781,8 +1816,8 @@ int cmd_record(int argc, const char **argv)
 		goto out;
 	}
 
-	/* Enable ignoring missing threads when -u option is defined. */
-	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX;
+	/* Enable ignoring missing threads when -u/-p option is defined. */
+	rec->opts.ignore_missing_thread = rec->opts.target.uid != UINT_MAX || rec->opts.target.pid;
 
 	err = -ENOMEM;
 	if (perf_evlist__create_maps(rec->evlist, &rec->opts.target) < 0)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c
index af5dd03..42a52dc 100644
--- a/tools/perf/builtin-report.c
+++ b/tools/perf/builtin-report.c
@@ -15,6 +15,7 @@
 #include "util/color.h"
 #include <linux/list.h>
 #include <linux/rbtree.h>
+#include <linux/err.h>
 #include "util/symbol.h"
 #include "util/callchain.h"
 #include "util/values.h"
@@ -51,6 +52,7 @@
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <unistd.h>
+#include <linux/mman.h>
 
 struct report {
 	struct perf_tool	tool;
@@ -60,6 +62,9 @@ struct report {
 	bool			show_threads;
 	bool			inverted_callchain;
 	bool			mem_mode;
+	bool			stats_mode;
+	bool			tasks_mode;
+	bool			mmaps_mode;
 	bool			header;
 	bool			header_only;
 	bool			nonany_branch_mode;
@@ -69,7 +74,9 @@ struct report {
 	const char		*cpu_list;
 	const char		*symbol_filter_str;
 	const char		*time_str;
-	struct perf_time_interval ptime;
+	struct perf_time_interval *ptime_range;
+	int			range_size;
+	int			range_num;
 	float			min_percent;
 	u64			nr_entries;
 	u64			queue_size;
@@ -162,12 +169,28 @@ static int hist_iter__branch_callback(struct hist_entry_iter *iter,
 	struct hist_entry *he = iter->he;
 	struct report *rep = arg;
 	struct branch_info *bi;
+	struct perf_sample *sample = iter->sample;
+	struct perf_evsel *evsel = iter->evsel;
+	int err;
+
+	if (!ui__has_annotation())
+		return 0;
+
+	hist__account_cycles(sample->branch_stack, al, sample,
+			     rep->nonany_branch_mode);
 
 	bi = he->branch_info;
+	err = addr_map_symbol__inc_samples(&bi->from, sample, evsel->idx);
+	if (err)
+		goto out;
+
+	err = addr_map_symbol__inc_samples(&bi->to, sample, evsel->idx);
+
 	branch_type_count(&rep->brtype_stat, &bi->flags,
 			  bi->from.addr, bi->to.addr);
 
-	return 0;
+out:
+	return err;
 }
 
 static int process_sample_event(struct perf_tool *tool,
@@ -186,8 +209,10 @@ static int process_sample_event(struct perf_tool *tool,
 	};
 	int ret = 0;
 
-	if (perf_time__skip_sample(&rep->ptime, sample->time))
+	if (perf_time__ranges_skip_sample(rep->ptime_range, rep->range_num,
+					  sample->time)) {
 		return 0;
+	}
 
 	if (machine__resolve(machine, &al, sample) < 0) {
 		pr_debug("problem processing %d event, skipping it.\n",
@@ -312,9 +337,10 @@ static int report__setup_sample_type(struct report *rep)
 
 	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
 		if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-		    (sample_type & PERF_SAMPLE_STACK_USER))
+		    (sample_type & PERF_SAMPLE_STACK_USER)) {
 			callchain_param.record_mode = CALLCHAIN_DWARF;
-		else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+			dwarf_callchain_users = true;
+		} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 			callchain_param.record_mode = CALLCHAIN_LBR;
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
@@ -377,6 +403,9 @@ static size_t hists__fprintf_nr_sample_events(struct hists *hists, struct report
 	if (evname != NULL)
 		ret += fprintf(fp, " of event '%s'", evname);
 
+	if (rep->time_str)
+		ret += fprintf(fp, " (time slices: %s)", rep->time_str);
+
 	if (symbol_conf.show_ref_callgraph &&
 	    strstr(evname, "call-graph=no")) {
 		ret += fprintf(fp, ", show reference callgraph");
@@ -567,6 +596,174 @@ static void report__output_resort(struct report *rep)
 	ui_progress__finish();
 }
 
+static void stats_setup(struct report *rep)
+{
+	memset(&rep->tool, 0, sizeof(rep->tool));
+	rep->tool.no_warn = true;
+}
+
+static int stats_print(struct report *rep)
+{
+	struct perf_session *session = rep->session;
+
+	perf_session__fprintf_nr_events(session, stdout);
+	return 0;
+}
+
+static void tasks_setup(struct report *rep)
+{
+	memset(&rep->tool, 0, sizeof(rep->tool));
+	if (rep->mmaps_mode) {
+		rep->tool.mmap = perf_event__process_mmap;
+		rep->tool.mmap2 = perf_event__process_mmap2;
+	}
+	rep->tool.comm = perf_event__process_comm;
+	rep->tool.exit = perf_event__process_exit;
+	rep->tool.fork = perf_event__process_fork;
+	rep->tool.no_warn = true;
+}
+
+struct task {
+	struct thread		*thread;
+	struct list_head	 list;
+	struct list_head	 children;
+};
+
+static struct task *tasks_list(struct task *task, struct machine *machine)
+{
+	struct thread *parent_thread, *thread = task->thread;
+	struct task   *parent_task;
+
+	/* Already listed. */
+	if (!list_empty(&task->list))
+		return NULL;
+
+	/* Last one in the chain. */
+	if (thread->ppid == -1)
+		return task;
+
+	parent_thread = machine__find_thread(machine, -1, thread->ppid);
+	if (!parent_thread)
+		return ERR_PTR(-ENOENT);
+
+	parent_task = thread__priv(parent_thread);
+	list_add_tail(&task->list, &parent_task->children);
+	return tasks_list(parent_task, machine);
+}
+
+static size_t maps__fprintf_task(struct maps *maps, int indent, FILE *fp)
+{
+	size_t printed = 0;
+	struct rb_node *nd;
+
+	for (nd = rb_first(&maps->entries); nd; nd = rb_next(nd)) {
+		struct map *map = rb_entry(nd, struct map, rb_node);
+
+		printed += fprintf(fp, "%*s  %" PRIx64 "-%" PRIx64 " %c%c%c%c %08" PRIx64 " %" PRIu64 " %s\n",
+				   indent, "", map->start, map->end,
+				   map->prot & PROT_READ ? 'r' : '-',
+				   map->prot & PROT_WRITE ? 'w' : '-',
+				   map->prot & PROT_EXEC ? 'x' : '-',
+				   map->flags & MAP_SHARED ? 's' : 'p',
+				   map->pgoff,
+				   map->ino, map->dso->name);
+	}
+
+	return printed;
+}
+
+static int map_groups__fprintf_task(struct map_groups *mg, int indent, FILE *fp)
+{
+	int printed = 0, i;
+	for (i = 0; i < MAP__NR_TYPES; ++i)
+		printed += maps__fprintf_task(&mg->maps[i], indent, fp);
+	return printed;
+}
+
+static void task__print_level(struct task *task, FILE *fp, int level)
+{
+	struct thread *thread = task->thread;
+	struct task *child;
+	int comm_indent = fprintf(fp, "  %8d %8d %8d |%*s",
+				  thread->pid_, thread->tid, thread->ppid,
+				  level, "");
+
+	fprintf(fp, "%s\n", thread__comm_str(thread));
+
+	map_groups__fprintf_task(thread->mg, comm_indent, fp);
+
+	if (!list_empty(&task->children)) {
+		list_for_each_entry(child, &task->children, list)
+			task__print_level(child, fp, level + 1);
+	}
+}
+
+static int tasks_print(struct report *rep, FILE *fp)
+{
+	struct perf_session *session = rep->session;
+	struct machine      *machine = &session->machines.host;
+	struct task *tasks, *task;
+	unsigned int nr = 0, itask = 0, i;
+	struct rb_node *nd;
+	LIST_HEAD(list);
+
+	/*
+	 * No locking needed while accessing machine->threads,
+	 * because --tasks is single threaded command.
+	 */
+
+	/* Count all the threads. */
+	for (i = 0; i < THREADS__TABLE_SIZE; i++)
+		nr += machine->threads[i].nr;
+
+	tasks = malloc(sizeof(*tasks) * nr);
+	if (!tasks)
+		return -ENOMEM;
+
+	for (i = 0; i < THREADS__TABLE_SIZE; i++) {
+		struct threads *threads = &machine->threads[i];
+
+		for (nd = rb_first(&threads->entries); nd; nd = rb_next(nd)) {
+			task = tasks + itask++;
+
+			task->thread = rb_entry(nd, struct thread, rb_node);
+			INIT_LIST_HEAD(&task->children);
+			INIT_LIST_HEAD(&task->list);
+			thread__set_priv(task->thread, task);
+		}
+	}
+
+	/*
+	 * Iterate every task down to the unprocessed parent
+	 * and link all in task children list. Task with no
+	 * parent is added into 'list'.
+	 */
+	for (itask = 0; itask < nr; itask++) {
+		task = tasks + itask;
+
+		if (!list_empty(&task->list))
+			continue;
+
+		task = tasks_list(task, machine);
+		if (IS_ERR(task)) {
+			pr_err("Error: failed to process tasks\n");
+			free(tasks);
+			return PTR_ERR(task);
+		}
+
+		if (task)
+			list_add_tail(&task->list, &list);
+	}
+
+	fprintf(fp, "# %8s %8s %8s  %s\n", "pid", "tid", "ppid", "comm");
+
+	list_for_each_entry(task, &list, list)
+		task__print_level(task, fp, 0);
+
+	free(tasks);
+	return 0;
+}
+
 static int __cmd_report(struct report *rep)
 {
 	int ret;
@@ -598,12 +795,24 @@ static int __cmd_report(struct report *rep)
 		return ret;
 	}
 
+	if (rep->stats_mode)
+		stats_setup(rep);
+
+	if (rep->tasks_mode)
+		tasks_setup(rep);
+
 	ret = perf_session__process_events(session);
 	if (ret) {
 		ui__error("failed to process sample\n");
 		return ret;
 	}
 
+	if (rep->stats_mode)
+		return stats_print(rep);
+
+	if (rep->tasks_mode)
+		return tasks_print(rep, stdout);
+
 	report__warn_kptr_restrict(rep);
 
 	evlist__for_each_entry(session->evlist, pos)
@@ -760,6 +969,9 @@ int cmd_report(int argc, const char **argv)
 	OPT_BOOLEAN('q', "quiet", &quiet, "Do not show any message"),
 	OPT_BOOLEAN('D', "dump-raw-trace", &dump_trace,
 		    "dump raw trace in ASCII"),
+	OPT_BOOLEAN(0, "stats", &report.stats_mode, "Display event stats"),
+	OPT_BOOLEAN(0, "tasks", &report.tasks_mode, "Display recorded tasks"),
+	OPT_BOOLEAN(0, "mmaps", &report.mmaps_mode, "Display recorded tasks memory maps"),
 	OPT_STRING('k', "vmlinux", &symbol_conf.vmlinux_name,
 		   "file", "vmlinux pathname"),
 	OPT_STRING(0, "kallsyms", &symbol_conf.kallsyms_name,
@@ -907,6 +1119,9 @@ int cmd_report(int argc, const char **argv)
 		report.symbol_filter_str = argv[0];
 	}
 
+	if (report.mmaps_mode)
+		report.tasks_mode = true;
+
 	if (quiet)
 		perf_quiet_option();
 
@@ -921,13 +1136,6 @@ int cmd_report(int argc, const char **argv)
 		return -EINVAL;
 	}
 
-	if (report.use_stdio)
-		use_browser = 0;
-	else if (report.use_tui)
-		use_browser = 1;
-	else if (report.use_gtk)
-		use_browser = 2;
-
 	if (report.inverted_callchain)
 		callchain_param.order = ORDER_CALLER;
 	if (symbol_conf.cumulate_callchain && !callchain_param.order_set)
@@ -1014,6 +1222,13 @@ int cmd_report(int argc, const char **argv)
 		perf_hpp_list.need_collapse = true;
 	}
 
+	if (report.use_stdio)
+		use_browser = 0;
+	else if (report.use_tui)
+		use_browser = 1;
+	else if (report.use_gtk)
+		use_browser = 2;
+
 	/* Force tty output for header output and per-thread stat. */
 	if (report.header || report.header_only || report.show_threads)
 		use_browser = 0;
@@ -1021,6 +1236,12 @@ int cmd_report(int argc, const char **argv)
 		report.tool.show_feat_hdr = SHOW_FEAT_HEADER;
 	if (report.show_full_info)
 		report.tool.show_feat_hdr = SHOW_FEAT_HEADER_FULL_INFO;
+	if (report.stats_mode || report.tasks_mode)
+		use_browser = 0;
+	if (report.stats_mode && report.tasks_mode) {
+		pr_err("Error: --tasks and --mmaps can't be used together with --stats\n");
+		goto error;
+	}
 
 	if (strcmp(input_name, "-") != 0)
 		setup_browser(true);
@@ -1043,7 +1264,8 @@ int cmd_report(int argc, const char **argv)
 			ret = 0;
 			goto error;
 		}
-	} else if (use_browser == 0 && !quiet) {
+	} else if (use_browser == 0 && !quiet &&
+		   !report.stats_mode && !report.tasks_mode) {
 		fputs("# To display the perf.data header info, please use --header/--header-only options.\n#\n",
 		      stdout);
 	}
@@ -1077,9 +1299,36 @@ int cmd_report(int argc, const char **argv)
 	if (symbol__init(&session->header.env) < 0)
 		goto error;
 
-	if (perf_time__parse_str(&report.ptime, report.time_str) != 0) {
-		pr_err("Invalid time string\n");
-		return -EINVAL;
+	report.ptime_range = perf_time__range_alloc(report.time_str,
+						    &report.range_size);
+	if (!report.ptime_range) {
+		ret = -ENOMEM;
+		goto error;
+	}
+
+	if (perf_time__parse_str(report.ptime_range, report.time_str) != 0) {
+		if (session->evlist->first_sample_time == 0 &&
+		    session->evlist->last_sample_time == 0) {
+			pr_err("HINT: no first/last sample time found in perf data.\n"
+			       "Please use latest perf binary to execute 'perf record'\n"
+			       "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n");
+			ret = -EINVAL;
+			goto error;
+		}
+
+		report.range_num = perf_time__percent_parse_str(
+					report.ptime_range, report.range_size,
+					report.time_str,
+					session->evlist->first_sample_time,
+					session->evlist->last_sample_time);
+
+		if (report.range_num < 0) {
+			pr_err("Invalid time string\n");
+			ret = -EINVAL;
+			goto error;
+		}
+	} else {
+		report.range_num = 1;
 	}
 
 	sort__setup_elide(stdout);
@@ -1092,6 +1341,8 @@ int cmd_report(int argc, const char **argv)
 		ret = 0;
 
 error:
+	zfree(&report.ptime_range);
+
 	perf_session__delete(session);
 	return ret;
 }
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index 9b43bda..ab19a6e 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -22,9 +22,11 @@
 #include "util/cpumap.h"
 #include "util/thread_map.h"
 #include "util/stat.h"
+#include "util/color.h"
 #include "util/string2.h"
 #include "util/thread-stack.h"
 #include "util/time-utils.h"
+#include "util/path.h"
 #include "print_binary.h"
 #include <linux/bitmap.h>
 #include <linux/kernel.h>
@@ -40,6 +42,7 @@
 #include <sys/param.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <fcntl.h>
 #include <unistd.h>
 
 #include "sane_ctype.h"
@@ -90,6 +93,8 @@ enum perf_output_field {
 	PERF_OUTPUT_SYNTH           = 1U << 25,
 	PERF_OUTPUT_PHYS_ADDR       = 1U << 26,
 	PERF_OUTPUT_UREGS	    = 1U << 27,
+	PERF_OUTPUT_METRIC	    = 1U << 28,
+	PERF_OUTPUT_MISC            = 1U << 29,
 };
 
 struct output_option {
@@ -124,6 +129,8 @@ struct output_option {
 	{.str = "brstackoff", .field = PERF_OUTPUT_BRSTACKOFF},
 	{.str = "synth", .field = PERF_OUTPUT_SYNTH},
 	{.str = "phys_addr", .field = PERF_OUTPUT_PHYS_ADDR},
+	{.str = "metric", .field = PERF_OUTPUT_METRIC},
+	{.str = "misc", .field = PERF_OUTPUT_MISC},
 };
 
 enum {
@@ -215,12 +222,20 @@ struct perf_evsel_script {
        char *filename;
        FILE *fp;
        u64  samples;
+       /* For metric output */
+       u64  val;
+       int  gnum;
 };
 
+static inline struct perf_evsel_script *evsel_script(struct perf_evsel *evsel)
+{
+	return (struct perf_evsel_script *)evsel->priv;
+}
+
 static struct perf_evsel_script *perf_evsel_script__new(struct perf_evsel *evsel,
 							struct perf_data *data)
 {
-	struct perf_evsel_script *es = malloc(sizeof(*es));
+	struct perf_evsel_script *es = zalloc(sizeof(*es));
 
 	if (es != NULL) {
 		if (asprintf(&es->filename, "%s.%s.dump", data->file.path, perf_evsel__name(evsel)) < 0)
@@ -228,7 +243,6 @@ static struct perf_evsel_script *perf_evsel_script__new(struct perf_evsel *evsel
 		es->fp = fopen(es->filename, "w");
 		if (es->fp == NULL)
 			goto out_free_filename;
-		es->samples = 0;
 	}
 
 	return es;
@@ -423,11 +437,6 @@ static int perf_evsel__check_attr(struct perf_evsel *evsel,
 					   PERF_OUTPUT_CPU, allow_user_set))
 		return -EINVAL;
 
-	if (PRINT_FIELD(PERIOD) &&
-		perf_evsel__check_stype(evsel, PERF_SAMPLE_PERIOD, "PERIOD",
-					PERF_OUTPUT_PERIOD))
-		return -EINVAL;
-
 	if (PRINT_FIELD(IREGS) &&
 		perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
 					PERF_OUTPUT_IREGS))
@@ -588,7 +597,8 @@ static int perf_sample__fprintf_uregs(struct perf_sample *sample,
 
 static int perf_sample__fprintf_start(struct perf_sample *sample,
 				      struct thread *thread,
-				      struct perf_evsel *evsel, FILE *fp)
+				      struct perf_evsel *evsel,
+				      u32 type, FILE *fp)
 {
 	struct perf_event_attr *attr = &evsel->attr;
 	unsigned long secs;
@@ -618,6 +628,47 @@ static int perf_sample__fprintf_start(struct perf_sample *sample,
 			printed += fprintf(fp, "[%03d] ", sample->cpu);
 	}
 
+	if (PRINT_FIELD(MISC)) {
+		int ret = 0;
+
+		#define has(m) \
+			(sample->misc & PERF_RECORD_MISC_##m) == PERF_RECORD_MISC_##m
+
+		if (has(KERNEL))
+			ret += fprintf(fp, "K");
+		if (has(USER))
+			ret += fprintf(fp, "U");
+		if (has(HYPERVISOR))
+			ret += fprintf(fp, "H");
+		if (has(GUEST_KERNEL))
+			ret += fprintf(fp, "G");
+		if (has(GUEST_USER))
+			ret += fprintf(fp, "g");
+
+		switch (type) {
+		case PERF_RECORD_MMAP:
+		case PERF_RECORD_MMAP2:
+			if (has(MMAP_DATA))
+				ret += fprintf(fp, "M");
+			break;
+		case PERF_RECORD_COMM:
+			if (has(COMM_EXEC))
+				ret += fprintf(fp, "E");
+			break;
+		case PERF_RECORD_SWITCH:
+		case PERF_RECORD_SWITCH_CPU_WIDE:
+			if (has(SWITCH_OUT))
+				ret += fprintf(fp, "S");
+		default:
+			break;
+		}
+
+		#undef has
+
+		ret += fprintf(fp, "%*s", 6 - ret, " ");
+		printed += ret;
+	}
+
 	if (PRINT_FIELD(TIME)) {
 		nsecs = sample->time;
 		secs = nsecs / NSEC_PER_SEC;
@@ -1437,13 +1488,16 @@ struct perf_script {
 	bool			show_mmap_events;
 	bool			show_switch_events;
 	bool			show_namespace_events;
+	bool			show_lost_events;
 	bool			allocated;
 	bool			per_event_dump;
 	struct cpu_map		*cpus;
 	struct thread_map	*threads;
 	int			name_width;
 	const char              *time_str;
-	struct perf_time_interval ptime;
+	struct perf_time_interval *ptime_range;
+	int			range_size;
+	int			range_num;
 };
 
 static int perf_evlist__max_name_len(struct perf_evlist *evlist)
@@ -1477,6 +1531,88 @@ static int data_src__fprintf(u64 data_src, FILE *fp)
 	return fprintf(fp, "%-*s", maxlen, out);
 }
 
+struct metric_ctx {
+	struct perf_sample	*sample;
+	struct thread		*thread;
+	struct perf_evsel	*evsel;
+	FILE 			*fp;
+};
+
+static void script_print_metric(void *ctx, const char *color,
+			        const char *fmt,
+			        const char *unit, double val)
+{
+	struct metric_ctx *mctx = ctx;
+
+	if (!fmt)
+		return;
+	perf_sample__fprintf_start(mctx->sample, mctx->thread, mctx->evsel,
+				   PERF_RECORD_SAMPLE, mctx->fp);
+	fputs("\tmetric: ", mctx->fp);
+	if (color)
+		color_fprintf(mctx->fp, color, fmt, val);
+	else
+		printf(fmt, val);
+	fprintf(mctx->fp, " %s\n", unit);
+}
+
+static void script_new_line(void *ctx)
+{
+	struct metric_ctx *mctx = ctx;
+
+	perf_sample__fprintf_start(mctx->sample, mctx->thread, mctx->evsel,
+				   PERF_RECORD_SAMPLE, mctx->fp);
+	fputs("\tmetric: ", mctx->fp);
+}
+
+static void perf_sample__fprint_metric(struct perf_script *script,
+				       struct thread *thread,
+				       struct perf_evsel *evsel,
+				       struct perf_sample *sample,
+				       FILE *fp)
+{
+	struct perf_stat_output_ctx ctx = {
+		.print_metric = script_print_metric,
+		.new_line = script_new_line,
+		.ctx = &(struct metric_ctx) {
+				.sample = sample,
+				.thread = thread,
+				.evsel  = evsel,
+				.fp     = fp,
+			 },
+		.force_header = false,
+	};
+	struct perf_evsel *ev2;
+	static bool init;
+	u64 val;
+
+	if (!init) {
+		perf_stat__init_shadow_stats();
+		init = true;
+	}
+	if (!evsel->stats)
+		perf_evlist__alloc_stats(script->session->evlist, false);
+	if (evsel_script(evsel->leader)->gnum++ == 0)
+		perf_stat__reset_shadow_stats();
+	val = sample->period * evsel->scale;
+	perf_stat__update_shadow_stats(evsel,
+				       val,
+				       sample->cpu,
+				       &rt_stat);
+	evsel_script(evsel)->val = val;
+	if (evsel_script(evsel->leader)->gnum == evsel->leader->nr_members) {
+		for_each_group_member (ev2, evsel->leader) {
+			perf_stat__print_shadow_stats(ev2,
+						      evsel_script(ev2)->val,
+						      sample->cpu,
+						      &ctx,
+						      NULL,
+						      &rt_stat);
+		}
+		evsel_script(evsel->leader)->gnum = 0;
+	}
+}
+
 static void process_event(struct perf_script *script,
 			  struct perf_sample *sample, struct perf_evsel *evsel,
 			  struct addr_location *al,
@@ -1493,7 +1629,8 @@ static void process_event(struct perf_script *script,
 
 	++es->samples;
 
-	perf_sample__fprintf_start(sample, thread, evsel, fp);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_SAMPLE, fp);
 
 	if (PRINT_FIELD(PERIOD))
 		fprintf(fp, "%10" PRIu64 " ", sample->period);
@@ -1564,6 +1701,9 @@ static void process_event(struct perf_script *script,
 	if (PRINT_FIELD(PHYS_ADDR))
 		fprintf(fp, "%16" PRIx64, sample->phys_addr);
 	fprintf(fp, "\n");
+
+	if (PRINT_FIELD(METRIC))
+		perf_sample__fprint_metric(script, thread, evsel, sample, fp);
 }
 
 static struct scripting_ops	*scripting_ops;
@@ -1643,8 +1783,10 @@ static int process_sample_event(struct perf_tool *tool,
 	struct perf_script *scr = container_of(tool, struct perf_script, tool);
 	struct addr_location al;
 
-	if (perf_time__skip_sample(&scr->ptime, sample->time))
+	if (perf_time__ranges_skip_sample(scr->ptime_range, scr->range_num,
+					  sample->time)) {
 		return 0;
+	}
 
 	if (debug_mode) {
 		if (sample->time < last_timestamp) {
@@ -1737,7 +1879,8 @@ static int process_comm_event(struct perf_tool *tool,
 		sample->tid = event->comm.tid;
 		sample->pid = event->comm.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_COMM, stdout);
 	perf_event__fprintf(event, stdout);
 	ret = 0;
 out:
@@ -1772,7 +1915,8 @@ static int process_namespaces_event(struct perf_tool *tool,
 		sample->tid = event->namespaces.tid;
 		sample->pid = event->namespaces.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_NAMESPACES, stdout);
 	perf_event__fprintf(event, stdout);
 	ret = 0;
 out:
@@ -1805,7 +1949,8 @@ static int process_fork_event(struct perf_tool *tool,
 		sample->tid = event->fork.tid;
 		sample->pid = event->fork.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_FORK, stdout);
 	perf_event__fprintf(event, stdout);
 	thread__put(thread);
 
@@ -1834,7 +1979,8 @@ static int process_exit_event(struct perf_tool *tool,
 		sample->tid = event->fork.tid;
 		sample->pid = event->fork.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_EXIT, stdout);
 	perf_event__fprintf(event, stdout);
 
 	if (perf_event__process_exit(tool, event, sample, machine) < 0)
@@ -1869,7 +2015,8 @@ static int process_mmap_event(struct perf_tool *tool,
 		sample->tid = event->mmap.tid;
 		sample->pid = event->mmap.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_MMAP, stdout);
 	perf_event__fprintf(event, stdout);
 	thread__put(thread);
 	return 0;
@@ -1900,7 +2047,8 @@ static int process_mmap2_event(struct perf_tool *tool,
 		sample->tid = event->mmap2.tid;
 		sample->pid = event->mmap2.pid;
 	}
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_MMAP2, stdout);
 	perf_event__fprintf(event, stdout);
 	thread__put(thread);
 	return 0;
@@ -1926,7 +2074,31 @@ static int process_switch_event(struct perf_tool *tool,
 		return -1;
 	}
 
-	perf_sample__fprintf_start(sample, thread, evsel, stdout);
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_SWITCH, stdout);
+	perf_event__fprintf(event, stdout);
+	thread__put(thread);
+	return 0;
+}
+
+static int
+process_lost_event(struct perf_tool *tool,
+		   union perf_event *event,
+		   struct perf_sample *sample,
+		   struct machine *machine)
+{
+	struct perf_script *script = container_of(tool, struct perf_script, tool);
+	struct perf_session *session = script->session;
+	struct perf_evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
+	struct thread *thread;
+
+	thread = machine__findnew_thread(machine, sample->pid,
+					 sample->tid);
+	if (thread == NULL)
+		return -1;
+
+	perf_sample__fprintf_start(sample, thread, evsel,
+				   PERF_RECORD_LOST, stdout);
 	perf_event__fprintf(event, stdout);
 	thread__put(thread);
 	return 0;
@@ -2026,6 +2198,8 @@ static int __cmd_script(struct perf_script *script)
 		script->tool.context_switch = process_switch_event;
 	if (script->show_namespace_events)
 		script->tool.namespaces = process_namespaces_event;
+	if (script->show_lost_events)
+		script->tool.lost = process_lost_event;
 
 	if (perf_script__setup_per_event_dump(script)) {
 		pr_err("Couldn't create the per event dump files\n");
@@ -2311,19 +2485,6 @@ static int parse_output_fields(const struct option *opt __maybe_unused,
 	return rc;
 }
 
-/* Helper function for filesystems that return a dent->d_type DT_UNKNOWN */
-static int is_directory(const char *base_path, const struct dirent *dent)
-{
-	char path[PATH_MAX];
-	struct stat st;
-
-	sprintf(path, "%s/%s", base_path, dent->d_name);
-	if (stat(path, &st))
-		return 0;
-
-	return S_ISDIR(st.st_mode);
-}
-
 #define for_each_lang(scripts_path, scripts_dir, lang_dirent)		\
 	while ((lang_dirent = readdir(scripts_dir)) != NULL)		\
 		if ((lang_dirent->d_type == DT_DIR ||			\
@@ -2758,9 +2919,10 @@ static void script__setup_sample_type(struct perf_script *script)
 
 	if (symbol_conf.use_callchain || symbol_conf.cumulate_callchain) {
 		if ((sample_type & PERF_SAMPLE_REGS_USER) &&
-		    (sample_type & PERF_SAMPLE_STACK_USER))
+		    (sample_type & PERF_SAMPLE_STACK_USER)) {
 			callchain_param.record_mode = CALLCHAIN_DWARF;
-		else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
+			dwarf_callchain_users = true;
+		} else if (sample_type & PERF_SAMPLE_BRANCH_STACK)
 			callchain_param.record_mode = CALLCHAIN_LBR;
 		else
 			callchain_param.record_mode = CALLCHAIN_FP;
@@ -2975,6 +3137,8 @@ int cmd_script(int argc, const char **argv)
 		    "Show context switch events (if recorded)"),
 	OPT_BOOLEAN('\0', "show-namespace-events", &script.show_namespace_events,
 		    "Show namespace events (if recorded)"),
+	OPT_BOOLEAN('\0', "show-lost-events", &script.show_lost_events,
+		    "Show lost events (if recorded)"),
 	OPT_BOOLEAN('\0', "per-event-dump", &script.per_event_dump,
 		    "Dump trace output to files named by the monitored events"),
 	OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
@@ -3281,18 +3445,46 @@ int cmd_script(int argc, const char **argv)
 	if (err < 0)
 		goto out_delete;
 
-	/* needs to be parsed after looking up reference time */
-	if (perf_time__parse_str(&script.ptime, script.time_str) != 0) {
-		pr_err("Invalid time string\n");
-		err = -EINVAL;
+	script.ptime_range = perf_time__range_alloc(script.time_str,
+						    &script.range_size);
+	if (!script.ptime_range) {
+		err = -ENOMEM;
 		goto out_delete;
 	}
 
+	/* needs to be parsed after looking up reference time */
+	if (perf_time__parse_str(script.ptime_range, script.time_str) != 0) {
+		if (session->evlist->first_sample_time == 0 &&
+		    session->evlist->last_sample_time == 0) {
+			pr_err("HINT: no first/last sample time found in perf data.\n"
+			       "Please use latest perf binary to execute 'perf record'\n"
+			       "(if '--buildid-all' is enabled, please set '--timestamp-boundary').\n");
+			err = -EINVAL;
+			goto out_delete;
+		}
+
+		script.range_num = perf_time__percent_parse_str(
+					script.ptime_range, script.range_size,
+					script.time_str,
+					session->evlist->first_sample_time,
+					session->evlist->last_sample_time);
+
+		if (script.range_num < 0) {
+			pr_err("Invalid time string\n");
+			err = -EINVAL;
+			goto out_delete;
+		}
+	} else {
+		script.range_num = 1;
+	}
+
 	err = __cmd_script(&script);
 
 	flush_scripting();
 
 out_delete:
+	zfree(&script.ptime_range);
+
 	perf_evlist__free_stats(session->evlist);
 	perf_session__delete(session);
 
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c
index 59af5a8..98bf9d3 100644
--- a/tools/perf/builtin-stat.c
+++ b/tools/perf/builtin-stat.c
@@ -63,7 +63,6 @@
 #include "util/group.h"
 #include "util/session.h"
 #include "util/tool.h"
-#include "util/group.h"
 #include "util/string2.h"
 #include "util/metricgroup.h"
 #include "asm/bug.h"
@@ -214,8 +213,13 @@ static inline void diff_timespec(struct timespec *r, struct timespec *a,
 
 static void perf_stat__reset_stats(void)
 {
+	int i;
+
 	perf_evlist__reset_stats(evsel_list);
 	perf_stat__reset_shadow_stats();
+
+	for (i = 0; i < stat_config.stats_num; i++)
+		perf_stat__reset_shadow_per_stat(&stat_config.stats[i]);
 }
 
 static int create_perf_stat_counter(struct perf_evsel *evsel)
@@ -272,7 +276,7 @@ static int create_perf_stat_counter(struct perf_evsel *evsel)
 			attr->enable_on_exec = 1;
 	}
 
-	if (target__has_cpu(&target))
+	if (target__has_cpu(&target) && !target__has_per_thread(&target))
 		return perf_evsel__open_per_cpu(evsel, perf_evsel__cpus(evsel));
 
 	return perf_evsel__open_per_thread(evsel, evsel_list->threads);
@@ -335,7 +339,7 @@ static int read_counter(struct perf_evsel *counter)
 	int nthreads = thread_map__nr(evsel_list->threads);
 	int ncpus, cpu, thread;
 
-	if (target__has_cpu(&target))
+	if (target__has_cpu(&target) && !target__has_per_thread(&target))
 		ncpus = perf_evsel__nr_cpus(counter);
 	else
 		ncpus = 1;
@@ -458,19 +462,8 @@ static void workload_exec_failed_signal(int signo __maybe_unused, siginfo_t *inf
 	workload_exec_errno = info->si_value.sival_int;
 }
 
-static bool has_unit(struct perf_evsel *counter)
-{
-	return counter->unit && *counter->unit;
-}
-
-static bool has_scale(struct perf_evsel *counter)
-{
-	return counter->scale != 1;
-}
-
 static int perf_stat_synthesize_config(bool is_pipe)
 {
-	struct perf_evsel *counter;
 	int err;
 
 	if (is_pipe) {
@@ -482,53 +475,10 @@ static int perf_stat_synthesize_config(bool is_pipe)
 		}
 	}
 
-	/*
-	 * Synthesize other events stuff not carried within
-	 * attr event - unit, scale, name
-	 */
-	evlist__for_each_entry(evsel_list, counter) {
-		if (!counter->supported)
-			continue;
-
-		/*
-		 * Synthesize unit and scale only if it's defined.
-		 */
-		if (has_unit(counter)) {
-			err = perf_event__synthesize_event_update_unit(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel unit.\n");
-				return err;
-			}
-		}
-
-		if (has_scale(counter)) {
-			err = perf_event__synthesize_event_update_scale(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel scale.\n");
-				return err;
-			}
-		}
-
-		if (counter->own_cpus) {
-			err = perf_event__synthesize_event_update_cpus(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel scale.\n");
-				return err;
-			}
-		}
-
-		/*
-		 * Name is needed only for pipe output,
-		 * perf.data carries event names.
-		 */
-		if (is_pipe) {
-			err = perf_event__synthesize_event_update_name(NULL, counter, process_synthesized_event);
-			if (err < 0) {
-				pr_err("Couldn't synthesize evsel name.\n");
-				return err;
-			}
-		}
-	}
+	err = perf_event__synthesize_extra_attr(NULL,
+						evsel_list,
+						process_synthesized_event,
+						is_pipe);
 
 	err = perf_event__synthesize_thread_map2(NULL, evsel_list->threads,
 						process_synthesized_event,
@@ -1151,7 +1101,8 @@ static void abs_printout(int id, int nr, struct perf_evsel *evsel, double avg)
 }
 
 static void printout(int id, int nr, struct perf_evsel *counter, double uval,
-		     char *prefix, u64 run, u64 ena, double noise)
+		     char *prefix, u64 run, u64 ena, double noise,
+		     struct runtime_stat *st)
 {
 	struct perf_stat_output_ctx out;
 	struct outstate os = {
@@ -1244,7 +1195,7 @@ static void printout(int id, int nr, struct perf_evsel *counter, double uval,
 
 	perf_stat__print_shadow_stats(counter, uval,
 				first_shadow_cpu(counter, id),
-				&out, &metric_events);
+				&out, &metric_events, st);
 	if (!csv_output && !metric_only) {
 		print_noise(counter, noise);
 		print_running(run, ena);
@@ -1268,7 +1219,8 @@ static void aggr_update_shadow(void)
 				val += perf_counts(counter->counts, cpu, 0)->val;
 			}
 			perf_stat__update_shadow_stats(counter, val,
-						       first_shadow_cpu(counter, id));
+					first_shadow_cpu(counter, id),
+					&rt_stat);
 		}
 	}
 }
@@ -1388,7 +1340,8 @@ static void print_aggr(char *prefix)
 				fprintf(output, "%s", prefix);
 
 			uval = val * counter->scale;
-			printout(id, nr, counter, uval, prefix, run, ena, 1.0);
+			printout(id, nr, counter, uval, prefix, run, ena, 1.0,
+				 &rt_stat);
 			if (!metric_only)
 				fputc('\n', output);
 		}
@@ -1397,13 +1350,24 @@ static void print_aggr(char *prefix)
 	}
 }
 
-static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
+static int cmp_val(const void *a, const void *b)
 {
-	FILE *output = stat_config.output;
-	int nthreads = thread_map__nr(counter->threads);
-	int ncpus = cpu_map__nr(counter->cpus);
-	int cpu, thread;
+	return ((struct perf_aggr_thread_value *)b)->val -
+		((struct perf_aggr_thread_value *)a)->val;
+}
+
+static struct perf_aggr_thread_value *sort_aggr_thread(
+					struct perf_evsel *counter,
+					int nthreads, int ncpus,
+					int *ret)
+{
+	int cpu, thread, i = 0;
 	double uval;
+	struct perf_aggr_thread_value *buf;
+
+	buf = calloc(nthreads, sizeof(struct perf_aggr_thread_value));
+	if (!buf)
+		return NULL;
 
 	for (thread = 0; thread < nthreads; thread++) {
 		u64 ena = 0, run = 0, val = 0;
@@ -1414,13 +1378,63 @@ static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
 			run += perf_counts(counter->counts, cpu, thread)->run;
 		}
 
+		uval = val * counter->scale;
+
+		/*
+		 * Skip value 0 when enabling --per-thread globally,
+		 * otherwise too many 0 output.
+		 */
+		if (uval == 0.0 && target__has_per_thread(&target))
+			continue;
+
+		buf[i].counter = counter;
+		buf[i].id = thread;
+		buf[i].uval = uval;
+		buf[i].val = val;
+		buf[i].run = run;
+		buf[i].ena = ena;
+		i++;
+	}
+
+	qsort(buf, i, sizeof(struct perf_aggr_thread_value), cmp_val);
+
+	if (ret)
+		*ret = i;
+
+	return buf;
+}
+
+static void print_aggr_thread(struct perf_evsel *counter, char *prefix)
+{
+	FILE *output = stat_config.output;
+	int nthreads = thread_map__nr(counter->threads);
+	int ncpus = cpu_map__nr(counter->cpus);
+	int thread, sorted_threads, id;
+	struct perf_aggr_thread_value *buf;
+
+	buf = sort_aggr_thread(counter, nthreads, ncpus, &sorted_threads);
+	if (!buf) {
+		perror("cannot sort aggr thread");
+		return;
+	}
+
+	for (thread = 0; thread < sorted_threads; thread++) {
 		if (prefix)
 			fprintf(output, "%s", prefix);
 
-		uval = val * counter->scale;
-		printout(thread, 0, counter, uval, prefix, run, ena, 1.0);
+		id = buf[thread].id;
+		if (stat_config.stats)
+			printout(id, 0, buf[thread].counter, buf[thread].uval,
+				 prefix, buf[thread].run, buf[thread].ena, 1.0,
+				 &stat_config.stats[id]);
+		else
+			printout(id, 0, buf[thread].counter, buf[thread].uval,
+				 prefix, buf[thread].run, buf[thread].ena, 1.0,
+				 &rt_stat);
 		fputc('\n', output);
 	}
+
+	free(buf);
 }
 
 struct caggr_data {
@@ -1455,7 +1469,8 @@ static void print_counter_aggr(struct perf_evsel *counter, char *prefix)
 		fprintf(output, "%s", prefix);
 
 	uval = cd.avg * counter->scale;
-	printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled, cd.avg);
+	printout(-1, 0, counter, uval, prefix, cd.avg_running, cd.avg_enabled,
+		 cd.avg, &rt_stat);
 	if (!metric_only)
 		fprintf(output, "\n");
 }
@@ -1494,7 +1509,8 @@ static void print_counter(struct perf_evsel *counter, char *prefix)
 			fprintf(output, "%s", prefix);
 
 		uval = val * counter->scale;
-		printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
+		printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
+			 &rt_stat);
 
 		fputc('\n', output);
 	}
@@ -1526,7 +1542,8 @@ static void print_no_aggr_metric(char *prefix)
 			run = perf_counts(counter->counts, cpu, 0)->run;
 
 			uval = val * counter->scale;
-			printout(cpu, 0, counter, uval, prefix, run, ena, 1.0);
+			printout(cpu, 0, counter, uval, prefix, run, ena, 1.0,
+				 &rt_stat);
 		}
 		fputc('\n', stat_config.output);
 	}
@@ -1582,7 +1599,8 @@ static void print_metric_headers(const char *prefix, bool no_indent)
 		perf_stat__print_shadow_stats(counter, 0,
 					      0,
 					      &out,
-					      &metric_events);
+					      &metric_events,
+					      &rt_stat);
 	}
 	fputc('\n', stat_config.output);
 }
@@ -2541,6 +2559,35 @@ int process_cpu_map_event(struct perf_tool *tool,
 	return set_maps(st);
 }
 
+static int runtime_stat_new(struct perf_stat_config *config, int nthreads)
+{
+	int i;
+
+	config->stats = calloc(nthreads, sizeof(struct runtime_stat));
+	if (!config->stats)
+		return -1;
+
+	config->stats_num = nthreads;
+
+	for (i = 0; i < nthreads; i++)
+		runtime_stat__init(&config->stats[i]);
+
+	return 0;
+}
+
+static void runtime_stat_delete(struct perf_stat_config *config)
+{
+	int i;
+
+	if (!config->stats)
+		return;
+
+	for (i = 0; i < config->stats_num; i++)
+		runtime_stat__exit(&config->stats[i]);
+
+	free(config->stats);
+}
+
 static const char * const stat_report_usage[] = {
 	"perf stat report [<options>]",
 	NULL,
@@ -2750,12 +2797,16 @@ int cmd_stat(int argc, const char **argv)
 		run_count = 1;
 	}
 
-	if ((stat_config.aggr_mode == AGGR_THREAD) && !target__has_task(&target)) {
-		fprintf(stderr, "The --per-thread option is only available "
-			"when monitoring via -p -t options.\n");
-		parse_options_usage(NULL, stat_options, "p", 1);
-		parse_options_usage(NULL, stat_options, "t", 1);
-		goto out;
+	if ((stat_config.aggr_mode == AGGR_THREAD) &&
+		!target__has_task(&target)) {
+		if (!target.system_wide || target.cpu_list) {
+			fprintf(stderr, "The --per-thread option is only "
+				"available when monitoring via -p -t -a "
+				"options or only --per-thread.\n");
+			parse_options_usage(NULL, stat_options, "p", 1);
+			parse_options_usage(NULL, stat_options, "t", 1);
+			goto out;
+		}
 	}
 
 	/*
@@ -2779,6 +2830,9 @@ int cmd_stat(int argc, const char **argv)
 
 	target__validate(&target);
 
+	if ((stat_config.aggr_mode == AGGR_THREAD) && (target.system_wide))
+		target.per_thread = true;
+
 	if (perf_evlist__create_maps(evsel_list, &target) < 0) {
 		if (target__has_task(&target)) {
 			pr_err("Problems finding threads of monitor\n");
@@ -2796,8 +2850,15 @@ int cmd_stat(int argc, const char **argv)
 	 * Initialize thread_map with comm names,
 	 * so we could print it out on output.
 	 */
-	if (stat_config.aggr_mode == AGGR_THREAD)
+	if (stat_config.aggr_mode == AGGR_THREAD) {
 		thread_map__read_comms(evsel_list->threads);
+		if (target.system_wide) {
+			if (runtime_stat_new(&stat_config,
+				thread_map__nr(evsel_list->threads))) {
+				goto out;
+			}
+		}
+	}
 
 	if (interval && interval < 100) {
 		if (interval < 10) {
@@ -2887,5 +2948,8 @@ int cmd_stat(int argc, const char **argv)
 		sysfs__write_int(FREEZE_ON_SMI_PATH, 0);
 
 	perf_evlist__delete(evsel_list);
+
+	runtime_stat_delete(&stat_config);
+
 	return status;
 }
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 9e0d264..c6ccda5 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -99,6 +99,7 @@ static void perf_top__resize(struct perf_top *top)
 
 static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
 {
+	struct perf_evsel *evsel = hists_to_evsel(he->hists);
 	struct symbol *sym;
 	struct annotation *notes;
 	struct map *map;
@@ -137,7 +138,7 @@ static int perf_top__parse_source(struct perf_top *top, struct hist_entry *he)
 		return err;
 	}
 
-	err = symbol__disassemble(sym, map, NULL, 0, NULL, NULL);
+	err = symbol__annotate(sym, map, evsel, 0, NULL);
 	if (err == 0) {
 out_assign:
 		top->sym_filter_entry = he;
@@ -229,6 +230,7 @@ static void perf_top__record_precise_ip(struct perf_top *top,
 static void perf_top__show_details(struct perf_top *top)
 {
 	struct hist_entry *he = top->sym_filter_entry;
+	struct perf_evsel *evsel = hists_to_evsel(he->hists);
 	struct annotation *notes;
 	struct symbol *symbol;
 	int more;
@@ -241,6 +243,8 @@ static void perf_top__show_details(struct perf_top *top)
 
 	pthread_mutex_lock(&notes->lock);
 
+	symbol__calc_percent(symbol, evsel);
+
 	if (notes->src == NULL)
 		goto out_unlock;
 
@@ -412,7 +416,7 @@ static void perf_top__print_mapped_keys(struct perf_top *top)
 	fprintf(stdout, "\t[S]     stop annotation.\n");
 
 	fprintf(stdout,
-		"\t[K]     hide kernel_symbols symbols.     \t(%s)\n",
+		"\t[K]     hide kernel symbols.             \t(%s)\n",
 		top->hide_kernel_symbols ? "yes" : "no");
 	fprintf(stdout,
 		"\t[U]     hide user symbols.               \t(%s)\n",
@@ -903,7 +907,7 @@ static int perf_top__start_counters(struct perf_top *top)
 		}
 	}
 
-	if (perf_evlist__mmap(evlist, opts->mmap_pages, false) < 0) {
+	if (perf_evlist__mmap(evlist, opts->mmap_pages) < 0) {
 		ui__error("Failed to mmap with %d (%s)\n",
 			    errno, str_error_r(errno, msg, sizeof(msg)));
 		goto out_err;
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index 84debdb..17d11de 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -21,6 +21,7 @@
 #include "builtin.h"
 #include "util/color.h"
 #include "util/debug.h"
+#include "util/env.h"
 #include "util/event.h"
 #include "util/evlist.h"
 #include <subcmd/exec-cmd.h>
@@ -45,18 +46,17 @@
 
 #include <errno.h>
 #include <inttypes.h>
-#include <libaudit.h> /* FIXME: Still needed for audit_errno_to_name */
 #include <poll.h>
 #include <signal.h>
 #include <stdlib.h>
 #include <string.h>
 #include <linux/err.h>
 #include <linux/filter.h>
-#include <linux/audit.h>
 #include <linux/kernel.h>
 #include <linux/random.h>
 #include <linux/stringify.h>
 #include <linux/time64.h>
+#include <fcntl.h>
 
 #include "sane_ctype.h"
 
@@ -111,6 +111,7 @@ struct trace {
 	bool			summary;
 	bool			summary_only;
 	bool			show_comm;
+	bool			print_sample;
 	bool			show_tool_stats;
 	bool			trace_syscalls;
 	bool			kernel_syscallchains;
@@ -545,9 +546,10 @@ static size_t syscall_arg__scnprintf_getrandom_flags(char *bf, size_t size,
 	  { .scnprintf	= SCA_STRARRAY, \
 	    .parm	= &strarray__##array, }
 
+#include "trace/beauty/arch_errno_names.c"
 #include "trace/beauty/eventfd.c"
-#include "trace/beauty/flock.c"
 #include "trace/beauty/futex_op.c"
+#include "trace/beauty/futex_val3.c"
 #include "trace/beauty/mmap.c"
 #include "trace/beauty/mode_t.c"
 #include "trace/beauty/msg_flags.c"
@@ -610,7 +612,8 @@ static struct syscall_fmt {
 	{ .name	    = "fstat", .alias = "newfstat", },
 	{ .name	    = "fstatat", .alias = "newfstatat", },
 	{ .name	    = "futex",
-	  .arg = { [1] = { .scnprintf = SCA_FUTEX_OP, /* op */ }, }, },
+	  .arg = { [1] = { .scnprintf = SCA_FUTEX_OP, /* op */ },
+		   [5] = { .scnprintf = SCA_FUTEX_VAL3, /* val3 */ }, }, },
 	{ .name	    = "futimesat",
 	  .arg = { [0] = { .scnprintf = SCA_FDAT, /* fd */ }, }, },
 	{ .name	    = "getitimer",
@@ -622,6 +625,7 @@ static struct syscall_fmt {
 	  .arg = { [2] = { .scnprintf = SCA_GETRANDOM_FLAGS, /* flags */ }, }, },
 	{ .name	    = "getrlimit",
 	  .arg = { [0] = STRARRAY(resource, rlimit_resources), }, },
+	{ .name	    = "gettid",	    .errpid = true, },
 	{ .name	    = "ioctl",
 	  .arg = {
 #if defined(__i386__) || defined(__x86_64__)
@@ -819,7 +823,7 @@ static size_t fprintf_duration(unsigned long t, bool calculated, FILE *fp)
 	size_t printed = fprintf(fp, "(");
 
 	if (!calculated)
-		printed += fprintf(fp, "     ?   ");
+		printed += fprintf(fp, "         ");
 	else if (duration >= 1.0)
 		printed += color_fprintf(fp, PERF_COLOR_RED, "%6.3f ms", duration);
 	else if (duration >= 0.01)
@@ -1554,10 +1558,9 @@ static void thread__update_stats(struct thread_trace *ttrace,
 	update_stats(stats, duration);
 }
 
-static int trace__printf_interrupted_entry(struct trace *trace, struct perf_sample *sample)
+static int trace__printf_interrupted_entry(struct trace *trace)
 {
 	struct thread_trace *ttrace;
-	u64 duration;
 	size_t printed;
 
 	if (trace->current == NULL)
@@ -1568,15 +1571,30 @@ static int trace__printf_interrupted_entry(struct trace *trace, struct perf_samp
 	if (!ttrace->entry_pending)
 		return 0;
 
-	duration = sample->time - ttrace->entry_time;
-
-	printed  = trace__fprintf_entry_head(trace, trace->current, duration, true, ttrace->entry_time, trace->output);
+	printed  = trace__fprintf_entry_head(trace, trace->current, 0, false, ttrace->entry_time, trace->output);
 	printed += fprintf(trace->output, "%-70s) ...\n", ttrace->entry_str);
 	ttrace->entry_pending = false;
 
 	return printed;
 }
 
+static int trace__fprintf_sample(struct trace *trace, struct perf_evsel *evsel,
+				 struct perf_sample *sample, struct thread *thread)
+{
+	int printed = 0;
+
+	if (trace->print_sample) {
+		double ts = (double)sample->time / NSEC_PER_MSEC;
+
+		printed += fprintf(trace->output, "%22s %10.3f %s %d/%d [%d]\n",
+				   perf_evsel__name(evsel), ts,
+				   thread__comm_str(thread),
+				   sample->pid, sample->tid, sample->cpu);
+	}
+
+	return printed;
+}
+
 static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
 			    union perf_event *event __maybe_unused,
 			    struct perf_sample *sample)
@@ -1597,6 +1615,8 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
 	if (ttrace == NULL)
 		goto out_put;
 
+	trace__fprintf_sample(trace, evsel, sample, thread);
+
 	args = perf_evsel__sc_tp_ptr(evsel, args, sample);
 
 	if (ttrace->entry_str == NULL) {
@@ -1606,7 +1626,7 @@ static int trace__sys_enter(struct trace *trace, struct perf_evsel *evsel,
 	}
 
 	if (!(trace->duration_filter || trace->summary_only || trace->min_stack))
-		trace__printf_interrupted_entry(trace, sample);
+		trace__printf_interrupted_entry(trace);
 
 	ttrace->entry_time = sample->time;
 	msg = ttrace->entry_str;
@@ -1643,7 +1663,7 @@ static int trace__resolve_callchain(struct trace *trace, struct perf_evsel *evse
 	struct addr_location al;
 
 	if (machine__resolve(trace->host, &al, sample) < 0 ||
-	    thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, trace->max_stack))
+	    thread__resolve_callchain(al.thread, cursor, evsel, sample, NULL, NULL, evsel->attr.sample_max_stack))
 		return -1;
 
 	return 0;
@@ -1659,6 +1679,14 @@ static int trace__fprintf_callchain(struct trace *trace, struct perf_sample *sam
 	return sample__fprintf_callchain(sample, 38, print_opts, &callchain_cursor, trace->output);
 }
 
+static const char *errno_to_name(struct perf_evsel *evsel, int err)
+{
+	struct perf_env *env = perf_evsel__env(evsel);
+	const char *arch_name = perf_env__arch(env);
+
+	return arch_syscalls__strerrno(arch_name, err);
+}
+
 static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 			   union perf_event *event __maybe_unused,
 			   struct perf_sample *sample)
@@ -1679,6 +1707,8 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 	if (ttrace == NULL)
 		goto out_put;
 
+	trace__fprintf_sample(trace, evsel, sample, thread);
+
 	if (trace->summary)
 		thread__update_stats(ttrace, id, sample);
 
@@ -1729,7 +1759,7 @@ static int trace__sys_exit(struct trace *trace, struct perf_evsel *evsel,
 errno_print: {
 		char bf[STRERR_BUFSIZE];
 		const char *emsg = str_error_r(-ret, bf, sizeof(bf)),
-			   *e = audit_errno_to_name(-ret);
+			   *e = errno_to_name(evsel, -ret);
 
 		fprintf(trace->output, ") = -1 %s %s", e, emsg);
 	}
@@ -1910,7 +1940,7 @@ static int trace__event_handler(struct trace *trace, struct perf_evsel *evsel,
 		}
 	}
 
-	trace__printf_interrupted_entry(trace, sample);
+	trace__printf_interrupted_entry(trace);
 	trace__fprintf_tstamp(trace, sample->time, trace->output);
 
 	if (trace->trace_syscalls)
@@ -2221,6 +2251,9 @@ static int trace__add_syscall_newtp(struct trace *trace)
 	if (perf_evsel__init_sc_tp_uint_field(sys_exit, ret))
 		goto out_delete_sys_exit;
 
+	perf_evsel__config_callchain(sys_enter, &trace->opts, &callchain_param);
+	perf_evsel__config_callchain(sys_exit, &trace->opts, &callchain_param);
+
 	perf_evlist__add(evlist, sys_enter);
 	perf_evlist__add(evlist, sys_exit);
 
@@ -2317,6 +2350,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		pgfault_maj = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MAJ);
 		if (pgfault_maj == NULL)
 			goto out_error_mem;
+		perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
 		perf_evlist__add(evlist, pgfault_maj);
 	}
 
@@ -2324,6 +2358,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		pgfault_min = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MIN);
 		if (pgfault_min == NULL)
 			goto out_error_mem;
+		perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
 		perf_evlist__add(evlist, pgfault_min);
 	}
 
@@ -2344,45 +2379,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 		goto out_delete_evlist;
 	}
 
-	perf_evlist__config(evlist, &trace->opts, NULL);
-
-	if (callchain_param.enabled) {
-		bool use_identifier = false;
-
-		if (trace->syscalls.events.sys_exit) {
-			perf_evsel__config_callchain(trace->syscalls.events.sys_exit,
-						     &trace->opts, &callchain_param);
-			use_identifier = true;
-		}
-
-		if (pgfault_maj) {
-			perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
-			use_identifier = true;
-		}
-
-		if (pgfault_min) {
-			perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
-			use_identifier = true;
-		}
-
-		if (use_identifier) {
-		       /*
-			* Now we have evsels with different sample_ids, use
-			* PERF_SAMPLE_IDENTIFIER to map from sample to evsel
-			* from a fixed position in each ring buffer record.
-			*
-			* As of this the changeset introducing this comment, this
-			* isn't strictly needed, as the fields that can come before
-			* PERF_SAMPLE_ID are all used, but we'll probably disable
-			* some of those for things like copying the payload of
-			* pointer syscall arguments, and for vfs_getname we don't
-			* need PERF_SAMPLE_ADDR and PERF_SAMPLE_IP, so do this
-			* here as a warning we need to use PERF_SAMPLE_IDENTIFIER.
-			*/
-			perf_evlist__set_sample_bit(evlist, IDENTIFIER);
-			perf_evlist__reset_sample_bit(evlist, ID);
-		}
-	}
+	perf_evlist__config(evlist, &trace->opts, &callchain_param);
 
 	signal(SIGCHLD, sig_handler);
 	signal(SIGINT, sig_handler);
@@ -2437,7 +2434,7 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	if (err < 0)
 		goto out_error_apply_filters;
 
-	err = perf_evlist__mmap(evlist, trace->opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, trace->opts.mmap_pages);
 	if (err < 0)
 		goto out_error_mmap;
 
@@ -2455,6 +2452,18 @@ static int trace__run(struct trace *trace, int argc, const char **argv)
 	trace->multiple_threads = thread_map__pid(evlist->threads, 0) == -1 ||
 				  evlist->threads->nr > 1 ||
 				  perf_evlist__first(evlist)->attr.inherit;
+
+	/*
+	 * Now that we already used evsel->attr to ask the kernel to setup the
+	 * events, lets reuse evsel->attr.sample_max_stack as the limit in
+	 * trace__resolve_callchain(), allowing per-event max-stack settings
+	 * to override an explicitely set --max-stack global setting.
+	 */
+	evlist__for_each_entry(evlist, evsel) {
+		if ((evsel->attr.sample_type & PERF_SAMPLE_CALLCHAIN) &&
+		    evsel->attr.sample_max_stack == 0)
+			evsel->attr.sample_max_stack = trace->max_stack;
+	}
 again:
 	before = trace->nr_events;
 
@@ -3046,6 +3055,8 @@ int cmd_trace(int argc, const char **argv)
 		     "Set the maximum stack depth when parsing the callchain, "
 		     "anything beyond the specified depth will be ignored. "
 		     "Default: kernel.perf_event_max_stack or " __stringify(PERF_MAX_STACK_DEPTH)),
+	OPT_BOOLEAN(0, "print-sample", &trace.print_sample,
+			"print the PERF_RECORD_SAMPLE PERF_SAMPLE_ info, for debugging"),
 	OPT_UINTEGER(0, "proc-map-timeout", &trace.opts.proc_map_timeout,
 			"per thread proc mmap processing timeout in ms"),
 	OPT_UINTEGER('D', "delay", &trace.opts.initial_delay,
@@ -3097,8 +3108,9 @@ int cmd_trace(int argc, const char **argv)
 	}
 
 #ifdef HAVE_DWARF_UNWIND_SUPPORT
-	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled && trace.trace_syscalls)
+	if ((trace.min_stack || max_stack_user_set) && !callchain_param.enabled) {
 		record_opts__parse_callchain(&trace.opts, &callchain_param, "dwarf", false);
+	}
 #endif
 
 	if (callchain_param.enabled) {
diff --git a/tools/perf/check-headers.sh b/tools/perf/check-headers.sh
index 3e64f10..51abdb0 100755
--- a/tools/perf/check-headers.sh
+++ b/tools/perf/check-headers.sh
@@ -33,21 +33,30 @@
 arch/s390/include/uapi/asm/kvm_perf.h
 arch/s390/include/uapi/asm/ptrace.h
 arch/s390/include/uapi/asm/sie.h
+arch/s390/include/uapi/asm/unistd.h
 arch/arm/include/uapi/asm/kvm.h
 arch/arm64/include/uapi/asm/kvm.h
+arch/alpha/include/uapi/asm/errno.h
+arch/mips/include/asm/errno.h
+arch/mips/include/uapi/asm/errno.h
+arch/parisc/include/uapi/asm/errno.h
+arch/powerpc/include/uapi/asm/errno.h
+arch/sparc/include/uapi/asm/errno.h
+arch/x86/include/uapi/asm/errno.h
 include/asm-generic/bitops/arch_hweight.h
 include/asm-generic/bitops/const_hweight.h
 include/asm-generic/bitops/__fls.h
 include/asm-generic/bitops/fls.h
 include/asm-generic/bitops/fls64.h
 include/linux/coresight-pmu.h
+include/uapi/asm-generic/errno.h
+include/uapi/asm-generic/errno-base.h
 include/uapi/asm-generic/ioctls.h
 include/uapi/asm-generic/mman-common.h
 '
 
 check () {
   file=$1
-  opts="--ignore-blank-lines --ignore-space-change"
 
   shift
   while [ -n "$*" ]; do
diff --git a/tools/perf/perf-completion.sh b/tools/perf/perf-completion.sh
index 345f5d6..fdf75d4 100644
--- a/tools/perf/perf-completion.sh
+++ b/tools/perf/perf-completion.sh
@@ -162,8 +162,37 @@
 	# List possible events for -e option
 	elif [[ $prev == @("-e"|"--event") &&
 		$prev_skip_opts == @(record|stat|top) ]]; then
-		evts=$($cmd list --raw-dump)
-		__perfcomp_colon "$evts" "$cur"
+
+		local cur1=${COMP_WORDS[COMP_CWORD]}
+		local raw_evts=$($cmd list --raw-dump)
+		local arr s tmp result
+
+		if [[ "$cur1" == */* && ${cur1#*/} =~ ^[A-Z] ]]; then
+			OLD_IFS="$IFS"
+			IFS=" "
+			arr=($raw_evts)
+			IFS="$OLD_IFS"
+
+			for s in ${arr[@]}
+			do
+				if [[ "$s" == *cpu/* ]]; then
+					tmp=${s#*cpu/}
+					result=$result" ""cpu/"${tmp^^}
+				else
+					result=$result" "$s
+				fi
+			done
+
+			evts=${result}" "$(ls /sys/bus/event_source/devices/cpu/events)
+		else
+			evts=${raw_evts}" "$(ls /sys/bus/event_source/devices/cpu/events)
+		fi
+
+		if [[ "$cur1" == , ]]; then
+			__perfcomp_colon "$evts" ""
+		else
+			__perfcomp_colon "$evts" "$cur1"
+		fi
 	else
 		# List subcommands for perf commands
 		if [[ $prev_skip_opts == @(kvm|kmem|mem|lock|sched|
@@ -246,11 +275,21 @@
 type perf &>/dev/null &&
 _perf()
 {
+	if [[ "$COMP_WORDBREAKS" != *,* ]]; then
+		COMP_WORDBREAKS="${COMP_WORDBREAKS},"
+		export COMP_WORDBREAKS
+	fi
+
+	if [[ "$COMP_WORDBREAKS" == *:* ]]; then
+		COMP_WORDBREAKS="${COMP_WORDBREAKS/:/}"
+		export COMP_WORDBREAKS
+	fi
+
 	local cur words cword prev
 	if [ $preload_get_comp_words_by_ref = "true" ]; then
-		_get_comp_words_by_ref -n =: cur words cword prev
+		_get_comp_words_by_ref -n =:, cur words cword prev
 	else
-		__perf_get_comp_words_by_ref -n =: cur words cword prev
+		__perf_get_comp_words_by_ref -n =:, cur words cword prev
 	fi
 	__perf_main
 } &&
diff --git a/tools/perf/perf.c b/tools/perf/perf.c
index 62b1351..1b3fc8e 100644
--- a/tools/perf/perf.c
+++ b/tools/perf/perf.c
@@ -73,7 +73,7 @@ static struct cmd_struct commands[] = {
 	{ "lock",	cmd_lock,	0 },
 	{ "kvm",	cmd_kvm,	0 },
 	{ "test",	cmd_test,	0 },
-#ifdef HAVE_LIBAUDIT_SUPPORT
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
 	{ "trace",	cmd_trace,	0 },
 #endif
 	{ "inject",	cmd_inject,	0 },
@@ -485,7 +485,7 @@ int main(int argc, const char **argv)
 		argv[0] = cmd;
 	}
 	if (strstarts(cmd, "trace")) {
-#ifdef HAVE_LIBAUDIT_SUPPORT
+#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)
 		setup_path();
 		argv[0] = "trace";
 		return cmd_trace(argc, argv);
diff --git a/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
new file mode 100644
index 0000000..2db45c4
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/cavium/thunderx2-imp-def.json
@@ -0,0 +1,62 @@
+[
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, read",
+        "EventCode": "0x40",
+        "EventName": "l1d_cache_rd",
+        "BriefDescription": "L1D cache read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache access, write ",
+        "EventCode": "0x41",
+        "EventName": "l1d_cache_wr",
+        "BriefDescription": "L1D cache write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, read",
+        "EventCode": "0x42",
+        "EventName": "l1d_cache_refill_rd",
+        "BriefDescription": "L1D cache refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data cache refill, write",
+        "EventCode": "0x43",
+        "EventName": "l1d_cache_refill_wr",
+        "BriefDescription": "L1D refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, read",
+        "EventCode": "0x4C",
+        "EventName": "l1d_tlb_refill_rd",
+        "BriefDescription": "L1D tlb refill read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data TLB refill, write",
+        "EventCode": "0x4D",
+        "EventName": "l1d_tlb_refill_wr",
+        "BriefDescription": "L1D tlb refill write",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, read",
+        "EventCode": "0x4E",
+        "EventName": "l1d_tlb_rd",
+        "BriefDescription": "L1D tlb read",
+    },
+    {
+        "PublicDescription": "Attributable Level 1 data or unified TLB access, write",
+        "EventCode": "0x4F",
+        "EventName": "l1d_tlb_wr",
+        "BriefDescription": "L1D tlb write",
+    },
+    {
+        "PublicDescription": "Bus access read",
+        "EventCode": "0x60",
+        "EventName": "bus_access_rd",
+        "BriefDescription": "Bus access read",
+   },
+   {
+        "PublicDescription": "Bus access write",
+        "EventCode": "0x61",
+        "EventName": "bus_access_wr",
+        "BriefDescription": "Bus access write",
+   }
+]
diff --git a/tools/perf/pmu-events/arch/arm64/mapfile.csv b/tools/perf/pmu-events/arch/arm64/mapfile.csv
new file mode 100644
index 0000000..219d675
--- /dev/null
+++ b/tools/perf/pmu-events/arch/arm64/mapfile.csv
@@ -0,0 +1,15 @@
+# Format:
+#	MIDR,Version,JSON/file/pathname,Type
+#
+# where
+#	MIDR	Processor version
+#		Variant[23:20] and Revision [3:0] should be zero.
+#	Version could be used to track version of of JSON file
+#		but currently unused.
+#	JSON/file/pathname is the path to JSON file, relative
+#		to tools/perf/pmu-events/arch/arm64/.
+#	Type is core, uncore etc
+#
+#
+#Family-model,Version,Filename,EventType
+0x00000000420f5160,v1,cavium,core
diff --git a/tools/perf/pmu-events/arch/powerpc/mapfile.csv b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
index a0f3a11..229150e 100644
--- a/tools/perf/pmu-events/arch/powerpc/mapfile.csv
+++ b/tools/perf/pmu-events/arch/powerpc/mapfile.csv
@@ -13,13 +13,5 @@
 #
 
 # Power8 entries
-004b0000,1,power8,core
-004b0201,1,power8,core
-004c0000,1,power8,core
-004d0000,1,power8,core
-004d0100,1,power8,core
-004d0200,1,power8,core
-004c0100,1,power8,core
-004e0100,1,power9,core
-004e0200,1,power9,core
-004e1200,1,power9,core
+004[bcd][[:xdigit:]]{4},1,power8,core
+004e[[:xdigit:]]{4},1,power9,core
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/cache.json b/tools/perf/pmu-events/arch/powerpc/power9/cache.json
index 18f6645f..7945c51 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/cache.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/cache.json
@@ -125,11 +125,6 @@
     "BriefDescription": "Finish stall because the NTF instruction was a larx waiting to be satisfied"
   },
   {,
-    "EventCode": "0x3006C",
-    "EventName": "PM_RUN_CYC_SMT2_MODE",
-    "BriefDescription": "Cycles in which this thread's run latch is set and the core is in SMT2 mode"
-  },
-  {,
     "EventCode": "0x1C058",
     "EventName": "PM_DTLB_MISS_16G",
     "BriefDescription": "Data TLB Miss page size 16G"
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/frontend.json b/tools/perf/pmu-events/arch/powerpc/power9/frontend.json
index c63a919..bd8361b 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/frontend.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/frontend.json
@@ -1,10 +1,5 @@
 [
   {,
-    "EventCode": "0x3E15C",
-    "EventName": "PM_MRK_L2_TM_ST_ABORT_SISTER",
-    "BriefDescription": "TM marked store abort for this thread"
-  },
-  {,
     "EventCode": "0x25044",
     "EventName": "PM_IPTEG_FROM_L31_MOD",
     "BriefDescription": "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request"
@@ -369,4 +364,4 @@
     "EventName": "PM_IPTEG_FROM_L31_ECO_MOD",
     "BriefDescription": "A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request"
   }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/marked.json b/tools/perf/pmu-events/arch/powerpc/power9/marked.json
index b9df54f..22f9f32 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/marked.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/marked.json
@@ -1,10 +1,5 @@
 [
   {,
-    "EventCode": "0x3C052",
-    "EventName": "PM_DATA_SYS_PUMP_MPRED",
-    "BriefDescription": "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load"
-  },
-  {,
     "EventCode": "0x3013E",
     "EventName": "PM_MRK_STALL_CMPLU_CYC",
     "BriefDescription": "Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)"
@@ -255,6 +250,11 @@
     "BriefDescription": "A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache"
   },
   {,
+    "EventCode": "0x3C052",
+    "EventName": "PM_DATA_SYS_PUMP_MPRED",
+    "BriefDescription": "Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load"
+  },
+  {,
     "EventCode": "0x4D142",
     "EventName": "PM_MRK_DATA_FROM_L3",
     "BriefDescription": "The processor's data cache was reloaded from local core's L3 due to a marked load"
@@ -435,21 +435,6 @@
     "BriefDescription": "ITLB Reloaded. Counts 1 per ITLB miss for HPT but multiple for radix depending on number of levels traveresed"
   },
   {,
-    "EventCode": "0x2D024",
-    "EventName": "PM_RADIX_PWC_L2_HIT",
-    "BriefDescription": "A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache."
-  },
-  {,
-    "EventCode": "0x3F056",
-    "EventName": "PM_RADIX_PWC_L3_HIT",
-    "BriefDescription": "A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache."
-  },
-  {,
-    "EventCode": "0x4E014",
-    "EventName": "PM_TM_TX_PASS_RUN_INST",
-    "BriefDescription": "Run instructions spent in successful transactions"
-  },
-  {,
     "EventCode": "0x1E044",
     "EventName": "PM_DPTEG_FROM_L3_NO_CONFLICT",
     "BriefDescription": "A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included"
@@ -644,4 +629,4 @@
     "EventName": "PM_MRK_BR_MPRED_CMPL",
     "BriefDescription": "Marked Branch Mispredicted"
   }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/other.json b/tools/perf/pmu-events/arch/powerpc/power9/other.json
index 54cc3be..5ce3129 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/other.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/other.json
@@ -80,6 +80,11 @@
     "BriefDescription": "A radix translation attempt missed in the TLB and all levels of page walk cache."
   },
   {,
+    "EventCode": "0x26882",
+    "EventName": "PM_L2_DC_INV",
+    "BriefDescription": "D-cache invalidates sent over the reload bus to the core"
+  },
+  {,
     "EventCode": "0x24048",
     "EventName": "PM_INST_FROM_LMEM",
     "BriefDescription": "The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)"
@@ -95,11 +100,6 @@
     "BriefDescription": "Number of TM transactions that passed"
   },
   {,
-    "EventCode": "0xD1A0",
-    "EventName": "PM_MRK_LSU_FLUSH_LHS",
-    "BriefDescription": "Effective Address alias flush : no EA match but Real Address match.  If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed"
-  },
-  {,
     "EventCode": "0xF088",
     "EventName": "PM_LSU0_STORE_REJECT",
     "BriefDescription": "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met"
@@ -127,7 +127,7 @@
   {,
     "EventCode": "0xD08C",
     "EventName": "PM_LSU2_LDMX_FIN",
-    "BriefDescription": "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491):  The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region. This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56])"
+    "BriefDescription": "New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491):  The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region.  This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56])."
   },
   {,
     "EventCode": "0x300F8",
@@ -205,11 +205,6 @@
     "BriefDescription": "Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load"
   },
   {,
-    "EventCode": "0xF0B4",
-    "EventName": "PM_DC_PREF_CONS_ALLOC",
-    "BriefDescription": "Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch"
-  },
-  {,
     "EventCode": "0xF894",
     "EventName": "PM_LSU3_L1_CAM_CANCEL",
     "BriefDescription": "ls3 l1 tm cam cancel"
@@ -220,21 +215,11 @@
     "BriefDescription": "Dispatch Flush: TLBIE"
   },
   {,
-    "EventCode": "0xD1A4",
-    "EventName": "PM_MRK_LSU_FLUSH_SAO",
-    "BriefDescription": "A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush"
-  },
-  {,
     "EventCode": "0x4E11E",
     "EventName": "PM_MRK_DATA_FROM_DMEM_CYC",
     "BriefDescription": "Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load"
   },
   {,
-    "EventCode": "0x5894",
-    "EventName": "PM_LWSYNC",
-    "BriefDescription": "Lwsync instruction decoded and transferred"
-  },
-  {,
     "EventCode": "0x14156",
     "EventName": "PM_MRK_DATA_FROM_L2_CYC",
     "BriefDescription": "Duration in cycles to reload from local core's L2 due to a marked load"
@@ -245,11 +230,6 @@
     "BriefDescription": "Read clearing SC"
   },
   {,
-    "EventCode": "0x50A0",
-    "EventName": "PM_HWSYNC",
-    "BriefDescription": "Hwsync instruction decoded and transferred"
-  },
-  {,
     "EventCode": "0x168B0",
     "EventName": "PM_L3_P1_NODE_PUMP",
     "BriefDescription": "L3 PF sent with nodal scope port 1, counts even retried requests"
@@ -265,6 +245,11 @@
     "BriefDescription": "The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load"
   },
   {,
+    "EventCode": "0x468AE",
+    "EventName": "PM_L3_P3_CO_RTY",
+    "BriefDescription": "L3 CO received retry port 3 (memory only), every retry counted"
+  },
+  {,
     "EventCode": "0x460A8",
     "EventName": "PM_SN_HIT",
     "BriefDescription": "Any port snooper hit L3.  Up to 4 can happen in a cycle but we only count 1"
@@ -280,11 +265,6 @@
     "BriefDescription": "Prefetch stream allocated by the hardware prefetch mechanism"
   },
   {,
-    "EventCode": "0xF0BC",
-    "EventName": "PM_LS2_UNALIGNED_ST",
-    "BriefDescription": "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.  If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
     "EventCode": "0xD0AC",
     "EventName": "PM_SRQ_SYNC_CYC",
     "BriefDescription": "A sync is in the S2Q (edge detect to count)"
@@ -380,26 +360,11 @@
     "BriefDescription": "Cycles in which this thread's run latch is set and the core is in SMT4 mode"
   },
   {,
-    "EventCode": "0x5088",
-    "EventName": "PM_DECODE_FUSION_OP_PRESERV",
-    "BriefDescription": "Destructive op operand preservation"
-  },
-  {,
     "EventCode": "0x1D14E",
     "EventName": "PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC",
     "BriefDescription": "Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load"
   },
   {,
-    "EventCode": "0x509C",
-    "EventName": "PM_FORCED_NOP",
-    "BriefDescription": "Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time"
-  },
-  {,
-    "EventCode": "0xC098",
-    "EventName": "PM_LS2_UNALIGNED_LD",
-    "BriefDescription": "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.  If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
     "EventCode": "0x20058",
     "EventName": "PM_DARQ1_10_12_ENTRIES",
     "BriefDescription": "Cycles in which 10 or  more DARQ1 entries (out of 12) are in use"
@@ -435,11 +400,6 @@
     "BriefDescription": "All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met"
   },
   {,
-    "EventCode": "0x4505E",
-    "EventName": "PM_FLOP_CMPL",
-    "BriefDescription": "Floating Point Operation Finished"
-  },
-  {,
     "EventCode": "0x1D144",
     "EventName": "PM_MRK_DATA_FROM_L3_DISP_CONFLICT",
     "BriefDescription": "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load"
@@ -480,14 +440,9 @@
     "BriefDescription": "XL-form branch was mispredicted due to the predicted target address missing from EAT.  The EAT forces a mispredict in this case since there is no predicated target to validate.  This is a rare case that may occur when the EAT is full and a branch is issued"
   },
   {,
-    "EventCode": "0xC094",
-    "EventName": "PM_LS0_UNALIGNED_LD",
-    "BriefDescription": "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.  If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
-    "EventCode": "0xF8BC",
-    "EventName": "PM_LS3_UNALIGNED_ST",
-    "BriefDescription": "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.  If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
+    "EventCode": "0x460AE",
+    "EventName": "PM_L3_P2_CO_RTY",
+    "BriefDescription": "L3 CO received retry port 2 (memory only), every retry counted"
   },
   {,
     "EventCode": "0x58B0",
@@ -505,11 +460,6 @@
     "BriefDescription": "TM Store (fav or non-fav) ran into conflict (failed)"
   },
   {,
-    "EventCode": "0xD998",
-    "EventName": "PM_MRK_LSU_FLUSH_EMSH",
-    "BriefDescription": "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address"
-  },
-  {,
     "EventCode": "0xF8A0",
     "EventName": "PM_NON_DATA_STORE",
     "BriefDescription": "All ops that drain from s2q to L2 and contain no data"
@@ -525,11 +475,6 @@
     "BriefDescription": "Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was covenrted to a Resolve."
   },
   {,
-    "EventCode": "0x1F056",
-    "EventName": "PM_RADIX_PWC_L1_HIT",
-    "BriefDescription": "A radix translation attempt missed in the TLB and only the first level page walk cache was a hit."
-  },
-  {,
     "EventCode": "0xF8A8",
     "EventName": "PM_DC_PREF_FUZZY_CONF",
     "BriefDescription": "A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)"
@@ -545,6 +490,11 @@
     "BriefDescription": "Load tm L1 miss"
   },
   {,
+    "EventCode": "0xC880",
+    "EventName": "PM_LS1_LD_VECTOR_FIN",
+    "BriefDescription": ""
+  },
+  {,
     "EventCode": "0x2894",
     "EventName": "PM_TM_OUTER_TEND",
     "BriefDescription": "Completion time outer tend"
@@ -565,21 +515,11 @@
     "BriefDescription": "Marked derat reload (miss) for any page size"
   },
   {,
-    "EventCode": "0x160A0",
-    "EventName": "PM_L3_PF_MISS_L3",
-    "BriefDescription": "L3 PF missed in L3"
-  },
-  {,
     "EventCode": "0x1C04A",
     "EventName": "PM_DATA_FROM_RL2L3_SHR",
     "BriefDescription": "The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load"
   },
   {,
-    "EventCode": "0xD99C",
-    "EventName": "PM_MRK_LSU_FLUSH_UE",
-    "BriefDescription": "Correctable ECC error on reload data, reported at critical data forward time"
-  },
-  {,
     "EventCode": "0x268B0",
     "EventName": "PM_L3_P1_GRP_PUMP",
     "BriefDescription": "L3 PF sent with grp scope port 1, counts even retried requests"
@@ -630,11 +570,6 @@
     "BriefDescription": "addrs only req to L2 only on the first one,Indication that Load footprint is not expanding"
   },
   {,
-    "EventCode": "0x5884",
-    "EventName": "PM_DECODE_LANES_NOT_AVAIL",
-    "BriefDescription": "Decode has something to transmit but dispatch lanes are not available"
-  },
-  {,
     "EventCode": "0x3C042",
     "EventName": "PM_DATA_FROM_L3_DISP_CONFLICT",
     "BriefDescription": "The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load"
@@ -690,9 +625,9 @@
     "BriefDescription": "False LHS match detected"
   },
   {,
-    "EventCode": "0xD9A4",
-    "EventName": "PM_MRK_LSU_FLUSH_LARX_STCX",
-    "BriefDescription": "A larx is flushed because an older larx has an LMQ reservation for the same thread.  A stcx is flushed because an older stcx is in the LMQ.  The flush happens when the older larx/stcx relaunches"
+    "EventCode": "0xF0B0",
+    "EventName": "PM_L3_LD_PREF",
+    "BriefDescription": "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest"
   },
   {,
     "EventCode": "0x4D012",
@@ -715,9 +650,9 @@
     "BriefDescription": "All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)"
   },
   {,
-    "EventCode": "0xF8B8",
-    "EventName": "PM_LS1_UNALIGNED_ST",
-    "BriefDescription": "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.  If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
+    "EventCode": "0x160A0",
+    "EventName": "PM_L3_PF_MISS_L3",
+    "BriefDescription": "L3 PF missed in L3"
   },
   {,
     "EventCode": "0x408C",
@@ -765,11 +700,6 @@
     "BriefDescription": "Completion time nested tend"
   },
   {,
-    "EventCode": "0x36084",
-    "EventName": "PM_L2_RCST_DISP",
-    "BriefDescription": "All D-side store dispatch attempts for this thread"
-  },
-  {,
     "EventCode": "0x368A0",
     "EventName": "PM_L3_PF_OFF_CHIP_CACHE",
     "BriefDescription": "L3 PF from Off chip cache"
@@ -830,11 +760,6 @@
     "BriefDescription": "Rotating sample of 16 snoop valids"
   },
   {,
-    "EventCode": "0x16084",
-    "EventName": "PM_L2_RCLD_DISP",
-    "BriefDescription": "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)"
-  },
-  {,
     "EventCode": "0x1608C",
     "EventName": "PM_RC0_BUSY",
     "BriefDescription": "RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)"
@@ -842,7 +767,7 @@
   {,
     "EventCode": "0x36082",
     "EventName": "PM_L2_LD_DISP",
-    "BriefDescription": "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)."
+    "BriefDescription": "All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)"
   },
   {,
     "EventCode": "0xF8B0",
@@ -905,11 +830,6 @@
     "BriefDescription": "Instruction prefetch requests"
   },
   {,
-    "EventCode": "0xC898",
-    "EventName": "PM_LS3_UNALIGNED_LD",
-    "BriefDescription": "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.  If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
     "EventCode": "0x488C",
     "EventName": "PM_IC_PREF_WRITE",
     "BriefDescription": "Instruction prefetch written into IL1"
@@ -1017,7 +937,7 @@
   {,
     "EventCode": "0x3E05E",
     "EventName": "PM_L3_CO_MEPF",
-    "BriefDescription": "L3 castouts in Mepf state for this thread"
+    "BriefDescription": "L3 CO of line in Mep state (includes casthrough to memory).  The Mepf state indicates that a line was brought in to satisfy an L3 prefetch request"
   },
   {,
     "EventCode": "0x460A2",
@@ -1205,11 +1125,6 @@
     "BriefDescription": "Non transactional conflict from LSU, gets reported to TEXASR"
   },
   {,
-    "EventCode": "0xD198",
-    "EventName": "PM_MRK_LSU_FLUSH_ATOMIC",
-    "BriefDescription": "Quad-word loads (lq) are considered atomic because they always span at least 2 slices.  If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed"
-  },
-  {,
     "EventCode": "0x201E0",
     "EventName": "PM_MRK_DATA_FROM_MEMORY",
     "BriefDescription": "The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load"
@@ -1295,11 +1210,6 @@
     "BriefDescription": "Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF; CR; XVF (XER/VSCR/FPSCR)"
   },
   {,
-    "EventCode": "0xC894",
-    "EventName": "PM_LS1_UNALIGNED_LD",
-    "BriefDescription": "Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size.  If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
     "EventCode": "0x360A2",
     "EventName": "PM_L3_L2_CO_HIT",
     "BriefDescription": "L2 CO hits"
@@ -1325,11 +1235,6 @@
     "BriefDescription": "L2 Castouts - Shared (Tx,Sx)"
   },
   {,
-    "EventCode": "0xD884",
-    "EventName": "PM_LSU3_SET_MPRED",
-    "BriefDescription": "Set prediction(set-p) miss.  The entry was not found in the Set prediction table"
-  },
-  {,
     "EventCode": "0x26092",
     "EventName": "PM_L2_LD_MISS_64B",
     "BriefDescription": "All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.e., M=1)"
@@ -1362,12 +1267,12 @@
   {,
     "EventCode": "0xD8A8",
     "EventName": "PM_ISLB_MISS",
-    "BriefDescription": "Instruction SLB miss - Total of all segment sizes"
+    "BriefDescription": "Instruction SLB Miss - Total of all segment sizes"
   },
   {,
-    "EventCode": "0xD19C",
-    "EventName": "PM_MRK_LSU_FLUSH_RELAUNCH_MISS",
-    "BriefDescription": "If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent"
+    "EventCode": "0x368AE",
+    "EventName": "PM_L3_P1_CO_RTY",
+    "BriefDescription": "L3 CO received retry port 1 (memory only), every retry counted"
   },
   {,
     "EventCode": "0x260A2",
@@ -1385,6 +1290,11 @@
     "BriefDescription": "Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT"
   },
   {,
+    "EventCode": "0xC084",
+    "EventName": "PM_LS2_LD_VECTOR_FIN",
+    "BriefDescription": ""
+  },
+  {,
     "EventCode": "0x1608E",
     "EventName": "PM_ST_CAUSED_FAIL",
     "BriefDescription": "Non-TM Store caused any thread to fail"
@@ -1410,11 +1320,6 @@
     "BriefDescription": "Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running"
   },
   {,
-    "EventCode": "0xD084",
-    "EventName": "PM_LSU2_SET_MPRED",
-    "BriefDescription": "Set prediction(set-p) miss.  The entry was not found in the Set prediction table"
-  },
-  {,
     "EventCode": "0x48B8",
     "EventName": "PM_BR_MPRED_TAKEN_TA",
     "BriefDescription": "Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack.  Only XL-form branches that resolved Taken set this event."
@@ -1450,29 +1355,24 @@
     "BriefDescription": "A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software."
   },
   {,
+    "EventCode": "0x36084",
+    "EventName": "PM_L2_RCST_DISP",
+    "BriefDescription": "All D-side store dispatch attempts for this thread"
+  },
+  {,
     "EventCode": "0x45054",
     "EventName": "PM_FMA_CMPL",
     "BriefDescription": "two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only. "
   },
   {,
-    "EventCode": "0x5090",
-    "EventName": "PM_SHL_ST_DISABLE",
-    "BriefDescription": "Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)"
-  },
-  {,
     "EventCode": "0x201E8",
     "EventName": "PM_THRESH_EXC_512",
     "BriefDescription": "Threshold counter exceeded a value of 512"
   },
   {,
-    "EventCode": "0x5084",
-    "EventName": "PM_DECODE_FUSION_EXT_ADD",
-    "BriefDescription": "32-bit extended addition"
-  },
-  {,
     "EventCode": "0x36080",
     "EventName": "PM_L2_INST",
-    "BriefDescription": "All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)."
+    "BriefDescription": "All successful I-side dispatches for this thread   (excludes i_l2mru_tch reqs)"
   },
   {,
     "EventCode": "0x3504C",
@@ -1555,21 +1455,11 @@
     "BriefDescription": "Memory Read With Intent to Modify for this thread"
   },
   {,
-    "EventCode": "0x26882",
-    "EventName": "PM_L2_DC_INV",
-    "BriefDescription": "D-cache invalidates sent over the reload bus to the core"
-  },
-  {,
     "EventCode": "0xC090",
     "EventName": "PM_LSU_STCX",
     "BriefDescription": "STCX sent to nest, i.e. total"
   },
   {,
-    "EventCode": "0xD080",
-    "EventName": "PM_LSU0_SET_MPRED",
-    "BriefDescription": "Set prediction(set-p) miss.  The entry was not found in the Set prediction table"
-  },
-  {,
     "EventCode": "0x2C120",
     "EventName": "PM_MRK_DATA_FROM_L2_NO_CONFLICT",
     "BriefDescription": "The processor's data cache was reloaded from local core's L2 without conflict due to a marked load"
@@ -1610,11 +1500,6 @@
     "BriefDescription": "A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request"
   },
   {,
-    "EventCode": "0xD9A0",
-    "EventName": "PM_MRK_LSU_FLUSH_LHL_SHL",
-    "BriefDescription": "The instruction was flushed because of a sequential load/store consistency.  If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores)."
-  },
-  {,
     "EventCode": "0x35042",
     "EventName": "PM_IPTEG_FROM_L3_DISP_CONFLICT",
     "BriefDescription": "A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request"
@@ -1692,7 +1577,7 @@
   {,
     "EventCode": "0x2001A",
     "EventName": "PM_NTC_ALL_FIN",
-    "BriefDescription": "Cycles after all instructions have finished to group completed"
+    "BriefDescription": "Cycles after instruction finished to instruction completed."
   },
   {,
     "EventCode": "0x3005A",
@@ -1710,6 +1595,11 @@
     "BriefDescription": "ls1 l1 tm cam cancel"
   },
   {,
+    "EventCode": "0x268AE",
+    "EventName": "PM_L3_P3_PF_RTY",
+    "BriefDescription": "L3 PF received retry port 3, every retry counted"
+  },
+  {,
     "EventCode": "0xE884",
     "EventName": "PM_LS1_ERAT_MISS_PREF",
     "BriefDescription": "LS1 Erat miss due to prefetch"
@@ -1742,7 +1632,7 @@
   {,
     "EventCode": "0x160B6",
     "EventName": "PM_L3_WI0_BUSY",
-    "BriefDescription": "Rotating sample of 8 WI valid"
+    "BriefDescription": "Rotating sample of 8 WI valid (duplicate)"
   },
   {,
     "EventCode": "0x368AC",
@@ -1790,9 +1680,9 @@
     "BriefDescription": "L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)"
   },
   {,
-    "EventCode": "0x589C",
-    "EventName": "PM_PTESYNC",
-    "BriefDescription": "ptesync instruction counted when the instruction is decoded and transmitted"
+    "EventCode": "0x260AE",
+    "EventName": "PM_L3_P2_PF_RTY",
+    "BriefDescription": "L3 PF received retry port 2, every retry counted"
   },
   {,
     "EventCode": "0x26086",
@@ -1825,6 +1715,11 @@
     "BriefDescription": "Store-Hit-Load Table Read Hit with entry Enabled"
   },
   {,
+    "EventCode": "0x46882",
+    "EventName": "PM_L2_ST_HIT",
+    "BriefDescription": "All successful D-side store dispatches for this thread that were L2 hits"
+  },
+  {,
     "EventCode": "0x360AC",
     "EventName": "PM_L3_SN0_BUSY",
     "BriefDescription": "Lifetime, sample of snooper machine 0 valid"
@@ -1845,11 +1740,6 @@
     "BriefDescription": "All successful D-Side Store dispatches that were an L2 miss for this thread"
   },
   {,
-    "EventCode": "0xF8B4",
-    "EventName": "PM_DC_PREF_XCONS_ALLOC",
-    "BriefDescription": "Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch"
-  },
-  {,
     "EventCode": "0x35048",
     "EventName": "PM_IPTEG_FROM_DL2L3_SHR",
     "BriefDescription": "A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request"
@@ -1970,11 +1860,6 @@
     "BriefDescription": "Cycles thread running at priority level 2 or 3"
   },
   {,
-    "EventCode": "0x10134",
-    "EventName": "PM_MRK_ST_DONE_L2",
-    "BriefDescription": "marked store completed in L2 ( RC machine done)"
-  },
-  {,
     "EventCode": "0x368B2",
     "EventName": "PM_L3_GRP_GUESS_WRONG_HIGH",
     "BriefDescription": "Initial scope=group (GS or NNS) but data from local node. Prediction too high"
@@ -2005,11 +1890,6 @@
     "BriefDescription": "L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)"
   },
   {,
-    "EventCode": "0x368AE",
-    "EventName": "PM_L3_P1_CO_RTY",
-    "BriefDescription": "L3 CO received retry port 1 (memory only), every retry counted"
-  },
-  {,
     "EventCode": "0xC0AC",
     "EventName": "PM_LSU_FLUSH_EMSH",
     "BriefDescription": "An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address"
@@ -2035,11 +1915,6 @@
     "BriefDescription": "RC requests that were on group (aka nodel) pump attempts"
   },
   {,
-    "EventCode": "0xF0B0",
-    "EventName": "PM_L3_LD_PREF",
-    "BriefDescription": "L3 load prefetch, sourced from a hardware or software stream, was sent to the nest"
-  },
-  {,
     "EventCode": "0x16080",
     "EventName": "PM_L2_LD",
     "BriefDescription": "All successful D-side Load dispatches for this thread (L2 miss + L2 hits)"
@@ -2050,6 +1925,11 @@
     "BriefDescription": "Math flop instruction completed"
   },
   {,
+    "EventCode": "0xC080",
+    "EventName": "PM_LS0_LD_VECTOR_FIN",
+    "BriefDescription": ""
+  },
+  {,
     "EventCode": "0x368B0",
     "EventName": "PM_L3_P1_SYS_PUMP",
     "BriefDescription": "L3 PF sent with sys scope port 1, counts even retried requests"
@@ -2120,11 +2000,6 @@
     "BriefDescription": "Conditional Branch Completed in which the HW correctly predicted the direction as taken.  Counted at completion time"
   },
   {,
-    "EventCode": "0xF0B8",
-    "EventName": "PM_LS0_UNALIGNED_ST",
-    "BriefDescription": "Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size.  If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty"
-  },
-  {,
     "EventCode": "0x20132",
     "EventName": "PM_MRK_DFU_FIN",
     "BriefDescription": "Decimal Unit marked Instruction Finish"
@@ -2140,6 +2015,11 @@
     "BriefDescription": "Effective Address alias flush : no EA match but Real Address match.  If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed"
   },
   {,
+    "EventCode": "0x16084",
+    "EventName": "PM_L2_RCLD_DISP",
+    "BriefDescription": "All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)"
+  },
+  {,
     "EventCode": "0x3F150",
     "EventName": "PM_MRK_ST_DRAIN_TO_L2DISP_CYC",
     "BriefDescription": "cycles to drain st from core to L2"
@@ -2225,11 +2105,6 @@
     "BriefDescription": "Prefetch Canceled due to page boundary"
   },
   {,
-    "EventCode": "0xF09C",
-    "EventName": "PM_SLB_TABLEWALK_CYC",
-    "BriefDescription": "Cycles when a tablewalk is pending on this thread on the SLB table"
-  },
-  {,
     "EventCode": "0x460AA",
     "EventName": "PM_L3_P0_CO_L31",
     "BriefDescription": "L3 CO to L3.1 (LCO) port 0 with or without data"
@@ -2247,10 +2122,10 @@
   {,
     "EventCode": "0x46082",
     "EventName": "PM_L2_ST_DISP",
-    "BriefDescription": "All successful D-side store dispatches for this thread "
+    "BriefDescription": "All successful D-side store dispatches for this thread (L2 miss + L2 hits)"
   },
   {,
-    "EventCode": "0x4609E",
+    "EventCode": "0x36880",
     "EventName": "PM_L2_INST_MISS",
     "BriefDescription": "All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)"
   },
@@ -2340,9 +2215,9 @@
     "BriefDescription": "All ISU rejects"
   },
   {,
-    "EventCode": "0x46882",
-    "EventName": "PM_L2_ST_HIT",
-    "BriefDescription": "All successful D-side store dispatches for this thread that were L2 hits"
+    "EventCode": "0xC884",
+    "EventName": "PM_LS3_LD_VECTOR_FIN",
+    "BriefDescription": ""
   },
   {,
     "EventCode": "0x360A8",
@@ -2360,11 +2235,6 @@
     "BriefDescription": "Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1"
   },
   {,
-    "EventCode": "0xD880",
-    "EventName": "PM_LSU1_SET_MPRED",
-    "BriefDescription": "Set prediction(set-p) miss.  The entry was not found in the Set prediction table"
-  },
-  {,
     "EventCode": "0xD0B8",
     "EventName": "PM_LSU_LMQ_FULL_CYC",
     "BriefDescription": "Counts the number of cycles the LMQ is full"
@@ -2389,4 +2259,4 @@
     "EventName": "PM_L3_PF_USAGE",
     "BriefDescription": "Rotating sample of 32 PF actives"
   }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/pipeline.json b/tools/perf/pmu-events/arch/powerpc/power9/pipeline.json
index bc2db63..5af1abb 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/pipeline.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/pipeline.json
@@ -125,6 +125,11 @@
     "BriefDescription": "Overflow from counter 5"
   },
   {,
+    "EventCode": "0x4505E",
+    "EventName": "PM_FLOP_CMPL",
+    "BriefDescription": "Floating Point Operation Finished"
+  },
+  {,
     "EventCode": "0x2C018",
     "EventName": "PM_CMPLU_STALL_DMISS_L21_L31",
     "BriefDescription": "Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)"
@@ -390,11 +395,6 @@
     "BriefDescription": "Ict empty for this thread due to branch mispred"
   },
   {,
-    "EventCode": "0x3405E",
-    "EventName": "PM_IFETCH_THROTTLE",
-    "BriefDescription": "Cycles in which Instruction fetch throttle was active."
-  },
-  {,
     "EventCode": "0x1F148",
     "EventName": "PM_MRK_DPTEG_FROM_ON_CHIP_CACHE",
     "BriefDescription": "A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included"
@@ -422,7 +422,7 @@
   {,
     "EventCode": "0xD0A8",
     "EventName": "PM_DSLB_MISS",
-    "BriefDescription": "Data SLB Miss - Total of all segment sizes"
+    "BriefDescription": "gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))"
   },
   {,
     "EventCode": "0x4C058",
@@ -549,4 +549,4 @@
     "EventName": "PM_MRK_DATA_FROM_L21_SHR_CYC",
     "BriefDescription": "Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load"
   }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/pmc.json b/tools/perf/pmu-events/arch/powerpc/power9/pmc.json
index 3ef8a10..d0b89f9 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/pmc.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/pmc.json
@@ -119,4 +119,4 @@
     "EventName": "PM_1FLOP_CMPL",
     "BriefDescription": "one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed"
   }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/powerpc/power9/translation.json b/tools/perf/pmu-events/arch/powerpc/power9/translation.json
index 8c0f120..bc8e03d 100644
--- a/tools/perf/pmu-events/arch/powerpc/power9/translation.json
+++ b/tools/perf/pmu-events/arch/powerpc/power9/translation.json
@@ -90,11 +90,6 @@
     "BriefDescription": "stcx failed"
   },
   {,
-    "EventCode": "0x20112",
-    "EventName": "PM_MRK_NTF_FIN",
-    "BriefDescription": "Marked next to finish instruction finished"
-  },
-  {,
     "EventCode": "0x300F0",
     "EventName": "PM_ST_MISS_L1",
     "BriefDescription": "Store Missed L1"
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/cache.json b/tools/perf/pmu-events/arch/x86/broadwell/cache.json
index 73688a9..bba3152 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/cache.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/cache.json
@@ -10,13 +10,30 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
-        "UMask": "0x41",
-        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
+        "UMask": "0x22",
+        "EventName": "L2_RQSTS.RFO_MISS",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "BriefDescription": "RFO requests that miss L2 cache.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache misses when fetching instructions.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x27",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests that miss L2 cache.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -30,6 +47,43 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3f",
+        "EventName": "L2_RQSTS.MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All requests that miss L2 cache.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x41",
+        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x42",
+        "EventName": "L2_RQSTS.RFO_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests that hit L2 cache.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x44",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of requests from the L2 hardware prefetchers that hit L2 cache. L3 prefetch new types.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -70,6 +124,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe7",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests to L2 cache.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the total number of requests from the L2 hardware prefetchers.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -80,6 +143,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xff",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All L2 requests.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of WB requests that hit L2 cache.",
         "EventCode": "0x27",
         "Counter": "0,1,2,3",
@@ -131,6 +203,27 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0x48",
+        "Counter": "2",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "CounterMask": "1",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
         "EventCode": "0x51",
         "Counter": "0,1,2,3",
@@ -152,7 +245,30 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "Errata": "BDM76",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "Errata": "BDM76",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -174,6 +290,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "Errata": "BDM76",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of offcore outstanding cacheable Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -185,18 +313,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "Errata": "BDM76",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This event counts cycles when offcore outstanding cacheable Core Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -209,18 +325,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "Errata": "BDM76",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This event counts the number of cycles when the L1D is locked. It is a superset of the 0x1 mask (BUS_LOCK_CLOCKS.BUS_LOCK_DURATION).",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
@@ -261,7 +365,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable \"Demands\" and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
+        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable Demands and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
         "EventCode": "0xB0",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -281,152 +385,161 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xB7, 0xBB",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x11",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts store uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x12",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with locked access retired to the architected path.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with locked access retired to the architected path.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x21",
         "Errata": "BDM35",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x41",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary.(Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x42",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This event counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x81",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This event counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x82",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data source were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Retired load uops with L1 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L1 cache hits as data sources. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "Errata": "BDM35",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops with L2 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L2 cache hits as data sources. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "Errata": "BDM100",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Retired load uops which data sources were data hits in L3 without snoops required.",
+        "BriefDescription": "Hit in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops misses in L1 cache as data sources.",
+        "BriefDescription": "Retired load uops misses in L1 cache as data sources. Uses PEBS.",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources. Uses PEBS.",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -438,84 +551,83 @@
         "Errata": "BDM100, BDE70",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_MISS",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS).",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
         "EventName": "MEM_LOAD_UOPS_RETIRED.HIT_LFB",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
+        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "Errata": "BDM100",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "Errata": "BDM100",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "Errata": "BDM100",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3.",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "Errata": "BDM100",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required.",
+        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required. (Precise Event - PEBS)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI).",
+        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD3",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "Errata": "BDE70, BDM100",
         "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Data from local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -659,119 +771,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x42",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that hit L2 cache.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x22",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that miss L2 cache.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x44",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache misses when fetching instructions.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x27",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests that miss L2 cache.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe7",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests to L2 cache.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3f",
-        "EventName": "L2_RQSTS.MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All requests that miss L2 cache.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xff",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All L2 requests.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "Errata": "BDM76",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "2",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "CounterMask": "1",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
+        "PublicDescription": "Counts demand data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010001 ",
         "Counter": "0,1,2,3",
@@ -784,6 +784,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020001 ",
         "Counter": "0,1,2,3",
@@ -796,6 +797,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020001 ",
         "Counter": "0,1,2,3",
@@ -808,6 +810,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020001 ",
         "Counter": "0,1,2,3",
@@ -820,6 +823,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020001 ",
         "Counter": "0,1,2,3",
@@ -832,6 +836,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020001 ",
         "Counter": "0,1,2,3",
@@ -844,6 +849,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020001 ",
         "Counter": "0,1,2,3",
@@ -856,6 +862,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0001 ",
         "Counter": "0,1,2,3",
@@ -868,6 +875,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0001 ",
         "Counter": "0,1,2,3",
@@ -880,6 +888,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0001 ",
         "Counter": "0,1,2,3",
@@ -892,6 +901,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0001 ",
         "Counter": "0,1,2,3",
@@ -904,6 +914,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0001 ",
         "Counter": "0,1,2,3",
@@ -916,6 +927,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0001 ",
         "Counter": "0,1,2,3",
@@ -928,6 +940,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010002 ",
         "Counter": "0,1,2,3",
@@ -940,6 +953,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0002 ",
         "Counter": "0,1,2,3",
@@ -952,6 +966,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0002 ",
         "Counter": "0,1,2,3",
@@ -964,6 +979,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0002 ",
         "Counter": "0,1,2,3",
@@ -976,6 +992,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0002 ",
         "Counter": "0,1,2,3",
@@ -988,6 +1005,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0002 ",
         "Counter": "0,1,2,3",
@@ -1000,6 +1018,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0002 ",
         "Counter": "0,1,2,3",
@@ -1012,6 +1031,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010004 ",
         "Counter": "0,1,2,3",
@@ -1024,6 +1044,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020004 ",
         "Counter": "0,1,2,3",
@@ -1036,6 +1057,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020004 ",
         "Counter": "0,1,2,3",
@@ -1048,6 +1070,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020004 ",
         "Counter": "0,1,2,3",
@@ -1060,6 +1083,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020004 ",
         "Counter": "0,1,2,3",
@@ -1072,6 +1096,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020004 ",
         "Counter": "0,1,2,3",
@@ -1084,6 +1109,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020004 ",
         "Counter": "0,1,2,3",
@@ -1096,6 +1122,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0004 ",
         "Counter": "0,1,2,3",
@@ -1108,6 +1135,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0004 ",
         "Counter": "0,1,2,3",
@@ -1120,6 +1148,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0004 ",
         "Counter": "0,1,2,3",
@@ -1132,6 +1161,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0004 ",
         "Counter": "0,1,2,3",
@@ -1144,6 +1174,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0004 ",
         "Counter": "0,1,2,3",
@@ -1156,6 +1187,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0004 ",
         "Counter": "0,1,2,3",
@@ -1168,6 +1200,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010008 ",
         "Counter": "0,1,2,3",
@@ -1180,6 +1213,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020008 ",
         "Counter": "0,1,2,3",
@@ -1192,6 +1226,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020008 ",
         "Counter": "0,1,2,3",
@@ -1204,6 +1239,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020008 ",
         "Counter": "0,1,2,3",
@@ -1216,6 +1252,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020008 ",
         "Counter": "0,1,2,3",
@@ -1228,6 +1265,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020008 ",
         "Counter": "0,1,2,3",
@@ -1240,6 +1278,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020008 ",
         "Counter": "0,1,2,3",
@@ -1252,6 +1291,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0008 ",
         "Counter": "0,1,2,3",
@@ -1264,6 +1304,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0008 ",
         "Counter": "0,1,2,3",
@@ -1276,6 +1317,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0008 ",
         "Counter": "0,1,2,3",
@@ -1288,6 +1330,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0008 ",
         "Counter": "0,1,2,3",
@@ -1300,6 +1343,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0008 ",
         "Counter": "0,1,2,3",
@@ -1312,6 +1356,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0008 ",
         "Counter": "0,1,2,3",
@@ -1324,6 +1369,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010010 ",
         "Counter": "0,1,2,3",
@@ -1336,6 +1382,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020010 ",
         "Counter": "0,1,2,3",
@@ -1348,6 +1395,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020010 ",
         "Counter": "0,1,2,3",
@@ -1360,6 +1408,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020010 ",
         "Counter": "0,1,2,3",
@@ -1372,6 +1421,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020010 ",
         "Counter": "0,1,2,3",
@@ -1384,6 +1434,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020010 ",
         "Counter": "0,1,2,3",
@@ -1396,6 +1447,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020010 ",
         "Counter": "0,1,2,3",
@@ -1408,6 +1460,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0010 ",
         "Counter": "0,1,2,3",
@@ -1420,6 +1473,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0010 ",
         "Counter": "0,1,2,3",
@@ -1432,6 +1486,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0010 ",
         "Counter": "0,1,2,3",
@@ -1444,6 +1499,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0010 ",
         "Counter": "0,1,2,3",
@@ -1456,6 +1512,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0010 ",
         "Counter": "0,1,2,3",
@@ -1468,6 +1525,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0010 ",
         "Counter": "0,1,2,3",
@@ -1480,6 +1538,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010020 ",
         "Counter": "0,1,2,3",
@@ -1492,6 +1551,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020020 ",
         "Counter": "0,1,2,3",
@@ -1504,6 +1564,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020020 ",
         "Counter": "0,1,2,3",
@@ -1516,6 +1577,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020020 ",
         "Counter": "0,1,2,3",
@@ -1528,6 +1590,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020020 ",
         "Counter": "0,1,2,3",
@@ -1540,6 +1603,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020020 ",
         "Counter": "0,1,2,3",
@@ -1552,6 +1616,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020020 ",
         "Counter": "0,1,2,3",
@@ -1564,6 +1629,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0020 ",
         "Counter": "0,1,2,3",
@@ -1576,6 +1642,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0020 ",
         "Counter": "0,1,2,3",
@@ -1588,6 +1655,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0020 ",
         "Counter": "0,1,2,3",
@@ -1600,6 +1668,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0020 ",
         "Counter": "0,1,2,3",
@@ -1612,6 +1681,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0020 ",
         "Counter": "0,1,2,3",
@@ -1624,6 +1694,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0020 ",
         "Counter": "0,1,2,3",
@@ -1636,6 +1707,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010040 ",
         "Counter": "0,1,2,3",
@@ -1648,6 +1720,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020040 ",
         "Counter": "0,1,2,3",
@@ -1660,6 +1733,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020040 ",
         "Counter": "0,1,2,3",
@@ -1672,6 +1746,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020040 ",
         "Counter": "0,1,2,3",
@@ -1684,6 +1759,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020040 ",
         "Counter": "0,1,2,3",
@@ -1696,6 +1772,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020040 ",
         "Counter": "0,1,2,3",
@@ -1708,6 +1785,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020040 ",
         "Counter": "0,1,2,3",
@@ -1720,6 +1798,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0040 ",
         "Counter": "0,1,2,3",
@@ -1732,6 +1811,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0040 ",
         "Counter": "0,1,2,3",
@@ -1744,6 +1824,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0040 ",
         "Counter": "0,1,2,3",
@@ -1756,6 +1837,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0040 ",
         "Counter": "0,1,2,3",
@@ -1768,6 +1850,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0040 ",
         "Counter": "0,1,2,3",
@@ -1780,6 +1863,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0040 ",
         "Counter": "0,1,2,3",
@@ -1792,6 +1876,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010080 ",
         "Counter": "0,1,2,3",
@@ -1804,6 +1889,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020080 ",
         "Counter": "0,1,2,3",
@@ -1816,6 +1902,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020080 ",
         "Counter": "0,1,2,3",
@@ -1828,6 +1915,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020080 ",
         "Counter": "0,1,2,3",
@@ -1840,6 +1928,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020080 ",
         "Counter": "0,1,2,3",
@@ -1852,6 +1941,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020080 ",
         "Counter": "0,1,2,3",
@@ -1864,6 +1954,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020080 ",
         "Counter": "0,1,2,3",
@@ -1876,6 +1967,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0080 ",
         "Counter": "0,1,2,3",
@@ -1888,6 +1980,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0080 ",
         "Counter": "0,1,2,3",
@@ -1900,6 +1993,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0080 ",
         "Counter": "0,1,2,3",
@@ -1912,6 +2006,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0080 ",
         "Counter": "0,1,2,3",
@@ -1924,6 +2019,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0080 ",
         "Counter": "0,1,2,3",
@@ -1936,6 +2032,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0080 ",
         "Counter": "0,1,2,3",
@@ -1948,6 +2045,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010100 ",
         "Counter": "0,1,2,3",
@@ -1960,6 +2058,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020100 ",
         "Counter": "0,1,2,3",
@@ -1972,6 +2071,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020100 ",
         "Counter": "0,1,2,3",
@@ -1984,6 +2084,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020100 ",
         "Counter": "0,1,2,3",
@@ -1996,6 +2097,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020100 ",
         "Counter": "0,1,2,3",
@@ -2008,6 +2110,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020100 ",
         "Counter": "0,1,2,3",
@@ -2020,6 +2123,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020100 ",
         "Counter": "0,1,2,3",
@@ -2032,6 +2136,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0100 ",
         "Counter": "0,1,2,3",
@@ -2044,6 +2149,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0100 ",
         "Counter": "0,1,2,3",
@@ -2056,6 +2162,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0100 ",
         "Counter": "0,1,2,3",
@@ -2068,6 +2175,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0100 ",
         "Counter": "0,1,2,3",
@@ -2080,6 +2188,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0100 ",
         "Counter": "0,1,2,3",
@@ -2092,6 +2201,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0100 ",
         "Counter": "0,1,2,3",
@@ -2104,6 +2214,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010200 ",
         "Counter": "0,1,2,3",
@@ -2116,6 +2227,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020200 ",
         "Counter": "0,1,2,3",
@@ -2128,6 +2240,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020200 ",
         "Counter": "0,1,2,3",
@@ -2140,6 +2253,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020200 ",
         "Counter": "0,1,2,3",
@@ -2152,6 +2266,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020200 ",
         "Counter": "0,1,2,3",
@@ -2164,6 +2279,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020200 ",
         "Counter": "0,1,2,3",
@@ -2176,6 +2292,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020200 ",
         "Counter": "0,1,2,3",
@@ -2188,6 +2305,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0200 ",
         "Counter": "0,1,2,3",
@@ -2200,6 +2318,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0200 ",
         "Counter": "0,1,2,3",
@@ -2212,6 +2331,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0200 ",
         "Counter": "0,1,2,3",
@@ -2224,6 +2344,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0200 ",
         "Counter": "0,1,2,3",
@@ -2236,6 +2357,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0200 ",
         "Counter": "0,1,2,3",
@@ -2248,6 +2370,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0200 ",
         "Counter": "0,1,2,3",
@@ -2260,6 +2383,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000018000 ",
         "Counter": "0,1,2,3",
@@ -2272,6 +2396,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080028000 ",
         "Counter": "0,1,2,3",
@@ -2284,6 +2409,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100028000 ",
         "Counter": "0,1,2,3",
@@ -2296,6 +2422,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200028000 ",
         "Counter": "0,1,2,3",
@@ -2308,6 +2435,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400028000 ",
         "Counter": "0,1,2,3",
@@ -2320,6 +2448,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000028000 ",
         "Counter": "0,1,2,3",
@@ -2332,6 +2461,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80028000 ",
         "Counter": "0,1,2,3",
@@ -2344,6 +2474,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c8000 ",
         "Counter": "0,1,2,3",
@@ -2356,6 +2487,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c8000 ",
         "Counter": "0,1,2,3",
@@ -2368,6 +2500,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c8000 ",
         "Counter": "0,1,2,3",
@@ -2380,6 +2513,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c8000 ",
         "Counter": "0,1,2,3",
@@ -2392,6 +2526,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c8000 ",
         "Counter": "0,1,2,3",
@@ -2404,6 +2539,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c8000 ",
         "Counter": "0,1,2,3",
@@ -2416,6 +2552,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010090 ",
         "Counter": "0,1,2,3",
@@ -2428,6 +2565,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020090 ",
         "Counter": "0,1,2,3",
@@ -2440,6 +2578,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020090 ",
         "Counter": "0,1,2,3",
@@ -2452,6 +2591,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020090 ",
         "Counter": "0,1,2,3",
@@ -2464,6 +2604,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020090 ",
         "Counter": "0,1,2,3",
@@ -2476,6 +2617,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020090 ",
         "Counter": "0,1,2,3",
@@ -2488,6 +2630,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020090 ",
         "Counter": "0,1,2,3",
@@ -2500,6 +2643,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0090 ",
         "Counter": "0,1,2,3",
@@ -2512,6 +2656,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0090 ",
         "Counter": "0,1,2,3",
@@ -2524,6 +2669,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0090 ",
         "Counter": "0,1,2,3",
@@ -2536,6 +2682,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0090 ",
         "Counter": "0,1,2,3",
@@ -2548,6 +2695,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0090 ",
         "Counter": "0,1,2,3",
@@ -2560,6 +2708,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0090 ",
         "Counter": "0,1,2,3",
@@ -2572,6 +2721,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010120 ",
         "Counter": "0,1,2,3",
@@ -2584,6 +2734,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020120 ",
         "Counter": "0,1,2,3",
@@ -2596,6 +2747,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020120 ",
         "Counter": "0,1,2,3",
@@ -2608,6 +2760,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020120 ",
         "Counter": "0,1,2,3",
@@ -2620,6 +2773,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020120 ",
         "Counter": "0,1,2,3",
@@ -2632,6 +2786,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020120 ",
         "Counter": "0,1,2,3",
@@ -2644,6 +2799,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020120 ",
         "Counter": "0,1,2,3",
@@ -2656,6 +2812,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0120 ",
         "Counter": "0,1,2,3",
@@ -2668,6 +2825,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0120 ",
         "Counter": "0,1,2,3",
@@ -2680,6 +2838,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0120 ",
         "Counter": "0,1,2,3",
@@ -2692,6 +2851,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0120 ",
         "Counter": "0,1,2,3",
@@ -2704,6 +2864,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0120 ",
         "Counter": "0,1,2,3",
@@ -2716,6 +2877,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0120 ",
         "Counter": "0,1,2,3",
@@ -2728,6 +2890,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010240 ",
         "Counter": "0,1,2,3",
@@ -2740,6 +2903,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020240 ",
         "Counter": "0,1,2,3",
@@ -2752,6 +2916,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020240 ",
         "Counter": "0,1,2,3",
@@ -2764,6 +2929,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020240 ",
         "Counter": "0,1,2,3",
@@ -2776,6 +2942,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020240 ",
         "Counter": "0,1,2,3",
@@ -2788,6 +2955,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020240 ",
         "Counter": "0,1,2,3",
@@ -2800,6 +2968,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020240 ",
         "Counter": "0,1,2,3",
@@ -2812,6 +2981,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0240 ",
         "Counter": "0,1,2,3",
@@ -2824,6 +2994,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0240 ",
         "Counter": "0,1,2,3",
@@ -2836,6 +3007,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0240 ",
         "Counter": "0,1,2,3",
@@ -2848,6 +3020,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0240 ",
         "Counter": "0,1,2,3",
@@ -2860,6 +3033,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0240 ",
         "Counter": "0,1,2,3",
@@ -2872,6 +3046,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0240 ",
         "Counter": "0,1,2,3",
@@ -2884,6 +3059,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010091 ",
         "Counter": "0,1,2,3",
@@ -2896,6 +3072,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020091 ",
         "Counter": "0,1,2,3",
@@ -2908,6 +3085,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020091 ",
         "Counter": "0,1,2,3",
@@ -2920,6 +3098,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020091 ",
         "Counter": "0,1,2,3",
@@ -2932,6 +3111,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020091 ",
         "Counter": "0,1,2,3",
@@ -2944,6 +3124,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020091 ",
         "Counter": "0,1,2,3",
@@ -2956,6 +3137,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020091 ",
         "Counter": "0,1,2,3",
@@ -2968,6 +3150,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0091 ",
         "Counter": "0,1,2,3",
@@ -2980,6 +3163,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0091 ",
         "Counter": "0,1,2,3",
@@ -2992,6 +3176,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0091 ",
         "Counter": "0,1,2,3",
@@ -3004,6 +3189,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0091 ",
         "Counter": "0,1,2,3",
@@ -3016,6 +3202,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0091 ",
         "Counter": "0,1,2,3",
@@ -3028,6 +3215,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0091 ",
         "Counter": "0,1,2,3",
@@ -3040,6 +3228,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010122 ",
         "Counter": "0,1,2,3",
@@ -3052,6 +3241,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020122 ",
         "Counter": "0,1,2,3",
@@ -3064,6 +3254,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020122 ",
         "Counter": "0,1,2,3",
@@ -3076,6 +3267,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020122 ",
         "Counter": "0,1,2,3",
@@ -3088,6 +3280,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020122 ",
         "Counter": "0,1,2,3",
@@ -3100,6 +3293,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020122 ",
         "Counter": "0,1,2,3",
@@ -3112,6 +3306,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f80020122 ",
         "Counter": "0,1,2,3",
@@ -3124,6 +3319,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00803c0122 ",
         "Counter": "0,1,2,3",
@@ -3136,6 +3332,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01003c0122 ",
         "Counter": "0,1,2,3",
@@ -3148,6 +3345,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02003c0122 ",
         "Counter": "0,1,2,3",
@@ -3160,6 +3358,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0122 ",
         "Counter": "0,1,2,3",
@@ -3172,6 +3371,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0122 ",
         "Counter": "0,1,2,3",
@@ -3184,6 +3384,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0122 ",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/floating-point.json b/tools/perf/pmu-events/arch/x86/broadwell/floating-point.json
index 102bfb8..689d478 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/floating-point.json
@@ -1,6 +1,6 @@
 [
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
         "EventCode": "0xC1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -11,7 +11,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
         "EventCode": "0xC1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -22,7 +22,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -32,7 +31,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PEBS": "1",
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -42,7 +40,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PEBS": "1",
+        "EventCode": "0xC7",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "FP_ARITH_INST_RETIRED.SCALAR",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of SSE/AVX computational scalar floating-point instructions retired. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -52,7 +58,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PEBS": "1",
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -62,7 +67,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PEBS": "1",
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -72,7 +76,43 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
+        "EventCode": "0xC7",
+        "Counter": "0,1,2,3",
+        "UMask": "0x15",
+        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
+        "SampleAfterValue": "2000006",
+        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xc7",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2a",
+        "EventName": "FP_ARITH_INST_RETIRED.SINGLE",
+        "SampleAfterValue": "2000005",
+        "BriefDescription": "Number of SSE/AVX computational single precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. ?.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3c",
+        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
+        "SampleAfterValue": "2000004",
+        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "This event counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -82,7 +122,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
+        "PublicDescription": "This event counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -92,7 +132,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
+        "PublicDescription": "This event counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -102,7 +142,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
+        "PublicDescription": "This event counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -121,51 +161,5 @@
         "BriefDescription": "Cycles with any input/output SSE or FP assist",
         "CounterMask": "1",
         "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PEBS": "1",
-        "EventCode": "0xc7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "FP_ARITH_INST_RETIRED.SCALAR",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of SSE/AVX computational scalar floating-point instructions retired. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3c",
-        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
-        "SampleAfterValue": "2000004",
-        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2a",
-        "EventName": "FP_ARITH_INST_RETIRED.SINGLE",
-        "SampleAfterValue": "2000005",
-        "BriefDescription": "Number of SSE/AVX computational single precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element. ?.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x15",
-        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
-        "SampleAfterValue": "2000006",
-        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
-        "CounterHTOff": "0,1,2,3"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/frontend.json b/tools/perf/pmu-events/arch/x86/broadwell/frontend.json
index b0cdf1f..7142c76 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/frontend.json
@@ -10,7 +10,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -20,58 +20,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -82,7 +31,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -93,7 +52,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -104,7 +73,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -116,7 +85,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x18",
@@ -127,7 +96,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x18",
@@ -138,7 +107,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x24",
@@ -149,7 +128,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x24",
@@ -160,7 +139,39 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EdgeDetect": "1",
+        "EventName": "IDQ.MS_SWITCHES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x3c",
@@ -200,7 +211,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding ?4 ? x? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
+        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding 4  x when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -263,7 +274,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0?2 cycles.",
+        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 02 cycles.",
         "EventCode": "0xAB",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -271,16 +282,5 @@
         "SampleAfterValue": "2000003",
         "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EdgeDetect": "1",
-        "EventName": "IDQ.MS_SWITCHES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/memory.json b/tools/perf/pmu-events/arch/x86/broadwell/memory.json
index ff5416d..c9154ce 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/memory.json
@@ -90,7 +90,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Unfriendly TSX abort triggered by  a flowmarker.",
         "EventCode": "0x5d",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -170,13 +169,13 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Number of times HLE abort was triggered.",
+        "PublicDescription": "Number of times HLE abort was triggered (PEBS).",
         "EventCode": "0xc8",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "HLE_RETIRED.ABORTED",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times HLE abort was triggered",
+        "BriefDescription": "Number of times HLE abort was triggered (PEBS)",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -251,13 +250,13 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Number of times RTM abort was triggered .",
+        "PublicDescription": "Number of times RTM abort was triggered (PEBS).",
         "EventCode": "0xc9",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "RTM_RETIRED.ABORTED",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times RTM abort was triggered",
+        "BriefDescription": "Number of times RTM abort was triggered (PEBS)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -431,6 +430,7 @@
         "CounterHTOff": "3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020001 ",
         "Counter": "0,1,2,3",
@@ -443,6 +443,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0001 ",
         "Counter": "0,1,2,3",
@@ -455,6 +456,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000001 ",
         "Counter": "0,1,2,3",
@@ -467,6 +469,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000001 ",
         "Counter": "0,1,2,3",
@@ -479,6 +482,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000001 ",
         "Counter": "0,1,2,3",
@@ -491,6 +495,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000001 ",
         "Counter": "0,1,2,3",
@@ -503,6 +508,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000001 ",
         "Counter": "0,1,2,3",
@@ -515,6 +521,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000001 ",
         "Counter": "0,1,2,3",
@@ -527,6 +534,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000001 ",
         "Counter": "0,1,2,3",
@@ -539,6 +547,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000001 ",
         "Counter": "0,1,2,3",
@@ -551,6 +560,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000001 ",
         "Counter": "0,1,2,3",
@@ -563,6 +573,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000001 ",
         "Counter": "0,1,2,3",
@@ -575,6 +586,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000001 ",
         "Counter": "0,1,2,3",
@@ -587,6 +599,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0002 ",
         "Counter": "0,1,2,3",
@@ -599,6 +612,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000002 ",
         "Counter": "0,1,2,3",
@@ -611,6 +625,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000002 ",
         "Counter": "0,1,2,3",
@@ -623,6 +638,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000002 ",
         "Counter": "0,1,2,3",
@@ -635,6 +651,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000002 ",
         "Counter": "0,1,2,3",
@@ -647,6 +664,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000002 ",
         "Counter": "0,1,2,3",
@@ -659,6 +677,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020004 ",
         "Counter": "0,1,2,3",
@@ -671,6 +690,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0004 ",
         "Counter": "0,1,2,3",
@@ -683,6 +703,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000004 ",
         "Counter": "0,1,2,3",
@@ -695,6 +716,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000004 ",
         "Counter": "0,1,2,3",
@@ -707,6 +729,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000004 ",
         "Counter": "0,1,2,3",
@@ -719,6 +742,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000004 ",
         "Counter": "0,1,2,3",
@@ -731,6 +755,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000004 ",
         "Counter": "0,1,2,3",
@@ -743,6 +768,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000004 ",
         "Counter": "0,1,2,3",
@@ -755,6 +781,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000004 ",
         "Counter": "0,1,2,3",
@@ -767,6 +794,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000004 ",
         "Counter": "0,1,2,3",
@@ -779,6 +807,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000004 ",
         "Counter": "0,1,2,3",
@@ -791,6 +820,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000004 ",
         "Counter": "0,1,2,3",
@@ -803,6 +833,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000004 ",
         "Counter": "0,1,2,3",
@@ -815,6 +846,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020008 ",
         "Counter": "0,1,2,3",
@@ -827,6 +859,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0008 ",
         "Counter": "0,1,2,3",
@@ -839,6 +872,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000008 ",
         "Counter": "0,1,2,3",
@@ -851,6 +885,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000008 ",
         "Counter": "0,1,2,3",
@@ -863,6 +898,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000008 ",
         "Counter": "0,1,2,3",
@@ -875,6 +911,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000008 ",
         "Counter": "0,1,2,3",
@@ -887,6 +924,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000008 ",
         "Counter": "0,1,2,3",
@@ -899,6 +937,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000008 ",
         "Counter": "0,1,2,3",
@@ -911,6 +950,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000008 ",
         "Counter": "0,1,2,3",
@@ -923,6 +963,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000008 ",
         "Counter": "0,1,2,3",
@@ -935,6 +976,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000008 ",
         "Counter": "0,1,2,3",
@@ -947,6 +989,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts writebacks (modified to exclusive) that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000008 ",
         "Counter": "0,1,2,3",
@@ -959,6 +1002,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000008 ",
         "Counter": "0,1,2,3",
@@ -971,6 +1015,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020010 ",
         "Counter": "0,1,2,3",
@@ -983,6 +1028,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0010 ",
         "Counter": "0,1,2,3",
@@ -995,6 +1041,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000010 ",
         "Counter": "0,1,2,3",
@@ -1007,6 +1054,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000010 ",
         "Counter": "0,1,2,3",
@@ -1019,6 +1067,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000010 ",
         "Counter": "0,1,2,3",
@@ -1031,6 +1080,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000010 ",
         "Counter": "0,1,2,3",
@@ -1043,6 +1093,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000010 ",
         "Counter": "0,1,2,3",
@@ -1055,6 +1106,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000010 ",
         "Counter": "0,1,2,3",
@@ -1067,6 +1119,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000010 ",
         "Counter": "0,1,2,3",
@@ -1079,6 +1132,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000010 ",
         "Counter": "0,1,2,3",
@@ -1091,6 +1145,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000010 ",
         "Counter": "0,1,2,3",
@@ -1103,6 +1158,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000010 ",
         "Counter": "0,1,2,3",
@@ -1115,6 +1171,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000010 ",
         "Counter": "0,1,2,3",
@@ -1127,6 +1184,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020020 ",
         "Counter": "0,1,2,3",
@@ -1139,6 +1197,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0020 ",
         "Counter": "0,1,2,3",
@@ -1151,6 +1210,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000020 ",
         "Counter": "0,1,2,3",
@@ -1163,6 +1223,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000020 ",
         "Counter": "0,1,2,3",
@@ -1175,6 +1236,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000020 ",
         "Counter": "0,1,2,3",
@@ -1187,6 +1249,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000020 ",
         "Counter": "0,1,2,3",
@@ -1199,6 +1262,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000020 ",
         "Counter": "0,1,2,3",
@@ -1211,6 +1275,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000020 ",
         "Counter": "0,1,2,3",
@@ -1223,6 +1288,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000020 ",
         "Counter": "0,1,2,3",
@@ -1235,6 +1301,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000020 ",
         "Counter": "0,1,2,3",
@@ -1247,6 +1314,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000020 ",
         "Counter": "0,1,2,3",
@@ -1259,6 +1327,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000020 ",
         "Counter": "0,1,2,3",
@@ -1271,6 +1340,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000020 ",
         "Counter": "0,1,2,3",
@@ -1283,6 +1353,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020040 ",
         "Counter": "0,1,2,3",
@@ -1295,6 +1366,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0040 ",
         "Counter": "0,1,2,3",
@@ -1307,6 +1379,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000040 ",
         "Counter": "0,1,2,3",
@@ -1319,6 +1392,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000040 ",
         "Counter": "0,1,2,3",
@@ -1331,6 +1405,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000040 ",
         "Counter": "0,1,2,3",
@@ -1343,6 +1418,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000040 ",
         "Counter": "0,1,2,3",
@@ -1355,6 +1431,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000040 ",
         "Counter": "0,1,2,3",
@@ -1367,6 +1444,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000040 ",
         "Counter": "0,1,2,3",
@@ -1379,6 +1457,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000040 ",
         "Counter": "0,1,2,3",
@@ -1391,6 +1470,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000040 ",
         "Counter": "0,1,2,3",
@@ -1403,6 +1483,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000040 ",
         "Counter": "0,1,2,3",
@@ -1415,6 +1496,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000040 ",
         "Counter": "0,1,2,3",
@@ -1427,6 +1509,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000040 ",
         "Counter": "0,1,2,3",
@@ -1439,6 +1522,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020080 ",
         "Counter": "0,1,2,3",
@@ -1451,6 +1535,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0080 ",
         "Counter": "0,1,2,3",
@@ -1463,6 +1548,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000080 ",
         "Counter": "0,1,2,3",
@@ -1475,6 +1561,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000080 ",
         "Counter": "0,1,2,3",
@@ -1487,6 +1574,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000080 ",
         "Counter": "0,1,2,3",
@@ -1499,6 +1587,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000080 ",
         "Counter": "0,1,2,3",
@@ -1511,6 +1600,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000080 ",
         "Counter": "0,1,2,3",
@@ -1523,6 +1613,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000080 ",
         "Counter": "0,1,2,3",
@@ -1535,6 +1626,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000080 ",
         "Counter": "0,1,2,3",
@@ -1547,6 +1639,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000080 ",
         "Counter": "0,1,2,3",
@@ -1559,6 +1652,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000080 ",
         "Counter": "0,1,2,3",
@@ -1571,6 +1665,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000080 ",
         "Counter": "0,1,2,3",
@@ -1583,6 +1678,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000080 ",
         "Counter": "0,1,2,3",
@@ -1595,6 +1691,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020100 ",
         "Counter": "0,1,2,3",
@@ -1607,6 +1704,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0100 ",
         "Counter": "0,1,2,3",
@@ -1619,6 +1717,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000100 ",
         "Counter": "0,1,2,3",
@@ -1631,6 +1730,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000100 ",
         "Counter": "0,1,2,3",
@@ -1643,6 +1743,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000100 ",
         "Counter": "0,1,2,3",
@@ -1655,6 +1756,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000100 ",
         "Counter": "0,1,2,3",
@@ -1667,6 +1769,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000100 ",
         "Counter": "0,1,2,3",
@@ -1679,6 +1782,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000100 ",
         "Counter": "0,1,2,3",
@@ -1691,6 +1795,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000100 ",
         "Counter": "0,1,2,3",
@@ -1703,6 +1808,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000100 ",
         "Counter": "0,1,2,3",
@@ -1715,6 +1821,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000100 ",
         "Counter": "0,1,2,3",
@@ -1727,6 +1834,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000100 ",
         "Counter": "0,1,2,3",
@@ -1739,6 +1847,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000100 ",
         "Counter": "0,1,2,3",
@@ -1751,6 +1860,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020200 ",
         "Counter": "0,1,2,3",
@@ -1763,6 +1873,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0200 ",
         "Counter": "0,1,2,3",
@@ -1775,6 +1886,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000200 ",
         "Counter": "0,1,2,3",
@@ -1787,6 +1899,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000200 ",
         "Counter": "0,1,2,3",
@@ -1799,6 +1912,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000200 ",
         "Counter": "0,1,2,3",
@@ -1811,6 +1925,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000200 ",
         "Counter": "0,1,2,3",
@@ -1823,6 +1938,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000200 ",
         "Counter": "0,1,2,3",
@@ -1835,6 +1951,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000200 ",
         "Counter": "0,1,2,3",
@@ -1847,6 +1964,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000200 ",
         "Counter": "0,1,2,3",
@@ -1859,6 +1977,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000200 ",
         "Counter": "0,1,2,3",
@@ -1871,6 +1990,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000200 ",
         "Counter": "0,1,2,3",
@@ -1883,6 +2003,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000200 ",
         "Counter": "0,1,2,3",
@@ -1895,6 +2016,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000200 ",
         "Counter": "0,1,2,3",
@@ -1907,6 +2029,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000028000 ",
         "Counter": "0,1,2,3",
@@ -1919,6 +2042,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c8000 ",
         "Counter": "0,1,2,3",
@@ -1931,6 +2055,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084008000 ",
         "Counter": "0,1,2,3",
@@ -1943,6 +2068,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104008000 ",
         "Counter": "0,1,2,3",
@@ -1955,6 +2081,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204008000 ",
         "Counter": "0,1,2,3",
@@ -1967,6 +2094,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404008000 ",
         "Counter": "0,1,2,3",
@@ -1979,6 +2107,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004008000 ",
         "Counter": "0,1,2,3",
@@ -1991,6 +2120,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004008000 ",
         "Counter": "0,1,2,3",
@@ -2003,6 +2133,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84008000 ",
         "Counter": "0,1,2,3",
@@ -2015,6 +2146,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc008000 ",
         "Counter": "0,1,2,3",
@@ -2027,6 +2159,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c008000 ",
         "Counter": "0,1,2,3",
@@ -2039,6 +2172,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts any other requests that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c008000 ",
         "Counter": "0,1,2,3",
@@ -2051,6 +2185,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c008000 ",
         "Counter": "0,1,2,3",
@@ -2063,6 +2198,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020090 ",
         "Counter": "0,1,2,3",
@@ -2075,6 +2211,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0090 ",
         "Counter": "0,1,2,3",
@@ -2087,6 +2224,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000090 ",
         "Counter": "0,1,2,3",
@@ -2099,6 +2237,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000090 ",
         "Counter": "0,1,2,3",
@@ -2111,6 +2250,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000090 ",
         "Counter": "0,1,2,3",
@@ -2123,6 +2263,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000090 ",
         "Counter": "0,1,2,3",
@@ -2135,6 +2276,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000090 ",
         "Counter": "0,1,2,3",
@@ -2147,6 +2289,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000090 ",
         "Counter": "0,1,2,3",
@@ -2159,6 +2302,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000090 ",
         "Counter": "0,1,2,3",
@@ -2171,6 +2315,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000090 ",
         "Counter": "0,1,2,3",
@@ -2183,6 +2328,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000090 ",
         "Counter": "0,1,2,3",
@@ -2195,6 +2341,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000090 ",
         "Counter": "0,1,2,3",
@@ -2207,6 +2354,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000090 ",
         "Counter": "0,1,2,3",
@@ -2219,6 +2367,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020120 ",
         "Counter": "0,1,2,3",
@@ -2231,6 +2380,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0120 ",
         "Counter": "0,1,2,3",
@@ -2243,6 +2393,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000120 ",
         "Counter": "0,1,2,3",
@@ -2255,6 +2406,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000120 ",
         "Counter": "0,1,2,3",
@@ -2267,6 +2419,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000120 ",
         "Counter": "0,1,2,3",
@@ -2279,6 +2432,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000120 ",
         "Counter": "0,1,2,3",
@@ -2291,6 +2445,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000120 ",
         "Counter": "0,1,2,3",
@@ -2303,6 +2458,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000120 ",
         "Counter": "0,1,2,3",
@@ -2315,6 +2471,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000120 ",
         "Counter": "0,1,2,3",
@@ -2327,6 +2484,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000120 ",
         "Counter": "0,1,2,3",
@@ -2339,6 +2497,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000120 ",
         "Counter": "0,1,2,3",
@@ -2351,6 +2510,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000120 ",
         "Counter": "0,1,2,3",
@@ -2363,6 +2523,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000120 ",
         "Counter": "0,1,2,3",
@@ -2375,6 +2536,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020240 ",
         "Counter": "0,1,2,3",
@@ -2387,6 +2549,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0240 ",
         "Counter": "0,1,2,3",
@@ -2399,6 +2562,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000240 ",
         "Counter": "0,1,2,3",
@@ -2411,6 +2575,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000240 ",
         "Counter": "0,1,2,3",
@@ -2423,6 +2588,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000240 ",
         "Counter": "0,1,2,3",
@@ -2435,6 +2601,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000240 ",
         "Counter": "0,1,2,3",
@@ -2447,6 +2614,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000240 ",
         "Counter": "0,1,2,3",
@@ -2459,6 +2627,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000240 ",
         "Counter": "0,1,2,3",
@@ -2471,6 +2640,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000240 ",
         "Counter": "0,1,2,3",
@@ -2483,6 +2653,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000240 ",
         "Counter": "0,1,2,3",
@@ -2495,6 +2666,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000240 ",
         "Counter": "0,1,2,3",
@@ -2507,6 +2679,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch code reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000240 ",
         "Counter": "0,1,2,3",
@@ -2519,6 +2692,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000240 ",
         "Counter": "0,1,2,3",
@@ -2531,6 +2705,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020091 ",
         "Counter": "0,1,2,3",
@@ -2543,6 +2718,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0091 ",
         "Counter": "0,1,2,3",
@@ -2555,6 +2731,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000091 ",
         "Counter": "0,1,2,3",
@@ -2567,6 +2744,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000091 ",
         "Counter": "0,1,2,3",
@@ -2579,6 +2757,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000091 ",
         "Counter": "0,1,2,3",
@@ -2591,6 +2770,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000091 ",
         "Counter": "0,1,2,3",
@@ -2603,6 +2783,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000091 ",
         "Counter": "0,1,2,3",
@@ -2615,6 +2796,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000091 ",
         "Counter": "0,1,2,3",
@@ -2627,6 +2809,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000091 ",
         "Counter": "0,1,2,3",
@@ -2639,6 +2822,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000091 ",
         "Counter": "0,1,2,3",
@@ -2651,6 +2835,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000091 ",
         "Counter": "0,1,2,3",
@@ -2663,6 +2848,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000091 ",
         "Counter": "0,1,2,3",
@@ -2675,6 +2861,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000091 ",
         "Counter": "0,1,2,3",
@@ -2687,6 +2874,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2000020122 ",
         "Counter": "0,1,2,3",
@@ -2699,6 +2887,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the target was non-DRAM system address. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x20003c0122 ",
         "Counter": "0,1,2,3",
@@ -2711,6 +2900,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000122 ",
         "Counter": "0,1,2,3",
@@ -2723,6 +2913,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000122 ",
         "Counter": "0,1,2,3",
@@ -2735,6 +2926,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000122 ",
         "Counter": "0,1,2,3",
@@ -2747,6 +2939,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000122 ",
         "Counter": "0,1,2,3",
@@ -2759,6 +2952,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000122 ",
         "Counter": "0,1,2,3",
@@ -2771,6 +2965,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x2004000122 ",
         "Counter": "0,1,2,3",
@@ -2783,6 +2978,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f84000122 ",
         "Counter": "0,1,2,3",
@@ -2795,6 +2991,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 with no details on snoop-related information. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000122 ",
         "Counter": "0,1,2,3",
@@ -2807,6 +3004,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000122 ",
         "Counter": "0,1,2,3",
@@ -2819,6 +3017,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 with a snoop miss response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000122 ",
         "Counter": "0,1,2,3",
@@ -2831,6 +3030,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000122 ",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/other.json b/tools/perf/pmu-events/arch/x86/broadwell/other.json
index edf14f0..4f829c5 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/other.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
-        "EventCode": "0x5C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CPL_CYCLES.RING123",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This event counts when there is a transition from ring 1,2 or 3 to ring0.",
         "EventCode": "0x5C",
         "Counter": "0,1,2,3",
@@ -32,6 +22,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
+        "EventCode": "0x5C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPL_CYCLES.RING123",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles in which the L1 and L2 are locked due to a UC lock or split lock. A lock is asserted in case of locked memory access, due to noncacheable memory, locked operation that spans two cache lines, or a page walk from the noncacheable page table. L1D and L2 locks have a very high performance penalty and it is highly recommended to avoid such access.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/pipeline.json b/tools/perf/pmu-events/arch/x86/broadwell/pipeline.json
index 78913ae..97c5d07 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/pipeline.json
@@ -2,32 +2,42 @@
     {
         "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. \nNotes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. \nCounting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "UMask": "0x1",
         "EventName": "INST_RETIRED.ANY",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Instructions retired from execution.",
-        "CounterHTOff": "Fixed counter 1"
+        "CounterHTOff": "Fixed counter 0"
     },
     {
         "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
+        "Counter": "Fixed counter 1",
         "UMask": "0x2",
         "EventName": "CPU_CLK_UNHALTED.THREAD",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Core cycles when the thread is not in halt state",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
+    },
+    {
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. \nNote: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  This event is clocked by base clock (100 Mhz) on Sandy Bridge. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "UMask": "0x3",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "PublicDescription": "This event counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when:\n - preceding store conflicts with the load (incomplete overlap);\n - store forwarding is impossible due to u-arch limitations;\n - preceding lock RMW operations are not forwarded;\n - store has the no-forward bit set (uncacheable/page-split/masked stores);\n - all-blocking stores are used (mostly, fences and port I/O);\nand others.\nThe most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events.\nSee the table of not supported store forwards in the Optimization Guide.",
@@ -59,6 +69,28 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "INT_MISC.RECOVERY_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "AnyThread": "1",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of cycles during which Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the current thread. This also includes the cycles during which the Allocator is serving another thread.",
         "EventCode": "0x0D",
         "Counter": "0,1,2,3",
@@ -69,17 +101,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This event counts the number of Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS).",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
@@ -90,6 +111,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
+        "EventCode": "0x0E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Number of flags-merge uops being allocated. Such uops considered perf sensitive\n added by GSR u-arch.",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
@@ -118,18 +151,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
-        "EventCode": "0x0E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "PublicDescription": "This event counts the number of the divide operations executed. Uses edge-detect and a cmask value of 1 on ARITH.FPU_DIV_ACTIVE to get the number of the divide operations executed.",
         "EventCode": "0x14",
         "Counter": "0,1,2,3",
@@ -140,6 +161,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This is a fixed-frequency event programmed to general counters. It counts when the core is unhalted at 100 Mhz.",
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
@@ -150,6 +191,36 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x3c",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -159,6 +230,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by asm inspection of the nearby instructions.",
         "EventCode": "0x4c",
         "Counter": "0,1,2,3",
@@ -225,6 +305,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts stalls occured due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.",
         "EventCode": "0x87",
         "Counter": "0,1,2,3",
@@ -405,6 +497,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x89",
+        "Counter": "0,1,2,3",
+        "UMask": "0xa0",
+        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts both taken and not taken speculative and retired mispredicted macro conditional branch instructions.",
         "EventCode": "0x89",
         "Counter": "0,1,2,3",
@@ -435,6 +536,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
+        "EventCode": "0xA0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -445,6 +556,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -455,6 +586,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -465,6 +616,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -475,6 +646,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -485,6 +676,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -495,6 +706,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -505,6 +736,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -515,6 +766,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts resource-related stall cycles. Reasons for stalls can be as follows:\n - *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots)\n - *any* u-arch structure got empty (like INT/SIMD FreeLists)\n - FPU control word (FPCW), MXCSR\nand others. This counts cycles that the pipeline backend blocked uop delivery from the front end.",
         "EventCode": "0xA2",
         "Counter": "0,1,2,3",
@@ -566,15 +837,14 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
         "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
+        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request (that is cycles with non-completed load waiting for its data from memory subsystem).",
@@ -588,17 +858,37 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "CounterMask": "2",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Counts number of cycles nothing is executed on any execution port.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
         "CounterMask": "4",
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Total execution stalls.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand* load request missing the L2 cache.(as a footprint) * includes also L1 HW prefetch requests that may or may not be required by demands.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
@@ -610,6 +900,16 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x5",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
+        "CounterMask": "5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand load request.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
@@ -621,6 +921,37 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x6",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
         "PublicDescription": "Counts number of cycles nothing is executed on any execution port, while there was at least one pending demand load request missing the L1 data cache.",
         "EventCode": "0xA3",
         "Counter": "2",
@@ -632,7 +963,16 @@
         "CounterHTOff": "2"
     },
     {
-        "PublicDescription": "Number of Uops delivered by the LSD. ",
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0xc",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "CounterMask": "12",
+        "CounterHTOff": "2"
+    },
+    {
         "EventCode": "0xA8",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -642,6 +982,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of uops to be executed per-thread each cycle.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -652,16 +1012,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of uops executed from any thread.",
-        "EventCode": "0xB1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of uops executed on the core.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This event counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread.",
         "EventCode": "0xB1",
         "Invert": "1",
@@ -674,375 +1024,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "Errata": "BDM61",
-        "EventName": "INST_RETIRED.ANY_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "INST_RETIRED.X87",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
-        "EventCode": "0xC0",
-        "Counter": "1",
-        "UMask": "0x1",
-        "Errata": "BDM11, BDM55",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This event counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.ALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Actually retired uops.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7",
-        "Data_LA": "1"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of retirement slots used.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Retirement slots used.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.",
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "MACHINE_CLEARS.CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts all (macro) branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Return instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Not taken branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Taken branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "Errata": "BDW98",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Far branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "Errata": "BDW98",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted return instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "BR_MISP_RETIRED.RET",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "This event counts the number of mispredicted ret instructions retired. Non PEBS",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
-        "EventCode": "0xCC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Count cases of saving new LBR",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x89",
-        "Counter": "0,1,2,3",
-        "UMask": "0xa0",
-        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -1083,256 +1064,13 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xe6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1f",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
+        "PublicDescription": "Number of uops executed from any thread.",
+        "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "EventName": "UOPS_EXECUTED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
-        "CounterMask": "2",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0xc",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "CounterMask": "12",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x5",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
-        "CounterMask": "5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x6",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x5E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
-        "EventCode": "0xA0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "AnyThread": "1",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "CounterMask": "1",
+        "BriefDescription": "Number of uops executed on the core.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -1386,32 +1124,304 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
-        "EventCode": "0x3C",
+        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "UMask": "0x0",
+        "Errata": "BDM61",
+        "EventName": "INST_RETIRED.ANY_P",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
+        "PEBS": "2",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
+        "EventCode": "0xC0",
+        "Counter": "1",
         "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "Errata": "BDM11, BDM55",
+        "EventName": "INST_RETIRED.PREC_DIST",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "CounterHTOff": "1"
     },
     {
-        "EventCode": "0x3C",
+        "PublicDescription": "This event counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "INST_RETIRED.X87",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.ALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Actually retired uops. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7",
+        "Data_LA": "1"
+    },
+    {
+        "PublicDescription": "This event counts cycles without actually retired uops.",
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts the number of retirement slots used.",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retirement slots used. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "MACHINE_CLEARS.CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts all (macro) branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts conditional branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Conditional branch instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect near call instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect near call instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect macro near call instructions retired (captured in ring 3).",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3). (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "Errata": "BDW98",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts return instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Return instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts not taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Taken branch instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts far branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "Errata": "BDW98",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Far branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted conditional branch instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted conditional branch instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted return instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "BR_MISP_RETIRED.RET",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "This event counts the number of mispredicted ret instructions retired.(Precise Event)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
+        "EventCode": "0xCC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Count cases of saving new LBR",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xe6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1f",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwell/virtual-memory.json b/tools/perf/pmu-events/arch/x86/broadwell/virtual-memory.json
index 4301e6f..2a015e4c 100644
--- a/tools/perf/pmu-events/arch/x86/broadwell/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwell/virtual-memory.json
@@ -44,6 +44,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "Errata": "BDM69",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of cycles while PMH is busy with the page walk.",
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
@@ -73,6 +83,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts store misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).",
         "EventCode": "0x49",
         "Counter": "0,1,2,3",
@@ -117,6 +136,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "Errata": "BDM69",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of cycles while PMH is busy with the page walk.",
         "EventCode": "0x49",
         "Counter": "0,1,2,3",
@@ -146,6 +175,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles for an extended page table walk. The Extended Page directory cache differs from standard TLB caches by the operating system that use it. Virtual machine operating systems use the extended page directory cache, while guest operating systems use the standard TLB caches.",
         "EventCode": "0x4F",
         "Counter": "0,1,2,3",
@@ -200,6 +238,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "Errata": "BDM69",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of cycles while PMH is busy with the page walk.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
@@ -229,6 +277,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "ITLB_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific).",
         "EventCode": "0xAE",
         "Counter": "0,1,2,3",
@@ -251,16 +308,6 @@
     {
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
-        "UMask": "0x21",
-        "Errata": "BDM69, BDM98",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
         "UMask": "0x12",
         "Errata": "BDM69, BDM98",
         "EventName": "PAGE_WALKER_LOADS.DTLB_L2",
@@ -271,16 +318,6 @@
     {
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
-        "UMask": "0x22",
-        "Errata": "BDM69, BDM98",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L2.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
         "UMask": "0x14",
         "Errata": "BDM69, BDM98",
         "EventName": "PAGE_WALKER_LOADS.DTLB_L3",
@@ -291,16 +328,6 @@
     {
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "Errata": "BDM69, BDM98",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
         "UMask": "0x18",
         "Errata": "BDM69, BDM98",
         "EventName": "PAGE_WALKER_LOADS.DTLB_MEMORY",
@@ -309,6 +336,36 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x21",
+        "Errata": "BDM69, BDM98",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x22",
+        "Errata": "BDM69, BDM98",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L2.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "Errata": "BDM69, BDM98",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "This event counts the number of DTLB flush attempts of the thread-specific entries.",
         "EventCode": "0xBD",
         "Counter": "0,1,2,3",
@@ -327,62 +384,5 @@
         "SampleAfterValue": "100007",
         "BriefDescription": "STLB flush attempts",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "Errata": "BDM69",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "Errata": "BDM69",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "Errata": "BDM69",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "ITLB_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/cache.json b/tools/perf/pmu-events/arch/x86/broadwellde/cache.json
index 36fe398..bf243fe 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/cache.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/cache.json
@@ -11,11 +11,28 @@
     },
     {
         "EventCode": "0x24",
-        "UMask": "0x41",
-        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "UMask": "0x22",
+        "BriefDescription": "RFO requests that miss L2 cache.",
         "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
-        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
+        "EventName": "L2_RQSTS.RFO_MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x24",
+        "BriefDescription": "L2 cache misses when fetching instructions.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x27",
+        "BriefDescription": "Demand requests that miss L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
         "SampleAfterValue": "200003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -31,6 +48,43 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0x3f",
+        "BriefDescription": "All requests that miss L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x41",
+        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
+        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x42",
+        "BriefDescription": "RFO requests that hit L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.RFO_HIT",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x44",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0x50",
         "BriefDescription": "L2 prefetch requests that hit L2 cache",
         "Counter": "0,1,2,3",
@@ -71,6 +125,15 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0xe7",
+        "BriefDescription": "Demand requests to L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0xf8",
         "BriefDescription": "Requests from L2 hardware prefetchers",
         "Counter": "0,1,2,3",
@@ -80,6 +143,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "UMask": "0xff",
+        "BriefDescription": "All L2 requests.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x27",
         "UMask": "0x50",
         "BriefDescription": "Not rejected writebacks that hit L2 cache",
@@ -131,6 +203,27 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0x48",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "Counter": "2",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "Counter": "0,1,2,3",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x51",
         "UMask": "0x1",
         "BriefDescription": "L1D data line replacements",
@@ -153,12 +246,35 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "CounterMask": "1",
+        "Errata": "BDM76",
+        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "CounterMask": "6",
+        "Errata": "BDM76",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x2",
         "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
         "Errata": "BDM76",
-        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -175,6 +291,18 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x4",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "CounterMask": "1",
+        "Errata": "BDM76",
+        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
         "Counter": "0,1,2,3",
@@ -186,18 +314,6 @@
     },
     {
         "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "CounterMask": "1",
-        "Errata": "BDM76",
-        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
         "Counter": "0,1,2,3",
@@ -209,18 +325,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x60",
-        "UMask": "0x4",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "CounterMask": "1",
-        "Errata": "BDM76",
-        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0x63",
         "UMask": "0x2",
         "BriefDescription": "Cycles when L1D is locked",
@@ -266,7 +370,7 @@
         "BriefDescription": "Demand and prefetch data reads",
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_REQUESTS.ALL_DATA_RD",
-        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable \"Demands\" and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
+        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable Demands and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -281,26 +385,35 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xD0",
         "UMask": "0x11",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x12",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts store uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -308,37 +421,37 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x21",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "Errata": "BDM35",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with locked access retired to the architected path.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with locked access retired to the architected path.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x41",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary.(Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x42",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -346,24 +459,24 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x81",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
-        "PublicDescription": "This event counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x82",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
-        "PublicDescription": "This event counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
         "SampleAfterValue": "2000003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -371,69 +484,69 @@
     {
         "EventCode": "0xD1",
         "UMask": "0x1",
-        "BriefDescription": "Retired load uops with L1 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L1 cache hits as data sources. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data source were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x2",
-        "BriefDescription": "Retired load uops with L2 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L2 cache hits as data sources. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
         "Errata": "BDM35",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were data hits in L3 without snoops required.",
+        "BriefDescription": "Hit in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x8",
-        "BriefDescription": "Retired load uops misses in L1 cache as data sources.",
+        "BriefDescription": "Retired load uops misses in L1 cache as data sources. Uses PEBS.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x10",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources. Uses PEBS.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x20",
-        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS).",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -445,77 +558,112 @@
     {
         "EventCode": "0xD1",
         "UMask": "0x40",
-        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
+        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.HIT_LFB",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x1",
-        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x2",
-        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3.",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x8",
-        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required.",
+        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD3",
         "UMask": "0x1",
-        "BriefDescription": "Data from local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM",
         "Errata": "BDE70, BDM100",
-        "PublicDescription": "Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI).",
+        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. This is a precise event.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xD3",
+        "UMask": "0x4",
+        "BriefDescription": "Retired load uop whose Data Source was: remote DRAM either Snoop not needed or Snoop Miss (RspI) (Precise Event)",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_DRAM",
+        "Errata": "BDE70",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xD3",
+        "UMask": "0x10",
+        "BriefDescription": "Retired load uop whose Data Source was: Remote cache HITM (Precise Event)",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_HITM",
+        "Errata": "BDE70",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xD3",
+        "UMask": "0x20",
+        "BriefDescription": "Retired load uop whose Data Source was: forwarded from remote cache (Precise Event)",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.REMOTE_FWD",
+        "Errata": "BDE70",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -657,118 +805,5 @@
         "PublicDescription": "This event counts the number of split locks in the super queue.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x42",
-        "BriefDescription": "RFO requests that hit L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x22",
-        "BriefDescription": "RFO requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x44",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x24",
-        "BriefDescription": "L2 cache misses when fetching instructions.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x27",
-        "BriefDescription": "Demand requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xe7",
-        "BriefDescription": "Demand requests to L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x3f",
-        "BriefDescription": "All requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xff",
-        "BriefDescription": "All L2 requests.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "UMask": "0x1",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "CounterMask": "6",
-        "Errata": "BDM76",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "Counter": "2",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "Counter": "0,1,2,3",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/floating-point.json b/tools/perf/pmu-events/arch/x86/broadwellde/floating-point.json
index 4ae1ea2..d7b9d9c 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/floating-point.json
@@ -6,7 +6,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OTHER_ASSISTS.AVX_TO_SSE",
         "Errata": "BDM30",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -17,7 +17,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OTHER_ASSISTS.SSE_TO_AVX",
         "Errata": "BDM30",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -25,7 +25,6 @@
         "EventCode": "0xC7",
         "UMask": "0x1",
         "BriefDescription": "Number of SSE/AVX computational scalar double precision floating-point instructions retired.  Each count represents 1 computation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB.  FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "FP_ARITH_INST_RETIRED.SCALAR_DOUBLE",
         "SampleAfterValue": "2000003",
@@ -35,7 +34,6 @@
         "EventCode": "0xC7",
         "UMask": "0x2",
         "BriefDescription": "Number of SSE/AVX computational scalar single precision floating-point instructions retired.  Each count represents 1 computation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB.  FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "FP_ARITH_INST_RETIRED.SCALAR_SINGLE",
         "SampleAfterValue": "2000003",
@@ -43,97 +41,6 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x4",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired.  Each count represents 2 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "UMask": "0x8",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "UMask": "0x10",
-        "BriefDescription": "Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x2",
-        "BriefDescription": "Number of X87 assists due to output value.",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.X87_OUTPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x4",
-        "BriefDescription": "Number of X87 assists due to input value.",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.X87_INPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x8",
-        "BriefDescription": "Number of SIMD FP assists due to Output values",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.SIMD_OUTPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x10",
-        "BriefDescription": "Number of SIMD FP assists due to input values",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.SIMD_INPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x1e",
-        "BriefDescription": "Cycles with any input/output SSE or FP assist",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.ANY",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xc7",
-        "UMask": "0x20",
-        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
         "UMask": "0x3",
         "BriefDescription": "Number of SSE/AVX computational scalar floating-point instructions retired. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
@@ -143,11 +50,47 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x3c",
-        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "UMask": "0x4",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired.  Each count represents 2 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
-        "SampleAfterValue": "2000004",
+        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x8",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x10",
+        "BriefDescription": "Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x15",
+        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
+        "SampleAfterValue": "2000006",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xc7",
+        "UMask": "0x20",
+        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -161,11 +104,62 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x15",
-        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
+        "UMask": "0x3c",
+        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
-        "SampleAfterValue": "2000006",
+        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
+        "SampleAfterValue": "2000004",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x2",
+        "BriefDescription": "Number of X87 assists due to output value.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.X87_OUTPUT",
+        "PublicDescription": "This event counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x4",
+        "BriefDescription": "Number of X87 assists due to input value.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.X87_INPUT",
+        "PublicDescription": "This event counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x8",
+        "BriefDescription": "Number of SIMD FP assists due to Output values",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.SIMD_OUTPUT",
+        "PublicDescription": "This event counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x10",
+        "BriefDescription": "Number of SIMD FP assists due to input values",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.SIMD_INPUT",
+        "PublicDescription": "This event counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x1e",
+        "BriefDescription": "Cycles with any input/output SSE or FP assist",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.ANY",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
+        "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/frontend.json b/tools/perf/pmu-events/arch/x86/broadwellde/frontend.json
index 06bf0a4..72781e1 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/frontend.json
@@ -15,58 +15,7 @@
         "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x8",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.DSB_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x10",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x20",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_UOPS",
-        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -77,7 +26,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x8",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.DSB_UOPS",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -88,7 +47,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.DSB_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x10",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -99,7 +68,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MS_DSB_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -111,7 +80,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MS_DSB_OCCUR",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -122,7 +91,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
         "CounterMask": "4",
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -133,7 +102,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x20",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -144,7 +123,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS",
         "CounterMask": "4",
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -155,7 +134,39 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_UOPS",
+        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_SWITCHES",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -165,7 +176,7 @@
         "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_ALL_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -205,7 +216,7 @@
         "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
         "Counter": "0,1,2,3",
         "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
-        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding ?4 ? x? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
+        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding 4  x when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -268,18 +279,7 @@
         "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
         "Counter": "0,1,2,3",
         "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES",
-        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0?2 cycles.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_SWITCHES",
-        "CounterMask": "1",
+        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 02 cycles.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/memory.json b/tools/perf/pmu-events/arch/x86/broadwellde/memory.json
index cfa1e58..e44f73c 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/memory.json
@@ -95,7 +95,6 @@
         "BriefDescription": "Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort.",
         "Counter": "0,1,2,3",
         "EventName": "TX_EXEC.MISC1",
-        "PublicDescription": "Unfriendly TSX abort triggered by  a flowmarker.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -171,11 +170,11 @@
     {
         "EventCode": "0xc8",
         "UMask": "0x4",
-        "BriefDescription": "Number of times HLE abort was triggered",
+        "BriefDescription": "Number of times HLE abort was triggered (PEBS)",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "HLE_RETIRED.ABORTED",
-        "PublicDescription": "Number of times HLE abort was triggered.",
+        "PublicDescription": "Number of times HLE abort was triggered (PEBS).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -252,11 +251,11 @@
     {
         "EventCode": "0xc9",
         "UMask": "0x4",
-        "BriefDescription": "Number of times RTM abort was triggered",
+        "BriefDescription": "Number of times RTM abort was triggered (PEBS)",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "RTM_RETIRED.ABORTED",
-        "PublicDescription": "Number of times RTM abort was triggered .",
+        "PublicDescription": "Number of times RTM abort was triggered (PEBS).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/other.json b/tools/perf/pmu-events/arch/x86/broadwellde/other.json
index 718fcb1..4475249 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/other.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x5C",
-        "UMask": "0x2",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "Counter": "0,1,2,3",
-        "EventName": "CPL_CYCLES.RING123",
-        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EdgeDetect": "1",
         "EventCode": "0x5C",
         "UMask": "0x1",
@@ -32,6 +22,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5C",
+        "UMask": "0x2",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "Counter": "0,1,2,3",
+        "EventName": "CPL_CYCLES.RING123",
+        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x63",
         "UMask": "0x1",
         "BriefDescription": "Cycles when L1 and L2 are locked due to UC or split lock",
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/pipeline.json b/tools/perf/pmu-events/arch/x86/broadwellde/pipeline.json
index 02b4e10..920c89d 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/pipeline.json
@@ -3,31 +3,41 @@
         "EventCode": "0x00",
         "UMask": "0x1",
         "BriefDescription": "Instructions retired from execution.",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "EventName": "INST_RETIRED.ANY",
         "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. \nNotes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. \nCounting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "SampleAfterValue": "2000003",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "UMask": "0x2",
+        "BriefDescription": "Core cycles when the thread is not in halt state",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x2",
-        "BriefDescription": "Core cycles when the thread is not in halt state",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "AnyThread": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x3",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. \nNote: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  This event is clocked by base clock (100 Mhz) on Sandy Bridge. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "EventCode": "0x03",
@@ -60,22 +70,33 @@
     },
     {
         "EventCode": "0x0D",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the thread",
+        "UMask": "0x3",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)",
         "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RAT_STALL_CYCLES",
-        "PublicDescription": "This event counts the number of cycles during which Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the current thread. This also includes the cycles during which the Allocator is serving another thread.",
+        "EventName": "INT_MISC.RECOVERY_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x0D",
         "UMask": "0x3",
-        "BriefDescription": "Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
         "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "AnyThread": "1",
         "CounterMask": "1",
-        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x0D",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the thread",
+        "Counter": "0,1,2,3",
+        "EventName": "INT_MISC.RAT_STALL_CYCLES",
+        "PublicDescription": "This event counts the number of cycles during which Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the current thread. This also includes the cycles during which the Allocator is serving another thread.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -90,6 +111,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "Invert": "1",
+        "EventCode": "0x0E",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0x0E",
         "UMask": "0x10",
         "BriefDescription": "Number of flags-merge uops being allocated. Such uops considered perf sensitive; added by GSR u-arch.",
@@ -118,18 +151,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "Invert": "1",
-        "EventCode": "0x0E",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "EventCode": "0x14",
         "UMask": "0x1",
         "BriefDescription": "Cycles when divider is busy executing divide operations",
@@ -141,6 +162,26 @@
     },
     {
         "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
         "UMask": "0x1",
         "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
         "Counter": "0,1,2,3",
@@ -150,6 +191,36 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x3c",
         "UMask": "0x2",
         "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
@@ -159,6 +230,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x2",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x4c",
         "UMask": "0x1",
         "BriefDescription": "Not software-prefetch load dispatches that hit FB allocated for software prefetch",
@@ -225,6 +305,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EdgeDetect": "1",
+        "Invert": "1",
+        "EventCode": "0x5E",
+        "UMask": "0x1",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "Counter": "0,1,2,3",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "CounterMask": "1",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x87",
         "UMask": "0x1",
         "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
@@ -406,6 +498,15 @@
     },
     {
         "EventCode": "0x89",
+        "UMask": "0xa0",
+        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x89",
         "UMask": "0xc1",
         "BriefDescription": "Speculative and retired mispredicted macro conditional branches",
         "Counter": "0,1,2,3",
@@ -435,6 +536,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA0",
+        "UMask": "0x3",
+        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
+        "Counter": "0,1,2,3",
+        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
+        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xA1",
         "UMask": "0x1",
         "BriefDescription": "Cycles per thread when uops are executed in port 0",
@@ -446,6 +557,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x2",
         "BriefDescription": "Cycles per thread when uops are executed in port 1",
         "Counter": "0,1,2,3",
@@ -456,6 +587,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x4",
         "BriefDescription": "Cycles per thread when uops are executed in port 2",
         "Counter": "0,1,2,3",
@@ -466,6 +617,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x8",
         "BriefDescription": "Cycles per thread when uops are executed in port 3",
         "Counter": "0,1,2,3",
@@ -476,6 +647,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x10",
         "BriefDescription": "Cycles per thread when uops are executed in port 4",
         "Counter": "0,1,2,3",
@@ -486,6 +677,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x20",
         "BriefDescription": "Cycles per thread when uops are executed in port 5",
         "Counter": "0,1,2,3",
@@ -496,6 +707,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x40",
         "BriefDescription": "Cycles per thread when uops are executed in port 6",
         "Counter": "0,1,2,3",
@@ -506,6 +737,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x80",
         "BriefDescription": "Cycles per thread when uops are executed in port 7",
         "Counter": "0,1,2,3",
@@ -515,6 +766,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xA2",
         "UMask": "0x1",
         "BriefDescription": "Resource-related stall cycles",
@@ -567,14 +838,13 @@
     },
     {
         "EventCode": "0xA3",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
-        "CounterMask": "8",
-        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xA3",
@@ -589,8 +859,18 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "CounterMask": "2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x4",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
         "Counter": "0,1,2,3",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "CounterMask": "4",
@@ -600,6 +880,16 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x4",
+        "BriefDescription": "Total execution stalls.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "CounterMask": "4",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x5",
         "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
         "Counter": "0,1,2,3",
@@ -611,6 +901,16 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x5",
+        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "CounterMask": "5",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x6",
         "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
         "Counter": "0,1,2,3",
@@ -622,6 +922,37 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x6",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "CounterMask": "6",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "CounterMask": "8",
+        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "CounterMask": "8",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0xc",
         "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
         "Counter": "2",
@@ -632,12 +963,41 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0xA3",
+        "UMask": "0xc",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "CounterMask": "12",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
         "EventCode": "0xA8",
         "UMask": "0x1",
         "BriefDescription": "Number of Uops delivered by the LSD.",
         "Counter": "0,1,2,3",
         "EventName": "LSD.UOPS",
-        "PublicDescription": "Number of Uops delivered by the LSD. ",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "Counter": "0,1,2,3",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "CounterMask": "4",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
+        "Counter": "0,1,2,3",
+        "EventName": "LSD.CYCLES_ACTIVE",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -652,16 +1012,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xB1",
-        "UMask": "0x2",
-        "BriefDescription": "Number of uops executed on the core.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "PublicDescription": "Number of uops executed from any thread.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "Invert": "1",
         "EventCode": "0xB1",
         "UMask": "0x1",
@@ -674,375 +1024,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xC0",
-        "UMask": "0x0",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.ANY_P",
-        "Errata": "BDM61",
-        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x2",
-        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.X87",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x1",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "PEBS": "2",
-        "Counter": "1",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "Errata": "BDM11, BDM55",
-        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "UMask": "0x40",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "Counter": "0,1,2,3",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Actually retired uops.",
-        "Data_LA": "1",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.ALL",
-        "PublicDescription": "This event counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x2",
-        "BriefDescription": "Retirement slots used.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of retirement slots used.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "CounterMask": "10",
-        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.CYCLES",
-        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x4",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x20",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x1",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x2",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x0",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "This event counts all (macro) branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x8",
-        "BriefDescription": "Return instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x10",
-        "BriefDescription": "Not taken branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x20",
-        "BriefDescription": "Taken branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x40",
-        "BriefDescription": "Far branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "Errata": "BDW98",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x4",
-        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "Errata": "BDW98",
-        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x1",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x0",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x8",
-        "BriefDescription": "This event counts the number of mispredicted ret instructions retired. Non PEBS",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.RET",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted return instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x4",
-        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xCC",
-        "UMask": "0x20",
-        "BriefDescription": "Count cases of saving new LBR",
-        "Counter": "0,1,2,3",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x89",
-        "UMask": "0xa0",
-        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x20",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "UMask": "0x1",
         "BriefDescription": "Cycles where at least 1 uop was executed per-thread.",
@@ -1083,255 +1064,12 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xe6",
-        "UMask": "0x1f",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "Counter": "0,1,2,3",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "CounterMask": "8",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
+        "EventCode": "0xB1",
         "UMask": "0x2",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "BriefDescription": "Number of uops executed on the core.",
         "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
-        "CounterMask": "2",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x4",
-        "BriefDescription": "Total execution stalls.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "CounterMask": "4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0xc",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "CounterMask": "12",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x5",
-        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "CounterMask": "5",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x6",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "CounterMask": "6",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "CounterMask": "1",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "CounterMask": "4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "Invert": "1",
-        "EventCode": "0x5E",
-        "UMask": "0x1",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "Counter": "0,1,2,3",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "CounterMask": "1",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA0",
-        "UMask": "0x3",
-        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
-        "Counter": "0,1,2,3",
-        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
-        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x00",
-        "UMask": "0x2",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "UMask": "0x3",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
+        "EventName": "UOPS_EXECUTED.CORE",
+        "PublicDescription": "Number of uops executed from any thread.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -1386,32 +1124,304 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "EventCode": "0xC0",
+        "UMask": "0x0",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
-        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
+        "EventName": "INST_RETIRED.ANY_P",
+        "Errata": "BDM61",
+        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "PEBS": "2",
+        "Counter": "1",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "Errata": "BDM11, BDM55",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
+        "CounterHTOff": "1"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x2",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "INST_RETIRED.X87",
+        "PublicDescription": "This event counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC1",
+        "UMask": "0x40",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "Counter": "0,1,2,3",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Actually retired uops. (Precise Event - PEBS)",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.ALL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles without actually retired uops.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "CounterMask": "10",
+        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x2",
+        "BriefDescription": "Retirement slots used. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts the number of retirement slots used.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.CYCLES",
+        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "CounterMask": "1",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x4",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x20",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x0",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "This event counts all (macro) branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x1",
+        "BriefDescription": "Conditional branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts conditional branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect near call instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect near call instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3). (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect macro near call instructions retired (captured in ring 3).",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x4",
+        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "Errata": "BDW98",
+        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x8",
+        "BriefDescription": "Return instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts return instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x10",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "PublicDescription": "This event counts not taken branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x20",
+        "BriefDescription": "Taken branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts taken branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x40",
+        "BriefDescription": "Far branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "Errata": "BDW98",
+        "PublicDescription": "This event counts far branch instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x0",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x1",
+        "BriefDescription": "Mispredicted conditional branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted conditional branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x4",
+        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x8",
+        "BriefDescription": "This event counts the number of mispredicted ret instructions retired.(Precise Event)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.RET",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted return instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x20",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCC",
+        "UMask": "0x20",
+        "BriefDescription": "Count cases of saving new LBR",
+        "Counter": "0,1,2,3",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
+        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xe6",
+        "UMask": "0x1f",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
+        "Counter": "0,1,2,3",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellde/virtual-memory.json b/tools/perf/pmu-events/arch/x86/broadwellde/virtual-memory.json
index 5ce8b67..7d79c70 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellde/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellde/virtual-memory.json
@@ -45,6 +45,16 @@
     },
     {
         "EventCode": "0x08",
+        "UMask": "0xe",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x08",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -73,6 +83,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x08",
+        "UMask": "0x60",
+        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x49",
         "UMask": "0x1",
         "BriefDescription": "Store misses in all DTLB levels that cause page walks",
@@ -118,6 +137,16 @@
     },
     {
         "EventCode": "0x49",
+        "UMask": "0xe",
+        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x49",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -146,6 +175,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x49",
+        "UMask": "0x60",
+        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x4F",
         "UMask": "0x10",
         "BriefDescription": "Cycle count for an Extended Page table walk.",
@@ -201,6 +239,16 @@
     },
     {
         "EventCode": "0x85",
+        "UMask": "0xe",
+        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x85",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -229,6 +277,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x85",
+        "UMask": "0x60",
+        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xAE",
         "UMask": "0x1",
         "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.",
@@ -250,16 +307,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x21",
-        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
-        "Errata": "BDM69, BDM98",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x12",
         "BriefDescription": "Number of DTLB page walker hits in the L2.",
         "Counter": "0,1,2,3",
@@ -270,16 +317,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x22",
-        "BriefDescription": "Number of ITLB page walker hits in the L2.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
-        "Errata": "BDM69, BDM98",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x14",
         "BriefDescription": "Number of DTLB page walker hits in the L3 + XSNP.",
         "Counter": "0,1,2,3",
@@ -290,20 +327,40 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x24",
-        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
+        "UMask": "0x18",
+        "BriefDescription": "Number of DTLB page walker hits in Memory.",
         "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
+        "EventName": "PAGE_WALKER_LOADS.DTLB_MEMORY",
         "Errata": "BDM69, BDM98",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x18",
-        "BriefDescription": "Number of DTLB page walker hits in Memory.",
+        "UMask": "0x21",
+        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
         "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.DTLB_MEMORY",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
+        "Errata": "BDM69, BDM98",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x22",
+        "BriefDescription": "Number of ITLB page walker hits in the L2.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
+        "Errata": "BDM69, BDM98",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x24",
+        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
         "Errata": "BDM69, BDM98",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
@@ -327,62 +384,5 @@
         "PublicDescription": "This event counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, and so on).",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0xe",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0x60",
-        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0xe",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0x60",
-        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0xe",
-        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0x60",
-        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/cache.json b/tools/perf/pmu-events/arch/x86/broadwellx/cache.json
index d1d0438..bf0c512 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/cache.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/cache.json
@@ -11,11 +11,28 @@
     },
     {
         "EventCode": "0x24",
-        "UMask": "0x41",
-        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "UMask": "0x22",
+        "BriefDescription": "RFO requests that miss L2 cache.",
         "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
-        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
+        "EventName": "L2_RQSTS.RFO_MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x24",
+        "BriefDescription": "L2 cache misses when fetching instructions.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x27",
+        "BriefDescription": "Demand requests that miss L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
         "SampleAfterValue": "200003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -31,6 +48,43 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0x3f",
+        "BriefDescription": "All requests that miss L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.MISS",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x41",
+        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
+        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x42",
+        "BriefDescription": "RFO requests that hit L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.RFO_HIT",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x44",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0x50",
         "BriefDescription": "L2 prefetch requests that hit L2 cache",
         "Counter": "0,1,2,3",
@@ -71,6 +125,15 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0xe7",
+        "BriefDescription": "Demand requests to L2 cache.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0xf8",
         "BriefDescription": "Requests from L2 hardware prefetchers",
         "Counter": "0,1,2,3",
@@ -80,6 +143,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "UMask": "0xff",
+        "BriefDescription": "All L2 requests.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x27",
         "UMask": "0x50",
         "BriefDescription": "Not rejected writebacks that hit L2 cache",
@@ -131,6 +203,27 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0x48",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "Counter": "2",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "Counter": "0,1,2,3",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x51",
         "UMask": "0x1",
         "BriefDescription": "L1D data line replacements",
@@ -153,12 +246,35 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "CounterMask": "1",
+        "Errata": "BDM76",
+        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "CounterMask": "6",
+        "Errata": "BDM76",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x2",
         "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
         "Errata": "BDM76",
-        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -175,6 +291,18 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x4",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "CounterMask": "1",
+        "Errata": "BDM76",
+        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The Offcore outstanding state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
         "Counter": "0,1,2,3",
@@ -186,18 +314,6 @@
     },
     {
         "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "CounterMask": "1",
-        "Errata": "BDM76",
-        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
         "Counter": "0,1,2,3",
@@ -209,18 +325,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x60",
-        "UMask": "0x4",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "CounterMask": "1",
-        "Errata": "BDM76",
-        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The \"Offcore outstanding\" state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0x63",
         "UMask": "0x2",
         "BriefDescription": "Cycles when L1D is locked",
@@ -266,7 +370,7 @@
         "BriefDescription": "Demand and prefetch data reads",
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_REQUESTS.ALL_DATA_RD",
-        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable \"Demands\" and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
+        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable Demands and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -281,26 +385,35 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xD0",
         "UMask": "0x11",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x12",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts store uops with true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops true STLB miss retired to the architected path. True STLB miss is an uop triggering page walk that gets completed without blocks, and later gets retired. This page walk can end up with or without a fault.",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -308,37 +421,37 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x21",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "Errata": "BDM35",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts load uops with locked access retired to the architected path.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops with locked access retired to the architected path.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x41",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary.(Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted load uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x42",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts line-splitted store uops retired to the architected path. A line split is across 64B cache-line which includes a page split (4K).",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -346,24 +459,24 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x81",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
-        "PublicDescription": "This event counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts load uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement. This event also counts SW prefetches.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x82",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
-        "PublicDescription": "This event counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event counts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts store uops retired to the architected path with a filter on bits 0 and 1 applied.\nNote: This event ?ounts AVX-256bit load/store double-pump memory uops as a single uop at retirement.",
         "SampleAfterValue": "2000003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -371,69 +484,69 @@
     {
         "EventCode": "0xD1",
         "UMask": "0x1",
-        "BriefDescription": "Retired load uops with L1 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L1 cache hits as data sources. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data source were hits in the nearest-level (L1) cache.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load. This event also counts SW prefetches independent of the actual data source.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x2",
-        "BriefDescription": "Retired load uops with L2 cache hits as data sources.",
+        "BriefDescription": "Retired load uops with L2 cache hits as data sources. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
         "Errata": "BDM35",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the mid-level (L2) cache.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were data hits in L3 without snoops required.",
+        "BriefDescription": "Hit in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were data hits in the last-level (L3) cache without snoops required.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x8",
-        "BriefDescription": "Retired load uops misses in L1 cache as data sources.",
+        "BriefDescription": "Retired load uops misses in L1 cache as data sources. Uses PEBS.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the nearest-level (L1) cache. Counting excludes unknown and UC data source.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x10",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources. Uses PEBS.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were misses in the mid-level (L2) cache. Counting excludes unknown and UC data source.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x20",
-        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source. (Precise Event - PEBS).",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -445,84 +558,83 @@
     {
         "EventCode": "0xD1",
         "UMask": "0x40",
-        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
+        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.HIT_LFB",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were load uops missed L1 but hit a fill buffer due to a preceding miss to the same cache line with the data not ready.\nNote: Only two data-sources of L1/FB are applicable for AVX-256bit  even though the corresponding AVX load could be serviced by a deeper level in the memory hierarchy. Data source is reported for the Low-half load.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x1",
-        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 hit and cross-core snoop missed in on-pkg core cache. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_MISS",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 Hit and a cross-core snoop missed in the on-pkg core cache.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x2",
-        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were L3 hit and a cross-core snoop hit in the on-pkg core cache.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3.",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were HitM responses from a core on same socket (shared L3).",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x8",
-        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required.",
+        "BriefDescription": "Retired load uops which data sources were hits in L3 without snoops required. (Precise Event - PEBS)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_NONE",
         "Errata": "BDM100",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts retired load uops which data sources were hits in the last-level (L3) cache without snoops required.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD3",
         "UMask": "0x1",
-        "BriefDescription": "Data from local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM",
         "Errata": "BDE70, BDM100",
-        "PublicDescription": "Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI).",
+        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD3",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uop whose Data Source was: remote DRAM either Snoop not needed or Snoop Miss (RspI)",
+        "BriefDescription": "Retired load uop whose Data Source was: remote DRAM either Snoop not needed or Snoop Miss (RspI) (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -534,7 +646,7 @@
     {
         "EventCode": "0xD3",
         "UMask": "0x10",
-        "BriefDescription": "Retired load uop whose Data Source was: Remote cache HITM",
+        "BriefDescription": "Retired load uop whose Data Source was: Remote cache HITM (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -546,7 +658,7 @@
     {
         "EventCode": "0xD3",
         "UMask": "0x20",
-        "BriefDescription": "Retired load uop whose Data Source was: forwarded from remote cache",
+        "BriefDescription": "Retired load uop whose Data Source was: forwarded from remote cache (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -695,119 +807,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x24",
-        "UMask": "0x42",
-        "BriefDescription": "RFO requests that hit L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x22",
-        "BriefDescription": "RFO requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x44",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x24",
-        "BriefDescription": "L2 cache misses when fetching instructions.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x27",
-        "BriefDescription": "Demand requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xe7",
-        "BriefDescription": "Demand requests to L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x3f",
-        "BriefDescription": "All requests that miss L2 cache.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.MISS",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xff",
-        "BriefDescription": "All L2 requests.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "UMask": "0x1",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "CounterMask": "6",
-        "Errata": "BDM76",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "Counter": "2",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "Counter": "0,1,2,3",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "Offcore": "1",
         "EventCode": "0xB7, 0xBB",
         "UMask": "0x1",
@@ -816,6 +815,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all requests that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -828,6 +828,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -840,6 +841,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -852,6 +854,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -864,6 +867,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -876,6 +880,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -888,6 +893,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -900,6 +906,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -912,6 +919,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -924,6 +932,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -936,6 +945,20 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3",
+        "MSRValue": "0x3f803c0002",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/floating-point.json b/tools/perf/pmu-events/arch/x86/broadwellx/floating-point.json
index 4ae1ea2..d7b9d9c 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/floating-point.json
@@ -6,7 +6,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OTHER_ASSISTS.AVX_TO_SSE",
         "Errata": "BDM30",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from AVX-256 to legacy SSE when penalty is applicable.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -17,7 +17,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OTHER_ASSISTS.SSE_TO_AVX",
         "Errata": "BDM30",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
+        "PublicDescription": "This event counts the number of transitions from legacy SSE to AVX-256 when penalty is applicable.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -25,7 +25,6 @@
         "EventCode": "0xC7",
         "UMask": "0x1",
         "BriefDescription": "Number of SSE/AVX computational scalar double precision floating-point instructions retired.  Each count represents 1 computation. Applies to SSE* and AVX* scalar double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT FM(N)ADD/SUB.  FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "FP_ARITH_INST_RETIRED.SCALAR_DOUBLE",
         "SampleAfterValue": "2000003",
@@ -35,7 +34,6 @@
         "EventCode": "0xC7",
         "UMask": "0x2",
         "BriefDescription": "Number of SSE/AVX computational scalar single precision floating-point instructions retired.  Each count represents 1 computation. Applies to SSE* and AVX* scalar single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT FM(N)ADD/SUB.  FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "FP_ARITH_INST_RETIRED.SCALAR_SINGLE",
         "SampleAfterValue": "2000003",
@@ -43,97 +41,6 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x4",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired.  Each count represents 2 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "UMask": "0x8",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
-        "UMask": "0x10",
-        "BriefDescription": "Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x2",
-        "BriefDescription": "Number of X87 assists due to output value.",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.X87_OUTPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x4",
-        "BriefDescription": "Number of X87 assists due to input value.",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.X87_INPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x8",
-        "BriefDescription": "Number of SIMD FP assists due to Output values",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.SIMD_OUTPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x10",
-        "BriefDescription": "Number of SIMD FP assists due to input values",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.SIMD_INPUT",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xCA",
-        "UMask": "0x1e",
-        "BriefDescription": "Cycles with any input/output SSE or FP assist",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ASSIST.ANY",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xc7",
-        "UMask": "0x20",
-        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC7",
         "UMask": "0x3",
         "BriefDescription": "Number of SSE/AVX computational scalar floating-point instructions retired. Applies to SSE* and AVX* scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT FM(N)ADD/SUB. FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
@@ -143,11 +50,47 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x3c",
-        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "UMask": "0x4",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed double precision floating-point instructions retired.  Each count represents 2 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
-        "SampleAfterValue": "2000004",
+        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_DOUBLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x8",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x10",
+        "BriefDescription": "Number of SSE/AVX computational 256-bit packed double precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed double precision floating-point instructions: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_DOUBLE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC7",
+        "UMask": "0x15",
+        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
+        "SampleAfterValue": "2000006",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xc7",
+        "UMask": "0x20",
+        "BriefDescription": "Number of SSE/AVX computational 256-bit packed single precision floating-point instructions retired.  Each count represents 8 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ARITH_INST_RETIRED.256B_PACKED_SINGLE",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -161,11 +104,62 @@
     },
     {
         "EventCode": "0xC7",
-        "UMask": "0x15",
-        "BriefDescription": "Number of SSE/AVX computational double precision floating-point instructions retired. Applies to SSE* and AVX*scalar, double and single precision floating-point: ADD SUB MUL DIV MIN MAX SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ?.",
+        "UMask": "0x3c",
+        "BriefDescription": "Number of SSE/AVX computational packed floating-point instructions retired. Applies to SSE* and AVX*, packed, double and single precision floating-point: ADD SUB MUL DIV MIN MAX RSQRT RCP SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
-        "EventName": "FP_ARITH_INST_RETIRED.DOUBLE",
-        "SampleAfterValue": "2000006",
+        "EventName": "FP_ARITH_INST_RETIRED.PACKED",
+        "SampleAfterValue": "2000004",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x2",
+        "BriefDescription": "Number of X87 assists due to output value.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.X87_OUTPUT",
+        "PublicDescription": "This event counts the number of x87 floating point (FP) micro-code assist (numeric overflow/underflow, inexact result) when the output value (destination register) is invalid.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x4",
+        "BriefDescription": "Number of X87 assists due to input value.",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.X87_INPUT",
+        "PublicDescription": "This event counts x87 floating point (FP) micro-code assist (invalid operation, denormal operand, SNaN operand) when the input value (one of the source operands to an FP instruction) is invalid.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x8",
+        "BriefDescription": "Number of SIMD FP assists due to Output values",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.SIMD_OUTPUT",
+        "PublicDescription": "This event counts the number of SSE* floating point (FP) micro-code assist (numeric overflow/underflow) when the output value (destination register) is invalid. Counting covers only cases involving penalties that require micro-code assist intervention.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x10",
+        "BriefDescription": "Number of SIMD FP assists due to input values",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.SIMD_INPUT",
+        "PublicDescription": "This event counts any input SSE* FP assist - invalid operation, denormal operand, dividing by zero, SNaN operand. Counting includes only cases involving penalties that required micro-code assist intervention.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCA",
+        "UMask": "0x1e",
+        "BriefDescription": "Cycles with any input/output SSE or FP assist",
+        "Counter": "0,1,2,3",
+        "EventName": "FP_ASSIST.ANY",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
+        "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/frontend.json b/tools/perf/pmu-events/arch/x86/broadwellx/frontend.json
index 06bf0a4..72781e1 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/frontend.json
@@ -15,58 +15,7 @@
         "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x8",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.DSB_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x10",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x20",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_UOPS",
-        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may \"bypass\" the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -77,7 +26,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x8",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.DSB_UOPS",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -88,7 +47,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.DSB_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x10",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "PublicDescription": "This event counts the number of uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -99,7 +68,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MS_DSB_CYCLES",
         "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -111,7 +80,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MS_DSB_OCCUR",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of deliveries to Instruction Decode Queue (IDQ) initiated by Decode Stream Buffer (DSB) while the Microcode Sequencer (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -122,7 +91,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
         "CounterMask": "4",
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -133,7 +102,17 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may \"bypass\" the IDQ.",
+        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may bypass the IDQ.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x20",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -144,7 +123,7 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS",
         "CounterMask": "4",
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -155,7 +134,39 @@
         "Counter": "0,1,2,3",
         "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS",
         "CounterMask": "1",
-        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_UOPS",
+        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may bypass the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_SWITCHES",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -165,7 +176,7 @@
         "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MITE_ALL_UOPS",
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may \"bypass\" the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may bypass the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -205,7 +216,7 @@
         "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
         "Counter": "0,1,2,3",
         "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
-        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding ?4 ? x? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
+        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding 4  x when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread;\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions); \n c. Instruction Decode Queue (IDQ) delivers four uops.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -268,18 +279,7 @@
         "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
         "Counter": "0,1,2,3",
         "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES",
-        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0?2 cycles.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_SWITCHES",
-        "CounterMask": "1",
+        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 02 cycles.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/memory.json b/tools/perf/pmu-events/arch/x86/broadwellx/memory.json
index 1204ea8..d79a5cf 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/memory.json
@@ -95,7 +95,6 @@
         "BriefDescription": "Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort.",
         "Counter": "0,1,2,3",
         "EventName": "TX_EXEC.MISC1",
-        "PublicDescription": "Unfriendly TSX abort triggered by  a flowmarker.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -171,11 +170,11 @@
     {
         "EventCode": "0xc8",
         "UMask": "0x4",
-        "BriefDescription": "Number of times HLE abort was triggered",
+        "BriefDescription": "Number of times HLE abort was triggered (PEBS)",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "HLE_RETIRED.ABORTED",
-        "PublicDescription": "Number of times HLE abort was triggered.",
+        "PublicDescription": "Number of times HLE abort was triggered (PEBS).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -252,11 +251,11 @@
     {
         "EventCode": "0xc9",
         "UMask": "0x4",
-        "BriefDescription": "Number of times RTM abort was triggered",
+        "BriefDescription": "Number of times RTM abort was triggered (PEBS)",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "RTM_RETIRED.ABORTED",
-        "PublicDescription": "Number of times RTM abort was triggered .",
+        "PublicDescription": "Number of times RTM abort was triggered (PEBS).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -439,6 +438,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all requests that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -451,6 +451,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and clean or shared data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -463,6 +464,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -475,6 +477,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the data is returned from remote dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -487,6 +490,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -499,6 +503,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -511,6 +516,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -523,6 +529,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -535,6 +542,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -547,6 +555,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -559,6 +568,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -571,6 +581,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -583,6 +594,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -595,6 +607,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -607,6 +620,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -619,6 +633,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -631,6 +646,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -643,6 +659,20 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts all demand data writes (RFOs) that miss in the L3",
+        "MSRValue": "0x3fbfc00002",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/other.json b/tools/perf/pmu-events/arch/x86/broadwellx/other.json
index 718fcb1..4475249 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/other.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x5C",
-        "UMask": "0x2",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "Counter": "0,1,2,3",
-        "EventName": "CPL_CYCLES.RING123",
-        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EdgeDetect": "1",
         "EventCode": "0x5C",
         "UMask": "0x1",
@@ -32,6 +22,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5C",
+        "UMask": "0x2",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "Counter": "0,1,2,3",
+        "EventName": "CPL_CYCLES.RING123",
+        "PublicDescription": "This event counts unhalted core cycles during which the thread is in rings 1, 2, or 3.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x63",
         "UMask": "0x1",
         "BriefDescription": "Cycles when L1 and L2 are locked due to UC or split lock",
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/pipeline.json b/tools/perf/pmu-events/arch/x86/broadwellx/pipeline.json
index 02b4e10..920c89d 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/pipeline.json
@@ -3,31 +3,41 @@
         "EventCode": "0x00",
         "UMask": "0x1",
         "BriefDescription": "Instructions retired from execution.",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "EventName": "INST_RETIRED.ANY",
         "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. \nNotes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. \nCounting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "SampleAfterValue": "2000003",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "UMask": "0x2",
+        "BriefDescription": "Core cycles when the thread is not in halt state",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x2",
-        "BriefDescription": "Core cycles when the thread is not in halt state",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "AnyThread": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x3",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. \nNote: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  This event is clocked by base clock (100 Mhz) on Sandy Bridge. The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "EventCode": "0x03",
@@ -60,22 +70,33 @@
     },
     {
         "EventCode": "0x0D",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the thread",
+        "UMask": "0x3",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)",
         "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RAT_STALL_CYCLES",
-        "PublicDescription": "This event counts the number of cycles during which Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the current thread. This also includes the cycles during which the Allocator is serving another thread.",
+        "EventName": "INT_MISC.RECOVERY_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x0D",
         "UMask": "0x3",
-        "BriefDescription": "Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
         "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "AnyThread": "1",
         "CounterMask": "1",
-        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x0D",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the thread",
+        "Counter": "0,1,2,3",
+        "EventName": "INT_MISC.RAT_STALL_CYCLES",
+        "PublicDescription": "This event counts the number of cycles during which Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for the current thread. This also includes the cycles during which the Allocator is serving another thread.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -90,6 +111,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "Invert": "1",
+        "EventCode": "0x0E",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0x0E",
         "UMask": "0x10",
         "BriefDescription": "Number of flags-merge uops being allocated. Such uops considered perf sensitive; added by GSR u-arch.",
@@ -118,18 +151,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "Invert": "1",
-        "EventCode": "0x0E",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "EventCode": "0x14",
         "UMask": "0x1",
         "BriefDescription": "Cycles when divider is busy executing divide operations",
@@ -141,6 +162,26 @@
     },
     {
         "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
         "UMask": "0x1",
         "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
         "Counter": "0,1,2,3",
@@ -150,6 +191,36 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x3c",
         "UMask": "0x2",
         "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
@@ -159,6 +230,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x2",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x4c",
         "UMask": "0x1",
         "BriefDescription": "Not software-prefetch load dispatches that hit FB allocated for software prefetch",
@@ -225,6 +305,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EdgeDetect": "1",
+        "Invert": "1",
+        "EventCode": "0x5E",
+        "UMask": "0x1",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "Counter": "0,1,2,3",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "CounterMask": "1",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x87",
         "UMask": "0x1",
         "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
@@ -406,6 +498,15 @@
     },
     {
         "EventCode": "0x89",
+        "UMask": "0xa0",
+        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x89",
         "UMask": "0xc1",
         "BriefDescription": "Speculative and retired mispredicted macro conditional branches",
         "Counter": "0,1,2,3",
@@ -435,6 +536,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA0",
+        "UMask": "0x3",
+        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
+        "Counter": "0,1,2,3",
+        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
+        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xA1",
         "UMask": "0x1",
         "BriefDescription": "Cycles per thread when uops are executed in port 0",
@@ -446,6 +557,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x2",
         "BriefDescription": "Cycles per thread when uops are executed in port 1",
         "Counter": "0,1,2,3",
@@ -456,6 +587,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x4",
         "BriefDescription": "Cycles per thread when uops are executed in port 2",
         "Counter": "0,1,2,3",
@@ -466,6 +617,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x8",
         "BriefDescription": "Cycles per thread when uops are executed in port 3",
         "Counter": "0,1,2,3",
@@ -476,6 +647,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x10",
         "BriefDescription": "Cycles per thread when uops are executed in port 4",
         "Counter": "0,1,2,3",
@@ -486,6 +677,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x20",
         "BriefDescription": "Cycles per thread when uops are executed in port 5",
         "Counter": "0,1,2,3",
@@ -496,6 +707,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x40",
         "BriefDescription": "Cycles per thread when uops are executed in port 6",
         "Counter": "0,1,2,3",
@@ -506,6 +737,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x80",
         "BriefDescription": "Cycles per thread when uops are executed in port 7",
         "Counter": "0,1,2,3",
@@ -515,6 +766,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
+        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xA2",
         "UMask": "0x1",
         "BriefDescription": "Resource-related stall cycles",
@@ -567,14 +838,13 @@
     },
     {
         "EventCode": "0xA3",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
-        "CounterMask": "8",
-        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xA3",
@@ -589,8 +859,18 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "CounterMask": "2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x4",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
         "Counter": "0,1,2,3",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "CounterMask": "4",
@@ -600,6 +880,16 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x4",
+        "BriefDescription": "Total execution stalls.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "CounterMask": "4",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x5",
         "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
         "Counter": "0,1,2,3",
@@ -611,6 +901,16 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x5",
+        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "CounterMask": "5",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0x6",
         "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
         "Counter": "0,1,2,3",
@@ -622,6 +922,37 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x6",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "Counter": "0,1,2,3",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "CounterMask": "6",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "CounterMask": "8",
+        "PublicDescription": "Counts number of cycles the CPU has at least one pending  demand load request missing the L1 data cache.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "CounterMask": "8",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0xc",
         "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
         "Counter": "2",
@@ -632,12 +963,41 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0xA3",
+        "UMask": "0xc",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "CounterMask": "12",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
         "EventCode": "0xA8",
         "UMask": "0x1",
         "BriefDescription": "Number of Uops delivered by the LSD.",
         "Counter": "0,1,2,3",
         "EventName": "LSD.UOPS",
-        "PublicDescription": "Number of Uops delivered by the LSD. ",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "Counter": "0,1,2,3",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "CounterMask": "4",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
+        "Counter": "0,1,2,3",
+        "EventName": "LSD.CYCLES_ACTIVE",
+        "CounterMask": "1",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -652,16 +1012,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xB1",
-        "UMask": "0x2",
-        "BriefDescription": "Number of uops executed on the core.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "PublicDescription": "Number of uops executed from any thread.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "Invert": "1",
         "EventCode": "0xB1",
         "UMask": "0x1",
@@ -674,375 +1024,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xC0",
-        "UMask": "0x0",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.ANY_P",
-        "Errata": "BDM61",
-        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x2",
-        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.X87",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x1",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "PEBS": "2",
-        "Counter": "1",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "Errata": "BDM11, BDM55",
-        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "UMask": "0x40",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "Counter": "0,1,2,3",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Actually retired uops.",
-        "Data_LA": "1",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.ALL",
-        "PublicDescription": "This event counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x2",
-        "BriefDescription": "Retirement slots used.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of retirement slots used.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "CounterMask": "10",
-        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.CYCLES",
-        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x4",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x20",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x1",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x2",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x0",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "This event counts all (macro) branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x8",
-        "BriefDescription": "Return instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x10",
-        "BriefDescription": "Not taken branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x20",
-        "BriefDescription": "Taken branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x40",
-        "BriefDescription": "Far branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "Errata": "BDW98",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x4",
-        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "Errata": "BDW98",
-        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x1",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x0",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x8",
-        "BriefDescription": "This event counts the number of mispredicted ret instructions retired. Non PEBS",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.RET",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted return instructions retired.",
-        "SampleAfterValue": "100007",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x4",
-        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xCC",
-        "UMask": "0x20",
-        "BriefDescription": "Count cases of saving new LBR",
-        "Counter": "0,1,2,3",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x89",
-        "UMask": "0xa0",
-        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x20",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "UMask": "0x1",
         "BriefDescription": "Cycles where at least 1 uop was executed per-thread.",
@@ -1083,255 +1064,12 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xe6",
-        "UMask": "0x1f",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "Counter": "0,1,2,3",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "CounterMask": "8",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
+        "EventCode": "0xB1",
         "UMask": "0x2",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "BriefDescription": "Number of uops executed on the core.",
         "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
-        "CounterMask": "2",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x4",
-        "BriefDescription": "Total execution stalls.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "CounterMask": "4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0xc",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "CounterMask": "12",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x5",
-        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "CounterMask": "5",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "UMask": "0x6",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "Counter": "0,1,2,3",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "CounterMask": "6",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "CounterMask": "1",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "CounterMask": "4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "Invert": "1",
-        "EventCode": "0x5E",
-        "UMask": "0x1",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "Counter": "0,1,2,3",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "CounterMask": "1",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7",
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA0",
-        "UMask": "0x3",
-        "BriefDescription": "Micro-op dispatches cancelled due to insufficient SIMD physical register file read ports",
-        "Counter": "0,1,2,3",
-        "EventName": "UOP_DISPATCHES_CANCELLED.SIMD_PRF",
-        "PublicDescription": "This event counts the number of micro-operations cancelled after they were dispatched from the scheduler to the execution units when the total number of physical register read ports across all dispatch ports exceeds the read bandwidth of the physical register file.  The SIMD_PRF subevent applies to the following instructions: VDPPS, DPPS, VPCMPESTRI, PCMPESTRI, VPCMPESTRM, PCMPESTRM, VFMADD*, VFMADDSUB*, VFMSUB*, VMSUBADD*, VFNMADD*, VFNMSUB*.  See the Broadwell Optimization Guide for more information.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x00",
-        "UMask": "0x2",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "UMask": "0x3",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
+        "EventName": "UOPS_EXECUTED.CORE",
+        "PublicDescription": "Number of uops executed from any thread.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -1386,32 +1124,304 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "EventCode": "0xC0",
+        "UMask": "0x0",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
-        "PublicDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate).",
+        "EventName": "INST_RETIRED.ANY_P",
+        "Errata": "BDM61",
+        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "PEBS": "2",
+        "Counter": "1",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "Errata": "BDM11, BDM55",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
+        "CounterHTOff": "1"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x2",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "FP operations  retired. X87 FP operations that have no exceptions:",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "INST_RETIRED.X87",
+        "PublicDescription": "This event counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC1",
+        "UMask": "0x40",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "Counter": "0,1,2,3",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Actually retired uops. (Precise Event - PEBS)",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.ALL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts all actually retired uops. Counting increments by two for micro-fused uops, and by one for macro-fused and other uops. Maximal increment value for one cycle is eight.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles without actually retired uops.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "CounterMask": "10",
+        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x2",
+        "BriefDescription": "Retirement slots used. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts the number of retirement slots used.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.CYCLES",
+        "PublicDescription": "This event counts both thread-specific (TS) and all-thread (AT) nukes.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "CounterMask": "1",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x4",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x20",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "PublicDescription": "Maskmov false fault - counts number of time ucode passes through Maskmov flow due to instruction's mask being 0 while the flow was completed without raising a fault.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x0",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "This event counts all (macro) branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x1",
+        "BriefDescription": "Conditional branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts conditional branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect near call instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect near call instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3). (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect macro near call instructions retired (captured in ring 3).",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x4",
+        "BriefDescription": "All (macro) branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "Errata": "BDW98",
+        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x8",
+        "BriefDescription": "Return instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts return instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x10",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "PublicDescription": "This event counts not taken branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x20",
+        "BriefDescription": "Taken branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts taken branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x40",
+        "BriefDescription": "Far branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "Errata": "BDW98",
+        "PublicDescription": "This event counts far branch instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x0",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x1",
+        "BriefDescription": "Mispredicted conditional branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted conditional branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x4",
+        "BriefDescription": "Mispredicted macro branch instructions retired. (Precise Event - PEBS)",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x8",
+        "BriefDescription": "This event counts the number of mispredicted ret instructions retired.(Precise Event)",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.RET",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted return instructions retired.",
+        "SampleAfterValue": "100007",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x20",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken. (Precise Event - PEBS).",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCC",
+        "UMask": "0x20",
+        "BriefDescription": "Count cases of saving new LBR",
+        "Counter": "0,1,2,3",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
+        "PublicDescription": "This event counts cases of saving new LBR records by hardware. This assumes proper enabling of LBRs and takes into account LBR filtering done by the LBR_SELECT register.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xe6",
+        "UMask": "0x1f",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
+        "Counter": "0,1,2,3",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/broadwellx/virtual-memory.json b/tools/perf/pmu-events/arch/x86/broadwellx/virtual-memory.json
index 5ce8b67..7d79c70 100644
--- a/tools/perf/pmu-events/arch/x86/broadwellx/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/broadwellx/virtual-memory.json
@@ -45,6 +45,16 @@
     },
     {
         "EventCode": "0x08",
+        "UMask": "0xe",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x08",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -73,6 +83,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x08",
+        "UMask": "0x60",
+        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x49",
         "UMask": "0x1",
         "BriefDescription": "Store misses in all DTLB levels that cause page walks",
@@ -118,6 +137,16 @@
     },
     {
         "EventCode": "0x49",
+        "UMask": "0xe",
+        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x49",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -146,6 +175,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x49",
+        "UMask": "0x60",
+        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x4F",
         "UMask": "0x10",
         "BriefDescription": "Cycle count for an Extended Page table walk.",
@@ -201,6 +239,16 @@
     },
     {
         "EventCode": "0x85",
+        "UMask": "0xe",
+        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "Errata": "BDM69",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x85",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -229,6 +277,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x85",
+        "UMask": "0x60",
+        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xAE",
         "UMask": "0x1",
         "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.",
@@ -250,16 +307,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x21",
-        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
-        "Errata": "BDM69, BDM98",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x12",
         "BriefDescription": "Number of DTLB page walker hits in the L2.",
         "Counter": "0,1,2,3",
@@ -270,16 +317,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x22",
-        "BriefDescription": "Number of ITLB page walker hits in the L2.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
-        "Errata": "BDM69, BDM98",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x14",
         "BriefDescription": "Number of DTLB page walker hits in the L3 + XSNP.",
         "Counter": "0,1,2,3",
@@ -290,20 +327,40 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x24",
-        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
+        "UMask": "0x18",
+        "BriefDescription": "Number of DTLB page walker hits in Memory.",
         "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
+        "EventName": "PAGE_WALKER_LOADS.DTLB_MEMORY",
         "Errata": "BDM69, BDM98",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x18",
-        "BriefDescription": "Number of DTLB page walker hits in Memory.",
+        "UMask": "0x21",
+        "BriefDescription": "Number of ITLB page walker hits in the L1+FB.",
         "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.DTLB_MEMORY",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
+        "Errata": "BDM69, BDM98",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x22",
+        "BriefDescription": "Number of ITLB page walker hits in the L2.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
+        "Errata": "BDM69, BDM98",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x24",
+        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
         "Errata": "BDM69, BDM98",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
@@ -327,62 +384,5 @@
         "PublicDescription": "This event counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, and so on).",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0xe",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0x60",
-        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0xe",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0x60",
-        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0xe",
-        "BriefDescription": "Misses in all ITLB levels that cause completed page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "Errata": "BDM69",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0x60",
-        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks.",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/goldmont/cache.json b/tools/perf/pmu-events/arch/x86/goldmont/cache.json
index 4e02e1e..f8bbe08 100644
--- a/tools/perf/pmu-events/arch/x86/goldmont/cache.json
+++ b/tools/perf/pmu-events/arch/x86/goldmont/cache.json
@@ -1,23 +1,13 @@
 [
     {
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the intra-die interconnect (IDI) fabric. The XQ may reject transactions from the L2Q (non-cacheable requests), L2 misses and L2 write-back victims.",
-        "EventCode": "0x30",
+        "PublicDescription": "Counts memory requests originating from the core that miss in the L2 cache.",
+        "EventCode": "0x2E",
         "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "L2_REJECT_XQ.ALL",
+        "UMask": "0x41",
+        "EventName": "LONGEST_LAT_CACHE.MISS",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Requests rejected by the XQ"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of demand and L1 prefetcher requests rejected by the L2Q due to a full or nearly full condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to insure fairness between cores, or to delay a core's dirty eviction when the address conflicts with incoming external snoops.",
-        "EventCode": "0x31",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "CORE_REJECT_L2Q.ALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Requests rejected by the L2Q "
+        "BriefDescription": "L2 cache request misses"
     },
     {
         "CollectPEBSRecord": "1",
@@ -31,61 +21,57 @@
     },
     {
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts memory requests originating from the core that miss in the L2 cache.",
-        "EventCode": "0x2E",
+        "PublicDescription": "Counts the number of demand and prefetch transactions that the L2 XQ rejects due to a full or near full condition which likely indicates back pressure from the intra-die interconnect (IDI) fabric. The XQ may reject transactions from the L2Q (non-cacheable requests), L2 misses and L2 write-back victims.",
+        "EventCode": "0x30",
         "Counter": "0,1,2,3",
-        "UMask": "0x41",
-        "EventName": "LONGEST_LAT_CACHE.MISS",
+        "UMask": "0x0",
+        "EventName": "L2_REJECT_XQ.ALL",
         "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache request misses"
+        "BriefDescription": "Requests rejected by the XQ"
     },
     {
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts cycles that an ICache miss is outstanding, and instruction fetch is stalled.  That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes, while an Icache miss outstanding.  Note this event is not the same as cycles to retrieve an instruction due to an Icache miss.  Rather, it is the part of the Instruction Cache (ICache) miss time where no bytes are available for the decoder.",
+        "PublicDescription": "Counts the number of demand and L1 prefetcher requests rejected by the L2Q due to a full or nearly full condition which likely indicates back pressure from L2Q. It also counts requests that would have gone directly to the XQ, but are rejected due to a full or nearly full condition, indicating back pressure from the IDI link. The L2Q may also reject transactions from a core to ensure fairness between cores, or to delay a core's dirty eviction when the address conflicts with incoming external snoops.",
+        "EventCode": "0x31",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "CORE_REJECT_L2Q.ALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Requests rejected by the L2Q"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts when a modified (dirty) cache line is evicted from the data L1 cache and needs to be written back to memory.  No count will occur if the evicted line is clean, and hence does not require a writeback.",
+        "EventCode": "0x51",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "DL1.DIRTY_EVICTION",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L1 Cache evictions for dirty data"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts cycles that fetch is stalled due to an outstanding ICache miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ICache miss.  Note: this event is not the same as the total number of cycles spent retrieving instruction cache lines from the memory hierarchy.",
         "EventCode": "0x86",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "FETCH_STALL.ICACHE_FILL_PENDING_CYCLES",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Cycles where code-fetch is stalled and an ICache miss is outstanding.  This is not the same as an ICache Miss."
+        "BriefDescription": "Cycles code-fetch stalled due to an outstanding ICache miss."
     },
     {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts the number of load uops retired.",
-        "EventCode": "0xD0",
+        "CollectPEBSRecord": "1",
+        "EventCode": "0xB7",
         "Counter": "0,1,2,3",
-        "UMask": "0x81",
-        "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Load uops retired (Precise event capable)"
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)"
     },
     {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts the number of store uops retired.",
-        "EventCode": "0xD0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x82",
-        "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Store uops retired (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts the number of memory uops retired that is either a loads or a store or both.",
-        "EventCode": "0xD0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x83",
-        "EventName": "MEM_UOPS_RETIRED.ALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Memory uops retired (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts locked memory uops retired.  This includes \"regular\" locks and bus locks. (To specifically count bus locks only, see the Offcore response event.)  A locked access is one with a lock prefix, or an exchange to memory.  See the SDM for a complete description of which memory load accesses are locks.",
+        "PublicDescription": "Counts locked memory uops retired.  This includes regular locks and bus locks. (To specifically count bus locks only, see the Offcore response event.)  A locked access is one with a lock prefix, or an exchange to memory.  See the SDM for a complete description of which memory load accesses are locks.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x21",
@@ -129,6 +115,39 @@
     {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of load uops retired.",
+        "EventCode": "0xD0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x81",
+        "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Load uops retired (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of store uops retired.",
+        "EventCode": "0xD0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x82",
+        "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Store uops retired (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of memory uops retired that is either a loads or a store or both.",
+        "EventCode": "0xD0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x83",
+        "EventName": "MEM_UOPS_RETIRED.ALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Memory uops retired (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
         "PublicDescription": "Counts load uops retired that hit the L1 data cache.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
@@ -140,17 +159,6 @@
     {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts load uops retired that miss the L1 data cache.",
-        "EventCode": "0xD1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Load uops retired that missed L1 data cache (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
         "PublicDescription": "Counts load uops retired that hit in the L2 cache.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
@@ -162,6 +170,17 @@
     {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts load uops retired that miss the L1 data cache.",
+        "EventCode": "0xD1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Load uops retired that missed L1 data cache (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
         "PublicDescription": "Counts load uops retired that miss in the L2 cache.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
@@ -205,24 +224,20 @@
     },
     {
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts when a modified (dirty) cache line is evicted from the data L1 cache and needs to be written back to memory.  No count will occur if the evicted line is clean, and hence does not require a writeback.",
-        "EventCode": "0x51",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x40000032b7 ",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "DL1.DIRTY_EVICTION",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L1 Cache evictions for dirty data"
+        "EventName": "OFFCORE_RESPONSE.ANY_READ.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
     },
     {
         "CollectPEBSRecord": "1",
-        "EventCode": "0xB7",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)"
-    },
-    {
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x36000032b7 ",
         "Counter": "0,1,2,3",
@@ -234,6 +249,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x10000032b7 ",
         "Counter": "0,1,2,3",
@@ -245,6 +262,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x04000032b7 ",
         "Counter": "0,1,2,3",
@@ -256,6 +275,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x02000032b7 ",
         "Counter": "0,1,2,3",
@@ -267,6 +288,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x00000432b7 ",
         "Counter": "0,1,2,3",
@@ -278,6 +301,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x00000132b7 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_READ.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000022 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_RFO.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000022 ",
         "Counter": "0,1,2,3",
@@ -289,6 +340,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000022 ",
         "Counter": "0,1,2,3",
@@ -300,6 +353,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000022 ",
         "Counter": "0,1,2,3",
@@ -311,6 +366,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000022 ",
         "Counter": "0,1,2,3",
@@ -322,6 +379,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040022 ",
         "Counter": "0,1,2,3",
@@ -333,6 +392,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010022 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_RFO.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000003091",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads (demand & prefetch) that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600003091",
         "Counter": "0,1,2,3",
@@ -344,6 +431,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000003091",
         "Counter": "0,1,2,3",
@@ -355,6 +444,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400003091",
         "Counter": "0,1,2,3",
@@ -366,6 +457,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200003091",
         "Counter": "0,1,2,3",
@@ -377,6 +470,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000043091",
         "Counter": "0,1,2,3",
@@ -388,6 +483,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000013091",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads (demand & prefetch) that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000003010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_PF_DATA_RD.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads generated by L1 or L2 prefetchers that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600003010 ",
         "Counter": "0,1,2,3",
@@ -399,6 +522,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000003010 ",
         "Counter": "0,1,2,3",
@@ -410,6 +535,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400003010 ",
         "Counter": "0,1,2,3",
@@ -421,6 +548,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200003010 ",
         "Counter": "0,1,2,3",
@@ -432,6 +561,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000043010 ",
         "Counter": "0,1,2,3",
@@ -443,6 +574,47 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000013010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_PF_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads generated by L1 or L2 prefetchers that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000008000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts requests to the uncore subsystem that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x3600008000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.L2_MISS.ANY",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts requests to the uncore subsystem that miss the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000008000 ",
         "Counter": "0,1,2,3",
@@ -454,6 +626,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400008000 ",
         "Counter": "0,1,2,3",
@@ -465,6 +639,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200008000 ",
         "Counter": "0,1,2,3",
@@ -476,6 +652,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000048000 ",
         "Counter": "0,1,2,3",
@@ -487,6 +665,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000018000 ",
         "Counter": "0,1,2,3",
@@ -498,6 +678,21 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000004800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600004800 ",
         "Counter": "0,1,2,3",
@@ -509,6 +704,47 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000004800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0400004800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L2_MISS.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0200004800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that true miss for the L2 cache with a snoop miss in the other processor module. ",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000044800 ",
         "Counter": "0,1,2,3",
@@ -520,6 +756,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000014800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000004000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_STREAMING_STORES.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600004000 ",
         "Counter": "0,1,2,3",
@@ -531,6 +795,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000004000 ",
         "Counter": "0,1,2,3",
@@ -542,6 +808,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400004000 ",
         "Counter": "0,1,2,3",
@@ -553,6 +821,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200004000 ",
         "Counter": "0,1,2,3",
@@ -564,6 +834,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000044000 ",
         "Counter": "0,1,2,3",
@@ -575,6 +847,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000014000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_STREAMING_STORES.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000002000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600002000 ",
         "Counter": "0,1,2,3",
@@ -586,6 +886,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000002000 ",
         "Counter": "0,1,2,3",
@@ -597,6 +899,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400002000 ",
         "Counter": "0,1,2,3",
@@ -608,6 +912,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200002000 ",
         "Counter": "0,1,2,3",
@@ -619,6 +925,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000042000 ",
         "Counter": "0,1,2,3",
@@ -630,6 +938,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000012000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000001000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.SW_PREFETCH.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache lines requests by software prefetch instructions that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600001000 ",
         "Counter": "0,1,2,3",
@@ -641,6 +977,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000001000 ",
         "Counter": "0,1,2,3",
@@ -652,6 +990,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400001000 ",
         "Counter": "0,1,2,3",
@@ -663,6 +1003,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200001000 ",
         "Counter": "0,1,2,3",
@@ -674,6 +1016,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000041000 ",
         "Counter": "0,1,2,3",
@@ -685,6 +1029,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000011000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.SW_PREFETCH.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache lines requests by software prefetch instructions that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.FULL_STREAMING_STORES.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000800 ",
         "Counter": "0,1,2,3",
@@ -696,6 +1068,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000800 ",
         "Counter": "0,1,2,3",
@@ -707,6 +1081,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000800 ",
         "Counter": "0,1,2,3",
@@ -718,6 +1094,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000800 ",
         "Counter": "0,1,2,3",
@@ -729,6 +1107,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040800 ",
         "Counter": "0,1,2,3",
@@ -740,6 +1120,99 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.FULL_STREAMING_STORES.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x3600000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_MISS.ANY",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that miss the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0400000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_MISS.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0200000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that true miss for the L2 cache with a snoop miss in the other processor module. ",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000040400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_HIT",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that hit the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000010400 ",
         "Counter": "0,1,2,3",
@@ -751,6 +1224,112 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x3600000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_MISS.ANY",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0400000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_MISS.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0200000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that true miss for the L2 cache with a snoop miss in the other processor module. ",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000040200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_HIT",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that hit the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000100 ",
         "Counter": "0,1,2,3",
@@ -762,6 +1341,86 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000000100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0400000100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_MISS.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0200000100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that true miss for the L2 cache with a snoop miss in the other processor module. ",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000040100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_HIT",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that hit the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000080 ",
         "Counter": "0,1,2,3",
@@ -773,6 +1432,86 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000000080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0400000080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_MISS.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0200000080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_MISS.SNOOP_MISS_OR_NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that true miss for the L2 cache with a snoop miss in the other processor module. ",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000040080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_HIT",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that hit the L2 cache.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000020 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000020 ",
         "Counter": "0,1,2,3",
@@ -784,6 +1523,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000020 ",
         "Counter": "0,1,2,3",
@@ -795,6 +1536,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000020 ",
         "Counter": "0,1,2,3",
@@ -806,6 +1549,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000020 ",
         "Counter": "0,1,2,3",
@@ -817,6 +1562,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040020 ",
         "Counter": "0,1,2,3",
@@ -828,6 +1575,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010020 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000010 ",
         "Counter": "0,1,2,3",
@@ -839,6 +1614,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000010 ",
         "Counter": "0,1,2,3",
@@ -850,6 +1627,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000010 ",
         "Counter": "0,1,2,3",
@@ -861,6 +1640,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000010 ",
         "Counter": "0,1,2,3",
@@ -872,6 +1653,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040010 ",
         "Counter": "0,1,2,3",
@@ -883,6 +1666,34 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x4000000008 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.COREWB.OUTSTANDING",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that are outstanding, per cycle, from the time of the L2 miss to when any response is received.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000008 ",
         "Counter": "0,1,2,3",
@@ -894,6 +1705,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000008 ",
         "Counter": "0,1,2,3",
@@ -905,6 +1718,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000008 ",
         "Counter": "0,1,2,3",
@@ -916,6 +1731,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000008 ",
         "Counter": "0,1,2,3",
@@ -927,6 +1744,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040008 ",
         "Counter": "0,1,2,3",
@@ -938,6 +1757,21 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010008 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.COREWB.ANY_RESPONSE",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x4000000004 ",
         "Counter": "0,1,2,3",
@@ -949,6 +1783,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000004 ",
         "Counter": "0,1,2,3",
@@ -960,6 +1796,21 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x1000000004 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L2_MISS.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000004 ",
         "Counter": "0,1,2,3",
@@ -971,6 +1822,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000004 ",
         "Counter": "0,1,2,3",
@@ -982,6 +1835,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040004 ",
         "Counter": "0,1,2,3",
@@ -993,6 +1848,21 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010004 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x4000000002 ",
         "Counter": "0,1,2,3",
@@ -1004,6 +1874,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000002 ",
         "Counter": "0,1,2,3",
@@ -1015,6 +1887,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000002 ",
         "Counter": "0,1,2,3",
@@ -1026,6 +1900,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000002 ",
         "Counter": "0,1,2,3",
@@ -1037,6 +1913,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000002 ",
         "Counter": "0,1,2,3",
@@ -1048,6 +1926,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040002 ",
         "Counter": "0,1,2,3",
@@ -1059,6 +1939,21 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010002 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that are outstanding, per cycle, from the time of the L2 miss to when any response is received. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x4000000001 ",
         "Counter": "0,1,2,3",
@@ -1070,6 +1965,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that miss the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x3600000001 ",
         "Counter": "0,1,2,3",
@@ -1081,6 +1978,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that miss the L2 cache with a snoop hit in the other processor module, data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x1000000001 ",
         "Counter": "0,1,2,3",
@@ -1092,6 +1991,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that miss the L2 cache with a snoop hit in the other processor module, no data forwarding is required. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0400000001 ",
         "Counter": "0,1,2,3",
@@ -1103,6 +2004,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that true miss for the L2 cache with a snoop miss in the other processor module.  Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0200000001 ",
         "Counter": "0,1,2,3",
@@ -1114,6 +2017,8 @@
         "Offcore": "1"
     },
     {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that hit the L2 cache. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
         "EventCode": "0xB7",
         "MSRValue": "0x0000040001 ",
         "Counter": "0,1,2,3",
@@ -1123,5 +2028,18 @@
         "SampleAfterValue": "100007",
         "BriefDescription": "Counts demand cacheable data reads of full cache lines that hit the L2 cache.",
         "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that have any transaction responses from the uncore subsystem. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x0000010001 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand cacheable data reads of full cache lines that have any transaction responses from the uncore subsystem.",
+        "Offcore": "1"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/goldmont/memory.json b/tools/perf/pmu-events/arch/x86/goldmont/memory.json
index ac8b0d3..690cebd 100644
--- a/tools/perf/pmu-events/arch/x86/goldmont/memory.json
+++ b/tools/perf/pmu-events/arch/x86/goldmont/memory.json
@@ -1,15 +1,5 @@
 [
     {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts machine clears due to memory ordering issues.  This occurs when a snoop request happens and the machine is uncertain if memory ordering will be preserved - as another core is in the process of modifying the data.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "MACHINE_CLEARS.MEMORY_ORDERING",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Machine clears due to memory ordering issue"
-    },
-    {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
         "PublicDescription": "Counts when a memory load of a uop spans a page boundary (a split) is retired.",
@@ -30,5 +20,275 @@
         "EventName": "MISALIGN_MEM_REF.STORE_PAGE_SPLIT",
         "SampleAfterValue": "200003",
         "BriefDescription": "Store uops that split a page (Precise event capable)"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts machine clears due to memory ordering issues.  This occurs when a snoop request happens and the machine is uncertain if memory ordering will be preserved as another core is in the process of modifying the data.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "MACHINE_CLEARS.MEMORY_ORDERING",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Machine clears due to memory ordering issue"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x20000032b7 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_READ.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data read, code read, and read for ownership (RFO) requests (demand & prefetch) that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000022 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_RFO.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests (demand & prefetch) that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads (demand & prefetch) that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000003091",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_DATA_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads (demand & prefetch) that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data reads generated by L1 or L2 prefetchers that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000003010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_PF_DATA_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data reads generated by L1 or L2 prefetchers that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts requests to the uncore subsystem that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000008000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.ANY_REQUEST.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts requests to the uncore subsystem that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000004800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts any data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000004000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_STREAMING_STORES.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts partial cache line data writes to uncacheable write combining (USWC) memory region  that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000002000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L1_DATA_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache line reads generated by hardware L1 data cache prefetcher that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cache lines requests by software prefetch instructions that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000001000 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.SW_PREFETCH.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cache lines requests by software prefetch instructions that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000800 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.FULL_STREAMING_STORES.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts full cache line data writes to uncacheable write combining (USWC) memory region and full cache-line non-temporal writes that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts bus lock and split lock requests that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000400 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.BUS_LOCKS.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts bus lock and split lock requests that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000200 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.UC_CODE_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts code reads in uncacheable (UC) memory region that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000100 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_WRITES.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of demand write requests (RFO) generated by a write to partial data cache line, including the writes to uncacheable (UC) and write through (WT), and write protected (WP) types of memory that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000080 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PARTIAL_READS.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand data partial reads, including data in uncacheable (UC) or uncacheable write combining (USWC) memory types that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000020 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts reads for ownership (RFO) requests generated by L2 prefetcher that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000010 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts data cacheline reads generated by hardware L2 cache prefetcher that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000008 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.COREWB.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of writeback transactions caused by L1 or L2 cache evictions that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000004 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand instruction cacheline and I-side prefetch requests that miss the instruction cache that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000002 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand reads for ownership (RFO) requests generated by a write to full data cache line that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts demand cacheable data reads of full cache lines that miss the L2 cache and targets non-DRAM system address. Requires MSR_OFFCORE_RESP[0,1] to specify request type and response. (duplicated for both MSRs)",
+        "EventCode": "0xB7",
+        "MSRValue": "0x2000000001 ",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L2_MISS.NON_DRAM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts demand cacheable data reads of full cache lines that miss the L2 cache and targets non-DRAM system address.",
+        "Offcore": "1"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/goldmont/other.json b/tools/perf/pmu-events/arch/x86/goldmont/other.json
index df25ca9..959cadd 100644
--- a/tools/perf/pmu-events/arch/x86/goldmont/other.json
+++ b/tools/perf/pmu-events/arch/x86/goldmont/other.json
@@ -1,6 +1,36 @@
 [
     {
         "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes.  This will include cycles due to an ITLB miss, ICache miss and other events.",
+        "EventCode": "0x86",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "FETCH_STALL.ALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Cycles code-fetch stalled due to any reason."
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts cycles that fetch is stalled due to an outstanding ITLB miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ITLB miss.  Note: this event is not the same as page walk cycles to retrieve an instruction translation.",
+        "EventCode": "0x86",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "FETCH_STALL.ITLB_FILL_PENDING_CYCLES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Cycles code-fetch stalled due to an outstanding ITLB miss."
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of issue slots per core cycle that were not consumed by the backend due to either a full resource  in the backend (RESOURCE_FULL) or due to the processor recovering from some event (RECOVERY).",
+        "EventCode": "0xCA",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "ISSUE_SLOTS_NOT_CONSUMED.ANY",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Unfilled issue slots per cycle"
+    },
+    {
+        "CollectPEBSRecord": "1",
         "PublicDescription": "Counts the number of issue slots per core cycle that were not consumed because of a full resource in the backend.  Including but not limited to resources such as the Re-order Buffer (ROB), reservation stations (RS), load/store buffers, physical registers, or any other needed machine resource that is currently unavailable.   Note that uops must be available for consumption in order for this event to fire.  If a uop is not available (Instruction Queue is empty), this event will not count.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
@@ -20,24 +50,24 @@
         "BriefDescription": "Unfilled issue slots per cycle to recover"
     },
     {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of issue slots per core cycle that were not consumed by the backend due to either a full resource  in the backend (RESOURCE_FULL) or due to the processor recovering from some event (RECOVERY).",
-        "EventCode": "0xCA",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "ISSUE_SLOTS_NOT_CONSUMED.ANY",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Unfilled issue slots per cycle"
-    },
-    {
         "CollectPEBSRecord": "2",
         "PublicDescription": "Counts hardware interrupts received by the processor.",
         "EventCode": "0xCB",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "HW_INTERRUPTS.RECEIVED",
+        "SampleAfterValue": "203",
+        "BriefDescription": "Hardware interrupts received"
+    },
+    {
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts the number of core cycles during which interrupts are masked (disabled). Increments by 1 each core cycle that EFLAGS.IF is 0, regardless of whether interrupts are pending or not.",
+        "EventCode": "0xCB",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "HW_INTERRUPTS.MASKED",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Hardware interrupts received (Precise event capable)"
+        "BriefDescription": "Cycles hardware interrupts are masked"
     },
     {
         "CollectPEBSRecord": "2",
@@ -47,6 +77,6 @@
         "UMask": "0x4",
         "EventName": "HW_INTERRUPTS.PENDING_AND_MASKED",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Cycles pending interrupts are masked (Precise event capable)"
+        "BriefDescription": "Cycles pending interrupts are masked"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/goldmont/pipeline.json b/tools/perf/pmu-events/arch/x86/goldmont/pipeline.json
index 07f0004..254788a 100644
--- a/tools/perf/pmu-events/arch/x86/goldmont/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/goldmont/pipeline.json
@@ -1,168 +1,136 @@
 [
     {
+        "PublicDescription": "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers.  This event uses fixed counter 0.  You cannot collect a PEBs record for this event.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 0",
+        "UMask": "0x1",
+        "EventName": "INST_RETIRED.ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Instructions retired (Fixed event)"
+    },
+    {
+        "PublicDescription": "Counts the number of core cycles while the core is not in a halt state.  The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time.  This event uses fixed counter 1.  You cannot collect a PEBs record for this event.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when core is not halted  (Fixed event)"
+    },
+    {
+        "PublicDescription": "Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction.  In mobile systems the core frequency may change from time.  This event is not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time.  This event uses fixed counter 2.  You cannot collect a PEBs record for this event.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 2",
+        "UMask": "0x3",
+        "EventName": "CPU_CLK_UNHALTED.REF_TSC",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when core is not halted  (Fixed event)"
+    },
+    {
         "PEBS": "2",
         "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts branch instructions retired for all branch types.  This is an architectural performance event.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts a load blocked from using a store forward, but did not occur because the store data was not available at the right time.  The forward might occur subsequently when the data is available.",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LD_BLOCKS.DATA_UNKNOWN",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Loads blocked due to store data not ready (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts a load blocked from using a store forward because of an address/size mismatch, only one of the loads blocked from each store will be counted.",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "LD_BLOCKS.STORE_FORWARD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Loads blocked due to store forward restriction (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts loads that block because their address modulo 4K matches a pending store.",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "LD_BLOCKS.4K_ALIAS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Loads blocked because address has 4k partial address false dependence (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts loads blocked because they are unable to find their physical address in the micro TLB (UTLB).",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "LD_BLOCKS.UTLB_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Loads blocked because address in not in the UTLB (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts anytime a load that retires is blocked for any reason.",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "LD_BLOCKS.ALL_BLOCK",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Loads blocked (Precise event capable)"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts uops issued by the front end and allocated into the back end of the machine.  This event counts uops that retire as well as uops that were speculatively executed but didn't retire. The sort of speculative uops that might be counted includes, but is not limited to those uops issued in the shadow of a miss-predicted branch, those uops that are inserted during an assist (such as for a denormal floating point result), and (previously allocated) uops that might be canceled during a machine clear.",
+        "EventCode": "0x0E",
         "Counter": "0,1,2,3",
         "UMask": "0x0",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "EventName": "UOPS_ISSUED.ANY",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Retired branch instructions (Precise event capable)"
+        "BriefDescription": "Uops issued to the back end per cycle"
     },
     {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was taken and when it was not taken.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x7e",
-        "EventName": "BR_INST_RETIRED.JCC",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired conditional branch instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were taken and does not count when the Jcc branch instruction were not taken.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xfe",
-        "EventName": "BR_INST_RETIRED.TAKEN_JCC",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired conditional branch instructions that were taken (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts near CALL branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xf9",
-        "EventName": "BR_INST_RETIRED.CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired near call instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts near relative CALL branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xfd",
-        "EventName": "BR_INST_RETIRED.REL_CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired near relative call instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts near indirect CALL branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xfb",
-        "EventName": "BR_INST_RETIRED.IND_CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired near indirect call instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts near return branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xf7",
-        "EventName": "BR_INST_RETIRED.RETURN",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired near return instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts near indirect call or near indirect jmp branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xeb",
-        "EventName": "BR_INST_RETIRED.NON_RETURN_IND",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired instructions of near indirect Jmp or call (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts far branch instructions retired.  This includes far jump, far call and return, and Interrupt call and return.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0xbf",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired far branch instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted branch instructions retired including all branch types.",
-        "EventCode": "0xC5",
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Core cycles when core is not halted.  This event uses a (_P)rogrammable general purpose performance counter.",
+        "EventCode": "0x3C",
         "Counter": "0,1,2,3",
         "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "EventName": "CPU_CLK_UNHALTED.CORE_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when core is not halted"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Reference cycles when core is not halted.  This event uses a programmable general purpose performance counter.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when core is not halted"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "This event used to measure front-end inefficiencies. I.e. when front-end of the machine is not delivering uops to the back-end and the back-end has is not stalled. This event can be used to identify if the machine is truly front-end bound.  When this event occurs, it is an indication that the front-end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: Front-end and Back-end. Front-end is responsible for fetching the instruction, decoding into uops in machine understandable format and putting them into a uop queue to be consumed by back end. The back-end then takes these uops, allocates the required resources.  When all resources are ready, uops are executed. If the back-end is not ready to accept uops from the front-end, then we do not want to count these as front-end bottlenecks.  However, whenever we have bottlenecks in the back-end, we will have allocation unit stalls and eventually forcing the front-end to wait until the back-end is ready to receive more uops. This event counts only when back-end is requesting more uops and front-end is not able to provide them. When 3 uops are requested and no uops are delivered, the event counts 3. When 3 are requested, and only 1 is delivered, the event counts 2. When only 2 are delivered, the event counts 1. Alternatively stated, the event will not count if 3 uops are delivered, or if the back end is stalled and not requesting any uops at all.  Counts indicate missed opportunities for the front-end to deliver a uop to the back end. Some examples of conditions that cause front-end efficiencies are: ICache misses, ITLB misses, and decoder restrictions that limit the front-end bandwidth. Known Issues: Some uops require multiple allocation slots.  These uops will not be charged as a front end 'not delivered' opportunity, and will be regarded as a back end problem. For example, the INC instruction has one uop that requires 2 issue slots.  A stream of INC instructions will not count as UOPS_NOT_DELIVERED, even though only one instruction can be issued per clock.  The low uop issue rate for a stream of INC instructions is considered to be a back end issue.",
+        "EventCode": "0x9C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "UOPS_NOT_DELIVERED.ANY",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted branch instructions (Precise event capable)"
+        "BriefDescription": "Uops requested but not-delivered to the back-end per cycle"
     },
     {
         "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was supposed to be taken and when it was not supposed to be taken (but the processor predicted the opposite condition).",
-        "EventCode": "0xC5",
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers.  This is an architectural performance event.  This event uses a (_P)rogrammable general purpose performance counter. *This event is Precise Event capable:  The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event.  Note: Because PEBS records can be collected only on IA32_PMC0, only one event can use the PEBS facility at a time.",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
-        "UMask": "0x7e",
-        "EventName": "BR_MISP_RETIRED.JCC",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted conditional branch instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were supposed to be taken but the processor predicted that it would not be taken.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0xfe",
-        "EventName": "BR_MISP_RETIRED.TAKEN_JCC",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted conditional branch instructions that were taken (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted near indirect CALL branch instructions retired, where the target address taken was not what the processor predicted.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0xfb",
-        "EventName": "BR_MISP_RETIRED.IND_CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted near indirect call instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted near RET branch instructions retired, where the return address taken was not what the processor predicted.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0xf7",
-        "EventName": "BR_MISP_RETIRED.RETURN",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted near return instructions (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts mispredicted branch instructions retired that were near indirect call or near indirect jmp, where the target address taken was not what the processor predicted.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0xeb",
-        "EventName": "BR_MISP_RETIRED.NON_RETURN_IND",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired mispredicted instructions of near indirect Jmp or near indirect call. (Precise event capable)"
+        "UMask": "0x0",
+        "EventName": "INST_RETIRED.ANY_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Instructions retired (Precise event capable)"
     },
     {
         "PEBS": "2",
@@ -187,8 +155,40 @@
         "BriefDescription": "MS uops retired (Precise event capable)"
     },
     {
+        "PEBS": "2",
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of times that the processor detects that a program is writing to a code section and has to perform a machine clear because of that modification.  Self-modifying code (SMC) causes a severe penalty in all Intel? architecture processors.",
+        "PublicDescription": "Counts the number of floating point divide uops retired.",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "UOPS_RETIRED.FPDIV",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Floating point divide uops retired. (Precise Event Capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of integer divide uops retired.",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "UOPS_RETIRED.IDIV",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Integer divide uops retired. (Precise Event Capable)"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts machine clears for any reason.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "MACHINE_CLEARS.ALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All machine clears"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts the number of times that the processor detects that a program is writing to a code section and has to perform a machine clear because of that modification.  Self-modifying code (SMC) causes a severe penalty in all Intel architecture processors.",
         "EventCode": "0xC3",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -217,45 +217,180 @@
         "BriefDescription": "Machine clears due to memory disambiguation"
     },
     {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts machine clears for any reason.",
-        "EventCode": "0xC3",
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts branch instructions retired for all branch types.  This is an architectural performance event.",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
         "UMask": "0x0",
-        "EventName": "MACHINE_CLEARS.ALL",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
         "SampleAfterValue": "200003",
-        "BriefDescription": "All machine clears"
+        "BriefDescription": "Retired branch instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was taken and when it was not taken.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x7e",
+        "EventName": "BR_INST_RETIRED.JCC",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired conditional branch instructions (Precise event capable)"
     },
     {
         "PEBS": "2",
         "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The event continues counting during hardware interrupts, traps, and inside interrupt handlers.  This is an architectural performance event.  This event uses a (_P)rogrammable general purpose performance counter. *This event is Precise Event capable:  The EventingRIP field in the PEBS record is precise to the address of the instruction which caused the event.  Note: Because PEBS records can be collected only on IA32_PMC0, only one event can use the PEBS facility at a time.",
-        "EventCode": "0xC0",
+        "PublicDescription": "Counts the number of taken branch instructions retired.",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "INST_RETIRED.ANY_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Instructions retired (Precise event capable)"
+        "UMask": "0x80",
+        "EventName": "BR_INST_RETIRED.ALL_TAKEN_BRANCHES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired taken branch instructions (Precise event capable)"
     },
     {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "This event used to measure front-end inefficiencies. I.e. when front-end of the machine is not delivering uops to the back-end and the back-end has is not stalled. This event can be used to identify if the machine is truly front-end bound.  When this event occurs, it is an indication that the front-end of the machine is operating at less than its theoretical peak performance. Background: We can think of the processor pipeline as being divided into 2 broader parts: Front-end and Back-end. Front-end is responsible for fetching the instruction, decoding into uops in machine understandable format and putting them into a uop queue to be consumed by back end. The back-end then takes these uops, allocates the required resources.  When all resources are ready, uops are executed. If the back-end is not ready to accept uops from the front-end, then we do not want to count these as front-end bottlenecks.  However, whenever we have bottlenecks in the back-end, we will have allocation unit stalls and eventually forcing the front-end to wait until the back-end is ready to receive more uops. This event counts only when back-end is requesting more uops and front-end is not able to provide them. When 3 uops are requested and no uops are delivered, the event counts 3. When 3 are requested, and only 1 is delivered, the event counts 2. When only 2 are delivered, the event counts 1. Alternatively stated, the event will not count if 3 uops are delivered, or if the back end is stalled and not requesting any uops at all.  Counts indicate missed opportunities for the front-end to deliver a uop to the back end. Some examples of conditions that cause front-end efficiencies are: ICache misses, ITLB misses, and decoder restrictions that limit the front-end bandwidth. Known Issues: Some uops require multiple allocation slots.  These uops will not be charged as a front end 'not delivered' opportunity, and will be regarded as a back end problem. For example, the INC instruction has one uop that requires 2 issue slots.  A stream of INC instructions will not count as UOPS_NOT_DELIVERED, even though only one instruction can be issued per clock.  The low uop issue rate for a stream of INC instructions is considered to be a back end issue.",
-        "EventCode": "0x9C",
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts far branch instructions retired.  This includes far jump, far call and return, and Interrupt call and return.",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "UOPS_NOT_DELIVERED.ANY",
+        "UMask": "0xbf",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Uops requested but not-delivered to the back-end per cycle"
+        "BriefDescription": "Retired far branch instructions (Precise event capable)"
     },
     {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts uops issued by the front end and allocated into the back end of the machine.  This event counts uops that retire as well as uops that were speculatively executed but didn't retire. The sort of speculative uops that might be counted includes, but is not limited to those uops issued in the shadow of a miss-predicted branch, those uops that are inserted during an assist (such as for a denormal floating point result), and (previously allocated) uops that might be canceled during a machine clear.",
-        "EventCode": "0x0E",
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts near indirect call or near indirect jmp branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xeb",
+        "EventName": "BR_INST_RETIRED.NON_RETURN_IND",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired instructions of near indirect Jmp or call (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts near return branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xf7",
+        "EventName": "BR_INST_RETIRED.RETURN",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired near return instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts near CALL branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xf9",
+        "EventName": "BR_INST_RETIRED.CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired near call instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts near indirect CALL branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xfb",
+        "EventName": "BR_INST_RETIRED.IND_CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired near indirect call instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts near relative CALL branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xfd",
+        "EventName": "BR_INST_RETIRED.REL_CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired near relative call instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were taken and does not count when the Jcc branch instruction were not taken.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0xfe",
+        "EventName": "BR_INST_RETIRED.TAKEN_JCC",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired conditional branch instructions that were taken (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted branch instructions retired including all branch types.",
+        "EventCode": "0xC5",
         "Counter": "0,1,2,3",
         "UMask": "0x0",
-        "EventName": "UOPS_ISSUED.ANY",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Uops issued to the back end per cycle"
+        "BriefDescription": "Retired mispredicted branch instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired, including both when the branch was supposed to be taken and when it was not supposed to be taken (but the processor predicted the opposite condition).",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x7e",
+        "EventName": "BR_MISP_RETIRED.JCC",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired mispredicted conditional branch instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted branch instructions retired that were near indirect call or near indirect jmp, where the target address taken was not what the processor predicted.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0xeb",
+        "EventName": "BR_MISP_RETIRED.NON_RETURN_IND",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired mispredicted instructions of near indirect Jmp or near indirect call. (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted near RET branch instructions retired, where the return address taken was not what the processor predicted.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0xf7",
+        "EventName": "BR_MISP_RETIRED.RETURN",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired mispredicted near return instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted near indirect CALL branch instructions retired, where the target address taken was not what the processor predicted.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0xfb",
+        "EventName": "BR_MISP_RETIRED.IND_CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired mispredicted near indirect call instructions (Precise event capable)"
+    },
+    {
+        "PEBS": "2",
+        "CollectPEBSRecord": "2",
+        "PublicDescription": "Counts mispredicted retired Jcc (Jump on Conditional Code/Jump if Condition is Met) branch instructions retired that were supposed to be taken but the processor predicted that it would not be taken.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0xfe",
+        "EventName": "BR_MISP_RETIRED.TAKEN_JCC",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Retired mispredicted conditional branch instructions that were taken (Precise event capable)"
     },
     {
         "CollectPEBSRecord": "1",
@@ -288,53 +423,6 @@
         "BriefDescription": "Cycles the FP divide unit is busy"
     },
     {
-        "PublicDescription": "Counts the number of instructions that retire execution. For instructions that consist of multiple uops, this event counts the retirement of the last uop of the instruction. The counter continues counting during hardware interrupts, traps, and inside interrupt handlers.  This event uses fixed counter 0.  You cannot collect a PEBs record for this event.",
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
-        "UMask": "0x1",
-        "EventName": "INST_RETIRED.ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Instructions retired (Fixed event)"
-    },
-    {
-        "PublicDescription": "Counts the number of core cycles while the core is not in a halt state.  The core enters the halt state when it is running the HLT instruction. In mobile systems the core frequency may change from time to time. For this reason this event may have a changing ratio with regards to time.  This event uses fixed counter 1.  You cannot collect a PEBs record for this event.",
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when core is not halted  (Fixed event)"
-    },
-    {
-        "PublicDescription": "Counts the number of reference cycles that the core is not in a halt state. The core enters the halt state when it is running the HLT instruction.  In mobile systems the core frequency may change from time.  This event is not affected by core frequency changes but counts as if the core is running at the maximum frequency all the time.  This event uses fixed counter 2.  You cannot collect a PEBs record for this event.",
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
-        "UMask": "0x3",
-        "EventName": "CPU_CLK_UNHALTED.REF_TSC",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when core is not halted  (Fixed event)"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Core cycles when core is not halted.  This event uses a (_P)rogrammable general purpose performance counter.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "CPU_CLK_UNHALTED.CORE_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when core is not halted"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Reference cycles when core is not halted.  This event uses a (_P)rogrammable general purpose performance counter.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when core is not halted"
-    },
-    {
         "CollectPEBSRecord": "1",
         "PublicDescription": "Counts the number of times a BACLEAR is signaled for any reason, including, but not limited to indirect branch/call,  Jcc (Jump on Conditional Code/Jump if Condition is Met) branch, unconditional branch/call, and returns.",
         "EventCode": "0xE6",
@@ -363,71 +451,5 @@
         "EventName": "BACLEARS.COND",
         "SampleAfterValue": "200003",
         "BriefDescription": "BACLEARs asserted for conditional branch"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts anytime a load that retires is blocked for any reason.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "LD_BLOCKS.ALL_BLOCK",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Loads blocked (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts loads blocked because they are unable to find their physical address in the micro TLB (UTLB).",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "LD_BLOCKS.UTLB_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Loads blocked because address in not in the UTLB (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts a load blocked from using a store forward because of an address/size mismatch, only one of the loads blocked from each store will be counted.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "LD_BLOCKS.STORE_FORWARD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Loads blocked due to store forward restriction (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts a load blocked from using a store forward, but did not occur because the store data was not available at the right time.  The forward might occur subsequently when the data is available.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LD_BLOCKS.DATA_UNKNOWN",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Loads blocked due to store data not ready (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "2",
-        "PublicDescription": "Counts loads that block because their address modulo 4K matches a pending store.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "LD_BLOCKS.4K_ALIAS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Loads blocked because address has 4k partial address false dependence (Precise event capable)"
-    },
-    {
-        "PEBS": "2",
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts the number of taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "BR_INST_RETIRED.ALL_TAKEN_BRANCHES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Retired taken branch instructions (Precise event capable)"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/goldmont/virtual-memory.json b/tools/perf/pmu-events/arch/x86/goldmont/virtual-memory.json
index 3202c44..9805198 100644
--- a/tools/perf/pmu-events/arch/x86/goldmont/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/goldmont/virtual-memory.json
@@ -1,6 +1,36 @@
 [
     {
         "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts every core cycle when a Data-side (walks due to a data operation) page walk is in progress.",
+        "EventCode": "0x05",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "PAGE_WALKS.D_SIDE_CYCLES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Duration of D-side page-walks in cycles"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts every core cycle when a Instruction-side (walks due to an instruction fetch) page walk is in progress.",
+        "EventCode": "0x05",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "PAGE_WALKS.I_SIDE_CYCLES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Duration of I-side pagewalks in cycles"
+    },
+    {
+        "CollectPEBSRecord": "1",
+        "PublicDescription": "Counts every core cycle a page-walk is in progress due to either a data memory operation or an instruction fetch.",
+        "EventCode": "0x05",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "PAGE_WALKS.CYCLES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Duration of page-walks in cycles"
+    },
+    {
+        "CollectPEBSRecord": "1",
         "PublicDescription": "Counts the number of times the machine was unable to find a translation in the Instruction Translation Lookaside Buffer (ITLB) for a linear address of an instruction fetch.  It counts when new translation are filled into the ITLB.  The event is speculative in nature, but will not count translations (page walks) that are begun and not finished, or translations that are finished but not filled into the ITLB.",
         "EventCode": "0x81",
         "Counter": "0,1,2,3",
@@ -41,35 +71,5 @@
         "EventName": "MEM_UOPS_RETIRED.DTLB_MISS",
         "SampleAfterValue": "200003",
         "BriefDescription": "Memory uops retired that missed the DTLB (Precise event capable)"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts every core cycle when a Data-side (walks due to a data operation) page walk is in progress.",
-        "EventCode": "0x05",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "PAGE_WALKS.D_SIDE_CYCLES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Duration of D-side page-walks in cycles"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts every core cycle when a Instruction-side (walks due to an instruction fetch) page walk is in progress.",
-        "EventCode": "0x05",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "PAGE_WALKS.I_SIDE_CYCLES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Duration of I-side pagewalks in cycles"
-    },
-    {
-        "CollectPEBSRecord": "1",
-        "PublicDescription": "Counts every core cycle a page-walk is in progress due to either a data memory operation or an instruction fetch.",
-        "EventCode": "0x05",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "PAGE_WALKS.CYCLES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Duration of page-walks in cycles"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswell/cache.json b/tools/perf/pmu-events/arch/x86/haswell/cache.json
index bfb5ebf..da4d6dd 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/cache.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/cache.json
@@ -11,6 +11,58 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts the number of store RFO requests that miss the L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x22",
+        "EventName": "L2_RQSTS.RFO_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of instruction fetches that missed the L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache misses when fetching instructions",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Demand requests that miss L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x27",
+        "Errata": "HSD78",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts all L2 HW prefetcher requests that missed L2.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "L2_RQSTS.L2_PF_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 prefetch requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "All requests that missed L2.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3f",
+        "Errata": "HSD78",
+        "EventName": "L2_RQSTS.MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Demand data read requests that hit L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -22,13 +74,23 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts all L2 HW prefetcher requests that missed L2.",
+        "PublicDescription": "Counts the number of store RFO requests that hit the L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "L2_RQSTS.L2_PF_MISS",
+        "UMask": "0x42",
+        "EventName": "L2_RQSTS.RFO_HIT",
         "SampleAfterValue": "200003",
-        "BriefDescription": "L2 prefetch requests that miss L2 cache",
+        "BriefDescription": "RFO requests that hit L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x44",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -73,6 +135,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Demand requests to L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe7",
+        "Errata": "HSD78",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests to L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts all L2 HW prefetcher requests.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -83,6 +156,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "All requests to L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xff",
+        "Errata": "HSD78",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All L2 requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Not rejected writebacks that hit L2 cache.",
         "EventCode": "0x27",
         "Counter": "0,1,2,3",
@@ -124,6 +208,27 @@
     },
     {
         "EventCode": "0x48",
+        "Counter": "2",
+        "UMask": "0x1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding.",
+        "CounterMask": "1",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
+        "Counter": "2",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "CounterMask": "1",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "L1D_PEND_MISS.REQUEST_FB_FULL",
@@ -133,13 +238,13 @@
     },
     {
         "EventCode": "0x48",
-        "Counter": "2",
-        "UMask": "0x1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding.",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
         "CounterMask": "1",
-        "CounterHTOff": "2"
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "PublicDescription": "This event counts when new data lines are brought into the L1 Data cache, which cause other lines to be evicted from the cache.",
@@ -163,6 +268,28 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "Errata": "HSD78, HSD62, HSD61",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "Errata": "HSD78, HSD62, HSD61",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Offcore outstanding Demand code Read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -185,6 +312,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "Errata": "HSD62, HSD61",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -198,17 +336,6 @@
     {
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "Errata": "HSD78, HSD62, HSD61",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
         "UMask": "0x8",
         "Errata": "HSD62, HSD61",
         "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
@@ -218,17 +345,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "Errata": "HSD62, HSD61",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles in which the L1D is locked.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
@@ -289,6 +405,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xB7, 0xBB",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PEBS": "1",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
@@ -296,7 +421,7 @@
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -308,7 +433,7 @@
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
@@ -321,31 +446,33 @@
         "Errata": "HSD76, HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "This event counts load uops retired which had memory addresses spilt across 2 cache lines. A line split is across 64B cache-lines which may include a page split (4K). This is a precise event.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x41",
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "This event counts store uops retired which had memory addresses spilt across 2 cache lines. A line split is across 64B cache-lines which may include a page split (4K). This is a precise event.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x42",
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
@@ -358,19 +485,20 @@
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "This event counts all store uops retired. This is a precise event.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x82",
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "All retired store uops. (precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
@@ -401,20 +529,20 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops with L3 cache hits as data sources.",
+        "PublicDescription": "This event counts retired load uops in which data sources were data hits in the L3 cache without snoops required. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "Errata": "HSD74, HSD29, HSD25, HSM26, HSM30",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Retired load uops which data sources were data hits in L3 without snoops required.",
+        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source.",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops missed L1 cache as data sources.",
+        "PublicDescription": "This event counts retired load uops in which data sources missed in the L1 cache. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -427,20 +555,18 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops missed L2. Unknown data source excluded.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
         "Errata": "HSD29, HSM30",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources.",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops missed L3. Excludes unknown data source .",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
@@ -477,25 +603,27 @@
     },
     {
         "PEBS": "1",
+        "PublicDescription": "This event counts retired load uops that hit in the L3 cache, but required a cross-core snoop which resulted in a HIT in an on-pkg core cache. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "Errata": "HSD29, HSD25, HSM26, HSM30",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. ",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "This event counts retired load uops that hit in the L3 cache, but required a cross-core snoop which resulted in a HITM (hit modified) in an on-pkg core cache. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "Errata": "HSD29, HSD25, HSM26, HSM30",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3.",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3. ",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -513,14 +641,13 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches.",
+        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. This is a precise event.",
         "EventCode": "0xD3",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "Errata": "HSD74, HSD29, HSD25, HSM30",
         "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Data from local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -665,6 +792,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "",
         "EventCode": "0xf4",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -674,131 +802,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts the number of store RFO requests that hit the L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x42",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that hit L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of store RFO requests that miss the L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x22",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x44",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of instruction fetches that missed the L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache misses when fetching instructions",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Demand requests that miss L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x27",
-        "Errata": "HSD78",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Demand requests to L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe7",
-        "Errata": "HSD78",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests to L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "All requests that missed L2.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3f",
-        "Errata": "HSD78",
-        "EventName": "L2_RQSTS.MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "All requests to L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xff",
-        "Errata": "HSD78",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All L2 requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "Errata": "HSD78, HSD62, HSD61",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "2",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "CounterMask": "1",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
+        "PublicDescription": "Counts all requests that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c8fff",
         "Counter": "0,1,2,3",
@@ -811,6 +815,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c07f7",
         "Counter": "0,1,2,3",
@@ -823,6 +828,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c07f7",
         "Counter": "0,1,2,3",
@@ -835,6 +841,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0244",
         "Counter": "0,1,2,3",
@@ -847,6 +854,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0122",
         "Counter": "0,1,2,3",
@@ -859,6 +867,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0122",
         "Counter": "0,1,2,3",
@@ -871,6 +880,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0091",
         "Counter": "0,1,2,3",
@@ -883,6 +893,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0091",
         "Counter": "0,1,2,3",
@@ -895,6 +906,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0200",
         "Counter": "0,1,2,3",
@@ -907,6 +919,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs  that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0100",
         "Counter": "0,1,2,3",
@@ -919,6 +932,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0080",
         "Counter": "0,1,2,3",
@@ -931,6 +945,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0040",
         "Counter": "0,1,2,3",
@@ -943,6 +958,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0020",
         "Counter": "0,1,2,3",
@@ -955,6 +971,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0010",
         "Counter": "0,1,2,3",
@@ -967,6 +984,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0004",
         "Counter": "0,1,2,3",
@@ -979,6 +997,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0004",
         "Counter": "0,1,2,3",
@@ -991,6 +1010,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0002",
         "Counter": "0,1,2,3",
@@ -1003,6 +1023,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0002",
         "Counter": "0,1,2,3",
@@ -1015,6 +1036,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10003c0001",
         "Counter": "0,1,2,3",
@@ -1027,6 +1049,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04003c0001",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/haswell/floating-point.json b/tools/perf/pmu-events/arch/x86/haswell/floating-point.json
index 1732fa4..f9843e5 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/floating-point.json
@@ -20,6 +20,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Note that a whole rep string only counts AVX_INST.ALL once.",
+        "EventCode": "0xC6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x7",
+        "EventName": "AVX_INSTS.ALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Approximate counts of AVX & AVX2 256-bit instructions, including non-arithmetic instructions, loads, and stores.  May count non-AVX instructions that employ 256-bit operations, including (but not necessarily limited to) rep string instructions that use 256-bit loads and stores for optimized performance, XSAVE* and XRSTOR*, and operations that transition the x87 FPU data registers between x87 and MMX.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of X87 FP assists due to output values.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
@@ -69,15 +79,5 @@
         "BriefDescription": "Cycles with any input/output SSE or FP assist",
         "CounterMask": "1",
         "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Note that a whole rep string only counts AVX_INST.ALL once.",
-        "EventCode": "0xC6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x7",
-        "EventName": "AVX_INSTS.ALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Approximate counts of AVX & AVX2 256-bit instructions, including non-arithmetic instructions, loads, and stores.  May count non-AVX instructions that employ 256-bit operations, including (but not necessarily limited to) rep string instructions that use 256-bit loads and stores for optimized performance, XSAVE* and XRSTOR*, and operations that transition the x87 FPU data registers between x87 and MMX.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswell/frontend.json b/tools/perf/pmu-events/arch/x86/haswell/frontend.json
index 57a1ce4..c0a5bed 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/frontend.json
@@ -21,57 +21,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts uops delivered by the Front-end with the assistance of the microcode sequencer.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the Front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -82,6 +31,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -92,6 +51,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -135,6 +104,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts cycles MITE is delivered four uops. Set Cmask = 4.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -157,6 +136,38 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "This event counts uops delivered by the Front-end with the assistance of the microcode sequencer.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the Front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EdgeDetect": "1",
+        "EventName": "IDQ.MS_SWITCHES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of uops delivered to IDQ from any path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -195,6 +206,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x80",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "ICACHE.IFDATA_STALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction-cache miss.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event count the number of undelivered (unallocated) uops from the Front-end to the Resource Allocation Table (RAT) while the Back-end of the processor is not stalled. The Front-end can allocate up to 4 uops per cycle so this event can increment 0-4 times per cycle depending on the number of unallocated uops. This event is counted on a per-core basis.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
@@ -270,25 +290,5 @@
         "SampleAfterValue": "2000003",
         "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EdgeDetect": "1",
-        "EventName": "IDQ.MS_SWITCHES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x80",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "ICACHE.IFDATA_STALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction-cache miss.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswell/memory.json b/tools/perf/pmu-events/arch/x86/haswell/memory.json
index aab981b..e5f9fa6 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/memory.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/memory.json
@@ -401,6 +401,7 @@
         "CounterHTOff": "3"
     },
     {
+        "PublicDescription": "Counts all requests that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc08fff",
         "Counter": "0,1,2,3",
@@ -413,6 +414,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01004007f7",
         "Counter": "0,1,2,3",
@@ -425,6 +427,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc007f7",
         "Counter": "0,1,2,3",
@@ -437,6 +440,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch code reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400244",
         "Counter": "0,1,2,3",
@@ -449,6 +453,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00244",
         "Counter": "0,1,2,3",
@@ -461,6 +466,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400122",
         "Counter": "0,1,2,3",
@@ -473,6 +479,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00122",
         "Counter": "0,1,2,3",
@@ -485,6 +492,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400091",
         "Counter": "0,1,2,3",
@@ -497,6 +505,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00091",
         "Counter": "0,1,2,3",
@@ -509,6 +518,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00200",
         "Counter": "0,1,2,3",
@@ -521,6 +531,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs  that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00100",
         "Counter": "0,1,2,3",
@@ -533,6 +544,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00080",
         "Counter": "0,1,2,3",
@@ -545,6 +557,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00040",
         "Counter": "0,1,2,3",
@@ -557,6 +570,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00020",
         "Counter": "0,1,2,3",
@@ -569,6 +583,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00010",
         "Counter": "0,1,2,3",
@@ -581,6 +596,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400004",
         "Counter": "0,1,2,3",
@@ -593,6 +609,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00004",
         "Counter": "0,1,2,3",
@@ -605,6 +622,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400002",
         "Counter": "0,1,2,3",
@@ -617,6 +635,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00002",
         "Counter": "0,1,2,3",
@@ -629,6 +648,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400001",
         "Counter": "0,1,2,3",
@@ -641,6 +661,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00001",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/haswell/other.json b/tools/perf/pmu-events/arch/x86/haswell/other.json
index 85d6a14..8a4d898 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/other.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
-        "EventCode": "0x5C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CPL_CYCLES.RING123",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0x5C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -31,6 +21,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
+        "EventCode": "0x5C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPL_CYCLES.RING123",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/haswell/pipeline.json b/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
index 0099848..a4dcfce 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/pipeline.json
@@ -2,33 +2,43 @@
     {
         "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. INST_RETIRED.ANY is counted by a designated fixed counter, leaving the programmable counters available for other events. Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "UMask": "0x1",
         "Errata": "HSD140, HSD143",
         "EventName": "INST_RETIRED.ANY",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Instructions retired from execution.",
-        "CounterHTOff": "Fixed counter 1"
+        "CounterHTOff": "Fixed counter 0"
     },
     {
         "PublicDescription": "This event counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
+        "Counter": "Fixed counter 1",
         "UMask": "0x2",
         "EventName": "CPU_CLK_UNHALTED.THREAD",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Core cycles when the thread is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
+    },
+    {
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "UMask": "0x3",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "PublicDescription": "This event counts loads that followed a store to the same address, where the data could not be forwarded inside the pipeline from the store to the load.  The most common reason why store forwarding would be blocked is when a load's address range overlaps with a preceding smaller uncompleted store. The penalty for blocked store forwarding is that the load must wait for the store to write its value to the cache before it can be issued.",
@@ -67,7 +77,19 @@
         "UMask": "0x3",
         "EventName": "INT_MISC.RECOVERY_CYCLES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "AnyThread": "1",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)",
         "CounterMask": "1",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -82,6 +104,29 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x0E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0x0E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "UOPS_ISSUED.CORE_STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Number of flags-merge uops allocated. Such uops add delay.",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
@@ -112,29 +157,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x0E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x0E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_ISSUED.CORE_STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "EventCode": "0x14",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -144,6 +166,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Increments at the frequency of XCLK (100 MHz) when not halted.",
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
@@ -154,6 +196,38 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x3c",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -163,6 +237,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch.",
         "EventCode": "0x4c",
         "Counter": "0,1,2,3",
@@ -233,6 +316,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles where the decoder is stalled on an instruction with a length changing prefix (LCP).",
         "EventCode": "0x87",
         "Counter": "0,1,2,3",
@@ -409,6 +504,15 @@
     {
         "EventCode": "0x89",
         "Counter": "0,1,2,3",
+        "UMask": "0xa0",
+        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x89",
+        "Counter": "0,1,2,3",
         "UMask": "0xc1",
         "EventName": "BR_MISP_EXEC.ALL_CONDITIONAL",
         "SampleAfterValue": "200003",
@@ -445,6 +549,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles per core when uops are exectuted in port 0.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are executed in port 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 1 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -455,6 +579,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles per core when uops are exectuted in port 1.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are executed in port 1.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 2 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -465,6 +609,25 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 3 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -475,6 +638,25 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 4 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -485,6 +667,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles per core when uops are exectuted in port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are executed in port 4.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 5 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -495,6 +697,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles per core when uops are exectuted in port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are executed in port 5.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 6 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -505,6 +727,26 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles per core when uops are exectuted in port 6.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are executed in port 6.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles which a uop is dispatched on port 7 in this thread.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -515,6 +757,25 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "AnyThread": "1",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles allocation is stalled due to resource related reason.",
         "EventCode": "0xA2",
         "Counter": "0,1,2,3",
@@ -566,17 +827,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles with pending L1 data cache miss loads. Set Cmask=8 to count cycle.",
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with pending L1 cache miss loads.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
-    },
-    {
         "PublicDescription": "Cycles with pending memory loads. Set Cmask=2 to count cycle.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
@@ -594,7 +844,7 @@
         "UMask": "0x4",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
         "CounterMask": "4",
         "CounterHTOff": "0,1,2,3"
     },
@@ -621,6 +871,17 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Cycles with pending L1 data cache miss loads. Set Cmask=8 to count cycle.",
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with pending L1 cache miss loads.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
         "PublicDescription": "Execution stalls due to L1 data cache miss loads. Set Cmask=0CH.",
         "EventCode": "0xA3",
         "Counter": "2",
@@ -642,14 +903,23 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
-        "EventCode": "0xB1",
+        "EventCode": "0xA8",
         "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "Errata": "HSD30, HSM31",
-        "EventName": "UOPS_EXECUTED.CORE",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_ACTIVE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of uops executed on the core.",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "4",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -665,368 +935,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of instructions at retirement.",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "Errata": "HSD11, HSD140",
-        "EventName": "INST_RETIRED.ANY_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "INST_RETIRED.X87",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "FP operations retired. X87 FP operations that have no exceptions: Counts also flows that have several X87 or flows that use X87 uops in the exception handling.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
-        "EventCode": "0xC0",
-        "Counter": "1",
-        "UMask": "0x1",
-        "Errata": "HSD140",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "CounterHTOff": "1"
-    },
-    {
-        "PublicDescription": "Number of microcode assists invoked by HW upon uop writeback.",
-        "EventCode": "0xC1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of micro-ops retired. Use Cmask=1 and invert to count active cycles or stalled cycles.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.ALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Actually retired uops.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7",
-        "Data_LA": "1"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This event counts the number of retirement slots used each cycle.  There are potentially 4 slots that can be used each cycle - meaning, 4 uops or 4 instructions could retire each cycle.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Retirement slots used.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "MACHINE_CLEARS.CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event is incremented when self-modifying code (SMC) is detected, which causes a machine clear.  Machine clears can have a significant performance impact if they are happening frequently.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of conditional branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Branch instructions at retirement.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of near return instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Return instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of not taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Not taken branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Number of near taken branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Taken branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of far branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Far branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PEBS": "1",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Mispredicted branch instructions at retirement.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "This event counts all mispredicted branch instructions retired. This is a precise event.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted macro branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Count cases of saving new LBR records by hardware.",
-        "EventCode": "0xCC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Count cases of saving new LBR",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x89",
-        "Counter": "0,1,2,3",
-        "UMask": "0xa0",
-        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "AnyThread": "1",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Number of near branch instructions retired that were taken but mispredicted.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "This events counts the cycles where at least one uop was executed. It is counted per thread.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -1074,171 +982,14 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
-        "EventCode": "0xe6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1f",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x5E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
+        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
+        "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "Errata": "HSD30, HSM31",
+        "EventName": "UOPS_EXECUTED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "AnyThread": "1",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)",
-        "CounterMask": "1",
+        "BriefDescription": "Number of uops executed on the core.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -1297,33 +1048,291 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
-        "EventCode": "0x3C",
+        "PublicDescription": "Number of instructions at retirement.",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "UMask": "0x0",
+        "Errata": "HSD11, HSD140",
+        "EventName": "INST_RETIRED.ANY_P",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "EventCode": "0x3C",
+        "PEBS": "2",
+        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
+        "EventCode": "0xC0",
+        "Counter": "1",
+        "UMask": "0x1",
+        "Errata": "HSD140",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "CounterHTOff": "1"
+    },
+    {
+        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
+        "EventCode": "0xC0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "INST_RETIRED.X87",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "FP operations retired. X87 FP operations that have no exceptions: Counts also flows that have several X87 or flows that use X87 uops in the exception handling.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of microcode assists invoked by HW upon uop writeback.",
+        "EventCode": "0xC1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.ALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Actually retired uops.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7",
+        "Data_LA": "1"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retirement slots used.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "MACHINE_CLEARS.CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This event is incremented when self-modifying code (SMC) is detected, which causes a machine clear.  Machine clears can have a significant performance impact if they are happening frequently.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Branch instructions at retirement.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Direct and indirect near call instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Return instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of not taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of far branches retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Far branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Mispredicted branch instructions at retirement.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "This event counts all mispredicted branch instructions retired. This is a precise event.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Count cases of saving new LBR records by hardware.",
+        "EventCode": "0xCC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "Count cases of saving new LBR",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
+        "EventCode": "0xe6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1f",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswell/virtual-memory.json b/tools/perf/pmu-events/arch/x86/haswell/virtual-memory.json
index ce80a08..777b500 100644
--- a/tools/perf/pmu-events/arch/x86/haswell/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/haswell/virtual-memory.json
@@ -39,6 +39,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Completed page walks in any TLB of any page size due to demand load misses.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles when the  page miss handler (PMH) is servicing page walks caused by DTLB load misses.",
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
@@ -69,6 +79,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Number of cache load STLB hits. No page walk.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "DTLB demand load misses with low part of linear-to-physical address translation missed.",
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
@@ -118,6 +138,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Completed page walks due to store miss in any TLB levels of any page size (4K/2M/4M/1G).",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles when the  page miss handler (PMH) is servicing page walks caused by DTLB store misses.",
         "EventCode": "0x49",
         "Counter": "0,1,2,3",
@@ -148,6 +178,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "DTLB store misses with low part of linear-to-physical address translation missed.",
         "EventCode": "0x49",
         "Counter": "0,1,2,3",
@@ -206,6 +246,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Completed page walks in ITLB of any page size.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Misses in all ITLB levels that cause completed page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "This event counts cycles when the  page miss handler (PMH) is servicing page walks caused by ITLB misses.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
@@ -236,6 +286,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "ITLB misses that hit STLB. No page walk.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0x60",
+        "EventName": "ITLB_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts the number of ITLB flushes, includes 4k/2M/4M pages.",
         "EventCode": "0xae",
         "Counter": "0,1,2,3",
@@ -256,34 +316,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L1+FB.",
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x21",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L1+FB",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x41",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L1 and FB.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x81",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L1 and FB.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "PublicDescription": "Number of DTLB page walker loads that hit in the L2.",
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
@@ -294,34 +326,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L2.",
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x22",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L2",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x42",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L2.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x82",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "PublicDescription": "Number of DTLB page walker loads that hit in the L3.",
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
@@ -333,35 +337,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L3.",
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "Errata": "HSD25",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x44",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L3.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x84",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "PublicDescription": "Number of DTLB page walker loads from memory.",
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
@@ -373,6 +348,37 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L1+FB.",
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x21",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L1+FB",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L2.",
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x22",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L2",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L3.",
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "Errata": "HSD25",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Number of ITLB page walker loads from memory.",
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
@@ -386,6 +392,33 @@
     {
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
+        "UMask": "0x41",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L1 and FB.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x42",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L2.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x44",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L3.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
         "UMask": "0x48",
         "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_MEMORY",
         "SampleAfterValue": "2000003",
@@ -395,6 +428,33 @@
     {
         "EventCode": "0xBC",
         "Counter": "0,1,2,3",
+        "UMask": "0x81",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L1 and FB.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x82",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x84",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "Counter": "0,1,2,3",
         "UMask": "0x88",
         "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_MEMORY",
         "SampleAfterValue": "2000003",
@@ -420,65 +480,5 @@
         "SampleAfterValue": "100003",
         "BriefDescription": "STLB flush attempts",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Completed page walks in any TLB of any page size due to demand load misses.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of cache load STLB hits. No page walk.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Completed page walks due to store miss in any TLB levels of any page size (4K/2M/4M/1G).",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Completed page walks in ITLB of any page size.",
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Misses in all ITLB levels that cause completed page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "ITLB misses that hit STLB. No page walk.",
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0x60",
-        "EventName": "ITLB_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/cache.json b/tools/perf/pmu-events/arch/x86/haswellx/cache.json
index f1bae08..b2fbd61 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/cache.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/cache.json
@@ -12,6 +12,58 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0x22",
+        "BriefDescription": "RFO requests that miss L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.RFO_MISS",
+        "PublicDescription": "Counts the number of store RFO requests that miss the L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x24",
+        "BriefDescription": "L2 cache misses when fetching instructions",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "PublicDescription": "Number of instruction fetches that missed the L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x27",
+        "BriefDescription": "Demand requests that miss L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
+        "Errata": "HSD78",
+        "PublicDescription": "Demand requests that miss L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x30",
+        "BriefDescription": "L2 prefetch requests that miss L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.L2_PF_MISS",
+        "PublicDescription": "Counts all L2 HW prefetcher requests that missed L2.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x3f",
+        "BriefDescription": "All requests that miss L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.MISS",
+        "Errata": "HSD78",
+        "PublicDescription": "All requests that missed L2.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0x41",
         "BriefDescription": "Demand Data Read requests that hit L2 cache",
         "Counter": "0,1,2,3",
@@ -23,11 +75,21 @@
     },
     {
         "EventCode": "0x24",
-        "UMask": "0x30",
-        "BriefDescription": "L2 prefetch requests that miss L2 cache",
+        "UMask": "0x42",
+        "BriefDescription": "RFO requests that hit L2 cache",
         "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.L2_PF_MISS",
-        "PublicDescription": "Counts all L2 HW prefetcher requests that missed L2.",
+        "EventName": "L2_RQSTS.RFO_HIT",
+        "PublicDescription": "Counts the number of store RFO requests that hit the L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
+        "UMask": "0x44",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
         "SampleAfterValue": "200003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -74,6 +136,17 @@
     },
     {
         "EventCode": "0x24",
+        "UMask": "0xe7",
+        "BriefDescription": "Demand requests to L2 cache",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "Errata": "HSD78",
+        "PublicDescription": "Demand requests to L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x24",
         "UMask": "0xf8",
         "BriefDescription": "Requests from L2 hardware prefetchers",
         "Counter": "0,1,2,3",
@@ -83,6 +156,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x24",
+        "UMask": "0xff",
+        "BriefDescription": "All L2 requests",
+        "Counter": "0,1,2,3",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "Errata": "HSD78",
+        "PublicDescription": "All requests to L2 cache.",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x27",
         "UMask": "0x50",
         "BriefDescription": "Not rejected writebacks that hit L2 cache",
@@ -124,6 +208,27 @@
     },
     {
         "EventCode": "0x48",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with L1D load Misses outstanding.",
+        "Counter": "2",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "Counter": "2",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0x48",
         "UMask": "0x2",
         "BriefDescription": "Number of times a request needed a FB entry but there was no entry available for it. That is the FB unavailability was dominant reason for blocking the request. A request includes cacheable/uncacheable demands that is load, store or SW prefetch. HWP are e.",
         "Counter": "0,1,2,3",
@@ -133,13 +238,13 @@
     },
     {
         "EventCode": "0x48",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with L1D load Misses outstanding.",
-        "Counter": "2",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "Counter": "0,1,2,3",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
         "CounterMask": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x51",
@@ -164,6 +269,28 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "CounterMask": "1",
+        "Errata": "HSD78, HSD62, HSD61",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "CounterMask": "6",
+        "Errata": "HSD78, HSD62, HSD61",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x2",
         "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
         "Counter": "0,1,2,3",
@@ -186,6 +313,17 @@
     },
     {
         "EventCode": "0x60",
+        "UMask": "0x4",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "CounterMask": "1",
+        "Errata": "HSD62, HSD61",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
         "Counter": "0,1,2,3",
@@ -197,17 +335,6 @@
     },
     {
         "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "CounterMask": "1",
-        "Errata": "HSD78, HSD62, HSD61",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
         "UMask": "0x8",
         "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
         "Counter": "0,1,2,3",
@@ -218,17 +345,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x60",
-        "UMask": "0x4",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "CounterMask": "1",
-        "Errata": "HSD62, HSD61",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0x63",
         "UMask": "0x2",
         "BriefDescription": "Cycles when L1D is locked",
@@ -289,9 +405,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0xD0",
         "UMask": "0x11",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -303,20 +428,20 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x12",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
         "Errata": "HSD29, HSM30",
-        "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
+        "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x21",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -328,32 +453,34 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x41",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
         "Errata": "HSD29, HSM30",
+        "PublicDescription": "This event counts load uops retired which had memory addresses spilt across 2 cache lines. A line split is across 64B cache-lines which may include a page split (4K). This is a precise event.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x42",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
         "Errata": "HSD29, HSM30",
-        "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
+        "PublicDescription": "This event counts store uops retired which had memory addresses spilt across 2 cache lines. A line split is across 64B cache-lines which may include a page split (4K). This is a precise event.",
+        "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x81",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -365,14 +492,15 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x82",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "All retired store uops. (precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
         "Errata": "HSD29, HSM30",
-        "SampleAfterValue": "2000003",
         "L1_Hit_Indication": "1",
+        "PublicDescription": "This event counts all store uops retired. This is a precise event.",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -402,13 +530,13 @@
     {
         "EventCode": "0xD1",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were data hits in L3 without snoops required.",
+        "BriefDescription": "Miss in last-level (L3) cache. Excludes Unknown data-source.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_HIT",
         "Errata": "HSD74, HSD29, HSD25, HSM26, HSM30",
-        "PublicDescription": "Retired load uops with L3 cache hits as data sources.",
+        "PublicDescription": "This event counts retired load uops in which data sources were data hits in the L3 cache without snoops required. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
@@ -421,20 +549,19 @@
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
         "Errata": "HSM30",
-        "PublicDescription": "Retired load uops missed L1 cache as data sources.",
+        "PublicDescription": "This event counts retired load uops in which data sources missed in the L1 cache. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD1",
         "UMask": "0x10",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources.",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
         "Errata": "HSD29, HSM30",
-        "PublicDescription": "Retired load uops missed L2. Unknown data source excluded.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
@@ -447,7 +574,6 @@
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L3_MISS",
         "Errata": "HSD74, HSD29, HSD25, HSM26, HSM30",
-        "PublicDescription": "Retired load uops missed L3. Excludes unknown data source .",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -478,24 +604,26 @@
     {
         "EventCode": "0xD2",
         "UMask": "0x2",
-        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache.",
+        "BriefDescription": "Retired load uops which data sources were L3 and cross-core snoop hits in on-pkg core cache. ",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HIT",
         "Errata": "HSD29, HSD25, HSM26, HSM30",
+        "PublicDescription": "This event counts retired load uops that hit in the L3 cache, but required a cross-core snoop which resulted in a HIT in an on-pkg core cache. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD2",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3.",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared L3. ",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_HIT_RETIRED.XSNP_HITM",
         "Errata": "HSD29, HSD25, HSM26, HSM30",
+        "PublicDescription": "This event counts retired load uops that hit in the L3 cache, but required a cross-core snoop which resulted in a HITM (hit modified) in an on-pkg core cache. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "20011",
         "CounterHTOff": "0,1,2,3"
     },
@@ -514,20 +642,19 @@
     {
         "EventCode": "0xD3",
         "UMask": "0x1",
-        "BriefDescription": "Data from local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_UOPS_L3_MISS_RETIRED.LOCAL_DRAM",
         "Errata": "HSD74, HSD29, HSD25, HSM30",
-        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches.",
+        "PublicDescription": "This event counts retired load uops where the data came from local DRAM. This does not include hardware prefetches. This is a precise event.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD3",
         "UMask": "0x4",
-        "BriefDescription": "Retired load uop whose Data Source was: remote DRAM either Snoop not needed or Snoop Miss (RspI)",
+        "BriefDescription": "Retired load uop whose Data Source was: remote DRAM either Snoop not needed or Snoop Miss (RspI) (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -539,7 +666,7 @@
     {
         "EventCode": "0xD3",
         "UMask": "0x10",
-        "BriefDescription": "Retired load uop whose Data Source was: Remote cache HITM",
+        "BriefDescription": "Retired load uop whose Data Source was: Remote cache HITM (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -551,7 +678,7 @@
     {
         "EventCode": "0xD3",
         "UMask": "0x20",
-        "BriefDescription": "Retired load uop whose Data Source was: forwarded from remote cache",
+        "BriefDescription": "Retired load uop whose Data Source was: forwarded from remote cache (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -706,135 +833,11 @@
         "BriefDescription": "Split locks in SQ",
         "Counter": "0,1,2,3",
         "EventName": "SQ_MISC.SPLIT_LOCK",
+        "PublicDescription": "",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x24",
-        "UMask": "0x42",
-        "BriefDescription": "RFO requests that hit L2 cache",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "PublicDescription": "Counts the number of store RFO requests that hit the L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x22",
-        "BriefDescription": "RFO requests that miss L2 cache",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "PublicDescription": "Counts the number of store RFO requests that miss the L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x44",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x24",
-        "BriefDescription": "L2 cache misses when fetching instructions",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "PublicDescription": "Number of instruction fetches that missed the L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x27",
-        "BriefDescription": "Demand requests that miss L2 cache",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "Errata": "HSD78",
-        "PublicDescription": "Demand requests that miss L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xe7",
-        "BriefDescription": "Demand requests to L2 cache",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "Errata": "HSD78",
-        "PublicDescription": "Demand requests to L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0x3f",
-        "BriefDescription": "All requests that miss L2 cache",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.MISS",
-        "Errata": "HSD78",
-        "PublicDescription": "All requests that missed L2.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "UMask": "0xff",
-        "BriefDescription": "All L2 requests",
-        "Counter": "0,1,2,3",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "Errata": "HSD78",
-        "PublicDescription": "All requests to L2 cache.",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "UMask": "0x1",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x60",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "Counter": "0,1,2,3",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "CounterMask": "6",
-        "Errata": "HSD78, HSD62, HSD61",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "Counter": "2",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0x48",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "Counter": "0,1,2,3",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "Offcore": "1",
         "EventCode": "0xB7, 0xBB",
         "UMask": "0x1",
@@ -843,6 +846,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -855,6 +859,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -867,6 +872,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -879,6 +885,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -891,6 +898,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -903,6 +911,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -915,6 +924,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -927,6 +937,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -939,6 +950,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -951,6 +963,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_DATA_RD.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -963,6 +976,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_RFO.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -975,6 +989,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -987,6 +1002,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -999,6 +1015,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1011,6 +1028,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1023,6 +1041,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1035,6 +1054,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1047,6 +1067,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1059,6 +1080,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1071,6 +1093,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.LLC_HIT.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all requests that hit in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/floating-point.json b/tools/perf/pmu-events/arch/x86/haswellx/floating-point.json
index 6282aed..bc08cc1 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/floating-point.json
@@ -20,6 +20,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xC6",
+        "UMask": "0x7",
+        "BriefDescription": "Approximate counts of AVX & AVX2 256-bit instructions, including non-arithmetic instructions, loads, and stores.  May count non-AVX instructions that employ 256-bit operations, including (but not necessarily limited to) rep string instructions that use 256-bit loads and stores for optimized performance, XSAVE* and XRSTOR*, and operations that transition the x87 FPU data registers between x87 and MMX.",
+        "Counter": "0,1,2,3",
+        "EventName": "AVX_INSTS.ALL",
+        "PublicDescription": "Note that a whole rep string only counts AVX_INST.ALL once.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xCA",
         "UMask": "0x2",
         "BriefDescription": "Number of X87 assists due to output value.",
@@ -69,15 +79,5 @@
         "PublicDescription": "Cycles with any input/output SSE* or FP assists.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC6",
-        "UMask": "0x7",
-        "BriefDescription": "Approximate counts of AVX & AVX2 256-bit instructions, including non-arithmetic instructions, loads, and stores.  May count non-AVX instructions that employ 256-bit operations, including (but not necessarily limited to) rep string instructions that use 256-bit loads and stores for optimized performance, XSAVE* and XRSTOR*, and operations that transition the x87 FPU data registers between x87 and MMX.",
-        "Counter": "0,1,2,3",
-        "EventName": "AVX_INSTS.ALL",
-        "PublicDescription": "Note that a whole rep string only counts AVX_INST.ALL once.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/frontend.json b/tools/perf/pmu-events/arch/x86/haswellx/frontend.json
index 2d0c7aa..a4d9f1fc 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/frontend.json
@@ -22,57 +22,6 @@
     },
     {
         "EventCode": "0x79",
-        "UMask": "0x8",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.DSB_UOPS",
-        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x10",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x20",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_UOPS",
-        "PublicDescription": "This event counts uops delivered by the Front-end with the assistance of the microcode sequencer.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_CYCLES",
-        "CounterMask": "1",
-        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the Front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x79",
         "UMask": "0x4",
         "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path.",
         "Counter": "0,1,2,3",
@@ -84,6 +33,16 @@
     {
         "EventCode": "0x79",
         "UMask": "0x8",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.DSB_UOPS",
+        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x8",
         "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.DSB_CYCLES",
@@ -94,6 +53,16 @@
     {
         "EventCode": "0x79",
         "UMask": "0x10",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x10",
         "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
         "Counter": "0,1,2,3",
         "EventName": "IDQ.MS_DSB_CYCLES",
@@ -136,6 +105,16 @@
     },
     {
         "EventCode": "0x79",
+        "UMask": "0x20",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
         "UMask": "0x24",
         "BriefDescription": "Cycles MITE is delivering 4 Uops",
         "Counter": "0,1,2,3",
@@ -158,6 +137,38 @@
     },
     {
         "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_UOPS",
+        "PublicDescription": "This event counts uops delivered by the Front-end with the assistance of the microcode sequencer.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_CYCLES",
+        "CounterMask": "1",
+        "PublicDescription": "This event counts cycles during which the microcode sequencer assisted the Front-end in delivering uops.  Microcode assists are used for complex instructions or scenarios that can't be handled by the standard decoder.  Using other instructions, if possible, will usually improve performance.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0x79",
+        "UMask": "0x30",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "Counter": "0,1,2,3",
+        "EventName": "IDQ.MS_SWITCHES",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x79",
         "UMask": "0x3c",
         "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
         "Counter": "0,1,2,3",
@@ -195,6 +206,15 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x80",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction-cache miss.",
+        "Counter": "0,1,2,3",
+        "EventName": "ICACHE.IFDATA_STALL",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x9C",
         "UMask": "0x1",
         "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
@@ -270,25 +290,5 @@
         "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0x79",
-        "UMask": "0x30",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "Counter": "0,1,2,3",
-        "EventName": "IDQ.MS_SWITCHES",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x80",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles where a code fetch is stalled due to L1 instruction-cache miss.",
-        "Counter": "0,1,2,3",
-        "EventName": "ICACHE.IFDATA_STALL",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/memory.json b/tools/perf/pmu-events/arch/x86/haswellx/memory.json
index 0886cc0..56b0f24 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/memory.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/memory.json
@@ -409,6 +409,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts demand data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -421,6 +422,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -433,6 +435,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -445,6 +448,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -457,6 +461,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -469,6 +474,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -481,6 +487,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -493,6 +500,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -505,6 +513,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -517,6 +526,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -529,6 +539,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_DATA_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -541,6 +552,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -553,6 +565,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_LLC_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts prefetch (that bring data to LLC only) code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -565,6 +578,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -577,6 +591,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -589,6 +604,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -601,6 +617,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -613,6 +630,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.LLC_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -625,6 +643,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -637,6 +656,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -649,6 +669,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -661,6 +682,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_CODE_RD.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all demand & prefetch code reads that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -673,6 +695,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -685,6 +708,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.LOCAL_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the data is returned from local dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -697,6 +721,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_DRAM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the data is returned from remote dram Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -709,6 +734,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and the modified data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -721,6 +747,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_READS.LLC_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all data/code/rfo reads (demand & prefetch) that miss the L3 and clean or shared data is transferred from remote cache Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -733,6 +760,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_REQUESTS.LLC_MISS.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts all requests that miss in the L3 Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/other.json b/tools/perf/pmu-events/arch/x86/haswellx/other.json
index 4e1b6ce..800e65d 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/other.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x5C",
-        "UMask": "0x2",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "Counter": "0,1,2,3",
-        "EventName": "CPL_CYCLES.RING123",
-        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EdgeDetect": "1",
         "EventCode": "0x5C",
         "UMask": "0x1",
@@ -31,6 +21,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5C",
+        "UMask": "0x2",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "Counter": "0,1,2,3",
+        "EventName": "CPL_CYCLES.RING123",
+        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x63",
         "UMask": "0x1",
         "BriefDescription": "Cycles when L1 and L2 are locked due to UC or split lock",
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json b/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
index c3a163d..8a18bfe 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/pipeline.json
@@ -3,32 +3,42 @@
         "EventCode": "0x00",
         "UMask": "0x1",
         "BriefDescription": "Instructions retired from execution.",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "EventName": "INST_RETIRED.ANY",
         "Errata": "HSD140, HSD143",
         "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. INST_RETIRED.ANY is counted by a designated fixed counter, leaving the programmable counters available for other events. Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "SampleAfterValue": "2000003",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "UMask": "0x2",
+        "BriefDescription": "Core cycles when the thread is not in halt state.",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "PublicDescription": "This event counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x2",
-        "BriefDescription": "Core cycles when the thread is not in halt state.",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "PublicDescription": "This event counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "AnyThread": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x3",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "EventCode": "0x03",
@@ -63,7 +73,7 @@
     {
         "EventCode": "0x0D",
         "UMask": "0x3",
-        "BriefDescription": "Number of cycles waiting for the checkpoints in Resource Allocation Table (RAT) to be recovered after Nuke due to all other cases except JEClear (e.g. whenever a ucode assist is needed like SSE exception, memory disambiguation, etc...)",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for this thread (e.g. misprediction or memory nuke)",
         "Counter": "0,1,2,3",
         "EventName": "INT_MISC.RECOVERY_CYCLES",
         "CounterMask": "1",
@@ -72,6 +82,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x0D",
+        "UMask": "0x3",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)",
+        "Counter": "0,1,2,3",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "PublicDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x0E",
         "UMask": "0x1",
         "BriefDescription": "Uops that Resource Allocation Table (RAT) issues to Reservation Station (RS)",
@@ -82,6 +104,29 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "Invert": "1",
+        "EventCode": "0x0E",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0x0E",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_ISSUED.CORE_STALL_CYCLES",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "EventCode": "0x0E",
         "UMask": "0x10",
         "BriefDescription": "Number of flags-merge uops being allocated. Such uops considered perf sensitive; added by GSR u-arch.",
@@ -112,29 +157,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "Invert": "1",
-        "EventCode": "0x0E",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0x0E",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for all threads.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_ISSUED.CORE_STALL_CYCLES",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
         "EventCode": "0x14",
         "UMask": "0x2",
         "BriefDescription": "Any uop executed by the Divider. (This includes all divide uops, sqrt, ...)",
@@ -145,6 +167,26 @@
     },
     {
         "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "PublicDescription": "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x0",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
         "UMask": "0x1",
         "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
         "Counter": "0,1,2,3",
@@ -154,6 +196,38 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "UMask": "0x1",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "AnyThread": "1",
+        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x3c",
         "UMask": "0x2",
         "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
@@ -163,6 +237,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "UMask": "0x2",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "Counter": "0,1,2,3",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x4c",
         "UMask": "0x1",
         "BriefDescription": "Not software-prefetch load dispatches that hit FB allocated for software prefetch",
@@ -233,6 +316,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EdgeDetect": "1",
+        "Invert": "1",
+        "EventCode": "0x5E",
+        "UMask": "0x1",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "Counter": "0,1,2,3",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "CounterMask": "1",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x87",
         "UMask": "0x1",
         "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
@@ -408,6 +503,15 @@
     },
     {
         "EventCode": "0x89",
+        "UMask": "0xa0",
+        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
+        "SampleAfterValue": "200003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x89",
         "UMask": "0xc1",
         "BriefDescription": "Speculative and retired mispredicted macro conditional branches.",
         "Counter": "0,1,2,3",
@@ -446,6 +550,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per core when uops are executed in port 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
+        "AnyThread": "1",
+        "PublicDescription": "Cycles per core when uops are exectuted in port 0.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x2",
         "BriefDescription": "Cycles per thread when uops are executed in port 1",
         "Counter": "0,1,2,3",
@@ -456,6 +580,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per core when uops are executed in port 1.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
+        "AnyThread": "1",
+        "PublicDescription": "Cycles per core when uops are exectuted in port 1.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x2",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x4",
         "BriefDescription": "Cycles per thread when uops are executed in port 2",
         "Counter": "0,1,2,3",
@@ -466,6 +610,25 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x4",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x8",
         "BriefDescription": "Cycles per thread when uops are executed in port 3",
         "Counter": "0,1,2,3",
@@ -476,6 +639,25 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x10",
         "BriefDescription": "Cycles per thread when uops are executed in port 4",
         "Counter": "0,1,2,3",
@@ -486,6 +668,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per core when uops are executed in port 4.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
+        "AnyThread": "1",
+        "PublicDescription": "Cycles per core when uops are exectuted in port 4.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x10",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x20",
         "BriefDescription": "Cycles per thread when uops are executed in port 5",
         "Counter": "0,1,2,3",
@@ -496,6 +698,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per core when uops are executed in port 5.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
+        "AnyThread": "1",
+        "PublicDescription": "Cycles per core when uops are exectuted in port 5.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x20",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x40",
         "BriefDescription": "Cycles per thread when uops are executed in port 6",
         "Counter": "0,1,2,3",
@@ -506,6 +728,26 @@
     },
     {
         "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per core when uops are executed in port 6.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
+        "AnyThread": "1",
+        "PublicDescription": "Cycles per core when uops are exectuted in port 6.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x40",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
         "UMask": "0x80",
         "BriefDescription": "Cycles per thread when uops are executed in port 7",
         "Counter": "0,1,2,3",
@@ -515,6 +757,25 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
+        "AnyThread": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA1",
+        "UMask": "0x80",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xA2",
         "UMask": "0x1",
         "BriefDescription": "Resource-related stall cycles",
@@ -567,17 +828,6 @@
     },
     {
         "EventCode": "0xA3",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles with pending L1 cache miss loads.",
-        "Counter": "2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
-        "CounterMask": "8",
-        "PublicDescription": "Cycles with pending L1 data cache miss loads. Set Cmask=8 to count cycle.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
         "UMask": "0x2",
         "BriefDescription": "Cycles with pending memory loads.",
         "Counter": "0,1,2,3",
@@ -590,7 +840,7 @@
     {
         "EventCode": "0xA3",
         "UMask": "0x4",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
         "Counter": "0,1,2,3",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "CounterMask": "4",
@@ -622,6 +872,17 @@
     },
     {
         "EventCode": "0xA3",
+        "UMask": "0x8",
+        "BriefDescription": "Cycles with pending L1 cache miss loads.",
+        "Counter": "2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "CounterMask": "8",
+        "PublicDescription": "Cycles with pending L1 data cache miss loads. Set Cmask=8 to count cycle.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
         "UMask": "0xc",
         "BriefDescription": "Execution stalls due to L1 data cache misses",
         "Counter": "2",
@@ -642,13 +903,22 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xB1",
-        "UMask": "0x2",
-        "BriefDescription": "Number of uops executed on the core.",
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
         "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "Errata": "HSD30, HSM31",
-        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
+        "EventName": "LSD.CYCLES_ACTIVE",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA8",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "Counter": "0,1,2,3",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "CounterMask": "4",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -665,368 +935,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xC0",
-        "UMask": "0x0",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.ANY_P",
-        "Errata": "HSD11, HSD140",
-        "PublicDescription": "Number of instructions at retirement.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x2",
-        "BriefDescription": "FP operations retired. X87 FP operations that have no exceptions: Counts also flows that have several X87 or flows that use X87 uops in the exception handling.",
-        "Counter": "0,1,2,3",
-        "EventName": "INST_RETIRED.X87",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC0",
-        "UMask": "0x1",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "PEBS": "2",
-        "Counter": "1",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "Errata": "HSD140",
-        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "UMask": "0x40",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "Counter": "0,1,2,3",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "PublicDescription": "Number of microcode assists invoked by HW upon uop writeback.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Actually retired uops.",
-        "Data_LA": "1",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.ALL",
-        "PublicDescription": "Counts the number of micro-ops retired. Use Cmask=1 and invert to count active cycles or stalled cycles.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "UMask": "0x2",
-        "BriefDescription": "Retirement slots used.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "PublicDescription": "This event counts the number of retirement slots used each cycle.  There are potentially 4 slots that can be used each cycle - meaning, 4 uops or 4 instructions could retire each cycle.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "CounterMask": "10",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "Invert": "1",
-        "EventCode": "0xC2",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.CYCLES",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x4",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "PublicDescription": "This event is incremented when self-modifying code (SMC) is detected, which causes a machine clear.  Machine clears can have a significant performance impact if they are happening frequently.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "UMask": "0x20",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x1",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "PublicDescription": "Counts the number of conditional branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x2",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x0",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "Branch instructions at retirement.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x8",
-        "BriefDescription": "Return instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "PublicDescription": "Counts the number of near return instructions retired.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x10",
-        "BriefDescription": "Not taken branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "PublicDescription": "Counts the number of not taken branch instructions retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x20",
-        "BriefDescription": "Taken branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "Number of near taken branches retired.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x40",
-        "BriefDescription": "Far branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "PublicDescription": "Number of far branches retired.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC4",
-        "UMask": "0x4",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x1",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x0",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "PublicDescription": "Mispredicted branch instructions at retirement.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x4",
-        "BriefDescription": "Mispredicted macro branch instructions retired. ",
-        "PEBS": "2",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "PublicDescription": "This event counts all mispredicted branch instructions retired. This is a precise event.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xCC",
-        "UMask": "0x20",
-        "BriefDescription": "Count cases of saving new LBR",
-        "Counter": "0,1,2,3",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "PublicDescription": "Count cases of saving new LBR records by hardware.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "PublicDescription": "Counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x89",
-        "UMask": "0xa0",
-        "BriefDescription": "Taken speculative and retired mispredicted indirect calls.",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_EXEC.TAKEN_INDIRECT_NEAR_CALL",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_0_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x2",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 1.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_1_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 2.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_2_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 3.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_3_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 4.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_4_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 5.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_5_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per core when uops are exectuted in port 6.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_6_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 7.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_EXECUTED_PORT.PORT_7_CORE",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC5",
-        "UMask": "0x20",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
-        "PEBS": "1",
-        "Counter": "0,1,2,3",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "PublicDescription": "Number of near branch instructions retired that were taken but mispredicted.",
-        "SampleAfterValue": "400009",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "UMask": "0x1",
         "BriefDescription": "Cycles where at least 1 uop was executed per-thread",
@@ -1074,170 +982,13 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xe6",
-        "UMask": "0x1f",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "Counter": "0,1,2,3",
-        "EventName": "BACLEARS.ANY",
-        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "EventCode": "0xC3",
-        "UMask": "0x1",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "Counter": "0,1,2,3",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "CounterMask": "1",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "CounterMask": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "Counter": "0,1,2,3",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "CounterMask": "4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EdgeDetect": "1",
-        "Invert": "1",
-        "EventCode": "0x5E",
-        "UMask": "0x1",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "Counter": "0,1,2,3",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "CounterMask": "1",
-        "SampleAfterValue": "200003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x1",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
+        "EventCode": "0xB1",
         "UMask": "0x2",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1.",
+        "BriefDescription": "Number of uops executed on the core.",
         "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x4",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x8",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x10",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x20",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x40",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA1",
-        "UMask": "0x80",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7.",
-        "Counter": "0,1,2,3",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x00",
-        "UMask": "0x2",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x0",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "AnyThread": "1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
-        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "UMask": "0x3",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke)",
-        "Counter": "0,1,2,3",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "AnyThread": "1",
-        "CounterMask": "1",
-        "PublicDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "EventName": "UOPS_EXECUTED.CORE",
+        "Errata": "HSD30, HSM31",
+        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -1297,33 +1048,291 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
-        "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "EventCode": "0xC0",
+        "UMask": "0x0",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
-        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
+        "EventName": "INST_RETIRED.ANY_P",
+        "Errata": "HSD11, HSD140",
+        "PublicDescription": "Number of instructions at retirement.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x1",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate)",
-        "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
-        "AnyThread": "1",
-        "PublicDescription": "Reference cycles when the at least one thread on the physical core is unhalted (counts at 100 MHz rate).",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "PEBS": "2",
+        "Counter": "1",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "Errata": "HSD140",
+        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
+        "CounterHTOff": "1"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC0",
         "UMask": "0x2",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "FP operations retired. X87 FP operations that have no exceptions: Counts also flows that have several X87 or flows that use X87 uops in the exception handling.",
         "Counter": "0,1,2,3",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "INST_RETIRED.X87",
+        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts FP operations retired. For X87 FP operations that have no exceptions counting also includes flows that have several X87, or flows that use X87 uops in the exception handling.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC1",
+        "UMask": "0x40",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "Counter": "0,1,2,3",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "PublicDescription": "Number of microcode assists invoked by HW upon uop writeback.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Actually retired uops.",
+        "Data_LA": "1",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.ALL",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "CounterMask": "10",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Invert": "1",
+        "EventCode": "0xC2",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
+        "AnyThread": "1",
+        "CounterMask": "1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "UMask": "0x2",
+        "BriefDescription": "Retirement slots used.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Cycles there was a Nuke. Account for both thread-specific and All Thread Nukes.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.CYCLES",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EdgeDetect": "1",
+        "EventCode": "0xC3",
+        "UMask": "0x1",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "CounterMask": "1",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x4",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "PublicDescription": "This event is incremented when self-modifying code (SMC) is detected, which causes a machine clear.  Machine clears can have a significant performance impact if they are happening frequently.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "UMask": "0x20",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "Counter": "0,1,2,3",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x0",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "Branch instructions at retirement.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x1",
+        "BriefDescription": "Conditional branch instructions retired.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect near call instructions retired.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x2",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3).",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x4",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x8",
+        "BriefDescription": "Return instructions retired.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x10",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "PublicDescription": "Counts the number of not taken branch instructions retired.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x20",
+        "BriefDescription": "Taken branch instructions retired.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC4",
+        "UMask": "0x40",
+        "BriefDescription": "Far branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "PublicDescription": "Number of far branches retired.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x0",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "PublicDescription": "Mispredicted branch instructions at retirement.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x1",
+        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x4",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
+        "PEBS": "2",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "PublicDescription": "This event counts all mispredicted branch instructions retired. This is a precise event.",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC5",
+        "UMask": "0x20",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
+        "PEBS": "1",
+        "Counter": "0,1,2,3",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xCC",
+        "UMask": "0x20",
+        "BriefDescription": "Count cases of saving new LBR",
+        "Counter": "0,1,2,3",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
+        "PublicDescription": "Count cases of saving new LBR records by hardware.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xe6",
+        "UMask": "0x1f",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
+        "Counter": "0,1,2,3",
+        "EventName": "BACLEARS.ANY",
+        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/haswellx/virtual-memory.json b/tools/perf/pmu-events/arch/x86/haswellx/virtual-memory.json
index 9c00f8e..168df55 100644
--- a/tools/perf/pmu-events/arch/x86/haswellx/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/haswellx/virtual-memory.json
@@ -40,6 +40,16 @@
     },
     {
         "EventCode": "0x08",
+        "UMask": "0xe",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "PublicDescription": "Completed page walks in any TLB of any page size due to demand load misses.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x08",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -70,6 +80,16 @@
     },
     {
         "EventCode": "0x08",
+        "UMask": "0x60",
+        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "PublicDescription": "Number of cache load STLB hits. No page walk.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x08",
         "UMask": "0x80",
         "BriefDescription": "DTLB demand load misses with low part of linear-to-physical address translation missed",
         "Counter": "0,1,2,3",
@@ -119,6 +139,16 @@
     },
     {
         "EventCode": "0x49",
+        "UMask": "0xe",
+        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "PublicDescription": "Completed page walks due to store miss in any TLB levels of any page size (4K/2M/4M/1G).",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x49",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -149,6 +179,16 @@
     },
     {
         "EventCode": "0x49",
+        "UMask": "0x60",
+        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks",
+        "Counter": "0,1,2,3",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "PublicDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x49",
         "UMask": "0x80",
         "BriefDescription": "DTLB store misses with low part of linear-to-physical address translation missed",
         "Counter": "0,1,2,3",
@@ -207,6 +247,16 @@
     },
     {
         "EventCode": "0x85",
+        "UMask": "0xe",
+        "BriefDescription": "Misses in all ITLB levels that cause completed page walks",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "PublicDescription": "Completed page walks in ITLB of any page size.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x85",
         "UMask": "0x10",
         "BriefDescription": "Cycles when PMH is busy with page walks",
         "Counter": "0,1,2,3",
@@ -236,6 +286,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x85",
+        "UMask": "0x60",
+        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks",
+        "Counter": "0,1,2,3",
+        "EventName": "ITLB_MISSES.STLB_HIT",
+        "PublicDescription": "ITLB misses that hit STLB. No page walk.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xae",
         "UMask": "0x1",
         "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.",
@@ -257,34 +317,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x21",
-        "BriefDescription": "Number of ITLB page walker hits in the L1+FB",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L1+FB.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x41",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L1 and FB.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x81",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L1 and FB.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L1",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x12",
         "BriefDescription": "Number of DTLB page walker hits in the L2",
         "Counter": "0,1,2,3",
@@ -295,34 +327,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x22",
-        "BriefDescription": "Number of ITLB page walker hits in the L2",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L2.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x42",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L2.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L2",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x82",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L2",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x14",
         "BriefDescription": "Number of DTLB page walker hits in the L3 + XSNP",
         "Counter": "0,1,2,3",
@@ -334,35 +338,6 @@
     },
     {
         "EventCode": "0xBC",
-        "UMask": "0x24",
-        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
-        "Errata": "HSD25",
-        "PublicDescription": "Number of ITLB page walker loads that hit in the L3.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x44",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L3.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L3",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
-        "UMask": "0x84",
-        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
-        "Counter": "0,1,2,3",
-        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L3",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xBC",
         "UMask": "0x18",
         "BriefDescription": "Number of DTLB page walker hits in Memory",
         "Counter": "0,1,2,3",
@@ -374,6 +349,37 @@
     },
     {
         "EventCode": "0xBC",
+        "UMask": "0x21",
+        "BriefDescription": "Number of ITLB page walker hits in the L1+FB",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L1",
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L1+FB.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x22",
+        "BriefDescription": "Number of ITLB page walker hits in the L2",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L2",
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L2.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x24",
+        "BriefDescription": "Number of ITLB page walker hits in the L3 + XSNP",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.ITLB_L3",
+        "Errata": "HSD25",
+        "PublicDescription": "Number of ITLB page walker loads that hit in the L3.",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
         "UMask": "0x28",
         "BriefDescription": "Number of ITLB page walker hits in Memory",
         "Counter": "0,1,2,3",
@@ -385,6 +391,33 @@
     },
     {
         "EventCode": "0xBC",
+        "UMask": "0x41",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L1 and FB.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x42",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L2.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x44",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in the L3.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_DTLB_L3",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
         "UMask": "0x48",
         "BriefDescription": "Counts the number of Extended Page Table walks from the DTLB that hit in memory.",
         "Counter": "0,1,2,3",
@@ -394,6 +427,33 @@
     },
     {
         "EventCode": "0xBC",
+        "UMask": "0x81",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L1 and FB.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L1",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x82",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
+        "UMask": "0x84",
+        "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in the L2.",
+        "Counter": "0,1,2,3",
+        "EventName": "PAGE_WALKER_LOADS.EPT_ITLB_L3",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xBC",
         "UMask": "0x88",
         "BriefDescription": "Counts the number of Extended Page Table walks from the ITLB that hit in memory.",
         "Counter": "0,1,2,3",
@@ -420,65 +480,5 @@
         "PublicDescription": "Count number of STLB flush attempts.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0xe",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "PublicDescription": "Completed page walks in any TLB of any page size due to demand load misses.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "UMask": "0x60",
-        "BriefDescription": "Load operations that miss the first DTLB level but hit the second and do not cause page walks",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "PublicDescription": "Number of cache load STLB hits. No page walk.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0xe",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "PublicDescription": "Completed page walks due to store miss in any TLB levels of any page size (4K/2M/4M/1G).",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "UMask": "0x60",
-        "BriefDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks",
-        "Counter": "0,1,2,3",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "PublicDescription": "Store operations that miss the first TLB level but hit the second and do not cause page walks.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0xe",
-        "BriefDescription": "Misses in all ITLB levels that cause completed page walks",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "PublicDescription": "Completed page walks in ITLB of any page size.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "UMask": "0x60",
-        "BriefDescription": "Operations that miss the first ITLB level but hit the second and do not cause any page walks",
-        "Counter": "0,1,2,3",
-        "EventName": "ITLB_MISSES.STLB_HIT",
-        "PublicDescription": "ITLB misses that hit STLB. No page walk.",
-        "SampleAfterValue": "100003",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/cache.json b/tools/perf/pmu-events/arch/x86/ivybridge/cache.json
index f1ee6d4..999a01b 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/cache.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/cache.json
@@ -10,6 +10,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts any demand and L1 HW prefetch data load requests to L2.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "RFO requests that hit L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -30,6 +40,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts all L2 store RFO requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xc",
+        "EventName": "L2_RQSTS.ALL_RFO",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests to L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -50,6 +70,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts all L2 code requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "L2_RQSTS.ALL_CODE_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 code requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts all L2 HW prefetcher requests that hit L2.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -70,36 +100,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts any demand and L1 HW prefetch data load requests to L2.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts all L2 store RFO requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xc",
-        "EventName": "L2_RQSTS.ALL_RFO",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests to L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts all L2 code requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "L2_RQSTS.ALL_CODE_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 code requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Counts all L2 HW prefetcher requests.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -219,6 +219,29 @@
         "CounterHTOff": "2"
     },
     {
+        "PublicDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "EventCode": "0x48",
+        "Counter": "2",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core",
+        "CounterMask": "1",
+        "CounterHTOff": "2"
+    },
+    {
+        "PublicDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts the number of lines brought into the L1 data cache.",
         "EventCode": "0x51",
         "Counter": "0,1,2,3",
@@ -239,36 +262,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -280,14 +273,24 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "PublicDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "CounterMask": "1",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -302,6 +305,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -313,6 +326,27 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles in which the L1D is locked.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
@@ -379,7 +413,7 @@
         "UMask": "0x11",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -389,7 +423,7 @@
         "UMask": "0x12",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -399,7 +433,7 @@
         "UMask": "0x21",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -409,7 +443,7 @@
         "UMask": "0x41",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -419,7 +453,7 @@
         "UMask": "0x42",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -429,7 +463,7 @@
         "UMask": "0x81",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -439,67 +473,61 @@
         "UMask": "0x82",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "All retired store uops. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops with L1 cache hits as data sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Retired load uops with L1 cache hits as data sources. ",
+        "BriefDescription": "Retired load uops with L1 cache hits as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops with L2 cache hits as data sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops with L2 cache hits as data sources. ",
+        "BriefDescription": "Retired load uops with L2 cache hits as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was LLC hit with no snoop required.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "MEM_LOAD_UOPS_RETIRED.LLC_HIT",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Retired load uops which data sources were data hits in LLC without snoops required. ",
+        "BriefDescription": "Retired load uops which data sources were data hits in LLC without snoops required.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source followed an L1 miss.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources following L1 data-cache miss",
+        "BriefDescription": "Retired load uops which data sources following L1 data-cache miss.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops that missed L2, excluding unknown sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source is LLC miss.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
@@ -510,61 +538,56 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
         "EventName": "MEM_LOAD_UOPS_RETIRED.HIT_LFB",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. ",
+        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package core cache LLC hit and cross-core snoop missed.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache. ",
+        "BriefDescription": "Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package LLC hit and cross-core snoop hits.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache. ",
+        "BriefDescription": "Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package core cache with HitM responses.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared LLC. ",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared LLC.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was LLC hit with no snoop required.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were hits in LLC without snoops required. ",
+        "BriefDescription": "Retired load uops which data sources were hits in LLC without snoops required.",
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI)",
+        "PublicDescription": "Retired load uops whose data source was local memory (cross-socket snoop not needed or missed).",
         "EventCode": "0xD3",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -753,50 +776,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Retired load uops whose data source was local memory (cross-socket snoop not needed or missed).",
-        "EventCode": "0xD3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load uops which data sources missed LLC but serviced from local dram.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "EventCode": "0x48",
-        "Counter": "2",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core",
-        "CounterMask": "1",
-        "CounterHTOff": "2"
-    },
-    {
-        "PublicDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3f803c0244",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/frontend.json b/tools/perf/pmu-events/arch/x86/ivybridge/frontend.json
index de72b84..efaa949 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/frontend.json
@@ -20,57 +20,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -82,6 +31,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -93,6 +52,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -138,6 +107,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts cycles MITE is delivered four uops. Set Cmask = 4.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -160,6 +139,39 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EdgeDetect": "1",
+        "EventName": "IDQ.MS_SWITCHES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of uops delivered to IDQ from any path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -206,7 +218,7 @@
         "UMask": "0x1",
         "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled ",
+        "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -289,17 +301,5 @@
         "SampleAfterValue": "2000003",
         "BriefDescription": "Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EdgeDetect": "1",
-        "EventName": "IDQ.MS_SWITCHES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/memory.json b/tools/perf/pmu-events/arch/x86/ivybridge/memory.json
index e1c6a1d..a74d54f 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/memory.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/memory.json
@@ -39,18 +39,6 @@
     },
     {
         "PEBS": "2",
-        "EventCode": "0xCD",
-        "Counter": "3",
-        "UMask": "0x2",
-        "EventName": "MEM_TRANS_RETIRED.PRECISE_STORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Sample stores and collect precise store operation via PEBS record. PMC3 only.",
-        "PRECISE_STORE": "1",
-        "TakenAlone": "1",
-        "CounterHTOff": "3"
-    },
-    {
-        "PEBS": "2",
         "PublicDescription": "Loads with latency value being above 4.",
         "EventCode": "0xCD",
         "MSRValue": "0x4",
@@ -162,6 +150,18 @@
         "CounterHTOff": "3"
     },
     {
+        "PEBS": "2",
+        "EventCode": "0xCD",
+        "Counter": "3",
+        "UMask": "0x2",
+        "EventName": "MEM_TRANS_RETIRED.PRECISE_STORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Sample stores and collect precise store operation via PEBS record. PMC3 only.",
+        "PRECISE_STORE": "1",
+        "TakenAlone": "1",
+        "CounterHTOff": "3"
+    },
+    {
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x300400244",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/other.json b/tools/perf/pmu-events/arch/x86/ivybridge/other.json
index 9c2dd05..4eb83ee 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/other.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
-        "EventCode": "0x5C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CPL_CYCLES.RING123",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Number of intervals between processor halts while thread is in ring 0.",
         "EventCode": "0x5C",
         "Counter": "0,1,2,3",
@@ -32,6 +22,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
+        "EventCode": "0x5C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPL_CYCLES.RING123",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/pipeline.json b/tools/perf/pmu-events/arch/x86/ivybridge/pipeline.json
index 2145c28..0afbfd9 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/pipeline.json
@@ -1,30 +1,41 @@
 [
     {
         "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "UMask": "0x1",
         "EventName": "INST_RETIRED.ANY",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Instructions retired from execution.",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when the thread is not in halt state.",
+        "CounterHTOff": "Fixed counter 1"
+    },
+    {
+        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when the thread is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
         "UMask": "0x3",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "PublicDescription": "Loads blocked by overlapping with store buffer that cannot be forwarded.",
@@ -78,6 +89,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "AnyThread": "1",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core.",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
@@ -175,6 +197,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Increments at the frequency of XCLK (100 MHz) when not halted.",
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
@@ -187,6 +220,36 @@
     {
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE",
         "SampleAfterValue": "2000003",
@@ -194,6 +257,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch.",
         "EventCode": "0x4C",
         "Counter": "0,1,2,3",
@@ -216,24 +288,6 @@
     {
         "EventCode": "0x58",
         "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MOVE_ELIMINATION.INT_NOT_ELIMINATED",
-        "SampleAfterValue": "1000003",
-        "BriefDescription": "Number of integer Move Elimination candidate uops that were not eliminated.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x58",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "MOVE_ELIMINATION.SIMD_NOT_ELIMINATED",
-        "SampleAfterValue": "1000003",
-        "BriefDescription": "Number of SIMD Move Elimination candidate uops that were not eliminated.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x58",
-        "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MOVE_ELIMINATION.INT_ELIMINATED",
         "SampleAfterValue": "1000003",
@@ -250,6 +304,24 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x58",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MOVE_ELIMINATION.INT_NOT_ELIMINATED",
+        "SampleAfterValue": "1000003",
+        "BriefDescription": "Number of integer Move Elimination candidate uops that were not eliminated.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x58",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "MOVE_ELIMINATION.SIMD_NOT_ELIMINATED",
+        "SampleAfterValue": "1000003",
+        "BriefDescription": "Number of SIMD Move Elimination candidate uops that were not eliminated.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles the RS is empty for the thread.",
         "EventCode": "0x5E",
         "Counter": "0,1,2,3",
@@ -260,6 +332,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x87",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -498,36 +582,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 1.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles per core when uops are dispatched to port 0.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -539,6 +593,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 1.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles per core when uops are dispatched to port 1.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -550,28 +614,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles per core when uops are dispatched to port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "AnyThread": "1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles per core when uops are dispatched to port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "AnyThread": "1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles which a Uop is dispatched on port 2.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -582,16 +624,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 3.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when load or STA uops are dispatched to port 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0xc",
@@ -602,6 +634,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 3.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when load or STA uops are dispatched to port 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles per core when load or STA uops are dispatched to port 3.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -613,6 +655,48 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles per core when uops are dispatched to port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "AnyThread": "1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles per core when uops are dispatched to port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "AnyThread": "1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles Allocation is stalled due to Resource Related reason.",
         "EventCode": "0xA2",
         "Counter": "0,1,2,3",
@@ -662,15 +746,14 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles with pending L1 cache miss loads. Set AnyThread to count per core.",
         "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with pending L1 cache miss loads.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
+        "BriefDescription": "Cycles while L2 cache miss load* is outstanding.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "PublicDescription": "Cycles with pending memory loads. Set AnyThread to count per core.",
@@ -684,13 +767,33 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "CounterMask": "2",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Total execution stalls.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Total execution stalls.",
         "CounterMask": "4",
         "CounterHTOff": "0,1,2,3"
     },
@@ -708,6 +811,16 @@
     {
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
+        "UMask": "0x5",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L2 cache miss load* is outstanding.",
+        "CounterMask": "5",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
         "UMask": "0x6",
         "EventName": "CYCLE_ACTIVITY.STALLS_LDM_PENDING",
         "SampleAfterValue": "2000003",
@@ -716,6 +829,37 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x6",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "Cycles with pending L1 cache miss loads. Set AnyThread to count per core.",
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with pending L1 cache miss loads.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
         "PublicDescription": "Execution stalls due to L1 data cache miss loads. Set Cmask=0CH.",
         "EventCode": "0xA3",
         "Counter": "2",
@@ -727,6 +871,16 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0xc",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "CounterMask": "12",
+        "CounterHTOff": "2"
+    },
+    {
         "EventCode": "0xA8",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -747,6 +901,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts total number of uops to be executed per-thread each cycle. Set Cmask = 1, INV =1 to count stall cycles.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -757,16 +922,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
-        "EventCode": "0xB1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of uops executed on the core.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "Invert": "1",
         "Counter": "0,1,2,3",
@@ -778,258 +933,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of instructions at retirement.",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "INST_RETIRED.ANY_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
-        "EventCode": "0xC0",
-        "Counter": "1",
-        "UMask": "0x1",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of micro-ops retired, Use cmask=1 and invert to count active cycles or stalled cycles.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.ALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Actually retired uops. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of retirement slots used each cycle.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Retirement slots used. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Number of self-modifying-code machine clears detected.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of conditional branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Conditional branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Direct and indirect near call instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Direct and indirect near call instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Branch instructions at retirement.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of near return instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Return instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of not taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Not taken branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Number of near taken branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Taken branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of far branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Far branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Mispredicted conditional branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted conditional branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Mispredicted branch instructions at retirement.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Mispredicted taken branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Count cases of saving new LBR records by hardware.",
-        "EventCode": "0xCC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Count cases of saving new LBR",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
-        "EventCode": "0xE6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1f",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles where at least 1 uop was executed per-thread.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -1074,150 +977,13 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x5E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L2 cache miss load* is outstanding.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
+        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
+        "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "EventName": "UOPS_EXECUTED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
-        "CounterMask": "2",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0xc",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "CounterMask": "12",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x5",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L2 cache miss load* is outstanding.",
-        "CounterMask": "5",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x6",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "AnyThread": "1",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "CounterMask": "1",
+        "BriefDescription": "Number of uops executed on the core.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -1276,32 +1042,268 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
-        "EventCode": "0x3C",
+        "PublicDescription": "Number of instructions at retirement.",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "UMask": "0x0",
+        "EventName": "INST_RETIRED.ANY_P",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "PEBS": "2",
+        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
+        "EventCode": "0xC0",
+        "Counter": "1",
+        "UMask": "0x1",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "CounterHTOff": "1"
+    },
+    {
+        "EventCode": "0xC1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.ALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retired uops.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retirement slots used.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of self-modifying-code machine clears detected.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Branch instructions at retirement.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect near call instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Return instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of not taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of far branches retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Far branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Mispredicted branch instructions at retirement.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Count cases of saving new LBR records by hardware.",
+        "EventCode": "0xCC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "Count cases of saving new LBR",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
+        "EventCode": "0xE6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1f",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivybridge/virtual-memory.json b/tools/perf/pmu-events/arch/x86/ivybridge/virtual-memory.json
index f036f53..f243551 100644
--- a/tools/perf/pmu-events/arch/x86/ivybridge/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/ivybridge/virtual-memory.json
@@ -1,5 +1,35 @@
 [
     {
+        "PublicDescription": "Misses in all TLB levels that cause a page walk of any page size from demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x81",
+        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Misses in all TLB levels that caused page walk completed of any size by demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x82",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycle PMH is busy with a walk due to demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x84",
+        "EventName": "DTLB_LOAD_MISSES.WALK_DURATION",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Demand load cycles page miss handler (PMH) is busy with this walk.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
         "UMask": "0x88",
@@ -146,35 +176,5 @@
         "SampleAfterValue": "100007",
         "BriefDescription": "STLB flush attempts",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Misses in all TLB levels that cause a page walk of any page size from demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x81",
-        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Misses in all TLB levels that caused page walk completed of any size by demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x82",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycle PMH is busy with a walk due to demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x84",
-        "EventName": "DTLB_LOAD_MISSES.WALK_DURATION",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Demand load cycles page miss handler (PMH) is busy with this walk.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/cache.json b/tools/perf/pmu-events/arch/x86/ivytown/cache.json
index ff27a62..6dad3ad 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/cache.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/cache.json
@@ -10,6 +10,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts any demand and L1 HW prefetch data load requests to L2.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "RFO requests that hit L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -30,6 +40,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts all L2 store RFO requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xc",
+        "EventName": "L2_RQSTS.ALL_RFO",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests to L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of instruction fetches that hit the L2 cache.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -50,6 +70,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts all L2 code requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "L2_RQSTS.ALL_CODE_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 code requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts all L2 HW prefetcher requests that hit L2.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -70,36 +100,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts any demand and L1 HW prefetch data load requests to L2.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts all L2 store RFO requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xc",
-        "EventName": "L2_RQSTS.ALL_RFO",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests to L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts all L2 code requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "L2_RQSTS.ALL_CODE_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 code requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Counts all L2 HW prefetcher requests.",
         "EventCode": "0x24",
         "Counter": "0,1,2,3",
@@ -219,6 +219,29 @@
         "CounterHTOff": "2"
     },
     {
+        "PublicDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "EventCode": "0x48",
+        "Counter": "2",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core",
+        "CounterMask": "1",
+        "CounterHTOff": "2"
+    },
+    {
+        "PublicDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts the number of lines brought into the L1 data cache.",
         "EventCode": "0x51",
         "Counter": "0,1,2,3",
@@ -239,36 +262,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -280,14 +273,24 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "PublicDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "CounterMask": "1",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding code reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -302,6 +305,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding RFO store transactions in SuperQueue (SQ), queue to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle.",
         "EventCode": "0x60",
         "Counter": "0,1,2,3",
@@ -313,6 +326,27 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles in which the L1D is locked.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
@@ -379,7 +413,7 @@
         "UMask": "0x11",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that miss the STLB.",
+        "BriefDescription": "Retired load uops that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -389,7 +423,7 @@
         "UMask": "0x12",
         "EventName": "MEM_UOPS_RETIRED.STLB_MISS_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that miss the STLB.",
+        "BriefDescription": "Retired store uops that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -399,7 +433,7 @@
         "UMask": "0x21",
         "EventName": "MEM_UOPS_RETIRED.LOCK_LOADS",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load uops with locked access.",
+        "BriefDescription": "Retired load uops with locked access. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -409,7 +443,7 @@
         "UMask": "0x41",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired load uops that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -419,7 +453,7 @@
         "UMask": "0x42",
         "EventName": "MEM_UOPS_RETIRED.SPLIT_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store uops that split across a cacheline boundary.",
+        "BriefDescription": "Retired store uops that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -429,7 +463,7 @@
         "UMask": "0x81",
         "EventName": "MEM_UOPS_RETIRED.ALL_LOADS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired load uops.",
+        "BriefDescription": "All retired load uops. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -439,67 +473,61 @@
         "UMask": "0x82",
         "EventName": "MEM_UOPS_RETIRED.ALL_STORES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired store uops.",
+        "BriefDescription": "All retired store uops. (Precise Event)",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops with L1 cache hits as data sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_HIT",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Retired load uops with L1 cache hits as data sources. ",
+        "BriefDescription": "Retired load uops with L1 cache hits as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops with L2 cache hits as data sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_HIT",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops with L2 cache hits as data sources. ",
+        "BriefDescription": "Retired load uops with L2 cache hits as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was LLC hit with no snoop required.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "MEM_LOAD_UOPS_RETIRED.LLC_HIT",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Retired load uops which data sources were data hits in LLC without snoops required. ",
+        "BriefDescription": "Retired load uops which data sources were data hits in LLC without snoops required.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source followed an L1 miss.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L1_MISS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources following L1 data-cache miss",
+        "BriefDescription": "Retired load uops which data sources following L1 data-cache miss.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops that missed L2, excluding unknown sources.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
         "EventName": "MEM_LOAD_UOPS_RETIRED.L2_MISS",
         "SampleAfterValue": "50021",
-        "BriefDescription": "Miss in mid-level (L2) cache. Excludes Unknown data-source.",
+        "BriefDescription": "Retired load uops with L2 cache misses as data sources.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source is LLC miss.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
@@ -510,67 +538,61 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
         "EventName": "MEM_LOAD_UOPS_RETIRED.HIT_LFB",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. ",
+        "BriefDescription": "Retired load uops which data sources were load uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package core cache LLC hit and cross-core snoop missed.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache. ",
+        "BriefDescription": "Retired load uops which data sources were LLC hit and cross-core snoop missed in on-pkg core cache.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package LLC hit and cross-core snoop hits.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache. ",
+        "BriefDescription": "Retired load uops which data sources were LLC and cross-core snoop hits in on-pkg core cache.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was an on-package core cache with HitM responses.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM",
         "SampleAfterValue": "20011",
-        "BriefDescription": "Retired load uops which data sources were HitM responses from shared LLC. ",
+        "BriefDescription": "Retired load uops which data sources were HitM responses from shared LLC.",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load uops whose data source was LLC hit with no snoop required.",
         "EventCode": "0xD2",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load uops which data sources were hits in LLC without snoops required. ",
+        "BriefDescription": "Retired load uops which data sources were hits in LLC without snoops required.",
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Retired load uop whose Data Source was: local DRAM either Snoop not needed or Snoop Miss (RspI)",
         "EventCode": "0xD3",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
+        "UMask": "0x3",
         "EventName": "MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load uops which data sources missed LLC but serviced from local dram.",
+        "BriefDescription": "Retired load uops whose data source was local DRAM (Snoop not needed, Snoop Miss, or Snoop Hit data not forwarded).",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -780,40 +802,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "EventCode": "0x48",
-        "Counter": "2",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core",
-        "CounterMask": "1",
-        "CounterHTOff": "2"
-    },
-    {
-        "PublicDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability.",
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles a demand request was blocked due to Fill Buffers inavailability",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x4003c0091",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/frontend.json b/tools/perf/pmu-events/arch/x86/ivytown/frontend.json
index de72b84..efaa949 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/frontend.json
@@ -20,57 +20,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "IDQ.MS_DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -82,6 +31,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -93,6 +52,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "IDQ.MS_DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by Decode Stream Buffer (DSB) that are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -138,6 +107,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts cycles MITE is delivered four uops. Set Cmask = 4.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -160,6 +139,39 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EdgeDetect": "1",
+        "EventName": "IDQ.MS_SWITCHES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of uops delivered to IDQ from any path.",
         "EventCode": "0x79",
         "Counter": "0,1,2,3",
@@ -206,7 +218,7 @@
         "UMask": "0x1",
         "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled ",
+        "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
         "CounterHTOff": "0,1,2,3"
     },
     {
@@ -289,17 +301,5 @@
         "SampleAfterValue": "2000003",
         "BriefDescription": "Cycles when Decode Stream Buffer (DSB) fill encounter more than 3 Decode Stream Buffer (DSB) lines",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EdgeDetect": "1",
-        "EventName": "IDQ.MS_SWITCHES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/memory.json b/tools/perf/pmu-events/arch/x86/ivytown/memory.json
index 437d98f..3a7b86a 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/memory.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/memory.json
@@ -30,18 +30,6 @@
     },
     {
         "PEBS": "2",
-        "EventCode": "0xCD",
-        "Counter": "3",
-        "UMask": "0x2",
-        "EventName": "MEM_TRANS_RETIRED.PRECISE_STORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Sample stores and collect precise store operation via PEBS record. PMC3 only.",
-        "PRECISE_STORE": "1",
-        "TakenAlone": "1",
-        "CounterHTOff": "3"
-    },
-    {
-        "PEBS": "2",
         "PublicDescription": "Loads with latency value being above 4.",
         "EventCode": "0xCD",
         "MSRValue": "0x4",
@@ -153,6 +141,18 @@
         "CounterHTOff": "3"
     },
     {
+        "PEBS": "2",
+        "EventCode": "0xCD",
+        "Counter": "3",
+        "UMask": "0x2",
+        "EventName": "MEM_TRANS_RETIRED.PRECISE_STORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Sample stores and collect precise store operation via PEBS record. PMC3 only.",
+        "PRECISE_STORE": "1",
+        "TakenAlone": "1",
+        "CounterHTOff": "3"
+    },
+    {
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fffc00244",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/other.json b/tools/perf/pmu-events/arch/x86/ivytown/other.json
index 9c2dd05..4eb83ee 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/other.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/other.json
@@ -10,16 +10,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
-        "EventCode": "0x5C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CPL_CYCLES.RING123",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Number of intervals between processor halts while thread is in ring 0.",
         "EventCode": "0x5C",
         "Counter": "0,1,2,3",
@@ -32,6 +22,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Unhalted core cycles when the thread is not in ring 0.",
+        "EventCode": "0x5C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPL_CYCLES.RING123",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Unhalted core cycles when thread is in rings 1, 2, or 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles in which the L1D and L2 are locked, due to a UC lock or split lock.",
         "EventCode": "0x63",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/pipeline.json b/tools/perf/pmu-events/arch/x86/ivytown/pipeline.json
index 2145c28..0afbfd9 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/pipeline.json
@@ -1,30 +1,41 @@
 [
     {
         "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "UMask": "0x1",
         "EventName": "INST_RETIRED.ANY",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Instructions retired from execution.",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when the thread is not in halt state.",
+        "CounterHTOff": "Fixed counter 1"
+    },
+    {
+        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when the thread is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
         "UMask": "0x3",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "PublicDescription": "Loads blocked by overlapping with store buffer that cannot be forwarded.",
@@ -78,6 +89,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3",
+        "AnyThread": "1",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core.",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
@@ -175,6 +197,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Increments at the frequency of XCLK (100 MHz) when not halted.",
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
@@ -187,6 +220,36 @@
     {
         "EventCode": "0x3C",
         "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
         "UMask": "0x2",
         "EventName": "CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE",
         "SampleAfterValue": "2000003",
@@ -194,6 +257,15 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch.",
         "EventCode": "0x4C",
         "Counter": "0,1,2,3",
@@ -216,24 +288,6 @@
     {
         "EventCode": "0x58",
         "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MOVE_ELIMINATION.INT_NOT_ELIMINATED",
-        "SampleAfterValue": "1000003",
-        "BriefDescription": "Number of integer Move Elimination candidate uops that were not eliminated.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x58",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "MOVE_ELIMINATION.SIMD_NOT_ELIMINATED",
-        "SampleAfterValue": "1000003",
-        "BriefDescription": "Number of SIMD Move Elimination candidate uops that were not eliminated.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x58",
-        "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "MOVE_ELIMINATION.INT_ELIMINATED",
         "SampleAfterValue": "1000003",
@@ -250,6 +304,24 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x58",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MOVE_ELIMINATION.INT_NOT_ELIMINATED",
+        "SampleAfterValue": "1000003",
+        "BriefDescription": "Number of integer Move Elimination candidate uops that were not eliminated.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x58",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "MOVE_ELIMINATION.SIMD_NOT_ELIMINATED",
+        "SampleAfterValue": "1000003",
+        "BriefDescription": "Number of SIMD Move Elimination candidate uops that were not eliminated.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles the RS is empty for the thread.",
         "EventCode": "0x5E",
         "Counter": "0,1,2,3",
@@ -260,6 +332,18 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x5E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "RS_EVENTS.EMPTY_END",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x87",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -498,36 +582,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 1.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are dispatched to port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles per core when uops are dispatched to port 0.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -539,6 +593,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 1.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles per core when uops are dispatched to port 1.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -550,28 +614,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles per core when uops are dispatched to port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "AnyThread": "1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles per core when uops are dispatched to port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "AnyThread": "1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5_CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per core when uops are dispatched to port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles which a Uop is dispatched on port 2.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -582,16 +624,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles which a Uop is dispatched on port 3.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when load or STA uops are dispatched to port 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0xc",
@@ -602,6 +634,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 3.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when load or STA uops are dispatched to port 3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles per core when load or STA uops are dispatched to port 3.",
         "EventCode": "0xA1",
         "Counter": "0,1,2,3",
@@ -613,6 +655,48 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles per core when uops are dispatched to port 4.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "AnyThread": "1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles which a Uop is dispatched on port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are dispatched to port 5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles per core when uops are dispatched to port 5.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "AnyThread": "1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5_CORE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per core when uops are dispatched to port 5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Cycles Allocation is stalled due to Resource Related reason.",
         "EventCode": "0xA2",
         "Counter": "0,1,2,3",
@@ -662,15 +746,14 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Cycles with pending L1 cache miss loads. Set AnyThread to count per core.",
         "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with pending L1 cache miss loads.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
+        "BriefDescription": "Cycles while L2 cache miss load* is outstanding.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "PublicDescription": "Cycles with pending memory loads. Set AnyThread to count per core.",
@@ -684,13 +767,33 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "CounterMask": "2",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PublicDescription": "Total execution stalls.",
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
         "EventName": "CYCLE_ACTIVITY.CYCLES_NO_EXECUTE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls",
+        "BriefDescription": "This event increments by 1 for every cycle where there was no execute for this thread.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Total execution stalls.",
         "CounterMask": "4",
         "CounterHTOff": "0,1,2,3"
     },
@@ -708,6 +811,16 @@
     {
         "EventCode": "0xA3",
         "Counter": "0,1,2,3",
+        "UMask": "0x5",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L2 cache miss load* is outstanding.",
+        "CounterMask": "5",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
         "UMask": "0x6",
         "EventName": "CYCLE_ACTIVITY.STALLS_LDM_PENDING",
         "SampleAfterValue": "2000003",
@@ -716,6 +829,37 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x6",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PublicDescription": "Cycles with pending L1 cache miss loads. Set AnyThread to count per core.",
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with pending L1 cache miss loads.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "CounterMask": "8",
+        "CounterHTOff": "2"
+    },
+    {
         "PublicDescription": "Execution stalls due to L1 data cache miss loads. Set Cmask=0CH.",
         "EventCode": "0xA3",
         "Counter": "2",
@@ -727,6 +871,16 @@
         "CounterHTOff": "2"
     },
     {
+        "EventCode": "0xA3",
+        "Counter": "2",
+        "UMask": "0xc",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "CounterMask": "12",
+        "CounterHTOff": "2"
+    },
+    {
         "EventCode": "0xA8",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -747,6 +901,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Counts total number of uops to be executed per-thread each cycle. Set Cmask = 1, INV =1 to count stall cycles.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -757,16 +922,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
-        "EventCode": "0xB1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of uops executed on the core.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xB1",
         "Invert": "1",
         "Counter": "0,1,2,3",
@@ -778,258 +933,6 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of instructions at retirement.",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "INST_RETIRED.ANY_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
-        "EventCode": "0xC0",
-        "Counter": "1",
-        "UMask": "0x1",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "CounterHTOff": "1"
-    },
-    {
-        "EventCode": "0xC1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of micro-ops retired, Use cmask=1 and invert to count active cycles or stalled cycles.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.ALL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Actually retired uops. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of retirement slots used each cycle.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Retirement slots used. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Number of self-modifying-code machine clears detected.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "MACHINE_CLEARS.MASKMOV",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of conditional branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Conditional branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Direct and indirect near call instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Direct and indirect near call instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Branch instructions at retirement.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Counts the number of near return instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Return instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Counts the number of not taken branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Not taken branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Number of near taken branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Taken branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of far branches retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Far branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Mispredicted conditional branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted conditional branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Mispredicted branch instructions at retirement.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "Mispredicted taken branch instructions retired.",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted macro branch instructions retired.",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Count cases of saving new LBR records by hardware.",
-        "EventCode": "0xCC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Count cases of saving new LBR",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
-        "EventCode": "0xE6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1f",
-        "EventName": "BACLEARS.ANY",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PublicDescription": "Cycles where at least 1 uop was executed per-thread.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -1074,150 +977,13 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x5E",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "RS_EVENTS.EMPTY_END",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate Frontend Latency Bound issues.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of machine clears (nukes) of any type.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "CounterMask": "8",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L2 cache miss load* is outstanding.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
+        "PublicDescription": "Counts total number of uops to be executed per-core each cycle.",
+        "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "EventName": "UOPS_EXECUTED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
-        "CounterMask": "2",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "2",
-        "UMask": "0xc",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "CounterMask": "12",
-        "CounterHTOff": "2"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x5",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L2 cache miss load* is outstanding.",
-        "CounterMask": "5",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x6",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "PublicDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3",
-        "AnyThread": "1",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
-        "CounterMask": "1",
+        "BriefDescription": "Number of uops executed on the core.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -1276,32 +1042,268 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Reference cycles when the thread is unhalted. (counts at 100 MHz rate)",
-        "EventCode": "0x3C",
+        "PublicDescription": "Number of instructions at retirement.",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "UMask": "0x0",
+        "EventName": "INST_RETIRED.ANY_P",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the thread is unhalted (counts at 100 MHz rate)",
+        "BriefDescription": "Number of instructions retired. General Counter   - architectural event",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "PEBS": "2",
+        "PublicDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution.",
+        "EventCode": "0xC0",
+        "Counter": "1",
+        "UMask": "0x1",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "CounterHTOff": "1"
+    },
+    {
+        "EventCode": "0xC1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "OTHER_ASSISTS.ANY_WB_ASSIST",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of times any microcode assist is invoked by HW upon uop writeback.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.ALL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retired uops.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "EventCode": "0xC2",
+        "Invert": "1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "EventName": "UOPS_RETIRED.CORE_STALL_CYCLES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Reference cycles when the at least one thread on the physical core is unhalted. (counts at 100 MHz rate)",
+        "BriefDescription": "Cycles without actually retired uops.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retirement slots used.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of self-modifying-code machine clears detected.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "MACHINE_CLEARS.MASKMOV",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "This event counts the number of executed Intel AVX masked load operations that refer to an illegal address range with the mask bits set to 0.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Branch instructions at retirement.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect near call instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL_R3",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect macro near call instructions retired (captured in ring 3).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Return instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of not taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of far branches retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Far branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Mispredicted branch instructions at retirement.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "number of near branch instructions retired that were mispredicted and taken.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Count cases of saving new LBR records by hardware.",
+        "EventCode": "0xCC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Count XClk pulses when this thread is unhalted and the other thread is halted.",
+        "BriefDescription": "Count cases of saving new LBR",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of front end re-steers due to BPU misprediction.",
+        "EventCode": "0xE6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1f",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/ivytown/virtual-memory.json b/tools/perf/pmu-events/arch/x86/ivytown/virtual-memory.json
index c8de548..4645e9d 100644
--- a/tools/perf/pmu-events/arch/x86/ivytown/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/ivytown/virtual-memory.json
@@ -1,5 +1,15 @@
 [
     {
+        "PublicDescription": "Misses in all TLB levels that cause a page walk of any page size from demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x81",
+        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
         "UMask": "0x82",
@@ -9,6 +19,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Misses in all TLB levels that caused page walk completed of any size by demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x82",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
         "UMask": "0x84",
@@ -18,6 +38,16 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycle PMH is busy with a walk due to demand loads.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x84",
+        "EventName": "DTLB_LOAD_MISSES.WALK_DURATION",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Demand load cycles page miss handler (PMH) is busy with this walk.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x08",
         "Counter": "0,1,2,3",
         "UMask": "0x88",
@@ -164,35 +194,5 @@
         "SampleAfterValue": "100007",
         "BriefDescription": "STLB flush attempts",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Misses in all TLB levels that cause a page walk of any page size from demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x81",
-        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes an page walk of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Misses in all TLB levels that caused page walk completed of any size by demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x82",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes of any page size.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycle PMH is busy with a walk due to demand loads.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x84",
-        "EventName": "DTLB_LOAD_MISSES.WALK_DURATION",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Demand load cycles page miss handler (PMH) is busy with this walk.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/mapfile.csv b/tools/perf/pmu-events/arch/x86/mapfile.csv
index fe1a2c4..93656f2 100644
--- a/tools/perf/pmu-events/arch/x86/mapfile.csv
+++ b/tools/perf/pmu-events/arch/x86/mapfile.csv
@@ -23,10 +23,7 @@
 GenuineIntel-6-1F,v2,nehalemep,core
 GenuineIntel-6-1A,v2,nehalemep,core
 GenuineIntel-6-2E,v2,nehalemex,core
-GenuineIntel-6-4E,v24,skylake,core
-GenuineIntel-6-5E,v24,skylake,core
-GenuineIntel-6-8E,v24,skylake,core
-GenuineIntel-6-9E,v24,skylake,core
+GenuineIntel-6-[4589]E,v24,skylake,core
 GenuineIntel-6-37,v13,silvermont,core
 GenuineIntel-6-4D,v13,silvermont,core
 GenuineIntel-6-4C,v13,silvermont,core
diff --git a/tools/perf/pmu-events/arch/x86/silvermont/cache.json b/tools/perf/pmu-events/arch/x86/silvermont/cache.json
index 0bd1bc5..82be7d1 100644
--- a/tools/perf/pmu-events/arch/x86/silvermont/cache.json
+++ b/tools/perf/pmu-events/arch/x86/silvermont/cache.json
@@ -36,12 +36,13 @@
         "BriefDescription": "L2 cache request misses"
     },
     {
+        "PublicDescription": "Counts cycles that fetch is stalled due to an outstanding ICache miss. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes due to an ICache miss.  Note: this event is not the same as the total number of cycles spent retrieving instruction cache lines from the memory hierarchy.\r\nCounts cycles that fetch is stalled due to any reason. That is, the decoder queue is able to accept bytes, but the fetch unit is unable to provide bytes.  This will include cycles due to an ITLB miss, ICache miss and other events. \r\n",
         "EventCode": "0x86",
         "Counter": "0,1",
         "UMask": "0x4",
         "EventName": "FETCH_STALL.ICACHE_FILL_PENDING_CYCLES",
         "SampleAfterValue": "200003",
-        "BriefDescription": "Counts the number of cycles the NIP stalls because of an icache miss. This is a cumulative count of cycles the NIP stalled for all icache misses."
+        "BriefDescription": "Cycles code-fetch stalled due to an outstanding ICache miss."
     },
     {
         "PEBS": "1",
diff --git a/tools/perf/pmu-events/arch/x86/skylake/cache.json b/tools/perf/pmu-events/arch/x86/skylake/cache.json
index 0551a9b..54bfe9e 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/cache.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/cache.json
@@ -1,23 +1,423 @@
 [
     {
+        "PublicDescription": "Counts the number of demand Data Read requests that miss L2 cache. Only not rejected loads are counted.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x21",
+        "EventName": "L2_RQSTS.DEMAND_DATA_RD_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read miss L2, no rejects",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the RFO (Read-for-Ownership) requests that miss L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x22",
+        "EventName": "L2_RQSTS.RFO_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts L2 cache misses when fetching instructions.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "EventName": "L2_RQSTS.CODE_RD_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache misses when fetching instructions",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Demand requests that miss L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x27",
+        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x38",
+        "EventName": "L2_RQSTS.PF_MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "All requests that miss L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3f",
+        "EventName": "L2_RQSTS.MISS",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All requests that miss L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of demand Data Read requests that hit L2 cache. Only non rejected loads are counted.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x41",
+        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read requests that hit L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the RFO (Read-for-Ownership) requests that hit L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x42",
+        "EventName": "L2_RQSTS.RFO_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests that hit L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts L2 cache hits when fetching instructions, code reads.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0x44",
+        "EventName": "L2_RQSTS.CODE_RD_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xd8",
+        "EventName": "L2_RQSTS.PF_HIT",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only non rejected loads are counted.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe1",
+        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand Data Read requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe2",
+        "EventName": "L2_RQSTS.ALL_RFO",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "RFO requests to L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the total number of L2 code requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe4",
+        "EventName": "L2_RQSTS.ALL_CODE_RD",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "L2 code requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Demand requests to L2 cache.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe7",
+        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Demand requests to L2 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the total number of requests from the L2 hardware prefetchers.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xf8",
+        "EventName": "L2_RQSTS.ALL_PF",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "All L2 requests.",
+        "EventCode": "0x24",
+        "Counter": "0,1,2,3",
+        "UMask": "0xff",
+        "EventName": "L2_RQSTS.REFERENCES",
+        "SampleAfterValue": "200003",
+        "BriefDescription": "All L2 requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts core-originated cacheable requests that miss the L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2. It does not include all misses to the L3.",
+        "EventCode": "0x2E",
+        "Counter": "0,1,2,3",
+        "UMask": "0x41",
+        "Errata": "SKL057",
+        "EventName": "LONGEST_LAT_CACHE.MISS",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Core-originated cacheable demand requests missed L3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts core-originated cacheable requests to the  L3 cache (Longest Latency cache). Requests include data and code reads, Reads-for-Ownership (RFOs), speculative accesses and hardware prefetches from L1 and L2.  It does not include all accesses to the L3.",
+        "EventCode": "0x2E",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4f",
+        "Errata": "SKL057",
+        "EventName": "LONGEST_LAT_CACHE.REFERENCE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Core-originated cacheable demand requests that refer to L3",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts duration of L1D miss outstanding, that is each cycle number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand from the demand Hit FB, if it is allocated by hardware or software prefetch.Note: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type.",
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "L1D_PEND_MISS.PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "L1D miss outstandings duration in cycles",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts duration of L1D miss outstanding in cycles.",
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times a request needed a FB (Fill Buffer) entry but there was no entry available for it. A request includes cacheable/uncacheable demands that are load, store or SW prefetch instructions.",
+        "EventCode": "0x48",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "L1D_PEND_MISS.FB_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times a request needed a FB entry but there was no entry available for it. That is the FB unavailability was dominant reason for blocking the request. A request includes cacheable/uncacheable demands that is load, store or SW prefetch.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
+        "EventCode": "0x51",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "L1D.REPLACEMENT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "L1D data line replacements",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding Demand Data Read transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor. See the corresponding Umask under OFFCORE_REQUESTS.Note: A prefetch promoted to Demand is counted from the promotion point.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding Demand Data Read transactions in uncore queue.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding RFO (store) transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of offcore outstanding cacheable Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles when offcore outstanding cacheable Core Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the Demand Data Read requests sent to uncore. Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore.",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS.DEMAND_DATA_RD",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand Data Read requests sent to uncore",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts both cacheable and non-cacheable code read requests.",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "OFFCORE_REQUESTS.DEMAND_CODE_RD",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Cacheable and noncachaeble code read requests",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the demand RFO (read for ownership) requests including regular RFOs, locks, ItoM.",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "OFFCORE_REQUESTS.DEMAND_RFO",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand RFO requests including regular RFOs, locks, ItoM",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the demand and prefetch data reads. All Core Data Reads include cacheable 'Demands' and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "OFFCORE_REQUESTS.ALL_DATA_RD",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand and prefetch data reads",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts memory transactions reached the super queue including requests initiated by the core, all L3 prefetches, page walks, etc..",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x80",
+        "EventName": "OFFCORE_REQUESTS.ALL_REQUESTS",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Any memory transaction that reached the SQ.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of cases when the offcore requests buffer cannot take more entries for the core. This can happen when the superqueue does not contain eligible entries, or when L1D writeback pending FIFO requests is full.Note: Writeback pending FIFO has six entries.",
+        "EventCode": "0xB2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_REQUESTS_BUFFER.SQ_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Offcore requests buffer cannot take more entries for this thread core.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "EventCode": "0xB7, 0xBB",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "OFFCORE_RESPONSE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
         "PEBS": "1",
+        "PublicDescription": "Retired load instructions that miss the STLB.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x11",
         "EventName": "MEM_INST_RETIRED.STLB_MISS_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load instructions that miss the STLB.",
+        "BriefDescription": "Retired load instructions that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Retired store instructions that miss the STLB.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x12",
         "EventName": "MEM_INST_RETIRED.STLB_MISS_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store instructions that miss the STLB.",
+        "BriefDescription": "Retired store instructions that miss the STLB. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
@@ -29,7 +429,7 @@
         "UMask": "0x21",
         "EventName": "MEM_INST_RETIRED.LOCK_LOADS",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired load instructions with locked access.",
+        "BriefDescription": "Retired load instructions with locked access. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -40,7 +440,7 @@
         "UMask": "0x41",
         "EventName": "MEM_INST_RETIRED.SPLIT_LOADS",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired load instructions that split across a cacheline boundary.",
+        "BriefDescription": "Retired load instructions that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
@@ -51,7 +451,7 @@
         "UMask": "0x42",
         "EventName": "MEM_INST_RETIRED.SPLIT_STORES",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Retired store instructions that split across a cacheline boundary.",
+        "BriefDescription": "Retired store instructions that split across a cacheline boundary. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
@@ -63,25 +463,26 @@
         "UMask": "0x81",
         "EventName": "MEM_INST_RETIRED.ALL_LOADS",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired load instructions.",
+        "BriefDescription": "All retired load instructions. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "All retired store instructions.",
         "EventCode": "0xD0",
         "Counter": "0,1,2,3",
         "UMask": "0x82",
         "EventName": "MEM_INST_RETIRED.ALL_STORES",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "All retired store instructions.",
+        "BriefDescription": "All retired store instructions. (Precise Event)",
         "CounterHTOff": "0,1,2,3",
         "Data_LA": "1",
         "L1_Hit_Indication": "1"
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load instructions with L1 cache hits as data sources.",
+        "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L1 data cache. This event includes all SW prefetches and lock instructions regardless of the data source.\r\n",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -117,7 +518,7 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load instructions missed L1 cache as data sources.",
+        "PublicDescription": "Counts retired load instructions with at least one uop that missed in the L1 cache.",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -153,7 +554,7 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Retired load instructions which data sources were load missed L1 but hit FB due to preceding miss to the same cache line with data not ready.",
+        "PublicDescription": "Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. \r\n",
         "EventCode": "0xD1",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
@@ -222,169 +623,7 @@
         "Data_LA": "1"
     },
     {
-        "PublicDescription": "This event counts L1D data line replacements including opportunistic replacements, and replacements that require stall-for-replace or block-for-replace.",
-        "EventCode": "0x51",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "L1D.REPLACEMENT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "L1D data line replacements",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts duration of L1D miss outstanding, that is each cycle number of Fill Buffers (FB) outstanding required by Demand Reads. FB either is held by demand loads, or it is held by non-demand loads and gets hit at least once by demand. The valid outstanding interval is defined until the FB deallocation by one of the following ways: from FB allocation, if FB is allocated by demand\n from the demand Hit FB, if it is allocated by hardware or software prefetch.\nNote: In the L1D, a Demand Read contains cacheable or noncacheable demand loads, including ones causing cache-line splits and reads due to page walks resulted from any request type.",
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "L1D_PEND_MISS.PENDING",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "L1D miss outstandings duration in cycles",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "L1D_PEND_MISS.FB_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times a request needed a FB entry but there was no entry available for it. That is the FB unavailability was dominant reason for blocking the request. A request includes cacheable/uncacheable demands that is load, store or SW prefetch.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts duration of L1D miss outstanding in cycles.",
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the Demand Data Read requests sent to uncore. Use it in conjunction with OFFCORE_REQUESTS_OUTSTANDING to determine average latency in the uncore.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS.DEMAND_DATA_RD",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand Data Read requests sent to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts both cacheable and noncachaeble code read requests.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "OFFCORE_REQUESTS.DEMAND_CODE_RD",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Cacheable and noncachaeble code read requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the demand RFO (read for ownership) requests including regular RFOs, locks, ItoM.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "OFFCORE_REQUESTS.DEMAND_RFO",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand RFO requests including regular RFOs, locks, ItoM",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the demand and prefetch data reads. All Core Data Reads include cacheable 'Demands' and L2 prefetchers (not L3 prefetchers). Counting also covers reads due to page walks resulted from any request type.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS.ALL_DATA_RD",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand and prefetch data reads",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts memory transactions reached the super queue including requests initiated by the core, all L3 prefetches, page walks, and so on.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "OFFCORE_REQUESTS.ALL_REQUESTS",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Any memory transaction that reached the SQ.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding Demand Data Read transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor. See the corresponding Umask under OFFCORE_REQUESTS.\nNote: A prefetch promoted to Demand is counted from the promotion point.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding Demand Data Read transactions in uncore queue.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding RFO (store) transactions in the super queue (SQ) every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore, every cycle",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding cacheable Core Data Read transactions in the super queue every cycle. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.ALL_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore outstanding cacheable Core Data Read transactions in SuperQueue (SQ), queue to uncore",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles when offcore outstanding Demand Data Read transactions are present in the super queue (SQ). A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation).",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding Demand Data Read transactions are present in SuperQueue (SQ), queue to uncore",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles when offcore outstanding cacheable Core Data Read transactions are present in the super queue. A transaction is considered to be in the Offcore outstanding state between L2 miss and transaction completion sent to requestor (SQ de-allocation). See corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when offcore outstanding cacheable Core Data Read transactions are present in SuperQueue (SQ), queue to uncore.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cases when the offcore requests buffer cannot take more entries for the core. This can happen when the superqueue does not contain eligible entries, or when L1D writeback pending FIFO requests is full.\nNote: Writeback pending FIFO has six entries.",
-        "EventCode": "0xB2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_BUFFER.SQ_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Offcore requests buffer cannot take more entries for this thread core.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts L2 writebacks that access L2 cache.",
+        "PublicDescription": "Counts L2 writebacks that access L2 cache.",
         "EventCode": "0xF0",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
@@ -394,204 +633,13 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts core-originated cacheable demand requests that miss the last level cache (LLC). Demand requests include loads, RFOs, and hardware prefetches from L1D, and instruction fetches from IFU.",
-        "EventCode": "0x2E",
+        "PublicDescription": "Counts the number of L2 cache lines filling the L2. Counting does not cover rejects.",
+        "EventCode": "0xF1",
         "Counter": "0,1,2,3",
-        "UMask": "0x41",
-        "Errata": "SKL057",
-        "EventName": "LONGEST_LAT_CACHE.MISS",
+        "UMask": "0x1f",
+        "EventName": "L2_LINES_IN.ALL",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Core-originated cacheable demand requests missed L3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts core-originated cacheable demand requests that refer to the last level cache (LLC). Demand requests include loads, RFOs, and hardware prefetches from L1D, and instruction fetches from IFU.",
-        "EventCode": "0x2E",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4f",
-        "Errata": "SKL057",
-        "EventName": "LONGEST_LAT_CACHE.REFERENCE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Core-originated cacheable demand requests that refer to L3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cache line split locks sent to the uncore.",
-        "EventCode": "0xF4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "SQ_MISC.SPLIT_LOCK",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of cache line split locks sent to uncore.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
-        "EventCode": "0xB7, 0xBB",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "PublicDescription": "This event counts the number of demand Data Read requests that miss L2 cache. Only not rejected loads are counted.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x21",
-        "EventName": "L2_RQSTS.DEMAND_DATA_RD_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read miss L2, no rejects",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of demand Data Read requests that hit L2 cache. Only not rejected loads are counted.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x41",
-        "EventName": "L2_RQSTS.DEMAND_DATA_RD_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read requests that hit L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of demand Data Read requests (including requests from L1D hardware prefetchers). These loads may hit or miss L2 cache. Only non rejected loads are counted.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe1",
-        "EventName": "L2_RQSTS.ALL_DEMAND_DATA_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand Data Read requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the total number of RFO (read for ownership) requests to L2 cache. L2 RFO requests include both L1D demand RFO misses as well as L1D RFO prefetches.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe2",
-        "EventName": "L2_RQSTS.ALL_RFO",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests to L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the total number of L2 code requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe4",
-        "EventName": "L2_RQSTS.ALL_CODE_RD",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 code requests",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the total number of requests from the L2 hardware prefetchers.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xf8",
-        "EventName": "L2_RQSTS.ALL_PF",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x38",
-        "EventName": "L2_RQSTS.PF_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xd8",
-        "EventName": "L2_RQSTS.PF_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Requests from the L1/L2/L3 hardware prefetchers or Load software prefetches that hit L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "RFO requests that hit L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x42",
-        "EventName": "L2_RQSTS.RFO_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that hit L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "RFO requests that miss L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x22",
-        "EventName": "L2_RQSTS.RFO_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "RFO requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x44",
-        "EventName": "L2_RQSTS.CODE_RD_HIT",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache hits when fetching instructions, code reads.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "L2 cache misses when fetching instructions.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "EventName": "L2_RQSTS.CODE_RD_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "L2 cache misses when fetching instructions",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Demand requests that miss L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x27",
-        "EventName": "L2_RQSTS.ALL_DEMAND_MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Demand requests to L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe7",
-        "EventName": "L2_RQSTS.ALL_DEMAND_REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "Demand requests to L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "All requests that miss L2 cache.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3f",
-        "EventName": "L2_RQSTS.MISS",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All requests that miss L2 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "All L2 requests.",
-        "EventCode": "0x24",
-        "Counter": "0,1,2,3",
-        "UMask": "0xff",
-        "EventName": "L2_RQSTS.REFERENCES",
-        "SampleAfterValue": "200003",
-        "BriefDescription": "All L2 requests",
+        "BriefDescription": "L2 cache lines filling L2",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -623,59 +671,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of L2 cache lines filling the L2. Counting does not cover rejects.",
-        "EventCode": "0xF1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1f",
-        "EventName": "L2_LINES_IN.ALL",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "L2 cache lines filling L2",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_CODE_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of offcore outstanding demand rfo Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_DEMAND_RFO",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with offcore outstanding demand rfo reads transactions in SuperQueue (SQ), queue to uncore.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x48",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "L1D_PEND_MISS.PENDING_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with L1D load Misses outstanding from any thread on physical core.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 offcore outstanding Demand Data Read transactions in uncore queue.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "EventCode": "0xF2",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -685,3102 +680,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0408000 ",
+        "PublicDescription": "Counts the number of cache line split locks sent to the uncore.",
+        "EventCode": "0xF4",
         "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
+        "UMask": "0x10",
+        "EventName": "SQ_MISC.SPLIT_LOCK",
         "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
+        "BriefDescription": "Number of cache line split locks sent to uncore.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts any other requests that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts any other requests that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts any other requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000018000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts any other requests that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts streaming stores that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts streaming stores that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts streaming stores that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000010800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts streaming stores that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000010100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000010080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand code reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand code reads that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000010004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand code reads that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc01c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x10001c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x04001c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x02001c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoops sent to sibling cores return clean response.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x01001c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00801c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0000010002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.ANY_RESPONSE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts all demand data writes (RFOs) that have any response type.",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fc0400001 ",
         "Counter": "0,1,2,3",
@@ -3793,6 +703,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000400001 ",
         "Counter": "0,1,2,3",
@@ -3805,6 +716,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400400001 ",
         "Counter": "0,1,2,3",
@@ -3817,6 +729,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200400001 ",
         "Counter": "0,1,2,3",
@@ -3829,6 +742,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100400001 ",
         "Counter": "0,1,2,3",
@@ -3841,6 +755,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080400001 ",
         "Counter": "0,1,2,3",
@@ -3853,18 +768,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040400001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L4_HIT_LOCAL_L4.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L4_HIT_LOCAL_L4 & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fc01c0001 ",
         "Counter": "0,1,2,3",
@@ -3877,6 +781,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x10001c0001 ",
         "Counter": "0,1,2,3",
@@ -3889,6 +794,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoops to sibling cores hit in either E/S state and the line is not forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x04001c0001 ",
         "Counter": "0,1,2,3",
@@ -3901,6 +807,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoops sent to sibling cores return clean response. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x02001c0001 ",
         "Counter": "0,1,2,3",
@@ -3913,6 +820,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x01001c0001 ",
         "Counter": "0,1,2,3",
@@ -3925,6 +833,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00801c0001 ",
         "Counter": "0,1,2,3",
@@ -3937,270 +846,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00401c0001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc0040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1000040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0400040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0200040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0100040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0080040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fc0020001 ",
         "Counter": "0,1,2,3",
@@ -4213,6 +859,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1000020001 ",
         "Counter": "0,1,2,3",
@@ -4225,6 +872,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0400020001 ",
         "Counter": "0,1,2,3",
@@ -4237,6 +885,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0200020001 ",
         "Counter": "0,1,2,3",
@@ -4249,6 +898,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0100020001 ",
         "Counter": "0,1,2,3",
@@ -4261,6 +911,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0080020001 ",
         "Counter": "0,1,2,3",
@@ -4273,18 +924,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0040020001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.SUPPLIER_NONE.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & SUPPLIER_NONE & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "Counts demand data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0000010001 ",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/skylake/floating-point.json b/tools/perf/pmu-events/arch/x86/skylake/floating-point.json
index 3c6b59a..213dd62 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/floating-point.json
@@ -27,13 +27,12 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "EventCode": "0xC7",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
         "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -55,7 +54,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
+        "PublicDescription": "Counts cycles with any input and output SSE or x87 FP assist. If an input and output assist are detected on the same cycle the event increments by 1.",
         "EventCode": "0xCA",
         "Counter": "0,1,2,3",
         "UMask": "0x1e",
diff --git a/tools/perf/pmu-events/arch/x86/skylake/frontend.json b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
index e697dbd..578dff5b 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/frontend.json
@@ -1,5 +1,146 @@
 [
     {
+        "PublicDescription": "Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "IDQ.MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "IDQ.MITE_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "IDQ.DSB_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "IDQ.MS_DSB_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of cycles 4 uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x18",
+        "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of cycles uops were delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Count includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x18",
+        "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "IDQ.MS_MITE_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of cycles 4 uops were delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles MITE is delivering 4 Uops",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of cycles uops were delivered to the Instruction Decode Queue (IDQ) from the MITE (legacy decode pipeline) path. Counting includes uops that may 'bypass' the IDQ. During these cycles uops are not being delivered from the Decode Stream Buffer (DSB).",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x24",
+        "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles MITE is delivering any Uop",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EdgeDetect": "1",
+        "EventName": "IDQ.MS_SWITCHES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the total number of uops delivered by the Microcode Sequencer (MS). Any instruction over 4 uops will be delivered by the MS. Some instructions such as transcendentals may additionally generate uops from the MS.",
+        "EventCode": "0x79",
+        "Counter": "0,1,2,3",
+        "UMask": "0x30",
+        "EventName": "IDQ.MS_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles where a code line fetch is stalled due to an L1 instruction cache miss. The legacy decode pipeline works at a 16 Byte granularity.",
         "EventCode": "0x80",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -36,125 +177,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "IDQ.MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from MITE path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "IDQ.MS_MITE_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops initiated by MITE and delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may 'bypass' the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "IDQ.MITE_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from MITE path",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops are being delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "IDQ.DSB_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops are being delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while the Microcode Sequencer (MS) is busy. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "IDQ.MS_DSB_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when uops initiated by Decode Stream Buffer (DSB) are being delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x18",
-        "EventName": "IDQ.ALL_DSB_CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering 4 Uops",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cycles  uops were  delivered to Instruction Decode Queue (IDQ) from the Decode Stream Buffer (DSB) path. Counting includes uops that may 'bypass' the IDQ.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x18",
-        "EventName": "IDQ.ALL_DSB_CYCLES_ANY_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles Decode Stream Buffer (DSB) is delivering any Uop",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cycles 4  uops were  delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "EventName": "IDQ.ALL_MITE_CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles MITE is delivering 4 Uops",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of cycles  uops were delivered to Instruction Decode Queue (IDQ) from the MITE path. Counting includes uops that may 'bypass' the IDQ. This also means that uops are not being delivered from the Decode Stream Buffer (DSB).",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x24",
-        "EventName": "IDQ.ALL_MITE_CYCLES_ANY_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles MITE is delivering any Uop",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding ?4 ? x? when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when:\n a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread\n\n b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions)\n \n c. Instruction Decode Queue (IDQ) delivers four uops.",
+        "PublicDescription": "Counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding 4  x when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread. b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions).  c. Instruction Decode Queue (IDQ) delivers four uops.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -164,7 +187,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles when no uops are delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core =4.",
+        "PublicDescription": "Counts, on the per-thread basis, cycles when no uops are delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core =4.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -175,7 +198,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles when less than 1 uop is  delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core >=3.",
+        "PublicDescription": "Counts, on the per-thread basis, cycles when less than 1 uop is delivered to Resource Allocation Table (RAT). IDQ_Uops_Not_Delivered.core >= 3.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -186,6 +209,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles with less than 2 uops delivered by the front-end.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -196,6 +220,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Cycles with less than 3 uops delivered by the front-end.",
         "EventCode": "0x9C",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -217,7 +242,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. \nMM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.\nPenalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0?2 cycles.",
+        "PublicDescription": "Counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. MM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.Penalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 02 cycles.",
         "EventCode": "0xAB",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -228,6 +253,7 @@
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired Instructions that experienced DSB (Decode stream buffer i.e. the decoded instruction-cache) miss. \r\n",
         "EventCode": "0xC6",
         "MSRValue": "0x11",
         "Counter": "0,1,2,3",
@@ -235,7 +261,7 @@
         "EventName": "FRONTEND_RETIRED.DSB_MISS",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired Instructions who experienced decode stream buffer (DSB - the decoded instruction-cache) miss.",
+        "BriefDescription": "Retired Instructions who experienced decode stream buffer (DSB - the decoded instruction-cache) miss. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -248,7 +274,7 @@
         "EventName": "FRONTEND_RETIRED.L1I_MISS",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired Instructions who experienced Instruction L1 Cache true miss.",
+        "BriefDescription": "Retired Instructions who experienced Instruction L1 Cache true miss. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -261,12 +287,13 @@
         "EventName": "FRONTEND_RETIRED.L2_MISS",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired Instructions who experienced Instruction L2 Cache true miss.",
+        "BriefDescription": "Retired Instructions who experienced Instruction L2 Cache true miss. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired Instructions that experienced iTLB (Instruction TLB) true miss.",
         "EventCode": "0xC6",
         "MSRValue": "0x14",
         "Counter": "0,1,2,3",
@@ -274,12 +301,13 @@
         "EventName": "FRONTEND_RETIRED.ITLB_MISS",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired Instructions who experienced iTLB true miss.",
+        "BriefDescription": "Retired Instructions who experienced iTLB true miss. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired Instructions that experienced STLB (2nd level TLB) true miss.",
         "EventCode": "0xC6",
         "MSRValue": "0x15",
         "Counter": "0,1,2,3",
@@ -287,7 +315,7 @@
         "EventName": "FRONTEND_RETIRED.STLB_MISS",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired Instructions who experienced STLB (2nd level TLB) true miss.",
+        "BriefDescription": "Retired Instructions who experienced STLB (2nd level TLB) true miss. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -300,7 +328,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_2",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -313,7 +341,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_2",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -326,34 +354,13 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_4",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EdgeDetect": "1",
-        "EventName": "IDQ.MS_SWITCHES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of switches from DSB (Decode Stream Buffer) or MITE (legacy decode pipeline) to the Microcode Sequencer",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the total number of uops delivered to Instruction Decode Queue (IDQ) while the Microcode Sequenser (MS) is busy. Counting includes uops that may 'bypass' the IDQ. Uops maybe initiated by Decode Stream Buffer (DSB) or MITE.",
-        "EventCode": "0x79",
-        "Counter": "0,1,2,3",
-        "UMask": "0x30",
-        "EventName": "IDQ.MS_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops delivered to Instruction Decode Queue (IDQ) while Microcode Sequenser (MS) is busy",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PEBS": "1",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 8 cycles. During this period the front-end delivered no uops. \r\n",
         "EventCode": "0xC6",
         "MSRValue": "0x400806",
         "Counter": "0,1,2,3",
@@ -367,6 +374,7 @@
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 16 cycles. During this period the front-end delivered no uops.\r\n",
         "EventCode": "0xC6",
         "MSRValue": "0x401006",
         "Counter": "0,1,2,3",
@@ -374,12 +382,13 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_16",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end  after a front-end stall of at least 32 cycles. During this period the front-end delivered no uops.\r\n",
         "EventCode": "0xC6",
         "MSRValue": "0x402006",
         "Counter": "0,1,2,3",
@@ -387,7 +396,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_32",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -400,7 +409,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_64",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -413,7 +422,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_128",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -426,7 +435,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_256",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -439,12 +448,13 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_512",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "PEBS": "1",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after the front-end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issue-pipeline slot while there was no RAT stall.\r\n",
         "EventCode": "0xC6",
         "MSRValue": "0x100206",
         "Counter": "0,1,2,3",
@@ -452,7 +462,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_1",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     },
@@ -465,7 +475,7 @@
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_3",
         "MSRIndex": "0x3F7",
         "SampleAfterValue": "100007",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "TakenAlone": "1",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/skylake/memory.json b/tools/perf/pmu-events/arch/x86/skylake/memory.json
index d7fd5b0..3bd8b71 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/memory.json
@@ -1,6 +1,74 @@
 [
     {
-        "PublicDescription": "Unfriendly TSX abort triggered by  a flowmarker.",
+        "PublicDescription": "Number of times a TSX line had a cache conflict.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "TX_MEM.ABORT_CONFLICT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "TX_MEM.ABORT_CAPACITY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times a TSX Abort was triggered due to a non-release/commit store to lock.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "TX_MEM.ABORT_HLE_STORE_TO_ELIDED_LOCK",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times a HLE transactional region aborted due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times a TSX Abort was triggered due to commit but Lock Buffer not empty.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_NOT_EMPTY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times an HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times a TSX Abort was triggered due to release/commit but data and address mismatch.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_MISMATCH",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times an HLE transactional execution aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times a TSX Abort was triggered due to attempting an unsupported alignment from Lock Buffer.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times an HLE transactional execution aborted due to an unsupported read alignment from the elision buffer.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times we could not allocate Lock Buffer.",
+        "EventCode": "0x54",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "TX_MEM.HLE_ELISION_BUFFER_FULL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x5d",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -10,7 +78,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Unfriendly TSX abort triggered by  a vzeroupper instruction.",
+        "PublicDescription": "Unfriendly TSX abort triggered by a vzeroupper instruction.",
         "EventCode": "0x5d",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -50,7 +118,77 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of times we entered an HLE region\n does not count nested transactions.",
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts number of Offcore outstanding Demand Data Read requests that miss L3 cache in the superQ every cycle.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_L3_MISS_DEMAND_DATA_RD",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with at least 1 Demand Data Read requests who miss L3 cache in the superQ.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x60",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD_GE_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with at least 6 Demand Data Read requests that miss L3 cache in the superQ.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L3_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L3 cache miss demand load is outstanding.",
+        "CounterMask": "2",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x6",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L3_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L3 cache miss demand load is outstanding.",
+        "CounterMask": "6",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Demand Data Read requests who miss L3 cache.",
+        "EventCode": "0xB0",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand Data Read requests who miss L3 cache",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of memory ordering Machine Clears detected. Memory Ordering Machine Clears can result from one of the following:a. memory disambiguation,b. external snoop, orc. cross SMT-HW-thread snoop (stores) hitting load buffer.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "Errata": "SKL089",
+        "EventName": "MACHINE_CLEARS.MEMORY_ORDERING",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the number of machine clears due to memory order conflicts.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of times we entered an HLE region. Does not count nested transactions.",
         "EventCode": "0xC8",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -71,7 +209,7 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Number of times HLE abort was triggered.",
+        "PublicDescription": "Number of times HLE abort was triggered. (PEBS)",
         "EventCode": "0xC8",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -99,13 +237,12 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).",
         "EventCode": "0xC8",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
         "EventName": "HLE_RETIRED.ABORTED_UNFRIENDLY",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.). ",
+        "BriefDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -128,7 +265,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of times we entered an RTM region\n does not count nested transactions.",
+        "PublicDescription": "Number of times we entered an RTM region. Does not count nested transactions.",
         "EventCode": "0xC9",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -149,7 +286,7 @@
     },
     {
         "PEBS": "1",
-        "PublicDescription": "Number of times RTM abort was triggered.",
+        "PublicDescription": "Number of times RTM abort was triggered. (PEBS)",
         "EventCode": "0xC9",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -208,17 +345,6 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of memory ordering Machine Clears detected. Memory Ordering Machine Clears can result from one of the following:\n1. memory disambiguation,\n2. external snoop, or\n3. cross SMT-HW-thread snoop (stores) hitting load buffer.",
-        "EventCode": "0xC3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "Errata": "SKL089",
-        "EventName": "MACHINE_CLEARS.MEMORY_ORDERING",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the number of machine clears due to memory order conflicts.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
         "PEBS": "2",
         "PublicDescription": "Counts loads when the latency from first dispatch to completion is greater than 4 cycles.  Reported latency may be longer than just the memory latency.",
         "EventCode": "0xCD",
@@ -331,1718 +457,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "PublicDescription": "Number of times a TSX line had a cache conflict.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "TX_MEM.ABORT_CONFLICT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "TX_MEM.ABORT_CAPACITY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times a transactional abort was signaled due to a data capacity limitation for transactional reads or writes.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of times a TSX Abort was triggered due to a non-release/commit store to lock.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "TX_MEM.ABORT_HLE_STORE_TO_ELIDED_LOCK",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times a HLE transactional region aborted due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of times a TSX Abort was triggered due to commit but Lock Buffer not empty.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_NOT_EMPTY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times an HLE transactional execution aborted due to NoAllocatedElisionBuffer being non-zero.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of times a TSX Abort was triggered due to release/commit but data and address mismatch.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_MISMATCH",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times an HLE transactional execution aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of times a TSX Abort was triggered due to attempting an unsupported alignment from Lock Buffer.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "TX_MEM.ABORT_HLE_ELISION_BUFFER_UNSUPPORTED_ALIGNMENT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times an HLE transactional execution aborted due to an unsupported read alignment from the elision buffer.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of times we could not allocate Lock Buffer.",
-        "EventCode": "0x54",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "TX_MEM.HLE_ELISION_BUFFER_FULL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of times HLE lock could not be elided due to ElisionBufferAvailable being zero.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Demand Data Read requests who miss L3 cache.",
-        "EventCode": "0xB0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "OFFCORE_REQUESTS.L3_MISS_DEMAND_DATA_RD",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand Data Read requests who miss L3 cache",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts number of Offcore outstanding Demand Data Read requests that miss L3 cache in the superQ every cycle.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L3_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L3 cache miss demand load is outstanding.",
-        "CounterMask": "2",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x6",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L3_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L3 cache miss demand load is outstanding.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.CYCLES_WITH_L3_MISS_DEMAND_DATA_RD",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 1 Demand Data Read requests who miss L3 cache in the superQ.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x60",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "OFFCORE_REQUESTS_OUTSTANDING.L3_MISS_DEMAND_DATA_RD_GE_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with at least 6 Demand Data Read requests that miss L3 cache in the superQ.",
-        "CounterMask": "6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044008000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000408000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c8000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000108000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000088000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000048000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000028000 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.OTHER.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "OTHER & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020800 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.STREAMING_STORES.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "STREAMING_STORES & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020100 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_RFO & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020080 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "PF_L3_DATA_RD & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020004 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_CODE_RD & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3ffc000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x103c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x043c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x023c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x013c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x00bc000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x3fc4000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.ANY_SNOOP",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & ANY_SNOOP",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x1004000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_HITM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_HITM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0404000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_HIT_NO_FWD",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_HIT_NO_FWD",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0204000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_MISS",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0104000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NOT_NEEDED",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NOT_NEEDED",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0084000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_NONE",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020002 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_RFO & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3ffc000001 ",
         "Counter": "0,1,2,3",
@@ -2055,18 +470,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x203c000001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_MISS & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x103c000001 ",
         "Counter": "0,1,2,3",
@@ -2079,6 +483,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x043c000001 ",
         "Counter": "0,1,2,3",
@@ -2091,6 +496,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x023c000001 ",
         "Counter": "0,1,2,3",
@@ -2103,6 +509,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x013c000001 ",
         "Counter": "0,1,2,3",
@@ -2115,6 +522,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x00bc000001 ",
         "Counter": "0,1,2,3",
@@ -2127,18 +535,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x007c000001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_MISS & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x3fc4000001 ",
         "Counter": "0,1,2,3",
@@ -2151,18 +548,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2004000001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x1004000001 ",
         "Counter": "0,1,2,3",
@@ -2175,6 +561,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0404000001 ",
         "Counter": "0,1,2,3",
@@ -2187,6 +574,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0204000001 ",
         "Counter": "0,1,2,3",
@@ -2199,6 +587,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0104000001 ",
         "Counter": "0,1,2,3",
@@ -2211,6 +600,7 @@
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "EventCode": "0xB7, 0xBB",
         "MSRValue": "0x0084000001 ",
         "Counter": "0,1,2,3",
@@ -2221,89 +611,5 @@
         "BriefDescription": "DEMAND_DATA_RD & L3_MISS_LOCAL_DRAM & SNOOP_NONE",
         "Offcore": "1",
         "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x0044000001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_LOCAL_DRAM.SPL_HIT",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_MISS_LOCAL_DRAM & SPL_HIT",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000400001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L4_HIT_LOCAL_L4.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L4_HIT_LOCAL_L4 & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x20001c0001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000100001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_S.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_S & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000080001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_E.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_E & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000040001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT_M.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & L3_HIT_M & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0xB7, 0xBB",
-        "MSRValue": "0x2000020001 ",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.SUPPLIER_NONE.SNOOP_NON_DRAM",
-        "MSRIndex": "0x1a6,0x1a7",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "DEMAND_DATA_RD & SUPPLIER_NONE & SNOOP_NON_DRAM",
-        "Offcore": "1",
-        "CounterHTOff": "0,1,2,3"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/skylake/other.json b/tools/perf/pmu-events/arch/x86/skylake/other.json
index cfdc323..84a316d 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/other.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/other.json
@@ -1,11 +1,47 @@
 [
     {
-        "PublicDescription": "This event counts the number of hardware interruptions received by the processor.",
+        "EventCode": "0x32",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "SW_PREFETCH_ACCESS.NTA",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of PREFETCHNTA instructions executed.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "SW_PREFETCH_ACCESS.T0",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of PREFETCHT0 instructions executed.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "SW_PREFETCH_ACCESS.T1_T2",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of PREFETCHT1 or PREFETCHT2 instructions executed.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "SW_PREFETCH_ACCESS.PREFETCHW",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of PREFETCHW instructions executed.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of hardware interruptions received by the processor.",
         "EventCode": "0xCB",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
         "EventName": "HW_INTERRUPTS.RECEIVED",
-        "SampleAfterValue": "100003",
+        "SampleAfterValue": "203",
         "BriefDescription": "Number of hardware interrupts received by the processor.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
diff --git a/tools/perf/pmu-events/arch/x86/skylake/pipeline.json b/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
index 0f7adb8..bc6d2af 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/pipeline.json
@@ -1,74 +1,76 @@
 [
     {
-        "PublicDescription": "This event counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, this event counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. \nNotes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. \nCounting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
+        "PublicDescription": "Counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, Counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "UMask": "0x1",
         "EventName": "INST_RETIRED.ANY",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Instructions retired from execution.",
-        "CounterHTOff": "Fixed counter 1"
+        "CounterHTOff": "Fixed counter 0"
     },
     {
-        "PublicDescription": "This event counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
+        "Counter": "Fixed counter 1",
         "UMask": "0x2",
         "EventName": "CPU_CLK_UNHALTED.THREAD",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Core cycles when the thread is not in halt state",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
     },
     {
-        "PublicDescription": "This event counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
         "EventCode": "0x00",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 1",
+        "UMask": "0x2",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "Fixed counter 1"
+    },
+    {
+        "PublicDescription": "Counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
+        "EventCode": "0x00",
+        "Counter": "Fixed counter 2",
         "UMask": "0x3",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "SampleAfterValue": "2000003",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
-        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
-        "EventCode": "0x3C",
+        "PublicDescription": "Counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when:a. preceding store conflicts with the load (incomplete overlap),b. store forwarding is impossible due to u-arch limitations,c. preceding lock RMW operations are not forwarded,d. store has the no-forward bit set (uncacheable/page-split/masked stores),e. all-blocking stores are used (mostly, fences and port I/O), and others.The most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events. See the table of not supported store forwards in the Optimization Guide.",
+        "EventCode": "0x03",
         "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Thread cycles when thread is not in halt state",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xE6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "BACLEARS.ANY",
+        "UMask": "0x2",
+        "EventName": "LD_BLOCKS.STORE_FORWARD",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
+        "BriefDescription": "Loads blocked by overlapping with store buffer that cannot be forwarded .",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xA8",
+        "PublicDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.",
+        "EventCode": "0x03",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "LD_BLOCKS.NO_SR",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts false dependencies in MOB when the partial comparison upon loose net check and dependency was resolved by the Enhanced Loose net mechanism. This may not result in high performance penalties. Loose net checks can fail when loads and stores are 4k aliased.",
+        "EventCode": "0x07",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "LSD.UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of Uops delivered by the LSD.",
+        "EventName": "LD_BLOCKS_PARTIAL.ADDRESS_ALIAS",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "False dependencies in MOB due to partial compare on address.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts stalls occured due to changing prefix length (66, 67 or REX.W when they change the length of the decoded instruction). Occurrences counting is proportional to the number of prefixes in a 16B-line. This may result in the following penalties: three-cycle penalty for each LCP in a 16-byte chunk.",
-        "EventCode": "0x87",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "ILD_STALL.LCP",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Cycles checkpoints in Resource Allocation Table (RAT) are recovering from JEClear or machine clear.",
+        "PublicDescription": "Core cycles the Resource allocator was stalled due to recovery from an earlier branch misprediction or machine clear event.",
         "EventCode": "0x0D",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -80,6 +82,16 @@
     {
         "EventCode": "0x0D",
         "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x0D",
+        "Counter": "0,1,2,3",
         "UMask": "0x80",
         "EventName": "INT_MISC.CLEAR_RESTEER_CYCLES",
         "SampleAfterValue": "2000003",
@@ -87,27 +99,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts resource-related stall cycles. Reasons for stalls can be as follows:\n - *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots)\n - *any* u-arch structure got empty (like INT/SIMD FreeLists)\n - FPU control word (FPCW), MXCSR\nand others. This counts cycles that the pipeline backend blocked uop delivery from the front end.",
-        "EventCode": "0xA2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "RESOURCE_STALLS.ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Resource-related stall cycles",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts stall cycles caused by the store buffer (SB) overflow (excluding draining from synch). This counts cycles that the pipeline backend blocked uop delivery from the front end.",
-        "EventCode": "0xA2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "RESOURCE_STALLS.SB",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles stalled due to no store buffers available. (not including draining form sync).",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS).",
+        "PublicDescription": "Counts the number of uops that the Resource Allocation Table (RAT) issues to the Reservation Station (RS).",
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -117,6 +109,28 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
+        "EventCode": "0x0E",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of Blend Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting with the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register. For more information, refer to Mixing Intel AVX and Intel SSE Code section of the Optimization Guide.",
+        "EventCode": "0x0E",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_ISSUED.VECTOR_WIDTH_MISMATCH",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Uops inserted at issue-stage in order to preserve upper bits of vector registers.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0x0E",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
@@ -126,19 +140,115 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles during which the Resource Allocation Table (RAT) does not issue any Uops to the reservation station (RS) for the current thread.",
-        "EventCode": "0x0E",
-        "Invert": "1",
+        "EventCode": "0x14",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "UOPS_ISSUED.STALL_CYCLES",
+        "EventName": "ARITH.DIVIDER_ACTIVE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when Resource Allocation Table (RAT) does not issue Uops to Reservation Station (RS) for the thread",
+        "BriefDescription": "Cycles when divide unit is busy executing divide or square root operations. Accounts for integer and floating-point operations.",
         "CounterMask": "1",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts cycles during which the reservation station (RS) is empty for the thread.\nNote: In ST-mode, not active thread should drive 0. This is usually caused by severely costly branch mispredictions, or allocator/FE issues.",
+        "PublicDescription": "This is an architectural event that counts the number of thread cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. For this reason, this event may have a changing ratio with regards to wall clock time.",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Thread cycles when thread is not in halt state",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts when the Current Privilege Level (CPL) transitions from ring 1, 2 or 3 to ring 0 (Kernel).",
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EdgeDetect": "1",
+        "EventName": "CPU_CLK_UNHALTED.RING0_TRANS",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts when there is a transition from ring 1, 2 or 3 to ring 0.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2503",
+        "BriefDescription": "Core crystal clock cycles when the thread is unhalted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2503",
+        "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
+        "SampleAfterValue": "2503",
+        "BriefDescription": "Core crystal clock cycles when the thread is unhalted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "AnyThread": "1",
+        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
+        "SampleAfterValue": "2503",
+        "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x3C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
+        "SampleAfterValue": "2503",
+        "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by ASM (Assembly File) inspection of the nearby instructions.",
+        "EventCode": "0x4C",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LOAD_HIT_PRE.SW_PF",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which the reservation station (RS) is empty for the thread.; Note: In ST-mode, not active thread should drive 0. This is usually caused by severely costly branch mispredictions, or allocator/FE issues.",
         "EventCode": "0x5E",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -148,6 +258,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts end of periods where the Reservation Station (RS) was empty. Could be useful to precisely locate front-end Latency Bound issues.",
         "EventCode": "0x5E",
         "Invert": "1",
         "Counter": "0,1,2,3",
@@ -160,242 +271,277 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Increments when an entry is added to the Last Branch Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT.",
-        "EventCode": "0xCC",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Increments whenever there is an update to the LBR array.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of machine clears (nukes) of any type.",
-        "EventCode": "0xC3",
+        "PublicDescription": "Counts cycles that the Instruction Length decoder (ILD) stalls occurred due to dynamically changing prefix length of the decoded instruction (by operand size prefix instruction 0x66, address size prefix instruction 0x67 or REX.W for Intel64). Count is proportional to the number of prefixes in a 16B-line. This may result in a three-cycle penalty for each LCP (Length changing prefix) in a 16-byte chunk.",
+        "EventCode": "0x87",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EdgeDetect": "1",
-        "EventName": "MACHINE_CLEARS.COUNT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of machine clears (nukes) of any type. ",
-        "CounterMask": "1",
+        "EventName": "ILD_STALL.LCP",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Stalls caused by changing prefix length of the instruction.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts self-modifying code (SMC) detected, which causes a machine clear.",
-        "EventCode": "0xC3",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 0.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 0",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 1.",
+        "EventCode": "0xA1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 2.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
-        "EventName": "MACHINE_CLEARS.SMC",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Self-modifying code (SMC) detected.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
-        "EventCode": "0xC0",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "Errata": "SKL091, SKL044",
-        "EventName": "INST_RETIRED.ANY_P",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of instructions retired. General Counter - architectural event",
+        "BriefDescription": "Cycles per thread when uops are executed in port 2",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts instructions retired.",
-        "EventCode": "0xC0",
-        "Counter": "1",
-        "UMask": "0x1",
-        "Errata": "SKL091, SKL044",
-        "EventName": "INST_RETIRED.PREC_DIST",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
-        "CounterHTOff": "1"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts the number of retirement slots used.",
-        "EventCode": "0xC2",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Retirement slots used.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.",
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.STALL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles without actually retired uops.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
-        "EventCode": "0xC2",
-        "Invert": "1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles with less than 10 actually retired uops.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Conditional branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Direct and indirect near call instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts all (macro) branch instructions retired.",
-        "EventCode": "0xC4",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 3.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Return instructions retired.",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 3",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 4.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Not taken branch instructions retired.",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 4",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 5.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Taken branch instructions retired.",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 5",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 6.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
         "UMask": "0x40",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Far branch instructions retired.",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 6",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
-        "EventCode": "0xC4",
+        "PublicDescription": "Counts, on the per-thread basis, cycles during which at least one uop is dispatched from the Reservation Station (RS) to port 7.",
+        "EventCode": "0xA1",
         "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "Errata": "SKL091",
-        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All (macro) branch instructions retired. ",
-        "CounterHTOff": "0,1,2,3"
+        "UMask": "0x80",
+        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles per thread when uops are executed in port 7",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.",
-        "EventCode": "0xC5",
+        "PublicDescription": "Counts resource-related stall cycles. Reasons for stalls can be as follows:a. *any* u-arch structure got full (LB, SB, RS, ROB, BOB, LM, Physical Register Reclaim Table (PRRT), or Physical History Table (PHT) slots).b. *any* u-arch structure got empty (like INT/SIMD FreeLists).c. FPU control word (FPCW), MXCSR.and others. This counts cycles that the pipeline back-end blocked uop delivery from the front-end.",
+        "EventCode": "0xA2",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "EventName": "RESOURCE_STALLS.ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Resource-related stall cycles",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
-        "PublicDescription": "This event counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.",
-        "EventCode": "0xC5",
+        "PublicDescription": "Counts allocation stall cycles caused by the store buffer (SB) being full. This counts cycles that the pipeline back-end blocked uop delivery from the front-end.",
+        "EventCode": "0xA2",
         "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "BR_MISP_RETIRED.NEAR_CALL",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted direct and indirect near call instructions retired.",
+        "UMask": "0x8",
+        "EventName": "RESOURCE_STALLS.SB",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles stalled due to no store buffers available. (not including draining form sync).",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts all mispredicted macro branch instructions retired.",
-        "EventCode": "0xC5",
+        "EventCode": "0xA3",
         "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "UMask": "0x1",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
+        "CounterMask": "1",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PEBS": "1",
-        "EventCode": "0xC5",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Number of near branch instructions retired that were mispredicted and taken.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
-        "EventCode": "0xC5",
+        "EventCode": "0xA3",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
-        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
-        "SampleAfterValue": "400009",
-        "BriefDescription": "Mispredicted macro branch instructions retired. ",
+        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Total execution stalls.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x5",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
+        "CounterMask": "5",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
+        "CounterMask": "8",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0xc",
+        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
+        "CounterMask": "12",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
+        "CounterMask": "16",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x14",
+        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
+        "CounterMask": "20",
         "CounterHTOff": "0,1,2,3"
     },
     {
+        "PublicDescription": "Counts cycles during which no uops were executed on all ports and Reservation Station (RS) was not empty.",
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "EXE_ACTIVITY.EXE_BOUND_0_PORTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles where no uops were executed, the Reservation Station was not empty, the Store Buffer was full and there was no outstanding load.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which a total of 1 uop was executed on all ports and Reservation Station (RS) was not empty.",
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "EXE_ACTIVITY.1_PORTS_UTIL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles during which a total of 2 uops were executed on all ports and Reservation Station (RS) was not empty.",
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "EXE_ACTIVITY.2_PORTS_UTIL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles total of 3 uops are executed on all ports and Reservation Station (RS) was not empty.",
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "EXE_ACTIVITY.3_PORTS_UTIL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles total of 4 uops are executed on all ports and Reservation Station (RS) was not empty.",
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "EXE_ACTIVITY.4_PORTS_UTIL",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xA6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "EventName": "EXE_ACTIVITY.BOUND_ON_STORES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles where the Store Buffer was full and no outstanding load.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Number of uops delivered to the back-end by the LSD(Loop Stream Detector).",
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of Uops delivered by the LSD.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the cycles when at least one uop is delivered by the LSD (Loop-stream detector).",
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_ACTIVE",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the cycles when 4 uops are delivered by the LSD (Loop-stream detector).",
+        "EventCode": "0xA8",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "LSD.CYCLES_4_UOPS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
+        "CounterMask": "4",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "PublicDescription": "Number of uops to be executed per-thread each cycle.",
         "EventCode": "0xB1",
         "Counter": "0,1,2,3",
@@ -406,26 +552,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "Number of uops executed from any thread.",
-        "EventCode": "0xB1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_EXECUTED.CORE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of uops executed on the core.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xB1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "UOPS_EXECUTED.X87",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts the number of x87 uops dispatched.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread.",
+        "PublicDescription": "Counts cycles during which no uops were dispatched from the Reservation Station (RS) per thread.",
         "EventCode": "0xB1",
         "Invert": "1",
         "Counter": "0,1,2,3",
@@ -481,368 +608,13 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0xA6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "EXE_ACTIVITY.EXE_BOUND_0_PORTS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles where no uops were executed, the Reservation Station was not empty, the Store Buffer was full and there was no outstanding load.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA6",
+        "PublicDescription": "Number of uops executed from any thread.",
+        "EventCode": "0xB1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "EXE_ACTIVITY.1_PORTS_UTIL",
+        "EventName": "UOPS_EXECUTED.CORE",
         "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles total of 1 uop is executed on all ports and Reservation Station was not empty.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "EXE_ACTIVITY.2_PORTS_UTIL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles total of 2 uops are executed on all ports and Reservation Station was not empty.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "EXE_ACTIVITY.3_PORTS_UTIL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles total of 3 uops are executed on all ports and Reservation Station was not empty.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "EXE_ACTIVITY.4_PORTS_UTIL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles total of 4 uops are executed on all ports and Reservation Station was not empty.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA6",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "EXE_ACTIVITY.BOUND_ON_STORES",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles where the Store Buffer was full and no outstanding load.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 0.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_0",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 0",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 1.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_1",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 2.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_2",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 2",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 3.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_3",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 3",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 4.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_4",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 5.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_5",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 6.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x40",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_6",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 6",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts, on the per-thread basis, cycles during which uops are dispatched from the Reservation Station (RS) to port 7.",
-        "EventCode": "0xA1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x80",
-        "EventName": "UOPS_DISPATCHED_PORT.PORT_7",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles per thread when uops are executed in port 7",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "CYCLE_ACTIVITY.STALLS_TOTAL",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Total execution stalls.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L1 cache miss demand load is outstanding.",
-        "CounterMask": "8",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0xc",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L1D_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L1 cache miss demand load is outstanding.",
-        "CounterMask": "12",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts all not software-prefetch load dispatches that hit the fill buffer (FB) allocated for the software prefetch. It can also be incremented by some lock instructions. So it should only be used with profiling so that the locks can be excluded by asm inspection of the nearby instructions.",
-        "EventCode": "0x4C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LOAD_HIT_PRE.SW_PF",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Demand load dispatches that hit L1D fill buffer (FB) allocated for software prefetch.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts how many times the load operation got the true Block-on-Store blocking code preventing store forwarding. This includes cases when:\n - preceding store conflicts with the load (incomplete overlap)\n\n - store forwarding is impossible due to u-arch limitations\n\n - preceding lock RMW operations are not forwarded\n\n - store has the no-forward bit set (uncacheable/page-split/masked stores)\n\n - all-blocking stores are used (mostly, fences and port I/O)\n\nand others.\nThe most common case is a load blocked due to its address range overlapping with a preceding smaller uncompleted store. Note: This event does not take into account cases of out-of-SW-control (for example, SbTailHit), unknown physical STA, and cases of blocking loads on store due to being non-WB memory type or a lock. These cases are covered by other events.\nSee the table of not supported store forwards in the Optimization Guide.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "LD_BLOCKS.STORE_FORWARD",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Loads blocked by overlapping with store buffer that cannot be forwarded .",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.",
-        "EventCode": "0x03",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "LD_BLOCKS.NO_SR",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "The number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts false dependencies in MOB when the partial comparison upon loose net check and dependency was resolved by the Enhanced Loose net mechanism. This may not result in high performance penalties. Loose net checks can fail when loads and stores are 4k aliased.",
-        "EventCode": "0x07",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LD_BLOCKS_PARTIAL.ADDRESS_ALIAS",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "False dependencies in MOB due to partial compare on address.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while L2 cache miss demand load is outstanding.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x5",
-        "EventName": "CYCLE_ACTIVITY.STALLS_L2_MISS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while L2 cache miss demand load is outstanding.",
-        "CounterMask": "5",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "CYCLE_ACTIVITY.CYCLES_MEM_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles while memory subsystem has an outstanding load.",
-        "CounterMask": "16",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA3",
-        "Counter": "0,1,2,3",
-        "UMask": "0x14",
-        "EventName": "CYCLE_ACTIVITY.STALLS_MEM_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Execution stalls while memory subsystem has an outstanding load.",
-        "CounterMask": "20",
-        "CounterHTOff": "0,1,2,3"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK",
-        "SampleAfterValue": "2503",
-        "BriefDescription": "Core crystal clock cycles when the thread is unhalted.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.ONE_THREAD_ACTIVE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PEBS": "2",
-        "PublicDescription": "Number of cycles using an always true condition applied to  PEBS instructions retired event. (inst_ret< 16)",
-        "EventCode": "0xC0",
-        "Invert": "1",
-        "Counter": "0,2,3",
-        "UMask": "0x1",
-        "Errata": "SKL091, SKL044",
-        "EventName": "INST_RETIRED.TOTAL_CYCLES_PS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Number of cycles using always true condition applied to  PEBS instructions retired event.",
-        "CounterMask": "10",
-        "CounterHTOff": "0,2,3"
-    },
-    {
-        "EventCode": "0x14",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "ARITH.DIVIDER_ACTIVE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles when divide unit is busy executing divide or square root operations. Accounts for integer and floating-point operations.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_ACTIVE",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xA8",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "LSD.CYCLES_4_UOPS",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Cycles 4 Uops delivered by the LSD, but didn't come from the decoder.",
-        "CounterMask": "4",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0xC1",
-        "Counter": "0,1,2,3",
-        "UMask": "0x3f",
-        "EventName": "OTHER_ASSISTS.ANY",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of Blend Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register.\r\nFor more information, refer to ?Mixing Intel AVX and Intel SSE Code? section of the Optimization Guide.",
-        "EventCode": "0x0E",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "UOPS_ISSUED.VECTOR_WIDTH_MISMATCH",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Uops inserted at issue-stage in order to preserve upper bits of vector registers.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x00",
-        "Counter": "Fixed counter 2",
-        "UMask": "0x2",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x0",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.THREAD_P_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_THREAD_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2503",
-        "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x0D",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "INT_MISC.RECOVERY_CYCLES_ANY",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Core cycles the allocator was stalled due to recovery from earlier clear event for any thread running on the physical core (e.g. misprediction or memory nuke).",
+        "BriefDescription": "Number of uops executed on the core.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -897,43 +669,282 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts when the Current Privilege Level (CPL) transitions from ring 1, 2 or 3 to ring 0 (Kernel).",
-        "EventCode": "0x3C",
+        "PublicDescription": "Counts the number of x87 uops executed.",
+        "EventCode": "0xB1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "UOPS_EXECUTED.X87",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts the number of x87 uops dispatched.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of instructions (EOMs) retired. Counting covers macro-fused instructions individually (that is, increments by two).",
+        "EventCode": "0xC0",
         "Counter": "0,1,2,3",
         "UMask": "0x0",
-        "EdgeDetect": "1",
-        "EventName": "CPU_CLK_UNHALTED.RING0_TRANS",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Counts when there is a transition from ring 1, 2 or 3 to ring 0.",
+        "Errata": "SKL091, SKL044",
+        "EventName": "INST_RETIRED.ANY_P",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of instructions retired. General Counter - architectural event",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "A version of INST_RETIRED that allows for a more unbiased distribution of samples across instructions retired. It utilizes the Precise Distribution of Instructions Retired (PDIR) feature to mitigate some bias in how retired instructions get sampled.",
+        "EventCode": "0xC0",
+        "Counter": "1",
+        "UMask": "0x1",
+        "Errata": "SKL091, SKL044",
+        "EventName": "INST_RETIRED.PREC_DIST",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Precise instruction retired event with HW to reduce effect of PEBS shadow in IP distribution",
+        "CounterHTOff": "1"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "Number of cycles using an always true condition applied to  PEBS instructions retired event. (inst_ret< 16)",
+        "EventCode": "0xC0",
+        "Invert": "1",
+        "Counter": "0,2,3",
+        "UMask": "0x1",
+        "Errata": "SKL091, SKL044",
+        "EventName": "INST_RETIRED.TOTAL_CYCLES_PS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Number of cycles using always true condition applied to  PEBS instructions retired event.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,2,3"
+    },
+    {
+        "EventCode": "0xC1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x3f",
+        "EventName": "OTHER_ASSISTS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of times a microcode assist is invoked by HW other than FP-assist. Examples include AD (page Access Dirty) and AVX* related assists.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the retirement slots used.",
+        "EventCode": "0xC2",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.RETIRE_SLOTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Retirement slots used.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts cycles without actually retired uops.",
+        "EventCode": "0xC2",
+        "Invert": "1",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "UOPS_RETIRED.STALL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles without actually retired uops.",
         "CounterMask": "1",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK",
-        "SampleAfterValue": "2503",
-        "BriefDescription": "Core crystal clock cycles when the thread is unhalted.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "AnyThread": "1",
-        "EventName": "CPU_CLK_UNHALTED.REF_XCLK_ANY",
-        "SampleAfterValue": "2503",
-        "BriefDescription": "Core crystal clock cycles when at least one thread on the physical core is unhalted.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x3C",
+        "PublicDescription": "Number of cycles using always true condition (uops_ret < 16) applied to non PEBS uops retired event.",
+        "EventCode": "0xC2",
+        "Invert": "1",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
-        "EventName": "CPU_CLK_UNHALTED.ONE_THREAD_ACTIVE",
-        "SampleAfterValue": "2503",
-        "BriefDescription": "Core crystal clock cycles when this thread is unhalted and the other thread is halted.",
+        "EventName": "UOPS_RETIRED.TOTAL_CYCLES",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Cycles with less than 10 actually retired uops.",
+        "CounterMask": "10",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EdgeDetect": "1",
+        "EventName": "MACHINE_CLEARS.COUNT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts self-modifying code (SMC) detected, which causes a machine clear.",
+        "EventCode": "0xC3",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "MACHINE_CLEARS.SMC",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Self-modifying code (SMC) detected.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts all (macro) branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts conditional branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect near call instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Direct and indirect near call instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "This is a precise version of BR_INST_RETIRED.ALL_BRANCHES that counts all (macro) branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All (macro) branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts return instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.NEAR_RETURN",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Return instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts not taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.NOT_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Not taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts taken branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Taken branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts far branch instructions retired.",
+        "EventCode": "0xC4",
+        "Counter": "0,1,2,3",
+        "UMask": "0x40",
+        "Errata": "SKL091",
+        "EventName": "BR_INST_RETIRED.FAR_BRANCH",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Counts the number of far branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts all the retired branch instructions that were mispredicted by the processor. A branch misprediction occurs when the processor incorrectly predicts the destination of the branch.  When the misprediction is discovered at execution, all the instructions executed in the wrong (speculative) path must be discarded, and the processor must start fetching from the correct path.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x0",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "All mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted conditional branch instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BR_MISP_RETIRED.CONDITIONAL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted conditional branch instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "This event counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "BR_MISP_RETIRED.NEAR_CALL",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted direct and indirect near call instructions retired.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PEBS": "2",
+        "PublicDescription": "This is a precise version of BR_MISP_RETIRED.ALL_BRANCHES that counts all mispredicted macro branch instructions retired.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "PEBS": "1",
+        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken.",
+        "EventCode": "0xC5",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "SampleAfterValue": "400009",
+        "BriefDescription": "Number of near branch instructions retired that were mispredicted and taken. ",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Increments when an entry is added to the Last Branch Record (LBR) array (or removed from the array in case of RETURNs in call stack mode). The event requires LBR enable via IA32_DEBUGCTL MSR and branch type selection via MSR_LBR_SELECT.",
+        "EventCode": "0xCC",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "ROB_MISC_EVENTS.LBR_INSERTS",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Increments whenever there is an update to the LBR array.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts the number of times the front-end is resteered when it finds a branch instruction in a fetch line. This occurs for the first time a branch instruction is fetched or when the branch is not tracked by the BPU (Branch Prediction Unit) anymore.",
+        "EventCode": "0xE6",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "BACLEARS.ANY",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Counts the total number when the front end is resteered, mainly when the BPU cannot provide a correct prediction and this is corrected by other branch handling mechanisms at the front end.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/skylake/virtual-memory.json b/tools/perf/pmu-events/arch/x86/skylake/virtual-memory.json
index 02f32cb..2bcba7d 100644
--- a/tools/perf/pmu-events/arch/x86/skylake/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylake/virtual-memory.json
@@ -1,15 +1,168 @@
 [
     {
-        "PublicDescription": "This event counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific).",
-        "EventCode": "0xAE",
+        "PublicDescription": "Counts demand data loads that caused a page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed.",
+        "EventCode": "0x08",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "ITLB.ITLB_FLUSH",
-        "SampleAfterValue": "100007",
-        "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.",
+        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Load misses in all DTLB levels that cause page walks",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_4K",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to a demand data load to a 4K page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 2M/4M pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to a demand data load to a 2M/4M page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_1G",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Page walk completed due to a demand data load to a 1G page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts demand data loads that caused a completed page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (All page sizes)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake microarchitecture.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "DTLB_LOAD_MISSES.WALK_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles when at least one PMH (Page Miss Handler) is busy with a page walk for a load.",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "DTLB_LOAD_MISSES.WALK_ACTIVE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a load. EPT page walk duration are excluded in Skylake.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts loads that miss the DTLB (Data TLB) and hit the STLB (Second level TLB).",
+        "EventCode": "0x08",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Loads that miss the DTLB and hit the STLB.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts demand data stores that caused a page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels, but the walk need not have completed.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x1",
+        "EventName": "DTLB_STORE_MISSES.MISS_CAUSES_A_WALK",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store misses in all DTLB levels that cause page walks",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x2",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_4K",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Page walk completed due to a demand data store to a 4K page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 2M/4M pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x4",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Page walk completed due to a demand data store to a 2M/4M page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 1G pages.  The page walks can end with or without a page fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x8",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_1G",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Page walk completed due to a demand data store to a 1G page",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts demand data stores that caused a completed page walk of any page size (4K/2M/4M/1G). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Store misses in all TLB levels causes a page walk that completes. (All page sizes)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake microarchitecture.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "DTLB_STORE_MISSES.WALK_PENDING",
+        "SampleAfterValue": "2000003",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles when at least one PMH (Page Miss Handler) is busy with a page walk for a store.",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "DTLB_STORE_MISSES.WALK_ACTIVE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a store. EPT page walk duration are excluded in Skylake.",
+        "CounterMask": "1",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Stores that miss the DTLB (Data TLB) and hit the STLB (2nd Level TLB).",
+        "EventCode": "0x49",
+        "Counter": "0,1,2,3",
+        "UMask": "0x20",
+        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Stores that miss the DTLB and hit the STLB.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts cycles for each PMH (Page Miss Handler) that is busy with an EPT (Extended Page Table) walk for any request type.",
         "EventCode": "0x4F",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
@@ -19,7 +172,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).",
+        "PublicDescription": "Counts page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB, but the walk need not have completed.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -29,7 +182,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.",
+        "PublicDescription": "Counts completed page walks (4K page size) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
         "UMask": "0x2",
@@ -39,7 +192,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault.",
+        "PublicDescription": "Counts code misses in all ITLB levels that caused a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
         "UMask": "0x4",
@@ -49,7 +202,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (1G  page size). The page walk can end with or without a fault.",
+        "PublicDescription": "Counts store misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
         "UMask": "0x8",
@@ -59,12 +212,34 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "PublicDescription": "Counts completed page walks (2M and 4M page sizes) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0xe",
+        "EventName": "ITLB_MISSES.WALK_COMPLETED",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (All page sizes)",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture.",
         "EventCode": "0x85",
         "Counter": "0,1,2,3",
         "UMask": "0x10",
         "EventName": "ITLB_MISSES.WALK_PENDING",
         "SampleAfterValue": "100003",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake.",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "PublicDescription": "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request. EPT page walk duration are excluded in Skylake microarchitecture.",
+        "EventCode": "0x85",
+        "Counter": "0,1,2,3",
+        "UMask": "0x10",
+        "EventName": "ITLB_MISSES.WALK_ACTIVE",
+        "SampleAfterValue": "100003",
+        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request. EPT page walk duration are excluded in Skylake.",
+        "CounterMask": "1",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
@@ -77,123 +252,17 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts load misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).",
-        "EventCode": "0x08",
+        "PublicDescription": "Counts the number of flushes of the big or small ITLB pages. Counting include both TLB Flush (covering all sets) and TLB Set Clear (set-specific).",
+        "EventCode": "0xAE",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
-        "EventName": "DTLB_LOAD_MISSES.MISS_CAUSES_A_WALK",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Load misses in all DTLB levels that cause page walks",
+        "EventName": "ITLB.ITLB_FLUSH",
+        "SampleAfterValue": "100007",
+        "BriefDescription": "Flushing of the Instruction TLB (ITLB) pages, includes 4k/2M/4M pages.",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts load misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_4K",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (4K).",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts load misses in all DTLB levels that cause a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (2M/4M).",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts load misses in all DTLB levels that cause a completed page walk (1G  page size). The page walk can end with or without a fault.",
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_1G",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (1G)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "DTLB_LOAD_MISSES.WALK_PENDING",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "DTLB_LOAD_MISSES.STLB_HIT",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Loads that miss the DTLB and hit the STLB.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause page walks of any page size (4K/2M/4M/1G).",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x1",
-        "EventName": "DTLB_STORE_MISSES.MISS_CAUSES_A_WALK",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all DTLB levels that cause page walks",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (4K page size). The page walk can end with or without a fault.",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x2",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_4K",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store miss in all TLB levels causes a page walk that completes. (4K)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault.",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x4",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (2M/4M)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts store misses in all DTLB levels that cause a completed page walk (1G  page size). The page walk can end with or without a fault.",
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x8",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_1G",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (1G)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "DTLB_STORE_MISSES.WALK_PENDING",
-        "SampleAfterValue": "2000003",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x20",
-        "EventName": "DTLB_STORE_MISSES.STLB_HIT",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Stores that miss the DTLB and hit the STLB.",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "PublicDescription": "This event counts the number of DTLB flush attempts of the thread-specific entries.",
+        "PublicDescription": "Counts the number of DTLB flush attempts of the thread-specific entries.",
         "EventCode": "0xBD",
         "Counter": "0,1,2,3",
         "UMask": "0x1",
@@ -203,7 +272,7 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
-        "PublicDescription": "This event counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, and so on).",
+        "PublicDescription": "Counts the number of any STLB flush attempts (such as entire, VPID, PCID, InvPage, CR3 write, etc.).",
         "EventCode": "0xBD",
         "Counter": "0,1,2,3",
         "UMask": "0x20",
@@ -211,62 +280,5 @@
         "SampleAfterValue": "100007",
         "BriefDescription": "STLB flush attempts",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "ITLB_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (All page sizes)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (All page sizes)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0xe",
-        "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Store misses in all TLB levels causes a page walk that completes. (All page sizes)",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x49",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "DTLB_STORE_MISSES.WALK_ACTIVE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x08",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "DTLB_LOAD_MISSES.WALK_ACTIVE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
-    },
-    {
-        "EventCode": "0x85",
-        "Counter": "0,1,2,3",
-        "UMask": "0x10",
-        "EventName": "ITLB_MISSES.WALK_ACTIVE",
-        "SampleAfterValue": "100003",
-        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for code (instruction fetch) request. EPT page walk duration are excluded in Skylake.",
-        "CounterMask": "1",
-        "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
 ]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/cache.json b/tools/perf/pmu-events/arch/x86/skylakex/cache.json
index b5bc742..5c99408 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/cache.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/cache.json
@@ -265,7 +265,7 @@
     {
         "EventCode": "0x60",
         "UMask": "0x2",
-        "BriefDescription": "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle. ",
+        "BriefDescription": "Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.",
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_REQUESTS_OUTSTANDING.DEMAND_CODE_RD",
         "PublicDescription": "Counts the number of offcore outstanding Code Reads transactions in the super queue every cycle. The 'Offcore outstanding' state of the transaction lasts from the L2 miss until the sending transaction completion to requestor (SQ deallocation). See the corresponding Umask under OFFCORE_REQUESTS.",
@@ -398,22 +398,24 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x11",
-        "BriefDescription": "Retired load instructions that miss the STLB.",
+        "BriefDescription": "Retired load instructions that miss the STLB. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_INST_RETIRED.STLB_MISS_LOADS",
+        "PublicDescription": "Retired load instructions that miss the STLB.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x12",
-        "BriefDescription": "Retired store instructions that miss the STLB.",
+        "BriefDescription": "Retired store instructions that miss the STLB. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_INST_RETIRED.STLB_MISS_STORES",
+        "PublicDescription": "Retired store instructions that miss the STLB.",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -421,7 +423,7 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x21",
-        "BriefDescription": "Retired load instructions with locked access.",
+        "BriefDescription": "Retired load instructions with locked access. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -432,24 +434,22 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x41",
-        "BriefDescription": "Retired load instructions that split across a cacheline boundary.",
+        "BriefDescription": "Retired load instructions that split across a cacheline boundary. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_INST_RETIRED.SPLIT_LOADS",
-        "PublicDescription": "Counts retired load instructions that split across a cacheline boundary.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
     {
         "EventCode": "0xD0",
         "UMask": "0x42",
-        "BriefDescription": "Retired store instructions that split across a cacheline boundary.",
+        "BriefDescription": "Retired store instructions that split across a cacheline boundary. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_INST_RETIRED.SPLIT_STORES",
-        "PublicDescription": "Counts retired store instructions that split across a cacheline boundary.",
         "SampleAfterValue": "100003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -457,7 +457,7 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x81",
-        "BriefDescription": "All retired load instructions.",
+        "BriefDescription": "All retired load instructions. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
@@ -468,11 +468,12 @@
     {
         "EventCode": "0xD0",
         "UMask": "0x82",
-        "BriefDescription": "All retired store instructions.",
+        "BriefDescription": "All retired store instructions. (Precise Event)",
         "Data_LA": "1",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_INST_RETIRED.ALL_STORES",
+        "PublicDescription": "All retired store instructions.",
         "SampleAfterValue": "2000003",
         "L1_Hit_Indication": "1",
         "CounterHTOff": "0,1,2,3"
@@ -485,7 +486,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_RETIRED.L1_HIT",
-        "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L1 data cache. This event includes all SW prefetches and lock instructions regardless of the data source.",
+        "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L1 data cache. This event includes all SW prefetches and lock instructions regardless of the data source.\r\n",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -509,7 +510,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_RETIRED.L3_HIT",
-        "PublicDescription": "Counts retired load instructions with at least one uop that hit in the L3 cache. ",
+        "PublicDescription": "Retired load instructions with L3 cache hits as data sources.",
         "SampleAfterValue": "50021",
         "CounterHTOff": "0,1,2,3"
     },
@@ -545,7 +546,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_RETIRED.L3_MISS",
-        "PublicDescription": "Counts retired load instructions with at least one uop that missed in the L3 cache. ",
+        "PublicDescription": "Retired load instructions missed L3 cache as data sources.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -557,7 +558,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_RETIRED.FB_HIT",
-        "PublicDescription": "Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. ",
+        "PublicDescription": "Counts retired load instructions with at least one uop was load missed in L1 but hit FB (Fill Buffers) due to preceding miss to the same cache line with data not ready. \r\n",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -616,7 +617,6 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_L3_MISS_RETIRED.LOCAL_DRAM",
-        "PublicDescription": "Retired load instructions which data sources missed L3 but serviced from local DRAM.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -639,7 +639,6 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_HITM",
-        "PublicDescription": "Retired load instructions whose data sources was remote HITM.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -648,9 +647,9 @@
         "UMask": "0x8",
         "BriefDescription": "Retired load instructions whose data sources was forwarded from a remote cache",
         "Data_LA": "1",
+        "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "MEM_LOAD_L3_MISS_RETIRED.REMOTE_FWD",
-        "PublicDescription": "Retired load instructions whose data sources was forwarded from a remote cache.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
     },
@@ -697,7 +696,7 @@
     {
         "EventCode": "0xF2",
         "UMask": "0x2",
-        "BriefDescription": "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3.  Clean lines may either be allocated in L3 or dropped ",
+        "BriefDescription": "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3.  Clean lines may either be allocated in L3 or dropped",
         "Counter": "0,1,2,3",
         "EventName": "L2_LINES_OUT.NON_SILENT",
         "PublicDescription": "Counts the number of lines that are evicted by L2 cache when triggered by an L2 cache fill. Those lines can be either in modified state or clean state. Modified lines may either be written back to L3 or directly written to memory and not allocated in L3.  Clean lines may either be allocated in L3 or dropped.",
@@ -742,7 +741,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts demand data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -755,7 +754,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts demand data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -768,7 +767,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -781,7 +780,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -794,7 +793,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts demand data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -807,7 +806,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts demand data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -820,7 +819,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand data writes (RFOs) that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -833,7 +832,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -846,7 +845,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -859,7 +858,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -872,7 +871,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -885,7 +884,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand data writes (RFOs) that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -898,7 +897,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand code reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -911,7 +910,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -924,7 +923,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -937,7 +936,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -950,7 +949,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand code reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -963,7 +962,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand code reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -976,7 +975,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -989,7 +988,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1002,7 +1001,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1015,7 +1014,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1028,7 +1027,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1041,7 +1040,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1054,7 +1053,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1067,7 +1066,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1080,7 +1079,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1093,7 +1092,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1106,7 +1105,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1119,7 +1118,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1132,7 +1131,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1145,7 +1144,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1158,7 +1157,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1171,7 +1170,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1184,7 +1183,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1197,7 +1196,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1210,7 +1209,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1223,7 +1222,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1236,7 +1235,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1249,7 +1248,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1262,7 +1261,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1275,7 +1274,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1288,7 +1287,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1301,7 +1300,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1314,7 +1313,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1327,7 +1326,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1340,7 +1339,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1353,7 +1352,85 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that have any response type.",
+        "MSRValue": "0x0000018000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.ANY_RESPONSE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.",
+        "MSRValue": "0x01003c8000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.NO_SNOOP_NEEDED",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.",
+        "MSRValue": "0x04003c8000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.HIT_OTHER_CORE_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "OTHER & L3_HIT & SNOOP_HIT_WITH_FWD",
+        "MSRValue": "0x08003c8000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.SNOOP_HIT_WITH_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.",
+        "MSRValue": "0x10003c8000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.HITM_OTHER_CORE",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that hit in the L3.",
+        "MSRValue": "0x3f803c8000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_HIT.ANY_SNOOP",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1366,7 +1443,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1379,7 +1456,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1392,7 +1469,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1405,7 +1482,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1418,7 +1495,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1431,7 +1508,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all prefetch data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1444,7 +1521,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1457,7 +1534,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1470,7 +1547,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1483,7 +1560,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1496,7 +1573,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1509,7 +1586,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts prefetch RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1522,7 +1599,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch data reads that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1535,7 +1612,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1548,7 +1625,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1561,7 +1638,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1574,7 +1651,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1587,7 +1664,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch data reads that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1600,7 +1677,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.ANY_RESPONSE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that have any response type.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that have any response type. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1613,7 +1690,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.NO_SNOOP_NEEDED",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and sibling core snoops are not needed as either the core-valid bit is not set or the shared line is present in multiple cores. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1626,7 +1703,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HIT_OTHER_CORE_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1639,7 +1716,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.SNOOP_HIT_WITH_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "tbd; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "tbd Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1652,7 +1729,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.HITM_OTHER_CORE",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3 and the snoop to one of the sibling cores hits the line in M state and the line is forwarded. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1665,7 +1742,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_HIT.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3.; Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that hit in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json b/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json
index 1c09a32..286ed1a 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/floating-point.json
@@ -29,10 +29,9 @@
     {
         "EventCode": "0xC7",
         "UMask": "0x8",
-        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.  ",
+        "BriefDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "Counter": "0,1,2,3",
         "EventName": "FP_ARITH_INST_RETIRED.128B_PACKED_SINGLE",
-        "PublicDescription": "Number of SSE/AVX computational 128-bit packed single precision floating-point instructions retired.  Each count represents 4 computations. Applies to SSE* and AVX* packed single precision floating-point instructions: ADD SUB MUL DIV MIN MAX RCP RSQRT SQRT DPP FM(N)ADD/SUB.  DPP and FM(N)ADD/SUB instructions count twice as they perform multiple calculations per element.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
index 40abc08..403a4f8 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/frontend.json
@@ -182,7 +182,7 @@
         "BriefDescription": "Uops not delivered to Resource Allocation Table (RAT) per thread when backend of the machine is not stalled",
         "Counter": "0,1,2,3",
         "EventName": "IDQ_UOPS_NOT_DELIVERED.CORE",
-        "PublicDescription": "Counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding \u201c4 \u2013 x\u201d when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread. b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions).  c. Instruction Decode Queue (IDQ) delivers four uops.",
+        "PublicDescription": "Counts the number of uops not delivered to Resource Allocation Table (RAT) per thread adding 4  x when Resource Allocation Table (RAT) is not stalled and Instruction Decode Queue (IDQ) delivers x uops to Resource Allocation Table (RAT) (where x belongs to {0,1,2,3}). Counting does not cover cases when: a. IDQ-Resource Allocation Table (RAT) pipe serves the other thread. b. Resource Allocation Table (RAT) is stalled for the thread (including uop drops and clear BE conditions).  c. Instruction Decode Queue (IDQ) delivers four uops.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -247,20 +247,20 @@
         "BriefDescription": "Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.",
         "Counter": "0,1,2,3",
         "EventName": "DSB2MITE_SWITCHES.PENALTY_CYCLES",
-        "PublicDescription": "Counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. MM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.Penalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 0\u20132 cycles.",
+        "PublicDescription": "Counts Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles. These cycles do not include uops routed through because of the switch itself, for example, when Instruction Decode Queue (IDQ) pre-allocation is unavailable, or Instruction Decode Queue (IDQ) is full. SBD-to-MITE switch true penalty cycles happen after the merge mux (MM) receives Decode Stream Buffer (DSB) Sync-indication until receiving the first MITE uop. MM is placed before Instruction Decode Queue (IDQ) to merge uops being fed from the MITE and Decode Stream Buffer (DSB) paths. Decode Stream Buffer (DSB) inserts the Sync-indication whenever a Decode Stream Buffer (DSB)-to-MITE switch occurs.Penalty: A Decode Stream Buffer (DSB) hit followed by a Decode Stream Buffer (DSB) miss can cost up to six cycles in which no uops are delivered to the IDQ. Most often, such switches from the Decode Stream Buffer (DSB) to the legacy pipeline cost 02 cycles.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired Instructions who experienced decode stream buffer (DSB - the decoded instruction-cache) miss.",
+        "BriefDescription": "Retired Instructions who experienced decode stream buffer (DSB - the decoded instruction-cache) miss. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x11",
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.DSB_MISS",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired Instructions that experienced DSB (Decode stream buffer i.e. the decoded instruction-cache) miss. ",
+        "PublicDescription": "Counts retired Instructions that experienced DSB (Decode stream buffer i.e. the decoded instruction-cache) miss. \r\n",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -268,7 +268,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired Instructions who experienced Instruction L1 Cache true miss.",
+        "BriefDescription": "Retired Instructions who experienced Instruction L1 Cache true miss. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x12",
         "Counter": "0,1,2,3",
@@ -281,7 +281,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired Instructions who experienced Instruction L2 Cache true miss.",
+        "BriefDescription": "Retired Instructions who experienced Instruction L2 Cache true miss. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x13",
         "Counter": "0,1,2,3",
@@ -294,7 +294,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired Instructions who experienced iTLB true miss.",
+        "BriefDescription": "Retired Instructions who experienced iTLB true miss. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x14",
         "Counter": "0,1,2,3",
@@ -308,13 +308,13 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired Instructions who experienced STLB (2nd level TLB) true miss.",
+        "BriefDescription": "Retired Instructions who experienced STLB (2nd level TLB) true miss. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x15",
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.STLB_MISS",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired Instructions that experienced STLB (2nd level TLB) true miss. ",
+        "PublicDescription": "Counts retired Instructions that experienced STLB (2nd level TLB) true miss.",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -322,7 +322,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x400206",
         "Counter": "0,1,2,3",
@@ -335,7 +335,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 2 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x200206",
         "Counter": "0,1,2,3",
@@ -348,7 +348,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 4 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x400406",
         "Counter": "0,1,2,3",
@@ -367,7 +367,7 @@
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_8",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 8 cycles. During this period the front-end delivered no uops.",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 8 cycles. During this period the front-end delivered no uops. \r\n",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -375,13 +375,13 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 16 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x401006",
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_16",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 16 cycles. During this period the front-end delivered no uops.",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 16 cycles. During this period the front-end delivered no uops.\r\n",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -389,13 +389,13 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 32 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x402006",
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_32",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired instructions that are delivered to the back-end after a front-end stall of at least 32 cycles. During this period the front-end delivered no uops.",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end  after a front-end stall of at least 32 cycles. During this period the front-end delivered no uops.\r\n",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -403,7 +403,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 64 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x404006",
         "Counter": "0,1,2,3",
@@ -416,7 +416,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 128 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x408006",
         "Counter": "0,1,2,3",
@@ -429,7 +429,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 256 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x410006",
         "Counter": "0,1,2,3",
@@ -442,7 +442,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end delivered no uops for a period of 512 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x420006",
         "Counter": "0,1,2,3",
@@ -455,13 +455,13 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 1 bubble-slot for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x100206",
         "Counter": "0,1,2,3",
         "EventName": "FRONTEND_RETIRED.LATENCY_GE_2_BUBBLES_GE_1",
         "MSRIndex": "0x3F7",
-        "PublicDescription": "Counts retired instructions that are delivered to the back-end after the front-end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issue-pipeline slot while there was no RAT stall.",
+        "PublicDescription": "Counts retired instructions that are delivered to the back-end after the front-end had at least 1 bubble-slot for a period of 2 cycles. A bubble-slot is an empty issue-pipeline slot while there was no RAT stall.\r\n",
         "TakenAlone": "1",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3"
@@ -469,7 +469,7 @@
     {
         "EventCode": "0xC6",
         "UMask": "0x1",
-        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall.",
+        "BriefDescription": "Retired instructions that are fetched after an interval where the front-end had at least 3 bubble-slots for a period of 2 cycles which was not interrupted by a back-end stall. Precise Event.",
         "PEBS": "1",
         "MSRValue": "0x300206",
         "Counter": "0,1,2,3",
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/memory.json b/tools/perf/pmu-events/arch/x86/skylakex/memory.json
index ca22a22..e7f1aa3 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/memory.json
@@ -214,7 +214,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "HLE_RETIRED.ABORTED",
-        "PublicDescription": "Number of times HLE abort was triggered.",
+        "PublicDescription": "Number of times HLE abort was triggered. (PEBS)",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -239,10 +239,9 @@
     {
         "EventCode": "0xC8",
         "UMask": "0x20",
-        "BriefDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.). ",
+        "BriefDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).",
         "Counter": "0,1,2,3",
         "EventName": "HLE_RETIRED.ABORTED_UNFRIENDLY",
-        "PublicDescription": "Number of times an HLE execution aborted due to HLE-unfriendly instructions and certain unfriendly events (such as AD assists etc.).",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -292,7 +291,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "RTM_RETIRED.ABORTED",
-        "PublicDescription": "Number of times RTM abort was triggered.",
+        "PublicDescription": "Number of times RTM abort was triggered. (PEBS)",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -466,7 +465,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss in the L3. ",
+        "PublicDescription": "Counts demand data reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -479,7 +478,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts demand data reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -492,7 +491,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts demand data reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -505,7 +504,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -518,7 +517,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -531,7 +530,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts demand data reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -544,7 +543,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -557,7 +556,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -570,7 +569,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -583,7 +582,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -596,7 +595,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -609,7 +608,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all demand data writes (RFOs) that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -622,7 +621,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss in the L3. ",
+        "PublicDescription": "Counts all demand code reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -635,7 +634,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -648,7 +647,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -661,7 +660,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -674,7 +673,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -687,7 +686,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.DEMAND_CODE_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all demand code reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -700,7 +699,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -713,7 +712,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -726,7 +725,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -739,7 +738,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -752,7 +751,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -765,7 +764,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts prefetch (that bring data to L2) data reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -778,7 +777,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -791,7 +790,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -804,7 +803,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -817,7 +816,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -830,7 +829,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -843,7 +842,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L2_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to L2) RFOs that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -856,7 +855,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -869,7 +868,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -882,7 +881,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -895,7 +894,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -908,7 +907,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -921,7 +920,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) data reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -934,7 +933,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -947,7 +946,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -960,7 +959,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -973,7 +972,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -986,7 +985,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -999,7 +998,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L3_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all prefetch (that bring data to LLC only) RFOs that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1012,7 +1011,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss in the L3. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1025,7 +1024,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1038,7 +1037,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1051,7 +1050,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1064,7 +1063,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1077,7 +1076,85 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.PF_L1D_AND_SW.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts L1 data cache hardware prefetch requests and software prefetch requests that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss in the L3.",
+        "MSRValue": "0x3fbc008000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.ANY_SNOOP",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss the L3 and clean or shared data is transferred from remote cache.",
+        "MSRValue": "0x083fc08000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.REMOTE_HIT_FORWARD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss the L3 and the modified data is transferred from remote cache.",
+        "MSRValue": "0x103fc08000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.REMOTE_HITM",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss the L3 and the data is returned from local or remote dram.",
+        "MSRValue": "0x063fc08000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS.SNOOP_MISS_OR_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss the L3 and the data is returned from remote dram.",
+        "MSRValue": "0x063b808000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
+        "SampleAfterValue": "100003",
+        "CounterHTOff": "0,1,2,3"
+    },
+    {
+        "Offcore": "1",
+        "EventCode": "0xB7, 0xBB",
+        "UMask": "0x1",
+        "BriefDescription": "Counts any other requests that miss the L3 and the data is returned from local dram.",
+        "MSRValue": "0x0604008000 ",
+        "Counter": "0,1,2,3",
+        "EventName": "OFFCORE_RESPONSE.OTHER.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
+        "MSRIndex": "0x1a6,0x1a7",
+        "PublicDescription": "Counts any other requests that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1090,7 +1167,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss in the L3. ",
+        "PublicDescription": "Counts all prefetch data reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1103,7 +1180,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1116,7 +1193,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1129,7 +1206,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1142,7 +1219,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1155,7 +1232,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all prefetch data reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1168,7 +1245,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss in the L3. ",
+        "PublicDescription": "Counts prefetch RFOs that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1181,7 +1258,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1194,7 +1271,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1207,7 +1284,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1220,7 +1297,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1233,7 +1310,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_PF_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts prefetch RFOs that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1246,7 +1323,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1259,7 +1336,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1272,7 +1349,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1285,7 +1362,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1298,7 +1375,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1311,7 +1388,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_DATA_RD.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram. ",
+        "PublicDescription": "Counts all demand & prefetch data reads that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1324,7 +1401,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.ANY_SNOOP",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3. ",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss in the L3. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1337,7 +1414,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HIT_FORWARD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and clean or shared data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1350,7 +1427,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.REMOTE_HITM",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. ",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the modified data is transferred from remote cache. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1363,7 +1440,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local or remote dram. ",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local or remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1376,7 +1453,7 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS_REMOTE_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from remote dram. ",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from remote dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     },
@@ -1389,8 +1466,8 @@
         "Counter": "0,1,2,3",
         "EventName": "OFFCORE_RESPONSE.ALL_RFO.L3_MISS_LOCAL_DRAM.SNOOP_MISS_OR_NO_FWD",
         "MSRIndex": "0x1a6,0x1a7",
-        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram.",
+        "PublicDescription": "Counts all demand & prefetch RFOs that miss the L3 and the data is returned from local dram. Offcore response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the offcore transaction.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3"
     }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/other.json b/tools/perf/pmu-events/arch/x86/skylakex/other.json
index 70243b0..778a541 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/other.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/other.json
@@ -40,6 +40,42 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0x32",
+        "UMask": "0x1",
+        "BriefDescription": "Number of PREFETCHNTA instructions executed.",
+        "Counter": "0,1,2,3",
+        "EventName": "SW_PREFETCH_ACCESS.NTA",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "UMask": "0x2",
+        "BriefDescription": "Number of PREFETCHT0 instructions executed.",
+        "Counter": "0,1,2,3",
+        "EventName": "SW_PREFETCH_ACCESS.T0",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "UMask": "0x4",
+        "BriefDescription": "Number of PREFETCHT1 or PREFETCHT2 instructions executed.",
+        "Counter": "0,1,2,3",
+        "EventName": "SW_PREFETCH_ACCESS.T1_T2",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0x32",
+        "UMask": "0x8",
+        "BriefDescription": "Number of PREFETCHW instructions executed.",
+        "Counter": "0,1,2,3",
+        "EventName": "SW_PREFETCH_ACCESS.PREFETCHW",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xCB",
         "UMask": "0x1",
         "BriefDescription": "Number of hardware interrupts received by the processor.",
@@ -50,6 +86,62 @@
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
+        "EventCode": "0xEF",
+        "UMask": "0x1",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_IHITI",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x2",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_IHITFSE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x4",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_SHITFSE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x8",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_SFWDM",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x10",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_IFWDM",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x20",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_IFWDFE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
+        "EventCode": "0xEF",
+        "UMask": "0x40",
+        "Counter": "0,1,2,3",
+        "EventName": "CORE_SNOOP_RESPONSE.RSP_SFWDFE",
+        "SampleAfterValue": "2000003",
+        "CounterHTOff": "0,1,2,3,4,5,6,7"
+    },
+    {
         "EventCode": "0xFE",
         "UMask": "0x2",
         "BriefDescription": "Counts number of cache lines that are allocated and written back to L3 with the intention that they are more likely to be reused shortly",
@@ -69,4 +161,4 @@
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     }
-]
+]
\ No newline at end of file
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
index 0895d1e..f99f7ae 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/pipeline.json
@@ -3,41 +3,41 @@
         "EventCode": "0x00",
         "UMask": "0x1",
         "BriefDescription": "Instructions retired from execution.",
-        "Counter": "Fixed counter 1",
+        "Counter": "Fixed counter 0",
         "EventName": "INST_RETIRED.ANY",
         "PublicDescription": "Counts the number of instructions retired from execution. For instructions that consist of multiple micro-ops, Counts the retirement of the last micro-op of the instruction. Counting continues during hardware interrupts, traps, and inside interrupt handlers. Notes: INST_RETIRED.ANY is counted by a designated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. INST_RETIRED.ANY_P is counted by a programmable counter and it is an architectural performance event. Counting: Faulting executions of GETSEC/VM entry/VM Exit/MWait will not count as retired instructions.",
         "SampleAfterValue": "2000003",
+        "CounterHTOff": "Fixed counter 0"
+    },
+    {
+        "EventCode": "0x00",
+        "UMask": "0x2",
+        "BriefDescription": "Core cycles when the thread is not in halt state",
+        "Counter": "Fixed counter 1",
+        "EventName": "CPU_CLK_UNHALTED.THREAD",
+        "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
+        "SampleAfterValue": "2000003",
         "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x2",
-        "BriefDescription": "Core cycles when the thread is not in halt state",
-        "Counter": "Fixed counter 2",
-        "EventName": "CPU_CLK_UNHALTED.THREAD",
-        "PublicDescription": "Counts the number of core cycles while the thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. This event is a component in many key event ratios. The core frequency may change from time to time due to transitions associated with Enhanced Intel SpeedStep Technology or TM2. For this reason this event may have a changing ratio with regards to time. When the core frequency is constant, this event can approximate elapsed time while the core was not in the halt state. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events.",
-        "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
-    },
-    {
-        "EventCode": "0x00",
-        "UMask": "0x2",
         "BriefDescription": "Core cycles when at least one thread on the physical core is not in halt state.",
-        "Counter": "Fixed counter 2",
+        "Counter": "Fixed counter 1",
         "EventName": "CPU_CLK_UNHALTED.THREAD_ANY",
         "AnyThread": "1",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 2"
+        "CounterHTOff": "Fixed counter 1"
     },
     {
         "EventCode": "0x00",
         "UMask": "0x3",
         "BriefDescription": "Reference cycles when the core is not in halt state.",
-        "Counter": "Fixed counter 3",
+        "Counter": "Fixed counter 2",
         "EventName": "CPU_CLK_UNHALTED.REF_TSC",
         "PublicDescription": "Counts the number of reference cycles when the core is not in a halt state. The core enters the halt state when it is running the HLT instruction or the MWAIT instruction. This event is not affected by core frequency changes (for example, P states, TM2 transitions) but has the same incrementing frequency as the time stamp counter. This event can approximate elapsed time while the core was not in a halt state. This event has a constant ratio with the CPU_CLK_UNHALTED.REF_XCLK event. It is counted on a dedicated fixed counter, leaving the four (eight when Hyperthreading is disabled) programmable counters available for other events. Note: On all current platforms this event stops counting during 'throttling (TM)' states duty off periods the processor is 'halted'.  The counter update is done at a lower clock rate then the core clock the overflow status bit for this counter may appear 'sticky'.  After the counter has overflowed and software clears the overflow status bit and resets the counter to less than MAX. The reset value to the counter is not clocked immediately so the overflow status bit will flip 'high (1)' and generate another PMI (if enabled) after which the reset value gets clocked into the counter. Therefore, software will get the interrupt, read the overflow status bit '1 for bit 34 while the counter value is less than MAX. Software should ignore this case.",
         "SampleAfterValue": "2000003",
-        "CounterHTOff": "Fixed counter 3"
+        "CounterHTOff": "Fixed counter 2"
     },
     {
         "EventCode": "0x03",
@@ -126,7 +126,7 @@
         "BriefDescription": "Uops inserted at issue-stage in order to preserve upper bits of vector registers.",
         "Counter": "0,1,2,3",
         "EventName": "UOPS_ISSUED.VECTOR_WIDTH_MISMATCH",
-        "PublicDescription": "Counts the number of Blend Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting with the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register. For more information, refer to \u201cMixing Intel AVX and Intel SSE Code\u201d section of the Optimization Guide.",
+        "PublicDescription": "Counts the number of Blend Uops issued by the Resource Allocation Table (RAT) to the reservation station (RS) in order to preserve upper bits of vector registers. Starting with the Skylake microarchitecture, these Blend uops are needed since every Intel SSE instruction executed in Dirty Upper State needs to preserve bits 128-255 of the destination register. For more information, refer to Mixing Intel AVX and Intel SSE Code section of the Optimization Guide.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -762,11 +762,10 @@
         "EdgeDetect": "1",
         "EventCode": "0xC3",
         "UMask": "0x1",
-        "BriefDescription": "Number of machine clears (nukes) of any type. ",
+        "BriefDescription": "Number of machine clears (nukes) of any type.",
         "Counter": "0,1,2,3",
         "EventName": "MACHINE_CLEARS.COUNT",
         "CounterMask": "1",
-        "PublicDescription": "Number of machine clears (nukes) of any type.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -799,7 +798,7 @@
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.CONDITIONAL",
         "Errata": "SKL091",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts conditional branch instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts conditional branch instructions retired.",
         "SampleAfterValue": "400009",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -811,14 +810,14 @@
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.NEAR_CALL",
         "Errata": "SKL091",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts both direct and indirect near call instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts both direct and indirect near call instructions retired.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xC4",
         "UMask": "0x4",
-        "BriefDescription": "All (macro) branch instructions retired. ",
+        "BriefDescription": "All (macro) branch instructions retired.",
         "PEBS": "2",
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.ALL_BRANCHES_PEBS",
@@ -835,7 +834,7 @@
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.NEAR_RETURN",
         "Errata": "SKL091",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts return instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts return instructions retired.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -858,19 +857,19 @@
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.NEAR_TAKEN",
         "Errata": "SKL091",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts taken branch instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts taken branch instructions retired.",
         "SampleAfterValue": "400009",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xC4",
         "UMask": "0x40",
-        "BriefDescription": "Far branch instructions retired.",
+        "BriefDescription": "Counts the number of far branch instructions retired.",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "BR_INST_RETIRED.FAR_BRANCH",
         "Errata": "SKL091",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts far branch instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts far branch instructions retired.",
         "SampleAfterValue": "100007",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -891,7 +890,7 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "BR_MISP_RETIRED.CONDITIONAL",
-        "PublicDescription": "This is a non-precise version (that is, does not use PEBS) of the event that counts mispredicted conditional branch instructions retired.",
+        "PublicDescription": "This is a precise version (that is, uses PEBS) of the event that counts mispredicted conditional branch instructions retired.",
         "SampleAfterValue": "400009",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -902,14 +901,14 @@
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "BR_MISP_RETIRED.NEAR_CALL",
-        "PublicDescription": "Counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.",
+        "PublicDescription": "This event counts both taken and not taken retired mispredicted direct and indirect near calls, including both register and memory indirect.",
         "SampleAfterValue": "400009",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0xC5",
         "UMask": "0x4",
-        "BriefDescription": "Mispredicted macro branch instructions retired. ",
+        "BriefDescription": "Mispredicted macro branch instructions retired.",
         "PEBS": "2",
         "Counter": "0,1,2,3",
         "EventName": "BR_MISP_RETIRED.ALL_BRANCHES_PEBS",
@@ -920,10 +919,11 @@
     {
         "EventCode": "0xC5",
         "UMask": "0x20",
-        "BriefDescription": "Number of near branch instructions retired that were mispredicted and taken.",
+        "BriefDescription": "Number of near branch instructions retired that were mispredicted and taken. ",
         "PEBS": "1",
         "Counter": "0,1,2,3",
         "EventName": "BR_MISP_RETIRED.NEAR_TAKEN",
+        "PublicDescription": "Number of near branch instructions retired that were mispredicted and taken.",
         "SampleAfterValue": "400009",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
diff --git a/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
index 70750da..7f466c9 100644
--- a/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
+++ b/tools/perf/pmu-events/arch/x86/skylakex/virtual-memory.json
@@ -12,30 +12,30 @@
     {
         "EventCode": "0x08",
         "UMask": "0x2",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (4K).",
+        "BriefDescription": "Page walk completed due to a demand data load to a 4K page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_4K",
-        "PublicDescription": "Counts demand data loads that caused a completed page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x08",
         "UMask": "0x4",
-        "BriefDescription": "Demand load Miss in all translation lookaside buffer (TLB) levels causes a page walk that completes (2M/4M).",
+        "BriefDescription": "Page walk completed due to a demand data load to a 2M/4M page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_2M_4M",
-        "PublicDescription": "Counts demand data loads that caused a completed page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 2M/4M pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x08",
         "UMask": "0x8",
-        "BriefDescription": "Load miss in all TLB levels causes a page walk that completes. (1G)",
+        "BriefDescription": "Page walk completed due to a demand data load to a 1G page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_LOAD_MISSES.WALK_COMPLETED_1G",
-        "PublicDescription": "Counts load misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data loads whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -52,17 +52,17 @@
     {
         "EventCode": "0x08",
         "UMask": "0x10",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake.",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_LOAD_MISSES.WALK_PENDING",
-        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake microarchitecture. ",
+        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a load. EPT page walk duration are excluded in Skylake microarchitecture.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x08",
         "UMask": "0x10",
-        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a load. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a load. EPT page walk duration are excluded in Skylake.",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_LOAD_MISSES.WALK_ACTIVE",
         "CounterMask": "1",
@@ -93,30 +93,30 @@
     {
         "EventCode": "0x49",
         "UMask": "0x2",
-        "BriefDescription": "Store miss in all TLB levels causes a page walk that completes. (4K)",
+        "BriefDescription": "Page walk completed due to a demand data store to a 4K page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_4K",
-        "PublicDescription": "Counts demand data stores that caused a completed page walk (4K page size). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 4K pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x49",
         "UMask": "0x4",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (2M/4M)",
+        "BriefDescription": "Page walk completed due to a demand data store to a 2M/4M page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_2M_4M",
-        "PublicDescription": "Counts demand data stores that caused a completed page walk (2M and 4M page sizes). This implies it missed in all TLB levels. The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 2M/4M pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x49",
         "UMask": "0x8",
-        "BriefDescription": "Store misses in all DTLB levels that cause completed page walks (1G)",
+        "BriefDescription": "Page walk completed due to a demand data store to a 1G page",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_STORE_MISSES.WALK_COMPLETED_1G",
-        "PublicDescription": "Counts store misses in all DTLB levels that cause a completed page walk (1G page size). The page walk can end with or without a fault.",
+        "PublicDescription": "Counts page walks completed due to demand data stores whose address translations missed in the TLB and were mapped to 1G pages.  The page walks can end with or without a page fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -133,17 +133,17 @@
     {
         "EventCode": "0x49",
         "UMask": "0x10",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake.",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_STORE_MISSES.WALK_PENDING",
-        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake microarchitecture. ",
+        "PublicDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for a store. EPT page walk duration are excluded in Skylake microarchitecture.",
         "SampleAfterValue": "2000003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
     {
         "EventCode": "0x49",
         "UMask": "0x10",
-        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a store. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Cycles when at least one PMH is busy with a page walk for a store. EPT page walk duration are excluded in Skylake.",
         "Counter": "0,1,2,3",
         "EventName": "DTLB_STORE_MISSES.WALK_ACTIVE",
         "CounterMask": "1",
@@ -197,7 +197,7 @@
         "BriefDescription": "Code miss in all TLB levels causes a page walk that completes. (2M/4M)",
         "Counter": "0,1,2,3",
         "EventName": "ITLB_MISSES.WALK_COMPLETED_2M_4M",
-        "PublicDescription": "Counts completed page walks of any page size (4K/2M/4M/1G) caused by a code fetch. This implies it missed in the ITLB and further levels of TLB. The page walk can end with or without a fault.",
+        "PublicDescription": "Counts code misses in all ITLB levels that caused a completed page walk (2M and 4M page sizes). The page walk can end with or without a fault.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
@@ -224,10 +224,10 @@
     {
         "EventCode": "0x85",
         "UMask": "0x10",
-        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake. ",
+        "BriefDescription": "Counts 1 per cycle for each PMH that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake.",
         "Counter": "0,1,2,3",
         "EventName": "ITLB_MISSES.WALK_PENDING",
-        "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture. ",
+        "PublicDescription": "Counts 1 per cycle for each PMH (Page Miss Handler) that is busy with a page walk for an instruction fetch request. EPT page walk duration are excluded in Skylake michroarchitecture.",
         "SampleAfterValue": "100003",
         "CounterHTOff": "0,1,2,3,4,5,6,7"
     },
diff --git a/tools/perf/pmu-events/jevents.c b/tools/perf/pmu-events/jevents.c
index 9eb7047..b578aa2 100644
--- a/tools/perf/pmu-events/jevents.c
+++ b/tools/perf/pmu-events/jevents.c
@@ -116,6 +116,43 @@ static void fixdesc(char *s)
 		*e = 0;
 }
 
+/* Add escapes for '\' so they are proper C strings. */
+static char *fixregex(char *s)
+{
+	int len = 0;
+	int esc_count = 0;
+	char *fixed = NULL;
+	char *p, *q;
+
+	/* Count the number of '\' in string */
+	for (p = s; *p; p++) {
+		++len;
+		if (*p == '\\')
+			++esc_count;
+	}
+
+	if (esc_count == 0)
+		return s;
+
+	/* allocate space for a new string */
+	fixed = (char *) malloc(len + 1);
+	if (!fixed)
+		return NULL;
+
+	/* copy over the characters */
+	q = fixed;
+	for (p = s; *p; p++) {
+		if (*p == '\\') {
+			*q = '\\';
+			++q;
+		}
+		*q = *p;
+		++q;
+	}
+	*q = '\0';
+	return fixed;
+}
+
 static struct msrmap {
 	const char *num;
 	const char *pname;
@@ -648,7 +685,7 @@ static int process_mapfile(FILE *outfp, char *fpath)
 		}
 		line[strlen(line)-1] = '\0';
 
-		cpuid = strtok_r(p, ",", &save);
+		cpuid = fixregex(strtok_r(p, ",", &save));
 		version = strtok_r(NULL, ",", &save);
 		fname = strtok_r(NULL, ",", &save);
 		type = strtok_r(NULL, ",", &save);
diff --git a/tools/perf/scripts/python/bin/mem-phys-addr-record b/tools/perf/scripts/python/bin/mem-phys-addr-record
new file mode 100644
index 0000000..5a87512
--- /dev/null
+++ b/tools/perf/scripts/python/bin/mem-phys-addr-record
@@ -0,0 +1,19 @@
+#!/bin/bash
+
+#
+# Profiling physical memory by all retired load instructions/uops event
+# MEM_INST_RETIRED.ALL_LOADS or MEM_UOPS_RETIRED.ALL_LOADS
+#
+
+load=`perf list | grep mem_inst_retired.all_loads`
+if [ -z "$load" ]; then
+	load=`perf list | grep mem_uops_retired.all_loads`
+fi
+if [ -z "$load" ]; then
+	echo "There is no event to count all retired load instructions/uops."
+	exit 1
+fi
+
+arg=$(echo $load | tr -d ' ')
+arg="$arg:P"
+perf record --phys-data -e $arg $@
diff --git a/tools/perf/scripts/python/bin/mem-phys-addr-report b/tools/perf/scripts/python/bin/mem-phys-addr-report
new file mode 100644
index 0000000..3f2b847
--- /dev/null
+++ b/tools/perf/scripts/python/bin/mem-phys-addr-report
@@ -0,0 +1,3 @@
+#!/bin/bash
+# description: resolve physical address samples
+perf script $@ -s "$PERF_EXEC_PATH"/scripts/python/mem-phys-addr.py
diff --git a/tools/perf/scripts/python/mem-phys-addr.py b/tools/perf/scripts/python/mem-phys-addr.py
new file mode 100644
index 0000000..ebee2c5
--- /dev/null
+++ b/tools/perf/scripts/python/mem-phys-addr.py
@@ -0,0 +1,95 @@
+# mem-phys-addr.py: Resolve physical address samples
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2018, Intel Corporation.
+
+from __future__ import division
+import os
+import sys
+import struct
+import re
+import bisect
+import collections
+
+sys.path.append(os.environ['PERF_EXEC_PATH'] + \
+	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')
+
+#physical address ranges for System RAM
+system_ram = []
+#physical address ranges for Persistent Memory
+pmem = []
+#file object for proc iomem
+f = None
+#Count for each type of memory
+load_mem_type_cnt = collections.Counter()
+#perf event name
+event_name = None
+
+def parse_iomem():
+	global f
+	f = open('/proc/iomem', 'r')
+	for i, j in enumerate(f):
+		m = re.split('-|:',j,2)
+		if m[2].strip() == 'System RAM':
+			system_ram.append(long(m[0], 16))
+			system_ram.append(long(m[1], 16))
+		if m[2].strip() == 'Persistent Memory':
+			pmem.append(long(m[0], 16))
+			pmem.append(long(m[1], 16))
+
+def print_memory_type():
+	print "Event: %s" % (event_name)
+	print "%-40s  %10s  %10s\n" % ("Memory type", "count", "percentage"),
+	print "%-40s  %10s  %10s\n" % ("----------------------------------------", \
+					"-----------", "-----------"),
+	total = sum(load_mem_type_cnt.values())
+	for mem_type, count in sorted(load_mem_type_cnt.most_common(), \
+					key = lambda(k, v): (v, k), reverse = True):
+		print "%-40s  %10d  %10.1f%%\n" % (mem_type, count, 100 * count / total),
+
+def trace_begin():
+	parse_iomem()
+
+def trace_end():
+	print_memory_type()
+	f.close()
+
+def is_system_ram(phys_addr):
+	#/proc/iomem is sorted
+	position = bisect.bisect(system_ram, phys_addr)
+	if position % 2 == 0:
+		return False
+	return True
+
+def is_persistent_mem(phys_addr):
+	position = bisect.bisect(pmem, phys_addr)
+	if position % 2 == 0:
+		return False
+	return True
+
+def find_memory_type(phys_addr):
+	if phys_addr == 0:
+		return "N/A"
+	if is_system_ram(phys_addr):
+		return "System RAM"
+
+	if is_persistent_mem(phys_addr):
+		return "Persistent Memory"
+
+	#slow path, search all
+	f.seek(0, 0)
+	for j in f:
+		m = re.split('-|:',j,2)
+		if long(m[0], 16) <= phys_addr <= long(m[1], 16):
+			return m[2]
+	return "N/A"
+
+def process_event(param_dict):
+	name       = param_dict["ev_name"]
+	sample     = param_dict["sample"]
+	phys_addr  = sample["phys_addr"]
+
+	global event_name
+	if event_name == None:
+		event_name = name
+	load_mem_type_cnt[find_memory_type(phys_addr)] += 1
diff --git a/tools/perf/tests/attr.c b/tools/perf/tests/attr.c
index 0e1367f..97f64ad 100644
--- a/tools/perf/tests/attr.c
+++ b/tools/perf/tests/attr.c
@@ -124,6 +124,12 @@ static int store_event(struct perf_event_attr *attr, pid_t pid, int cpu,
 	WRITE_ASS(exclude_guest,  "d");
 	WRITE_ASS(exclude_callchain_kernel, "d");
 	WRITE_ASS(exclude_callchain_user, "d");
+	WRITE_ASS(mmap2,	  "d");
+	WRITE_ASS(comm_exec,	  "d");
+	WRITE_ASS(context_switch, "d");
+	WRITE_ASS(write_backward, "d");
+	WRITE_ASS(namespaces,	  "d");
+	WRITE_ASS(use_clockid,    "d");
 	WRITE_ASS(wakeup_events, PRIu32);
 	WRITE_ASS(bp_type, PRIu32);
 	WRITE_ASS(config1, "llu");
diff --git a/tools/perf/tests/backward-ring-buffer.c b/tools/perf/tests/backward-ring-buffer.c
index 71b9a0b..4035d43 100644
--- a/tools/perf/tests/backward-ring-buffer.c
+++ b/tools/perf/tests/backward-ring-buffer.c
@@ -33,8 +33,8 @@ static int count_samples(struct perf_evlist *evlist, int *sample_count,
 	for (i = 0; i < evlist->nr_mmaps; i++) {
 		union perf_event *event;
 
-		perf_mmap__read_catchup(&evlist->backward_mmap[i]);
-		while ((event = perf_mmap__read_backward(&evlist->backward_mmap[i])) != NULL) {
+		perf_mmap__read_catchup(&evlist->overwrite_mmap[i]);
+		while ((event = perf_mmap__read_backward(&evlist->overwrite_mmap[i])) != NULL) {
 			const u32 type = event->header.type;
 
 			switch (type) {
@@ -59,7 +59,7 @@ static int do_test(struct perf_evlist *evlist, int mmap_pages,
 	int err;
 	char sbuf[STRERR_BUFSIZE];
 
-	err = perf_evlist__mmap(evlist, mmap_pages, true);
+	err = perf_evlist__mmap(evlist, mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/bp_signal.c b/tools/perf/tests/bp_signal.c
index 335b695..a467615 100644
--- a/tools/perf/tests/bp_signal.c
+++ b/tools/perf/tests/bp_signal.c
@@ -296,7 +296,7 @@ bool test__bp_signal_is_supported(void)
  * instruction breakpoint using the perf event interface.
  * Once it's there we can release this.
  */
-#ifdef __powerpc__
+#if defined(__powerpc__) || defined(__s390x__)
 	return false;
 #else
 	return true;
diff --git a/tools/perf/tests/bpf-script-example.c b/tools/perf/tests/bpf-script-example.c
index 268e5f8..e4123c1 100644
--- a/tools/perf/tests/bpf-script-example.c
+++ b/tools/perf/tests/bpf-script-example.c
@@ -31,8 +31,8 @@ struct bpf_map_def SEC("maps") flip_table = {
 	.max_entries = 1,
 };
 
-SEC("func=SyS_epoll_wait")
-int bpf_func__SyS_epoll_wait(void *ctx)
+SEC("func=SyS_epoll_pwait")
+int bpf_func__SyS_epoll_pwait(void *ctx)
 {
 	int ind =0;
 	int *flag = bpf_map_lookup_elem(&flip_table, &ind);
diff --git a/tools/perf/tests/bpf.c b/tools/perf/tests/bpf.c
index 34c22cd..e8399be 100644
--- a/tools/perf/tests/bpf.c
+++ b/tools/perf/tests/bpf.c
@@ -3,6 +3,7 @@
 #include <sys/epoll.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <fcntl.h>
 #include <util/util.h>
 #include <util/bpf-loader.h>
 #include <util/evlist.h>
@@ -19,13 +20,13 @@
 
 #ifdef HAVE_LIBBPF_SUPPORT
 
-static int epoll_wait_loop(void)
+static int epoll_pwait_loop(void)
 {
 	int i;
 
 	/* Should fail NR_ITERS times */
 	for (i = 0; i < NR_ITERS; i++)
-		epoll_wait(-(i + 1), NULL, 0, 0);
+		epoll_pwait(-(i + 1), NULL, 0, 0, NULL);
 	return 0;
 }
 
@@ -63,46 +64,41 @@ static struct {
 	bool	pin;
 } bpf_testcase_table[] = {
 	{
-		LLVM_TESTCASE_BASE,
-		"Basic BPF filtering",
-		"[basic_bpf_test]",
-		"fix 'perf test LLVM' first",
-		"load bpf object failed",
-		&epoll_wait_loop,
-		(NR_ITERS + 1) / 2,
-		false,
+		.prog_id	  = LLVM_TESTCASE_BASE,
+		.desc		  = "Basic BPF filtering",
+		.name		  = "[basic_bpf_test]",
+		.msg_compile_fail = "fix 'perf test LLVM' first",
+		.msg_load_fail	  = "load bpf object failed",
+		.target_func	  = &epoll_pwait_loop,
+		.expect_result	  = (NR_ITERS + 1) / 2,
 	},
 	{
-		LLVM_TESTCASE_BASE,
-		"BPF pinning",
-		"[bpf_pinning]",
-		"fix kbuild first",
-		"check your vmlinux setting?",
-		&epoll_wait_loop,
-		(NR_ITERS + 1) / 2,
-		true,
+		.prog_id	  = LLVM_TESTCASE_BASE,
+		.desc		  = "BPF pinning",
+		.name		  = "[bpf_pinning]",
+		.msg_compile_fail = "fix kbuild first",
+		.msg_load_fail	  = "check your vmlinux setting?",
+		.target_func	  = &epoll_pwait_loop,
+		.expect_result	  = (NR_ITERS + 1) / 2,
+		.pin 		  = true,
 	},
 #ifdef HAVE_BPF_PROLOGUE
 	{
-		LLVM_TESTCASE_BPF_PROLOGUE,
-		"BPF prologue generation",
-		"[bpf_prologue_test]",
-		"fix kbuild first",
-		"check your vmlinux setting?",
-		&llseek_loop,
-		(NR_ITERS + 1) / 4,
-		false,
+		.prog_id	  = LLVM_TESTCASE_BPF_PROLOGUE,
+		.desc		  = "BPF prologue generation",
+		.name		  = "[bpf_prologue_test]",
+		.msg_compile_fail = "fix kbuild first",
+		.msg_load_fail	  = "check your vmlinux setting?",
+		.target_func	  = &llseek_loop,
+		.expect_result	  = (NR_ITERS + 1) / 4,
 	},
 #endif
 	{
-		LLVM_TESTCASE_BPF_RELOCATION,
-		"BPF relocation checker",
-		"[bpf_relocation_test]",
-		"fix 'perf test LLVM' first",
-		"libbpf error when dealing with relocation",
-		NULL,
-		0,
-		false,
+		.prog_id	  = LLVM_TESTCASE_BPF_RELOCATION,
+		.desc		  = "BPF relocation checker",
+		.name		  = "[bpf_relocation_test]",
+		.msg_compile_fail = "fix 'perf test LLVM' first",
+		.msg_load_fail	  = "libbpf error when dealing with relocation",
 	},
 };
 
@@ -167,7 +163,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, opts.mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
@@ -190,7 +186,7 @@ static int do_test(struct bpf_object *obj, int (*func)(void),
 	}
 
 	if (count != expect) {
-		pr_debug("BPF filter result incorrect\n");
+		pr_debug("BPF filter result incorrect, expected %d, got %d samples\n", expect, count);
 		goto out_delete_evlist;
 	}
 
diff --git a/tools/perf/tests/builtin-test.c b/tools/perf/tests/builtin-test.c
index 766573e..fafa014 100644
--- a/tools/perf/tests/builtin-test.c
+++ b/tools/perf/tests/builtin-test.c
@@ -411,9 +411,9 @@ static const char *shell_test__description(char *description, size_t size,
 	return description ? trim(description + 1) : NULL;
 }
 
-#define for_each_shell_test(dir, ent)		\
+#define for_each_shell_test(dir, base, ent)	\
 	while ((ent = readdir(dir)) != NULL)	\
-		if (ent->d_type == DT_REG && ent->d_name[0] != '.')
+		if (!is_directory(base, ent))
 
 static const char *shell_tests__dir(char *path, size_t size)
 {
@@ -452,7 +452,7 @@ static int shell_tests__max_desc_width(void)
 	if (!dir)
 		return -1;
 
-	for_each_shell_test(dir, ent) {
+	for_each_shell_test(dir, path, ent) {
 		char bf[256];
 		const char *desc = shell_test__description(bf, sizeof(bf), path, ent->d_name);
 
@@ -504,7 +504,7 @@ static int run_shell_tests(int argc, const char *argv[], int i, int width)
 	if (!dir)
 		return -1;
 
-	for_each_shell_test(dir, ent) {
+	for_each_shell_test(dir, st.dir, ent) {
 		int curr = i++;
 		char desc[256];
 		struct test test = {
@@ -614,7 +614,7 @@ static int perf_test__list_shell(int argc, const char **argv, int i)
 	if (!dir)
 		return -1;
 
-	for_each_shell_test(dir, ent) {
+	for_each_shell_test(dir, path, ent) {
 		int curr = i++;
 		char bf[256];
 		struct test t = {
diff --git a/tools/perf/tests/code-reading.c b/tools/perf/tests/code-reading.c
index fcc8984..3bf7b14 100644
--- a/tools/perf/tests/code-reading.c
+++ b/tools/perf/tests/code-reading.c
@@ -639,7 +639,7 @@ static int do_test_code_reading(bool try_kcore)
 		break;
 	}
 
-	ret = perf_evlist__mmap(evlist, UINT_MAX, false);
+	ret = perf_evlist__mmap(evlist, UINT_MAX);
 	if (ret < 0) {
 		pr_debug("perf_evlist__mmap failed\n");
 		goto out_put;
diff --git a/tools/perf/tests/dwarf-unwind.c b/tools/perf/tests/dwarf-unwind.c
index ac40e05..26041896 100644
--- a/tools/perf/tests/dwarf-unwind.c
+++ b/tools/perf/tests/dwarf-unwind.c
@@ -173,6 +173,7 @@ int test__dwarf_unwind(struct test *test __maybe_unused, int subtest __maybe_unu
 	}
 
 	callchain_param.record_mode = CALLCHAIN_DWARF;
+	dwarf_callchain_users = true;
 
 	if (init_live_machine(machine)) {
 		pr_err("Could not init machine\n");
diff --git a/tools/perf/tests/keep-tracking.c b/tools/perf/tests/keep-tracking.c
index 842d336..c465309 100644
--- a/tools/perf/tests/keep-tracking.c
+++ b/tools/perf/tests/keep-tracking.c
@@ -95,7 +95,7 @@ int test__keep_tracking(struct test *test __maybe_unused, int subtest __maybe_un
 		goto out_err;
 	}
 
-	CHECK__(perf_evlist__mmap(evlist, UINT_MAX, false));
+	CHECK__(perf_evlist__mmap(evlist, UINT_MAX));
 
 	/*
 	 * First, test that a 'comm' event can be found when the event is
diff --git a/tools/perf/tests/mmap-basic.c b/tools/perf/tests/mmap-basic.c
index 5a8bf31..c0e971d 100644
--- a/tools/perf/tests/mmap-basic.c
+++ b/tools/perf/tests/mmap-basic.c
@@ -94,7 +94,7 @@ int test__basic_mmap(struct test *test __maybe_unused, int subtest __maybe_unuse
 		expected_nr_events[i] = 1 + rand() % 127;
 	}
 
-	if (perf_evlist__mmap(evlist, 128, true) < 0) {
+	if (perf_evlist__mmap(evlist, 128) < 0) {
 		pr_debug("failed to mmap events: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_delete_evlist;
diff --git a/tools/perf/tests/openat-syscall-tp-fields.c b/tools/perf/tests/openat-syscall-tp-fields.c
index d9619d2..4351926 100644
--- a/tools/perf/tests/openat-syscall-tp-fields.c
+++ b/tools/perf/tests/openat-syscall-tp-fields.c
@@ -1,5 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 #include <linux/err.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
 #include "perf.h"
 #include "evlist.h"
 #include "evsel.h"
@@ -64,7 +67,7 @@ int test__syscall_openat_tp_fields(struct test *test __maybe_unused, int subtest
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, UINT_MAX, false);
+	err = perf_evlist__mmap(evlist, UINT_MAX);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/parse-events.c b/tools/perf/tests/parse-events.c
index f067961..18b0644 100644
--- a/tools/perf/tests/parse-events.c
+++ b/tools/perf/tests/parse-events.c
@@ -13,7 +13,6 @@
 #include <unistd.h>
 #include <linux/kernel.h>
 #include <linux/hw_breakpoint.h>
-#include <api/fs/fs.h>
 #include <api/fs/tracing_path.h>
 
 #define PERF_TP_SAMPLE_TYPE (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME | \
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c
index c34904d..0afafab 100644
--- a/tools/perf/tests/perf-record.c
+++ b/tools/perf/tests/perf-record.c
@@ -141,7 +141,7 @@ int test__PERF_RECORD(struct test *test __maybe_unused, int subtest __maybe_unus
 	 * fds in the same CPU to be injected in the same mmap ring buffer
 	 * (using ioctl(PERF_EVENT_IOC_SET_OUTPUT)).
 	 */
-	err = perf_evlist__mmap(evlist, opts.mmap_pages, false);
+	err = perf_evlist__mmap(evlist, opts.mmap_pages);
 	if (err < 0) {
 		pr_debug("perf_evlist__mmap: %s\n",
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/sample-parsing.c b/tools/perf/tests/sample-parsing.c
index 3ec6302..0e2d00d 100644
--- a/tools/perf/tests/sample-parsing.c
+++ b/tools/perf/tests/sample-parsing.c
@@ -248,7 +248,7 @@ static int do_test(u64 sample_type, u64 sample_regs, u64 read_format)
 	event->header.size = sz;
 
 	err = perf_event__synthesize_sample(event, sample_type, read_format,
-					    &sample, false);
+					    &sample);
 	if (err) {
 		pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
 			 "perf_event__synthesize_sample", sample_type, err);
diff --git a/tools/perf/tests/shell/trace+probe_vfs_getname.sh b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
index 2a9ef08..55ad979 100755
--- a/tools/perf/tests/shell/trace+probe_vfs_getname.sh
+++ b/tools/perf/tests/shell/trace+probe_vfs_getname.sh
@@ -17,10 +17,9 @@
 file=$(mktemp /tmp/temporary_file.XXXXX)
 
 trace_open_vfs_getname() {
-	test "$(uname -m)" = s390x && { svc="openat"; txt="dfd: +CWD, +"; }
-
-	perf trace -e ${svc:-open} touch $file 2>&1 | \
-	egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ ${svc:-open}\(${txt}filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
+	evts=$(echo $(perf list syscalls:sys_enter_open* |& egrep 'open(at)? ' | sed -r 's/.*sys_enter_([a-z]+) +\[.*$/\1/') | sed 's/ /,/')
+	perf trace -e $evts touch $file 2>&1 | \
+	egrep " +[0-9]+\.[0-9]+ +\( +[0-9]+\.[0-9]+ ms\): +touch\/[0-9]+ open(at)?\((dfd: +CWD, +)?filename: +${file}, +flags: CREAT\|NOCTTY\|NONBLOCK\|WRONLY, +mode: +IRUGO\|IWUGO\) += +[0-9]+$"
 }
 
 
diff --git a/tools/perf/tests/sw-clock.c b/tools/perf/tests/sw-clock.c
index 725a196..f6c72f9 100644
--- a/tools/perf/tests/sw-clock.c
+++ b/tools/perf/tests/sw-clock.c
@@ -78,7 +78,7 @@ static int __test__sw_clock_freq(enum perf_sw_ids clock_id)
 		goto out_delete_evlist;
 	}
 
-	err = perf_evlist__mmap(evlist, 128, true);
+	err = perf_evlist__mmap(evlist, 128);
 	if (err < 0) {
 		pr_debug("failed to mmap event: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
diff --git a/tools/perf/tests/switch-tracking.c b/tools/perf/tests/switch-tracking.c
index 7d3f4bf..33e0029 100644
--- a/tools/perf/tests/switch-tracking.c
+++ b/tools/perf/tests/switch-tracking.c
@@ -449,7 +449,7 @@ int test__switch_tracking(struct test *test __maybe_unused, int subtest __maybe_
 		goto out;
 	}
 
-	err = perf_evlist__mmap(evlist, UINT_MAX, false);
+	err = perf_evlist__mmap(evlist, UINT_MAX);
 	if (err) {
 		pr_debug("perf_evlist__mmap failed!\n");
 		goto out_err;
diff --git a/tools/perf/tests/task-exit.c b/tools/perf/tests/task-exit.c
index 89c8e16..01b62b8 100644
--- a/tools/perf/tests/task-exit.c
+++ b/tools/perf/tests/task-exit.c
@@ -101,7 +101,7 @@ int test__task_exit(struct test *test __maybe_unused, int subtest __maybe_unused
 		goto out_delete_evlist;
 	}
 
-	if (perf_evlist__mmap(evlist, 128, true) < 0) {
+	if (perf_evlist__mmap(evlist, 128) < 0) {
 		pr_debug("failed to mmap events: %d (%s)\n", errno,
 			 str_error_r(errno, sbuf, sizeof(sbuf)));
 		goto out_delete_evlist;
diff --git a/tools/perf/tests/thread-map.c b/tools/perf/tests/thread-map.c
index dbcb6a1..4de1939 100644
--- a/tools/perf/tests/thread-map.c
+++ b/tools/perf/tests/thread-map.c
@@ -105,7 +105,7 @@ int test__thread_map_remove(struct test *test __maybe_unused, int subtest __mayb
 	TEST_ASSERT_VAL("failed to allocate map string",
 			asprintf(&str, "%d,%d", getpid(), getppid()) >= 0);
 
-	threads = thread_map__new_str(str, NULL, 0);
+	threads = thread_map__new_str(str, NULL, 0, false);
 
 	TEST_ASSERT_VAL("failed to allocate thread_map",
 			threads);
diff --git a/tools/perf/trace/beauty/Build b/tools/perf/trace/beauty/Build
index 066bbf0..66330d4 100644
--- a/tools/perf/trace/beauty/Build
+++ b/tools/perf/trace/beauty/Build
@@ -1,5 +1,6 @@
 libperf-y += clone.o
 libperf-y += fcntl.o
+libperf-y += flock.o
 ifeq ($(SRCARCH),$(filter $(SRCARCH),x86))
 libperf-y += ioctl.o
 endif
diff --git a/tools/perf/trace/beauty/arch_errno_names.c b/tools/perf/trace/beauty/arch_errno_names.c
new file mode 100644
index 0000000..ede031c
--- /dev/null
+++ b/tools/perf/trace/beauty/arch_errno_names.c
@@ -0,0 +1 @@
+#include "trace/beauty/generated/arch_errno_name_array.c"
diff --git a/tools/perf/trace/beauty/arch_errno_names.sh b/tools/perf/trace/beauty/arch_errno_names.sh
new file mode 100755
index 0000000..22c9fc9
--- /dev/null
+++ b/tools/perf/trace/beauty/arch_errno_names.sh
@@ -0,0 +1,100 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0
+#
+# Generate C file mapping errno codes to errno names.
+#
+# Copyright IBM Corp. 2018
+# Author(s):  Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
+
+gcc="$1"
+toolsdir="$2"
+include_path="-I$toolsdir/include/uapi"
+
+arch_string()
+{
+	echo "$1" |sed -e 'y/- /__/' |tr '[[:upper:]]' '[[:lower:]]'
+}
+
+asm_errno_file()
+{
+	local arch="$1"
+	local header
+
+	header="$toolsdir/arch/$arch/include/uapi/asm/errno.h"
+	if test -r "$header"; then
+		echo "$header"
+	else
+		echo "$toolsdir/include/uapi/asm-generic/errno.h"
+	fi
+}
+
+create_errno_lookup_func()
+{
+	local arch=$(arch_string "$1")
+	local nr name
+
+	cat <<EoFuncBegin
+static const char *errno_to_name__$arch(int err)
+{
+	switch (err) {
+EoFuncBegin
+
+	while read name nr; do
+		printf '\tcase %d: return "%s";\n' $nr $name
+	done
+
+	cat <<EoFuncEnd
+	default:
+		return "(unknown)";
+	}
+}
+
+EoFuncEnd
+}
+
+process_arch()
+{
+	local arch="$1"
+	local asm_errno=$(asm_errno_file "$arch")
+
+	$gcc $include_path -E -dM -x c $asm_errno \
+		|grep -hE '^#define[[:blank:]]+(E[^[:blank:]]+)[[:blank:]]+([[:digit:]]+).*' \
+		|awk '{ print $2","$3; }' \
+		|sort -t, -k2 -nu \
+		|IFS=, create_errno_lookup_func "$arch"
+}
+
+create_arch_errno_table_func()
+{
+	local archlist="$1"
+	local default="$2"
+	local arch
+
+	printf 'const char *arch_syscalls__strerrno(const char *arch, int err)\n'
+	printf '{\n'
+	for arch in $archlist; do
+		printf '\tif (!strcmp(arch, "%s"))\n' $(arch_string "$arch")
+		printf '\t\treturn errno_to_name__%s(err);\n' $(arch_string "$arch")
+	done
+	printf '\treturn errno_to_name__%s(err);\n' $(arch_string "$default")
+	printf '}\n'
+}
+
+cat <<EoHEADER
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include <string.h>
+
+EoHEADER
+
+# Create list of architectures and ignore those that do not appear
+# in tools/perf/arch
+archlist=""
+for arch in $(find $toolsdir/arch -maxdepth 1 -mindepth 1 -type d -printf "%f\n" | grep -v x86 | sort); do
+	test -d arch/$arch && archlist="$archlist $arch"
+done
+
+for arch in x86 $archlist generic; do
+	process_arch "$arch"
+done
+create_arch_errno_table_func "x86 $archlist" "generic"
diff --git a/tools/perf/trace/beauty/beauty.h b/tools/perf/trace/beauty/beauty.h
index a6dfd04..984a504 100644
--- a/tools/perf/trace/beauty/beauty.h
+++ b/tools/perf/trace/beauty/beauty.h
@@ -79,6 +79,9 @@ size_t syscall_arg__scnprintf_fcntl_cmd(char *bf, size_t size, struct syscall_ar
 size_t syscall_arg__scnprintf_fcntl_arg(char *bf, size_t size, struct syscall_arg *arg);
 #define SCA_FCNTL_ARG syscall_arg__scnprintf_fcntl_arg
 
+size_t syscall_arg__scnprintf_flock(char *bf, size_t size, struct syscall_arg *arg);
+#define SCA_FLOCK syscall_arg__scnprintf_flock
+
 size_t syscall_arg__scnprintf_ioctl_cmd(char *bf, size_t size, struct syscall_arg *arg);
 #define SCA_IOCTL_CMD syscall_arg__scnprintf_ioctl_cmd
 
@@ -114,4 +117,6 @@ size_t open__scnprintf_flags(unsigned long flags, char *bf, size_t size);
 void syscall_arg__set_ret_scnprintf(struct syscall_arg *arg,
 				    size_t (*ret_scnprintf)(char *bf, size_t size, struct syscall_arg *arg));
 
+const char *arch_syscalls__strerrno(const char *arch, int err);
+
 #endif /* _PERF_TRACE_BEAUTY_H */
diff --git a/tools/perf/trace/beauty/flock.c b/tools/perf/trace/beauty/flock.c
index f9707f5..c4ff6ad 100644
--- a/tools/perf/trace/beauty/flock.c
+++ b/tools/perf/trace/beauty/flock.c
@@ -1,5 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
-#include <fcntl.h>
+
+#include "trace/beauty/beauty.h"
+#include <linux/kernel.h>
+#include <uapi/linux/fcntl.h>
 
 #ifndef LOCK_MAND
 #define LOCK_MAND	 32
@@ -17,8 +20,7 @@
 #define LOCK_RW		192
 #endif
 
-static size_t syscall_arg__scnprintf_flock(char *bf, size_t size,
-					   struct syscall_arg *arg)
+size_t syscall_arg__scnprintf_flock(char *bf, size_t size, struct syscall_arg *arg)
 {
 	int printed = 0, op = arg->val;
 
@@ -45,5 +47,3 @@ static size_t syscall_arg__scnprintf_flock(char *bf, size_t size,
 
 	return printed;
 }
-
-#define SCA_FLOCK syscall_arg__scnprintf_flock
diff --git a/tools/perf/trace/beauty/futex_val3.c b/tools/perf/trace/beauty/futex_val3.c
new file mode 100644
index 0000000..26f6b32
--- /dev/null
+++ b/tools/perf/trace/beauty/futex_val3.c
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/futex.h>
+
+#ifndef FUTEX_BITSET_MATCH_ANY
+#define FUTEX_BITSET_MATCH_ANY 0xffffffff
+#endif
+
+static size_t syscall_arg__scnprintf_futex_val3(char *bf, size_t size, struct syscall_arg *arg)
+{
+	unsigned int bitset = arg->val;
+
+	if (bitset == FUTEX_BITSET_MATCH_ANY)
+		return scnprintf(bf, size, "MATCH_ANY");
+
+	return scnprintf(bf, size, "%#xd", bitset);
+}
+
+#define SCA_FUTEX_VAL3  syscall_arg__scnprintf_futex_val3
diff --git a/tools/perf/ui/browsers/annotate.c b/tools/perf/ui/browsers/annotate.c
index 8f7f59d..2864279 100644
--- a/tools/perf/ui/browsers/annotate.c
+++ b/tools/perf/ui/browsers/annotate.c
@@ -25,16 +25,10 @@ struct disasm_line_samples {
 #define IPC_WIDTH 6
 #define CYCLES_WIDTH 6
 
-struct browser_disasm_line {
-	struct rb_node			rb_node;
-	u32				idx;
-	int				idx_asm;
-	int				jump_sources;
-	/*
-	 * actual length of this array is saved on the nr_events field
-	 * of the struct annotate_browser
-	 */
-	struct disasm_line_samples	samples[1];
+struct browser_line {
+	u32	idx;
+	int	idx_asm;
+	int	jump_sources;
 };
 
 static struct annotate_browser_opt {
@@ -53,39 +47,43 @@ static struct annotate_browser_opt {
 struct arch;
 
 struct annotate_browser {
-	struct ui_browser b;
-	struct rb_root	  entries;
-	struct rb_node	  *curr_hot;
-	struct disasm_line  *selection;
-	struct disasm_line  **offsets;
-	struct arch	    *arch;
-	int		    nr_events;
-	u64		    start;
-	int		    nr_asm_entries;
-	int		    nr_entries;
-	int		    max_jump_sources;
-	int		    nr_jumps;
-	bool		    searching_backwards;
-	bool		    have_cycles;
-	u8		    addr_width;
-	u8		    jumps_width;
-	u8		    target_width;
-	u8		    min_addr_width;
-	u8		    max_addr_width;
-	char		    search_bf[128];
+	struct ui_browser	    b;
+	struct rb_root		    entries;
+	struct rb_node		   *curr_hot;
+	struct annotation_line	   *selection;
+	struct annotation_line	  **offsets;
+	struct arch		   *arch;
+	int			    nr_events;
+	u64			    start;
+	int			    nr_asm_entries;
+	int			    nr_entries;
+	int			    max_jump_sources;
+	int			    nr_jumps;
+	bool			    searching_backwards;
+	bool			    have_cycles;
+	u8			    addr_width;
+	u8			    jumps_width;
+	u8			    target_width;
+	u8			    min_addr_width;
+	u8			    max_addr_width;
+	char			    search_bf[128];
 };
 
-static inline struct browser_disasm_line *disasm_line__browser(struct disasm_line *dl)
+static inline struct browser_line *browser_line(struct annotation_line *al)
 {
-	return (struct browser_disasm_line *)(dl + 1);
+	void *ptr = al;
+
+	ptr = container_of(al, struct disasm_line, al);
+	return ptr - sizeof(struct browser_line);
 }
 
 static bool disasm_line__filter(struct ui_browser *browser __maybe_unused,
 				void *entry)
 {
 	if (annotate_browser__opts.hide_src_code) {
-		struct disasm_line *dl = list_entry(entry, struct disasm_line, node);
-		return dl->offset == -1;
+		struct annotation_line *al = list_entry(entry, struct annotation_line, node);
+
+		return al->offset == -1;
 	}
 
 	return false;
@@ -120,11 +118,37 @@ static int annotate_browser__cycles_width(struct annotate_browser *ab)
 	return ab->have_cycles ? IPC_WIDTH + CYCLES_WIDTH : 0;
 }
 
+static void disasm_line__write(struct disasm_line *dl, struct ui_browser *browser,
+			       char *bf, size_t size)
+{
+	if (dl->ins.ops && dl->ins.ops->scnprintf) {
+		if (ins__is_jump(&dl->ins)) {
+			bool fwd = dl->ops.target.offset > dl->al.offset;
+
+			ui_browser__write_graph(browser, fwd ? SLSMG_DARROW_CHAR :
+							    SLSMG_UARROW_CHAR);
+			SLsmg_write_char(' ');
+		} else if (ins__is_call(&dl->ins)) {
+			ui_browser__write_graph(browser, SLSMG_RARROW_CHAR);
+			SLsmg_write_char(' ');
+		} else if (ins__is_ret(&dl->ins)) {
+			ui_browser__write_graph(browser, SLSMG_LARROW_CHAR);
+			SLsmg_write_char(' ');
+		} else {
+			ui_browser__write_nstring(browser, " ", 2);
+		}
+	} else {
+		ui_browser__write_nstring(browser, " ", 2);
+	}
+
+	disasm_line__scnprintf(dl, bf, size, !annotate_browser__opts.use_offset);
+}
+
 static void annotate_browser__write(struct ui_browser *browser, void *entry, int row)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
-	struct disasm_line *dl = list_entry(entry, struct disasm_line, node);
-	struct browser_disasm_line *bdl = disasm_line__browser(dl);
+	struct annotation_line *al = list_entry(entry, struct annotation_line, node);
+	struct browser_line *bl = browser_line(al);
 	bool current_entry = ui_browser__is_current_entry(browser, row);
 	bool change_color = (!annotate_browser__opts.hide_src_code &&
 			     (!current_entry || (browser->use_navkeypressed &&
@@ -137,32 +161,32 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 	bool show_title = false;
 
 	for (i = 0; i < ab->nr_events; i++) {
-		if (bdl->samples[i].percent > percent_max)
-			percent_max = bdl->samples[i].percent;
+		if (al->samples[i].percent > percent_max)
+			percent_max = al->samples[i].percent;
 	}
 
-	if ((row == 0) && (dl->offset == -1 || percent_max == 0.0)) {
+	if ((row == 0) && (al->offset == -1 || percent_max == 0.0)) {
 		if (ab->have_cycles) {
-			if (dl->ipc == 0.0 && dl->cycles == 0)
+			if (al->ipc == 0.0 && al->cycles == 0)
 				show_title = true;
 		} else
 			show_title = true;
 	}
 
-	if (dl->offset != -1 && percent_max != 0.0) {
+	if (al->offset != -1 && percent_max != 0.0) {
 		for (i = 0; i < ab->nr_events; i++) {
 			ui_browser__set_percent_color(browser,
-						bdl->samples[i].percent,
+						al->samples[i].percent,
 						current_entry);
 			if (annotate_browser__opts.show_total_period) {
 				ui_browser__printf(browser, "%11" PRIu64 " ",
-						   bdl->samples[i].he.period);
+						   al->samples[i].he.period);
 			} else if (annotate_browser__opts.show_nr_samples) {
 				ui_browser__printf(browser, "%6" PRIu64 " ",
-						   bdl->samples[i].he.nr_samples);
+						   al->samples[i].he.nr_samples);
 			} else {
 				ui_browser__printf(browser, "%6.2f ",
-						   bdl->samples[i].percent);
+						   al->samples[i].percent);
 			}
 		}
 	} else {
@@ -177,16 +201,16 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 		}
 	}
 	if (ab->have_cycles) {
-		if (dl->ipc)
-			ui_browser__printf(browser, "%*.2f ", IPC_WIDTH - 1, dl->ipc);
+		if (al->ipc)
+			ui_browser__printf(browser, "%*.2f ", IPC_WIDTH - 1, al->ipc);
 		else if (!show_title)
 			ui_browser__write_nstring(browser, " ", IPC_WIDTH);
 		else
 			ui_browser__printf(browser, "%*s ", IPC_WIDTH - 1, "IPC");
 
-		if (dl->cycles)
+		if (al->cycles)
 			ui_browser__printf(browser, "%*" PRIu64 " ",
-					   CYCLES_WIDTH - 1, dl->cycles);
+					   CYCLES_WIDTH - 1, al->cycles);
 		else if (!show_title)
 			ui_browser__write_nstring(browser, " ", CYCLES_WIDTH);
 		else
@@ -199,19 +223,19 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 	if (!browser->navkeypressed)
 		width += 1;
 
-	if (!*dl->line)
+	if (!*al->line)
 		ui_browser__write_nstring(browser, " ", width - pcnt_width - cycles_width);
-	else if (dl->offset == -1) {
-		if (dl->line_nr && annotate_browser__opts.show_linenr)
+	else if (al->offset == -1) {
+		if (al->line_nr && annotate_browser__opts.show_linenr)
 			printed = scnprintf(bf, sizeof(bf), "%-*d ",
-					ab->addr_width + 1, dl->line_nr);
+					ab->addr_width + 1, al->line_nr);
 		else
 			printed = scnprintf(bf, sizeof(bf), "%*s  ",
 				    ab->addr_width, " ");
 		ui_browser__write_nstring(browser, bf, printed);
-		ui_browser__write_nstring(browser, dl->line, width - printed - pcnt_width - cycles_width + 1);
+		ui_browser__write_nstring(browser, al->line, width - printed - pcnt_width - cycles_width + 1);
 	} else {
-		u64 addr = dl->offset;
+		u64 addr = al->offset;
 		int color = -1;
 
 		if (!annotate_browser__opts.use_offset)
@@ -220,13 +244,13 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 		if (!annotate_browser__opts.use_offset) {
 			printed = scnprintf(bf, sizeof(bf), "%" PRIx64 ": ", addr);
 		} else {
-			if (bdl->jump_sources) {
+			if (bl->jump_sources) {
 				if (annotate_browser__opts.show_nr_jumps) {
 					int prev;
 					printed = scnprintf(bf, sizeof(bf), "%*d ",
 							    ab->jumps_width,
-							    bdl->jump_sources);
-					prev = annotate_browser__set_jumps_percent_color(ab, bdl->jump_sources,
+							    bl->jump_sources);
+					prev = annotate_browser__set_jumps_percent_color(ab, bl->jump_sources,
 											 current_entry);
 					ui_browser__write_nstring(browser, bf, printed);
 					ui_browser__set_color(browser, prev);
@@ -245,32 +269,14 @@ static void annotate_browser__write(struct ui_browser *browser, void *entry, int
 		ui_browser__write_nstring(browser, bf, printed);
 		if (change_color)
 			ui_browser__set_color(browser, color);
-		if (dl->ins.ops && dl->ins.ops->scnprintf) {
-			if (ins__is_jump(&dl->ins)) {
-				bool fwd = dl->ops.target.offset > dl->offset;
 
-				ui_browser__write_graph(browser, fwd ? SLSMG_DARROW_CHAR :
-								    SLSMG_UARROW_CHAR);
-				SLsmg_write_char(' ');
-			} else if (ins__is_call(&dl->ins)) {
-				ui_browser__write_graph(browser, SLSMG_RARROW_CHAR);
-				SLsmg_write_char(' ');
-			} else if (ins__is_ret(&dl->ins)) {
-				ui_browser__write_graph(browser, SLSMG_LARROW_CHAR);
-				SLsmg_write_char(' ');
-			} else {
-				ui_browser__write_nstring(browser, " ", 2);
-			}
-		} else {
-			ui_browser__write_nstring(browser, " ", 2);
-		}
+		disasm_line__write(disasm_line(al), browser, bf, sizeof(bf));
 
-		disasm_line__scnprintf(dl, bf, sizeof(bf), !annotate_browser__opts.use_offset);
 		ui_browser__write_nstring(browser, bf, width - pcnt_width - cycles_width - 3 - printed);
 	}
 
 	if (current_entry)
-		ab->selection = dl;
+		ab->selection = al;
 }
 
 static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sym)
@@ -286,7 +292,7 @@ static bool disasm_line__is_valid_jump(struct disasm_line *dl, struct symbol *sy
 
 static bool is_fused(struct annotate_browser *ab, struct disasm_line *cursor)
 {
-	struct disasm_line *pos = list_prev_entry(cursor, node);
+	struct disasm_line *pos = list_prev_entry(cursor, al.node);
 	const char *name;
 
 	if (!pos)
@@ -306,8 +312,9 @@ static bool is_fused(struct annotate_browser *ab, struct disasm_line *cursor)
 static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 {
 	struct annotate_browser *ab = container_of(browser, struct annotate_browser, b);
-	struct disasm_line *cursor = ab->selection, *target;
-	struct browser_disasm_line *btarget, *bcursor;
+	struct disasm_line *cursor = disasm_line(ab->selection);
+	struct annotation_line *target;
+	struct browser_line *btarget, *bcursor;
 	unsigned int from, to;
 	struct map_symbol *ms = ab->b.priv;
 	struct symbol *sym = ms->sym;
@@ -321,11 +328,9 @@ static void annotate_browser__draw_current_jump(struct ui_browser *browser)
 		return;
 
 	target = ab->offsets[cursor->ops.target.offset];
-	if (!target)
-		return;
 
-	bcursor = disasm_line__browser(cursor);
-	btarget = disasm_line__browser(target);
+	bcursor = browser_line(&cursor->al);
+	btarget = browser_line(target);
 
 	if (annotate_browser__opts.hide_src_code) {
 		from = bcursor->idx_asm;
@@ -361,12 +366,11 @@ static unsigned int annotate_browser__refresh(struct ui_browser *browser)
 	return ret;
 }
 
-static int disasm__cmp(struct browser_disasm_line *a,
-		       struct browser_disasm_line *b, int nr_pcnt)
+static int disasm__cmp(struct annotation_line *a, struct annotation_line *b)
 {
 	int i;
 
-	for (i = 0; i < nr_pcnt; i++) {
+	for (i = 0; i < a->samples_nr; i++) {
 		if (a->samples[i].percent == b->samples[i].percent)
 			continue;
 		return a->samples[i].percent < b->samples[i].percent;
@@ -374,28 +378,27 @@ static int disasm__cmp(struct browser_disasm_line *a,
 	return 0;
 }
 
-static void disasm_rb_tree__insert(struct rb_root *root, struct browser_disasm_line *bdl,
-				   int nr_events)
+static void disasm_rb_tree__insert(struct rb_root *root, struct annotation_line *al)
 {
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
-	struct browser_disasm_line *l;
+	struct annotation_line *l;
 
 	while (*p != NULL) {
 		parent = *p;
-		l = rb_entry(parent, struct browser_disasm_line, rb_node);
+		l = rb_entry(parent, struct annotation_line, rb_node);
 
-		if (disasm__cmp(bdl, l, nr_events))
+		if (disasm__cmp(al, l))
 			p = &(*p)->rb_left;
 		else
 			p = &(*p)->rb_right;
 	}
-	rb_link_node(&bdl->rb_node, parent, p);
-	rb_insert_color(&bdl->rb_node, root);
+	rb_link_node(&al->rb_node, parent, p);
+	rb_insert_color(&al->rb_node, root);
 }
 
 static void annotate_browser__set_top(struct annotate_browser *browser,
-				      struct disasm_line *pos, u32 idx)
+				      struct annotation_line *pos, u32 idx)
 {
 	unsigned back;
 
@@ -404,7 +407,7 @@ static void annotate_browser__set_top(struct annotate_browser *browser,
 	browser->b.top_idx = browser->b.index = idx;
 
 	while (browser->b.top_idx != 0 && back != 0) {
-		pos = list_entry(pos->node.prev, struct disasm_line, node);
+		pos = list_entry(pos->node.prev, struct annotation_line, node);
 
 		if (disasm_line__filter(&browser->b, &pos->node))
 			continue;
@@ -420,12 +423,13 @@ static void annotate_browser__set_top(struct annotate_browser *browser,
 static void annotate_browser__set_rb_top(struct annotate_browser *browser,
 					 struct rb_node *nd)
 {
-	struct browser_disasm_line *bpos;
-	struct disasm_line *pos;
+	struct browser_line *bpos;
+	struct annotation_line *pos;
 	u32 idx;
 
-	bpos = rb_entry(nd, struct browser_disasm_line, rb_node);
-	pos = ((struct disasm_line *)bpos) - 1;
+	pos = rb_entry(nd, struct annotation_line, rb_node);
+	bpos = browser_line(pos);
+
 	idx = bpos->idx;
 	if (annotate_browser__opts.hide_src_code)
 		idx = bpos->idx_asm;
@@ -439,46 +443,35 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser,
 	struct map_symbol *ms = browser->b.priv;
 	struct symbol *sym = ms->sym;
 	struct annotation *notes = symbol__annotation(sym);
-	struct disasm_line *pos, *next;
-	s64 len = symbol__size(sym);
+	struct disasm_line *pos;
 
 	browser->entries = RB_ROOT;
 
 	pthread_mutex_lock(&notes->lock);
 
-	list_for_each_entry(pos, &notes->src->source, node) {
-		struct browser_disasm_line *bpos = disasm_line__browser(pos);
-		const char *path = NULL;
+	symbol__calc_percent(sym, evsel);
+
+	list_for_each_entry(pos, &notes->src->source, al.node) {
 		double max_percent = 0.0;
 		int i;
 
-		if (pos->offset == -1) {
-			RB_CLEAR_NODE(&bpos->rb_node);
+		if (pos->al.offset == -1) {
+			RB_CLEAR_NODE(&pos->al.rb_node);
 			continue;
 		}
 
-		next = disasm__get_next_ip_line(&notes->src->source, pos);
+		for (i = 0; i < pos->al.samples_nr; i++) {
+			struct annotation_data *sample = &pos->al.samples[i];
 
-		for (i = 0; i < browser->nr_events; i++) {
-			struct sym_hist_entry sample;
-
-			bpos->samples[i].percent = disasm__calc_percent(notes,
-						evsel->idx + i,
-						pos->offset,
-						next ? next->offset : len,
-						&path, &sample);
-			bpos->samples[i].he = sample;
-
-			if (max_percent < bpos->samples[i].percent)
-				max_percent = bpos->samples[i].percent;
+			if (max_percent < sample->percent)
+				max_percent = sample->percent;
 		}
 
-		if (max_percent < 0.01 && pos->ipc == 0) {
-			RB_CLEAR_NODE(&bpos->rb_node);
+		if (max_percent < 0.01 && pos->al.ipc == 0) {
+			RB_CLEAR_NODE(&pos->al.rb_node);
 			continue;
 		}
-		disasm_rb_tree__insert(&browser->entries, bpos,
-				       browser->nr_events);
+		disasm_rb_tree__insert(&browser->entries, &pos->al);
 	}
 	pthread_mutex_unlock(&notes->lock);
 
@@ -487,38 +480,38 @@ static void annotate_browser__calc_percent(struct annotate_browser *browser,
 
 static bool annotate_browser__toggle_source(struct annotate_browser *browser)
 {
-	struct disasm_line *dl;
-	struct browser_disasm_line *bdl;
+	struct annotation_line *al;
+	struct browser_line *bl;
 	off_t offset = browser->b.index - browser->b.top_idx;
 
 	browser->b.seek(&browser->b, offset, SEEK_CUR);
-	dl = list_entry(browser->b.top, struct disasm_line, node);
-	bdl = disasm_line__browser(dl);
+	al = list_entry(browser->b.top, struct annotation_line, node);
+	bl = browser_line(al);
 
 	if (annotate_browser__opts.hide_src_code) {
-		if (bdl->idx_asm < offset)
-			offset = bdl->idx;
+		if (bl->idx_asm < offset)
+			offset = bl->idx;
 
 		browser->b.nr_entries = browser->nr_entries;
 		annotate_browser__opts.hide_src_code = false;
 		browser->b.seek(&browser->b, -offset, SEEK_CUR);
-		browser->b.top_idx = bdl->idx - offset;
-		browser->b.index = bdl->idx;
+		browser->b.top_idx = bl->idx - offset;
+		browser->b.index = bl->idx;
 	} else {
-		if (bdl->idx_asm < 0) {
+		if (bl->idx_asm < 0) {
 			ui_helpline__puts("Only available for assembly lines.");
 			browser->b.seek(&browser->b, -offset, SEEK_CUR);
 			return false;
 		}
 
-		if (bdl->idx_asm < offset)
-			offset = bdl->idx_asm;
+		if (bl->idx_asm < offset)
+			offset = bl->idx_asm;
 
 		browser->b.nr_entries = browser->nr_asm_entries;
 		annotate_browser__opts.hide_src_code = true;
 		browser->b.seek(&browser->b, -offset, SEEK_CUR);
-		browser->b.top_idx = bdl->idx_asm - offset;
-		browser->b.index = bdl->idx_asm;
+		browser->b.top_idx = bl->idx_asm - offset;
+		browser->b.index = bl->idx_asm;
 	}
 
 	return true;
@@ -543,7 +536,7 @@ static bool annotate_browser__callq(struct annotate_browser *browser,
 				    struct hist_browser_timer *hbt)
 {
 	struct map_symbol *ms = browser->b.priv;
-	struct disasm_line *dl = browser->selection;
+	struct disasm_line *dl = disasm_line(browser->selection);
 	struct annotation *notes;
 	struct addr_map_symbol target = {
 		.map = ms->map,
@@ -589,10 +582,10 @@ struct disasm_line *annotate_browser__find_offset(struct annotate_browser *brows
 	struct disasm_line *pos;
 
 	*idx = 0;
-	list_for_each_entry(pos, &notes->src->source, node) {
-		if (pos->offset == offset)
+	list_for_each_entry(pos, &notes->src->source, al.node) {
+		if (pos->al.offset == offset)
 			return pos;
-		if (!disasm_line__filter(&browser->b, &pos->node))
+		if (!disasm_line__filter(&browser->b, &pos->al.node))
 			++*idx;
 	}
 
@@ -601,7 +594,7 @@ struct disasm_line *annotate_browser__find_offset(struct annotate_browser *brows
 
 static bool annotate_browser__jump(struct annotate_browser *browser)
 {
-	struct disasm_line *dl = browser->selection;
+	struct disasm_line *dl = disasm_line(browser->selection);
 	u64 offset;
 	s64 idx;
 
@@ -615,29 +608,29 @@ static bool annotate_browser__jump(struct annotate_browser *browser)
 		return true;
 	}
 
-	annotate_browser__set_top(browser, dl, idx);
+	annotate_browser__set_top(browser, &dl->al, idx);
 
 	return true;
 }
 
 static
-struct disasm_line *annotate_browser__find_string(struct annotate_browser *browser,
+struct annotation_line *annotate_browser__find_string(struct annotate_browser *browser,
 					  char *s, s64 *idx)
 {
 	struct map_symbol *ms = browser->b.priv;
 	struct symbol *sym = ms->sym;
 	struct annotation *notes = symbol__annotation(sym);
-	struct disasm_line *pos = browser->selection;
+	struct annotation_line *al = browser->selection;
 
 	*idx = browser->b.index;
-	list_for_each_entry_continue(pos, &notes->src->source, node) {
-		if (disasm_line__filter(&browser->b, &pos->node))
+	list_for_each_entry_continue(al, &notes->src->source, node) {
+		if (disasm_line__filter(&browser->b, &al->node))
 			continue;
 
 		++*idx;
 
-		if (pos->line && strstr(pos->line, s) != NULL)
-			return pos;
+		if (al->line && strstr(al->line, s) != NULL)
+			return al;
 	}
 
 	return NULL;
@@ -645,38 +638,38 @@ struct disasm_line *annotate_browser__find_string(struct annotate_browser *brows
 
 static bool __annotate_browser__search(struct annotate_browser *browser)
 {
-	struct disasm_line *dl;
+	struct annotation_line *al;
 	s64 idx;
 
-	dl = annotate_browser__find_string(browser, browser->search_bf, &idx);
-	if (dl == NULL) {
+	al = annotate_browser__find_string(browser, browser->search_bf, &idx);
+	if (al == NULL) {
 		ui_helpline__puts("String not found!");
 		return false;
 	}
 
-	annotate_browser__set_top(browser, dl, idx);
+	annotate_browser__set_top(browser, al, idx);
 	browser->searching_backwards = false;
 	return true;
 }
 
 static
-struct disasm_line *annotate_browser__find_string_reverse(struct annotate_browser *browser,
+struct annotation_line *annotate_browser__find_string_reverse(struct annotate_browser *browser,
 						  char *s, s64 *idx)
 {
 	struct map_symbol *ms = browser->b.priv;
 	struct symbol *sym = ms->sym;
 	struct annotation *notes = symbol__annotation(sym);
-	struct disasm_line *pos = browser->selection;
+	struct annotation_line *al = browser->selection;
 
 	*idx = browser->b.index;
-	list_for_each_entry_continue_reverse(pos, &notes->src->source, node) {
-		if (disasm_line__filter(&browser->b, &pos->node))
+	list_for_each_entry_continue_reverse(al, &notes->src->source, node) {
+		if (disasm_line__filter(&browser->b, &al->node))
 			continue;
 
 		--*idx;
 
-		if (pos->line && strstr(pos->line, s) != NULL)
-			return pos;
+		if (al->line && strstr(al->line, s) != NULL)
+			return al;
 	}
 
 	return NULL;
@@ -684,16 +677,16 @@ struct disasm_line *annotate_browser__find_string_reverse(struct annotate_browse
 
 static bool __annotate_browser__search_reverse(struct annotate_browser *browser)
 {
-	struct disasm_line *dl;
+	struct annotation_line *al;
 	s64 idx;
 
-	dl = annotate_browser__find_string_reverse(browser, browser->search_bf, &idx);
-	if (dl == NULL) {
+	al = annotate_browser__find_string_reverse(browser, browser->search_bf, &idx);
+	if (al == NULL) {
 		ui_helpline__puts("String not found!");
 		return false;
 	}
 
-	annotate_browser__set_top(browser, dl, idx);
+	annotate_browser__set_top(browser, al, idx);
 	browser->searching_backwards = true;
 	return true;
 }
@@ -899,13 +892,16 @@ static int annotate_browser__run(struct annotate_browser *browser,
 			continue;
 		case K_ENTER:
 		case K_RIGHT:
+		{
+			struct disasm_line *dl = disasm_line(browser->selection);
+
 			if (browser->selection == NULL)
 				ui_helpline__puts("Huh? No selection. Report to linux-kernel@vger.kernel.org");
 			else if (browser->selection->offset == -1)
 				ui_helpline__puts("Actions are only available for assembly lines.");
-			else if (!browser->selection->ins.ops)
+			else if (!dl->ins.ops)
 				goto show_sup_ins;
-			else if (ins__is_ret(&browser->selection->ins))
+			else if (ins__is_ret(&dl->ins))
 				goto out;
 			else if (!(annotate_browser__jump(browser) ||
 				     annotate_browser__callq(browser, evsel, hbt))) {
@@ -913,6 +909,7 @@ static int annotate_browser__run(struct annotate_browser *browser,
 				ui_helpline__puts("Actions are only available for function call/return & jump/branch instructions.");
 			}
 			continue;
+		}
 		case 't':
 			if (annotate_browser__opts.show_total_period) {
 				annotate_browser__opts.show_total_period = false;
@@ -990,10 +987,10 @@ static void count_and_fill(struct annotate_browser *browser, u64 start, u64 end,
 			return;
 
 		for (offset = start; offset <= end; offset++) {
-			struct disasm_line *dl = browser->offsets[offset];
+			struct annotation_line *al = browser->offsets[offset];
 
-			if (dl)
-				dl->ipc = ipc;
+			if (al)
+				al->ipc = ipc;
 		}
 	}
 }
@@ -1018,13 +1015,13 @@ static void annotate__compute_ipc(struct annotate_browser *browser, size_t size,
 
 		ch = &notes->src->cycles_hist[offset];
 		if (ch && ch->cycles) {
-			struct disasm_line *dl;
+			struct annotation_line *al;
 
 			if (ch->have_start)
 				count_and_fill(browser, ch->start, offset, ch);
-			dl = browser->offsets[offset];
-			if (dl && ch->num_aggr)
-				dl->cycles = ch->cycles_aggr / ch->num_aggr;
+			al = browser->offsets[offset];
+			if (al && ch->num_aggr)
+				al->cycles = ch->cycles_aggr / ch->num_aggr;
 			browser->have_cycles = true;
 		}
 	}
@@ -1043,23 +1040,27 @@ static void annotate_browser__mark_jump_targets(struct annotate_browser *browser
 		return;
 
 	for (offset = 0; offset < size; ++offset) {
-		struct disasm_line *dl = browser->offsets[offset], *dlt;
-		struct browser_disasm_line *bdlt;
+		struct annotation_line *al = browser->offsets[offset];
+		struct disasm_line *dl;
+		struct browser_line *blt;
+
+		dl = disasm_line(al);
 
 		if (!disasm_line__is_valid_jump(dl, sym))
 			continue;
 
-		dlt = browser->offsets[dl->ops.target.offset];
+		al = browser->offsets[dl->ops.target.offset];
+
 		/*
  		 * FIXME: Oops, no jump target? Buggy disassembler? Or do we
  		 * have to adjust to the previous offset?
  		 */
-		if (dlt == NULL)
+		if (al == NULL)
 			continue;
 
-		bdlt = disasm_line__browser(dlt);
-		if (++bdlt->jump_sources > browser->max_jump_sources)
-			browser->max_jump_sources = bdlt->jump_sources;
+		blt = browser_line(al);
+		if (++blt->jump_sources > browser->max_jump_sources)
+			browser->max_jump_sources = blt->jump_sources;
 
 		++browser->nr_jumps;
 	}
@@ -1078,7 +1079,7 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 			 struct perf_evsel *evsel,
 			 struct hist_browser_timer *hbt)
 {
-	struct disasm_line *pos, *n;
+	struct annotation_line *al;
 	struct annotation *notes;
 	size_t size;
 	struct map_symbol ms = {
@@ -1097,7 +1098,6 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 	};
 	int ret = -1, err;
 	int nr_pcnt = 1;
-	size_t sizeof_bdl = sizeof(struct browser_disasm_line);
 
 	if (sym == NULL)
 		return -1;
@@ -1107,21 +1107,16 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 	if (map->dso->annotate_warned)
 		return -1;
 
-	browser.offsets = zalloc(size * sizeof(struct disasm_line *));
+	browser.offsets = zalloc(size * sizeof(struct annotation_line *));
 	if (browser.offsets == NULL) {
 		ui__error("Not enough memory!");
 		return -1;
 	}
 
-	if (perf_evsel__is_group_event(evsel)) {
+	if (perf_evsel__is_group_event(evsel))
 		nr_pcnt = evsel->nr_members;
-		sizeof_bdl += sizeof(struct disasm_line_samples) *
-		  (nr_pcnt - 1);
-	}
 
-	err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel),
-				  sizeof_bdl, &browser.arch,
-				  perf_evsel__env_cpuid(evsel));
+	err = symbol__annotate(sym, map, evsel, sizeof(struct browser_line), &browser.arch);
 	if (err) {
 		char msg[BUFSIZ];
 		symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
@@ -1129,20 +1124,22 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 		goto out_free_offsets;
 	}
 
+	symbol__calc_percent(sym, evsel);
+
 	ui_helpline__push("Press ESC to exit");
 
 	notes = symbol__annotation(sym);
 	browser.start = map__rip_2objdump(map, sym->start);
 
-	list_for_each_entry(pos, &notes->src->source, node) {
-		struct browser_disasm_line *bpos;
-		size_t line_len = strlen(pos->line);
+	list_for_each_entry(al, &notes->src->source, node) {
+		struct browser_line *bpos;
+		size_t line_len = strlen(al->line);
 
 		if (browser.b.width < line_len)
 			browser.b.width = line_len;
-		bpos = disasm_line__browser(pos);
+		bpos = browser_line(al);
 		bpos->idx = browser.nr_entries++;
-		if (pos->offset != -1) {
+		if (al->offset != -1) {
 			bpos->idx_asm = browser.nr_asm_entries++;
 			/*
 			 * FIXME: short term bandaid to cope with assembly
@@ -1151,8 +1148,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 			 *
 			 * E.g. copy_user_generic_unrolled
  			 */
-			if (pos->offset < (s64)size)
-				browser.offsets[pos->offset] = pos;
+			if (al->offset < (s64)size)
+				browser.offsets[al->offset] = al;
 		} else
 			bpos->idx_asm = -1;
 	}
@@ -1174,10 +1171,8 @@ int symbol__tui_annotate(struct symbol *sym, struct map *map,
 	annotate_browser__update_addr_width(&browser);
 
 	ret = annotate_browser__run(&browser, evsel, hbt);
-	list_for_each_entry_safe(pos, n, &notes->src->source, node) {
-		list_del(&pos->node);
-		disasm_line__free(pos);
-	}
+
+	annotated_source__purge(notes->src);
 
 out_free_offsets:
 	free(browser.offsets);
diff --git a/tools/perf/ui/gtk/annotate.c b/tools/perf/ui/gtk/annotate.c
index fc7a2e1..aeeaf15 100644
--- a/tools/perf/ui/gtk/annotate.c
+++ b/tools/perf/ui/gtk/annotate.c
@@ -31,14 +31,14 @@ static int perf_gtk__get_percent(char *buf, size_t size, struct symbol *sym,
 
 	strcpy(buf, "");
 
-	if (dl->offset == (s64) -1)
+	if (dl->al.offset == (s64) -1)
 		return 0;
 
 	symhist = annotation__histogram(symbol__annotation(sym), evidx);
-	if (!symbol_conf.event_group && !symhist->addr[dl->offset].nr_samples)
+	if (!symbol_conf.event_group && !symhist->addr[dl->al.offset].nr_samples)
 		return 0;
 
-	percent = 100.0 * symhist->addr[dl->offset].nr_samples / symhist->nr_samples;
+	percent = 100.0 * symhist->addr[dl->al.offset].nr_samples / symhist->nr_samples;
 
 	markup = perf_gtk__get_percent_color(percent);
 	if (markup)
@@ -57,16 +57,16 @@ static int perf_gtk__get_offset(char *buf, size_t size, struct symbol *sym,
 
 	strcpy(buf, "");
 
-	if (dl->offset == (s64) -1)
+	if (dl->al.offset == (s64) -1)
 		return 0;
 
-	return scnprintf(buf, size, "%"PRIx64, start + dl->offset);
+	return scnprintf(buf, size, "%"PRIx64, start + dl->al.offset);
 }
 
 static int perf_gtk__get_line(char *buf, size_t size, struct disasm_line *dl)
 {
 	int ret = 0;
-	char *line = g_markup_escape_text(dl->line, -1);
+	char *line = g_markup_escape_text(dl->al.line, -1);
 	const char *markup = "<span fgcolor='gray'>";
 
 	strcpy(buf, "");
@@ -74,7 +74,7 @@ static int perf_gtk__get_line(char *buf, size_t size, struct disasm_line *dl)
 	if (!line)
 		return 0;
 
-	if (dl->offset != (s64) -1)
+	if (dl->al.offset != (s64) -1)
 		markup = NULL;
 
 	if (markup)
@@ -119,7 +119,7 @@ static int perf_gtk__annotate_symbol(GtkWidget *window, struct symbol *sym,
 	gtk_tree_view_set_model(GTK_TREE_VIEW(view), GTK_TREE_MODEL(store));
 	g_object_unref(GTK_TREE_MODEL(store));
 
-	list_for_each_entry(pos, &notes->src->source, node) {
+	list_for_each_entry(pos, &notes->src->source, al.node) {
 		GtkTreeIter iter;
 		int ret = 0;
 
@@ -148,8 +148,8 @@ static int perf_gtk__annotate_symbol(GtkWidget *window, struct symbol *sym,
 
 	gtk_container_add(GTK_CONTAINER(window), view);
 
-	list_for_each_entry_safe(pos, n, &notes->src->source, node) {
-		list_del(&pos->node);
+	list_for_each_entry_safe(pos, n, &notes->src->source, al.node) {
+		list_del(&pos->al.node);
 		disasm_line__free(pos);
 	}
 
@@ -169,8 +169,7 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
 	if (map->dso->annotate_warned)
 		return -1;
 
-	err = symbol__disassemble(sym, map, perf_evsel__env_arch(evsel),
-				  0, NULL, NULL);
+	err = symbol__annotate(sym, map, evsel, 0, NULL);
 	if (err) {
 		char msg[BUFSIZ];
 		symbol__strerror_disassemble(sym, map, err, msg, sizeof(msg));
@@ -178,6 +177,8 @@ static int symbol__gtk_annotate(struct symbol *sym, struct map *map,
 		return -1;
 	}
 
+	symbol__calc_percent(sym, evsel);
+
 	if (perf_gtk__is_active_context(pgctx)) {
 		window = pgctx->main_window;
 		notebook = pgctx->notebook;
diff --git a/tools/perf/util/Build b/tools/perf/util/Build
index a3de791..ea0a452 100644
--- a/tools/perf/util/Build
+++ b/tools/perf/util/Build
@@ -44,7 +44,7 @@
 libperf-y += map.o
 libperf-y += pstack.o
 libperf-y += session.o
-libperf-$(CONFIG_AUDIT) += syscalltbl.o
+libperf-$(CONFIG_TRACE) += syscalltbl.o
 libperf-y += ordered-events.o
 libperf-y += namespaces.o
 libperf-y += comm.o
@@ -86,6 +86,14 @@
 libperf-$(CONFIG_AUXTRACE) += intel-pt-decoder/
 libperf-$(CONFIG_AUXTRACE) += intel-pt.o
 libperf-$(CONFIG_AUXTRACE) += intel-bts.o
+libperf-$(CONFIG_AUXTRACE) += arm-spe.o
+libperf-$(CONFIG_AUXTRACE) += arm-spe-pkt-decoder.o
+
+ifdef CONFIG_LIBOPENCSD
+libperf-$(CONFIG_AUXTRACE) += cs-etm.o
+libperf-$(CONFIG_AUXTRACE) += cs-etm-decoder/
+endif
+
 libperf-y += parse-branch-options.o
 libperf-y += dump-insn.o
 libperf-y += parse-regs-options.o
diff --git a/tools/perf/util/annotate.c b/tools/perf/util/annotate.c
index 3369c78..28b233c 100644
--- a/tools/perf/util/annotate.c
+++ b/tools/perf/util/annotate.c
@@ -26,7 +26,6 @@
 #include <pthread.h>
 #include <linux/bitops.h>
 #include <linux/kernel.h>
-#include <sys/utsname.h>
 
 #include "sane_ctype.h"
 
@@ -322,6 +321,8 @@ static int comment__symbol(char *raw, char *comment, u64 *addrp, char **namep)
 		return 0;
 
 	*addrp = strtoull(comment, &endptr, 16);
+	if (endptr == comment)
+		return 0;
 	name = strchr(endptr, '<');
 	if (name == NULL)
 		return -1;
@@ -435,8 +436,8 @@ static int mov__parse(struct arch *arch, struct ins_operands *ops, struct map *m
 		return 0;
 
 	comment = ltrim(comment);
-	comment__symbol(ops->source.raw, comment, &ops->source.addr, &ops->source.name);
-	comment__symbol(ops->target.raw, comment, &ops->target.addr, &ops->target.name);
+	comment__symbol(ops->source.raw, comment + 1, &ops->source.addr, &ops->source.name);
+	comment__symbol(ops->target.raw, comment + 1, &ops->target.addr, &ops->target.name);
 
 	return 0;
 
@@ -480,7 +481,7 @@ static int dec__parse(struct arch *arch __maybe_unused, struct ins_operands *ops
 		return 0;
 
 	comment = ltrim(comment);
-	comment__symbol(ops->target.raw, comment, &ops->target.addr, &ops->target.name);
+	comment__symbol(ops->target.raw, comment + 1, &ops->target.addr, &ops->target.name);
 
 	return 0;
 }
@@ -878,32 +879,99 @@ static int disasm_line__parse(char *line, const char **namep, char **rawp)
 	return -1;
 }
 
-static struct disasm_line *disasm_line__new(s64 offset, char *line,
-					    size_t privsize, int line_nr,
-					    struct arch *arch,
-					    struct map *map)
-{
-	struct disasm_line *dl = zalloc(sizeof(*dl) + privsize);
+struct annotate_args {
+	size_t			 privsize;
+	struct arch		*arch;
+	struct map		*map;
+	struct perf_evsel	*evsel;
+	s64			 offset;
+	char			*line;
+	int			 line_nr;
+};
 
-	if (dl != NULL) {
-		dl->offset = offset;
-		dl->line = strdup(line);
-		dl->line_nr = line_nr;
-		if (dl->line == NULL)
+static void annotation_line__delete(struct annotation_line *al)
+{
+	void *ptr = (void *) al - al->privsize;
+
+	free_srcline(al->path);
+	zfree(&al->line);
+	free(ptr);
+}
+
+/*
+ * Allocating the annotation line data with following
+ * structure:
+ *
+ *    --------------------------------------
+ *    private space | struct annotation_line
+ *    --------------------------------------
+ *
+ * Size of the private space is stored in 'struct annotation_line'.
+ *
+ */
+static struct annotation_line *
+annotation_line__new(struct annotate_args *args, size_t privsize)
+{
+	struct annotation_line *al;
+	struct perf_evsel *evsel = args->evsel;
+	size_t size = privsize + sizeof(*al);
+	int nr = 1;
+
+	if (perf_evsel__is_group_event(evsel))
+		nr = evsel->nr_members;
+
+	size += sizeof(al->samples[0]) * nr;
+
+	al = zalloc(size);
+	if (al) {
+		al = (void *) al + privsize;
+		al->privsize   = privsize;
+		al->offset     = args->offset;
+		al->line       = strdup(args->line);
+		al->line_nr    = args->line_nr;
+		al->samples_nr = nr;
+	}
+
+	return al;
+}
+
+/*
+ * Allocating the disasm annotation line data with
+ * following structure:
+ *
+ *    ------------------------------------------------------------
+ *    privsize space | struct disasm_line | struct annotation_line
+ *    ------------------------------------------------------------
+ *
+ * We have 'struct annotation_line' member as last member
+ * of 'struct disasm_line' to have an easy access.
+ *
+ */
+static struct disasm_line *disasm_line__new(struct annotate_args *args)
+{
+	struct disasm_line *dl = NULL;
+	struct annotation_line *al;
+	size_t privsize = args->privsize + offsetof(struct disasm_line, al);
+
+	al = annotation_line__new(args, privsize);
+	if (al != NULL) {
+		dl = disasm_line(al);
+
+		if (dl->al.line == NULL)
 			goto out_delete;
 
-		if (offset != -1) {
-			if (disasm_line__parse(dl->line, &dl->ins.name, &dl->ops.raw) < 0)
+		if (args->offset != -1) {
+			if (disasm_line__parse(dl->al.line, &dl->ins.name, &dl->ops.raw) < 0)
 				goto out_free_line;
 
-			disasm_line__init_ins(dl, arch, map);
+			disasm_line__init_ins(dl, args->arch, args->map);
 		}
 	}
 
 	return dl;
 
 out_free_line:
-	zfree(&dl->line);
+	zfree(&dl->al.line);
 out_delete:
 	free(dl);
 	return NULL;
@@ -911,14 +979,13 @@ static struct disasm_line *disasm_line__new(s64 offset, char *line,
 
 void disasm_line__free(struct disasm_line *dl)
 {
-	zfree(&dl->line);
 	if (dl->ins.ops && dl->ins.ops->free)
 		dl->ins.ops->free(&dl->ops);
 	else
 		ins__delete(&dl->ops);
 	free((void *)dl->ins.name);
 	dl->ins.name = NULL;
-	free(dl);
+	annotation_line__delete(&dl->al);
 }
 
 int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool raw)
@@ -929,12 +996,13 @@ int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool r
 	return ins__scnprintf(&dl->ins, bf, size, &dl->ops);
 }
 
-static void disasm__add(struct list_head *head, struct disasm_line *line)
+static void annotation_line__add(struct annotation_line *al, struct list_head *head)
 {
-	list_add_tail(&line->node, head);
+	list_add_tail(&al->node, head);
 }
 
-struct disasm_line *disasm__get_next_ip_line(struct list_head *head, struct disasm_line *pos)
+struct annotation_line *
+annotation_line__next(struct annotation_line *pos, struct list_head *head)
 {
 	list_for_each_entry_continue(pos, head, node)
 		if (pos->offset >= 0)
@@ -943,50 +1011,6 @@ struct disasm_line *disasm__get_next_ip_line(struct list_head *head, struct disa
 	return NULL;
 }
 
-double disasm__calc_percent(struct annotation *notes, int evidx, s64 offset,
-			    s64 end, const char **path, struct sym_hist_entry *sample)
-{
-	struct source_line *src_line = notes->src->lines;
-	double percent = 0.0;
-
-	sample->nr_samples = sample->period = 0;
-
-	if (src_line) {
-		size_t sizeof_src_line = sizeof(*src_line) +
-				sizeof(src_line->samples) * (src_line->nr_pcnt - 1);
-
-		while (offset < end) {
-			src_line = (void *)notes->src->lines +
-					(sizeof_src_line * offset);
-
-			if (*path == NULL)
-				*path = src_line->path;
-
-			percent += src_line->samples[evidx].percent;
-			sample->nr_samples += src_line->samples[evidx].nr;
-			offset++;
-		}
-	} else {
-		struct sym_hist *h = annotation__histogram(notes, evidx);
-		unsigned int hits = 0;
-		u64 period = 0;
-
-		while (offset < end) {
-			hits   += h->addr[offset].nr_samples;
-			period += h->addr[offset].period;
-			++offset;
-		}
-
-		if (h->nr_samples) {
-			sample->period	   = period;
-			sample->nr_samples = hits;
-			percent = 100.0 * hits / h->nr_samples;
-		}
-	}
-
-	return percent;
-}
-
 static const char *annotate__address_color(struct block_range *br)
 {
 	double cov = block_range__coverage(br);
@@ -1069,50 +1093,39 @@ static void annotate__branch_printf(struct block_range *br, u64 addr)
 	}
 }
 
-
-static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 start,
-		      struct perf_evsel *evsel, u64 len, int min_pcnt, int printed,
-		      int max_lines, struct disasm_line *queue)
+static int disasm_line__print(struct disasm_line *dl, u64 start, int addr_fmt_width)
 {
+	s64 offset = dl->al.offset;
+	const u64 addr = start + offset;
+	struct block_range *br;
+
+	br = block_range__find(addr);
+	color_fprintf(stdout, annotate__address_color(br), "  %*" PRIx64 ":", addr_fmt_width, addr);
+	color_fprintf(stdout, annotate__asm_color(br), "%s", dl->al.line);
+	annotate__branch_printf(br, addr);
+	return 0;
+}
+
+static int
+annotation_line__print(struct annotation_line *al, struct symbol *sym, u64 start,
+		       struct perf_evsel *evsel, u64 len, int min_pcnt, int printed,
+		       int max_lines, struct annotation_line *queue, int addr_fmt_width)
+{
+	struct disasm_line *dl = container_of(al, struct disasm_line, al);
 	static const char *prev_line;
 	static const char *prev_color;
 
-	if (dl->offset != -1) {
-		const char *path = NULL;
-		double percent, max_percent = 0.0;
-		double *ppercents = &percent;
-		struct sym_hist_entry sample;
-		struct sym_hist_entry *psamples = &sample;
+	if (al->offset != -1) {
+		double max_percent = 0.0;
 		int i, nr_percent = 1;
 		const char *color;
 		struct annotation *notes = symbol__annotation(sym);
-		s64 offset = dl->offset;
-		const u64 addr = start + offset;
-		struct disasm_line *next;
-		struct block_range *br;
 
-		next = disasm__get_next_ip_line(&notes->src->source, dl);
+		for (i = 0; i < al->samples_nr; i++) {
+			struct annotation_data *sample = &al->samples[i];
 
-		if (perf_evsel__is_group_event(evsel)) {
-			nr_percent = evsel->nr_members;
-			ppercents = calloc(nr_percent, sizeof(double));
-			psamples = calloc(nr_percent, sizeof(struct sym_hist_entry));
-			if (ppercents == NULL || psamples == NULL) {
-				return -1;
-			}
-		}
-
-		for (i = 0; i < nr_percent; i++) {
-			percent = disasm__calc_percent(notes,
-					notes->src->lines ? i : evsel->idx + i,
-					offset,
-					next ? next->offset : (s64) len,
-					&path, &sample);
-
-			ppercents[i] = percent;
-			psamples[i] = sample;
-			if (percent > max_percent)
-				max_percent = percent;
+			if (sample->percent > max_percent)
+				max_percent = sample->percent;
 		}
 
 		if (max_percent < min_pcnt)
@@ -1123,10 +1136,10 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
 
 		if (queue != NULL) {
 			list_for_each_entry_from(queue, &notes->src->source, node) {
-				if (queue == dl)
+				if (queue == al)
 					break;
-				disasm_line__print(queue, sym, start, evsel, len,
-						    0, 0, 1, NULL);
+				annotation_line__print(queue, sym, start, evsel, len,
+						       0, 0, 1, NULL, addr_fmt_width);
 			}
 		}
 
@@ -1137,44 +1150,34 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
 		 * the same color than the percentage. Don't print it
 		 * twice for close colored addr with the same filename:line
 		 */
-		if (path) {
-			if (!prev_line || strcmp(prev_line, path)
+		if (al->path) {
+			if (!prev_line || strcmp(prev_line, al->path)
 				       || color != prev_color) {
-				color_fprintf(stdout, color, " %s", path);
-				prev_line = path;
+				color_fprintf(stdout, color, " %s", al->path);
+				prev_line = al->path;
 				prev_color = color;
 			}
 		}
 
 		for (i = 0; i < nr_percent; i++) {
-			percent = ppercents[i];
-			sample = psamples[i];
-			color = get_percent_color(percent);
+			struct annotation_data *sample = &al->samples[i];
+
+			color = get_percent_color(sample->percent);
 
 			if (symbol_conf.show_total_period)
 				color_fprintf(stdout, color, " %11" PRIu64,
-					      sample.period);
+					      sample->he.period);
 			else if (symbol_conf.show_nr_samples)
 				color_fprintf(stdout, color, " %7" PRIu64,
-					      sample.nr_samples);
+					      sample->he.nr_samples);
 			else
-				color_fprintf(stdout, color, " %7.2f", percent);
+				color_fprintf(stdout, color, " %7.2f", sample->percent);
 		}
 
-		printf(" :	");
+		printf(" : ");
 
-		br = block_range__find(addr);
-		color_fprintf(stdout, annotate__address_color(br), "  %" PRIx64 ":", addr);
-		color_fprintf(stdout, annotate__asm_color(br), "%s", dl->line);
-		annotate__branch_printf(br, addr);
+		disasm_line__print(dl, start, addr_fmt_width);
 		printf("\n");
-
-		if (ppercents != &percent)
-			free(ppercents);
-
-		if (psamples != &sample)
-			free(psamples);
-
 	} else if (max_lines && printed >= max_lines)
 		return 1;
 	else {
@@ -1186,10 +1189,10 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
 		if (perf_evsel__is_group_event(evsel))
 			width *= evsel->nr_members;
 
-		if (!*dl->line)
+		if (!*al->line)
 			printf(" %*s:\n", width, " ");
 		else
-			printf(" %*s:	%s\n", width, " ", dl->line);
+			printf(" %*s:     %*s %s\n", width, " ", addr_fmt_width, " ", al->line);
 	}
 
 	return 0;
@@ -1215,11 +1218,11 @@ static int disasm_line__print(struct disasm_line *dl, struct symbol *sym, u64 st
  * means that it's not a disassembly line so should be treated differently.
  * The ops.raw part will be parsed further according to type of the instruction.
  */
-static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
-				      struct arch *arch,
-				      FILE *file, size_t privsize,
+static int symbol__parse_objdump_line(struct symbol *sym, FILE *file,
+				      struct annotate_args *args,
 				      int *line_nr)
 {
+	struct map *map = args->map;
 	struct annotation *notes = symbol__annotation(sym);
 	struct disasm_line *dl;
 	char *line = NULL, *parsed_line, *tmp, *tmp2;
@@ -1263,7 +1266,11 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
 			parsed_line = tmp2 + 1;
 	}
 
-	dl = disasm_line__new(offset, parsed_line, privsize, *line_nr, arch, map);
+	args->offset  = offset;
+	args->line    = parsed_line;
+	args->line_nr = *line_nr;
+
+	dl = disasm_line__new(args);
 	free(line);
 	(*line_nr)++;
 
@@ -1288,7 +1295,7 @@ static int symbol__parse_objdump_line(struct symbol *sym, struct map *map,
 			dl->ops.target.name = strdup(target.sym->name);
 	}
 
-	disasm__add(&notes->src->source, dl);
+	annotation_line__add(&dl->al, &notes->src->source);
 
 	return 0;
 }
@@ -1305,19 +1312,19 @@ static void delete_last_nop(struct symbol *sym)
 	struct disasm_line *dl;
 
 	while (!list_empty(list)) {
-		dl = list_entry(list->prev, struct disasm_line, node);
+		dl = list_entry(list->prev, struct disasm_line, al.node);
 
 		if (dl->ins.ops) {
 			if (dl->ins.ops != &nop_ops)
 				return;
 		} else {
-			if (!strstr(dl->line, " nop ") &&
-			    !strstr(dl->line, " nopl ") &&
-			    !strstr(dl->line, " nopw "))
+			if (!strstr(dl->al.line, " nop ") &&
+			    !strstr(dl->al.line, " nopl ") &&
+			    !strstr(dl->al.line, " nopw "))
 				return;
 		}
 
-		list_del(&dl->node);
+		list_del(&dl->al.node);
 		disasm_line__free(dl);
 	}
 }
@@ -1412,25 +1419,11 @@ static int dso__disassemble_filename(struct dso *dso, char *filename, size_t fil
 	return 0;
 }
 
-static const char *annotate__norm_arch(const char *arch_name)
+static int symbol__disassemble(struct symbol *sym, struct annotate_args *args)
 {
-	struct utsname uts;
-
-	if (!arch_name) { /* Assume we are annotating locally. */
-		if (uname(&uts) < 0)
-			return NULL;
-		arch_name = uts.machine;
-	}
-	return normalize_arch((char *)arch_name);
-}
-
-int symbol__disassemble(struct symbol *sym, struct map *map,
-			const char *arch_name, size_t privsize,
-			struct arch **parch, char *cpuid)
-{
+	struct map *map = args->map;
 	struct dso *dso = map->dso;
 	char command[PATH_MAX * 2];
-	struct arch *arch = NULL;
 	FILE *file;
 	char symfs_filename[PATH_MAX];
 	struct kcore_extract kce;
@@ -1444,25 +1437,6 @@ int symbol__disassemble(struct symbol *sym, struct map *map,
 	if (err)
 		return err;
 
-	arch_name = annotate__norm_arch(arch_name);
-	if (!arch_name)
-		return -1;
-
-	arch = arch__find(arch_name);
-	if (arch == NULL)
-		return -ENOTSUP;
-
-	if (parch)
-		*parch = arch;
-
-	if (arch->init) {
-		err = arch->init(arch, cpuid);
-		if (err) {
-			pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name);
-			return err;
-		}
-	}
-
 	pr_debug("%s: filename=%s, sym=%s, start=%#" PRIx64 ", end=%#" PRIx64 "\n", __func__,
 		 symfs_filename, sym->name, map->unmap_ip(map, sym->start),
 		 map->unmap_ip(map, sym->end));
@@ -1546,8 +1520,7 @@ int symbol__disassemble(struct symbol *sym, struct map *map,
 		 * can associate it with the instructions till the next one.
 		 * See disasm_line__new() and struct disasm_line::line_nr.
 		 */
-		if (symbol__parse_objdump_line(sym, map, arch, file, privsize,
-			    &lineno) < 0)
+		if (symbol__parse_objdump_line(sym, file, args, &lineno) < 0)
 			break;
 		nline++;
 	}
@@ -1580,21 +1553,110 @@ int symbol__disassemble(struct symbol *sym, struct map *map,
 	goto out_remove_tmp;
 }
 
-static void insert_source_line(struct rb_root *root, struct source_line *src_line)
+static void calc_percent(struct sym_hist *hist,
+			 struct annotation_data *sample,
+			 s64 offset, s64 end)
 {
-	struct source_line *iter;
+	unsigned int hits = 0;
+	u64 period = 0;
+
+	while (offset < end) {
+		hits   += hist->addr[offset].nr_samples;
+		period += hist->addr[offset].period;
+		++offset;
+	}
+
+	if (hist->nr_samples) {
+		sample->he.period     = period;
+		sample->he.nr_samples = hits;
+		sample->percent = 100.0 * hits / hist->nr_samples;
+	}
+}
+
+static void annotation__calc_percent(struct annotation *notes,
+				     struct perf_evsel *evsel, s64 len)
+{
+	struct annotation_line *al, *next;
+
+	list_for_each_entry(al, &notes->src->source, node) {
+		s64 end;
+		int i;
+
+		if (al->offset == -1)
+			continue;
+
+		next = annotation_line__next(al, &notes->src->source);
+		end  = next ? next->offset : len;
+
+		for (i = 0; i < al->samples_nr; i++) {
+			struct annotation_data *sample;
+			struct sym_hist *hist;
+
+			hist   = annotation__histogram(notes, evsel->idx + i);
+			sample = &al->samples[i];
+
+			calc_percent(hist, sample, al->offset, end);
+		}
+	}
+}
+
+void symbol__calc_percent(struct symbol *sym, struct perf_evsel *evsel)
+{
+	struct annotation *notes = symbol__annotation(sym);
+
+	annotation__calc_percent(notes, evsel, symbol__size(sym));
+}
+
+int symbol__annotate(struct symbol *sym, struct map *map,
+		     struct perf_evsel *evsel, size_t privsize,
+		     struct arch **parch)
+{
+	struct annotate_args args = {
+		.privsize	= privsize,
+		.map		= map,
+		.evsel		= evsel,
+	};
+	struct perf_env *env = perf_evsel__env(evsel);
+	const char *arch_name = perf_env__arch(env);
+	struct arch *arch;
+	int err;
+
+	if (!arch_name)
+		return -1;
+
+	args.arch = arch = arch__find(arch_name);
+	if (arch == NULL)
+		return -ENOTSUP;
+
+	if (parch)
+		*parch = arch;
+
+	if (arch->init) {
+		err = arch->init(arch, env ? env->cpuid : NULL);
+		if (err) {
+			pr_err("%s: failed to initialize %s arch priv area\n", __func__, arch->name);
+			return err;
+		}
+	}
+
+	return symbol__disassemble(sym, &args);
+}
+
+static void insert_source_line(struct rb_root *root, struct annotation_line *al)
+{
+	struct annotation_line *iter;
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
 	int i, ret;
 
 	while (*p != NULL) {
 		parent = *p;
-		iter = rb_entry(parent, struct source_line, node);
+		iter = rb_entry(parent, struct annotation_line, rb_node);
 
-		ret = strcmp(iter->path, src_line->path);
+		ret = strcmp(iter->path, al->path);
 		if (ret == 0) {
-			for (i = 0; i < src_line->nr_pcnt; i++)
-				iter->samples[i].percent_sum += src_line->samples[i].percent;
+			for (i = 0; i < al->samples_nr; i++)
+				iter->samples[i].percent_sum += al->samples[i].percent;
 			return;
 		}
 
@@ -1604,18 +1666,18 @@ static void insert_source_line(struct rb_root *root, struct source_line *src_lin
 			p = &(*p)->rb_right;
 	}
 
-	for (i = 0; i < src_line->nr_pcnt; i++)
-		src_line->samples[i].percent_sum = src_line->samples[i].percent;
+	for (i = 0; i < al->samples_nr; i++)
+		al->samples[i].percent_sum = al->samples[i].percent;
 
-	rb_link_node(&src_line->node, parent, p);
-	rb_insert_color(&src_line->node, root);
+	rb_link_node(&al->rb_node, parent, p);
+	rb_insert_color(&al->rb_node, root);
 }
 
-static int cmp_source_line(struct source_line *a, struct source_line *b)
+static int cmp_source_line(struct annotation_line *a, struct annotation_line *b)
 {
 	int i;
 
-	for (i = 0; i < a->nr_pcnt; i++) {
+	for (i = 0; i < a->samples_nr; i++) {
 		if (a->samples[i].percent_sum == b->samples[i].percent_sum)
 			continue;
 		return a->samples[i].percent_sum > b->samples[i].percent_sum;
@@ -1624,135 +1686,47 @@ static int cmp_source_line(struct source_line *a, struct source_line *b)
 	return 0;
 }
 
-static void __resort_source_line(struct rb_root *root, struct source_line *src_line)
+static void __resort_source_line(struct rb_root *root, struct annotation_line *al)
 {
-	struct source_line *iter;
+	struct annotation_line *iter;
 	struct rb_node **p = &root->rb_node;
 	struct rb_node *parent = NULL;
 
 	while (*p != NULL) {
 		parent = *p;
-		iter = rb_entry(parent, struct source_line, node);
+		iter = rb_entry(parent, struct annotation_line, rb_node);
 
-		if (cmp_source_line(src_line, iter))
+		if (cmp_source_line(al, iter))
 			p = &(*p)->rb_left;
 		else
 			p = &(*p)->rb_right;
 	}
 
-	rb_link_node(&src_line->node, parent, p);
-	rb_insert_color(&src_line->node, root);
+	rb_link_node(&al->rb_node, parent, p);
+	rb_insert_color(&al->rb_node, root);
 }
 
 static void resort_source_line(struct rb_root *dest_root, struct rb_root *src_root)
 {
-	struct source_line *src_line;
+	struct annotation_line *al;
 	struct rb_node *node;
 
 	node = rb_first(src_root);
 	while (node) {
 		struct rb_node *next;
 
-		src_line = rb_entry(node, struct source_line, node);
+		al = rb_entry(node, struct annotation_line, rb_node);
 		next = rb_next(node);
 		rb_erase(node, src_root);
 
-		__resort_source_line(dest_root, src_line);
+		__resort_source_line(dest_root, al);
 		node = next;
 	}
 }
 
-static void symbol__free_source_line(struct symbol *sym, int len)
-{
-	struct annotation *notes = symbol__annotation(sym);
-	struct source_line *src_line = notes->src->lines;
-	size_t sizeof_src_line;
-	int i;
-
-	sizeof_src_line = sizeof(*src_line) +
-			  (sizeof(src_line->samples) * (src_line->nr_pcnt - 1));
-
-	for (i = 0; i < len; i++) {
-		free_srcline(src_line->path);
-		src_line = (void *)src_line + sizeof_src_line;
-	}
-
-	zfree(&notes->src->lines);
-}
-
-/* Get the filename:line for the colored entries */
-static int symbol__get_source_line(struct symbol *sym, struct map *map,
-				   struct perf_evsel *evsel,
-				   struct rb_root *root, int len)
-{
-	u64 start;
-	int i, k;
-	int evidx = evsel->idx;
-	struct source_line *src_line;
-	struct annotation *notes = symbol__annotation(sym);
-	struct sym_hist *h = annotation__histogram(notes, evidx);
-	struct rb_root tmp_root = RB_ROOT;
-	int nr_pcnt = 1;
-	u64 nr_samples = h->nr_samples;
-	size_t sizeof_src_line = sizeof(struct source_line);
-
-	if (perf_evsel__is_group_event(evsel)) {
-		for (i = 1; i < evsel->nr_members; i++) {
-			h = annotation__histogram(notes, evidx + i);
-			nr_samples += h->nr_samples;
-		}
-		nr_pcnt = evsel->nr_members;
-		sizeof_src_line += (nr_pcnt - 1) * sizeof(src_line->samples);
-	}
-
-	if (!nr_samples)
-		return 0;
-
-	src_line = notes->src->lines = calloc(len, sizeof_src_line);
-	if (!notes->src->lines)
-		return -1;
-
-	start = map__rip_2objdump(map, sym->start);
-
-	for (i = 0; i < len; i++) {
-		u64 offset;
-		double percent_max = 0.0;
-
-		src_line->nr_pcnt = nr_pcnt;
-
-		for (k = 0; k < nr_pcnt; k++) {
-			double percent = 0.0;
-
-			h = annotation__histogram(notes, evidx + k);
-			nr_samples = h->addr[i].nr_samples;
-			if (h->nr_samples)
-				percent = 100.0 * nr_samples / h->nr_samples;
-
-			if (percent > percent_max)
-				percent_max = percent;
-			src_line->samples[k].percent = percent;
-			src_line->samples[k].nr = nr_samples;
-		}
-
-		if (percent_max <= 0.5)
-			goto next;
-
-		offset = start + i;
-		src_line->path = get_srcline(map->dso, offset, NULL,
-					     false, true);
-		insert_source_line(&tmp_root, src_line);
-
-	next:
-		src_line = (void *)src_line + sizeof_src_line;
-	}
-
-	resort_source_line(root, &tmp_root);
-	return 0;
-}
-
 static void print_summary(struct rb_root *root, const char *filename)
 {
-	struct source_line *src_line;
+	struct annotation_line *al;
 	struct rb_node *node;
 
 	printf("\nSorted summary for file %s\n", filename);
@@ -1770,9 +1744,9 @@ static void print_summary(struct rb_root *root, const char *filename)
 		char *path;
 		int i;
 
-		src_line = rb_entry(node, struct source_line, node);
-		for (i = 0; i < src_line->nr_pcnt; i++) {
-			percent = src_line->samples[i].percent_sum;
+		al = rb_entry(node, struct annotation_line, rb_node);
+		for (i = 0; i < al->samples_nr; i++) {
+			percent = al->samples[i].percent_sum;
 			color = get_percent_color(percent);
 			color_fprintf(stdout, color, " %7.2f", percent);
 
@@ -1780,7 +1754,7 @@ static void print_summary(struct rb_root *root, const char *filename)
 				percent_max = percent;
 		}
 
-		path = src_line->path;
+		path = al->path;
 		color = get_percent_color(percent_max);
 		color_fprintf(stdout, color, " %s\n", path);
 
@@ -1801,6 +1775,19 @@ static void symbol__annotate_hits(struct symbol *sym, struct perf_evsel *evsel)
 	printf("%*s: %" PRIu64 "\n", BITS_PER_LONG / 2, "h->nr_samples", h->nr_samples);
 }
 
+static int annotated_source__addr_fmt_width(struct list_head *lines, u64 start)
+{
+	char bf[32];
+	struct annotation_line *line;
+
+	list_for_each_entry_reverse(line, lines, node) {
+		if (line->offset != -1)
+			return scnprintf(bf, sizeof(bf), "%" PRIx64, start + line->offset);
+	}
+
+	return 0;
+}
+
 int symbol__annotate_printf(struct symbol *sym, struct map *map,
 			    struct perf_evsel *evsel, bool full_paths,
 			    int min_pcnt, int max_lines, int context)
@@ -1811,9 +1798,9 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
 	const char *evsel_name = perf_evsel__name(evsel);
 	struct annotation *notes = symbol__annotation(sym);
 	struct sym_hist *h = annotation__histogram(notes, evsel->idx);
-	struct disasm_line *pos, *queue = NULL;
+	struct annotation_line *pos, *queue = NULL;
 	u64 start = map__rip_2objdump(map, sym->start);
-	int printed = 2, queue_len = 0;
+	int printed = 2, queue_len = 0, addr_fmt_width;
 	int more = 0;
 	u64 len;
 	int width = symbol_conf.show_total_period ? 12 : 8;
@@ -1844,15 +1831,21 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
 	if (verbose > 0)
 		symbol__annotate_hits(sym, evsel);
 
+	addr_fmt_width = annotated_source__addr_fmt_width(&notes->src->source, start);
+
 	list_for_each_entry(pos, &notes->src->source, node) {
+		int err;
+
 		if (context && queue == NULL) {
 			queue = pos;
 			queue_len = 0;
 		}
 
-		switch (disasm_line__print(pos, sym, start, evsel, len,
-					    min_pcnt, printed, max_lines,
-					    queue)) {
+		err = annotation_line__print(pos, sym, start, evsel, len,
+					     min_pcnt, printed, max_lines,
+					     queue, addr_fmt_width);
+
+		switch (err) {
 		case 0:
 			++printed;
 			if (context) {
@@ -1907,13 +1900,13 @@ void symbol__annotate_decay_histogram(struct symbol *sym, int evidx)
 	}
 }
 
-void disasm__purge(struct list_head *head)
+void annotated_source__purge(struct annotated_source *as)
 {
-	struct disasm_line *pos, *n;
+	struct annotation_line *al, *n;
 
-	list_for_each_entry_safe(pos, n, head, node) {
-		list_del(&pos->node);
-		disasm_line__free(pos);
+	list_for_each_entry_safe(al, n, &as->source, node) {
+		list_del(&al->node);
+		disasm_line__free(disasm_line(al));
 	}
 }
 
@@ -1921,10 +1914,10 @@ static size_t disasm_line__fprintf(struct disasm_line *dl, FILE *fp)
 {
 	size_t printed;
 
-	if (dl->offset == -1)
-		return fprintf(fp, "%s\n", dl->line);
+	if (dl->al.offset == -1)
+		return fprintf(fp, "%s\n", dl->al.line);
 
-	printed = fprintf(fp, "%#" PRIx64 " %s", dl->offset, dl->ins.name);
+	printed = fprintf(fp, "%#" PRIx64 " %s", dl->al.offset, dl->ins.name);
 
 	if (dl->ops.raw[0] != '\0') {
 		printed += fprintf(fp, "%.*s %s\n", 6 - (int)printed, " ",
@@ -1939,38 +1932,73 @@ size_t disasm__fprintf(struct list_head *head, FILE *fp)
 	struct disasm_line *pos;
 	size_t printed = 0;
 
-	list_for_each_entry(pos, head, node)
+	list_for_each_entry(pos, head, al.node)
 		printed += disasm_line__fprintf(pos, fp);
 
 	return printed;
 }
 
+static void annotation__calc_lines(struct annotation *notes, struct map *map,
+				  struct rb_root *root, u64 start)
+{
+	struct annotation_line *al;
+	struct rb_root tmp_root = RB_ROOT;
+
+	list_for_each_entry(al, &notes->src->source, node) {
+		double percent_max = 0.0;
+		int i;
+
+		for (i = 0; i < al->samples_nr; i++) {
+			struct annotation_data *sample;
+
+			sample = &al->samples[i];
+
+			if (sample->percent > percent_max)
+				percent_max = sample->percent;
+		}
+
+		if (percent_max <= 0.5)
+			continue;
+
+		al->path = get_srcline(map->dso, start + al->offset, NULL,
+				       false, true, start + al->offset);
+		insert_source_line(&tmp_root, al);
+	}
+
+	resort_source_line(root, &tmp_root);
+}
+
+static void symbol__calc_lines(struct symbol *sym, struct map *map,
+			      struct rb_root *root)
+{
+	struct annotation *notes = symbol__annotation(sym);
+	u64 start = map__rip_2objdump(map, sym->start);
+
+	annotation__calc_lines(notes, map, root, start);
+}
+
 int symbol__tty_annotate(struct symbol *sym, struct map *map,
 			 struct perf_evsel *evsel, bool print_lines,
 			 bool full_paths, int min_pcnt, int max_lines)
 {
 	struct dso *dso = map->dso;
 	struct rb_root source_line = RB_ROOT;
-	u64 len;
 
-	if (symbol__disassemble(sym, map, perf_evsel__env_arch(evsel),
-				0, NULL, NULL) < 0)
+	if (symbol__annotate(sym, map, evsel, 0, NULL) < 0)
 		return -1;
 
-	len = symbol__size(sym);
+	symbol__calc_percent(sym, evsel);
 
 	if (print_lines) {
 		srcline_full_filename = full_paths;
-		symbol__get_source_line(sym, map, evsel, &source_line, len);
+		symbol__calc_lines(sym, map, &source_line);
 		print_summary(&source_line, dso->long_name);
 	}
 
 	symbol__annotate_printf(sym, map, evsel, full_paths,
 				min_pcnt, max_lines, 0);
-	if (print_lines)
-		symbol__free_source_line(sym, len);
 
-	disasm__purge(&symbol__annotation(sym)->src->source);
+	annotated_source__purge(symbol__annotation(sym)->src);
 
 	return 0;
 }
diff --git a/tools/perf/util/annotate.h b/tools/perf/util/annotate.h
index f6ba356..ce42744 100644
--- a/tools/perf/util/annotate.h
+++ b/tools/perf/util/annotate.h
@@ -59,33 +59,55 @@ bool ins__is_fused(struct arch *arch, const char *ins1, const char *ins2);
 
 struct annotation;
 
-struct disasm_line {
-	struct list_head    node;
-	s64		    offset;
-	char		    *line;
-	struct ins	    ins;
-	int		    line_nr;
-	float		    ipc;
-	u64		    cycles;
-	struct ins_operands ops;
+struct sym_hist_entry {
+	u64		nr_samples;
+	u64		period;
 };
 
+struct annotation_data {
+	double			 percent;
+	double			 percent_sum;
+	struct sym_hist_entry	 he;
+};
+
+struct annotation_line {
+	struct list_head	 node;
+	struct rb_node		 rb_node;
+	s64			 offset;
+	char			*line;
+	int			 line_nr;
+	float			 ipc;
+	u64			 cycles;
+	size_t			 privsize;
+	char			*path;
+	int			 samples_nr;
+	struct annotation_data	 samples[0];
+};
+
+struct disasm_line {
+	struct ins		 ins;
+	struct ins_operands	 ops;
+
+	/* This needs to be at the end. */
+	struct annotation_line	 al;
+};
+
+static inline struct disasm_line *disasm_line(struct annotation_line *al)
+{
+	return al ? container_of(al, struct disasm_line, al) : NULL;
+}
+
 static inline bool disasm_line__has_offset(const struct disasm_line *dl)
 {
 	return dl->ops.target.offset_avail;
 }
 
-struct sym_hist_entry {
-	u64		nr_samples;
-	u64		period;
-};
-
 void disasm_line__free(struct disasm_line *dl);
-struct disasm_line *disasm__get_next_ip_line(struct list_head *head, struct disasm_line *pos);
+struct annotation_line *
+annotation_line__next(struct annotation_line *pos, struct list_head *head);
 int disasm_line__scnprintf(struct disasm_line *dl, char *bf, size_t size, bool raw);
 size_t disasm__fprintf(struct list_head *head, FILE *fp);
-double disasm__calc_percent(struct annotation *notes, int evidx, s64 offset,
-			    s64 end, const char **path, struct sym_hist_entry *sample);
+void symbol__calc_percent(struct symbol *sym, struct perf_evsel *evsel);
 
 struct sym_hist {
 	u64		      nr_samples;
@@ -104,19 +126,6 @@ struct cyc_hist {
 	u16	reset;
 };
 
-struct source_line_samples {
-	double		percent;
-	double		percent_sum;
-	u64		nr;
-};
-
-struct source_line {
-	struct rb_node	node;
-	char		*path;
-	int		nr_pcnt;
-	struct source_line_samples samples[1];
-};
-
 /** struct annotated_source - symbols with hits have this attached as in sannotation
  *
  * @histogram: Array of addr hit histograms per event being monitored
@@ -132,7 +141,6 @@ struct source_line {
  */
 struct annotated_source {
 	struct list_head   source;
-	struct source_line *lines;
 	int    		   nr_histograms;
 	size_t		   sizeof_sym_hist;
 	struct cyc_hist	   *cycles_hist;
@@ -169,9 +177,9 @@ int hist_entry__inc_addr_samples(struct hist_entry *he, struct perf_sample *samp
 int symbol__alloc_hist(struct symbol *sym);
 void symbol__annotate_zero_histograms(struct symbol *sym);
 
-int symbol__disassemble(struct symbol *sym, struct map *map,
-			const char *arch_name, size_t privsize,
-			struct arch **parch, char *cpuid);
+int symbol__annotate(struct symbol *sym, struct map *map,
+		     struct perf_evsel *evsel, size_t privsize,
+		     struct arch **parch);
 
 enum symbol_disassemble_errno {
 	SYMBOL_ANNOTATE_ERRNO__SUCCESS		= 0,
@@ -198,7 +206,7 @@ int symbol__annotate_printf(struct symbol *sym, struct map *map,
 			    int min_pcnt, int max_lines, int context);
 void symbol__annotate_zero_histogram(struct symbol *sym, int evidx);
 void symbol__annotate_decay_histogram(struct symbol *sym, int evidx);
-void disasm__purge(struct list_head *head);
+void annotated_source__purge(struct annotated_source *as);
 
 bool ui__has_annotation(void);
 
diff --git a/tools/perf/util/arm-spe-pkt-decoder.c b/tools/perf/util/arm-spe-pkt-decoder.c
new file mode 100644
index 0000000..b94001b
--- /dev/null
+++ b/tools/perf/util/arm-spe-pkt-decoder.c
@@ -0,0 +1,462 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <stdio.h>
+#include <string.h>
+#include <endian.h>
+#include <byteswap.h>
+
+#include "arm-spe-pkt-decoder.h"
+
+#define BIT(n)		(1ULL << (n))
+
+#define NS_FLAG		BIT(63)
+#define EL_FLAG		(BIT(62) | BIT(61))
+
+#define SPE_HEADER0_PAD			0x0
+#define SPE_HEADER0_END			0x1
+#define SPE_HEADER0_ADDRESS		0x30 /* address packet (short) */
+#define SPE_HEADER0_ADDRESS_MASK	0x38
+#define SPE_HEADER0_COUNTER		0x18 /* counter packet (short) */
+#define SPE_HEADER0_COUNTER_MASK	0x38
+#define SPE_HEADER0_TIMESTAMP		0x71
+#define SPE_HEADER0_TIMESTAMP		0x71
+#define SPE_HEADER0_EVENTS		0x2
+#define SPE_HEADER0_EVENTS_MASK		0xf
+#define SPE_HEADER0_SOURCE		0x3
+#define SPE_HEADER0_SOURCE_MASK		0xf
+#define SPE_HEADER0_CONTEXT		0x24
+#define SPE_HEADER0_CONTEXT_MASK	0x3c
+#define SPE_HEADER0_OP_TYPE		0x8
+#define SPE_HEADER0_OP_TYPE_MASK	0x3c
+#define SPE_HEADER1_ALIGNMENT		0x0
+#define SPE_HEADER1_ADDRESS		0xb0 /* address packet (extended) */
+#define SPE_HEADER1_ADDRESS_MASK	0xf8
+#define SPE_HEADER1_COUNTER		0x98 /* counter packet (extended) */
+#define SPE_HEADER1_COUNTER_MASK	0xf8
+
+#if __BYTE_ORDER == __BIG_ENDIAN
+#define le16_to_cpu bswap_16
+#define le32_to_cpu bswap_32
+#define le64_to_cpu bswap_64
+#define memcpy_le64(d, s, n) do { \
+	memcpy((d), (s), (n));    \
+	*(d) = le64_to_cpu(*(d)); \
+} while (0)
+#else
+#define le16_to_cpu
+#define le32_to_cpu
+#define le64_to_cpu
+#define memcpy_le64 memcpy
+#endif
+
+static const char * const arm_spe_packet_name[] = {
+	[ARM_SPE_PAD]		= "PAD",
+	[ARM_SPE_END]		= "END",
+	[ARM_SPE_TIMESTAMP]	= "TS",
+	[ARM_SPE_ADDRESS]	= "ADDR",
+	[ARM_SPE_COUNTER]	= "LAT",
+	[ARM_SPE_CONTEXT]	= "CONTEXT",
+	[ARM_SPE_OP_TYPE]	= "OP-TYPE",
+	[ARM_SPE_EVENTS]	= "EVENTS",
+	[ARM_SPE_DATA_SOURCE]	= "DATA-SOURCE",
+};
+
+const char *arm_spe_pkt_name(enum arm_spe_pkt_type type)
+{
+	return arm_spe_packet_name[type];
+}
+
+/* return ARM SPE payload size from its encoding,
+ * which is in bits 5:4 of the byte.
+ * 00 : byte
+ * 01 : halfword (2)
+ * 10 : word (4)
+ * 11 : doubleword (8)
+ */
+static int payloadlen(unsigned char byte)
+{
+	return 1 << ((byte & 0x30) >> 4);
+}
+
+static int arm_spe_get_payload(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	size_t payload_len = payloadlen(buf[0]);
+
+	if (len < 1 + payload_len)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	buf++;
+
+	switch (payload_len) {
+	case 1: packet->payload = *(uint8_t *)buf; break;
+	case 2: packet->payload = le16_to_cpu(*(uint16_t *)buf); break;
+	case 4: packet->payload = le32_to_cpu(*(uint32_t *)buf); break;
+	case 8: packet->payload = le64_to_cpu(*(uint64_t *)buf); break;
+	default: return ARM_SPE_BAD_PACKET;
+	}
+
+	return 1 + payload_len;
+}
+
+static int arm_spe_get_pad(struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_PAD;
+	return 1;
+}
+
+static int arm_spe_get_alignment(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	unsigned int alignment = 1 << ((buf[0] & 0xf) + 1);
+
+	if (len < alignment)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_PAD;
+	return alignment - (((uintptr_t)buf) & (alignment - 1));
+}
+
+static int arm_spe_get_end(struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_END;
+	return 1;
+}
+
+static int arm_spe_get_timestamp(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_TIMESTAMP;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_events(const unsigned char *buf, size_t len,
+			      struct arm_spe_pkt *packet)
+{
+	int ret = arm_spe_get_payload(buf, len, packet);
+
+	packet->type = ARM_SPE_EVENTS;
+
+	/* we use index to identify Events with a less number of
+	 * comparisons in arm_spe_pkt_desc(): E.g., the LLC-ACCESS,
+	 * LLC-REFILL, and REMOTE-ACCESS events are identified iff
+	 * index > 1.
+	 */
+	packet->index = ret - 1;
+
+	return ret;
+}
+
+static int arm_spe_get_data_source(const unsigned char *buf, size_t len,
+				   struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_DATA_SOURCE;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_context(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_CONTEXT;
+	packet->index = buf[0] & 0x3;
+
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_op_type(const unsigned char *buf, size_t len,
+			       struct arm_spe_pkt *packet)
+{
+	packet->type = ARM_SPE_OP_TYPE;
+	packet->index = buf[0] & 0x3;
+	return arm_spe_get_payload(buf, len, packet);
+}
+
+static int arm_spe_get_counter(const unsigned char *buf, size_t len,
+			       const unsigned char ext_hdr, struct arm_spe_pkt *packet)
+{
+	if (len < 2)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_COUNTER;
+	if (ext_hdr)
+		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+	else
+		packet->index = buf[0] & 0x7;
+
+	packet->payload = le16_to_cpu(*(uint16_t *)(buf + 1));
+
+	return 1 + ext_hdr + 2;
+}
+
+static int arm_spe_get_addr(const unsigned char *buf, size_t len,
+			    const unsigned char ext_hdr, struct arm_spe_pkt *packet)
+{
+	if (len < 8)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	packet->type = ARM_SPE_ADDRESS;
+	if (ext_hdr)
+		packet->index = ((buf[0] & 0x3) << 3) | (buf[1] & 0x7);
+	else
+		packet->index = buf[0] & 0x7;
+
+	memcpy_le64(&packet->payload, buf + 1, 8);
+
+	return 1 + ext_hdr + 8;
+}
+
+static int arm_spe_do_get_packet(const unsigned char *buf, size_t len,
+				 struct arm_spe_pkt *packet)
+{
+	unsigned int byte;
+
+	memset(packet, 0, sizeof(struct arm_spe_pkt));
+
+	if (!len)
+		return ARM_SPE_NEED_MORE_BYTES;
+
+	byte = buf[0];
+	if (byte == SPE_HEADER0_PAD)
+		return arm_spe_get_pad(packet);
+	else if (byte == SPE_HEADER0_END) /* no timestamp at end of record */
+		return arm_spe_get_end(packet);
+	else if (byte & 0xc0 /* 0y11xxxxxx */) {
+		if (byte & 0x80) {
+			if ((byte & SPE_HEADER0_ADDRESS_MASK) == SPE_HEADER0_ADDRESS)
+				return arm_spe_get_addr(buf, len, 0, packet);
+			if ((byte & SPE_HEADER0_COUNTER_MASK) == SPE_HEADER0_COUNTER)
+				return arm_spe_get_counter(buf, len, 0, packet);
+		} else
+			if (byte == SPE_HEADER0_TIMESTAMP)
+				return arm_spe_get_timestamp(buf, len, packet);
+			else if ((byte & SPE_HEADER0_EVENTS_MASK) == SPE_HEADER0_EVENTS)
+				return arm_spe_get_events(buf, len, packet);
+			else if ((byte & SPE_HEADER0_SOURCE_MASK) == SPE_HEADER0_SOURCE)
+				return arm_spe_get_data_source(buf, len, packet);
+			else if ((byte & SPE_HEADER0_CONTEXT_MASK) == SPE_HEADER0_CONTEXT)
+				return arm_spe_get_context(buf, len, packet);
+			else if ((byte & SPE_HEADER0_OP_TYPE_MASK) == SPE_HEADER0_OP_TYPE)
+				return arm_spe_get_op_type(buf, len, packet);
+	} else if ((byte & 0xe0) == 0x20 /* 0y001xxxxx */) {
+		/* 16-bit header */
+		byte = buf[1];
+		if (byte == SPE_HEADER1_ALIGNMENT)
+			return arm_spe_get_alignment(buf, len, packet);
+		else if ((byte & SPE_HEADER1_ADDRESS_MASK) == SPE_HEADER1_ADDRESS)
+			return arm_spe_get_addr(buf, len, 1, packet);
+		else if ((byte & SPE_HEADER1_COUNTER_MASK) == SPE_HEADER1_COUNTER)
+			return arm_spe_get_counter(buf, len, 1, packet);
+	}
+
+	return ARM_SPE_BAD_PACKET;
+}
+
+int arm_spe_get_packet(const unsigned char *buf, size_t len,
+		       struct arm_spe_pkt *packet)
+{
+	int ret;
+
+	ret = arm_spe_do_get_packet(buf, len, packet);
+	/* put multiple consecutive PADs on the same line, up to
+	 * the fixed-width output format of 16 bytes per line.
+	 */
+	if (ret > 0 && packet->type == ARM_SPE_PAD) {
+		while (ret < 16 && len > (size_t)ret && !buf[ret])
+			ret += 1;
+	}
+	return ret;
+}
+
+int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf,
+		     size_t buf_len)
+{
+	int ret, ns, el, idx = packet->index;
+	unsigned long long payload = packet->payload;
+	const char *name = arm_spe_pkt_name(packet->type);
+
+	switch (packet->type) {
+	case ARM_SPE_BAD:
+	case ARM_SPE_PAD:
+	case ARM_SPE_END:
+		return snprintf(buf, buf_len, "%s", name);
+	case ARM_SPE_EVENTS: {
+		size_t blen = buf_len;
+
+		ret = 0;
+		ret = snprintf(buf, buf_len, "EV");
+		buf += ret;
+		blen -= ret;
+		if (payload & 0x1) {
+			ret = snprintf(buf, buf_len, " EXCEPTION-GEN");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x2) {
+			ret = snprintf(buf, buf_len, " RETIRED");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x4) {
+			ret = snprintf(buf, buf_len, " L1D-ACCESS");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x8) {
+			ret = snprintf(buf, buf_len, " L1D-REFILL");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x10) {
+			ret = snprintf(buf, buf_len, " TLB-ACCESS");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x20) {
+			ret = snprintf(buf, buf_len, " TLB-REFILL");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x40) {
+			ret = snprintf(buf, buf_len, " NOT-TAKEN");
+			buf += ret;
+			blen -= ret;
+		}
+		if (payload & 0x80) {
+			ret = snprintf(buf, buf_len, " MISPRED");
+			buf += ret;
+			blen -= ret;
+		}
+		if (idx > 1) {
+			if (payload & 0x100) {
+				ret = snprintf(buf, buf_len, " LLC-ACCESS");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x200) {
+				ret = snprintf(buf, buf_len, " LLC-REFILL");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x400) {
+				ret = snprintf(buf, buf_len, " REMOTE-ACCESS");
+				buf += ret;
+				blen -= ret;
+			}
+		}
+		if (ret < 0)
+			return ret;
+		blen -= ret;
+		return buf_len - blen;
+	}
+	case ARM_SPE_OP_TYPE:
+		switch (idx) {
+		case 0:	return snprintf(buf, buf_len, "%s", payload & 0x1 ?
+					"COND-SELECT" : "INSN-OTHER");
+		case 1:	{
+			size_t blen = buf_len;
+
+			if (payload & 0x1)
+				ret = snprintf(buf, buf_len, "ST");
+			else
+				ret = snprintf(buf, buf_len, "LD");
+			buf += ret;
+			blen -= ret;
+			if (payload & 0x2) {
+				if (payload & 0x4) {
+					ret = snprintf(buf, buf_len, " AT");
+					buf += ret;
+					blen -= ret;
+				}
+				if (payload & 0x8) {
+					ret = snprintf(buf, buf_len, " EXCL");
+					buf += ret;
+					blen -= ret;
+				}
+				if (payload & 0x10) {
+					ret = snprintf(buf, buf_len, " AR");
+					buf += ret;
+					blen -= ret;
+				}
+			} else if (payload & 0x4) {
+				ret = snprintf(buf, buf_len, " SIMD-FP");
+				buf += ret;
+				blen -= ret;
+			}
+			if (ret < 0)
+				return ret;
+			blen -= ret;
+			return buf_len - blen;
+		}
+		case 2:	{
+			size_t blen = buf_len;
+
+			ret = snprintf(buf, buf_len, "B");
+			buf += ret;
+			blen -= ret;
+			if (payload & 0x1) {
+				ret = snprintf(buf, buf_len, " COND");
+				buf += ret;
+				blen -= ret;
+			}
+			if (payload & 0x2) {
+				ret = snprintf(buf, buf_len, " IND");
+				buf += ret;
+				blen -= ret;
+			}
+			if (ret < 0)
+				return ret;
+			blen -= ret;
+			return buf_len - blen;
+			}
+		default: return 0;
+		}
+	case ARM_SPE_DATA_SOURCE:
+	case ARM_SPE_TIMESTAMP:
+		return snprintf(buf, buf_len, "%s %lld", name, payload);
+	case ARM_SPE_ADDRESS:
+		switch (idx) {
+		case 0:
+		case 1: ns = !!(packet->payload & NS_FLAG);
+			el = (packet->payload & EL_FLAG) >> 61;
+			payload &= ~(0xffULL << 56);
+			return snprintf(buf, buf_len, "%s 0x%llx el%d ns=%d",
+				        (idx == 1) ? "TGT" : "PC", payload, el, ns);
+		case 2:	return snprintf(buf, buf_len, "VA 0x%llx", payload);
+		case 3:	ns = !!(packet->payload & NS_FLAG);
+			payload &= ~(0xffULL << 56);
+			return snprintf(buf, buf_len, "PA 0x%llx ns=%d",
+					payload, ns);
+		default: return 0;
+		}
+	case ARM_SPE_CONTEXT:
+		return snprintf(buf, buf_len, "%s 0x%lx el%d", name,
+				(unsigned long)payload, idx + 1);
+	case ARM_SPE_COUNTER: {
+		size_t blen = buf_len;
+
+		ret = snprintf(buf, buf_len, "%s %d ", name,
+			       (unsigned short)payload);
+		buf += ret;
+		blen -= ret;
+		switch (idx) {
+		case 0:	ret = snprintf(buf, buf_len, "TOT"); break;
+		case 1:	ret = snprintf(buf, buf_len, "ISSUE"); break;
+		case 2:	ret = snprintf(buf, buf_len, "XLAT"); break;
+		default: ret = 0;
+		}
+		if (ret < 0)
+			return ret;
+		blen -= ret;
+		return buf_len - blen;
+	}
+	default:
+		break;
+	}
+
+	return snprintf(buf, buf_len, "%s 0x%llx (%d)",
+			name, payload, packet->index);
+}
diff --git a/tools/perf/util/arm-spe-pkt-decoder.h b/tools/perf/util/arm-spe-pkt-decoder.h
new file mode 100644
index 0000000..d786ef6
--- /dev/null
+++ b/tools/perf/util/arm-spe-pkt-decoder.h
@@ -0,0 +1,43 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#ifndef INCLUDE__ARM_SPE_PKT_DECODER_H__
+#define INCLUDE__ARM_SPE_PKT_DECODER_H__
+
+#include <stddef.h>
+#include <stdint.h>
+
+#define ARM_SPE_PKT_DESC_MAX		256
+
+#define ARM_SPE_NEED_MORE_BYTES		-1
+#define ARM_SPE_BAD_PACKET		-2
+
+enum arm_spe_pkt_type {
+	ARM_SPE_BAD,
+	ARM_SPE_PAD,
+	ARM_SPE_END,
+	ARM_SPE_TIMESTAMP,
+	ARM_SPE_ADDRESS,
+	ARM_SPE_COUNTER,
+	ARM_SPE_CONTEXT,
+	ARM_SPE_OP_TYPE,
+	ARM_SPE_EVENTS,
+	ARM_SPE_DATA_SOURCE,
+};
+
+struct arm_spe_pkt {
+	enum arm_spe_pkt_type	type;
+	unsigned char		index;
+	uint64_t		payload;
+};
+
+const char *arm_spe_pkt_name(enum arm_spe_pkt_type);
+
+int arm_spe_get_packet(const unsigned char *buf, size_t len,
+		       struct arm_spe_pkt *packet);
+
+int arm_spe_pkt_desc(const struct arm_spe_pkt *packet, char *buf, size_t len);
+#endif
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c
new file mode 100644
index 0000000..6067267
--- /dev/null
+++ b/tools/perf/util/arm-spe.c
@@ -0,0 +1,231 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#include <endian.h>
+#include <errno.h>
+#include <byteswap.h>
+#include <inttypes.h>
+#include <linux/kernel.h>
+#include <linux/types.h>
+#include <linux/bitops.h>
+#include <linux/log2.h>
+
+#include "cpumap.h"
+#include "color.h"
+#include "evsel.h"
+#include "evlist.h"
+#include "machine.h"
+#include "session.h"
+#include "util.h"
+#include "thread.h"
+#include "debug.h"
+#include "auxtrace.h"
+#include "arm-spe.h"
+#include "arm-spe-pkt-decoder.h"
+
+struct arm_spe {
+	struct auxtrace			auxtrace;
+	struct auxtrace_queues		queues;
+	struct auxtrace_heap		heap;
+	u32				auxtrace_type;
+	struct perf_session		*session;
+	struct machine			*machine;
+	u32				pmu_type;
+};
+
+struct arm_spe_queue {
+	struct arm_spe		*spe;
+	unsigned int		queue_nr;
+	struct auxtrace_buffer	*buffer;
+	bool			on_heap;
+	bool			done;
+	pid_t			pid;
+	pid_t			tid;
+	int			cpu;
+};
+
+static void arm_spe_dump(struct arm_spe *spe __maybe_unused,
+			 unsigned char *buf, size_t len)
+{
+	struct arm_spe_pkt packet;
+	size_t pos = 0;
+	int ret, pkt_len, i;
+	char desc[ARM_SPE_PKT_DESC_MAX];
+	const char *color = PERF_COLOR_BLUE;
+
+	color_fprintf(stdout, color,
+		      ". ... ARM SPE data: size %zu bytes\n",
+		      len);
+
+	while (len) {
+		ret = arm_spe_get_packet(buf, len, &packet);
+		if (ret > 0)
+			pkt_len = ret;
+		else
+			pkt_len = 1;
+		printf(".");
+		color_fprintf(stdout, color, "  %08x: ", pos);
+		for (i = 0; i < pkt_len; i++)
+			color_fprintf(stdout, color, " %02x", buf[i]);
+		for (; i < 16; i++)
+			color_fprintf(stdout, color, "   ");
+		if (ret > 0) {
+			ret = arm_spe_pkt_desc(&packet, desc,
+					       ARM_SPE_PKT_DESC_MAX);
+			if (ret > 0)
+				color_fprintf(stdout, color, " %s\n", desc);
+		} else {
+			color_fprintf(stdout, color, " Bad packet!\n");
+		}
+		pos += pkt_len;
+		buf += pkt_len;
+		len -= pkt_len;
+	}
+}
+
+static void arm_spe_dump_event(struct arm_spe *spe, unsigned char *buf,
+			       size_t len)
+{
+	printf(".\n");
+	arm_spe_dump(spe, buf, len);
+}
+
+static int arm_spe_process_event(struct perf_session *session __maybe_unused,
+				 union perf_event *event __maybe_unused,
+				 struct perf_sample *sample __maybe_unused,
+				 struct perf_tool *tool __maybe_unused)
+{
+	return 0;
+}
+
+static int arm_spe_process_auxtrace_event(struct perf_session *session,
+					  union perf_event *event,
+					  struct perf_tool *tool __maybe_unused)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+	struct auxtrace_buffer *buffer;
+	off_t data_offset;
+	int fd = perf_data__fd(session->data);
+	int err;
+
+	if (perf_data__is_pipe(session->data)) {
+		data_offset = 0;
+	} else {
+		data_offset = lseek(fd, 0, SEEK_CUR);
+		if (data_offset == -1)
+			return -errno;
+	}
+
+	err = auxtrace_queues__add_event(&spe->queues, session, event,
+					 data_offset, &buffer);
+	if (err)
+		return err;
+
+	/* Dump here now we have copied a piped trace out of the pipe */
+	if (dump_trace) {
+		if (auxtrace_buffer__get_data(buffer, fd)) {
+			arm_spe_dump_event(spe, buffer->data,
+					     buffer->size);
+			auxtrace_buffer__put_data(buffer);
+		}
+	}
+
+	return 0;
+}
+
+static int arm_spe_flush(struct perf_session *session __maybe_unused,
+			 struct perf_tool *tool __maybe_unused)
+{
+	return 0;
+}
+
+static void arm_spe_free_queue(void *priv)
+{
+	struct arm_spe_queue *speq = priv;
+
+	if (!speq)
+		return;
+	free(speq);
+}
+
+static void arm_spe_free_events(struct perf_session *session)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+	struct auxtrace_queues *queues = &spe->queues;
+	unsigned int i;
+
+	for (i = 0; i < queues->nr_queues; i++) {
+		arm_spe_free_queue(queues->queue_array[i].priv);
+		queues->queue_array[i].priv = NULL;
+	}
+	auxtrace_queues__free(queues);
+}
+
+static void arm_spe_free(struct perf_session *session)
+{
+	struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe,
+					     auxtrace);
+
+	auxtrace_heap__free(&spe->heap);
+	arm_spe_free_events(session);
+	session->auxtrace = NULL;
+	free(spe);
+}
+
+static const char * const arm_spe_info_fmts[] = {
+	[ARM_SPE_PMU_TYPE]		= "  PMU Type           %"PRId64"\n",
+};
+
+static void arm_spe_print_info(u64 *arr)
+{
+	if (!dump_trace)
+		return;
+
+	fprintf(stdout, arm_spe_info_fmts[ARM_SPE_PMU_TYPE], arr[ARM_SPE_PMU_TYPE]);
+}
+
+int arm_spe_process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session)
+{
+	struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info;
+	size_t min_sz = sizeof(u64) * ARM_SPE_PMU_TYPE;
+	struct arm_spe *spe;
+	int err;
+
+	if (auxtrace_info->header.size < sizeof(struct auxtrace_info_event) +
+					min_sz)
+		return -EINVAL;
+
+	spe = zalloc(sizeof(struct arm_spe));
+	if (!spe)
+		return -ENOMEM;
+
+	err = auxtrace_queues__init(&spe->queues);
+	if (err)
+		goto err_free;
+
+	spe->session = session;
+	spe->machine = &session->machines.host; /* No kvm support */
+	spe->auxtrace_type = auxtrace_info->type;
+	spe->pmu_type = auxtrace_info->priv[ARM_SPE_PMU_TYPE];
+
+	spe->auxtrace.process_event = arm_spe_process_event;
+	spe->auxtrace.process_auxtrace_event = arm_spe_process_auxtrace_event;
+	spe->auxtrace.flush_events = arm_spe_flush;
+	spe->auxtrace.free_events = arm_spe_free_events;
+	spe->auxtrace.free = arm_spe_free;
+	session->auxtrace = &spe->auxtrace;
+
+	arm_spe_print_info(&auxtrace_info->priv[0]);
+
+	return 0;
+
+err_free:
+	free(spe);
+	return err;
+}
diff --git a/tools/perf/util/arm-spe.h b/tools/perf/util/arm-spe.h
new file mode 100644
index 0000000..98d3235
--- /dev/null
+++ b/tools/perf/util/arm-spe.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Arm Statistical Profiling Extensions (SPE) support
+ * Copyright (c) 2017-2018, Arm Ltd.
+ */
+
+#ifndef INCLUDE__PERF_ARM_SPE_H__
+#define INCLUDE__PERF_ARM_SPE_H__
+
+#define ARM_SPE_PMU_NAME "arm_spe_"
+
+enum {
+	ARM_SPE_PMU_TYPE,
+	ARM_SPE_PER_CPU_MMAPS,
+	ARM_SPE_AUXTRACE_PRIV_MAX,
+};
+
+#define ARM_SPE_AUXTRACE_PRIV_SIZE (ARM_SPE_AUXTRACE_PRIV_MAX * sizeof(u64))
+
+union perf_event;
+struct perf_session;
+struct perf_pmu;
+
+struct auxtrace_record *arm_spe_recording_init(int *err,
+					       struct perf_pmu *arm_spe_pmu);
+
+int arm_spe_process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session);
+
+struct perf_event_attr *arm_spe_pmu_default_config(struct perf_pmu *arm_spe_pmu);
+#endif
diff --git a/tools/perf/util/auxtrace.c b/tools/perf/util/auxtrace.c
index a3349141..9faf3b5 100644
--- a/tools/perf/util/auxtrace.c
+++ b/tools/perf/util/auxtrace.c
@@ -31,9 +31,6 @@
 #include <sys/param.h>
 #include <stdlib.h>
 #include <stdio.h>
-#include <string.h>
-#include <limits.h>
-#include <errno.h>
 #include <linux/list.h>
 
 #include "../perf.h"
@@ -55,8 +52,10 @@
 #include "debug.h"
 #include <subcmd/parse-options.h>
 
+#include "cs-etm.h"
 #include "intel-pt.h"
 #include "intel-bts.h"
+#include "arm-spe.h"
 
 #include "sane_ctype.h"
 #include "symbol/kallsyms.h"
@@ -913,7 +912,10 @@ int perf_event__process_auxtrace_info(struct perf_tool *tool __maybe_unused,
 		return intel_pt_process_auxtrace_info(event, session);
 	case PERF_AUXTRACE_INTEL_BTS:
 		return intel_bts_process_auxtrace_info(event, session);
+	case PERF_AUXTRACE_ARM_SPE:
+		return arm_spe_process_auxtrace_info(event, session);
 	case PERF_AUXTRACE_CS_ETM:
+		return cs_etm__process_auxtrace_info(event, session);
 	case PERF_AUXTRACE_UNKNOWN:
 	default:
 		return -EINVAL;
diff --git a/tools/perf/util/auxtrace.h b/tools/perf/util/auxtrace.h
index d19e11b..453c148 100644
--- a/tools/perf/util/auxtrace.h
+++ b/tools/perf/util/auxtrace.h
@@ -43,6 +43,7 @@ enum auxtrace_type {
 	PERF_AUXTRACE_INTEL_PT,
 	PERF_AUXTRACE_INTEL_BTS,
 	PERF_AUXTRACE_CS_ETM,
+	PERF_AUXTRACE_ARM_SPE,
 };
 
 enum itrace_period_type {
diff --git a/tools/perf/util/bpf-loader.c b/tools/perf/util/bpf-loader.c
index 72c107f..af7ad81 100644
--- a/tools/perf/util/bpf-loader.c
+++ b/tools/perf/util/bpf-loader.c
@@ -94,7 +94,7 @@ struct bpf_object *bpf__prepare_load(const char *filename, bool source)
 		err = perf_clang__compile_bpf(filename, &obj_buf, &obj_buf_sz);
 		perf_clang__cleanup();
 		if (err) {
-			pr_warning("bpf: builtin compilation failed: %d, try external compiler\n", err);
+			pr_debug("bpf: builtin compilation failed: %d, try external compiler\n", err);
 			err = llvm__compile_bpf(filename, &obj_buf, &obj_buf_sz);
 			if (err)
 				return ERR_PTR(-BPF_LOADER_ERRNO__COMPILE);
@@ -1533,7 +1533,7 @@ int bpf__apply_obj_config(void)
 			(strcmp("__bpf_stdout__", 	\
 				bpf_map__name(pos)) == 0))
 
-int bpf__setup_stdout(struct perf_evlist *evlist __maybe_unused)
+int bpf__setup_stdout(struct perf_evlist *evlist)
 {
 	struct bpf_map_priv *tmpl_priv = NULL;
 	struct bpf_object *obj, *tmp;
diff --git a/tools/perf/util/callchain.c b/tools/perf/util/callchain.c
index 082505d0..32ef7bd 100644
--- a/tools/perf/util/callchain.c
+++ b/tools/perf/util/callchain.c
@@ -37,6 +37,15 @@ struct callchain_param callchain_param = {
 	CALLCHAIN_PARAM_DEFAULT
 };
 
+/*
+ * Are there any events usind DWARF callchains?
+ *
+ * I.e.
+ *
+ * -e cycles/call-graph=dwarf/
+ */
+bool dwarf_callchain_users;
+
 struct callchain_param callchain_param_default = {
 	CALLCHAIN_PARAM_DEFAULT
 };
@@ -265,6 +274,7 @@ int parse_callchain_record(const char *arg, struct callchain_param *param)
 			ret = 0;
 			param->record_mode = CALLCHAIN_DWARF;
 			param->dump_size = default_stack_dump_size;
+			dwarf_callchain_users = true;
 
 			tok = strtok_r(NULL, ",", &saveptr);
 			if (tok) {
diff --git a/tools/perf/util/callchain.h b/tools/perf/util/callchain.h
index b79ef24..154560b 100644
--- a/tools/perf/util/callchain.h
+++ b/tools/perf/util/callchain.h
@@ -89,6 +89,8 @@ enum chain_value {
 	CCVAL_COUNT,
 };
 
+extern bool dwarf_callchain_users;
+
 struct callchain_param {
 	bool			enabled;
 	enum perf_call_graph_mode record_mode;
diff --git a/tools/perf/util/cgroup.c b/tools/perf/util/cgroup.c
index d9ffc1e..984f691 100644
--- a/tools/perf/util/cgroup.c
+++ b/tools/perf/util/cgroup.c
@@ -6,6 +6,9 @@
 #include "cgroup.h"
 #include "evlist.h"
 #include <linux/stringify.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
 
 int nr_cgroups;
 
diff --git a/tools/perf/util/cs-etm-decoder/Build b/tools/perf/util/cs-etm-decoder/Build
new file mode 100644
index 0000000..bc22c39
--- /dev/null
+++ b/tools/perf/util/cs-etm-decoder/Build
@@ -0,0 +1 @@
+libperf-$(CONFIG_AUXTRACE) += cs-etm-decoder.o
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
new file mode 100644
index 0000000..1fb0184
--- /dev/null
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.c
@@ -0,0 +1,513 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright(C) 2015-2018 Linaro Limited.
+ *
+ * Author: Tor Jeremiassen <tor@ti.com>
+ * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
+ */
+
+#include <linux/err.h>
+#include <linux/list.h>
+#include <stdlib.h>
+#include <opencsd/c_api/opencsd_c_api.h>
+#include <opencsd/etmv4/trc_pkt_types_etmv4.h>
+#include <opencsd/ocsd_if_types.h>
+
+#include "cs-etm.h"
+#include "cs-etm-decoder.h"
+#include "intlist.h"
+#include "util.h"
+
+#define MAX_BUFFER 1024
+
+/* use raw logging */
+#ifdef CS_DEBUG_RAW
+#define CS_LOG_RAW_FRAMES
+#ifdef CS_RAW_PACKED
+#define CS_RAW_DEBUG_FLAGS (OCSD_DFRMTR_UNPACKED_RAW_OUT | \
+			    OCSD_DFRMTR_PACKED_RAW_OUT)
+#else
+#define CS_RAW_DEBUG_FLAGS (OCSD_DFRMTR_UNPACKED_RAW_OUT)
+#endif
+#endif
+
+struct cs_etm_decoder {
+	void *data;
+	void (*packet_printer)(const char *msg);
+	bool trace_on;
+	dcd_tree_handle_t dcd_tree;
+	cs_etm_mem_cb_type mem_access;
+	ocsd_datapath_resp_t prev_return;
+	u32 packet_count;
+	u32 head;
+	u32 tail;
+	struct cs_etm_packet packet_buffer[MAX_BUFFER];
+};
+
+static u32
+cs_etm_decoder__mem_access(const void *context,
+			   const ocsd_vaddr_t address,
+			   const ocsd_mem_space_acc_t mem_space __maybe_unused,
+			   const u32 req_size,
+			   u8 *buffer)
+{
+	struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context;
+
+	return decoder->mem_access(decoder->data,
+				   address,
+				   req_size,
+				   buffer);
+}
+
+int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *decoder,
+				      u64 start, u64 end,
+				      cs_etm_mem_cb_type cb_func)
+{
+	decoder->mem_access = cb_func;
+
+	if (ocsd_dt_add_callback_mem_acc(decoder->dcd_tree, start, end,
+					 OCSD_MEM_SPACE_ANY,
+					 cs_etm_decoder__mem_access, decoder))
+		return -1;
+
+	return 0;
+}
+
+int cs_etm_decoder__reset(struct cs_etm_decoder *decoder)
+{
+	ocsd_datapath_resp_t dp_ret;
+
+	dp_ret = ocsd_dt_process_data(decoder->dcd_tree, OCSD_OP_RESET,
+				      0, 0, NULL, NULL);
+	if (OCSD_DATA_RESP_IS_FATAL(dp_ret))
+		return -1;
+
+	return 0;
+}
+
+int cs_etm_decoder__get_packet(struct cs_etm_decoder *decoder,
+			       struct cs_etm_packet *packet)
+{
+	if (!decoder || !packet)
+		return -EINVAL;
+
+	/* Nothing to do, might as well just return */
+	if (decoder->packet_count == 0)
+		return 0;
+
+	*packet = decoder->packet_buffer[decoder->head];
+
+	decoder->head = (decoder->head + 1) & (MAX_BUFFER - 1);
+
+	decoder->packet_count--;
+
+	return 1;
+}
+
+static void cs_etm_decoder__gen_etmv4_config(struct cs_etm_trace_params *params,
+					     ocsd_etmv4_cfg *config)
+{
+	config->reg_configr = params->etmv4.reg_configr;
+	config->reg_traceidr = params->etmv4.reg_traceidr;
+	config->reg_idr0 = params->etmv4.reg_idr0;
+	config->reg_idr1 = params->etmv4.reg_idr1;
+	config->reg_idr2 = params->etmv4.reg_idr2;
+	config->reg_idr8 = params->etmv4.reg_idr8;
+	config->reg_idr9 = 0;
+	config->reg_idr10 = 0;
+	config->reg_idr11 = 0;
+	config->reg_idr12 = 0;
+	config->reg_idr13 = 0;
+	config->arch_ver = ARCH_V8;
+	config->core_prof = profile_CortexA;
+}
+
+static void cs_etm_decoder__print_str_cb(const void *p_context,
+					 const char *msg,
+					 const int str_len)
+{
+	if (p_context && str_len)
+		((struct cs_etm_decoder *)p_context)->packet_printer(msg);
+}
+
+static int
+cs_etm_decoder__init_def_logger_printing(struct cs_etm_decoder_params *d_params,
+					 struct cs_etm_decoder *decoder)
+{
+	int ret = 0;
+
+	if (d_params->packet_printer == NULL)
+		return -1;
+
+	decoder->packet_printer = d_params->packet_printer;
+
+	/*
+	 * Set up a library default logger to process any printers
+	 * (packet/raw frame) we add later.
+	 */
+	ret = ocsd_def_errlog_init(OCSD_ERR_SEV_ERROR, 1);
+	if (ret != 0)
+		return -1;
+
+	/* no stdout / err / file output */
+	ret = ocsd_def_errlog_config_output(C_API_MSGLOGOUT_FLG_NONE, NULL);
+	if (ret != 0)
+		return -1;
+
+	/*
+	 * Set the string CB for the default logger, passes strings to
+	 * perf print logger.
+	 */
+	ret = ocsd_def_errlog_set_strprint_cb(decoder->dcd_tree,
+					      (void *)decoder,
+					      cs_etm_decoder__print_str_cb);
+	if (ret != 0)
+		ret = -1;
+
+	return 0;
+}
+
+#ifdef CS_LOG_RAW_FRAMES
+static void
+cs_etm_decoder__init_raw_frame_logging(struct cs_etm_decoder_params *d_params,
+				       struct cs_etm_decoder *decoder)
+{
+	/* Only log these during a --dump operation */
+	if (d_params->operation == CS_ETM_OPERATION_PRINT) {
+		/* set up a library default logger to process the
+		 *  raw frame printer we add later
+		 */
+		ocsd_def_errlog_init(OCSD_ERR_SEV_ERROR, 1);
+
+		/* no stdout / err / file output */
+		ocsd_def_errlog_config_output(C_API_MSGLOGOUT_FLG_NONE, NULL);
+
+		/* set the string CB for the default logger,
+		 * passes strings to perf print logger.
+		 */
+		ocsd_def_errlog_set_strprint_cb(decoder->dcd_tree,
+						(void *)decoder,
+						cs_etm_decoder__print_str_cb);
+
+		/* use the built in library printer for the raw frames */
+		ocsd_dt_set_raw_frame_printer(decoder->dcd_tree,
+					      CS_RAW_DEBUG_FLAGS);
+	}
+}
+#else
+static void
+cs_etm_decoder__init_raw_frame_logging(
+		struct cs_etm_decoder_params *d_params __maybe_unused,
+		struct cs_etm_decoder *decoder __maybe_unused)
+{
+}
+#endif
+
+static int cs_etm_decoder__create_packet_printer(struct cs_etm_decoder *decoder,
+						 const char *decoder_name,
+						 void *trace_config)
+{
+	u8 csid;
+
+	if (ocsd_dt_create_decoder(decoder->dcd_tree, decoder_name,
+				   OCSD_CREATE_FLG_PACKET_PROC,
+				   trace_config, &csid))
+		return -1;
+
+	if (ocsd_dt_set_pkt_protocol_printer(decoder->dcd_tree, csid, 0))
+		return -1;
+
+	return 0;
+}
+
+static int
+cs_etm_decoder__create_etm_packet_printer(struct cs_etm_trace_params *t_params,
+					  struct cs_etm_decoder *decoder)
+{
+	const char *decoder_name;
+	ocsd_etmv4_cfg trace_config_etmv4;
+	void *trace_config;
+
+	switch (t_params->protocol) {
+	case CS_ETM_PROTO_ETMV4i:
+		cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4);
+		decoder_name = OCSD_BUILTIN_DCD_ETMV4I;
+		trace_config = &trace_config_etmv4;
+		break;
+	default:
+		return -1;
+	}
+
+	return cs_etm_decoder__create_packet_printer(decoder,
+						     decoder_name,
+						     trace_config);
+}
+
+static void cs_etm_decoder__clear_buffer(struct cs_etm_decoder *decoder)
+{
+	int i;
+
+	decoder->head = 0;
+	decoder->tail = 0;
+	decoder->packet_count = 0;
+	for (i = 0; i < MAX_BUFFER; i++) {
+		decoder->packet_buffer[i].start_addr = 0xdeadbeefdeadbeefUL;
+		decoder->packet_buffer[i].end_addr   = 0xdeadbeefdeadbeefUL;
+		decoder->packet_buffer[i].exc	     = false;
+		decoder->packet_buffer[i].exc_ret    = false;
+		decoder->packet_buffer[i].cpu	     = INT_MIN;
+	}
+}
+
+static ocsd_datapath_resp_t
+cs_etm_decoder__buffer_packet(struct cs_etm_decoder *decoder,
+			      const ocsd_generic_trace_elem *elem,
+			      const u8 trace_chan_id,
+			      enum cs_etm_sample_type sample_type)
+{
+	u32 et = 0;
+	struct int_node *inode = NULL;
+
+	if (decoder->packet_count >= MAX_BUFFER - 1)
+		return OCSD_RESP_FATAL_SYS_ERR;
+
+	/* Search the RB tree for the cpu associated with this traceID */
+	inode = intlist__find(traceid_list, trace_chan_id);
+	if (!inode)
+		return OCSD_RESP_FATAL_SYS_ERR;
+
+	et = decoder->tail;
+	decoder->packet_buffer[et].sample_type = sample_type;
+	decoder->packet_buffer[et].start_addr = elem->st_addr;
+	decoder->packet_buffer[et].end_addr = elem->en_addr;
+	decoder->packet_buffer[et].exc = false;
+	decoder->packet_buffer[et].exc_ret = false;
+	decoder->packet_buffer[et].cpu = *((int *)inode->priv);
+
+	/* Wrap around if need be */
+	et = (et + 1) & (MAX_BUFFER - 1);
+
+	decoder->tail = et;
+	decoder->packet_count++;
+
+	if (decoder->packet_count == MAX_BUFFER - 1)
+		return OCSD_RESP_WAIT;
+
+	return OCSD_RESP_CONT;
+}
+
+static ocsd_datapath_resp_t cs_etm_decoder__gen_trace_elem_printer(
+				const void *context,
+				const ocsd_trc_index_t indx __maybe_unused,
+				const u8 trace_chan_id __maybe_unused,
+				const ocsd_generic_trace_elem *elem)
+{
+	ocsd_datapath_resp_t resp = OCSD_RESP_CONT;
+	struct cs_etm_decoder *decoder = (struct cs_etm_decoder *) context;
+
+	switch (elem->elem_type) {
+	case OCSD_GEN_TRC_ELEM_UNKNOWN:
+		break;
+	case OCSD_GEN_TRC_ELEM_NO_SYNC:
+		decoder->trace_on = false;
+		break;
+	case OCSD_GEN_TRC_ELEM_TRACE_ON:
+		decoder->trace_on = true;
+		break;
+	case OCSD_GEN_TRC_ELEM_INSTR_RANGE:
+		resp = cs_etm_decoder__buffer_packet(decoder, elem,
+						     trace_chan_id,
+						     CS_ETM_RANGE);
+		break;
+	case OCSD_GEN_TRC_ELEM_EXCEPTION:
+		decoder->packet_buffer[decoder->tail].exc = true;
+		break;
+	case OCSD_GEN_TRC_ELEM_EXCEPTION_RET:
+		decoder->packet_buffer[decoder->tail].exc_ret = true;
+		break;
+	case OCSD_GEN_TRC_ELEM_PE_CONTEXT:
+	case OCSD_GEN_TRC_ELEM_EO_TRACE:
+	case OCSD_GEN_TRC_ELEM_ADDR_NACC:
+	case OCSD_GEN_TRC_ELEM_TIMESTAMP:
+	case OCSD_GEN_TRC_ELEM_CYCLE_COUNT:
+	case OCSD_GEN_TRC_ELEM_ADDR_UNKNOWN:
+	case OCSD_GEN_TRC_ELEM_EVENT:
+	case OCSD_GEN_TRC_ELEM_SWTRACE:
+	case OCSD_GEN_TRC_ELEM_CUSTOM:
+	default:
+		break;
+	}
+
+	return resp;
+}
+
+static int cs_etm_decoder__create_etm_packet_decoder(
+					struct cs_etm_trace_params *t_params,
+					struct cs_etm_decoder *decoder)
+{
+	const char *decoder_name;
+	ocsd_etmv4_cfg trace_config_etmv4;
+	void *trace_config;
+	u8 csid;
+
+	switch (t_params->protocol) {
+	case CS_ETM_PROTO_ETMV4i:
+		cs_etm_decoder__gen_etmv4_config(t_params, &trace_config_etmv4);
+		decoder_name = OCSD_BUILTIN_DCD_ETMV4I;
+		trace_config = &trace_config_etmv4;
+		break;
+	default:
+		return -1;
+	}
+
+	if (ocsd_dt_create_decoder(decoder->dcd_tree,
+				     decoder_name,
+				     OCSD_CREATE_FLG_FULL_DECODER,
+				     trace_config, &csid))
+		return -1;
+
+	if (ocsd_dt_set_gen_elem_outfn(decoder->dcd_tree,
+				       cs_etm_decoder__gen_trace_elem_printer,
+				       decoder))
+		return -1;
+
+	return 0;
+}
+
+static int
+cs_etm_decoder__create_etm_decoder(struct cs_etm_decoder_params *d_params,
+				   struct cs_etm_trace_params *t_params,
+				   struct cs_etm_decoder *decoder)
+{
+	if (d_params->operation == CS_ETM_OPERATION_PRINT)
+		return cs_etm_decoder__create_etm_packet_printer(t_params,
+								 decoder);
+	else if (d_params->operation == CS_ETM_OPERATION_DECODE)
+		return cs_etm_decoder__create_etm_packet_decoder(t_params,
+								 decoder);
+
+	return -1;
+}
+
+struct cs_etm_decoder *
+cs_etm_decoder__new(int num_cpu, struct cs_etm_decoder_params *d_params,
+		    struct cs_etm_trace_params t_params[])
+{
+	struct cs_etm_decoder *decoder;
+	ocsd_dcd_tree_src_t format;
+	u32 flags;
+	int i, ret;
+
+	if ((!t_params) || (!d_params))
+		return NULL;
+
+	decoder = zalloc(sizeof(*decoder));
+
+	if (!decoder)
+		return NULL;
+
+	decoder->data = d_params->data;
+	decoder->prev_return = OCSD_RESP_CONT;
+	cs_etm_decoder__clear_buffer(decoder);
+	format = (d_params->formatted ? OCSD_TRC_SRC_FRAME_FORMATTED :
+					 OCSD_TRC_SRC_SINGLE);
+	flags = 0;
+	flags |= (d_params->fsyncs ? OCSD_DFRMTR_HAS_FSYNCS : 0);
+	flags |= (d_params->hsyncs ? OCSD_DFRMTR_HAS_HSYNCS : 0);
+	flags |= (d_params->frame_aligned ? OCSD_DFRMTR_FRAME_MEM_ALIGN : 0);
+
+	/*
+	 * Drivers may add barrier frames when used with perf, set up to
+	 * handle this. Barriers const of FSYNC packet repeated 4 times.
+	 */
+	flags |= OCSD_DFRMTR_RESET_ON_4X_FSYNC;
+
+	/* Create decode tree for the data source */
+	decoder->dcd_tree = ocsd_create_dcd_tree(format, flags);
+
+	if (decoder->dcd_tree == 0)
+		goto err_free_decoder;
+
+	/* init library print logging support */
+	ret = cs_etm_decoder__init_def_logger_printing(d_params, decoder);
+	if (ret != 0)
+		goto err_free_decoder_tree;
+
+	/* init raw frame logging if required */
+	cs_etm_decoder__init_raw_frame_logging(d_params, decoder);
+
+	for (i = 0; i < num_cpu; i++) {
+		ret = cs_etm_decoder__create_etm_decoder(d_params,
+							 &t_params[i],
+							 decoder);
+		if (ret != 0)
+			goto err_free_decoder_tree;
+	}
+
+	return decoder;
+
+err_free_decoder_tree:
+	ocsd_destroy_dcd_tree(decoder->dcd_tree);
+err_free_decoder:
+	free(decoder);
+	return NULL;
+}
+
+int cs_etm_decoder__process_data_block(struct cs_etm_decoder *decoder,
+				       u64 indx, const u8 *buf,
+				       size_t len, size_t *consumed)
+{
+	int ret = 0;
+	ocsd_datapath_resp_t cur = OCSD_RESP_CONT;
+	ocsd_datapath_resp_t prev_return = decoder->prev_return;
+	size_t processed = 0;
+	u32 count;
+
+	while (processed < len) {
+		if (OCSD_DATA_RESP_IS_WAIT(prev_return)) {
+			cur = ocsd_dt_process_data(decoder->dcd_tree,
+						   OCSD_OP_FLUSH,
+						   0,
+						   0,
+						   NULL,
+						   NULL);
+		} else if (OCSD_DATA_RESP_IS_CONT(prev_return)) {
+			cur = ocsd_dt_process_data(decoder->dcd_tree,
+						   OCSD_OP_DATA,
+						   indx + processed,
+						   len - processed,
+						   &buf[processed],
+						   &count);
+			processed += count;
+		} else {
+			ret = -EINVAL;
+			break;
+		}
+
+		/*
+		 * Return to the input code if the packet buffer is full.
+		 * Flushing will get done once the packet buffer has been
+		 * processed.
+		 */
+		if (OCSD_DATA_RESP_IS_WAIT(cur))
+			break;
+
+		prev_return = cur;
+	}
+
+	decoder->prev_return = cur;
+	*consumed = processed;
+
+	return ret;
+}
+
+void cs_etm_decoder__free(struct cs_etm_decoder *decoder)
+{
+	if (!decoder)
+		return;
+
+	ocsd_destroy_dcd_tree(decoder->dcd_tree);
+	decoder->dcd_tree = NULL;
+	free(decoder);
+}
diff --git a/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
new file mode 100644
index 0000000..3d2e620
--- /dev/null
+++ b/tools/perf/util/cs-etm-decoder/cs-etm-decoder.h
@@ -0,0 +1,105 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright(C) 2015-2018 Linaro Limited.
+ *
+ * Author: Tor Jeremiassen <tor@ti.com>
+ * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
+ */
+
+#ifndef INCLUDE__CS_ETM_DECODER_H__
+#define INCLUDE__CS_ETM_DECODER_H__
+
+#include <linux/types.h>
+#include <stdio.h>
+
+struct cs_etm_decoder;
+
+struct cs_etm_buffer {
+	const unsigned char *buf;
+	size_t len;
+	u64 offset;
+	u64 ref_timestamp;
+};
+
+enum cs_etm_sample_type {
+	CS_ETM_RANGE = 1 << 0,
+};
+
+struct cs_etm_packet {
+	enum cs_etm_sample_type sample_type;
+	u64 start_addr;
+	u64 end_addr;
+	u8 exc;
+	u8 exc_ret;
+	int cpu;
+};
+
+struct cs_etm_queue;
+
+typedef u32 (*cs_etm_mem_cb_type)(struct cs_etm_queue *, u64,
+				  size_t, u8 *);
+
+struct cs_etmv4_trace_params {
+	u32 reg_idr0;
+	u32 reg_idr1;
+	u32 reg_idr2;
+	u32 reg_idr8;
+	u32 reg_configr;
+	u32 reg_traceidr;
+};
+
+struct cs_etm_trace_params {
+	int protocol;
+	union {
+		struct cs_etmv4_trace_params etmv4;
+	};
+};
+
+struct cs_etm_decoder_params {
+	int operation;
+	void (*packet_printer)(const char *msg);
+	cs_etm_mem_cb_type mem_acc_cb;
+	u8 formatted;
+	u8 fsyncs;
+	u8 hsyncs;
+	u8 frame_aligned;
+	void *data;
+};
+
+/*
+ * The following enums are indexed starting with 1 to align with the
+ * open source coresight trace decoder library.
+ */
+enum {
+	CS_ETM_PROTO_ETMV3 = 1,
+	CS_ETM_PROTO_ETMV4i,
+	CS_ETM_PROTO_ETMV4d,
+};
+
+enum {
+	CS_ETM_OPERATION_PRINT = 1,
+	CS_ETM_OPERATION_DECODE,
+};
+
+int cs_etm_decoder__process_data_block(struct cs_etm_decoder *decoder,
+				       u64 indx, const u8 *buf,
+				       size_t len, size_t *consumed);
+
+struct cs_etm_decoder *
+cs_etm_decoder__new(int num_cpu,
+		    struct cs_etm_decoder_params *d_params,
+		    struct cs_etm_trace_params t_params[]);
+
+void cs_etm_decoder__free(struct cs_etm_decoder *decoder);
+
+int cs_etm_decoder__add_mem_access_cb(struct cs_etm_decoder *decoder,
+				      u64 start, u64 end,
+				      cs_etm_mem_cb_type cb_func);
+
+int cs_etm_decoder__get_packet(struct cs_etm_decoder *decoder,
+			       struct cs_etm_packet *packet);
+
+int cs_etm_decoder__reset(struct cs_etm_decoder *decoder);
+
+#endif /* INCLUDE__CS_ETM_DECODER_H__ */
diff --git a/tools/perf/util/cs-etm.c b/tools/perf/util/cs-etm.c
new file mode 100644
index 0000000..b9f0a53
--- /dev/null
+++ b/tools/perf/util/cs-etm.c
@@ -0,0 +1,1023 @@
+/*
+ * SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright(C) 2015-2018 Linaro Limited.
+ *
+ * Author: Tor Jeremiassen <tor@ti.com>
+ * Author: Mathieu Poirier <mathieu.poirier@linaro.org>
+ */
+
+#include <linux/bitops.h>
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/log2.h>
+#include <linux/types.h>
+
+#include <stdlib.h>
+
+#include "auxtrace.h"
+#include "color.h"
+#include "cs-etm.h"
+#include "cs-etm-decoder/cs-etm-decoder.h"
+#include "debug.h"
+#include "evlist.h"
+#include "intlist.h"
+#include "machine.h"
+#include "map.h"
+#include "perf.h"
+#include "thread.h"
+#include "thread_map.h"
+#include "thread-stack.h"
+#include "util.h"
+
+#define MAX_TIMESTAMP (~0ULL)
+
+struct cs_etm_auxtrace {
+	struct auxtrace auxtrace;
+	struct auxtrace_queues queues;
+	struct auxtrace_heap heap;
+	struct itrace_synth_opts synth_opts;
+	struct perf_session *session;
+	struct machine *machine;
+	struct thread *unknown_thread;
+
+	u8 timeless_decoding;
+	u8 snapshot_mode;
+	u8 data_queued;
+	u8 sample_branches;
+
+	int num_cpu;
+	u32 auxtrace_type;
+	u64 branches_sample_type;
+	u64 branches_id;
+	u64 **metadata;
+	u64 kernel_start;
+	unsigned int pmu_type;
+};
+
+struct cs_etm_queue {
+	struct cs_etm_auxtrace *etm;
+	struct thread *thread;
+	struct cs_etm_decoder *decoder;
+	struct auxtrace_buffer *buffer;
+	const struct cs_etm_state *state;
+	union perf_event *event_buf;
+	unsigned int queue_nr;
+	pid_t pid, tid;
+	int cpu;
+	u64 time;
+	u64 timestamp;
+	u64 offset;
+};
+
+static int cs_etm__update_queues(struct cs_etm_auxtrace *etm);
+static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
+					   pid_t tid, u64 time_);
+
+static void cs_etm__packet_dump(const char *pkt_string)
+{
+	const char *color = PERF_COLOR_BLUE;
+	int len = strlen(pkt_string);
+
+	if (len && (pkt_string[len-1] == '\n'))
+		color_fprintf(stdout, color, "	%s", pkt_string);
+	else
+		color_fprintf(stdout, color, "	%s\n", pkt_string);
+
+	fflush(stdout);
+}
+
+static void cs_etm__dump_event(struct cs_etm_auxtrace *etm,
+			       struct auxtrace_buffer *buffer)
+{
+	int i, ret;
+	const char *color = PERF_COLOR_BLUE;
+	struct cs_etm_decoder_params d_params;
+	struct cs_etm_trace_params *t_params;
+	struct cs_etm_decoder *decoder;
+	size_t buffer_used = 0;
+
+	fprintf(stdout, "\n");
+	color_fprintf(stdout, color,
+		     ". ... CoreSight ETM Trace data: size %zu bytes\n",
+		     buffer->size);
+
+	/* Use metadata to fill in trace parameters for trace decoder */
+	t_params = zalloc(sizeof(*t_params) * etm->num_cpu);
+	for (i = 0; i < etm->num_cpu; i++) {
+		t_params[i].protocol = CS_ETM_PROTO_ETMV4i;
+		t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0];
+		t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1];
+		t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2];
+		t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8];
+		t_params[i].etmv4.reg_configr =
+					etm->metadata[i][CS_ETMV4_TRCCONFIGR];
+		t_params[i].etmv4.reg_traceidr =
+					etm->metadata[i][CS_ETMV4_TRCTRACEIDR];
+	}
+
+	/* Set decoder parameters to simply print the trace packets */
+	d_params.packet_printer = cs_etm__packet_dump;
+	d_params.operation = CS_ETM_OPERATION_PRINT;
+	d_params.formatted = true;
+	d_params.fsyncs = false;
+	d_params.hsyncs = false;
+	d_params.frame_aligned = true;
+
+	decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params);
+
+	zfree(&t_params);
+
+	if (!decoder)
+		return;
+	do {
+		size_t consumed;
+
+		ret = cs_etm_decoder__process_data_block(
+				decoder, buffer->offset,
+				&((u8 *)buffer->data)[buffer_used],
+				buffer->size - buffer_used, &consumed);
+		if (ret)
+			break;
+
+		buffer_used += consumed;
+	} while (buffer_used < buffer->size);
+
+	cs_etm_decoder__free(decoder);
+}
+
+static int cs_etm__flush_events(struct perf_session *session,
+				struct perf_tool *tool)
+{
+	int ret;
+	struct cs_etm_auxtrace *etm = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+	if (dump_trace)
+		return 0;
+
+	if (!tool->ordered_events)
+		return -EINVAL;
+
+	if (!etm->timeless_decoding)
+		return -EINVAL;
+
+	ret = cs_etm__update_queues(etm);
+
+	if (ret < 0)
+		return ret;
+
+	return cs_etm__process_timeless_queues(etm, -1, MAX_TIMESTAMP - 1);
+}
+
+static void cs_etm__free_queue(void *priv)
+{
+	struct cs_etm_queue *etmq = priv;
+
+	free(etmq);
+}
+
+static void cs_etm__free_events(struct perf_session *session)
+{
+	unsigned int i;
+	struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+	struct auxtrace_queues *queues = &aux->queues;
+
+	for (i = 0; i < queues->nr_queues; i++) {
+		cs_etm__free_queue(queues->queue_array[i].priv);
+		queues->queue_array[i].priv = NULL;
+	}
+
+	auxtrace_queues__free(queues);
+}
+
+static void cs_etm__free(struct perf_session *session)
+{
+	int i;
+	struct int_node *inode, *tmp;
+	struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+	cs_etm__free_events(session);
+	session->auxtrace = NULL;
+
+	/* First remove all traceID/CPU# nodes for the RB tree */
+	intlist__for_each_entry_safe(inode, tmp, traceid_list)
+		intlist__remove(traceid_list, inode);
+	/* Then the RB tree itself */
+	intlist__delete(traceid_list);
+
+	for (i = 0; i < aux->num_cpu; i++)
+		zfree(&aux->metadata[i]);
+
+	zfree(&aux->metadata);
+	zfree(&aux);
+}
+
+static u32 cs_etm__mem_access(struct cs_etm_queue *etmq, u64 address,
+			      size_t size, u8 *buffer)
+{
+	u8  cpumode;
+	u64 offset;
+	int len;
+	struct	 thread *thread;
+	struct	 machine *machine;
+	struct	 addr_location al;
+
+	if (!etmq)
+		return -1;
+
+	machine = etmq->etm->machine;
+	if (address >= etmq->etm->kernel_start)
+		cpumode = PERF_RECORD_MISC_KERNEL;
+	else
+		cpumode = PERF_RECORD_MISC_USER;
+
+	thread = etmq->thread;
+	if (!thread) {
+		if (cpumode != PERF_RECORD_MISC_KERNEL)
+			return -EINVAL;
+		thread = etmq->etm->unknown_thread;
+	}
+
+	thread__find_addr_map(thread, cpumode, MAP__FUNCTION, address, &al);
+
+	if (!al.map || !al.map->dso)
+		return 0;
+
+	if (al.map->dso->data.status == DSO_DATA_STATUS_ERROR &&
+	    dso__data_status_seen(al.map->dso, DSO_DATA_STATUS_SEEN_ITRACE))
+		return 0;
+
+	offset = al.map->map_ip(al.map, address);
+
+	map__load(al.map);
+
+	len = dso__data_read_offset(al.map->dso, machine, offset, buffer, size);
+
+	if (len <= 0)
+		return 0;
+
+	return len;
+}
+
+static struct cs_etm_queue *cs_etm__alloc_queue(struct cs_etm_auxtrace *etm,
+						unsigned int queue_nr)
+{
+	int i;
+	struct cs_etm_decoder_params d_params;
+	struct cs_etm_trace_params  *t_params;
+	struct cs_etm_queue *etmq;
+
+	etmq = zalloc(sizeof(*etmq));
+	if (!etmq)
+		return NULL;
+
+	etmq->event_buf = malloc(PERF_SAMPLE_MAX_SIZE);
+	if (!etmq->event_buf)
+		goto out_free;
+
+	etmq->etm = etm;
+	etmq->queue_nr = queue_nr;
+	etmq->pid = -1;
+	etmq->tid = -1;
+	etmq->cpu = -1;
+
+	/* Use metadata to fill in trace parameters for trace decoder */
+	t_params = zalloc(sizeof(*t_params) * etm->num_cpu);
+
+	if (!t_params)
+		goto out_free;
+
+	for (i = 0; i < etm->num_cpu; i++) {
+		t_params[i].protocol = CS_ETM_PROTO_ETMV4i;
+		t_params[i].etmv4.reg_idr0 = etm->metadata[i][CS_ETMV4_TRCIDR0];
+		t_params[i].etmv4.reg_idr1 = etm->metadata[i][CS_ETMV4_TRCIDR1];
+		t_params[i].etmv4.reg_idr2 = etm->metadata[i][CS_ETMV4_TRCIDR2];
+		t_params[i].etmv4.reg_idr8 = etm->metadata[i][CS_ETMV4_TRCIDR8];
+		t_params[i].etmv4.reg_configr =
+					etm->metadata[i][CS_ETMV4_TRCCONFIGR];
+		t_params[i].etmv4.reg_traceidr =
+					etm->metadata[i][CS_ETMV4_TRCTRACEIDR];
+	}
+
+	/* Set decoder parameters to simply print the trace packets */
+	d_params.packet_printer = cs_etm__packet_dump;
+	d_params.operation = CS_ETM_OPERATION_DECODE;
+	d_params.formatted = true;
+	d_params.fsyncs = false;
+	d_params.hsyncs = false;
+	d_params.frame_aligned = true;
+	d_params.data = etmq;
+
+	etmq->decoder = cs_etm_decoder__new(etm->num_cpu, &d_params, t_params);
+
+	zfree(&t_params);
+
+	if (!etmq->decoder)
+		goto out_free;
+
+	/*
+	 * Register a function to handle all memory accesses required by
+	 * the trace decoder library.
+	 */
+	if (cs_etm_decoder__add_mem_access_cb(etmq->decoder,
+					      0x0L, ((u64) -1L),
+					      cs_etm__mem_access))
+		goto out_free_decoder;
+
+	etmq->offset = 0;
+
+	return etmq;
+
+out_free_decoder:
+	cs_etm_decoder__free(etmq->decoder);
+out_free:
+	zfree(&etmq->event_buf);
+	free(etmq);
+
+	return NULL;
+}
+
+static int cs_etm__setup_queue(struct cs_etm_auxtrace *etm,
+			       struct auxtrace_queue *queue,
+			       unsigned int queue_nr)
+{
+	struct cs_etm_queue *etmq = queue->priv;
+
+	if (list_empty(&queue->head) || etmq)
+		return 0;
+
+	etmq = cs_etm__alloc_queue(etm, queue_nr);
+
+	if (!etmq)
+		return -ENOMEM;
+
+	queue->priv = etmq;
+
+	if (queue->cpu != -1)
+		etmq->cpu = queue->cpu;
+
+	etmq->tid = queue->tid;
+
+	return 0;
+}
+
+static int cs_etm__setup_queues(struct cs_etm_auxtrace *etm)
+{
+	unsigned int i;
+	int ret;
+
+	for (i = 0; i < etm->queues.nr_queues; i++) {
+		ret = cs_etm__setup_queue(etm, &etm->queues.queue_array[i], i);
+		if (ret)
+			return ret;
+	}
+
+	return 0;
+}
+
+static int cs_etm__update_queues(struct cs_etm_auxtrace *etm)
+{
+	if (etm->queues.new_data) {
+		etm->queues.new_data = false;
+		return cs_etm__setup_queues(etm);
+	}
+
+	return 0;
+}
+
+static int
+cs_etm__get_trace(struct cs_etm_buffer *buff, struct cs_etm_queue *etmq)
+{
+	struct auxtrace_buffer *aux_buffer = etmq->buffer;
+	struct auxtrace_buffer *old_buffer = aux_buffer;
+	struct auxtrace_queue *queue;
+
+	queue = &etmq->etm->queues.queue_array[etmq->queue_nr];
+
+	aux_buffer = auxtrace_buffer__next(queue, aux_buffer);
+
+	/* If no more data, drop the previous auxtrace_buffer and return */
+	if (!aux_buffer) {
+		if (old_buffer)
+			auxtrace_buffer__drop_data(old_buffer);
+		buff->len = 0;
+		return 0;
+	}
+
+	etmq->buffer = aux_buffer;
+
+	/* If the aux_buffer doesn't have data associated, try to load it */
+	if (!aux_buffer->data) {
+		/* get the file desc associated with the perf data file */
+		int fd = perf_data__fd(etmq->etm->session->data);
+
+		aux_buffer->data = auxtrace_buffer__get_data(aux_buffer, fd);
+		if (!aux_buffer->data)
+			return -ENOMEM;
+	}
+
+	/* If valid, drop the previous buffer */
+	if (old_buffer)
+		auxtrace_buffer__drop_data(old_buffer);
+
+	buff->offset = aux_buffer->offset;
+	buff->len = aux_buffer->size;
+	buff->buf = aux_buffer->data;
+
+	buff->ref_timestamp = aux_buffer->reference;
+
+	return buff->len;
+}
+
+static void  cs_etm__set_pid_tid_cpu(struct cs_etm_auxtrace *etm,
+				     struct auxtrace_queue *queue)
+{
+	struct cs_etm_queue *etmq = queue->priv;
+
+	/* CPU-wide tracing isn't supported yet */
+	if (queue->tid == -1)
+		return;
+
+	if ((!etmq->thread) && (etmq->tid != -1))
+		etmq->thread = machine__find_thread(etm->machine, -1,
+						    etmq->tid);
+
+	if (etmq->thread) {
+		etmq->pid = etmq->thread->pid_;
+		if (queue->cpu == -1)
+			etmq->cpu = etmq->thread->cpu;
+	}
+}
+
+/*
+ * The cs etm packet encodes an instruction range between a branch target
+ * and the next taken branch. Generate sample accordingly.
+ */
+static int cs_etm__synth_branch_sample(struct cs_etm_queue *etmq,
+				       struct cs_etm_packet *packet)
+{
+	int ret = 0;
+	struct cs_etm_auxtrace *etm = etmq->etm;
+	struct perf_sample sample = {.ip = 0,};
+	union perf_event *event = etmq->event_buf;
+	u64 start_addr = packet->start_addr;
+	u64 end_addr = packet->end_addr;
+
+	event->sample.header.type = PERF_RECORD_SAMPLE;
+	event->sample.header.misc = PERF_RECORD_MISC_USER;
+	event->sample.header.size = sizeof(struct perf_event_header);
+
+	sample.ip = start_addr;
+	sample.pid = etmq->pid;
+	sample.tid = etmq->tid;
+	sample.addr = end_addr;
+	sample.id = etmq->etm->branches_id;
+	sample.stream_id = etmq->etm->branches_id;
+	sample.period = 1;
+	sample.cpu = packet->cpu;
+	sample.flags = 0;
+	sample.cpumode = PERF_RECORD_MISC_USER;
+
+	ret = perf_session__deliver_synth_event(etm->session, event, &sample);
+
+	if (ret)
+		pr_err(
+		"CS ETM Trace: failed to deliver instruction event, error %d\n",
+		ret);
+
+	return ret;
+}
+
+struct cs_etm_synth {
+	struct perf_tool dummy_tool;
+	struct perf_session *session;
+};
+
+static int cs_etm__event_synth(struct perf_tool *tool,
+			       union perf_event *event,
+			       struct perf_sample *sample __maybe_unused,
+			       struct machine *machine __maybe_unused)
+{
+	struct cs_etm_synth *cs_etm_synth =
+		      container_of(tool, struct cs_etm_synth, dummy_tool);
+
+	return perf_session__deliver_synth_event(cs_etm_synth->session,
+						 event, NULL);
+}
+
+static int cs_etm__synth_event(struct perf_session *session,
+			       struct perf_event_attr *attr, u64 id)
+{
+	struct cs_etm_synth cs_etm_synth;
+
+	memset(&cs_etm_synth, 0, sizeof(struct cs_etm_synth));
+	cs_etm_synth.session = session;
+
+	return perf_event__synthesize_attr(&cs_etm_synth.dummy_tool, attr, 1,
+					   &id, cs_etm__event_synth);
+}
+
+static int cs_etm__synth_events(struct cs_etm_auxtrace *etm,
+				struct perf_session *session)
+{
+	struct perf_evlist *evlist = session->evlist;
+	struct perf_evsel *evsel;
+	struct perf_event_attr attr;
+	bool found = false;
+	u64 id;
+	int err;
+
+	evlist__for_each_entry(evlist, evsel) {
+		if (evsel->attr.type == etm->pmu_type) {
+			found = true;
+			break;
+		}
+	}
+
+	if (!found) {
+		pr_debug("No selected events with CoreSight Trace data\n");
+		return 0;
+	}
+
+	memset(&attr, 0, sizeof(struct perf_event_attr));
+	attr.size = sizeof(struct perf_event_attr);
+	attr.type = PERF_TYPE_HARDWARE;
+	attr.sample_type = evsel->attr.sample_type & PERF_SAMPLE_MASK;
+	attr.sample_type |= PERF_SAMPLE_IP | PERF_SAMPLE_TID |
+			    PERF_SAMPLE_PERIOD;
+	if (etm->timeless_decoding)
+		attr.sample_type &= ~(u64)PERF_SAMPLE_TIME;
+	else
+		attr.sample_type |= PERF_SAMPLE_TIME;
+
+	attr.exclude_user = evsel->attr.exclude_user;
+	attr.exclude_kernel = evsel->attr.exclude_kernel;
+	attr.exclude_hv = evsel->attr.exclude_hv;
+	attr.exclude_host = evsel->attr.exclude_host;
+	attr.exclude_guest = evsel->attr.exclude_guest;
+	attr.sample_id_all = evsel->attr.sample_id_all;
+	attr.read_format = evsel->attr.read_format;
+
+	/* create new id val to be a fixed offset from evsel id */
+	id = evsel->id[0] + 1000000000;
+
+	if (!id)
+		id = 1;
+
+	if (etm->synth_opts.branches) {
+		attr.config = PERF_COUNT_HW_BRANCH_INSTRUCTIONS;
+		attr.sample_period = 1;
+		attr.sample_type |= PERF_SAMPLE_ADDR;
+		err = cs_etm__synth_event(session, &attr, id);
+		if (err)
+			return err;
+		etm->sample_branches = true;
+		etm->branches_sample_type = attr.sample_type;
+		etm->branches_id = id;
+	}
+
+	return 0;
+}
+
+static int cs_etm__sample(struct cs_etm_queue *etmq)
+{
+	int ret;
+	struct cs_etm_packet packet;
+
+	while (1) {
+		ret = cs_etm_decoder__get_packet(etmq->decoder, &packet);
+		if (ret <= 0)
+			return ret;
+
+		/*
+		 * If the packet contains an instruction range, generate an
+		 * instruction sequence event.
+		 */
+		if (packet.sample_type & CS_ETM_RANGE)
+			cs_etm__synth_branch_sample(etmq, &packet);
+	}
+
+	return 0;
+}
+
+static int cs_etm__run_decoder(struct cs_etm_queue *etmq)
+{
+	struct cs_etm_auxtrace *etm = etmq->etm;
+	struct cs_etm_buffer buffer;
+	size_t buffer_used, processed;
+	int err = 0;
+
+	if (!etm->kernel_start)
+		etm->kernel_start = machine__kernel_start(etm->machine);
+
+	/* Go through each buffer in the queue and decode them one by one */
+more:
+	buffer_used = 0;
+	memset(&buffer, 0, sizeof(buffer));
+	err = cs_etm__get_trace(&buffer, etmq);
+	if (err <= 0)
+		return err;
+	/*
+	 * We cannot assume consecutive blocks in the data file are contiguous,
+	 * reset the decoder to force re-sync.
+	 */
+	err = cs_etm_decoder__reset(etmq->decoder);
+	if (err != 0)
+		return err;
+
+	/* Run trace decoder until buffer consumed or end of trace */
+	do {
+		processed = 0;
+
+		err = cs_etm_decoder__process_data_block(
+						etmq->decoder,
+						etmq->offset,
+						&buffer.buf[buffer_used],
+						buffer.len - buffer_used,
+						&processed);
+
+		if (err)
+			return err;
+
+		etmq->offset += processed;
+		buffer_used += processed;
+
+		/*
+		 * Nothing to do with an error condition, let's hope the next
+		 * chunk will be better.
+		 */
+		err = cs_etm__sample(etmq);
+	} while (buffer.len > buffer_used);
+
+goto more;
+
+	return err;
+}
+
+static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
+					   pid_t tid, u64 time_)
+{
+	unsigned int i;
+	struct auxtrace_queues *queues = &etm->queues;
+
+	for (i = 0; i < queues->nr_queues; i++) {
+		struct auxtrace_queue *queue = &etm->queues.queue_array[i];
+		struct cs_etm_queue *etmq = queue->priv;
+
+		if (etmq && ((tid == -1) || (etmq->tid == tid))) {
+			etmq->time = time_;
+			cs_etm__set_pid_tid_cpu(etm, queue);
+			cs_etm__run_decoder(etmq);
+		}
+	}
+
+	return 0;
+}
+
+static int cs_etm__process_event(struct perf_session *session,
+				 union perf_event *event,
+				 struct perf_sample *sample,
+				 struct perf_tool *tool)
+{
+	int err = 0;
+	u64 timestamp;
+	struct cs_etm_auxtrace *etm = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+
+	if (dump_trace)
+		return 0;
+
+	if (!tool->ordered_events) {
+		pr_err("CoreSight ETM Trace requires ordered events\n");
+		return -EINVAL;
+	}
+
+	if (!etm->timeless_decoding)
+		return -EINVAL;
+
+	if (sample->time && (sample->time != (u64) -1))
+		timestamp = sample->time;
+	else
+		timestamp = 0;
+
+	if (timestamp || etm->timeless_decoding) {
+		err = cs_etm__update_queues(etm);
+		if (err)
+			return err;
+	}
+
+	if (event->header.type == PERF_RECORD_EXIT)
+		return cs_etm__process_timeless_queues(etm,
+						       event->fork.tid,
+						       sample->time);
+
+	return 0;
+}
+
+static int cs_etm__process_auxtrace_event(struct perf_session *session,
+					  union perf_event *event,
+					  struct perf_tool *tool __maybe_unused)
+{
+	struct cs_etm_auxtrace *etm = container_of(session->auxtrace,
+						   struct cs_etm_auxtrace,
+						   auxtrace);
+	if (!etm->data_queued) {
+		struct auxtrace_buffer *buffer;
+		off_t  data_offset;
+		int fd = perf_data__fd(session->data);
+		bool is_pipe = perf_data__is_pipe(session->data);
+		int err;
+
+		if (is_pipe)
+			data_offset = 0;
+		else {
+			data_offset = lseek(fd, 0, SEEK_CUR);
+			if (data_offset == -1)
+				return -errno;
+		}
+
+		err = auxtrace_queues__add_event(&etm->queues, session,
+						 event, data_offset, &buffer);
+		if (err)
+			return err;
+
+		if (dump_trace)
+			if (auxtrace_buffer__get_data(buffer, fd)) {
+				cs_etm__dump_event(etm, buffer);
+				auxtrace_buffer__put_data(buffer);
+			}
+	}
+
+	return 0;
+}
+
+static bool cs_etm__is_timeless_decoding(struct cs_etm_auxtrace *etm)
+{
+	struct perf_evsel *evsel;
+	struct perf_evlist *evlist = etm->session->evlist;
+	bool timeless_decoding = true;
+
+	/*
+	 * Circle through the list of event and complain if we find one
+	 * with the time bit set.
+	 */
+	evlist__for_each_entry(evlist, evsel) {
+		if ((evsel->attr.sample_type & PERF_SAMPLE_TIME))
+			timeless_decoding = false;
+	}
+
+	return timeless_decoding;
+}
+
+static const char * const cs_etm_global_header_fmts[] = {
+	[CS_HEADER_VERSION_0]	= "	Header version		       %llx\n",
+	[CS_PMU_TYPE_CPUS]	= "	PMU type/num cpus	       %llx\n",
+	[CS_ETM_SNAPSHOT]	= "	Snapshot		       %llx\n",
+};
+
+static const char * const cs_etm_priv_fmts[] = {
+	[CS_ETM_MAGIC]		= "	Magic number		       %llx\n",
+	[CS_ETM_CPU]		= "	CPU			       %lld\n",
+	[CS_ETM_ETMCR]		= "	ETMCR			       %llx\n",
+	[CS_ETM_ETMTRACEIDR]	= "	ETMTRACEIDR		       %llx\n",
+	[CS_ETM_ETMCCER]	= "	ETMCCER			       %llx\n",
+	[CS_ETM_ETMIDR]		= "	ETMIDR			       %llx\n",
+};
+
+static const char * const cs_etmv4_priv_fmts[] = {
+	[CS_ETM_MAGIC]		= "	Magic number		       %llx\n",
+	[CS_ETM_CPU]		= "	CPU			       %lld\n",
+	[CS_ETMV4_TRCCONFIGR]	= "	TRCCONFIGR		       %llx\n",
+	[CS_ETMV4_TRCTRACEIDR]	= "	TRCTRACEIDR		       %llx\n",
+	[CS_ETMV4_TRCIDR0]	= "	TRCIDR0			       %llx\n",
+	[CS_ETMV4_TRCIDR1]	= "	TRCIDR1			       %llx\n",
+	[CS_ETMV4_TRCIDR2]	= "	TRCIDR2			       %llx\n",
+	[CS_ETMV4_TRCIDR8]	= "	TRCIDR8			       %llx\n",
+	[CS_ETMV4_TRCAUTHSTATUS] = "	TRCAUTHSTATUS		       %llx\n",
+};
+
+static void cs_etm__print_auxtrace_info(u64 *val, int num)
+{
+	int i, j, cpu = 0;
+
+	for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+		fprintf(stdout, cs_etm_global_header_fmts[i], val[i]);
+
+	for (i = CS_HEADER_VERSION_0_MAX; cpu < num; cpu++) {
+		if (val[i] == __perf_cs_etmv3_magic)
+			for (j = 0; j < CS_ETM_PRIV_MAX; j++, i++)
+				fprintf(stdout, cs_etm_priv_fmts[j], val[i]);
+		else if (val[i] == __perf_cs_etmv4_magic)
+			for (j = 0; j < CS_ETMV4_PRIV_MAX; j++, i++)
+				fprintf(stdout, cs_etmv4_priv_fmts[j], val[i]);
+		else
+			/* failure.. return */
+			return;
+	}
+}
+
+int cs_etm__process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session)
+{
+	struct auxtrace_info_event *auxtrace_info = &event->auxtrace_info;
+	struct cs_etm_auxtrace *etm = NULL;
+	struct int_node *inode;
+	unsigned int pmu_type;
+	int event_header_size = sizeof(struct perf_event_header);
+	int info_header_size;
+	int total_size = auxtrace_info->header.size;
+	int priv_size = 0;
+	int num_cpu;
+	int err = 0, idx = -1;
+	int i, j, k;
+	u64 *ptr, *hdr = NULL;
+	u64 **metadata = NULL;
+
+	/*
+	 * sizeof(auxtrace_info_event::type) +
+	 * sizeof(auxtrace_info_event::reserved) == 8
+	 */
+	info_header_size = 8;
+
+	if (total_size < (event_header_size + info_header_size))
+		return -EINVAL;
+
+	priv_size = total_size - event_header_size - info_header_size;
+
+	/* First the global part */
+	ptr = (u64 *) auxtrace_info->priv;
+
+	/* Look for version '0' of the header */
+	if (ptr[0] != 0)
+		return -EINVAL;
+
+	hdr = zalloc(sizeof(*hdr) * CS_HEADER_VERSION_0_MAX);
+	if (!hdr)
+		return -ENOMEM;
+
+	/* Extract header information - see cs-etm.h for format */
+	for (i = 0; i < CS_HEADER_VERSION_0_MAX; i++)
+		hdr[i] = ptr[i];
+	num_cpu = hdr[CS_PMU_TYPE_CPUS] & 0xffffffff;
+	pmu_type = (unsigned int) ((hdr[CS_PMU_TYPE_CPUS] >> 32) &
+				    0xffffffff);
+
+	/*
+	 * Create an RB tree for traceID-CPU# tuple. Since the conversion has
+	 * to be made for each packet that gets decoded, optimizing access in
+	 * anything other than a sequential array is worth doing.
+	 */
+	traceid_list = intlist__new(NULL);
+	if (!traceid_list) {
+		err = -ENOMEM;
+		goto err_free_hdr;
+	}
+
+	metadata = zalloc(sizeof(*metadata) * num_cpu);
+	if (!metadata) {
+		err = -ENOMEM;
+		goto err_free_traceid_list;
+	}
+
+	/*
+	 * The metadata is stored in the auxtrace_info section and encodes
+	 * the configuration of the ARM embedded trace macrocell which is
+	 * required by the trace decoder to properly decode the trace due
+	 * to its highly compressed nature.
+	 */
+	for (j = 0; j < num_cpu; j++) {
+		if (ptr[i] == __perf_cs_etmv3_magic) {
+			metadata[j] = zalloc(sizeof(*metadata[j]) *
+					     CS_ETM_PRIV_MAX);
+			if (!metadata[j]) {
+				err = -ENOMEM;
+				goto err_free_metadata;
+			}
+			for (k = 0; k < CS_ETM_PRIV_MAX; k++)
+				metadata[j][k] = ptr[i + k];
+
+			/* The traceID is our handle */
+			idx = metadata[j][CS_ETM_ETMTRACEIDR];
+			i += CS_ETM_PRIV_MAX;
+		} else if (ptr[i] == __perf_cs_etmv4_magic) {
+			metadata[j] = zalloc(sizeof(*metadata[j]) *
+					     CS_ETMV4_PRIV_MAX);
+			if (!metadata[j]) {
+				err = -ENOMEM;
+				goto err_free_metadata;
+			}
+			for (k = 0; k < CS_ETMV4_PRIV_MAX; k++)
+				metadata[j][k] = ptr[i + k];
+
+			/* The traceID is our handle */
+			idx = metadata[j][CS_ETMV4_TRCTRACEIDR];
+			i += CS_ETMV4_PRIV_MAX;
+		}
+
+		/* Get an RB node for this CPU */
+		inode = intlist__findnew(traceid_list, idx);
+
+		/* Something went wrong, no need to continue */
+		if (!inode) {
+			err = PTR_ERR(inode);
+			goto err_free_metadata;
+		}
+
+		/*
+		 * The node for that CPU should not be taken.
+		 * Back out if that's the case.
+		 */
+		if (inode->priv) {
+			err = -EINVAL;
+			goto err_free_metadata;
+		}
+		/* All good, associate the traceID with the CPU# */
+		inode->priv = &metadata[j][CS_ETM_CPU];
+	}
+
+	/*
+	 * Each of CS_HEADER_VERSION_0_MAX, CS_ETM_PRIV_MAX and
+	 * CS_ETMV4_PRIV_MAX mark how many double words are in the
+	 * global metadata, and each cpu's metadata respectively.
+	 * The following tests if the correct number of double words was
+	 * present in the auxtrace info section.
+	 */
+	if (i * 8 != priv_size) {
+		err = -EINVAL;
+		goto err_free_metadata;
+	}
+
+	etm = zalloc(sizeof(*etm));
+
+	if (!etm) {
+		err = -ENOMEM;
+		goto err_free_metadata;
+	}
+
+	err = auxtrace_queues__init(&etm->queues);
+	if (err)
+		goto err_free_etm;
+
+	etm->session = session;
+	etm->machine = &session->machines.host;
+
+	etm->num_cpu = num_cpu;
+	etm->pmu_type = pmu_type;
+	etm->snapshot_mode = (hdr[CS_ETM_SNAPSHOT] != 0);
+	etm->metadata = metadata;
+	etm->auxtrace_type = auxtrace_info->type;
+	etm->timeless_decoding = cs_etm__is_timeless_decoding(etm);
+
+	etm->auxtrace.process_event = cs_etm__process_event;
+	etm->auxtrace.process_auxtrace_event = cs_etm__process_auxtrace_event;
+	etm->auxtrace.flush_events = cs_etm__flush_events;
+	etm->auxtrace.free_events = cs_etm__free_events;
+	etm->auxtrace.free = cs_etm__free;
+	session->auxtrace = &etm->auxtrace;
+
+	if (dump_trace) {
+		cs_etm__print_auxtrace_info(auxtrace_info->priv, num_cpu);
+		return 0;
+	}
+
+	if (session->itrace_synth_opts && session->itrace_synth_opts->set) {
+		etm->synth_opts = *session->itrace_synth_opts;
+	} else {
+		itrace_synth_opts__set_default(&etm->synth_opts);
+		etm->synth_opts.callchain = false;
+	}
+
+	err = cs_etm__synth_events(etm, session);
+	if (err)
+		goto err_free_queues;
+
+	err = auxtrace_queues__process_index(&etm->queues, session);
+	if (err)
+		goto err_free_queues;
+
+	etm->data_queued = etm->queues.populated;
+
+	return 0;
+
+err_free_queues:
+	auxtrace_queues__free(&etm->queues);
+	session->auxtrace = NULL;
+err_free_etm:
+	zfree(&etm);
+err_free_metadata:
+	/* No need to check @metadata[j], free(NULL) is supported */
+	for (j = 0; j < num_cpu; j++)
+		free(metadata[j]);
+	zfree(&metadata);
+err_free_traceid_list:
+	intlist__delete(traceid_list);
+err_free_hdr:
+	zfree(&hdr);
+
+	return -EINVAL;
+}
diff --git a/tools/perf/util/cs-etm.h b/tools/perf/util/cs-etm.h
index 3cc6bc3..5864d5d 100644
--- a/tools/perf/util/cs-etm.h
+++ b/tools/perf/util/cs-etm.h
@@ -18,6 +18,9 @@
 #ifndef INCLUDE__UTIL_PERF_CS_ETM_H__
 #define INCLUDE__UTIL_PERF_CS_ETM_H__
 
+#include "util/event.h"
+#include "util/session.h"
+
 /* Versionning header in case things need tro change in the future.  That way
  * decoding of old snapshot is still possible.
  */
@@ -61,6 +64,9 @@ enum {
 	CS_ETMV4_PRIV_MAX,
 };
 
+/* RB tree for quick conversion between traceID and CPUs */
+struct intlist *traceid_list;
+
 #define KiB(x) ((x) * 1024)
 #define MiB(x) ((x) * 1024 * 1024)
 
@@ -71,4 +77,16 @@ static const u64 __perf_cs_etmv4_magic   = 0x4040404040404040ULL;
 #define CS_ETMV3_PRIV_SIZE (CS_ETM_PRIV_MAX * sizeof(u64))
 #define CS_ETMV4_PRIV_SIZE (CS_ETMV4_PRIV_MAX * sizeof(u64))
 
+#ifdef HAVE_CSTRACE_SUPPORT
+int cs_etm__process_auxtrace_info(union perf_event *event,
+				  struct perf_session *session);
+#else
+static inline int
+cs_etm__process_auxtrace_info(union perf_event *event __maybe_unused,
+			      struct perf_session *session __maybe_unused)
+{
+	return -1;
+}
+#endif
+
 #endif
diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c
index 48094fd..d8cfc19 100644
--- a/tools/perf/util/data.c
+++ b/tools/perf/util/data.c
@@ -12,16 +12,6 @@
 #include "util.h"
 #include "debug.h"
 
-#ifndef O_CLOEXEC
-#ifdef __sparc__
-#define O_CLOEXEC	0x400000
-#elif defined(__alpha__) || defined(__hppa__)
-#define O_CLOEXEC	010000000
-#else
-#define O_CLOEXEC	02000000
-#endif
-#endif
-
 static bool check_pipe(struct perf_data *data)
 {
 	struct stat st;
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c
index d5b6f7f..36ef45b 100644
--- a/tools/perf/util/dso.c
+++ b/tools/perf/util/dso.c
@@ -446,7 +446,7 @@ static int do_open(char *name)
 	char sbuf[STRERR_BUFSIZE];
 
 	do {
-		fd = open(name, O_RDONLY);
+		fd = open(name, O_RDONLY|O_CLOEXEC);
 		if (fd >= 0)
 			return fd;
 
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c
index 6276b34..6d31186 100644
--- a/tools/perf/util/env.c
+++ b/tools/perf/util/env.c
@@ -1,8 +1,10 @@
 // SPDX-License-Identifier: GPL-2.0
 #include "cpumap.h"
 #include "env.h"
+#include "sane_ctype.h"
 #include "util.h"
 #include <errno.h>
+#include <sys/utsname.h>
 
 struct perf_env perf_env;
 
@@ -93,3 +95,48 @@ void cpu_cache_level__free(struct cpu_cache_level *cache)
 	free(cache->map);
 	free(cache->size);
 }
+
+/*
+ * Return architecture name in a normalized form.
+ * The conversion logic comes from the Makefile.
+ */
+static const char *normalize_arch(char *arch)
+{
+	if (!strcmp(arch, "x86_64"))
+		return "x86";
+	if (arch[0] == 'i' && arch[2] == '8' && arch[3] == '6')
+		return "x86";
+	if (!strcmp(arch, "sun4u") || !strncmp(arch, "sparc", 5))
+		return "sparc";
+	if (!strcmp(arch, "aarch64") || !strcmp(arch, "arm64"))
+		return "arm64";
+	if (!strncmp(arch, "arm", 3) || !strcmp(arch, "sa110"))
+		return "arm";
+	if (!strncmp(arch, "s390", 4))
+		return "s390";
+	if (!strncmp(arch, "parisc", 6))
+		return "parisc";
+	if (!strncmp(arch, "powerpc", 7) || !strncmp(arch, "ppc", 3))
+		return "powerpc";
+	if (!strncmp(arch, "mips", 4))
+		return "mips";
+	if (!strncmp(arch, "sh", 2) && isdigit(arch[2]))
+		return "sh";
+
+	return arch;
+}
+
+const char *perf_env__arch(struct perf_env *env)
+{
+	struct utsname uts;
+	char *arch_name;
+
+	if (!env) { /* Assume local operation */
+		if (uname(&uts) < 0)
+			return NULL;
+		arch_name = uts.machine;
+	} else
+		arch_name = env->arch;
+
+	return normalize_arch(arch_name);
+}
diff --git a/tools/perf/util/env.h b/tools/perf/util/env.h
index 1eb35b1..bf970f5 100644
--- a/tools/perf/util/env.h
+++ b/tools/perf/util/env.h
@@ -65,4 +65,6 @@ int perf_env__set_cmdline(struct perf_env *env, int argc, const char *argv[]);
 int perf_env__read_cpu_topology_map(struct perf_env *env);
 
 void cpu_cache_level__free(struct cpu_cache_level *cache);
+
+const char *perf_env__arch(struct perf_env *env);
 #endif /* __PERF_ENV_H */
diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 97a8ef9..44e603c 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -1435,6 +1435,11 @@ size_t perf_event__fprintf_switch(union perf_event *event, FILE *fp)
 		       event->context_switch.next_prev_tid);
 }
 
+static size_t perf_event__fprintf_lost(union perf_event *event, FILE *fp)
+{
+	return fprintf(fp, " lost %" PRIu64 "\n", event->lost.lost);
+}
+
 size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 {
 	size_t ret = fprintf(fp, "PERF_RECORD_%s",
@@ -1467,6 +1472,9 @@ size_t perf_event__fprintf(union perf_event *event, FILE *fp)
 	case PERF_RECORD_SWITCH_CPU_WIDE:
 		ret += perf_event__fprintf_switch(event, fp);
 		break;
+	case PERF_RECORD_LOST:
+		ret += perf_event__fprintf_lost(event, fp);
+		break;
 	default:
 		ret += fprintf(fp, "\n");
 	}
diff --git a/tools/perf/util/event.h b/tools/perf/util/event.h
index 1ae95ef..0f79474 100644
--- a/tools/perf/util/event.h
+++ b/tools/perf/util/event.h
@@ -205,6 +205,7 @@ struct perf_sample {
 	u32 flags;
 	u16 insn_len;
 	u8  cpumode;
+	u16 misc;
 	char insn[MAX_INSN];
 	void *raw_data;
 	struct ip_callchain *callchain;
@@ -774,8 +775,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 				     u64 read_format);
 int perf_event__synthesize_sample(union perf_event *event, u64 type,
 				  u64 read_format,
-				  const struct perf_sample *sample,
-				  bool swapped);
+				  const struct perf_sample *sample);
 
 pid_t perf_event__synthesize_comm(struct perf_tool *tool,
 				  union perf_event *event, pid_t pid,
diff --git a/tools/perf/util/evlist.c b/tools/perf/util/evlist.c
index b62e523..ac35cd2 100644
--- a/tools/perf/util/evlist.c
+++ b/tools/perf/util/evlist.c
@@ -25,6 +25,7 @@
 #include "parse-events.h"
 #include <subcmd/parse-options.h>
 
+#include <fcntl.h>
 #include <sys/ioctl.h>
 #include <sys/mman.h>
 
@@ -125,7 +126,7 @@ static void perf_evlist__purge(struct perf_evlist *evlist)
 void perf_evlist__exit(struct perf_evlist *evlist)
 {
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 	fdarray__exit(&evlist->pollfd);
 }
 
@@ -675,11 +676,11 @@ static int perf_evlist__set_paused(struct perf_evlist *evlist, bool value)
 {
 	int i;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return 0;
 
 	for (i = 0; i < evlist->nr_mmaps; i++) {
-		int fd = evlist->backward_mmap[i].fd;
+		int fd = evlist->overwrite_mmap[i].fd;
 		int err;
 
 		if (fd < 0)
@@ -711,7 +712,7 @@ union perf_event *perf_evlist__mmap_read_forward(struct perf_evlist *evlist, int
 	 * No need for read-write ring buffer: kernel stop outputting when
 	 * it hit md->prev (perf_mmap__consume()).
 	 */
-	return perf_mmap__read_forward(md, evlist->overwrite);
+	return perf_mmap__read_forward(md);
 }
 
 union perf_event *perf_evlist__mmap_read_backward(struct perf_evlist *evlist, int idx)
@@ -738,7 +739,7 @@ void perf_evlist__mmap_read_catchup(struct perf_evlist *evlist, int idx)
 
 void perf_evlist__mmap_consume(struct perf_evlist *evlist, int idx)
 {
-	perf_mmap__consume(&evlist->mmap[idx], evlist->overwrite);
+	perf_mmap__consume(&evlist->mmap[idx], false);
 }
 
 static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
@@ -749,16 +750,16 @@ static void perf_evlist__munmap_nofree(struct perf_evlist *evlist)
 		for (i = 0; i < evlist->nr_mmaps; i++)
 			perf_mmap__munmap(&evlist->mmap[i]);
 
-	if (evlist->backward_mmap)
+	if (evlist->overwrite_mmap)
 		for (i = 0; i < evlist->nr_mmaps; i++)
-			perf_mmap__munmap(&evlist->backward_mmap[i]);
+			perf_mmap__munmap(&evlist->overwrite_mmap[i]);
 }
 
 void perf_evlist__munmap(struct perf_evlist *evlist)
 {
 	perf_evlist__munmap_nofree(evlist);
 	zfree(&evlist->mmap);
-	zfree(&evlist->backward_mmap);
+	zfree(&evlist->overwrite_mmap);
 }
 
 static struct perf_mmap *perf_evlist__alloc_mmap(struct perf_evlist *evlist)
@@ -800,7 +801,7 @@ perf_evlist__should_poll(struct perf_evlist *evlist __maybe_unused,
 
 static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 				       struct mmap_params *mp, int cpu_idx,
-				       int thread, int *_output, int *_output_backward)
+				       int thread, int *_output, int *_output_overwrite)
 {
 	struct perf_evsel *evsel;
 	int revent;
@@ -812,18 +813,20 @@ static int perf_evlist__mmap_per_evsel(struct perf_evlist *evlist, int idx,
 		int fd;
 		int cpu;
 
+		mp->prot = PROT_READ | PROT_WRITE;
 		if (evsel->attr.write_backward) {
-			output = _output_backward;
-			maps = evlist->backward_mmap;
+			output = _output_overwrite;
+			maps = evlist->overwrite_mmap;
 
 			if (!maps) {
 				maps = perf_evlist__alloc_mmap(evlist);
 				if (!maps)
 					return -1;
-				evlist->backward_mmap = maps;
+				evlist->overwrite_mmap = maps;
 				if (evlist->bkw_mmap_state == BKW_MMAP_NOTREADY)
 					perf_evlist__toggle_bkw_mmap(evlist, BKW_MMAP_RUNNING);
 			}
+			mp->prot &= ~PROT_WRITE;
 		}
 
 		if (evsel->system_wide && thread)
@@ -884,14 +887,14 @@ static int perf_evlist__mmap_per_cpu(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per cpu\n");
 	for (cpu = 0; cpu < nr_cpus; cpu++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, cpu,
 					      true);
 
 		for (thread = 0; thread < nr_threads; thread++) {
 			if (perf_evlist__mmap_per_evsel(evlist, cpu, mp, cpu,
-							thread, &output, &output_backward))
+							thread, &output, &output_overwrite))
 				goto out_unmap;
 		}
 	}
@@ -912,13 +915,13 @@ static int perf_evlist__mmap_per_thread(struct perf_evlist *evlist,
 	pr_debug2("perf event ring buffer mmapped per thread\n");
 	for (thread = 0; thread < nr_threads; thread++) {
 		int output = -1;
-		int output_backward = -1;
+		int output_overwrite = -1;
 
 		auxtrace_mmap_params__set_idx(&mp->auxtrace_mp, evlist, thread,
 					      false);
 
 		if (perf_evlist__mmap_per_evsel(evlist, thread, mp, 0, thread,
-						&output, &output_backward))
+						&output, &output_overwrite))
 			goto out_unmap;
 	}
 
@@ -1052,15 +1055,18 @@ int perf_evlist__parse_mmap_pages(const struct option *opt, const char *str,
  * Return: %0 on success, negative error code otherwise.
  */
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
+			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite)
 {
 	struct perf_evsel *evsel;
 	const struct cpu_map *cpus = evlist->cpus;
 	const struct thread_map *threads = evlist->threads;
-	struct mmap_params mp = {
-		.prot = PROT_READ | (overwrite ? 0 : PROT_WRITE),
-	};
+	/*
+	 * Delay setting mp.prot: set it before calling perf_mmap__mmap.
+	 * Its value is decided by evsel's write_backward.
+	 * So &mp should not be passed through const pointer.
+	 */
+	struct mmap_params mp;
 
 	if (!evlist->mmap)
 		evlist->mmap = perf_evlist__alloc_mmap(evlist);
@@ -1070,7 +1076,6 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	if (evlist->pollfd.entries == NULL && perf_evlist__alloc_pollfd(evlist) < 0)
 		return -ENOMEM;
 
-	evlist->overwrite = overwrite;
 	evlist->mmap_len = perf_evlist__mmap_size(pages);
 	pr_debug("mmap size %zuB\n", evlist->mmap_len);
 	mp.mask = evlist->mmap_len - page_size - 1;
@@ -1091,10 +1096,9 @@ int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
 	return perf_evlist__mmap_per_cpu(evlist, &mp);
 }
 
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite)
+int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages)
 {
-	return perf_evlist__mmap_ex(evlist, pages, overwrite, 0, false);
+	return perf_evlist__mmap_ex(evlist, pages, 0, false);
 }
 
 int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
@@ -1102,7 +1106,8 @@ int perf_evlist__create_maps(struct perf_evlist *evlist, struct target *target)
 	struct cpu_map *cpus;
 	struct thread_map *threads;
 
-	threads = thread_map__new_str(target->pid, target->tid, target->uid);
+	threads = thread_map__new_str(target->pid, target->tid, target->uid,
+				      target->per_thread);
 
 	if (!threads)
 		return -1;
@@ -1582,6 +1587,17 @@ int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *even
 	return perf_evsel__parse_sample(evsel, event, sample);
 }
 
+int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist,
+					union perf_event *event,
+					u64 *timestamp)
+{
+	struct perf_evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+	if (!evsel)
+		return -EFAULT;
+	return perf_evsel__parse_sample_timestamp(evsel, event, timestamp);
+}
+
 size_t perf_evlist__fprintf(struct perf_evlist *evlist, FILE *fp)
 {
 	struct perf_evsel *evsel;
@@ -1739,13 +1755,13 @@ void perf_evlist__toggle_bkw_mmap(struct perf_evlist *evlist,
 		RESUME,
 	} action = NONE;
 
-	if (!evlist->backward_mmap)
+	if (!evlist->overwrite_mmap)
 		return;
 
 	switch (old_state) {
 	case BKW_MMAP_NOTREADY: {
 		if (state != BKW_MMAP_RUNNING)
-			goto state_err;;
+			goto state_err;
 		break;
 	}
 	case BKW_MMAP_RUNNING: {
diff --git a/tools/perf/util/evlist.h b/tools/perf/util/evlist.h
index 491f695..75f8e0a 100644
--- a/tools/perf/util/evlist.h
+++ b/tools/perf/util/evlist.h
@@ -7,7 +7,6 @@
 #include <linux/refcount.h>
 #include <linux/list.h>
 #include <api/fd/array.h>
-#include <fcntl.h>
 #include <stdio.h>
 #include "../perf.h"
 #include "event.h"
@@ -31,7 +30,6 @@ struct perf_evlist {
 	int		 nr_entries;
 	int		 nr_groups;
 	int		 nr_mmaps;
-	bool		 overwrite;
 	bool		 enabled;
 	bool		 has_user_cpus;
 	size_t		 mmap_len;
@@ -45,12 +43,14 @@ struct perf_evlist {
 	} workload;
 	struct fdarray	 pollfd;
 	struct perf_mmap *mmap;
-	struct perf_mmap *backward_mmap;
+	struct perf_mmap *overwrite_mmap;
 	struct thread_map *threads;
 	struct cpu_map	  *cpus;
 	struct perf_evsel *selected;
 	struct events_stats stats;
 	struct perf_env	*env;
+	u64		first_sample_time;
+	u64		last_sample_time;
 };
 
 struct perf_evsel_str_handler {
@@ -169,10 +169,9 @@ int perf_evlist__parse_mmap_pages(const struct option *opt,
 unsigned long perf_event_mlock_kb_in_pages(void);
 
 int perf_evlist__mmap_ex(struct perf_evlist *evlist, unsigned int pages,
-			 bool overwrite, unsigned int auxtrace_pages,
+			 unsigned int auxtrace_pages,
 			 bool auxtrace_overwrite);
-int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages,
-		      bool overwrite);
+int perf_evlist__mmap(struct perf_evlist *evlist, unsigned int pages);
 void perf_evlist__munmap(struct perf_evlist *evlist);
 
 size_t perf_evlist__mmap_size(unsigned long pages);
@@ -205,6 +204,10 @@ u16 perf_evlist__id_hdr_size(struct perf_evlist *evlist);
 int perf_evlist__parse_sample(struct perf_evlist *evlist, union perf_event *event,
 			      struct perf_sample *sample);
 
+int perf_evlist__parse_sample_timestamp(struct perf_evlist *evlist,
+					union perf_event *event,
+					u64 *timestamp);
+
 bool perf_evlist__valid_sample_type(struct perf_evlist *evlist);
 bool perf_evlist__valid_sample_id_all(struct perf_evlist *evlist);
 bool perf_evlist__valid_read_format(struct perf_evlist *evlist);
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c
index d5fbcf8..66fa451 100644
--- a/tools/perf/util/evsel.c
+++ b/tools/perf/util/evsel.c
@@ -36,6 +36,7 @@
 #include "debug.h"
 #include "trace-event.h"
 #include "stat.h"
+#include "memswap.h"
 #include "util/parse-branch-options.h"
 
 #include "sane_ctype.h"
@@ -650,9 +651,9 @@ int perf_evsel__group_desc(struct perf_evsel *evsel, char *buf, size_t size)
 	return ret;
 }
 
-void perf_evsel__config_callchain(struct perf_evsel *evsel,
-				  struct record_opts *opts,
-				  struct callchain_param *param)
+static void __perf_evsel__config_callchain(struct perf_evsel *evsel,
+					   struct record_opts *opts,
+					   struct callchain_param *param)
 {
 	bool function = perf_evsel__is_function_event(evsel);
 	struct perf_event_attr *attr = &evsel->attr;
@@ -698,6 +699,14 @@ void perf_evsel__config_callchain(struct perf_evsel *evsel,
 	}
 }
 
+void perf_evsel__config_callchain(struct perf_evsel *evsel,
+				  struct record_opts *opts,
+				  struct callchain_param *param)
+{
+	if (param->enabled)
+		return __perf_evsel__config_callchain(evsel, opts, param);
+}
+
 static void
 perf_evsel__reset_callgraph(struct perf_evsel *evsel,
 			    struct callchain_param *param)
@@ -717,19 +726,19 @@ perf_evsel__reset_callgraph(struct perf_evsel *evsel,
 }
 
 static void apply_config_terms(struct perf_evsel *evsel,
-			       struct record_opts *opts)
+			       struct record_opts *opts, bool track)
 {
 	struct perf_evsel_config_term *term;
 	struct list_head *config_terms = &evsel->config_terms;
 	struct perf_event_attr *attr = &evsel->attr;
-	struct callchain_param param;
+	/* callgraph default */
+	struct callchain_param param = {
+		.record_mode = callchain_param.record_mode,
+	};
 	u32 dump_size = 0;
 	int max_stack = 0;
 	const char *callgraph_buf = NULL;
 
-	/* callgraph default */
-	param.record_mode = callchain_param.record_mode;
-
 	list_for_each_entry(term, config_terms, list) {
 		switch (term->type) {
 		case PERF_EVSEL__CONFIG_TERM_PERIOD:
@@ -779,6 +788,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 		case PERF_EVSEL__CONFIG_TERM_OVERWRITE:
 			attr->write_backward = term->val.overwrite ? 1 : 0;
 			break;
+		case PERF_EVSEL__CONFIG_TERM_DRV_CFG:
+			break;
 		default:
 			break;
 		}
@@ -786,6 +797,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 
 	/* User explicitly set per-event callgraph, clear the old setting and reset. */
 	if ((callgraph_buf != NULL) || (dump_size > 0) || max_stack) {
+		bool sample_address = false;
+
 		if (max_stack) {
 			param.max_stack = max_stack;
 			if (callgraph_buf == NULL)
@@ -805,6 +818,8 @@ static void apply_config_terms(struct perf_evsel *evsel,
 					       evsel->name);
 					return;
 				}
+				if (param.record_mode == CALLCHAIN_DWARF)
+					sample_address = true;
 			}
 		}
 		if (dump_size > 0) {
@@ -817,8 +832,14 @@ static void apply_config_terms(struct perf_evsel *evsel,
 			perf_evsel__reset_callgraph(evsel, &callchain_param);
 
 		/* set perf-event callgraph */
-		if (param.enabled)
+		if (param.enabled) {
+			if (sample_address) {
+				perf_evsel__set_sample_bit(evsel, ADDR);
+				perf_evsel__set_sample_bit(evsel, DATA_SRC);
+				evsel->attr.mmap_data = track;
+			}
 			perf_evsel__config_callchain(evsel, opts, &param);
+		}
 	}
 }
 
@@ -1049,7 +1070,7 @@ void perf_evsel__config(struct perf_evsel *evsel, struct record_opts *opts,
 	 * Apply event specific term settings,
 	 * it overloads any global configuration.
 	 */
-	apply_config_terms(evsel, opts);
+	apply_config_terms(evsel, opts, track);
 
 	evsel->ignore_missing_thread = opts->ignore_missing_thread;
 }
@@ -1574,6 +1595,7 @@ int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 	PRINT_ATTRf(use_clockid, p_unsigned);
 	PRINT_ATTRf(context_switch, p_unsigned);
 	PRINT_ATTRf(write_backward, p_unsigned);
+	PRINT_ATTRf(namespaces, p_unsigned);
 
 	PRINT_ATTRn("{ wakeup_events, wakeup_watermark }", wakeup_events, p_unsigned);
 	PRINT_ATTRf(bp_type, p_unsigned);
@@ -1596,10 +1618,46 @@ static int __open_attr__fprintf(FILE *fp, const char *name, const char *val,
 	return fprintf(fp, "  %-32s %s\n", name, val);
 }
 
+static void perf_evsel__remove_fd(struct perf_evsel *pos,
+				  int nr_cpus, int nr_threads,
+				  int thread_idx)
+{
+	for (int cpu = 0; cpu < nr_cpus; cpu++)
+		for (int thread = thread_idx; thread < nr_threads - 1; thread++)
+			FD(pos, cpu, thread) = FD(pos, cpu, thread + 1);
+}
+
+static int update_fds(struct perf_evsel *evsel,
+		      int nr_cpus, int cpu_idx,
+		      int nr_threads, int thread_idx)
+{
+	struct perf_evsel *pos;
+
+	if (cpu_idx >= nr_cpus || thread_idx >= nr_threads)
+		return -EINVAL;
+
+	evlist__for_each_entry(evsel->evlist, pos) {
+		nr_cpus = pos != evsel ? nr_cpus : cpu_idx;
+
+		perf_evsel__remove_fd(pos, nr_cpus, nr_threads, thread_idx);
+
+		/*
+		 * Since fds for next evsel has not been created,
+		 * there is no need to iterate whole event list.
+		 */
+		if (pos == evsel)
+			break;
+	}
+	return 0;
+}
+
 static bool ignore_missing_thread(struct perf_evsel *evsel,
+				  int nr_cpus, int cpu,
 				  struct thread_map *threads,
 				  int thread, int err)
 {
+	pid_t ignore_pid = thread_map__pid(threads, thread);
+
 	if (!evsel->ignore_missing_thread)
 		return false;
 
@@ -1615,11 +1673,18 @@ static bool ignore_missing_thread(struct perf_evsel *evsel,
 	if (threads->nr == 1)
 		return false;
 
+	/*
+	 * We should remove fd for missing_thread first
+	 * because thread_map__remove() will decrease threads->nr.
+	 */
+	if (update_fds(evsel, nr_cpus, cpu, threads->nr, thread))
+		return false;
+
 	if (thread_map__remove(threads, thread))
 		return false;
 
 	pr_warning("WARNING: Ignored open failure for pid %d\n",
-		   thread_map__pid(threads, thread));
+		   ignore_pid);
 	return true;
 }
 
@@ -1724,7 +1789,7 @@ int perf_evsel__open(struct perf_evsel *evsel, struct cpu_map *cpus,
 			if (fd < 0) {
 				err = -errno;
 
-				if (ignore_missing_thread(evsel, threads, thread, err)) {
+				if (ignore_missing_thread(evsel, cpus->nr, cpu, threads, thread, err)) {
 					/*
 					 * We just removed 1 thread, so take a step
 					 * back on thread index and lower the upper
@@ -1960,6 +2025,20 @@ static inline bool overflow(const void *endp, u16 max_size, const void *offset,
 #define OVERFLOW_CHECK_u64(offset) \
 	OVERFLOW_CHECK(offset, sizeof(u64), sizeof(u64))
 
+static int
+perf_event__check_size(union perf_event *event, unsigned int sample_size)
+{
+	/*
+	 * The evsel's sample_size is based on PERF_SAMPLE_MASK which includes
+	 * up to PERF_SAMPLE_PERIOD.  After that overflow() must be used to
+	 * check the format does not go past the end of the event.
+	 */
+	if (sample_size + sizeof(event->header) > event->header.size)
+		return -EFAULT;
+
+	return 0;
+}
+
 int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 			     struct perf_sample *data)
 {
@@ -1981,6 +2060,9 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	data->stream_id = data->id = data->time = -1ULL;
 	data->period = evsel->attr.sample_period;
 	data->cpumode = event->header.misc & PERF_RECORD_MISC_CPUMODE_MASK;
+	data->misc    = event->header.misc;
+	data->id = -1ULL;
+	data->data_src = PERF_MEM_DATA_SRC_NONE;
 
 	if (event->header.type != PERF_RECORD_SAMPLE) {
 		if (!evsel->attr.sample_id_all)
@@ -1990,15 +2072,9 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 
 	array = event->sample.array;
 
-	/*
-	 * The evsel's sample_size is based on PERF_SAMPLE_MASK which includes
-	 * up to PERF_SAMPLE_PERIOD.  After that overflow() must be used to
-	 * check the format does not go past the end of the event.
-	 */
-	if (evsel->sample_size + sizeof(event->header) > event->header.size)
+	if (perf_event__check_size(event, evsel->sample_size))
 		return -EFAULT;
 
-	data->id = -1ULL;
 	if (type & PERF_SAMPLE_IDENTIFIER) {
 		data->id = *array;
 		array++;
@@ -2028,7 +2104,6 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
-	data->addr = 0;
 	if (type & PERF_SAMPLE_ADDR) {
 		data->addr = *array;
 		array++;
@@ -2120,14 +2195,27 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	if (type & PERF_SAMPLE_RAW) {
 		OVERFLOW_CHECK_u64(array);
 		u.val64 = *array;
-		if (WARN_ONCE(swapped,
-			      "Endianness of raw data not corrected!\n")) {
-			/* undo swap of u64, then swap on individual u32s */
+
+		/*
+		 * Undo swap of u64, then swap on individual u32s,
+		 * get the size of the raw area and undo all of the
+		 * swap. The pevent interface handles endianity by
+		 * itself.
+		 */
+		if (swapped) {
 			u.val64 = bswap_64(u.val64);
 			u.val32[0] = bswap_32(u.val32[0]);
 			u.val32[1] = bswap_32(u.val32[1]);
 		}
 		data->raw_size = u.val32[0];
+
+		/*
+		 * The raw data is aligned on 64bits including the
+		 * u32 size, so it's safe to use mem_bswap_64.
+		 */
+		if (swapped)
+			mem_bswap_64((void *) array, data->raw_size);
+
 		array = (void *)array + sizeof(u32);
 
 		OVERFLOW_CHECK(array, data->raw_size, max_size);
@@ -2192,14 +2280,12 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 		array++;
 	}
 
-	data->data_src = PERF_MEM_DATA_SRC_NONE;
 	if (type & PERF_SAMPLE_DATA_SRC) {
 		OVERFLOW_CHECK_u64(array);
 		data->data_src = *array;
 		array++;
 	}
 
-	data->transaction = 0;
 	if (type & PERF_SAMPLE_TRANSACTION) {
 		OVERFLOW_CHECK_u64(array);
 		data->transaction = *array;
@@ -2232,6 +2318,50 @@ int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 	return 0;
 }
 
+int perf_evsel__parse_sample_timestamp(struct perf_evsel *evsel,
+				       union perf_event *event,
+				       u64 *timestamp)
+{
+	u64 type = evsel->attr.sample_type;
+	const u64 *array;
+
+	if (!(type & PERF_SAMPLE_TIME))
+		return -1;
+
+	if (event->header.type != PERF_RECORD_SAMPLE) {
+		struct perf_sample data = {
+			.time = -1ULL,
+		};
+
+		if (!evsel->attr.sample_id_all)
+			return -1;
+		if (perf_evsel__parse_id_sample(evsel, event, &data))
+			return -1;
+
+		*timestamp = data.time;
+		return 0;
+	}
+
+	array = event->sample.array;
+
+	if (perf_event__check_size(event, evsel->sample_size))
+		return -EFAULT;
+
+	if (type & PERF_SAMPLE_IDENTIFIER)
+		array++;
+
+	if (type & PERF_SAMPLE_IP)
+		array++;
+
+	if (type & PERF_SAMPLE_TID)
+		array++;
+
+	if (type & PERF_SAMPLE_TIME)
+		*timestamp = *array;
+
+	return 0;
+}
+
 size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 				     u64 read_format)
 {
@@ -2342,8 +2472,7 @@ size_t perf_event__sample_event_size(const struct perf_sample *sample, u64 type,
 
 int perf_event__synthesize_sample(union perf_event *event, u64 type,
 				  u64 read_format,
-				  const struct perf_sample *sample,
-				  bool swapped)
+				  const struct perf_sample *sample)
 {
 	u64 *array;
 	size_t sz;
@@ -2368,15 +2497,6 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 	if (type & PERF_SAMPLE_TID) {
 		u.val32[0] = sample->pid;
 		u.val32[1] = sample->tid;
-		if (swapped) {
-			/*
-			 * Inverse of what is done in perf_evsel__parse_sample
-			 */
-			u.val32[0] = bswap_32(u.val32[0]);
-			u.val32[1] = bswap_32(u.val32[1]);
-			u.val64 = bswap_64(u.val64);
-		}
-
 		*array = u.val64;
 		array++;
 	}
@@ -2403,13 +2523,7 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 
 	if (type & PERF_SAMPLE_CPU) {
 		u.val32[0] = sample->cpu;
-		if (swapped) {
-			/*
-			 * Inverse of what is done in perf_evsel__parse_sample
-			 */
-			u.val32[0] = bswap_32(u.val32[0]);
-			u.val64 = bswap_64(u.val64);
-		}
+		u.val32[1] = 0;
 		*array = u.val64;
 		array++;
 	}
@@ -2456,15 +2570,6 @@ int perf_event__synthesize_sample(union perf_event *event, u64 type,
 
 	if (type & PERF_SAMPLE_RAW) {
 		u.val32[0] = sample->raw_size;
-		if (WARN_ONCE(swapped,
-			      "Endianness of raw data not corrected!\n")) {
-			/*
-			 * Inverse of what is done in perf_evsel__parse_sample
-			 */
-			u.val32[0] = bswap_32(u.val32[0]);
-			u.val32[1] = bswap_32(u.val32[1]);
-			u.val64 = bswap_64(u.val64);
-		}
 		*array = u.val64;
 		array = (void *)array + sizeof(u32);
 
@@ -2743,8 +2848,9 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 		break;
 	case EOPNOTSUPP:
 		if (evsel->attr.sample_period != 0)
-			return scnprintf(msg, size, "%s",
-	"PMU Hardware doesn't support sampling/overflow-interrupts.");
+			return scnprintf(msg, size,
+	"%s: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'",
+					 perf_evsel__name(evsel));
 		if (evsel->attr.precise_ip)
 			return scnprintf(msg, size, "%s",
 	"\'precise\' request may not be supported. Try removing 'p' modifier.");
@@ -2781,16 +2887,9 @@ int perf_evsel__open_strerror(struct perf_evsel *evsel, struct target *target,
 			 perf_evsel__name(evsel));
 }
 
-char *perf_evsel__env_arch(struct perf_evsel *evsel)
+struct perf_env *perf_evsel__env(struct perf_evsel *evsel)
 {
-	if (evsel && evsel->evlist && evsel->evlist->env)
-		return evsel->evlist->env->arch;
-	return NULL;
-}
-
-char *perf_evsel__env_cpuid(struct perf_evsel *evsel)
-{
-	if (evsel && evsel->evlist && evsel->evlist->env)
-		return evsel->evlist->env->cpuid;
+	if (evsel && evsel->evlist)
+		return evsel->evlist->env;
 	return NULL;
 }
diff --git a/tools/perf/util/evsel.h b/tools/perf/util/evsel.h
index 157f49e..846e416 100644
--- a/tools/perf/util/evsel.h
+++ b/tools/perf/util/evsel.h
@@ -38,7 +38,7 @@ struct cgroup_sel;
  * It is allocated within event parsing and attached to
  * perf_evsel::config_terms list head.
 */
-enum {
+enum term_type {
 	PERF_EVSEL__CONFIG_TERM_PERIOD,
 	PERF_EVSEL__CONFIG_TERM_FREQ,
 	PERF_EVSEL__CONFIG_TERM_TIME,
@@ -49,12 +49,11 @@ enum {
 	PERF_EVSEL__CONFIG_TERM_OVERWRITE,
 	PERF_EVSEL__CONFIG_TERM_DRV_CFG,
 	PERF_EVSEL__CONFIG_TERM_BRANCH,
-	PERF_EVSEL__CONFIG_TERM_MAX,
 };
 
 struct perf_evsel_config_term {
 	struct list_head	list;
-	int	type;
+	enum term_type	type;
 	union {
 		u64	period;
 		u64	freq;
@@ -339,6 +338,10 @@ static inline int perf_evsel__read_on_cpu_scaled(struct perf_evsel *evsel,
 int perf_evsel__parse_sample(struct perf_evsel *evsel, union perf_event *event,
 			     struct perf_sample *sample);
 
+int perf_evsel__parse_sample_timestamp(struct perf_evsel *evsel,
+				       union perf_event *event,
+				       u64 *timestamp);
+
 static inline struct perf_evsel *perf_evsel__next(struct perf_evsel *evsel)
 {
 	return list_entry(evsel->node.next, struct perf_evsel, node);
@@ -443,7 +446,6 @@ typedef int (*attr__fprintf_f)(FILE *, const char *, const char *, void *);
 int perf_event_attr__fprintf(FILE *fp, struct perf_event_attr *attr,
 			     attr__fprintf_f attr__fprintf, void *priv);
 
-char *perf_evsel__env_arch(struct perf_evsel *evsel);
-char *perf_evsel__env_cpuid(struct perf_evsel *evsel);
+struct perf_env *perf_evsel__env(struct perf_evsel *evsel);
 
 #endif /* __PERF_EVSEL_H */
diff --git a/tools/perf/util/generate-cmdlist.sh b/tools/perf/util/generate-cmdlist.sh
index 9bbcec4..ff17920 100755
--- a/tools/perf/util/generate-cmdlist.sh
+++ b/tools/perf/util/generate-cmdlist.sh
@@ -38,7 +38,7 @@
 done
 echo "#endif /* HAVE_LIBELF_SUPPORT */"
 
-echo "#ifdef HAVE_LIBAUDIT_SUPPORT"
+echo "#if defined(HAVE_LIBAUDIT_SUPPORT) || defined(HAVE_SYSCALL_TABLE)"
 sed -n -e 's/^perf-\([^ 	]*\)[ 	].* audit*/\1/p' command-list.txt |
 sort |
 while read cmd
diff --git a/tools/perf/util/header.c b/tools/perf/util/header.c
index 7c0e9d5..a326e0d 100644
--- a/tools/perf/util/header.c
+++ b/tools/perf/util/header.c
@@ -15,9 +15,8 @@
 #include <linux/bitops.h>
 #include <linux/stringify.h>
 #include <sys/stat.h>
-#include <sys/types.h>
 #include <sys/utsname.h>
-#include <unistd.h>
+#include <linux/time64.h>
 
 #include "evlist.h"
 #include "evsel.h"
@@ -37,6 +36,7 @@
 #include <api/fs/fs.h>
 #include "asm/bug.h"
 #include "tool.h"
+#include "time-utils.h"
 
 #include "sane_ctype.h"
 
@@ -1182,6 +1182,20 @@ static int write_stat(struct feat_fd *ff __maybe_unused,
 	return 0;
 }
 
+static int write_sample_time(struct feat_fd *ff,
+			     struct perf_evlist *evlist)
+{
+	int ret;
+
+	ret = do_write(ff, &evlist->first_sample_time,
+		       sizeof(evlist->first_sample_time));
+	if (ret < 0)
+		return ret;
+
+	return do_write(ff, &evlist->last_sample_time,
+			sizeof(evlist->last_sample_time));
+}
+
 static void print_hostname(struct feat_fd *ff, FILE *fp)
 {
 	fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
@@ -1507,6 +1521,28 @@ static void print_group_desc(struct feat_fd *ff, FILE *fp)
 	}
 }
 
+static void print_sample_time(struct feat_fd *ff, FILE *fp)
+{
+	struct perf_session *session;
+	char time_buf[32];
+	double d;
+
+	session = container_of(ff->ph, struct perf_session, header);
+
+	timestamp__scnprintf_usec(session->evlist->first_sample_time,
+				  time_buf, sizeof(time_buf));
+	fprintf(fp, "# time of first sample : %s\n", time_buf);
+
+	timestamp__scnprintf_usec(session->evlist->last_sample_time,
+				  time_buf, sizeof(time_buf));
+	fprintf(fp, "# time of last sample : %s\n", time_buf);
+
+	d = (double)(session->evlist->last_sample_time -
+		session->evlist->first_sample_time) / NSEC_PER_MSEC;
+
+	fprintf(fp, "# sample duration : %10.3f ms\n", d);
+}
+
 static int __event_process_build_id(struct build_id_event *bev,
 				    char *filename,
 				    struct perf_session *session)
@@ -2148,6 +2184,27 @@ static int process_cache(struct feat_fd *ff, void *data __maybe_unused)
 	return -1;
 }
 
+static int process_sample_time(struct feat_fd *ff, void *data __maybe_unused)
+{
+	struct perf_session *session;
+	u64 first_sample_time, last_sample_time;
+	int ret;
+
+	session = container_of(ff->ph, struct perf_session, header);
+
+	ret = do_read_u64(ff, &first_sample_time);
+	if (ret)
+		return -1;
+
+	ret = do_read_u64(ff, &last_sample_time);
+	if (ret)
+		return -1;
+
+	session->evlist->first_sample_time = first_sample_time;
+	session->evlist->last_sample_time = last_sample_time;
+	return 0;
+}
+
 struct feature_ops {
 	int (*write)(struct feat_fd *ff, struct perf_evlist *evlist);
 	void (*print)(struct feat_fd *ff, FILE *fp);
@@ -2205,6 +2262,7 @@ static const struct feature_ops feat_ops[HEADER_LAST_FEATURE] = {
 	FEAT_OPN(AUXTRACE,	auxtrace,	false),
 	FEAT_OPN(STAT,		stat,		false),
 	FEAT_OPN(CACHE,		cache,		true),
+	FEAT_OPR(SAMPLE_TIME,	sample_time,	false),
 };
 
 struct header_print_data {
@@ -3258,6 +3316,74 @@ int perf_event__synthesize_attrs(struct perf_tool *tool,
 	return err;
 }
 
+static bool has_unit(struct perf_evsel *counter)
+{
+	return counter->unit && *counter->unit;
+}
+
+static bool has_scale(struct perf_evsel *counter)
+{
+	return counter->scale != 1;
+}
+
+int perf_event__synthesize_extra_attr(struct perf_tool *tool,
+				      struct perf_evlist *evsel_list,
+				      perf_event__handler_t process,
+				      bool is_pipe)
+{
+	struct perf_evsel *counter;
+	int err;
+
+	/*
+	 * Synthesize other events stuff not carried within
+	 * attr event - unit, scale, name
+	 */
+	evlist__for_each_entry(evsel_list, counter) {
+		if (!counter->supported)
+			continue;
+
+		/*
+		 * Synthesize unit and scale only if it's defined.
+		 */
+		if (has_unit(counter)) {
+			err = perf_event__synthesize_event_update_unit(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel unit.\n");
+				return err;
+			}
+		}
+
+		if (has_scale(counter)) {
+			err = perf_event__synthesize_event_update_scale(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel counter.\n");
+				return err;
+			}
+		}
+
+		if (counter->own_cpus) {
+			err = perf_event__synthesize_event_update_cpus(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel cpus.\n");
+				return err;
+			}
+		}
+
+		/*
+		 * Name is needed only for pipe output,
+		 * perf.data carries event names.
+		 */
+		if (is_pipe) {
+			err = perf_event__synthesize_event_update_name(tool, counter, process);
+			if (err < 0) {
+				pr_err("Couldn't synthesize evsel name.\n");
+				return err;
+			}
+		}
+	}
+	return 0;
+}
+
 int perf_event__process_attr(struct perf_tool *tool __maybe_unused,
 			     union perf_event *event,
 			     struct perf_evlist **pevlist)
diff --git a/tools/perf/util/header.h b/tools/perf/util/header.h
index 29ccbfd..f28aaaa 100644
--- a/tools/perf/util/header.h
+++ b/tools/perf/util/header.h
@@ -9,6 +9,7 @@
 #include <linux/types.h>
 #include "event.h"
 #include "env.h"
+#include "pmu.h"
 
 enum {
 	HEADER_RESERVED		= 0,	/* always cleared */
@@ -34,6 +35,7 @@ enum {
 	HEADER_AUXTRACE,
 	HEADER_STAT,
 	HEADER_CACHE,
+	HEADER_SAMPLE_TIME,
 	HEADER_LAST_FEATURE,
 	HEADER_FEAT_BITS	= 256,
 };
@@ -107,6 +109,11 @@ int perf_event__synthesize_features(struct perf_tool *tool,
 				    struct perf_evlist *evlist,
 				    perf_event__handler_t process);
 
+int perf_event__synthesize_extra_attr(struct perf_tool *tool,
+				      struct perf_evlist *evsel_list,
+				      perf_event__handler_t process,
+				      bool is_pipe);
+
 int perf_event__process_feature(struct perf_tool *tool,
 				union perf_event *event,
 				struct perf_session *session);
@@ -166,5 +173,5 @@ int write_padded(struct feat_fd *fd, const void *bf,
  */
 int get_cpuid(char *buffer, size_t sz);
 
-char *get_cpuid_str(void);
+char *get_cpuid_str(struct perf_pmu *pmu __maybe_unused);
 #endif /* __PERF_HEADER_H */
diff --git a/tools/perf/util/intel-bts.c b/tools/perf/util/intel-bts.c
index 5325e65..72db274 100644
--- a/tools/perf/util/intel-bts.c
+++ b/tools/perf/util/intel-bts.c
@@ -67,7 +67,6 @@ struct intel_bts {
 	u64				branches_sample_type;
 	u64				branches_id;
 	size_t				branches_event_size;
-	bool				synth_needs_swap;
 	unsigned long			num_events;
 };
 
@@ -303,8 +302,7 @@ static int intel_bts_synth_branch_sample(struct intel_bts_queue *btsq,
 		event.sample.header.size = bts->branches_event_size;
 		ret = perf_event__synthesize_sample(&event,
 						    bts->branches_sample_type,
-						    0, &sample,
-						    bts->synth_needs_swap);
+						    0, &sample);
 		if (ret)
 			return ret;
 	}
@@ -841,8 +839,6 @@ static int intel_bts_synth_events(struct intel_bts *bts,
 				__perf_evsel__sample_size(attr.sample_type);
 	}
 
-	bts->synth_needs_swap = evsel->needs_swap;
-
 	return 0;
 }
 
diff --git a/tools/perf/util/intel-pt-decoder/Build b/tools/perf/util/intel-pt-decoder/Build
index 10e0814..1b704fb 100644
--- a/tools/perf/util/intel-pt-decoder/Build
+++ b/tools/perf/util/intel-pt-decoder/Build
@@ -11,15 +11,21 @@
 
 $(OUTPUT)util/intel-pt-decoder/intel-pt-insn-decoder.o: util/intel-pt-decoder/intel-pt-insn-decoder.c util/intel-pt-decoder/inat.c $(OUTPUT)util/intel-pt-decoder/inat-tables.c
 	@(diff -I 2>&1 | grep -q 'option requires an argument' && \
-	test -d ../../kernel -a -d ../../tools -a -d ../perf && (( \
-	diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null && \
-	diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null && \
-	diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null && \
-	diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) \
-	|| echo "Warning: Intel PT: x86 instruction decoder differs from kernel" >&2 )) || true
+	test -d ../../kernel -a -d ../../tools -a -d ../perf && ( \
+	((diff -B -I'^#include' util/intel-pt-decoder/insn.c ../../arch/x86/lib/insn.c >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder C file at 'tools/perf/util/intel-pt-decoder/insn.c' differs from latest version at 'arch/x86/lib/insn.c'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat.c ../../arch/x86/lib/inat.c >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder C file at 'tools/perf/util/intel-pt-decoder/inat.c' differs from latest version at 'arch/x86/lib/inat.c'" >&2)) && \
+	((diff -B util/intel-pt-decoder/x86-opcode-map.txt ../../arch/x86/lib/x86-opcode-map.txt >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder map file at 'tools/perf/util/intel-pt-decoder/x86-opcode-map.txt' differs from latest version at 'arch/x86/lib/x86-opcode-map.txt'" >&2)) && \
+	((diff -B util/intel-pt-decoder/gen-insn-attr-x86.awk ../../arch/x86/tools/gen-insn-attr-x86.awk >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder script at 'tools/perf/util/intel-pt-decoder/gen-insn-attr-x86.awk' differs from latest version at 'arch/x86/tools/gen-insn-attr-x86.awk'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/insn.h ../../arch/x86/include/asm/insn.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/insn.h' differs from latest version at 'arch/x86/include/asm/insn.h'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat.h ../../arch/x86/include/asm/inat.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat.h' differs from latest version at 'arch/x86/include/asm/inat.h'" >&2)) && \
+	((diff -B -I'^#include' util/intel-pt-decoder/inat_types.h ../../arch/x86/include/asm/inat_types.h >/dev/null) || \
+	(echo "Warning: Intel PT: x86 instruction decoder header at 'tools/perf/util/intel-pt-decoder/inat_types.h' differs from latest version at 'arch/x86/include/asm/inat_types.h'" >&2)))) || true
 	$(call rule_mkdir)
 	$(call if_changed_dep,cc_o_c)
 
diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index 23f9ba6..3773d9c 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -104,8 +104,6 @@ struct intel_pt {
 	u64 pwrx_id;
 	u64 cbr_id;
 
-	bool synth_needs_swap;
-
 	u64 tsc_bit;
 	u64 mtc_bit;
 	u64 mtc_freq_bits;
@@ -1101,11 +1099,10 @@ static void intel_pt_prep_b_sample(struct intel_pt *pt,
 }
 
 static int intel_pt_inject_event(union perf_event *event,
-				 struct perf_sample *sample, u64 type,
-				 bool swapped)
+				 struct perf_sample *sample, u64 type)
 {
 	event->header.size = perf_event__sample_event_size(sample, type, 0);
-	return perf_event__synthesize_sample(event, type, 0, sample, swapped);
+	return perf_event__synthesize_sample(event, type, 0, sample);
 }
 
 static inline int intel_pt_opt_inject(struct intel_pt *pt,
@@ -1115,7 +1112,7 @@ static inline int intel_pt_opt_inject(struct intel_pt *pt,
 	if (!pt->synth_opts.inject)
 		return 0;
 
-	return intel_pt_inject_event(event, sample, type, pt->synth_needs_swap);
+	return intel_pt_inject_event(event, sample, type);
 }
 
 static int intel_pt_deliver_synth_b_event(struct intel_pt *pt,
@@ -2329,8 +2326,6 @@ static int intel_pt_synth_events(struct intel_pt *pt,
 		id += 1;
 	}
 
-	pt->synth_needs_swap = evsel->needs_swap;
-
 	return 0;
 }
 
diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 270f322..b05a674 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -1726,7 +1726,7 @@ static char *callchain_srcline(struct map *map, struct symbol *sym, u64 ip)
 		bool show_addr = callchain_param.key == CCKEY_ADDRESS;
 
 		srcline = get_srcline(map->dso, map__rip_2objdump(map, ip),
-				      sym, show_sym, show_addr);
+				      sym, show_sym, show_addr, ip);
 		srcline__tree_insert(&map->dso->srclines, ip, srcline);
 	}
 
@@ -2204,7 +2204,7 @@ int thread__resolve_callchain(struct thread *thread,
 {
 	int ret = 0;
 
-	callchain_cursor_reset(&callchain_cursor);
+	callchain_cursor_reset(cursor);
 
 	if (callchain_param.order == ORDER_CALLEE) {
 		ret = thread__resolve_callchain_sample(thread, cursor,
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c
index 6d40efd..8fe5703 100644
--- a/tools/perf/util/map.c
+++ b/tools/perf/util/map.c
@@ -419,7 +419,7 @@ int map__fprintf_srcline(struct map *map, u64 addr, const char *prefix,
 	if (map && map->dso) {
 		srcline = get_srcline(map->dso,
 				      map__rip_2objdump(map, addr), NULL,
-				      true, true);
+				      true, true, addr);
 		if (srcline != SRCLINE_UNKNOWN)
 			ret = fprintf(fp, "%s%s", prefix, srcline);
 		free_srcline(srcline);
diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 0ddd9c1..1ddc3d1 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -20,12 +20,10 @@
 #include "pmu.h"
 #include "expr.h"
 #include "rblist.h"
-#include "pmu.h"
 #include <string.h>
 #include <stdbool.h>
 #include <errno.h>
 #include "pmu-events/pmu-events.h"
-#include "strbuf.h"
 #include "strlist.h"
 #include <assert.h>
 #include <ctype.h>
@@ -38,6 +36,10 @@ struct metric_event *metricgroup__lookup(struct rblist *metric_events,
 	struct metric_event me = {
 		.evsel = evsel
 	};
+
+	if (!metric_events)
+		return NULL;
+
 	nd = rblist__find(metric_events, &me);
 	if (nd)
 		return container_of(nd, struct metric_event, nd);
@@ -270,7 +272,7 @@ static void metricgroup__print_strlist(struct strlist *metrics, bool raw)
 void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 			bool raw)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int i;
 	struct rblist groups;
@@ -368,7 +370,7 @@ void metricgroup__print(bool metrics, bool metricgroups, char *filter,
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
 				   struct list_head *group_list)
 {
-	struct pmu_events_map *map = perf_pmu__find_map();
+	struct pmu_events_map *map = perf_pmu__find_map(NULL);
 	struct pmu_event *pe;
 	int ret = -EINVAL;
 	int i, j;
diff --git a/tools/perf/util/mmap.c b/tools/perf/util/mmap.c
index 9fe5f9c..05076e6 100644
--- a/tools/perf/util/mmap.c
+++ b/tools/perf/util/mmap.c
@@ -21,33 +21,13 @@ size_t perf_mmap__mmap_len(struct perf_mmap *map)
 }
 
 /* When check_messup is true, 'end' must points to a good entry */
-static union perf_event *perf_mmap__read(struct perf_mmap *map, bool check_messup,
+static union perf_event *perf_mmap__read(struct perf_mmap *map,
 					 u64 start, u64 end, u64 *prev)
 {
 	unsigned char *data = map->base + page_size;
 	union perf_event *event = NULL;
 	int diff = end - start;
 
-	if (check_messup) {
-		/*
-		 * If we're further behind than half the buffer, there's a chance
-		 * the writer will bite our tail and mess up the samples under us.
-		 *
-		 * If we somehow ended up ahead of the 'end', we got messed up.
-		 *
-		 * In either case, truncate and restart at 'end'.
-		 */
-		if (diff > map->mask / 2 || diff < 0) {
-			fprintf(stderr, "WARNING: failed to keep up with mmap data.\n");
-
-			/*
-			 * 'end' points to a known good entry, start there.
-			 */
-			start = end;
-			diff = 0;
-		}
-	}
-
 	if (diff >= (int)sizeof(event->header)) {
 		size_t size;
 
@@ -89,7 +69,7 @@ static union perf_event *perf_mmap__read(struct perf_mmap *map, bool check_messu
 	return event;
 }
 
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup)
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map)
 {
 	u64 head;
 	u64 old = map->prev;
@@ -102,7 +82,7 @@ union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_mess
 
 	head = perf_mmap__read_head(map);
 
-	return perf_mmap__read(map, check_messup, old, head, &map->prev);
+	return perf_mmap__read(map, old, head, &map->prev);
 }
 
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
@@ -138,7 +118,7 @@ union perf_event *perf_mmap__read_backward(struct perf_mmap *map)
 	else
 		end = head + map->mask + 1;
 
-	return perf_mmap__read(map, false, start, end, &map->prev);
+	return perf_mmap__read(map, start, end, &map->prev);
 }
 
 void perf_mmap__read_catchup(struct perf_mmap *map)
@@ -254,18 +234,18 @@ int perf_mmap__mmap(struct perf_mmap *map, struct mmap_params *mp, int fd)
 	return 0;
 }
 
-static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
+static int overwrite_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64 *end)
 {
 	struct perf_event_header *pheader;
 	u64 evt_head = head;
 	int size = mask + 1;
 
-	pr_debug2("backward_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
+	pr_debug2("overwrite_rb_find_range: buf=%p, head=%"PRIx64"\n", buf, head);
 	pheader = (struct perf_event_header *)(buf + (head & mask));
 	*start = head;
 	while (true) {
 		if (evt_head - head >= (unsigned int)size) {
-			pr_debug("Finished reading backward ring buffer: rewind\n");
+			pr_debug("Finished reading overwrite ring buffer: rewind\n");
 			if (evt_head - head > (unsigned int)size)
 				evt_head -= pheader->size;
 			*end = evt_head;
@@ -275,7 +255,7 @@ static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64
 		pheader = (struct perf_event_header *)(buf + (evt_head & mask));
 
 		if (pheader->size == 0) {
-			pr_debug("Finished reading backward ring buffer: get start\n");
+			pr_debug("Finished reading overwrite ring buffer: get start\n");
 			*end = evt_head;
 			return 0;
 		}
@@ -287,19 +267,7 @@ static int backward_rb_find_range(void *buf, int mask, u64 head, u64 *start, u64
 	return -1;
 }
 
-static int rb_find_range(void *data, int mask, u64 head, u64 old,
-			 u64 *start, u64 *end, bool backward)
-{
-	if (!backward) {
-		*start = old;
-		*end = head;
-		return 0;
-	}
-
-	return backward_rb_find_range(data, mask, head, start, end);
-}
-
-int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
+int perf_mmap__push(struct perf_mmap *md, bool overwrite,
 		    void *to, int push(void *to, void *buf, size_t size))
 {
 	u64 head = perf_mmap__read_head(md);
@@ -310,19 +278,28 @@ int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
 	void *buf;
 	int rc = 0;
 
-	if (rb_find_range(data, md->mask, head, old, &start, &end, backward))
-		return -1;
+	start = overwrite ? head : old;
+	end = overwrite ? old : head;
 
 	if (start == end)
 		return 0;
 
 	size = end - start;
 	if (size > (unsigned long)(md->mask) + 1) {
-		WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
+		if (!overwrite) {
+			WARN_ONCE(1, "failed to keep up with mmap data. (warn only once)\n");
 
-		md->prev = head;
-		perf_mmap__consume(md, overwrite || backward);
-		return 0;
+			md->prev = head;
+			perf_mmap__consume(md, overwrite);
+			return 0;
+		}
+
+		/*
+		 * Backward ring buffer is full. We still have a chance to read
+		 * most of data from it.
+		 */
+		if (overwrite_rb_find_range(data, md->mask, head, &start, &end))
+			return -1;
 	}
 
 	if ((start & md->mask) + size != (end & md->mask)) {
@@ -346,7 +323,7 @@ int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
 	}
 
 	md->prev = head;
-	perf_mmap__consume(md, overwrite || backward);
+	perf_mmap__consume(md, overwrite);
 out:
 	return rc;
 }
diff --git a/tools/perf/util/mmap.h b/tools/perf/util/mmap.h
index 3a5cb5a..e43d7b5 100644
--- a/tools/perf/util/mmap.h
+++ b/tools/perf/util/mmap.h
@@ -86,10 +86,10 @@ static inline void perf_mmap__write_tail(struct perf_mmap *md, u64 tail)
 	pc->data_tail = tail;
 }
 
-union perf_event *perf_mmap__read_forward(struct perf_mmap *map, bool check_messup);
+union perf_event *perf_mmap__read_forward(struct perf_mmap *map);
 union perf_event *perf_mmap__read_backward(struct perf_mmap *map);
 
-int perf_mmap__push(struct perf_mmap *md, bool overwrite, bool backward,
+int perf_mmap__push(struct perf_mmap *md, bool backward,
 		    void *to, int push(void *to, void *buf, size_t size));
 
 size_t perf_mmap__mmap_len(struct perf_mmap *map);
diff --git a/tools/perf/util/ordered-events.c b/tools/perf/util/ordered-events.c
index 8e09fd2..bad9e02 100644
--- a/tools/perf/util/ordered-events.c
+++ b/tools/perf/util/ordered-events.c
@@ -157,9 +157,8 @@ void ordered_events__delete(struct ordered_events *oe, struct ordered_event *eve
 }
 
 int ordered_events__queue(struct ordered_events *oe, union perf_event *event,
-			  struct perf_sample *sample, u64 file_offset)
+			  u64 timestamp, u64 file_offset)
 {
-	u64 timestamp = sample->time;
 	struct ordered_event *oevent;
 
 	if (!timestamp || timestamp == ~0ULL)
diff --git a/tools/perf/util/ordered-events.h b/tools/perf/util/ordered-events.h
index 96e5292..8c7a294 100644
--- a/tools/perf/util/ordered-events.h
+++ b/tools/perf/util/ordered-events.h
@@ -45,7 +45,7 @@ struct ordered_events {
 };
 
 int ordered_events__queue(struct ordered_events *oe, union perf_event *event,
-			  struct perf_sample *sample, u64 file_offset);
+			  u64 timestamp, u64 file_offset);
 void ordered_events__delete(struct ordered_events *oe, struct ordered_event *event);
 int ordered_events__flush(struct ordered_events *oe, enum oe_flush how);
 void ordered_events__init(struct ordered_events *oe, ordered_events__deliver_t deliver);
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c
index 1703167..34589c4 100644
--- a/tools/perf/util/parse-events.c
+++ b/tools/perf/util/parse-events.c
@@ -4,6 +4,9 @@
 #include <dirent.h>
 #include <errno.h>
 #include <sys/ioctl.h>
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
 #include <sys/param.h>
 #include "term.h"
 #include "../perf.h"
diff --git a/tools/perf/util/path.c b/tools/perf/util/path.c
index 933f5c6..ca56ba2 100644
--- a/tools/perf/util/path.c
+++ b/tools/perf/util/path.c
@@ -18,6 +18,7 @@
 #include <stdio.h>
 #include <sys/types.h>
 #include <sys/stat.h>
+#include <dirent.h>
 #include <unistd.h>
 
 static char bad_path[] = "/bad-path/";
@@ -77,3 +78,16 @@ bool is_regular_file(const char *file)
 
 	return S_ISREG(st.st_mode);
 }
+
+/* Helper function for filesystems that return a dent->d_type DT_UNKNOWN */
+bool is_directory(const char *base_path, const struct dirent *dent)
+{
+	char path[PATH_MAX];
+	struct stat st;
+
+	sprintf(path, "%s/%s", base_path, dent->d_name);
+	if (stat(path, &st))
+		return false;
+
+	return S_ISDIR(st.st_mode);
+}
diff --git a/tools/perf/util/path.h b/tools/perf/util/path.h
index 14a254a..f014f90 100644
--- a/tools/perf/util/path.h
+++ b/tools/perf/util/path.h
@@ -2,9 +2,12 @@
 #ifndef _PERF_PATH_H
 #define _PERF_PATH_H
 
+struct dirent;
+
 int path__join(char *bf, size_t size, const char *path1, const char *path2);
 int path__join3(char *bf, size_t size, const char *path1, const char *path2, const char *path3);
 
 bool is_regular_file(const char *file);
+bool is_directory(const char *base_path, const struct dirent *dent);
 
 #endif /* _PERF_PATH_H */
diff --git a/tools/perf/util/pmu.c b/tools/perf/util/pmu.c
index 80fb159..57e38fd 100644
--- a/tools/perf/util/pmu.c
+++ b/tools/perf/util/pmu.c
@@ -12,6 +12,7 @@
 #include <dirent.h>
 #include <api/fs/fs.h>
 #include <locale.h>
+#include <regex.h>
 #include "util.h"
 #include "pmu.h"
 #include "parse-events.h"
@@ -537,17 +538,45 @@ static bool pmu_is_uncore(const char *name)
 }
 
 /*
+ *  PMU CORE devices have different name other than cpu in sysfs on some
+ *  platforms. looking for possible sysfs files to identify as core device.
+ */
+static int is_pmu_core(const char *name)
+{
+	struct stat st;
+	char path[PATH_MAX];
+	const char *sysfs = sysfs__mountpoint();
+
+	if (!sysfs)
+		return 0;
+
+	/* Look for cpu sysfs (x86 and others) */
+	scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/cpu", sysfs);
+	if ((stat(path, &st) == 0) &&
+			(strncmp(name, "cpu", strlen("cpu")) == 0))
+		return 1;
+
+	/* Look for cpu sysfs (specific to arm) */
+	scnprintf(path, PATH_MAX, "%s/bus/event_source/devices/%s/cpus",
+				sysfs, name);
+	if (stat(path, &st) == 0)
+		return 1;
+
+	return 0;
+}
+
+/*
  * Return the CPU id as a raw string.
  *
  * Each architecture should provide a more precise id string that
  * can be use to match the architecture's "mapfile".
  */
-char * __weak get_cpuid_str(void)
+char * __weak get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 {
 	return NULL;
 }
 
-static char *perf_pmu__getcpuid(void)
+static char *perf_pmu__getcpuid(struct perf_pmu *pmu)
 {
 	char *cpuid;
 	static bool printed;
@@ -556,7 +585,7 @@ static char *perf_pmu__getcpuid(void)
 	if (cpuid)
 		cpuid = strdup(cpuid);
 	if (!cpuid)
-		cpuid = get_cpuid_str();
+		cpuid = get_cpuid_str(pmu);
 	if (!cpuid)
 		return NULL;
 
@@ -567,22 +596,45 @@ static char *perf_pmu__getcpuid(void)
 	return cpuid;
 }
 
-struct pmu_events_map *perf_pmu__find_map(void)
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu)
 {
 	struct pmu_events_map *map;
-	char *cpuid = perf_pmu__getcpuid();
+	char *cpuid = perf_pmu__getcpuid(pmu);
 	int i;
 
+	/* on some platforms which uses cpus map, cpuid can be NULL for
+	 * PMUs other than CORE PMUs.
+	 */
+	if (!cpuid)
+		return NULL;
+
 	i = 0;
 	for (;;) {
+		regex_t re;
+		regmatch_t pmatch[1];
+		int match;
+
 		map = &pmu_events_map[i++];
 		if (!map->table) {
 			map = NULL;
 			break;
 		}
 
-		if (!strcmp(map->cpuid, cpuid))
+		if (regcomp(&re, map->cpuid, REG_EXTENDED) != 0) {
+			/* Warn unable to generate match particular string. */
+			pr_info("Invalid regular expression %s\n", map->cpuid);
 			break;
+		}
+
+		match = !regexec(&re, cpuid, 1, pmatch, 0);
+		regfree(&re);
+		if (match) {
+			size_t match_len = (pmatch[0].rm_eo - pmatch[0].rm_so);
+
+			/* Verify the entire string matched. */
+			if (match_len == strlen(cpuid))
+				break;
+		}
 	}
 	free(cpuid);
 	return map;
@@ -593,13 +645,14 @@ struct pmu_events_map *perf_pmu__find_map(void)
  * to the current running CPU. Then, add all PMU events from that table
  * as aliases.
  */
-static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
+static void pmu_add_cpu_aliases(struct list_head *head, struct perf_pmu *pmu)
 {
 	int i;
 	struct pmu_events_map *map;
 	struct pmu_event *pe;
+	const char *name = pmu->name;
 
-	map = perf_pmu__find_map();
+	map = perf_pmu__find_map(pmu);
 	if (!map)
 		return;
 
@@ -608,7 +661,6 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 	 */
 	i = 0;
 	while (1) {
-		const char *pname;
 
 		pe = &map->table[i++];
 		if (!pe->name) {
@@ -617,9 +669,13 @@ static void pmu_add_cpu_aliases(struct list_head *head, const char *name)
 			break;
 		}
 
-		pname = pe->pmu ? pe->pmu : "cpu";
-		if (strncmp(pname, name, strlen(pname)))
-			continue;
+		if (!is_pmu_core(name)) {
+			/* check for uncore devices */
+			if (pe->pmu == NULL)
+				continue;
+			if (strncmp(pe->pmu, name, strlen(pe->pmu)))
+				continue;
+		}
 
 		/* need type casts to override 'const' */
 		__perf_pmu__new_alias(head, NULL, (char *)pe->name,
@@ -661,21 +717,20 @@ static struct perf_pmu *pmu_lookup(const char *name)
 	if (pmu_aliases(name, &aliases))
 		return NULL;
 
-	pmu_add_cpu_aliases(&aliases, name);
 	pmu = zalloc(sizeof(*pmu));
 	if (!pmu)
 		return NULL;
 
 	pmu->cpus = pmu_cpumask(name);
-
+	pmu->name = strdup(name);
+	pmu->type = type;
 	pmu->is_uncore = pmu_is_uncore(name);
+	pmu_add_cpu_aliases(&aliases, pmu);
 
 	INIT_LIST_HEAD(&pmu->format);
 	INIT_LIST_HEAD(&pmu->aliases);
 	list_splice(&format, &pmu->format);
 	list_splice(&aliases, &pmu->aliases);
-	pmu->name = strdup(name);
-	pmu->type = type;
 	list_add_tail(&pmu->list, &pmus);
 
 	pmu->default_config = perf_pmu__get_default_config(pmu);
diff --git a/tools/perf/util/pmu.h b/tools/perf/util/pmu.h
index 27c75e6..76fecec 100644
--- a/tools/perf/util/pmu.h
+++ b/tools/perf/util/pmu.h
@@ -92,6 +92,6 @@ int perf_pmu__test(void);
 
 struct perf_event_attr *perf_pmu__get_default_config(struct perf_pmu *pmu);
 
-struct pmu_events_map *perf_pmu__find_map(void);
+struct pmu_events_map *perf_pmu__find_map(struct perf_pmu *pmu);
 
 #endif /* __PMU_H */
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c
index b7aaf9b..e1dbc98 100644
--- a/tools/perf/util/probe-event.c
+++ b/tools/perf/util/probe-event.c
@@ -1325,27 +1325,30 @@ static int parse_perf_probe_event_name(char **arg, struct perf_probe_event *pev)
 {
 	char *ptr;
 
-	ptr = strchr(*arg, ':');
+	ptr = strpbrk_esc(*arg, ":");
 	if (ptr) {
 		*ptr = '\0';
 		if (!pev->sdt && !is_c_func_name(*arg))
 			goto ng_name;
-		pev->group = strdup(*arg);
+		pev->group = strdup_esc(*arg);
 		if (!pev->group)
 			return -ENOMEM;
 		*arg = ptr + 1;
 	} else
 		pev->group = NULL;
-	if (!pev->sdt && !is_c_func_name(*arg)) {
+
+	pev->event = strdup_esc(*arg);
+	if (pev->event == NULL)
+		return -ENOMEM;
+
+	if (!pev->sdt && !is_c_func_name(pev->event)) {
+		zfree(&pev->event);
 ng_name:
+		zfree(&pev->group);
 		semantic_error("%s is bad for event name -it must "
 			       "follow C symbol-naming rule.\n", *arg);
 		return -EINVAL;
 	}
-	pev->event = strdup(*arg);
-	if (pev->event == NULL)
-		return -ENOMEM;
-
 	return 0;
 }
 
@@ -1373,7 +1376,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 			arg++;
 	}
 
-	ptr = strpbrk(arg, ";=@+%");
+	ptr = strpbrk_esc(arg, ";=@+%");
 	if (pev->sdt) {
 		if (ptr) {
 			if (*ptr != '@') {
@@ -1387,7 +1390,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 				pev->target = build_id_cache__origname(tmp);
 				free(tmp);
 			} else
-				pev->target = strdup(ptr + 1);
+				pev->target = strdup_esc(ptr + 1);
 			if (!pev->target)
 				return -ENOMEM;
 			*ptr = '\0';
@@ -1421,13 +1424,14 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 	 *
 	 * Otherwise, we consider arg to be a function specification.
 	 */
-	if (!strpbrk(arg, "+@%") && (ptr = strpbrk(arg, ";:")) != NULL) {
+	if (!strpbrk_esc(arg, "+@%")) {
+		ptr = strpbrk_esc(arg, ";:");
 		/* This is a file spec if it includes a '.' before ; or : */
-		if (memchr(arg, '.', ptr - arg))
+		if (ptr && memchr(arg, '.', ptr - arg))
 			file_spec = true;
 	}
 
-	ptr = strpbrk(arg, ";:+@%");
+	ptr = strpbrk_esc(arg, ";:+@%");
 	if (ptr) {
 		nc = *ptr;
 		*ptr++ = '\0';
@@ -1436,7 +1440,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 	if (arg[0] == '\0')
 		tmp = NULL;
 	else {
-		tmp = strdup(arg);
+		tmp = strdup_esc(arg);
 		if (tmp == NULL)
 			return -ENOMEM;
 	}
@@ -1469,12 +1473,12 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 		arg = ptr;
 		c = nc;
 		if (c == ';') {	/* Lazy pattern must be the last part */
-			pp->lazy_line = strdup(arg);
+			pp->lazy_line = strdup(arg); /* let leave escapes */
 			if (pp->lazy_line == NULL)
 				return -ENOMEM;
 			break;
 		}
-		ptr = strpbrk(arg, ";:+@%");
+		ptr = strpbrk_esc(arg, ";:+@%");
 		if (ptr) {
 			nc = *ptr;
 			*ptr++ = '\0';
@@ -1501,7 +1505,7 @@ static int parse_perf_probe_point(char *arg, struct perf_probe_event *pev)
 				semantic_error("SRC@SRC is not allowed.\n");
 				return -EINVAL;
 			}
-			pp->file = strdup(arg);
+			pp->file = strdup_esc(arg);
 			if (pp->file == NULL)
 				return -ENOMEM;
 			break;
@@ -2573,7 +2577,8 @@ int show_perf_probe_events(struct strfilter *filter)
 }
 
 static int get_new_event_name(char *buf, size_t len, const char *base,
-			      struct strlist *namelist, bool allow_suffix)
+			      struct strlist *namelist, bool ret_event,
+			      bool allow_suffix)
 {
 	int i, ret;
 	char *p, *nbase;
@@ -2584,13 +2589,13 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
 	if (!nbase)
 		return -ENOMEM;
 
-	/* Cut off the dot suffixes (e.g. .const, .isra)*/
-	p = strchr(nbase, '.');
+	/* Cut off the dot suffixes (e.g. .const, .isra) and version suffixes */
+	p = strpbrk(nbase, ".@");
 	if (p && p != nbase)
 		*p = '\0';
 
 	/* Try no suffix number */
-	ret = e_snprintf(buf, len, "%s", nbase);
+	ret = e_snprintf(buf, len, "%s%s", nbase, ret_event ? "__return" : "");
 	if (ret < 0) {
 		pr_debug("snprintf() failed: %d\n", ret);
 		goto out;
@@ -2625,6 +2630,14 @@ static int get_new_event_name(char *buf, size_t len, const char *base,
 
 out:
 	free(nbase);
+
+	/* Final validation */
+	if (ret >= 0 && !is_c_func_name(buf)) {
+		pr_warning("Internal error: \"%s\" is an invalid event name.\n",
+			   buf);
+		ret = -EINVAL;
+	}
+
 	return ret;
 }
 
@@ -2681,8 +2694,8 @@ static int probe_trace_event__set_name(struct probe_trace_event *tev,
 		group = PERFPROBE_GROUP;
 
 	/* Get an unused new event name */
-	ret = get_new_event_name(buf, 64, event,
-				 namelist, allow_suffix);
+	ret = get_new_event_name(buf, 64, event, namelist,
+				 tev->point.retprobe, allow_suffix);
 	if (ret < 0)
 		return ret;
 
@@ -2792,16 +2805,40 @@ static int find_probe_functions(struct map *map, char *name,
 	int found = 0;
 	struct symbol *sym;
 	struct rb_node *tmp;
+	const char *norm, *ver;
+	char *buf = NULL;
+	bool cut_version = true;
 
 	if (map__load(map) < 0)
 		return 0;
 
+	/* If user gives a version, don't cut off the version from symbols */
+	if (strchr(name, '@'))
+		cut_version = false;
+
 	map__for_each_symbol(map, sym, tmp) {
-		if (strglobmatch(sym->name, name)) {
+		norm = arch__normalize_symbol_name(sym->name);
+		if (!norm)
+			continue;
+
+		if (cut_version) {
+			/* We don't care about default symbol or not */
+			ver = strchr(norm, '@');
+			if (ver) {
+				buf = strndup(norm, ver - norm);
+				if (!buf)
+					return -ENOMEM;
+				norm = buf;
+			}
+		}
+
+		if (strglobmatch(norm, name)) {
 			found++;
 			if (syms && found < probe_conf.max_probes)
 				syms[found - 1] = sym;
 		}
+		if (buf)
+			zfree(&buf);
 	}
 
 	return found;
@@ -2847,7 +2884,7 @@ static int find_probe_trace_events_from_map(struct perf_probe_event *pev,
 	 * same name but different addresses, this lists all the symbols.
 	 */
 	num_matched_functions = find_probe_functions(map, pp->function, syms);
-	if (num_matched_functions == 0) {
+	if (num_matched_functions <= 0) {
 		pr_err("Failed to find symbol %s in %s\n", pp->function,
 			pev->target ? : "kernel");
 		ret = -ENOENT;
diff --git a/tools/perf/util/python-ext-sources b/tools/perf/util/python-ext-sources
index b4f2f06..7aa0ea6 100644
--- a/tools/perf/util/python-ext-sources
+++ b/tools/perf/util/python-ext-sources
@@ -10,6 +10,7 @@
 util/evlist.c
 util/evsel.c
 util/cpumap.c
+util/memswap.c
 util/mmap.c
 util/namespaces.c
 ../lib/bitmap.c
diff --git a/tools/perf/util/python.c b/tools/perf/util/python.c
index 8e49d9c..b1e999b 100644
--- a/tools/perf/util/python.c
+++ b/tools/perf/util/python.c
@@ -864,7 +864,7 @@ static PyObject *pyrf_evlist__mmap(struct pyrf_evlist *pevlist,
 					 &pages, &overwrite))
 		return NULL;
 
-	if (perf_evlist__mmap(evlist, pages, overwrite) < 0) {
+	if (perf_evlist__mmap(evlist, pages) < 0) {
 		PyErr_SetFromErrno(PyExc_OSError);
 		return NULL;
 	}
diff --git a/tools/perf/util/rblist.c b/tools/perf/util/rblist.c
index 0dfe27d..0efc325 100644
--- a/tools/perf/util/rblist.c
+++ b/tools/perf/util/rblist.c
@@ -101,16 +101,21 @@ void rblist__init(struct rblist *rblist)
 	return;
 }
 
+void rblist__exit(struct rblist *rblist)
+{
+	struct rb_node *pos, *next = rb_first(&rblist->entries);
+
+	while (next) {
+		pos = next;
+		next = rb_next(pos);
+		rblist__remove_node(rblist, pos);
+	}
+}
+
 void rblist__delete(struct rblist *rblist)
 {
 	if (rblist != NULL) {
-		struct rb_node *pos, *next = rb_first(&rblist->entries);
-
-		while (next) {
-			pos = next;
-			next = rb_next(pos);
-			rblist__remove_node(rblist, pos);
-		}
+		rblist__exit(rblist);
 		free(rblist);
 	}
 }
diff --git a/tools/perf/util/rblist.h b/tools/perf/util/rblist.h
index 4c8638a..76df15c 100644
--- a/tools/perf/util/rblist.h
+++ b/tools/perf/util/rblist.h
@@ -29,6 +29,7 @@ struct rblist {
 };
 
 void rblist__init(struct rblist *rblist);
+void rblist__exit(struct rblist *rblist);
 void rblist__delete(struct rblist *rblist);
 int rblist__add_node(struct rblist *rblist, const void *new_entry);
 void rblist__remove_node(struct rblist *rblist, struct rb_node *rb_node);
diff --git a/tools/perf/util/scripting-engines/trace-event-python.c b/tools/perf/util/scripting-engines/trace-event-python.c
index c7187f0..ea07088 100644
--- a/tools/perf/util/scripting-engines/trace-event-python.c
+++ b/tools/perf/util/scripting-engines/trace-event-python.c
@@ -43,7 +43,6 @@
 #include "../db-export.h"
 #include "../thread-stack.h"
 #include "../trace-event.h"
-#include "../machine.h"
 #include "../call-path.h"
 #include "thread_map.h"
 #include "cpumap.h"
@@ -500,6 +499,8 @@ static PyObject *get_perf_sample_dict(struct perf_sample *sample,
 			PyLong_FromUnsignedLongLong(sample->time));
 	pydict_set_item_string_decref(dict_sample, "period",
 			PyLong_FromUnsignedLongLong(sample->period));
+	pydict_set_item_string_decref(dict_sample, "phys_addr",
+			PyLong_FromUnsignedLongLong(sample->phys_addr));
 	set_sample_read_in_dict(dict_sample, sample, evsel);
 	pydict_set_item_string_decref(dict, "sample", dict_sample);
 
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index 5c41231..c71ced7 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -27,7 +27,6 @@
 
 static int perf_session__deliver_event(struct perf_session *session,
 				       union perf_event *event,
-				       struct perf_sample *sample,
 				       struct perf_tool *tool,
 				       u64 file_offset);
 
@@ -107,17 +106,10 @@ static void perf_session__set_comm_exec(struct perf_session *session)
 static int ordered_events__deliver_event(struct ordered_events *oe,
 					 struct ordered_event *event)
 {
-	struct perf_sample sample;
 	struct perf_session *session = container_of(oe, struct perf_session,
 						    ordered_events);
-	int ret = perf_evlist__parse_sample(session->evlist, event->event, &sample);
 
-	if (ret) {
-		pr_err("Can't parse sample, err = %d\n", ret);
-		return ret;
-	}
-
-	return perf_session__deliver_event(session, event->event, &sample,
+	return perf_session__deliver_event(session, event->event,
 					   session->tool, event->file_offset);
 }
 
@@ -873,9 +865,9 @@ static int process_finished_round(struct perf_tool *tool __maybe_unused,
 }
 
 int perf_session__queue_event(struct perf_session *s, union perf_event *event,
-			      struct perf_sample *sample, u64 file_offset)
+			      u64 timestamp, u64 file_offset)
 {
-	return ordered_events__queue(&s->ordered_events, event, sample, file_offset);
+	return ordered_events__queue(&s->ordered_events, event, timestamp, file_offset);
 }
 
 static void callchain__lbr_callstack_printf(struct perf_sample *sample)
@@ -1328,20 +1320,26 @@ static int machines__deliver_event(struct machines *machines,
 
 static int perf_session__deliver_event(struct perf_session *session,
 				       union perf_event *event,
-				       struct perf_sample *sample,
 				       struct perf_tool *tool,
 				       u64 file_offset)
 {
+	struct perf_sample sample;
 	int ret;
 
-	ret = auxtrace__process_event(session, event, sample, tool);
+	ret = perf_evlist__parse_sample(session->evlist, event, &sample);
+	if (ret) {
+		pr_err("Can't parse sample, err = %d\n", ret);
+		return ret;
+	}
+
+	ret = auxtrace__process_event(session, event, &sample, tool);
 	if (ret < 0)
 		return ret;
 	if (ret > 0)
 		return 0;
 
 	return machines__deliver_event(&session->machines, session->evlist,
-				       event, sample, tool, file_offset);
+				       event, &sample, tool, file_offset);
 }
 
 static s64 perf_session__process_user_event(struct perf_session *session,
@@ -1350,10 +1348,11 @@ static s64 perf_session__process_user_event(struct perf_session *session,
 {
 	struct ordered_events *oe = &session->ordered_events;
 	struct perf_tool *tool = session->tool;
+	struct perf_sample sample = { .time = 0, };
 	int fd = perf_data__fd(session->data);
 	int err;
 
-	dump_event(session->evlist, event, file_offset, NULL);
+	dump_event(session->evlist, event, file_offset, &sample);
 
 	/* These events are processed right away */
 	switch (event->header.type) {
@@ -1495,7 +1494,6 @@ static s64 perf_session__process_event(struct perf_session *session,
 {
 	struct perf_evlist *evlist = session->evlist;
 	struct perf_tool *tool = session->tool;
-	struct perf_sample sample;
 	int ret;
 
 	if (session->header.needs_swap)
@@ -1509,21 +1507,19 @@ static s64 perf_session__process_event(struct perf_session *session,
 	if (event->header.type >= PERF_RECORD_USER_TYPE_START)
 		return perf_session__process_user_event(session, event, file_offset);
 
-	/*
-	 * For all kernel events we get the sample data
-	 */
-	ret = perf_evlist__parse_sample(evlist, event, &sample);
-	if (ret)
-		return ret;
-
 	if (tool->ordered_events) {
-		ret = perf_session__queue_event(session, event, &sample, file_offset);
+		u64 timestamp = -1ULL;
+
+		ret = perf_evlist__parse_sample_timestamp(evlist, event, &timestamp);
+		if (ret && ret != -1)
+			return ret;
+
+		ret = perf_session__queue_event(session, event, timestamp, file_offset);
 		if (ret != -ETIME)
 			return ret;
 	}
 
-	return perf_session__deliver_event(session, event, &sample, tool,
-					   file_offset);
+	return perf_session__deliver_event(session, event, tool, file_offset);
 }
 
 void perf_event_header__bswap(struct perf_event_header *hdr)
@@ -1777,7 +1773,8 @@ static int __perf_session__process_pipe_events(struct perf_session *session)
 	err = perf_session__flush_thread_stacks(session);
 out_err:
 	free(buf);
-	perf_session__warn_about_errors(session);
+	if (!tool->no_warn)
+		perf_session__warn_about_errors(session);
 	ordered_events__free(&session->ordered_events);
 	auxtrace__free_events(session);
 	return err;
@@ -1933,7 +1930,8 @@ static int __perf_session__process_events(struct perf_session *session,
 	err = perf_session__flush_thread_stacks(session);
 out_err:
 	ui_progress__finish();
-	perf_session__warn_about_errors(session);
+	if (!tool->no_warn)
+		perf_session__warn_about_errors(session);
 	/*
 	 * We may switching perf.data output, make ordered_events
 	 * reusable.
diff --git a/tools/perf/util/session.h b/tools/perf/util/session.h
index da1434a..da40b4b 100644
--- a/tools/perf/util/session.h
+++ b/tools/perf/util/session.h
@@ -53,7 +53,7 @@ int perf_session__peek_event(struct perf_session *session, off_t file_offset,
 int perf_session__process_events(struct perf_session *session);
 
 int perf_session__queue_event(struct perf_session *s, union perf_event *event,
-			      struct perf_sample *sample, u64 file_offset);
+			      u64 timestamp, u64 file_offset);
 
 void perf_tool__fill_defaults(struct perf_tool *tool);
 
diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c
index a00eacd..2da4d04 100644
--- a/tools/perf/util/sort.c
+++ b/tools/perf/util/sort.c
@@ -336,7 +336,7 @@ char *hist_entry__get_srcline(struct hist_entry *he)
 		return SRCLINE_UNKNOWN;
 
 	return get_srcline(map->dso, map__rip_2objdump(map, he->ip),
-			   he->ms.sym, true, true);
+			   he->ms.sym, true, true, he->ip);
 }
 
 static int64_t
@@ -380,7 +380,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
 					   map__rip_2objdump(map,
 							     left->branch_info->from.al_addr),
 							 left->branch_info->from.sym,
-							 true, true);
+							 true, true,
+							 left->branch_info->from.al_addr);
 	}
 	if (!right->branch_info->srcline_from) {
 		struct map *map = right->branch_info->from.map;
@@ -391,7 +392,8 @@ sort__srcline_from_cmp(struct hist_entry *left, struct hist_entry *right)
 					     map__rip_2objdump(map,
 							       right->branch_info->from.al_addr),
 						     right->branch_info->from.sym,
-						     true, true);
+						     true, true,
+						     right->branch_info->from.al_addr);
 	}
 	return strcmp(right->branch_info->srcline_from, left->branch_info->srcline_from);
 }
@@ -423,7 +425,8 @@ sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
 					   map__rip_2objdump(map,
 							     left->branch_info->to.al_addr),
 							 left->branch_info->from.sym,
-							 true, true);
+							 true, true,
+							 left->branch_info->to.al_addr);
 	}
 	if (!right->branch_info->srcline_to) {
 		struct map *map = right->branch_info->to.map;
@@ -434,7 +437,8 @@ sort__srcline_to_cmp(struct hist_entry *left, struct hist_entry *right)
 					     map__rip_2objdump(map,
 							       right->branch_info->to.al_addr),
 						     right->branch_info->to.sym,
-						     true, true);
+						     true, true,
+						     right->branch_info->to.al_addr);
 	}
 	return strcmp(right->branch_info->srcline_to, left->branch_info->srcline_to);
 }
@@ -465,7 +469,7 @@ static char *hist_entry__get_srcfile(struct hist_entry *e)
 		return no_srcfile;
 
 	sf = __get_srcline(map->dso, map__rip_2objdump(map, e->ip),
-			 e->ms.sym, false, true, true);
+			 e->ms.sym, false, true, true, e->ip);
 	if (!strcmp(sf, SRCLINE_UNKNOWN))
 		return no_srcfile;
 	p = strchr(sf, ':');
@@ -2883,10 +2887,10 @@ static int setup_output_list(struct perf_hpp_list *list, char *str)
 			tok; tok = strtok_r(NULL, ", ", &tmp)) {
 		ret = output_field_add(list, tok);
 		if (ret == -EINVAL) {
-			pr_err("Invalid --fields key: `%s'", tok);
+			ui__error("Invalid --fields key: `%s'", tok);
 			break;
 		} else if (ret == -ESRCH) {
-			pr_err("Unknown --fields key: `%s'", tok);
+			ui__error("Unknown --fields key: `%s'", tok);
 			break;
 		}
 	}
diff --git a/tools/perf/util/srcline.c b/tools/perf/util/srcline.c
index d19f05c..3c21fd0 100644
--- a/tools/perf/util/srcline.c
+++ b/tools/perf/util/srcline.c
@@ -496,7 +496,8 @@ static struct inline_node *addr2inlines(const char *dso_name, u64 addr,
 #define A2L_FAIL_LIMIT 123
 
 char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
-		  bool show_sym, bool show_addr, bool unwind_inlines)
+		  bool show_sym, bool show_addr, bool unwind_inlines,
+		  u64 ip)
 {
 	char *file = NULL;
 	unsigned line = 0;
@@ -536,7 +537,7 @@ char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
 
 	if (sym) {
 		if (asprintf(&srcline, "%s+%" PRIu64, show_sym ? sym->name : "",
-					addr - sym->start) < 0)
+					ip - sym->start) < 0)
 			return SRCLINE_UNKNOWN;
 	} else if (asprintf(&srcline, "%s[%" PRIx64 "]", dso->short_name, addr) < 0)
 		return SRCLINE_UNKNOWN;
@@ -550,9 +551,9 @@ void free_srcline(char *srcline)
 }
 
 char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
-		  bool show_sym, bool show_addr)
+		  bool show_sym, bool show_addr, u64 ip)
 {
-	return __get_srcline(dso, addr, sym, show_sym, show_addr, false);
+	return __get_srcline(dso, addr, sym, show_sym, show_addr, false, ip);
 }
 
 struct srcline_node {
diff --git a/tools/perf/util/srcline.h b/tools/perf/util/srcline.h
index 847b708..b2bb5502 100644
--- a/tools/perf/util/srcline.h
+++ b/tools/perf/util/srcline.h
@@ -11,9 +11,10 @@ struct symbol;
 
 extern bool srcline_full_filename;
 char *get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
-		  bool show_sym, bool show_addr);
+		  bool show_sym, bool show_addr, u64 ip);
 char *__get_srcline(struct dso *dso, u64 addr, struct symbol *sym,
-		  bool show_sym, bool show_addr, bool unwind_inlines);
+		  bool show_sym, bool show_addr, bool unwind_inlines,
+		  u64 ip);
 void free_srcline(char *srcline);
 
 /* insert the srcline into the DSO, which will take ownership */
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 855e35c..594d14a 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -9,17 +9,6 @@
 #include "expr.h"
 #include "metricgroup.h"
 
-enum {
-	CTX_BIT_USER	= 1 << 0,
-	CTX_BIT_KERNEL	= 1 << 1,
-	CTX_BIT_HV	= 1 << 2,
-	CTX_BIT_HOST	= 1 << 3,
-	CTX_BIT_IDLE	= 1 << 4,
-	CTX_BIT_MAX	= 1 << 5,
-};
-
-#define NUM_CTX CTX_BIT_MAX
-
 /*
  * AGGR_GLOBAL: Use CPU 0
  * AGGR_SOCKET: Use first CPU of socket
@@ -27,36 +16,18 @@ enum {
  * AGGR_NONE: Use matching CPU
  * AGGR_THREAD: Not supported?
  */
-static struct stats runtime_nsecs_stats[MAX_NR_CPUS];
-static struct stats runtime_cycles_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_stalled_cycles_front_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_stalled_cycles_back_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_branches_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_cacherefs_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_l1_dcache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_l1_icache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_ll_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_itlb_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_dtlb_cache_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_cycles_in_tx_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_transaction_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_elision_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_total_slots[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_slots_issued[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_slots_retired[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_fetch_bubbles[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_topdown_recovery_bubbles[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_smi_num_stats[NUM_CTX][MAX_NR_CPUS];
-static struct stats runtime_aperf_stats[NUM_CTX][MAX_NR_CPUS];
-static struct rblist runtime_saved_values;
 static bool have_frontend_stalled;
 
+struct runtime_stat rt_stat;
 struct stats walltime_nsecs_stats;
 
 struct saved_value {
 	struct rb_node rb_node;
 	struct perf_evsel *evsel;
+	enum stat_type type;
+	int ctx;
 	int cpu;
+	struct runtime_stat *stat;
 	struct stats stats;
 };
 
@@ -69,6 +40,30 @@ static int saved_value_cmp(struct rb_node *rb_node, const void *entry)
 
 	if (a->cpu != b->cpu)
 		return a->cpu - b->cpu;
+
+	/*
+	 * Previously the rbtree was used to link generic metrics.
+	 * The keys were evsel/cpu. Now the rbtree is extended to support
+	 * per-thread shadow stats. For shadow stats case, the keys
+	 * are cpu/type/ctx/stat (evsel is NULL). For generic metrics
+	 * case, the keys are still evsel/cpu (type/ctx/stat are 0 or NULL).
+	 */
+	if (a->type != b->type)
+		return a->type - b->type;
+
+	if (a->ctx != b->ctx)
+		return a->ctx - b->ctx;
+
+	if (a->evsel == NULL && b->evsel == NULL) {
+		if (a->stat == b->stat)
+			return 0;
+
+		if ((char *)a->stat < (char *)b->stat)
+			return -1;
+
+		return 1;
+	}
+
 	if (a->evsel == b->evsel)
 		return 0;
 	if ((char *)a->evsel < (char *)b->evsel)
@@ -87,34 +82,66 @@ static struct rb_node *saved_value_new(struct rblist *rblist __maybe_unused,
 	return &nd->rb_node;
 }
 
+static void saved_value_delete(struct rblist *rblist __maybe_unused,
+			       struct rb_node *rb_node)
+{
+	struct saved_value *v;
+
+	BUG_ON(!rb_node);
+	v = container_of(rb_node, struct saved_value, rb_node);
+	free(v);
+}
+
 static struct saved_value *saved_value_lookup(struct perf_evsel *evsel,
 					      int cpu,
-					      bool create)
+					      bool create,
+					      enum stat_type type,
+					      int ctx,
+					      struct runtime_stat *st)
 {
+	struct rblist *rblist;
 	struct rb_node *nd;
 	struct saved_value dm = {
 		.cpu = cpu,
 		.evsel = evsel,
+		.type = type,
+		.ctx = ctx,
+		.stat = st,
 	};
-	nd = rblist__find(&runtime_saved_values, &dm);
+
+	rblist = &st->value_list;
+
+	nd = rblist__find(rblist, &dm);
 	if (nd)
 		return container_of(nd, struct saved_value, rb_node);
 	if (create) {
-		rblist__add_node(&runtime_saved_values, &dm);
-		nd = rblist__find(&runtime_saved_values, &dm);
+		rblist__add_node(rblist, &dm);
+		nd = rblist__find(rblist, &dm);
 		if (nd)
 			return container_of(nd, struct saved_value, rb_node);
 	}
 	return NULL;
 }
 
+void runtime_stat__init(struct runtime_stat *st)
+{
+	struct rblist *rblist = &st->value_list;
+
+	rblist__init(rblist);
+	rblist->node_cmp = saved_value_cmp;
+	rblist->node_new = saved_value_new;
+	rblist->node_delete = saved_value_delete;
+}
+
+void runtime_stat__exit(struct runtime_stat *st)
+{
+	rblist__exit(&st->value_list);
+}
+
 void perf_stat__init_shadow_stats(void)
 {
 	have_frontend_stalled = pmu_have_event("cpu", "stalled-cycles-frontend");
-	rblist__init(&runtime_saved_values);
-	runtime_saved_values.node_cmp = saved_value_cmp;
-	runtime_saved_values.node_new = saved_value_new;
-	/* No delete for now */
+	runtime_stat__init(&rt_stat);
 }
 
 static int evsel_context(struct perf_evsel *evsel)
@@ -135,36 +162,13 @@ static int evsel_context(struct perf_evsel *evsel)
 	return ctx;
 }
 
-void perf_stat__reset_shadow_stats(void)
+static void reset_stat(struct runtime_stat *st)
 {
+	struct rblist *rblist;
 	struct rb_node *pos, *next;
 
-	memset(runtime_nsecs_stats, 0, sizeof(runtime_nsecs_stats));
-	memset(runtime_cycles_stats, 0, sizeof(runtime_cycles_stats));
-	memset(runtime_stalled_cycles_front_stats, 0, sizeof(runtime_stalled_cycles_front_stats));
-	memset(runtime_stalled_cycles_back_stats, 0, sizeof(runtime_stalled_cycles_back_stats));
-	memset(runtime_branches_stats, 0, sizeof(runtime_branches_stats));
-	memset(runtime_cacherefs_stats, 0, sizeof(runtime_cacherefs_stats));
-	memset(runtime_l1_dcache_stats, 0, sizeof(runtime_l1_dcache_stats));
-	memset(runtime_l1_icache_stats, 0, sizeof(runtime_l1_icache_stats));
-	memset(runtime_ll_cache_stats, 0, sizeof(runtime_ll_cache_stats));
-	memset(runtime_itlb_cache_stats, 0, sizeof(runtime_itlb_cache_stats));
-	memset(runtime_dtlb_cache_stats, 0, sizeof(runtime_dtlb_cache_stats));
-	memset(runtime_cycles_in_tx_stats, 0,
-			sizeof(runtime_cycles_in_tx_stats));
-	memset(runtime_transaction_stats, 0,
-		sizeof(runtime_transaction_stats));
-	memset(runtime_elision_stats, 0, sizeof(runtime_elision_stats));
-	memset(&walltime_nsecs_stats, 0, sizeof(walltime_nsecs_stats));
-	memset(runtime_topdown_total_slots, 0, sizeof(runtime_topdown_total_slots));
-	memset(runtime_topdown_slots_retired, 0, sizeof(runtime_topdown_slots_retired));
-	memset(runtime_topdown_slots_issued, 0, sizeof(runtime_topdown_slots_issued));
-	memset(runtime_topdown_fetch_bubbles, 0, sizeof(runtime_topdown_fetch_bubbles));
-	memset(runtime_topdown_recovery_bubbles, 0, sizeof(runtime_topdown_recovery_bubbles));
-	memset(runtime_smi_num_stats, 0, sizeof(runtime_smi_num_stats));
-	memset(runtime_aperf_stats, 0, sizeof(runtime_aperf_stats));
-
-	next = rb_first(&runtime_saved_values.entries);
+	rblist = &st->value_list;
+	next = rb_first(&rblist->entries);
 	while (next) {
 		pos = next;
 		next = rb_next(pos);
@@ -174,13 +178,35 @@ void perf_stat__reset_shadow_stats(void)
 	}
 }
 
+void perf_stat__reset_shadow_stats(void)
+{
+	reset_stat(&rt_stat);
+	memset(&walltime_nsecs_stats, 0, sizeof(walltime_nsecs_stats));
+}
+
+void perf_stat__reset_shadow_per_stat(struct runtime_stat *st)
+{
+	reset_stat(st);
+}
+
+static void update_runtime_stat(struct runtime_stat *st,
+				enum stat_type type,
+				int ctx, int cpu, u64 count)
+{
+	struct saved_value *v = saved_value_lookup(NULL, cpu, true,
+						   type, ctx, st);
+
+	if (v)
+		update_stats(&v->stats, count);
+}
+
 /*
  * Update various tracking values we maintain to print
  * more semantic information such as miss/hit ratios,
  * instruction rates, etc:
  */
 void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
-				    int cpu)
+				    int cpu, struct runtime_stat *st)
 {
 	int ctx = evsel_context(counter);
 
@@ -188,50 +214,58 @@ void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
 
 	if (perf_evsel__match(counter, SOFTWARE, SW_TASK_CLOCK) ||
 	    perf_evsel__match(counter, SOFTWARE, SW_CPU_CLOCK))
-		update_stats(&runtime_nsecs_stats[cpu], count);
+		update_runtime_stat(st, STAT_NSECS, 0, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
-		update_stats(&runtime_cycles_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
-		update_stats(&runtime_cycles_in_tx_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_CYCLES_IN_TX, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TRANSACTION_START))
-		update_stats(&runtime_transaction_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TRANSACTION, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, ELISION_START))
-		update_stats(&runtime_elision_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_ELISION, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_TOTAL_SLOTS))
-		update_stats(&runtime_topdown_total_slots[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TOPDOWN_TOTAL_SLOTS,
+				    ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_ISSUED))
-		update_stats(&runtime_topdown_slots_issued[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_ISSUED,
+				    ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_SLOTS_RETIRED))
-		update_stats(&runtime_topdown_slots_retired[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TOPDOWN_SLOTS_RETIRED,
+				    ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_FETCH_BUBBLES))
-		update_stats(&runtime_topdown_fetch_bubbles[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TOPDOWN_FETCH_BUBBLES,
+				    ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
-		update_stats(&runtime_topdown_recovery_bubbles[ctx][cpu], count);
+		update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
+				    ctx, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
-		update_stats(&runtime_stalled_cycles_front_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
+				    ctx, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
-		update_stats(&runtime_stalled_cycles_back_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
+				    ctx, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
-		update_stats(&runtime_branches_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_BRANCHES, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
-		update_stats(&runtime_cacherefs_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_CACHEREFS, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
-		update_stats(&runtime_l1_dcache_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_L1_DCACHE, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
-		update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_L1_ICACHE, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_LL))
-		update_stats(&runtime_ll_cache_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_LL_CACHE, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
-		update_stats(&runtime_dtlb_cache_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_DTLB_CACHE, ctx, cpu, count);
 	else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
-		update_stats(&runtime_itlb_cache_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_ITLB_CACHE, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, SMI_NUM))
-		update_stats(&runtime_smi_num_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_SMI_NUM, ctx, cpu, count);
 	else if (perf_stat_evsel__is(counter, APERF))
-		update_stats(&runtime_aperf_stats[ctx][cpu], count);
+		update_runtime_stat(st, STAT_APERF, ctx, cpu, count);
 
 	if (counter->collect_stat) {
-		struct saved_value *v = saved_value_lookup(counter, cpu, true);
+		struct saved_value *v = saved_value_lookup(counter, cpu, true,
+							   STAT_NONE, 0, st);
 		update_stats(&v->stats, count);
 	}
 }
@@ -352,15 +386,40 @@ void perf_stat__collect_metric_expr(struct perf_evlist *evsel_list)
 	}
 }
 
+static double runtime_stat_avg(struct runtime_stat *st,
+			       enum stat_type type, int ctx, int cpu)
+{
+	struct saved_value *v;
+
+	v = saved_value_lookup(NULL, cpu, false, type, ctx, st);
+	if (!v)
+		return 0.0;
+
+	return avg_stats(&v->stats);
+}
+
+static double runtime_stat_n(struct runtime_stat *st,
+			     enum stat_type type, int ctx, int cpu)
+{
+	struct saved_value *v;
+
+	v = saved_value_lookup(NULL, cpu, false, type, ctx, st);
+	if (!v)
+		return 0.0;
+
+	return v->stats.n;
+}
+
 static void print_stalled_cycles_frontend(int cpu,
 					  struct perf_evsel *evsel, double avg,
-					  struct perf_stat_output_ctx *out)
+					  struct perf_stat_output_ctx *out,
+					  struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -376,13 +435,14 @@ static void print_stalled_cycles_frontend(int cpu,
 
 static void print_stalled_cycles_backend(int cpu,
 					 struct perf_evsel *evsel, double avg,
-					 struct perf_stat_output_ctx *out)
+					 struct perf_stat_output_ctx *out,
+					 struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -395,13 +455,14 @@ static void print_stalled_cycles_backend(int cpu,
 static void print_branch_misses(int cpu,
 				struct perf_evsel *evsel,
 				double avg,
-				struct perf_stat_output_ctx *out)
+				struct perf_stat_output_ctx *out,
+				struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_branches_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_BRANCHES, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -414,13 +475,15 @@ static void print_branch_misses(int cpu,
 static void print_l1_dcache_misses(int cpu,
 				   struct perf_evsel *evsel,
 				   double avg,
-				   struct perf_stat_output_ctx *out)
+				   struct perf_stat_output_ctx *out,
+				   struct runtime_stat *st)
+
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_l1_dcache_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_L1_DCACHE, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -433,13 +496,15 @@ static void print_l1_dcache_misses(int cpu,
 static void print_l1_icache_misses(int cpu,
 				   struct perf_evsel *evsel,
 				   double avg,
-				   struct perf_stat_output_ctx *out)
+				   struct perf_stat_output_ctx *out,
+				   struct runtime_stat *st)
+
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_l1_icache_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_L1_ICACHE, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -451,13 +516,14 @@ static void print_l1_icache_misses(int cpu,
 static void print_dtlb_cache_misses(int cpu,
 				    struct perf_evsel *evsel,
 				    double avg,
-				    struct perf_stat_output_ctx *out)
+				    struct perf_stat_output_ctx *out,
+				    struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_dtlb_cache_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_DTLB_CACHE, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -469,13 +535,14 @@ static void print_dtlb_cache_misses(int cpu,
 static void print_itlb_cache_misses(int cpu,
 				    struct perf_evsel *evsel,
 				    double avg,
-				    struct perf_stat_output_ctx *out)
+				    struct perf_stat_output_ctx *out,
+				    struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_itlb_cache_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_ITLB_CACHE, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -487,13 +554,14 @@ static void print_itlb_cache_misses(int cpu,
 static void print_ll_cache_misses(int cpu,
 				  struct perf_evsel *evsel,
 				  double avg,
-				  struct perf_stat_output_ctx *out)
+				  struct perf_stat_output_ctx *out,
+				  struct runtime_stat *st)
 {
 	double total, ratio = 0.0;
 	const char *color;
 	int ctx = evsel_context(evsel);
 
-	total = avg_stats(&runtime_ll_cache_stats[ctx][cpu]);
+	total = runtime_stat_avg(st, STAT_LL_CACHE, ctx, cpu);
 
 	if (total)
 		ratio = avg / total * 100.0;
@@ -551,68 +619,72 @@ static double sanitize_val(double x)
 	return x;
 }
 
-static double td_total_slots(int ctx, int cpu)
+static double td_total_slots(int ctx, int cpu, struct runtime_stat *st)
 {
-	return avg_stats(&runtime_topdown_total_slots[ctx][cpu]);
+	return runtime_stat_avg(st, STAT_TOPDOWN_TOTAL_SLOTS, ctx, cpu);
 }
 
-static double td_bad_spec(int ctx, int cpu)
+static double td_bad_spec(int ctx, int cpu, struct runtime_stat *st)
 {
 	double bad_spec = 0;
 	double total_slots;
 	double total;
 
-	total = avg_stats(&runtime_topdown_slots_issued[ctx][cpu]) -
-		avg_stats(&runtime_topdown_slots_retired[ctx][cpu]) +
-		avg_stats(&runtime_topdown_recovery_bubbles[ctx][cpu]);
-	total_slots = td_total_slots(ctx, cpu);
+	total = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_ISSUED, ctx, cpu) -
+		runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED, ctx, cpu) +
+		runtime_stat_avg(st, STAT_TOPDOWN_RECOVERY_BUBBLES, ctx, cpu);
+
+	total_slots = td_total_slots(ctx, cpu, st);
 	if (total_slots)
 		bad_spec = total / total_slots;
 	return sanitize_val(bad_spec);
 }
 
-static double td_retiring(int ctx, int cpu)
+static double td_retiring(int ctx, int cpu, struct runtime_stat *st)
 {
 	double retiring = 0;
-	double total_slots = td_total_slots(ctx, cpu);
-	double ret_slots = avg_stats(&runtime_topdown_slots_retired[ctx][cpu]);
+	double total_slots = td_total_slots(ctx, cpu, st);
+	double ret_slots = runtime_stat_avg(st, STAT_TOPDOWN_SLOTS_RETIRED,
+					    ctx, cpu);
 
 	if (total_slots)
 		retiring = ret_slots / total_slots;
 	return retiring;
 }
 
-static double td_fe_bound(int ctx, int cpu)
+static double td_fe_bound(int ctx, int cpu, struct runtime_stat *st)
 {
 	double fe_bound = 0;
-	double total_slots = td_total_slots(ctx, cpu);
-	double fetch_bub = avg_stats(&runtime_topdown_fetch_bubbles[ctx][cpu]);
+	double total_slots = td_total_slots(ctx, cpu, st);
+	double fetch_bub = runtime_stat_avg(st, STAT_TOPDOWN_FETCH_BUBBLES,
+					    ctx, cpu);
 
 	if (total_slots)
 		fe_bound = fetch_bub / total_slots;
 	return fe_bound;
 }
 
-static double td_be_bound(int ctx, int cpu)
+static double td_be_bound(int ctx, int cpu, struct runtime_stat *st)
 {
-	double sum = (td_fe_bound(ctx, cpu) +
-		      td_bad_spec(ctx, cpu) +
-		      td_retiring(ctx, cpu));
+	double sum = (td_fe_bound(ctx, cpu, st) +
+		      td_bad_spec(ctx, cpu, st) +
+		      td_retiring(ctx, cpu, st));
 	if (sum == 0)
 		return 0;
 	return sanitize_val(1.0 - sum);
 }
 
 static void print_smi_cost(int cpu, struct perf_evsel *evsel,
-			   struct perf_stat_output_ctx *out)
+			   struct perf_stat_output_ctx *out,
+			   struct runtime_stat *st)
 {
 	double smi_num, aperf, cycles, cost = 0.0;
 	int ctx = evsel_context(evsel);
 	const char *color = NULL;
 
-	smi_num = avg_stats(&runtime_smi_num_stats[ctx][cpu]);
-	aperf = avg_stats(&runtime_aperf_stats[ctx][cpu]);
-	cycles = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+	smi_num = runtime_stat_avg(st, STAT_SMI_NUM, ctx, cpu);
+	aperf = runtime_stat_avg(st, STAT_APERF, ctx, cpu);
+	cycles = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
 
 	if ((cycles == 0) || (aperf == 0))
 		return;
@@ -632,7 +704,8 @@ static void generic_metric(const char *metric_expr,
 			   const char *metric_name,
 			   double avg,
 			   int cpu,
-			   struct perf_stat_output_ctx *out)
+			   struct perf_stat_output_ctx *out,
+			   struct runtime_stat *st)
 {
 	print_metric_t print_metric = out->print_metric;
 	struct parse_ctx pctx;
@@ -651,7 +724,8 @@ static void generic_metric(const char *metric_expr,
 			stats = &walltime_nsecs_stats;
 			scale = 1e-9;
 		} else {
-			v = saved_value_lookup(metric_events[i], cpu, false);
+			v = saved_value_lookup(metric_events[i], cpu, false,
+					       STAT_NONE, 0, st);
 			if (!v)
 				break;
 			stats = &v->stats;
@@ -679,7 +753,8 @@ static void generic_metric(const char *metric_expr,
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out,
-				   struct rblist *metric_events)
+				   struct rblist *metric_events,
+				   struct runtime_stat *st)
 {
 	void *ctxp = out->ctx;
 	print_metric_t print_metric = out->print_metric;
@@ -690,7 +765,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 	int num = 1;
 
 	if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
-		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+
 		if (total) {
 			ratio = avg / total;
 			print_metric(ctxp, NULL, "%7.2f ",
@@ -698,8 +774,13 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		} else {
 			print_metric(ctxp, NULL, NULL, "insn per cycle", 0);
 		}
-		total = avg_stats(&runtime_stalled_cycles_front_stats[ctx][cpu]);
-		total = max(total, avg_stats(&runtime_stalled_cycles_back_stats[ctx][cpu]));
+
+		total = runtime_stat_avg(st, STAT_STALLED_CYCLES_FRONT,
+					 ctx, cpu);
+
+		total = max(total, runtime_stat_avg(st,
+						    STAT_STALLED_CYCLES_BACK,
+						    ctx, cpu));
 
 		if (total && avg) {
 			out->new_line(ctxp);
@@ -712,8 +793,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				     "stalled cycles per insn", 0);
 		}
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
-		if (runtime_branches_stats[ctx][cpu].n != 0)
-			print_branch_misses(cpu, evsel, avg, out);
+		if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
+			print_branch_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all branches", 0);
 	} else if (
@@ -721,8 +802,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		evsel->attr.config ==  ( PERF_COUNT_HW_CACHE_L1D |
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
-		if (runtime_l1_dcache_stats[ctx][cpu].n != 0)
-			print_l1_dcache_misses(cpu, evsel, avg, out);
+
+		if (runtime_stat_n(st, STAT_L1_DCACHE, ctx, cpu) != 0)
+			print_l1_dcache_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all L1-dcache hits", 0);
 	} else if (
@@ -730,8 +812,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		evsel->attr.config ==  ( PERF_COUNT_HW_CACHE_L1I |
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
-		if (runtime_l1_icache_stats[ctx][cpu].n != 0)
-			print_l1_icache_misses(cpu, evsel, avg, out);
+
+		if (runtime_stat_n(st, STAT_L1_ICACHE, ctx, cpu) != 0)
+			print_l1_icache_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all L1-icache hits", 0);
 	} else if (
@@ -739,8 +822,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		evsel->attr.config ==  ( PERF_COUNT_HW_CACHE_DTLB |
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
-		if (runtime_dtlb_cache_stats[ctx][cpu].n != 0)
-			print_dtlb_cache_misses(cpu, evsel, avg, out);
+
+		if (runtime_stat_n(st, STAT_DTLB_CACHE, ctx, cpu) != 0)
+			print_dtlb_cache_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all dTLB cache hits", 0);
 	} else if (
@@ -748,8 +832,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		evsel->attr.config ==  ( PERF_COUNT_HW_CACHE_ITLB |
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
-		if (runtime_itlb_cache_stats[ctx][cpu].n != 0)
-			print_itlb_cache_misses(cpu, evsel, avg, out);
+
+		if (runtime_stat_n(st, STAT_ITLB_CACHE, ctx, cpu) != 0)
+			print_itlb_cache_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all iTLB cache hits", 0);
 	} else if (
@@ -757,27 +842,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		evsel->attr.config ==  ( PERF_COUNT_HW_CACHE_LL |
 					((PERF_COUNT_HW_CACHE_OP_READ) << 8) |
 					 ((PERF_COUNT_HW_CACHE_RESULT_MISS) << 16))) {
-		if (runtime_ll_cache_stats[ctx][cpu].n != 0)
-			print_ll_cache_misses(cpu, evsel, avg, out);
+
+		if (runtime_stat_n(st, STAT_LL_CACHE, ctx, cpu) != 0)
+			print_ll_cache_misses(cpu, evsel, avg, out, st);
 		else
 			print_metric(ctxp, NULL, NULL, "of all LL-cache hits", 0);
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
-		total = avg_stats(&runtime_cacherefs_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CACHEREFS, ctx, cpu);
 
 		if (total)
 			ratio = avg * 100 / total;
 
-		if (runtime_cacherefs_stats[ctx][cpu].n != 0)
+		if (runtime_stat_n(st, STAT_CACHEREFS, ctx, cpu) != 0)
 			print_metric(ctxp, NULL, "%8.3f %%",
 				     "of all cache refs", ratio);
 		else
 			print_metric(ctxp, NULL, NULL, "of all cache refs", 0);
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
-		print_stalled_cycles_frontend(cpu, evsel, avg, out);
+		print_stalled_cycles_frontend(cpu, evsel, avg, out, st);
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
-		print_stalled_cycles_backend(cpu, evsel, avg, out);
+		print_stalled_cycles_backend(cpu, evsel, avg, out, st);
 	} else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
-		total = avg_stats(&runtime_nsecs_stats[cpu]);
+		total = runtime_stat_avg(st, STAT_NSECS, 0, cpu);
 
 		if (total) {
 			ratio = avg / total;
@@ -786,7 +872,8 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 			print_metric(ctxp, NULL, NULL, "Ghz", 0);
 		}
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX)) {
-		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+
 		if (total)
 			print_metric(ctxp, NULL,
 					"%7.2f%%", "transactional cycles",
@@ -795,8 +882,9 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 			print_metric(ctxp, NULL, NULL, "transactional cycles",
 				     0);
 	} else if (perf_stat_evsel__is(evsel, CYCLES_IN_TX_CP)) {
-		total = avg_stats(&runtime_cycles_stats[ctx][cpu]);
-		total2 = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
+		total2 = runtime_stat_avg(st, STAT_CYCLES_IN_TX, ctx, cpu);
+
 		if (total2 < avg)
 			total2 = avg;
 		if (total)
@@ -805,19 +893,21 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		else
 			print_metric(ctxp, NULL, NULL, "aborted cycles", 0);
 	} else if (perf_stat_evsel__is(evsel, TRANSACTION_START)) {
-		total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX,
+					 ctx, cpu);
 
 		if (avg)
 			ratio = total / avg;
 
-		if (runtime_cycles_in_tx_stats[ctx][cpu].n != 0)
+		if (runtime_stat_n(st, STAT_CYCLES_IN_TX, ctx, cpu) != 0)
 			print_metric(ctxp, NULL, "%8.0f",
 				     "cycles / transaction", ratio);
 		else
 			print_metric(ctxp, NULL, NULL, "cycles / transaction",
-				     0);
+				      0);
 	} else if (perf_stat_evsel__is(evsel, ELISION_START)) {
-		total = avg_stats(&runtime_cycles_in_tx_stats[ctx][cpu]);
+		total = runtime_stat_avg(st, STAT_CYCLES_IN_TX,
+					 ctx, cpu);
 
 		if (avg)
 			ratio = total / avg;
@@ -831,28 +921,28 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		else
 			print_metric(ctxp, NULL, NULL, "CPUs utilized", 0);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_FETCH_BUBBLES)) {
-		double fe_bound = td_fe_bound(ctx, cpu);
+		double fe_bound = td_fe_bound(ctx, cpu, st);
 
 		if (fe_bound > 0.2)
 			color = PERF_COLOR_RED;
 		print_metric(ctxp, color, "%8.1f%%", "frontend bound",
 				fe_bound * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_RETIRED)) {
-		double retiring = td_retiring(ctx, cpu);
+		double retiring = td_retiring(ctx, cpu, st);
 
 		if (retiring > 0.7)
 			color = PERF_COLOR_GREEN;
 		print_metric(ctxp, color, "%8.1f%%", "retiring",
 				retiring * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_RECOVERY_BUBBLES)) {
-		double bad_spec = td_bad_spec(ctx, cpu);
+		double bad_spec = td_bad_spec(ctx, cpu, st);
 
 		if (bad_spec > 0.1)
 			color = PERF_COLOR_RED;
 		print_metric(ctxp, color, "%8.1f%%", "bad speculation",
 				bad_spec * 100.);
 	} else if (perf_stat_evsel__is(evsel, TOPDOWN_SLOTS_ISSUED)) {
-		double be_bound = td_be_bound(ctx, cpu);
+		double be_bound = td_be_bound(ctx, cpu, st);
 		const char *name = "backend bound";
 		static int have_recovery_bubbles = -1;
 
@@ -865,19 +955,19 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 
 		if (be_bound > 0.2)
 			color = PERF_COLOR_RED;
-		if (td_total_slots(ctx, cpu) > 0)
+		if (td_total_slots(ctx, cpu, st) > 0)
 			print_metric(ctxp, color, "%8.1f%%", name,
 					be_bound * 100.);
 		else
 			print_metric(ctxp, NULL, NULL, name, 0);
 	} else if (evsel->metric_expr) {
 		generic_metric(evsel->metric_expr, evsel->metric_events, evsel->name,
-				evsel->metric_name, avg, cpu, out);
-	} else if (runtime_nsecs_stats[cpu].n != 0) {
+				evsel->metric_name, avg, cpu, out, st);
+	} else if (runtime_stat_n(st, STAT_NSECS, 0, cpu) != 0) {
 		char unit = 'M';
 		char unit_buf[10];
 
-		total = avg_stats(&runtime_nsecs_stats[cpu]);
+		total = runtime_stat_avg(st, STAT_NSECS, 0, cpu);
 
 		if (total)
 			ratio = 1000.0 * avg / total;
@@ -888,7 +978,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 		snprintf(unit_buf, sizeof(unit_buf), "%c/sec", unit);
 		print_metric(ctxp, NULL, "%8.3f", unit_buf, ratio);
 	} else if (perf_stat_evsel__is(evsel, SMI_NUM)) {
-		print_smi_cost(cpu, evsel, out);
+		print_smi_cost(cpu, evsel, out, st);
 	} else {
 		num = 0;
 	}
@@ -901,7 +991,7 @@ void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				out->new_line(ctxp);
 			generic_metric(mexp->metric_expr, mexp->metric_events,
 					evsel->name, mexp->metric_name,
-					avg, cpu, out);
+					avg, cpu, out, st);
 		}
 	}
 	if (num == 0)
diff --git a/tools/perf/util/stat.c b/tools/perf/util/stat.c
index 151e9ef..32235657 100644
--- a/tools/perf/util/stat.c
+++ b/tools/perf/util/stat.c
@@ -278,9 +278,16 @@ process_counter_values(struct perf_stat_config *config, struct perf_evsel *evsel
 			perf_evsel__compute_deltas(evsel, cpu, thread, count);
 		perf_counts_values__scale(count, config->scale, NULL);
 		if (config->aggr_mode == AGGR_NONE)
-			perf_stat__update_shadow_stats(evsel, count->val, cpu);
-		if (config->aggr_mode == AGGR_THREAD)
-			perf_stat__update_shadow_stats(evsel, count->val, 0);
+			perf_stat__update_shadow_stats(evsel, count->val, cpu,
+						       &rt_stat);
+		if (config->aggr_mode == AGGR_THREAD) {
+			if (config->stats)
+				perf_stat__update_shadow_stats(evsel,
+					count->val, 0, &config->stats[thread]);
+			else
+				perf_stat__update_shadow_stats(evsel,
+					count->val, 0, &rt_stat);
+		}
 		break;
 	case AGGR_GLOBAL:
 		aggr->val += count->val;
@@ -362,7 +369,7 @@ int perf_stat_process_counter(struct perf_stat_config *config,
 	/*
 	 * Save the full runtime - to allow normalization during printout:
 	 */
-	perf_stat__update_shadow_stats(counter, *count, 0);
+	perf_stat__update_shadow_stats(counter, *count, 0, &rt_stat);
 
 	return 0;
 }
diff --git a/tools/perf/util/stat.h b/tools/perf/util/stat.h
index eefca5c..dbc6f71 100644
--- a/tools/perf/util/stat.h
+++ b/tools/perf/util/stat.h
@@ -5,6 +5,7 @@
 #include <linux/types.h>
 #include <stdio.h>
 #include "xyarray.h"
+#include "rblist.h"
 
 struct stats
 {
@@ -43,11 +44,54 @@ enum aggr_mode {
 	AGGR_UNSET,
 };
 
+enum {
+	CTX_BIT_USER	= 1 << 0,
+	CTX_BIT_KERNEL	= 1 << 1,
+	CTX_BIT_HV	= 1 << 2,
+	CTX_BIT_HOST	= 1 << 3,
+	CTX_BIT_IDLE	= 1 << 4,
+	CTX_BIT_MAX	= 1 << 5,
+};
+
+#define NUM_CTX CTX_BIT_MAX
+
+enum stat_type {
+	STAT_NONE = 0,
+	STAT_NSECS,
+	STAT_CYCLES,
+	STAT_STALLED_CYCLES_FRONT,
+	STAT_STALLED_CYCLES_BACK,
+	STAT_BRANCHES,
+	STAT_CACHEREFS,
+	STAT_L1_DCACHE,
+	STAT_L1_ICACHE,
+	STAT_LL_CACHE,
+	STAT_ITLB_CACHE,
+	STAT_DTLB_CACHE,
+	STAT_CYCLES_IN_TX,
+	STAT_TRANSACTION,
+	STAT_ELISION,
+	STAT_TOPDOWN_TOTAL_SLOTS,
+	STAT_TOPDOWN_SLOTS_ISSUED,
+	STAT_TOPDOWN_SLOTS_RETIRED,
+	STAT_TOPDOWN_FETCH_BUBBLES,
+	STAT_TOPDOWN_RECOVERY_BUBBLES,
+	STAT_SMI_NUM,
+	STAT_APERF,
+	STAT_MAX
+};
+
+struct runtime_stat {
+	struct rblist value_list;
+};
+
 struct perf_stat_config {
 	enum aggr_mode	aggr_mode;
 	bool		scale;
 	FILE		*output;
 	unsigned int	interval;
+	struct runtime_stat *stats;
+	int		stats_num;
 };
 
 void update_stats(struct stats *stats, u64 val);
@@ -67,6 +111,15 @@ static inline void init_stats(struct stats *stats)
 struct perf_evsel;
 struct perf_evlist;
 
+struct perf_aggr_thread_value {
+	struct perf_evsel *counter;
+	int id;
+	double uval;
+	u64 val;
+	u64 run;
+	u64 ena;
+};
+
 bool __perf_evsel_stat__is(struct perf_evsel *evsel,
 			   enum perf_stat_evsel_id id);
 
@@ -75,16 +128,20 @@ bool __perf_evsel_stat__is(struct perf_evsel *evsel,
 
 void perf_stat_evsel_id_init(struct perf_evsel *evsel);
 
+extern struct runtime_stat rt_stat;
 extern struct stats walltime_nsecs_stats;
 
 typedef void (*print_metric_t)(void *ctx, const char *color, const char *unit,
 			       const char *fmt, double val);
 typedef void (*new_line_t )(void *ctx);
 
+void runtime_stat__init(struct runtime_stat *st);
+void runtime_stat__exit(struct runtime_stat *st);
 void perf_stat__init_shadow_stats(void);
 void perf_stat__reset_shadow_stats(void);
+void perf_stat__reset_shadow_per_stat(struct runtime_stat *st);
 void perf_stat__update_shadow_stats(struct perf_evsel *counter, u64 count,
-				    int cpu);
+				    int cpu, struct runtime_stat *st);
 struct perf_stat_output_ctx {
 	void *ctx;
 	print_metric_t print_metric;
@@ -92,11 +149,11 @@ struct perf_stat_output_ctx {
 	bool force_header;
 };
 
-struct rblist;
 void perf_stat__print_shadow_stats(struct perf_evsel *evsel,
 				   double avg, int cpu,
 				   struct perf_stat_output_ctx *out,
-				   struct rblist *metric_events);
+				   struct rblist *metric_events,
+				   struct runtime_stat *st);
 void perf_stat__collect_metric_expr(struct perf_evlist *);
 
 int perf_evlist__alloc_stats(struct perf_evlist *evlist, bool alloc_raw);
diff --git a/tools/perf/util/string.c b/tools/perf/util/string.c
index aaa08ee..d8bfd0c 100644
--- a/tools/perf/util/string.c
+++ b/tools/perf/util/string.c
@@ -396,3 +396,49 @@ char *asprintf_expr_inout_ints(const char *var, bool in, size_t nints, int *ints
 	free(expr);
 	return NULL;
 }
+
+/* Like strpbrk(), but not break if it is right after a backslash (escaped) */
+char *strpbrk_esc(char *str, const char *stopset)
+{
+	char *ptr;
+
+	do {
+		ptr = strpbrk(str, stopset);
+		if (ptr == str ||
+		    (ptr == str + 1 && *(ptr - 1) != '\\'))
+			break;
+		str = ptr + 1;
+	} while (ptr && *(ptr - 1) == '\\' && *(ptr - 2) != '\\');
+
+	return ptr;
+}
+
+/* Like strdup, but do not copy a single backslash */
+char *strdup_esc(const char *str)
+{
+	char *s, *d, *p, *ret = strdup(str);
+
+	if (!ret)
+		return NULL;
+
+	d = strchr(ret, '\\');
+	if (!d)
+		return ret;
+
+	s = d + 1;
+	do {
+		if (*s == '\0') {
+			*d = '\0';
+			break;
+		}
+		p = strchr(s + 1, '\\');
+		if (p) {
+			memmove(d, s, p - s);
+			d += p - s;
+			s = p + 1;
+		} else
+			memmove(d, s, strlen(s) + 1);
+	} while (p);
+
+	return ret;
+}
diff --git a/tools/perf/util/string2.h b/tools/perf/util/string2.h
index ee14ca5..4c68a09 100644
--- a/tools/perf/util/string2.h
+++ b/tools/perf/util/string2.h
@@ -39,5 +39,7 @@ static inline char *asprintf_expr_not_in_ints(const char *var, size_t nints, int
 	return asprintf_expr_inout_ints(var, false, nints, ints);
 }
 
+char *strpbrk_esc(char *str, const char *stopset);
+char *strdup_esc(const char *str);
 
 #endif /* PERF_STRING_H */
diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 1b67a86..cc065d4 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -94,6 +94,11 @@ static int prefix_underscores_count(const char *str)
 	return tail - str;
 }
 
+const char * __weak arch__normalize_symbol_name(const char *name)
+{
+	return name;
+}
+
 int __weak arch__compare_symbol_names(const char *namea, const char *nameb)
 {
 	return strcmp(namea, nameb);
diff --git a/tools/perf/util/symbol.h b/tools/perf/util/symbol.h
index a4f0075..0563f33 100644
--- a/tools/perf/util/symbol.h
+++ b/tools/perf/util/symbol.h
@@ -349,6 +349,7 @@ bool elf__needs_adjust_symbols(GElf_Ehdr ehdr);
 void arch__sym_update(struct symbol *s, GElf_Sym *sym);
 #endif
 
+const char *arch__normalize_symbol_name(const char *name);
 #define SYMBOL_A 0
 #define SYMBOL_B 1
 
diff --git a/tools/perf/util/syscalltbl.c b/tools/perf/util/syscalltbl.c
index 6eea7cf..303bdb8 100644
--- a/tools/perf/util/syscalltbl.c
+++ b/tools/perf/util/syscalltbl.c
@@ -26,6 +26,10 @@
 #include <asm/syscalls_64.c>
 const int syscalltbl_native_max_id = SYSCALLTBL_x86_64_MAX_ID;
 static const char **syscalltbl_native = syscalltbl_x86_64;
+#elif defined(__s390x__)
+#include <asm/syscalls_64.c>
+const int syscalltbl_native_max_id = SYSCALLTBL_S390_64_MAX_ID;
+static const char **syscalltbl_native = syscalltbl_s390_64;
 #endif
 
 struct syscall {
diff --git a/tools/perf/util/target.h b/tools/perf/util/target.h
index 446aa7a..6ef01a8 100644
--- a/tools/perf/util/target.h
+++ b/tools/perf/util/target.h
@@ -64,6 +64,11 @@ static inline bool target__none(struct target *target)
 	return !target__has_task(target) && !target__has_cpu(target);
 }
 
+static inline bool target__has_per_thread(struct target *target)
+{
+	return target->system_wide && target->per_thread;
+}
+
 static inline bool target__uses_dummy_map(struct target *target)
 {
 	bool use_dummy = false;
@@ -73,6 +78,8 @@ static inline bool target__uses_dummy_map(struct target *target)
 	else if (target__has_task(target) ||
 	         (!target__has_cpu(target) && !target->uses_mmap))
 		use_dummy = true;
+	else if (target__has_per_thread(target))
+		use_dummy = true;
 
 	return use_dummy;
 }
diff --git a/tools/perf/util/thread_map.c b/tools/perf/util/thread_map.c
index be0d5a7..3e1038f 100644
--- a/tools/perf/util/thread_map.c
+++ b/tools/perf/util/thread_map.c
@@ -92,7 +92,7 @@ struct thread_map *thread_map__new_by_tid(pid_t tid)
 	return threads;
 }
 
-struct thread_map *thread_map__new_by_uid(uid_t uid)
+static struct thread_map *__thread_map__new_all_cpus(uid_t uid)
 {
 	DIR *proc;
 	int max_threads = 32, items, i;
@@ -113,7 +113,6 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 	while ((dirent = readdir(proc)) != NULL) {
 		char *end;
 		bool grow = false;
-		struct stat st;
 		pid_t pid = strtol(dirent->d_name, &end, 10);
 
 		if (*end) /* only interested in proper numerical dirents */
@@ -121,11 +120,12 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 
 		snprintf(path, sizeof(path), "/proc/%s", dirent->d_name);
 
-		if (stat(path, &st) != 0)
-			continue;
+		if (uid != UINT_MAX) {
+			struct stat st;
 
-		if (st.st_uid != uid)
-			continue;
+			if (stat(path, &st) != 0 || st.st_uid != uid)
+				continue;
+		}
 
 		snprintf(path, sizeof(path), "/proc/%d/task", pid);
 		items = scandir(path, &namelist, filter, NULL);
@@ -178,6 +178,16 @@ struct thread_map *thread_map__new_by_uid(uid_t uid)
 	goto out_closedir;
 }
 
+struct thread_map *thread_map__new_all_cpus(void)
+{
+	return __thread_map__new_all_cpus(UINT_MAX);
+}
+
+struct thread_map *thread_map__new_by_uid(uid_t uid)
+{
+	return __thread_map__new_all_cpus(uid);
+}
+
 struct thread_map *thread_map__new(pid_t pid, pid_t tid, uid_t uid)
 {
 	if (pid != -1)
@@ -313,7 +323,7 @@ struct thread_map *thread_map__new_by_tid_str(const char *tid_str)
 }
 
 struct thread_map *thread_map__new_str(const char *pid, const char *tid,
-				       uid_t uid)
+				       uid_t uid, bool per_thread)
 {
 	if (pid)
 		return thread_map__new_by_pid_str(pid);
@@ -321,6 +331,9 @@ struct thread_map *thread_map__new_str(const char *pid, const char *tid,
 	if (!tid && uid != UINT_MAX)
 		return thread_map__new_by_uid(uid);
 
+	if (per_thread)
+		return thread_map__new_all_cpus();
+
 	return thread_map__new_by_tid_str(tid);
 }
 
diff --git a/tools/perf/util/thread_map.h b/tools/perf/util/thread_map.h
index f158039..0a806b9 100644
--- a/tools/perf/util/thread_map.h
+++ b/tools/perf/util/thread_map.h
@@ -23,6 +23,7 @@ struct thread_map *thread_map__new_dummy(void);
 struct thread_map *thread_map__new_by_pid(pid_t pid);
 struct thread_map *thread_map__new_by_tid(pid_t tid);
 struct thread_map *thread_map__new_by_uid(uid_t uid);
+struct thread_map *thread_map__new_all_cpus(void);
 struct thread_map *thread_map__new(pid_t pid, pid_t tid, uid_t uid);
 struct thread_map *thread_map__new_event(struct thread_map_event *event);
 
@@ -30,7 +31,7 @@ struct thread_map *thread_map__get(struct thread_map *map);
 void thread_map__put(struct thread_map *map);
 
 struct thread_map *thread_map__new_str(const char *pid,
-		const char *tid, uid_t uid);
+		const char *tid, uid_t uid, bool per_thread);
 
 struct thread_map *thread_map__new_by_tid_str(const char *tid_str);
 
diff --git a/tools/perf/util/time-utils.c b/tools/perf/util/time-utils.c
index 81927d0..6193b46 100644
--- a/tools/perf/util/time-utils.c
+++ b/tools/perf/util/time-utils.c
@@ -6,6 +6,7 @@
 #include <time.h>
 #include <errno.h>
 #include <inttypes.h>
+#include <math.h>
 
 #include "perf.h"
 #include "debug.h"
@@ -60,11 +61,10 @@ static int parse_timestr_sec_nsec(struct perf_time_interval *ptime,
 	return 0;
 }
 
-int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+static int split_start_end(char **start, char **end, const char *ostr, char ch)
 {
 	char *start_str, *end_str;
 	char *d, *str;
-	int rc = 0;
 
 	if (ostr == NULL || *ostr == '\0')
 		return 0;
@@ -74,25 +74,35 @@ int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
 	if (str == NULL)
 		return -ENOMEM;
 
-	ptime->start = 0;
-	ptime->end = 0;
-
-	/* str has the format: <start>,<stop>
-	 * variations: <start>,
-	 *             ,<stop>
-	 *             ,
-	 */
 	start_str = str;
-	d = strchr(start_str, ',');
+	d = strchr(start_str, ch);
 	if (d) {
 		*d = '\0';
 		++d;
 	}
 	end_str = d;
 
+	*start = start_str;
+	*end = end_str;
+
+	return 0;
+}
+
+int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
+{
+	char *start_str = NULL, *end_str;
+	int rc;
+
+	rc = split_start_end(&start_str, &end_str, ostr, ',');
+	if (rc || !start_str)
+		return rc;
+
+	ptime->start = 0;
+	ptime->end = 0;
+
 	rc = parse_timestr_sec_nsec(ptime, start_str, end_str);
 
-	free(str);
+	free(start_str);
 
 	/* make sure end time is after start time if it was given */
 	if (rc == 0 && ptime->end && ptime->end < ptime->start)
@@ -104,6 +114,245 @@ int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr)
 	return rc;
 }
 
+static int parse_percent(double *pcnt, char *str)
+{
+	char *c, *endptr;
+	double d;
+
+	c = strchr(str, '%');
+	if (c)
+		*c = '\0';
+	else
+		return -1;
+
+	d = strtod(str, &endptr);
+	if (endptr != str + strlen(str))
+		return -1;
+
+	*pcnt = d / 100.0;
+	return 0;
+}
+
+static int percent_slash_split(char *str, struct perf_time_interval *ptime,
+			       u64 start, u64 end)
+{
+	char *p, *end_str;
+	double pcnt, start_pcnt, end_pcnt;
+	u64 total = end - start;
+	int i;
+
+	/*
+	 * Example:
+	 * 10%/2: select the second 10% slice and the third 10% slice
+	 */
+
+	/* We can modify this string since the original one is copied */
+	p = strchr(str, '/');
+	if (!p)
+		return -1;
+
+	*p = '\0';
+	if (parse_percent(&pcnt, str) < 0)
+		return -1;
+
+	p++;
+	i = (int)strtol(p, &end_str, 10);
+	if (*end_str)
+		return -1;
+
+	if (pcnt <= 0.0)
+		return -1;
+
+	start_pcnt = pcnt * (i - 1);
+	end_pcnt = pcnt * i;
+
+	if (start_pcnt < 0.0 || start_pcnt > 1.0 ||
+	    end_pcnt < 0.0 || end_pcnt > 1.0) {
+		return -1;
+	}
+
+	ptime->start = start + round(start_pcnt * total);
+	ptime->end = start + round(end_pcnt * total);
+
+	return 0;
+}
+
+static int percent_dash_split(char *str, struct perf_time_interval *ptime,
+			      u64 start, u64 end)
+{
+	char *start_str = NULL, *end_str;
+	double start_pcnt, end_pcnt;
+	u64 total = end - start;
+	int ret;
+
+	/*
+	 * Example: 0%-10%
+	 */
+
+	ret = split_start_end(&start_str, &end_str, str, '-');
+	if (ret || !start_str)
+		return ret;
+
+	if ((parse_percent(&start_pcnt, start_str) != 0) ||
+	    (parse_percent(&end_pcnt, end_str) != 0)) {
+		free(start_str);
+		return -1;
+	}
+
+	free(start_str);
+
+	if (start_pcnt < 0.0 || start_pcnt > 1.0 ||
+	    end_pcnt < 0.0 || end_pcnt > 1.0 ||
+	    start_pcnt > end_pcnt) {
+		return -1;
+	}
+
+	ptime->start = start + round(start_pcnt * total);
+	ptime->end = start + round(end_pcnt * total);
+
+	return 0;
+}
+
+typedef int (*time_pecent_split)(char *, struct perf_time_interval *,
+				 u64 start, u64 end);
+
+static int percent_comma_split(struct perf_time_interval *ptime_buf, int num,
+			       const char *ostr, u64 start, u64 end,
+			       time_pecent_split func)
+{
+	char *str, *p1, *p2;
+	int len, ret, i = 0;
+
+	str = strdup(ostr);
+	if (str == NULL)
+		return -ENOMEM;
+
+	len = strlen(str);
+	p1 = str;
+
+	while (p1 < str + len) {
+		if (i >= num) {
+			free(str);
+			return -1;
+		}
+
+		p2 = strchr(p1, ',');
+		if (p2)
+			*p2 = '\0';
+
+		ret = (func)(p1, &ptime_buf[i], start, end);
+		if (ret < 0) {
+			free(str);
+			return -1;
+		}
+
+		pr_debug("start time %d: %" PRIu64 ", ", i, ptime_buf[i].start);
+		pr_debug("end time %d: %" PRIu64 "\n", i, ptime_buf[i].end);
+
+		i++;
+
+		if (p2)
+			p1 = p2 + 1;
+		else
+			break;
+	}
+
+	free(str);
+	return i;
+}
+
+static int one_percent_convert(struct perf_time_interval *ptime_buf,
+			       const char *ostr, u64 start, u64 end, char *c)
+{
+	char *str;
+	int len = strlen(ostr), ret;
+
+	/*
+	 * c points to '%'.
+	 * '%' should be the last character
+	 */
+	if (ostr + len - 1 != c)
+		return -1;
+
+	/*
+	 * Construct a string like "xx%/1"
+	 */
+	str = malloc(len + 3);
+	if (str == NULL)
+		return -ENOMEM;
+
+	memcpy(str, ostr, len);
+	strcpy(str + len, "/1");
+
+	ret = percent_slash_split(str, ptime_buf, start, end);
+	if (ret == 0)
+		ret = 1;
+
+	free(str);
+	return ret;
+}
+
+int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
+				 const char *ostr, u64 start, u64 end)
+{
+	char *c;
+
+	/*
+	 * ostr example:
+	 * 10%/2,10%/3: select the second 10% slice and the third 10% slice
+	 * 0%-10%,30%-40%: multiple time range
+	 * 50%: just one percent
+	 */
+
+	memset(ptime_buf, 0, sizeof(*ptime_buf) * num);
+
+	c = strchr(ostr, '/');
+	if (c) {
+		return percent_comma_split(ptime_buf, num, ostr, start,
+					   end, percent_slash_split);
+	}
+
+	c = strchr(ostr, '-');
+	if (c) {
+		return percent_comma_split(ptime_buf, num, ostr, start,
+					   end, percent_dash_split);
+	}
+
+	c = strchr(ostr, '%');
+	if (c)
+		return one_percent_convert(ptime_buf, ostr, start, end, c);
+
+	return -1;
+}
+
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size)
+{
+	const char *p1, *p2;
+	int i = 1;
+	struct perf_time_interval *ptime;
+
+	/*
+	 * At least allocate one time range.
+	 */
+	if (!ostr)
+		goto alloc;
+
+	p1 = ostr;
+	while (p1 < ostr + strlen(ostr)) {
+		p2 = strchr(p1, ',');
+		if (!p2)
+			break;
+
+		p1 = p2 + 1;
+		i++;
+	}
+
+alloc:
+	*size = i;
+	ptime = calloc(i, sizeof(*ptime));
+	return ptime;
+}
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 {
 	/* if time is not set don't drop sample */
@@ -119,6 +368,34 @@ bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp)
 	return false;
 }
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+				   int num, u64 timestamp)
+{
+	struct perf_time_interval *ptime;
+	int i;
+
+	if ((timestamp == 0) || (num == 0))
+		return false;
+
+	if (num == 1)
+		return perf_time__skip_sample(&ptime_buf[0], timestamp);
+
+	/*
+	 * start/end of multiple time ranges must be valid.
+	 */
+	for (i = 0; i < num; i++) {
+		ptime = &ptime_buf[i];
+
+		if (timestamp >= ptime->start &&
+		    ((timestamp < ptime->end && i < num - 1) ||
+		     (timestamp <= ptime->end && i == num - 1))) {
+			break;
+		}
+	}
+
+	return (i == num) ? true : false;
+}
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz)
 {
 	u64  sec = timestamp / NSEC_PER_SEC;
diff --git a/tools/perf/util/time-utils.h b/tools/perf/util/time-utils.h
index 15b475c..70b177d 100644
--- a/tools/perf/util/time-utils.h
+++ b/tools/perf/util/time-utils.h
@@ -13,8 +13,16 @@ int parse_nsec_time(const char *str, u64 *ptime);
 
 int perf_time__parse_str(struct perf_time_interval *ptime, const char *ostr);
 
+int perf_time__percent_parse_str(struct perf_time_interval *ptime_buf, int num,
+				 const char *ostr, u64 start, u64 end);
+
+struct perf_time_interval *perf_time__range_alloc(const char *ostr, int *size);
+
 bool perf_time__skip_sample(struct perf_time_interval *ptime, u64 timestamp);
 
+bool perf_time__ranges_skip_sample(struct perf_time_interval *ptime_buf,
+				   int num, u64 timestamp);
+
 int timestamp__scnprintf_usec(u64 timestamp, char *buf, size_t sz);
 
 int fetch_current_timestamp(char *buf, size_t sz);
diff --git a/tools/perf/util/tool.h b/tools/perf/util/tool.h
index 2532b55..183c914 100644
--- a/tools/perf/util/tool.h
+++ b/tools/perf/util/tool.h
@@ -76,6 +76,7 @@ struct perf_tool {
 	bool		ordered_events;
 	bool		ordering_requires_timestamps;
 	bool		namespace_events;
+	bool		no_warn;
 	enum show_feature_header show_feat_hdr;
 };
 
diff --git a/tools/perf/util/unwind-libunwind-local.c b/tools/perf/util/unwind-libunwind-local.c
index 7a42f70..af87304 100644
--- a/tools/perf/util/unwind-libunwind-local.c
+++ b/tools/perf/util/unwind-libunwind-local.c
@@ -631,9 +631,8 @@ static unw_accessors_t accessors = {
 
 static int _unwind__prepare_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return 0;
-
 	thread->addr_space = unw_create_addr_space(&accessors, 0);
 	if (!thread->addr_space) {
 		pr_err("unwind: Can't create unwind address space.\n");
@@ -646,17 +645,15 @@ static int _unwind__prepare_access(struct thread *thread)
 
 static void _unwind__flush_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return;
-
 	unw_flush_cache(thread->addr_space, 0, 0);
 }
 
 static void _unwind__finish_access(struct thread *thread)
 {
-	if (callchain_param.record_mode != CALLCHAIN_DWARF)
+	if (!dwarf_callchain_users)
 		return;
-
 	unw_destroy_addr_space(thread->addr_space);
 }
 
diff --git a/tools/perf/util/unwind-libunwind.c b/tools/perf/util/unwind-libunwind.c
index 647a1e6..b029a5e 100644
--- a/tools/perf/util/unwind-libunwind.c
+++ b/tools/perf/util/unwind-libunwind.c
@@ -3,7 +3,7 @@
 #include "thread.h"
 #include "session.h"
 #include "debug.h"
-#include "arch/common.h"
+#include "env.h"
 
 struct unwind_libunwind_ops __weak *local_unwind_libunwind_ops;
 struct unwind_libunwind_ops __weak *x86_32_unwind_libunwind_ops;
@@ -39,7 +39,7 @@ int unwind__prepare_access(struct thread *thread, struct map *map,
 	if (dso_type == DSO__TYPE_UNKNOWN)
 		return 0;
 
-	arch = normalize_arch(thread->mg->machine->env->arch);
+	arch = perf_env__arch(thread->mg->machine->env);
 
 	if (!strcmp(arch, "x86")) {
 		if (dso_type != DSO__TYPE_64BIT)
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index a789f95..443892d 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -210,7 +210,7 @@ static int copyfile_offset(int ifd, loff_t off_in, int ofd, loff_t off_out, u64
 
 		size -= ret;
 		off_in += ret;
-		off_out -= ret;
+		off_out += ret;
 	}
 	munmap(ptr, off_in + size);
 
diff --git a/tools/perf/util/util.h b/tools/perf/util/util.h
index 0143450..9496365 100644
--- a/tools/perf/util/util.h
+++ b/tools/perf/util/util.h
@@ -68,4 +68,14 @@ extern bool perf_singlethreaded;
 void perf_set_singlethreaded(void);
 void perf_set_multithreaded(void);
 
+#ifndef O_CLOEXEC
+#ifdef __sparc__
+#define O_CLOEXEC      0x400000
+#elif defined(__alpha__) || defined(__hppa__)
+#define O_CLOEXEC      010000000
+#else
+#define O_CLOEXEC      02000000
+#endif
+#endif
+
 #endif /* GIT_COMPAT_UTIL_H */
diff --git a/tools/power/acpi/tools/acpidump/apmain.c b/tools/power/acpi/tools/acpidump/apmain.c
index 22c3b4e..be418fb 100644
--- a/tools/power/acpi/tools/acpidump/apmain.c
+++ b/tools/power/acpi/tools/acpidump/apmain.c
@@ -79,7 +79,7 @@ struct ap_dump_action action_table[AP_MAX_ACTIONS];
 u32 current_action = 0;
 
 #define AP_UTILITY_NAME             "ACPI Binary Table Dump Utility"
-#define AP_SUPPORTED_OPTIONS        "?a:bc:f:hn:o:r:svxz"
+#define AP_SUPPORTED_OPTIONS        "?a:bc:f:hn:o:r:sv^xz"
 
 /******************************************************************************
  *
@@ -100,6 +100,7 @@ static void ap_display_usage(void)
 	ACPI_OPTION("-r <Address>", "Dump tables from specified RSDP");
 	ACPI_OPTION("-s", "Print table summaries only");
 	ACPI_OPTION("-v", "Display version information");
+	ACPI_OPTION("-vd", "Display build date and time");
 	ACPI_OPTION("-z", "Verbose mode");
 
 	ACPI_USAGE_TEXT("\nTable Options:\n");
@@ -231,10 +232,29 @@ static int ap_do_options(int argc, char **argv)
 			}
 			continue;
 
-		case 'v':	/* Revision/version */
+		case 'v':	/* -v: (Version): signon already emitted, just exit */
 
-			acpi_os_printf(ACPI_COMMON_SIGNON(AP_UTILITY_NAME));
-			return (1);
+			switch (acpi_gbl_optarg[0]) {
+			case '^':	/* -v: (Version) */
+
+				fprintf(stderr,
+					ACPI_COMMON_SIGNON(AP_UTILITY_NAME));
+				return (1);
+
+			case 'd':
+
+				fprintf(stderr,
+					ACPI_COMMON_SIGNON(AP_UTILITY_NAME));
+				printf(ACPI_COMMON_BUILD_TIME);
+				return (1);
+
+			default:
+
+				printf("Unknown option: -v%s\n",
+				       acpi_gbl_optarg);
+				return (-1);
+			}
+			break;
 
 		case 'z':	/* Verbose mode */
 
diff --git a/tools/power/cpupower/lib/cpufreq.h b/tools/power/cpupower/lib/cpufreq.h
index 3b005c3..60beaf5 100644
--- a/tools/power/cpupower/lib/cpufreq.h
+++ b/tools/power/cpupower/lib/cpufreq.h
@@ -11,10 +11,6 @@
  *  but WITHOUT ANY WARRANTY; without even the implied warranty of
  *  MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  *  GNU General Public License for more details.
- *
- *  You should have received a copy of the GNU General Public License
- *  along with this program; if not, write to the Free Software
- *  Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
  */
 
 #ifndef __CPUPOWER_CPUFREQ_H__
diff --git a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
index 0b24dd9..29f50d4 100755
--- a/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
+++ b/tools/power/x86/intel_pstate_tracer/intel_pstate_tracer.py
@@ -411,6 +411,16 @@
         print('IO error setting trace buffer size ')
         quit()
 
+def free_trace_buffer():
+    """ Free the trace buffer memory """
+
+    try:
+       open('/sys/kernel/debug/tracing/buffer_size_kb'
+                 , 'w').write("1")
+    except:
+        print('IO error setting trace buffer size ')
+        quit()
+
 def read_trace_data(filename):
     """ Read and parse trace data """
 
@@ -583,4 +593,9 @@
     for f in files:
         fix_ownership(f)
 
+clear_trace_file()
+# Free the memory
+if interval:
+    free_trace_buffer()
+
 os.chdir('../../')
diff --git a/tools/testing/selftests/bpf/test_align.c b/tools/testing/selftests/bpf/test_align.c
index 8591c89..471bbbd 100644
--- a/tools/testing/selftests/bpf/test_align.c
+++ b/tools/testing/selftests/bpf/test_align.c
@@ -474,27 +474,7 @@ static struct bpf_align_test tests[] = {
 		.result = REJECT,
 		.matches = {
 			{4, "R5=pkt(id=0,off=0,r=0,imm=0)"},
-			/* ptr & 0x40 == either 0 or 0x40 */
-			{5, "R5=inv(id=0,umax_value=64,var_off=(0x0; 0x40))"},
-			/* ptr << 2 == unknown, (4n) */
-			{7, "R5=inv(id=0,smax_value=9223372036854775804,umax_value=18446744073709551612,var_off=(0x0; 0xfffffffffffffffc))"},
-			/* (4n) + 14 == (4n+2).  We blow our bounds, because
-			 * the add could overflow.
-			 */
-			{8, "R5=inv(id=0,var_off=(0x2; 0xfffffffffffffffc))"},
-			/* Checked s>=0 */
-			{10, "R5=inv(id=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
-			/* packet pointer + nonnegative (4n+2) */
-			{12, "R6=pkt(id=1,off=0,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
-			{14, "R4=pkt(id=1,off=4,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
-			/* NET_IP_ALIGN + (4n+2) == (4n), alignment is fine.
-			 * We checked the bounds, but it might have been able
-			 * to overflow if the packet pointer started in the
-			 * upper half of the address space.
-			 * So we did not get a 'range' on R6, and the access
-			 * attempt will fail.
-			 */
-			{16, "R6=pkt(id=1,off=0,r=0,umin_value=2,umax_value=9223372036854775806,var_off=(0x2; 0x7ffffffffffffffc))"},
+			/* R5 bitwise operator &= on pointer prohibited */
 		}
 	},
 	{
diff --git a/tools/testing/selftests/bpf/test_verifier.c b/tools/testing/selftests/bpf/test_verifier.c
index b510174..5ed4175 100644
--- a/tools/testing/selftests/bpf/test_verifier.c
+++ b/tools/testing/selftests/bpf/test_verifier.c
@@ -273,6 +273,46 @@ static struct bpf_test tests[] = {
 		.result = REJECT,
 	},
 	{
+		"arsh32 on imm",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_ALU32_IMM(BPF_ARSH, BPF_REG_0, 5),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "BPF_ARSH not supported for 32 bit ALU",
+	},
+	{
+		"arsh32 on reg",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_MOV64_IMM(BPF_REG_1, 5),
+			BPF_ALU32_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "BPF_ARSH not supported for 32 bit ALU",
+	},
+	{
+		"arsh64 on imm",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_ALU64_IMM(BPF_ARSH, BPF_REG_0, 5),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+	},
+	{
+		"arsh64 on reg",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_MOV64_IMM(BPF_REG_1, 5),
+			BPF_ALU64_REG(BPF_ARSH, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+	},
+	{
 		"no bpf_exit",
 		.insns = {
 			BPF_ALU64_REG(BPF_MOV, BPF_REG_0, BPF_REG_2),
@@ -2553,6 +2593,29 @@ static struct bpf_test tests[] = {
 		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
 	},
 	{
+		"context stores via ST",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_ST_MEM(BPF_DW, BPF_REG_1, offsetof(struct __sk_buff, mark), 0),
+			BPF_EXIT_INSN(),
+		},
+		.errstr = "BPF_ST stores into R1 context is not allowed",
+		.result = REJECT,
+		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	},
+	{
+		"context stores via XADD",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_RAW_INSN(BPF_STX | BPF_XADD | BPF_W, BPF_REG_1,
+				     BPF_REG_0, offsetof(struct __sk_buff, mark), 0),
+			BPF_EXIT_INSN(),
+		},
+		.errstr = "BPF_XADD stores into R1 context is not allowed",
+		.result = REJECT,
+		.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	},
+	{
 		"direct packet access: test1",
 		.insns = {
 			BPF_LDX_MEM(BPF_W, BPF_REG_2, BPF_REG_1,
@@ -4272,7 +4335,8 @@ static struct bpf_test tests[] = {
 		.fixup_map1 = { 2 },
 		.errstr_unpriv = "R2 leaks addr into mem",
 		.result_unpriv = REJECT,
-		.result = ACCEPT,
+		.result = REJECT,
+		.errstr = "BPF_XADD stores into R1 context is not allowed",
 	},
 	{
 		"leak pointer into ctx 2",
@@ -4286,7 +4350,8 @@ static struct bpf_test tests[] = {
 		},
 		.errstr_unpriv = "R10 leaks addr into mem",
 		.result_unpriv = REJECT,
-		.result = ACCEPT,
+		.result = REJECT,
+		.errstr = "BPF_XADD stores into R1 context is not allowed",
 	},
 	{
 		"leak pointer into ctx 3",
@@ -6667,7 +6732,7 @@ static struct bpf_test tests[] = {
 			BPF_JMP_IMM(BPF_JA, 0, 0, -7),
 		},
 		.fixup_map1 = { 4 },
-		.errstr = "unbounded min value",
+		.errstr = "R0 invalid mem access 'inv'",
 		.result = REJECT,
 	},
 	{
@@ -8569,6 +8634,127 @@ static struct bpf_test tests[] = {
 		.flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
 	},
 	{
+		"check deducing bounds from const, 1",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 1, 0),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "R0 tried to subtract pointer from scalar",
+	},
+	{
+		"check deducing bounds from const, 2",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 1),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 1, 1),
+			BPF_EXIT_INSN(),
+			BPF_JMP_IMM(BPF_JSLE, BPF_REG_0, 1, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+	},
+	{
+		"check deducing bounds from const, 3",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSLE, BPF_REG_0, 0, 0),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "R0 tried to subtract pointer from scalar",
+	},
+	{
+		"check deducing bounds from const, 4",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSLE, BPF_REG_0, 0, 1),
+			BPF_EXIT_INSN(),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
+			BPF_EXIT_INSN(),
+		},
+		.result = ACCEPT,
+	},
+	{
+		"check deducing bounds from const, 5",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 1),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "R0 tried to subtract pointer from scalar",
+	},
+	{
+		"check deducing bounds from const, 6",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 1),
+			BPF_EXIT_INSN(),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "R0 tried to subtract pointer from scalar",
+	},
+	{
+		"check deducing bounds from const, 7",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, ~0),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 0),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_1, BPF_REG_0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct __sk_buff, mark)),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "dereference of modified ctx ptr",
+	},
+	{
+		"check deducing bounds from const, 8",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, ~0),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 1),
+			BPF_ALU64_REG(BPF_ADD, BPF_REG_1, BPF_REG_0),
+			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1,
+				    offsetof(struct __sk_buff, mark)),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "dereference of modified ctx ptr",
+	},
+	{
+		"check deducing bounds from const, 9",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSGE, BPF_REG_0, 0, 0),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "R0 tried to subtract pointer from scalar",
+	},
+	{
+		"check deducing bounds from const, 10",
+		.insns = {
+			BPF_MOV64_IMM(BPF_REG_0, 0),
+			BPF_JMP_IMM(BPF_JSLE, BPF_REG_0, 0, 0),
+			/* Marks reg as unknown. */
+			BPF_ALU64_IMM(BPF_NEG, BPF_REG_0, 0),
+			BPF_ALU64_REG(BPF_SUB, BPF_REG_0, BPF_REG_1),
+			BPF_EXIT_INSN(),
+		},
+		.result = REJECT,
+		.errstr = "math between ctx pointer and register with unbounded min value is not allowed",
+	},
+	{
 		"bpf_exit with invalid return code. test1",
 		.insns = {
 			BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_1, 0),
diff --git a/tools/testing/selftests/ptp/testptp.c b/tools/testing/selftests/ptp/testptp.c
index 5d2eae1..a5d8f0a 100644
--- a/tools/testing/selftests/ptp/testptp.c
+++ b/tools/testing/selftests/ptp/testptp.c
@@ -60,9 +60,7 @@ static int clock_adjtime(clockid_t id, struct timex *tx)
 static clockid_t get_clockid(int fd)
 {
 #define CLOCKFD 3
-#define FD_TO_CLOCKID(fd)	((~(clockid_t) (fd) << 3) | CLOCKFD)
-
-	return FD_TO_CLOCKID(fd);
+	return (((unsigned int) ~fd) << 3) | CLOCKFD;
 }
 
 static void handle_alarm(int s)
diff --git a/tools/testing/selftests/rcutorture/bin/config2frag.sh b/tools/testing/selftests/rcutorture/bin/config2frag.sh
deleted file mode 100755
index 56f51ae..0000000
--- a/tools/testing/selftests/rcutorture/bin/config2frag.sh
+++ /dev/null
@@ -1,25 +0,0 @@
-#!/bin/bash
-# Usage: config2frag.sh < .config > configfrag
-#
-# Converts the "# CONFIG_XXX is not set" to "CONFIG_XXX=n" so that the
-# resulting file becomes a legitimate Kconfig fragment.
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program; if not, you can access it online at
-# http://www.gnu.org/licenses/gpl-2.0.html.
-#
-# Copyright (C) IBM Corporation, 2013
-#
-# Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
-
-LANG=C sed -e 's/^# CONFIG_\([a-zA-Z0-9_]*\) is not set$/CONFIG_\1=n/'
diff --git a/tools/testing/selftests/rcutorture/bin/configinit.sh b/tools/testing/selftests/rcutorture/bin/configinit.sh
index 51f66a7..c15f270 100755
--- a/tools/testing/selftests/rcutorture/bin/configinit.sh
+++ b/tools/testing/selftests/rcutorture/bin/configinit.sh
@@ -51,7 +51,7 @@
 			mkdir $builddir
 		fi
 	else
-		echo Bad build directory: \"$builddir\"
+		echo Bad build directory: \"$buildloc\"
 		exit 2
 	fi
 fi
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-build.sh b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
index fb66d01..34d1267 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-build.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-build.sh
@@ -29,11 +29,6 @@
 	exit 1
 fi
 builddir=${2}
-if test -z "$builddir" -o ! -d "$builddir" -o ! -w "$builddir"
-then
-	echo "kvm-build.sh :$builddir: Not a writable directory, cannot build into it"
-	exit 1
-fi
 
 T=${TMPDIR-/tmp}/test-linux.sh.$$
 trap 'rm -rf $T' 0
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
index 43f7640..2de92f4 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-lock.sh
@@ -23,7 +23,7 @@
 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
 	:
 else
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
index 559e01a..c2e1bb6 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcu.sh
@@ -23,14 +23,14 @@
 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
 	:
 else
 	echo Unreadable results directory: $i
 	exit 1
 fi
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 configfile=`echo $i | sed -e 's/^.*\///'`
 ngps=`grep ver: $i/console.log 2> /dev/null | tail -1 | sed -e 's/^.* ver: //' -e 's/ .*$//'`
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
index f79b0e9..963f712 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf-ftrace.sh
@@ -26,7 +26,7 @@
 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 i="$1"
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 if test "`grep -c 'rcu_exp_grace_period.*start' < $i/console.log`" -lt 100
 then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
index 8f3121a..ccebf77 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck-rcuperf.sh
@@ -23,7 +23,7 @@
 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 i="$1"
-if test -d $i
+if test -d "$i" -a -r "$i"
 then
 	:
 else
@@ -31,7 +31,7 @@
 	exit 1
 fi
 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 
 if kvm-recheck-rcuperf-ftrace.sh $i
 then
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
index f659346..f7e988f 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-recheck.sh
@@ -25,7 +25,7 @@
 # Authors: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
 
 PATH=`pwd`/tools/testing/selftests/rcutorture/bin:$PATH; export PATH
-. tools/testing/selftests/rcutorture/bin/functions.sh
+. functions.sh
 for rd in "$@"
 do
 	firsttime=1
diff --git a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
index ab14b97..1b78a12 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm-test-1-run.sh
@@ -42,7 +42,7 @@
 trap 'rm -rf $T' 0
 mkdir $T
 
-. $KVM/bin/functions.sh
+. functions.sh
 . $CONFIGFRAG/ver_functions.sh
 
 config_template=${1}
@@ -154,9 +154,7 @@
 vcpus=`identify_qemu_vcpus`
 if test $cpu_count -gt $vcpus
 then
-	echo CPU count limited from $cpu_count to $vcpus
-	touch $resdir/Warnings
-	echo CPU count limited from $cpu_count to $vcpus >> $resdir/Warnings
+	echo CPU count limited from $cpu_count to $vcpus | tee -a $resdir/Warnings
 	cpu_count=$vcpus
 fi
 qemu_args="`specify_qemu_cpus "$QEMU" "$qemu_args" "$cpu_count"`"
diff --git a/tools/testing/selftests/rcutorture/bin/kvm.sh b/tools/testing/selftests/rcutorture/bin/kvm.sh
index ccd49e9..7d1f607 100755
--- a/tools/testing/selftests/rcutorture/bin/kvm.sh
+++ b/tools/testing/selftests/rcutorture/bin/kvm.sh
@@ -1,8 +1,7 @@
 #!/bin/bash
 #
 # Run a series of 14 tests under KVM.  These are not particularly
-# well-selected or well-tuned, but are the current set.  Run from the
-# top level of the source tree.
+# well-selected or well-tuned, but are the current set.
 #
 # Edit the definitions below to set the locations of the various directories,
 # as well as the test duration.
@@ -34,6 +33,8 @@
 trap 'rm -rf $T' 0
 mkdir $T
 
+cd `dirname $scriptname`/../../../../../
+
 dur=$((30*60))
 dryrun=""
 KVM="`pwd`/tools/testing/selftests/rcutorture"; export KVM
@@ -70,7 +71,7 @@
 	echo "       --kmake-arg kernel-make-arguments"
 	echo "       --mac nn:nn:nn:nn:nn:nn"
 	echo "       --no-initrd"
-	echo "       --qemu-args qemu-system-..."
+	echo "       --qemu-args qemu-arguments"
 	echo "       --qemu-cmd qemu-system-..."
 	echo "       --results absolute-pathname"
 	echo "       --torture rcu"
@@ -150,7 +151,7 @@
 		TORTURE_INITRD=""; export TORTURE_INITRD
 		;;
 	--qemu-args|--qemu-arg)
-		checkarg --qemu-args "-qemu args" $# "$2" '^-' '^error'
+		checkarg --qemu-args "(qemu arguments)" $# "$2" '^-' '^error'
 		TORTURE_QEMU_ARG="$2"
 		shift
 		;;
@@ -238,7 +239,6 @@
 }
 
 END {
-	alldone = 0;
 	batch = 0;
 	nc = -1;
 
@@ -331,8 +331,7 @@
 # Dump out the scripting required to run one test batch.
 function dump(first, pastlast, batchnum)
 {
-	print "echo ----Start batch " batchnum ": `date`";
-	print "echo ----Start batch " batchnum ": `date` >> " rd "/log";
+	print "echo ----Start batch " batchnum ": `date` | tee -a " rd "log";
 	print "needqemurun="
 	jn=1
 	for (j = first; j < pastlast; j++) {
@@ -349,21 +348,18 @@
 			ovf = "-ovf";
 		else
 			ovf = "";
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date`";
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` >> " rd "/log";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Starting build. `date` | tee -a " rd "log";
 		print "rm -f " builddir ".*";
 		print "touch " builddir ".wait";
 		print "mkdir " builddir " > /dev/null 2>&1 || :";
 		print "mkdir " rd cfr[jn] " || :";
 		print "kvm-test-1-run.sh " CONFIGDIR cf[j], builddir, rd cfr[jn], dur " \"" TORTURE_QEMU_ARG "\" \"" TORTURE_BOOTARGS "\" > " rd cfr[jn]  "/kvm-test-1-run.sh.out 2>&1 &"
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date`";
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` >> " rd "/log";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Waiting for build to complete. `date` | tee -a " rd "log";
 		print "while test -f " builddir ".wait"
 		print "do"
 		print "\tsleep 1"
 		print "done"
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date`";
-		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` >> " rd "/log";
+		print "echo ", cfr[jn], cpusr[jn] ovf ": Build complete. `date` | tee -a " rd "log";
 		jn++;
 	}
 	for (j = 1; j < jn; j++) {
@@ -371,8 +367,7 @@
 		print "rm -f " builddir ".ready"
 		print "if test -f \"" rd cfr[j] "/builtkernel\""
 		print "then"
-		print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date`";
-		print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date` >> " rd "/log";
+		print "\techo ----", cfr[j], cpusr[j] ovf ": Kernel present. `date` | tee -a " rd "log";
 		print "\tneedqemurun=1"
 		print "fi"
 	}
@@ -386,31 +381,26 @@
 		njitter = ja[1];
 	if (TORTURE_BUILDONLY && njitter != 0) {
 		njitter = 0;
-		print "echo Build-only run, so suppressing jitter >> " rd "/log"
+		print "echo Build-only run, so suppressing jitter | tee -a " rd "log"
 	}
 	if (TORTURE_BUILDONLY) {
 		print "needqemurun="
 	}
 	print "if test -n \"$needqemurun\""
 	print "then"
-	print "\techo ---- Starting kernels. `date`";
-	print "\techo ---- Starting kernels. `date` >> " rd "/log";
+	print "\techo ---- Starting kernels. `date` | tee -a " rd "log";
 	for (j = 0; j < njitter; j++)
 		print "\tjitter.sh " j " " dur " " ja[2] " " ja[3] "&"
 	print "\twait"
-	print "\techo ---- All kernel runs complete. `date`";
-	print "\techo ---- All kernel runs complete. `date` >> " rd "/log";
+	print "\techo ---- All kernel runs complete. `date` | tee -a " rd "log";
 	print "else"
 	print "\twait"
-	print "\techo ---- No kernel runs. `date`";
-	print "\techo ---- No kernel runs. `date` >> " rd "/log";
+	print "\techo ---- No kernel runs. `date` | tee -a " rd "log";
 	print "fi"
 	for (j = 1; j < jn; j++) {
 		builddir=KVM "/b" j
-		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results:";
-		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: >> " rd "/log";
-		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out";
-		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out >> " rd "/log";
+		print "echo ----", cfr[j], cpusr[j] ovf ": Build/run results: | tee -a " rd "log";
+		print "cat " rd cfr[j]  "/kvm-test-1-run.sh.out | tee -a " rd "log";
 	}
 }
 
diff --git a/tools/testing/selftests/rcutorture/bin/parse-torture.sh b/tools/testing/selftests/rcutorture/bin/parse-torture.sh
index f12c389..5987e50 100755
--- a/tools/testing/selftests/rcutorture/bin/parse-torture.sh
+++ b/tools/testing/selftests/rcutorture/bin/parse-torture.sh
@@ -55,7 +55,7 @@
 	exit
 fi
 
-grep --binary-files=text 'torture:.*ver:' $file | grep --binary-files=text -v '(null)' | sed -e 's/^(initramfs)[^]]*] //' -e 's/^\[[^]]*] //' |
+grep --binary-files=text 'torture:.*ver:' $file | egrep --binary-files=text -v '\(null\)|rtc: 000000000* ' | sed -e 's/^(initramfs)[^]]*] //' -e 's/^\[[^]]*] //' |
 awk '
 BEGIN	{
 	ver = 0;
diff --git a/tools/testing/selftests/rcutorture/configs/lock/ver_functions.sh b/tools/testing/selftests/rcutorture/configs/lock/ver_functions.sh
index 252aae6..80eb646 100644
--- a/tools/testing/selftests/rcutorture/configs/lock/ver_functions.sh
+++ b/tools/testing/selftests/rcutorture/configs/lock/ver_functions.sh
@@ -38,6 +38,5 @@
 	echo $1 `locktorture_param_onoff "$1" "$2"` \
 		locktorture.stat_interval=15 \
 		locktorture.shutdown_secs=$3 \
-		locktorture.torture_runnable=1 \
 		locktorture.verbose=1
 }
diff --git a/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh b/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
index ffb85ed..24ec910 100644
--- a/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
+++ b/tools/testing/selftests/rcutorture/configs/rcu/ver_functions.sh
@@ -51,7 +51,6 @@
 		`rcutorture_param_n_barrier_cbs "$1"` \
 		rcutorture.stat_interval=15 \
 		rcutorture.shutdown_secs=$3 \
-		rcutorture.torture_runnable=1 \
 		rcutorture.test_no_idle_hz=1 \
 		rcutorture.verbose=1
 }
diff --git a/tools/testing/selftests/rcutorture/configs/rcuperf/ver_functions.sh b/tools/testing/selftests/rcutorture/configs/rcuperf/ver_functions.sh
index 34f2a1b..b960311 100644
--- a/tools/testing/selftests/rcutorture/configs/rcuperf/ver_functions.sh
+++ b/tools/testing/selftests/rcutorture/configs/rcuperf/ver_functions.sh
@@ -46,7 +46,6 @@
 per_version_boot_params () {
 	echo $1 `rcuperf_param_nreaders "$1"` \
 		`rcuperf_param_nwriters "$1"` \
-		rcuperf.perf_runnable=1 \
 		rcuperf.shutdown=1 \
 		rcuperf.verbose=1
 }
diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile
index 939a337..5d4f10a 100644
--- a/tools/testing/selftests/x86/Makefile
+++ b/tools/testing/selftests/x86/Makefile
@@ -7,7 +7,7 @@
 
 TARGETS_C_BOTHBITS := single_step_syscall sysret_ss_attrs syscall_nt ptrace_syscall test_mremap_vdso \
 			check_initial_reg_state sigreturn ldt_gdt iopl mpx-mini-test ioperm \
-			protection_keys test_vdso
+			protection_keys test_vdso test_vsyscall
 TARGETS_C_32BIT_ONLY := entry_from_vm86 syscall_arg_fault test_syscall_vdso unwind_vdso \
 			test_FCMOV test_FCOMI test_FISTTP \
 			vdso_restorer
diff --git a/tools/testing/selftests/x86/test_vsyscall.c b/tools/testing/selftests/x86/test_vsyscall.c
new file mode 100644
index 0000000..7a744fa
--- /dev/null
+++ b/tools/testing/selftests/x86/test_vsyscall.c
@@ -0,0 +1,500 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#define _GNU_SOURCE
+
+#include <stdio.h>
+#include <sys/time.h>
+#include <time.h>
+#include <stdlib.h>
+#include <sys/syscall.h>
+#include <unistd.h>
+#include <dlfcn.h>
+#include <string.h>
+#include <inttypes.h>
+#include <signal.h>
+#include <sys/ucontext.h>
+#include <errno.h>
+#include <err.h>
+#include <sched.h>
+#include <stdbool.h>
+#include <setjmp.h>
+
+#ifdef __x86_64__
+# define VSYS(x) (x)
+#else
+# define VSYS(x) 0
+#endif
+
+#ifndef SYS_getcpu
+# ifdef __x86_64__
+#  define SYS_getcpu 309
+# else
+#  define SYS_getcpu 318
+# endif
+#endif
+
+static void sethandler(int sig, void (*handler)(int, siginfo_t *, void *),
+		       int flags)
+{
+	struct sigaction sa;
+	memset(&sa, 0, sizeof(sa));
+	sa.sa_sigaction = handler;
+	sa.sa_flags = SA_SIGINFO | flags;
+	sigemptyset(&sa.sa_mask);
+	if (sigaction(sig, &sa, 0))
+		err(1, "sigaction");
+}
+
+/* vsyscalls and vDSO */
+bool should_read_vsyscall = false;
+
+typedef long (*gtod_t)(struct timeval *tv, struct timezone *tz);
+gtod_t vgtod = (gtod_t)VSYS(0xffffffffff600000);
+gtod_t vdso_gtod;
+
+typedef int (*vgettime_t)(clockid_t, struct timespec *);
+vgettime_t vdso_gettime;
+
+typedef long (*time_func_t)(time_t *t);
+time_func_t vtime = (time_func_t)VSYS(0xffffffffff600400);
+time_func_t vdso_time;
+
+typedef long (*getcpu_t)(unsigned *, unsigned *, void *);
+getcpu_t vgetcpu = (getcpu_t)VSYS(0xffffffffff600800);
+getcpu_t vdso_getcpu;
+
+static void init_vdso(void)
+{
+	void *vdso = dlopen("linux-vdso.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD);
+	if (!vdso)
+		vdso = dlopen("linux-gate.so.1", RTLD_LAZY | RTLD_LOCAL | RTLD_NOLOAD);
+	if (!vdso) {
+		printf("[WARN]\tfailed to find vDSO\n");
+		return;
+	}
+
+	vdso_gtod = (gtod_t)dlsym(vdso, "__vdso_gettimeofday");
+	if (!vdso_gtod)
+		printf("[WARN]\tfailed to find gettimeofday in vDSO\n");
+
+	vdso_gettime = (vgettime_t)dlsym(vdso, "__vdso_clock_gettime");
+	if (!vdso_gettime)
+		printf("[WARN]\tfailed to find clock_gettime in vDSO\n");
+
+	vdso_time = (time_func_t)dlsym(vdso, "__vdso_time");
+	if (!vdso_time)
+		printf("[WARN]\tfailed to find time in vDSO\n");
+
+	vdso_getcpu = (getcpu_t)dlsym(vdso, "__vdso_getcpu");
+	if (!vdso_getcpu) {
+		/* getcpu() was never wired up in the 32-bit vDSO. */
+		printf("[%s]\tfailed to find getcpu in vDSO\n",
+		       sizeof(long) == 8 ? "WARN" : "NOTE");
+	}
+}
+
+static int init_vsys(void)
+{
+#ifdef __x86_64__
+	int nerrs = 0;
+	FILE *maps;
+	char line[128];
+	bool found = false;
+
+	maps = fopen("/proc/self/maps", "r");
+	if (!maps) {
+		printf("[WARN]\tCould not open /proc/self/maps -- assuming vsyscall is r-x\n");
+		should_read_vsyscall = true;
+		return 0;
+	}
+
+	while (fgets(line, sizeof(line), maps)) {
+		char r, x;
+		void *start, *end;
+		char name[128];
+		if (sscanf(line, "%p-%p %c-%cp %*x %*x:%*x %*u %s",
+			   &start, &end, &r, &x, name) != 5)
+			continue;
+
+		if (strcmp(name, "[vsyscall]"))
+			continue;
+
+		printf("\tvsyscall map: %s", line);
+
+		if (start != (void *)0xffffffffff600000 ||
+		    end != (void *)0xffffffffff601000) {
+			printf("[FAIL]\taddress range is nonsense\n");
+			nerrs++;
+		}
+
+		printf("\tvsyscall permissions are %c-%c\n", r, x);
+		should_read_vsyscall = (r == 'r');
+		if (x != 'x') {
+			vgtod = NULL;
+			vtime = NULL;
+			vgetcpu = NULL;
+		}
+
+		found = true;
+		break;
+	}
+
+	fclose(maps);
+
+	if (!found) {
+		printf("\tno vsyscall map in /proc/self/maps\n");
+		should_read_vsyscall = false;
+		vgtod = NULL;
+		vtime = NULL;
+		vgetcpu = NULL;
+	}
+
+	return nerrs;
+#else
+	return 0;
+#endif
+}
+
+/* syscalls */
+static inline long sys_gtod(struct timeval *tv, struct timezone *tz)
+{
+	return syscall(SYS_gettimeofday, tv, tz);
+}
+
+static inline int sys_clock_gettime(clockid_t id, struct timespec *ts)
+{
+	return syscall(SYS_clock_gettime, id, ts);
+}
+
+static inline long sys_time(time_t *t)
+{
+	return syscall(SYS_time, t);
+}
+
+static inline long sys_getcpu(unsigned * cpu, unsigned * node,
+			      void* cache)
+{
+	return syscall(SYS_getcpu, cpu, node, cache);
+}
+
+static jmp_buf jmpbuf;
+
+static void sigsegv(int sig, siginfo_t *info, void *ctx_void)
+{
+	siglongjmp(jmpbuf, 1);
+}
+
+static double tv_diff(const struct timeval *a, const struct timeval *b)
+{
+	return (double)(a->tv_sec - b->tv_sec) +
+		(double)((int)a->tv_usec - (int)b->tv_usec) * 1e-6;
+}
+
+static int check_gtod(const struct timeval *tv_sys1,
+		      const struct timeval *tv_sys2,
+		      const struct timezone *tz_sys,
+		      const char *which,
+		      const struct timeval *tv_other,
+		      const struct timezone *tz_other)
+{
+	int nerrs = 0;
+	double d1, d2;
+
+	if (tz_other && (tz_sys->tz_minuteswest != tz_other->tz_minuteswest || tz_sys->tz_dsttime != tz_other->tz_dsttime)) {
+		printf("[FAIL] %s tz mismatch\n", which);
+		nerrs++;
+	}
+
+	d1 = tv_diff(tv_other, tv_sys1);
+	d2 = tv_diff(tv_sys2, tv_other); 
+	printf("\t%s time offsets: %lf %lf\n", which, d1, d2);
+
+	if (d1 < 0 || d2 < 0) {
+		printf("[FAIL]\t%s time was inconsistent with the syscall\n", which);
+		nerrs++;
+	} else {
+		printf("[OK]\t%s gettimeofday()'s timeval was okay\n", which);
+	}
+
+	return nerrs;
+}
+
+static int test_gtod(void)
+{
+	struct timeval tv_sys1, tv_sys2, tv_vdso, tv_vsys;
+	struct timezone tz_sys, tz_vdso, tz_vsys;
+	long ret_vdso = -1;
+	long ret_vsys = -1;
+	int nerrs = 0;
+
+	printf("[RUN]\ttest gettimeofday()\n");
+
+	if (sys_gtod(&tv_sys1, &tz_sys) != 0)
+		err(1, "syscall gettimeofday");
+	if (vdso_gtod)
+		ret_vdso = vdso_gtod(&tv_vdso, &tz_vdso);
+	if (vgtod)
+		ret_vsys = vgtod(&tv_vsys, &tz_vsys);
+	if (sys_gtod(&tv_sys2, &tz_sys) != 0)
+		err(1, "syscall gettimeofday");
+
+	if (vdso_gtod) {
+		if (ret_vdso == 0) {
+			nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vDSO", &tv_vdso, &tz_vdso);
+		} else {
+			printf("[FAIL]\tvDSO gettimeofday() failed: %ld\n", ret_vdso);
+			nerrs++;
+		}
+	}
+
+	if (vgtod) {
+		if (ret_vsys == 0) {
+			nerrs += check_gtod(&tv_sys1, &tv_sys2, &tz_sys, "vsyscall", &tv_vsys, &tz_vsys);
+		} else {
+			printf("[FAIL]\tvsys gettimeofday() failed: %ld\n", ret_vsys);
+			nerrs++;
+		}
+	}
+
+	return nerrs;
+}
+
+static int test_time(void) {
+	int nerrs = 0;
+
+	printf("[RUN]\ttest time()\n");
+	long t_sys1, t_sys2, t_vdso = 0, t_vsys = 0;
+	long t2_sys1 = -1, t2_sys2 = -1, t2_vdso = -1, t2_vsys = -1;
+	t_sys1 = sys_time(&t2_sys1);
+	if (vdso_time)
+		t_vdso = vdso_time(&t2_vdso);
+	if (vtime)
+		t_vsys = vtime(&t2_vsys);
+	t_sys2 = sys_time(&t2_sys2);
+	if (t_sys1 < 0 || t_sys1 != t2_sys1 || t_sys2 < 0 || t_sys2 != t2_sys2) {
+		printf("[FAIL]\tsyscall failed (ret1:%ld output1:%ld ret2:%ld output2:%ld)\n", t_sys1, t2_sys1, t_sys2, t2_sys2);
+		nerrs++;
+		return nerrs;
+	}
+
+	if (vdso_time) {
+		if (t_vdso < 0 || t_vdso != t2_vdso) {
+			printf("[FAIL]\tvDSO failed (ret:%ld output:%ld)\n", t_vdso, t2_vdso);
+			nerrs++;
+		} else if (t_vdso < t_sys1 || t_vdso > t_sys2) {
+			printf("[FAIL]\tvDSO returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vdso, t_sys2);
+			nerrs++;
+		} else {
+			printf("[OK]\tvDSO time() is okay\n");
+		}
+	}
+
+	if (vtime) {
+		if (t_vsys < 0 || t_vsys != t2_vsys) {
+			printf("[FAIL]\tvsyscall failed (ret:%ld output:%ld)\n", t_vsys, t2_vsys);
+			nerrs++;
+		} else if (t_vsys < t_sys1 || t_vsys > t_sys2) {
+			printf("[FAIL]\tvsyscall returned the wrong time (%ld %ld %ld)\n", t_sys1, t_vsys, t_sys2);
+			nerrs++;
+		} else {
+			printf("[OK]\tvsyscall time() is okay\n");
+		}
+	}
+
+	return nerrs;
+}
+
+static int test_getcpu(int cpu)
+{
+	int nerrs = 0;
+	long ret_sys, ret_vdso = -1, ret_vsys = -1;
+
+	printf("[RUN]\tgetcpu() on CPU %d\n", cpu);
+
+	cpu_set_t cpuset;
+	CPU_ZERO(&cpuset);
+	CPU_SET(cpu, &cpuset);
+	if (sched_setaffinity(0, sizeof(cpuset), &cpuset) != 0) {
+		printf("[SKIP]\tfailed to force CPU %d\n", cpu);
+		return nerrs;
+	}
+
+	unsigned cpu_sys, cpu_vdso, cpu_vsys, node_sys, node_vdso, node_vsys;
+	unsigned node = 0;
+	bool have_node = false;
+	ret_sys = sys_getcpu(&cpu_sys, &node_sys, 0);
+	if (vdso_getcpu)
+		ret_vdso = vdso_getcpu(&cpu_vdso, &node_vdso, 0);
+	if (vgetcpu)
+		ret_vsys = vgetcpu(&cpu_vsys, &node_vsys, 0);
+
+	if (ret_sys == 0) {
+		if (cpu_sys != cpu) {
+			printf("[FAIL]\tsyscall reported CPU %hu but should be %d\n", cpu_sys, cpu);
+			nerrs++;
+		}
+
+		have_node = true;
+		node = node_sys;
+	}
+
+	if (vdso_getcpu) {
+		if (ret_vdso) {
+			printf("[FAIL]\tvDSO getcpu() failed\n");
+			nerrs++;
+		} else {
+			if (!have_node) {
+				have_node = true;
+				node = node_vdso;
+			}
+
+			if (cpu_vdso != cpu) {
+				printf("[FAIL]\tvDSO reported CPU %hu but should be %d\n", cpu_vdso, cpu);
+				nerrs++;
+			} else {
+				printf("[OK]\tvDSO reported correct CPU\n");
+			}
+
+			if (node_vdso != node) {
+				printf("[FAIL]\tvDSO reported node %hu but should be %hu\n", node_vdso, node);
+				nerrs++;
+			} else {
+				printf("[OK]\tvDSO reported correct node\n");
+			}
+		}
+	}
+
+	if (vgetcpu) {
+		if (ret_vsys) {
+			printf("[FAIL]\tvsyscall getcpu() failed\n");
+			nerrs++;
+		} else {
+			if (!have_node) {
+				have_node = true;
+				node = node_vsys;
+			}
+
+			if (cpu_vsys != cpu) {
+				printf("[FAIL]\tvsyscall reported CPU %hu but should be %d\n", cpu_vsys, cpu);
+				nerrs++;
+			} else {
+				printf("[OK]\tvsyscall reported correct CPU\n");
+			}
+
+			if (node_vsys != node) {
+				printf("[FAIL]\tvsyscall reported node %hu but should be %hu\n", node_vsys, node);
+				nerrs++;
+			} else {
+				printf("[OK]\tvsyscall reported correct node\n");
+			}
+		}
+	}
+
+	return nerrs;
+}
+
+static int test_vsys_r(void)
+{
+#ifdef __x86_64__
+	printf("[RUN]\tChecking read access to the vsyscall page\n");
+	bool can_read;
+	if (sigsetjmp(jmpbuf, 1) == 0) {
+		*(volatile int *)0xffffffffff600000;
+		can_read = true;
+	} else {
+		can_read = false;
+	}
+
+	if (can_read && !should_read_vsyscall) {
+		printf("[FAIL]\tWe have read access, but we shouldn't\n");
+		return 1;
+	} else if (!can_read && should_read_vsyscall) {
+		printf("[FAIL]\tWe don't have read access, but we should\n");
+		return 1;
+	} else {
+		printf("[OK]\tgot expected result\n");
+	}
+#endif
+
+	return 0;
+}
+
+
+#ifdef __x86_64__
+#define X86_EFLAGS_TF (1UL << 8)
+static volatile sig_atomic_t num_vsyscall_traps;
+
+static unsigned long get_eflags(void)
+{
+	unsigned long eflags;
+	asm volatile ("pushfq\n\tpopq %0" : "=rm" (eflags));
+	return eflags;
+}
+
+static void set_eflags(unsigned long eflags)
+{
+	asm volatile ("pushq %0\n\tpopfq" : : "rm" (eflags) : "flags");
+}
+
+static void sigtrap(int sig, siginfo_t *info, void *ctx_void)
+{
+	ucontext_t *ctx = (ucontext_t *)ctx_void;
+	unsigned long ip = ctx->uc_mcontext.gregs[REG_RIP];
+
+	if (((ip ^ 0xffffffffff600000UL) & ~0xfffUL) == 0)
+		num_vsyscall_traps++;
+}
+
+static int test_native_vsyscall(void)
+{
+	time_t tmp;
+	bool is_native;
+
+	if (!vtime)
+		return 0;
+
+	printf("[RUN]\tchecking for native vsyscall\n");
+	sethandler(SIGTRAP, sigtrap, 0);
+	set_eflags(get_eflags() | X86_EFLAGS_TF);
+	vtime(&tmp);
+	set_eflags(get_eflags() & ~X86_EFLAGS_TF);
+
+	/*
+	 * If vsyscalls are emulated, we expect a single trap in the
+	 * vsyscall page -- the call instruction will trap with RIP
+	 * pointing to the entry point before emulation takes over.
+	 * In native mode, we expect two traps, since whatever code
+	 * the vsyscall page contains will be more than just a ret
+	 * instruction.
+	 */
+	is_native = (num_vsyscall_traps > 1);
+
+	printf("\tvsyscalls are %s (%d instructions in vsyscall page)\n",
+	       (is_native ? "native" : "emulated"),
+	       (int)num_vsyscall_traps);
+
+	return 0;
+}
+#endif
+
+int main(int argc, char **argv)
+{
+	int nerrs = 0;
+
+	init_vdso();
+	nerrs += init_vsys();
+
+	nerrs += test_gtod();
+	nerrs += test_time();
+	nerrs += test_getcpu(0);
+	nerrs += test_getcpu(1);
+
+	sethandler(SIGSEGV, sigsegv, 0);
+	nerrs += test_vsys_r();
+
+#ifdef __x86_64__
+	nerrs += test_native_vsyscall();
+#endif
+
+	return nerrs ? 1 : 0;
+}
diff --git a/virt/kvm/arm/arm.c b/virt/kvm/arm/arm.c
index 2e43f9d..08464b2 100644
--- a/virt/kvm/arm/arm.c
+++ b/virt/kvm/arm/arm.c
@@ -53,8 +53,8 @@
 __asm__(".arch_extension	virt");
 #endif
 
+DEFINE_PER_CPU(kvm_cpu_context_t, kvm_host_cpu_state);
 static DEFINE_PER_CPU(unsigned long, kvm_arm_hyp_stack_page);
-static kvm_cpu_context_t __percpu *kvm_host_cpu_state;
 
 /* Per-CPU variable containing the currently running vcpu. */
 static DEFINE_PER_CPU(struct kvm_vcpu *, kvm_arm_running_vcpu);
@@ -354,7 +354,7 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 	}
 
 	vcpu->cpu = cpu;
-	vcpu->arch.host_cpu_context = this_cpu_ptr(kvm_host_cpu_state);
+	vcpu->arch.host_cpu_context = this_cpu_ptr(&kvm_host_cpu_state);
 
 	kvm_arm_set_running_vcpu(vcpu);
 	kvm_vgic_load(vcpu);
@@ -509,7 +509,7 @@ static void update_vttbr(struct kvm *kvm)
 	pgd_phys = virt_to_phys(kvm->arch.pgd);
 	BUG_ON(pgd_phys & ~VTTBR_BADDR_MASK);
 	vmid = ((u64)(kvm->arch.vmid) << VTTBR_VMID_SHIFT) & VTTBR_VMID_MASK(kvm_vmid_bits);
-	kvm->arch.vttbr = pgd_phys | vmid;
+	kvm->arch.vttbr = kvm_phys_to_vttbr(pgd_phys) | vmid;
 
 	spin_unlock(&kvm_vmid_lock);
 }
@@ -704,9 +704,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		 */
 		trace_kvm_entry(*vcpu_pc(vcpu));
 		guest_enter_irqoff();
+		if (has_vhe())
+			kvm_arm_vhe_guest_enter();
 
 		ret = kvm_call_hyp(__kvm_vcpu_run, vcpu);
 
+		if (has_vhe())
+			kvm_arm_vhe_guest_exit();
 		vcpu->mode = OUTSIDE_GUEST_MODE;
 		vcpu->stat.exits++;
 		/*
@@ -759,6 +763,9 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		guest_exit();
 		trace_kvm_exit(ret, kvm_vcpu_trap_get_class(vcpu), *vcpu_pc(vcpu));
 
+		/* Exit types that need handling before we can be preempted */
+		handle_exit_early(vcpu, run, ret);
+
 		preempt_enable();
 
 		ret = handle_exit(vcpu, run, ret);
@@ -1158,7 +1165,7 @@ static void cpu_init_hyp_mode(void *dummy)
 	pgd_ptr = kvm_mmu_get_httbr();
 	stack_page = __this_cpu_read(kvm_arm_hyp_stack_page);
 	hyp_stack_ptr = stack_page + PAGE_SIZE;
-	vector_ptr = (unsigned long)kvm_ksym_ref(__kvm_hyp_vector);
+	vector_ptr = (unsigned long)kvm_get_hyp_vector();
 
 	__cpu_init_hyp_mode(pgd_ptr, hyp_stack_ptr, vector_ptr);
 	__cpu_init_stage2();
@@ -1272,19 +1279,8 @@ static inline void hyp_cpu_pm_exit(void)
 }
 #endif
 
-static void teardown_common_resources(void)
-{
-	free_percpu(kvm_host_cpu_state);
-}
-
 static int init_common_resources(void)
 {
-	kvm_host_cpu_state = alloc_percpu(kvm_cpu_context_t);
-	if (!kvm_host_cpu_state) {
-		kvm_err("Cannot allocate host CPU state\n");
-		return -ENOMEM;
-	}
-
 	/* set size of VMID supported by CPU */
 	kvm_vmid_bits = kvm_get_vmid_bits();
 	kvm_info("%d-bit VMID\n", kvm_vmid_bits);
@@ -1403,6 +1399,12 @@ static int init_hyp_mode(void)
 		goto out_err;
 	}
 
+	err = kvm_map_vectors();
+	if (err) {
+		kvm_err("Cannot map vectors\n");
+		goto out_err;
+	}
+
 	/*
 	 * Map the Hyp stack pages
 	 */
@@ -1420,7 +1422,7 @@ static int init_hyp_mode(void)
 	for_each_possible_cpu(cpu) {
 		kvm_cpu_context_t *cpu_ctxt;
 
-		cpu_ctxt = per_cpu_ptr(kvm_host_cpu_state, cpu);
+		cpu_ctxt = per_cpu_ptr(&kvm_host_cpu_state, cpu);
 		err = create_hyp_mappings(cpu_ctxt, cpu_ctxt + 1, PAGE_HYP);
 
 		if (err) {
@@ -1544,7 +1546,6 @@ int kvm_arch_init(void *opaque)
 	if (!in_hyp_mode)
 		teardown_hyp_mode();
 out_err:
-	teardown_common_resources();
 	return err;
 }
 
diff --git a/virt/kvm/arm/mmu.c b/virt/kvm/arm/mmu.c
index b4b69c2..f8eaf86 100644
--- a/virt/kvm/arm/mmu.c
+++ b/virt/kvm/arm/mmu.c
@@ -621,7 +621,7 @@ static int create_hyp_pud_mappings(pgd_t *pgd, unsigned long start,
 	return 0;
 }
 
-static int __create_hyp_mappings(pgd_t *pgdp,
+static int __create_hyp_mappings(pgd_t *pgdp, unsigned long ptrs_per_pgd,
 				 unsigned long start, unsigned long end,
 				 unsigned long pfn, pgprot_t prot)
 {
@@ -634,7 +634,7 @@ static int __create_hyp_mappings(pgd_t *pgdp,
 	addr = start & PAGE_MASK;
 	end = PAGE_ALIGN(end);
 	do {
-		pgd = pgdp + pgd_index(addr);
+		pgd = pgdp + ((addr >> PGDIR_SHIFT) & (ptrs_per_pgd - 1));
 
 		if (pgd_none(*pgd)) {
 			pud = pud_alloc_one(NULL, addr);
@@ -697,8 +697,8 @@ int create_hyp_mappings(void *from, void *to, pgprot_t prot)
 		int err;
 
 		phys_addr = kvm_kaddr_to_phys(from + virt_addr - start);
-		err = __create_hyp_mappings(hyp_pgd, virt_addr,
-					    virt_addr + PAGE_SIZE,
+		err = __create_hyp_mappings(hyp_pgd, PTRS_PER_PGD,
+					    virt_addr, virt_addr + PAGE_SIZE,
 					    __phys_to_pfn(phys_addr),
 					    prot);
 		if (err)
@@ -729,7 +729,7 @@ int create_hyp_io_mappings(void *from, void *to, phys_addr_t phys_addr)
 	if (!is_vmalloc_addr(from) || !is_vmalloc_addr(to - 1))
 		return -EINVAL;
 
-	return __create_hyp_mappings(hyp_pgd, start, end,
+	return __create_hyp_mappings(hyp_pgd, PTRS_PER_PGD, start, end,
 				     __phys_to_pfn(phys_addr), PAGE_HYP_DEVICE);
 }
 
@@ -1310,7 +1310,7 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
 		return -EFAULT;
 	}
 
-	if (is_vm_hugetlb_page(vma) && !logging_active) {
+	if (vma_kernel_pagesize(vma) == PMD_SIZE && !logging_active) {
 		hugetlb = true;
 		gfn = (fault_ipa & PMD_MASK) >> PAGE_SHIFT;
 	} else {
@@ -1735,7 +1735,7 @@ static int kvm_map_idmap_text(pgd_t *pgd)
 	int err;
 
 	/* Create the idmap in the boot page tables */
-	err = 	__create_hyp_mappings(pgd,
+	err = 	__create_hyp_mappings(pgd, __kvm_idmap_ptrs_per_pgd(),
 				      hyp_idmap_start, hyp_idmap_end,
 				      __phys_to_pfn(hyp_idmap_start),
 				      PAGE_HYP_EXEC);
diff --git a/virt/kvm/arm/vgic/vgic-init.c b/virt/kvm/arm/vgic/vgic-init.c
index 6231012..743ca5c 100644
--- a/virt/kvm/arm/vgic/vgic-init.c
+++ b/virt/kvm/arm/vgic/vgic-init.c
@@ -285,9 +285,11 @@ int vgic_init(struct kvm *kvm)
 	if (ret)
 		goto out;
 
-	ret = vgic_v4_init(kvm);
-	if (ret)
-		goto out;
+	if (vgic_has_its(kvm)) {
+		ret = vgic_v4_init(kvm);
+		if (ret)
+			goto out;
+	}
 
 	kvm_for_each_vcpu(i, vcpu, kvm)
 		kvm_vgic_vcpu_enable(vcpu);
diff --git a/virt/kvm/arm/vgic/vgic-v4.c b/virt/kvm/arm/vgic/vgic-v4.c
index 4a37292..bc42651 100644
--- a/virt/kvm/arm/vgic/vgic-v4.c
+++ b/virt/kvm/arm/vgic/vgic-v4.c
@@ -118,7 +118,7 @@ int vgic_v4_init(struct kvm *kvm)
 	struct kvm_vcpu *vcpu;
 	int i, nr_vcpus, ret;
 
-	if (!vgic_supports_direct_msis(kvm))
+	if (!kvm_vgic_global_state.has_gicv4)
 		return 0; /* Nothing to see here... move along. */
 
 	if (dist->its_vm.vpes)